From morph at debian.org Sat Sep 1 05:19:30 2012 From: morph at debian.org (Sandro Tosi) Date: Sat, 1 Sep 2012 11:19:30 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b1 release In-Reply-To: References: Message-ID: On Fri, Aug 31, 2012 at 8:07 PM, Sandro Tosi wrote: > On Fri, Aug 31, 2012 at 7:17 PM, Ond?ej ?ert?k wrote: >> If you could create issues at github: https://github.com/numpy/numpy/issues >> that would be great. If you have time, also with some info about the platform >> and how to reproduce it. Or at least a link to the build logs. > > I've reported it here: https://github.com/numpy/numpy/issues/402 I've just spammed the issue tracker with additional issues, reporting all the test suite failures on Debian architectures; issues are 406 -> 414 . Don't hesitate to contact me if you need any support or clarification. Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From 275438859 at qq.com Sun Sep 2 09:59:53 2012 From: 275438859 at qq.com (=?gb18030?B?0MTI59byueI=?=) Date: Sun, 2 Sep 2012 21:59:53 +0800 Subject: [Numpy-discussion] encounter many warnings while it's installing scipy Message-ID: Hi,everybody. I encounter many warnings while it's installing scipy with the commend:pip install scipy Such as " warning:XX variable is uninitialized before used [-Wuninitialized]" What makes these warnings? And will they make something wrong with scipy?How can I fix it ?? Thank you so much for your help! -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Sun Sep 2 12:22:38 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Sun, 2 Sep 2012 18:22:38 +0200 Subject: [Numpy-discussion] encounter many warnings while it's installing scipy In-Reply-To: References: Message-ID: I don't think so. Maybe you can just try some functions you're familiar to see if they work as expected. Chao On Sun, Sep 2, 2012 at 3:59 PM, ???? <275438859 at qq.com> wrote: > Hi,everybody. > I encounter many warnings while it's installing scipy with the > commend:pip install scipy > Such as " warning:XX variable is uninitialized before used > [-Wuninitialized]" > What makes these warnings? > And will they make something wrong with scipy?How can I fix it ?? > Thank you so much for your help! > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Sun Sep 2 15:16:58 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sun, 02 Sep 2012 22:16:58 +0300 Subject: [Numpy-discussion] [ANN] New free tool for TSP solving Message-ID: <30653.1346613418.12034604965145542656@ffe12.ukr.net> Hi all, New free tool for TSP solving is available (for downloading as well) - OpenOpt TSP class: TSP (traveling salesman problem). It is written in Python, uses NetworkX graphs on input (another BSD-licensed Python library, de-facto standard graph lib for Python language programmers), can connect to MILP solvers like glpk, cplex, lpsolve, has a couple of other solvers - sa (simulated annealing, Python code by John Montgomery) and interalg If someone is interested, I could implement something from (or beyound) its future plans till next OpenOpt stable release 0.41, that will be 2 weeks later (Sept-15). Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Mon Sep 3 09:32:29 2012 From: tmp50 at ukr.net (Dmitrey) Date: Mon, 03 Sep 2012 16:32:29 +0300 Subject: [Numpy-discussion] [SciPy-User] [ANN] New free tool for TSP solving In-Reply-To: References: <30653.1346613418.12034604965145542656@ffe12.ukr.net> Message-ID: <49578.1346679149.16454580772003840000@ffe12.ukr.net> --- ???????? ????????? --- ?? ????: "Niki Spahiev" ????: scipy-user at scipy.org ????: 3 ???????? 2012, 13:57:49 ????: Re: [SciPy-User] [ANN] New free tool for TSP solving > > > New free tool for TSP solving is available (for downloading as well) - > OpenOpt TSP class: TSP (traveling salesman problem). Hello Dmitrey, Can this tool solve ATSP problems? Thanks, Niki > Hi, yes - asymmetric (see examples with networkx DiGraph), including multigraphs (networkx MultiDiGraph) as well. > ?_______________________________________________ SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Sep 4 06:15:39 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 4 Sep 2012 11:15:39 +0100 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis Message-ID: The last two Travis builds of master have failed consistently with the same error: http://travis-ci.org/#!/numpy/numpy/builds It looks like a real failure -- we're getting the same error on every build variant, some sort of problem in test_pareto. Example: http://travis-ci.org/#!/numpy/numpy/jobs/2328823 The obvious culprit would be the previous commit, which regenerated mtrand.c with Cython 0.17: http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 What's weird, though, is that that commit passed just fine on Travis: http://travis-ci.org/#!/numpy/numpy/builds/2313124 It's just the two commits since then that failed. But these commits have been 1-line docstring changes, so I don't see how they could have possibly created the problem. Also, the test passes fine with python 2.7 on my laptop with current master. Can anyone reproduce this failure? Any ideas what might be going on? -n From matthew.brett at gmail.com Tue Sep 4 06:23:40 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 4 Sep 2012 11:23:40 +0100 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: Hi, On Tue, Sep 4, 2012 at 11:15 AM, Nathaniel Smith wrote: > The last two Travis builds of master have failed consistently with the > same error: > http://travis-ci.org/#!/numpy/numpy/builds > It looks like a real failure -- we're getting the same error on every > build variant, some sort of problem in test_pareto. Example: > http://travis-ci.org/#!/numpy/numpy/jobs/2328823 > > The obvious culprit would be the previous commit, which regenerated > mtrand.c with Cython 0.17: > http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 > > What's weird, though, is that that commit passed just fine on Travis: > http://travis-ci.org/#!/numpy/numpy/builds/2313124 > > It's just the two commits since then that failed. But these commits > have been 1-line docstring changes, so I don't see how they could have > possibly created the problem. > > Also, the test passes fine with python 2.7 on my laptop with current master. > > Can anyone reproduce this failure? Any ideas what might be going on? I believe Travis just (a couple of days ago?) switched to Ubuntu 12.04 images - could that be the problem? See you, Matthew From scott.sinclair.za at gmail.com Tue Sep 4 08:47:57 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 4 Sep 2012 14:47:57 +0200 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On 4 September 2012 12:23, Matthew Brett wrote: > On Tue, Sep 4, 2012 at 11:15 AM, Nathaniel Smith wrote: >> The last two Travis builds of master have failed consistently with the >> same error: >> http://travis-ci.org/#!/numpy/numpy/builds >> It looks like a real failure -- we're getting the same error on every >> build variant, some sort of problem in test_pareto. Example: >> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >> >> The obvious culprit would be the previous commit, which regenerated >> mtrand.c with Cython 0.17: >> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >> >> What's weird, though, is that that commit passed just fine on Travis: >> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >> >> It's just the two commits since then that failed. But these commits >> have been 1-line docstring changes, so I don't see how they could have >> possibly created the problem. >> >> Also, the test passes fine with python 2.7 on my laptop with current master. >> >> Can anyone reproduce this failure? Any ideas what might be going on? > > I believe Travis just (a couple of days ago?) switched to Ubuntu 12.04 > images - could that be the problem? For whatever it's worth, tox reports: py24: commands succeeded py25: commands succeeded py26: commands succeeded py27: commands succeeded ERROR: py31: InterpreterNotFound: python3.1 py32: commands succeeded py27-separate: commands succeeded py32-separate: commands succeeded with git revision a72ce7e on my Ubuntu 12.04 machine (64-bit). Cheers, S From njs at pobox.com Tue Sep 4 09:18:18 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 4 Sep 2012 14:18:18 +0100 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 1:47 PM, Scott Sinclair wrote: > On 4 September 2012 12:23, Matthew Brett wrote: >> On Tue, Sep 4, 2012 at 11:15 AM, Nathaniel Smith wrote: >>> The last two Travis builds of master have failed consistently with the >>> same error: >>> http://travis-ci.org/#!/numpy/numpy/builds >>> It looks like a real failure -- we're getting the same error on every >>> build variant, some sort of problem in test_pareto. Example: >>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>> >>> The obvious culprit would be the previous commit, which regenerated >>> mtrand.c with Cython 0.17: >>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>> >>> What's weird, though, is that that commit passed just fine on Travis: >>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>> >>> It's just the two commits since then that failed. But these commits >>> have been 1-line docstring changes, so I don't see how they could have >>> possibly created the problem. >>> >>> Also, the test passes fine with python 2.7 on my laptop with current master. >>> >>> Can anyone reproduce this failure? Any ideas what might be going on? >> >> I believe Travis just (a couple of days ago?) switched to Ubuntu 12.04 >> images - could that be the problem? > > For whatever it's worth, tox reports: > > > py24: commands succeeded > py25: commands succeeded > py26: commands succeeded > py27: commands succeeded > ERROR: py31: InterpreterNotFound: python3.1 > py32: commands succeeded > py27-separate: commands succeeded > py32-separate: commands succeeded > > > with git revision a72ce7e on my Ubuntu 12.04 machine (64-bit). It works for me on 64-bit 12.04 too. However, the Travis machines are using *32*-bit 12.04, which might be the problem... anyone got one of those lying around to test? -n From matthew.brett at gmail.com Tue Sep 4 14:02:22 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 4 Sep 2012 19:02:22 +0100 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: Hi, On Tue, Sep 4, 2012 at 2:18 PM, Nathaniel Smith wrote: > On Tue, Sep 4, 2012 at 1:47 PM, Scott Sinclair > wrote: >> On 4 September 2012 12:23, Matthew Brett wrote: >>> On Tue, Sep 4, 2012 at 11:15 AM, Nathaniel Smith wrote: >>>> The last two Travis builds of master have failed consistently with the >>>> same error: >>>> http://travis-ci.org/#!/numpy/numpy/builds >>>> It looks like a real failure -- we're getting the same error on every >>>> build variant, some sort of problem in test_pareto. Example: >>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>> >>>> The obvious culprit would be the previous commit, which regenerated >>>> mtrand.c with Cython 0.17: >>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>> >>>> What's weird, though, is that that commit passed just fine on Travis: >>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>> >>>> It's just the two commits since then that failed. But these commits >>>> have been 1-line docstring changes, so I don't see how they could have >>>> possibly created the problem. >>>> >>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>> >>>> Can anyone reproduce this failure? Any ideas what might be going on? >>> >>> I believe Travis just (a couple of days ago?) switched to Ubuntu 12.04 >>> images - could that be the problem? >> >> For whatever it's worth, tox reports: >> >> >> py24: commands succeeded >> py25: commands succeeded >> py26: commands succeeded >> py27: commands succeeded >> ERROR: py31: InterpreterNotFound: python3.1 >> py32: commands succeeded >> py27-separate: commands succeeded >> py32-separate: commands succeeded >> >> >> with git revision a72ce7e on my Ubuntu 12.04 machine (64-bit). > > It works for me on 64-bit 12.04 too. However, the Travis machines are > using *32*-bit 12.04, which might be the problem... anyone got one of > those lying around to test? I do - send me your ssh public key somehow? See you, Matthew From ondrej.certik at gmail.com Tue Sep 4 15:31:11 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 12:31:11 -0700 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: > The last two Travis builds of master have failed consistently with the > same error: > http://travis-ci.org/#!/numpy/numpy/builds > It looks like a real failure -- we're getting the same error on every > build variant, some sort of problem in test_pareto. Example: > http://travis-ci.org/#!/numpy/numpy/jobs/2328823 > > The obvious culprit would be the previous commit, which regenerated > mtrand.c with Cython 0.17: > http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 > > What's weird, though, is that that commit passed just fine on Travis: > http://travis-ci.org/#!/numpy/numpy/builds/2313124 > > It's just the two commits since then that failed. But these commits > have been 1-line docstring changes, so I don't see how they could have > possibly created the problem. > > Also, the test passes fine with python 2.7 on my laptop with current master. > > Can anyone reproduce this failure? Any ideas what might be going on? I made this: https://github.com/numpy/numpy/issues/424 It was me who updated the Cython file. It seemed to be working. I've added the issue to the release TODO. Ondrej From ondrej.certik at gmail.com Tue Sep 4 15:41:57 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 12:41:57 -0700 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: > On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >> The last two Travis builds of master have failed consistently with the >> same error: >> http://travis-ci.org/#!/numpy/numpy/builds >> It looks like a real failure -- we're getting the same error on every >> build variant, some sort of problem in test_pareto. Example: >> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >> >> The obvious culprit would be the previous commit, which regenerated >> mtrand.c with Cython 0.17: >> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >> >> What's weird, though, is that that commit passed just fine on Travis: >> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >> >> It's just the two commits since then that failed. But these commits >> have been 1-line docstring changes, so I don't see how they could have >> possibly created the problem. >> >> Also, the test passes fine with python 2.7 on my laptop with current master. >> >> Can anyone reproduce this failure? Any ideas what might be going on? > > I made this: > > https://github.com/numpy/numpy/issues/424 > > It was me who updated the Cython file. It seemed to be working. I've > added the issue > to the release TODO. Ok, here is how to reproduce the problem: 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed in the README: https://github.com/certik/numpy-vendor 2) run tests, you'll get: https://gist.github.com/3625509 Ondrej From ondrej.certik at gmail.com Tue Sep 4 16:17:39 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 13:17:39 -0700 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: > On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>> The last two Travis builds of master have failed consistently with the >>> same error: >>> http://travis-ci.org/#!/numpy/numpy/builds >>> It looks like a real failure -- we're getting the same error on every >>> build variant, some sort of problem in test_pareto. Example: >>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>> >>> The obvious culprit would be the previous commit, which regenerated >>> mtrand.c with Cython 0.17: >>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>> >>> What's weird, though, is that that commit passed just fine on Travis: >>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>> >>> It's just the two commits since then that failed. But these commits >>> have been 1-line docstring changes, so I don't see how they could have >>> possibly created the problem. >>> >>> Also, the test passes fine with python 2.7 on my laptop with current master. >>> >>> Can anyone reproduce this failure? Any ideas what might be going on? >> >> I made this: >> >> https://github.com/numpy/numpy/issues/424 >> >> It was me who updated the Cython file. It seemed to be working. I've >> added the issue >> to the release TODO. > > Ok, here is how to reproduce the problem: > > 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed > in the README: > > https://github.com/certik/numpy-vendor > > 2) run tests, you'll get: > > https://gist.github.com/3625509 So the problem was actually introduced much earlier. Most probably it has never worked in 32bit Ubuntu 12.04. I tried for example this old commit: https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 and it still fails: https://gist.github.com/3625943 I tried to test even the first commit that introduced the problem: https://github.com/numpy/numpy/commit/898e6bdc625cdd3c97865ef99f8d51c5f43eafff but while it compiled, I got some import error: (py)vagrant at precise32:~/repos/numpy/tools$ nosetests /home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py RuntimeError: module compiled against API version 6 but this version of numpy is 5 E ====================================================================== ERROR: Failure: ImportError (numpy.core.multiarray failed to import) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py", line 1, in from numpy.testing import TestCase, run_module_suite, assert_ File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, in import add_newdocs File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line 9, in from numpy.lib import add_newdoc File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line 4, in from type_check import * File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/lib/type_check.py", line 8, in import numpy.core.numeric as _nx File "/home/vagrant/repos/numpy/py/local/lib/python2.7/site-packages/numpy/core/__init__.py", line 10, in import _sort ImportError: numpy.core.multiarray failed to import ---------------------------------------------------------------------- Ran 1 test in 0.002s FAILED (errors=1) But in any case, this seems to be a problem with the actual 32bit Ubuntu 12.04 itself. So maybe something in gcc has changed that now triggers the problem. Ondrej From njs at pobox.com Tue Sep 4 16:37:35 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 4 Sep 2012 21:37:35 +0100 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k wrote: > On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: >> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>>> The last two Travis builds of master have failed consistently with the >>>> same error: >>>> http://travis-ci.org/#!/numpy/numpy/builds >>>> It looks like a real failure -- we're getting the same error on every >>>> build variant, some sort of problem in test_pareto. Example: >>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>> >>>> The obvious culprit would be the previous commit, which regenerated >>>> mtrand.c with Cython 0.17: >>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>> >>>> What's weird, though, is that that commit passed just fine on Travis: >>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>> >>>> It's just the two commits since then that failed. But these commits >>>> have been 1-line docstring changes, so I don't see how they could have >>>> possibly created the problem. >>>> >>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>> >>>> Can anyone reproduce this failure? Any ideas what might be going on? >>> >>> I made this: >>> >>> https://github.com/numpy/numpy/issues/424 >>> >>> It was me who updated the Cython file. It seemed to be working. I've >>> added the issue >>> to the release TODO. >> >> Ok, here is how to reproduce the problem: >> >> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed >> in the README: >> >> https://github.com/certik/numpy-vendor >> >> 2) run tests, you'll get: >> >> https://gist.github.com/3625509 > > So the problem was actually introduced much earlier. Most probably it > has never worked > in 32bit Ubuntu 12.04. I tried for example this old commit: > > https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 > > and it still fails: [...] > But in any case, this seems to be a problem with the actual 32bit > Ubuntu 12.04 itself. So maybe something in gcc > has changed that now triggers the problem. To be clear, the mismatching value is: > /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) (Pdb) actual[1, 0] 52828779.702948704 (Pdb) desired[1, 0] 52828779.702948518 So they do match to 14 decimal points, and it's entirely possible that this is just a problem of the test being too stringent in requiring 15 decimal points of match. Maybe the 32-bit GCC is spilling registers differently, tripping the famous x86 idiosyncrasy where register spilling triggers rounding. But I'd feel better if someone familiar with the pareto code could confirm. -n From josef.pktd at gmail.com Tue Sep 4 16:48:23 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Sep 2012 16:48:23 -0400 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 4:37 PM, Nathaniel Smith wrote: > On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k wrote: >> On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: >>> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >>>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>>>> The last two Travis builds of master have failed consistently with the >>>>> same error: >>>>> http://travis-ci.org/#!/numpy/numpy/builds >>>>> It looks like a real failure -- we're getting the same error on every >>>>> build variant, some sort of problem in test_pareto. Example: >>>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>>> >>>>> The obvious culprit would be the previous commit, which regenerated >>>>> mtrand.c with Cython 0.17: >>>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>>> >>>>> What's weird, though, is that that commit passed just fine on Travis: >>>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>>> >>>>> It's just the two commits since then that failed. But these commits >>>>> have been 1-line docstring changes, so I don't see how they could have >>>>> possibly created the problem. >>>>> >>>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>>> >>>>> Can anyone reproduce this failure? Any ideas what might be going on? >>>> >>>> I made this: >>>> >>>> https://github.com/numpy/numpy/issues/424 >>>> >>>> It was me who updated the Cython file. It seemed to be working. I've >>>> added the issue >>>> to the release TODO. >>> >>> Ok, here is how to reproduce the problem: >>> >>> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed >>> in the README: >>> >>> https://github.com/certik/numpy-vendor >>> >>> 2) run tests, you'll get: >>> >>> https://gist.github.com/3625509 >> >> So the problem was actually introduced much earlier. Most probably it >> has never worked >> in 32bit Ubuntu 12.04. I tried for example this old commit: >> >> https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 >> >> and it still fails: > [...] >> But in any case, this seems to be a problem with the actual 32bit >> Ubuntu 12.04 itself. So maybe something in gcc >> has changed that now triggers the problem. > > To be clear, the mismatching value is: > >> /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() > -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) > (Pdb) actual[1, 0] > 52828779.702948704 > (Pdb) desired[1, 0] > 52828779.702948518 > > So they do match to 14 decimal points, and it's entirely possible that > this is just a problem of the test being too stringent in requiring 15 > decimal points of match. Maybe the 32-bit GCC is spilling registers > differently, tripping the famous x86 idiosyncrasy where register > spilling triggers rounding. But I'd feel better if someone familiar > with the pareto code could confirm. I don't understand this. Isn't assert_almost_equal testing decimals not significant digits? As I remember, these tests were added to avoid or signal changes in the random number generator, and I don't think a digit up or down makes a difference. Josef > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Sep 4 17:16:28 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Sep 2012 15:16:28 -0600 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 2:37 PM, Nathaniel Smith wrote: > On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k > wrote: > > On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k > wrote: > >> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k > wrote: > >>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: > >>>> The last two Travis builds of master have failed consistently with the > >>>> same error: > >>>> http://travis-ci.org/#!/numpy/numpy/builds > >>>> It looks like a real failure -- we're getting the same error on every > >>>> build variant, some sort of problem in test_pareto. Example: > >>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 > >>>> > >>>> The obvious culprit would be the previous commit, which regenerated > >>>> mtrand.c with Cython 0.17: > >>>> > http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 > >>>> > >>>> What's weird, though, is that that commit passed just fine on Travis: > >>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 > >>>> > >>>> It's just the two commits since then that failed. But these commits > >>>> have been 1-line docstring changes, so I don't see how they could have > >>>> possibly created the problem. > >>>> > >>>> Also, the test passes fine with python 2.7 on my laptop with current > master. > >>>> > >>>> Can anyone reproduce this failure? Any ideas what might be going on? > >>> > >>> I made this: > >>> > >>> https://github.com/numpy/numpy/issues/424 > >>> > >>> It was me who updated the Cython file. It seemed to be working. I've > >>> added the issue > >>> to the release TODO. > >> > >> Ok, here is how to reproduce the problem: > >> > >> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed > >> in the README: > >> > >> https://github.com/certik/numpy-vendor > >> > >> 2) run tests, you'll get: > >> > >> https://gist.github.com/3625509 > > > > So the problem was actually introduced much earlier. Most probably it > > has never worked > > in 32bit Ubuntu 12.04. I tried for example this old commit: > > > > > https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 > > > > and it still fails: > [...] > > But in any case, this seems to be a problem with the actual 32bit > > Ubuntu 12.04 itself. So maybe something in gcc > > has changed that now triggers the problem. > > To be clear, the mismatching value is: > > > > /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() > -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) > (Pdb) actual[1, 0] > 52828779.702948704 > (Pdb) desired[1, 0] > 52828779.702948518 > > So they do match to 14 decimal points, and it's entirely possible that > this is just a problem of the test being too stringent in requiring 15 > decimal points of match. Maybe the 32-bit GCC is spilling registers > differently, tripping the famous x86 idiosyncrasy where register > spilling triggers rounding. But I'd feel better if someone familiar > with the pareto code could confirm. > > I expect it is the tendency of 32 bit gcc to mix 8087 and sse instructions, which have different floating point register precisions. As a result the floating point values can depend on the compiler version and optimization flags. Here it seems that the test is requiring a bit too much precision, although it would be nice to know what the expected precision of the algorithm is. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Tue Sep 4 17:49:18 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 14:49:18 -0700 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 1:48 PM, wrote: > On Tue, Sep 4, 2012 at 4:37 PM, Nathaniel Smith wrote: >> On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k wrote: >>> On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: >>>> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >>>>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>>>>> The last two Travis builds of master have failed consistently with the >>>>>> same error: >>>>>> http://travis-ci.org/#!/numpy/numpy/builds >>>>>> It looks like a real failure -- we're getting the same error on every >>>>>> build variant, some sort of problem in test_pareto. Example: >>>>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>>>> >>>>>> The obvious culprit would be the previous commit, which regenerated >>>>>> mtrand.c with Cython 0.17: >>>>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>>>> >>>>>> What's weird, though, is that that commit passed just fine on Travis: >>>>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>>>> >>>>>> It's just the two commits since then that failed. But these commits >>>>>> have been 1-line docstring changes, so I don't see how they could have >>>>>> possibly created the problem. >>>>>> >>>>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>>>> >>>>>> Can anyone reproduce this failure? Any ideas what might be going on? >>>>> >>>>> I made this: >>>>> >>>>> https://github.com/numpy/numpy/issues/424 >>>>> >>>>> It was me who updated the Cython file. It seemed to be working. I've >>>>> added the issue >>>>> to the release TODO. >>>> >>>> Ok, here is how to reproduce the problem: >>>> >>>> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed >>>> in the README: >>>> >>>> https://github.com/certik/numpy-vendor >>>> >>>> 2) run tests, you'll get: >>>> >>>> https://gist.github.com/3625509 >>> >>> So the problem was actually introduced much earlier. Most probably it >>> has never worked >>> in 32bit Ubuntu 12.04. I tried for example this old commit: >>> >>> https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 >>> >>> and it still fails: >> [...] >>> But in any case, this seems to be a problem with the actual 32bit >>> Ubuntu 12.04 itself. So maybe something in gcc >>> has changed that now triggers the problem. >> >> To be clear, the mismatching value is: >> >>> /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() >> -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) >> (Pdb) actual[1, 0] >> 52828779.702948704 >> (Pdb) desired[1, 0] >> 52828779.702948518 >> >> So they do match to 14 decimal points, and it's entirely possible that >> this is just a problem of the test being too stringent in requiring 15 >> decimal points of match. Maybe the 32-bit GCC is spilling registers >> differently, tripping the famous x86 idiosyncrasy where register >> spilling triggers rounding. But I'd feel better if someone familiar >> with the pareto code could confirm. > > I don't understand this. Isn't assert_almost_equal testing decimals > not significant digits? Yes it is, see the docs: The test is equivalent to ``abs(desired-actual) < 0.5 * 10**(-decimal)``. and the unit test is badly written, because that floating point number has maybe 14 or 15 good significant digits, but it simply does not have 15 digits after the decimal point, so in particular, that test is testing that the two floating point numbers are exactly equal. Here is a fix: https://github.com/numpy/numpy/pull/425 Let me know if it is ok to merge it, so that we can work on other issues and have a working test suite. Ondrej From josef.pktd at gmail.com Tue Sep 4 17:58:46 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Sep 2012 17:58:46 -0400 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 5:49 PM, Ond?ej ?ert?k wrote: > On Tue, Sep 4, 2012 at 1:48 PM, wrote: >> On Tue, Sep 4, 2012 at 4:37 PM, Nathaniel Smith wrote: >>> On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k wrote: >>>> On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: >>>>> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >>>>>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>>>>>> The last two Travis builds of master have failed consistently with the >>>>>>> same error: >>>>>>> http://travis-ci.org/#!/numpy/numpy/builds >>>>>>> It looks like a real failure -- we're getting the same error on every >>>>>>> build variant, some sort of problem in test_pareto. Example: >>>>>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>>>>> >>>>>>> The obvious culprit would be the previous commit, which regenerated >>>>>>> mtrand.c with Cython 0.17: >>>>>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>>>>> >>>>>>> What's weird, though, is that that commit passed just fine on Travis: >>>>>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>>>>> >>>>>>> It's just the two commits since then that failed. But these commits >>>>>>> have been 1-line docstring changes, so I don't see how they could have >>>>>>> possibly created the problem. >>>>>>> >>>>>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>>>>> >>>>>>> Can anyone reproduce this failure? Any ideas what might be going on? >>>>>> >>>>>> I made this: >>>>>> >>>>>> https://github.com/numpy/numpy/issues/424 >>>>>> >>>>>> It was me who updated the Cython file. It seemed to be working. I've >>>>>> added the issue >>>>>> to the release TODO. >>>>> >>>>> Ok, here is how to reproduce the problem: >>>>> >>>>> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed >>>>> in the README: >>>>> >>>>> https://github.com/certik/numpy-vendor >>>>> >>>>> 2) run tests, you'll get: >>>>> >>>>> https://gist.github.com/3625509 >>>> >>>> So the problem was actually introduced much earlier. Most probably it >>>> has never worked >>>> in 32bit Ubuntu 12.04. I tried for example this old commit: >>>> >>>> https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 >>>> >>>> and it still fails: >>> [...] >>>> But in any case, this seems to be a problem with the actual 32bit >>>> Ubuntu 12.04 itself. So maybe something in gcc >>>> has changed that now triggers the problem. >>> >>> To be clear, the mismatching value is: >>> >>>> /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() >>> -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) >>> (Pdb) actual[1, 0] >>> 52828779.702948704 >>> (Pdb) desired[1, 0] >>> 52828779.702948518 >>> >>> So they do match to 14 decimal points, and it's entirely possible that >>> this is just a problem of the test being too stringent in requiring 15 >>> decimal points of match. Maybe the 32-bit GCC is spilling registers >>> differently, tripping the famous x86 idiosyncrasy where register >>> spilling triggers rounding. But I'd feel better if someone familiar >>> with the pareto code could confirm. >> >> I don't understand this. Isn't assert_almost_equal testing decimals >> not significant digits? > > Yes it is, see the docs: > > The test is equivalent to ``abs(desired-actual) < 0.5 * 10**(-decimal)``. > > and the unit test is badly written, because that floating point > number has maybe 14 or 15 good significant digits, but it simply does not have > 15 digits after the decimal point, so in particular, that test is > testing that the two > floating point numbers are exactly equal. > > > Here is a fix: > > https://github.com/numpy/numpy/pull/425 > > Let me know if it is ok to merge it, so that we can work on other > issues and have a working test suite. I think all the tests should be changed to use the new assert with relative tolerance (similar to approx equal) https://gist.github.com/3625943 shows also the other distributions with 15 decimals when the values are larger than 10 for example Josef > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ondrej.certik at gmail.com Tue Sep 4 18:29:59 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 15:29:59 -0700 Subject: [Numpy-discussion] Mysterious test_pareto failure on Travis In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 2:58 PM, wrote: > On Tue, Sep 4, 2012 at 5:49 PM, Ond?ej ?ert?k wrote: >> On Tue, Sep 4, 2012 at 1:48 PM, wrote: >>> On Tue, Sep 4, 2012 at 4:37 PM, Nathaniel Smith wrote: >>>> On Tue, Sep 4, 2012 at 9:17 PM, Ond?ej ?ert?k wrote: >>>>> On Tue, Sep 4, 2012 at 12:41 PM, Ond?ej ?ert?k wrote: >>>>>> On Tue, Sep 4, 2012 at 12:31 PM, Ond?ej ?ert?k wrote: >>>>>>> On Tue, Sep 4, 2012 at 3:15 AM, Nathaniel Smith wrote: >>>>>>>> The last two Travis builds of master have failed consistently with the >>>>>>>> same error: >>>>>>>> http://travis-ci.org/#!/numpy/numpy/builds >>>>>>>> It looks like a real failure -- we're getting the same error on every >>>>>>>> build variant, some sort of problem in test_pareto. Example: >>>>>>>> http://travis-ci.org/#!/numpy/numpy/jobs/2328823 >>>>>>>> >>>>>>>> The obvious culprit would be the previous commit, which regenerated >>>>>>>> mtrand.c with Cython 0.17: >>>>>>>> http://github.com/numpy/numpy/commit/cd9092aa71d23359b33e89d938c55fb14b9bf606 >>>>>>>> >>>>>>>> What's weird, though, is that that commit passed just fine on Travis: >>>>>>>> http://travis-ci.org/#!/numpy/numpy/builds/2313124 >>>>>>>> >>>>>>>> It's just the two commits since then that failed. But these commits >>>>>>>> have been 1-line docstring changes, so I don't see how they could have >>>>>>>> possibly created the problem. >>>>>>>> >>>>>>>> Also, the test passes fine with python 2.7 on my laptop with current master. >>>>>>>> >>>>>>>> Can anyone reproduce this failure? Any ideas what might be going on? >>>>>>> >>>>>>> I made this: >>>>>>> >>>>>>> https://github.com/numpy/numpy/issues/424 >>>>>>> >>>>>>> It was me who updated the Cython file. It seemed to be working. I've >>>>>>> added the issue >>>>>>> to the release TODO. >>>>>> >>>>>> Ok, here is how to reproduce the problem: >>>>>> >>>>>> 1) install my numpy-vendor vagrant image (32 bit Ubuntu), as directed >>>>>> in the README: >>>>>> >>>>>> https://github.com/certik/numpy-vendor >>>>>> >>>>>> 2) run tests, you'll get: >>>>>> >>>>>> https://gist.github.com/3625509 >>>>> >>>>> So the problem was actually introduced much earlier. Most probably it >>>>> has never worked >>>>> in 32bit Ubuntu 12.04. I tried for example this old commit: >>>>> >>>>> https://github.com/numpy/numpy/commit/3882d65c42acf6d5fff8cc9b3f410bb3e49c8af8 >>>>> >>>>> and it still fails: >>>> [...] >>>>> But in any case, this seems to be a problem with the actual 32bit >>>>> Ubuntu 12.04 itself. So maybe something in gcc >>>>> has changed that now triggers the problem. >>>> >>>> To be clear, the mismatching value is: >>>> >>>>> /home/njs/numpy/.tox/py27/local/lib/python2.7/site-packages/numpy/random/tests/test_random.py(363)test_pareto() >>>> -> np.testing.assert_array_almost_equal(actual, desired, decimal=15) >>>> (Pdb) actual[1, 0] >>>> 52828779.702948704 >>>> (Pdb) desired[1, 0] >>>> 52828779.702948518 >>>> >>>> So they do match to 14 decimal points, and it's entirely possible that >>>> this is just a problem of the test being too stringent in requiring 15 >>>> decimal points of match. Maybe the 32-bit GCC is spilling registers >>>> differently, tripping the famous x86 idiosyncrasy where register >>>> spilling triggers rounding. But I'd feel better if someone familiar >>>> with the pareto code could confirm. >>> >>> I don't understand this. Isn't assert_almost_equal testing decimals >>> not significant digits? >> >> Yes it is, see the docs: >> >> The test is equivalent to ``abs(desired-actual) < 0.5 * 10**(-decimal)``. >> >> and the unit test is badly written, because that floating point >> number has maybe 14 or 15 good significant digits, but it simply does not have >> 15 digits after the decimal point, so in particular, that test is >> testing that the two >> floating point numbers are exactly equal. >> >> >> Here is a fix: >> >> https://github.com/numpy/numpy/pull/425 >> >> Let me know if it is ok to merge it, so that we can work on other >> issues and have a working test suite. > > I think all the tests should be changed to use the new assert with > relative tolerance (similar to approx equal) > https://gist.github.com/3625943 shows also the other distributions > with 15 decimals when the values are larger than 10 for example That sounds reasonable. Josef, would you mind sending a PR with this better fix? The PR #425 fixes it for now and for me it's now very important to get all the tests pass again on Travis due to their upgrade. I am now working on other failures at Travis: https://github.com/numpy/numpy/issues/394 https://github.com/numpy/numpy/issues/426 As those seem to be causing all the Debian buildbots test failures reported by Sandro. Ondrej From ondrej.certik at gmail.com Tue Sep 4 18:31:41 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 15:31:41 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b1 release In-Reply-To: References: Message-ID: On Sat, Sep 1, 2012 at 2:19 AM, Sandro Tosi wrote: > On Fri, Aug 31, 2012 at 8:07 PM, Sandro Tosi wrote: >> On Fri, Aug 31, 2012 at 7:17 PM, Ond?ej ?ert?k wrote: >>> If you could create issues at github: https://github.com/numpy/numpy/issues >>> that would be great. If you have time, also with some info about the platform >>> and how to reproduce it. Or at least a link to the build logs. >> >> I've reported it here: https://github.com/numpy/numpy/issues/402 > > I've just spammed the issue tracker with additional issues, reporting > all the test suite failures on Debian architectures; issues are 406 -> > 414 . > > Don't hesitate to contact me if you need any support or clarification. Thanks Sandro for reporting it! I put all of them into my release issue: https://github.com/numpy/numpy/issues/396 most of the failures seem to be caused by these two issues: https://github.com/numpy/numpy/issues/394 https://github.com/numpy/numpy/issues/426 so I am looking into this now. Ondrej From amueller at ais.uni-bonn.de Tue Sep 4 18:31:20 2012 From: amueller at ais.uni-bonn.de (Andreas Mueller) Date: Tue, 04 Sep 2012 23:31:20 +0100 Subject: [Numpy-discussion] ANN: scikit-learn 0.12 Released Message-ID: <50468138.6090000@ais.uni-bonn.de> Dear fellow Pythonistas. I am pleased to announce the release of scikit-learn 0.12 . This release adds several new features, for example multidimensional scaling (MDS), multi-task Lasso and multi-output decision and regression forests. There has also been a lot of progress in documentation and ease of use. Details can be found on the what's new page . Sources and windows binaries are available on sourceforge, through pypi (http://pypi.python.org/pypi/scikit-learn/0.12) or can be installed directly using pip: pip install -U scikit-learn I want to thank all of the developers who made this release possible and welcome our new contributors. Keep on learning, Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From dylantem at umich.edu Tue Sep 4 19:18:16 2012 From: dylantem at umich.edu (Dylan Temple) Date: Tue, 4 Sep 2012 19:18:16 -0400 Subject: [Numpy-discussion] Question Regarding numpy.i and 2-dimensional input array Message-ID: Hello, I am trying to use numpy.i to move a function from python to C that takes a 2-D array as an input and am having difficulty. The code is pretty long so I've made a simple example that recreates the error. example.c: #include #include "example.h" void matrix_example (int n, int m, double *x0, \ int p, double *x) { int i; int j; int k; for (i = 0; i < n; ++i) { for (j=0; j < 0; ++j) { x[k] = 2 * x0[i][j]; ++k; } } } example.h: void matrix_example (int n, int m, double *x0, int p, double *x); example.i: %module example %{ #define SWIG_FILE_WITH_INIT #include "example.h" %} %include "numpy.i" %init %{ import_array(); %} %apply (int DIM1, int DIM2, double *IN_ARRAY2) {(int n, int m, double *x0)}; %apply (int DIM1, double *ARGOUT_ARRAY1) {(int p, double *x)}; I then use prompts: $ swig -python example.i $ python setup.py build_ext --inplace However I get the error: example.c: In function ?matrix_example?: example.c:12: error: subscripted value is neither array nor pointer error: command 'gcc' failed with exit status 1 It seems as though x0 is not actually being passed to the function as a 2-D array. Any help on why this error is happening would be great! Thank you, Dylan Temple -- Dylan Temple Graduate Student, University of Michigan NA&ME Department B.E. Naval Architecture M.S.E Naval Architecture 121 NA&ME Department, 2600 Draper Dr. 607-592-1749 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Tue Sep 4 19:24:14 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 16:24:14 -0700 Subject: [Numpy-discussion] Should abs([nan]) be supported? Message-ID: Hi, When running the test suite, there are problems of this kind: https://github.com/numpy/numpy/issues/394 which then causes for example the Debian buildbots tests to fail (https://github.com/numpy/numpy/issues/406). The problem is really simple: >>> from numpy import array, abs, nan >>> a = array([1, nan, 3]) >>> a array([ 1., nan, 3.]) >>> abs(a) __main__:1: RuntimeWarning: invalid value encountered in absolute array([ 1., nan, 3.]) See the issue #394 for detailed explanation why "nan" is being passed to abs(). Now the question is, what should the right fix be? 1) Should the runtime warning be disabled? 2) Should the tests be reworked, so that "nan" is not tested in allclose()? 3) Should abs() be fixed to not emit the warning? 4) Should the test suite be somehow fixed not to fail if there are runtime warnings? Let me know which direction we should go. Thanks, Ondrej From travis at continuum.io Tue Sep 4 23:38:41 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 4 Sep 2012 22:38:41 -0500 Subject: [Numpy-discussion] Should abs([nan]) be supported? In-Reply-To: References: Message-ID: There is an error context that controls how floating point signals are handled. There is a separate control for underflow, overflow, divide by zero, and invalid. IIRC, it was decided on this list a while ago to make the default ignore for underflow and warning for overflow, invalid and divide by zero. However, an oversight pushed versions of NumPy where all the error handlers where set to "ignore" and this test was probably written then. I think the test should be changed to check for RuntimeWarning on some of the cases. This might take a little work as it looks like the code uses generators across multiple tests and would have to be changed to handle expecting warnings. Alternatively, the error context can be set before the test runs and then restored afterwords: olderr = np.seterr(invalid='ignore') abs(a) np.seterr(**olderr) or, using an errstate context --- with np.errstate(invalid='ignore'): abs(a) -Travis On Sep 4, 2012, at 6:24 PM, Ond?ej ?ert?k wrote: > Hi, > > When running the test suite, there are problems of this kind: > > https://github.com/numpy/numpy/issues/394 > > which then causes for example the Debian buildbots tests to fail > (https://github.com/numpy/numpy/issues/406). > The problem is really simple: > > >>>> from numpy import array, abs, nan >>>> a = array([1, nan, 3]) >>>> a > array([ 1., nan, 3.]) >>>> abs(a) > __main__:1: RuntimeWarning: invalid value encountered in absolute > array([ 1., nan, 3.]) > > > See the issue #394 for detailed explanation why "nan" is being passed > to abs(). Now the question is, what should the right fix be? > > 1) Should the runtime warning be disabled? > > 2) Should the tests be reworked, so that "nan" is not tested in allclose()? > > 3) Should abs() be fixed to not emit the warning? > > 4) Should the test suite be somehow fixed not to fail if there are > runtime warnings? > > Let me know which direction we should go. > > Thanks, > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ondrej.certik at gmail.com Tue Sep 4 23:49:14 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 4 Sep 2012 20:49:14 -0700 Subject: [Numpy-discussion] Should abs([nan]) be supported? In-Reply-To: References: Message-ID: On Tue, Sep 4, 2012 at 8:38 PM, Travis Oliphant wrote: > > There is an error context that controls how floating point signals are handled. There is a separate control for underflow, overflow, divide by zero, and invalid. IIRC, it was decided on this list a while ago to make the default ignore for underflow and warning for overflow, invalid and divide by zero. > > However, an oversight pushed versions of NumPy where all the error handlers where set to "ignore" and this test was probably written then. I think the test should be changed to check for RuntimeWarning on some of the cases. This might take a little work as it looks like the code uses generators across multiple tests and would have to be changed to handle expecting warnings. > > Alternatively, the error context can be set before the test runs and then restored afterwords: > > olderr = np.seterr(invalid='ignore') > abs(a) > np.seterr(**olderr) > > > or, using an errstate context --- > > with np.errstate(invalid='ignore'): > abs(a) I see --- so abs([nan]) should emit a warning, but in the test we should suppress it. I'll work on that. The only thing that I don't understand is why it only happens on some platforms and doesn't on some other platforms (apparently). But it's clear how to fix it now. Thanks for the information. Ondrej From travis at continuum.io Wed Sep 5 01:06:43 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 5 Sep 2012 00:06:43 -0500 Subject: [Numpy-discussion] Should abs([nan]) be supported? In-Reply-To: References: Message-ID: <2F3BC0D2-E111-478C-B733-F414BFFAA76A@continuum.io> The framework for catching errors relies on hardware flags getting set and our C code making the right calls to detect those flags. This has usually worked correctly in the past --- but it is an area where changes in compilers or platforms could create problems. We should test to be sure that the correct warnings are issued, I would think. Perhaps using a catch_warnings context would be helpful (from http://docs.python.org/library/warnings.html) import warnings def fxn(): warnings.warn("deprecated", DeprecationWarning) with warnings.catch_warnings(record=True) as w: # Cause all warnings to always be triggered. warnings.simplefilter("always") # Trigger a warning. fxn() # Verify some things assert len(w) == 1 assert issubclass(w[-1].category, DeprecationWarning) assert "deprecated" in str(w[-1].message) -Travis On Sep 4, 2012, at 10:49 PM, Ond?ej ?ert?k wrote: > On Tue, Sep 4, 2012 at 8:38 PM, Travis Oliphant wrote: >> >> There is an error context that controls how floating point signals are handled. There is a separate control for underflow, overflow, divide by zero, and invalid. IIRC, it was decided on this list a while ago to make the default ignore for underflow and warning for overflow, invalid and divide by zero. >> >> However, an oversight pushed versions of NumPy where all the error handlers where set to "ignore" and this test was probably written then. I think the test should be changed to check for RuntimeWarning on some of the cases. This might take a little work as it looks like the code uses generators across multiple tests and would have to be changed to handle expecting warnings. >> >> Alternatively, the error context can be set before the test runs and then restored afterwords: >> >> olderr = np.seterr(invalid='ignore') >> abs(a) >> np.seterr(**olderr) >> >> >> or, using an errstate context --- >> >> with np.errstate(invalid='ignore'): >> abs(a) > > I see --- so abs([nan]) should emit a warning, but in the test we > should suppress it. > I'll work on that. > > The only thing that I don't understand is why it only happens on some > platforms and doesn't on some other platforms (apparently). But it's > clear how to fix it now. > > Thanks for the information. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From wfspotz at sandia.gov Wed Sep 5 01:38:49 2012 From: wfspotz at sandia.gov (Bill Spotz) Date: Tue, 4 Sep 2012 23:38:49 -0600 Subject: [Numpy-discussion] [EXTERNAL] Question Regarding numpy.i and 2-dimensional input array In-Reply-To: References: Message-ID: Dylan, The error appears to be in example.c, not example_wrap.c. The way you have written it, I think it should be double **x0. But the SWIG typemaps require double*. You could write it with double *x0, but then index with x0[i*m+j]. Or, if you want the [i][j] indexing, you could write a wrapper function with double *x1 that calls your matrix_example() function with &x1. And then wrap the wrapper function. -Bill On Sep 4, 2012, at 5:18 PM, Dylan Temple wrote: > Hello, > > I am trying to use numpy.i to move a function from python to C that takes a 2-D array as an input and am having difficulty. The code is pretty long so I've made a simple example that recreates the error. > > example.c: > > #include > #include "example.h" > > void matrix_example (int n, int m, double *x0, \ > int p, double *x) > { > int i; > int j; > int k; > for (i = 0; i < n; ++i) { > for (j=0; j < 0; ++j) { > x[k] = 2 * x0[i][j]; > ++k; > } > } > } > > example.h: > > void matrix_example (int n, int m, double *x0, int p, double *x); > > example.i: > > %module example > > %{ > #define SWIG_FILE_WITH_INIT > #include "example.h" > %} > > %include "numpy.i" > > %init %{ > import_array(); > %} > > %apply (int DIM1, int DIM2, double *IN_ARRAY2) {(int n, int m, double *x0)}; > %apply (int DIM1, double *ARGOUT_ARRAY1) {(int p, double *x)}; > > I then use prompts: > > $ swig -python example.i > $ python setup.py build_ext --inplace > > However I get the error: > example.c: In function ?matrix_example?: > example.c:12: error: subscripted value is neither array nor pointer > error: command 'gcc' failed with exit status 1 > > It seems as though x0 is not actually being passed to the function as a 2-D array. Any help on why this error is happening would be great! > > > Thank you, > Dylan Temple > > -- > Dylan Temple > Graduate Student, University of Michigan NA&ME Department > B.E. Naval Architecture > M.S.E Naval Architecture > 121 NA&ME Department, 2600 Draper Dr. > 607-592-1749 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From 275438859 at qq.com Wed Sep 5 03:50:18 2012 From: 275438859 at qq.com (=?gb18030?B?0MTI59byueI=?=) Date: Wed, 5 Sep 2012 15:50:18 +0800 Subject: [Numpy-discussion] encounter error while it's testing Message-ID: Hi,everybody. I have installed scipy with commend:"pip install scipy" (OSX lion 10.7.4) But I encounter error while testing.BTW,the test of numpy is OK. I input: >>> import scipy >>> scipy.test() And this is the respond: Running unit tests for scipy NumPy version 1.8.0.dev-e60c70d NumPy is installed in /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy SciPy version 0.12.0.dev-858610f SciPy is installed in /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy Python version 2.7.3 (default, Apr 19 2012, 00:55:09) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] nose version 1.1.2 .....................................................................................................................................................................................F.FFPython(334,0x7fff7d1a2960) malloc: *** error for object 0x7fa0544e7f28: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug Abort trap: 6 -------------- next part -------------- An HTML attachment was scrubbed... URL: From 275438859 at qq.com Wed Sep 5 12:57:14 2012 From: 275438859 at qq.com (=?gb18030?B?0MTI59byueI=?=) Date: Thu, 6 Sep 2012 00:57:14 +0800 Subject: [Numpy-discussion] problem with scipy's test Message-ID: Hi,every body. I encounter the error while the scipy is testing . I wanna know why and how to fix it.(OSX lion 10.7.4) here is part of the respond: AssertionError: Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13 error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, OPpart=None, mode=buckling (mismatch 100.0%) x: array([[ 15.86892331, 0.0549568 ], [ 14.15864153, 0.31381369], [ 10.99691307, 0.37543458],... y: array([[ 3.19549052, 0.0549568 ], [ 2.79856422, 0.31381369], [ 1.67526354, 0.37543458],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'd', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Python/2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 249, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 1178, in assert_allclose verbose=verbose, header=header) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13 error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[-0.36892684, -0.01935691], [-0.26850996, -0.11053158], [-0.40976156, -0.13223572],... y: array([[-0.43633077, -0.01935691], [-0.25161386, -0.11053158], [-0.36756684, -0.13223572],... ---------------------------------------------------------------------- Ran 5501 tests in 56.993s FAILED (KNOWNFAIL=13, SKIP=42, failures=76) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Wed Sep 5 13:36:15 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Wed, 5 Sep 2012 13:36:15 -0400 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble Message-ID: Hi, I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change. The problem is that we can't access fields from PyArrayObject anymore, we absolutely must use the old macro/newly function. For the data field, the new function don't allow to set it. There is no function that allow to do this. After so much time spent on small syntactic change, I don't feel making more complex change today. Also, I think there should be a function PyArray_SetDataPtr as similar to PyArray_SetBaseObject. Do you plan to add one? I though that you wanted to force the removing of the old API, but I never hear you wanted to disable this. The only current work around is to create a new Array, but then we need to update the refcount and other stuff. This will make the code slower too. I'll make another post on the problem I got when making Theano work with numpy 1.7b1, but I want to make this functionality removing problem into its own thread. So, do you plan to add an PyArray_SetDataPtr? Do you agree we should have one? Fred From cournape at gmail.com Wed Sep 5 13:54:16 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 5 Sep 2012 18:54:16 +0100 Subject: [Numpy-discussion] encounter error while it's testing In-Reply-To: References: Message-ID: On Wed, Sep 5, 2012 at 8:50 AM, ???? <275438859 at qq.com> wrote: > Hi,everybody. > I have installed scipy with commend:"pip install scipy" (OSX lion > 10.7.4) > But I encounter error while testing.BTW,the test of numpy is OK. gcc-llvm (the default gcc) is known to not work with scipy. It may a bug in gcc-llvm (or, more unlikely, in scipy). I recommend you use the binaries with the python from python.org website, or to use clang to build it on lion. David From njs at pobox.com Wed Sep 5 13:56:45 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 5 Sep 2012 18:56:45 +0100 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: On Wed, Sep 5, 2012 at 6:36 PM, Fr?d?ric Bastien wrote: > Hi, > > I spent up to now 2 or 3 days making change to Theano to support numpy > 1.7b1. But now, I just find an interface change that will need > recoding a function, not just small code change. My understanding was that 1.7 is not supposed to require any code changes... so, separate from your actual question about assigning to the data field can I ask: are the changes you're talking about just to avoid *deprecated* APIs, or did you have actual problems running Theano against 1.7b1? And if you had actual problems, could you say what? (Or just post a diff of the changes you found you had to make, which should amount to the same thing?) -n From cournape at gmail.com Wed Sep 5 14:05:07 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 5 Sep 2012 19:05:07 +0100 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: Hi Frederic, On Wed, Sep 5, 2012 at 6:36 PM, Fr?d?ric Bastien wrote: > Hi, > > I spent up to now 2 or 3 days making change to Theano to support numpy > 1.7b1. But now, I just find an interface change that will need > recoding a function, not just small code change. > > The problem is that we can't access fields from PyArrayObject anymore, > we absolutely must use the old macro/newly function. Why can't you adress the PyArrayObject anymore ? It is deprecated, but the structure itself has not changed. It would certainly be a significant issue if that is not possible anymore, as it would be a significant API break. > > For the data field, the new function don't allow to set it. There is > no function that allow to do this. After so much time spent on small > syntactic change, I don't feel making more complex change today. > > Also, I think there should be a function PyArray_SetDataPtr as similar > to PyArray_SetBaseObject. > > Do you plan to add one? I though that you wanted to force the removing > of the old API, but I never hear you wanted to disable this. It was a design mistake to leak this in the first place, so the end goal (not for 1.7), is certainly to 'forbid' access. It is necessary to move numpy forward and keep ABI compatibility later on. Adding functions to directly access the underlying structures would defeat a lot of this. Regarding the need for new API: - speed issues: do you have any concrete measurements (or usecases) where this is problematic ? - updating the refcount: can you give an example ? thanks, David From lists at onerussian.com Wed Sep 5 16:38:15 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Wed, 5 Sep 2012 16:38:15 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of nukmpy 1.7.0b1 Message-ID: <20120905203815.GA5866@onerussian.com> Recently Sandro uploaded 1.7.0b1 into Debian experimental so I decided to see if this bleeding edge version doesn't break some of its dependees... Below is a copy of http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid.summary first FAILED/ok column is when building against sid numpy version 1.6.2-1 and the second one is against 1.7.0~b1. I think some 'ok -> FAILED' might be indicative of regressions (myself looking into two new funny failures in pymvpa2's master). Some FAILED->FAILED could be ignored (e.g. I forgotten to provide /dev/shm so multiprocessing was failing)... Enjoy Testing builds against python-numpy_1.7.0~b1-1.dsc aster_10.6.0-1-4.dsc FAILED FAILED aster_10.6.0-1-4_amd64.build avogadro_1.0.3-5.dsc FAILED ok babel_1.4.0.dfsg-8.dsc ok ok basemap_1.0.3+dfsg-2.dsc ok ok biosig4c++_1.3.0-2.dsc ok ok brian_1.3.1-1.dsc ok ok cfflib_2.0.5-1.dsc ok ok cmor_2.8.0-2.dsc ok ok connectomeviewer_2.1.0-1.dsc ok ok cython_0.15.1-2.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/cython_0.15.1-2_amd64.build dballe_5.18-1.dsc ok ok dipy_0.5.0-3.dsc ok ok dolfin_1.0.0-7.dsc FAILED ok flann_1.7.1-4.dsc ok ok fonttools_2.3-1.dsc ok ok gamera_3.3.3-2.dsc ok ok gdal_1.9.0-3.dsc ok ok getfem++_4.1.1-10.dsc FAILED ok gnudatalanguage_0.9.2-4.dsc ok ok gnuradio_3.6.1-1.dsc FAILED ok guiqwt_2.1.6-4.dsc FAILED ok h5py_2.0.1-2.dsc ok ok joblib_0.6.4-3.dsc ok ok lazyarray_0.1.0-1.dsc ok ok libfreenect_0.1.2+dfsg-6.dsc ok ok libgetdata_0.7.3-6.dsc ok ok libmpikmeans_1.5-1.dsc ok ok libvigraimpex_1.7.1+dfsg1-3.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/libvigraimpex_1.7.1+dfsg1-3_amd64.build lintian_2.5.10.1.dsc FAILED ok magics++_2.14.11-4.dsc ok ok mathgl_1.11.2-14.dsc FAILED ok matplotlib_1.1.1~rc2-1.dsc FAILED ok mayavi2_4.1.0-1.dsc FAILED ok mdp_3.2+git78-g7db3c50-3.dsc ok ok mgltools-bhtree_1.5.6~rc3~cvs.20120206-1.dsc ok ok mgltools-dejavu_1.5.6~rc3~cvs.20120206-1.dsc ok ok mgltools-geomutils_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-gle_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-molkit_1.5.6~rc3~cvs.20120206-1.dsc ok ok mgltools-opengltk_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-pyglf_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-sff_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-utpackages_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-vision_1.5.6~rc3~cvs.20120601-1.dsc ok ok mgltools-visionlibraries_1.5.6~rc3~cvs.20120601-1.dsc ok ok mlpy_2.2.0~dfsg1-2.dsc ok ok mmass_5.2.0-2.dsc ok ok model-builder_0.4.1-6.dsc ok ok mpi4py_1.3+hg20120611-1.dsc ok ok mypaint_1.0.0-1.dsc ok ok necpp_1.5.0+cvs20101003-2.1.dsc ok ok neo_0.2.0-1.dsc ok ok nexus_4.2.1-svn1614-1.dsc FAILED ok nibabel_1.2.2-1.dsc ok ok nipy_0.2.0-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/nipy_0.2.0-1_amd64.build nitime_0.4-2.dsc ok ok nlopt_2.2.4+dfsg-2.dsc ok ok numexpr_2.0.1-3.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/numexpr_2.0.1-3_amd64.build numm_0.4-1.dsc FAILED ok opencv_2.3.1-11.dsc ok ok openmeeg_2.0.0.dfsg-5.dsc FAILED ok openopt_0.38+svn1589-1.dsc ok ok pandas_0.8.1-1.dsc ok ok pdb2pqr_1.8-1.dsc ok ok pebl_1.0.2-2.dsc ok ok plplot_5.9.9-5.dsc FAILED ok psignifit3_3.0~beta.20120611.1-1.dsc ok ok pycuda_2012.1-1.dsc ok ok pydicom_0.9.6-1.dsc ok ok pyentropy_0.4.1-1.dsc ok ok pyepr_0.6.1-2.dsc ok ok pyevolve_0.6~rc1+svn398+dfsg-2.dsc ok ok pyfai_0.3.5-1.dsc ok ok pyfits_3.0.8-2.dsc ok ok pyformex_0.8.6-4.dsc ok ok pygame_1.9.1release+dfsg-6.dsc FAILED ok pygrib_1.9.3-1.dsc ok ok pygtk_2.24.0-3.dsc ok ok pylibtiff_0.3.0~svn78-3.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pylibtiff_0.3.0~svn78-3_amd64.build pymca_4.6.0-2.dsc ok ok pymol_1.5.0.1-2.dsc ok ok pymvpa_0.4.8-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa_0.4.8-1_amd64.build pymvpa2_2.1.0-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa2_2.1.0-1_amd64.build pynifti_0.20100607.1-4.dsc ok ok pynn_0.7.4-1.dsc ok ok pyopencl_2012.1-1.dsc ok ok pyqwt3d_0.1.7~cvs20090625-9.dsc FAILED ok pyqwt5_5.2.1~cvs20091107+dfsg-6.dsc FAILED ok pysparse_1.1-1.dsc ok ok pysurfer_0.3+git15-gae6cbb1-1.1.dsc ok ok pytables_2.3.1-3.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pytables_2.3.1-3_amd64.build pytango_7.2.3-2.dsc ok ok python-ase_3.6.0.2515-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-ase_3.6.0.2515-1_amd64.build python-biggles_1.6.6-1.dsc ok ok python-biom-format_1.0.0-1.dsc ok ok python-biopython_1.59-1.dsc ok ok python-chaco_4.1.0-1.dsc ok ok python-cogent_1.5.1-2.dsc ok ok python-cpl_0.3.6-1.dsc ok ok python-csa_0.1.0-1.1.dsc ok ok python-enable_4.1.0-1.dsc ok ok python-fabio_0.0.8-1.dsc ok ok python-fftw_0.2.2-1.dsc ok ok python-gnuplot_1.8-1.1.dsc ok ok python-networkx_1.7~rc1-3.dsc ok ok python-neuroshare_0.8.5-1.dsc ok ok python-pywcs_1.11-1.dsc ok ok python-scientific_2.8-3.dsc ok ok python-scipy_0.10.1+dfsg1-4.dsc ok ok python-shapely_1.2.14-1.dsc ok ok python-visual_5.12-1.4.dsc ok ok pytools_2011.5-2.dsc ok ok pywavelets_0.2.0-5.dsc ok ok pyzmq_2.2.0-1.dsc ok ok qiime_1.5.0-2.dsc ok ok rdkit_201203-3.dsc ok ok rpy_1.0.3-22.dsc ok ok rpy2_2.2.6-1.dsc ok ok scikit-learn_0.11.0-2.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/scikit-learn_0.11.0-2_amd64.build shogun_1.1.0-6.dsc FAILED ok skimage_0.6.1-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/skimage_0.6.1-1_amd64.build spherepack_3.2-4.dsc ok ok statsmodels_0.4.2-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/statsmodels_0.4.2-1_amd64.build stimfit_0.10.18-1.1.dsc ok ok syfi_1.0.0.dfsg-1.dsc ok ok taurus_3.0.0-1.dsc FAILED ok tifffile_20120421-1.dsc ok ok uncertainties_1.8-1.dsc ok ok veusz_1.15-1.dsc FAILED ok vistrails_2.0.alpha~1-3.dsc ok ok wrapitk-python_3.20.1.5.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/wrapitk-python_3.20.1.5_amd64.build wsjt_5.9.7.r383-1.6.dsc ok ok yade_0.80.1-2.dsc FAILED ok yp-svipc_0.14-2.dsc ok ok -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From lists at onerussian.com Wed Sep 5 17:02:19 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Wed, 5 Sep 2012 17:02:19 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of nukmpy 1.7.0b1 In-Reply-To: <20120905203815.GA5866@onerussian.com> References: <20120905203815.GA5866@onerussian.com> Message-ID: <20120905210219.GP5871@onerussian.com> quick question -- either this is a desired effect that ndarray.base is no longer chains to point to all parent arrays? following code produces different outputs with 1.6.3 and 1.7.0b1: $> python -c 'import numpy as np; print np.__version__; a=np.arange(10); print a[:4].base is a, a[:4][:3].base is a, a[:4][:3].base.base is a' 1.6.2 True False True 1.7.0rc1.dev-ea23de8 True True False On Wed, 05 Sep 2012, Yaroslav Halchenko wrote: > pymvpa2_2.1.0-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa2_2.1.0-1_amd64.build -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From lists at onerussian.com Wed Sep 5 17:14:32 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Wed, 5 Sep 2012 17:14:32 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of numpy 1.7.0b1 In-Reply-To: <20120905203815.GA5866@onerussian.com> References: <20120905203815.GA5866@onerussian.com> Message-ID: <20120905211432.GQ5871@onerussian.com> and another, quite weird one -- initially it was crashing with the same error on np.dot(Vh.T, U.T) but while adding print statements to troubleshoot it, started to fail on print: File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/mappers/procrustean.py", line 164, in _train print "Vh:", Vh File "/home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 1471, in array_str return array2string(a, max_line_width, precision, suppress_small, ' ', "", str) File "/home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py", line 440, in array2string elif reduce(product, a.shape) == 0: TypeError: object of type 'float' has no len() here is part of pdb session: Vh: > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py(440)array2string() -> elif reduce(product, a.shape) == 0: (Pdb) up > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/numeric.py(1471)array_str() -> return array2string(a, max_line_width, precision, suppress_small, ' ', "", str) (Pdb) print a [[-0.99818262 0.06026149] [ 0.06026149 0.99818262]] *(Pdb) print a.__class__ (Pdb) down > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py(440)array2string() -> elif reduce(product, a.shape) == 0: (Pdb) print reduce(product, a.shape) 4 (Pdb) c ERROR it might be that this valgrind msg would be relevant ;) : ==10281== Invalid read of size 4 ==10281== at 0x88C6973: _descriptor_from_pep3118_format (buffer.c:791) ==10281== by 0x88C6B0E: _array_from_buffer_3118 (ctors.c:1193) ==10281== by 0x88E7ABB: PyArray_GetArrayParamsFromObject (ctors.c:1378) ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) ==10281== by 0x4F148D: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EB221: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== Address 0x75c3a04 is 4 bytes inside a block of size 6 alloc'd ==10281== at 0x4C28BED: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10281== by 0x88C6911: _descriptor_from_pep3118_format (buffer.c:776) ==10281== by 0x88C6B0E: _array_from_buffer_3118 (ctors.c:1193) ==10281== by 0x88E7ABB: PyArray_GetArrayParamsFromObject (ctors.c:1378) ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) ==10281== by 0x4F148D: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== ==10281== Invalid read of size 4 ==10281== at 0x88C6973: _descriptor_from_pep3118_format (buffer.c:791) ==10281== by 0x88E0BAB: PyArray_DTypeFromObjectHelper (common.c:287) ==10281== by 0x88E1012: PyArray_DTypeFromObject.constprop.277 (common.c:111) ==10281== by 0x88E7C74: PyArray_GetArrayParamsFromObject (ctors.c:1453) ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) ==10281== by 0x4F148D: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== Address 0x7852e94 is 4 bytes inside a block of size 6 alloc'd ==10281== at 0x4C28BED: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10281== by 0x88C6911: _descriptor_from_pep3118_format (buffer.c:776) ==10281== by 0x88E0BAB: PyArray_DTypeFromObjectHelper (common.c:287) ==10281== by 0x88E1012: PyArray_DTypeFromObject.constprop.277 (common.c:111) ==10281== by 0x88E7C74: PyArray_GetArrayParamsFromObject (ctors.c:1453) ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) ==10281== by 0x4F148D: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in /home/yoh/python-env/numpy/bin/python) ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in /home/yoh/python-env/numpy/bin/python) On Wed, 05 Sep 2012, Yaroslav Halchenko wrote: > Recently Sandro uploaded 1.7.0b1 into Debian experimental so I decided to see > if this bleeding edge version doesn't break some of its dependees... Below is > a copy of > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid.summary > first FAILED/ok column is when building against sid numpy version 1.6.2-1 and > the second one is against 1.7.0~b1. I think some 'ok -> FAILED' might be > indicative of regressions (myself looking into two new funny failures in > pymvpa2's master). Some FAILED->FAILED could be ignored (e.g. I forgotten to > provide /dev/shm so multiprocessing was failing)... Enjoy > Testing builds against python-numpy_1.7.0~b1-1.dsc > aster_10.6.0-1-4.dsc FAILED FAILED aster_10.6.0-1-4_amd64.build > avogadro_1.0.3-5.dsc FAILED ok > babel_1.4.0.dfsg-8.dsc ok ok > basemap_1.0.3+dfsg-2.dsc ok ok > biosig4c++_1.3.0-2.dsc ok ok > brian_1.3.1-1.dsc ok ok > cfflib_2.0.5-1.dsc ok ok > cmor_2.8.0-2.dsc ok ok > connectomeviewer_2.1.0-1.dsc ok ok > cython_0.15.1-2.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/cython_0.15.1-2_amd64.build > dballe_5.18-1.dsc ok ok > dipy_0.5.0-3.dsc ok ok > dolfin_1.0.0-7.dsc FAILED ok > flann_1.7.1-4.dsc ok ok > fonttools_2.3-1.dsc ok ok > gamera_3.3.3-2.dsc ok ok > gdal_1.9.0-3.dsc ok ok > getfem++_4.1.1-10.dsc FAILED ok > gnudatalanguage_0.9.2-4.dsc ok ok > gnuradio_3.6.1-1.dsc FAILED ok > guiqwt_2.1.6-4.dsc FAILED ok > h5py_2.0.1-2.dsc ok ok > joblib_0.6.4-3.dsc ok ok > lazyarray_0.1.0-1.dsc ok ok > libfreenect_0.1.2+dfsg-6.dsc ok ok > libgetdata_0.7.3-6.dsc ok ok > libmpikmeans_1.5-1.dsc ok ok > libvigraimpex_1.7.1+dfsg1-3.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/libvigraimpex_1.7.1+dfsg1-3_amd64.build > lintian_2.5.10.1.dsc FAILED ok > magics++_2.14.11-4.dsc ok ok > mathgl_1.11.2-14.dsc FAILED ok > matplotlib_1.1.1~rc2-1.dsc FAILED ok > mayavi2_4.1.0-1.dsc FAILED ok > mdp_3.2+git78-g7db3c50-3.dsc ok ok > mgltools-bhtree_1.5.6~rc3~cvs.20120206-1.dsc ok ok > mgltools-dejavu_1.5.6~rc3~cvs.20120206-1.dsc ok ok > mgltools-geomutils_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-gle_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-molkit_1.5.6~rc3~cvs.20120206-1.dsc ok ok > mgltools-opengltk_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-pyglf_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-sff_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-utpackages_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-vision_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mgltools-visionlibraries_1.5.6~rc3~cvs.20120601-1.dsc ok ok > mlpy_2.2.0~dfsg1-2.dsc ok ok > mmass_5.2.0-2.dsc ok ok > model-builder_0.4.1-6.dsc ok ok > mpi4py_1.3+hg20120611-1.dsc ok ok > mypaint_1.0.0-1.dsc ok ok > necpp_1.5.0+cvs20101003-2.1.dsc ok ok > neo_0.2.0-1.dsc ok ok > nexus_4.2.1-svn1614-1.dsc FAILED ok > nibabel_1.2.2-1.dsc ok ok > nipy_0.2.0-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/nipy_0.2.0-1_amd64.build > nitime_0.4-2.dsc ok ok > nlopt_2.2.4+dfsg-2.dsc ok ok > numexpr_2.0.1-3.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/numexpr_2.0.1-3_amd64.build > numm_0.4-1.dsc FAILED ok > opencv_2.3.1-11.dsc ok ok > openmeeg_2.0.0.dfsg-5.dsc FAILED ok > openopt_0.38+svn1589-1.dsc ok ok > pandas_0.8.1-1.dsc ok ok > pdb2pqr_1.8-1.dsc ok ok > pebl_1.0.2-2.dsc ok ok > plplot_5.9.9-5.dsc FAILED ok > psignifit3_3.0~beta.20120611.1-1.dsc ok ok > pycuda_2012.1-1.dsc ok ok > pydicom_0.9.6-1.dsc ok ok > pyentropy_0.4.1-1.dsc ok ok > pyepr_0.6.1-2.dsc ok ok > pyevolve_0.6~rc1+svn398+dfsg-2.dsc ok ok > pyfai_0.3.5-1.dsc ok ok > pyfits_3.0.8-2.dsc ok ok > pyformex_0.8.6-4.dsc ok ok > pygame_1.9.1release+dfsg-6.dsc FAILED ok > pygrib_1.9.3-1.dsc ok ok > pygtk_2.24.0-3.dsc ok ok > pylibtiff_0.3.0~svn78-3.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pylibtiff_0.3.0~svn78-3_amd64.build > pymca_4.6.0-2.dsc ok ok > pymol_1.5.0.1-2.dsc ok ok > pymvpa_0.4.8-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa_0.4.8-1_amd64.build > pymvpa2_2.1.0-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa2_2.1.0-1_amd64.build > pynifti_0.20100607.1-4.dsc ok ok > pynn_0.7.4-1.dsc ok ok > pyopencl_2012.1-1.dsc ok ok > pyqwt3d_0.1.7~cvs20090625-9.dsc FAILED ok > pyqwt5_5.2.1~cvs20091107+dfsg-6.dsc FAILED ok > pysparse_1.1-1.dsc ok ok > pysurfer_0.3+git15-gae6cbb1-1.1.dsc ok ok > pytables_2.3.1-3.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pytables_2.3.1-3_amd64.build > pytango_7.2.3-2.dsc ok ok > python-ase_3.6.0.2515-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-ase_3.6.0.2515-1_amd64.build > python-biggles_1.6.6-1.dsc ok ok > python-biom-format_1.0.0-1.dsc ok ok > python-biopython_1.59-1.dsc ok ok > python-chaco_4.1.0-1.dsc ok ok > python-cogent_1.5.1-2.dsc ok ok > python-cpl_0.3.6-1.dsc ok ok > python-csa_0.1.0-1.1.dsc ok ok > python-enable_4.1.0-1.dsc ok ok > python-fabio_0.0.8-1.dsc ok ok > python-fftw_0.2.2-1.dsc ok ok > python-gnuplot_1.8-1.1.dsc ok ok > python-networkx_1.7~rc1-3.dsc ok ok > python-neuroshare_0.8.5-1.dsc ok ok > python-pywcs_1.11-1.dsc ok ok > python-scientific_2.8-3.dsc ok ok > python-scipy_0.10.1+dfsg1-4.dsc ok ok > python-shapely_1.2.14-1.dsc ok ok > python-visual_5.12-1.4.dsc ok ok > pytools_2011.5-2.dsc ok ok > pywavelets_0.2.0-5.dsc ok ok > pyzmq_2.2.0-1.dsc ok ok > qiime_1.5.0-2.dsc ok ok > rdkit_201203-3.dsc ok ok > rpy_1.0.3-22.dsc ok ok > rpy2_2.2.6-1.dsc ok ok > scikit-learn_0.11.0-2.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/scikit-learn_0.11.0-2_amd64.build > shogun_1.1.0-6.dsc FAILED ok > skimage_0.6.1-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/skimage_0.6.1-1_amd64.build > spherepack_3.2-4.dsc ok ok > statsmodels_0.4.2-1.dsc ok FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/statsmodels_0.4.2-1_amd64.build > stimfit_0.10.18-1.1.dsc ok ok > syfi_1.0.0.dfsg-1.dsc ok ok > taurus_3.0.0-1.dsc FAILED ok > tifffile_20120421-1.dsc ok ok > uncertainties_1.8-1.dsc ok ok > veusz_1.15-1.dsc FAILED ok > vistrails_2.0.alpha~1-3.dsc ok ok > wrapitk-python_3.20.1.5.dsc FAILED FAILED http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/wrapitk-python_3.20.1.5_amd64.build > wsjt_5.9.7.r383-1.6.dsc ok ok > yade_0.80.1-2.dsc FAILED ok > yp-svipc_0.14-2.dsc ok ok -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From njs at pobox.com Wed Sep 5 17:41:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 5 Sep 2012 22:41:48 +0100 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of nukmpy 1.7.0b1 In-Reply-To: <20120905210219.GP5871@onerussian.com> References: <20120905203815.GA5866@onerussian.com> <20120905210219.GP5871@onerussian.com> Message-ID: On Wed, Sep 5, 2012 at 10:02 PM, Yaroslav Halchenko wrote: > quick question -- either this is a desired effect that ndarray.base is no > longer chains to point to all parent arrays? following code produces > different outputs with 1.6.3 and 1.7.0b1: > > $> python -c 'import numpy as np; print np.__version__; a=np.arange(10); print a[:4].base is a, a[:4][:3].base is a, a[:4][:3].base.base is a' > > 1.6.2 > True False True > > 1.7.0rc1.dev-ea23de8 > True True False It is an intentional change: https://github.com/numpy/numpy/commit/b7cc20ad#L5R77 but the benefits aren't necessarily *that* compelling, so it could certainly be revisited if there are unforeseen downsides. (Mostly it means that intermediate view objects can be deallocated when not otherwise referenced.) Is it somehow causing a problem for you? AFAICT introspection on .base is just a bad idea to start with, but... -n From lists at onerussian.com Wed Sep 5 17:57:39 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Wed, 5 Sep 2012 17:57:39 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of nukmpy 1.7.0b1 In-Reply-To: References: <20120905203815.GA5866@onerussian.com> <20120905210219.GP5871@onerussian.com> Message-ID: <20120905215739.GB5866@onerussian.com> On Wed, 05 Sep 2012, Nathaniel Smith wrote: > It is an intentional change: > https://github.com/numpy/numpy/commit/b7cc20ad#L5R77 > but the benefits aren't necessarily *that* compelling, so it could > certainly be revisited if there are unforeseen downsides. (Mostly it > means that intermediate view objects can be deallocated when not > otherwise referenced.) Is it somehow causing a problem for you? not really -- just fails our unittests which relied upon > introspection on .base is just a bad idea to start with, but... public interface and previous assumptions ;) since this chaining is actually not of importance for that test (we just cared to not deal with a copy of the actual load), I will tune it up so it would work under any numpy's handling here (chain or not to chain). -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From ondrej.certik at gmail.com Wed Sep 5 18:03:51 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 5 Sep 2012 15:03:51 -0700 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: Hi Fred, On Wed, Sep 5, 2012 at 10:56 AM, Nathaniel Smith wrote: > On Wed, Sep 5, 2012 at 6:36 PM, Fr?d?ric Bastien wrote: >> Hi, >> >> I spent up to now 2 or 3 days making change to Theano to support numpy >> 1.7b1. But now, I just find an interface change that will need >> recoding a function, not just small code change. > > My understanding was that 1.7 is not supposed to require any code > changes... so, separate from your actual question about assigning to > the data field can I ask: are the changes you're talking about just to > avoid *deprecated* APIs, or did you have actual problems running > Theano against 1.7b1? And if you had actual problems, could you say > what? (Or just post a diff of the changes you found you had to make, > which should amount to the same thing?) Thank you for trying the beta version, that was the purpose to put it out there and see if it breaks things. As others said, if you can give us more details, that'd be great. Let's get it fixed before rc1. Ondrej From friedrichromstedt at gmail.com Wed Sep 5 19:55:20 2012 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 6 Sep 2012 01:55:20 +0200 Subject: [Numpy-discussion] how is y += x computed when y.strides = (0, 8) and x.strides=(16, 8) ? In-Reply-To: References: Message-ID: Poor Sebastian, you make the mistake of asking difficult questions. I noticed that it should be [6, 10] not [6, 12], and in fact is with numpy-1.4.1; while I observe the [4, 6] result with numpy-1.6.1. Logs follow: numpy-1.4.1 in Python-2.6.5 on Mac (intel 64bit) with Python + numpy built from sources dual-arch. The result in terms of output does not depend on the architecture chosen for run. The other is numpy-1.6.1 with Python-2.7.2. numpy-1.4.1 (64bit Python 2.6.5): Python 2.6.5 (r265:79063, Jul 18 2010, 12:14:53) [GCC 4.2.1 (Apple Inc. build 5659)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> print numpy.__version__ 1.4.1 >>> import numpy >>> >>> x = numpy.arange(6).reshape((3,2)) >>> y = numpy.arange(2) >>> >>> print 'x=\n', x x= [[0 1] [2 3] [4 5]] >>> print 'y=\n', y y= [0 1] >>> >>> u,v = numpy.broadcast_arrays(x, y) >>> >>> print 'u=\n', u u= [[0 1] [2 3] [4 5]] >>> print 'v=\n', v v= [[0 1] [0 1] [0 1]] >>> print 'v.strides=\n', v.strides v.strides= (0, 8) >>> >>> v += u >>> >>> print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] v= [[ 6 10] [ 6 10] [ 6 10]] >>> print 'u=\n', u u= [[0 1] [2 3] [4 5]] >>> print 'y=\n', y # expectation: y = [6,12] y= [ 6 10] And numpy-1.6.1 (64bit Python-2.7.2): Python 2.7.2 (default, Mar 15 2012, 15:42:23) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. [fiddling with sys.maxint edited out] >>> import numpy >>> >>> x = numpy.arange(6).reshape((3,2)) >>> y = numpy.arange(2) >>> >>> print 'x=\n', x x= [[0 1] [2 3] [4 5]] >>> print 'y=\n', y y= [0 1] >>> >>> u,v = numpy.broadcast_arrays(x, y) >>> >>> print 'u=\n', u u= [[0 1] [2 3] [4 5]] >>> print 'v=\n', v v= [[0 1] [0 1] [0 1]] >>> print 'v.strides=\n', v.strides v.strides= (0, 8) >>> >>> v += u >>> >>> print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] v= [[4 6] [4 6] [4 6]] >>> print 'u=\n', u u= [[0 1] [2 3] [4 5]] >>> print 'y=\n', y # expectation: y = [6,12] y= [4 6] Maybe this helps bisecting it. Friedrich. P.S.: I took the time scrolling through the tickets, with an empty set resulting (first three pages by date or so). This does not mean such a ticket does not exist. Also the docs are rather quiet about this (e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.__iadd__.html#numpy.ndarray.__iadd__, or http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#arithmetic-and-comparison-operations). My guess is that v is duplicated during the in-place addition (including strides), and the results of the calculation (using the original v) are written to that copy. After that, the copy's data would be [4, 6] with the strides as for v. So copying over to v's "original" data or letting it point to the copy and freeing the remaining copy would explain the behaviour after all. Maybe there's a sensible explanation for why it should be done that way. It is and remains a guess after all. I find the problem difficult to term after all. Maybe "In-place operation not in-place for zero-strided first operand?" I don't think it is easy to get a good notion of an "in-place operation" for a zero-strided first operand. It might be that your definition does just not match up. But I think in-place operations should be expected to be independent on the order of execution of the element-wise operations, even if these elements share data (as in v in this case, the first operand). This criterion is fulfilled by your expectation and numpy-1.4.1 but not by numpy-1.6.1. I noticed that it's not necessary to duplicate v's strides in the hypothesis noted above. The neglection of the other element-operations would then happen when copying the element-wise results over to v's storage. A remedy might be to let the back-copy respect the operation it is used for. So if the back-copying happens because an addition is to be in-place, it could use in-place addition instead of assigment. As a result, the element-wise operations on the "repeat()'ed" copy would be "chained together" by the operation under use. The non-commutative operations like subtraction or division would use addition or mulitiplication, resp., because the subtraction or division character can be thought of as an inversion of the second operand only before the whole in-place operations takes place, leading to a commutative in-place operation. Hence the "chained" operands in the back-copy operation form an associative and commutative expression after all. Nevertheless, a duplication of the first operands would occur in our case, since the same data from the first operand appears in several element's expressions in the copy buffer. The result would then be, in our example here, neither [6, 10] nor [4, 6], but [6, 12] instead. What makes actually more sense if you observe the symmetry regarding the quotient of the second column and the first column (the second column seen as a series, as it is to be collapes onto the same datum). It seems it comes down to giving the assigment a sensible meaning in case the first operand (i.e., the target) is partially zero-strided or contains less data elements that its view does. The second case might appear for stride tricks. It seems, it comes down to deciding in that case how to collapse the value of the second operand (the source of the assigment) with the values of the first operand (the target). And it seems from this consideration here, that the substitution from "a += b" to "a = a + b" is not generally valid if the substitute needs reshaping of the first operand "a" to match the shape of "b". This repetition is what makes the substitute generate a different result than the breakdown to element-wise operation in the substituted form. It seems, that in-place operations and parallelisation by array notions do not commute. To illustrate this noncummativity of in-place operations and parallelism, note that the implementation of "a += b" which yielded the 'expected result' (as in numpy-1.4.1) goes first by elements, and then into the in-place operation (the iteration happens on a higher code level than the operation). In the other case, which is done by numpy-1.6.1 in an equivalent way at least, the operations is executed first on array operands, and then an iterations is employed. Unfortunately, for the second approach, due to the nature of computers, an "execution of an operation on arrays" is abstract, and needs concrete implementation on the element level, s.t. the "a = a + b" substitute leads to an elementwise assigment, which is not accounted for by the algorithmic idea, and leads to the divergence between the two approaches' results. I like the equivalence criterion of "a += b" <=> "a = a + b", not caring for dtype downcasts. These downcasts are actually a Python idiosyncrasy and leads to the known paradox only when neglecting the character of Python assignment. The important notion of the substitutional equivalence of "a += b" <=> "a = a + b" seems to be to me, that the r.h.s. of the equivalence refers to additions, and explains the in-place operation by addition. Unfortunately, in the case of a zero-strided first operand, the assignment in between becomes ill-defined. But if then we then stand by "explaining in-place addition by pure addition" it would seem natural to me, to chain the operands resulting from the substitute "a + b" together by just that "+" operation. I don't know what symbol to put for that, maybe a "=+", how strange this might ever look. So the usual Python teaching of replacing "+=" by "=" and "+" seems to need a more precise definition of that "=" used in case of arrays as operands. For scalars, of course an ordinary assigment does very well. In numpy, I would hence state as a formula containing the preceding sentences: "a += b <=> a =+ a + b". It would be possible to implement this by filling the first operand by the unity of the respective operation (0 for addition and 1 for multiplication), carrying out the element-wise operation on broadcasted arrays, and feeding it back to the original first (left) operand by chaining using the resp. operation, which could in turn be done by scalar in-place operation using the unity data as a start. Funny is, e.g. the following hypothetical Python session (I turn the arrows upwards for this): ^^^ a = numpy.ones(()) ^^^ b = numpy.zeros((10, 1)) ^^^ a array(1.0) ^^^ b array([[ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.]]) ^^^ u, v = numpy.broadcast_arrays(a, b) ^^^ u array([[ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.], [ 1.]]) ^^^ v array([[ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.]]) ^^^ u += v ^^^ u array([[10.], [10.], [10.], [10.], [10.], [10.], [10.], [10.], [10.], [10.]]) ^^^ v array([[ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.]]) ^^^ a array(10.0) ^^^ b array([[ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.], [ 0.]]) ^^^ Funnily too, the operations "a += b" (in my case), or "y += x" would be perfectly defined even if "a + b" or "y + x" (latter in Sebastian's case) require broadcasting of the first operand. Currently they yield the following result: (numpy-1.5.1, as above:) >>> a += b Traceback (most recent call last): File "", line 1, in ValueError: invalid return array shape (numpy-1.6.1, as above:) >>> y += x Traceback (most recent call last): File "", line 1, in ValueError: non-broadcastable output operand with shape (2) doesn't match the broadcast shape (3,2) The result for Sebstian would then be [6, 12] for the y variable as noted above already. Actually it's interesting that any arbitrarily added additional parallelisation by forcing the first operand to be broadcasted yields changes in it even if the second operand is the zero element w.r.t. the operation considered. But I think that's rather by design then. No-one was asking for invariance under arbitrary repetition, no? I don't know how this would work with multi-threading. Probably not very well as it would require locking on the then-shared target operand on the l.h.s. of, e.g., "a += b". It's not easy to tell apart the cases where all elements of "a" are independent on each other. But actually it's possible at all only in restriction of the dtype to an atomic data type. What's always the case if object arrays are considered as pointer arrays. But I really know too little on this multithreading optimisation as it's really not my point of interest and rather an additional idea which probably can be made consistent with the idea proposed here if some work is done on it. This was the post-scriptum, apparently. F. Am 31.08.2012 um 11:31 schrieb Sebastian Walter: > Hi, > > I'm using numpy 1.6.1 on Ubuntu 12.04.1 LTS. > A code that used to work with an older version of numpy now fails with an error. > > Were there any changes in the way inplace operations like +=, *=, etc. > work on arrays with non-standard strides? > > For the script: > > ------- start of code ------- > > import numpy > > x = numpy.arange(6).reshape((3,2)) > y = numpy.arange(2) > > print 'x=\n', x > print 'y=\n', y > > u,v = numpy.broadcast_arrays(x, y) > > print 'u=\n', u > print 'v=\n', v > print 'v.strides=\n', v.strides > > v += u > > print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] > print 'u=\n', u > print 'y=\n', y # expectation: y = [6,12] > > ------- end of code ------- > > I get the output > > -------- start of output --------- > x= > [[0 1] > [2 3] > [4 5]] > y= > [0 1] > u= > [[0 1] > [2 3] > [4 5]] > v= > [[0 1] > [0 1] > [0 1]] > v.strides= > (0, 8) > v= > [[4 6] > [4 6] > [4 6]] > u= > [[0 1] > [2 3] > [4 5]] > y= > [4 6] > > -------- end of output -------- > > I would have expected that > > v += u > > performs an element-by-element += > > v[0,0] += u[0,0] # increments y[0] > v[0,1] += u[0,1] # increments y[1] > v[1,0] += u[1,0] # increments y[0] > v[1,1] += u[1,1] # increments y[1] > v[2,0] += u[2,0] # increments y[0] > v[2,1] += u[2,1] # increments y[1] > > yielding the result > > y = [6,12] > > but instead one obtains > > y = [4, 6] > > which could be the result of > > v[2,0] += u[2,0] # increments y[0] > v[2,1] += u[2,1] # increments y[1] > > > Is this the intended behavior? > > regards, > Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Sep 5 20:41:21 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 06 Sep 2012 02:41:21 +0200 Subject: [Numpy-discussion] how is y += x computed when y.strides = (0, 8) and x.strides=(16, 8) ? In-Reply-To: References: Message-ID: <1346892081.1210.17.camel@sebastian-laptop> Hey, No idea if this is simply not support or just a bug, though I am guessing that such usage simply is not planned. However, this also has to do with buffering, so unless the behaviour is substantially changed, I would not expect even predictable results. I have used things like a[1:] += a[:-1] (cumsum is better here, and clear as to what it does), but this is different, as here the same data is read after being written. And here an example showing the buffer is involved: In [14]: np.setbufsize(1024) Out[14]: 1024 In [19]: x = numpy.arange(6*1000).reshape(-1,2) In [20]: y = numpy.arange(2) In [21]: u,v = numpy.broadcast_arrays(x, y) In [22]: v += u In [23]: v Out[23]: array([[21348, 21355], [21348, 21355], [21348, 21355], ..., [21348, 21355], [21348, 21355], [21348, 21355]]) In [24]: np.setbufsize(1000000) Out[24]: 1024 # note it gives old bufsize... In [25]: x = numpy.arange(6*1000).reshape(-1,2) In [26]: y = numpy.arange(2) In [27]: u,v = numpy.broadcast_arrays(x, y) In [28]: v += u In [29]: v Out[29]: array([[5998, 6000], [5998, 6000], [5998, 6000], ..., [5998, 6000], [5998, 6000], [5998, 6000]]) On Thu, 2012-09-06 at 01:55 +0200, Friedrich Romstedt wrote: > Poor Sebastian, you make the mistake of asking difficult questions. > > I noticed that it should be [6, 10] not [6, 12], and in fact is with numpy-1.4.1; while I observe the [4, 6] result with numpy-1.6.1. Logs follow: > > numpy-1.4.1 in Python-2.6.5 on Mac (intel 64bit) with Python + numpy built from sources dual-arch. The result in terms of output does not depend on the architecture chosen for run. The other is numpy-1.6.1 with Python-2.7.2. > > numpy-1.4.1 (64bit Python 2.6.5): > > Python 2.6.5 (r265:79063, Jul 18 2010, 12:14:53) > [GCC 4.2.1 (Apple Inc. build 5659)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> print numpy.__version__ > 1.4.1 > >>> import numpy > >>> > >>> x = numpy.arange(6).reshape((3,2)) > >>> y = numpy.arange(2) > >>> > >>> print 'x=\n', x > x= > [[0 1] > [2 3] > [4 5]] > >>> print 'y=\n', y > y= > [0 1] > >>> > >>> u,v = numpy.broadcast_arrays(x, y) > >>> > >>> print 'u=\n', u > u= > [[0 1] > [2 3] > [4 5]] > >>> print 'v=\n', v > v= > [[0 1] > [0 1] > [0 1]] > >>> print 'v.strides=\n', v.strides > v.strides= > (0, 8) > >>> > >>> v += u > >>> > >>> print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] > v= > [[ 6 10] > [ 6 10] > [ 6 10]] > >>> print 'u=\n', u > u= > [[0 1] > [2 3] > [4 5]] > >>> print 'y=\n', y # expectation: y = [6,12] > y= > [ 6 10] > > And numpy-1.6.1 (64bit Python-2.7.2): > > Python 2.7.2 (default, Mar 15 2012, 15:42:23) > [GCC 4.2.1 (Apple Inc. build 5664)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > [fiddling with sys.maxint edited out] > >>> import numpy > >>> > >>> x = numpy.arange(6).reshape((3,2)) > >>> y = numpy.arange(2) > >>> > >>> print 'x=\n', x > x= > [[0 1] > [2 3] > [4 5]] > >>> print 'y=\n', y > y= > [0 1] > >>> > >>> u,v = numpy.broadcast_arrays(x, y) > >>> > >>> print 'u=\n', u > u= > [[0 1] > [2 3] > [4 5]] > >>> print 'v=\n', v > v= > [[0 1] > [0 1] > [0 1]] > >>> print 'v.strides=\n', v.strides > v.strides= > (0, 8) > >>> > >>> v += u > >>> > >>> print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] > v= > [[4 6] > [4 6] > [4 6]] > >>> print 'u=\n', u > u= > [[0 1] > [2 3] > [4 5]] > >>> print 'y=\n', y # expectation: y = [6,12] > y= > [4 6] > > Maybe this helps bisecting it. > > Friedrich. > > P.S.: I took the time scrolling through the tickets, with an empty set resulting (first three pages by date or so). This does not mean such a ticket does not exist. Also the docs are rather quiet about this (e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.__iadd__.html#numpy.ndarray.__iadd__, or http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#arithmetic-and-comparison-operations). > > My guess is that v is duplicated during the in-place addition (including strides), and the results of the calculation (using the original v) are written to that copy. After that, the copy's data would be [4, 6] with the strides as for v. So copying over to v's "original" data or letting it point to the copy and freeing the remaining copy would explain the behaviour after all. Maybe there's a sensible explanation for why it should be done that way. > > It is and remains a guess after all. > > I find the problem difficult to term after all. Maybe "In-place operation not in-place for zero-strided first operand?" I don't think it is easy to get a good notion of an "in-place operation" for a zero-strided first operand. It might be that your definition does just not match up. But I think in-place operations should be expected to be independent on the order of execution of the element-wise operations, even if these elements share data (as in v in this case, the first operand). This criterion is fulfilled by your expectation and numpy-1.4.1 but not by numpy-1.6.1. > > I noticed that it's not necessary to duplicate v's strides in the hypothesis noted above. The neglection of the other element-operations would then happen when copying the element-wise results over to v's storage. > > A remedy might be to let the back-copy respect the operation it is used for. So if the back-copying happens because an addition is to be in-place, it could use in-place addition instead of assigment. As a result, the element-wise operations on the "repeat()'ed" copy would be "chained together" by the operation under use. The non-commutative operations like subtraction or division would use addition or mulitiplication, resp., because the subtraction or division character can be thought of as an inversion of the second operand only before the whole in-place operations takes place, leading to a commutative in-place operation. Hence the "chained" operands in the back-copy operation form an associative and commutative expression after all. Nevertheless, a duplication of the first operands would occur in our case, since the same data from the first operand appears in several element's expressions in the copy buffer. > > The result would then be, in our example here, neither [6, 10] nor [4, 6], but [6, 12] instead. What makes actually more sense if you observe the symmetry regarding the quotient of the second column and the first column (the second column seen as a series, as it is to be collapes onto the same datum). It seems it comes down to giving the assigment a sensible meaning in case the first operand (i.e., the target) is partially zero-strided or contains less data elements that its view does. The second case might appear for stride tricks. > > It seems, it comes down to deciding in that case how to collapse the value of the second operand (the source of the assigment) with the values of the first operand (the target). And it seems from this consideration here, that the substitution from "a += b" to "a = a + b" is not generally valid if the substitute needs reshaping of the first operand "a" to match the shape of "b". This repetition is what makes the substitute generate a different result than the breakdown to element-wise operation in the substituted form. It seems, that in-place operations and parallelisation by array notions do not commute. > > To illustrate this noncummativity of in-place operations and parallelism, note that the implementation of "a += b" which yielded the 'expected result' (as in numpy-1.4.1) goes first by elements, and then into the in-place operation (the iteration happens on a higher code level than the operation). In the other case, which is done by numpy-1.6.1 in an equivalent way at least, the operations is executed first on array operands, and then an iterations is employed. Unfortunately, for the second approach, due to the nature of computers, an "execution of an operation on arrays" is abstract, and needs concrete implementation on the element level, s.t. the "a = a + b" substitute leads to an elementwise assigment, which is not accounted for by the algorithmic idea, and leads to the divergence between the two approaches' results. > > I like the equivalence criterion of "a += b" <=> "a = a + b", not caring for dtype downcasts. These downcasts are actually a Python idiosyncrasy and leads to the known paradox only when neglecting the character of Python assignment. The important notion of the substitutional equivalence of "a += b" <=> "a = a + b" seems to be to me, that the r.h.s. of the equivalence refers to additions, and explains the in-place operation by addition. Unfortunately, in the case of a zero-strided first operand, the assignment in between becomes ill-defined. But if then we then stand by "explaining in-place addition by pure addition" it would seem natural to me, to chain the operands resulting from the substitute "a + b" together by just that "+" operation. I don't know what symbol to put for that, maybe a "=+", how strange this might ever look. So the usual Python teaching of replacing "+=" by "=" and "+" seems to need a more precise definition of that "=" used in case of arrays as oper > ands. For scalars, of course an ordinary assigment does very well. In numpy, I would hence state as a formula containing the preceding sentences: "a += b <=> a =+ a + b". > > It would be possible to implement this by filling the first operand by the unity of the respective operation (0 for addition and 1 for multiplication), carrying out the element-wise operation on broadcasted arrays, and feeding it back to the original first (left) operand by chaining using the resp. operation, which could in turn be done by scalar in-place operation using the unity data as a start. > > Funny is, e.g. the following hypothetical Python session (I turn the arrows upwards for this): > > ^^^ a = numpy.ones(()) > ^^^ b = numpy.zeros((10, 1)) > ^^^ a > array(1.0) > ^^^ b > array([[ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.]]) > ^^^ u, v = numpy.broadcast_arrays(a, b) > ^^^ u > array([[ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.]]) > ^^^ v > array([[ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.]]) > ^^^ u += v > ^^^ u > array([[10.], > [10.], > [10.], > [10.], > [10.], > [10.], > [10.], > [10.], > [10.], > [10.]]) > ^^^ v > array([[ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.]]) > ^^^ a > array(10.0) > ^^^ b > array([[ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.], > [ 0.]]) > ^^^ > > Funnily too, the operations "a += b" (in my case), or "y += x" would be perfectly defined even if "a + b" or "y + x" (latter in Sebastian's case) require broadcasting of the first operand. Currently they yield the following result: > > (numpy-1.5.1, as above:) > >>> a += b > Traceback (most recent call last): > File "", line 1, in > ValueError: invalid return array shape > > (numpy-1.6.1, as above:) > >>> y += x > Traceback (most recent call last): > File "", line 1, in > ValueError: non-broadcastable output operand with shape (2) doesn't match the broadcast shape (3,2) > > The result for Sebstian would then be [6, 12] for the y variable as noted above already. > > Actually it's interesting that any arbitrarily added additional parallelisation by forcing the first operand to be broadcasted yields changes in it even if the second operand is the zero element w.r.t. the operation considered. But I think that's rather by design then. No-one was asking for invariance under arbitrary repetition, no? > > I don't know how this would work with multi-threading. Probably not very well as it would require locking on the then-shared target operand on the l.h.s. of, e.g., "a += b". It's not easy to tell apart the cases where all elements of "a" are independent on each other. But actually it's possible at all only in restriction of the dtype to an atomic data type. What's always the case if object arrays are considered as pointer arrays. But I really know too little on this multithreading optimisation as it's really not my point of interest and rather an additional idea which probably can be made consistent with the idea proposed here if some work is done on it. > > This was the post-scriptum, apparently. F. > > > Am 31.08.2012 um 11:31 schrieb Sebastian Walter: > > > Hi, > > > > I'm using numpy 1.6.1 on Ubuntu 12.04.1 LTS. > > A code that used to work with an older version of numpy now fails with an error. > > > > Were there any changes in the way inplace operations like +=, *=, etc. > > work on arrays with non-standard strides? > > > > For the script: > > > > ------- start of code ------- > > > > import numpy > > > > x = numpy.arange(6).reshape((3,2)) > > y = numpy.arange(2) > > > > print 'x=\n', x > > print 'y=\n', y > > > > u,v = numpy.broadcast_arrays(x, y) > > > > print 'u=\n', u > > print 'v=\n', v > > print 'v.strides=\n', v.strides > > > > v += u > > > > print 'v=\n', v # expectation: v = [[6,12], [6,12], [6,12]] > > print 'u=\n', u > > print 'y=\n', y # expectation: y = [6,12] > > > > ------- end of code ------- > > > > I get the output > > > > -------- start of output --------- > > x= > > [[0 1] > > [2 3] > > [4 5]] > > y= > > [0 1] > > u= > > [[0 1] > > [2 3] > > [4 5]] > > v= > > [[0 1] > > [0 1] > > [0 1]] > > v.strides= > > (0, 8) > > v= > > [[4 6] > > [4 6] > > [4 6]] > > u= > > [[0 1] > > [2 3] > > [4 5]] > > y= > > [4 6] > > > > -------- end of output -------- > > > > I would have expected that > > > > v += u > > > > performs an element-by-element += > > > > v[0,0] += u[0,0] # increments y[0] > > v[0,1] += u[0,1] # increments y[1] > > v[1,0] += u[1,0] # increments y[0] > > v[1,1] += u[1,1] # increments y[1] > > v[2,0] += u[2,0] # increments y[0] > > v[2,1] += u[2,1] # increments y[1] > > > > yielding the result > > > > y = [6,12] > > > > but instead one obtains > > > > y = [4, 6] > > > > which could be the result of > > > > v[2,0] += u[2,0] # increments y[0] > > v[2,1] += u[2,1] # increments y[1] > > > > > > Is this the intended behavior? > > > > regards, > > Sebastian > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Thu Sep 6 08:58:33 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Sep 2012 13:58:33 +0100 Subject: [Numpy-discussion] how is y += x computed when y.strides = (0, 8) and x.strides=(16, 8) ? In-Reply-To: <1346892081.1210.17.camel@sebastian-laptop> References: <1346892081.1210.17.camel@sebastian-laptop> Message-ID: On Thu, Sep 6, 2012 at 1:41 AM, Sebastian Berg wrote: > Hey, > > No idea if this is simply not support or just a bug, though I am > guessing that such usage simply is not planned. I think that's right... currently numpy simply makes no guarantees about what order ufunc loops will be performed in, or even if they will be performed in any strictly sequential order. In ordinary cases this lets it make various optimizations, but it means that you can't count on any specific behaviour for the unusual case where different locations in the output array are stored in overlapping memory. Fixing this would require two things: (a) Some code to detect when an array may have internal overlaps (sort of like np.may_share_memory for axes). Not entirely trivial. (b) A "fallback mode" for ufuncs where if the code in (a) detects that we are (probably) dealing with one of these arrays, it processes the operations in some predictable order without buffering. I suppose if someone wanted to come up with these two pieces, and it didn't look like it would cause slowdowns in common cases, the code in (b) avoided creating duplicate code paths that increased maintenance burden, etc., then probably no-one would object to making these arrays act in a better defined way? I don't think most people are that worried about this though. Your original code would be much clearer if it just used np.sum... -n From stefan at sun.ac.za Thu Sep 6 09:20:40 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 6 Sep 2012 06:20:40 -0700 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of nukmpy 1.7.0b1 In-Reply-To: <20120905203815.GA5866@onerussian.com> References: <20120905203815.GA5866@onerussian.com> Message-ID: On Wed, Sep 5, 2012 at 1:38 PM, Yaroslav Halchenko wrote: > skimage_0.6.1-1.dsc ok FAILED This breakage is due to https://github.com/numpy/numpy/issues/392 Thanks for checking! St?fan From aron at ahmadia.net Thu Sep 6 09:44:03 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Thu, 6 Sep 2012 14:44:03 +0100 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of numpy 1.7.0b1 In-Reply-To: <20120905211432.GQ5871@onerussian.com> References: <20120905203815.GA5866@onerussian.com> <20120905211432.GQ5871@onerussian.com> Message-ID: Are you running the valgrind test with the Python suppression file: http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp ? Cheers, A On Wed, Sep 5, 2012 at 10:14 PM, Yaroslav Halchenko wrote: > and another, quite weird one -- initially it was crashing with the same > error on > > np.dot(Vh.T, U.T) > > but while adding print statements to troubleshoot it, started to fail on > print: > > File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/mappers/procrustean.py", line > 164, in _train > print "Vh:", Vh > File > "/home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/numeric.py", > line 1471, in array_str > return array2string(a, max_line_width, precision, suppress_small, ' ', > "", str) > File > "/home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py", > line 440, in array2string > elif reduce(product, a.shape) == 0: > TypeError: object of type 'float' has no len() > > here is part of pdb session: > > Vh: > > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py(440)array2string() > -> elif reduce(product, a.shape) == 0: > (Pdb) up > > > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/numeric.py(1471)array_str() > -> return array2string(a, max_line_width, precision, suppress_small, ' ', > "", str) > (Pdb) print a > [[-0.99818262 0.06026149] > [ 0.06026149 0.99818262]] > *(Pdb) print a.__class__ > > (Pdb) down > > > /home/yoh/python-env/numpy/local/lib/python2.7/site-packages/numpy/core/arrayprint.py(440)array2string() > -> elif reduce(product, a.shape) == 0: > (Pdb) print reduce(product, a.shape) > 4 > (Pdb) c > ERROR > > it might be that this valgrind msg would be relevant ;) : > > ==10281== Invalid read of size 4 > ==10281== at 0x88C6973: _descriptor_from_pep3118_format (buffer.c:791) > ==10281== by 0x88C6B0E: _array_from_buffer_3118 (ctors.c:1193) > ==10281== by 0x88E7ABB: PyArray_GetArrayParamsFromObject (ctors.c:1378) > ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) > ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) > ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) > ==10281== by 0x4F148D: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EB221: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== Address 0x75c3a04 is 4 bytes inside a block of size 6 alloc'd > ==10281== at 0x4C28BED: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==10281== by 0x88C6911: _descriptor_from_pep3118_format (buffer.c:776) > ==10281== by 0x88C6B0E: _array_from_buffer_3118 (ctors.c:1193) > ==10281== by 0x88E7ABB: PyArray_GetArrayParamsFromObject (ctors.c:1378) > ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) > ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) > ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) > ==10281== by 0x4F148D: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== > ==10281== Invalid read of size 4 > ==10281== at 0x88C6973: _descriptor_from_pep3118_format (buffer.c:791) > ==10281== by 0x88E0BAB: PyArray_DTypeFromObjectHelper (common.c:287) > ==10281== by 0x88E1012: PyArray_DTypeFromObject.constprop.277 > (common.c:111) > ==10281== by 0x88E7C74: PyArray_GetArrayParamsFromObject (ctors.c:1453) > ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) > ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) > ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) > ==10281== by 0x4F148D: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== Address 0x7852e94 is 4 bytes inside a block of size 6 alloc'd > ==10281== at 0x4C28BED: malloc (in > /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==10281== by 0x88C6911: _descriptor_from_pep3118_format (buffer.c:776) > ==10281== by 0x88E0BAB: PyArray_DTypeFromObjectHelper (common.c:287) > ==10281== by 0x88E1012: PyArray_DTypeFromObject.constprop.277 > (common.c:111) > ==10281== by 0x88E7C74: PyArray_GetArrayParamsFromObject (ctors.c:1453) > ==10281== by 0x88E7F98: PyArray_FromAny (ctors.c:1580) > ==10281== by 0x88EE895: PyArray_CheckFromAny (ctors.c:1758) > ==10281== by 0x88EF7E2: _array_fromobject (multiarraymodule.c:1644) > ==10281== by 0x4F148D: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4EAFD7: PyEval_EvalFrameEx (in > /home/yoh/python-env/numpy/bin/python) > ==10281== by 0x4F1DAF: PyEval_EvalCodeEx (in > /home/yoh/python-env/numpy/bin/python) > > > > > On Wed, 05 Sep 2012, Yaroslav Halchenko wrote: > > > Recently Sandro uploaded 1.7.0b1 into Debian experimental so I decided > to see > > if this bleeding edge version doesn't break some of its dependees... > Below is > > a copy of > > > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid.summary > > first FAILED/ok column is when building against sid numpy version > 1.6.2-1 and > > the second one is against 1.7.0~b1. I think some 'ok -> FAILED' might > be > > indicative of regressions (myself looking into two new funny failures in > > pymvpa2's master). Some FAILED->FAILED could be ignored (e.g. I > forgotten to > > provide /dev/shm so multiprocessing was failing)... Enjoy > > > Testing builds against python-numpy_1.7.0~b1-1.dsc > > aster_10.6.0-1-4.dsc FAILED FAILED > aster_10.6.0-1-4_amd64.build > > avogadro_1.0.3-5.dsc FAILED ok > > babel_1.4.0.dfsg-8.dsc ok ok > > basemap_1.0.3+dfsg-2.dsc ok ok > > biosig4c++_1.3.0-2.dsc ok ok > > brian_1.3.1-1.dsc ok ok > > cfflib_2.0.5-1.dsc ok ok > > cmor_2.8.0-2.dsc ok ok > > connectomeviewer_2.1.0-1.dsc ok ok > > cython_0.15.1-2.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/cython_0.15.1-2_amd64.build > > dballe_5.18-1.dsc ok ok > > dipy_0.5.0-3.dsc ok ok > > dolfin_1.0.0-7.dsc FAILED ok > > flann_1.7.1-4.dsc ok ok > > fonttools_2.3-1.dsc ok ok > > gamera_3.3.3-2.dsc ok ok > > gdal_1.9.0-3.dsc ok ok > > getfem++_4.1.1-10.dsc FAILED ok > > gnudatalanguage_0.9.2-4.dsc ok ok > > gnuradio_3.6.1-1.dsc FAILED ok > > guiqwt_2.1.6-4.dsc FAILED ok > > h5py_2.0.1-2.dsc ok ok > > joblib_0.6.4-3.dsc ok ok > > lazyarray_0.1.0-1.dsc ok ok > > libfreenect_0.1.2+dfsg-6.dsc ok ok > > libgetdata_0.7.3-6.dsc ok ok > > libmpikmeans_1.5-1.dsc ok ok > > libvigraimpex_1.7.1+dfsg1-3.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/libvigraimpex_1.7.1+dfsg1-3_amd64.build > > lintian_2.5.10.1.dsc FAILED ok > > magics++_2.14.11-4.dsc ok ok > > mathgl_1.11.2-14.dsc FAILED ok > > matplotlib_1.1.1~rc2-1.dsc FAILED ok > > mayavi2_4.1.0-1.dsc FAILED ok > > mdp_3.2+git78-g7db3c50-3.dsc ok ok > > mgltools-bhtree_1.5.6~rc3~cvs.20120206-1.dsc ok ok > > mgltools-dejavu_1.5.6~rc3~cvs.20120206-1.dsc ok ok > > mgltools-geomutils_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-gle_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-molkit_1.5.6~rc3~cvs.20120206-1.dsc ok ok > > mgltools-opengltk_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-pyglf_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-sff_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-utpackages_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-vision_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mgltools-visionlibraries_1.5.6~rc3~cvs.20120601-1.dsc ok ok > > mlpy_2.2.0~dfsg1-2.dsc ok ok > > mmass_5.2.0-2.dsc ok ok > > model-builder_0.4.1-6.dsc ok ok > > mpi4py_1.3+hg20120611-1.dsc ok ok > > mypaint_1.0.0-1.dsc ok ok > > necpp_1.5.0+cvs20101003-2.1.dsc ok ok > > neo_0.2.0-1.dsc ok ok > > nexus_4.2.1-svn1614-1.dsc FAILED ok > > nibabel_1.2.2-1.dsc ok ok > > nipy_0.2.0-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/nipy_0.2.0-1_amd64.build > > nitime_0.4-2.dsc ok ok > > nlopt_2.2.4+dfsg-2.dsc ok ok > > numexpr_2.0.1-3.dsc FAILED FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/numexpr_2.0.1-3_amd64.build > > numm_0.4-1.dsc FAILED ok > > opencv_2.3.1-11.dsc ok ok > > openmeeg_2.0.0.dfsg-5.dsc FAILED ok > > openopt_0.38+svn1589-1.dsc ok ok > > pandas_0.8.1-1.dsc ok ok > > pdb2pqr_1.8-1.dsc ok ok > > pebl_1.0.2-2.dsc ok ok > > plplot_5.9.9-5.dsc FAILED ok > > psignifit3_3.0~beta.20120611.1-1.dsc ok ok > > pycuda_2012.1-1.dsc ok ok > > pydicom_0.9.6-1.dsc ok ok > > pyentropy_0.4.1-1.dsc ok ok > > pyepr_0.6.1-2.dsc ok ok > > pyevolve_0.6~rc1+svn398+dfsg-2.dsc ok ok > > pyfai_0.3.5-1.dsc ok ok > > pyfits_3.0.8-2.dsc ok ok > > pyformex_0.8.6-4.dsc ok ok > > pygame_1.9.1release+dfsg-6.dsc FAILED ok > > pygrib_1.9.3-1.dsc ok ok > > pygtk_2.24.0-3.dsc ok ok > > pylibtiff_0.3.0~svn78-3.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pylibtiff_0.3.0~svn78-3_amd64.build > > pymca_4.6.0-2.dsc ok ok > > pymol_1.5.0.1-2.dsc ok ok > > pymvpa_0.4.8-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa_0.4.8-1_amd64.build > > pymvpa2_2.1.0-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pymvpa2_2.1.0-1_amd64.build > > pynifti_0.20100607.1-4.dsc ok ok > > pynn_0.7.4-1.dsc ok ok > > pyopencl_2012.1-1.dsc ok ok > > pyqwt3d_0.1.7~cvs20090625-9.dsc FAILED ok > > pyqwt5_5.2.1~cvs20091107+dfsg-6.dsc FAILED ok > > pysparse_1.1-1.dsc ok ok > > pysurfer_0.3+git15-gae6cbb1-1.1.dsc ok ok > > pytables_2.3.1-3.dsc FAILED FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/pytables_2.3.1-3_amd64.build > > pytango_7.2.3-2.dsc ok ok > > python-ase_3.6.0.2515-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/python-ase_3.6.0.2515-1_amd64.build > > python-biggles_1.6.6-1.dsc ok ok > > python-biom-format_1.0.0-1.dsc ok ok > > python-biopython_1.59-1.dsc ok ok > > python-chaco_4.1.0-1.dsc ok ok > > python-cogent_1.5.1-2.dsc ok ok > > python-cpl_0.3.6-1.dsc ok ok > > python-csa_0.1.0-1.1.dsc ok ok > > python-enable_4.1.0-1.dsc ok ok > > python-fabio_0.0.8-1.dsc ok ok > > python-fftw_0.2.2-1.dsc ok ok > > python-gnuplot_1.8-1.1.dsc ok ok > > python-networkx_1.7~rc1-3.dsc ok ok > > python-neuroshare_0.8.5-1.dsc ok ok > > python-pywcs_1.11-1.dsc ok ok > > python-scientific_2.8-3.dsc ok ok > > python-scipy_0.10.1+dfsg1-4.dsc ok ok > > python-shapely_1.2.14-1.dsc ok ok > > python-visual_5.12-1.4.dsc ok ok > > pytools_2011.5-2.dsc ok ok > > pywavelets_0.2.0-5.dsc ok ok > > pyzmq_2.2.0-1.dsc ok ok > > qiime_1.5.0-2.dsc ok ok > > rdkit_201203-3.dsc ok ok > > rpy_1.0.3-22.dsc ok ok > > rpy2_2.2.6-1.dsc ok ok > > scikit-learn_0.11.0-2.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/scikit-learn_0.11.0-2_amd64.build > > shogun_1.1.0-6.dsc FAILED ok > > skimage_0.6.1-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/skimage_0.6.1-1_amd64.build > > spherepack_3.2-4.dsc ok ok > > statsmodels_0.4.2-1.dsc ok FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/statsmodels_0.4.2-1_amd64.build > > stimfit_0.10.18-1.1.dsc ok ok > > syfi_1.0.0.dfsg-1.dsc ok ok > > taurus_3.0.0-1.dsc FAILED ok > > tifffile_20120421-1.dsc ok ok > > uncertainties_1.8-1.dsc ok ok > > veusz_1.15-1.dsc FAILED ok > > vistrails_2.0.alpha~1-3.dsc ok ok > > wrapitk-python_3.20.1.5.dsc FAILED FAILED > http://www.onerussian.com/Linux/deb/logs/python-numpy_1.7.0~b1-1_amd64.testrdepends.debian-sid/wrapitk-python_3.20.1.5_amd64.build > > wsjt_5.9.7.r383-1.6.dsc ok ok > > yade_0.80.1-2.dsc FAILED ok > > yp-svipc_0.14-2.dsc ok ok > -- > Yaroslav O. Halchenko > Postdoctoral Fellow, Department of Psychological and Brain Sciences > Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 > Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 > WWW: http://www.linkedin.com/in/yarik > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Thu Sep 6 10:07:46 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Thu, 6 Sep 2012 10:07:46 -0400 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: Hi, I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information. Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. I didn't saw it before my first email, but some code I didn't wrote was passing the -Werror flag to g++, so this created errors for me. At first, I added this macro, but this caused error because we used the old C-API. I'm gatering all the correct information/comments and reply later with them. Fred On Wed, Sep 5, 2012 at 6:03 PM, Ond?ej ?ert?k wrote: > Hi Fred, > > On Wed, Sep 5, 2012 at 10:56 AM, Nathaniel Smith wrote: >> On Wed, Sep 5, 2012 at 6:36 PM, Fr?d?ric Bastien wrote: >>> Hi, >>> >>> I spent up to now 2 or 3 days making change to Theano to support numpy >>> 1.7b1. But now, I just find an interface change that will need >>> recoding a function, not just small code change. >> >> My understanding was that 1.7 is not supposed to require any code >> changes... so, separate from your actual question about assigning to >> the data field can I ask: are the changes you're talking about just to >> avoid *deprecated* APIs, or did you have actual problems running >> Theano against 1.7b1? And if you had actual problems, could you say >> what? (Or just post a diff of the changes you found you had to make, >> which should amount to the same thing?) > > > Thank you for trying the beta version, that was the purpose to put it out there > and see if it breaks things. > > As others said, if you can give us more details, that'd be great. Let's get it > fixed before rc1. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From lists at onerussian.com Thu Sep 6 11:01:28 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Thu, 6 Sep 2012 11:01:28 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of numpy 1.7.0b1 In-Reply-To: References: <20120905203815.GA5866@onerussian.com> <20120905211432.GQ5871@onerussian.com> Message-ID: <20120906150128.GC5866@onerussian.com> On Thu, 06 Sep 2012, Aron Ahmadia wrote: > Are you running the valgrind test with the Python suppression > file:?[1]http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp yes -- on Debian there is /usr/lib/valgrind/python.supp which comes with python package and I believe enabled by default, and it is identical to above (just dynamic library versions different) but it still produces lots of false positives -- IIRC it needs additional tune ups per architecture etc... I just ignored those messages "manually" and listed the relevant one which comes from numpy functionality. -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From charlesr.harris at gmail.com Thu Sep 6 11:32:34 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Sep 2012 11:32:34 -0400 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: On Thu, Sep 6, 2012 at 10:07 AM, Fr?d?ric Bastien wrote: > Hi, > > I reply with more information probably later today or tomorrow, but I > think i need to finish everything to give you the exact information. > > Part of the problem I had was that by default there is a warning that > is generated. It tell that to remove this warning we need to set > NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. You don't want to define this macro if you need to directly access the fields. What warning are you getting if you don't define it? Are you using Cython? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Sep 6 11:36:38 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Sep 2012 16:36:38 +0100 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: Hi, On Wed, Sep 5, 2012 at 7:05 PM, David Cournapeau wrote: > Hi Frederic, > > On Wed, Sep 5, 2012 at 6:36 PM, Fr?d?ric Bastien wrote: >> Hi, >> >> I spent up to now 2 or 3 days making change to Theano to support numpy >> 1.7b1. But now, I just find an interface change that will need >> recoding a function, not just small code change. >> >> The problem is that we can't access fields from PyArrayObject anymore, >> we absolutely must use the old macro/newly function. > > Why can't you adress the PyArrayObject anymore ? It is deprecated, but > the structure itself has not changed. It would certainly be a > significant issue if that is not possible anymore, as it would be a > significant API break. > >> >> For the data field, the new function don't allow to set it. There is >> no function that allow to do this. After so much time spent on small >> syntactic change, I don't feel making more complex change today. >> >> Also, I think there should be a function PyArray_SetDataPtr as similar >> to PyArray_SetBaseObject. >> >> Do you plan to add one? I though that you wanted to force the removing >> of the old API, but I never hear you wanted to disable this. > > It was a design mistake to leak this in the first place, so the end > goal (not for 1.7), is certainly to 'forbid' access. It is necessary > to move numpy forward and keep ABI compatibility later on. Is this still the goal? Is there a still a role for a simple numpy array structure for maximum speed of access? Best, Matthew From lists at onerussian.com Thu Sep 6 12:24:40 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Thu, 6 Sep 2012 12:24:40 -0400 Subject: [Numpy-discussion] FWIW: "regressions" of dependees of numpy 1.7.0b1 In-Reply-To: <20120906150128.GC5866@onerussian.com> References: <20120905203815.GA5866@onerussian.com> <20120905211432.GQ5871@onerussian.com> <20120906150128.GC5866@onerussian.com> Message-ID: <20120906162440.GD5866@onerussian.com> preamble: the bug here seems to be due to incorrect np.asarray(ctypes.cdouble array) ok -- I tried with a debug build of python and -O0 build of numpy, and the same old valgrind... this time valgrind is silent BUT then python itself says test_simple (mvpa2.tests.test_procrust.ProcrusteanMapperTests) ... XXX undetected error ERROR and nose failure (now I see it) has nothing actually to do with the operation on the object but just get reported to the first line after the function call where I guess a problematic object was created... ====================================================================== ERROR: test_simple (mvpa2.tests.test_procrust.ProcrusteanMapperTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/testing/sweepargs.py", line 67, in do_sweep method(*args_, **kwargs_) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/testing/sweepargs.py", line 67, in do_sweep method(*args_, **kwargs_) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/testing/tools.py", line 179, in newfunc return func(*arg, **kwargs) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/tests/test_procrust.py", line 67, in test_simple pm.train(ds) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/base/learner.py", line 119, in train result = self._train(ds) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/mappers/procrustean.py", line 161, in _train print "------------" TypeError: object of type 'float' has no len() that function is fancy only in that it uses ctypes to call a function from the cdll.LoadLibrary('liblapack.so'): https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/support/lapack_svd.py but even if I comment out that call to lapacklib. it still screws up the same way -- so it has to do with those variable definitions before then I guess... ok -- it boils down to numpy.asarray(s) in return statement... I improved it with a printout now where I assigned constructed array to a variable first NB if I swap print lines, it would lead me to crash above, but with this order -- it manages to continue without crashing BUT showing incorrect values. s_arr = numpy.asarray(s) print "s_arr", s_arr print "s:", s return vt, s_arr, u so it gives me $> MVPA_SEED=1928295852 `which nosetests` -s -v mvpa2/tests/test_procrust.py T: MVPA_SEED=1928295852 test_simple (mvpa2.tests.test_procrust.ProcrusteanMapperTests) ... VERSION 1.7.0rc1.dev-ea23de8 s_arr [[ 6.90689888e-310 1.83759219e-316] [ -3.16388621e+134 -3.16388621e+134]] s: FAIL ====================================================================== FAIL: test_simple (mvpa2.tests.test_procrust.ProcrusteanMapperTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/testing/tools.py", line 179, in newfunc return func(*arg, **kwargs) File "/home/yoh/proj/pymvpa/pymvpa/mvpa2/tests/test_procrust.py", line 93, in test_simple "perfectly. Now got d scale*R=%g" % dsR) AssertionError: Single scenario lead to failures of unittest test_simple: on oblique=False : Single scenario lead to failures of unittest test_simple: on svd=dgesvd : We should have got reconstructed rotation+scaling perfectly. Now got d scale*R=1.34232e+134 ---------------------------------------------------------------------- Ran 1 test in 0.222s FAILED (failures=1) [161643 refs] when with 1.6.2 (using the same seed, so if numpy's RNG didn't change -- data should be the same): $> MVPA_SEED=1928295852 nosetests -s -v mvpa2/tests/test_procrust.py T: MVPA_SEED=1928295852 test_simple (mvpa2.tests.test_procrust.ProcrusteanMapperTests) ... VERSION 1.6.2 s_arr [0.775771814652 0.224228185348] s: s_arr [0.775771814652 0.224228185348] s: s_arr [0.365993114976 0.101324191354 0.0959100894799 0.0861936270658 0.0712098129231 0.0694159685405 0.0636016740376 0.056058975141 0.0481853714064 0.042107175076] s: s_arr [0.365993114976 0.101324191354 0.0959100894799 0.0861936270658 0.0712098129231 0.0694159685405 0.0636016740376 0.056058975141 0.0481853714064 0.042107175076] s: s_arr [0.775771814652 0.224228185348 0.0] s: s_arr [0.775771814652 0.224228185348 0.0] s: s_arr [0.703907195999 0.200251541932 0.0] s: s_arr [0.703907195999 0.200251541932 0.0] s: ok ---------------------------------------------------------------------- Ran 1 test in 7.031s OK MVPA_SEED=1928295852 nosetests -s -v mvpa2/tests/test_procrust.py 7,14s user 0,79s system 96% cpu 8,219 total which immediately shows that np.asarray created a 2d array whenever it should have been a 1d (original definition of s is s=(c_double*min(x,y))()) ok -- added printing of dtype of that array -- with 1.6.2 it is reported as object while with 1.7.0b1 -- float64... from here I pass it onto experts! ;) On Thu, 06 Sep 2012, Yaroslav Halchenko wrote: > On Thu, 06 Sep 2012, Aron Ahmadia wrote: > > Are you running the valgrind test with the Python suppression > > file:?[1]http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp > yes -- on Debian there is /usr/lib/valgrind/python.supp which comes > with python package and I believe enabled by default, and it is > identical to above (just dynamic library versions different) but it > still produces lots of false positives -- IIRC it needs additional tune > ups per architecture etc... I just ignored those messages > "manually" and listed the relevant one which comes from numpy > functionality. -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From fperez.net at gmail.com Fri Sep 7 04:38:06 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Sep 2012 01:38:06 -0700 Subject: [Numpy-discussion] John Hunter's memorial service Message-ID: Hi all, I have just received the following information from John's family regarding the memorial service: John's memorial service will be held on Monday, October 1, 2012, at 11.a.m. at Rockefeller Chapel at the University of Chicago. The exact address is 5850 S. Woodlawn Ave, Chicago, IL 60615. The service is open to the public. The service will be fully planned and scripted with no room for people to eulogize, however, we will have a reception after the service, hosted by Tradelink, where people can talk. Regards, f From ben.root at ou.edu Fri Sep 7 09:36:40 2012 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 7 Sep 2012 09:36:40 -0400 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? Message-ID: An issue just reported on the matplotlib-users list involved a user who ran out of memory while attempting to do an imshow() on a large array. While this wouldn't be totally unexpected, the user's traceback shows that they ran out of memory before any actual building of the image occurred. Memory usage sky-rocketed when imshow() attempted to determine the min and max of the image. The input data was a masked array, and it appears that the implementation of min() for masked arrays goes something like this (paraphrasing here): obj.filled(inf).min() The idea is that any masked element is set to the largest possible value for their dtype in a copied array of itself, and then a min() is performed on that copied array. I am assuming that max() does the same thing. Can this be done differently/more efficiently? If the "filled" approach has to be done, maybe it would be a good idea to make the copy in chunks instead of all at once? Ideally, it would be nice to avoid the copying altogether and utilize some of the special iterators that Mark Weibe created last year. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Sep 7 10:54:26 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 7 Sep 2012 16:54:26 +0200 Subject: [Numpy-discussion] Should abs([nan]) be supported? In-Reply-To: <2F3BC0D2-E111-478C-B733-F414BFFAA76A@continuum.io> References: <2F3BC0D2-E111-478C-B733-F414BFFAA76A@continuum.io> Message-ID: On Wed, Sep 5, 2012 at 7:06 AM, Travis Oliphant wrote: > The framework for catching errors relies on hardware flags getting set and > our C code making the right calls to detect those flags. > > This has usually worked correctly in the past --- but it is an area where > changes in compilers or platforms could create problems. > I don't think it ever did, for less common platforms at least. See all the Debian test issues that were filed by Sandro this week. And even between Windows and Linux, there are some inconsistencies. > > We should test to be sure that the correct warnings are issued, I would > think. Perhaps using a catch_warnings context would be helpful (from > http://docs.python.org/library/warnings.html) > There are some tests for that already, in core/test_numeric.py. For example: ====================================================================== FAIL: test_default (test_numeric.TestSeterr) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/rgommers/Code/numpy/numpy/core/tests/test_numeric.py", line 231, in test_default under='ignore', AssertionError: {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} != {'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'} ---------------------------------------------------------------------- They're not exhaustive though. > > import warnings > def fxn(): > warnings.warn("deprecated", DeprecationWarning) > with warnings.catch_warnings(record=True) as w: > # Cause all warnings to always be triggered. > warnings.simplefilter("always") > # Trigger a warning. > fxn() > # Verify some things > assert len(w) == 1 > assert issubclass(w[-1].category, DeprecationWarning) > assert "deprecated" in str(w[-1].message) > > > > Use ``from numpy.testing import WarningManager`` for a 2.4-compatible version of catch_warnings (with explicitly calling its __enter__ and __exit__ methods). Ralf > -Travis > > > > On Sep 4, 2012, at 10:49 PM, Ond?ej ?ert?k wrote: > > On Tue, Sep 4, 2012 at 8:38 PM, Travis Oliphant > wrote: > > > There is an error context that controls how floating point signals are > handled. There is a separate control for underflow, overflow, divide by > zero, and invalid. IIRC, it was decided on this list a while ago to make > the default ignore for underflow and warning for overflow, invalid and > divide by zero. > > > However, an oversight pushed versions of NumPy where all the error > handlers where set to "ignore" and this test was probably written then. > I think the test should be changed to check for RuntimeWarning on some > of the cases. This might take a little work as it looks like the code > uses generators across multiple tests and would have to be changed to > handle expecting warnings. > > > Alternatively, the error context can be set before the test runs and then > restored afterwords: > > > olderr = np.seterr(invalid='ignore') > > abs(a) > > np.seterr(**olderr) > > > > or, using an errstate context --- > > > with np.errstate(invalid='ignore'): > > abs(a) > > > I see --- so abs([nan]) should emit a warning, but in the test we > should suppress it. > I'll work on that. > > The only thing that I don't understand is why it only happens on some > platforms and doesn't on some other platforms (apparently). But it's > clear how to fix it now. > > Thanks for the information. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Sep 7 12:05:25 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 7 Sep 2012 17:05:25 +0100 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? In-Reply-To: References: Message-ID: On 7 Sep 2012 14:38, "Benjamin Root" wrote: > > An issue just reported on the matplotlib-users list involved a user who ran out of memory while attempting to do an imshow() on a large array. While this wouldn't be totally unexpected, the user's traceback shows that they ran out of memory before any actual building of the image occurred. Memory usage sky-rocketed when imshow() attempted to determine the min and max of the image. The input data was a masked array, and it appears that the implementation of min() for masked arrays goes something like this (paraphrasing here): > > obj.filled(inf).min() > > The idea is that any masked element is set to the largest possible value for their dtype in a copied array of itself, and then a min() is performed on that copied array. I am assuming that max() does the same thing. > > Can this be done differently/more efficiently? If the "filled" approach has to be done, maybe it would be a good idea to make the copy in chunks instead of all at once? Ideally, it would be nice to avoid the copying altogether and utilize some of the special iterators that Mark Weibe created last year. I think what you're looking for is where= support for ufunc.reduce. This isn't implemented yet but at least it's straightforward in principle... otherwise I don't know anything better than reimplementing .min() by hand. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Sep 7 14:21:56 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Sep 2012 11:21:56 -0700 Subject: [Numpy-discussion] John Hunter's memorial service In-Reply-To: References: Message-ID: I just received the official announcement, please note the RSVP requirement to Miriam at msierig at gmail.com. John Davidson Hunter, III 1968-2012 [image: Inline image 1] Our family invites you to join us to celebrate and remember the life of John Hunter Memorial Service Rockefeller Chapel 5850 South Woodlawn Chicago, IL 60637 Monday October 1, 2012 11am Service will be followed by a reception where family and friends may gather to share memories of John. Please RSVP to Miriam at msierig at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jdhj.jpg Type: image/jpeg Size: 370050 bytes Desc: not available URL: From ralf.gommers at gmail.com Fri Sep 7 17:09:09 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 7 Sep 2012 23:09:09 +0200 Subject: [Numpy-discussion] Debian/Ubuntu patch help (was: ANN: NumPy 1.6.2 release candidate 1) In-Reply-To: References: Message-ID: On Fri, Aug 31, 2012 at 3:03 AM, Ond?ej ?ert?k wrote: > On Tue, May 15, 2012 at 11:52 AM, Ralf Gommers > wrote: > > > > > > On Sat, May 12, 2012 at 9:17 PM, Ralf Gommers < > ralf.gommers at googlemail.com> > > wrote: > >> > >> > >> > >> On Sat, May 12, 2012 at 6:22 PM, Sandro Tosi > wrote: > >>> > >>> Hello, > >>> > >>> On Sat, May 5, 2012 at 8:15 PM, Ralf Gommers > >>> wrote: > >>> > Hi, > >>> > > >>> > I'm pleased to announce the availability of the first release > candidate > >>> > of > >>> > NumPy 1.6.2. This is a maintenance release. Due to the delay of the > >>> > NumPy > >>> > 1.7.0, this release contains far more fixes than a regular NumPy > bugfix > >>> > release. It also includes a number of documentation and build > >>> > improvements. > >>> > > >>> > Sources and binary installers can be found at > >>> > https://sourceforge.net/projects/numpy/files/NumPy/1.6.2rc1/ > >>> > > >>> > Please test this release and report any issues on the > numpy-discussion > >>> > mailing list. > >>> ... > >>> > BLD: add support for the new X11 directory structure on Ubuntu & > co. > >>> > >>> We've just discovered that this fix is not enough. Actually the new > >>> directories are due to the "multi-arch" feature of Debian systems, > >>> that allows to install libraries from other (foreign) architectures > >>> than the one the machine is (the classic example, i386 libraries on a > >>> amd64 host). > >>> > >>> the fix included to look up in additional directories is currently > >>> only for X11, while for example Debian has fftw3 that's > >>> multi-arch-ified and thus will fail to be detected. > >>> > >>> Could this fix be extended to include all other things that are > >>> checked? for reference the bug in Debian is [1]; there was also a > >>> patch[2] in previous versions, that was using gcc to get the > >>> multi-arch paths - you might use as a reference, or to implement > >>> something debian-systems-specific. > >>> > >>> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=640940 > >>> [2] > >>> > http://anonscm.debian.org/viewvc/python-modules/packages/numpy/trunk/debian/patches/50_search-multiarch-paths.patch?view=markup&pathrev=21168 > >>> > >>> It would be awesome is such support would end up in 1.6.2 . > >> > >> > >> Hardcoding some more paths to check in distutils/system_info.py should > be > >> OK, also for 1.6.2 (will require a new RC). > >> > >> The --print-multiarch thing looks very questionable. As far as I can > tell, > >> it's a Debian specific gcc patch, only available in gcc 4.6 and up. > Ubuntu > >> before 11.10 release also doesn't have it. Therefore I don't think use > of > >> --print-multiarch is appropriate for numpy for now, and certainly not a > >> change I'd like to make to distutils right before a release. > >> > >> If anyone with access to a Debian/Ubuntu system could come up with a > patch > >> which adds the right paths to system_info.py, that would be great. > > > > > > Hi, if there's anyone wants to have a look at the above issue this week, > > that would be great. > > > > If there's a patch by this weekend I can create a second RC, so we can > still > > have the final release before the end of this month (needed for Debian > > freeze). Otherwise a second RC won't be needed. > > For NumPy 1.7.0, the issue is fixed for X11 by the following lines: > > if os.path.exists('/usr/lib/X11'): > globbed_x11_dir = glob('/usr/lib/*/libX11.so') > if globbed_x11_dir: > x11_so_dir = os.path.split(globbed_x11_dir[0])[0] > default_x11_lib_dirs.extend([x11_so_dir, '/usr/lib/X11']) > default_x11_include_dirs.extend(['/usr/lib/X11/include', > '/usr/include/X11']) > > > in numpy/distutils/system_info.py, there is still an issue of > supporting Debian multi-arch fully: > > http://projects.scipy.org/numpy/ticket/2150 > > However, I don't understand what exactly it means. Ralf, would would > be a canonical example to fix? > If I use for example x11, I get: > > > In [1]: from numpy.distutils.system_info import get_info > > In [2]: get_info("x11", 2) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11R6/lib64 is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11R6/lib is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11/lib64 is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11/lib is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/lib64 is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11R6/include is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/X11/include is invalid. > warnings.warn('Specified path %s is invalid.' % d) > > /home/ondrej/repos/numpy/py27/lib/python2.7/site-packages/numpy/distutils/system_info.py:551: > UserWarning: Specified path /usr/lib/X11/include is invalid. > warnings.warn('Specified path %s is invalid.' % d) > Out[2]: > {'include_dirs': ['/usr/include'], > 'libraries': ['X11'], > 'library_dirs': ['/usr/lib/x86_64-linux-gnu']} > > > > I am using Ubuntu 12.04. Is the task to remove the warnings, or is the > task to fix it for some other package from the get_info() list (which > one)? > The idea is to review and apply the patch linked to in this thread, in order for numpy builds to still work when other libs than X11 are multi-arched in Debian (FFTW was mentioned as an example by Julian). Here's a direct link to the patch again: http://anonscm.debian.org/viewvc/python-modules/packages/numpy/trunk/debian/patches/50_search-multiarch-paths.patch?view=markup&pathrev=21168 It looks to me like showing the warning is inappropriate and there should be a second dash in "-print-multi-arch" (disclaimer: didn't test the patch), but for the rest it's good to go. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Sep 8 12:37:35 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Sep 2012 18:37:35 +0200 Subject: [Numpy-discussion] problem with scipy's test In-Reply-To: References: Message-ID: On Wed, Sep 5, 2012 at 6:57 PM, ???? <275438859 at qq.com> wrote: > Hi,every body. > I encounter the error while the scipy is testing . > I wanna know why and how to fix it.(OSX lion 10.7.4) > here is part of the respond: > See http://thread.gmane.org/gmane.comp.python.scientific.devel/15289/focus=15297 Seems hard to fix. Using gfortran 4.4 is a workaround. Ralf > > AssertionError: > Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13 > error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, > OPpart=None, mode=buckling > (mismatch 100.0%) > x: array([[ 15.86892331, 0.0549568 ], > [ 14.15864153, 0.31381369], > [ 10.99691307, 0.37543458],... > y: array([[ 3.19549052, 0.0549568 ], > [ 2.79856422, 0.31381369], > [ 1.67526354, 0.37543458],... > > ====================================================================== > FAIL: test_arpack.test_symmetric_modes(True, , 'd', 2, > 'SA', None, 0.5, , None, 'cayley') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Python/2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line > 197, in runTest > self.test(*self.arg) > File > "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", > line 249, in eval_evec > assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) > File > "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", > line 1178, in assert_allclose > verbose=verbose, header=header) > File > "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", > line 644, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13 > error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, > OPpart=None, mode=cayley > (mismatch 100.0%) > x: array([[-0.36892684, -0.01935691], > [-0.26850996, -0.11053158], > [-0.40976156, -0.13223572],... > y: array([[-0.43633077, -0.01935691], > [-0.25161386, -0.11053158], > [-0.36756684, -0.13223572],... > > ---------------------------------------------------------------------- > Ran 5501 tests in 56.993s > > FAILED (KNOWNFAIL=13, SKIP=42, failures=76) > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sat Sep 8 17:56:56 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 8 Sep 2012 23:56:56 +0200 Subject: [Numpy-discussion] sum and prod Message-ID: Hi, I ran the following code: args = np.array([4,8]) print np.sum( (arg > 0) for arg in args) print np.sum([(arg > 0) for arg in args]) print np.prod( (arg > 0) for arg in args) print np.prod([(arg > 0) for arg in args]) with this result: 2 1 at 0x1c70410> 1 Is the difference between prod and sum intentional? I would expect that numpy.prod would also work on a generator, just like numpy.sum. BTW: the last line does what I need: the product over the truth values of all elements of args. Is there perhaps a nicer (conciser) way to achieve this? Thanks. Nicky From hangenuit at gmail.com Sat Sep 8 18:06:32 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sun, 9 Sep 2012 00:06:32 +0200 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: Hi, Maybe try something like this? >>> args = np.array([4,8]) >>> np.prod(args > 0) 1 >>> np.sum(args > 0) 2 Cheers, Han From warren.weckesser at enthought.com Sat Sep 8 18:10:16 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 8 Sep 2012 17:10:16 -0500 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: On Sat, Sep 8, 2012 at 4:56 PM, nicky van foreest wrote: > Hi, > > I ran the following code: > > args = np.array([4,8]) > print np.sum( (arg > 0) for arg in args) > print np.sum([(arg > 0) for arg in args]) > print np.prod( (arg > 0) for arg in args) > print np.prod([(arg > 0) for arg in args]) > > with this result: > > 2 > 1 > I get 2 here, not 1 (numpy version 1.6.1). > at 0x1c70410> > 1 > > Is the difference between prod and sum intentional? I would expect > that numpy.prod would also work on a generator, just like numpy.sum. > Whatever the correct result may be, I would expect them to have the same behavior with respect to a generator argument. > BTW: the last line does what I need: the product over the truth values > of all elements of args. Is there perhaps a nicer (conciser) way to > achieve this? Thanks. > How about: In [15]: np.all(args > 0) Out[15]: True Warren > Nicky > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Sat Sep 8 18:30:40 2012 From: e.antero.tammi at gmail.com (eat) Date: Sun, 9 Sep 2012 01:30:40 +0300 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: Hi, On Sun, Sep 9, 2012 at 12:56 AM, nicky van foreest wrote: > Hi, > > I ran the following code: > > args = np.array([4,8]) > print np.sum( (arg > 0) for arg in args) > print np.sum([(arg > 0) for arg in args]) > print np.prod( (arg > 0) for arg in args) > print np.prod([(arg > 0) for arg in args]) > Can't see why someone would write code like above, but anyway: In []: args = np.array([4,8]) In []: print np.sum( (arg > 0) for arg in args) 2 In []: print np.sum([(arg > 0) for arg in args]) 2 In []: print np.prod( (arg > 0) for arg in args) at 0x062BDA08> In []: print np.prod([(arg > 0) for arg in args]) 1 In []: print np.prod( (arg > 0) for arg in args).next() True In []: sys.version Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' In []: np.version.version Out[]: '1.6.0' My 2 cents, -eat > > with this result: > > 2 > 1 > at 0x1c70410> > 1 > > Is the difference between prod and sum intentional? I would expect > that numpy.prod would also work on a generator, just like numpy.sum. > > BTW: the last line does what I need: the product over the truth values > of all elements of args. Is there perhaps a nicer (conciser) way to > achieve this? Thanks. > > Nicky > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sun Sep 9 04:23:59 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 9 Sep 2012 10:23:59 +0200 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: On 9 September 2012 00:10, Warren Weckesser wrote: > > > On Sat, Sep 8, 2012 at 4:56 PM, nicky van foreest > wrote: >> >> Hi, >> >> I ran the following code: >> >> args = np.array([4,8]) >> print np.sum( (arg > 0) for arg in args) >> print np.sum([(arg > 0) for arg in args]) >> print np.prod( (arg > 0) for arg in args) >> print np.prod([(arg > 0) for arg in args]) >> >> with this result: >> >> 2 >> 1 > > > > I get 2 here, not 1 (numpy version 1.6.1). Sorry. Typo. > > >> >> at 0x1c70410> >> 1 >> >> Is the difference between prod and sum intentional? I would expect >> that numpy.prod would also work on a generator, just like numpy.sum. > > > > Whatever the correct result may be, I would expect them to have the same > behavior with respect to a generator argument. > > >> >> BTW: the last line does what I need: the product over the truth values >> of all elements of args. Is there perhaps a nicer (conciser) way to >> achieve this? Thanks. > > > > How about: > > In [15]: np.all(args > 0) > Out[15]: True > > > Warren > > > >> >> Nicky >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From vanforeest at gmail.com Sun Sep 9 04:25:09 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 9 Sep 2012 10:25:09 +0200 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: Thanks for your hints. NIcky On 9 September 2012 00:30, eat wrote: > Hi, > > On Sun, Sep 9, 2012 at 12:56 AM, nicky van foreest > wrote: >> >> Hi, >> >> I ran the following code: >> >> args = np.array([4,8]) >> print np.sum( (arg > 0) for arg in args) >> print np.sum([(arg > 0) for arg in args]) >> print np.prod( (arg > 0) for arg in args) >> print np.prod([(arg > 0) for arg in args]) > > Can't see why someone would write code like above, but anyway: > In []: args = np.array([4,8]) > In []: print np.sum( (arg > 0) for arg in args) > 2 > In []: print np.sum([(arg > 0) for arg in args]) > 2 > In []: print np.prod( (arg > 0) for arg in args) > at 0x062BDA08> > In []: print np.prod([(arg > 0) for arg in args]) > 1 > In []: print np.prod( (arg > 0) for arg in args).next() > True > In []: sys.version > Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]' > In []: np.version.version > Out[]: '1.6.0' > > My 2 cents, > -eat >> >> >> with this result: >> >> 2 >> 1 >> at 0x1c70410> >> 1 >> >> Is the difference between prod and sum intentional? I would expect >> that numpy.prod would also work on a generator, just like numpy.sum. >> >> BTW: the last line does what I need: the product over the truth values >> of all elements of args. Is there perhaps a nicer (conciser) way to >> achieve this? Thanks. >> >> Nicky >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nouiz at nouiz.org Sun Sep 9 13:08:18 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 9 Sep 2012 13:08:18 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b1 release In-Reply-To: References: Message-ID: All the PyArray_FLOAT*, ... to NPY_FLOAT*. PyObject *var; PyArrayObject * var_arr; var_arr->{dimensions,strides,nd,descr} to PyArray_{DIMS,...}(var_arr), need macro PyArray_ISCONTIGUOUS(var) to PyArray_ISCONTIGUOUS(var_arr) PyArray_{DATA,STRIDES,GETPTR2}(var) to PyArray_{DATA,STRIDES}(var_arr) The sed script didn't replace NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS, NPY_F_CONTIGUOUS, The sed script did change as well, but I think it should not be deprecated. This flag NPY_ARRAY_ENSURECOPY is a new one. It was not existing in numpy 1.6.0. We try to stay compitible with numpy 1.3 (maybe we will bump to numpy 1.4). This is the info on when this line was introduced: 263df0cc (Mark Wiebe 2011-07-19 17:06:08 -0500 784) #define NPY_ARRAY_ENSURECOPY 0x0020 In the trunk of Theano, I'll define NPY_ARRAY_ENSURECOPY for older version of numpy. PyArray_SetBaseObject On Tue, Sep 4, 2012 at 6:31 PM, Ond?ej ?ert?k wrote: > On Sat, Sep 1, 2012 at 2:19 AM, Sandro Tosi wrote: >> On Fri, Aug 31, 2012 at 8:07 PM, Sandro Tosi wrote: >>> On Fri, Aug 31, 2012 at 7:17 PM, Ond?ej ?ert?k wrote: >>>> If you could create issues at github: https://github.com/numpy/numpy/issues >>>> that would be great. If you have time, also with some info about the platform >>>> and how to reproduce it. Or at least a link to the build logs. >>> >>> I've reported it here: https://github.com/numpy/numpy/issues/402 >> >> I've just spammed the issue tracker with additional issues, reporting >> all the test suite failures on Debian architectures; issues are 406 -> >> 414 . >> >> Don't hesitate to contact me if you need any support or clarification. > > Thanks Sandro for reporting it! I put all of them into my release issue: > > https://github.com/numpy/numpy/issues/396 > > most of the failures seem to be caused by these two issues: > > https://github.com/numpy/numpy/issues/394 > https://github.com/numpy/numpy/issues/426 > > so I am looking into this now. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nouiz at nouiz.org Sun Sep 9 13:09:12 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 9 Sep 2012 13:09:12 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b1 release In-Reply-To: References: Message-ID: Hi, forget this email, it was sent by error. I'll write the info in a new email in a few minutes. Fred On Sun, Sep 9, 2012 at 1:08 PM, Fr?d?ric Bastien wrote: > All the PyArray_FLOAT*, ... to NPY_FLOAT*. > > PyObject *var; > PyArrayObject * var_arr; > var_arr->{dimensions,strides,nd,descr} to PyArray_{DIMS,...}(var_arr), > need macro > > PyArray_ISCONTIGUOUS(var) to PyArray_ISCONTIGUOUS(var_arr) > > PyArray_{DATA,STRIDES,GETPTR2}(var) to PyArray_{DATA,STRIDES}(var_arr) > > The sed script didn't replace NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem > for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS, NPY_F_CONTIGUOUS, > > The sed script did change as well, but I think it should not be > deprecated. This flag NPY_ARRAY_ENSURECOPY is a new one. It was not > existing in numpy 1.6.0. We try to stay compitible with numpy 1.3 > (maybe we will bump to numpy 1.4). This is the info on when this line > was introduced: > > 263df0cc (Mark Wiebe 2011-07-19 17:06:08 -0500 784) #define > NPY_ARRAY_ENSURECOPY 0x0020 > > In the trunk of Theano, I'll define NPY_ARRAY_ENSURECOPY for older > version of numpy. > > PyArray_SetBaseObject > > > On Tue, Sep 4, 2012 at 6:31 PM, Ond?ej ?ert?k wrote: >> On Sat, Sep 1, 2012 at 2:19 AM, Sandro Tosi wrote: >>> On Fri, Aug 31, 2012 at 8:07 PM, Sandro Tosi wrote: >>>> On Fri, Aug 31, 2012 at 7:17 PM, Ond?ej ?ert?k wrote: >>>>> If you could create issues at github: https://github.com/numpy/numpy/issues >>>>> that would be great. If you have time, also with some info about the platform >>>>> and how to reproduce it. Or at least a link to the build logs. >>>> >>>> I've reported it here: https://github.com/numpy/numpy/issues/402 >>> >>> I've just spammed the issue tracker with additional issues, reporting >>> all the test suite failures on Debian architectures; issues are 406 -> >>> 414 . >>> >>> Don't hesitate to contact me if you need any support or clarification. >> >> Thanks Sandro for reporting it! I put all of them into my release issue: >> >> https://github.com/numpy/numpy/issues/396 >> >> most of the failures seem to be caused by these two issues: >> >> https://github.com/numpy/numpy/issues/394 >> https://github.com/numpy/numpy/issues/426 >> >> so I am looking into this now. >> >> Ondrej >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion From nouiz at nouiz.org Sun Sep 9 13:12:12 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 9 Sep 2012 13:12:12 -0400 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: Hi, On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris wrote: > > > On Thu, Sep 6, 2012 at 10:07 AM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I reply with more information probably later today or tomorrow, but I >> think i need to finish everything to give you the exact information. >> >> Part of the problem I had was that by default there is a warning that >> is generated. It tell that to remove this warning we need to set >> NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. > > > You don't want to define this macro if you need to directly access the > fields. What warning are you getting if you don't define it? Are you using > Cython? If I don't define it and I remove the -Werror, I got 3 errors. 1 is related to an error message that was changed. The second was that we called numpy.dot() with 2 sparse matrix(from scipy). It worked in the past, but not now. Changing the test is easy. I don't expect people to have done this frequently, but maybe warning about this in the release note would help people to fix it faster. The error message is not helpful, it tell that it can't find a common dtype between float32 and float32 dtype. I changed the np.dot(a,b) to a*b as this is the matrix multiplication function for sparse matrix in scipy. This change remove the possibility to make a function that use matrix product to work with both ndarray and sparse matrix without special case for the object type. Not great, but there is an easy work around. So this stay like this in the release, there should be a warning. The third is releated to change to the casting rules in numpy. Before a scalar complex128 * vector float32 gived a vector of dtype complex128. Now it give a vector of complex64. The reason is that now the scalar of different category only change the category, not the precision. I would consider a must that we warn clearly about this interface change. Most people won't see it, but people that optimize there code heavily could depend on such thing. The other problem I had was related to the fact that I tryed to use only the new API. This took me a few day and it is not finished, as now I have a seg fault that is not easy to trigger. It happen in one tests, but only when other tests a ran before... This is probably an error from my change The sed script that replace some macro helped, but there is few macro change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and NPY_F_CONTIGUOUS. The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe warning somewhere in the API transition doc that if people want to stay compatible with older version of numpy, the should use an "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code. I won't have the time to make a PR with those small change as I have a deadline the 16 september and the 1 october. I hope my comment will be helpful. If you still have questions, don't hesitate. Fred From charlesr.harris at gmail.com Sun Sep 9 15:42:35 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 9 Sep 2012 13:42:35 -0600 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: On Sun, Sep 9, 2012 at 11:12 AM, Fr?d?ric Bastien wrote: > Hi, > > On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris > wrote: > > > > > > On Thu, Sep 6, 2012 at 10:07 AM, Fr?d?ric Bastien > wrote: > >> > >> Hi, > >> > >> I reply with more information probably later today or tomorrow, but I > >> think i need to finish everything to give you the exact information. > >> > >> Part of the problem I had was that by default there is a warning that > >> is generated. It tell that to remove this warning we need to set > >> NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. > > > > > > You don't want to define this macro if you need to directly access the > > fields. What warning are you getting if you don't define it? Are you > using > > Cython? > > If I don't define it and I remove the -Werror, I got 3 errors. 1 is > related to an error message that was changed. > > The second was that we called numpy.dot() with 2 sparse matrix(from > scipy). It worked in the past, but not now. Changing the test is easy. > I don't expect people to have done this frequently, but maybe warning > about this in the release note would help people to fix it faster. The > error message is not helpful, it tell that it can't find a common > dtype between float32 and float32 dtype. I changed the np.dot(a,b) to > a*b as this is the matrix multiplication function for sparse matrix in > scipy. This change remove the possibility to make a function that use > matrix product to work with both ndarray and sparse matrix without > special case for the object type. Not great, but there is an easy work > around. So this stay like this in the release, there should be a > warning. > > The third is releated to change to the casting rules in numpy. Before > a scalar complex128 * vector float32 gived a vector of dtype > complex128. Now it give a vector of complex64. The reason is that now > the scalar of different category only change the category, not the > precision. I would consider a must that we warn clearly about this > interface change. Most people won't see it, but people that optimize > there code heavily could depend on such thing. > > The other problem I had was related to the fact that I tryed to use > only the new API. This took me a few day and it is not finished, as > now I have a seg fault that is not easy to trigger. It happen in one > tests, but only when other tests a ran before... This is probably an > error from my change > > The sed script that replace some macro helped, but there is few macro > change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem > for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and > NPY_F_CONTIGUOUS. > I can add those, they seem to have been present since at least Numpy 1.5. > The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I > think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe > warning somewhere in the API transition doc that if people want to > stay compatible with older version of numpy, the should use an > "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code. > Hmm... Looks like you are right about NPY_ARRAY_ENSURECOPY. An alternative would be to not deprecate it, but an #ifndef would be better for the long term goal of having everyone use newer macros. > > I won't have the time to make a PR with those small change as I have a > deadline the 16 september and the 1 october. I hope my comment will be > helpful. If you still have questions, don't hesitate. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Sep 9 17:17:52 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 9 Sep 2012 15:17:52 -0600 Subject: [Numpy-discussion] Numpy 1.7b1 API change cause big trouble In-Reply-To: References: Message-ID: On Sun, Sep 9, 2012 at 1:42 PM, Charles R Harris wrote: > > > On Sun, Sep 9, 2012 at 11:12 AM, Fr?d?ric Bastien wrote: > >> Hi, >> >> On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Sep 6, 2012 at 10:07 AM, Fr?d?ric Bastien >> wrote: >> >> >> >> Hi, >> >> >> >> I reply with more information probably later today or tomorrow, but I >> >> think i need to finish everything to give you the exact information. >> >> >> >> Part of the problem I had was that by default there is a warning that >> >> is generated. It tell that to remove this warning we need to set >> >> NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. >> > >> > >> > You don't want to define this macro if you need to directly access the >> > fields. What warning are you getting if you don't define it? Are you >> using >> > Cython? >> >> If I don't define it and I remove the -Werror, I got 3 errors. 1 is >> related to an error message that was changed. >> >> The second was that we called numpy.dot() with 2 sparse matrix(from >> scipy). It worked in the past, but not now. Changing the test is easy. >> I don't expect people to have done this frequently, but maybe warning >> about this in the release note would help people to fix it faster. The >> error message is not helpful, it tell that it can't find a common >> dtype between float32 and float32 dtype. I changed the np.dot(a,b) to >> a*b as this is the matrix multiplication function for sparse matrix in >> scipy. This change remove the possibility to make a function that use >> matrix product to work with both ndarray and sparse matrix without >> special case for the object type. Not great, but there is an easy work >> around. So this stay like this in the release, there should be a >> warning. >> >> The third is releated to change to the casting rules in numpy. Before >> a scalar complex128 * vector float32 gived a vector of dtype >> complex128. Now it give a vector of complex64. The reason is that now >> the scalar of different category only change the category, not the >> precision. I would consider a must that we warn clearly about this >> interface change. Most people won't see it, but people that optimize >> there code heavily could depend on such thing. >> >> The other problem I had was related to the fact that I tryed to use >> only the new API. This took me a few day and it is not finished, as >> now I have a seg fault that is not easy to trigger. It happen in one >> tests, but only when other tests a ran before... This is probably an >> error from my change >> >> The sed script that replace some macro helped, but there is few macro >> change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem >> for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and >> NPY_F_CONTIGUOUS. >> > > I can add those, they seem to have been present since at least Numpy 1.5. > > >> The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I >> think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe >> warning somewhere in the API transition doc that if people want to >> stay compatible with older version of numpy, the should use an >> "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code. >> > > Hmm... Looks like you are right about NPY_ARRAY_ENSURECOPY. An alternative > would be to not deprecate it, but an #ifndef would be better for the long > term goal of having everyone use newer macros. > > And the other *_ARRAY_* macros seem to have been defined in 1.5. If 1.7 is intended to be a long term release, they probably shouldn't be deprecated until a later release. > >> I won't have the time to make a PR with those small change as I have a >> deadline the 16 september and the 1 october. I hope my comment will be >> helpful. If you still have questions, don't hesitate. >> >> > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From hangenuit at gmail.com Sun Sep 9 19:43:34 2012 From: hangenuit at gmail.com (Han Genuit) Date: Mon, 10 Sep 2012 01:43:34 +0200 Subject: [Numpy-discussion] sum and prod In-Reply-To: References: Message-ID: >> Is the difference between prod and sum intentional? I would expect >> that numpy.prod would also work on a generator, just like numpy.sum. > > > > Whatever the correct result may be, I would expect them to have the same > behavior with respect to a generator argument. > I found out that np.sum() has some special treatment in fromnumeric.py, where in case of a generator argument it uses the Python sum() function instead of the NumPy one. This is not the case for np.prod(), where the generator argument stays NPY_OBJECT in PyArray_GetArrayParamsFromObject. There is no NumPy code for handling generators, except for np.fromiter(), but that needs a dtype (which cannot be inferred automatically before running the generator). It might be more consistent to add special generator cases to other NumPy functions as well, using Python reduce() or imap(), but I'm not sure about the best way to solve this.. From chaoyuejoy at gmail.com Mon Sep 10 12:43:17 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Mon, 10 Sep 2012 18:43:17 +0200 Subject: [Numpy-discussion] easy way to change part of only unmasked elements value? Message-ID: Dear all numpy users, what's the easy way if I just want to change part of the unmasked array elements into another new value? like an example below: in my real case, I would like to change a subgrid of a masked numpy array to another value, but this grid include both masked and unmasked data. If I do a simple array[index1:index2, index3:index4] = another_value, those data with original True mask will change into False. I am using numpy 1.6.2. Thanks for any ideas. In [91]: a = np.ma.masked_less(np.arange(10),5) In [92]: or_mask = a.mask.copy() In [93]: a Out[93]: masked_array(data = [-- -- -- -- -- 5 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 999999) In [94]: a[3:6]=1 In [95]: a Out[95]: masked_array(data = [-- -- -- 1 1 1 6 7 8 9], mask = [ True True True False False False False False False False], fill_value = 999999) In [96]: a = np.ma.masked_array(a,mask=or_mask) In [97]: a Out[97]: masked_array(data = [-- -- -- -- -- 1 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 999999) Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alok.jadhav at credit-suisse.com Mon Sep 10 23:46:38 2012 From: alok.jadhav at credit-suisse.com (Jadhav, Alok) Date: Tue, 11 Sep 2012 11:46:38 +0800 Subject: [Numpy-discussion] numpy sort is not working Message-ID: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] >>> allRics.dtype dtype([('idx', '>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav at credit-suisse.com | www.credit-suisse.com =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Sep 10 23:57:39 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 10 Sep 2012 22:57:39 -0500 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: References: Message-ID: <2DCD812C-977A-42CA-A106-3AF5F6A9F391@continuum.io> Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote: > > Hi everyone, > > I have a numpy array of dimensions > > >>> allRics.shape > (583760, 1) > > To sort the array, I set the dtype of the array as follows: > > allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] > >>> allRics.dtype > dtype([('idx', ' > I checked and the endianness in dtype is correct. > > When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ?time?. I do the following. > > >>> x=np.sort(allRics,order='time') > >>> x[17330:17350]['time'] > array([[ 61184.4 ], > [ 61188.51 ], > [ 61188.979], > [ 61188.979], > [ 61189.989], > [ 61191.66 ], > [ 61194.35 ], > [ 61194.35 ], > [ 61198.79 ], > [ 61198.145], > [ 36126.217], > [ 36126.217], > [ 36126.218], > [ 36126.218], > [ 36126.219], > [ 36126.271], > [ 36126.271], > [ 36126.271], > [ 36126.293], > [ 36126.293]]) > > Time column doesn?t change its order. Could someone please advise what is missing here? Is this related to the bug > http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). > > Regards, > Alok > > > > > > Alok Jadhav > CREDIT SUISSE AG > GAT IT Hong Kong, KVAG 67 > International Commerce Centre | Hong Kong | Hong Kong > Phone +852 2101 6274 | Mobile +852 9169 7172 > alok.jadhav at credit-suisse.com | www.credit-suisse.com > > > ============================================================================== > Please access the attached hyperlink for an important electronic communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > ============================================================================== > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From alok.jadhav at credit-suisse.com Tue Sep 11 01:52:57 2012 From: alok.jadhav at credit-suisse.com (Jadhav, Alok) Date: Tue, 11 Sep 2012 13:52:57 +0800 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: <2DCD812C-977A-42CA-A106-3AF5F6A9F391@continuum.io> References: <2DCD812C-977A-42CA-A106-3AF5F6A9F391@continuum.io> Message-ID: Hi Travis, Very Strange. I am on version 1.6.2 L What could I be missing. I started using numpy quite recently. Is there a way to share the data with you? Regards, Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 11:58 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] >>> allRics.dtype dtype([('idx', '>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav at credit-suisse.com | www.credit-suisse.com ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From scopatz at gmail.com Tue Sep 11 02:03:53 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Tue, 11 Sep 2012 01:03:53 -0500 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: References: <2DCD812C-977A-42CA-A106-3AF5F6A9F391@continuum.io> Message-ID: On Tue, Sep 11, 2012 at 12:52 AM, Jadhav, Alok < alok.jadhav at credit-suisse.com> wrote: > Hi Travis,**** > > ** ** > > Very Strange. I am on version 1.6.2 L**** > > What could I be missing. I started using numpy quite recently. Is there a > way to share the data with you? > Hi Alok, Typically, a self-contained example which reproduces the error and generates its own data is more useful / valuable than sharing datasets. if you could come up with this, that'd be great! Be Well Anthony > **** > > ** ** > > Regards,**** > > ** ** > > Alok Jadhav**** > > GAT IT Hong Kong**** > > +852 2101 6274 (*852 6274)**** > > ** ** > > *From:* numpy-discussion-bounces at scipy.org [mailto: > numpy-discussion-bounces at scipy.org] *On Behalf Of *Travis Oliphant > *Sent:* Tuesday, September 11, 2012 11:58 AM > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] numpy sort is not working**** > > ** ** > > Hey Alok, **** > > ** ** > > This is worth taking a look. What version of NumPy are you using? **** > > ** ** > > It is not related directly to the issue you referenced as that was an > endian-ness issue and your data is native-order. **** > > ** ** > > Your example seems to work for me (with a simulated case on 1.6.1)**** > > ** ** > > Best,**** > > ** ** > > -Travis**** > > ** ** > > ** ** > > On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote:**** > > > > **** > > **** > > Hi everyone,**** > > **** > > I have a numpy array of dimensions**** > > **** > > >>> allRics.shape > (583760, 1) **** > > **** > > To sort the array, I set the dtype of the array as follows:**** > > **** > > allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np. > float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np. > float)] **** > > >>> allRics.dtype**** > > dtype([('idx', ' ' > **** > > I checked and the endianness in dtype is correct.**** > > **** > > When I sort the array, the output array of sort is same as original array > without any change. I want to sort the allRics numpy array on ?time?. I do > the following.**** > > **** > > >>> x=np.sort(allRics,order='time')**** > > >>> x[17330:17350]['time']**** > > array([[ 61184.4 ],**** > > [ 61188.51 ],**** > > [ 61188.979],**** > > [ 61188.979],**** > > [ 61189.989],**** > > [ 61191.66 ],**** > > [ 61194.35 ],**** > > [ 61194.35 ],**** > > [ 61198.79 ],**** > > [ 61198.145],**** > > [ 36126.217],**** > > [ 36126.217],**** > > [ 36126.218],**** > > [ 36126.218],**** > > [ 36126.219],**** > > [ 36126.271],**** > > [ 36126.271],**** > > [ 36126.271],**** > > [ 36126.293],**** > > [ 36126.293]])**** > > **** > > Time column doesn?t change its order. Could someone please advise what is > missing here? Is this related to the bug**** > > http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html > (from 2010).**** > > **** > > Regards,**** > > Alok**** > > **** > > **** > > **** > > **** > > **** > > Alok Jadhav**** > > *CREDIT SUISSE AG***** > > GAT IT Hong Kong, KVAG 67**** > > International Commerce Centre | Hong Kong | Hong Kong**** > > Phone +852 2101 6274 | Mobile +852 9169 7172**** > > alok.jadhav at credit-suisse.com | www.credit-suisse.com**** > > **** > > ** ** > > > ============================================================================== > Please access the attached hyperlink for an important electronic > communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > > ============================================================================== > **** > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion**** > > ** ** > > ** > > > ============================================================================== > Please access the attached hyperlink for an important electronic > communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > > ============================================================================== > **** > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alok.jadhav at credit-suisse.com Tue Sep 11 03:03:23 2012 From: alok.jadhav at credit-suisse.com (Jadhav, Alok) Date: Tue, 11 Sep 2012 15:03:23 +0800 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: References: <2DCD812C-977A-42CA-A106-3AF5F6A9F391@continuum.io> Message-ID: Anthony, Travis, I understand how an example that generates the data would be useful but it will be difficult to provide this in my case for following reasons - Data is read from hdf5 files (>50 MB) - Attaching the code which is used to generate the data. It may provide some more light. - numpy.sort() example on the site for small array works fine. So my guess is that only my data is having issue. - Fyi, I am on a windows 7 machine with python 2.6 and numpy 1.6.2 for j in range(0,len(rics)): ric=rics[j] trd=h5.getTrades(ric) idx=np.ones(len(trd))*j opened=np.zeros(len(trd)) time=trd[0:,0] trdp1=trd[0:,1] trdp0=np.insert(trd[1:,1], 0, basePx[ric]) dt=trd[0:,0]- np.insert(trd[0:-1,0], 0, trd[0,0]) value=trd[0:,1]*trd[0:,2] ricData=np.array([idx,opened,time,trdp1,trdp0,dt,value]) ricData=np.transpose(ricData) if allRics is None: allRics=ricData else: allRics=np.vstack((allRics, ricData)) allRics.dtype=[('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] allRics=np.sort(allRics,order='time') # This doesn't work Please notice that I am using vstack to generate the array. I just found out that I am able to sort numpy array before I set the dtype in following crude way allRics=allRics[allRics[:,2].argsort()] # This works I am able to continue with my code right now but not sure why structured array could not be sorted. Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Anthony Scopatz Sent: Tuesday, September 11, 2012 2:04 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working On Tue, Sep 11, 2012 at 12:52 AM, Jadhav, Alok wrote: Hi Travis, Very Strange. I am on version 1.6.2 L What could I be missing. I started using numpy quite recently. Is there a way to share the data with you? Hi Alok, Typically, a self-contained example which reproduces the error and generates its own data is more useful / valuable than sharing datasets. if you could come up with this, that'd be great! Be Well Anthony Regards, Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 11:58 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working Hey Alok, This is worth taking a look. What version of NumPy are you using? It is not related directly to the issue you referenced as that was an endian-ness issue and your data is native-order. Your example seems to work for me (with a simulated case on 1.6.1) Best, -Travis On Sep 10, 2012, at 10:46 PM, Jadhav, Alok wrote: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value' ,np.float)] >>> allRics.dtype dtype([('idx', '>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn't change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav at credit-suisse.com | www.credit-suisse.com ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ======================================================================== ====== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ======================================================================== ====== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Tue Sep 11 06:01:03 2012 From: markbak at gmail.com (Mark Bakker) Date: Tue, 11 Sep 2012 12:01:03 +0200 Subject: [Numpy-discussion] status of 'present' to use optional parameters in f2py Message-ID: Hello List, I searched the list whether it is possible to use optional arguments in fortran functions that are compiled with f2py. In 2008, this feature was not yet included and appeared to be non-trivial. Does anybody know the status of putting that feature in f2py? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhattersley at gmail.com Tue Sep 11 09:19:52 2012 From: rhattersley at gmail.com (Richard Hattersley) Date: Tue, 11 Sep 2012 14:19:52 +0100 Subject: [Numpy-discussion] easy way to change part of only unmasked elements value? In-Reply-To: References: Message-ID: Hi Chao, If you don't mind modifying masked values, then if you write to the underlying ndarray it won't touch the mask: >>> a = np.ma.masked_less(np.arange(10),5) >>> a.base[3:6] = 1 >>> a masked_array(data = [-- -- -- -- -- 1 6 7 8 9], mask = [ True True True True True False False False False False], fill_value = 999999) Regards, Richard Hattersley On 10 September 2012 17:43, Chao YUE wrote: > Dear all numpy users, > > what's the easy way if I just want to change part of the unmasked array > elements into another new value? like an example below: > in my real case, I would like to change a subgrid of a masked numpy array > to another value, but this grid include both masked and unmasked data. > If I do a simple array[index1:index2, index3:index4] = another_value, > those data with original True mask will change into False. I am using numpy > 1.6.2. > Thanks for any ideas. > > In [91]: a = np.ma.masked_less(np.arange(10),5) > > In [92]: or_mask = a.mask.copy() > In [93]: a > Out[93]: > masked_array(data = [-- -- -- -- -- 5 6 7 8 9], > mask = [ True True True True True False False False False > False], > fill_value = 999999) > > > In [94]: a[3:6]=1 > > In [95]: a > Out[95]: > masked_array(data = [-- -- -- 1 1 1 6 7 8 9], > mask = [ True True True False False False False False False > False], > fill_value = 999999) > > > In [96]: a = np.ma.masked_array(a,mask=or_mask) > > In [97]: a > Out[97]: > masked_array(data = [-- -- -- -- -- 1 6 7 8 9], > mask = [ True True True True True False False False False > False], > fill_value = 999999) > > Chao > > -- > > *********************************************************************************** > Chao YUE > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > UMR 1572 CEA-CNRS-UVSQ > Batiment 712 - Pe 119 > 91191 GIF Sur YVETTE Cedex > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 > > ************************************************************************************ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Sep 11 10:07:05 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 11 Sep 2012 09:07:05 -0500 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: References: Message-ID: I am wondering of this has to do with the size of the array. It looks like the array is sorted --- but in chunks. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 10, 2012, at 10:46 PM, "Jadhav, Alok" wrote: > > Hi everyone, > > I have a numpy array of dimensions > > >>> allRics.shape > (583760, 1) > > To sort the array, I set the dtype of the array as follows: > > allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] > >>> allRics.dtype > dtype([('idx', ' > I checked and the endianness in dtype is correct. > > When I sort the array, the output array of sort is same as original array without any change. I want to sort the allRics numpy array on ?time?. I do the following. > > >>> x=np.sort(allRics,order='time') > >>> x[17330:17350]['time'] > array([[ 61184.4 ], > [ 61188.51 ], > [ 61188.979], > [ 61188.979], > [ 61189.989], > [ 61191.66 ], > [ 61194.35 ], > [ 61194.35 ], > [ 61198.79 ], > [ 61198.145], > [ 36126.217], > [ 36126.217], > [ 36126.218], > [ 36126.218], > [ 36126.219], > [ 36126.271], > [ 36126.271], > [ 36126.271], > [ 36126.293], > [ 36126.293]]) > > Time column doesn?t change its order. Could someone please advise what is missing here? Is this related to the bug > http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). > > Regards, > Alok > > > > > > Alok Jadhav > CREDIT SUISSE AG > GAT IT Hong Kong, KVAG 67 > International Commerce Centre | Hong Kong | Hong Kong > Phone +852 2101 6274 | Mobile +852 9169 7172 > alok.jadhav at credit-suisse.com | www.credit-suisse.com > > > ============================================================================== > Please access the attached hyperlink for an important electronic communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > ============================================================================== > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From alok.jadhav at credit-suisse.com Tue Sep 11 11:10:03 2012 From: alok.jadhav at credit-suisse.com (Jadhav, Alok) Date: Tue, 11 Sep 2012 23:10:03 +0800 Subject: [Numpy-discussion] numpy sort is not working In-Reply-To: References: Message-ID: The sorted array you see is the same as original array. I have replied to code to generate the below data. (done in a loop. Each loop generates sorted numpy arrayas it reads from file) and combines all arrays into a single numpy array. I need to sort the final array into a single sorted array. It could be because of array size. It maybe silently failing somewhere? I don?t see any error, but output is not sorted. Array sorting works fine if the array is not structured. Alok Jadhav GAT IT Hong Kong +852 2101 6274 (*852 6274) From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Travis Oliphant Sent: Tuesday, September 11, 2012 10:07 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] numpy sort is not working I am wondering of this has to do with the size of the array. It looks like the array is sorted --- but in chunks. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 10, 2012, at 10:46 PM, "Jadhav, Alok" wrote: Hi everyone, I have a numpy array of dimensions >>> allRics.shape (583760, 1) To sort the array, I set the dtype of the array as follows: allRics.dtype = [('idx', np.float), ('opened', np.float), ('time', np.float),('trdp1',np.float),('trdp0',np.float),('dt',np.float),('value',np.float)] >>> allRics.dtype dtype([('idx', '>> x=np.sort(allRics,order='time') >>> x[17330:17350]['time'] array([[ 61184.4 ], [ 61188.51 ], [ 61188.979], [ 61188.979], [ 61189.989], [ 61191.66 ], [ 61194.35 ], [ 61194.35 ], [ 61198.79 ], [ 61198.145], [ 36126.217], [ 36126.217], [ 36126.218], [ 36126.218], [ 36126.219], [ 36126.271], [ 36126.271], [ 36126.271], [ 36126.293], [ 36126.293]]) Time column doesn?t change its order. Could someone please advise what is missing here? Is this related to the bug http://www.mail-archive.com/numpy-discussion at scipy.org/msg23060.html (from 2010). Regards, Alok Alok Jadhav CREDIT SUISSE AG GAT IT Hong Kong, KVAG 67 International Commerce Centre | Hong Kong | Hong Kong Phone +852 2101 6274 | Mobile +852 9169 7172 alok.jadhav at credit-suisse.com | www.credit-suisse.com ============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ============================================================================== _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html =============================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Tue Sep 11 11:24:19 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Tue, 11 Sep 2012 17:24:19 +0200 Subject: [Numpy-discussion] easy way to change part of only unmasked elements value? In-Reply-To: References: Message-ID: Dear Richard, this is what I want. Thanks! Chao On Tue, Sep 11, 2012 at 3:19 PM, Richard Hattersley wrote: > Hi Chao, > > If you don't mind modifying masked values, then if you write to the > underlying ndarray it won't touch the mask: > > >>> a = np.ma.masked_less(np.arange(10),5) > >>> a.base[3:6] = 1 > >>> a > > masked_array(data = [-- -- -- -- -- 1 6 7 8 9], > mask = [ True True True True True False False False False > False], > fill_value = 999999) > > Regards, > Richard Hattersley > > > On 10 September 2012 17:43, Chao YUE wrote: > >> Dear all numpy users, >> >> what's the easy way if I just want to change part of the unmasked array >> elements into another new value? like an example below: >> in my real case, I would like to change a subgrid of a masked numpy array >> to another value, but this grid include both masked and unmasked data. >> If I do a simple array[index1:index2, index3:index4] = another_value, >> those data with original True mask will change into False. I am using numpy >> 1.6.2. >> Thanks for any ideas. >> >> In [91]: a = np.ma.masked_less(np.arange(10),5) >> >> In [92]: or_mask = a.mask.copy() >> In [93]: a >> Out[93]: >> masked_array(data = [-- -- -- -- -- 5 6 7 8 9], >> mask = [ True True True True True False False False >> False False], >> fill_value = 999999) >> >> >> In [94]: a[3:6]=1 >> >> In [95]: a >> Out[95]: >> masked_array(data = [-- -- -- 1 1 1 6 7 8 9], >> mask = [ True True True False False False False False >> False False], >> fill_value = 999999) >> >> >> In [96]: a = np.ma.masked_array(a,mask=or_mask) >> >> In [97]: a >> Out[97]: >> masked_array(data = [-- -- -- -- -- 1 6 7 8 9], >> mask = [ True True True True True False False False >> False False], >> fill_value = 999999) >> >> Chao >> >> -- >> >> *********************************************************************************** >> Chao YUE >> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) >> UMR 1572 CEA-CNRS-UVSQ >> Batiment 712 - Pe 119 >> 91191 GIF Sur YVETTE Cedex >> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 >> >> ************************************************************************************ >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Sep 12 09:46:35 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 12 Sep 2012 14:46:35 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release Message-ID: Hi, I just noticed that this works for numpy 1.6.1: In [36]: np.concatenate(([2, 3], [1]), 1) Out[36]: array([2, 3, 1]) but the beta release branch: In [3]: np.concatenate(([2, 3], [1]), 1) --------------------------------------------------------------------------- IndexError Traceback (most recent call last) /Users/mb312/ in () ----> 1 np.concatenate(([2, 3], [1]), 1) IndexError: axis 1 out of bounds [0, 1) In the interests of backward compatibility maybe it would be better to raise a warning for this release, rather than an error? Best, Matthew From njs at pobox.com Wed Sep 12 11:19:35 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 12 Sep 2012 16:19:35 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett wrote: > Hi, > > I just noticed that this works for numpy 1.6.1: > > In [36]: np.concatenate(([2, 3], [1]), 1) > Out[36]: array([2, 3, 1]) > > but the beta release branch: > > In [3]: np.concatenate(([2, 3], [1]), 1) > --------------------------------------------------------------------------- > IndexError Traceback (most recent call last) > /Users/mb312/ in () > ----> 1 np.concatenate(([2, 3], [1]), 1) > > IndexError: axis 1 out of bounds [0, 1) > > In the interests of backward compatibility maybe it would be better to > raise a warning for this release, rather than an error? Yep, that'd be a good idea. Want to write a patch? :-) -n From matthew.brett at gmail.com Wed Sep 12 14:36:10 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 12 Sep 2012 19:36:10 +0100 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? Message-ID: Hi, We hit a subtle behavior change for the ``astype`` array method between 1.6.1 and 1.7.0 beta. In 1.6.1: In [18]: a = np.arange(24).reshape((2, 3, 4)).transpose((1, 2, 0)) In [19]: a.flags Out[19]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [20]: b = a.astype(float) In [21]: b.flags Out[21]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [22]: b.strides Out[22]: (64, 16, 8) So - ``a.astype(float)`` here has made a new C-contiguous array, somewhat as implied by the 'copy' explanation in the docstring. In 1.7.0 beta, ``a`` is the same but: In [22]: b.flags Out[22]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [23]: b.strides Out[23]: (32, 8, 96) Is this intended? Is there a performance reason to keep the same strides in 1.7.0? Thanks for any pointers, Matthew From travis at continuum.io Wed Sep 12 14:58:49 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 12 Sep 2012 13:58:49 -0500 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? In-Reply-To: References: Message-ID: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> On Sep 12, 2012, at 1:36 PM, Matthew Brett wrote: > Hi, > > We hit a subtle behavior change for the ``astype`` array method > between 1.6.1 and 1.7.0 beta. > > In 1.6.1: > > > In [18]: a = np.arange(24).reshape((2, 3, 4)).transpose((1, 2, 0)) > > In [19]: a.flags > Out[19]: > C_CONTIGUOUS : False > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [20]: b = a.astype(float) > > In [21]: b.flags > Out[21]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [22]: b.strides > Out[22]: (64, 16, 8) > > So - ``a.astype(float)`` here has made a new C-contiguous array, > somewhat as implied by the 'copy' explanation in the docstring. In > 1.7.0 beta, ``a`` is the same but: > > In [22]: b.flags > Out[22]: > C_CONTIGUOUS : False > F_CONTIGUOUS : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [23]: b.strides > Out[23]: (32, 8, 96) > > Is this intended? Is there a performance reason to keep the same > strides in 1.7.0? I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to "C-order" but to "Keep-order". So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). -Travis From matthew.brett at gmail.com Wed Sep 12 15:47:51 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 12 Sep 2012 20:47:51 +0100 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? In-Reply-To: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> References: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> Message-ID: Hi, On Wed, Sep 12, 2012 at 7:58 PM, Travis Oliphant wrote: > > On Sep 12, 2012, at 1:36 PM, Matthew Brett wrote: > >> Hi, >> >> We hit a subtle behavior change for the ``astype`` array method >> between 1.6.1 and 1.7.0 beta. >> >> In 1.6.1: >> >> >> In [18]: a = np.arange(24).reshape((2, 3, 4)).transpose((1, 2, 0)) >> >> In [19]: a.flags >> Out[19]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [20]: b = a.astype(float) >> >> In [21]: b.flags >> Out[21]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [22]: b.strides >> Out[22]: (64, 16, 8) >> >> So - ``a.astype(float)`` here has made a new C-contiguous array, >> somewhat as implied by the 'copy' explanation in the docstring. In >> 1.7.0 beta, ``a`` is the same but: >> >> In [22]: b.flags >> Out[22]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : False >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [23]: b.strides >> Out[23]: (32, 8, 96) >> >> Is this intended? Is there a performance reason to keep the same >> strides in 1.7.0? > > I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to "C-order" but to "Keep-order". So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). > Thanks for the reply. So maybe the bottom line is that the user should not assume any contiguity from ``astype``? If that's the case I'll submit a docstring PR to say that. Cheers, Matthew From travis at continuum.io Wed Sep 12 16:27:19 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 12 Sep 2012 15:27:19 -0500 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? In-Reply-To: References: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> Message-ID: <8A8CA780-BB15-408C-AE52-8D1C68BD7A31@continuum.io> >>> >>> Is this intended? Is there a performance reason to keep the same >>> strides in 1.7.0? >> >> I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to "C-order" but to "Keep-order". So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). >> > > Thanks for the reply. > > So maybe the bottom line is that the user should not assume any > contiguity from ``astype``? If that's the case I'll submit a > docstring PR to say that. > Yes, that would be a great addition to the docstring. Mark, can you confirm this is the desired behavior? Ondrej, this would be something to put in the release notes, if it isn't already. Thanks, -Trvis From ondrej.certik at gmail.com Wed Sep 12 17:24:09 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 12 Sep 2012 14:24:09 -0700 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? In-Reply-To: <8A8CA780-BB15-408C-AE52-8D1C68BD7A31@continuum.io> References: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> <8A8CA780-BB15-408C-AE52-8D1C68BD7A31@continuum.io> Message-ID: Hi Matt, On Wed, Sep 12, 2012 at 1:27 PM, Travis Oliphant wrote: >>>> >>>> Is this intended? Is there a performance reason to keep the same >>>> strides in 1.7.0? >>> >>> I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to "C-order" but to "Keep-order". So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). >>> >> >> Thanks for the reply. >> >> So maybe the bottom line is that the user should not assume any >> contiguity from ``astype``? If that's the case I'll submit a >> docstring PR to say that. >> > > Yes, that would be a great addition to the docstring. Mark, can you confirm this is the desired behavior? Ondrej, this would be something to put in the release notes, if it isn't already. If you could submit the PR with the docs, that'd be awesome. In the meantime, I've created an issue for it: https://github.com/numpy/numpy/issues/437 and put it into my TODO list: https://github.com/numpy/numpy/issues/396 Ondrej From matthew.brett at gmail.com Thu Sep 13 06:12:40 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 13 Sep 2012 11:12:40 +0100 Subject: [Numpy-discussion] Obscure code in concatenate code path? Message-ID: Hi, While writing some tests for np.concatenate, I ran foul of this code: if (axis >= NPY_MAXDIMS) { ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); } else { ret = PyArray_ConcatenateArrays(narrays, arrays, axis); } in multiarraymodule.c So, if the user passes an axis >= (by default) 32 the arrays to concatenate get flattened, and must all have the same number of elements (it turns out). This seems obscure. Obviously it is not likely that someone would pass in an axis no >= 32 by accident, but if they did, they would not get the result they expect. Is there some code-path that needs this? Is there another way of doing it? Best, Matthew From matthew.brett at gmail.com Thu Sep 13 06:31:29 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 13 Sep 2012 11:31:29 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: On Wed, Sep 12, 2012 at 4:19 PM, Nathaniel Smith wrote: > On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett wrote: >> Hi, >> >> I just noticed that this works for numpy 1.6.1: >> >> In [36]: np.concatenate(([2, 3], [1]), 1) >> Out[36]: array([2, 3, 1]) >> >> but the beta release branch: >> >> In [3]: np.concatenate(([2, 3], [1]), 1) >> --------------------------------------------------------------------------- >> IndexError Traceback (most recent call last) >> /Users/mb312/ in () >> ----> 1 np.concatenate(([2, 3], [1]), 1) >> >> IndexError: axis 1 out of bounds [0, 1) >> >> In the interests of backward compatibility maybe it would be better to >> raise a warning for this release, rather than an error? > > Yep, that'd be a good idea. Want to write a patch? :-) https://github.com/numpy/numpy/pull/440 Thanks, Matthew From matthew.brett at gmail.com Thu Sep 13 08:59:06 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 13 Sep 2012 13:59:06 +0100 Subject: [Numpy-discussion] Contiguity of result of astype changed - intentional? In-Reply-To: References: <24314CE2-6FA9-4EFD-B203-A9FBDEF75381@continuum.io> <8A8CA780-BB15-408C-AE52-8D1C68BD7A31@continuum.io> Message-ID: Hi, On Wed, Sep 12, 2012 at 10:24 PM, Ond?ej ?ert?k wrote: > Hi Matt, > > On Wed, Sep 12, 2012 at 1:27 PM, Travis Oliphant wrote: >>>>> >>>>> Is this intended? Is there a performance reason to keep the same >>>>> strides in 1.7.0? >>>> >>>> I believe that this could be because in 1.7.0, NumPy was changed so that copying does not always default to "C-order" but to "Keep-order". So, in 1.7.0, the strides of b is governed by the strides of a, while in 1.6.1, the strides of b is C-order (because of the copy). >>>> >>> >>> Thanks for the reply. >>> >>> So maybe the bottom line is that the user should not assume any >>> contiguity from ``astype``? If that's the case I'll submit a >>> docstring PR to say that. >>> >> >> Yes, that would be a great addition to the docstring. Mark, can you confirm this is the desired behavior? Ondrej, this would be something to put in the release notes, if it isn't already. > > If you could submit the PR with the docs, that'd be awesome. In the > meantime, I've created an issue for it: > > https://github.com/numpy/numpy/issues/437 > > and put it into my TODO list: > > https://github.com/numpy/numpy/issues/396 Sorry - inadequate research on my part - the current docstring for ``astype`` is clear and comprehensive, and the change was obviously intentional. Cheers, Matthew From njs at pobox.com Thu Sep 13 09:40:06 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 13 Sep 2012 14:40:06 +0100 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett wrote: > Hi, > > While writing some tests for np.concatenate, I ran foul of this code: > > if (axis >= NPY_MAXDIMS) { > ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); > } > else { > ret = PyArray_ConcatenateArrays(narrays, arrays, axis); > } > > in multiarraymodule.c How deeply weird. > So, if the user passes an axis >= (by default) 32 the arrays to > concatenate get flattened, and must all have the same number of > elements (it turns out). This seems obscure. Obviously it is not > likely that someone would pass in an axis no >= 32 by accident, but if > they did, they would not get the result they expect. Is there some > code-path that needs this? Is there another way of doing it? This behaviour seems to be older -- I can reproduce it empirically with 1.6.2. But the actual code you encountered was introduced along with PyArray_ConcatenateFlattenedArrays itself by Mark Wiebe in 9194b3af. So @Mark, you were the last one to look at this closely, any thoughts? :-) Though, in 1.6.2, there doesn't seem to be any requirement that the arrays have the same length: In [11]: np.concatenate(([[1, 2]], [[3]]), axis=100) Out[11]: array([1, 2, 3]) My first guess is that this was some ill-advised "defensive programming" thing where someone wanted to do *something* with these weird axis arguments, and picked *something* at more-or-less random. I like that theory better than the one where someone introduced this on purpose and then used it... It might even be that rare case where the best solution is to just rip it out and see if anyone notices. -n From travis at continuum.io Thu Sep 13 10:01:05 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 13 Sep 2012 09:01:05 -0500 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: > On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett wrote: >> Hi, >> >> While writing some tests for np.concatenate, I ran foul of this code: >> >> if (axis >= NPY_MAXDIMS) { >> ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); >> } >> else { >> ret = PyArray_ConcatenateArrays(narrays, arrays, axis); >> } >> >> in multiarraymodule.c > > How deeply weird This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation. This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. Reference PyArray_AxisConverter (which turns a "None" Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. -Travis From warren.weckesser at enthought.com Thu Sep 13 12:39:25 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 13 Sep 2012 11:39:25 -0500 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: On Thu, Sep 13, 2012 at 9:01 AM, Travis Oliphant wrote: > > On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: > > > On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett > wrote: > >> Hi, > >> > >> While writing some tests for np.concatenate, I ran foul of this code: > >> > >> if (axis >= NPY_MAXDIMS) { > >> ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, > NPY_CORDER); > >> } > >> else { > >> ret = PyArray_ConcatenateArrays(narrays, arrays, axis); > >> } > >> > >> in multiarraymodule.c > > > > How deeply weird > > > This is expected behavior. Heh, I guess "expected" is subjective: In [23]: np.__version__ Out[23]: '1.6.1' In [24]: a = zeros((2,2)) In [25]: b = ones((2,3)) In [26]: concatenate((a, b), axis=0) # Expected error. --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/warren/gitwork/class-material/demo/pytables/ in () ----> 1 concatenate((a, b), axis=0) # Expected error. ValueError: array dimensions must agree except for d_0 In [27]: concatenate((a, b), axis=1) # Normal behavior. Out[27]: array([[ 0., 0., 1., 1., 1.], [ 0., 0., 1., 1., 1.]]) In [28]: concatenate((a, b), axis=2) # Cryptic error message. --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/warren/gitwork/class-material/demo/pytables/ in () ----> 1 concatenate((a, b), axis=2) # Cryptic error message. ValueError: bad axis1 argument to swapaxes In [29]: concatenate((a, b), axis=32) # What the... ? Out[29]: array([ 0., 0., 0., 0., 1., 1., 1., 1., 1., 1.]) I would expect an error, consistent with the behavior when 1 < axis < 32. Warren > It's how the concatenate Python function manages to handle axis=None to > flatten the arrays before concatenation. This has been in NumPy since > 1.0 and should not be changed without deprecation warnings which I am -0 on. > > Now, it is true that the C-API could have been written differently (I > think this is what Mark was trying to encourage) so that there are two > C-API functions and they are dispatched separately from the > array_concatenate method depending on whether or not a None is passed in. > But, the behavior is documented and has been for a long time. > > Reference PyArray_AxisConverter (which turns a "None" Python argument into > an axis=MAX_DIMS). This is consistent behavior throughout the C-API. > > -Travis > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Sep 13 12:51:52 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 13 Sep 2012 11:51:52 -0500 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: On Sep 13, 2012, at 11:39 AM, Warren Weckesser wrote: > > > On Thu, Sep 13, 2012 at 9:01 AM, Travis Oliphant wrote: > > On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: > > > On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett wrote: > >> Hi, > >> > >> While writing some tests for np.concatenate, I ran foul of this code: > >> > >> if (axis >= NPY_MAXDIMS) { > >> ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); > >> } > >> else { > >> ret = PyArray_ConcatenateArrays(narrays, arrays, axis); > >> } > >> > >> in multiarraymodule.c > > > > How deeply weird > > > This is expected behavior. > > > Heh, I guess "expected" is subjective: "Expected" only in the sense that the current C-API has been intentional for 6 years. The side-effect of the Python-side being confusing can be changed --- it just hasn't been yet --- the documented approach is to use None. A patch to PyArray_AxisConverter might be the answer. -Travis > > In [23]: np.__version__ > Out[23]: '1.6.1' > > In [24]: a = zeros((2,2)) > > In [25]: b = ones((2,3)) > > In [26]: concatenate((a, b), axis=0) # Expected error. > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /Users/warren/gitwork/class-material/demo/pytables/ in () > ----> 1 concatenate((a, b), axis=0) # Expected error. > > ValueError: array dimensions must agree except for d_0 > > In [27]: concatenate((a, b), axis=1) # Normal behavior. > Out[27]: > array([[ 0., 0., 1., 1., 1.], > [ 0., 0., 1., 1., 1.]]) > > In [28]: concatenate((a, b), axis=2) # Cryptic error message. > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /Users/warren/gitwork/class-material/demo/pytables/ in () > ----> 1 concatenate((a, b), axis=2) # Cryptic error message. > > ValueError: bad axis1 argument to swapaxes > > In [29]: concatenate((a, b), axis=32) # What the... ? > Out[29]: array([ 0., 0., 0., 0., 1., 1., 1., 1., 1., 1.]) > > > I would expect an error, consistent with the behavior when 1 < axis < 32. > > > Warren > > > > It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation. This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. > > Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. > > Reference PyArray_AxisConverter (which turns a "None" Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. > > -Travis > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Thu Sep 13 12:57:32 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 13 Sep 2012 18:57:32 +0200 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: On Thu, Sep 13, 2012 at 6:39 PM, Warren Weckesser wrote: > I would expect an error, consistent with the behavior when 1 < axis < 32. In that case, you are hitting the dimension limit. np.concatenate((a,b), axis=31) ValueError: bad axis1 argument to swapaxes Where axis=32, axis=3500, axis=None all return the flattened array. I have been trying with other functions and got something interesting. With the same a, b as before: np.sum((a,b), axis=0) ValueError: operands could not be broadcast together with shapes (2) (3) np.sum((a,b), axis=1) array([[ 0. 0.], [ 2. 2. 2.]], dtype=object) np.sum((a,b), axis=2) ValueError: axis(=2) out of bounds This is to be expected, but now this is not consistent: np.sum((a,b), axis=32) ValueError: operands could not be broadcast together with shapes (2) (3) np.sum((a,b), axis=500) ValueError: axis(=500) out of bounds From matthew.brett at gmail.com Thu Sep 13 13:34:07 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 13 Sep 2012 18:34:07 +0100 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: Hi, On Thu, Sep 13, 2012 at 3:01 PM, Travis Oliphant wrote: > > On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote: > >> On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett wrote: >>> Hi, >>> >>> While writing some tests for np.concatenate, I ran foul of this code: >>> >>> if (axis >= NPY_MAXDIMS) { >>> ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER); >>> } >>> else { >>> ret = PyArray_ConcatenateArrays(narrays, arrays, axis); >>> } >>> >>> in multiarraymodule.c >> >> How deeply weird > > > This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation. This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. > > Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. > > Reference PyArray_AxisConverter (which turns a "None" Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. How about something like: #define NPY_NONE_AXIS NPY_MAXDIMS to make it clearer what is intended? Best, Matthew From travis at continuum.io Thu Sep 13 13:35:34 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 13 Sep 2012 12:35:34 -0500 Subject: [Numpy-discussion] Obscure code in concatenate code path? In-Reply-To: References: Message-ID: >> >> >> This is expected behavior. It's how the concatenate Python function manages to handle axis=None to flatten the arrays before concatenation. This has been in NumPy since 1.0 and should not be changed without deprecation warnings which I am -0 on. >> >> Now, it is true that the C-API could have been written differently (I think this is what Mark was trying to encourage) so that there are two C-API functions and they are dispatched separately from the array_concatenate method depending on whether or not a None is passed in. But, the behavior is documented and has been for a long time. >> >> Reference PyArray_AxisConverter (which turns a "None" Python argument into an axis=MAX_DIMS). This is consistent behavior throughout the C-API. > > How about something like: > > #define NPY_NONE_AXIS NPY_MAXDIMS > > to make it clearer what is intended? +1 -Travis From matthew.brett at gmail.com Thu Sep 13 14:00:19 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 13 Sep 2012 19:00:19 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: Hi, On Thu, Sep 13, 2012 at 11:31 AM, Matthew Brett wrote: > On Wed, Sep 12, 2012 at 4:19 PM, Nathaniel Smith wrote: >> On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett wrote: >>> Hi, >>> >>> I just noticed that this works for numpy 1.6.1: >>> >>> In [36]: np.concatenate(([2, 3], [1]), 1) >>> Out[36]: array([2, 3, 1]) >>> >>> but the beta release branch: >>> >>> In [3]: np.concatenate(([2, 3], [1]), 1) >>> --------------------------------------------------------------------------- >>> IndexError Traceback (most recent call last) >>> /Users/mb312/ in () >>> ----> 1 np.concatenate(([2, 3], [1]), 1) >>> >>> IndexError: axis 1 out of bounds [0, 1) >>> >>> In the interests of backward compatibility maybe it would be better to >>> raise a warning for this release, rather than an error? >> >> Yep, that'd be a good idea. Want to write a patch? :-) > > https://github.com/numpy/numpy/pull/440 Thinking about the other thread, and the 'number of elements' check, I noticed this: In [51]: np.__version__ Out[51]: '1.6.1' In [52]: r4 = range(4) In [53]: r3 = range(3) In [54]: np.concatenate((r4, r3), None) Out[54]: array([0, 1, 2, 3, 0, 1, 2]) but: In [46]: np.__version__ Out[46]: '1.7.0rc1.dev-ea23de8' In [47]: np.concatenate((r4, r3), None) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Users/mb312/tmp/ in () ----> 1 np.concatenate((r4, r3), None) ValueError: all the input arrays must have same number of elements The change requiring the same number of elements appears to have been added explicitly by Mark in commit 9194b3af . Mark - what was the reason for that check? Best, Matthew From travis at continuum.io Thu Sep 13 14:16:50 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 13 Sep 2012 13:16:50 -0500 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: <98F2481B-8225-4ADC-AFA0-96AA61F0D451@continuum.io> >>> Yep, that'd be a good idea. Want to write a patch? :-) >> >> https://github.com/numpy/numpy/pull/440 > > Thinking about the other thread, and the 'number of elements' check, I > noticed this: > > In [51]: np.__version__ > Out[51]: '1.6.1' > > In [52]: r4 = range(4) > > In [53]: r3 = range(3) > > In [54]: np.concatenate((r4, r3), None) > Out[54]: array([0, 1, 2, 3, 0, 1, 2]) > > but: > > In [46]: np.__version__ > Out[46]: '1.7.0rc1.dev-ea23de8' > > In [47]: np.concatenate((r4, r3), None) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /Users/mb312/tmp/ in () > ----> 1 np.concatenate((r4, r3), None) > > ValueError: all the input arrays must have same number of elements > > The change requiring the same number of elements appears to have been > added explicitly by Mark in commit 9194b3af . Mark - what was the > reason for that check? This looks like a regression. That should still work. -Travis > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From francesc at continuum.io Fri Sep 14 09:31:39 2012 From: francesc at continuum.io (Francesc Alted) Date: Fri, 14 Sep 2012 15:31:39 +0200 Subject: [Numpy-discussion] ANN: python-blosc 1.0.4 released Message-ID: <505331BB.6030207@continuum.io> ============================= Announcing python-blosc 1.0.4 ============================= What is it? =========== A Python wrapper for the Blosc compression library. Blosc (http://blosc.pytables.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc. python-blosc is a Python package that wraps it. What is new? ============ Optimized the amount of data copied during compression (using _PyBytes_Resize() now instead of previous PyBytes_FromStringAndSize()). This leads to improvements in compression speed ranging from 1.2x for highly compressible chunks up to 7x for mostly uncompressible data. Thanks to Valentin Haenel for this nice contribution. For more info, you can see the release notes in: https://github.com/FrancescAlted/python-blosc/wiki/Release-notes More docs and examples are available in the Quick User's Guide wiki page: https://github.com/FrancescAlted/python-blosc/wiki/Quick-User's-Guide Download sources ================ Go to: http://github.com/FrancescAlted/python-blosc and download the most recent release from there. Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for details. Mailing list ============ There is an official mailing list for Blosc at: blosc at googlegroups.com http://groups.google.es/group/blosc -- Francesc Alted From matthew.brett at gmail.com Fri Sep 14 13:48:01 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 14 Sep 2012 18:48:01 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: Hi, On Thu, Sep 13, 2012 at 7:00 PM, Matthew Brett wrote: > Hi, > > On Thu, Sep 13, 2012 at 11:31 AM, Matthew Brett wrote: >> On Wed, Sep 12, 2012 at 4:19 PM, Nathaniel Smith wrote: >>> On Wed, Sep 12, 2012 at 2:46 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> I just noticed that this works for numpy 1.6.1: >>>> >>>> In [36]: np.concatenate(([2, 3], [1]), 1) >>>> Out[36]: array([2, 3, 1]) >>>> >>>> but the beta release branch: >>>> >>>> In [3]: np.concatenate(([2, 3], [1]), 1) >>>> --------------------------------------------------------------------------- >>>> IndexError Traceback (most recent call last) >>>> /Users/mb312/ in () >>>> ----> 1 np.concatenate(([2, 3], [1]), 1) >>>> >>>> IndexError: axis 1 out of bounds [0, 1) >>>> >>>> In the interests of backward compatibility maybe it would be better to >>>> raise a warning for this release, rather than an error? >>> >>> Yep, that'd be a good idea. Want to write a patch? :-) >> >> https://github.com/numpy/numpy/pull/440 > > Thinking about the other thread, and the 'number of elements' check, I > noticed this: > > In [51]: np.__version__ > Out[51]: '1.6.1' > > In [52]: r4 = range(4) > > In [53]: r3 = range(3) > > In [54]: np.concatenate((r4, r3), None) > Out[54]: array([0, 1, 2, 3, 0, 1, 2]) > > but: > > In [46]: np.__version__ > Out[46]: '1.7.0rc1.dev-ea23de8' > > In [47]: np.concatenate((r4, r3), None) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /Users/mb312/tmp/ in () > ----> 1 np.concatenate((r4, r3), None) > > ValueError: all the input arrays must have same number of elements > > The change requiring the same number of elements appears to have been > added explicitly by Mark in commit 9194b3af . Mark - what was the > reason for that check? Appealing for anyone who might understand that part of the code : there's a check in multiarraymodule.c at around line 477: /* * Figure out the final concatenated shape starting from the first * array's shape. */ for (iarrays = 1; iarrays < narrays; ++iarrays) { if (PyArray_SIZE(arrays[iarrays]) != shape[1]) { PyErr_SetString(PyExc_ValueError, "all the input arrays must have same " "number of elements"); return NULL; } } I don't understand the following code so I don't know what this check is for... Cheers, Matthew From fperez.net at gmail.com Fri Sep 14 15:46:37 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 14 Sep 2012 12:46:37 -0700 Subject: [Numpy-discussion] John Hunter has been awarded the first Distinguished Service Award by the PSF Message-ID: Hi folks, you may have already seen this, but in case you haven't, I'm thrilled to share that the Python Software Foundation has just created its newest and highest distinction, the Distinguished Service Award, and has chosen John as its first recipient: http://pyfound.blogspot.com/2012/09/announcing-2012-distinctive-service.html This is a fitting tribute to his many contributions. Cheers, f From hangenuit at gmail.com Fri Sep 14 18:25:05 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sat, 15 Sep 2012 00:25:05 +0200 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: I think there is something wrong with the implementation.. I would expect each incoming array in PyArray_ConcatenateFlattenedArrays to be flattened and the sizes of all of them added into a one-dimensional shape. Now the shape is two-dimensional, which does not make sense to me. Also the requirement that all sizes must be equal between the incoming arrays only makes sense when you want to stack them into a two-dimensional array, which makes it unnecessarily complicated. The difficulty here is to use PyArray_CopyAsFlat without having to transform/copy each incoming array to the priority dtype, because they can have different item sizes between them, but other than that it should be pretty straightforward, imo. From tmp50 at ukr.net Sat Sep 15 06:51:55 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 15 Sep 2012 13:51:55 +0300 Subject: [Numpy-discussion] [ANN] OpenOpt Suite release 0.42 Message-ID: <69972.1347706315.12104809919653740544@ffe16.ukr.net> Hi all, I'm glad to inform you about new OpenOpt Suite release 0.42 (2012-Sept-15). Main changes: * Some improvements for solver interalg, including handling of categorical variables * Some parameters for solver gsubg * Speedup objective function for de and pswarm on FuncDesigner models * New global (GLP) solver: asa (adaptive simulated annealing) * Some new classes for network problems: TSP (traveling salesman problem), STAB (maximum graph stable set)], MCP (maximum clique problem) * Improvements for FD XOR (and now it can handle many inputs) * Solver de has parameter "seed", also, now it works with PyPy * Function sign now is available in FuncDesigner * FuncDesigner interval analysis (and thus solver interalg) now can handle non-monotone splines of 1st order * FuncDesigner now can handle parameter fixedVars as Python dict * Now scipy InterpolatedUnivariateSpline is used in FuncDesigner interpolator() instead of UnivariateSpline. This creates backward incompatibility - you cannot pass smoothing parameter (s) to interpolator no longer. * SpaceFuncs: add Point weight, Disk, Ball and method contains(), bugfix for importing Sphere, some new examples * Some improvements (essential speedup, new parameter interpolate for P()) for our (currently commercial) FuncDesigner Stochastic Programming addon * Some bugfixes In our website ( http://openopt.org ) you could vote for most required OpenOpt Suite development direction(s) (poll has been renewed, previous results are here). Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Sep 15 07:44:09 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 15 Sep 2012 12:44:09 +0100 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: Hi, On Fri, Sep 14, 2012 at 11:25 PM, Han Genuit wrote: > I think there is something wrong with the implementation.. I would > expect each incoming array in PyArray_ConcatenateFlattenedArrays to be > flattened and the sizes of all of them added into a one-dimensional > shape. Now the shape is two-dimensional, which does not make sense to > me. Also the requirement that all sizes must be equal between the > incoming arrays only makes sense when you want to stack them into a > two-dimensional array, which makes it unnecessarily complicated. The > difficulty here is to use PyArray_CopyAsFlat without having to > transform/copy each incoming array to the priority dtype, because they > can have different item sizes between them, but other than that it > should be pretty straightforward, imo. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Thanks for the feedback. Feeling inadequate to a full understanding of the code there, I've entered an issue for it: https://github.com/numpy/numpy/issues/442 Ondrej - would you consider this a blocker for release? Best, Matthew From matthew.brett at gmail.com Sat Sep 15 09:15:36 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 15 Sep 2012 14:15:36 +0100 Subject: [Numpy-discussion] Change in ``round`` behavior for numpy scalars in python 3 Message-ID: Hi, I just noticed that Python 3 raises an error for 0 dimensional numpy arrays. Here's Python 2.6: In [14]: a = np.array(1.1) In [15]: round(a) Out[15]: 1.0 and Python 3.2: In [3]: a = np.array(1.1) In [4]: round(a) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/mb312/dev_trees/ in () ----> 1 round(a) TypeError: type numpy.ndarray doesn't define __round__ method Should arrays implement __round__ ? Returning an error for 1D or above? Best, Matthew From chaoyuejoy at gmail.com Sat Sep 15 11:20:30 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Sat, 15 Sep 2012 17:20:30 +0200 Subject: [Numpy-discussion] easy way to change part of only unmasked elements value? In-Reply-To: References: Message-ID: but I think I personally prefer the reverse. I would expect when I do a[3:6]=1 the mask state would not change. then I want to change the "base", I would use a.base[3:6]=1 then the mask state would change also. By the way, I found b.data always be equal to b.base? cheers, Chao On Tue, Sep 11, 2012 at 5:24 PM, Chao YUE wrote: > Dear Richard, > > this is what I want. Thanks! > > Chao > > > On Tue, Sep 11, 2012 at 3:19 PM, Richard Hattersley > wrote: > >> Hi Chao, >> >> If you don't mind modifying masked values, then if you write to the >> underlying ndarray it won't touch the mask: >> >> >>> a = np.ma.masked_less(np.arange(10),5) >> >>> a.base[3:6] = 1 >> >>> a >> >> masked_array(data = [-- -- -- -- -- 1 6 7 8 9], >> mask = [ True True True True True False False False >> False False], >> fill_value = 999999) >> >> Regards, >> Richard Hattersley >> >> >> On 10 September 2012 17:43, Chao YUE wrote: >> >>> Dear all numpy users, >>> >>> what's the easy way if I just want to change part of the unmasked array >>> elements into another new value? like an example below: >>> in my real case, I would like to change a subgrid of a masked numpy >>> array to another value, but this grid include both masked and unmasked data. >>> If I do a simple array[index1:index2, index3:index4] = another_value, >>> those data with original True mask will change into False. I am using numpy >>> 1.6.2. >>> Thanks for any ideas. >>> >>> In [91]: a = np.ma.masked_less(np.arange(10),5) >>> >>> In [92]: or_mask = a.mask.copy() >>> In [93]: a >>> Out[93]: >>> masked_array(data = [-- -- -- -- -- 5 6 7 8 9], >>> mask = [ True True True True True False False False >>> False False], >>> fill_value = 999999) >>> >>> >>> In [94]: a[3:6]=1 >>> >>> In [95]: a >>> Out[95]: >>> masked_array(data = [-- -- -- 1 1 1 6 7 8 9], >>> mask = [ True True True False False False False False >>> False False], >>> fill_value = 999999) >>> >>> >>> In [96]: a = np.ma.masked_array(a,mask=or_mask) >>> >>> In [97]: a >>> Out[97]: >>> masked_array(data = [-- -- -- -- -- 1 6 7 8 9], >>> mask = [ True True True True True False False False >>> False False], >>> fill_value = 999999) >>> >>> Chao >>> >>> -- >>> >>> *********************************************************************************** >>> Chao YUE >>> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) >>> UMR 1572 CEA-CNRS-UVSQ >>> Batiment 712 - Pe 119 >>> 91191 GIF Sur YVETTE Cedex >>> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 >>> >>> ************************************************************************************ >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > *********************************************************************************** > Chao YUE > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > UMR 1572 CEA-CNRS-UVSQ > Batiment 712 - Pe 119 > 91191 GIF Sur YVETTE Cedex > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 > > ************************************************************************************ > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hangenuit at gmail.com Sat Sep 15 12:06:38 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sat, 15 Sep 2012 18:06:38 +0200 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: Okay, sent in a pull request: https://github.com/numpy/numpy/pull/443. From travis at continuum.io Sat Sep 15 15:52:26 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 15 Sep 2012 14:52:26 -0500 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: Message-ID: <0F2D4700-031E-4865-9C73-B7651839FE91@continuum.io> I was working on the same fix and so I saw your code was similar and merged it. It needs to be back-ported to 1.7.0 Thanks, -Travis On Sep 15, 2012, at 11:06 AM, Han Genuit wrote: > Okay, sent in a pull request: https://github.com/numpy/numpy/pull/443. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hangenuit at gmail.com Sat Sep 15 16:14:32 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sat, 15 Sep 2012 22:14:32 +0200 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: <0F2D4700-031E-4865-9C73-B7651839FE91@continuum.io> References: <0F2D4700-031E-4865-9C73-B7651839FE91@continuum.io> Message-ID: Yeah, that merge was fast. :-) Regards, Han From travis at continuum.io Sat Sep 15 16:33:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 15 Sep 2012 15:33:28 -0500 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: References: <0F2D4700-031E-4865-9C73-B7651839FE91@continuum.io> Message-ID: <3D200D46-E048-41F3-9669-59641336E19B@continuum.io> It's very nice to get your help. I hope I haven't inappropriately set expectations :-) -Travis On Sep 15, 2012, at 3:14 PM, Han Genuit wrote: > Yeah, that merge was fast. :-) > > Regards, > Han > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hangenuit at gmail.com Sat Sep 15 17:03:07 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sat, 15 Sep 2012 23:03:07 +0200 Subject: [Numpy-discussion] Change in behavior of np.concatenate for upcoming release In-Reply-To: <3D200D46-E048-41F3-9669-59641336E19B@continuum.io> References: <0F2D4700-031E-4865-9C73-B7651839FE91@continuum.io> <3D200D46-E048-41F3-9669-59641336E19B@continuum.io> Message-ID: You're welcome. I do not have many expectations; only those you can expect from an open source project. ;-) On Sat, Sep 15, 2012 at 10:33 PM, Travis Oliphant wrote: > It's very nice to get your help. I hope I haven't inappropriately set expectations :-) > > -Travis > > On Sep 15, 2012, at 3:14 PM, Han Genuit wrote: > >> Yeah, that merge was fast. :-) >> >> Regards, >> Han From ondrej.certik at gmail.com Sun Sep 16 03:19:23 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 16 Sep 2012 00:19:23 -0700 Subject: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release Message-ID: Hi, I've finally finished review of https://github.com/numpy/numpy/pull/439 which back-ports all the PRs from master into the release branch and pushed it in. Here is the current status of bugs for the 1.7.0 release: https://github.com/numpy/numpy/issues/396 I believe that for example a lot of the Debian based bugs were fixed by now (in the 1.7.0 branch). Can I release 1.7.0b2? So that others can try it out, while we work on the rest of the issues. I don't think it's ready for rc1 yet, but we've done a lot of work since beta1 I think. Ondrej From ralf.gommers at gmail.com Sun Sep 16 05:10:57 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Sep 2012 11:10:57 +0200 Subject: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release In-Reply-To: References: Message-ID: On Sun, Sep 16, 2012 at 9:19 AM, Ond?ej ?ert?k wrote: > Hi, > > I've finally finished review of > > https://github.com/numpy/numpy/pull/439 > > which back-ports all the PRs from master into the release branch and > pushed it in. Here is the current status of bugs for the 1.7.0 > release: > > https://github.com/numpy/numpy/issues/396 > > I believe that for example a lot of the Debian based bugs were fixed > by now (in the 1.7.0 branch). Can I release 1.7.0b2? So that others > can try it out, > while we work on the rest of the issues. I don't think it's ready for > rc1 yet, but we've done a lot of work since beta1 I think. > Sounds like a good idea to me. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesc at continuum.io Sun Sep 16 06:44:13 2012 From: francesc at continuum.io (Francesc Alted) Date: Sun, 16 Sep 2012 12:44:13 +0200 Subject: [Numpy-discussion] [ANN] python-blosc 1.0.5 released Message-ID: <5055AD7D.3030105@continuum.io> ============================= Announcing python-blosc 1.0.5 ============================= What is it? =========== A Python wrapper for the Blosc compression library. Blosc (http://blosc.pytables.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc. python-blosc is a Python package that wraps it. What is new? ============ - Upgraded to latest Blosc 1.1.4. - Better handling of condition errors, and improved memory releasing in case of errors (thanks to Valentin Haenel and Han Genuit). - Better handling of types (should compile without warning now, at least with GCC). For more info, you can see the release notes in: https://github.com/FrancescAlted/python-blosc/wiki/Release-notes More docs and examples are available in the Quick User's Guide wiki page: https://github.com/FrancescAlted/python-blosc/wiki/Quick-User's-Guide Download sources ================ Go to: http://github.com/FrancescAlted/python-blosc and download the most recent release from there. Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for details. Mailing list ============ There is an official mailing list for Blosc at: blosc at googlegroups.com http://groups.google.es/group/blosc -- Francesc Alted From ecarlson at eng.ua.edu Sun Sep 16 10:10:55 2012 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Sun, 16 Sep 2012 09:10:55 -0500 Subject: [Numpy-discussion] some vectorization help In-Reply-To: <5055AD7D.3030105@continuum.io> References: <5055AD7D.3030105@continuum.io> Message-ID: Hello All, I have a bit of code that nicely accomplishes what I need for a course I am teaching. I'd like to extend this for larger 3D grids, and I suspect that the looping will be a brutal performance hit. Even if my suspicions are not confirmed, I still would like to know if it's possible to vectorize the following code: from scipy import shape, prod, array, zeros,ravel, reshape,sin, mgrid from scipy.misc import derivative def gradient2D_vect(func,x,y): the_shape = shape(x) x1=ravel(x) y1=ravel(y) N = prod(the_shape) the_result = zeros([N,2]) for k in range(N): func_x=lambda x: func(x,y1[k]) func_y=lambda y: func(x1[k],y) the_result[k,:]= array([derivative(func_x,x1[k],dx=.01, order=5), derivative(func_y,y1[k],dx=.01, order=5)]) if prod(shape(the_shape))==1: return the_result else: return reshape(the_result,[the_shape[0],the_shape[1],2]) fxy = lambda x,y: sin(x*y) #just a little test x,y=mgrid[0:5,0:4] the_gradient = gradient2D_vect(fxy, x,y) Cheers, Eric Carlson From cgohlke at uci.edu Sun Sep 16 14:03:34 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sun, 16 Sep 2012 11:03:34 -0700 Subject: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release In-Reply-To: References: Message-ID: <50561476.9090909@uci.edu> On 9/16/2012 12:19 AM, Ond?ej ?ert?k wrote: > Hi, > > I've finally finished review of > > https://github.com/numpy/numpy/pull/439 > > which back-ports all the PRs from master into the release branch and > pushed it in. Here is the current status of bugs for the 1.7.0 > release: > > https://github.com/numpy/numpy/issues/396 > > I believe that for example a lot of the Debian based bugs were fixed > by now (in the 1.7.0 branch). Can I release 1.7.0b2? So that others > can try it out, > while we work on the rest of the issues. I don't think it's ready for > rc1 yet, but we've done a lot of work since beta1 I think. > > Ondrej Hello, I ran some compatibility tests on Windows, using numpy-MKL-1.7.x.dev.win-amd64-py2.7 with packages built against numpy-MKL-1.6.2. There are new test failures in scipy, bottleneck, pymc, and mvpa2 of the following types: IndexError: too many indices ValueError: negative dimensions are not allowed The test results are at Christoph From hangenuit at gmail.com Sun Sep 16 18:06:38 2012 From: hangenuit at gmail.com (Han Genuit) Date: Mon, 17 Sep 2012 00:06:38 +0200 Subject: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release In-Reply-To: <50561476.9090909@uci.edu> References: <50561476.9090909@uci.edu> Message-ID: [snip] > Hello, > > I ran some compatibility tests on Windows, using > numpy-MKL-1.7.x.dev.win-amd64-py2.7 with packages built against > numpy-MKL-1.6.2. > > There are new test failures in scipy, bottleneck, pymc, and mvpa2 of the > following types: > > IndexError: too many indices > ValueError: negative dimensions are not allowed > > The test results are at > > > Christoph Hi, https://github.com/numpy/numpy/pull/445 should fix "negative dimensions are not allowed", the other one I have not yet been able to pinpoint. Regards, Han From cgohlke at uci.edu Sun Sep 16 18:23:42 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sun, 16 Sep 2012 15:23:42 -0700 Subject: [Numpy-discussion] Status of fixing bugs for the 1.7.0rc1 release In-Reply-To: References: <50561476.9090909@uci.edu> Message-ID: <5056516E.5020907@uci.edu> On 9/16/2012 3:06 PM, Han Genuit wrote: > [snip] > >> Hello, >> >> I ran some compatibility tests on Windows, using >> numpy-MKL-1.7.x.dev.win-amd64-py2.7 with packages built against >> numpy-MKL-1.6.2. >> >> There are new test failures in scipy, bottleneck, pymc, and mvpa2 of the >> following types: >> >> IndexError: too many indices >> ValueError: negative dimensions are not allowed >> >> The test results are at >> >> >> Christoph > > Hi, > > https://github.com/numpy/numpy/pull/445 should fix "negative > dimensions are not allowed", the other one I have not yet been able to > pinpoint. > > Regards, > Han I just tracked the "IndexError: too many indices" errors to >>> list(np.ndindex(*())) [(0,)] I'll check your PR. It might fix this too. Christoph From matthew.brett at gmail.com Mon Sep 17 05:22:44 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 17 Sep 2012 10:22:44 +0100 Subject: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble) Message-ID: Hi, On Sun, Sep 9, 2012 at 6:12 PM, Fr?d?ric Bastien wrote: > The third is releated to change to the casting rules in numpy. Before > a scalar complex128 * vector float32 gived a vector of dtype > complex128. Now it give a vector of complex64. The reason is that now > the scalar of different category only change the category, not the > precision. I would consider a must that we warn clearly about this > interface change. Most people won't see it, but people that optimize > there code heavily could depend on such thing. It seems to me that it would be a very good idea to put the casting table results into the tests to make sure we are keeping track of this kind of thing. I'm happy to try to do it if no-one else more qualified has time. Best, Matthew From ben.root at ou.edu Mon Sep 17 09:42:08 2012 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 17 Sep 2012 09:42:08 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) Message-ID: Consider the following code: import numpy as np a = np.array([1, 2, 3, 4, 5], dtype=np.int16) a *= float(255) / 15 In v1.6.x, this yields: array([17, 34, 51, 68, 85], dtype=int16) But in master, this throws an exception about failing to cast via same_kind. Note that numpy was smart about this operation before, consider: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) a *= float(128) / 256 yields: array([0, 1, 1, 2, 2], dtype=int16) Of course, this is different than if one does it in a non-in-place manner: np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 which yields an array with floating point dtype in both versions. I can appreciate the arguments for preventing this kind of implicit casting between non-same_kind dtypes, but I argue that because the operation is in-place, then I (as the programmer) am explicitly stating that I desire to utilize the current array to store the results of the operation, dtype and all. Obviously, we can't completely turn off this rule (for example, an in-place addition between integer array and a datetime64 makes no sense), but surely there is some sort of happy medium that would allow these sort of operations to take place? Lastly, if it is determined that it is desirable to allow in-place operations to continue working like they have before, I would like to see such a fix in v1.7 because if it isn't in 1.7, then other libraries (such as matplotlib, where this issue was first found) would have to change their code anyway just to be compatible with numpy. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronan.lamy at gmail.com Mon Sep 17 15:27:34 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 17 Sep 2012 20:27:34 +0100 Subject: [Numpy-discussion] Inconsistencies with string indices Message-ID: <1347910054.2237.6.camel@ronan-desktop> Consider the following: >>> import numpy as np >>> np.__version__ '1.6.1' >>> arr = np.asarray([[1, 2, 3]]) >>> arr["0"] Traceback (most recent call last): File "", line 1, in arr["0"] ValueError: field named 0 not found. >>> arr["0",] array([1, 2, 3]) >>> arr["0", 1] 2 >>> arr[0, "1"] 2 >>> arr[1] Traceback (most recent call last): File "", line 1, in arr[1] IndexError: index out of bounds >>> arr[1, 1] Traceback (most recent call last): File "", line 1, in arr[1, 1] IndexError: index (1) out of range (0<=index<1) in dimension 0 >>> arr["1", "1"] Traceback (most recent call last): File "", line 1, in arr["1", "1"] IndexError: index (1) out of range (0<=index<0) in dimension 0 Is there some kind of logic here, or is this just accumulated cruft? IMHO, strings should simply never be coerced to int when indexing. From travis at continuum.io Mon Sep 17 17:40:32 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 17 Sep 2012 16:40:32 -0500 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: > Consider the following code: > > import numpy as np > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > a *= float(255) / 15 > > In v1.6.x, this yields: > array([17, 34, 51, 68, 85], dtype=int16) > > But in master, this throws an exception about failing to cast via same_kind. > > Note that numpy was smart about this operation before, consider: > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > a *= float(128) / 256 > yields: > array([0, 1, 1, 2, 2], dtype=int16) > > Of course, this is different than if one does it in a non-in-place manner: > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 > > which yields an array with floating point dtype in both versions. I can appreciate the arguments for preventing this kind of implicit casting between non-same_kind dtypes, but I argue that because the operation is in-place, then I (as the programmer) am explicitly stating that I desire to utilize the current array to store the results of the operation, dtype and all. Obviously, we can't completely turn off this rule (for example, an in-place addition between integer array and a datetime64 makes no sense), but surely there is some sort of happy medium that would allow these sort of operations to take place? > > Lastly, if it is determined that it is desirable to allow in-place operations to continue working like they have before, I would like to see such a fix in v1.7 because if it isn't in 1.7, then other libraries (such as matplotlib, where this issue was first found) would have to change their code anyway just to be compatible with numpy. I agree that in-place operations should allow different casting rules. There are different opinions on this, of course, but generally this is how NumPy has worked in the past. We did decide to change the default casting rule to "same_kind" but making an exception for in-place seems reasonable. -Travis From charlesr.harris at gmail.com Mon Sep 17 21:33:04 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Sep 2012 19:33:04 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: > > On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: > > > Consider the following code: > > > > import numpy as np > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(255) / 15 > > > > In v1.6.x, this yields: > > array([17, 34, 51, 68, 85], dtype=int16) > > > > But in master, this throws an exception about failing to cast via > same_kind. > > > > Note that numpy was smart about this operation before, consider: > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(128) / 256 > > > yields: > > array([0, 1, 1, 2, 2], dtype=int16) > > > > Of course, this is different than if one does it in a non-in-place > manner: > > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 > > > > which yields an array with floating point dtype in both versions. I can > appreciate the arguments for preventing this kind of implicit casting > between non-same_kind dtypes, but I argue that because the operation is > in-place, then I (as the programmer) am explicitly stating that I desire to > utilize the current array to store the results of the operation, dtype and > all. Obviously, we can't completely turn off this rule (for example, an > in-place addition between integer array and a datetime64 makes no sense), > but surely there is some sort of happy medium that would allow these sort > of operations to take place? > > > > Lastly, if it is determined that it is desirable to allow in-place > operations to continue working like they have before, I would like to see > such a fix in v1.7 because if it isn't in 1.7, then other libraries (such > as matplotlib, where this issue was first found) would have to change their > code anyway just to be compatible with numpy. > > I agree that in-place operations should allow different casting rules. > There are different opinions on this, of course, but generally this is how > NumPy has worked in the past. > > We did decide to change the default casting rule to "same_kind" but making > an exception for in-place seems reasonable. > I think that in these cases same_kind will flag what are most likely programming errors and sloppy code. It is easy to be explicit and doing so will make the code more readable because it will be immediately obvious what the multiplicand is without the need to recall what the numpy casting rules are in this exceptional case. IISTR several mentions of this before (Gael?), and in some of those cases it turned out that bugs were being turned up. Catching bugs with minimal effort is a good thing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 13:39:11 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 13:39:11 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris wrote: > > > On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: > >> >> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >> >> > Consider the following code: >> > >> > import numpy as np >> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >> > a *= float(255) / 15 >> > >> > In v1.6.x, this yields: >> > array([17, 34, 51, 68, 85], dtype=int16) >> > >> > But in master, this throws an exception about failing to cast via >> same_kind. >> > >> > Note that numpy was smart about this operation before, consider: >> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >> > a *= float(128) / 256 >> >> > yields: >> > array([0, 1, 1, 2, 2], dtype=int16) >> > >> > Of course, this is different than if one does it in a non-in-place >> manner: >> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >> > >> > which yields an array with floating point dtype in both versions. I >> can appreciate the arguments for preventing this kind of implicit casting >> between non-same_kind dtypes, but I argue that because the operation is >> in-place, then I (as the programmer) am explicitly stating that I desire to >> utilize the current array to store the results of the operation, dtype and >> all. Obviously, we can't completely turn off this rule (for example, an >> in-place addition between integer array and a datetime64 makes no sense), >> but surely there is some sort of happy medium that would allow these sort >> of operations to take place? >> > >> > Lastly, if it is determined that it is desirable to allow in-place >> operations to continue working like they have before, I would like to see >> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >> as matplotlib, where this issue was first found) would have to change their >> code anyway just to be compatible with numpy. >> >> I agree that in-place operations should allow different casting rules. >> There are different opinions on this, of course, but generally this is how >> NumPy has worked in the past. >> >> We did decide to change the default casting rule to "same_kind" but >> making an exception for in-place seems reasonable. >> > > I think that in these cases same_kind will flag what are most likely > programming errors and sloppy code. It is easy to be explicit and doing so > will make the code more readable because it will be immediately obvious > what the multiplicand is without the need to recall what the numpy casting > rules are in this exceptional case. IISTR several mentions of this before > (Gael?), and in some of those cases it turned out that bugs were being > turned up. Catching bugs with minimal effort is a good thing. > > Chuck > > True, it is quite likely to be a programming error, but then again, there are many cases where it isn't. Is the problem strictly that we are trying to downcast the float to an int, or is it that we are trying to downcast to a lower precision? Is there a way for one to explicitly relax the same_kind restriction? Thanks, Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 13:40:23 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 13:40:23 -0400 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? In-Reply-To: References: Message-ID: On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith wrote: > On 7 Sep 2012 14:38, "Benjamin Root" wrote: > > > > An issue just reported on the matplotlib-users list involved a user who > ran out of memory while attempting to do an imshow() on a large array. > While this wouldn't be totally unexpected, the user's traceback shows that > they ran out of memory before any actual building of the image occurred. > Memory usage sky-rocketed when imshow() attempted to determine the min and > max of the image. The input data was a masked array, and it appears that > the implementation of min() for masked arrays goes something like this > (paraphrasing here): > > > > obj.filled(inf).min() > > > > The idea is that any masked element is set to the largest possible value > for their dtype in a copied array of itself, and then a min() is performed > on that copied array. I am assuming that max() does the same thing. > > > > Can this be done differently/more efficiently? If the "filled" approach > has to be done, maybe it would be a good idea to make the copy in chunks > instead of all at once? Ideally, it would be nice to avoid the copying > altogether and utilize some of the special iterators that Mark Weibe > created last year. > > I think what you're looking for is where= support for ufunc.reduce. This > isn't implemented yet but at least it's straightforward in principle... > otherwise I don't know anything better than reimplementing .min() by hand. > > -n > > Yes, it was the where= support that I was thinking of. I take it that it was pulled out of the 1.7 branch with the rest of the NA stuff? Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Tue Sep 18 14:42:19 2012 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 18 Sep 2012 08:42:19 -1000 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? In-Reply-To: References: Message-ID: <5058C08B.9020608@hawaii.edu> On 2012/09/18 7:40 AM, Benjamin Root wrote: > > > On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith > wrote: > > On 7 Sep 2012 14:38, "Benjamin Root" > wrote: > > > > An issue just reported on the matplotlib-users list involved a > user who ran out of memory while attempting to do an imshow() on a > large array. While this wouldn't be totally unexpected, the user's > traceback shows that they ran out of memory before any actual > building of the image occurred. Memory usage sky-rocketed when > imshow() attempted to determine the min and max of the image. The > input data was a masked array, and it appears that the > implementation of min() for masked arrays goes something like this > (paraphrasing here): > > > > obj.filled(inf).min() > > > > The idea is that any masked element is set to the largest > possible value for their dtype in a copied array of itself, and then > a min() is performed on that copied array. I am assuming that max() > does the same thing. > > > > Can this be done differently/more efficiently? If the "filled" > approach has to be done, maybe it would be a good idea to make the > copy in chunks instead of all at once? Ideally, it would be nice to > avoid the copying altogether and utilize some of the special > iterators that Mark Weibe created last year. > > I think what you're looking for is where= support for ufunc.reduce. > This isn't implemented yet but at least it's straightforward in > principle... otherwise I don't know anything better than > reimplementing .min() by hand. > > -n > > > > Yes, it was the where= support that I was thinking of. I take it that > it was pulled out of the 1.7 branch with the rest of the NA stuff? The where= support was left in: http://docs.scipy.org/doc/numpy/reference/ufuncs.html See also get_ufunc_arguments in ufunc_object.c. Eric > > Ben Root > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Sep 18 14:47:49 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 12:47:49 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: > > > On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >> >>> >>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>> >>> > Consider the following code: >>> > >>> > import numpy as np >>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>> > a *= float(255) / 15 >>> > >>> > In v1.6.x, this yields: >>> > array([17, 34, 51, 68, 85], dtype=int16) >>> > >>> > But in master, this throws an exception about failing to cast via >>> same_kind. >>> > >>> > Note that numpy was smart about this operation before, consider: >>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>> > a *= float(128) / 256 >>> >>> > yields: >>> > array([0, 1, 1, 2, 2], dtype=int16) >>> > >>> > Of course, this is different than if one does it in a non-in-place >>> manner: >>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>> > >>> > which yields an array with floating point dtype in both versions. I >>> can appreciate the arguments for preventing this kind of implicit casting >>> between non-same_kind dtypes, but I argue that because the operation is >>> in-place, then I (as the programmer) am explicitly stating that I desire to >>> utilize the current array to store the results of the operation, dtype and >>> all. Obviously, we can't completely turn off this rule (for example, an >>> in-place addition between integer array and a datetime64 makes no sense), >>> but surely there is some sort of happy medium that would allow these sort >>> of operations to take place? >>> > >>> > Lastly, if it is determined that it is desirable to allow in-place >>> operations to continue working like they have before, I would like to see >>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>> as matplotlib, where this issue was first found) would have to change their >>> code anyway just to be compatible with numpy. >>> >>> I agree that in-place operations should allow different casting rules. >>> There are different opinions on this, of course, but generally this is how >>> NumPy has worked in the past. >>> >>> We did decide to change the default casting rule to "same_kind" but >>> making an exception for in-place seems reasonable. >>> >> >> I think that in these cases same_kind will flag what are most likely >> programming errors and sloppy code. It is easy to be explicit and doing so >> will make the code more readable because it will be immediately obvious >> what the multiplicand is without the need to recall what the numpy casting >> rules are in this exceptional case. IISTR several mentions of this before >> (Gael?), and in some of those cases it turned out that bugs were being >> turned up. Catching bugs with minimal effort is a good thing. >> >> Chuck >> >> > True, it is quite likely to be a programming error, but then again, there > are many cases where it isn't. Is the problem strictly that we are trying > to downcast the float to an int, or is it that we are trying to downcast to > a lower precision? Is there a way for one to explicitly relax the > same_kind restriction? > I think the problem is down casting across kinds, with the result that floats are truncated and the imaginary parts of imaginaries might be discarded. That is, the value, not just the precision, of the rhs changes. So I'd favor an explicit cast in code like this, i.e., cast the rhs to an integer. It is true that this forces downstream to code up to a higher standard, but I don't see that as a bad thing, especially if it exposes bugs. And it isn't difficult to fix. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Sep 18 15:08:07 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 18 Sep 2012 14:08:07 -0500 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Sep 18, 2012, at 1:47 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: > > > On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris wrote: > > > On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: > > On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: > > > Consider the following code: > > > > import numpy as np > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(255) / 15 > > > > In v1.6.x, this yields: > > array([17, 34, 51, 68, 85], dtype=int16) > > > > But in master, this throws an exception about failing to cast via same_kind. > > > > Note that numpy was smart about this operation before, consider: > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(128) / 256 > > > yields: > > array([0, 1, 1, 2, 2], dtype=int16) > > > > Of course, this is different than if one does it in a non-in-place manner: > > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 > > > > which yields an array with floating point dtype in both versions. I can appreciate the arguments for preventing this kind of implicit casting between non-same_kind dtypes, but I argue that because the operation is in-place, then I (as the programmer) am explicitly stating that I desire to utilize the current array to store the results of the operation, dtype and all. Obviously, we can't completely turn off this rule (for example, an in-place addition between integer array and a datetime64 makes no sense), but surely there is some sort of happy medium that would allow these sort of operations to take place? > > > > Lastly, if it is determined that it is desirable to allow in-place operations to continue working like they have before, I would like to see such a fix in v1.7 because if it isn't in 1.7, then other libraries (such as matplotlib, where this issue was first found) would have to change their code anyway just to be compatible with numpy. > > I agree that in-place operations should allow different casting rules. There are different opinions on this, of course, but generally this is how NumPy has worked in the past. > > We did decide to change the default casting rule to "same_kind" but making an exception for in-place seems reasonable. > > I think that in these cases same_kind will flag what are most likely programming errors and sloppy code. It is easy to be explicit and doing so will make the code more readable because it will be immediately obvious what the multiplicand is without the need to recall what the numpy casting rules are in this exceptional case. IISTR several mentions of this before (Gael?), and in some of those cases it turned out that bugs were being turned up. Catching bugs with minimal effort is a good thing. > > Chuck > > > True, it is quite likely to be a programming error, but then again, there are many cases where it isn't. Is the problem strictly that we are trying to downcast the float to an int, or is it that we are trying to downcast to a lower precision? Is there a way for one to explicitly relax the same_kind restriction? > > I think the problem is down casting across kinds, with the result that floats are truncated and the imaginary parts of imaginaries might be discarded. That is, the value, not just the precision, of the rhs changes. So I'd favor an explicit cast in code like this, i.e., cast the rhs to an integer. > > It is true that this forces downstream to code up to a higher standard, but I don't see that as a bad thing, especially if it exposes bugs. And it isn't difficult to fix. Shouldn't we be issuing a warning, though? Even if the desire is to change the casting rules? The fact that multiple codes are breaking and need to be "upgraded" seems like a hard thing to require of someone going straight from 1.6 to 1.7. That's what I'm opposed to. All of these efforts move NumPy to its use as a library instead of an interactive "environment" where it started which is a good direction to move, but managing this move in the context of a very large user-community is the challenge we have. -Travis > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 15:12:32 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 13:12:32 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 1:08 PM, Travis Oliphant wrote: > > On Sep 18, 2012, at 1:47 PM, Charles R Harris wrote: > > > > On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: > >> >> >> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>> >>>> >>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>> >>>> > Consider the following code: >>>> > >>>> > import numpy as np >>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>> > a *= float(255) / 15 >>>> > >>>> > In v1.6.x, this yields: >>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>> > >>>> > But in master, this throws an exception about failing to cast via >>>> same_kind. >>>> > >>>> > Note that numpy was smart about this operation before, consider: >>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>> > a *= float(128) / 256 >>>> >>>> > yields: >>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>> > >>>> > Of course, this is different than if one does it in a non-in-place >>>> manner: >>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>> > >>>> > which yields an array with floating point dtype in both versions. I >>>> can appreciate the arguments for preventing this kind of implicit casting >>>> between non-same_kind dtypes, but I argue that because the operation is >>>> in-place, then I (as the programmer) am explicitly stating that I desire to >>>> utilize the current array to store the results of the operation, dtype and >>>> all. Obviously, we can't completely turn off this rule (for example, an >>>> in-place addition between integer array and a datetime64 makes no sense), >>>> but surely there is some sort of happy medium that would allow these sort >>>> of operations to take place? >>>> > >>>> > Lastly, if it is determined that it is desirable to allow in-place >>>> operations to continue working like they have before, I would like to see >>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>> as matplotlib, where this issue was first found) would have to change their >>>> code anyway just to be compatible with numpy. >>>> >>>> I agree that in-place operations should allow different casting rules. >>>> There are different opinions on this, of course, but generally this is how >>>> NumPy has worked in the past. >>>> >>>> We did decide to change the default casting rule to "same_kind" but >>>> making an exception for in-place seems reasonable. >>>> >>> >>> I think that in these cases same_kind will flag what are most likely >>> programming errors and sloppy code. It is easy to be explicit and doing so >>> will make the code more readable because it will be immediately obvious >>> what the multiplicand is without the need to recall what the numpy casting >>> rules are in this exceptional case. IISTR several mentions of this before >>> (Gael?), and in some of those cases it turned out that bugs were being >>> turned up. Catching bugs with minimal effort is a good thing. >>> >>> Chuck >>> >>> >> True, it is quite likely to be a programming error, but then again, there >> are many cases where it isn't. Is the problem strictly that we are trying >> to downcast the float to an int, or is it that we are trying to downcast to >> a lower precision? Is there a way for one to explicitly relax the >> same_kind restriction? >> > > I think the problem is down casting across kinds, with the result that > floats are truncated and the imaginary parts of imaginaries might be > discarded. That is, the value, not just the precision, of the rhs changes. > So I'd favor an explicit cast in code like this, i.e., cast the rhs to an > integer. > > It is true that this forces downstream to code up to a higher standard, > but I don't see that as a bad thing, especially if it exposes bugs. And it > isn't difficult to fix. > > > Shouldn't we be issuing a warning, though? Even if the desire is to > change the casting rules? The fact that multiple codes are breaking and > need to be "upgraded" seems like a hard thing to require of someone going > straight from 1.6 to 1.7. That's what I'm opposed to. > I think a warning would do just as well. I'd tend to regard the broken codes as already broken, but that's just me ;) > > All of these efforts move NumPy to its use as a library instead of an > interactive "environment" where it started which is a good direction to > move, but managing this move in the context of a very large user-community > is the challenge we have. > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 15:13:49 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 15:13:49 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: > >> >> >> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>> >>>> >>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>> >>>> > Consider the following code: >>>> > >>>> > import numpy as np >>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>> > a *= float(255) / 15 >>>> > >>>> > In v1.6.x, this yields: >>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>> > >>>> > But in master, this throws an exception about failing to cast via >>>> same_kind. >>>> > >>>> > Note that numpy was smart about this operation before, consider: >>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>> > a *= float(128) / 256 >>>> >>>> > yields: >>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>> > >>>> > Of course, this is different than if one does it in a non-in-place >>>> manner: >>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>> > >>>> > which yields an array with floating point dtype in both versions. I >>>> can appreciate the arguments for preventing this kind of implicit casting >>>> between non-same_kind dtypes, but I argue that because the operation is >>>> in-place, then I (as the programmer) am explicitly stating that I desire to >>>> utilize the current array to store the results of the operation, dtype and >>>> all. Obviously, we can't completely turn off this rule (for example, an >>>> in-place addition between integer array and a datetime64 makes no sense), >>>> but surely there is some sort of happy medium that would allow these sort >>>> of operations to take place? >>>> > >>>> > Lastly, if it is determined that it is desirable to allow in-place >>>> operations to continue working like they have before, I would like to see >>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>> as matplotlib, where this issue was first found) would have to change their >>>> code anyway just to be compatible with numpy. >>>> >>>> I agree that in-place operations should allow different casting rules. >>>> There are different opinions on this, of course, but generally this is how >>>> NumPy has worked in the past. >>>> >>>> We did decide to change the default casting rule to "same_kind" but >>>> making an exception for in-place seems reasonable. >>>> >>> >>> I think that in these cases same_kind will flag what are most likely >>> programming errors and sloppy code. It is easy to be explicit and doing so >>> will make the code more readable because it will be immediately obvious >>> what the multiplicand is without the need to recall what the numpy casting >>> rules are in this exceptional case. IISTR several mentions of this before >>> (Gael?), and in some of those cases it turned out that bugs were being >>> turned up. Catching bugs with minimal effort is a good thing. >>> >>> Chuck >>> >>> >> True, it is quite likely to be a programming error, but then again, there >> are many cases where it isn't. Is the problem strictly that we are trying >> to downcast the float to an int, or is it that we are trying to downcast to >> a lower precision? Is there a way for one to explicitly relax the >> same_kind restriction? >> > > I think the problem is down casting across kinds, with the result that > floats are truncated and the imaginary parts of imaginaries might be > discarded. That is, the value, not just the precision, of the rhs changes. > So I'd favor an explicit cast in code like this, i.e., cast the rhs to an > integer. > > It is true that this forces downstream to code up to a higher standard, > but I don't see that as a bad thing, especially if it exposes bugs. And it > isn't difficult to fix. > > Chuck > > Mind you, in my case, casting the rhs as an integer before doing the multiplication would be a bug, since our value for the rhs is usually between zero and one. Multiplying first by the integer numerator before dividing by the integer denominator would likely cause issues with overflowing the 16 bit integer. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Sep 18 15:19:22 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 18 Sep 2012 21:19:22 +0200 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 9:13 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >> >>> >>> >>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>>> >>>>> >>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>> >>>>> > Consider the following code: >>>>> > >>>>> > import numpy as np >>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> > a *= float(255) / 15 >>>>> > >>>>> > In v1.6.x, this yields: >>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>> > >>>>> > But in master, this throws an exception about failing to cast via >>>>> same_kind. >>>>> > >>>>> > Note that numpy was smart about this operation before, consider: >>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> > a *= float(128) / 256 >>>>> >>>>> > yields: >>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>> > >>>>> > Of course, this is different than if one does it in a non-in-place >>>>> manner: >>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>> > >>>>> > which yields an array with floating point dtype in both versions. I >>>>> can appreciate the arguments for preventing this kind of implicit casting >>>>> between non-same_kind dtypes, but I argue that because the operation is >>>>> in-place, then I (as the programmer) am explicitly stating that I desire to >>>>> utilize the current array to store the results of the operation, dtype and >>>>> all. Obviously, we can't completely turn off this rule (for example, an >>>>> in-place addition between integer array and a datetime64 makes no sense), >>>>> but surely there is some sort of happy medium that would allow these sort >>>>> of operations to take place? >>>>> > >>>>> > Lastly, if it is determined that it is desirable to allow in-place >>>>> operations to continue working like they have before, I would like to see >>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>>> as matplotlib, where this issue was first found) would have to change their >>>>> code anyway just to be compatible with numpy. >>>>> >>>>> I agree that in-place operations should allow different casting rules. >>>>> There are different opinions on this, of course, but generally this is how >>>>> NumPy has worked in the past. >>>>> >>>>> We did decide to change the default casting rule to "same_kind" but >>>>> making an exception for in-place seems reasonable. >>>>> >>>> >>>> I think that in these cases same_kind will flag what are most likely >>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>> will make the code more readable because it will be immediately obvious >>>> what the multiplicand is without the need to recall what the numpy casting >>>> rules are in this exceptional case. IISTR several mentions of this before >>>> (Gael?), and in some of those cases it turned out that bugs were being >>>> turned up. Catching bugs with minimal effort is a good thing. >>>> >>>> Chuck >>>> >>>> >>> True, it is quite likely to be a programming error, but then again, >>> there are many cases where it isn't. Is the problem strictly that we are >>> trying to downcast the float to an int, or is it that we are trying to >>> downcast to a lower precision? Is there a way for one to explicitly relax >>> the same_kind restriction? >>> >> >> I think the problem is down casting across kinds, with the result that >> floats are truncated and the imaginary parts of imaginaries might be >> discarded. That is, the value, not just the precision, of the rhs changes. >> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >> integer. >> >> It is true that this forces downstream to code up to a higher standard, >> but I don't see that as a bad thing, especially if it exposes bugs. And it >> isn't difficult to fix. >> >> Chuck >> >> > Mind you, in my case, casting the rhs as an integer before doing the > multiplication would be a bug, since our value for the rhs is usually > between zero and one. Multiplying first by the integer numerator before > dividing by the integer denominator would likely cause issues with > overflowing the 16 bit integer. > Then you'd have to do >>> a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>> np.multiply(a, 0.5, out=a, casting="unsafe") array([0, 1, 1, 2, 2], dtype=int16) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 15:23:20 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 15:23:20 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 3:19 PM, Ralf Gommers wrote: > > > On Tue, Sep 18, 2012 at 9:13 PM, Benjamin Root wrote: > >> >> >> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>> >>>> >>>> >>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>>>> >>>>>> >>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>> >>>>>> > Consider the following code: >>>>>> > >>>>>> > import numpy as np >>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>> > a *= float(255) / 15 >>>>>> > >>>>>> > In v1.6.x, this yields: >>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>> > >>>>>> > But in master, this throws an exception about failing to cast via >>>>>> same_kind. >>>>>> > >>>>>> > Note that numpy was smart about this operation before, consider: >>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>> > a *= float(128) / 256 >>>>>> >>>>>> > yields: >>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>> > >>>>>> > Of course, this is different than if one does it in a non-in-place >>>>>> manner: >>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>> > >>>>>> > which yields an array with floating point dtype in both versions. >>>>>> I can appreciate the arguments for preventing this kind of implicit >>>>>> casting between non-same_kind dtypes, but I argue that because the >>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>> that I desire to utilize the current array to store the results of the >>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>> rule (for example, an in-place addition between integer array and a >>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>> that would allow these sort of operations to take place? >>>>>> > >>>>>> > Lastly, if it is determined that it is desirable to allow in-place >>>>>> operations to continue working like they have before, I would like to see >>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>>>> as matplotlib, where this issue was first found) would have to change their >>>>>> code anyway just to be compatible with numpy. >>>>>> >>>>>> I agree that in-place operations should allow different casting >>>>>> rules. There are different opinions on this, of course, but generally this >>>>>> is how NumPy has worked in the past. >>>>>> >>>>>> We did decide to change the default casting rule to "same_kind" but >>>>>> making an exception for in-place seems reasonable. >>>>>> >>>>> >>>>> I think that in these cases same_kind will flag what are most likely >>>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>>> will make the code more readable because it will be immediately obvious >>>>> what the multiplicand is without the need to recall what the numpy casting >>>>> rules are in this exceptional case. IISTR several mentions of this before >>>>> (Gael?), and in some of those cases it turned out that bugs were being >>>>> turned up. Catching bugs with minimal effort is a good thing. >>>>> >>>>> Chuck >>>>> >>>>> >>>> True, it is quite likely to be a programming error, but then again, >>>> there are many cases where it isn't. Is the problem strictly that we are >>>> trying to downcast the float to an int, or is it that we are trying to >>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>> the same_kind restriction? >>>> >>> >>> I think the problem is down casting across kinds, with the result that >>> floats are truncated and the imaginary parts of imaginaries might be >>> discarded. That is, the value, not just the precision, of the rhs changes. >>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>> integer. >>> >>> It is true that this forces downstream to code up to a higher standard, >>> but I don't see that as a bad thing, especially if it exposes bugs. And it >>> isn't difficult to fix. >>> >>> Chuck >>> >>> >> Mind you, in my case, casting the rhs as an integer before doing the >> multiplication would be a bug, since our value for the rhs is usually >> between zero and one. Multiplying first by the integer numerator before >> dividing by the integer denominator would likely cause issues with >> overflowing the 16 bit integer. >> > > Then you'd have to do > > > >>> a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > >>> np.multiply(a, 0.5, out=a, casting="unsafe") > > array([0, 1, 1, 2, 2], dtype=int16) > > Ralf > > That is exactly what I am looking for! When did the "casting" kwarg come about? I am unfamiliar with it. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Sep 18 15:24:56 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 18 Sep 2012 21:24:56 +0200 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? In-Reply-To: <5058C08B.9020608@hawaii.edu> References: <5058C08B.9020608@hawaii.edu> Message-ID: <1347996296.6880.2.camel@sebastian-laptop> On Tue, 2012-09-18 at 08:42 -1000, Eric Firing wrote: > On 2012/09/18 7:40 AM, Benjamin Root wrote: > > > > > > On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith > > wrote: > > > > On 7 Sep 2012 14:38, "Benjamin Root" > > wrote: > > > > > > An issue just reported on the matplotlib-users list involved a > > user who ran out of memory while attempting to do an imshow() on a > > large array. While this wouldn't be totally unexpected, the user's > > traceback shows that they ran out of memory before any actual > > building of the image occurred. Memory usage sky-rocketed when > > imshow() attempted to determine the min and max of the image. The > > input data was a masked array, and it appears that the > > implementation of min() for masked arrays goes something like this > > (paraphrasing here): > > > > > > obj.filled(inf).min() > > > > > > The idea is that any masked element is set to the largest > > possible value for their dtype in a copied array of itself, and then > > a min() is performed on that copied array. I am assuming that max() > > does the same thing. > > > > > > Can this be done differently/more efficiently? If the "filled" > > approach has to be done, maybe it would be a good idea to make the > > copy in chunks instead of all at once? Ideally, it would be nice to > > avoid the copying altogether and utilize some of the special > > iterators that Mark Weibe created last year. > > > > I think what you're looking for is where= support for ufunc.reduce. > > This isn't implemented yet but at least it's straightforward in > > principle... otherwise I don't know anything better than > > reimplementing .min() by hand. > > > > -n > > > > > > > > Yes, it was the where= support that I was thinking of. I take it that > > it was pulled out of the 1.7 branch with the rest of the NA stuff? > > The where= support was left in: > http://docs.scipy.org/doc/numpy/reference/ufuncs.html > It seems though that the keyword argument is still missing from the ufunc help (`help(np.add)` and individual `np.info(np.add)`) though. > See also get_ufunc_arguments in ufunc_object.c. > > Eric > > > > > > Ben Root > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Sep 18 15:25:56 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 13:25:56 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >> >>> >>> >>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>>> >>>>> >>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>> >>>>> > Consider the following code: >>>>> > >>>>> > import numpy as np >>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> > a *= float(255) / 15 >>>>> > >>>>> > In v1.6.x, this yields: >>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>> > >>>>> > But in master, this throws an exception about failing to cast via >>>>> same_kind. >>>>> > >>>>> > Note that numpy was smart about this operation before, consider: >>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> > a *= float(128) / 256 >>>>> >>>>> > yields: >>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>> > >>>>> > Of course, this is different than if one does it in a non-in-place >>>>> manner: >>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>> > >>>>> > which yields an array with floating point dtype in both versions. I >>>>> can appreciate the arguments for preventing this kind of implicit casting >>>>> between non-same_kind dtypes, but I argue that because the operation is >>>>> in-place, then I (as the programmer) am explicitly stating that I desire to >>>>> utilize the current array to store the results of the operation, dtype and >>>>> all. Obviously, we can't completely turn off this rule (for example, an >>>>> in-place addition between integer array and a datetime64 makes no sense), >>>>> but surely there is some sort of happy medium that would allow these sort >>>>> of operations to take place? >>>>> > >>>>> > Lastly, if it is determined that it is desirable to allow in-place >>>>> operations to continue working like they have before, I would like to see >>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>>> as matplotlib, where this issue was first found) would have to change their >>>>> code anyway just to be compatible with numpy. >>>>> >>>>> I agree that in-place operations should allow different casting rules. >>>>> There are different opinions on this, of course, but generally this is how >>>>> NumPy has worked in the past. >>>>> >>>>> We did decide to change the default casting rule to "same_kind" but >>>>> making an exception for in-place seems reasonable. >>>>> >>>> >>>> I think that in these cases same_kind will flag what are most likely >>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>> will make the code more readable because it will be immediately obvious >>>> what the multiplicand is without the need to recall what the numpy casting >>>> rules are in this exceptional case. IISTR several mentions of this before >>>> (Gael?), and in some of those cases it turned out that bugs were being >>>> turned up. Catching bugs with minimal effort is a good thing. >>>> >>>> Chuck >>>> >>>> >>> True, it is quite likely to be a programming error, but then again, >>> there are many cases where it isn't. Is the problem strictly that we are >>> trying to downcast the float to an int, or is it that we are trying to >>> downcast to a lower precision? Is there a way for one to explicitly relax >>> the same_kind restriction? >>> >> >> I think the problem is down casting across kinds, with the result that >> floats are truncated and the imaginary parts of imaginaries might be >> discarded. That is, the value, not just the precision, of the rhs changes. >> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >> integer. >> >> It is true that this forces downstream to code up to a higher standard, >> but I don't see that as a bad thing, especially if it exposes bugs. And it >> isn't difficult to fix. >> >> Chuck >> >> > Mind you, in my case, casting the rhs as an integer before doing the > multiplication would be a bug, since our value for the rhs is usually > between zero and one. Multiplying first by the integer numerator before > dividing by the integer denominator would likely cause issues with > overflowing the 16 bit integer. > > For the case in point I'd do In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) In [2]: a //= 2 In [3]: a Out[3]: array([0, 1, 1, 2, 2], dtype=int16) Although I expect you would want something different in practice. But the current code already looks fragile to me and I think it is a good thing you are taking a closer look at it. If you really intend going through a float, then it should be something like a = (a*(float(128)/256)).astype(int16) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 15:35:10 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 15:35:10 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: > >> >> >> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>> >>>> >>>> >>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: >>>>> >>>>>> >>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>> >>>>>> > Consider the following code: >>>>>> > >>>>>> > import numpy as np >>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>> > a *= float(255) / 15 >>>>>> > >>>>>> > In v1.6.x, this yields: >>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>> > >>>>>> > But in master, this throws an exception about failing to cast via >>>>>> same_kind. >>>>>> > >>>>>> > Note that numpy was smart about this operation before, consider: >>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>> > a *= float(128) / 256 >>>>>> >>>>>> > yields: >>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>> > >>>>>> > Of course, this is different than if one does it in a non-in-place >>>>>> manner: >>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>> > >>>>>> > which yields an array with floating point dtype in both versions. >>>>>> I can appreciate the arguments for preventing this kind of implicit >>>>>> casting between non-same_kind dtypes, but I argue that because the >>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>> that I desire to utilize the current array to store the results of the >>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>> rule (for example, an in-place addition between integer array and a >>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>> that would allow these sort of operations to take place? >>>>>> > >>>>>> > Lastly, if it is determined that it is desirable to allow in-place >>>>>> operations to continue working like they have before, I would like to see >>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>>>> as matplotlib, where this issue was first found) would have to change their >>>>>> code anyway just to be compatible with numpy. >>>>>> >>>>>> I agree that in-place operations should allow different casting >>>>>> rules. There are different opinions on this, of course, but generally this >>>>>> is how NumPy has worked in the past. >>>>>> >>>>>> We did decide to change the default casting rule to "same_kind" but >>>>>> making an exception for in-place seems reasonable. >>>>>> >>>>> >>>>> I think that in these cases same_kind will flag what are most likely >>>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>>> will make the code more readable because it will be immediately obvious >>>>> what the multiplicand is without the need to recall what the numpy casting >>>>> rules are in this exceptional case. IISTR several mentions of this before >>>>> (Gael?), and in some of those cases it turned out that bugs were being >>>>> turned up. Catching bugs with minimal effort is a good thing. >>>>> >>>>> Chuck >>>>> >>>>> >>>> True, it is quite likely to be a programming error, but then again, >>>> there are many cases where it isn't. Is the problem strictly that we are >>>> trying to downcast the float to an int, or is it that we are trying to >>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>> the same_kind restriction? >>>> >>> >>> I think the problem is down casting across kinds, with the result that >>> floats are truncated and the imaginary parts of imaginaries might be >>> discarded. That is, the value, not just the precision, of the rhs changes. >>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>> integer. >>> >>> It is true that this forces downstream to code up to a higher standard, >>> but I don't see that as a bad thing, especially if it exposes bugs. And it >>> isn't difficult to fix. >>> >>> Chuck >>> >>> >> Mind you, in my case, casting the rhs as an integer before doing the >> multiplication would be a bug, since our value for the rhs is usually >> between zero and one. Multiplying first by the integer numerator before >> dividing by the integer denominator would likely cause issues with >> overflowing the 16 bit integer. >> >> > For the case in point I'd do > > In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > In [2]: a //= 2 > > In [3]: a > Out[3]: array([0, 1, 1, 2, 2], dtype=int16) > > Although I expect you would want something different in practice. But the > current code already looks fragile to me and I think it is a good thing you > are taking a closer look at it. If you really intend going through a float, > then it should be something like > > a = (a*(float(128)/256)).astype(int16) > > Chuck > > And thereby losing the memory benefit of an in-place multiplication? That is sort of the point of all this. We are using 16 bit integers because we wanted to be as efficient as possible and didn't need anything larger. Note, that is what we changed the code to, I am just wondering if we are being too cautious. The casting kwarg looks to be what I might want, though it isn't as clean as just writing an "*=" statement. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 15:44:25 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 13:44:25 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: >> >>> >>> >>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>>> >>>>> >>>>> >>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>>> charlesr.harris at gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant >>>>> > wrote: >>>>>> >>>>>>> >>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>>> >>>>>>> > Consider the following code: >>>>>>> > >>>>>>> > import numpy as np >>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>> > a *= float(255) / 15 >>>>>>> > >>>>>>> > In v1.6.x, this yields: >>>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>>> > >>>>>>> > But in master, this throws an exception about failing to cast via >>>>>>> same_kind. >>>>>>> > >>>>>>> > Note that numpy was smart about this operation before, consider: >>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>> > a *= float(128) / 256 >>>>>>> >>>>>>> > yields: >>>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>>> > >>>>>>> > Of course, this is different than if one does it in a non-in-place >>>>>>> manner: >>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>>> > >>>>>>> > which yields an array with floating point dtype in both versions. >>>>>>> I can appreciate the arguments for preventing this kind of implicit >>>>>>> casting between non-same_kind dtypes, but I argue that because the >>>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>>> that I desire to utilize the current array to store the results of the >>>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>>> rule (for example, an in-place addition between integer array and a >>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>>> that would allow these sort of operations to take place? >>>>>>> > >>>>>>> > Lastly, if it is determined that it is desirable to allow in-place >>>>>>> operations to continue working like they have before, I would like to see >>>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such >>>>>>> as matplotlib, where this issue was first found) would have to change their >>>>>>> code anyway just to be compatible with numpy. >>>>>>> >>>>>>> I agree that in-place operations should allow different casting >>>>>>> rules. There are different opinions on this, of course, but generally this >>>>>>> is how NumPy has worked in the past. >>>>>>> >>>>>>> We did decide to change the default casting rule to "same_kind" but >>>>>>> making an exception for in-place seems reasonable. >>>>>>> >>>>>> >>>>>> I think that in these cases same_kind will flag what are most likely >>>>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>>>> will make the code more readable because it will be immediately obvious >>>>>> what the multiplicand is without the need to recall what the numpy casting >>>>>> rules are in this exceptional case. IISTR several mentions of this before >>>>>> (Gael?), and in some of those cases it turned out that bugs were being >>>>>> turned up. Catching bugs with minimal effort is a good thing. >>>>>> >>>>>> Chuck >>>>>> >>>>>> >>>>> True, it is quite likely to be a programming error, but then again, >>>>> there are many cases where it isn't. Is the problem strictly that we are >>>>> trying to downcast the float to an int, or is it that we are trying to >>>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>>> the same_kind restriction? >>>>> >>>> >>>> I think the problem is down casting across kinds, with the result that >>>> floats are truncated and the imaginary parts of imaginaries might be >>>> discarded. That is, the value, not just the precision, of the rhs changes. >>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>>> integer. >>>> >>>> It is true that this forces downstream to code up to a higher standard, >>>> but I don't see that as a bad thing, especially if it exposes bugs. And it >>>> isn't difficult to fix. >>>> >>>> Chuck >>>> >>>> >>> Mind you, in my case, casting the rhs as an integer before doing the >>> multiplication would be a bug, since our value for the rhs is usually >>> between zero and one. Multiplying first by the integer numerator before >>> dividing by the integer denominator would likely cause issues with >>> overflowing the 16 bit integer. >>> >>> >> For the case in point I'd do >> >> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >> >> In [2]: a //= 2 >> >> In [3]: a >> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) >> >> Although I expect you would want something different in practice. But the >> current code already looks fragile to me and I think it is a good thing you >> are taking a closer look at it. If you really intend going through a float, >> then it should be something like >> >> a = (a*(float(128)/256)).astype(int16) >> >> Chuck >> >> > And thereby losing the memory benefit of an in-place multiplication? > What makes you think you are getting that? I'd have to check the numpy C source, but I expect the multiplication is handled just as I wrote it out. I don't recall any loops that handle mixed types likes that. I'd like to see some, though, scaling integers is a common problem. > That is sort of the point of all this. We are using 16 bit integers > because we wanted to be as efficient as possible and didn't need anything > larger. Note, that is what we changed the code to, I am just wondering if > we are being too cautious. The casting kwarg looks to be what I might > want, though it isn't as clean as just writing an "*=" statement. > > I think even there you will have an intermediate float array followed by a cast. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Tue Sep 18 15:55:16 2012 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 18 Sep 2012 09:55:16 -1000 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: <5058D1A4.8060200@hawaii.edu> On 2012/09/18 9:25 AM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root > wrote: > > > > On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris > > wrote: > > > > On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root > wrote: > > > > On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris > > wrote: > > > > On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant > > wrote: > > > On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: > > > Consider the following code: > > > > import numpy as np > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(255) / 15 > > > > In v1.6.x, this yields: > > array([17, 34, 51, 68, 85], dtype=int16) > > > > But in master, this throws an exception about > failing to cast via same_kind. > > > > Note that numpy was smart about this operation > before, consider: > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(128) / 256 > > > yields: > > array([0, 1, 1, 2, 2], dtype=int16) > > > > Of course, this is different than if one does it > in a non-in-place manner: > > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 > > > > which yields an array with floating point dtype > in both versions. I can appreciate the arguments > for preventing this kind of implicit casting between > non-same_kind dtypes, but I argue that because the > operation is in-place, then I (as the programmer) am > explicitly stating that I desire to utilize the > current array to store the results of the operation, > dtype and all. Obviously, we can't completely turn > off this rule (for example, an in-place addition > between integer array and a datetime64 makes no > sense), but surely there is some sort of happy > medium that would allow these sort of operations to > take place? > > > > Lastly, if it is determined that it is desirable > to allow in-place operations to continue working > like they have before, I would like to see such a > fix in v1.7 because if it isn't in 1.7, then other > libraries (such as matplotlib, where this issue was > first found) would have to change their code anyway > just to be compatible with numpy. > > I agree that in-place operations should allow > different casting rules. There are different > opinions on this, of course, but generally this is > how NumPy has worked in the past. > > We did decide to change the default casting rule to > "same_kind" but making an exception for in-place > seems reasonable. > > > I think that in these cases same_kind will flag what are > most likely programming errors and sloppy code. It is > easy to be explicit and doing so will make the code more > readable because it will be immediately obvious what the > multiplicand is without the need to recall what the > numpy casting rules are in this exceptional case. IISTR > several mentions of this before (Gael?), and in some of > those cases it turned out that bugs were being turned > up. Catching bugs with minimal effort is a good thing. > > Chuck > > > True, it is quite likely to be a programming error, but then > again, there are many cases where it isn't. Is the problem > strictly that we are trying to downcast the float to an int, > or is it that we are trying to downcast to a lower > precision? Is there a way for one to explicitly relax the > same_kind restriction? > > > I think the problem is down casting across kinds, with the > result that floats are truncated and the imaginary parts of > imaginaries might be discarded. That is, the value, not just the > precision, of the rhs changes. So I'd favor an explicit cast in > code like this, i.e., cast the rhs to an integer. > > It is true that this forces downstream to code up to a higher > standard, but I don't see that as a bad thing, especially if it > exposes bugs. And it isn't difficult to fix. > > Chuck > > > Mind you, in my case, casting the rhs as an integer before doing the > multiplication would be a bug, since our value for the rhs is > usually between zero and one. Multiplying first by the integer > numerator before dividing by the integer denominator would likely > cause issues with overflowing the 16 bit integer. > > > For the case in point I'd do > > In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > In [2]: a //= 2 > > In [3]: a > Out[3]: array([0, 1, 1, 2, 2], dtype=int16) > > Although I expect you would want something different in practice. But > the current code already looks fragile to me and I think it is a good > thing you are taking a closer look at it. If you really intend going > through a float, then it should be something like > > a = (a*(float(128)/256)).astype(int16) That's actually what we had been doing for years until a seemingly harmless "optimization" snuck in via an unrelated PR. Fortunately, Ben caught it after only a few days. Eric > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Tue Sep 18 16:18:25 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Sep 2012 21:18:25 +0100 Subject: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy? In-Reply-To: References: Message-ID: On 18 Sep 2012 18:40, "Benjamin Root" wrote: > > > > On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith wrote: >> >> On 7 Sep 2012 14:38, "Benjamin Root" wrote: >> > >> > An issue just reported on the matplotlib-users list involved a user who ran out of memory while attempting to do an imshow() on a large array. While this wouldn't be totally unexpected, the user's traceback shows that they ran out of memory before any actual building of the image occurred. Memory usage sky-rocketed when imshow() attempted to determine the min and max of the image. The input data was a masked array, and it appears that the implementation of min() for masked arrays goes something like this (paraphrasing here): >> > >> > obj.filled(inf).min() >> > >> > The idea is that any masked element is set to the largest possible value for their dtype in a copied array of itself, and then a min() is performed on that copied array. I am assuming that max() does the same thing. >> > >> > Can this be done differently/more efficiently? If the "filled" approach has to be done, maybe it would be a good idea to make the copy in chunks instead of all at once? Ideally, it would be nice to avoid the copying altogether and utilize some of the special iterators that Mark Weibe created last year. >> >> I think what you're looking for is where= support for ufunc.reduce. This isn't implemented yet but at least it's straightforward in principle... otherwise I don't know anything better than reimplementing .min() by hand. >> >> -n >> >> > > Yes, it was the where= support that I was thinking of. I take it that it was pulled out of the 1.7 branch with the rest of the NA stuff? where= was left in, but it was only implemented for regular vectorized ufunc operations in the first place. Supporting it in reductions still needs to be written. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Sep 18 16:33:10 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 18 Sep 2012 15:33:10 -0500 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: > > > On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris wrote: > > > On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote: > > On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: > > > Consider the following code: > > > > import numpy as np > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(255) / 15 > > > > In v1.6.x, this yields: > > array([17, 34, 51, 68, 85], dtype=int16) > > > > But in master, this throws an exception about failing to cast via same_kind. > > > > Note that numpy was smart about this operation before, consider: > > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > a *= float(128) / 256 > > > yields: > > array([0, 1, 1, 2, 2], dtype=int16) > > > > Of course, this is different than if one does it in a non-in-place manner: > > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 > > > > which yields an array with floating point dtype in both versions. I can appreciate the arguments for preventing this kind of implicit casting between non-same_kind dtypes, but I argue that because the operation is in-place, then I (as the programmer) am explicitly stating that I desire to utilize the current array to store the results of the operation, dtype and all. Obviously, we can't completely turn off this rule (for example, an in-place addition between integer array and a datetime64 makes no sense), but surely there is some sort of happy medium that would allow these sort of operations to take place? > > > > Lastly, if it is determined that it is desirable to allow in-place operations to continue working like they have before, I would like to see such a fix in v1.7 because if it isn't in 1.7, then other libraries (such as matplotlib, where this issue was first found) would have to change their code anyway just to be compatible with numpy. > > I agree that in-place operations should allow different casting rules. There are different opinions on this, of course, but generally this is how NumPy has worked in the past. > > We did decide to change the default casting rule to "same_kind" but making an exception for in-place seems reasonable. > > I think that in these cases same_kind will flag what are most likely programming errors and sloppy code. It is easy to be explicit and doing so will make the code more readable because it will be immediately obvious what the multiplicand is without the need to recall what the numpy casting rules are in this exceptional case. IISTR several mentions of this before (Gael?), and in some of those cases it turned out that bugs were being turned up. Catching bugs with minimal effort is a good thing. > > Chuck > > > True, it is quite likely to be a programming error, but then again, there are many cases where it isn't. Is the problem strictly that we are trying to downcast the float to an int, or is it that we are trying to downcast to a lower precision? Is there a way for one to explicitly relax the same_kind restriction? > > I think the problem is down casting across kinds, with the result that floats are truncated and the imaginary parts of imaginaries might be discarded. That is, the value, not just the precision, of the rhs changes. So I'd favor an explicit cast in code like this, i.e., cast the rhs to an integer. > > It is true that this forces downstream to code up to a higher standard, but I don't see that as a bad thing, especially if it exposes bugs. And it isn't difficult to fix. > > Chuck > > > Mind you, in my case, casting the rhs as an integer before doing the multiplication would be a bug, since our value for the rhs is usually between zero and one. Multiplying first by the integer numerator before dividing by the integer denominator would likely cause issues with overflowing the 16 bit integer. > > > For the case in point I'd do > > In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) > > In [2]: a //= 2 > > In [3]: a > Out[3]: array([0, 1, 1, 2, 2], dtype=int16) > > Although I expect you would want something different in practice. But the current code already looks fragile to me and I think it is a good thing you are taking a closer look at it. If you really intend going through a float, then it should be something like > > a = (a*(float(128)/256)).astype(int16) > > Chuck > > > And thereby losing the memory benefit of an in-place multiplication? > > What makes you think you are getting that? I'd have to check the numpy C source, but I expect the multiplication is handled just as I wrote it out. I don't recall any loops that handle mixed types likes that. I'd like to see some, though, scaling integers is a common problem. > > That is sort of the point of all this. We are using 16 bit integers because we wanted to be as efficient as possible and didn't need anything larger. Note, that is what we changed the code to, I am just wondering if we are being too cautious. The casting kwarg looks to be what I might want, though it isn't as clean as just writing an "*=" statement. > > > I think even there you will have an intermediate float array followed by a cast. This is true, but it is done in chunks of a fixed size (controllable by a thread-local variable or keyword argument to the ufunc). How difficult would it be to change in-place operations back to the "unsafe" default? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 16:42:10 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 14:42:10 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant wrote: > > On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote: > > > > On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: > >> >> >> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: >>> >>>> >>>> >>>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>>>> charlesr.harris at gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant < >>>>>>> travis at continuum.io> wrote: >>>>>>> >>>>>>>> >>>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>>>> >>>>>>>> > Consider the following code: >>>>>>>> > >>>>>>>> > import numpy as np >>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>> > a *= float(255) / 15 >>>>>>>> > >>>>>>>> > In v1.6.x, this yields: >>>>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>>>> > >>>>>>>> > But in master, this throws an exception about failing to cast via >>>>>>>> same_kind. >>>>>>>> > >>>>>>>> > Note that numpy was smart about this operation before, consider: >>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>> > a *= float(128) / 256 >>>>>>>> >>>>>>>> > yields: >>>>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>>>> > >>>>>>>> > Of course, this is different than if one does it in a >>>>>>>> non-in-place manner: >>>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>>>> > >>>>>>>> > which yields an array with floating point dtype in both versions. >>>>>>>> I can appreciate the arguments for preventing this kind of implicit >>>>>>>> casting between non-same_kind dtypes, but I argue that because the >>>>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>>>> that I desire to utilize the current array to store the results of the >>>>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>>>> rule (for example, an in-place addition between integer array and a >>>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>>>> that would allow these sort of operations to take place? >>>>>>>> > >>>>>>>> > Lastly, if it is determined that it is desirable to allow >>>>>>>> in-place operations to continue working like they have before, I would like >>>>>>>> to see such a fix in v1.7 because if it isn't in 1.7, then other libraries >>>>>>>> (such as matplotlib, where this issue was first found) would have to change >>>>>>>> their code anyway just to be compatible with numpy. >>>>>>>> >>>>>>>> I agree that in-place operations should allow different casting >>>>>>>> rules. There are different opinions on this, of course, but generally this >>>>>>>> is how NumPy has worked in the past. >>>>>>>> >>>>>>>> We did decide to change the default casting rule to "same_kind" but >>>>>>>> making an exception for in-place seems reasonable. >>>>>>>> >>>>>>> >>>>>>> I think that in these cases same_kind will flag what are most likely >>>>>>> programming errors and sloppy code. It is easy to be explicit and doing so >>>>>>> will make the code more readable because it will be immediately obvious >>>>>>> what the multiplicand is without the need to recall what the numpy casting >>>>>>> rules are in this exceptional case. IISTR several mentions of this before >>>>>>> (Gael?), and in some of those cases it turned out that bugs were being >>>>>>> turned up. Catching bugs with minimal effort is a good thing. >>>>>>> >>>>>>> Chuck >>>>>>> >>>>>>> >>>>>> True, it is quite likely to be a programming error, but then again, >>>>>> there are many cases where it isn't. Is the problem strictly that we are >>>>>> trying to downcast the float to an int, or is it that we are trying to >>>>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>>>> the same_kind restriction? >>>>>> >>>>> >>>>> I think the problem is down casting across kinds, with the result that >>>>> floats are truncated and the imaginary parts of imaginaries might be >>>>> discarded. That is, the value, not just the precision, of the rhs changes. >>>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>>>> integer. >>>>> >>>>> It is true that this forces downstream to code up to a higher >>>>> standard, but I don't see that as a bad thing, especially if it exposes >>>>> bugs. And it isn't difficult to fix. >>>>> >>>>> Chuck >>>>> >>>>> >>>> Mind you, in my case, casting the rhs as an integer before doing the >>>> multiplication would be a bug, since our value for the rhs is usually >>>> between zero and one. Multiplying first by the integer numerator before >>>> dividing by the integer denominator would likely cause issues with >>>> overflowing the 16 bit integer. >>>> >>>> >>> For the case in point I'd do >>> >>> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>> >>> In [2]: a //= 2 >>> >>> In [3]: a >>> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) >>> >>> Although I expect you would want something different in practice. But >>> the current code already looks fragile to me and I think it is a good thing >>> you are taking a closer look at it. If you really intend going through a >>> float, then it should be something like >>> >>> a = (a*(float(128)/256)).astype(int16) >>> >>> Chuck >>> >>> >> And thereby losing the memory benefit of an in-place multiplication? >> > > What makes you think you are getting that? I'd have to check the numpy C > source, but I expect the multiplication is handled just as I wrote it out. > I don't recall any loops that handle mixed types likes that. I'd like to > see some, though, scaling integers is a common problem. > > > > >> That is sort of the point of all this. We are using 16 bit integers >> because we wanted to be as efficient as possible and didn't need anything >> larger. Note, that is what we changed the code to, I am just wondering if >> we are being too cautious. The casting kwarg looks to be what I might >> want, though it isn't as clean as just writing an "*=" statement. >> >> > I think even there you will have an intermediate float array followed by a > cast. > > > This is true, but it is done in chunks of a fixed size (controllable by a > thread-local variable or keyword argument to the ufunc). > > How difficult would it be to change in-place operations back to the > "unsafe" default? > Probably not too difficult, but I think it would be a mistake. What keyword argument are you referring to? In the current case, I think what is wanted is a scaling function that will actually do things in place. The matplotlib folks would probably be happier with the result if they simply coded up a couple of small Cython routines to do that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Sep 18 16:52:05 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 18 Sep 2012 16:52:05 -0400 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 4:42 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant wrote: > >> >> On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote: >> >> >> >> On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: >> >>> >>> >>> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: >>>> >>>>> >>>>> >>>>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >>>>> charlesr.harris at gmail.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>>>>> charlesr.harris at gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant < >>>>>>>> travis at continuum.io> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>>>>> >>>>>>>>> > Consider the following code: >>>>>>>>> > >>>>>>>>> > import numpy as np >>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>> > a *= float(255) / 15 >>>>>>>>> > >>>>>>>>> > In v1.6.x, this yields: >>>>>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>>>>> > >>>>>>>>> > But in master, this throws an exception about failing to cast >>>>>>>>> via same_kind. >>>>>>>>> > >>>>>>>>> > Note that numpy was smart about this operation before, consider: >>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>> > a *= float(128) / 256 >>>>>>>>> >>>>>>>>> > yields: >>>>>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>>>>> > >>>>>>>>> > Of course, this is different than if one does it in a >>>>>>>>> non-in-place manner: >>>>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>>>>> > >>>>>>>>> > which yields an array with floating point dtype in both >>>>>>>>> versions. I can appreciate the arguments for preventing this kind of >>>>>>>>> implicit casting between non-same_kind dtypes, but I argue that because the >>>>>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>>>>> that I desire to utilize the current array to store the results of the >>>>>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>>>>> rule (for example, an in-place addition between integer array and a >>>>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>>>>> that would allow these sort of operations to take place? >>>>>>>>> > >>>>>>>>> > Lastly, if it is determined that it is desirable to allow >>>>>>>>> in-place operations to continue working like they have before, I would like >>>>>>>>> to see such a fix in v1.7 because if it isn't in 1.7, then other libraries >>>>>>>>> (such as matplotlib, where this issue was first found) would have to change >>>>>>>>> their code anyway just to be compatible with numpy. >>>>>>>>> >>>>>>>>> I agree that in-place operations should allow different casting >>>>>>>>> rules. There are different opinions on this, of course, but generally this >>>>>>>>> is how NumPy has worked in the past. >>>>>>>>> >>>>>>>>> We did decide to change the default casting rule to "same_kind" >>>>>>>>> but making an exception for in-place seems reasonable. >>>>>>>>> >>>>>>>> >>>>>>>> I think that in these cases same_kind will flag what are most >>>>>>>> likely programming errors and sloppy code. It is easy to be explicit and >>>>>>>> doing so will make the code more readable because it will be immediately >>>>>>>> obvious what the multiplicand is without the need to recall what the numpy >>>>>>>> casting rules are in this exceptional case. IISTR several mentions of this >>>>>>>> before (Gael?), and in some of those cases it turned out that bugs were >>>>>>>> being turned up. Catching bugs with minimal effort is a good thing. >>>>>>>> >>>>>>>> Chuck >>>>>>>> >>>>>>>> >>>>>>> True, it is quite likely to be a programming error, but then again, >>>>>>> there are many cases where it isn't. Is the problem strictly that we are >>>>>>> trying to downcast the float to an int, or is it that we are trying to >>>>>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>>>>> the same_kind restriction? >>>>>>> >>>>>> >>>>>> I think the problem is down casting across kinds, with the result >>>>>> that floats are truncated and the imaginary parts of imaginaries might be >>>>>> discarded. That is, the value, not just the precision, of the rhs changes. >>>>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>>>>> integer. >>>>>> >>>>>> It is true that this forces downstream to code up to a higher >>>>>> standard, but I don't see that as a bad thing, especially if it exposes >>>>>> bugs. And it isn't difficult to fix. >>>>>> >>>>>> Chuck >>>>>> >>>>>> >>>>> Mind you, in my case, casting the rhs as an integer before doing the >>>>> multiplication would be a bug, since our value for the rhs is usually >>>>> between zero and one. Multiplying first by the integer numerator before >>>>> dividing by the integer denominator would likely cause issues with >>>>> overflowing the 16 bit integer. >>>>> >>>>> >>>> For the case in point I'd do >>>> >>>> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>> >>>> In [2]: a //= 2 >>>> >>>> In [3]: a >>>> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) >>>> >>>> Although I expect you would want something different in practice. But >>>> the current code already looks fragile to me and I think it is a good thing >>>> you are taking a closer look at it. If you really intend going through a >>>> float, then it should be something like >>>> >>>> a = (a*(float(128)/256)).astype(int16) >>>> >>>> Chuck >>>> >>>> >>> And thereby losing the memory benefit of an in-place multiplication? >>> >> >> What makes you think you are getting that? I'd have to check the numpy C >> source, but I expect the multiplication is handled just as I wrote it out. >> I don't recall any loops that handle mixed types likes that. I'd like to >> see some, though, scaling integers is a common problem. >> >> >> >> >>> That is sort of the point of all this. We are using 16 bit integers >>> because we wanted to be as efficient as possible and didn't need anything >>> larger. Note, that is what we changed the code to, I am just wondering if >>> we are being too cautious. The casting kwarg looks to be what I might >>> want, though it isn't as clean as just writing an "*=" statement. >>> >>> >> I think even there you will have an intermediate float array followed by >> a cast. >> >> >> This is true, but it is done in chunks of a fixed size (controllable by a >> thread-local variable or keyword argument to the ufunc). >> >> How difficult would it be to change in-place operations back to the >> "unsafe" default? >> > > Probably not too difficult, but I think it would be a mistake. What > keyword argument are you referring to? In the current case, I think what is > wanted is a scaling function that will actually do things in place. The > matplotlib folks would probably be happier with the result if they simply > coded up a couple of small Cython routines to do that. > > Chuck > > As far as matplotlib is concerned, the problem was solved when we reverted a change. The issue that I am raising is that it was such an innocuous, and frankly, obvious change to do an in-place operation in the first place. I have to wonder if we are being overly cautious with "same_kind". You are right, we probably would benefit greatly from creating some CXX scaling functions (contrary to popular belief, we don't use Cython), however, I would imagine that such general-purpose function would fare better within NumPy. But, ultimately, Python is about there being one right way of doing something, and so I think the goal should be to have a somewhat more restrictive casting rule than "unsafe" for in-place operations, but restrictive enough to catch the sort of errors "same_kind" was catching. This way, I have one way of doing an inplace operation, regardless of the types of my operands. Cheers, Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Sep 18 17:04:13 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 18 Sep 2012 23:04:13 +0200 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 10:52 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 4:42 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant wrote: >> >>> >>> On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote: >>> >>> >>> >>> On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: >>> >>>> >>>> >>>> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >>>>>> charlesr.harris at gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>>>>>> charlesr.harris at gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant < >>>>>>>>> travis at continuum.io> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>>>>>> >>>>>>>>>> > Consider the following code: >>>>>>>>>> > >>>>>>>>>> > import numpy as np >>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>>> > a *= float(255) / 15 >>>>>>>>>> > >>>>>>>>>> > In v1.6.x, this yields: >>>>>>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>>>>>> > >>>>>>>>>> > But in master, this throws an exception about failing to cast >>>>>>>>>> via same_kind. >>>>>>>>>> > >>>>>>>>>> > Note that numpy was smart about this operation before, consider: >>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>>> > a *= float(128) / 256 >>>>>>>>>> >>>>>>>>>> > yields: >>>>>>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>>>>>> > >>>>>>>>>> > Of course, this is different than if one does it in a >>>>>>>>>> non-in-place manner: >>>>>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>>>>>> > >>>>>>>>>> > which yields an array with floating point dtype in both >>>>>>>>>> versions. I can appreciate the arguments for preventing this kind of >>>>>>>>>> implicit casting between non-same_kind dtypes, but I argue that because the >>>>>>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>>>>>> that I desire to utilize the current array to store the results of the >>>>>>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>>>>>> rule (for example, an in-place addition between integer array and a >>>>>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>>>>>> that would allow these sort of operations to take place? >>>>>>>>>> > >>>>>>>>>> > Lastly, if it is determined that it is desirable to allow >>>>>>>>>> in-place operations to continue working like they have before, I would like >>>>>>>>>> to see such a fix in v1.7 because if it isn't in 1.7, then other libraries >>>>>>>>>> (such as matplotlib, where this issue was first found) would have to change >>>>>>>>>> their code anyway just to be compatible with numpy. >>>>>>>>>> >>>>>>>>>> I agree that in-place operations should allow different casting >>>>>>>>>> rules. There are different opinions on this, of course, but generally this >>>>>>>>>> is how NumPy has worked in the past. >>>>>>>>>> >>>>>>>>>> We did decide to change the default casting rule to "same_kind" >>>>>>>>>> but making an exception for in-place seems reasonable. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think that in these cases same_kind will flag what are most >>>>>>>>> likely programming errors and sloppy code. It is easy to be explicit and >>>>>>>>> doing so will make the code more readable because it will be immediately >>>>>>>>> obvious what the multiplicand is without the need to recall what the numpy >>>>>>>>> casting rules are in this exceptional case. IISTR several mentions of this >>>>>>>>> before (Gael?), and in some of those cases it turned out that bugs were >>>>>>>>> being turned up. Catching bugs with minimal effort is a good thing. >>>>>>>>> >>>>>>>>> Chuck >>>>>>>>> >>>>>>>>> >>>>>>>> True, it is quite likely to be a programming error, but then again, >>>>>>>> there are many cases where it isn't. Is the problem strictly that we are >>>>>>>> trying to downcast the float to an int, or is it that we are trying to >>>>>>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>>>>>> the same_kind restriction? >>>>>>>> >>>>>>> >>>>>>> I think the problem is down casting across kinds, with the result >>>>>>> that floats are truncated and the imaginary parts of imaginaries might be >>>>>>> discarded. That is, the value, not just the precision, of the rhs changes. >>>>>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>>>>>> integer. >>>>>>> >>>>>>> It is true that this forces downstream to code up to a higher >>>>>>> standard, but I don't see that as a bad thing, especially if it exposes >>>>>>> bugs. And it isn't difficult to fix. >>>>>>> >>>>>>> Chuck >>>>>>> >>>>>>> >>>>>> Mind you, in my case, casting the rhs as an integer before doing the >>>>>> multiplication would be a bug, since our value for the rhs is usually >>>>>> between zero and one. Multiplying first by the integer numerator before >>>>>> dividing by the integer denominator would likely cause issues with >>>>>> overflowing the 16 bit integer. >>>>>> >>>>>> >>>>> For the case in point I'd do >>>>> >>>>> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> >>>>> In [2]: a //= 2 >>>>> >>>>> In [3]: a >>>>> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) >>>>> >>>>> Although I expect you would want something different in practice. But >>>>> the current code already looks fragile to me and I think it is a good thing >>>>> you are taking a closer look at it. If you really intend going through a >>>>> float, then it should be something like >>>>> >>>>> a = (a*(float(128)/256)).astype(int16) >>>>> >>>>> Chuck >>>>> >>>>> >>>> And thereby losing the memory benefit of an in-place multiplication? >>>> >>> >>> What makes you think you are getting that? I'd have to check the numpy >>> C source, but I expect the multiplication is handled just as I wrote it >>> out. I don't recall any loops that handle mixed types likes that. I'd like >>> to see some, though, scaling integers is a common problem. >>> >>> >>> >>> >>>> That is sort of the point of all this. We are using 16 bit integers >>>> because we wanted to be as efficient as possible and didn't need anything >>>> larger. Note, that is what we changed the code to, I am just wondering if >>>> we are being too cautious. The casting kwarg looks to be what I might >>>> want, though it isn't as clean as just writing an "*=" statement. >>>> >>>> >>> I think even there you will have an intermediate float array followed by >>> a cast. >>> >>> >>> This is true, but it is done in chunks of a fixed size (controllable by >>> a thread-local variable or keyword argument to the ufunc). >>> >>> How difficult would it be to change in-place operations back to the >>> "unsafe" default? >>> >> >> Probably not too difficult, but I think it would be a mistake. What >> keyword argument are you referring to? In the current case, I think what is >> wanted is a scaling function that will actually do things in place. The >> matplotlib folks would probably be happier with the result if they simply >> coded up a couple of small Cython routines to do that. >> >> Chuck >> >> > As far as matplotlib is concerned, the problem was solved when we reverted > a change. The issue that I am raising is that it was such an innocuous, > and frankly, obvious change to do an in-place operation in the first > place. I have to wonder if we are being overly cautious with "same_kind". > You are right, we probably would benefit greatly from creating some CXX > scaling functions (contrary to popular belief, we don't use Cython), > however, I would imagine that such general-purpose function would fare > better within NumPy. But, ultimately, Python is about there being one > right way of doing something, and so I think the goal should be to have a > somewhat more restrictive casting rule than "unsafe" for in-place > operations, but restrictive enough to catch the sort of errors "same_kind" > was catching. > That sentence doesn't parse. ("more restrictive" & "restrictive enough") == "same_kind" ? Ralf > This way, I have one way of doing an inplace operation, regardless of the > types of my operands. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 17:07:48 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 15:07:48 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 2:52 PM, Benjamin Root wrote: > > > On Tue, Sep 18, 2012 at 4:42 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant wrote: >> >>> >>> On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote: >>> >>> >>> >>> On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root wrote: >>> >>>> >>>> >>>> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris < >>>>>> charlesr.harris at gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris < >>>>>>>> charlesr.harris at gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant < >>>>>>>>> travis at continuum.io> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote: >>>>>>>>>> >>>>>>>>>> > Consider the following code: >>>>>>>>>> > >>>>>>>>>> > import numpy as np >>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>>> > a *= float(255) / 15 >>>>>>>>>> > >>>>>>>>>> > In v1.6.x, this yields: >>>>>>>>>> > array([17, 34, 51, 68, 85], dtype=int16) >>>>>>>>>> > >>>>>>>>>> > But in master, this throws an exception about failing to cast >>>>>>>>>> via same_kind. >>>>>>>>>> > >>>>>>>>>> > Note that numpy was smart about this operation before, consider: >>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>>>>>>> > a *= float(128) / 256 >>>>>>>>>> >>>>>>>>>> > yields: >>>>>>>>>> > array([0, 1, 1, 2, 2], dtype=int16) >>>>>>>>>> > >>>>>>>>>> > Of course, this is different than if one does it in a >>>>>>>>>> non-in-place manner: >>>>>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5 >>>>>>>>>> > >>>>>>>>>> > which yields an array with floating point dtype in both >>>>>>>>>> versions. I can appreciate the arguments for preventing this kind of >>>>>>>>>> implicit casting between non-same_kind dtypes, but I argue that because the >>>>>>>>>> operation is in-place, then I (as the programmer) am explicitly stating >>>>>>>>>> that I desire to utilize the current array to store the results of the >>>>>>>>>> operation, dtype and all. Obviously, we can't completely turn off this >>>>>>>>>> rule (for example, an in-place addition between integer array and a >>>>>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium >>>>>>>>>> that would allow these sort of operations to take place? >>>>>>>>>> > >>>>>>>>>> > Lastly, if it is determined that it is desirable to allow >>>>>>>>>> in-place operations to continue working like they have before, I would like >>>>>>>>>> to see such a fix in v1.7 because if it isn't in 1.7, then other libraries >>>>>>>>>> (such as matplotlib, where this issue was first found) would have to change >>>>>>>>>> their code anyway just to be compatible with numpy. >>>>>>>>>> >>>>>>>>>> I agree that in-place operations should allow different casting >>>>>>>>>> rules. There are different opinions on this, of course, but generally this >>>>>>>>>> is how NumPy has worked in the past. >>>>>>>>>> >>>>>>>>>> We did decide to change the default casting rule to "same_kind" >>>>>>>>>> but making an exception for in-place seems reasonable. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think that in these cases same_kind will flag what are most >>>>>>>>> likely programming errors and sloppy code. It is easy to be explicit and >>>>>>>>> doing so will make the code more readable because it will be immediately >>>>>>>>> obvious what the multiplicand is without the need to recall what the numpy >>>>>>>>> casting rules are in this exceptional case. IISTR several mentions of this >>>>>>>>> before (Gael?), and in some of those cases it turned out that bugs were >>>>>>>>> being turned up. Catching bugs with minimal effort is a good thing. >>>>>>>>> >>>>>>>>> Chuck >>>>>>>>> >>>>>>>>> >>>>>>>> True, it is quite likely to be a programming error, but then again, >>>>>>>> there are many cases where it isn't. Is the problem strictly that we are >>>>>>>> trying to downcast the float to an int, or is it that we are trying to >>>>>>>> downcast to a lower precision? Is there a way for one to explicitly relax >>>>>>>> the same_kind restriction? >>>>>>>> >>>>>>> >>>>>>> I think the problem is down casting across kinds, with the result >>>>>>> that floats are truncated and the imaginary parts of imaginaries might be >>>>>>> discarded. That is, the value, not just the precision, of the rhs changes. >>>>>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an >>>>>>> integer. >>>>>>> >>>>>>> It is true that this forces downstream to code up to a higher >>>>>>> standard, but I don't see that as a bad thing, especially if it exposes >>>>>>> bugs. And it isn't difficult to fix. >>>>>>> >>>>>>> Chuck >>>>>>> >>>>>>> >>>>>> Mind you, in my case, casting the rhs as an integer before doing the >>>>>> multiplication would be a bug, since our value for the rhs is usually >>>>>> between zero and one. Multiplying first by the integer numerator before >>>>>> dividing by the integer denominator would likely cause issues with >>>>>> overflowing the 16 bit integer. >>>>>> >>>>>> >>>>> For the case in point I'd do >>>>> >>>>> In [1]: a = np.array([1, 2, 3, 4, 5], dtype=np.int16) >>>>> >>>>> In [2]: a //= 2 >>>>> >>>>> In [3]: a >>>>> Out[3]: array([0, 1, 1, 2, 2], dtype=int16) >>>>> >>>>> Although I expect you would want something different in practice. But >>>>> the current code already looks fragile to me and I think it is a good thing >>>>> you are taking a closer look at it. If you really intend going through a >>>>> float, then it should be something like >>>>> >>>>> a = (a*(float(128)/256)).astype(int16) >>>>> >>>>> Chuck >>>>> >>>>> >>>> And thereby losing the memory benefit of an in-place multiplication? >>>> >>> >>> What makes you think you are getting that? I'd have to check the numpy >>> C source, but I expect the multiplication is handled just as I wrote it >>> out. I don't recall any loops that handle mixed types likes that. I'd like >>> to see some, though, scaling integers is a common problem. >>> >>> >>> >>> >>>> That is sort of the point of all this. We are using 16 bit integers >>>> because we wanted to be as efficient as possible and didn't need anything >>>> larger. Note, that is what we changed the code to, I am just wondering if >>>> we are being too cautious. The casting kwarg looks to be what I might >>>> want, though it isn't as clean as just writing an "*=" statement. >>>> >>>> >>> I think even there you will have an intermediate float array followed by >>> a cast. >>> >>> >>> This is true, but it is done in chunks of a fixed size (controllable by >>> a thread-local variable or keyword argument to the ufunc). >>> >>> How difficult would it be to change in-place operations back to the >>> "unsafe" default? >>> >> >> Probably not too difficult, but I think it would be a mistake. What >> keyword argument are you referring to? In the current case, I think what is >> wanted is a scaling function that will actually do things in place. The >> matplotlib folks would probably be happier with the result if they simply >> coded up a couple of small Cython routines to do that. >> >> Chuck >> >> > As far as matplotlib is concerned, the problem was solved when we reverted > a change. The issue that I am raising is that it was such an innocuous, > and frankly, obvious change to do an in-place operation in the first > place. I have to wonder if we are being overly cautious with "same_kind". > You are right, we probably would benefit greatly from creating some CXX > scaling functions (contrary to popular belief, we don't use Cython), > however, I would imagine that such general-purpose function would fare > better within NumPy. But, ultimately, Python is about there being one > right way of doing something, and so I think the goal should be to have a > somewhat more restrictive casting rule than "unsafe" for in-place > operations, but restrictive enough to catch the sort of errors "same_kind" > was catching. This way, I have one way of doing an inplace operation, > regardless of the types of my operands. > > I think there can be a difference of opinion as to "right" way. I think Mark got it right. Making these sort of things explicit lets the programmer tell the world that, yes, I meant to do that, it isn't something I overlooked. Remember, you aren't writing code for yourself, you are writing it for the next person who reads it. Python is loosely typed, but it compensates for that by having only doubles, complex doubles, and unlimited precision integers. Numpy isn't like that, and for numerical work it is especially important to account for downcasts because of the potential dangers. The way things have trended in practice is for downcasts to be made more and more explicit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Sep 18 19:02:00 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 18 Sep 2012 18:02:00 -0500 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: Message-ID: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> > >> >> That is sort of the point of all this. We are using 16 bit integers because we wanted to be as efficient as possible and didn't need anything larger. Note, that is what we changed the code to, I am just wondering if we are being too cautious. The casting kwarg looks to be what I might want, though it isn't as clean as just writing an "*=" statement. >> >> >> I think even there you will have an intermediate float array followed by a cast. > > This is true, but it is done in chunks of a fixed size (controllable by a thread-local variable or keyword argument to the ufunc). > > How difficult would it be to change in-place operations back to the "unsafe" default? > > Probably not too difficult, but I think it would be a mistake. What keyword argument are you referring to? In the current case, I think what is wanted is a scaling function that will actually do things in place. The matplotlib folks would probably be happier with the result if they simply coded up a couple of small Cython routines to do that. http://docs.scipy.org/doc/numpy/reference/ufuncs.html#ufunc In particular, the extobj keyword argument or the thread-local variable at umath.UFUNC_PYVALS_NAME But, the problem is not just for matplotlib. Matplotlib is showing a symptom of the problem of just changing the default casting mode in one release. I think this is too stark of a change for a single minor release without some kind of glide path or warning system. I think we need to change in-place multiplication back to "unsafe" and then put in the release notes that we are planning on changing this for 1.8. It would be ideal if we could raise a warning when "unsafe" castings occur. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 19:52:24 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 17:52:24 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Tue, Sep 18, 2012 at 5:02 PM, Travis Oliphant wrote: > >> >> >>> That is sort of the point of all this. We are using 16 bit integers >>> because we wanted to be as efficient as possible and didn't need anything >>> larger. Note, that is what we changed the code to, I am just wondering if >>> we are being too cautious. The casting kwarg looks to be what I might >>> want, though it isn't as clean as just writing an "*=" statement. >>> >>> >> I think even there you will have an intermediate float array followed by >> a cast. >> >> >> This is true, but it is done in chunks of a fixed size (controllable by a >> thread-local variable or keyword argument to the ufunc). >> >> How difficult would it be to change in-place operations back to the >> "unsafe" default? >> > > Probably not too difficult, but I think it would be a mistake. What > keyword argument are you referring to? In the current case, I think what is > wanted is a scaling function that will actually do things in place. The > matplotlib folks would probably be happier with the result if they simply > coded up a couple of small Cython routines to do that. > > > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#ufunc > > In particular, the extobj keyword argument or the thread-local variable at > umath.UFUNC_PYVALS_NAME > Hmm, the ufunc documentation that comes with the functions needs an upgrade. > > But, the problem is not just for matplotlib. Matplotlib is showing a > symptom of the problem of just changing the default casting mode in one > release. I think this is too stark of a change for a single minor > release without some kind of glide path or warning system. > > I think we need to change in-place multiplication back to "unsafe" and then > put in the release notes that we are planning on changing this for 1.8. > It would be ideal if we could raise a warning when "unsafe" castings occur. > I think that raising a warning would be appropriate, maybe with a note concerning the future change since I expect few to read the release notes. The new casting modes were introduced in 1.6 so code that needs to work with older versions of numpy won't be able to use that option to work around the default. Type specific functions for scaling integers would be helpful, although I'd probably restrict it to float32/float64 scaling factors to avoid combinatorial bloat. Having such a function do the normal broadcasting would probably be desirable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 20:08:51 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 18:08:51 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: The relevant setting is in numpy/core/include/numpy/ndarraytypes.h #define NPY_DEFAULT_ASSIGN_CASTING NPY_SAME_KIND_CASTING I think that if we want to raise a warning we could define a new rule, NPY_WARN_SAME_KIND_CASTING Which would do the same as unsafe, only raise a warning on the way. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 18 21:31:27 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 19:31:27 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Tue, Sep 18, 2012 at 6:08 PM, Charles R Harris wrote: > > > The relevant setting is in numpy/core/include/numpy/ndarraytypes.h > > #define NPY_DEFAULT_ASSIGN_CASTING NPY_SAME_KIND_CASTING > > I think that if we want to raise a warning we could define a new rule, > > NPY_WARN_SAME_KIND_CASTING > > Which would do the same as unsafe, only raise a warning on the way. > On second thought, it might be easier to set a warn bit on the usual casting macros, i.e., #define NPY_WARN_CASTING 256 #define NPY_MASK_CASTING 255 #define NPY_DEFAULT_ASSIGN_CASTING (NPY_UNSAFE_CASTING | NPY_WARN_CASTING) and replace the current checks with masked checks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Sep 18 21:32:29 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 18 Sep 2012 21:32:29 -0400 Subject: [Numpy-discussion] Where is the code for PyArray_UpdateFlags? Message-ID: Hi, I would like to reuse the code of this function: PyArray_UpdateFlags I don't find its definition in the numpy source code. In the build directory, it get defined in the file build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h like this: #define PyArray_UpdateFlags \ (*(void (*)(PyArrayObject *, int)) \ PyArray_API[92]) This file is generated by this file numpy/core/code_generators/generate_numpy_api.py, that use the numpy_api_order.txt file. But I'm not able up to now to find the code itself. Do someone know where I can get it? thanks Fr?d?ric From charlesr.harris at gmail.com Tue Sep 18 21:43:42 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Sep 2012 19:43:42 -0600 Subject: [Numpy-discussion] Where is the code for PyArray_UpdateFlags? In-Reply-To: References: Message-ID: On Tue, Sep 18, 2012 at 7:32 PM, Fr?d?ric Bastien wrote: > Hi, > > I would like to reuse the code of this function: PyArray_UpdateFlags > > I don't find its definition in the numpy source code. In the build > directory, it get defined in the file > build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h > like this: > > #define PyArray_UpdateFlags \ > (*(void (*)(PyArrayObject *, int)) \ > PyArray_API[92]) > > This file is generated by this file > numpy/core/code_generators/generate_numpy_api.py, that use the > numpy_api_order.txt file. But I'm not able up to now to find the code > itself. Do someone know where I can get it? > > gvim numpy/core/src/multiarray/flagsobject.c +63 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Sep 18 22:01:41 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 18 Sep 2012 22:01:41 -0400 Subject: [Numpy-discussion] Where is the code for PyArray_UpdateFlags? In-Reply-To: References: Message-ID: Hi I had done a checkout of a old version of numpy where this file didn't existed. It is there in the trunk. thanks Fred On Tue, Sep 18, 2012 at 9:43 PM, Charles R Harris wrote: > > > On Tue, Sep 18, 2012 at 7:32 PM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I would like to reuse the code of this function: PyArray_UpdateFlags >> >> I don't find its definition in the numpy source code. In the build >> directory, it get defined in the file >> build/src.linux-x86_64-2.7/numpy/core/include/numpy/__multiarray_api.h >> like this: >> >> #define PyArray_UpdateFlags \ >> (*(void (*)(PyArrayObject *, int)) \ >> PyArray_API[92]) >> >> This file is generated by this file >> numpy/core/code_generators/generate_numpy_api.py, that use the >> numpy_api_order.txt file. But I'm not able up to now to find the code >> itself. Do someone know where I can get it? >> > > gvim numpy/core/src/multiarray/flagsobject.c +63 > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ondrej.certik at gmail.com Thu Sep 20 02:24:07 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 19 Sep 2012 23:24:07 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release Message-ID: Hi, I'm pleased to announce the availability of the second beta release of NumPy 1.7.0b2. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ Please test this release and report any issues on the numpy-discussion mailing list. Since beta1, we've fixed most of the known (back then) issues, except: http://projects.scipy.org/numpy/ticket/2076 http://projects.scipy.org/numpy/ticket/2101 http://projects.scipy.org/numpy/ticket/2108 http://projects.scipy.org/numpy/ticket/2150 And many other issues that were reported since the beta1 release. The log of changes is attached. The full list of issues that we still need to work on is at: https://github.com/numpy/numpy/issues/396 Any help is welcome, the best is to send a PR fixing any of the issues -- against master, and I'll then back-port it to the release branch (unless it is something release specific, in which case just send the PR against the release branch). Cheers, Ondrej * f217517 Release 1.7.0b2 * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. * fcacdcc FIX: use py24-compatible version of virtualenv on Travis * d01354e FIX: loosen numerical tolerance in test_pareto() * 65ec87e TST: Add test for boolean insert * 9ee9984 TST: Add extra test for multidimensional inserts. * 8460514 BUG: Fix for issues #378 and #392 This should fix the problems with numpy.insert(), where the input values were not checked for all scalar types and where values did not get inserted properly, but got duplicated by default. * 07e02d0 BUG: fix npymath install location. * 6da087e BUG: fix custom post_check. * 095a3ab BUG: forgot to build _dotblas in bento build. * cb0de72 REF: remove unused imports in bscript. * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 * 3dc3b1b Retain backward compatibility. Enforce C order. * 5a471b5 Improve ndindex execution speed. * 2f28db6 FIX: Add a test for Ticket #2066 * ca29849 BUG: Add a test for Ticket #2189 * 1ee4a00 BUG: Add a test for Ticket #1588 * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip * f65ff87 FIX: simplify the import statement * 124a608 Fix returned copy * 996a9fb FIX: bug in np.where and recarray swapping * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). * 416af9a pavement.py: rename "yop" to "atlas" * 3930881 BUG: fix bento build. * fbad4a7 Remove test_recarray_from_long_formats * 5cb80f8 Add test for long number in shape specifier of dtype string * 24da7f6 Add test for long numbers in numpy.rec.array formats string * 77da3f8 Allow long numbers in numpy.rec.array formats string * 99c9397 Use PyUnicode_DecodeUTF32() * 31660d0 Follow the C guidelines * d5d6894 Fix memory leak in concatenate. * 8141e1e FIX: Make sure the tests produce valid unicode * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 * a022015 Re-enable unpickling optimization for large py3k bytes objects. * 470486b Copy bytes object when unpickling an array * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 * a1561c2 [FIX] Add missing header so separate compilation works again * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. * 28ffac7 REL: set version number to 1.7.0rc1-dev. From rhattersley at gmail.com Thu Sep 20 07:50:37 2012 From: rhattersley at gmail.com (Richard Hattersley) Date: Thu, 20 Sep 2012 12:50:37 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: Hi, [First of all - thanks to everyone involved in the 1.7 release. Especially Ond?ej - it takes a lot of time & energy to coordinate something like this.] Is there an up to date release schedule anywhere? The trac milestone still references June. Regards, Richard Hattersley On 20 September 2012 07:24, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the second beta release of > NumPy 1.7.0b2. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ > > Please test this release and report any issues on the numpy-discussion > mailing list. Since beta1, we've fixed most of the known (back then) > issues, except: > > http://projects.scipy.org/numpy/ticket/2076 > http://projects.scipy.org/numpy/ticket/2101 > http://projects.scipy.org/numpy/ticket/2108 > http://projects.scipy.org/numpy/ticket/2150 > > And many other issues that were reported since the beta1 release. The > log of changes is attached. The full list of issues that we still need > to work on is at: > > https://github.com/numpy/numpy/issues/396 > > Any help is welcome, the best is to send a PR fixing any of the issues > -- against master, and I'll then back-port it to the release branch > (unless it is something release specific, in which case just send the > PR against the release branch). > > Cheers, > Ondrej > > > * f217517 Release 1.7.0b2 > * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. > * fcacdcc FIX: use py24-compatible version of virtualenv on Travis > * d01354e FIX: loosen numerical tolerance in test_pareto() > * 65ec87e TST: Add test for boolean insert > * 9ee9984 TST: Add extra test for multidimensional inserts. > * 8460514 BUG: Fix for issues #378 and #392 This should fix the > problems with numpy.insert(), where the input values were not checked > for all scalar types and where values did not get inserted properly, > but got duplicated by default. > * 07e02d0 BUG: fix npymath install location. > * 6da087e BUG: fix custom post_check. > * 095a3ab BUG: forgot to build _dotblas in bento build. > * cb0de72 REF: remove unused imports in bscript. > * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 > * 3dc3b1b Retain backward compatibility. Enforce C order. > * 5a471b5 Improve ndindex execution speed. > * 2f28db6 FIX: Add a test for Ticket #2066 > * ca29849 BUG: Add a test for Ticket #2189 > * 1ee4a00 BUG: Add a test for Ticket #1588 > * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip > * f65ff87 FIX: simplify the import statement > * 124a608 Fix returned copy > * 996a9fb FIX: bug in np.where and recarray swapping > * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). > * 416af9a pavement.py: rename "yop" to "atlas" > * 3930881 BUG: fix bento build. > * fbad4a7 Remove test_recarray_from_long_formats > * 5cb80f8 Add test for long number in shape specifier of dtype string > * 24da7f6 Add test for long numbers in numpy.rec.array formats string > * 77da3f8 Allow long numbers in numpy.rec.array formats string > * 99c9397 Use PyUnicode_DecodeUTF32() > * 31660d0 Follow the C guidelines > * d5d6894 Fix memory leak in concatenate. > * 8141e1e FIX: Make sure the tests produce valid unicode > * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 > * a022015 Re-enable unpickling optimization for large py3k bytes objects. > * 470486b Copy bytes object when unpickling an array > * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 > * a1561c2 [FIX] Add missing header so separate compilation works again > * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. > * 28ffac7 REL: set version number to 1.7.0rc1-dev. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Sep 20 10:33:56 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Sep 2012 07:33:56 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Thu, Sep 20, 2012 at 4:50 AM, Richard Hattersley wrote: > Hi, > > [First of all - thanks to everyone involved in the 1.7 release. Especially > Ond?ej - it takes a lot of time & energy to coordinate something like this.] > > Is there an up to date release schedule anywhere? The trac milestone still > references June. Well, originally we were supposed to release about a month ago, but it turned out there are more things to fix. Currently, we just need to fix all the issues here: https://github.com/numpy/numpy/issues/396 it looks like a lot, but many of them are really easy to fix, so my hope is that it will not take long. The hardest one is this: http://projects.scipy.org/numpy/ticket/2108 if anyone wants to help with this one, that'd be very much appreciated. After these are fixed, the rc1 (possibly rc2) and the final release should go quickly, as I already know how to make the binaries easily. Ondrej From njs at pobox.com Thu Sep 20 15:00:35 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Sep 2012 20:00:35 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Thu, Sep 20, 2012 at 3:33 PM, Ond?ej ?ert?k wrote: > On Thu, Sep 20, 2012 at 4:50 AM, Richard Hattersley > wrote: >> Hi, >> >> [First of all - thanks to everyone involved in the 1.7 release. Especially >> Ond?ej - it takes a lot of time & energy to coordinate something like this.] >> >> Is there an up to date release schedule anywhere? The trac milestone still >> references June. > > Well, originally we were supposed to release about a month ago, but it > turned out there are more things to fix. > Currently, we just need to fix all the issues here: > > https://github.com/numpy/numpy/issues/396 > > it looks like a lot, but many of them are really easy to fix, so my > hope is that it will not take long. The hardest one is this: > > http://projects.scipy.org/numpy/ticket/2108 > > if anyone wants to help with this one, that'd be very much appreciated. This particular bug should actually be pretty trivial to fix if anyone is looking for something to do (esp. if you have a working win32 build environment to test your work): http://thread.gmane.org/gmane.comp.python.numeric.general/50950/focus=50980 -n From ondrej.certik at gmail.com Thu Sep 20 15:51:58 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 20 Sep 2012 12:51:58 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Thu, Sep 20, 2012 at 12:00 PM, Nathaniel Smith wrote: > On Thu, Sep 20, 2012 at 3:33 PM, Ond?ej ?ert?k wrote: >> On Thu, Sep 20, 2012 at 4:50 AM, Richard Hattersley >> wrote: >>> Hi, >>> >>> [First of all - thanks to everyone involved in the 1.7 release. Especially >>> Ond?ej - it takes a lot of time & energy to coordinate something like this.] >>> >>> Is there an up to date release schedule anywhere? The trac milestone still >>> references June. >> >> Well, originally we were supposed to release about a month ago, but it >> turned out there are more things to fix. >> Currently, we just need to fix all the issues here: >> >> https://github.com/numpy/numpy/issues/396 >> >> it looks like a lot, but many of them are really easy to fix, so my >> hope is that it will not take long. The hardest one is this: >> >> http://projects.scipy.org/numpy/ticket/2108 >> >> if anyone wants to help with this one, that'd be very much appreciated. > > This particular bug should actually be pretty trivial to fix if anyone > is looking for something to do (esp. if you have a working win32 build > environment to test your work): > http://thread.gmane.org/gmane.comp.python.numeric.general/50950/focus=50980 Ah, that looks easy. I'll try to give it a shot. See my repo here how to get a working win32 environment: https://github.com/certik/numpy-vendor However, I don't have access to MSVC, but I am sure somebody else can test it there, once the PR is ready. Ondrej From njs at pobox.com Thu Sep 20 16:09:16 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Sep 2012 21:09:16 +0100 Subject: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble) In-Reply-To: References: Message-ID: On Mon, Sep 17, 2012 at 10:22 AM, Matthew Brett wrote: > Hi, > > On Sun, Sep 9, 2012 at 6:12 PM, Fr?d?ric Bastien wrote: >> The third is releated to change to the casting rules in numpy. Before >> a scalar complex128 * vector float32 gived a vector of dtype >> complex128. Now it give a vector of complex64. The reason is that now >> the scalar of different category only change the category, not the >> precision. I would consider a must that we warn clearly about this >> interface change. Most people won't see it, but people that optimize >> there code heavily could depend on such thing. > > It seems to me that it would be a very good idea to put the casting > table results into the tests to make sure we are keeping track of this > kind of thing. > > I'm happy to try to do it if no-one else more qualified has time. I haven't seen any PRs show up from anyone else in the last few days, and this would indeed be an excellent test to have, so that would be awesome. -n From travis at continuum.io Thu Sep 20 16:20:22 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 20 Sep 2012 15:20:22 -0500 Subject: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble) In-Reply-To: References: Message-ID: <2F489A53-EE68-4E62-9DB2-5224FA4EEB14@continuum.io> Here are a couple of scripts that might help (I used them to compare casting tables between various versions of NumPy): Casting Table Creation Script ======================== import numpy as np operators = np.set_numeric_ops().values() types = '?bhilqpBHILQPfdgFDGO' to_check = ['add', 'divide', 'minimum', 'maximum', 'remainder', 'true_divide', 'logical_or', 'bitwise_or', 'right_shift', 'less', 'equal'] operators = [op for op in operators if op.__name__ in to_check] def type_wrap(op): def func(obj1, obj2): try: result = op(obj1, obj2) char = result.dtype.char except: char = 'X' return char return func def coerce(): result = {} for op in operators: d = {} name = op.__name__ print name op = type_wrap(op) for type1 in types: s1 = np.dtype(type1).type(2) a1 = np.dtype(type1).type([1,2,3]) for type2 in types: s2 = np.dtype(type2).type(1) a2 = np.dtype(type2).type([2,3,4]) codes = [] # scalar scalar codes.append(op(s1, s2)) # scalar array codes.append(op(s1, a2)) # array scalar codes.append(op(a1, s2)) # array array codes.append(op(a1, a2)) d[type1,type2] = codes result[name] = d #for check_key in to_check: # for key in result.keys(): # if key == check_key: # continue # if result[key] == result[check_key]: # del result[key] #assert set(result.keys()) == set(to_check) return result import sys if sys.maxint > 2**33: bits = 64 else: bits = 32 def write(): import cPickle file = open('coercion-%s-%sbit.pkl'%(np.__version__, bits),'w') cPickle.dump(coerce(),file,protocol=2) file.close() if __name__ == '__main__': write() Comparison Script ================ import numpy as np def compare(result1, result2): for op in result1.keys(): print "**** ", op, " ****" if op not in result2: print op, " not in the first" table1 = result1[op] table2 = result2[op] if table1 == table2: print "Tables are the same" else: if set(table1.keys()) != set(table2.keys()): print "Keys are not the same" continue for key in table1.keys(): if table1[key] != table2[key]: print "Different at ", key, ": ", table1[key], table2[key] import cPickle import sys if __name__ == '__main__': name1 = 'coercion-1.5.1-64bit.pkl' name2 = 'coercion-1.6.1-64bit.pkl' if len(sys.argv) > 1: name1 = 'coercion-%s-64bit.pkl' % sys.argv[1] if len(sys.argv) > 2: name2 = 'coercion-%s-64bit.pkl' % sys.argv[2] result1 = cPickle.load(open(name1)) result2 = cPickle.load(open(name2)) compare(result1, result2) On Sep 20, 2012, at 3:09 PM, Nathaniel Smith wrote: > On Mon, Sep 17, 2012 at 10:22 AM, Matthew Brett wrote: >> Hi, >> >> On Sun, Sep 9, 2012 at 6:12 PM, Fr?d?ric Bastien wrote: >>> The third is releated to change to the casting rules in numpy. Before >>> a scalar complex128 * vector float32 gived a vector of dtype >>> complex128. Now it give a vector of complex64. The reason is that now >>> the scalar of different category only change the category, not the >>> precision. I would consider a must that we warn clearly about this >>> interface change. Most people won't see it, but people that optimize >>> there code heavily could depend on such thing. >> >> It seems to me that it would be a very good idea to put the casting >> table results into the tests to make sure we are keeping track of this >> kind of thing. >> >> I'm happy to try to do it if no-one else more qualified has time. > > I haven't seen any PRs show up from anyone else in the last few days, > and this would indeed be an excellent test to have, so that would be > awesome. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Thu Sep 20 17:48:26 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Sep 2012 22:48:26 +0100 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Wed, Sep 19, 2012 at 1:08 AM, Charles R Harris wrote: > > > The relevant setting is in numpy/core/include/numpy/ndarraytypes.h > > #define NPY_DEFAULT_ASSIGN_CASTING NPY_SAME_KIND_CASTING > > I think that if we want to raise a warning we could define a new rule, > > NPY_WARN_SAME_KIND_CASTING > > Which would do the same as unsafe, only raise a warning on the way. https://github.com/numpy/numpy/pull/451 Query: I would have thought that NPY_DEFAULT_ASSIGN_CASTING would determine the default casting used for assignments. But in current master: >>> a = np.zeros(3, dtype=int) >>> a[0] = 1.1 >>> a array([1, 0, 0]) In fact, this variable seems to only be used by PyArray_Std, PyArray_Round, and ufuncs. Okay, so, NPY_DEFAULT_ASSIGN_CASTING is just misnamed, but -- what casting rule *should* plain old assignment follow? I'd think same_kind casting is probably a good default here for the same reason it's a good default for ufuncs, and because a += b really should be the same as a = a + b. But, the only problem is, how could you override it if desired? a.__setitem__(0, casting="unsafe")? -n From orion at cora.nwra.com Thu Sep 20 17:56:25 2012 From: orion at cora.nwra.com (Orion Poplawski) Date: Thu, 20 Sep 2012 15:56:25 -0600 Subject: [Numpy-discussion] Fwd: Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion In-Reply-To: <20120920210154.D759123187@bastion01.phx2.fedoraproject.org> References: <20120920210154.D759123187@bastion01.phx2.fedoraproject.org> Message-ID: <505B9109.60607@cora.nwra.com> This is a plea for some help. We've been having trouble getting scipy to pass all of the tests in the Fedora 18 build with python 3.3 (although it seems to build okay in Fedora 19). Below are the logs of the build. There appears to be some kind of memory corruption that manifests itself a little differently on 32-bit vs. 64-bit. I really have no idea myself how to pursue debugging this, though I'm happy to provide any more needed information. Thanks, Orion -------- Original Message -------- Subject: Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion Date: Thu, 20 Sep 2012 21:01:54 +0000 (UTC) From: Fedora Koji Build System To: ausil at fedoraproject.org, jspaleta at fedoraproject.org, voronov at fedoraproject.org, torwangjl at fedoraproject.org, alagunambi at fedoraproject.org, urkle at fedoraproject.org, orion at fedoraproject.org Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion ID: 350761 Started: Thu, 20 Sep 2012 20:39:45 UTC Finished: Thu, 20 Sep 2012 21:01:32 UTC scipy-0.11.0-0.1.rc2.fc18 (350761) failed on buildvm-33.phx2.fedoraproject.org (x86_64), buildvm-35.phx2.fedoraproject.org (i386), buildvm-34.phx2.fedoraproject.org (noarch): BuildError: error building package (arch x86_64), mock exited with status 1; see build.log for more information SRPMS: scipy-0.11.0-0.1.rc2.fc18.src.rpm Failed tasks: ------------- Task 4509076 on buildvm-33.phx2.fedoraproject.org Task Type: buildArch (scipy-0.11.0-0.1.rc2.fc18.src.rpm, x86_64) logs: http://koji.fedoraproject.org/koji/getfile?taskID=4509076&name=build.log http://koji.fedoraproject.org/koji/getfile?taskID=4509076&name=mock_output.log http://koji.fedoraproject.org/koji/getfile?taskID=4509076&name=root.log http://koji.fedoraproject.org/koji/getfile?taskID=4509076&name=state.log Task 4509077 on buildvm-35.phx2.fedoraproject.org Task Type: buildArch (scipy-0.11.0-0.1.rc2.fc18.src.rpm, i686) logs: http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=build.log http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=mock_output.log http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=root.log http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=state.log Task 4509063 on buildvm-34.phx2.fedoraproject.org Task Type: build (f18-candidate, /scipy:cb69bd06f0d930fbe8840d89b918b617e28af63f) Closed tasks: ------------- Task 4509064 on buildvm-30.phx2.fedoraproject.org Task Type: buildSRPMFromSCM (/scipy:cb69bd06f0d930fbe8840d89b918b617e28af63f) logs: http://koji.fedoraproject.org/koji/getfile?taskID=4509064&name=build.log http://koji.fedoraproject.org/koji/getfile?taskID=4509064&name=checkout.log http://koji.fedoraproject.org/koji/getfile?taskID=4509064&name=mock_output.log http://koji.fedoraproject.org/koji/getfile?taskID=4509064&name=root.log http://koji.fedoraproject.org/koji/getfile?taskID=4509064&name=state.log Task Info: http://koji.fedoraproject.org/koji/taskinfo?taskID=4509063 Build Info: http://koji.fedoraproject.org/koji/buildinfo?buildID=350761 From njs at pobox.com Thu Sep 20 18:04:37 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Sep 2012 23:04:37 +0100 Subject: [Numpy-discussion] Fwd: Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion In-Reply-To: <505B9109.60607@cora.nwra.com> References: <20120920210154.D759123187@bastion01.phx2.fedoraproject.org> <505B9109.60607@cora.nwra.com> Message-ID: On Thu, Sep 20, 2012 at 10:56 PM, Orion Poplawski wrote: > This is a plea for some help. We've been having trouble getting scipy to > pass all of the tests in the Fedora 18 build with python 3.3 (although it > seems to build okay in Fedora 19). Below are the logs of the build. There > appears to be some kind of memory corruption that manifests itself a little > differently on 32-bit vs. 64-bit. I really have no idea myself how to > pursue debugging this, though I'm happy to provide any more needed > information. You should probably ask the scipy list, but since we're on the numpy list... what version of numpy are you even building against? There's no released version of numpy that works with python 3.3... -n From orion at cora.nwra.com Thu Sep 20 19:02:05 2012 From: orion at cora.nwra.com (Orion Poplawski) Date: Thu, 20 Sep 2012 17:02:05 -0600 Subject: [Numpy-discussion] Fwd: Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion In-Reply-To: References: <20120920210154.D759123187@bastion01.phx2.fedoraproject.org> <505B9109.60607@cora.nwra.com> Message-ID: <505BA06D.5090305@cora.nwra.com> On 09/20/2012 04:04 PM, Nathaniel Smith wrote: > On Thu, Sep 20, 2012 at 10:56 PM, Orion Poplawski wrote: >> This is a plea for some help. We've been having trouble getting scipy to >> pass all of the tests in the Fedora 18 build with python 3.3 (although it >> seems to build okay in Fedora 19). Below are the logs of the build. There >> appears to be some kind of memory corruption that manifests itself a little >> differently on 32-bit vs. 64-bit. I really have no idea myself how to >> pursue debugging this, though I'm happy to provide any more needed >> information. > > You should probably ask the scipy list, but since we're on the numpy > list... what version of numpy are you even building against? There's > no released version of numpy that works with python 3.3... > > -n I'm building against numpy 1.7.0b2. I'll ask on the scipy list as well, but one reason I'm mentioning it here is the the backtrace on the 32-bit dump mentions numpy fuctions. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder Office FAX: 303-415-9702 3380 Mitchell Lane orion at nwra.com Boulder, CO 80301 http://www.nwra.com From charlesr.harris at gmail.com Thu Sep 20 19:30:10 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Sep 2012 17:30:10 -0600 Subject: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble) In-Reply-To: <2F489A53-EE68-4E62-9DB2-5224FA4EEB14@continuum.io> References: <2F489A53-EE68-4E62-9DB2-5224FA4EEB14@continuum.io> Message-ID: On Thu, Sep 20, 2012 at 2:20 PM, Travis Oliphant wrote: > Here are a couple of scripts that might help (I used them to compare > casting tables between various versions of NumPy): > > Casting Table Creation Script > ======================== > import numpy as np > > operators = np.set_numeric_ops().values() > types = '?bhilqpBHILQPfdgFDGO' > to_check = ['add', 'divide', 'minimum', 'maximum', 'remainder', > 'true_divide', 'logical_or', 'bitwise_or', 'right_shift', 'less', 'equal'] > operators = [op for op in operators if op.__name__ in to_check] > > > def type_wrap(op): > def func(obj1, obj2): > try: > result = op(obj1, obj2) > char = result.dtype.char > except: > char = 'X' > return char > > return func > > def coerce(): > result = {} > for op in operators: > d = {} > name = op.__name__ > print name > op = type_wrap(op) > for type1 in types: > s1 = np.dtype(type1).type(2) > a1 = np.dtype(type1).type([1,2,3]) > for type2 in types: > s2 = np.dtype(type2).type(1) > a2 = np.dtype(type2).type([2,3,4]) > codes = [] > # scalar scalar > codes.append(op(s1, s2)) > # scalar array > codes.append(op(s1, a2)) > # array scalar > codes.append(op(a1, s2)) > # array array > codes.append(op(a1, a2)) > d[type1,type2] = codes > result[name] = d > > #for check_key in to_check: > # for key in result.keys(): > # if key == check_key: > # continue > # if result[key] == result[check_key]: > # del result[key] > #assert set(result.keys()) == set(to_check) > return result > > import sys > if sys.maxint > 2**33: > bits = 64 > else: > bits = 32 > > def write(): > import cPickle > file = open('coercion-%s-%sbit.pkl'%(np.__version__, bits),'w') > cPickle.dump(coerce(),file,protocol=2) > file.close() > > if __name__ == '__main__': > write() > > > > > > Comparison Script > ================ > > import numpy as np > > > def compare(result1, result2): > for op in result1.keys(): > print "**** ", op, " ****" > if op not in result2: > print op, " not in the first" > table1 = result1[op] > table2 = result2[op] > if table1 == table2: > print "Tables are the same" > else: > if set(table1.keys()) != set(table2.keys()): > print "Keys are not the same" > continue > for key in table1.keys(): > if table1[key] != table2[key]: > print "Different at ", key, ": ", table1[key], > table2[key] > > import cPickle > import sys > > if __name__ == '__main__': > name1 = 'coercion-1.5.1-64bit.pkl' > name2 = 'coercion-1.6.1-64bit.pkl' > > if len(sys.argv) > 1: > name1 = 'coercion-%s-64bit.pkl' % sys.argv[1] > if len(sys.argv) > 2: > name2 = 'coercion-%s-64bit.pkl' % sys.argv[2] > result1 = cPickle.load(open(name1)) > result2 = cPickle.load(open(name2)) > compare(result1, result2) > > > > On Sep 20, 2012, at 3:09 PM, Nathaniel Smith wrote: > > > On Mon, Sep 17, 2012 at 10:22 AM, Matthew Brett > wrote: > >> Hi, > >> > >> On Sun, Sep 9, 2012 at 6:12 PM, Fr?d?ric Bastien > wrote: > >>> The third is releated to change to the casting rules in numpy. Before > >>> a scalar complex128 * vector float32 gived a vector of dtype > >>> complex128. Now it give a vector of complex64. The reason is that now > >>> the scalar of different category only change the category, not the > >>> precision. I would consider a must that we warn clearly about this > >>> interface change. Most people won't see it, but people that optimize > >>> there code heavily could depend on such thing. > >> > >> It seems to me that it would be a very good idea to put the casting > >> table results into the tests to make sure we are keeping track of this > >> kind of thing. > >> > >> I'm happy to try to do it if no-one else more qualified has time. > > > > I haven't seen any PRs show up from anyone else in the last few days, > > and this would indeed be an excellent test to have, so that would be > > awesome. > > > IIRC, there are some scripts in the numpy repository. But I forget where I saw them. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Thu Sep 20 20:05:12 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Thu, 20 Sep 2012 20:05:12 -0400 Subject: [Numpy-discussion] tests for casting table? (was: Numpy 1.7b1 API change cause big trouble) In-Reply-To: References: <2F489A53-EE68-4E62-9DB2-5224FA4EEB14@continuum.io> Message-ID: Hi, Finally, the change about the casting rule was done in NumPy 1.6. It is our test that checked specifically for numpy 1.6 behavior. But adding the test to make sure it don't change is an excellent idea. Fred On Thu, Sep 20, 2012 at 7:30 PM, Charles R Harris wrote: > > > On Thu, Sep 20, 2012 at 2:20 PM, Travis Oliphant > wrote: >> >> Here are a couple of scripts that might help (I used them to compare >> casting tables between various versions of NumPy): >> >> Casting Table Creation Script >> ======================== >> import numpy as np >> >> operators = np.set_numeric_ops().values() >> types = '?bhilqpBHILQPfdgFDGO' >> to_check = ['add', 'divide', 'minimum', 'maximum', 'remainder', >> 'true_divide', 'logical_or', 'bitwise_or', 'right_shift', 'less', 'equal'] >> operators = [op for op in operators if op.__name__ in to_check] >> >> >> def type_wrap(op): >> def func(obj1, obj2): >> try: >> result = op(obj1, obj2) >> char = result.dtype.char >> except: >> char = 'X' >> return char >> >> return func >> >> def coerce(): >> result = {} >> for op in operators: >> d = {} >> name = op.__name__ >> print name >> op = type_wrap(op) >> for type1 in types: >> s1 = np.dtype(type1).type(2) >> a1 = np.dtype(type1).type([1,2,3]) >> for type2 in types: >> s2 = np.dtype(type2).type(1) >> a2 = np.dtype(type2).type([2,3,4]) >> codes = [] >> # scalar scalar >> codes.append(op(s1, s2)) >> # scalar array >> codes.append(op(s1, a2)) >> # array scalar >> codes.append(op(a1, s2)) >> # array array >> codes.append(op(a1, a2)) >> d[type1,type2] = codes >> result[name] = d >> >> #for check_key in to_check: >> # for key in result.keys(): >> # if key == check_key: >> # continue >> # if result[key] == result[check_key]: >> # del result[key] >> #assert set(result.keys()) == set(to_check) >> return result >> >> import sys >> if sys.maxint > 2**33: >> bits = 64 >> else: >> bits = 32 >> >> def write(): >> import cPickle >> file = open('coercion-%s-%sbit.pkl'%(np.__version__, bits),'w') >> cPickle.dump(coerce(),file,protocol=2) >> file.close() >> >> if __name__ == '__main__': >> write() >> >> >> >> >> >> Comparison Script >> ================ >> >> import numpy as np >> >> >> def compare(result1, result2): >> for op in result1.keys(): >> print "**** ", op, " ****" >> if op not in result2: >> print op, " not in the first" >> table1 = result1[op] >> table2 = result2[op] >> if table1 == table2: >> print "Tables are the same" >> else: >> if set(table1.keys()) != set(table2.keys()): >> print "Keys are not the same" >> continue >> for key in table1.keys(): >> if table1[key] != table2[key]: >> print "Different at ", key, ": ", table1[key], >> table2[key] >> >> import cPickle >> import sys >> >> if __name__ == '__main__': >> name1 = 'coercion-1.5.1-64bit.pkl' >> name2 = 'coercion-1.6.1-64bit.pkl' >> >> if len(sys.argv) > 1: >> name1 = 'coercion-%s-64bit.pkl' % sys.argv[1] >> if len(sys.argv) > 2: >> name2 = 'coercion-%s-64bit.pkl' % sys.argv[2] >> result1 = cPickle.load(open(name1)) >> result2 = cPickle.load(open(name2)) >> compare(result1, result2) >> >> >> >> On Sep 20, 2012, at 3:09 PM, Nathaniel Smith wrote: >> >> > On Mon, Sep 17, 2012 at 10:22 AM, Matthew Brett >> > wrote: >> >> Hi, >> >> >> >> On Sun, Sep 9, 2012 at 6:12 PM, Fr?d?ric Bastien >> >> wrote: >> >>> The third is releated to change to the casting rules in numpy. Before >> >>> a scalar complex128 * vector float32 gived a vector of dtype >> >>> complex128. Now it give a vector of complex64. The reason is that now >> >>> the scalar of different category only change the category, not the >> >>> precision. I would consider a must that we warn clearly about this >> >>> interface change. Most people won't see it, but people that optimize >> >>> there code heavily could depend on such thing. >> >> >> >> It seems to me that it would be a very good idea to put the casting >> >> table results into the tests to make sure we are keeping track of this >> >> kind of thing. >> >> >> >> I'm happy to try to do it if no-one else more qualified has time. >> > >> > I haven't seen any PRs show up from anyone else in the last few days, >> > and this would indeed be an excellent test to have, so that would be >> > awesome. >> > > > > IIRC, there are some scripts in the numpy repository. But I forget where I > saw them. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sebastian at sipsolutions.net Thu Sep 20 21:05:31 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Sep 2012 03:05:31 +0200 Subject: [Numpy-discussion] np.delete fix Message-ID: <1348189531.6880.8.camel@sebastian-laptop> Hey, I have written a small PR, to fix np.delete, since it would change the behavior a little (to the better IMO) I think I should also write to the list? So here is the problem with np.delete: 1. When using slices with negative strides, it does not work (best case) or give even wrong results. 2. When using an index array, it ignores negative indexes. 3. The fact that it uses setdiff1d makes it unnecessarily slow. https://github.com/numpy/numpy/pull/452/files fixes these things. The change is that out of bounds indices would raise an Exception (even if one might say that they do not matter to deletion), however I consider that a feature. And a small example how badly its wrong with slices: In [1]: arr = np.arange(4) In [2]: set(arr) - set(arr[1::-1]) Out[2]: set([2, 3]) In [3]: np.delete(arr, np.s_[1::-1]) Out[3]: array([3, 3]) Regards, Sebastian From spitskip at gmail.com Fri Sep 21 08:59:35 2012 From: spitskip at gmail.com (Wim Bakker) Date: Fri, 21 Sep 2012 14:59:35 +0200 Subject: [Numpy-discussion] ZeroRank memmap behavior? Message-ID: I'm deeply puzzled by the recently changed behavior of zero-rank memmaps. I think this change happened from version 1.6.0 to 1.6.1, which I'm currently using. >>> import numpy as np Create a zero-rank memmap. >>> x = np.memmap(filename='/tmp/m', dtype=float, mode='w+', shape=()) Give it a value: >>> x[...] = 22 >>> x memmap(22.0) So far so good. But now: >>> b = (x + x) / 1.5 >>> b memmap(29.333333333333332) WT.? Why is the result of this calculation a memmap? It even thinks that it's still linked to the file, but it's not: >>> b.filename '/tmp/m' If I try this with arrays then I don't get this weird behavior: >>> a = np.array(2, dtype=float) >>> (a + a) / 2.5 1.6000000000000001 which gives me a Python float, not a zero-rank array. Why does the memmap behave like that? Why do I get a memmap even though it's not connected to any file? Regards, Wim -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Sep 21 10:26:44 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 21 Sep 2012 16:26:44 +0200 Subject: [Numpy-discussion] ZeroRank memmap behavior? In-Reply-To: References: Message-ID: <1348237604.6880.11.camel@sebastian-laptop> Hey, this is indirectly related (I think it might fix many of these memmap oddities though?)... Why does the memmap object not implement: def __array_wrap__(self, obj): if self is obj: return obj return np.array(obj, copy=False, subok=False) By doing so if we have a ufunc with only memmap objects, the result will not be a memmap object, but if the ufunc has an output parameter, then "self is obj" which means that it is not casted. The problem is that subclass automatically inherit an __array_wrap__ method that sets the result to the subclasses type (which is not generally wanted for memmaps). May make a PR for this... Regards, Sebastian On Fri, 2012-09-21 at 14:59 +0200, Wim Bakker wrote: > I'm deeply puzzled by the recently changed behavior of zero-rank > memmaps. I think this change happened from version 1.6.0 to 1.6.1, > which I'm currently using. > > > >>> import numpy as np > > > Create a zero-rank memmap. > > > >>> x = np.memmap(filename='/tmp/m', dtype=float, mode='w+', shape=()) > > > Give it a value: > > > >>> x[...] = 22 > >>> x > memmap(22.0) > > > So far so good. But now: > > > >>> b = (x + x) / 1.5 > >>> b > memmap(29.333333333333332) > > > WT.? Why is the result of this calculation a memmap? > > > It even thinks that it's still linked to the file, but it's not: > > > >>> b.filename > '/tmp/m' > > > If I try this with arrays then I don't get this weird behavior: > > > >>> a = np.array(2, dtype=float) > > > >>> (a + a) / 2.5 > 1.6000000000000001 > > > which gives me a Python float, not a zero-rank array. > > > Why does the memmap behave like that? Why do I get a memmap even > though it's not connected to any file? > > > Regards, > > > Wim > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chris.barker at noaa.gov Fri Sep 21 12:24:20 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 21 Sep 2012 09:24:20 -0700 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Thu, Sep 20, 2012 at 2:48 PM, Nathaniel Smith wrote: > because a += b > really should be the same as a = a + b. I don't think that's the case - the inplace operator should be (and are) more than syntactic sugar -- they have a different meaning and use (in fact, I think they should't work at all for immutable, sbut i guess the common increment-a-counter use was too good to pass up) in the numpy case: a = a + b means "make a new array, from the result of adding a and b" whereas: a += b means "change a in place by adding b to it" In the first case, I'd expect the type of the result to be determined by both a and b -- casting rules. In the second case, a should certainly not be a different object, and should not have a new data buffer, therefor should not change type. Whereas the general case, there is no assumption that with: a = b+c a is the same type as either b or c, but certainly not the same object. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Fri Sep 21 13:03:09 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Sep 2012 18:03:09 +0100 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On 21 Sep 2012 17:31, "Chris Barker" wrote: > > On Thu, Sep 20, 2012 at 2:48 PM, Nathaniel Smith wrote: > > because a += b > > really should be the same as a = a + b. > > I don't think that's the case - the inplace operator should be (and > are) more than syntactic sugar -- they have a different meaning and > use (in fact, I think they should't work at all for immutable, sbut i > guess the common increment-a-counter use was too good to pass up) > > in the numpy case: > > a = a + b > > means "make a new array, from the result of adding a and b" > > whereas: > > a += b > > means "change a in place by adding b to it" > > In the first case, I'd expect the type of the result to be determined > by both a and b -- casting rules. > > In the second case, a should certainly not be a different object, and > should not have a new data buffer, therefor should not change type. You're right of course. What I meant is that a += b should produce the same result as a[...] = a + b If we change the casting rule for the first one but not the second, though, then these will produce different results if a is integer and b is float: the first will produce an error, while the second will succeed, silently discarding fractional parts. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Fri Sep 21 13:41:24 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 21 Sep 2012 10:41:24 -0700 Subject: [Numpy-discussion] Fwd: Package: scipy-0.11.0-0.1.rc2.fc18 Tag: f18-updates-candidate Status: failed Built by: orion In-Reply-To: <505B9109.60607@cora.nwra.com> References: <20120920210154.D759123187@bastion01.phx2.fedoraproject.org> <505B9109.60607@cora.nwra.com> Message-ID: Hi Orion, On Thu, Sep 20, 2012 at 2:56 PM, Orion Poplawski wrote: > This is a plea for some help. We've been having trouble getting scipy to > pass all of the tests in the Fedora 18 build with python 3.3 (although it > seems to build okay in Fedora 19). Below are the logs of the build. There > appears to be some kind of memory corruption that manifests itself a little > differently on 32-bit vs. 64-bit. I really have no idea myself how to > pursue debugging this, though I'm happy to provide any more needed > information. Thanks for testing the latest beta2 release. > Task 4509077 on buildvm-35.phx2.fedoraproject.org > Task Type: buildArch (scipy-0.11.0-0.1.rc2.fc18.src.rpm, i686) > logs: > http://koji.fedoraproject.org/koji/getfile?taskID=4509077&name=build.log This link has the following stacktrace: /lib/libpython3.3m.so.1.0(PyMem_Free+0x1c)[0xbf044c] /usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0x4d52b)[0x42252b] /usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0xcb7c5)[0x4a07c5] /usr/lib/python3.3/site-packages/numpy/core/multiarray.cpython-33m.so(+0xcbc5e)[0x4a0c5e] Which indeed looks like in NumPy. Would you be able to obtain full stacktrace? There has certainly been segfaults in Python 3.3 with NumPy, but we've fixed all that we could reproduce. That doesn't mean there couldn't be more. If you could nail it down a little bit more, that would be great. I'll help once I can reproduce it somehow. Ondrej From ralf.gommers at gmail.com Fri Sep 21 16:13:41 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 21 Sep 2012 22:13:41 +0200 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires Message-ID: Hi, An issue I keep running into is that packages use: install_requires = ["numpy"] or install_requires = ['numpy >= 1.6'] in their setup.py. This simply doesn't work a lot of the time. I actually filed a bug against patsy for that (https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it would be better to bring it up on this list. The problem is that if you use pip, it doesn't detect numpy (may work better if you had installed numpy with setuptools) and tries to automatically install or upgrade numpy. That won't work if users don't have the right compiler. Just as bad would be that it does work, and the user didn't want to upgrade for whatever reason. This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw other people have the exact same problem. My recommendation would be to not use install_requires for numpy, but simply do something like this in setup.py: try: import numpy except ImportError: raise ImportError("my_package requires numpy") or try: from numpy.version import short_version as npversion except ImportError: raise ImportError("my_package requires numpy") if npversion < '1.6': raise ImportError("Numpy version is %s; required is version >= 1.6" % npversion) Any objections, better ideas? Is there a good place to put it in the numpy docs somewhere? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Sep 21 16:19:13 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 21 Sep 2012 15:19:13 -0500 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: Message-ID: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote: > Hi, > > An issue I keep running into is that packages use: > install_requires = ["numpy"] > or > install_requires = ['numpy >= 1.6'] > > in their setup.py. This simply doesn't work a lot of the time. I actually filed a bug against patsy for that (https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it would be better to bring it up on this list. > > The problem is that if you use pip, it doesn't detect numpy (may work better if you had installed numpy with setuptools) and tries to automatically install or upgrade numpy. That won't work if users don't have the right compiler. Just as bad would be that it does work, and the user didn't want to upgrade for whatever reason. > > This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw other people have the exact same problem. My recommendation would be to not use install_requires for numpy, but simply do something like this in setup.py: > > try: > import numpy > except ImportError: > raise ImportError("my_package requires numpy") > > or > > try: > from numpy.version import short_version as npversion > except ImportError: > raise ImportError("my_package requires numpy") > if npversion < '1.6': > raise ImportError("Numpy version is %s; required is version >= 1.6" % npversion) > > Any objections, better ideas? Is there a good place to put it in the numpy docs somewhere? I agree. I would recommend against using install requires. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Fri Sep 21 16:37:13 2012 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 21 Sep 2012 16:37:13 -0400 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant wrote: > > On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote: > > Hi, > > An issue I keep running into is that packages use: > install_requires = ["numpy"] > or > install_requires = ['numpy >= 1.6'] > > in their setup.py. This simply doesn't work a lot of the time. I actually > filed a bug against patsy for that ( > https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it > would be better to bring it up on this list. > > The problem is that if you use pip, it doesn't detect numpy (may work > better if you had installed numpy with setuptools) and tries to > automatically install or upgrade numpy. That won't work if users don't have > the right compiler. Just as bad would be that it does work, and the user > didn't want to upgrade for whatever reason. > > This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw > other people have the exact same problem. My recommendation would be to not > use install_requires for numpy, but simply do something like this in > setup.py: > > try: > import numpy > except ImportError: > raise ImportError("my_package requires numpy") > > or > > try: > from numpy.version import short_version as npversion > except ImportError: > raise ImportError("my_package requires numpy") > if npversion < '1.6': > raise ImportError("Numpy version is %s; required is version >= 1.6" > % npversion) > > Any objections, better ideas? Is there a good place to put it in the numpy > docs somewhere? > > > I agree. I would recommend against using install requires. > > -Travis > > > Why? I have personally never had an issue with this. The only way I could imagine that this wouldn't work is if numpy was installed via some other means and there wasn't an entry in the easy-install.pth (or whatever equivalent pip uses). If pip is having a problem detecting numpy, then that is a bug that needs fixing somewhere. As for packages getting updated unintentionally, easy_install and pip both require an argument to upgrade any existing packages (I think -U), so I am not sure how you are running into such a situation. I have found install_requires to be a powerful feature in my setup.py scripts, and I have seen no reason to discourage it. Perhaps I am the only one? Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Fri Sep 21 16:41:33 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 21 Sep 2012 22:41:33 +0200 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: <505CD0FD.1070002@hilboll.de> Am Fr 21 Sep 2012 22:37:13 CEST schrieb Benjamin Root: > > > On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant > wrote: > > > On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote: > >> Hi, >> >> An issue I keep running into is that packages use: >> install_requires = ["numpy"] >> or >> install_requires = ['numpy >= 1.6'] >> >> in their setup.py. This simply doesn't work a lot of the time. I >> actually filed a bug against patsy for that >> (https://github.com/pydata/patsy/issues/5), but Nathaniel is >> right that it would be better to bring it up on this list. >> >> The problem is that if you use pip, it doesn't detect numpy (may >> work better if you had installed numpy with setuptools) and tries >> to automatically install or upgrade numpy. That won't work if >> users don't have the right compiler. Just as bad would be that it >> does work, and the user didn't want to upgrade for whatever reason. >> >> This isn't just my problem; at Wes' pandas tutorial at EuroScipy >> I saw other people have the exact same problem. My recommendation >> would be to not use install_requires for numpy, but simply do >> something like this in setup.py: >> >> try: >> import numpy >> except ImportError: >> raise ImportError("my_package requires numpy") >> >> or >> >> try: >> from numpy.version import short_version as npversion >> except ImportError: >> raise ImportError("my_package requires numpy") >> if npversion < '1.6': >> raise ImportError("Numpy version is %s; required is >> version >= 1.6" % npversion) >> >> Any objections, better ideas? Is there a good place to put it in >> the numpy docs somewhere? > > I agree. I would recommend against using install requires. > > -Travis > > > > Why? I have personally never had an issue with this. The only way I > could imagine that this wouldn't work is if numpy was installed via > some other means and there wasn't an entry in the easy-install.pth (or > whatever equivalent pip uses). If pip is having a problem detecting > numpy, then that is a bug that needs fixing somewhere. > > As for packages getting updated unintentionally, easy_install and pip > both require an argument to upgrade any existing packages (I think > -U), so I am not sure how you are running into such a situation. Quite easily, actually. I ran into pip wanting to upgrade numpy when I was installing/upgrading a package depending on numpy. Problem is, -U upgrades both the package you explicitly select *and* its dependencies. I know there's some way around this, but it's not obvious -- at least not for users. Cheers, Andreas. From ralf.gommers at gmail.com Fri Sep 21 16:42:05 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 21 Sep 2012 22:42:05 +0200 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 10:37 PM, Benjamin Root wrote: > > > On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant wrote: > >> >> On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote: >> >> Hi, >> >> An issue I keep running into is that packages use: >> install_requires = ["numpy"] >> or >> install_requires = ['numpy >= 1.6'] >> >> in their setup.py. This simply doesn't work a lot of the time. I actually >> filed a bug against patsy for that ( >> https://github.com/pydata/patsy/issues/5), but Nathaniel is right that >> it would be better to bring it up on this list. >> >> The problem is that if you use pip, it doesn't detect numpy (may work >> better if you had installed numpy with setuptools) and tries to >> automatically install or upgrade numpy. That won't work if users don't have >> the right compiler. Just as bad would be that it does work, and the user >> didn't want to upgrade for whatever reason. >> >> This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw >> other people have the exact same problem. My recommendation would be to not >> use install_requires for numpy, but simply do something like this in >> setup.py: >> >> try: >> import numpy >> except ImportError: >> raise ImportError("my_package requires numpy") >> >> or >> >> try: >> from numpy.version import short_version as npversion >> except ImportError: >> raise ImportError("my_package requires numpy") >> if npversion < '1.6': >> raise ImportError("Numpy version is %s; required is version >= >> 1.6" % npversion) >> >> Any objections, better ideas? Is there a good place to put it in the >> numpy docs somewhere? >> >> >> I agree. I would recommend against using install requires. >> >> -Travis >> >> >> > Why? I have personally never had an issue with this. The only way I > could imagine that this wouldn't work is if numpy was installed via some > other means and there wasn't an entry in the easy-install.pth (or whatever > equivalent pip uses). > Eh, just installing numpy with "python setup.py install" uses plain distutils, not setuptools. So there indeed isn't an entry in easy-install.pth. Which some consider a feature:) > If pip is having a problem detecting numpy, then that is a bug that needs > fixing somewhere. > Sure. But who's going to do that? > As for packages getting updated unintentionally, easy_install and pip both > require an argument to upgrade any existing packages (I think -U), so I am > not sure how you are running into such a situation. > No, if the version detection fails pip will happily "upgrade" my 1.8.0-dev to 1.6.2. > I have found install_requires to be a powerful feature in my setup.py > scripts, and I have seen no reason to discourage it. Perhaps I am the only > one? > I'm sure you're not the only one. But it's still severely broken. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Fri Sep 21 16:45:25 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 21 Sep 2012 16:45:25 -0400 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 4:37 PM, Benjamin Root wrote: > > > On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant > wrote: >> >> >> On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote: >> >> Hi, >> >> An issue I keep running into is that packages use: >> install_requires = ["numpy"] >> or >> install_requires = ['numpy >= 1.6'] >> >> in their setup.py. This simply doesn't work a lot of the time. I actually >> filed a bug against patsy for that >> (https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it >> would be better to bring it up on this list. >> >> The problem is that if you use pip, it doesn't detect numpy (may work >> better if you had installed numpy with setuptools) and tries to >> automatically install or upgrade numpy. That won't work if users don't have >> the right compiler. Just as bad would be that it does work, and the user >> didn't want to upgrade for whatever reason. >> >> This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw >> other people have the exact same problem. My recommendation would be to not >> use install_requires for numpy, but simply do something like this in >> setup.py: >> >> try: >> import numpy >> except ImportError: >> raise ImportError("my_package requires numpy") >> >> or >> >> try: >> from numpy.version import short_version as npversion >> except ImportError: >> raise ImportError("my_package requires numpy") >> if npversion < '1.6': >> raise ImportError("Numpy version is %s; required is version >= 1.6" >> % npversion) >> >> Any objections, better ideas? Is there a good place to put it in the numpy >> docs somewhere? >> >> >> I agree. I would recommend against using install requires. >> >> -Travis >> >> > > Why? I have personally never had an issue with this. The only way I could > imagine that this wouldn't work is if numpy was installed via some other > means and there wasn't an entry in the easy-install.pth (or whatever > equivalent pip uses). If pip is having a problem detecting numpy, then that > is a bug that needs fixing somewhere. > > As for packages getting updated unintentionally, easy_install and pip both > require an argument to upgrade any existing packages (I think -U), so I am > not sure how you are running into such a situation. If a user use that option, it will also try to updaet NumPy. This is a bad default behavior. The work aroud is to pass -U and --no-deps to don't update the dependency. People don't want to update numpy when they update there package other package as Theano. > I have found install_requires to be a powerful feature in my setup.py > scripts, and I have seen no reason to discourage it. Perhaps I am the only > one? What about if numpy is installed and recent enough, don't put in in the install_require. If not there, add it there? It will still fail if not c compiler is there, but maybe it won't update it at then same time? Fred From chris.barker at noaa.gov Fri Sep 21 17:04:33 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 21 Sep 2012 14:04:33 -0700 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 10:03 AM, Nathaniel Smith wrote: > You're right of course. What I meant is that > a += b > should produce the same result as > a[...] = a + b > > If we change the casting rule for the first one but not the second, though, > then these will produce different results if a is integer and b is float: I certainly agree that we would want that, however, numpy still needs to deal tih pyton symantics, which means that wile (at the numpy level) we can control what "a[...] =" means, and we can control what "a + b" produces, we can't change what "a + b" means depending on the context of the left hand side. that means we need to do the casting at the assignment stage, which I gues is your point -- so: a_int += a_float should do the addition with the "regular" casting rules, then cast to an int after doing that. not sure the implimentation details. Oh, and: a += b should be the same as a[..] = a + b should be the same as np.add(a, b, out=a) not sure what the story is with that at this point. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Fri Sep 21 17:39:04 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Sep 2012 22:39:04 +0100 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 9:42 PM, Ralf Gommers wrote: > Eh, just installing numpy with "python setup.py install" uses plain > distutils, not setuptools. So there indeed isn't an entry in > easy-install.pth. Which some consider a feature:) I don't think this is correct. To be clear on the technical issue: what's going on is that when pip sees install_requires=["numpy"], it needs to check whether you already have the distribution called "numpy" installed. It turns out that in the wonderful world of python packaging, "distributions" are not quite the same as "packages", so it can't do this by searching PYTHONPATH for a "numpy" directory. What it does is search PYTHONPATH for a file named numpy-.egg-info[1]. This isn't *quite* as dumb as it seems, because in practice there really isn't a 1-to-1 mapping between source distributions and installed packages, but it's... pretty dumb. Anyway. The problem is that Ralf installed numpy by doing an in-place build in his source tree, and then adding his source tree to his PYTHONPATH. But, he didn't put a .egg-info on his PYTHONPATH, so pip couldn't tell that numpy was installed, and did something dumb. So the question is, how do we get a .egg-info? For the specific case Ralf ran into, I'm pretty sure the solution is just that if you're clever enough to do an in-place build and add it to your PYTHONPATH, you should be clever enough to also run 'python setupegg.py egg_info' which will create a .egg-info to go with your in-place build and everything will be fine. The question is whether there are any other situations where this can break. I'm not aware of any. Contrary to what's claimed in the bit I quoted above, I just ran a plain vanilla 'python setup.py install' on numpy inside a virtualenv, and I ended up with a .egg-info installed. I'm pretty sure plain old distutils installs .egg-infos these days too. In that bug report Ralf says there's some problem with virtualenvs, but I'm not sure what (I use virtualenvs extensively and have never run into anything). Can anyone elaborate? [1] or several other variants, see some PEP or another for the tedious details. -n P.S.: yeah the thing where pip decides to upgrade the world is REALLY OBNOXIOUS. It also appears to be on the list to be fixed in the next release or the next release+1, so I guess there's hope?: https://github.com/pypa/pip/pull/571 From njs at pobox.com Fri Sep 21 18:20:30 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Sep 2012 23:20:30 +0100 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 10:04 PM, Chris Barker wrote: > On Fri, Sep 21, 2012 at 10:03 AM, Nathaniel Smith wrote: > >> You're right of course. What I meant is that >> a += b >> should produce the same result as >> a[...] = a + b >> >> If we change the casting rule for the first one but not the second, though, >> then these will produce different results if a is integer and b is float: > > I certainly agree that we would want that, however, numpy still needs > to deal tih pyton symantics, which means that wile (at the numpy > level) we can control what "a[...] =" means, and we can control what > "a + b" produces, we can't change what "a + b" means depending on the > context of the left hand side. > > that means we need to do the casting at the assignment stage, which I > gues is your point -- so: > > a_int += a_float > > should do the addition with the "regular" casting rules, then cast to > an int after doing that. > > not sure the implimentation details. Yes, that seems to be what happens. In [1]: a = np.arange(3) In [2]: a *= 1.5 In [3]: a Out[3]: array([0, 1, 3]) But still, the question is, can and should we tighten up the assignment casting rules to same_kind or similar? -n From efiring at hawaii.edu Fri Sep 21 19:51:12 2012 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 21 Sep 2012 13:51:12 -1000 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> Message-ID: <505CFD70.7010801@hawaii.edu> On 2012/09/21 12:20 PM, Nathaniel Smith wrote: > On Fri, Sep 21, 2012 at 10:04 PM, Chris Barker wrote: >> On Fri, Sep 21, 2012 at 10:03 AM, Nathaniel Smith wrote: >> >>> You're right of course. What I meant is that >>> a += b >>> should produce the same result as >>> a[...] = a + b >>> >>> If we change the casting rule for the first one but not the second, though, >>> then these will produce different results if a is integer and b is float: >> >> I certainly agree that we would want that, however, numpy still needs >> to deal tih pyton symantics, which means that wile (at the numpy >> level) we can control what "a[...] =" means, and we can control what >> "a + b" produces, we can't change what "a + b" means depending on the >> context of the left hand side. >> >> that means we need to do the casting at the assignment stage, which I >> gues is your point -- so: >> >> a_int += a_float >> >> should do the addition with the "regular" casting rules, then cast to >> an int after doing that. >> >> not sure the implimentation details. > > Yes, that seems to be what happens. > > In [1]: a = np.arange(3) > > In [2]: a *= 1.5 > > In [3]: a > Out[3]: array([0, 1, 3]) > > But still, the question is, can and should we tighten up the > assignment casting rules to same_kind or similar? An example of where tighter casting seems undesirable is the case of functions that return integer values with floating point dtype, such as rint(). It seems natural to do something like In [1]: ind = np.empty((3,), dtype=int) In [2]: rint(np.arange(3, dtype=float) / 3, out=ind) Out[2]: array([0, 0, 1]) where one is generating integer indices based on some manipulation of floating point numbers. This works in 1.6 but fails in 1.7. Eric > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Fri Sep 21 21:28:22 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 21 Sep 2012 21:28:22 -0400 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 5:39 PM, Nathaniel Smith wrote: > On Fri, Sep 21, 2012 at 9:42 PM, Ralf Gommers wrote: >> Eh, just installing numpy with "python setup.py install" uses plain >> distutils, not setuptools. So there indeed isn't an entry in >> easy-install.pth. Which some consider a feature:) > > I don't think this is correct. To be clear on the technical issue: > what's going on is that when pip sees install_requires=["numpy"], it > needs to check whether you already have the distribution called > "numpy" installed. It turns out that in the wonderful world of python > packaging, "distributions" are not quite the same as "packages", so it > can't do this by searching PYTHONPATH for a "numpy" directory. What it > does is search PYTHONPATH for a file named > numpy-.egg-info[1]. This isn't *quite* as dumb as it > seems, because in practice there really isn't a 1-to-1 mapping between > source distributions and installed packages, but it's... pretty dumb. > Anyway. The problem is that Ralf installed numpy by doing an in-place > build in his source tree, and then adding his source tree to his > PYTHONPATH. But, he didn't put a .egg-info on his PYTHONPATH, so pip > couldn't tell that numpy was installed, and did something dumb. > > So the question is, how do we get a .egg-info? For the specific case > Ralf ran into, I'm pretty sure the solution is just that if you're > clever enough to do an in-place build and add it to your PYTHONPATH, > you should be clever enough to also run 'python setupegg.py egg_info' > which will create a .egg-info to go with your in-place build and > everything will be fine. > > The question is whether there are any other situations where this can > break. I'm not aware of any. Contrary to what's claimed in the bit I > quoted above, I just ran a plain vanilla 'python setup.py install' on > numpy inside a virtualenv, and I ended up with a .egg-info installed. > I'm pretty sure plain old distutils installs .egg-infos these days > too. In that bug report Ralf says there's some problem with > virtualenvs, but I'm not sure what (I use virtualenvs extensively and > have never run into anything). Can anyone elaborate? > > [1] or several other variants, see some PEP or another for the tedious details. > > -n > > P.S.: yeah the thing where pip decides to upgrade the world is REALLY > OBNOXIOUS. It also appears to be on the list to be fixed in the next > release or the next release+1, so I guess there's hope?: > https://github.com/pypa/pip/pull/571 In statsmodels we moved to the check that Ralf proposes, and no requires. When I'm easy_installing a package I always need to watch out when a package tries to upgrade numpy. I just had to hit Crtl-C several times when the requires of pandas tried to update my numpy version. Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Fri Sep 21 21:29:12 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Sep 2012 19:29:12 -0600 Subject: [Numpy-discussion] Regression: in-place operations (possibly intentional) In-Reply-To: <505CFD70.7010801@hawaii.edu> References: <21151E34-1DAD-4943-BCE6-70B9C217F937@continuum.io> <505CFD70.7010801@hawaii.edu> Message-ID: On Fri, Sep 21, 2012 at 5:51 PM, Eric Firing wrote: > On 2012/09/21 12:20 PM, Nathaniel Smith wrote: > > On Fri, Sep 21, 2012 at 10:04 PM, Chris Barker > wrote: > >> On Fri, Sep 21, 2012 at 10:03 AM, Nathaniel Smith > wrote: > >> > >>> You're right of course. What I meant is that > >>> a += b > >>> should produce the same result as > >>> a[...] = a + b > >>> > >>> If we change the casting rule for the first one but not the second, > though, > >>> then these will produce different results if a is integer and b is > float: > >> > >> I certainly agree that we would want that, however, numpy still needs > >> to deal tih pyton symantics, which means that wile (at the numpy > >> level) we can control what "a[...] =" means, and we can control what > >> "a + b" produces, we can't change what "a + b" means depending on the > >> context of the left hand side. > >> > >> that means we need to do the casting at the assignment stage, which I > >> gues is your point -- so: > >> > >> a_int += a_float > >> > >> should do the addition with the "regular" casting rules, then cast to > >> an int after doing that. > >> > >> not sure the implimentation details. > > > > Yes, that seems to be what happens. > > > > In [1]: a = np.arange(3) > > > > In [2]: a *= 1.5 > > > > In [3]: a > > Out[3]: array([0, 1, 3]) > > > > But still, the question is, can and should we tighten up the > > assignment casting rules to same_kind or similar? > > An example of where tighter casting seems undesirable is the case of > functions that return integer values with floating point dtype, such as > rint(). It seems natural to do something like > > In [1]: ind = np.empty((3,), dtype=int) > > In [2]: rint(np.arange(3, dtype=float) / 3, out=ind) > Out[2]: array([0, 0, 1]) > > where one is generating integer indices based on some manipulation of > floating point numbers. This works in 1.6 but fails in 1.7. > In [16]: rint(arange(3, dtype=float)/3, out=ind, casting='unsafe') Out[16]: array([0, 0, 1]) I'm not sure how to make this backward compatible though. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Sep 22 08:18:23 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 22 Sep 2012 14:18:23 +0200 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Fri, Sep 21, 2012 at 11:39 PM, Nathaniel Smith wrote: > On Fri, Sep 21, 2012 at 9:42 PM, Ralf Gommers > wrote: > > Eh, just installing numpy with "python setup.py install" uses plain > > distutils, not setuptools. So there indeed isn't an entry in > > easy-install.pth. Which some consider a feature:) > > I don't think this is correct. To be clear on the technical issue: > what's going on is that when pip sees install_requires=["numpy"], it > needs to check whether you already have the distribution called > "numpy" installed. It turns out that in the wonderful world of python > packaging, "distributions" are not quite the same as "packages", so it > can't do this by searching PYTHONPATH for a "numpy" directory. What it > does is search PYTHONPATH for a file named > numpy-.egg-info[1]. This isn't *quite* as dumb as it > seems, because in practice there really isn't a 1-to-1 mapping between > source distributions and installed packages, but it's... pretty dumb. > Anyway. The problem is that Ralf installed numpy by doing an in-place > build in his source tree, and then adding his source tree to his > PYTHONPATH. But, he didn't put a .egg-info on his PYTHONPATH, so pip > couldn't tell that numpy was installed, and did something dumb. > > So the question is, how do we get a .egg-info? For the specific case > Ralf ran into, I'm pretty sure the solution is just that if you're > clever enough to do an in-place build and add it to your PYTHONPATH, > you should be clever enough to also run 'python setupegg.py egg_info' > which will create a .egg-info to go with your in-place build and > everything will be fine. > That command first starts rebuilding numpy. The correct one seems to be 'python setupegg.py install_egg_info'. This does install the egg_info file in site-packages, but it's still not working: $ python -c "import numpy as np; print(np.__version__)" 1.8.0.dev-d8988ab $ ls /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/ ... numpy-1.8.0.dev_d8988ab-py2.6.egg-info ... $ pip install -U --no-deps pandas Exception: Traceback (most recent call last): ... VersionConflict: (numpy 1.5.1 (/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages), Requirement.parse('numpy>=1.6')). As long as you try to do anything with PYTHONPATH, I think pip/easy_install/setuptools are broken in a quite fundamental way. The question is whether there are any other situations where this can > break. I'm not aware of any. Contrary to what's claimed in the bit I > quoted above, I just ran a plain vanilla 'python setup.py install' on > numpy inside a virtualenv, and I ended up with a .egg-info installed. > I'm pretty sure plain old distutils installs .egg-infos these days > too. You're right, that is the case. > In that bug report Ralf says there's some problem with > virtualenvs, but I'm not sure what (I use virtualenvs extensively and > have never run into anything). Can anyone elaborate? > I haven't used them in a while, so I can't explain in detail now. Basic numpy install into virtualenvs is working now AFAIK (which was quite painful too), but I remember having problems when using them in combination with PYTHONPATH too. > > [1] or several other variants, see some PEP or another for the tedious > details. > > -n > > P.S.: yeah the thing where pip decides to upgrade the world is REALLY > OBNOXIOUS. It also appears to be on the list to be fixed in the next > release or the next release+1, so I guess there's hope?: > https://github.com/pypa/pip/pull/571 > Good to know. Let's hope that does make it in. Given it's development model, I'm less optimistic that easy_install will receive the same fix though .... Until both pip and easy_install are fixed, this alone should be enough for the advice to be "don't use install_requires". It's not like my alternative suggestion takes away any information or valuable functionality. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Sep 22 09:54:08 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 22 Sep 2012 15:54:08 +0200 Subject: [Numpy-discussion] Views of memmaps and offset Message-ID: <20120922135408.GH1292@phare.normalesup.org> Hi list, I am struggling with offsets on the view of a memmaped array. Consider the following: import numpy as np a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+') a[:] = np.arange(50) b = a[10:] Here, I have a.offset == 0 and b.offset == 0. In practice, the data in b is offset compared to the start of the file, given that it is a view computed with an offset. My goal is, given b, to find a way to open a new view on the file, e.g. in a different process. For this I need the offset. Any idea of how I can retrieve it? In the previous numpy versions, I could go from b to a using the 'base' attribute of a. This is no longer possible. Also, should the above behavior be considered as a bug? Cheers, Ga?l From olivier.grisel at ensta.org Sat Sep 22 10:08:50 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sat, 22 Sep 2012 16:08:50 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: <20120922135408.GH1292@phare.normalesup.org> References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: 2012/9/22 Gael Varoquaux : > Hi list, > > I am struggling with offsets on the view of a memmaped array. Consider > the following: > > import numpy as np > > a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+') > a[:] = np.arange(50) > b = a[10:] > > Here, I have a.offset == 0 and b.offset == 0. In practice, the data in b > is offset compared to the start of the file, given that it is a view > computed with an offset. > > My goal is, given b, to find a way to open a new view on the file, e.g. > in a different process. For this I need the offset. > > Any idea of how I can retrieve it? In the previous numpy versions, I > could go from b to a using the 'base' attribute of a. This is no longer > possible. > > Also, should the above behavior be considered as a bug? Note: this question on applies on the current master of numpy. On previously released versions of numpy it's possible to introspect `b.base.strides`. A similar question apply if a was itself open with an offset: orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+') orig[:] = np.arange(orig.shape[0]) * -1.0 # negative markers to detect under / overflows a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16) a[:] = np.arange(50) b = a[10:] How to reopen the same view as b on the buffer allocated by orig with the current API in numpy master? These questions stem from the following effort to build tools for efficient memory management of numpy based datastructures when working with python multiprocessing pools: https://github.com/joblib/joblib/pull/44 -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From olivier.grisel at ensta.org Sat Sep 22 11:46:16 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sat, 22 Sep 2012 17:46:16 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: There is also a third use case that is problematic on numpy master: orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+') orig[:] = np.arange(orig.shape[0]) * -1.0 # negative markers to detect under / overflows a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16) a[:] = np.arange(50) b = np.asarray(a[10:]) Now b does not even have a 'filename' attribute anymore. `b.base` is a python mmap instance but the later is created with a file descriptor. It would still be possible to use: from _multiprocessing import address_of_buffer to find the memory address of the mmap buffer and use than to open new buffer views on the same memory segment from subprocesses using `numpy.frombuffer((ctypes.c_byte * n_byte).fromaddress(addr))` but in case of failure (e.g. the file has been deleted on the HDD) one gets a segmentation fault instead of a much more userfriendly catchable file not found exception. From charlesr.harris at gmail.com Sat Sep 22 12:15:52 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 10:15:52 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 9:46 AM, Olivier Grisel wrote: > There is also a third use case that is problematic on numpy master: > > orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+') > orig[:] = np.arange(orig.shape[0]) * -1.0 # negative markers to > detect under / overflows > > a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16) > a[:] = np.arange(50) > b = np.asarray(a[10:]) > > Now b does not even have a 'filename' attribute anymore. `b.base` is a > python mmap instance but the later is created with a file descriptor. > > It would still be possible to use: > > from _multiprocessing import address_of_buffer > > to find the memory address of the mmap buffer and use than to open new > buffer views on the same memory segment from subprocesses using > `numpy.frombuffer((ctypes.c_byte * n_byte).fromaddress(addr))` but in > case of failure (e.g. the file has been deleted on the HDD) one gets a > segmentation fault instead of a much more userfriendly catchable file > not found exception. > Would some sort of 'dup' method be useful? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Sep 22 12:16:59 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 22 Sep 2012 18:16:59 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: <20120922161659.GB4650@phare.normalesup.org> On Sat, Sep 22, 2012 at 10:15:52AM -0600, Charles R Harris wrote: > Would some sort of 'dup' method be useful? What do you mean by dup? G From olivier.grisel at ensta.org Sat Sep 22 12:30:27 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sat, 22 Sep 2012 18:30:27 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: <20120922161659.GB4650@phare.normalesup.org> References: <20120922135408.GH1292@phare.normalesup.org> <20120922161659.GB4650@phare.normalesup.org> Message-ID: A posix dup (http://www.unix.com/man-page/POSIX/3posix/dup/) would not solve it as the fd is hidden inside the python `mmap.mmap` instance that is a builtin that just exposes the python buffer interface and hides the implementation details. The only clean solution would be to make `numpy.memmap` use a wrapper buffer object that would keep track of the filename and offset attributes instead of using a `mmap.mmap` instance directly. -- Olivier From charlesr.harris at gmail.com Sat Sep 22 13:16:27 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 11:16:27 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: <20120922135408.GH1292@phare.normalesup.org> References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 7:54 AM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > Hi list, > > I am struggling with offsets on the view of a memmaped array. Consider > the following: > > import numpy as np > > a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+') > a[:] = np.arange(50) > b = a[10:] > > Here, I have a.offset == 0 and b.offset == 0. In practice, the data in b > is offset compared to the start of the file, given that it is a view > computed with an offset. > > My goal is, given b, to find a way to open a new view on the file, e.g. > in a different process. For this I need the offset. > > Any idea of how I can retrieve it? In the previous numpy versions, I > could go from b to a using the 'base' attribute of a. This is no longer > possible. > > Also, should the above behavior be considered as a bug? > I think this is a bug, taking a view should probably update the offset. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Sep 22 13:31:11 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 22 Sep 2012 19:31:11 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922161659.GB4650@phare.normalesup.org> Message-ID: <20120922173111.GB31321@phare.normalesup.org> On Sat, Sep 22, 2012 at 06:30:27PM +0200, Olivier Grisel wrote: > The only clean solution would be to make `numpy.memmap` use a wrapper > buffer object that would keep track of the filename and offset > attributes instead of using a `mmap.mmap` instance directly. Indeed, Olivier and I have been struggling to find a solution to our problem: knowing whether the data in an array is a view on a file, and if so which file. This was possible in previous numpy versions, by going up the chain of 'base' references, and looking on the top-most object. It is no longer possible as the base now is a Python mmap that does not keep track of this information. Would people agree with a patch proposing to change the base of an np.memmap with a wrapper on the Python mmap keeping track of this information? If so, we'll try to submit a patch quickly. Ga?l From gael.varoquaux at normalesup.org Sat Sep 22 13:31:50 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 22 Sep 2012 19:31:50 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: <20120922173150.GC31321@phare.normalesup.org> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: > I think this is a bug, taking a view should probably update the offset. OK, we can include a fix for that alongside with the patch to keep track of the filename. Cheers, Ga?l From sebastian at sipsolutions.net Sat Sep 22 13:50:40 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 22 Sep 2012 19:50:40 +0200 Subject: [Numpy-discussion] Ignore axes with dimension==1 for contiguous flags Message-ID: <1348336240.20038.20.camel@sebastian-laptop> Hey, Numpy currently assumes that if "ndim > 1" then it is impossible for any array to be both C- and F-contiguous, however an axes of dimension 1 does have no effect on the memory layout. I think I have made most important changes (actually really very few), though I bet some parts of numpy still need adapting because of smaller quirks: https://github.com/seberg/numpy/compare/master...cflags This example sums up two advantages. On that branch: In [9]: a = np.arange(9).reshape(3,3)[::3,:] In [10]: a.flags.contiguous, a.flags.fortran Out[10]: (True, True) Note that currently _both_ are false, because numpy does not reset the strides for the first dimension. The only real problem I see is that someone who assumes that for a contiguous array strides[0] or strides[-1] is elemsize has to change the code or face segmentation faults, but maybe I am missing something big? Any comments if this is the right idea? And if where would more changes be needed? Regards, Sebastian From charlesr.harris at gmail.com Sat Sep 22 13:52:49 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 11:52:49 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: <20120922173150.GC31321@phare.normalesup.org> References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: > > I think this is a bug, taking a view should probably update the > offset. > > OK, we can include a fix for that alongside with the patch to keep track > of the filename. > It already tracks the file name In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+', offset=4) In [2]: b = a[10:] In [3]: b.filename Out[3]: '/home/charris/tmp.mmap' or did you mean something else? I was guessing the fix could be mad in the same place that copied over the filename. > > Cheers, > > Ga?l > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Sep 22 13:55:30 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 11:55:30 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux < > gael.varoquaux at normalesup.org> wrote: > >> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: >> > I think this is a bug, taking a view should probably update the >> offset. >> >> OK, we can include a fix for that alongside with the patch to keep track >> of the filename. >> > > It already tracks the file name > > In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+', > offset=4) > > In [2]: b = a[10:] > > In [3]: b.filename > Out[3]: '/home/charris/tmp.mmap' > > or did you mean something else? I was guessing the fix could be mad in the > same place that copied over the filename. > > You can also tell it is a memmap In [4]: b._mmap Out[4]: Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Sep 22 14:01:21 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 22 Sep 2012 20:01:21 +0200 Subject: [Numpy-discussion] np.array execution path Message-ID: <1348336881.20038.28.camel@sebastian-laptop> Hi, I have a bit of trouble figuring this out. I would have expected np.asarray(array) to go through ctors, PyArray_NewFromArray, but it seems to me it does not, so which execution path is exactly taken here? The reason I am asking is that I want to figure out this behavior/bug, and I really am not sure which function is responsible: In [69]: o = np.ones(3) In [70]: no = np.asarray(o, order='C') In [71]: no[:] = 10 In [72]: o # OK, o was changed in place: Out[72]: array([ 10., 10., 10.]) In [73]: no.flags # But no claims to own its data! Out[73]: C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [74]: no = np.asarray(o, order='F') In [75]: no[:] = 11 In [76]: o # Here asarray actually returned a real copy! Out[76]: array([ 10., 10., 10.]) Thanks, Sebastian From olivier.grisel at ensta.org Sat Sep 22 14:06:56 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sat, 22 Sep 2012 20:06:56 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: 2012/9/22 Charles R Harris : > > > On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris > wrote: >> >> >> >> On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux >> wrote: >>> >>> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: >>> > I think this is a bug, taking a view should probably update the >>> > offset. >>> >>> OK, we can include a fix for that alongside with the patch to keep track >>> of the filename. >> >> >> It already tracks the file name >> >> In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+', >> offset=4) >> >> In [2]: b = a[10:] >> >> In [3]: b.filename >> Out[3]: '/home/charris/tmp.mmap' >> >> or did you mean something else? I was guessing the fix could be mad in the >> same place that copied over the filename. >> > > You can also tell it is a memmap > > In [4]: b._mmap > Out[4]: The problem is with: >>> c = np.asarray(b) >>> c.base But you loose the pointer to the filename and the offset. In previous versions of numpy c.base used to be the np.memmap instance from which c is an array view. That allowed to make efficient pickling without any memory copy when doing single machine multiprocessing stuff by introspecting the base ancestry. This is no longer possible with the current base collapsing that is happening in numpy master. The only way would be to replace the mmap.mmap instance of a numpy.memmap object by a buffer implementation that would wrap or derive from mmap.mmap but also preserve the original filename and offset. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From travis at continuum.io Sat Sep 22 14:12:02 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 22 Sep 2012 13:12:02 -0500 Subject: [Numpy-discussion] np.array execution path In-Reply-To: <1348336881.20038.28.camel@sebastian-laptop> References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: Check to see if this expression is true no is o In the first case no and o are the same object Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 22, 2012, at 1:01 PM, Sebastian Berg wrote: > Hi, > > I have a bit of trouble figuring this out. I would have expected > np.asarray(array) to go through ctors, PyArray_NewFromArray, but it > seems to me it does not, so which execution path is exactly taken here? > The reason I am asking is that I want to figure out this behavior/bug, > and I really am not sure which function is responsible: > > In [69]: o = np.ones(3) > > In [70]: no = np.asarray(o, order='C') > > In [71]: no[:] = 10 > > In [72]: o # OK, o was changed in place: > Out[72]: array([ 10., 10., 10.]) > > In [73]: no.flags # But no claims to own its data! > Out[73]: > C_CONTIGUOUS : True > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [74]: no = np.asarray(o, order='F') > > In [75]: no[:] = 11 > > In [76]: o # Here asarray actually returned a real copy! > Out[76]: array([ 10., 10., 10.]) > > > Thanks, > > Sebastian > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sat Sep 22 14:19:53 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 12:19:53 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 12:06 PM, Olivier Grisel wrote: > 2012/9/22 Charles R Harris : > > > > > > On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris > > wrote: > >> > >> > >> > >> On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux > >> wrote: > >>> > >>> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: > >>> > I think this is a bug, taking a view should probably update the > >>> > offset. > >>> > >>> OK, we can include a fix for that alongside with the patch to keep > track > >>> of the filename. > >> > >> > >> It already tracks the file name > >> > >> In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+', > >> offset=4) > >> > >> In [2]: b = a[10:] > >> > >> In [3]: b.filename > >> Out[3]: '/home/charris/tmp.mmap' > >> > >> or did you mean something else? I was guessing the fix could be mad in > the > >> same place that copied over the filename. > >> > > > > You can also tell it is a memmap > > > > In [4]: b._mmap > > Out[4]: > > The problem is with: > > >>> c = np.asarray(b) > >>> c.base > > > But you loose the pointer to the filename and the offset. In previous > versions of numpy c.base used to be the np.memmap instance from which > c is an array view. That allowed to make efficient pickling without > any memory copy when doing single machine multiprocessing stuff by > introspecting the base ancestry. > > This is no longer possible with the current base collapsing that is > happening in numpy master. The only way would be to replace the > mmap.mmap instance of a numpy.memmap object by a buffer implementation > that would wrap or derive from mmap.mmap but also preserve the > original filename and offset. > Pickling was left as an unresolved problem after to offset updates to memmap. It would be nice to get all those issues fixed up. As to the 1.7 release, I've been thinking we are violating the release early, release often maxim. Bugs trickle in at a constant rate and if we wait to fix them all we wait forever. So while it would be nice to have this in 1.7.0, I think we should also plan on a 1.7.1 bug fix release a few months after the 1.7.0 release. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Sep 22 14:26:11 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 22 Sep 2012 20:26:11 +0200 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: <1348338371.20038.37.camel@sebastian-laptop> Ooops obviously thanks a lot, stupid me. Thanks was also enough to figure the rest out myself... On Sat, 2012-09-22 at 13:12 -0500, Travis Oliphant wrote: > Check to see if this expression is true > > no is o > > In the first case no and o are the same object > > > Travis > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Sep 22, 2012, at 1:01 PM, Sebastian Berg wrote: > > > Hi, > > > > I have a bit of trouble figuring this out. I would have expected > > np.asarray(array) to go through ctors, PyArray_NewFromArray, but it > > seems to me it does not, so which execution path is exactly taken here? > > The reason I am asking is that I want to figure out this behavior/bug, > > and I really am not sure which function is responsible: > > > > In [69]: o = np.ones(3) > > > > In [70]: no = np.asarray(o, order='C') > > > > In [71]: no[:] = 10 > > > > In [72]: o # OK, o was changed in place: > > Out[72]: array([ 10., 10., 10.]) > > > > In [73]: no.flags # But no claims to own its data! > > Out[73]: > > C_CONTIGUOUS : True > > F_CONTIGUOUS : True > > OWNDATA : True > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > In [74]: no = np.asarray(o, order='F') > > > > In [75]: no[:] = 11 > > > > In [76]: o # Here asarray actually returned a real copy! > > Out[76]: array([ 10., 10., 10.]) > > > > > > Thanks, > > > > Sebastian > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sat Sep 22 14:30:46 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Sep 2012 12:30:46 -0600 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 12:19 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Sep 22, 2012 at 12:06 PM, Olivier Grisel > wrote: > >> 2012/9/22 Charles R Harris : >> > >> > >> > On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris >> > wrote: >> >> >> >> >> >> >> >> On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux >> >> wrote: >> >>> >> >>> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote: >> >>> > I think this is a bug, taking a view should probably update the >> >>> > offset. >> >>> >> >>> OK, we can include a fix for that alongside with the patch to keep >> track >> >>> of the filename. >> >> >> >> >> >> It already tracks the file name >> >> >> >> In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, >> mode='w+', >> >> offset=4) >> >> >> >> In [2]: b = a[10:] >> >> >> >> In [3]: b.filename >> >> Out[3]: '/home/charris/tmp.mmap' >> >> >> >> or did you mean something else? I was guessing the fix could be mad in >> the >> >> same place that copied over the filename. >> >> >> > >> > You can also tell it is a memmap >> > >> > In [4]: b._mmap >> > Out[4]: >> >> The problem is with: >> >> >>> c = np.asarray(b) >> >>> c.base >> >> >> But you loose the pointer to the filename and the offset. In previous >> versions of numpy c.base used to be the np.memmap instance from which >> c is an array view. That allowed to make efficient pickling without >> any memory copy when doing single machine multiprocessing stuff by >> introspecting the base ancestry. >> >> This is no longer possible with the current base collapsing that is >> happening in numpy master. The only way would be to replace the >> mmap.mmap instance of a numpy.memmap object by a buffer implementation >> that would wrap or derive from mmap.mmap but also preserve the >> original filename and offset. >> > > Pickling was left as an unresolved problem after to offset updates to > memmap. It would be nice to get all those issues fixed up. > > As to the 1.7 release, I've been thinking we are violating the release > early, release often maxim. Bugs trickle in at a constant rate and if we > wait to fix them all we wait forever. So while it would be nice to have > this in 1.7.0, I think we should also plan on a 1.7.1 bug fix release a few > months after the 1.7.0 release. > > Previous work was at http://projects.scipy.org/numpy/ticket/1452, which is probabably worth a read. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Sat Sep 22 16:00:30 2012 From: seb.haase at gmail.com (Sebastian Haase) Date: Sat, 22 Sep 2012 22:00:30 +0200 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: Oh, is this actually documented - I knew that np.array would (by default) only create copies as need ... but I never knew it would - if all fits - even just return the original Python-object... Thanks, Sebastian Haase On Sat, Sep 22, 2012 at 8:12 PM, Travis Oliphant wrote: > Check to see if this expression is true > > no is o > > In the first case no and o are the same object > > > Travis > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Sep 22, 2012, at 1:01 PM, Sebastian Berg wrote: > >> Hi, >> >> I have a bit of trouble figuring this out. I would have expected >> np.asarray(array) to go through ctors, PyArray_NewFromArray, but it >> seems to me it does not, so which execution path is exactly taken here? >> The reason I am asking is that I want to figure out this behavior/bug, >> and I really am not sure which function is responsible: >> >> In [69]: o = np.ones(3) >> >> In [70]: no = np.asarray(o, order='C') >> >> In [71]: no[:] = 10 >> >> In [72]: o # OK, o was changed in place: >> Out[72]: array([ 10., 10., 10.]) >> >> In [73]: no.flags # But no claims to own its data! >> Out[73]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [74]: no = np.asarray(o, order='F') >> >> In [75]: no[:] = 11 >> >> In [76]: o # Here asarray actually returned a real copy! >> Out[76]: array([ 10., 10., 10.]) >> >> >> Thanks, >> >> Sebastian >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From gael.varoquaux at normalesup.org Sat Sep 22 18:15:01 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 23 Sep 2012 00:15:01 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: <20120922221501.GA8823@phare.normalesup.org> On Sat, Sep 22, 2012 at 12:30:46PM -0600, Charles R Harris wrote: > As to the 1.7 release, I've been thinking we are violating the release > early, release often maxim. Bugs trickle in at a constant rate and if we > wait to fix them all we wait forever. So while it would be nice to have > this in 1.7.0, I think we should also plan on a 1.7.1 bug fix release a > few months after the 1.7.0 release. Indeed, we are having a hard time releasing early. Maybe the reason is the massive amount of changes to core behavior in numpy. The situation here is that a usecase that was working up to 1.6 included will stop working in 1.7 with no possible workaround. I am biassed because that usecase is important to me. However, I do find that having a core package that is such a moving target makes it hard to build upon. Ga?l From gael.varoquaux at normalesup.org Sat Sep 22 18:20:02 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 23 Sep 2012 00:20:02 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> <20120922173150.GC31321@phare.normalesup.org> Message-ID: <20120922222002.GB8823@phare.normalesup.org> On Sat, Sep 22, 2012 at 12:19:53PM -0600, Charles R Harris wrote: > But you loose the pointer to the filename and the offset. In previous > versions of numpy c.base used to be the np.memmap instance from which > c is an array view. That allowed to make efficient pickling without > any memory copy when doing single machine multiprocessing stuff by > introspecting the base ancestry. > This is no longer possible with the current base collapsing that is > happening in numpy master. The only way would be to replace the > mmap.mmap instance of a numpy.memmap object by a buffer implementation > that would wrap or derive from mmap.mmap but also preserve the > original filename and offset. > Pickling was left as an unresolved problem after to offset updates to > memmap. To be clear, the issue here is not really a pickling issue where pickling is a general purpose persistence model, but rather a specific and focussed I/O problem. We don't want to swap out to disk and load back an array that is a view on the disk. With the current numpy, we can tell that it is indeed a view on the disk, and we can tell what offset and strides, but we cannot tell from which file it comes from, because that information is lost when deriving children arrays. Gael From sebastian at sipsolutions.net Sat Sep 22 17:24:13 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 22 Sep 2012 23:24:13 +0200 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: <1348349053.20038.43.camel@sebastian-laptop> In case you are interested, the second (real odditiy), is caused by ISFORTRAN and IS_F_CONTIGUOUS mixup, I have found three occurances where I think ISFORTRAN should be replaced by the latter. Check also: https://github.com/seberg/numpy/commit/4d2713ce8f2107d225fe291f5da6c6a75436647e Sebastian On Sat, 2012-09-22 at 13:12 -0500, Travis Oliphant wrote: > Check to see if this expression is true > > no is o > > In the first case no and o are the same object > > > Travis > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Sep 22, 2012, at 1:01 PM, Sebastian Berg wrote: > > > Hi, > > > > I have a bit of trouble figuring this out. I would have expected > > np.asarray(array) to go through ctors, PyArray_NewFromArray, but it > > seems to me it does not, so which execution path is exactly taken here? > > The reason I am asking is that I want to figure out this behavior/bug, > > and I really am not sure which function is responsible: > > > > In [69]: o = np.ones(3) > > > > In [70]: no = np.asarray(o, order='C') > > > > In [71]: no[:] = 10 > > > > In [72]: o # OK, o was changed in place: > > Out[72]: array([ 10., 10., 10.]) > > > > In [73]: no.flags # But no claims to own its data! > > Out[73]: > > C_CONTIGUOUS : True > > F_CONTIGUOUS : True > > OWNDATA : True > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > In [74]: no = np.asarray(o, order='F') > > > > In [75]: no[:] = 11 > > > > In [76]: o # Here asarray actually returned a real copy! > > Out[76]: array([ 10., 10., 10.]) > > > > > > Thanks, > > > > Sebastian > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From 275438859 at qq.com Sun Sep 23 12:23:37 2012 From: 275438859 at qq.com (=?gb18030?B?0MTI59byueI=?=) Date: Mon, 24 Sep 2012 00:23:37 +0800 Subject: [Numpy-discussion] errors of scipy build Message-ID: Hi,all. I have installed numpy and scipy step by step carefully as the instructions from website:http://www.scipy.org/Installing_SciPy/Mac_OS_X But still get many errors and warnings while it's building. (OSX lion 10.7.4 /Xcode 4.5 /clang /gfortran4.2.3) Do scipy and numpy must be built by gcc?? Such as: 1 error generated. _configtest.c:5:28: error: 'test_array' declared as an array with a negative size static int test_array [1 - 2 * !(((long) (sizeof (npy_check_sizeof_type))) == 4)]; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. C compiler: clang -fno-strict-aliasing -fno-common -dynamic -pipe -O2 -fwrapv -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes /////////////////////////////////////////////////////////// Constructing wrapper function "drotm"... getarrdims:warning: assumed shape array, using 0 instead of '*' getarrdims:warning: assumed shape array, using 0 instead of '*' x,y = drotm(x,y,param,[n,offx,incx,offy,incy,overwrite_x,overwrite_y]) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Sep 23 13:11:30 2012 From: cournape at gmail.com (David Cournapeau) Date: Sun, 23 Sep 2012 18:11:30 +0100 Subject: [Numpy-discussion] errors of scipy build In-Reply-To: References: Message-ID: On Sun, Sep 23, 2012 at 5:23 PM, ???? <275438859 at qq.com> wrote: > Hi,all. > I have installed numpy and scipy step by step carefully as the > instructions from website:http://www.scipy.org/Installing_SciPy/Mac_OS_X > But still get many errors and warnings while it's building. > (OSX lion 10.7.4 /Xcode 4.5 /clang /gfortran4.2.3) Those error are parts of the configuration. As long as the build runs until the end, the build is successful. You should not build scipy with gcc on mac os x 10.7, as it is known to cause issues. David From njs at pobox.com Sun Sep 23 13:54:46 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Sep 2012 18:54:46 +0100 Subject: [Numpy-discussion] np.array execution path In-Reply-To: <1348349053.20038.43.camel@sebastian-laptop> References: <1348336881.20038.28.camel@sebastian-laptop> <1348349053.20038.43.camel@sebastian-laptop> Message-ID: On Sat, Sep 22, 2012 at 10:24 PM, Sebastian Berg wrote: > In case you are interested, the second (real odditiy), is caused by > ISFORTRAN and IS_F_CONTIGUOUS mixup, I have found three occurances where > I think ISFORTRAN should be replaced by the latter. Check also: > > https://github.com/seberg/numpy/commit/4d2713ce8f2107d225fe291f5da6c6a75436647e So I guess we have this ISFORTRAN function (also exposed to Python[1]). It's documented as checking the rather odd condition of an array being in fortran-order AND having ndim > 1. Sebastian, as part of polishing up some of our contiguity-handling code, is suggesting changing this so that ISFORTRAN is true for an array that is (fortran order && !C order). Off the top of my head I can't think of any situation where *either* of these predicates is actually useful. (I can see why you want to check if an array is in fortran order, but not why it'd be natural to check whether it's in fortran order and also these other conditions together in one function call.) The problem is, this makes it hard to know whether Sebastian's change is a good idea. Can anyone think of legitimate uses for ISFORTRAN? Or should it just be deprecated altogether? -n [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.isfortran.html From njs at pobox.com Sun Sep 23 14:34:59 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Sep 2012 19:34:59 +0100 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: On Sat, Sep 22, 2012 at 4:46 PM, Olivier Grisel wrote: > There is also a third use case that is problematic on numpy master: > > orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+') > orig[:] = np.arange(orig.shape[0]) * -1.0 # negative markers to > detect under / overflows > > a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16) > a[:] = np.arange(50) > b = np.asarray(a[10:]) > > Now b does not even have a 'filename' attribute anymore. `b.base` is a > python mmap instance but the later is created with a file descriptor. > > It would still be possible to use: > > from _multiprocessing import address_of_buffer > > to find the memory address of the mmap buffer and use than to open new > buffer views on the same memory segment from subprocesses using > `numpy.frombuffer((ctypes.c_byte * n_byte).fromaddress(addr))` but in > case of failure (e.g. the file has been deleted on the HDD) one gets a > segmentation fault instead of a much more userfriendly catchable file > not found exception. On Unix, if the processes are related in a way that lets this work, then this would actually be a far better solution... it will always refer to the same file that was opened in the parent, even if it's has since been deleted or renamed or replaced by a different file. (And if they aren't related by fork(), then sending the fd would be better than sending the filename, for the same reason.) Of course that doesn't help for Windows; no idea what happens there. Numpy in general really does not provide any reliable way of tracking the relationship between different views of the same buffer. Introspecting on .base will work in many cases, but it's not guaranteed to even in earlier versions. Maybe you don't care because it works well enough but it's an inherently rickety design :-). Trying to think of the correct solution here, I think it would have to be something like... have the numpy mmap code keep a global scorecard of all extant memory mappings -- filename, offset, length, memory address. And then when you want to do an "mmap aware pickle", you check the address of the array you're trying to save to see if it falls into an mmap'ed region. That'd be simpler and more reliable than anything involving base tracking. -n From olivier.grisel at ensta.org Sun Sep 23 14:51:46 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 23 Sep 2012 20:51:46 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: 2012/9/23 Nathaniel Smith : > On Sat, Sep 22, 2012 at 4:46 PM, Olivier Grisel > wrote: >> There is also a third use case that is problematic on numpy master: >> >> orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+') >> orig[:] = np.arange(orig.shape[0]) * -1.0 # negative markers to >> detect under / overflows >> >> a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16) >> a[:] = np.arange(50) >> b = np.asarray(a[10:]) >> >> Now b does not even have a 'filename' attribute anymore. `b.base` is a >> python mmap instance but the later is created with a file descriptor. >> >> It would still be possible to use: >> >> from _multiprocessing import address_of_buffer >> >> to find the memory address of the mmap buffer and use than to open new >> buffer views on the same memory segment from subprocesses using >> `numpy.frombuffer((ctypes.c_byte * n_byte).fromaddress(addr))` but in >> case of failure (e.g. the file has been deleted on the HDD) one gets a >> segmentation fault instead of a much more userfriendly catchable file >> not found exception. > > On Unix, if the processes are related in a way that lets this work, > then this would actually be a far better solution... it will always > refer to the same file that was opened in the parent, even if it's has > since been deleted or renamed or replaced by a different file. (And if > they aren't related by fork(), then sending the fd would be better > than sending the filename, for the same reason.) Of course that > doesn't help for Windows; no idea what happens there. > > Numpy in general really does not provide any reliable way of tracking > the relationship between different views of the same buffer. > Introspecting on .base will work in many cases, but it's not > guaranteed to even in earlier versions. Maybe you don't care because > it works well enough but it's an inherently rickety design :-). Trying > to think of the correct solution here, I think it would have to be > something like... have the numpy mmap code keep a global scorecard of > all extant memory mappings -- filename, offset, length, memory > address. And then when you want to do an "mmap aware pickle", you > check the address of the array you're trying to save to see if it > falls into an mmap'ed region. That'd be simpler and more reliable than > anything involving base tracking. Well, base tracking seems to work really well on 1.6.2. Here is the code that does the introspection / reconstruction of shared memory views from sub-process using the python multiprocessing Pool API: https://github.com/joblib/joblib/pull/44/files#L5R55 The only clean solution for the collapsed base of numpy 1.7 I see would be to replace the direct mmap.mmap buffer instance from the numpy.memmap class to use a custom wrapper of mmap.mmap that would still implement the buffer python API but would also store the filename and offset as additional attributes. To me that sounds like a much cleaner than a "global scorecard of all extant memory mappings". -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From olivier.grisel at ensta.org Sun Sep 23 14:55:19 2012 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Sun, 23 Sep 2012 20:55:19 +0200 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: 2012/9/23 Olivier Grisel : > > The only clean solution for the collapsed base of numpy 1.7 I see > would be to replace the direct mmap.mmap buffer instance from the > numpy.memmap class to use a custom wrapper of mmap.mmap that would > still implement the buffer python API but would also store the > filename and offset as additional attributes. To me that sounds like a > much cleaner than a "global scorecard of all extant memory mappings". Rather than a wrapper for mmap.mmap we could just subclass it actually. This is even cleaner: very few code change and would not break user code testing for `isintance(a.base, mmap.mmap)` or similar. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From njs at pobox.com Sun Sep 23 15:24:38 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Sep 2012 20:24:38 +0100 Subject: [Numpy-discussion] Views of memmaps and offset In-Reply-To: References: <20120922135408.GH1292@phare.normalesup.org> Message-ID: On Sun, Sep 23, 2012 at 7:55 PM, Olivier Grisel wrote: > 2012/9/23 Olivier Grisel : >> >> The only clean solution for the collapsed base of numpy 1.7 I see >> would be to replace the direct mmap.mmap buffer instance from the >> numpy.memmap class to use a custom wrapper of mmap.mmap that would >> still implement the buffer python API but would also store the >> filename and offset as additional attributes. To me that sounds like a >> much cleaner than a "global scorecard of all extant memory mappings". > > Rather than a wrapper for mmap.mmap we could just subclass it actually. > This is even cleaner: very few code change and would not break user > code testing for `isintance(a.base, mmap.mmap)` or similar. You'd need a subclass in either case, but the advantage of the "global scorecard" (which would just be a sorted python list) is that in your approach, you depend on all code everywhere passing around .base values in the way you expect, but in my version the memmap pickle code would only need to rely on a tiny bit of code that's maintained alongside it in the same file. That's what I consider cleaner. -n From njs at pobox.com Sun Sep 23 16:20:47 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Sep 2012 21:20:47 +0100 Subject: [Numpy-discussion] specifying numpy as dependency in your project, install_requires In-Reply-To: References: <9644D8AB-2BA7-4537-989B-F1B80DF73155@continuum.io> Message-ID: On Sat, Sep 22, 2012 at 1:18 PM, Ralf Gommers wrote: > On Fri, Sep 21, 2012 at 11:39 PM, Nathaniel Smith wrote: >> So the question is, how do we get a .egg-info? For the specific case >> Ralf ran into, I'm pretty sure the solution is just that if you're >> clever enough to do an in-place build and add it to your PYTHONPATH, >> you should be clever enough to also run 'python setupegg.py egg_info' >> which will create a .egg-info to go with your in-place build and >> everything will be fine. > > That command first starts rebuilding numpy. No, it just seems to run the config and source-generation bits, not build anything. It also leaves the .egg-info in the source directory, which is what you want. > The correct one seems to be > 'python setupegg.py install_egg_info'. This does install the egg_info file > in site-packages, but it's still not working: > > $ python -c "import numpy as np; print(np.__version__)" > 1.8.0.dev-d8988ab > $ ls > /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/ > ... > numpy-1.8.0.dev_d8988ab-py2.6.egg-info > ... > $ pip install -U --no-deps pandas > Exception: > Traceback (most recent call last): > ... > VersionConflict: (numpy 1.5.1 > (/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages), > Requirement.parse('numpy>=1.6')). The problem here is that you have numpy 1.5.1 installed in a directory that appears on your PYTHONPATH *before* the directory that you installed the .egg-info into. The .egg-info is supposed to go in the same directory as the package; that way 'import numpy' and pip will always find corresponding versions of numpy, no matter how you change your PYTHONPATH. It does look like you can also use: python setup.py install_egg_info -d . This just uses standard distutils and skips running the config/source-generation step. (I guess this is because the vanilla distutils egg-info metadata is less thorough, and doesn't include a list of installed files.) > As long as you try to do anything with PYTHONPATH, I think > pip/easy_install/setuptools are broken in a quite fundamental way. I still see no evidence of this. Just put the .egg-info next to the package and move on... >> P.S.: yeah the thing where pip decides to upgrade the world is REALLY >> OBNOXIOUS. It also appears to be on the list to be fixed in the next >> release or the next release+1, so I guess there's hope?: >> https://github.com/pypa/pip/pull/571 > > Good to know. Let's hope that does make it in. Given it's development model, > I'm less optimistic that easy_install will receive the same fix though .... Yeah, easy_install is abandoned and bit-rotting, which is why people usually recommend pip :-). But in this case, I thought that easy_install already doesn't upgrade the world when it runs? Is there something to fix here? > Until both pip and easy_install are fixed, this alone should be enough for > the advice to be "don't use install_requires". It's not like my alternative > suggestion takes away any information or valuable functionality. pandas, for example, requires several other packages, and I found it quite convenient the other day when I wanted to try out a new version and pip automatically took care of setting all that up for me. It even correctly upgraded numpy, since the virtualenv I was using for testing had inherited my system-installed 1.5.2, but this was the first version of pandas that needed 1.6. Python packaging tools make me feel grumpy and traumatized too but I don't see how the solution is to just give up on computer-readable dependency-tracking altogether. -n From sebastian at sipsolutions.net Sun Sep 23 17:45:30 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 23 Sep 2012 23:45:30 +0200 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> <1348349053.20038.43.camel@sebastian-laptop> Message-ID: <1348436730.20038.64.camel@sebastian-laptop> On Sun, 2012-09-23 at 18:54 +0100, Nathaniel Smith wrote: > On Sat, Sep 22, 2012 at 10:24 PM, Sebastian Berg > wrote: > > In case you are interested, the second (real odditiy), is caused by > > ISFORTRAN and IS_F_CONTIGUOUS mixup, I have found three occurances where > > I think ISFORTRAN should be replaced by the latter. Check also: > > > > https://github.com/seberg/numpy/commit/4d2713ce8f2107d225fe291f5da6c6a75436647e > > So I guess we have this ISFORTRAN function (also exposed to > Python[1]). It's documented as checking the rather odd condition of an > array being in fortran-order AND having ndim > 1. Sebastian, as part > of polishing up some of our contiguity-handling code, is suggesting > changing this so that ISFORTRAN is true for an array that is (fortran > order && !C order). Off the top of my head I can't think of any > situation where *either* of these predicates is actually useful. (I > can see why you want to check if an array is in fortran order, but not > why it'd be natural to check whether it's in fortran order and also > these other conditions together in one function call.) The problem is, > this makes it hard to know whether Sebastian's change is a good idea. > > Can anyone think of legitimate uses for ISFORTRAN? Or should it just > be deprecated altogether? Maybe I am missing things, but I think ISFORTRAN is used to decide the order in which a new array is requested when "Anyorder" is used. In some use cases it does not matter, but for example in these cases (where the new array has a different shape then the original) it would change if you just changed ISFORTRAN: a = np.ones(4) # C/F-Contig a.reshape(2,2, order='A') np.add.outer(a, a, order='A') These would return Fortran order instead of C if ISFORTRAN did not check dimension > 1 (or equivalently !c-contig). > > -n > > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.isfortran.html > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From spitskip at gmail.com Mon Sep 24 10:25:59 2012 From: spitskip at gmail.com (Wim Bakker) Date: Mon, 24 Sep 2012 16:25:59 +0200 Subject: [Numpy-discussion] ZeroRank memmap behavior? Message-ID: Thanks Sebastian. Casting it to an array would certainly help. Another oddity of zero-ranked scalars is that they look iterable, but in fact are not. Because all they do is generate an error. >>> a = np.array(22) Test if iterable: >>> hasattr(a, __iter__) True Or: >>> import collections >>> isinstance(a, collections.Iterable) True >>> for e in a: print e TypeError: iteration over a 0-d array Wouldn't it be better to have a different type altogether for numpy scalars? Because scalars don't seem to quack quite like arrays... (I haven't been following the discussion around scalars, so if this is a silly remark just ignore it) Another question. What would be the preferred method of testing for zero-ranked array? Something like this? >>> if a.shape: print "array" else: print "scalar" Regards, Wim -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Sep 24 13:04:34 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 24 Sep 2012 19:04:34 +0200 Subject: [Numpy-discussion] Memory leak with numpy master Message-ID: <20120924170434.GB8368@phare.normalesup.org> Hi list, I think that I am hit a memory leak with numpy master. The following code enables to reproduce it: ________________________________________________________________________________ import numpy as np n = 100 m = np.eye(n) for i in range(30000): #np.linalg.slogdet(m) t, result_t = np.linalg.linalg._commonType(m) a = np.linalg.linalg._fastCopyAndTranspose(t, m) pivots = np.zeros((n,), np.linalg.linalg.fortran_int) results = np.linalg.lapack_lite.dgetrf(n, n, a, n, pivots, 0) d = np.diagonal(a) if not i % 1000: print i ________________________________________________________________________________ If you execute this code, you'll see the memory go steadily up. The reason that I came up with such a strange looking code is that in my codebase, I do repeated calls to np.linalg.slogdet. I came up with the code above by simplifying what is done in slogdet. I don't think that I can simplify any further and still reproduce the memory leak. Should I submit a bug report (in other words, can people reproduce?)? Cheers, Ga?l From chris.barker at noaa.gov Mon Sep 24 13:09:45 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 24 Sep 2012 10:09:45 -0700 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: On Sat, Sep 22, 2012 at 1:00 PM, Sebastian Haase wrote: > Oh, > is this actually documented - I knew that np.array would (by default) > only create copies as need ... but I never knew it would - if all fits > - even just return the original Python-object... was that a typo? is is "asarray" that returns the orignal object if it can. That's kin dof the point. Perhaps the OP was confusing asarray() with .view(). IIUC, .view() will always create a new ndarray object, but will use the same internal data pointer. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From seb.haase at gmail.com Mon Sep 24 13:45:53 2012 From: seb.haase at gmail.com (Sebastian Haase) Date: Mon, 24 Sep 2012 19:45:53 +0200 Subject: [Numpy-discussion] np.array execution path In-Reply-To: References: <1348336881.20038.28.camel@sebastian-laptop> Message-ID: On Mon, Sep 24, 2012 at 7:09 PM, Chris Barker wrote: > On Sat, Sep 22, 2012 at 1:00 PM, Sebastian Haase wrote: >> Oh, >> is this actually documented - I knew that np.array would (by default) >> only create copies as need ... but I never knew it would - if all fits >> - even just return the original Python-object... > > was that a typo? is is "asarray" that returns the orignal object if it > can. That's kin dof the point. well, I have misread the original post ..... so never mind my question ... - Sebastian Haase > > Perhaps the OP was confusing asarray() with .view(). IIUC, .view() > will always create a new ndarray object, but will use the same > internal data pointer. > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nouiz at nouiz.org Mon Sep 24 14:17:16 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 24 Sep 2012 14:17:16 -0400 Subject: [Numpy-discussion] Memory leak with numpy master In-Reply-To: <20120924170434.GB8368@phare.normalesup.org> References: <20120924170434.GB8368@phare.normalesup.org> Message-ID: Hi, with numpy '1.6.1', I have no problem. With numpy 1.7.0b2, I can reproduce the problem. HTH Fred On Mon, Sep 24, 2012 at 1:04 PM, Gael Varoquaux wrote: > Hi list, > > I think that I am hit a memory leak with numpy master. The following code > enables to reproduce it: > > ________________________________________________________________________________ > import numpy as np > n = 100 > m = np.eye(n) > for i in range(30000): > #np.linalg.slogdet(m) > > t, result_t = np.linalg.linalg._commonType(m) > a = np.linalg.linalg._fastCopyAndTranspose(t, m) > > pivots = np.zeros((n,), np.linalg.linalg.fortran_int) > results = np.linalg.lapack_lite.dgetrf(n, n, a, n, pivots, 0) > d = np.diagonal(a) > > if not i % 1000: > print i > ________________________________________________________________________________ > > If you execute this code, you'll see the memory go steadily up. > > The reason that I came up with such a strange looking code is that in my > codebase, I do repeated calls to np.linalg.slogdet. I came up with the > code above by simplifying what is done in slogdet. I don't think that I > can simplify any further and still reproduce the memory leak. > > Should I submit a bug report (in other words, can people reproduce?)? > > Cheers, > > Ga?l > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From gael.varoquaux at normalesup.org Mon Sep 24 14:19:12 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 24 Sep 2012 20:19:12 +0200 Subject: [Numpy-discussion] Memory leak with numpy master In-Reply-To: References: <20120924170434.GB8368@phare.normalesup.org> Message-ID: <20120924181912.GB30520@phare.normalesup.org> Hi Fred, On Mon, Sep 24, 2012 at 02:17:16PM -0400, Fr?d?ric Bastien wrote: > with numpy '1.6.1', I have no problem. > With numpy 1.7.0b2, I can reproduce the problem. OK, thanks. I think that I'll start a bisect to figure out when it crept in. Gael From njs at pobox.com Mon Sep 24 14:45:45 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Sep 2012 19:45:45 +0100 Subject: [Numpy-discussion] Memory leak with numpy master In-Reply-To: <20120924181912.GB30520@phare.normalesup.org> References: <20120924170434.GB8368@phare.normalesup.org> <20120924181912.GB30520@phare.normalesup.org> Message-ID: On Mon, Sep 24, 2012 at 7:19 PM, Gael Varoquaux wrote: > Hi Fred, > > On Mon, Sep 24, 2012 at 02:17:16PM -0400, Fr?d?ric Bastien wrote: >> with numpy '1.6.1', I have no problem. > >> With numpy 1.7.0b2, I can reproduce the problem. > > OK, thanks. I think that I'll start a bisect to figure out when it crept > in. This also seems to reproduce it: while True: a = np.zeros((1000, 1000)) a.diagonal() which means I probably forgot a DECREF while doing the PyArray_Diagonal changes... --n From njs at pobox.com Mon Sep 24 14:59:11 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Sep 2012 19:59:11 +0100 Subject: [Numpy-discussion] Memory leak with numpy master In-Reply-To: References: <20120924170434.GB8368@phare.normalesup.org> <20120924181912.GB30520@phare.normalesup.org> Message-ID: On Mon, Sep 24, 2012 at 7:45 PM, Nathaniel Smith wrote: > On Mon, Sep 24, 2012 at 7:19 PM, Gael Varoquaux > wrote: >> Hi Fred, >> >> On Mon, Sep 24, 2012 at 02:17:16PM -0400, Fr?d?ric Bastien wrote: >>> with numpy '1.6.1', I have no problem. >> >>> With numpy 1.7.0b2, I can reproduce the problem. >> >> OK, thanks. I think that I'll start a bisect to figure out when it crept >> in. > > This also seems to reproduce it: > > while True: > a = np.zeros((1000, 1000)) > a.diagonal() > > which means I probably forgot a DECREF while doing the > PyArray_Diagonal changes... Yep: https://github.com/numpy/numpy/pull/457 -n From gael.varoquaux at normalesup.org Mon Sep 24 15:05:02 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 24 Sep 2012 21:05:02 +0200 Subject: [Numpy-discussion] Memory leak with numpy master In-Reply-To: References: <20120924170434.GB8368@phare.normalesup.org> <20120924181912.GB30520@phare.normalesup.org> Message-ID: <20120924190502.GD30520@phare.normalesup.org> On Mon, Sep 24, 2012 at 07:59:11PM +0100, Nathaniel Smith wrote: > > which means I probably forgot a DECREF while doing the > > PyArray_Diagonal changes... > Yep: https://github.com/numpy/numpy/pull/457 Awesome. I can confirm that this fixes the problem. Script below to check. You are my hero! Gael _______________________________________________________________________________ os.system('python setup.py build_ext -i') def get_mem_usage(): pid = os.getpid() usage = open('/proc/%i/statm' % pid, 'r').read().split(' ')[0] return int(usage) import numpy as np n = 100 m = np.eye(n) for i in range(30000): #np.linalg.slogdet(m) t, result_t = np.linalg.linalg._commonType(m) a = np.linalg.linalg._fastCopyAndTranspose(t, m) pivots = np.zeros((n,), np.linalg.linalg.fortran_int) results = np.linalg.lapack_lite.dgetrf(n, n, a, n, pivots, 0) d = np.diagonal(a) if i == 0: initial_usage = get_mem_usage() if not i % 1000: usage = get_mem_usage() print i, usage if usage > 4*initial_usage: sys.exit(10) sys.exit(0) _______________________________________________________________________________ From pierre.raybaut at gmail.com Mon Sep 24 15:22:38 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Mon, 24 Sep 2012 21:22:38 +0200 Subject: [Numpy-discussion] ANN: WinPython v2.7.3.0 Message-ID: Hi all, I'm pleased to introduce my new contribution to the Python community: WinPython. WinPython v2.7.3.0 has been released and is available for 32-bit and 64-bit Windows platforms: http://code.google.com/p/winpython/ WinPython is a free open-source portable distribution of Python for Windows, designed for scientists. It is a full-featured (see http://code.google.com/p/winpython/wiki/PackageIndex) Python-based scientific environment: * Designed for scientists (thanks to the integrated libraries NumPy, SciPy, Matplotlib, guiqwt, etc.: * Regular *scientific users*: interactive data processing and visualization using Python with Spyder * *Advanced scientific users and software developers*: Python applications development with Spyder, version control with Mercurial and other development tools (like gettext) * *Portable*: preconfigured, it should run out of the box on any machine under Windows (without any installation requirements) and the folder containing WinPython can be moved to any location (local, network or removable drive) * *Flexible*: one can install (or should I write "use" as it's portable) as many WinPython versions as necessary (like isolated and self-consistent environments), even if those versions are running different versions of Python (2.7, 3.x in the near future) or different architectures (32bit or 64bit) on the same machine * *Customizable*: using the integrated package manager (wppm, as WinPython Package Manager), it's possible to install, uninstall or upgrade Python packages (see http://code.google.com/p/winpython/wiki/WPPM for more details on supported package formats). *WinPython is not an attempt to replace Python(x,y)*, this is just something different (see http://code.google.com/p/winpython/wiki/Roadmap): more flexible, easier to maintain, movable and less invasive for the OS, but certainly less user-friendly, with less packages/contents and without any integration to Windows explorer [*]. [*] Actually there is an optional integration into Windows explorer, providing the same features as the official Python installer regarding file associations and context menu entry (this option may be activated through the WinPython Control Panel). Enjoy! -Pierre From nouiz at nouiz.org Mon Sep 24 16:25:44 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 24 Sep 2012 16:25:44 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: Hi, I tested this new beta on Theano and discovered an interface change that was not there in the beta 1. New behavior: numpy.ndindex().next() (0,) Old behavior: numpy.ndindex().next() () This break some Theano code that look like this: import numpy shape=() out_shape=[12] random_state=numpy.random.RandomState() out = numpy.zeros(out_shape, int) for i in numpy.ndindex(*shape): out[i] = random_state.permutation(5) I suppose this is an regression as the only mention of ndindex in the first email of this change is that it is faster. There is a second "regression" in ndindex This was working in the past, but it raise an ValueError now: numpy.ndindex((2,1,1,1)) But If I call numpy.ndindex(2,1,1,1) The documentation[1] do not talk about receiving a tuple as input. I already make a commit to change Theano code to make it work. But this could break other people code. It is up to you to decide if you want this, but a warning in the release note would be great to help people know that the old not documented behavior changed. Do you know if the first change is expected? This will probably cause bad results in some people code if you intended this change. Fred [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndindex.html On Thu, Sep 20, 2012 at 3:51 PM, Ond?ej ?ert?k wrote: > On Thu, Sep 20, 2012 at 12:00 PM, Nathaniel Smith wrote: >> On Thu, Sep 20, 2012 at 3:33 PM, Ond?ej ?ert?k wrote: >>> On Thu, Sep 20, 2012 at 4:50 AM, Richard Hattersley >>> wrote: >>>> Hi, >>>> >>>> [First of all - thanks to everyone involved in the 1.7 release. Especially >>>> Ond?ej - it takes a lot of time & energy to coordinate something like this.] >>>> >>>> Is there an up to date release schedule anywhere? The trac milestone still >>>> references June. >>> >>> Well, originally we were supposed to release about a month ago, but it >>> turned out there are more things to fix. >>> Currently, we just need to fix all the issues here: >>> >>> https://github.com/numpy/numpy/issues/396 >>> >>> it looks like a lot, but many of them are really easy to fix, so my >>> hope is that it will not take long. The hardest one is this: >>> >>> http://projects.scipy.org/numpy/ticket/2108 >>> >>> if anyone wants to help with this one, that'd be very much appreciated. >> >> This particular bug should actually be pretty trivial to fix if anyone >> is looking for something to do (esp. if you have a working win32 build >> environment to test your work): >> http://thread.gmane.org/gmane.comp.python.numeric.general/50950/focus=50980 > > Ah, that looks easy. I'll try to give it a shot. See my repo here how > to get a working win32 environment: > > https://github.com/certik/numpy-vendor > > However, I don't have access to MSVC, but I am sure somebody else can > test it there, once the PR is ready. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Sep 24 17:47:37 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Sep 2012 15:47:37 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Mon, Sep 24, 2012 at 2:25 PM, Fr?d?ric Bastien wrote: > Hi, > > I tested this new beta on Theano and discovered an interface change > that was not there in the beta 1. > > New behavior: > numpy.ndindex().next() > (0,) > > Old behavior: > numpy.ndindex().next() > () > > This break some Theano code that look like this: > > import numpy > shape=() > out_shape=[12] > random_state=numpy.random.RandomState() > > out = numpy.zeros(out_shape, int) > for i in numpy.ndindex(*shape): > out[i] = random_state.permutation(5) > > > I suppose this is an regression as the only mention of ndindex in the > first email of this change is that it is faster. > > I think this problem has been brought up on the list. It is interesting that it turned up after the first beta. Could you do a bisection to discover which commit is responsible? > > > There is a second "regression" in ndindex This was working in the > past, but it raise an ValueError now: > > numpy.ndindex((2,1,1,1)) > > But If I call numpy.ndindex(2,1,1,1) > > The documentation[1] do not talk about receiving a tuple as input. I > already make a commit to change Theano code to make it work. But this > could break other people code. It is up to you to decide if you want > this, but a warning in the release note would be great to help people > know that the old not documented behavior changed. > > Do you know if the first change is expected? This will probably cause > bad results in some people code if you intended this change. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Mon Sep 24 17:55:20 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 24 Sep 2012 17:55:20 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Mon, Sep 24, 2012 at 5:47 PM, Charles R Harris wrote: > > > On Mon, Sep 24, 2012 at 2:25 PM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I tested this new beta on Theano and discovered an interface change >> that was not there in the beta 1. >> >> New behavior: >> numpy.ndindex().next() >> (0,) >> >> Old behavior: >> numpy.ndindex().next() >> () >> >> This break some Theano code that look like this: >> >> import numpy >> shape=() >> out_shape=[12] >> random_state=numpy.random.RandomState() >> >> out = numpy.zeros(out_shape, int) >> for i in numpy.ndindex(*shape): >> out[i] = random_state.permutation(5) >> >> >> I suppose this is an regression as the only mention of ndindex in the >> first email of this change is that it is faster. >> > > I think this problem has been brought up on the list. It is interesting that > it turned up after the first beta. Could you do a bisection to discover > which commit is responsible? I'll check that. Do I need to reinstall numpy from scratch everytimes or is there a better way to do that? Fred From njs at pobox.com Mon Sep 24 18:49:27 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Sep 2012 23:49:27 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Mon, Sep 24, 2012 at 10:47 PM, Charles R Harris wrote: > > > On Mon, Sep 24, 2012 at 2:25 PM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I tested this new beta on Theano and discovered an interface change >> that was not there in the beta 1. >> >> New behavior: >> numpy.ndindex().next() >> (0,) >> >> Old behavior: >> numpy.ndindex().next() >> () >> >> This break some Theano code that look like this: >> >> import numpy >> shape=() >> out_shape=[12] >> random_state=numpy.random.RandomState() >> >> out = numpy.zeros(out_shape, int) >> for i in numpy.ndindex(*shape): >> out[i] = random_state.permutation(5) >> >> >> I suppose this is an regression as the only mention of ndindex in the >> first email of this change is that it is faster. >> > > I think this problem has been brought up on the list. It is interesting that > it turned up after the first beta. Could you do a bisection to discover > which commit is responsible? No need, the problem is already known. It was introduced by that ndindex speed up patch, PR #393, which was backported into the first beta as well. There's a follow-up patch in PR #445 that fixes both of these issues, though it also exposes some more fundamental issues with the nditer API, so there's lots of discussion there about if we want some more changes... this is a good summary: https://github.com/numpy/numpy/pull/445#issuecomment-8740982 For 1.7 purposes though the bottom line is that we already have multiple acceptable solutions, so both the issues reported here should definitely be fixed. -n From pav at iki.fi Mon Sep 24 18:52:24 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 25 Sep 2012 01:52:24 +0300 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: 25.09.2012 00:55, Fr?d?ric Bastien kirjoitti: > On Mon, Sep 24, 2012 at 5:47 PM, Charles R Harris [clip] >> I think this problem has been brought up on the list. It is interesting that >> it turned up after the first beta. Could you do a bisection to discover >> which commit is responsible? > > I'll check that. Do I need to reinstall numpy from scratch everytimes > or is there a better way to do that? Reinstallation is needed, but this is reasonably simple to automate, check "git bisect run". For instance like so: https://github.com/pv/scipy-build-makefile/blob/master/bisectrun.py -- Pauli Virtanen From ondrej.certik at gmail.com Mon Sep 24 20:27:52 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 24 Sep 2012 17:27:52 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Mon, Sep 24, 2012 at 3:49 PM, Nathaniel Smith wrote: > On Mon, Sep 24, 2012 at 10:47 PM, Charles R Harris > wrote: >> >> >> On Mon, Sep 24, 2012 at 2:25 PM, Fr?d?ric Bastien wrote: >>> >>> Hi, >>> >>> I tested this new beta on Theano and discovered an interface change >>> that was not there in the beta 1. >>> >>> New behavior: >>> numpy.ndindex().next() >>> (0,) >>> >>> Old behavior: >>> numpy.ndindex().next() >>> () >>> >>> This break some Theano code that look like this: >>> >>> import numpy >>> shape=() >>> out_shape=[12] >>> random_state=numpy.random.RandomState() >>> >>> out = numpy.zeros(out_shape, int) >>> for i in numpy.ndindex(*shape): >>> out[i] = random_state.permutation(5) >>> >>> >>> I suppose this is an regression as the only mention of ndindex in the >>> first email of this change is that it is faster. >>> >> >> I think this problem has been brought up on the list. It is interesting that >> it turned up after the first beta. Could you do a bisection to discover >> which commit is responsible? > > No need, the problem is already known. It was introduced by that > ndindex speed up patch, PR #393, which was backported into the first > beta as well. There's a follow-up patch in PR #445 that fixes both of > these issues, though it also exposes some more fundamental issues with > the nditer API, so there's lots of discussion there about if we want > some more changes... this is a good summary: > https://github.com/numpy/numpy/pull/445#issuecomment-8740982 > > For 1.7 purposes though the bottom line is that we already have > multiple acceptable solutions, so both the issues reported here should > definitely be fixed. Should we just remove (revert) this PR #393 patch from the release branch? It shouldn't have been there in the first place, the only reason I included it is because other patches depended on it and I would have to fix collisions, and we thought it would be harmless to just include it. Which turned out to be a mistake, for which I apologize. That way we'll feel confident that the branch works, and we can get the right solution into master and test it there. So I am actually convinced I should simply revert this patch in the release branch. Let me know what you think. Ondrej From scollis.acrf at gmail.com Mon Sep 24 20:37:13 2012 From: scollis.acrf at gmail.com (Scott Collis) Date: Mon, 24 Sep 2012 19:37:13 -0500 Subject: [Numpy-discussion] Job opportunity working with python Message-ID: <366B4D54-B0F1-460E-9C02-78DEA95C8842@gmail.com> Good evening fellow Numpy-ers! I sent an email a while ago about a employment opportunity at Argonne National Laboratory but I think I sent it from the wrong address so it bounced. We are seeking to hire some one to work on a python based toolkit for working with scanning weather radar. The main role for this person will be to find novel ways of working with large radar based data sets, probably writing code in C, C++ and Fortran to make advanced modules available in Python. Working with communities such as this will be encouraged in this new position. Argonne is a multidisciplinary lab in the Chicago burbs.. once you get over the cold winters you will enjoy good parks, schools and easy access to Chicago! Take a look at the description here: http://www.arm.gov/news/jobs/post/18955 and use the PD to apply here: http://web.anl.gov/jobsearch/detail.jsp?userreqid=319854+EVS&lsBrowse=ALL Basically you can read the job description as: This person will build and maintain a scikit for working with weather radars? Thanks for your time, Scott --- Dr Scott Collis ARM Precipitation Radar Translator Environmental Sciences Division Argonne National Laboratory Mb: +1 630 235 8025 Of: +1 630 252 0550 http://radar.arm.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From feraudy at phimeca.com Mon Sep 24 11:54:14 2012 From: feraudy at phimeca.com (Raphael de Feraudy) Date: Mon, 24 Sep 2012 15:54:14 +0000 (UTC) Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython References: Message-ID: Yang Zhang gmail.com> writes: > > > I'm curious how to disable threads in numpy (not an ideal solution). > > Googling seems to point me to setting NPY_ALLOW_THREADS to > > 0....somewhere. > > Anyone? > It's appearing to me I had to face this very issue, which I reported @Numpy TRAC : http://projects.scipy.org/numpy/ticket/2213. I just tried your suggestion : set NPY_ALLOW_THREADS to 0 in numpy/core/include/numpy/ndarraytypes.h. It allowed my atomic example to run without stalling, and also fixed the issue in my application. Though i'm not entirely satisfied by this workaround, which might slow down heavy computations. I also find it too intrusive in numpy source code and don't wish to maintain a powerless numpy fork. Has anyone else settled with this fix ? Or may anybody have any other suggestion / comments ? Thanks. Raphael. From will at thearete.co.uk Tue Sep 25 05:03:59 2012 From: will at thearete.co.uk (William Furnass) Date: Tue, 25 Sep 2012 10:03:59 +0100 Subject: [Numpy-discussion] Double-ended queues Message-ID: Hi all, I want to be able to within a loop a) apply a mathematical operation to all elements in a vector (can be done atomically) then b) pop zero or more elements from one end of the vector and c) push zero or more elements on to the other end. So far I've used a collections.deque to store my vector as it should be more efficient than a numpy array for the appending and deletion of elements. However, I was wondering whether performance could be improved through the use of a homogeneously-typed double-ended queue i.e. a linked list equivalent of numpy.ndarray. Has anyone previously considered whether it would be worth including such a thing within the numpy package? Cheers, Will From lists at hilboll.de Tue Sep 25 05:31:59 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Tue, 25 Sep 2012 11:31:59 +0200 Subject: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt Message-ID: <390e840f1a882a108e0bfa3cfe3eccf2.squirrel@srv2.s4y.tournesol-consulting.eu> Hi, I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array. Is there any way around this? Thanks for your insight, Andreas. From njs at pobox.com Tue Sep 25 05:38:40 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Sep 2012 10:38:40 +0100 Subject: [Numpy-discussion] Double-ended queues In-Reply-To: References: Message-ID: On Tue, Sep 25, 2012 at 10:03 AM, William Furnass wrote: > Hi all, > > I want to be able to within a loop a) apply a mathematical operation > to all elements in a vector (can be done atomically) then b) pop zero > or more elements from one end of the vector and c) push zero or more > elements on to the other end. So far I've used a collections.deque to > store my vector as it should be more efficient than a numpy array for > the appending and deletion of elements. However, I was wondering > whether performance could be improved through the use of a > homogeneously-typed double-ended queue i.e. a linked list equivalent > of numpy.ndarray. Implementing a ring buffer on top of ndarray would be pretty straightforward and probably work better than a linked-list implementation. -n From aron at ahmadia.net Tue Sep 25 07:10:10 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Tue, 25 Sep 2012 14:10:10 +0300 Subject: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython In-Reply-To: References: Message-ID: Can you expand a bit? Are you trying to disable threads at compile-time or at run-time? Which threaded functionality are you trying to disable? Are you using numpy as a computational library with multiple threads making calls into its functions? I think NPY_ALLOW_THREADS is for interacting with the GIL, but I have not played with it much. A On Mon, Sep 24, 2012 at 6:54 PM, Raphael de Feraudy wrote: > Yang Zhang gmail.com> writes: > >> >> > I'm curious how to disable threads in numpy (not an ideal solution). >> > Googling seems to point me to setting NPY_ALLOW_THREADS to >> > 0....somewhere. >> >> Anyone? >> > > It's appearing to me I had to face this very issue, > which I reported @Numpy TRAC : http://projects.scipy.org/numpy/ticket/2213. > > I just tried your suggestion : > set NPY_ALLOW_THREADS to 0 in numpy/core/include/numpy/ndarraytypes.h. > It allowed my atomic example to run without stalling, > and also fixed the issue in my application. > > Though i'm not entirely satisfied by this workaround, > which might slow down heavy computations. > I also find it too intrusive in numpy source code > and don't wish to maintain a powerless numpy fork. > > Has anyone else settled with this fix ? > Or may anybody have any other suggestion / comments ? > > Thanks. > Raphael. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sturla at molden.no Tue Sep 25 07:31:59 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 25 Sep 2012 13:31:59 +0200 Subject: [Numpy-discussion] Double-ended queues In-Reply-To: References: Message-ID: <5061962F.5000809@molden.no> On 25.09.2012 11:38, Nathaniel Smith wrote: > Implementing a ring buffer on top of ndarray would be pretty > straightforward and probably work better than a linked-list > implementation. Amazingly, many do not know that a ringbuffer is simply an array indexed modulus its length: foo = np.zeros(n) i = 0 while 1: foo[i % n] # access ringbuffer i += 1 Also, instead of writing a linked list, consider collections.deque. A deque is by definition a double-ended queue. It is just waste of time to implement a deque (double-ended queue) and hope it will perform better than Python's standard lib collections.deque object. Sturla From cournape at gmail.com Tue Sep 25 08:06:45 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 Sep 2012 13:06:45 +0100 Subject: [Numpy-discussion] API, ABI compatibility Message-ID: From njs at pobox.com Tue Sep 25 08:50:34 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Sep 2012 13:50:34 +0100 Subject: [Numpy-discussion] Double-ended queues In-Reply-To: <5061962F.5000809@molden.no> References: <5061962F.5000809@molden.no> Message-ID: On Tue, Sep 25, 2012 at 12:31 PM, Sturla Molden wrote: > On 25.09.2012 11:38, Nathaniel Smith wrote: > >> Implementing a ring buffer on top of ndarray would be pretty >> straightforward and probably work better than a linked-list >> implementation. > > Amazingly, many do not know that a ringbuffer is simply an array indexed > modulus its length: > > foo = np.zeros(n) > i = 0 > while 1: > foo[i % n] # access ringbuffer > i += 1 Good trick, but to be reliable I think you need to either be willing for i to overflow into a long (arbitrary width) integer, or else make sure that i is an unsigned integer and that n is 2**k where k <= sizeof(i)? Just doing i %= n on each pass through the loop might be less error-prone. > Also, instead of writing a linked list, consider collections.deque. > A deque is by definition a double-ended queue. It is just waste of time > to implement a deque (double-ended queue) and hope it will perform > better than Python's standard lib collections.deque object. The original poster is using collections.deque now, but wants a version that supports efficient vectorized operations. -n From njs at pobox.com Tue Sep 25 09:02:29 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Sep 2012 14:02:29 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Tue, Sep 25, 2012 at 1:27 AM, Ond?ej ?ert?k wrote: > On Mon, Sep 24, 2012 at 3:49 PM, Nathaniel Smith wrote: >> On Mon, Sep 24, 2012 at 10:47 PM, Charles R Harris >> wrote: >>> >>> >>> On Mon, Sep 24, 2012 at 2:25 PM, Fr?d?ric Bastien wrote: >>>> >>>> Hi, >>>> >>>> I tested this new beta on Theano and discovered an interface change >>>> that was not there in the beta 1. >>>> >>>> New behavior: >>>> numpy.ndindex().next() >>>> (0,) >>>> >>>> Old behavior: >>>> numpy.ndindex().next() >>>> () >>>> >>>> This break some Theano code that look like this: >>>> >>>> import numpy >>>> shape=() >>>> out_shape=[12] >>>> random_state=numpy.random.RandomState() >>>> >>>> out = numpy.zeros(out_shape, int) >>>> for i in numpy.ndindex(*shape): >>>> out[i] = random_state.permutation(5) >>>> >>>> >>>> I suppose this is an regression as the only mention of ndindex in the >>>> first email of this change is that it is faster. >>>> >>> >>> I think this problem has been brought up on the list. It is interesting that >>> it turned up after the first beta. Could you do a bisection to discover >>> which commit is responsible? >> >> No need, the problem is already known. It was introduced by that >> ndindex speed up patch, PR #393, which was backported into the first >> beta as well. There's a follow-up patch in PR #445 that fixes both of >> these issues, though it also exposes some more fundamental issues with >> the nditer API, so there's lots of discussion there about if we want >> some more changes... this is a good summary: >> https://github.com/numpy/numpy/pull/445#issuecomment-8740982 >> >> For 1.7 purposes though the bottom line is that we already have >> multiple acceptable solutions, so both the issues reported here should >> definitely be fixed. > > Should we just remove (revert) this PR #393 patch from the release branch? > It shouldn't have been there in the first place, the only reason I included it > is because other patches depended on it and I would have to fix collisions, > and we thought it would be harmless to just include it. Which turned out > to be a mistake, for which I apologize. > > That way we'll feel confident that the branch works, and we can get the right > solution into master and test it there. > > So I am actually convinced I should simply revert this patch in the > release branch. > Let me know what you think. Sounds good to me. (I also thought it would be harmless to include it, and also missed that the other patch that depended on it was part of the same change and could be reverted too.) -n From charlesr.harris at gmail.com Tue Sep 25 09:23:54 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Sep 2012 07:23:54 -0600 Subject: [Numpy-discussion] Double-ended queues In-Reply-To: References: <5061962F.5000809@molden.no> Message-ID: On Tue, Sep 25, 2012 at 6:50 AM, Nathaniel Smith wrote: > On Tue, Sep 25, 2012 at 12:31 PM, Sturla Molden wrote: > > On 25.09.2012 11:38, Nathaniel Smith wrote: > > > >> Implementing a ring buffer on top of ndarray would be pretty > >> straightforward and probably work better than a linked-list > >> implementation. > > > > Amazingly, many do not know that a ringbuffer is simply an array indexed > > modulus its length: > > > > foo = np.zeros(n) > > i = 0 > > while 1: > > foo[i % n] # access ringbuffer > > i += 1 > > Good trick, but to be reliable I think you need to either be willing > for i to overflow into a long (arbitrary width) integer, or else make > sure that i is an unsigned integer and that n is 2**k where k <= > sizeof(i)? Just doing i %= n on each pass through the loop might be > less error-prone. > > > Also, instead of writing a linked list, consider collections.deque. > > A deque is by definition a double-ended queue. It is just waste of time > > to implement a deque (double-ended queue) and hope it will perform > > better than Python's standard lib collections.deque object. > > The original poster is using collections.deque now, but wants a > version that supports efficient vectorized operations. > > The C++ stdlib has an efficient deque object, but it moves through memory. Hmm, it wouldn't be easy to make that work with numpy arrays what with views and all. Efficient circular lists are often implemented using powers of two so that modulo indexing can be done using a mask. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Sep 25 10:47:27 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 25 Sep 2012 10:47:27 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: Hi, thanks for that script. It seam very useful for that case. As other people know about this problem, I won't need to bisect. thanks Fred On Mon, Sep 24, 2012 at 6:52 PM, Pauli Virtanen wrote: > 25.09.2012 00:55, Fr?d?ric Bastien kirjoitti: >> On Mon, Sep 24, 2012 at 5:47 PM, Charles R Harris > [clip] >>> I think this problem has been brought up on the list. It is interesting that >>> it turned up after the first beta. Could you do a bisection to discover >>> which commit is responsible? >> >> I'll check that. Do I need to reinstall numpy from scratch everytimes >> or is there a better way to do that? > > Reinstallation is needed, but this is reasonably simple to automate, > check "git bisect run". For instance like so: > > https://github.com/pv/scipy-build-makefile/blob/master/bisectrun.py > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Tue Sep 25 10:56:12 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Sep 2012 08:56:12 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Thu, Sep 20, 2012 at 12:24 AM, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the second beta release of > NumPy 1.7.0b2. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ > > Please test this release and report any issues on the numpy-discussion > mailing list. Since beta1, we've fixed most of the known (back then) > issues, except: > > http://projects.scipy.org/numpy/ticket/2076 > http://projects.scipy.org/numpy/ticket/2101 > http://projects.scipy.org/numpy/ticket/2108 > http://projects.scipy.org/numpy/ticket/2150 > > And many other issues that were reported since the beta1 release. The > log of changes is attached. The full list of issues that we still need > to work on is at: > > https://github.com/numpy/numpy/issues/396 > > Any help is welcome, the best is to send a PR fixing any of the issues > -- against master, and I'll then back-port it to the release branch > (unless it is something release specific, in which case just send the > PR against the release branch). > > Cheers, > Ondrej > > > * f217517 Release 1.7.0b2 > * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. > * fcacdcc FIX: use py24-compatible version of virtualenv on Travis > * d01354e FIX: loosen numerical tolerance in test_pareto() > * 65ec87e TST: Add test for boolean insert > * 9ee9984 TST: Add extra test for multidimensional inserts. > * 8460514 BUG: Fix for issues #378 and #392 This should fix the > problems with numpy.insert(), where the input values were not checked > for all scalar types and where values did not get inserted properly, > but got duplicated by default. > * 07e02d0 BUG: fix npymath install location. > * 6da087e BUG: fix custom post_check. > * 095a3ab BUG: forgot to build _dotblas in bento build. > * cb0de72 REF: remove unused imports in bscript. > * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 > * 3dc3b1b Retain backward compatibility. Enforce C order. > * 5a471b5 Improve ndindex execution speed. > * 2f28db6 FIX: Add a test for Ticket #2066 > * ca29849 BUG: Add a test for Ticket #2189 > * 1ee4a00 BUG: Add a test for Ticket #1588 > * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip > * f65ff87 FIX: simplify the import statement > * 124a608 Fix returned copy > * 996a9fb FIX: bug in np.where and recarray swapping > * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). > * 416af9a pavement.py: rename "yop" to "atlas" > * 3930881 BUG: fix bento build. > * fbad4a7 Remove test_recarray_from_long_formats > * 5cb80f8 Add test for long number in shape specifier of dtype string > * 24da7f6 Add test for long numbers in numpy.rec.array formats string > * 77da3f8 Allow long numbers in numpy.rec.array formats string > * 99c9397 Use PyUnicode_DecodeUTF32() > * 31660d0 Follow the C guidelines > * d5d6894 Fix memory leak in concatenate. > * 8141e1e FIX: Make sure the tests produce valid unicode > * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 > * a022015 Re-enable unpickling optimization for large py3k bytes objects. > * 470486b Copy bytes object when unpickling an array > * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 > * a1561c2 [FIX] Add missing header so separate compilation works again > * ea23de8 TST: set raise-on-warning behavior of NoseTester to release mode. > * 28ffac7 REL: set version number to 1.7.0rc1-dev. > Ticket #2218 needs to be fixed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Sep 25 11:11:22 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 25 Sep 2012 09:11:22 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.0b2 release In-Reply-To: References: Message-ID: On Tue, Sep 25, 2012 at 8:56 AM, Charles R Harris wrote: > > > On Thu, Sep 20, 2012 at 12:24 AM, Ond?ej ?ert?k wrote: > >> Hi, >> >> I'm pleased to announce the availability of the second beta release of >> NumPy 1.7.0b2. >> >> Sources and binary installers can be found at >> https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b2/ >> >> Please test this release and report any issues on the numpy-discussion >> mailing list. Since beta1, we've fixed most of the known (back then) >> issues, except: >> >> http://projects.scipy.org/numpy/ticket/2076 >> http://projects.scipy.org/numpy/ticket/2101 >> http://projects.scipy.org/numpy/ticket/2108 >> http://projects.scipy.org/numpy/ticket/2150 >> >> And many other issues that were reported since the beta1 release. The >> log of changes is attached. The full list of issues that we still need >> to work on is at: >> >> https://github.com/numpy/numpy/issues/396 >> >> Any help is welcome, the best is to send a PR fixing any of the issues >> -- against master, and I'll then back-port it to the release branch >> (unless it is something release specific, in which case just send the >> PR against the release branch). >> >> Cheers, >> Ondrej >> >> >> * f217517 Release 1.7.0b2 >> * 50f71cb MAINT: silence Cython warnings about changes dtype/ufunc size. >> * fcacdcc FIX: use py24-compatible version of virtualenv on Travis >> * d01354e FIX: loosen numerical tolerance in test_pareto() >> * 65ec87e TST: Add test for boolean insert >> * 9ee9984 TST: Add extra test for multidimensional inserts. >> * 8460514 BUG: Fix for issues #378 and #392 This should fix the >> problems with numpy.insert(), where the input values were not checked >> for all scalar types and where values did not get inserted properly, >> but got duplicated by default. >> * 07e02d0 BUG: fix npymath install location. >> * 6da087e BUG: fix custom post_check. >> * 095a3ab BUG: forgot to build _dotblas in bento build. >> * cb0de72 REF: remove unused imports in bscript. >> * 6e3e289 FIX: Regenerate mtrand.c with Cython 0.17 >> * 3dc3b1b Retain backward compatibility. Enforce C order. >> * 5a471b5 Improve ndindex execution speed. >> * 2f28db6 FIX: Add a test for Ticket #2066 >> * ca29849 BUG: Add a test for Ticket #2189 >> * 1ee4a00 BUG: Add a test for Ticket #1588 >> * 7b5dba0 BUG: Fix ticket #1588/gh issue #398, refcount error in clip >> * f65ff87 FIX: simplify the import statement >> * 124a608 Fix returned copy >> * 996a9fb FIX: bug in np.where and recarray swapping >> * 7583adc MAINT: silence DeprecationWarning in np.safe_eval(). >> * 416af9a pavement.py: rename "yop" to "atlas" >> * 3930881 BUG: fix bento build. >> * fbad4a7 Remove test_recarray_from_long_formats >> * 5cb80f8 Add test for long number in shape specifier of dtype string >> * 24da7f6 Add test for long numbers in numpy.rec.array formats string >> * 77da3f8 Allow long numbers in numpy.rec.array formats string >> * 99c9397 Use PyUnicode_DecodeUTF32() >> * 31660d0 Follow the C guidelines >> * d5d6894 Fix memory leak in concatenate. >> * 8141e1e FIX: Make sure the tests produce valid unicode >> * d67785b FIX: Fixes the PyUnicodeObject problem in py-3.3 >> * a022015 Re-enable unpickling optimization for large py3k bytes objects. >> * 470486b Copy bytes object when unpickling an array >> * d72280f Fix tests for empty shape, strides and suboffsets on Python 3.3 >> * a1561c2 [FIX] Add missing header so separate compilation works again >> * ea23de8 TST: set raise-on-warning behavior of NoseTester to release >> mode. >> * 28ffac7 REL: set version number to 1.7.0rc1-dev. >> > > Ticket #2218 needs to be fixed. > The all method fails also. In [1]: a = zeros(5, complex) In [2]: a.imag = 1 In [3]: a.all() Out[3]: False Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Sep 25 11:56:30 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 25 Sep 2012 08:56:30 -0700 Subject: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt In-Reply-To: <390e840f1a882a108e0bfa3cfe3eccf2.squirrel@srv2.s4y.tournesol-consulting.eu> References: <390e840f1a882a108e0bfa3cfe3eccf2.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll wrote: > I commonly have to deal with legacy ASCII files, which don't have a > constant number of columns. The standard is 10 values per row, but > sometimes, there are less columns. loadtxt doesn't support this, and in > genfromtext, the rows which have less than 10 values are excluded from the > resulting array. > > Is there any way around this? the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that. Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway) -Chris > Thanks for your insight, > Andreas. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Tue Sep 25 12:03:18 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 25 Sep 2012 09:03:18 -0700 Subject: [Numpy-discussion] Double-ended queues In-Reply-To: <5061962F.5000809@molden.no> References: <5061962F.5000809@molden.no> Message-ID: On Tue, Sep 25, 2012 at 4:31 AM, Sturla Molden wrote: > Also, instead of writing a linked list, consider collections.deque. > A deque is by definition a double-ended queue. It is just waste of time > to implement a deque (double-ended queue) and hope it will perform > better than Python's standard lib collections.deque object. not for insertion, deletion, etc, but there _may_ be a benefit to a class that stores the data in a homogenous data data buffer compatible with numpy: - you could use non-standard data types (uint, etc...) - It would be more memory efficient *not having to store all those python objects for each value) - you could round-trip to/from numpy arrays without data copying (or with efficient data copying...) for other operations. Whether it's worth the work would depend on teh use case, of course. Writing such a thing in Cython would be pretty easy though, particularly if you only needed to support a couple types. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lists at hilboll.de Tue Sep 25 12:35:36 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Tue, 25 Sep 2012 18:35:36 +0200 Subject: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt In-Reply-To: References: <390e840f1a882a108e0bfa3cfe3eccf2.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: > On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll wrote: >> I commonly have to deal with legacy ASCII files, which don't have a >> constant number of columns. The standard is 10 values per row, but >> sometimes, there are less columns. loadtxt doesn't support this, and in >> genfromtext, the rows which have less than 10 values are excluded from >> the >> resulting array. >> >> Is there any way around this? > > the trick is: what does it mean when there are fewer values in a row? > There is no way to universally define that. > > Anyway, I'd just punt on using a standard ascii file reader, in the > time it took to write this question, you'd be halfway to writing a > custom file parser -- it's really easy in Python, at least if you > don't need absolutely top performance (which loadtext and genfromtext > doen't give you anyway) Actually, that's just what I did before writing this question ;) I was just wondering if there were some solution available which I didn't know about. Cheers, Andreas. From ralf.gommers at gmail.com Tue Sep 25 14:13:41 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 25 Sep 2012 20:13:41 +0200 Subject: [Numpy-discussion] API, ABI compatibility In-Reply-To: References: Message-ID: is a good thing:) On Tue, Sep 25, 2012 at 2:06 PM, David Cournapeau wrote: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Sep 25 15:26:53 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 25 Sep 2012 20:26:53 +0100 Subject: [Numpy-discussion] API, ABI compatibility In-Reply-To: References: Message-ID: Ok, so since many people asked: this was sent by mistake, and intended to be a discarded draft instead. David On Tue, Sep 25, 2012 at 7:13 PM, Ralf Gommers wrote: > is a good thing:) > > > > > On Tue, Sep 25, 2012 at 2:06 PM, David Cournapeau > wrote: >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pmhobson at gmail.com Tue Sep 25 20:02:50 2012 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 25 Sep 2012 17:02:50 -0700 Subject: [Numpy-discussion] variable number of columns in loadtxt/genfromtxt In-Reply-To: References: <390e840f1a882a108e0bfa3cfe3eccf2.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Tue, Sep 25, 2012 at 9:35 AM, Andreas Hilboll wrote: >> On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll wrote: >>> I commonly have to deal with legacy ASCII files, which don't have a >>> constant number of columns. The standard is 10 values per row, but >>> sometimes, there are less columns. loadtxt doesn't support this, and in >>> genfromtext, the rows which have less than 10 values are excluded from >>> the >>> resulting array. >>> >>> Is there any way around this? >> >> the trick is: what does it mean when there are fewer values in a row? >> There is no way to universally define that. >> >> Anyway, I'd just punt on using a standard ascii file reader, in the >> time it took to write this question, you'd be halfway to writing a >> custom file parser -- it's really easy in Python, at least if you >> don't need absolutely top performance (which loadtext and genfromtext >> doen't give you anyway) > > Actually, that's just what I did before writing this question ;) I was > just wondering if there were some solution available which I didn't know > about. This may or may not be relevant, but pandas does a pretty good job of handling this sort of thing... http://nbviewer.maxdrawdown.com/3785198 Notebook Viewer hasn't quite caught up with the dev version of ipython. I've attached a screen shot too. -paul -------------- next part -------------- A non-text attachment was scrubbed... Name: variable_cols.png Type: image/png Size: 44286 bytes Desc: not available URL: From cournape at gmail.com Thu Sep 27 08:46:30 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 Sep 2012 13:46:30 +0100 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Sun, Jun 24, 2012 at 12:31 PM, Ralf Gommers wrote: > > > On Sat, Jun 23, 2012 at 11:30 AM, Thouis (Ray) Jones > wrote: >> >> On Sat, Jun 23, 2012 at 11:13 AM, Ralf Gommers >> wrote: >> > >> > >> > On Sat, Jun 23, 2012 at 11:03 AM, Thouis (Ray) Jones >> > wrote: >> >> >> >> On Fri, Jun 22, 2012 at 7:29 PM, Ralf Gommers >> >> wrote: >> >> > >> >> > >> >> > On Fri, Jun 22, 2012 at 9:49 AM, Thouis (Ray) Jones >> >> > >> >> > wrote: >> >> >> >> >> >> On Mon, Jun 4, 2012 at 7:43 PM, Travis Oliphant >> >> >> >> >> >> wrote: >> >> >> > I have turned on issue tracking and started a few labels. Feel >> >> >> > free >> >> >> > to >> >> >> > add >> >> >> > more / adjust the names as appropriate. I am trying to find >> >> >> > someone >> >> >> > who >> >> >> > can help manage the migration from Trac. >> >> >> >> >> >> Are the github issues set up sufficiently for Trac to be disabled >> >> >> and >> >> >> github to take over? >> >> > >> >> > >> >> > You lost me here. You were going to set up a test site where we could >> >> > see >> >> > the Trac --> Github conversion could be tested, before actually >> >> > pushing >> >> > that >> >> > conversion to the numpy Github repo. If you sent a message that that >> >> > was >> >> > ready, I must have missed it. >> >> > >> >> > The current state of labels on https://github.com/numpy/numpy/issues >> >> > is >> >> > also >> >> > far from complete (no prios, components). >> >> >> >> I wasn't completely clear. What I meant to ask: >> >> >> >> "Are the github issues (and labels) set up well enough for Trac to be >> >> disabled for accepting new bugs and to point users filing new bugs to >> >> github instead?" >> >> >> >> (The answer to which is "no", based on your reply). >> > >> > >> > I don't think it's a problem that a few issues have already been filed >> > on >> > Github, but we'll have to properly label them by hand later. >> > >> > Making Github the default or only option now would be a bit strange. It >> > would be better to first do the conversion, or at least have it far >> > enough >> > along that we have agreed on workflow and labels to use. >> >> My concern is that transitioning first would define the >> workflow/labels based on what's in Trac, rather than on what would >> work best with github. > > > Trac is not unique, most bug trackers have similar concepts (milestones, > components, prios, issue types). > >> >> But maybe the best way to move things forward >> is to do the transition to a test project, and see what comes out. Ok, so scipy.org was down again because of trac. Unfortunately, the machine on which scipy.org lives is the same as trac, and is a bit messy. I would really like to accelerate whatever needs to be done to get that done, both to get out of trac's misery, and to make scipy.org more responsive. I can't promise a lot of spare cycles, but I will make sure there are no roadblocks on Enthought side when we need to make the actual migration. Thouis, what needs to be done to make a testbed of the conversion ? David From thouis at gmail.com Thu Sep 27 10:22:29 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 27 Sep 2012 10:22:29 -0400 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau wrote: > Thouis, what needs to be done to make a testbed of the conversion ? I just returned to this a couple of days ago [*], and last night successfully imported all the trac issues (from my somewhat out-of-date snapshot) to this repo: https://github.com/thouis/numpy-trac-migration Note that there are currently multiple copies of many issues. This is because github has no way to actually delete an issue, and I didn't bother marking issues that were from failed runs during debugging. If you see two copies of an issue with differences in their body/comments, the one with the higher github issue # is the one generated last night. The code that actually does the import is in that repository, as well. Thanks to the networkX group for alpha-testing the code. Ray [*] sorry for the delay... international move + starting a new job From charlesr.harris at gmail.com Thu Sep 27 11:46:27 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Sep 2012 09:46:27 -0600 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 8:22 AM, Thouis (Ray) Jones wrote: > On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau > wrote: > > > Thouis, what needs to be done to make a testbed of the conversion ? > > I just returned to this a couple of days ago [*], and last night > successfully imported all the trac issues (from my somewhat > out-of-date snapshot) to this repo: > https://github.com/thouis/numpy-trac-migration > > Note that there are currently multiple copies of many issues. This is > because github has no way to actually delete an issue, and I didn't > bother marking issues that were from failed runs during debugging. If > you see two copies of an issue with differences in their > body/comments, the one with the higher github issue # is the one > generated last night. > > The code that actually does the import is in that repository, as well. > Thanks to the networkX group for alpha-testing the code. > > Thouis, will you want commit privileges? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Thu Sep 27 11:58:06 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 27 Sep 2012 11:58:06 -0400 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 11:46 AM, Charles R Harris wrote: > > > On Thu, Sep 27, 2012 at 8:22 AM, Thouis (Ray) Jones > wrote: >> >> On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau >> wrote: >> >> > Thouis, what needs to be done to make a testbed of the conversion ? >> >> I just returned to this a couple of days ago [*], and last night >> successfully imported all the trac issues (from my somewhat >> out-of-date snapshot) to this repo: >> https://github.com/thouis/numpy-trac-migration >> >> Note that there are currently multiple copies of many issues. This is >> because github has no way to actually delete an issue, and I didn't >> bother marking issues that were from failed runs during debugging. If >> you see two copies of an issue with differences in their >> body/comments, the one with the higher github issue # is the one >> generated last night. >> >> The code that actually does the import is in that repository, as well. >> Thanks to the networkX group for alpha-testing the code. >> > > Thouis, will you want commit privileges? Yes, though I created a bot account to keep from injecting myself into the history: https://github.com/numpy-gitbot Someone could also just run the import as the "numpy" github user. Ray From sergio.pasra at gmail.com Thu Sep 27 12:08:11 2012 From: sergio.pasra at gmail.com (Sergio Pascual) Date: Thu, 27 Sep 2012 18:08:11 +0200 Subject: [Numpy-discussion] Reductions with nditer working only with the last axis Message-ID: Hello, I'm trying to understand how to work with nditer to do a reduction, in my case converting a 3d array into a 2d array. I followed the help here http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html and managed to create a function that applies reduction over the last axis of the input. With this function def nditer_sum(data, red_axes): it = numpy.nditer([data, None], flags=['reduce_ok', 'external_loop'], op_flags=[['readonly'], ['readwrite', 'allocate']], op_axes=[None, red_axes]) it.operands[1][...] = 0 for x, y in it: y[...] = x.sum() return it.operands[1] I can get something equivalent to data.sum(axis=2) >>> data = numpy.arange(2*3*4).reshape((2,3,4)) >>> nditer_sum(data, [0, 1, -1]) [[ 6 22 38] [54 70 86]] >>> data.sum(axis=2) [[ 6 22 38] [54 70 86]] So to get something equivalent to data.sum(axis=0) I though that it was enough to change the argument red_axes to [-1, 0,1] But the result is quite different. >>> data = numpy.arange(2*3*4).reshape((2,3,4)) >>> data.sum(axis=0) [[12 14 16 18] [20 22 24 26] [28 30 32 34]] >>> nditer_sum(data, [-1, 0, 1]) [[210 210 210 210] [210 210 210 210] [210 210 210 210]] In the for loop inside nditer_sum (for x,y in it:), the iterator is looping 2 times and giving an array of length 12 each time, instead of looping 12 times and giving an array of length 2 each time. I have read the numpy documentation several times and googled about this to no avail. Does anybody have an example of a reduction in the first axis of an array using nditer? Is this a bug? Regards, Sergio From pav at iki.fi Thu Sep 27 14:45:21 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 27 Sep 2012 21:45:21 +0300 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: 27.09.2012 15:46, David Cournapeau kirjoitti: [clip] > Ok, so scipy.org was down again because of trac. Unfortunately, the > machine on which scipy.org lives is the same as trac, and is a bit > messy. Trac runs on new.scipy.org, which AFAIK *is* a different machine from scipy.org. Pauli From pav at iki.fi Thu Sep 27 14:55:45 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 27 Sep 2012 21:55:45 +0300 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: 27.09.2012 21:45, Pauli Virtanen kirjoitti: > 27.09.2012 15:46, David Cournapeau kirjoitti: > [clip] >> Ok, so scipy.org was down again because of trac. Unfortunately, the >> machine on which scipy.org lives is the same as trac, and is a bit >> messy. > > Trac runs on new.scipy.org, which AFAIK *is* a different machine from > scipy.org. Well, turns out that I was mistaken here, and David was right. Carry on... -- Pauli Virtanen From charlesr.harris at gmail.com Thu Sep 27 18:38:25 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 27 Sep 2012 16:38:25 -0600 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 9:58 AM, Thouis (Ray) Jones wrote: > On Thu, Sep 27, 2012 at 11:46 AM, Charles R Harris > wrote: > > > > > > On Thu, Sep 27, 2012 at 8:22 AM, Thouis (Ray) Jones > > wrote: > >> > >> On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau > >> wrote: > >> > >> > Thouis, what needs to be done to make a testbed of the conversion ? > >> > >> I just returned to this a couple of days ago [*], and last night > >> successfully imported all the trac issues (from my somewhat > >> out-of-date snapshot) to this repo: > >> https://github.com/thouis/numpy-trac-migration > >> > >> Note that there are currently multiple copies of many issues. This is > >> because github has no way to actually delete an issue, and I didn't > >> bother marking issues that were from failed runs during debugging. If > >> you see two copies of an issue with differences in their > >> body/comments, the one with the higher github issue # is the one > >> generated last night. > >> > >> The code that actually does the import is in that repository, as well. > >> Thanks to the networkX group for alpha-testing the code. > >> > > > > Thouis, will you want commit privileges? > > Yes, though I created a bot account to keep from injecting myself into > the history: https://github.com/numpy-gitbot > > Someone could also just run the import as the "numpy" github user. > > You're in. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Sep 27 19:10:29 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Sep 2012 00:10:29 +0100 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 3:22 PM, Thouis (Ray) Jones wrote: > On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau wrote: > >> Thouis, what needs to be done to make a testbed of the conversion ? > > I just returned to this a couple of days ago [*], and last night > successfully imported all the trac issues (from my somewhat > out-of-date snapshot) to this repo: > https://github.com/thouis/numpy-trac-migration Skimmed some of the bugs: The "(migrated from Trac #xxx)" in all the bug titles is kind of intrusive and distracting -- maybe reduce this to just "(was Trac #xxx)" or even "(Trac#xxx)" or so? https://github.com/thouis/numpy-trac-migration/issues/3783: "Reported 2011-03-04 by atmention:rgommers, assigned to unknown." <- This atmention thing looks like a bug. In the main body of the bug (and also in some comments), your code managed to convert one {{{source-code block}}} into a ```source-code block```, but failed on several others. "Comment in Trac by atmention:rgommers, 2012-05-19" <-- aside from the atmention: thing, wouldn't it be better to put the commenter first so it's more visible when skimming comments? Maybe something like "@rgommers wrote on 2012-05-19:"? "Marked as knownfail in 1.6.x branch in commit:6922cf8" <- fails to create a link to the commit. https://github.com/thouis/numpy-trac-migration/issues/3777: There are several mysterious empty comments here. (Apparently they came from milestone change messages.) Maybe this is already fixed, b/c I saw other milestone change messages, but fyi. https://github.com/thouis/numpy-trac-migration/issues/3774: Some originally-bold text has mysterious punctuation marks instead. ('''Problem''', '''Expectation of Behavior''', ...) https://github.com/thouis/numpy-trac-migration/issues/3760#issuecomment-8939552: This comment refers to an attachment by linking directly into the original trac instance. Are we going to keep trac alive indefinitely to serve these links, or is there some other plan...? For the boilerplate text at the beginning of every ticket ("Original ticket http://projects.scipy.org/numpy/ticket/1155\nReported 2009-06-30 by trac user gorm, assigned to atmention:rgommers."), it would be nice if it were somehow set off visually to make clearer where the actual start of the ticket text is. Maybe it could be italicized, or we could put a rule underneath it, or something? --- Anyway, these are mostly nitpicks (all except the attachment thing I guess, that seems like it could be a problem). Thanks so much for your work on this; it's really appreciated. -n From thouis at gmail.com Thu Sep 27 22:09:48 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 27 Sep 2012 22:09:48 -0400 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: tl;dr I think I fixed everything mentioned below. On Thu, Sep 27, 2012 at 7:10 PM, Nathaniel Smith wrote: > On Thu, Sep 27, 2012 at 3:22 PM, Thouis (Ray) Jones wrote: >> On Thu, Sep 27, 2012 at 8:46 AM, David Cournapeau wrote: >> >>> Thouis, what needs to be done to make a testbed of the conversion ? >> >> I just returned to this a couple of days ago [*], and last night >> successfully imported all the trac issues (from my somewhat >> out-of-date snapshot) to this repo: >> https://github.com/thouis/numpy-trac-migration > > Skimmed some of the bugs: > > The "(migrated from Trac #xxx)" in all the bug titles is kind of > intrusive and distracting -- maybe reduce this to just "(was Trac > #xxx)" or even "(Trac#xxx)" or so? Changed to the last suggestion. > https://github.com/thouis/numpy-trac-migration/issues/3783: > > "Reported 2011-03-04 by atmention:rgommers, assigned to unknown." <- > This atmention thing looks like a bug. This is to prevent @user notifications from occurring during testing. I'll switch it to an actual "@" when it's time to do the actual import. > In the main body of the bug (and also in some comments), your code > managed to convert one {{{source-code block}}} into a ```source-code > block```, but failed on several others. It appears that github markup requires indented blocks to be separated by an empty line. Fixed. See: https://github.com/thouis/numpy-trac-migration/issues/3794 > "Comment in Trac by atmention:rgommers, 2012-05-19" <-- aside from the > atmention: thing, wouldn't it be better to put the commenter first so > it's more visible when skimming comments? Maybe something like > "@rgommers wrote on 2012-05-19:"? Done. > "Marked as knownfail in 1.6.x branch in commit:6922cf8" <- fails to > create a link to the commit. Fixed. > https://github.com/thouis/numpy-trac-migration/issues/3777: There are > several mysterious empty comments here. (Apparently they came from > milestone change messages.) Maybe this is already fixed, b/c I saw > other milestone change messages, but fyi. I think this has been fixed in the latest code (but not all issues have used that for importing). See: https://github.com/thouis/numpy-trac-migration/issues/3786 > https://github.com/thouis/numpy-trac-migration/issues/3774: Some > originally-bold text has mysterious punctuation marks instead. > ('''Problem''', '''Expectation of Behavior''', ...) Fixed. https://github.com/thouis/numpy-trac-migration/issues/3788 > https://github.com/thouis/numpy-trac-migration/issues/3760#issuecomment-8939552: > This comment refers to an attachment by linking directly into the > original trac instance. Are we going to keep trac alive indefinitely > to serve these links, or is there some other plan...? The plan is to keep these around indefinitely (possibly as a snapshot). If they are moved in the future, automatic rewriting should be possible. > For the boilerplate text at the beginning of every ticket ("Original > ticket http://projects.scipy.org/numpy/ticket/1155\nReported > 2009-06-30 by trac user gorm, assigned to atmention:rgommers."), it > would be nice if it were somehow set off visually to make clearer > where the actual start of the ticket text is. Maybe it could be > italicized, or we could put a rule underneath it, or something? Like this? https://github.com/thouis/numpy-trac-migration/issues/3792 Ray From thouis at gmail.com Thu Sep 27 22:20:43 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 27 Sep 2012 22:20:43 -0400 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Thu, Sep 27, 2012 at 10:09 PM, Thouis (Ray) Jones wrote: > tl;dr I think I fixed everything mentioned below. By the way, my current method is to address bugs in the import by just reimporting tickets that demonstrate the bug, and not worrying about old versions of that ticket. If in browsing through you come upon an apparent bug, it's worth searching for the Trac # to make sure there isn't a more recent version with the bug fixed. I think things are close to ready. I still need to file a numpy warning/issue that mentions everyone that's going to be @mentioned in the full import, to make sure they're aware of what's coming and can filter appropriately, or request I remove them from the trac-to-github username map. Ray From ralf.gommers at gmail.com Fri Sep 28 01:48:06 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 28 Sep 2012 07:48:06 +0200 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Fri, Sep 28, 2012 at 4:20 AM, Thouis (Ray) Jones wrote: > On Thu, Sep 27, 2012 at 10:09 PM, Thouis (Ray) Jones > wrote: > > tl;dr I think I fixed everything mentioned below. > > By the way, my current method is to address bugs in the import by just > reimporting tickets that demonstrate the bug, and not worrying about > old versions of that ticket. If in browsing through you come upon an > apparent bug, it's worth searching for the Trac # to make sure there > isn't a more recent version with the bug fixed. > > I think things are close to ready. I still need to file a numpy > warning/issue that mentions everyone that's going to be @mentioned in > the full import, to make sure they're aware of what's coming and can > filter appropriately, or request I remove them from the trac-to-github > username map. > Looks great! After a quick browse, the only thing I noticed that still needs some thought is the color scheme for the labels. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Fri Sep 28 09:17:16 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Fri, 28 Sep 2012 09:17:16 -0400 Subject: [Numpy-discussion] Issue tracking In-Reply-To: References: <3C64F2BE-50E5-403C-9022-71233A6E3449@continuum.io> Message-ID: On Fri, Sep 28, 2012 at 1:48 AM, Ralf Gommers wrote: > Looks great! After a quick browse, the only thing I noticed that still needs > some thought is the color scheme for the labels. That's easy to adjust afterwards. From njs at pobox.com Fri Sep 28 10:45:50 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Sep 2012 15:45:50 +0100 Subject: [Numpy-discussion] numpy.distutils, lapack, and cython Message-ID: Hi all, I've gotten a pull request for scikits-sparse to switch it to using numpy.distutils: https://github.com/njsmith/scikits-sparse/pull/2 Overall this seems fair enough, finding libraries is a pain and numpy.distutils has that knowledge. 1) What's the proper way to find lapack using numpy.distutils? The patch tries lapack_mkl_info and lapack_info, but this is 2 of the 6 variants that my numpy.distutils.system_info contains. Really what I probably want is a way to ask numpy how it was built (including any custom paths the user used) and to default to doing the same? Is that possible? 2) Is there a better way to build Cython files than this weird monkey-patching thing they propose? (It's still better than the horror that setuptools/distribute require, but I guess I have higher expectations...) 3) Can I use 2to3 along with numpy.distutils? -n From matthew.brett at gmail.com Fri Sep 28 12:58:33 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Sep 2012 17:58:33 +0100 Subject: [Numpy-discussion] numpy.distutils, lapack, and cython In-Reply-To: References: Message-ID: Hi, On Fri, Sep 28, 2012 at 3:45 PM, Nathaniel Smith wrote: > Hi all, > > I've gotten a pull request for scikits-sparse to switch it to using > numpy.distutils: > https://github.com/njsmith/scikits-sparse/pull/2 > > Overall this seems fair enough, finding libraries is a pain and > numpy.distutils has that knowledge. > > 1) What's the proper way to find lapack using numpy.distutils? The > patch tries lapack_mkl_info and lapack_info, but this is 2 of the 6 > variants that my numpy.distutils.system_info contains. Really what I > probably want is a way to ask numpy how it was built (including any > custom paths the user used) and to default to doing the same? Is that > possible? > > 2) Is there a better way to build Cython files than this weird > monkey-patching thing they propose? (It's still better than the horror > that setuptools/distribute require, but I guess I have higher > expectations...) > > 3) Can I use 2to3 along with numpy.distutils? For this last - current nipy trunk uses 2to3 with numpy.distutils, for porting the tests and the doctests. Cheers, Matthew From robert.kern at gmail.com Fri Sep 28 13:45:56 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Sep 2012 12:45:56 -0500 Subject: [Numpy-discussion] numpy.distutils, lapack, and cython In-Reply-To: References: Message-ID: On Fri, Sep 28, 2012 at 9:45 AM, Nathaniel Smith wrote: > Hi all, > > I've gotten a pull request for scikits-sparse to switch it to using > numpy.distutils: > https://github.com/njsmith/scikits-sparse/pull/2 > > Overall this seems fair enough, finding libraries is a pain and > numpy.distutils has that knowledge. > > 1) What's the proper way to find lapack using numpy.distutils? The > patch tries lapack_mkl_info and lapack_info, but this is 2 of the 6 > variants that my numpy.distutils.system_info contains. Really what I > probably want is a way to ask numpy how it was built (including any > custom paths the user used) and to default to doing the same? Is that > possible? You can get some of that information from the generated numpy.__config__ module (see its show() and get_info() functions). I'm not sure if that includes custom paths that were hacked in by the user by editing the setup.py files, but I think it includes custom paths specified in the site.cfg that the builder used. Of course, if one installed numpy using a binary built on another machine, it is possible for that information to be not applicable to the current build. I believe that you want to use 'lapack_opt' as the most generic optimized LAPACK build information. That should dispatch to the vendor-specific ones if they are present, I think. It's what numpy.linalg and scipy.linalg do. I have rather blissfully forgotten such details, so you may want to do some digging of your own. > 2) Is there a better way to build Cython files than this weird > monkey-patching thing they propose? (It's still better than the horror > that setuptools/distribute require, but I guess I have higher > expectations...) Sadly, probably not. numpy.distutils is not much less horrifying than setuptools. -- Robert Kern From ndbecker2 at gmail.com Fri Sep 28 15:02:02 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 28 Sep 2012 15:02:02 -0400 Subject: [Numpy-discussion] silently ignored size mismatch (bug??) Message-ID: In [19]: u = np.arange (10) In [20]: v = np.arange (10) In [21]: u[v] = u In [22]: u[v] = np.arange(11) silence... From pwang at continuum.io Fri Sep 28 16:30:19 2012 From: pwang at continuum.io (Peter Wang) Date: Fri, 28 Sep 2012 15:30:19 -0500 Subject: [Numpy-discussion] PyData NYC 2012 Speakers and Talks announced! Message-ID: Hi everyone, The PyData NYC team and Continuum Analytics are proud to announce the full lineup of talks and speakers for the PyData NYC 2012 event! We're thrilled with the exciting lineup of workshops, hands-on tutorials, and talks about real-world uses of Python for data analysis. http://nyc2012.pydata.org/schedule The list of presenters and talk abstracts are also available, and are linked from the schedule page. For those who will be in town on Thursday evening of October 25th, there will be a special PyData edition of Drinks & Data at Dewey's Flatiron. It'll be a great chance to socialize and meet with PyData presenters and other attendees. Register here: http://drinks-and-data-pydata-conf-ny.eventbrite.com/ We're also proud to be part of the NYC DataWeek: http://oreilly.com/dataweek/?cmp=tw-strata-ev-dr. The week of October 22nd is going to be a great time to be in New York! Lastly, we are still looking for sponsors! If you want to get your company recognition in front of a few hundred Python data hackers and hardcore developers, PyData will be a premier venue to showcase your products or recruit exceptional talent. Please visit http://nyc2012.pydata.org/sponsors/becoming/ to inquire about sponsorship. In addition to the conference sponsorship, charter sponsorships for dinner Friday night, as well as the Sunday Hack-a-thon event are all open. Please help us promote the conference! Tell your friends, email your meetup groups, and follow @PyDataConf on Twitter. Early registration ends in just a few weeks, so register today! http://pydata.eventbrite.com/ See you there! -Peter Wang Organizer, PyData NYC 2012 -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Fri Sep 28 17:23:27 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 28 Sep 2012 23:23:27 +0200 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please Message-ID: <20120928212327.GA22708@phare.normalesup.org> Hi numpy developers, First of all, thanks a lot for the hard work you put in numpy. I know very well that maintaining such a core library is a lot of effort and a service to the community. But "with great dedication, comes great responsibility" :). I find that Numpy is a bit of a wild horse, a moving target. I have just fixed a fairly nasty bug in scikit-learn [1] that was introduced by change of semantics in ordering when doing copies with numpy. I have been running working and developing the scikit-learn while tracking numpy's development tree and, as far as I can tell, I never saw warnings raised in our code that something was going to change, or had changed. In other settings, changes in array inheritance and 'base' propagation have made impossible some of our memmap-related usecase that used to work under previous numpy [2]. Other's have been hitting difficulties related to these changes in behavior [3]. Not to mention the new casting rules (default: 'same_kind') that break a lot of code, or the ABI change that, while not done an purpose, ended up causing us a lot of pain. My point here is that having code that works and gives correct results with new releases of numpy is more challenging that it should be. I cannot claim that I disagree with the changes that I mention above. They were all implemented for a good reason and can all be considered as overall improvements to numpy. However the situation is that given a complex codebase relying on numpy that works at a time t, the chances that it works flawlessly at time t + 1y are thin. I am not too proud that we managed to release scikit-learn 0.12 with a very ugly bug under numpy 1.7. That happened although we have 90% of test coverage, buildbots under different numpy versions, and a lot of people, including me, using our development tree on a day to day basis with bleeding edge numpy. Most code in research settings or RD industry does not benefit from such software engineering and I believe is much more likely to suffer from changes in numpy. I think that this is a cultural issue: priority is not given to stability and backward compatibility. I think that this culture is very much ingrained in the Python world, that likes iteratively cleaning its software design. For instance, I have the feeling that in the scikit-learn, we probably fall in the same trap. That said, such a behavior cannot fare well for a base scientific environment. People tell me that if they take old matlab code, the odds that it will still works is much higher than with Python code. As a geek, I tend to reply that we get a lot out of this mobility, because we accumulate less cruft. However, in research settings, for reproducibility reasons, ones need to be able to pick up an old codebase and trust its results without knowing its intricacies. >From a practical standpoint, I believe that people implementing large changes to the numpy codebase, or any other core scipy package, should think really hard about their impact. I do realise that the changes are discussed on the mailing lists, but there is a lot of activity to follow and I don't believe that it is possible for many of us to monitor the discussions. Also, putting more emphasis on backward compatibility is possible. For instance, the 'order' parameter added to np.copy could have defaulted to the old behavior, 'K', for a year, with a DeprecationWarning, same thing for the casting rules. Thank you for reading this long email. I don't mean it to be a complaint about the past, but more a suggestion on something to keep in mind when making changes to core projects. Cheers, Ga?l ____ [1] https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783 [2] http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html [3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html From travis at continuum.io Fri Sep 28 17:43:00 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 28 Sep 2012 16:43:00 -0500 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: <20120928212327.GA22708@phare.normalesup.org> References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: Thank you for expressing this voice, Gael. It is an important perspective. The main reason that 1.7 has taken so long to get released is because I'm concerned about these kinds of changes and really want to either remove them or put in adequate warnings prior to moving forward. It's a long and complex process. Thanks for providing feedback when you encounter problems so that we can do our best to address them. I agree that we should be much more cautious about semantic changes in the 1.X series of NumPy. How we handle situations where 1.6 changed things from 1.5 and wasn't reported until now is an open question and depends on the particular problem in question. I agree that we should be much more cautious about changes (particularly semantic changes that will break existing code). -Travis On Sep 28, 2012, at 4:23 PM, Gael Varoquaux wrote: > Hi numpy developers, > > First of all, thanks a lot for the hard work you put in numpy. I know > very well that maintaining such a core library is a lot of effort and a > service to the community. But "with great dedication, comes great > responsibility" :). > > I find that Numpy is a bit of a wild horse, a moving target. I have just > fixed a fairly nasty bug in scikit-learn [1] that was introduced by > change of semantics in ordering when doing copies with numpy. I have been > running working and developing the scikit-learn while tracking numpy's > development tree and, as far as I can tell, I never saw warnings raised > in our code that something was going to change, or had changed. > > In other settings, changes in array inheritance and 'base' propagation > have made impossible some of our memmap-related usecase that used to work > under previous numpy [2]. Other's have been hitting difficulties related > to these changes in behavior [3]. Not to mention the new casting rules > (default: 'same_kind') that break a lot of code, or the ABI change that, > while not done an purpose, ended up causing us a lot of pain. > > My point here is that having code that works and gives correct results > with new releases of numpy is more challenging that it should be. I > cannot claim that I disagree with the changes that I mention above. They > were all implemented for a good reason and can all be considered as > overall improvements to numpy. However the situation is that given a > complex codebase relying on numpy that works at a time t, the chances > that it works flawlessly at time t + 1y are thin. I am not too proud that > we managed to release scikit-learn 0.12 with a very ugly bug under numpy > 1.7. That happened although we have 90% of test coverage, buildbots under > different numpy versions, and a lot of people, including me, using our > development tree on a day to day basis with bleeding edge numpy. Most > code in research settings or RD industry does not benefit from such > software engineering and I believe is much more likely to suffer from > changes in numpy. > > I think that this is a cultural issue: priority is not given to stability > and backward compatibility. I think that this culture is very much > ingrained in the Python world, that likes iteratively cleaning its > software design. For instance, I have the feeling that in the > scikit-learn, we probably fall in the same trap. That said, such a > behavior cannot fare well for a base scientific environment. People tell > me that if they take old matlab code, the odds that it will still works > is much higher than with Python code. As a geek, I tend to reply that we > get a lot out of this mobility, because we accumulate less cruft. > However, in research settings, for reproducibility reasons, ones need to > be able to pick up an old codebase and trust its results without knowing > its intricacies. > >> From a practical standpoint, I believe that people implementing large > changes to the numpy codebase, or any other core scipy package, should > think really hard about their impact. I do realise that the changes are > discussed on the mailing lists, but there is a lot of activity to follow > and I don't believe that it is possible for many of us to monitor the > discussions. Also, putting more emphasis on backward compatibility is > possible. For instance, the 'order' parameter added to np.copy could have > defaulted to the old behavior, 'K', for a year, with a > DeprecationWarning, same thing for the casting rules. > > Thank you for reading this long email. I don't mean it to be a complaint > about the past, but more a suggestion on something to keep in mind when > making changes to core projects. > > Cheers, > > Ga?l > > ____ > > [1] https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783 > > [2] http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html > > [3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From heng at cantab.net Fri Sep 28 17:53:28 2012 From: heng at cantab.net (Henry Gomersall) Date: Fri, 28 Sep 2012 22:53:28 +0100 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: <1348869208.6529.40.camel@farnsworth> On Fri, 2012-09-28 at 16:43 -0500, Travis Oliphant wrote: > I agree that we should be much more cautious about semantic changes in > the 1.X series of NumPy. How we handle situations where 1.6 changed > things from 1.5 and wasn't reported until now is an open question and > depends on the particular problem in question. I agree that we > should be much more cautious about changes (particularly semantic > changes that will break existing code). One thing I noticed in my (short and shallow) foray into numpy development was the rather limited scope of the tests in the area I touched (fft). I know not the extent to which this is true across the code base, but I know from experience the value of a truly exhaustive test set (every line tested for every condition). Perhaps someone with a deeper knowledge could comment on this? I also know from experience the huge discipline it takes to do such test driven coding, especially when one has limited time and motivation on a project! Also, writing tests for legacy code is painful! This is not meant as a criticism. Like Gael, I think Numpy is a fantastic project that has achieved great things. Henry From charlesr.harris at gmail.com Fri Sep 28 18:11:37 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Sep 2012 16:11:37 -0600 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: <20120928212327.GA22708@phare.normalesup.org> References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 3:23 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > Hi numpy developers, > > First of all, thanks a lot for the hard work you put in numpy. I know > very well that maintaining such a core library is a lot of effort and a > service to the community. But "with great dedication, comes great > responsibility" :). > > I find that Numpy is a bit of a wild horse, a moving target. I have just > fixed a fairly nasty bug in scikit-learn [1] that was introduced by > change of semantics in ordering when doing copies with numpy. I have been > running working and developing the scikit-learn while tracking numpy's > development tree and, as far as I can tell, I never saw warnings raised > in our code that something was going to change, or had changed. > IIRC, the copy order was not specced and should not have been assumed. Some copy orders are faster than others and I believe numpy now takes advantage of that fact. Admittedly, numpy has started to move, mostly due to Mark's work, but I don't think that is all bad, I feel that it has to move some and the users need to be pushed a bit. It's a balancing act, but I don't think copy order goes over the line. One way to look at that is that 1.8 might have been a better release to make the change, on the other hand, Mark has moved on and dropped into the ContinuumIO black hole. Sometimes you need to ride the train when it is there at the station. > > In other settings, changes in array inheritance and 'base' propagation > have made impossible some of our memmap-related usecase that used to work > under previous numpy [2]. Other's have been hitting difficulties related > to these changes in behavior [3]. Not to mention the new casting rules > (default: 'same_kind') that break a lot of code, or the ABI change that, > while not done an purpose, ended up causing us a lot of pain. > IIRC, the base propagation changes fixed a bug, an old bug. > > My point here is that having code that works and gives correct results > with new releases of numpy is more challenging that it should be. I > cannot claim that I disagree with the changes that I mention above. They > were all implemented for a good reason and can all be considered as > overall improvements to numpy. However the situation is that given a > complex codebase relying on numpy that works at a time t, the chances > that it works flawlessly at time t + 1y are thin. I am not too proud that > we managed to release scikit-learn 0.12 with a very ugly bug under numpy > 1.7. That happened although we have 90% of test coverage, buildbots under > different numpy versions, and a lot of people, including me, using our > development tree on a day to day basis with bleeding edge numpy. Most > code in research settings or RD industry does not benefit from such > software engineering and I believe is much more likely to suffer from > changes in numpy. > If the behaviour is not specified and tested, there is no guarantee that it will continue. > > I think that this is a cultural issue: priority is not given to stability > and backward compatibility. I think that this culture is very much > ingrained in the Python world, that likes iteratively cleaning its > software design. For instance, I have the feeling that in the > scikit-learn, we probably fall in the same trap. That said, such a > behavior cannot fare well for a base scientific environment. People tell > me that if they take old matlab code, the odds that it will still works > is much higher than with Python code. As a geek, I tend to reply that we > get a lot out of this mobility, because we accumulate less cruft. > However, in research settings, for reproducibility reasons, ones need to > be able to pick up an old codebase and trust its results without knowing > its intricacies. > Bitch, bitch, bitch. Look, I know you are pissed and venting a bit, but this problem could have been detected and reported 6 months ago, that is, unless it is new due to development on your end. > > >From a practical standpoint, I believe that people implementing large > changes to the numpy codebase, or any other core scipy package, should > think really hard about their impact. I do realise that the changes are > discussed on the mailing lists, but there is a lot of activity to follow > and I don't believe that it is possible for many of us to monitor the > discussions. Also, putting more emphasis on backward compatibility is > possible. For instance, the 'order' parameter added to np.copy could have > defaulted to the old behavior, 'K', for a year, with a > DeprecationWarning, same thing for the casting rules. > > Thank you for reading this long email. I don't mean it to be a complaint > about the past, but more a suggestion on something to keep in mind when > making changes to core projects. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Sep 28 18:15:36 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 28 Sep 2012 17:15:36 -0500 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: <1348869208.6529.40.camel@farnsworth> References: <20120928212327.GA22708@phare.normalesup.org> <1348869208.6529.40.camel@farnsworth> Message-ID: On Sep 28, 2012, at 4:53 PM, Henry Gomersall wrote: > On Fri, 2012-09-28 at 16:43 -0500, Travis Oliphant wrote: >> I agree that we should be much more cautious about semantic changes in >> the 1.X series of NumPy. How we handle situations where 1.6 changed >> things from 1.5 and wasn't reported until now is an open question and >> depends on the particular problem in question. I agree that we >> should be much more cautious about changes (particularly semantic >> changes that will break existing code). > > One thing I noticed in my (short and shallow) foray into numpy > development was the rather limited scope of the tests in the area I > touched (fft). I know not the extent to which this is true across the > code base, but I know from experience the value of a truly exhaustive > test set (every line tested for every condition). Perhaps someone with a > deeper knowledge could comment on this? Thank you for bringing this up. It is definitely a huge flaw of NumPy that it does not have more extensive testing. It is a result of the limited resources under which NumPy has been developed. We are trying to correct this problem over time --- but it takes time. In the mean time, there is a huge install base of code out there which acts as a de-facto test suite of NumPy. We just need to make sure those tests actually get run on new versions of NumPy and we get reports back of failures --- especially when subtle changes have taken place in the way things work (iteration in ufuncs and coercion rules being the most obvious). This results in longer release cycles if releases contain code that significantly change the way things work (removed APIs, altered coercion rules, etc.) The alteration of the semantics of how the base attribute works is a good example. Everyone felt it was a good idea to have the .base attribute point to the actual object holding the memory (and it fixed a well-known example of how you could crash Python by building up a stack of array-object references). However, our fix created a problem for code that uses memmap objects and relied on the fact that the .base attribute would hold a reference to the most recent *memmap* object. This was an unforeseen problem with our change. On the other hand, change is a good thing and we don't want NumPy to stop getting improvements. We just have to be careful that we don't allow our enthusiasm for new features and changes to over-ride our responsibility to end-users. I appreciate the efforts of all the NumPy developers in working through the inevitable debates that differences in perspective on that fundamental trade-off will bring. Best, -Travis From rhl at astro.princeton.edu Fri Sep 28 18:18:14 2012 From: rhl at astro.princeton.edu (Robert Lupton the Good) Date: Fri, 28 Sep 2012 18:18:14 -0400 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: Message-ID: <24DD436C-13B4-45B0-A222-FDE721EE4147@astro.princeton.edu> Gael puts in a plea for backward compatibility; I totally agree. Numpy sometimes goes out of its way to make this hard. For example, when the syntax of histogram were changed you got a nice DepreciationWarning about an option to switch to the new behaviour; great. But a few releases later that option went away and code carefully written to survive the transition stopped working; that didn't make me a happy user. R > Also, putting more emphasis on backward compatibility is possible. For instance, the 'order' parameter added to np.copy could have defaulted to the old behavior, 'K', for a year, with a DeprecationWarning, same thing for the casting rules. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: From fperez.net at gmail.com Fri Sep 28 20:44:09 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 28 Sep 2012 17:44:09 -0700 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 3:11 PM, Charles R Harris wrote: > Bitch, bitch, bitch. Look, I know you are pissed and venting a bit, but this > problem could have been detected and reported 6 months ago, that is, unless > it is new due to development on your end. It would be great if we could keep these kinds of comments out of our regular discourse. Gael's email was a calmly composed, friendly and balanced view from someone developing widely used code (sklearn) and using numpy daily for production work. He also happens to be a member of this community from long ago, so I'm sure he'll know to ignore the above and won't be deterred from providing feedback again in the future when necessary. But if newcomers find that the feedback they provide on this list with real-world issues is met with replies like the above, I suspect it will greatly diminish the likelihood they will engage us. Numpy will overall be hurt as a project. Chuck, I'd like to encourage you to use a more civil discourse and dispense with the unnecessarily aggressive angles. Regards, f From scopatz at gmail.com Fri Sep 28 21:25:35 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Fri, 28 Sep 2012 20:25:35 -0500 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 7:44 PM, Fernando Perez wrote: > On Fri, Sep 28, 2012 at 3:11 PM, Charles R Harris > wrote: > > Bitch, bitch, bitch. Look, I know you are pissed and venting a bit, but > this > > problem could have been detected and reported 6 months ago, that is, > unless > > it is new due to development on your end. > > It would be great if we could keep these kinds of comments out of our > regular discourse. > +1, I think we should endeavor to have a respectful and welcoming community. > > Gael's email was a calmly composed, friendly and balanced view from > someone developing widely used code (sklearn) and using numpy daily > for production work. He also happens to be a member of this > community from long ago, so I'm sure he'll know to ignore the above > and won't be deterred from providing feedback again in the future when > necessary. > > But if newcomers find that the feedback they provide on this list with > real-world issues is met with replies like the above, I suspect it > will greatly diminish the likelihood they will engage us. Numpy will > overall be hurt as a project. > > Chuck, I'd like to encourage you to use a more civil discourse and > dispense with the unnecessarily aggressive angles. > > Regards, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Sep 28 21:26:04 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Sep 2012 19:26:04 -0600 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 6:44 PM, Fernando Perez wrote: > On Fri, Sep 28, 2012 at 3:11 PM, Charles R Harris > wrote: > > Bitch, bitch, bitch. Look, I know you are pissed and venting a bit, but > this > > problem could have been detected and reported 6 months ago, that is, > unless > > it is new due to development on your end. > > It would be great if we could keep these kinds of comments out of our > regular discourse. > > Gael's email was a calmly composed, friendly and balanced view from > someone developing widely used code (sklearn) and using numpy daily > for production work. He also happens to be a member of this > community from long ago, so I'm sure he'll know to ignore the above > and won't be deterred from providing feedback again in the future when > necessary. > > But if newcomers find that the feedback they provide on this list with > real-world issues is met with replies like the above, I suspect it > will greatly diminish the likelihood they will engage us. Numpy will > overall be hurt as a project. > > Chuck, I'd like to encourage you to use a more civil discourse and > dispense with the unnecessarily aggressive angles. > > Your note is a bit off topic. What do you think of the subject of the post? Regards, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Sep 28 21:37:35 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Sep 2012 19:37:35 -0600 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 7:25 PM, Anthony Scopatz wrote: > On Fri, Sep 28, 2012 at 7:44 PM, Fernando Perez wrote: > >> On Fri, Sep 28, 2012 at 3:11 PM, Charles R Harris >> wrote: >> > Bitch, bitch, bitch. Look, I know you are pissed and venting a bit, but >> this >> > problem could have been detected and reported 6 months ago, that is, >> unless >> > it is new due to development on your end. >> >> It would be great if we could keep these kinds of comments out of our >> regular discourse. >> > > +1, I think we should endeavor to have a respectful and welcoming > community. > With a bit of humour now and then among the old timers, no? Look, I've been outright insulted on the list and I don't recall anyone weighing in about civility at the time. But nor am I a tender flower. But we are going way off topic, no? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Sep 28 22:03:01 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Sep 2012 03:03:01 +0100 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: <20120928212327.GA22708@phare.normalesup.org> References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: On Fri, Sep 28, 2012 at 10:23 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > Hi numpy developers, > > First of all, thanks a lot for the hard work you put in numpy. I know > very well that maintaining such a core library is a lot of effort and a > service to the community. But "with great dedication, comes great > responsibility" :). There've been several long discussions about this on numpy-discussion over the last few months, actually... a few that I remember off the top of my head: http://mail.scipy.org/pipermail/numpy-discussion/2012-May/062496.html http://www.mail-archive.com/numpy-discussion at scipy.org/msg37500.html http://mail.scipy.org/pipermail/numpy-discussion/2012-May/thread.html#62298 > I find that Numpy is a bit of a wild horse, a moving target. I have just > fixed a fairly nasty bug in scikit-learn [1] that was introduced by > change of semantics in ordering when doing copies with numpy. I have been > running working and developing the scikit-learn while tracking numpy's > development tree and, as far as I can tell, I never saw warnings raised > in our code that something was going to change, or had changed. It looks like this is a bug caused by the 1.7 pre-release versions? Have you reported it? (I swear I saw some code go by recently that involved guessing array orders to match the input, but I can't recall where.) > In other settings, changes in array inheritance and 'base' propagation > have made impossible some of our memmap-related usecase that used to work > under previous numpy [2]. Other's have been hitting difficulties related > to these changes in behavior [3]. Not to mention the new casting rules > (default: 'same_kind') that break a lot of code, or the ABI change that, > while not done an purpose, ended up causing us a lot of pain. The same_kind rule change has been reverted for 1.7 for exactly this reason, and several dozen changes have gone in in the last month or two trying to clear up all the little regressions we've found so far in 1.7pre. And we've been trying to be more rigorous about following a formal deprecation schedule in general. https://github.com/numpy/numpy/pull/440 https://github.com/numpy/numpy/pull/451 https://github.com/numpy/numpy/pull/280 https://github.com/numpy/numpy/pull/350 I have mixed feelings about the .base change. If it were possible to do a deprecation period I'd definitely be in favor, but I don't see how, unless we were to remove accessing it from python altogether, and that's pretty unlikely. The problem is it's an attractive nuisance; the *only* reliable thing you've *ever* been able to do with it is pin an object in memory when constructing an array directly, but people keep expecting more, so all the breakages have been in code that was IMHO already on thin ice. And from my point of view it wouldn't be the *most* terrible thing if the result here is that you're forced to make memmap pickling work in numpy directly for everybody ;-). But I go back and forth in my own mind, because of the things you say. Other ideas very welcome. (Maybe we should rename the python attribute to ._base - with appropriate deprecation period of course - just to encourage people to stop doing unwise things with it, and *then* make the change that's tripping you up now?) > My point here is that having code that works and gives correct results > with new releases of numpy is more challenging that it should be. I > cannot claim that I disagree with the changes that I mention above. They > were all implemented for a good reason and can all be considered as > overall improvements to numpy. However the situation is that given a > complex codebase relying on numpy that works at a time t, the chances > that it works flawlessly at time t + 1y are thin. I am not too proud that > we managed to release scikit-learn 0.12 with a very ugly bug under numpy > 1.7. That happened although we have 90% of test coverage, buildbots under > different numpy versions, and a lot of people, including me, using our > development tree on a day to day basis with bleeding edge numpy. Most > code in research settings or RD industry does not benefit from such > software engineering and I believe is much more likely to suffer from > changes in numpy. > > I think that this is a cultural issue: priority is not given to stability > and backward compatibility. I think that this culture is very much > ingrained in the Python world, that likes iteratively cleaning its > software design. For instance, I have the feeling that in the > scikit-learn, we probably fall in the same trap. That said, such a > behavior cannot fare well for a base scientific environment. People tell > me that if they take old matlab code, the odds that it will still works > is much higher than with Python code. As a geek, I tend to reply that we > get a lot out of this mobility, because we accumulate less cruft. > However, in research settings, for reproducibility reasons, ones need to > be able to pick up an old codebase and trust its results without knowing > its intricacies. > > >From a practical standpoint, I believe that people implementing large > changes to the numpy codebase, or any other core scipy package, should > think really hard about their impact. I do realise that the changes are > discussed on the mailing lists, but there is a lot of activity to follow > and I don't believe that it is possible for many of us to monitor the > discussions. Also, putting more emphasis on backward compatibility is > possible. For instance, the 'order' parameter added to np.copy could have > defaulted to the old behavior, 'K', for a year, with a > DeprecationWarning, same thing for the casting rules. Maybe it still can, but you have to tell us details :-) In general numpy development just needs more people keeping track of these things. If you want to keep an open source stack functional sometimes you have to pay a tax of your time to making sure the levels below you will continue to suit your needs. -n > Thank you for reading this long email. I don't mean it to be a complaint > about the past, but more a suggestion on something to keep in mind when > making changes to core projects. > > Cheers, > > Ga?l > > ____ > > [1] https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783 > > [2] http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html > > [3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Sep 28 22:22:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 28 Sep 2012 21:22:28 -0500 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: <2FBBB67D-B767-44E8-9DCE-DFCB77ABCEFB@continuum.io> > > > > >From a practical standpoint, I believe that people implementing large > > changes to the numpy codebase, or any other core scipy package, should > > think really hard about their impact. I do realise that the changes are > > discussed on the mailing lists, but there is a lot of activity to follow > > and I don't believe that it is possible for many of us to monitor the > > discussions. Also, putting more emphasis on backward compatibility is > > possible. For instance, the 'order' parameter added to np.copy could have > > defaulted to the old behavior, 'K', for a year, with a > > DeprecationWarning, same thing for the casting rules. > > Maybe it still can, but you have to tell us details :-) > > In general numpy development just needs more people keeping track of these things. If you want to keep an open source stack functional sometimes you have to pay a tax of your time to making sure the levels below you will continue to suit your needs. > > Thanks for the thorough and thoughtful response. Well spoken... -Travis > -n > > > Thank you for reading this long email. I don't mean it to be a complaint > > about the past, but more a suggestion on something to keep in mind when > > making changes to core projects. > > > > Cheers, > > > > Ga?l > > > > ____ > > > > [1] https://github.com/scikit-learn/scikit-learn/commit/7842748cf777412c506a8c0ed28090711d3a3783 > > > > [2] http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html > > > > [3] http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063126.html > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Sep 29 05:16:00 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 29 Sep 2012 11:16:00 +0200 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: <20120929091600.GA13677@phare.normalesup.org> On Fri, Sep 28, 2012 at 07:37:35PM -0600, Charles R Harris wrote: > +1, I think we should?endeavor?to have a respectful and welcoming > community. > With a bit of humour now and then among the old timers, no? Look, I've been > outright insulted on the list and I don't recall anyone weighing in about > civility at the time. But nor am I a tender flower. But we are going way off > topic, no? I think that Chuck and I know each other well enough to understand that this pique is not intended to be disrespectful and is full of humour. I bystander might feel different, but I am actually laughing and smiling at Chuck's answer. I know that he is serious and means it, and respect his point of view I somehow feel that this is just a very opinionated way of expressing his enthusiasm about numpy, almost like a Frenchman would do :}. Chuck: point taken. I agree with some of your points, and I am certainly not taking it badly. I know you well enough to know that you are an enthusiastic, and very friendly person who invests a lot of time in numpy. Next time I see you, I owe you a beer for making you cross :). I should be at scipy next year. I'll settle my debt in beer hopefully, if I have the chance to see you. Take care, Ga?l From gael.varoquaux at normalesup.org Sat Sep 29 06:43:32 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 29 Sep 2012 12:43:32 +0200 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> Message-ID: <20120929104331.GB13677@phare.normalesup.org> Hi Nathaniel, First of all, thanks for your investment in numpy. You have really been moving the project forward lately. On Sat, Sep 29, 2012 at 03:03:01AM +0100, Nathaniel Smith wrote: > > I have just fixed a fairly nasty bug in scikit-learn that was > > introduced by change of semantics in ordering when doing copies with > > numpy. > It looks like this is a bug caused by the 1.7 pre-release versions? Have you > reported it? I just found this bug yesterday morning, because a user reported a bug in scikit-learn. I wrote this email after fixing our bug using the "order='K'" option of np.copy. I think that we didn't find this problem because all the core developers of scikit-learn are very careful to pass in their arrays in the right ordering to avoid copies. That said, you have a point that this also reveals a failure in our test suite: we don't test systematically for fortran and C ordered inputs. We probably should. > The same_kind rule change has been reverted for 1.7 for exactly this reason, Sorry, I haven't been following well enough. I think that this is probably a good idea. I would vote for warnings to be raised (maybe it is the case), and maybe in the long term (2.0) relying on the same_kind rule. > And we've been trying to be more rigorous about following a formal > deprecation schedule in general. Yes, I think that formal deprecation schedules are important. We try to do the same in scikit-learn. It's a pain and as developers we have to force ourselves, but it's useful for users. > I have mixed feelings about the .base change. I like it. I think that it's useful. I just think that it's implications are not fully understood yet, and that new mechanisms need to be offered to replace what it changed. > all the breakages have been in code that was IMHO already on thin ice. But that served useful usecases. > And from my point of view it wouldn't be the *most* terrible thing if > the result here is that you're forced to make memmap pickling work in > numpy directly for everybody ;-). I am not sure that I understand your sentence here. Actually, getting off topic here, but would people be interested in discussing a technical solution to our problem: http://mail.scipy.org/pipermail/numpy-discussion/2012-September/063985.html i.e. finding the filename and offset of an array inheriting for memmapped memory, when such a filename exists. I am ready to put in effort and send in a patch, but before writing such a patch, I'd like to have some consensus of core developers on an acceptable solution. > > For instance, the 'order' parameter added to np.copy could have > > defaulted to the old behavior, 'K', for a year, with a > > DeprecationWarning, same thing for the casting rules. > Maybe it still can, but you have to tell us details :-) Well, I would think that having a default "order='K'" in np.copy, and adding such an argument to ndarray.copy would avoid breakage. > In general numpy development just needs more people keeping track of > these things. If you want to keep an open source stack functional > sometimes you have to pay a tax of your time to making sure the levels > below you will continue to suit your needs. I partly agree. I think that it goes both ways. I think that downstream needs to follow upstream, which I do, by running the development tree on my work desktop. Downstream needs to push bugs and difficulties upstream. On the other hand, upstream developers should think in terms of impact and deprecation. When somebody (I can't remember whether it was Joseph Perktold or Christophe Gohlke) ran different downstream package with the an RC of numpy or scipy a few months ago, that was terribly useful. I can offer only limited help, as my schedule is way too packed. On the one hand I may appear as useless to the community because I spend a lot of time in meetings, managing students, or writing grant proposals or papers, instead of following the technical developments. On the other hand, such activities enable my to hire people to work on open source software, to have students that invest time on those technologies, and to have the scipy ecosystem be accepted by the 'establishment'. To avoid having the discussion going in circles, what I can suggest in concrete terms with my limited time available is: - I can invest time on the mmap issue, and work with core numpy developers on a patch - I suggest that np.copy and ndarray.copy should take an order='K' as default (ndarray.copy does not have such a keyword argument). Let's make numpy 1.7 rock! Cheers, Ga?l PS: rereading my previous mail in the thread, I found that it was full of sentences that did not make grammatical sens. I apologize for this. It may look like I am writing my mails hastily, but in fact I am very slightly dyslexic, and I tend not to see missing words or letters. From ondrej.certik at gmail.com Sat Sep 29 18:09:08 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 29 Sep 2012 15:09:08 -0700 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: <20120929091600.GA13677@phare.normalesup.org> References: <20120928212327.GA22708@phare.normalesup.org> <20120929091600.GA13677@phare.normalesup.org> Message-ID: On Sat, Sep 29, 2012 at 2:16 AM, Gael Varoquaux wrote: > On Fri, Sep 28, 2012 at 07:37:35PM -0600, Charles R Harris wrote: >> +1, I think we should endeavor to have a respectful and welcoming >> community. > >> With a bit of humour now and then among the old timers, no? Look, I've been >> outright insulted on the list and I don't recall anyone weighing in about >> civility at the time. But nor am I a tender flower. But we are going way off >> topic, no? > > I think that Chuck and I know each other well enough to understand that > this pique is not intended to be disrespectful and is full of humour. I > bystander might feel different, but I am actually laughing and smiling at > Chuck's answer. I know that he is serious and means it, and respect his > point of view I somehow feel that this is just a very opinionated way of > expressing his enthusiasm about numpy, almost like a Frenchman would do > :}. > > Chuck: point taken. I agree with some of your points, and I am certainly > not taking it badly. I know you well enough to know that you are an > enthusiastic, and very friendly person who invests a lot of time in > numpy. Next time I see you, I owe you a beer for making you cross :). I > should be at scipy next year. I'll settle my debt in beer hopefully, if I > have the chance to see you. I agree with what Fernando wrote above. If possible, please let's keep the emails respectful. I am glad that you didn't get offended Gael and I am sure that you Chuck wouldn't be either with such a response. However, for me, as a newcomer to the numpy development, I didn't know that you Chuck and Gael know each other well. And even if you know each other well and while it's ok with either of you, it might not be for other people, and it definitely creates an unwelcoming atmosphere on the mailinglist. If I can offer a suggestion that works for me --- I personally love to say what I frankly think, and I am not against swear words at all, I believe it's better to express what I think in one simple word, than in two paragraphs --- but it only works over a beer if I know that nobody will get offended. Over private email it's harder and on public mailinglist, where a lot of people will read it who have never met me, it's counter productive. Chuck, Gael, here is my todo list for the 1.7.0 release: https://github.com/numpy/numpy/issues/396 Is there anything missing in terms of backwards compatibility? Gael, if you have a minute, would you mind creating issues for things that broke since the 1.6.2 release that are not mentioned in #396 yet? It's my top priority to make sure that as little things break as possible by the new 1.7.0 release, that's also why we decided to release two beta versions to give people more chance to test it well. My apologies that it has taken me longer than it should have to get it all done. Ondrej P.S. For beers I would love to join both of you anytime. :) From paul.anton.letnes at gmail.com Sun Sep 30 10:14:45 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Sun, 30 Sep 2012 16:14:45 +0200 Subject: [Numpy-discussion] memory-efficient loadtxt Message-ID: Hello everyone, I've modified loadtxt to make it (potentially) more memory efficient. The idea is that if a user passes a seekable file, (s)he can also pass the 'seekable=True' kwarg. Then, loadtxt will count the number of lines (containing data) and allocate an array of exactly the right size to hold the loaded data. The downside is that the line counting more than doubles the runtime, as it loops over the file twice, and there's a sort-of unnecessary np.array function call in the loop. The branch is called faster-loadtxt, which is silly due to the runtime doubling, but I'm hoping that the false advertising is acceptable :) (I naively expected a speedup by removing some needless list manipulation.) I'm pretty sure that the function can be micro-optimized quite a bit here and there, and in particular, the main for loop is a bit duplicated right now. However, I got the impression that someone was working on a More Advanced (TM) C-based file reader, which will replace loadtxt; this patch is intended as a useful thing to have while we're waiting for that to appear. The patch passes all tests in the test suite, and documentation for the kwarg has been added. I've modified all tests to include the seekable kwarg, but that was mostly to check that all tests are passed also with this kwarg. I guess it's bit too late for 1.7.0 though? Should I make a pull request? I'm happy to take any and all suggestions before I do. Cheers Paul From paul.anton.letnes at gmail.com Sun Sep 30 10:16:41 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Sun, 30 Sep 2012 16:16:41 +0200 Subject: [Numpy-discussion] memory-efficient loadtxt In-Reply-To: References: Message-ID: For convenience and clarity, this is the diff in question: https://github.com/Dynetrekk/numpy-1/commit/5bde67531a2005ef80a2690a75c65cebf97c9e00 And this is my numpy fork: https://github.com/Dynetrekk/numpy-1/ Paul On Sun, Sep 30, 2012 at 4:14 PM, Paul Anton Letnes wrote: > Hello everyone, > > I've modified loadtxt to make it (potentially) more memory efficient. > The idea is that if a user passes a seekable file, (s)he can also pass > the 'seekable=True' kwarg. Then, loadtxt will count the number of > lines (containing data) and allocate an array of exactly the right > size to hold the loaded data. The downside is that the line counting > more than doubles the runtime, as it loops over the file twice, and > there's a sort-of unnecessary np.array function call in the loop. The > branch is called faster-loadtxt, which is silly due to the runtime > doubling, but I'm hoping that the false advertising is acceptable :) > (I naively expected a speedup by removing some needless list > manipulation.) > > I'm pretty sure that the function can be micro-optimized quite a bit > here and there, and in particular, the main for loop is a bit > duplicated right now. However, I got the impression that someone was > working on a More Advanced (TM) C-based file reader, which will > replace loadtxt; this patch is intended as a useful thing to have > while we're waiting for that to appear. > > The patch passes all tests in the test suite, and documentation for > the kwarg has been added. I've modified all tests to include the > seekable kwarg, but that was mostly to check that all tests are passed > also with this kwarg. I guess it's bit too late for 1.7.0 though? > > Should I make a pull request? I'm happy to take any and all > suggestions before I do. > > Cheers > Paul From beamesleach at gmail.com Sun Sep 30 11:14:19 2012 From: beamesleach at gmail.com (Alex Leach) Date: Sun, 30 Sep 2012 15:14:19 +0000 (UTC) Subject: [Numpy-discussion] Should abs([nan]) be supported? References: Message-ID: Ond?ej ?ert?k gmail.com> writes: > > On Tue, Sep 4, 2012 at 8:38 PM, Travis Oliphant continuum.io> > wrote: > I think the test should be changed to check for RuntimeWarning on some of > the cases. This might take a little work as it looks like the code uses > generators across multiple tests and > would have to be changed to handle expecting warnings. > > > > Alternatively, the error context can be set before the test runs and then > > restored afterwords: > > > > olderr = np.seterr(invalid='ignore') > > abs(a) > > np.seterr(**olderr) > > > > > > or, using an errstate context --- > > > > with np.errstate(invalid='ignore'): > > abs(a) > > I see --- so abs([nan]) should emit a warning, but in the test we > should suppress it. > I'll work on that. > Just witnessing this error now.. I'm building numpy on 64bit Linux with Intel's icc, and on OSX Mountain Lion with clang. I thought it was a problem with the built-in abs, as I used the following test case:- $ /usr/local/bin/python >>> import numpy as np >>> abs(np.array([1e-08, 1, 1000020.0000000099] ) - \ ... np.array([0, np.nan, 1000000.0] ) ) With the OSX clang build, this returns without an error message. array([ 1.00000000e-08, nan, 2.00000000e+01]) But on Ubuntu with my icc build of Python-2.7.3, I get that FloatingPointError, and corresponding numpy test failures. I figure this is caused by a compile flag that I did or didn't use, so dug around the icc man page, and think I found the cause for it... I built my Ubuntu python with the '-fp-model strict' option, as per recommendations I've seen, but this turns on floating point exceptions, so I'm going to rebuild with '-fp-model precise -fp-model source', and see how it goes... Cheers, Alex From gael.varoquaux at normalesup.org Sun Sep 30 11:56:57 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 30 Sep 2012 17:56:57 +0200 Subject: [Numpy-discussion] Making numpy sensible: backward compatibility please In-Reply-To: References: <20120928212327.GA22708@phare.normalesup.org> <20120929091600.GA13677@phare.normalesup.org> Message-ID: <20120930155657.GA1253@phare.normalesup.org> On Sat, Sep 29, 2012 at 03:09:08PM -0700, Ond?ej ?ert?k wrote: > Chuck, Gael, here is my todo list for the 1.7.0 release: > https://github.com/numpy/numpy/issues/396 I have created issues and mentionned them in the comments on your issue. Cheers, Ga?l From njs at pobox.com Sun Sep 30 14:17:42 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 30 Sep 2012 19:17:42 +0100 Subject: [Numpy-discussion] Memory order of array copies Message-ID: There are three basic Python APIs to copy an array in numpy: a.copy(): has always returned a C-contiguous array by default. has always taken an order= argument, which defaults to "C". np.array(a, copy=True): by default, produces an array with whatever memory ordering that 'a' had. Can also specify order="C", "F" to get C or Fortran contiguity instead. np.copy(a): has always been a simple alias for np.array(a, copy=True), which means that it also preserves memory ordering. BUT in current master and the 1.7 betas, an extra argument order= has been added, and this has been set to default to "C" ordering. The extra argument and behavioural change occurred in 0e1a4e95. It's not clear why; the change isn't mentioned in the commit message. The change has to be reverted for 1.7, at least, because it caused regressions in scikit-learn (and presumably other packages too). So the question is, what *should* these interface look like. Then we can figure out what kind of transition scheme is needed, if any. My gut reaction is that if we want to change this at all from it's pre-1.7 status quo, it would be the opposite of the change that was made in master... I'd expect np.copy(a) and a.copy() to return arrays that are as nearly-identical to 'a' as possible, unless I explicitly requested something different by passing an order= argument. But, I bet there is code out there that's doing something like: my_fragile_C_function_wrapper(a.copy()) when it really should be doing something more explicit like my_fragile_C_function_wrapper(np.array(a, copy=False, order="C", dtype=float)) i.e., they're depending on the current behaviour where a.copy() normalizes order. I don't see any way to detect these cases and issue a proper warning, so we may not be able to change this at all. Any ideas? Is there anything better to do than simply revert np.copy() to its traditional behaviour and accept that np.copy(a) and a.copy() will continue to have different semantics indefinitely? -n From gael.varoquaux at normalesup.org Sun Sep 30 14:22:59 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 30 Sep 2012 20:22:59 +0200 Subject: [Numpy-discussion] Memory order of array copies In-Reply-To: References: Message-ID: <20120930182259.GC15857@phare.normalesup.org> On Sun, Sep 30, 2012 at 07:17:42PM +0100, Nathaniel Smith wrote: > Is there anything better to do than simply revert np.copy() to its > traditional behaviour and accept that np.copy(a) and a.copy() will > continue to have different semantics indefinitely? Have np.copy take an 'order=None', which would translate to 'K'. Detect 'None' as a sentinel that order as not been specified. If the order is not specified, raise a FutureWarning that np.copy will change semantics in 2 releases. In two releases, do the change. That's how I would deal with it. From nouiz at nouiz.org Sun Sep 30 14:41:14 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 30 Sep 2012 14:41:14 -0400 Subject: [Numpy-discussion] Memory order of array copies In-Reply-To: <20120930182259.GC15857@phare.normalesup.org> References: <20120930182259.GC15857@phare.normalesup.org> Message-ID: As always, I think it is better to don't change the default behaviour. There is many people that don't update frequently and 2 releases is not enough. This will lead to many hard to find bug. This will also give the impression what we can't rely on numpy default behaviour and numpy is not stable. As a rule of thumb, we need to compare the benefit and consequence of changing default behaviour. In this case I see only a marginal speed gain (marginal in the sense that in the global user script, this won't matter, but locally it could be significant) vs silent and hard to find bug. If speed in that case is important, i think it would be much better to write an optimizer version that will take stride and cache line length into account. Even if we hard code the cache line lenght, this will probably bring most of the local speed up, without the inconvenient. If people still want to do this change, I think only a big release like numpy 2.0 make this acceptable but with the warning as Gael told. But I still prefer it not done and if people matter about the speed, they can write optimized code. Fred On Sun, Sep 30, 2012 at 2:22 PM, Gael Varoquaux wrote: > On Sun, Sep 30, 2012 at 07:17:42PM +0100, Nathaniel Smith wrote: >> Is there anything better to do than simply revert np.copy() to its >> traditional behaviour and accept that np.copy(a) and a.copy() will >> continue to have different semantics indefinitely? > > Have np.copy take an 'order=None', which would translate to 'K'. Detect > 'None' as a sentinel that order as not been specified. If the order is > not specified, raise a FutureWarning that np.copy will change semantics > in 2 releases. In two releases, do the change. > > That's how I would deal with it. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From travis at continuum.io Sun Sep 30 15:59:58 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 30 Sep 2012 14:59:58 -0500 Subject: [Numpy-discussion] Behavior of .base Message-ID: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Hey all, In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. What are reactions to this proposal? -Travis From hangenuit at gmail.com Sun Sep 30 16:30:52 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sun, 30 Sep 2012 22:30:52 +0200 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: > Hey all, > > In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. > > Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. > > The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. > > What are reactions to this proposal? > > -Travis > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion I think the current behaviour of the .base attribute is much more stable and predictable than past behaviour. For views for instance, this makes sure you don't hold references of 'intermediate' views, but always point to the original *base* object. Also, I think a lot of internal logic depends on this behaviour, so I am not in favour of changing this back (yet) again. Also, considering that this behaviour already exists in past versions of NumPy, namely 1.6, and is very fundamental to how arrays work, I find it strange that it is now up for change in 1.7 at the last minute. From travis at continuum.io Sun Sep 30 16:35:16 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 30 Sep 2012 15:35:16 -0500 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> We are not talking about changing it "back". The change in 1.6 caused problems that need to be addressed. Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 3:30 PM, Han Genuit wrote: > On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: >> Hey all, >> >> In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. >> >> Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. >> >> The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. >> >> What are reactions to this proposal? >> >> -Travis >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I think the current behaviour of the .base attribute is much more > stable and predictable than past behaviour. For views for instance, > this makes sure you don't hold references of 'intermediate' views, but > always point to the original *base* object. Also, I think a lot of > internal logic depends on this behaviour, so I am not in favour of > changing this back (yet) again. > > Also, considering that this behaviour already exists in past versions > of NumPy, namely 1.6, and is very fundamental to how arrays work, I > find it strange that it is now up for change in 1.7 at the last > minute. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hangenuit at gmail.com Sun Sep 30 16:50:38 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sun, 30 Sep 2012 22:50:38 +0200 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> Message-ID: On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant wrote: > We are not talking about changing it "back". The change in 1.6 caused problems that need to be addressed. > > Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Sep 30, 2012, at 3:30 PM, Han Genuit wrote: > >> On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: >>> Hey all, >>> >>> In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. >>> >>> Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. >>> >>> The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. >>> >>> What are reactions to this proposal? >>> >>> -Travis >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> I think the current behaviour of the .base attribute is much more >> stable and predictable than past behaviour. For views for instance, >> this makes sure you don't hold references of 'intermediate' views, but >> always point to the original *base* object. Also, I think a lot of >> internal logic depends on this behaviour, so I am not in favour of >> changing this back (yet) again. >> >> Also, considering that this behaviour already exists in past versions >> of NumPy, namely 1.6, and is very fundamental to how arrays work, I >> find it strange that it is now up for change in 1.7 at the last >> minute. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Well, the current behaviour makes sure you can have an endless chain of views derived from each other without keeping a copy of each view alive. If I understand correctly, you propose to change this behaviour to where it would keep a copy of each view alive.. My concern is that the problems that occurred from the 1.6 change are now seen as paramount above a correct implementation. There are problems with backward compatibility, but most of these are due to lack of documentation and testing. And now there will be a lot of people depending on the new behaviour, which is also something to take into account. From travis at continuum.io Sun Sep 30 16:55:51 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 30 Sep 2012 15:55:51 -0500 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> Message-ID: <73176955-DD3D-42E6-A5EA-3BC5F17AA1BF@continuum.io> I think you are misunderstanding the proposal. The proposal is to traverse the views as far as you can but stop just short of having base point to an object of a different type. This fixes the infinite chain of views problem but also fixes the problem sklearn was having with base pointing to an unexpected mmap object. -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 3:50 PM, Han Genuit wrote: > On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant wrote: >> We are not talking about changing it "back". The change in 1.6 caused problems that need to be addressed. >> >> Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. >> >> -- >> Travis Oliphant >> (on a mobile) >> 512-826-7480 >> >> >> On Sep 30, 2012, at 3:30 PM, Han Genuit wrote: >> >>> On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: >>>> Hey all, >>>> >>>> In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. >>>> >>>> Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. >>>> >>>> The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. >>>> >>>> What are reactions to this proposal? >>>> >>>> -Travis >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> I think the current behaviour of the .base attribute is much more >>> stable and predictable than past behaviour. For views for instance, >>> this makes sure you don't hold references of 'intermediate' views, but >>> always point to the original *base* object. Also, I think a lot of >>> internal logic depends on this behaviour, so I am not in favour of >>> changing this back (yet) again. >>> >>> Also, considering that this behaviour already exists in past versions >>> of NumPy, namely 1.6, and is very fundamental to how arrays work, I >>> find it strange that it is now up for change in 1.7 at the last >>> minute. >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Well, the current behaviour makes sure you can have an endless chain > of views derived from each other without keeping a copy of each view > alive. If I understand correctly, you propose to change this behaviour > to where it would keep a copy of each view alive.. My concern is that > the problems that occurred from the 1.6 change are now seen as > paramount above a correct implementation. There are problems with > backward compatibility, but most of these are due to lack of > documentation and testing. And now there will be a lot of people > depending on the new behaviour, which is also something to take into > account. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hangenuit at gmail.com Sun Sep 30 17:00:29 2012 From: hangenuit at gmail.com (Han Genuit) Date: Sun, 30 Sep 2012 23:00:29 +0200 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: <73176955-DD3D-42E6-A5EA-3BC5F17AA1BF@continuum.io> References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> <73176955-DD3D-42E6-A5EA-3BC5F17AA1BF@continuum.io> Message-ID: On Sun, Sep 30, 2012 at 10:55 PM, Travis Oliphant wrote: > I think you are misunderstanding the proposal. The proposal is to traverse the views as far as you can but stop just short of having base point to an object of a different type. > > This fixes the infinite chain of views problem but also fixes the problem sklearn was having with base pointing to an unexpected mmap object. > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Sep 30, 2012, at 3:50 PM, Han Genuit wrote: > >> On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant wrote: >>> We are not talking about changing it "back". The change in 1.6 caused problems that need to be addressed. >>> >>> Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. >>> >>> -- >>> Travis Oliphant >>> (on a mobile) >>> 512-826-7480 >>> >>> >>> On Sep 30, 2012, at 3:30 PM, Han Genuit wrote: >>> >>>> On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: >>>>> Hey all, >>>>> >>>>> In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. >>>>> >>>>> Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. >>>>> >>>>> The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. >>>>> >>>>> What are reactions to this proposal? >>>>> >>>>> -Travis >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> I think the current behaviour of the .base attribute is much more >>>> stable and predictable than past behaviour. For views for instance, >>>> this makes sure you don't hold references of 'intermediate' views, but >>>> always point to the original *base* object. Also, I think a lot of >>>> internal logic depends on this behaviour, so I am not in favour of >>>> changing this back (yet) again. >>>> >>>> Also, considering that this behaviour already exists in past versions >>>> of NumPy, namely 1.6, and is very fundamental to how arrays work, I >>>> find it strange that it is now up for change in 1.7 at the last >>>> minute. >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> Well, the current behaviour makes sure you can have an endless chain >> of views derived from each other without keeping a copy of each view >> alive. If I understand correctly, you propose to change this behaviour >> to where it would keep a copy of each view alive.. My concern is that >> the problems that occurred from the 1.6 change are now seen as >> paramount above a correct implementation. There are problems with >> backward compatibility, but most of these are due to lack of >> documentation and testing. And now there will be a lot of people >> depending on the new behaviour, which is also something to take into >> account. >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Ah, sorry, I get it. You mean to make sure that base is an object of type ndarray. No problems there. :-) From gael.varoquaux at normalesup.org Sun Sep 30 17:08:06 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 30 Sep 2012 23:08:06 +0200 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: <20120930210806.GA3430@phare.normalesup.org> On Sun, Sep 30, 2012 at 10:30:52PM +0200, Han Genuit wrote: > Also, considering that this behaviour already exists in past versions > of NumPy, namely 1.6, I just checked: in numpy 1.6.1, the behaviour is to create an endless chain of base.base.base... In some sens, what Travis is proposing is going one step in the direction of the old behavior, without its major drawbacks. I am actually very favorable to his suggestion. My 2 cents, Ga?l From travis at continuum.io Sun Sep 30 17:09:43 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 30 Sep 2012 16:09:43 -0500 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> <2747FDE3-B109-483F-BFE3-1C5BE140E592@continuum.io> <73176955-DD3D-42E6-A5EA-3BC5F17AA1BF@continuum.io> Message-ID: -- Travis Oliphant (on a mobile) 512-826-7480 On Sep 30, 2012, at 4:00 PM, Han Genuit wrote: > On Sun, Sep 30, 2012 at 10:55 PM, Travis Oliphant wrote: >> I think you are misunderstanding the proposal. The proposal is to traverse the views as far as you can but stop just short of having base point to an object of a different type. >> >> This fixes the infinite chain of views problem but also fixes the problem sklearn was having with base pointing to an unexpected mmap object. >> >> -- >> Travis Oliphant >> (on a mobile) >> 512-826-7480 >> >> >> On Sep 30, 2012, at 3:50 PM, Han Genuit wrote: >> >>> On Sun, Sep 30, 2012 at 10:35 PM, Travis Oliphant wrote: >>>> We are not talking about changing it "back". The change in 1.6 caused problems that need to be addressed. >>>> >>>> Can you clarify your concerns? The proposal is not a major change to the behavior on master, but it does fix a real issue. >>>> >>>> -- >>>> Travis Oliphant >>>> (on a mobile) >>>> 512-826-7480 >>>> >>>> >>>> On Sep 30, 2012, at 3:30 PM, Han Genuit wrote: >>>> >>>>> On Sun, Sep 30, 2012 at 9:59 PM, Travis Oliphant wrote: >>>>>> Hey all, >>>>>> >>>>>> In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. >>>>>> >>>>>> Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. >>>>>> >>>>>> The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. >>>>>> >>>>>> What are reactions to this proposal? >>>>>> >>>>>> -Travis >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> NumPy-Discussion mailing list >>>>>> NumPy-Discussion at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> I think the current behaviour of the .base attribute is much more >>>>> stable and predictable than past behaviour. For views for instance, >>>>> this makes sure you don't hold references of 'intermediate' views, but >>>>> always point to the original *base* object. Also, I think a lot of >>>>> internal logic depends on this behaviour, so I am not in favour of >>>>> changing this back (yet) again. >>>>> >>>>> Also, considering that this behaviour already exists in past versions >>>>> of NumPy, namely 1.6, and is very fundamental to how arrays work, I >>>>> find it strange that it is now up for change in 1.7 at the last >>>>> minute. >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> Well, the current behaviour makes sure you can have an endless chain >>> of views derived from each other without keeping a copy of each view >>> alive. If I understand correctly, you propose to change this behaviour >>> to where it would keep a copy of each view alive.. My concern is that >>> the problems that occurred from the 1.6 change are now seen as >>> paramount above a correct implementation. There are problems with >>> backward compatibility, but most of these are due to lack of >>> documentation and testing. And now there will be a lot of people >>> depending on the new behaviour, which is also something to take into >>> account. >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Ah, sorry, I get it. You mean to make sure that base is an object of > type ndarray. No problems there. :-) Yes. Exactly. I realize I didn't explain it very well. For a subtype it would ensure base is a subtype. Thanks for feedback. Travis > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From tsyu80 at gmail.com Sun Sep 30 17:23:54 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Sun, 30 Sep 2012 17:23:54 -0400 Subject: [Numpy-discussion] ANN: scikits-image 0.7.0 release Message-ID: Announcement: scikits-image 0.7.0 ================================= We're happy to announce the 7th version of scikits-image! Scikits-image is an image processing toolbox for SciPy that includes algorithms for segmentation, geometric transformations, color space manipulation, analysis, filtering, morphology, feature detection, and more. For more information, examples, and documentation, please visit our website http://skimage.org New Features ------------ It's been only 3 months since scikits-image 0.6 was released, but in that short time, we've managed to add plenty of new features and enhancements, including - Geometric image transforms - 3 new image segmentation routines (Felsenzwalb, Quickshift, SLIC) - Local binary patterns for texture characterization - Morphological reconstruction - Polygon approximation - CIE Lab color space conversion - Image pyramids - Multispectral support in random walker segmentation - Slicing, concatenation, and natural sorting of image collections - Perimeter and coordinates measurements in regionprops - An extensible image viewer based on Qt and Matplotlib, with plugins for edge detection, line-profiling, and viewing image collections Plus, this release adds a number of bug fixes, new examples, and performance enhancements. Contributors to this release ---------------------------- This release was only possible due to the efforts of many contributors, both new and old. - Andreas Mueller - Andreas Wuerl - Andy Wilson - Brian Holt - Christoph Gohlke - Dharhas Pothina - Emmanuelle Gouillart - Guillaume Gay - Josh Warner - James Bergstra - Johannes Schonberger - Jonathan J. Helmus - Juan Nunez-Iglesias - Leon Tietz - Marianne Corvellec - Matt McCormick - Neil Yager - Nicolas Pinto - Nicolas Poilvert - Pavel Campr - Petter Strandmark - Stefan van der Walt - Tim Sheerman-Chase - Tomas Kazmar - Tony S Yu - Wei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Sep 30 22:30:17 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 Sep 2012 20:30:17 -0600 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: On Sun, Sep 30, 2012 at 1:59 PM, Travis Oliphant wrote: > Hey all, > > In a github-discussion with Gael and Nathaniel, we came up with a proposal > for .base that we should put before this list. Traditionally, .base has > always pointed to None for arrays that owned their own memory and to the > "most immediate" array object parent for arrays that did not own their own > memory. There was a long-standing issue related to running out of stack > space that this behavior created. > > Recently this behavior was altered so that .base always points to "the > original" object holding the memory (something exposing the buffer > interface). This created some problems for users who relied on the fact > that most of the time .base pointed to an instance of an array object. > > The proposal here is to change the behavior of .base for arrays that don't > own their own memory so that the .base attribute of an array points to "the > most original object" that is still an instance of the type of the array. > This would go into the 1.7.0 release so as to correct the issues > reported. > > What are reactions to this proposal? > > It sounds like this would solve the problem in the short term, but it is a bit of a hack in that the behaviour is more complicated than either the original or the current version. So I could see this in 1.7, but it might be preferable in the long term to work out what attributes are needed to solve Gael's problem more directly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Sep 30 23:11:14 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 Sep 2012 21:11:14 -0600 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: On Sun, Sep 30, 2012 at 8:30 PM, Charles R Harris wrote: > > > On Sun, Sep 30, 2012 at 1:59 PM, Travis Oliphant wrote: > >> Hey all, >> >> In a github-discussion with Gael and Nathaniel, we came up with a >> proposal for .base that we should put before this list. Traditionally, >> .base has always pointed to None for arrays that owned their own memory and >> to the "most immediate" array object parent for arrays that did not own >> their own memory. There was a long-standing issue related to running out >> of stack space that this behavior created. >> >> Recently this behavior was altered so that .base always points to "the >> original" object holding the memory (something exposing the buffer >> interface). This created some problems for users who relied on the fact >> that most of the time .base pointed to an instance of an array object. >> >> The proposal here is to change the behavior of .base for arrays that >> don't own their own memory so that the .base attribute of an array points >> to "the most original object" that is still an instance of the type of the >> array. This would go into the 1.7.0 release so as to correct the >> issues reported. >> >> What are reactions to this proposal? >> >> > It sounds like this would solve the problem in the short term, but it is a > bit of a hack in that the behaviour is more complicated than either the > original or the current version. So I could see this in 1.7, but it might > be preferable in the long term to work out what attributes are needed to > solve Gael's problem more directly. > > Although I think the proposal needs to be laid out more exactly with more details in order to understand what it is. Perhaps an explanation of the problem with an explanation of how it is solved. A diagram would be helpful and could go into the documentation. Chuck > Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sun Sep 30 23:17:09 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 30 Sep 2012 22:17:09 -0500 Subject: [Numpy-discussion] Behavior of .base In-Reply-To: References: <15149FE7-75A4-4335-912C-92433F0AC98A@continuum.io> Message-ID: <659CEF43-32AC-4D6E-86F6-34CA30AA415E@continuum.io> It sounds like there are no objections and this has a strong chance to fix the problems. We will put it on the TODO list for 1.7.0 release. -Travis On Sep 30, 2012, at 9:30 PM, Charles R Harris wrote: > > > On Sun, Sep 30, 2012 at 1:59 PM, Travis Oliphant wrote: > Hey all, > > In a github-discussion with Gael and Nathaniel, we came up with a proposal for .base that we should put before this list. Traditionally, .base has always pointed to None for arrays that owned their own memory and to the "most immediate" array object parent for arrays that did not own their own memory. There was a long-standing issue related to running out of stack space that this behavior created. > > Recently this behavior was altered so that .base always points to "the original" object holding the memory (something exposing the buffer interface). This created some problems for users who relied on the fact that most of the time .base pointed to an instance of an array object. > > The proposal here is to change the behavior of .base for arrays that don't own their own memory so that the .base attribute of an array points to "the most original object" that is still an instance of the type of the array. This would go into the 1.7.0 release so as to correct the issues reported. > > What are reactions to this proposal? > > > It sounds like this would solve the problem in the short term, but it is a bit of a hack in that the behaviour is more complicated than either the original or the current version. So I could see this in 1.7, but it might be preferable in the long term to work out what attributes are needed to solve Gael's problem more directly. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: