From njs at pobox.com Sun Jul 1 13:36:51 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 1 Jul 2012 18:36:51 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau wrote: > On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith wrote: >> But seriously, what compilers do we support that don't have >> -fvisibility=hidden? ...Is there even a list of compilers we support >> available anywhere? > > Well, I am not sure how all this is handled on the big guys (bluegen > and co), for once. > > There is also the issue of the consequence on statically linking numpy > to python: I don't what they are (I would actually like to make > statically linked numpy into python easier, not harder). All the docs I can find in a quick google seem to say that bluegene doesn't do shared libraries at all, though those may be out of date. Also, it looks like our current approach is not doing a great job of avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all over the source, I still count ~170 exported symbols on Linux with numpy 1.6, many of them with non-namespaced names ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is fixable, but it's interesting that no-one has noticed. (Current master brings this up to ~300 exported symbols.) It sounds like as far as our "officially supported" platforms go (linux/windows/osx with gcc/msvc), then the ideal approach would be to use -fvisibility=hidden or --retain-symbols-file to convince gcc to hide symbols by default, like msvc does. That would let us remove cruft from the source code, produce a more reliable result, and let us use the more convenient separate build, with no real downsides. (Static linking is trickier because no-one uses it anymore so the docs aren't great, but I think on Linux at least you could accomplish the equivalent by building the static library with 'ld -r ... -o tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray tmp-multiarray.a multiarray.a'.) Of course there are presumably other platforms that we don't support or test on, but where we have users anyway. Building on such a platform sort of intrinsically requires build system hacks, and some equivalent to the above may well be available (e.g. I know icc supports -fvisibility). So I while I'm not going to do anything about this myself in the near future, I'd argue that it would be a good idea to: - Switch the build-system to export nothing by default when using gcc, using -fvisibility=hidden - Switch the default build to "separate" - Leave in the single-file build, but not "officially supported", i.e., we're happy to get patches but it's not used on any systems that we can actually test ourselves. (I suspect it's less fragile than the separate build anyway, since name clashes are less common than forgotten include files.) -N From cournape at gmail.com Sun Jul 1 14:36:49 2012 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Jul 2012 19:36:49 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith wrote: > On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau wrote: >> On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith wrote: >>> But seriously, what compilers do we support that don't have >>> -fvisibility=hidden? ...Is there even a list of compilers we support >>> available anywhere? >> >> Well, I am not sure how all this is handled on the big guys (bluegen >> and co), for once. >> >> There is also the issue of the consequence on statically linking numpy >> to python: I don't what they are (I would actually like to make >> statically linked numpy into python easier, not harder). > > All the docs I can find in a quick google seem to say that bluegene > doesn't do shared libraries at all, though those may be out of date. > > Also, it looks like our current approach is not doing a great job of > avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all > over the source, I still count ~170 exported symbols on Linux with > numpy 1.6, many of them with non-namespaced names > ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is > fixable, but it's interesting that no-one has noticed. (Current master > brings this up to ~300 exported symbols.) > > It sounds like as far as our "officially supported" platforms go > (linux/windows/osx with gcc/msvc), then the ideal approach would be to > use -fvisibility=hidden or --retain-symbols-file to convince gcc to > hide symbols by default, like msvc does. That would let us remove > cruft from the source code, produce a more reliable result, and let us > use the more convenient separate build, with no real downsides. What cruft would it allow us to remove ? Whatever method we use, we need a whitelist of symbols to export. On the exported list I see on mac, most of them are either from npymath (npy prefix) or npysort (no prefix, I think this should be added). Once those are ignored as they should be, there are < 30 symbols exported. > (Static linking is trickier because no-one uses it anymore so the docs > aren't great, but I think on Linux at least you could accomplish the > equivalent by building the static library with 'ld -r ... -o > tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray > tmp-multiarray.a multiarray.a'.) I am not sure why you say that static linking is not used anymore: I have met some people who do statically link numpy into python. > > Of course there are presumably other platforms that we don't support > or test on, but where we have users anyway. Building on such a > platform sort of intrinsically requires build system hacks, and some > equivalent to the above may well be available (e.g. I know icc > supports -fvisibility). So I while I'm not going to do anything about > this myself in the near future, I'd argue that it would be a good idea > to: > - Switch the build-system to export nothing by default when using > gcc, using -fvisibility=hidden > - Switch the default build to "separate" > - Leave in the single-file build, but not "officially supported", > i.e., we're happy to get patches but it's not used on any systems that > we can actually test ourselves. (I suspect it's less fragile than the > separate build anyway, since name clashes are less common than > forgotten include files.) I am fine with making the separate build the default (I have a patch somewhere that does that on supported platforms), but not with using -fvisibility=hidden. When I implemented the initial support around this, fvisibility was buggy on some platforms, including mingw 3.x I don't think changing what our implementation does here is worthwhile given that it works, and fsibility=hidden has no big advantages (you would still need to mark the functions to be exported). David From njs at pobox.com Sun Jul 1 15:32:41 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 1 Jul 2012 20:32:41 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau wrote: > On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith wrote: >> On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau wrote: >>> On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith wrote: >>>> But seriously, what compilers do we support that don't have >>>> -fvisibility=hidden? ...Is there even a list of compilers we support >>>> available anywhere? >>> >>> Well, I am not sure how all this is handled on the big guys (bluegen >>> and co), for once. >>> >>> There is also the issue of the consequence on statically linking numpy >>> to python: I don't what they are (I would actually like to make >>> statically linked numpy into python easier, not harder). >> >> All the docs I can find in a quick google seem to say that bluegene >> doesn't do shared libraries at all, though those may be out of date. >> >> Also, it looks like our current approach is not doing a great job of >> avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all >> over the source, I still count ~170 exported symbols on Linux with >> numpy 1.6, many of them with non-namespaced names >> ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is >> fixable, but it's interesting that no-one has noticed. (Current master >> brings this up to ~300 exported symbols.) >> >> It sounds like as far as our "officially supported" platforms go >> (linux/windows/osx with gcc/msvc), then the ideal approach would be to >> use -fvisibility=hidden or --retain-symbols-file to convince gcc to >> hide symbols by default, like msvc does. That would let us remove >> cruft from the source code, produce a more reliable result, and let us >> use the more convenient separate build, with no real downsides. > > What cruft would it allow us to remove ? Whatever method we use, we > need a whitelist of symbols to export. No, right now we don't have a whitelist, we have a blacklist -- every time we add a new function or global variable, we have to remember to add a NPY_NO_EXPORT tag to its definition. Except the evidence says that we don't do that reliably. (Everyone always sucks at maintaining blacklists, that's the nature of blacklists.) I'm saying that we'd better off if we did have a whitelist. Especially since CPython API makes maintaining this whitelist so very trivial -- each module exports exactly one symbol! > On the exported list I see on mac, most of them are either from > npymath (npy prefix) or npysort (no prefix, I think this should be > added). Once those are ignored as they should be, there are < 30 > symbols exported. > >> (Static linking is trickier because no-one uses it anymore so the docs >> aren't great, but I think on Linux at least you could accomplish the >> equivalent by building the static library with 'ld -r ... -o >> tmp-multiarray.a; objcopy --keep-global-symbol=initmultiarray >> tmp-multiarray.a multiarray.a'.) > > I am not sure why you say that static linking is not used anymore: I > have met some people who do statically link numpy into python. Yes, of course, or I wouldn't have bothered researching it. But this research would have been easier if there were enough of a user base that the tools makers actually paid any attention to supporting this use case, is all I was saying :-). >> Of course there are presumably other platforms that we don't support >> or test on, but where we have users anyway. Building on such a >> platform sort of intrinsically requires build system hacks, and some >> equivalent to the above may well be available (e.g. I know icc >> supports -fvisibility). So I while I'm not going to do anything about >> this myself in the near future, I'd argue that it would be a good idea >> to: >> - Switch the build-system to export nothing by default when using >> gcc, using -fvisibility=hidden >> - Switch the default build to "separate" >> - Leave in the single-file build, but not "officially supported", >> i.e., we're happy to get patches but it's not used on any systems that >> we can actually test ourselves. (I suspect it's less fragile than the >> separate build anyway, since name clashes are less common than >> forgotten include files.) > > I am fine with making the separate build the default (I have a patch > somewhere that does that on supported platforms), but not with using > -fvisibility=hidden. When I implemented the initial support around > this, fvisibility was buggy on some platforms, including mingw 3.x It's true that mingw doesn't support -fvisibility=hidden, but that's because it would be a no-op; windows already works that way by default... > I don't think changing what our implementation does here is worthwhile > given that it works, and fsibility=hidden has no big advantages (you > would still need to mark the functions to be exported). But there are only about 10 functions that we need to export, and that list never changes; OTOH there are tons and tons of functions that we want to *not* export, and that list changes constantly. -N From cournape at gmail.com Sun Jul 1 16:17:22 2012 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Jul 2012 21:17:22 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Sun, Jul 1, 2012 at 8:32 PM, Nathaniel Smith wrote: > On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau wrote: >> On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith wrote: >>> On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau wrote: >>>> On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith wrote: >>>>> But seriously, what compilers do we support that don't have >>>>> -fvisibility=hidden? ...Is there even a list of compilers we support >>>>> available anywhere? >>>> >>>> Well, I am not sure how all this is handled on the big guys (bluegen >>>> and co), for once. >>>> >>>> There is also the issue of the consequence on statically linking numpy >>>> to python: I don't what they are (I would actually like to make >>>> statically linked numpy into python easier, not harder). >>> >>> All the docs I can find in a quick google seem to say that bluegene >>> doesn't do shared libraries at all, though those may be out of date. >>> >>> Also, it looks like our current approach is not doing a great job of >>> avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all >>> over the source, I still count ~170 exported symbols on Linux with >>> numpy 1.6, many of them with non-namespaced names >>> ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is >>> fixable, but it's interesting that no-one has noticed. (Current master >>> brings this up to ~300 exported symbols.) >>> >>> It sounds like as far as our "officially supported" platforms go >>> (linux/windows/osx with gcc/msvc), then the ideal approach would be to >>> use -fvisibility=hidden or --retain-symbols-file to convince gcc to >>> hide symbols by default, like msvc does. That would let us remove >>> cruft from the source code, produce a more reliable result, and let us >>> use the more convenient separate build, with no real downsides. >> >> What cruft would it allow us to remove ? Whatever method we use, we >> need a whitelist of symbols to export. > > No, right now we don't have a whitelist, we have a blacklist -- every > time we add a new function or global variable, we have to remember to > add a NPY_NO_EXPORT tag to its definition. Except the evidence says > that we don't do that reliably. (Everyone always sucks at maintaining > blacklists, that's the nature of blacklists.) I'm saying that we'd > better off if we did have a whitelist. Especially since CPython API > makes maintaining this whitelist so very trivial -- each module > exports exactly one symbol! There may be some confusion on what NPY_NP_EXPORT does: it marks a function that can be used between compilation units but is not exported. The choice is between static and NPY_NO_EXPORT, not between NPY_NO_EXPORT and nothing. In that sense, marking something NPY_NO_EXPORT is a whitelist. If we were to use -fvisibility=hidden, we would still need to mark those functions static (as it would otherwise publish functions in the single file build). > > Yes, of course, or I wouldn't have bothered researching it. But this > research would have been easier if there were enough of a user base > that the tools makers actually paid any attention to supporting this > use case, is all I was saying :-). > >>> Of course there are presumably other platforms that we don't support >>> or test on, but where we have users anyway. Building on such a >>> platform sort of intrinsically requires build system hacks, and some >>> equivalent to the above may well be available (e.g. I know icc >>> supports -fvisibility). So I while I'm not going to do anything about >>> this myself in the near future, I'd argue that it would be a good idea >>> to: >>> - Switch the build-system to export nothing by default when using >>> gcc, using -fvisibility=hidden >>> - Switch the default build to "separate" >>> - Leave in the single-file build, but not "officially supported", >>> i.e., we're happy to get patches but it's not used on any systems that >>> we can actually test ourselves. (I suspect it's less fragile than the >>> separate build anyway, since name clashes are less common than >>> forgotten include files.) >> >> I am fine with making the separate build the default (I have a patch >> somewhere that does that on supported platforms), but not with using >> -fvisibility=hidden. When I implemented the initial support around >> this, fvisibility was buggy on some platforms, including mingw 3.x > > It's true that mingw doesn't support -fvisibility=hidden, but that's > because it would be a no-op; windows already works that way by > default... That's not my understanding: gcc behaves on windows as on linux (it would break too many softwares that are the usual target of mingw otherwise), but the -fvisibility flag is broken on gcc 3.x. The more recent mingw supposedly handle this better, but we can't use gcc 4.x because of another issue regarding private dll sharing :) David From heng at cantab.net Sun Jul 1 16:29:22 2012 From: heng at cantab.net (Henry Gomersall) Date: Sun, 01 Jul 2012 21:29:22 +0100 Subject: [Numpy-discussion] running a development branch? Message-ID: <1341174562.2275.6.camel@farnsworth> Forgive me for what seems to me should be an obvious question. How do people run development code without the need to build an entire source distribution each time? My current strategy is to develop in a virtualenv and then copy the changes to my numpy fork when done, but there are lots of obvious problems with that. Trying to run modules in the numpy branch gives me the error about trying to import numpy from its source tree. Cheers, Henry From ralf.gommers at googlemail.com Sun Jul 1 16:35:50 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 1 Jul 2012 22:35:50 +0200 Subject: [Numpy-discussion] running a development branch? In-Reply-To: <1341174562.2275.6.camel@farnsworth> References: <1341174562.2275.6.camel@farnsworth> Message-ID: On Sun, Jul 1, 2012 at 10:29 PM, Henry Gomersall wrote: > Forgive me for what seems to me should be an obvious question. > > How do people run development code without the need to build an entire > source distribution each time? My current strategy is to develop in a > virtualenv and then copy the changes to my numpy fork when done, but > there are lots of obvious problems with that. > > Trying to run modules in the numpy branch gives me the error about > trying to import numpy from its source tree. > You need an in-place build, "python setup.py build_ext -i" (or its Bento or Numscons equivalent) and then add the base dir of the numpy repo to your PYTHONPATH. You can find some more details in the 2nd question of the FAQ of https://github.com/scipy/scipy/blob/master/HACKING.rst.txt Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sveinugu at gmail.com Mon Jul 2 10:53:43 2012 From: sveinugu at gmail.com (Sveinung Gundersen) Date: Mon, 2 Jul 2012 16:53:43 +0200 Subject: [Numpy-discussion] Change in memmap behaviour Message-ID: <00082D11-6BC4-4FBE-829E-682314FE73CA@gmail.com> Hi, We are developing a large project for genome analysis (http://hyperbrowser.uio.no), where we use memmap vectors as the basic data structure for storage. The stored data are accessed in slices, and used as basis for calculations. As the stored data may be large (up to 24 GB), the memory footprint is important. We experienced a problem with 64-bit addressing for the function concatenate (using quite old numpy version 1.5.1rc), and have thus updated the version of numpy to 1.7.0.dev-651ef74, where the problem has been fixed. We have, however, experienced another problem connected to a change in memmap behaviour. This change seems to have come with the 1.6 release. Before (1.5.1rc1): >>> import platform; print platform.python_version() 2.7.0 >>> import numpy as np >>> np.version.version '1.5.1rc1' >>> a = np.memmap('testmemmap', 'int32', 'w+', shape=20) >>> a[:] = 2 >>> a[0:2] memmap([2, 2], dtype=int32) >>> a[0:2]._mmap >>> a.sum() 40 >>> a.sum()._mmap Traceback (most recent call last): File "", line 1, in AttributeError: 'numpy.int64' object has no attribute '_mmap' After (1.6.2): >>> import platform; print platform.python_version() 2.7.0 >>> import numpy as np >>> np.version.version '1.6.2' >>> a = np.memmap('testmemmap', 'int32', 'w+', shape=20) >>> a[:] = 2 >>> a[0:2] memmap([2, 2], dtype=int32) >>> a[0:2]._mmap >>> a.sum() memmap(40) >>> a.sum()._mmap The problem is then that doing calculations of memmap objects, resulting in scalar results, previously returned a numpy scalar, with no reference to the memmap object. We could then just keep the result, and mark the memmap for garbage collection. Now, the memory usage of the system has increased dramatically, as we now longer have this option. So, the question is twofold: 1) What is the reason behind this change? It makes sense to keep the reference to the mmap when slicing, but to go from a scalar value to the mmap does not seem very useful. Is there a possibility to return to the old solution? 2) If not, do you have any advice how we can retain the old solution without rewriting the system. We could cast the results of all functions on the memmap, but these are scattered throughout the system and would probably cause much headache. So we would rather implement a general solution, for instance wrapping the memmap object somehow. Do you have any ideas? Connected to this is the rather puzzling fact that the 'new' memmap scalar object has an __iter__ method, but no length. Should not the __iter__ method be removed, as this signals that the object is iterable? Before (1.5.1rc1): >>> a[0:2].__iter__() >>> len(a[0:2]) 2 >>> a.sum().__iter__ Traceback (most recent call last): File "", line 1, in AttributeError: 'numpy.int64' object has no attribute '__iter__' >>> len(a.sum()) Traceback (most recent call last): File "", line 1, in TypeError: object of type 'numpy.int64' has no len() After (1.6.2): >>> a[0:2].__iter__() >>> len(a[0:2]) 2 >>> a.sum().__iter__ >>> len(a.sum()) Traceback (most recent call last): File "", line 1, in TypeError: len() of unsized object >>> [x for x in a.sum()] Traceback (most recent call last): File "", line 1, in TypeError: iteration over a 0-d array Regards, Sveinung Gundersen -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Jul 2 12:51:10 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 17:51:10 +0100 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: <00082D11-6BC4-4FBE-829E-682314FE73CA@gmail.com> References: <00082D11-6BC4-4FBE-829E-682314FE73CA@gmail.com> Message-ID: On Mon, Jul 2, 2012 at 3:53 PM, Sveinung Gundersen wrote: > Hi, > > We are developing a large project for genome analysis > (http://hyperbrowser.uio.no), where we use memmap vectors as the basic data > structure for storage. The stored data are accessed in slices, and used as > basis for calculations. As the stored data may be large (up to 24 GB), the > memory footprint is important. > > We experienced a problem with 64-bit addressing for the function concatenate > (using quite old numpy version 1.5.1rc), and have thus updated the version > of numpy to 1.7.0.dev-651ef74, where the problem has been fixed. We have, > however, experienced another problem connected to a change in memmap > behaviour. This change seems to have come with the 1.6 release. > > Before (1.5.1rc1): > >>>> import platform; print platform.python_version() > 2.7.0 >>>> import numpy as np >>>> np.version.version > '1.5.1rc1' >>>> a = np.memmap('testmemmap', 'int32', 'w+', shape=20) >>>> a[:] = 2 >>>> a[0:2] > memmap([2, 2], dtype=int32) >>>> a[0:2]._mmap > >>>> a.sum() > 40 >>>> a.sum()._mmap > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'numpy.int64' object has no attribute '_mmap' > > After (1.6.2): > >>>> import platform; print platform.python_version() > 2.7.0 >>>> import numpy as np >>>> np.version.version > '1.6.2' >>>> a = np.memmap('testmemmap', 'int32', 'w+', shape=20) >>>> a[:] = 2 >>>> a[0:2] > memmap([2, 2], dtype=int32) >>>> a[0:2]._mmap > >>>> a.sum() > memmap(40) >>>> a.sum()._mmap > > > The problem is then that doing calculations of memmap objects, resulting in > scalar results, previously returned a numpy scalar, with no reference to the > memmap object. We could then just keep the result, and mark the memmap for > garbage collection. Now, the memory usage of the system has increased > dramatically, as we now longer have this option. Your actual memory usage may not have increased as much as you think, since memmap objects don't necessarily take much memory -- it sounds like you're leaking virtual memory, but your resident set size shouldn't go up as much. That said, this is clearly a bug, and it's even worse than you mention -- *all* operations on memmap arrays are holding onto references to the original mmap object, regardless of whether they share any memory: >>> a = np.memmap("/etc/passwd", np.uint8, "r") # arithmetic >>> (a + 10)._mmap is a._mmap True # fancy indexing (doesn't return a view!) >>> a[[1, 2, 3]]._mmap is a._mmap True >>> a.sum()._mmap is a._mmap True Really, only slicing should be returning a np.memmap object at all. Unfortunately, it is currently impossible to create an ndarray subclass that returns base-class ndarrays from any operations -- __array_finalize__() has no way to do this. And this is the third ndarray subclass in a row that I've looked at that wanted to be able to do this, so I guess maybe it's something we should implement... In the short term, the numpy-upstream fix is to change numpy.core.memmap:memmap.__array_finalize__ so that it only copies over the ._mmap attribute of its parent if np.may_share_memory(self, parent) is True. Patches gratefully accepted ;-) In the short term, you have a few options for hacky workarounds. You could monkeypatch the above fix into the memmap class. You could manually assign None to the _mmap attribute of offending arrays (being careful only to do this to arrays where you know it is safe!). And for reduction operations like sum() in particular, what you have right now is not actually a scalar object -- it is a 0-dimensional array that holds a single scalar. You can pull this scalar out by calling .item() on the array, and then throw away the array itself -- the scalar won't have any _mmap attribute. def scalarify(scalar_or_0d_array): if isinstance(scalar_or_0d_array, np.ndarray): return scalar_or_0d_array.item() else: return scalar_or_0d_array # works on both numpy 1.5 and numpy 1.6: total = scalarify(a.sum()) -N From sveinugu at gmail.com Mon Jul 2 13:54:39 2012 From: sveinugu at gmail.com (Sveinung Gundersen) Date: Mon, 2 Jul 2012 19:54:39 +0200 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: References: Message-ID: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> [snip] > > Your actual memory usage may not have increased as much as you think, > since memmap objects don't necessarily take much memory -- it sounds > like you're leaking virtual memory, but your resident set size > shouldn't go up as much. As I understand it, memmap objects retain the contents of the memmap in memory after it has been read the first time (in a lazy manner). Thus, when reading a slice of a 24GB file, only that part recides in memory. Our system reads a slice of a memmap, calculates something (say, the sum), and then deletes the memmap. It then loops through this for consequitive slices, retaining a low memory usage. Consider the following code: import numpy as np res = [] vecLen = 3095677412 for i in xrange(vecLen/10**8+1): x = i * 10**8 y = min((i+1) * 10**8, vecLen) res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) The memory usage of this code on a 24GB file (one value for each nucleotide in the human DNA!) is 23g resident memory after the loop is finished (not 24g for some reason..). Running the same code on 1.5.1rc1 gives a resident memory of 23m after the loop. > > That said, this is clearly a bug, and it's even worse than you mention > -- *all* operations on memmap arrays are holding onto references to > the original mmap object, regardless of whether they share any memory: >>>> a = np.memmap("/etc/passwd", np.uint8, "r") > # arithmetic >>>> (a + 10)._mmap is a._mmap > True > # fancy indexing (doesn't return a view!) >>>> a[[1, 2, 3]]._mmap is a._mmap > True >>>> a.sum()._mmap is a._mmap > True > Really, only slicing should be returning a np.memmap object at all. > Unfortunately, it is currently impossible to create an ndarray > subclass that returns base-class ndarrays from any operations -- > __array_finalize__() has no way to do this. And this is the third > ndarray subclass in a row that I've looked at that wanted to be able > to do this, so I guess maybe it's something we should implement... > > In the short term, the numpy-upstream fix is to change > numpy.core.memmap:memmap.__array_finalize__ so that it only copies > over the ._mmap attribute of its parent if np.may_share_memory(self, > parent) is True. Patches gratefully accepted ;-) Great! Any idea on whether such a patch may be included in 1.7? > > In the short term, you have a few options for hacky workarounds. You > could monkeypatch the above fix into the memmap class. You could > manually assign None to the _mmap attribute of offending arrays (being > careful only to do this to arrays where you know it is safe!). And for > reduction operations like sum() in particular, what you have right now > is not actually a scalar object -- it is a 0-dimensional array that > holds a single scalar. You can pull this scalar out by calling .item() > on the array, and then throw away the array itself -- the scalar won't > have any _mmap attribute. > def scalarify(scalar_or_0d_array): > if isinstance(scalar_or_0d_array, np.ndarray): > return scalar_or_0d_array.item() > else: > return scalar_or_0d_array > # works on both numpy 1.5 and numpy 1.6: > total = scalarify(a.sum()) Thank you for this! However, such a solution would have to be scattered throughout the code (probably over 100 places), and I would rather not do that. I guess the abovementioned patch would be the best solution. I do not have experience in the numpy core code, so I am also eagerly awaiting such a patch! Sveinung -- Sveinung Gundersen PhD Student, Bioinformatics, Dept. of Tumor Biology, Inst. for Cancer Research, The Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway E-mail: sveinung.gundersen at medisin.uio.no, Phone: +47 93 00 94 54 -------------- next part -------------- An HTML attachment was scrubbed... URL: From matrixhasu at gmail.com Mon Jul 2 13:58:28 2012 From: matrixhasu at gmail.com (Sandro Tosi) Date: Mon, 2 Jul 2012 19:58:28 +0200 Subject: [Numpy-discussion] Numpy regression in 1.6.2 in deducing the dtype for record array Message-ID: Hello, I'd like to point you to this bug report just reported to Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679948 It would be really awesome if you could give a look and comment if the proposed fix would be appropriate. Thanks a lot, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From dalke at dalkescientific.com Mon Jul 2 15:17:58 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 2 Jul 2012 21:17:58 +0200 Subject: [Numpy-discussion] "import numpy" performance Message-ID: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> In this email I propose a few changes which I think are minor and which don't really affect the external NumPy API but which I think could improve the "import numpy" performance by at least 40%. This affects me because I and my clients use a chemistry toolkit which uses only NumPy arrays, and where we run short programs often on the command-line. In July of 2008 I started a thread about how "import numpy" was noticeably slow for one of my customers. They had chemical analysis software, often even run on a single molecular structure using command-line tools, and the several invocations with 0.1 seconds overhead was one of the dominant costs even when numpy wasn't needed. I fixed most of their problems by deferring numpy imports until needed. I remember well the Steve Jobs anecdote at http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt and spent another day of my time in 2008 to identify the parts of the numpy import sequence which seemed excessive. I managed to get the import time down from 0.21 seconds to 0.08 seconds. Very little of that made it into NumPy. The three biggest changes I would like are: 1) remove "add_newdocs" and put the docstrings in the C code 'add_newdocs' still needs to be there, The code says: # This is only meant to add docs to objects defined in C-extension modules. # The purpose is to allow easier editing of the docstrings without # requiring a re-compile. However, the change log shows that there are relatively few commits to this module Year Number of commits ==== ================= 2012 8 2011 62 2010 9 2009 18 2008 17 so I propose moving the docstrings to the C code, and perhaps leaving 'add_newdocs' there, but only used when testing new docstrings. 2) Don't optimistically assume that all submodules are needed. For example, some current code uses >>> import numpy >>> numpy.fft.ifft (See a real-world example at http://stackoverflow.com/questions/10222812/python-numpy-fft-and-inverse-fft ) IMO, this optimizes the needs of the interactive shell NumPy author over the needs of the many-fold more people who don't spend their time in the REPL and/or don't need those extra features added to every NumPy startup. Please bear in mind that NumPy users of the first category will be active on the mailing list, go to SciPy conferences, etc. while members of the second category are less visible. I recognize that this is backwards incompatible, and will not change. However, I understand that "NumPy 2.0" is a glimmer in the future, which might be a natural place for a transition to the more standard Python style of from numpy import fft Personally, I think the documentation now (if it doesn't already) should transition to use this form. 3) Especially: don't always import 'numpy.testing' As far as I can tell, automatic import of this module is not needed, so is pure overhead for the vast majority of NumPy users. Unfortunately, there's a large number of user-facing 'test' and 'bench' bound methods acting as functions. from numpy.testing import Tester test = Tester().test bench = Tester().test They seem rather pointless to me but could be replaced with per-module functions like def test(...): from numpy.testing import Tester Tester().test(...) I have not worried about numpy import performance for 4 years. While I have been developing scientific software for 20 years, and in Python for 15 years, it has been in areas of biology and chemistry which don't use arrays. I use numpy for a day about once every two years, and so far I have had no reason to use scipy. This has changed. I talked with one of my clients last week. They (and I) use a chemistry toolkit called "RDKit". RDKit uses numpy as a way to store coordinate data for molecules. I checked with the package author and he confirms: yeah, it's just using the homogenous array most of the time. My client complained about RDKit's high startup cost, due to the NumPy dependency. On my laptop, with a warm disk cache, it take 0.119s to "import rdkit". On a cold cache it can take 3 seconds. On their cluster filesytem, with a cold cache, it can take over 10 seconds. (I told them about zipimport. They will be looking into that as a solution. However, it doesn't easily help the other people who use the RDKit toolkit.) With instrumentation I found that 0.083s of the 0.119s is spent loading numpy.core.multiarray. The slowest module import times are listed here, with the cumulative time for each module and the name of the (first) importing parent in parentheses: 0.119 rdkit 0.089 rdchem (pyPgSQL) 0.083 numpy.core.multiarray (rdchem) 0.038 add_newdocs (numpy.core.multiarray) 0.032 numpy.lib (add_newdocs) 0.023 type_check (numpy.lib) 0.023 numpy.core.numeric (type_check) 0.012 numpy.testing (numpy.core.numeric) 0.010 unittest (numpy.testing) 0.008 cDataStructs (pyPgSQL) 0.007 random (numpy.core.multiarray) 0.007 mtrand (random) 0.006 case (unittest) 0.005 rdmolfiles (pyPgSQL) 0.005 rdmolops (pyPgSQL) 0.005 difflib (case) 0.005 chebyshev (numpy.core.multiarray) 0.004 hermite (numpy.core.multiarray) 0.004 hermite_e (numpy.core.multiarray) 0.004 laguerre (numpy.core.multiarray) These timing were on a MacBook Pro I bought this year, using The minimal cheminformatics program is % time python -c 'from rdkit import Chem; print Chem.MolToSmiles(Chem.MolFromSmiles("OCC"))' CCO 0.126u 0.035s 0:00.16 93.7% 0+0k 0+0io 0pf+0w (This chemical structure doesn't contain coordinates, so numpy is pure overhead. However, other formats do contain 2D or 3D coordinates. None need hermite or other polynomials, which together are 10% of the wall-clock time.) With a hot disk cache, 1/2 of the time is spent importing. With a cold cache, this is worse. (Eg, trying again now, it takes 1.955 seconds to "import rdkit." Trying much later, it takes 1.17 seconds to "import numpy". My web browser windows and tabs have filled most of memory.) Real code is of course more complex than this trivial bit of code. But on the other hand, the typical development cycle is to write the code to work for one compound, get that working, and then run it on thousands of compounds. The typical algorithm will be less than 1 second long, so during early development it's obvious that much of the run-time is dominated by the import time startup. My hope is to get the single-compound time down to under 0.1 seconds, or about 40% faster than it is now. Below the 0.1 second threshold, human factors studies show that people consider the time to be "instantaneous." I do not think I'll be able to shave of the full 0.06s which I want, but I can get close. If I can figure out how to get rid of add_newdocs (0.038s) and numpy.testing (0.012s) then I'll end up removing 0.05s. If I can remove automatic inclusion of the polynomial modules then I'm well into my goal. What can be done to make these changes to NumPy? What are the objections to my providing an updated set of patches removing add_newdocs and numpy.testing ? Cheers, Andrew dalke at dalkescientific.com From fperez.net at gmail.com Mon Jul 2 15:26:56 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 12:26:56 -0700 Subject: [Numpy-discussion] Fwd: an interesting single-file, cross-platform Python deployment tool. Message-ID: Hi all, sorry for the slightly off-topic post, but I know that in our community many people often struggle with deployment issues (to colleagues, to experimental/hardware control machines, to one-off test machines, ...). I just stumbled upon this announcement by accident, and figured it might prove useful. If anyone happens to try it out and has any feedback on whether it works well or not, I suspect I wouldn't be the only one interested. Cheers, f On Mon, 02 Jul 2012 15:52:20 +0200, eGenix Team: M.-A. Lemburg wrote: > ________________________________________________________________________ > ANNOUNCING > > eGenix PyRun - One file Python Runtime > > Version 1.0.0 > > > An easy-to-use single file relocatable Python run-time - > available for Windows, Mac OS X and Unix platforms > > > This announcement is also available on our web-site for online reading: > http://www.egenix.com/company/news/eGenix-PyRun-1.0.0.html > > ________________________________________________________________________ > INTRODUCTION > > Our new eGenix PyRun combines a Python interpreter with an almost > complete Python standard library into a single easy-to-use executable, > that does not require a system wide installation and is fully > relocatable. > > eGenix PyRun's executable only needs 12MB, but still supports most > Python application and scripts - and it can be further compressed to > 3-4MB using gzexe or upx. > > Compared to a regular Python installation of typically 100MB on disk, > this makes eGenix PyRun ideal for applications and scripts that need to > be distributed to many target machines, client installations or > customers. > > It makes "installing" Python on a Unix based system as simple as copying > a single file. > > http://www.egenix.com/products/python/PyRun/ > > ________________________________________________________________________ > NEWS > > This is the first public release of eGenix PyRun. We have been using the > product internally in our mxODBC Connect Server since 2008 with great > success and have now extracted it into a stand-alone open-source > product. > > We provide both the source archive to build your own eGenix PyRun, as > well as provide pre-compiled binaries for Linux, > FreeBSD and Mac OS X, for the resp. 32- and 64-bit platforms. > > Presentation at EuroPython 2012 ------------------------------- > > Marc-Andr?, CEO of eGenix, will be giving a presentation about eGenix > PyRun at EuroPython 2012 in Florence, Italy on Wednesday, > July 4th in the Room Tagliatelle. > > He will also be available during the conference to answer questions. > > ________________________________________________________________________ > DOWNLOADS > > The download archives and instructions for installing the product can be > found at: > > http://www.egenix.com/products/python/PyRun/ > > _______________________________________________________________________ > SUPPORT > > Commercial support for these packages is available from eGenix.com. > Please see > > http://www.egenix.com/services/support/ > > for details about our support offerings. > > ________________________________________________________________________ > MORE INFORMATION > > For more information about eGenix PyRun, licensing and download > instructions, please visit our web-site or write to sales at egenix.com. > > Enjoy, > -- > Marc-Andre Lemburg eGenix.com > > Professional Python Services directly from the Source >>>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > ________________________________________________________________________ > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ From klonuo at gmail.com Mon Jul 2 16:11:31 2012 From: klonuo at gmail.com (klo uo) Date: Mon, 2 Jul 2012 22:11:31 +0200 Subject: [Numpy-discussion] Fwd: an interesting single-file, cross-platform Python deployment tool. In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 9:26 PM, Fernando Perez wrote: >> ANNOUNCING >> >> eGenix PyRun - One file Python Runtime >> >> Version 1.0.0 >> >> >> An easy-to-use single file relocatable Python run-time - >> available for Windows, Mac OS X and Unix platforms Quote from http://www.egenix.com/products/python/PyRun/: Windows (x86 - 32/64-bit): eGenix PyRun does not support Windows in the current release. Otherwise seems like interesting for those who need/want to demonstrate some Python approach, in non-Pythonic environment (I don't know of such environment except default Windows). I've heard about Portable Python, has anyone any experience with it, or what's the difference between it and PyRun, in regards of possible NumPy usage? From fperez.net at gmail.com Mon Jul 2 16:16:26 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 13:16:26 -0700 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? Message-ID: Hi all, in recent work with a colleague, the need came up for a multivariate hypergeometric sampler; I had a look in the numpy code and saw we have the bivariate version, but not the multivariate one. I had a look at the code in scipy.stats.distributions, and it doesn't look too difficult to add a proper multivariate hypergeometric by extending the bivariate code, with one important caveat: the hard part is the implementation of the actual discrete hypergeometric sampler, which lives inside of numpy/random/mtrand/distributions.c: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L743 That code is hand-written C, and it only works for the bivariate case right now. It doesn't look terribly difficult to extend, but it will certainly take a bit of care and testing to ensure all edge cases are handled correctly. Does anyone happen to have that implemented lying around, in a form that would be easy to merge to add this capability to numpy? Thanks, f From cournape at gmail.com Mon Jul 2 16:33:21 2012 From: cournape at gmail.com (David Cournapeau) Date: Mon, 2 Jul 2012 21:33:21 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke wrote: > In this email I propose a few changes which I think are minor > and which don't really affect the external NumPy API but which > I think could improve the "import numpy" performance by at > least 40%. This affects me because I and my clients use a > chemistry toolkit which uses only NumPy arrays, and where > we run short programs often on the command-line. > > > In July of 2008 I started a thread about how "import numpy" > was noticeably slow for one of my customers. They had > chemical analysis software, often even run on a single > molecular structure using command-line tools, and the > several invocations with 0.1 seconds overhead was one of > the dominant costs even when numpy wasn't needed. > > I fixed most of their problems by deferring numpy imports > until needed. I remember well the Steve Jobs anecdote at > http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt > and spent another day of my time in 2008 to identify the > parts of the numpy import sequence which seemed excessive. > I managed to get the import time down from 0.21 seconds to > 0.08 seconds. I will answer to your other remarks later, but 0.21 sec to import numpy is very slow, especially on a recent computer. It is 0.095 sec on my mac, and 0.075 sec on a linux VM on the same computer (both hot cache of course). importing multiarray.so only is negligible for me (i.e. difference between python -c "import multiarray" and python -c "" is statistically insignificant). I would check external factors, like the size of your sys.path as well. David From njs at pobox.com Mon Jul 2 16:34:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 21:34:48 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke wrote: > In this email I propose a few changes which I think are minor > and which don't really affect the external NumPy API but which > I think could improve the "import numpy" performance by at > least 40%. This affects me because I and my clients use a > chemistry toolkit which uses only NumPy arrays, and where > we run short programs often on the command-line. > > > In July of 2008 I started a thread about how "import numpy" > was noticeably slow for one of my customers. They had > chemical analysis software, often even run on a single > molecular structure using command-line tools, and the > several invocations with 0.1 seconds overhead was one of > the dominant costs even when numpy wasn't needed. > > I fixed most of their problems by deferring numpy imports > until needed. I remember well the Steve Jobs anecdote at > http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt > and spent another day of my time in 2008 to identify the > parts of the numpy import sequence which seemed excessive. > I managed to get the import time down from 0.21 seconds to > 0.08 seconds. > > Very little of that made it into NumPy. > > > The three biggest changes I would like are: > > 1) remove "add_newdocs" and put the docstrings in the C code > 'add_newdocs' still needs to be there, > > The code says: > > # This is only meant to add docs to objects defined in C-extension modules. > # The purpose is to allow easier editing of the docstrings without > # requiring a re-compile. > > However, the change log shows that there are relatively few commits > to this module > > Year Number of commits > ==== ================= > 2012 8 > 2011 62 > 2010 9 > 2009 18 > 2008 17 > > so I propose moving the docstrings to the C code, and perhaps > leaving 'add_newdocs' there, but only used when testing new > docstrings. I don't have any opinion on how acceptable this would be, but I also don't see a benchmark showing how much this would help? > 2) Don't optimistically assume that all submodules are > needed. For example, some current code uses > >>>> import numpy >>>> numpy.fft.ifft > > > (See a real-world example at > http://stackoverflow.com/questions/10222812/python-numpy-fft-and-inverse-fft > ) > > IMO, this optimizes the needs of the interactive shell > NumPy author over the needs of the many-fold more people > who don't spend their time in the REPL and/or don't need > those extra features added to every NumPy startup. Please > bear in mind that NumPy users of the first category will > be active on the mailing list, go to SciPy conferences, > etc. while members of the second category are less visible. > > I recognize that this is backwards incompatible, and will > not change. However, I understand that "NumPy 2.0" is a > glimmer in the future, which might be a natural place for > a transition to the more standard Python style of > > from numpy import fft > > Personally, I think the documentation now (if it doesn't > already) should transition to use this form. I think this ship has sailed, but it'd be worth looking into lazy importing, where 'numpy.fft' isn't actually imported until someone starts using it. There are a bunch of libraries that do this, and one would have to fiddle to get compatibility with all the different python versions and make sure you're not killing performance (might have to be in C) but something along the lines of class _FFTModule(object): def __getattribute__(self, name): mod = importlib.import_module("numpy.fft") _FFTModule.__getattribute__ = mod.__getattribute__ return getattr(mod, name) fft = _FFTModule() > 3) Especially: don't always import 'numpy.testing' > > As far as I can tell, automatic import of this module > is not needed, so is pure overhead for the vast majority > of NumPy users. Unfortunately, there's a large number > of user-facing 'test' and 'bench' bound methods acting > as functions. > > from numpy.testing import Tester > test = Tester().test > bench = Tester().test > > They seem rather pointless to me but could be replaced > with per-module functions like > > def test(...): > from numpy.testing import Tester > Tester().test(...) > > > > > I have not worried about numpy import performance for > 4 years. While I have been developing scientific software > for 20 years, and in Python for 15 years, it has been > in areas of biology and chemistry which don't use arrays. > I use numpy for a day about once every two years, and > so far I have had no reason to use scipy. > > > This has changed. > > I talked with one of my clients last week. They (and I) > use a chemistry toolkit called "RDKit". RDKit uses > numpy as a way to store coordinate data for molecules. > I checked with the package author and he confirms: > > yeah, it's just using the homogenous array most of the time. > > My client complained about RDKit's high startup cost, > due to the NumPy dependency. On my laptop, with a warm > disk cache, it take 0.119s to "import rdkit". On a cold > cache it can take 3 seconds. On their cluster filesytem, > with a cold cache, it can take over 10 seconds. > > (I told them about zipimport. They will be looking into > that as a solution. However, it doesn't easily help the other > people who use the RDKit toolkit.) > > > With instrumentation I found that 0.083s of the 0.119s > is spent loading numpy.core.multiarray. Sounds like this would be useful to profile. I have no idea why importing this would need so much time, since the actual import just requires filling in a few C structs. Could be the linker, could be lots of things, profiling is useful. > The slowest module > import times are listed here, with the cumulative time for > each module and the name of the (first) importing parent > in parentheses: > > 0.119 rdkit > 0.089 rdchem (pyPgSQL) > 0.083 numpy.core.multiarray (rdchem) > 0.038 add_newdocs (numpy.core.multiarray) > 0.032 numpy.lib (add_newdocs) > 0.023 type_check (numpy.lib) > 0.023 numpy.core.numeric (type_check) > 0.012 numpy.testing (numpy.core.numeric) > 0.010 unittest (numpy.testing) > 0.008 cDataStructs (pyPgSQL) > 0.007 random (numpy.core.multiarray) > 0.007 mtrand (random) > 0.006 case (unittest) > 0.005 rdmolfiles (pyPgSQL) > 0.005 rdmolops (pyPgSQL) > 0.005 difflib (case) > 0.005 chebyshev (numpy.core.multiarray) > 0.004 hermite (numpy.core.multiarray) > 0.004 hermite_e (numpy.core.multiarray) > 0.004 laguerre (numpy.core.multiarray) > > These timing were on a MacBook Pro I bought this year, using > > > > > The minimal cheminformatics program is > > % time python -c 'from rdkit import Chem; print Chem.MolToSmiles(Chem.MolFromSmiles("OCC"))' > CCO > 0.126u 0.035s 0:00.16 93.7% 0+0k 0+0io 0pf+0w > > (This chemical structure doesn't contain coordinates, > so numpy is pure overhead. However, other formats do > contain 2D or 3D coordinates. None need hermite or > other polynomials, which together are 10% of the > wall-clock time.) > > With a hot disk cache, 1/2 of the time is spent importing. > With a cold cache, this is worse. (Eg, trying again now, it > takes 1.955 seconds to "import rdkit." Trying much later, > it takes 1.17 seconds to "import numpy". My web browser > windows and tabs have filled most of memory.) If you want reproducible numbers for cold cache load times you should drop caches explicitly -- I think on OS X you use 'purge' to do this. > Real code is of course more complex than this trivial > bit of code. But on the other hand, the typical > development cycle is to write the code to work for one > compound, get that working, and then run it on thousands > of compounds. The typical algorithm will be less than 1 > second long, so during early development it's obvious that > much of the run-time is dominated by the import time > startup. > > My hope is to get the single-compound time down to > under 0.1 seconds, or about 40% faster than it is now. > Below the 0.1 second threshold, human factors studies > show that people consider the time to be "instantaneous." > > I do not think I'll be able to shave of the full 0.06s > which I want, but I can get close. If I can figure out > how to get rid of add_newdocs (0.038s) and numpy.testing > (0.012s) then I'll end up removing 0.05s. If I can remove > automatic inclusion of the polynomial modules then I'm > well into my goal. > > What can be done to make these changes to NumPy? What > are the objections to my providing an updated set of > patches removing add_newdocs and numpy.testing ? > > Cheers, > > > Andrew > dalke at dalkescientific.com > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Mon Jul 2 16:40:10 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 21:40:10 +0100 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> References: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> Message-ID: On Mon, Jul 2, 2012 at 6:54 PM, Sveinung Gundersen wrote: > [snip] > > > > Your actual memory usage may not have increased as much as you think, > since memmap objects don't necessarily take much memory -- it sounds > like you're leaking virtual memory, but your resident set size > shouldn't go up as much. > > > As I understand it, memmap objects retain the contents of the memmap in > memory after it has been read the first time (in a lazy manner). Thus, when > reading a slice of a 24GB file, only that part recides in memory. Our system > reads a slice of a memmap, calculates something (say, the sum), and then > deletes the memmap. It then loops through this for consequitive slices, > retaining a low memory usage. Consider the following code: > > import numpy as np > res = [] > vecLen = 3095677412 > for i in xrange(vecLen/10**8+1): > x = i * 10**8 > y = min((i+1) * 10**8, vecLen) > res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) > > The memory usage of this code on a 24GB file (one value for each nucleotide > in the human DNA!) is 23g resident memory after the loop is finished (not > 24g for some reason..). > > Running the same code on 1.5.1rc1 gives a resident memory of 23m after the > loop. Your memory measurement tools are misleading you. The same memory is resident in both cases, just in one case your tools say it is operating system disk cache (and not attributed to your app), and in the other case that same memory, treated in the same way by the OS, is shown as part of your app's resident memory. Virtual memory is confusing... > That said, this is clearly a bug, and it's even worse than you mention > -- *all* operations on memmap arrays are holding onto references to > the original mmap object, regardless of whether they share any memory: > > a = np.memmap("/etc/passwd", np.uint8, "r") > > # arithmetic > > (a + 10)._mmap is a._mmap > > True > # fancy indexing (doesn't return a view!) > > a[[1, 2, 3]]._mmap is a._mmap > > True > > a.sum()._mmap is a._mmap > > True > Really, only slicing should be returning a np.memmap object at all. > Unfortunately, it is currently impossible to create an ndarray > subclass that returns base-class ndarrays from any operations -- > __array_finalize__() has no way to do this. And this is the third > ndarray subclass in a row that I've looked at that wanted to be able > to do this, so I guess maybe it's something we should implement... > > In the short term, the numpy-upstream fix is to change > numpy.core.memmap:memmap.__array_finalize__ so that it only copies > over the ._mmap attribute of its parent if np.may_share_memory(self, > parent) is True. Patches gratefully accepted ;-) > > > Great! Any idea on whether such a patch may be included in 1.7? Not really, if I or you or someone else gets inspired to take the time to write a patch soon then it will be, otherwise not... -N From ben.root at ou.edu Mon Jul 2 16:43:09 2012 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 2 Jul 2012 16:43:09 -0400 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith wrote: > On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke > wrote: > > In this email I propose a few changes which I think are minor > > and which don't really affect the external NumPy API but which > > I think could improve the "import numpy" performance by at > > least 40%. This affects me because I and my clients use a > > chemistry toolkit which uses only NumPy arrays, and where > > we run short programs often on the command-line. > > > > > > In July of 2008 I started a thread about how "import numpy" > > was noticeably slow for one of my customers. They had > > chemical analysis software, often even run on a single > > molecular structure using command-line tools, and the > > several invocations with 0.1 seconds overhead was one of > > the dominant costs even when numpy wasn't needed. > > > > I fixed most of their problems by deferring numpy imports > > until needed. I remember well the Steve Jobs anecdote at > > > http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt > > and spent another day of my time in 2008 to identify the > > parts of the numpy import sequence which seemed excessive. > > I managed to get the import time down from 0.21 seconds to > > 0.08 seconds. > > > > Very little of that made it into NumPy. > > > > > > The three biggest changes I would like are: > > > > 1) remove "add_newdocs" and put the docstrings in the C code > > 'add_newdocs' still needs to be there, > > > > The code says: > > > > # This is only meant to add docs to objects defined in C-extension > modules. > > # The purpose is to allow easier editing of the docstrings without > > # requiring a re-compile. > > > > However, the change log shows that there are relatively few commits > > to this module > > > > Year Number of commits > > ==== ================= > > 2012 8 > > 2011 62 > > 2010 9 > > 2009 18 > > 2008 17 > > > > so I propose moving the docstrings to the C code, and perhaps > > leaving 'add_newdocs' there, but only used when testing new > > docstrings. > > I don't have any opinion on how acceptable this would be, but I also > don't see a benchmark showing how much this would help? > > > 2) Don't optimistically assume that all submodules are > > needed. For example, some current code uses > > > >>>> import numpy > >>>> numpy.fft.ifft > > > > > > (See a real-world example at > > > http://stackoverflow.com/questions/10222812/python-numpy-fft-and-inverse-fft > > ) > > > > IMO, this optimizes the needs of the interactive shell > > NumPy author over the needs of the many-fold more people > > who don't spend their time in the REPL and/or don't need > > those extra features added to every NumPy startup. Please > > bear in mind that NumPy users of the first category will > > be active on the mailing list, go to SciPy conferences, > > etc. while members of the second category are less visible. > > > > I recognize that this is backwards incompatible, and will > > not change. However, I understand that "NumPy 2.0" is a > > glimmer in the future, which might be a natural place for > > a transition to the more standard Python style of > > > > from numpy import fft > > > > Personally, I think the documentation now (if it doesn't > > already) should transition to use this form. > > I think this ship has sailed, but it'd be worth looking into lazy > importing, where 'numpy.fft' isn't actually imported until someone > starts using it. There are a bunch of libraries that do this, and one > would have to fiddle to get compatibility with all the different > python versions and make sure you're not killing performance (might > have to be in C) but something along the lines of > > class _FFTModule(object): > def __getattribute__(self, name): > mod = importlib.import_module("numpy.fft") > _FFTModule.__getattribute__ = mod.__getattribute__ > return getattr(mod, name) > fft = _FFTModule() > > Not sure how this would impact projects like ipython that does tab-completion support, but I know that that would drive me nuts in my basic tab-completion setup I have for my regular python terminal. Of course, in the grand scheme of things, that really isn't all that important, I don't think. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 2 17:06:31 2012 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 2 Jul 2012 22:06:31 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 9:43 PM, Benjamin Root wrote: > > On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith wrote: >> I think this ship has sailed, but it'd be worth looking into lazy >> importing, where 'numpy.fft' isn't actually imported until someone >> starts using it. There are a bunch of libraries that do this, and one >> would have to fiddle to get compatibility with all the different >> python versions and make sure you're not killing performance (might >> have to be in C) but something along the lines of >> >> class _FFTModule(object): >> ? def __getattribute__(self, name): >> ? ? mod = importlib.import_module("numpy.fft") >> ? ? _FFTModule.__getattribute__ = mod.__getattribute__ >> ? ? return getattr(mod, name) >> fft = _FFTModule() > > Not sure how this would impact projects like ipython that does > tab-completion support, but I know that that would drive me nuts in my basic > tab-completion setup I have for my regular python terminal.? Of course, in > the grand scheme of things, that really isn't all that important, I don't > think. We used to do it for scipy. It did interfere with tab completion. It did drive many people nuts. -- Robert Kern From pav at iki.fi Mon Jul 2 17:24:29 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 02 Jul 2012 23:24:29 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: 02.07.2012 21:17, Andrew Dalke kirjoitti: [clip] > 1) remove "add_newdocs" and put the docstrings in the C code > 'add_newdocs' still needs to be there, The docstrings need to be in an easily parseable format, because of the online documentation editor. Keeping the current format may be the easiest as that already works. Moving them in the middle of other C code won't do, but a header file e.g. generated at build-time should work. This is how it's currently done with the ufunc docstrings, and it should work also for everything else. The commit statistics for add_newdocs.py are somewhat misleading --- since 2008, many of the documentation edits went in the online way, and these only show up in a single large commit, usually before releases. -- Pauli Virtanen From dalke at dalkescientific.com Mon Jul 2 17:26:16 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 2 Jul 2012 23:26:16 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Jul 2, 2012, at 10:33 PM, David Cournapeau wrote: > On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke wrote: >> In July of 2008 I started a thread about how "import numpy" >> was noticeably slow for one of my customers. ... >> I managed to get the import time down from 0.21 seconds to >> 0.08 seconds. > > I will answer to your other remarks later, but 0.21 sec to import > numpy is very slow, especially on a recent computer. It is 0.095 sec > on my mac, and 0.075 sec on a linux VM on the same computer (both hot > cache of course). That quote was historical review from 4 years ago. I described the problems I had then, the work-around solution I implemented, and my additional work to see if I could identify ways which would have kept me from needing to find a work-around solution. I then described why I have not worked on this problem for the last four years, and what has changed to make me interested in it again. That included current details, such as how "import numpy" with a warm cache takes 0.083 seconds on my Mac. > importing multiarray.so only is negligible for me (i.e. difference > between python -c "import multiarray" and python -c "" is > statistically insignificant). The NumPy initialization is being done in C++ code through "import_array()". That C function does (among other things) PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); so the relevant timing test is more likely: % time python -c 'import numpy.core.multiarray' 0.086u 0.031s 0:00.12 91.6% 0+0k 0+0io 0pf+0w % time python -c 'import numpy.core.multiarray' 0.083u 0.031s 0:00.11 100.0% 0+0k 0+0io 0pf+0w % time python -c 'import numpy.core.multiarray' 0.083u 0.030s 0:00.12 91.6% 0+0k 0+0io 0pf+0w I do not know how to run the timing test you did, as I get: % python -c "import multiarray" Traceback (most recent call last): File "", line 1, in ImportError: No module named multi array > I would check external factors, like the size of your sys.path as well. I have checked that, and inspected the output of python -v -v. Andrew dalke at dalkescientific.com From andrea.gavana at gmail.com Mon Jul 2 17:27:24 2012 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 2 Jul 2012 23:27:24 +0200 Subject: [Numpy-discussion] Fwd: an interesting single-file, cross-platform Python deployment tool. In-Reply-To: References: Message-ID: On 2 July 2012 22:11, klo uo wrote: > On Mon, Jul 2, 2012 at 9:26 PM, Fernando Perez wrote: >>> ANNOUNCING >>> >>> ? ? ? ? ? ? ? ? ?eGenix PyRun - One file Python Runtime >>> >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Version 1.0.0 >>> >>> >>> ? ? ? ? ? An easy-to-use single file relocatable Python run-time - >>> ? ? ? ? ? ? available for Windows, Mac OS X and Unix platforms > > Quote from http://www.egenix.com/products/python/PyRun/: > > ? ? Windows (x86 - 32/64-bit): eGenix PyRun does not support Windows > in the current release. > > > Otherwise seems like interesting for those who need/want to > demonstrate some Python approach, in non-Pythonic environment (I don't > know of such environment except default Windows). > > I've heard about Portable Python, has anyone any experience with it, > or what's the difference between it and PyRun, in regards of possible > NumPy usage? I have used PortablePython without any problem on Windows, combining wxPython, numpy, matplotlib and other stuff in more than one application. I can't comment on numpy speed or benchmarks, as I couldn't care less about that. But PP gave me a way to use Python (and a lot of 3rd party libraries) in an armour-plated Windows environment where every installation is forbidden. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ From fperez.net at gmail.com Mon Jul 2 17:38:39 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 14:38:39 -0700 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 2:26 PM, Andrew Dalke wrote: > > so the relevant timing test is more likely: > > % time python -c 'import numpy.core.multiarray' > 0.086u 0.031s 0:00.12 91.6% 0+0k 0+0io 0pf+0w No, that's the wrong thing to test, because it effectively amounts to 'import numpy', sicne the numpy __init__ file is still executed. As David indicated, you must import multarray.so by itself. > I do not know how to run the timing test you did, as I get: > > % python -c "import multiarray" > Traceback (most recent call last): > File "", line 1, in > ImportError: No module named multi array You just have to cd to the directory where multiarray.so lives. I get the same numbers as David: longs[core]> time python -c '' real 0m0.038s user 0m0.032s sys 0m0.000s longs[core]> time python -c 'import multiarray' real 0m0.035s user 0m0.020s sys 0m0.012s longs[core]> pwd /usr/lib/python2.7/dist-packages/numpy/core Cheers, f From dalke at dalkescientific.com Mon Jul 2 17:44:12 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Mon, 2 Jul 2012 23:44:12 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Jul 2, 2012, at 10:34 PM, Nathaniel Smith wrote: > I don't have any opinion on how acceptable this would be, but I also > don't see a benchmark showing how much this would help? The profile output was lower in that email. The relevant line is 0.038 add_newdocs (numpy.core.multiarray) This says that 'add_newdocs', which is imported from numpy.core.multiarray (though there may be other importers) takes 0.038 seconds to go through __import__, including all of its children module imports. I have attached my import profile script. It has only minor changes since the one I posted on this list 4 years ago. Its output is here, showing the import dependency tree first, and then the list of slowest modules to import. == Tree == rdkit: 0.150 (None) os: 0.000 (rdkit) sys: 0.000 (rdkit) exceptions: 0.000 (rdkit) sqlite3: 0.003 (pyPgSQL) dbapi2: 0.002 (sqlite3) datetime: 0.001 (dbapi2) time: 0.000 (dbapi2) _sqlite3: 0.001 (dbapi2) cDataStructs: 0.008 (pyPgSQL) rdkit.Geometry: 0.003 (pyPgSQL) rdGeometry: 0.003 (rdkit.Geometry) PeriodicTable: 0.002 (pyPgSQL) re: 0.000 (PeriodicTable) rdchem: 0.116 (pyPgSQL) numpy.core.multiarray: 0.109 (rdchem) numpy.__config__: 0.000 (numpy.core.multiarray) version: 0.000 (numpy.core.multiarray) _import_tools: 0.000 (numpy.core.multiarray) add_newdocs: 0.067 (numpy.core.multiarray) numpy.lib: 0.061 (add_newdocs) info: 0.000 (numpy.lib) numpy.version: 0.000 (numpy.lib) type_check: 0.053 (numpy.lib) numpy.core.numeric: 0.053 (type_check) multiarray: 0.001 (numpy.core.numeric) umath: 0.000 (numpy.core.numeric) _internal: 0.004 (numpy.core.numeric) warnings: 0.000 (_internal) numpy.compat: 0.000 (_internal) _inspect: 0.000 (numpy.compat) types: 0.000 (_inspect) py3k: 0.000 (numpy.compat) numerictypes: 0.001 (numpy.core.numeric) __builtin__: 0.000 (numerictypes) _sort: 0.000 (numpy.core.numeric) numeric: 0.003 (numpy.core.numeric) _dotblas: 0.000 (numeric) arrayprint: 0.001 (numeric) fromnumeric: 0.000 (arrayprint) cPickle: 0.001 (numeric) copy_reg: 0.000 (cPickle) cStringIO: 0.000 (cPickle) defchararray: 0.001 (numpy.core.numeric) numpy: 0.000 (defchararray) records: 0.000 (numpy.core.numeric) memmap: 0.000 (numpy.core.numeric) scalarmath: 0.000 (numpy.core.numeric) numpy.core.umath: 0.000 (scalarmath) function_base: 0.000 (numpy.core.numeric) machar: 0.000 (numpy.core.numeric) numpy.core.fromnumeric: 0.000 (machar) getlimits: 0.000 (numpy.core.numeric) shape_base: 0.000 (numpy.core.numeric) numpy.testing: 0.041 (numpy.core.numeric) unittest: 0.039 (numpy.testing) result: 0.002 (unittest) traceback: 0.000 (result) linecache: 0.000 (traceback) StringIO: 0.000 (result) errno: 0.000 (StringIO) : 0.000 (result) functools: 0.001 (result) _functools: 0.000 (functools) case: 0.035 (unittest) difflib: 0.034 (case) heapq: 0.031 (difflib) itertools: 0.029 (heapq) operator: 0.001 (heapq) bisect: 0.001 (heapq) _bisect: 0.000 (bisect) _heapq: 0.000 (heapq) collections: 0.001 (difflib) _abcoll: 0.000 (collections) _collections: 0.000 (collections) keyword: 0.000 (collections) thread: 0.000 (collections) pprint: 0.000 (case) util: 0.000 (case) suite: 0.000 (unittest) loader: 0.001 (unittest) fnmatch: 0.000 (loader) main: 0.001 (unittest) signals: 0.001 (main) signal: 0.000 (signals) weakref: 0.001 (signals) UserDict: 0.000 (weakref) _weakref: 0.000 (weakref) _weakrefset: 0.000 (weakref) runner: 0.000 (unittest) decorators: 0.001 (numpy.testing) numpy.testing.utils: 0.001 (decorators) nosetester: 0.000 (numpy.testing.utils) utils: 0.000 (numpy.testing) numpytest: 0.000 (numpy.testing) ufunclike: 0.000 (type_check) index_tricks: 0.003 (numpy.lib) numpy.core.numerictypes: 0.000 (index_tricks) math: 0.000 (index_tricks) numpy.core: 0.000 (index_tricks) numpy.lib.twodim_base: 0.000 (index_tricks) _compiled_base: 0.000 (index_tricks) arraysetops: 0.001 (index_tricks) numpy.lib.utils: 0.001 (arraysetops) numpy.matrixlib: 0.001 (index_tricks) defmatrix: 0.001 (numpy.matrixlib) numpy.lib._compiled_base: 0.000 (index_tricks) stride_tricks: 0.000 (numpy.lib) twodim_base: 0.000 (numpy.lib) scimath: 0.000 (numpy.lib) numpy.lib.type_check: 0.000 (scimath) polynomial: 0.001 (numpy.lib) numpy.lib.function_base: 0.000 (polynomial) numpy.linalg: 0.001 (polynomial) linalg: 0.000 (numpy.linalg) numpy.matrixlib.defmatrix: 0.000 (linalg) npyio: 0.002 (numpy.lib) format: 0.000 (npyio) _datasource: 0.001 (npyio) shutil: 0.001 (_datasource) stat: 0.000 (shutil) os.path: 0.000 (shutil) pwd: 0.000 (shutil) grp: 0.000 (shutil) _iotools: 0.000 (npyio) financial: 0.000 (numpy.lib) arrayterator: 0.000 (numpy.lib) __future__: 0.000 (arrayterator) numpy.lib.index_tricks: 0.000 (add_newdocs) testing: 0.000 (numpy.core.multiarray) core: 0.000 (numpy.core.multiarray) compat: 0.000 (numpy.core.multiarray) lib: 0.000 (numpy.core.multiarray) fft: 0.001 (numpy.core.multiarray) fftpack: 0.000 (fft) fftpack_lite: 0.000 (fftpack) helper: 0.000 (fft) polyutils: 0.000 (numpy.core.multiarray) polytemplate: 0.002 (numpy.core.multiarray) string: 0.002 (polytemplate) strop: 0.000 (string) chebyshev: 0.005 (numpy.core.multiarray) legendre: 0.004 (numpy.core.multiarray) hermite: 0.004 (numpy.core.multiarray) hermite_e: 0.004 (numpy.core.multiarray) laguerre: 0.004 (numpy.core.multiarray) random: 0.007 (numpy.core.multiarray) mtrand: 0.006 (random) ctypeslib: 0.004 (numpy.core.multiarray) ctypes: 0.003 (ctypeslib) _ctypes: 0.001 (ctypes) struct: 0.001 (ctypes) _struct: 0.000 (struct) ctypes._endian: 0.000 (ctypes) numpy.core._internal: 0.000 (ctypeslib) ma: 0.003 (numpy.core.multiarray) extras: 0.001 (ma) matrixlib: 0.000 (numpy.core.multiarray) rdmolfiles: 0.005 (pyPgSQL) rdmolops: 0.006 (pyPgSQL) inchi: 0.000 (pyPgSQL) == Slowest (including children) == 0.150 rdkit (None) 0.116 rdchem (pyPgSQL) 0.109 numpy.core.multiarray (rdchem) 0.067 add_newdocs (numpy.core.multiarray) 0.061 numpy.lib (add_newdocs) 0.053 type_check (numpy.lib) 0.053 numpy.core.numeric (type_check) 0.041 numpy.testing (numpy.core.numeric) 0.039 unittest (numpy.testing) 0.035 case (unittest) 0.034 difflib (case) 0.031 heapq (difflib) 0.029 itertools (heapq) 0.008 cDataStructs (pyPgSQL) 0.007 random (numpy.core.multiarray) 0.006 mtrand (random) 0.006 rdmolops (pyPgSQL) 0.005 rdmolfiles (pyPgSQL) 0.005 chebyshev (numpy.core.multiarray) 0.004 hermite (numpy.core.multiarray) >> With instrumentation I found that 0.083s of the 0.119s >> is spent loading numpy.core.multiarray. > > Sounds like this would be useful to profile. I have no idea why > importing this would need so much time, since the actual import just > requires filling in a few C structs. Could be the linker, could be > lots of things, profiling is useful. I also have no idea why "import_array()" needs to include ~90 modules. My hope it to prune a good number of those. Cheers, Andrew dalke at dalkescientific.com -------------- next part -------------- A non-text attachment was scrubbed... Name: import_profile.py Type: text/x-python-script Size: 1168 bytes Desc: not available URL: From sveinugu at gmail.com Mon Jul 2 17:52:55 2012 From: sveinugu at gmail.com (Sveinung Gundersen) Date: Mon, 2 Jul 2012 23:52:55 +0200 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: References: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> Message-ID: On 2. juli 2012, at 22.40, Nathaniel Smith wrote: > On Mon, Jul 2, 2012 at 6:54 PM, Sveinung Gundersen wrote: >> [snip] >> >> >> >> Your actual memory usage may not have increased as much as you think, >> since memmap objects don't necessarily take much memory -- it sounds >> like you're leaking virtual memory, but your resident set size >> shouldn't go up as much. >> >> >> As I understand it, memmap objects retain the contents of the memmap in >> memory after it has been read the first time (in a lazy manner). Thus, when >> reading a slice of a 24GB file, only that part recides in memory. Our system >> reads a slice of a memmap, calculates something (say, the sum), and then >> deletes the memmap. It then loops through this for consequitive slices, >> retaining a low memory usage. Consider the following code: >> >> import numpy as np >> res = [] >> vecLen = 3095677412 >> for i in xrange(vecLen/10**8+1): >> x = i * 10**8 >> y = min((i+1) * 10**8, vecLen) >> res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) >> >> The memory usage of this code on a 24GB file (one value for each nucleotide >> in the human DNA!) is 23g resident memory after the loop is finished (not >> 24g for some reason..). >> >> Running the same code on 1.5.1rc1 gives a resident memory of 23m after the >> loop. > > Your memory measurement tools are misleading you. The same memory is > resident in both cases, just in one case your tools say it is > operating system disk cache (and not attributed to your app), and in > the other case that same memory, treated in the same way by the OS, is > shown as part of your app's resident memory. Virtual memory is > confusing... But the crucial difference is perhaps that the disk cache can be cleared by the OS if needed, but not the application memory in the same way, which must be swapped to disk? Or am I still confused? (snip) >> >> Great! Any idea on whether such a patch may be included in 1.7? > > Not really, if I or you or someone else gets inspired to take the time > to write a patch soon then it will be, otherwise not... > > -N I have now tried to add a patch, in the way you proposed, but I may have gotten it wrong.. http://projects.scipy.org/numpy/ticket/2179 Sveinung From njs at pobox.com Mon Jul 2 17:59:41 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 22:59:41 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 10:06 PM, Robert Kern wrote: > On Mon, Jul 2, 2012 at 9:43 PM, Benjamin Root wrote: >> >> On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith wrote: > >>> I think this ship has sailed, but it'd be worth looking into lazy >>> importing, where 'numpy.fft' isn't actually imported until someone >>> starts using it. There are a bunch of libraries that do this, and one >>> would have to fiddle to get compatibility with all the different >>> python versions and make sure you're not killing performance (might >>> have to be in C) but something along the lines of >>> >>> class _FFTModule(object): >>> def __getattribute__(self, name): >>> mod = importlib.import_module("numpy.fft") >>> _FFTModule.__getattribute__ = mod.__getattribute__ >>> return getattr(mod, name) >>> fft = _FFTModule() >> >> Not sure how this would impact projects like ipython that does >> tab-completion support, but I know that that would drive me nuts in my basic >> tab-completion setup I have for my regular python terminal. Of course, in >> the grand scheme of things, that really isn't all that important, I don't >> think. > > We used to do it for scipy. It did interfere with tab completion. It > did drive many people nuts. Sounds like a bug in your old code, or else the REPLs have gotten better? I just pasted the above code into both ipython and python prompts, and typing 'fft.' worked fine in both cases. dir(fft) works first try as well. (If you try this, don't forget to 'import importlib' first, and note importlib is 2.7+ only. Obviously importlib is not necessary but it makes the minimal example less tedious.) -n From dalke at dalkescientific.com Mon Jul 2 18:15:46 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 3 Jul 2012 00:15:46 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> On Jul 2, 2012, at 11:38 PM, Fernando Perez wrote: > No, that's the wrong thing to test, because it effectively amounts to > 'import numpy', sicne the numpy __init__ file is still executed. As > David indicated, you must import multarray.so by itself. I understand that clarification. However, it does not affect me. I do "import rdkit.Chem". This is all I really care about. That imports "rdkit.Chem.rdchem" which is a shared library. That shared library calls the C function/macro "import_array", which appears to be: #define import_array() { if (_import_array() < 0) {PyErr_Print(); PyErr_SetString(PyExc_ImportError, "numpy.core.multiarray failed to import"); } } The _import_array looks to be defined via numpy/core/code_generators/generate_numpy_api.py which contains static int _import_array(void) { int st; PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); PyObject *c_api = NULL; ... Thus, I don't see any way that I can import 'multiarray' directly, because the underlying C code is the one which imports 'numpy.core.multiarray' and by design it is inaccessible to change from Python code. Thus, the correct reference benchmark is "import numpy.core.multiarray" Unless I'm lost in a set of header files? Cheers, Andrew dalke at dalkescientific.com From njs at pobox.com Mon Jul 2 18:21:08 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 23:21:08 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 10:44 PM, Andrew Dalke wrote: > On Jul 2, 2012, at 10:34 PM, Nathaniel Smith wrote: >> I don't have any opinion on how acceptable this would be, but I also >> don't see a benchmark showing how much this would help? > > The profile output was lower in that email. The relevant line is > > 0.038 add_newdocs (numpy.core.multiarray) Yes, but for a proper benchmark we need to compare this to the number that we would get with some other implementation... I'm assuming you aren't proposing we just delete the docstrings :-). > This says that 'add_newdocs', which is imported from > numpy.core.multiarray (though there may be other importers) > takes 0.038 seconds to go through __import__, including > all of its children module imports. There are no "children modules", all these modules refer to each other, and you're assuming that whichever module you happen to load first is responsible for all the other modules it happens to reference. > add_newdocs: 0.067 (numpy.core.multiarray) > numpy.lib: 0.061 (add_newdocs) I'm pretty sure that what these two lines say is that the actual add_newdocs code only takes 0.006 seconds? > numpy.testing: 0.041 (numpy.core.numeric) However, it does look like numpy.testing is responsible for something like 35% of our startup overhead and for pulling in a ton of extra modules (with associated disk seeks), which is pretty dumb. >>> With instrumentation I found that 0.083s of the 0.119s >>> is spent loading numpy.core.multiarray. The number 0.083 doesn't appear anywhere in that profile you pasted, so I don't know where this comes from... Anyway, it sounds like the answer is that importing numpy.core.multiarray doesn't take that long; you're measuring the total time to do 'import numpy', and it just happens that numpy.core.multiarray is the first module you load. (BTW, you probably shouldn't be importing numpy.core.multiarray directly at all, just do 'import numpy'.) -N From fperez.net at gmail.com Mon Jul 2 18:21:57 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 15:21:57 -0700 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 3:15 PM, Andrew Dalke wrote: > > Thus, I don't see any way that I can import 'multiarray' directly, > because the underlying C code is the one which imports > 'numpy.core.multiarray' and by design it is inaccessible to change > from Python code. I was just referring to how David was benchmarking the cost of multiarray in isolation, which can indeed be done, and is useful for understanding the cumulative effect. Indeed for your case, it's the sum total of what import_array does that ultimately matters, but it's still useful to be able to understand these pieces in isolation. Cheers, f From njs at pobox.com Mon Jul 2 18:23:34 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 23:23:34 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 11:15 PM, Andrew Dalke wrote: > On Jul 2, 2012, at 11:38 PM, Fernando Perez wrote: >> No, that's the wrong thing to test, because it effectively amounts to >> 'import numpy', sicne the numpy __init__ file is still executed. As >> David indicated, you must import multarray.so by itself. > > I understand that clarification. However, it does not affect me. > > I do "import rdkit.Chem". This is all I really care about. > > That imports "rdkit.Chem.rdchem" which is a shared library. > > That shared library calls the C function/macro "import_array", which appears to be: > > #define import_array() { if (_import_array() < 0) {PyErr_Print(); PyErr_SetString(PyExc_ImportError, "numpy.core.multiarray failed to import"); } } > > > The _import_array looks to be defined via numpy/core/code_generators/generate_numpy_api.py > which contains > > static int > _import_array(void) > { > int st; > PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray"); > PyObject *c_api = NULL; > ... > > > Thus, I don't see any way that I can import 'multiarray' directly, > because the underlying C code is the one which imports > 'numpy.core.multiarray' and by design it is inaccessible to change > from Python code. > > Thus, the correct reference benchmark is "import numpy.core.multiarray" Oh, I see. I withdraw my comment about how you shouldn't import numpy.core.multiarray directly, I forgot import_array() does that. -n From njs at pobox.com Mon Jul 2 18:34:49 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Jul 2012 23:34:49 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Sun, Jul 1, 2012 at 9:17 PM, David Cournapeau wrote: > On Sun, Jul 1, 2012 at 8:32 PM, Nathaniel Smith wrote: >> On Sun, Jul 1, 2012 at 7:36 PM, David Cournapeau wrote: >>> On Sun, Jul 1, 2012 at 6:36 PM, Nathaniel Smith wrote: >>>> On Wed, Jun 27, 2012 at 9:05 PM, David Cournapeau wrote: >>>>> On Wed, Jun 27, 2012 at 8:53 PM, Nathaniel Smith wrote: >>>>>> But seriously, what compilers do we support that don't have >>>>>> -fvisibility=hidden? ...Is there even a list of compilers we support >>>>>> available anywhere? >>>>> >>>>> Well, I am not sure how all this is handled on the big guys (bluegen >>>>> and co), for once. >>>>> >>>>> There is also the issue of the consequence on statically linking numpy >>>>> to python: I don't what they are (I would actually like to make >>>>> statically linked numpy into python easier, not harder). >>>> >>>> All the docs I can find in a quick google seem to say that bluegene >>>> doesn't do shared libraries at all, though those may be out of date. >>>> >>>> Also, it looks like our current approach is not doing a great job of >>>> avoiding symbol table pollution... despite all the NPY_NO_EXPORTS all >>>> over the source, I still count ~170 exported symbols on Linux with >>>> numpy 1.6, many of them with non-namespaced names >>>> ("_n_to_n_data_copy", "_next", "npy_tan", etc.) Of course this is >>>> fixable, but it's interesting that no-one has noticed. (Current master >>>> brings this up to ~300 exported symbols.) >>>> >>>> It sounds like as far as our "officially supported" platforms go >>>> (linux/windows/osx with gcc/msvc), then the ideal approach would be to >>>> use -fvisibility=hidden or --retain-symbols-file to convince gcc to >>>> hide symbols by default, like msvc does. That would let us remove >>>> cruft from the source code, produce a more reliable result, and let us >>>> use the more convenient separate build, with no real downsides. >>> >>> What cruft would it allow us to remove ? Whatever method we use, we >>> need a whitelist of symbols to export. >> >> No, right now we don't have a whitelist, we have a blacklist -- every >> time we add a new function or global variable, we have to remember to >> add a NPY_NO_EXPORT tag to its definition. Except the evidence says >> that we don't do that reliably. (Everyone always sucks at maintaining >> blacklists, that's the nature of blacklists.) I'm saying that we'd >> better off if we did have a whitelist. Especially since CPython API >> makes maintaining this whitelist so very trivial -- each module >> exports exactly one symbol! > > There may be some confusion on what NPY_NP_EXPORT does: it marks a > function that can be used between compilation units but is not > exported. The choice is between static and NPY_NO_EXPORT, not between > NPY_NO_EXPORT and nothing. In that sense, marking something > NPY_NO_EXPORT is a whitelist. > > If we were to use -fvisibility=hidden, we would still need to mark > those functions static (as it would otherwise publish functions in the > single file build). To be clear, this subthread started with the caveat *as far as our "officially supported" platforms go* -- I'm not saying that we should go around and remove all the NPY_NO_EXPORT macros tomorrow. However, the only reason they're actually needed is for supporting platforms where you can't control symbol visibility from the linker, and AFAICT we have no examples of such platforms to hand. So I'm questioning the wisdom of maintaining multiple parallel build systems etc. just for this hypothetical benefit. >> Yes, of course, or I wouldn't have bothered researching it. But this >> research would have been easier if there were enough of a user base >> that the tools makers actually paid any attention to supporting this >> use case, is all I was saying :-). >> >>>> Of course there are presumably other platforms that we don't support >>>> or test on, but where we have users anyway. Building on such a >>>> platform sort of intrinsically requires build system hacks, and some >>>> equivalent to the above may well be available (e.g. I know icc >>>> supports -fvisibility). So I while I'm not going to do anything about >>>> this myself in the near future, I'd argue that it would be a good idea >>>> to: >>>> - Switch the build-system to export nothing by default when using >>>> gcc, using -fvisibility=hidden >>>> - Switch the default build to "separate" >>>> - Leave in the single-file build, but not "officially supported", >>>> i.e., we're happy to get patches but it's not used on any systems that >>>> we can actually test ourselves. (I suspect it's less fragile than the >>>> separate build anyway, since name clashes are less common than >>>> forgotten include files.) >>> >>> I am fine with making the separate build the default (I have a patch >>> somewhere that does that on supported platforms), but not with using >>> -fvisibility=hidden. When I implemented the initial support around >>> this, fvisibility was buggy on some platforms, including mingw 3.x >> >> It's true that mingw doesn't support -fvisibility=hidden, but that's >> because it would be a no-op; windows already works that way by >> default... > > That's not my understanding: gcc behaves on windows as on linux (it > would break too many softwares that are the usual target of mingw > otherwise), but the -fvisibility flag is broken on gcc 3.x. The more > recent mingw supposedly handle this better, but we can't use gcc 4.x > because of another issue regarding private dll sharing :) I don't have windows to test, but everyone else on the internet seems to think mingw works the way I said, with __declspec and all... you aren't thinking of cygwin, are you? (see e.g. http://mingw.org/wiki/sampleDLL) -N From cournape at gmail.com Mon Jul 2 18:46:41 2012 From: cournape at gmail.com (David Cournapeau) Date: Mon, 2 Jul 2012 23:46:41 +0100 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 11:15 PM, Andrew Dalke wrote: > On Jul 2, 2012, at 11:38 PM, Fernando Perez wrote: >> No, that's the wrong thing to test, because it effectively amounts to >> 'import numpy', sicne the numpy __init__ file is still executed. As >> David indicated, you must import multarray.so by itself. > > I understand that clarification. However, it does not affect me. It is indeed irrelevant to your end goal, but it does affect the interpretation of what import_array does, and thus of your benchmark polynomial is definitely the big new overhead (I don't remember it being significant last time I optimized numpy import times), it is roughly 30 % of the total cost of importing numpy (95 -> 70 ms total time, of which numpy went from 70 to 50 ms). Then ctypeslib and test are the two other significant ones. I use profile_imports.py from bzr as follows: import sys import profile_imports profile_imports.install() import numpy profile_imports.log_stack_info(sys.stdout) Focusing on polynomial seems the only sensible action. Except for test, all the other stuff seem difficult to change without breaking anything. David From dalke at dalkescientific.com Mon Jul 2 18:54:46 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 3 Jul 2012 00:54:46 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: <50ADB535-464E-445F-ABF7-2BB83CBB4842@dalkescientific.com> On Jul 3, 2012, at 12:21 AM, Nathaniel Smith wrote: > Yes, but for a proper benchmark we need to compare this to the number > that we would get with some other implementation... I'm assuming you > aren't proposing we just delete the docstrings :-). I suspect that we have a different meaning of the term 'benchmark'. A benchmark establishes first the baseline by which future implementations are measured. Which is why I did. Once there are changes, the benchmark, rerun, helps judge the usefulness of those changes. This I did not do. I do not believe that a benchmark requires the changed code as well before it can be considered a "proper benchmark" >> This says that 'add_newdocs', which is imported from >> numpy.core.multiarray (though there may be other importers) >> takes 0.038 seconds to go through __import__, including >> all of its children module imports. > > There are no "children modules", all these modules refer to each > other, and you're assuming that whichever module you happen to load > first is responsible for all the other modules it happens to > reference. While I believe there is an "import tree" analogous to a "call tree" and Python's import scheme helps ensure that it's a DAG (so that 'children modules' has a real meaning), you are correct in identifying that I was only pointing out the first parent, and not all of the parents. add_newdocs is the first module to import 'numpy.lib', but after further testing (I stubbed out the import and made a fake function), I see that other modules import numpy.lib and there's no measurable performance increase. I retract therefore my proposal to move the documentation which is currently in add_newdocs into the C code. >>>> With instrumentation I found that 0.083s of the 0.119s >>>> is spent loading numpy.core.multiarray. > > The number 0.083 doesn't appear anywhere in that profile you pasted, > so I don't know where this comes from... I did not save the output run which I used for my original email. It's easy to generate, so I just ran it again. Cheers, Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Mon Jul 2 19:16:40 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 3 Jul 2012 01:16:40 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: On Jul 3, 2012, at 12:46 AM, David Cournapeau wrote: > It is indeed irrelevant to your end goal, but it does affect the > interpretation of what import_array does, and thus of your benchmark Indeed. > Focusing on polynomial seems the only sensible action. Except for > test, all the other stuff seem difficult to change without breaking > anything. I confirm that when I comment out numpy/__init__.py's "import polynomial" then the import time for numpy.core.multiarray goes from 0.084u 0.031s 0:00.11 100.0% 0+0k 0+0io 0pf+0w to 0.058u 0.028s 0:00.08 87.5% 0+0k 0+0io 0pf+0w numpy/polynomial imports: from polynomial import Polynomial from chebyshev import Chebyshev from legendre import Legendre from hermite import Hermite from hermite_e import HermiteE from laguerre import Laguerre and there's no easy way to make these be lazy imports. Strange! The bottom of hermite.py has: exec polytemplate.substitute(name='Hermite', nick='herm', domain='[-1,1]') as well as similar code in laguerre.py, chebyshev.py, hermite_e.py, and polynomial.py. I bet there's a lot of overhead generating and exec'ing those for each import! Andrew dalke at dalkescientific.com From cournape at gmail.com Mon Jul 2 19:29:26 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 3 Jul 2012 00:29:26 +0100 Subject: [Numpy-discussion] Combined versus separate build In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 11:34 PM, Nathaniel Smith wrote: > > To be clear, this subthread started with the caveat *as far as our > "officially supported" platforms go* -- I'm not saying that we should > go around and remove all the NPY_NO_EXPORT macros tomorrow. > > However, the only reason they're actually needed is for supporting > platforms where you can't control symbol visibility from the linker, > and AFAICT we have no examples of such platforms to hand. I gave you one, mingw 3.x. Actually, reading a bit more around, it seems this is not specific to mingw, but all gcc < 4 (http://gcc.gnu.org/gcc-4.0/changes.html#visibility) > I don't have windows to test, but everyone else on the internet seems > to think mingw works the way I said, with __declspec and all... you > aren't thinking of cygwin, are you? (see e.g. > http://mingw.org/wiki/sampleDLL) Well, I did check myself, but looking more into it, I was tricked by nm output, which makes little sense on windows w.r.t. visibility with dll. You can define the same function in multiple dll, they will all appear as a public symbol (T label with nm), but the windows linker will not see them when linking for an executable. I am still biased toward the conservative option, especially that it is still followed by pretty much every C extension out there (including python itself). I trust their experience in dealing with cross platform more than ours. I cannot find my patch for detecting platforms where this can safely become the default, I will reprepare one. David From jsalvati at u.washington.edu Mon Jul 2 19:48:01 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Mon, 2 Jul 2012 16:48:01 -0700 Subject: [Numpy-discussion] Would a patch with a function for incrementing an array with advanced indexing be accepted? In-Reply-To: References: Message-ID: Hi Fred, That's an excellent idea, but I am not too familiar with this use case. What do you mean by list in 'matrix[list]'? Is the use case, just incrementing in place a sub matrix of a numpy matrix? John On Fri, Jun 29, 2012 at 11:43 AM, Fr?d?ric Bastien wrote: > Hi, > > I personnaly can't review this as this is too much in NumPy internal. > > My only comments is that you could add a test and an example in the > doc for matrix[list]. I think it will be the most used case. > > Fred > > On Wed, Jun 27, 2012 at 7:47 PM, John Salvatier > wrote: > > I've submitted a pull request ( https://github.com/numpy/numpy/pull/326 > ). > > I'm new to the numpy and python internals, so feedback is greatly > > appreciated. > > > > > > On Tue, Jun 26, 2012 at 12:10 PM, Travis Oliphant > > wrote: > >> > >> > >> On Jun 26, 2012, at 1:34 PM, Fr?d?ric Bastien wrote: > >> > >> > Hi, > >> > > >> > I think he was referring that making NUMPY_ARRAY_OBJECT[...] syntax > >> > support the operation that you said is hard. But having a separate > >> > function do it is less complicated as you said. > >> > >> Yes. That's precisely what I meant. Thank you for clarifying. > >> > >> -Travis > >> > >> > > >> > Fred > >> > > >> > On Tue, Jun 26, 2012 at 1:27 PM, John Salvatier > >> > wrote: > >> >> Can you clarify why it would be super hard? I just reused the code > for > >> >> advanced indexing (a modification of PyArray_SetMap). Am I missing > >> >> something > >> >> crucial? > >> >> > >> >> > >> >> > >> >> On Tue, Jun 26, 2012 at 9:57 AM, Travis Oliphant < > travis at continuum.io> > >> >> wrote: > >> >>> > >> >>> > >> >>> On Jun 26, 2012, at 11:46 AM, John Salvatier wrote: > >> >>> > >> >>> Hello, > >> >>> > >> >>> If you increment an array using advanced indexing and have repeated > >> >>> indexes, the array doesn't get repeatedly > >> >>> incremented, > >> >>> http://comments.gmane.org/gmane.comp.python.numeric.general/50291. > >> >>> I wrote a C function that does incrementing with repeated indexes > >> >>> correctly. > >> >>> The branch is here (https://github.com/jsalvatier/numpy see the > last > >> >>> two > >> >>> commits). Would a patch with a cleaned up version of a function like > >> >>> this be > >> >>> accepted into numpy? I'm not experienced writing numpy C code so I'm > >> >>> sure it > >> >>> still needs improvement. > >> >>> > >> >>> > >> >>> This is great. It is an often-requested feature. It's *very > >> >>> difficult* > >> >>> to do without changing fundamentally what NumPy is. But, yes this > >> >>> would be > >> >>> a great pull request. > >> >>> > >> >>> Thanks, > >> >>> > >> >>> -Travis > >> >>> > >> >>> > >> >>> > >> >>> _______________________________________________ > >> >>> NumPy-Discussion mailing list > >> >>> NumPy-Discussion at scipy.org > >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >>> > >> >> > >> >> > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Jul 2 20:07:22 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 2 Jul 2012 17:07:22 -0700 Subject: [Numpy-discussion] Buildbot status Message-ID: Hi all, I'd like to find out what the current status of continuous integration is for numpy. I'm aware of: a) http://buildbot.scipy.org -- used by Ralf for testing releases? b) http://travis-ci.org -- connected via GitHub c) http://184.73.247.160:8111 -- dedicated Amazon EC2 with TeamCity d) http://build.pydata.org:8111/ -- dedicated Rackspace instance with TeamCity e) https://jenkins.shiningpanda.com/numpy/ -- python 2.4 on Debian Is there interest in maintaining the buildbot setup? If so, I suggest we move (a) onto (c), and then connect in several of the NiPy buildbots [including a Raspberry Pi!] (offered by Matthew Brett). It'd be nice to have a semi-official set up, so that we don't duplicate too much effort. St?fan From josef.pktd at gmail.com Mon Jul 2 20:08:47 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 2 Jul 2012 20:08:47 -0400 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 4:16 PM, Fernando Perez wrote: > Hi all, > > in recent work with a colleague, the need came up for a multivariate > hypergeometric sampler; I had a look in the numpy code and saw we have > the bivariate version, but not the multivariate one. > > I had a look at the code in scipy.stats.distributions, and it doesn't > look too difficult to add a proper multivariate hypergeometric by > extending the bivariate code, with one important caveat: the hard part > is the implementation of the actual discrete hypergeometric sampler, > which lives inside of numpy/random/mtrand/distributions.c: > > https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L743 > > That code is hand-written C, and it only works for the bivariate case > right now. ?It doesn't look terribly difficult to extend, but it will > certainly take a bit of care and testing to ensure all edge cases are > handled correctly. My only foray into this http://projects.scipy.org/numpy/ticket/921 http://projects.scipy.org/numpy/ticket/923 This looks difficult to add without a good reference and clear description of the algorithm. > > Does anyone happen to have that implemented lying around, in a form > that would be easy to merge to add this capability to numpy? not me, I have never even heard of multivariate hypergeometric distribution. maybe http://hal.inria.fr/docs/00/11/00/56/PDF/perm.pdf p.11 with some properties http://www.math.uah.edu/stat/urn/MultiHypergeometric.html I've seen one other algorithm, that seems to need N (number of draws in hypergeom) random variables for one multivariate hypergeometric random draw, which seems slow to me. But maybe someone has it lying around. Josef > > Thanks, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From fperez.net at gmail.com Mon Jul 2 20:21:36 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 17:21:36 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: Message-ID: Useful-looking: http://gcc.gnu.org/wiki/CompileFarm From ondrej.certik at gmail.com Mon Jul 2 20:31:06 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 2 Jul 2012 17:31:06 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: Message-ID: Hi Stefan, On Mon, Jul 2, 2012 at 5:07 PM, St?fan van der Walt wrote: > Hi all, > > I'd like to find out what the current status of continuous integration > is for numpy. I'm aware of: > > a) http://buildbot.scipy.org -- used by Ralf for testing releases? > b) http://travis-ci.org -- connected via GitHub > c) http://184.73.247.160:8111 -- dedicated Amazon EC2 with TeamCity > d) http://build.pydata.org:8111/ -- dedicated Rackspace instance with TeamCity > e) https://jenkins.shiningpanda.com/numpy/ -- python 2.4 on Debian > > Is there interest in maintaining the buildbot setup? If so, I suggest > we move (a) onto (c), and then connect in several of the NiPy > buildbots [including a Raspberry Pi!] (offered by Matthew Brett). > > It'd be nice to have a semi-official set up, so that we don't > duplicate too much effort. Yes, definitely. I will have time to work on the tests in about 2 weeks. Could you coordinate with Travis? He can make it "official". Ondrej From caseywstark at gmail.com Mon Jul 2 21:17:59 2012 From: caseywstark at gmail.com (Casey W. Stark) Date: Mon, 2 Jul 2012 18:17:59 -0700 Subject: [Numpy-discussion] f2py with allocatable arrays Message-ID: Hi numpy. Does anyone know if f2py supports allocatable arrays, allocated inside fortran subroutines? The old f2py docs seem to indicate that the allocatable array must be created with numpy, and dropped in the module. Here's more background to explain... I have a fortran subroutine that returns allocatable positions and velocities arrays. I wish I could get rid of the allocatable part, but you don't know how many particles it will create until the subroutine does some work (it checks if each particle it perturbs ends up in the domain). module zp implicit none contains subroutine ics(..., num_particles, particle_mass, positions, velocities) use data_types, only : dp implicit none ... inputs ... integer, intent(out) :: num_particles real (kind=dp), intent(out) :: particle_mass real (kind=dp), intent(out), dimension(:, :), allocatable :: positions, velocities ... end subroutine end module I tested this with a fortran driver program and it looks good, but when I try with f2py, it cannot compile. It throws the error "Error: Actual argument for 'positions' must be ALLOCATABLE at (1)". I figure this has something to do with the auto-generated "*-f2pywrappers2.f90" file, but the build deletes the file. If anyone knows an f2py friendly way to rework this, I would be happy to try. I'm also fine with using ctypes if it can handle this case. Best, Casey -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Jul 2 21:18:26 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 2 Jul 2012 18:18:26 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 5:31 PM, Ond?ej ?ert?k wrote: > Yes, definitely. I will have time to work on the tests in about 2 weeks. > Could you coordinate with Travis? He can make it "official". I'd gladly coordinate with everyone, but I'd like to do it here on the mailing list so that we're on the same page. Numerous parties have spoken about this informally, but now there are multiple efforts that I'd like to consolidate. Here are some questions we need to answer: 1) Do we keep the current buildbot.scipy.org? I suggest that we move it to the EC2 machine, if we do. Both Chris Ball and Matthew Brett has set up fairly sophisticated buildbots before, so we can leverage their knowledge. 2) Do we switch to another system, such as Jenkins? It seems as though you've investigated some of those alternatives. Did you also look at TeamCity? If anyone needs access to the EC2 machine, just let me know. Regards St?fan From travis at continuum.io Mon Jul 2 21:31:11 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 2 Jul 2012 19:31:11 -0600 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: Message-ID: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> Ondrej should have time to work on this full time in the coming days. I think your list, Stefan, is as complete a list as we have. A few interns have investigated Team City and other CI systems and a combination of Jenkins and Travis CI has been suggested. NumFocus can provide some funding needed for maintaining servers, etc, but keeping build bots active requires the efforts of multiple volunteers. If anyone has build machines to offer, please let Ondrej know so he can coordinate getting Jenkins slaves onto them and hooking them up to the master. Ondrej would especially appreciate any experience with Windows nodes. Best regards, Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Jul 2, 2012, at 7:18 PM, St?fan van der Walt wrote: > On Mon, Jul 2, 2012 at 5:31 PM, Ond?ej ?ert?k wrote: >> Yes, definitely. I will have time to work on the tests in about 2 weeks. >> Could you coordinate with Travis? He can make it "official". > > I'd gladly coordinate with everyone, but I'd like to do it here on the > mailing list so that we're on the same page. Numerous parties have > spoken about this informally, but now there are multiple efforts that > I'd like to consolidate. > > Here are some questions we need to answer: > > 1) Do we keep the current buildbot.scipy.org? > > I suggest that we move it to the EC2 machine, if we do. Both Chris > Ball and Matthew Brett has set up fairly sophisticated buildbots > before, so we can leverage their knowledge. > > 2) Do we switch to another system, such as Jenkins? It seems as > though you've investigated some of those alternatives. Did you also > look at TeamCity? > > If anyone needs access to the EC2 machine, just let me know. > > Regards > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Mon Jul 2 21:35:14 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 2 Jul 2012 21:35:14 -0400 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 8:08 PM, wrote: > On Mon, Jul 2, 2012 at 4:16 PM, Fernando Perez wrote: >> Hi all, >> >> in recent work with a colleague, the need came up for a multivariate >> hypergeometric sampler; I had a look in the numpy code and saw we have >> the bivariate version, but not the multivariate one. >> >> I had a look at the code in scipy.stats.distributions, and it doesn't >> look too difficult to add a proper multivariate hypergeometric by >> extending the bivariate code, with one important caveat: the hard part >> is the implementation of the actual discrete hypergeometric sampler, >> which lives inside of numpy/random/mtrand/distributions.c: >> >> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L743 >> >> That code is hand-written C, and it only works for the bivariate case >> right now. ?It doesn't look terribly difficult to extend, but it will >> certainly take a bit of care and testing to ensure all edge cases are >> handled correctly. > > My only foray into this > > http://projects.scipy.org/numpy/ticket/921 > http://projects.scipy.org/numpy/ticket/923 > > This looks difficult to add without a good reference and clear > description of the algorithm. > >> >> Does anyone happen to have that implemented lying around, in a form >> that would be easy to merge to add this capability to numpy? > > not me, I have never even heard of multivariate hypergeometric distribution. > > > maybe http://hal.inria.fr/docs/00/11/00/56/PDF/perm.pdf ?p.11 > with some properties http://www.math.uah.edu/stat/urn/MultiHypergeometric.html > > I've seen one other algorithm, that seems to need N (number of draws > in hypergeom) random variables for one multivariate hypergeometric > random draw, which seems slow to me. > > But maybe someone has it lying around. Now I have a pure num/sci/python version around. A bit more than an hour, so no guarantees, but freq and pmf look close enough. Josef > > Josef > >> >> Thanks, >> >> f >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- # -*- coding: utf-8 -*- """ Created on Mon Jul 02 20:23:08 2012 Author: Josef Perktold """ import numpy as np n_balls = [5, 3, 4] n_sample = 5 size = 100000 p = len(n_balls) n_all = sum(n_balls) rvs = np.zeros((size, p), int) n_bad = n_all n_remain = n_sample * np.ones(size, int) for ii in range(p-1): n_good = n_balls[ii] n_bad = n_bad - n_good rvs_ii = rvs[:,ii] mask = n_remain >= 1 need = mask.sum() rvs_ii[mask] = np.random.hypergeometric(n_good, n_bad, n_remain[mask], size=need) rvs[:,ii] = rvs_ii n_remain = np.maximum(n_remain - rvs_ii, 0) rvs[:, -1] = n_sample - rvs.sum(1) #print rvs print rvs.mean(0) * n_all / n_sample u, idx = np.unique(rvs.view([('', int)]*3), return_inverse=True) u_arr = u.view(int).reshape(len(u), -1) count = np.bincount(idx) freq = count * 1. / len(idx) from scipy.misc import comb def pmf(x, n_balls): x = np.asarray(x) n_balls = np.asarray(n_balls) #p = len(n_balls) ret = np.product(comb(n_balls, x)) / comb(n_balls.sum(), x.sum()) return ret print print freq for x,fr in zip(u_arr, freq): th = pmf(x, n_balls) print x, np.round(th, 5), fr, np.round(fr - th, 10) From jsseabold at gmail.com Mon Jul 2 22:31:43 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 2 Jul 2012 22:31:43 -0400 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 9:35 PM, wrote: > On Mon, Jul 2, 2012 at 8:08 PM, wrote: > > On Mon, Jul 2, 2012 at 4:16 PM, Fernando Perez > wrote: > >> Hi all, > >> > >> in recent work with a colleague, the need came up for a multivariate > >> hypergeometric sampler; I had a look in the numpy code and saw we have > >> the bivariate version, but not the multivariate one. > >> > >> I had a look at the code in scipy.stats.distributions, and it doesn't > >> look too difficult to add a proper multivariate hypergeometric by > >> extending the bivariate code, with one important caveat: the hard part > >> is the implementation of the actual discrete hypergeometric sampler, > >> which lives inside of numpy/random/mtrand/distributions.c: > >> > >> > https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L743 > >> > >> That code is hand-written C, and it only works for the bivariate case > >> right now. It doesn't look terribly difficult to extend, but it will > >> certainly take a bit of care and testing to ensure all edge cases are > >> handled correctly. > > > > My only foray into this > > > > http://projects.scipy.org/numpy/ticket/921 > > http://projects.scipy.org/numpy/ticket/923 > > > > This looks difficult to add without a good reference and clear > > description of the algorithm. > > > >> > >> Does anyone happen to have that implemented lying around, in a form > >> that would be easy to merge to add this capability to numpy? > > > > not me, I have never even heard of multivariate hypergeometric > distribution. > > > > > > maybe http://hal.inria.fr/docs/00/11/00/56/PDF/perm.pdf p.11 > > with some properties > http://www.math.uah.edu/stat/urn/MultiHypergeometric.html > > > > I've seen one other algorithm, that seems to need N (number of draws > > in hypergeom) random variables for one multivariate hypergeometric > > random draw, which seems slow to me. > > > > But maybe someone has it lying around. > > Now I have a pure num/sci/python version around. > > A bit more than an hour, so no guarantees, but freq and pmf look close > enough. I could be wrong, but I think PyMC has sampling and likelihood. Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Jul 2 22:49:11 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 19:49:11 -0700 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 7:31 PM, Skipper Seabold wrote: > I could be wrong, but I think PyMC has sampling and likelihood. It appears you're right! http://pymc-devs.github.com/pymc/distributions.html?highlight=hypergeometric#pymc.distributions.multivariate_hypergeometric_like Thanks :) Cheers, f From fperez.net at gmail.com Mon Jul 2 22:53:04 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 2 Jul 2012 19:53:04 -0700 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 7:49 PM, Fernando Perez wrote: > It appears you're right! > > http://pymc-devs.github.com/pymc/distributions.html?highlight=hypergeometric#pymc.distributions.multivariate_hypergeometric_like Furthermore, the code actually calls a sampler implemented in Fortran: http://pymc-devs.github.com/pymc/_modules/pymc/distributions.html#multivariate_hypergeometric_like which calls this: https://github.com/pymc-devs/pymc/blob/master/pymc/flib.f#L4379 Thanks again to both of you for the help. It's always productive to ask around here ;) Cheers, f From josef.pktd at gmail.com Mon Jul 2 23:27:56 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 2 Jul 2012 23:27:56 -0400 Subject: [Numpy-discussion] Multivariate hypergeometric distribution? In-Reply-To: References: Message-ID: On Mon, Jul 2, 2012 at 10:53 PM, Fernando Perez wrote: > On Mon, Jul 2, 2012 at 7:49 PM, Fernando Perez wrote: >> It appears you're right! nice idea: https://github.com/pymc-devs/pymc/blob/master/pymc/distributions.py#L1670 >> >> http://pymc-devs.github.com/pymc/distributions.html?highlight=hypergeometric#pymc.distributions.multivariate_hypergeometric_like > > Furthermore, the code actually calls a sampler implemented in Fortran: > > http://pymc-devs.github.com/pymc/_modules/pymc/distributions.html#multivariate_hypergeometric_like > > which calls this: > > https://github.com/pymc-devs/pymc/blob/master/pymc/flib.f#L4379 > > Thanks again to both of you for the help. ?It's always productive to > ask around here ;) the loglikelihood function is just some calls to scipy.special.gammaln I changed my version to get better numerical precision #taken from scipy.misc but returns log of comb def log_comb(n, k): from scipy import special from scipy.special import gammaln k = np.asarray(k) n = np.asarray(n) cond = (k <= n) & (n >= 0) & (k >= 0) sv = special.errprint(0) vals = gammaln(n + 1) - gammaln(n - k + 1) - gammaln(k + 1) sv = special.errprint(sv) return np.where(cond, vals, -np.inf) def log_pmf(x, n_balls): x = np.asarray(x) n_balls = np.asarray(n_balls) ret = np.sum(log_comb(n_balls, x)) - log_comb(n_balls.sum(), x.sum()) return ret def pmf(x, n_balls): x = np.asarray(x) n_balls = np.asarray(n_balls) ret = np.sum(log_comb(n_balls, x)) - log_comb(n_balls.sum(), x.sum()) ret = np.exp(ret) return ret proof by picture https://picasaweb.google.com/106983885143680349926/Joepy#5760777856741667746 https://picasaweb.google.com/106983885143680349926/Joepy#5760777861891130962 Cheers, Josef > > Cheers, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From thouis at gmail.com Tue Jul 3 05:35:24 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Tue, 3 Jul 2012 11:35:24 +0200 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: References: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> Message-ID: On Mon, Jul 2, 2012 at 11:52 PM, Sveinung Gundersen wrote: > > On 2. juli 2012, at 22.40, Nathaniel Smith wrote: > >> On Mon, Jul 2, 2012 at 6:54 PM, Sveinung Gundersen wrote: >>> [snip] >>> >>> >>> >>> Your actual memory usage may not have increased as much as you think, >>> since memmap objects don't necessarily take much memory -- it sounds >>> like you're leaking virtual memory, but your resident set size >>> shouldn't go up as much. >>> >>> >>> As I understand it, memmap objects retain the contents of the memmap in >>> memory after it has been read the first time (in a lazy manner). Thus, when >>> reading a slice of a 24GB file, only that part recides in memory. Our system >>> reads a slice of a memmap, calculates something (say, the sum), and then >>> deletes the memmap. It then loops through this for consequitive slices, >>> retaining a low memory usage. Consider the following code: >>> >>> import numpy as np >>> res = [] >>> vecLen = 3095677412 >>> for i in xrange(vecLen/10**8+1): >>> x = i * 10**8 >>> y = min((i+1) * 10**8, vecLen) >>> res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) >>> >>> The memory usage of this code on a 24GB file (one value for each nucleotide >>> in the human DNA!) is 23g resident memory after the loop is finished (not >>> 24g for some reason..). >>> >>> Running the same code on 1.5.1rc1 gives a resident memory of 23m after the >>> loop. >> >> Your memory measurement tools are misleading you. The same memory is >> resident in both cases, just in one case your tools say it is >> operating system disk cache (and not attributed to your app), and in >> the other case that same memory, treated in the same way by the OS, is >> shown as part of your app's resident memory. Virtual memory is >> confusing... > > But the crucial difference is perhaps that the disk cache can be cleared by the OS if needed, but not the application memory in the same way, which must be swapped to disk? Or am I still confused? > > (snip) > >>> >>> Great! Any idea on whether such a patch may be included in 1.7? >> >> Not really, if I or you or someone else gets inspired to take the time >> to write a patch soon then it will be, otherwise not... >> >> -N > > I have now tried to add a patch, in the way you proposed, but I may have gotten it wrong.. > > http://projects.scipy.org/numpy/ticket/2179 I put this in a github repo, and added tests (author credit to Sveinung) https://github.com/thouis/numpy/tree/mmap_children I'm not sure which branch to issue a PR request against, though. From gnurser at gmail.com Tue Jul 3 05:54:36 2012 From: gnurser at gmail.com (George Nurser) Date: Tue, 3 Jul 2012 10:54:36 +0100 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: Message-ID: Can you interface your fortran program twice? First time return the number of particles, dimensions etc to python python then creates work array of right size Second interface pass work array as in/out array, dimension in fortran argument list, to fortran fortran copies allocatable arrays to argument arrays clumsy, I know. George Nurser On 3 July 2012 02:17, Casey W. Stark wrote: > Hi numpy. > > Does anyone know if f2py supports allocatable arrays, allocated inside > fortran subroutines? The old f2py docs seem to indicate that the allocatable > array must be created with numpy, and dropped in the module. Here's more > background to explain... > > I have a fortran subroutine that returns allocatable positions and > velocities arrays. I wish I could get rid of the allocatable part, but you > don't know how many particles it will create until the subroutine does some > work (it checks if each particle it perturbs ends up in the domain). > > module zp > implicit none > contains > subroutine ics(..., num_particles, particle_mass, positions, velocities) > use data_types, only : dp > implicit none > ... inputs ... > integer, intent(out) :: num_particles > real (kind=dp), intent(out) :: particle_mass > real (kind=dp), intent(out), dimension(:, :), allocatable :: positions, > velocities > ... > end subroutine > end module > > I tested this with a fortran driver program and it looks good, but when I > try with f2py, it cannot compile. It throws the error "Error: Actual > argument for 'positions' must be ALLOCATABLE at (1)". I figure this has > something to do with the auto-generated "*-f2pywrappers2.f90" file, but the > build deletes the file. > > If anyone knows an f2py friendly way to rework this, I would be happy to > try. I'm also fine with using ctypes if it can handle this case. > > Best, > Casey > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From lists at hilboll.de Tue Jul 3 06:19:21 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Tue, 3 Jul 2012 12:19:21 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: Message-ID: <516a311a670992a778397afd00c49b35.squirrel@srv2.s4y.tournesol-consulting.eu> > Hi numpy. > > Does anyone know if f2py supports allocatable arrays, allocated inside > fortran subroutines? The old f2py docs seem to indicate that the > allocatable array must be created with numpy, and dropped in the module. > Here's more background to explain... > > I have a fortran subroutine that returns allocatable positions and > velocities arrays. I wish I could get rid of the allocatable part, but you > don't know how many particles it will create until the subroutine does > some > work (it checks if each particle it perturbs ends up in the domain). > > module zp > implicit none > contains > subroutine ics(..., num_particles, particle_mass, positions, velocities) > use data_types, only : dp > implicit none > ... inputs ... > integer, intent(out) :: num_particles > real (kind=dp), intent(out) :: particle_mass > real (kind=dp), intent(out), dimension(:, :), allocatable :: > positions, > velocities > ... > end subroutine > end module > > I tested this with a fortran driver program and it looks good, but when I > try with f2py, it cannot compile. It throws the error "Error: Actual > argument for 'positions' must be ALLOCATABLE at (1)". I figure this has > something to do with the auto-generated "*-f2pywrappers2.f90" file, but > the > build deletes the file. > > If anyone knows an f2py friendly way to rework this, I would be happy to > try. I'm also fine with using ctypes if it can handle this case. Can you split your code so that you have a subroutine which calculates the number of particles only, and call this subroutine from your 'original' routine? If yes, then you might be able to say somehting like real (kind=dp), intent(out), dimension(get_particle_number(WHATEVER) :: Not entirely sure, though ... Can someone with more f2py knowledge confirm this? From jim.vickroy at noaa.gov Tue Jul 3 07:09:24 2012 From: jim.vickroy at noaa.gov (Jim Vickroy) Date: Tue, 03 Jul 2012 05:09:24 -0600 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: Message-ID: <4FF2D2E4.1080903@noaa.gov> On 7/2/2012 7:17 PM, Casey W. Stark wrote: > Hi numpy. > > Does anyone know if f2py supports allocatable arrays, allocated inside > fortran subroutines? The old f2py docs seem to indicate that the > allocatable array must be created with numpy, and dropped in the > module. Here's more background to explain... > > I have a fortran subroutine that returns allocatable positions and > velocities arrays. I wish I could get rid of the allocatable part, but > you don't know how many particles it will create until the subroutine > does some work (it checks if each particle it perturbs ends up in the > domain). > > module zp > implicit none > contains > subroutine ics(..., num_particles, particle_mass, positions, velocities) > use data_types, only : dp > implicit none > ... inputs ... > integer, intent(out) :: num_particles > real (kind=dp), intent(out) :: particle_mass > real (kind=dp), intent(out), dimension(:, :), allocatable :: > positions, velocities > ... > end subroutine > end module > > I tested this with a fortran driver program and it looks good, but > when I try with f2py, it cannot compile. It throws the error "Error: > Actual argument for 'positions' must be ALLOCATABLE at (1)". I figure > this has something to do with the auto-generated "*-f2pywrappers2.f90" > file, but the build deletes the file. > > If anyone knows an f2py friendly way to rework this, I would be happy > to try. I'm also fine with using ctypes if it can handle this case. > > Best, > Casey > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Hi, ... not sure what version of numpy you are using but it can be done with the f2py included with numpy 1.6.x. Here is a (hopefully intelligible) code fragment from a working application: module Grid_ ! a representation of a two-dimensional, rectangular, (image) sensor grid ! -- this component is used in the "smoothing" portion of the Thematic MaP solution algorithm (smoothness prior probabilities) use runtime_ implicit none integer, private, parameter :: begin_ = 1, end_ = 2, row_ = 1, column_ = 2 integer, private, dimension(begin_:end_) :: rows__ = (/0,0/), columns__ = (/0,0/) ! grid rows and columns extent integer, allocatable, dimension(:,:) :: pixel_neighbors ! the neighbors of a specified pixel -- ((row,column), ..., (row,column)) ! + this array is managed (allocated, deallocated, populated) by the neighbors_of() procedure ! + when allocated, the first dimension is always 2 -- pixel (row,column) ! + it would be preferable for this array to be the return result, of procedure neighbors_of(), ! but F2PY does not seem to support an allocatable array as a function result contains [snip] function neighbors_of (pixel) result(completed) ! determines all second-order (grid) neighbors of *pixel* ! -- returns .true. if successful ! -- use the error_diagnostic() procedure if .false. is returned ! -- upon successful completion of this procedure, Grid_.pixel_neighbors contains the neighbors of *pixel* ! -- *pixel* may not be on the grid; in this case, Grid_.pixel_neighbors is not allocated and .false. is returned ! integer, dimension(row_:column_), intent(in) :: pixel ! pixel to be considered ... f2py does not support this integer, dimension(2), intent(in) :: pixel ! pixel under consideration (row,column) logical :: completed ! integer, dimension(begin_:end_) :: neighbor_rows, neighbor_columns ... f2py does not support this ! integer, dimension(row_:column_) :: neighbor ... f2py does not support this integer, dimension(2) :: neighbor_rows, neighbor_columns ! each is: (begin,end) integer, dimension(2) :: neighbor ! (row,column) integer :: row, column, code, count character (len=100) :: diagnostic completed = .false. count = 0 ! *pixel* has no neighbors if (allocated (pixel_neighbors)) deallocate (pixel_neighbors) if (is_interior (pixel)) then count = 8 ! interior pixels have eight, second-order neighbors neighbor_rows (begin_) = pixel(row_)-1 neighbor_rows (end_) = pixel(row_)+1 neighbor_columns (begin_) = pixel(column_)-1 neighbor_columns (end_) = pixel(column_)+1 else if (is_border (pixel)) then count = 5 ! non-corner, border pixels have five, second-order neighbors, but ... if (is_corner (pixel)) count = 3 ! corner pixels have three, second-order neighbors neighbor_rows (begin_) = max (pixel(row_)-1, rows__(begin_)) neighbor_rows (end_) = min (pixel(row_)+1, rows__(end_)) neighbor_columns (begin_) = max (pixel(column_)-1, columns__(begin_)) neighbor_columns (end_) = min (pixel(column_)+1, columns__(end_)) end if if (count > 0) then allocate (pixel_neighbors(row_:column_,count), stat=code, errmsg=diagnostic) if (code /= 0) then call set_error (code,diagnostic) return end if count = 0 do row = neighbor_rows(begin_), neighbor_rows(end_) do column = neighbor_columns(begin_), neighbor_columns(end_) neighbor(row_) = row; neighbor(column_) = column if (neighbor(row_) == pixel(row_) .and. neighbor(column_) == pixel(column_)) cycle ! neighbor is pixel count = count + 1; pixel_neighbors(:,count) = neighbor end do ! columns end do ! rows completed = .true. end if end function neighbors_of end module Grid_ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Tue Jul 3 10:20:04 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 03 Jul 2012 16:20:04 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: Message-ID: <4FF2FF94.4080904@molden.no> Den 03.07.2012 11:54, skrev George Nurser: >> module zp >> implicit none >> contains >> subroutine ics(..., num_particles, particle_mass, positions, velocities) >> use data_types, only : dp >> implicit none >> ... inputs ... >> integer, intent(out) :: num_particles >> real (kind=dp), intent(out) :: particle_mass >> real (kind=dp), intent(out), dimension(:, :), allocatable :: positions, >> velocities >> ... >> end subroutine >> end module >> This looks like a Fortran error. deallocate is called automatically on allocatable arrays when a subroutine exit (guarranteed from Fortran 95, optional in Fortran 90). Allocatable arrays might even be put on the stack depending on the requested size. I am not even sure intent(out) on an allocatable array is legal or an error the compiler should trap, but it is obviously very bad Fortran. So "positions" and "velocities" should here be declared pointer, not allocatable. As for f2py: Allocatable arrays are local variables for internal use, and they are not a part of the subroutine's calling interface. f2py only needs to know about the interface, not the local variables. Sturla From njs at pobox.com Tue Jul 3 11:08:04 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 3 Jul 2012 16:08:04 +0100 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: References: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> Message-ID: On Tue, Jul 3, 2012 at 10:35 AM, Thouis (Ray) Jones wrote: > On Mon, Jul 2, 2012 at 11:52 PM, Sveinung Gundersen wrote: >> >> On 2. juli 2012, at 22.40, Nathaniel Smith wrote: >> >>> On Mon, Jul 2, 2012 at 6:54 PM, Sveinung Gundersen wrote: >>>> [snip] >>>> >>>> >>>> >>>> Your actual memory usage may not have increased as much as you think, >>>> since memmap objects don't necessarily take much memory -- it sounds >>>> like you're leaking virtual memory, but your resident set size >>>> shouldn't go up as much. >>>> >>>> >>>> As I understand it, memmap objects retain the contents of the memmap in >>>> memory after it has been read the first time (in a lazy manner). Thus, when >>>> reading a slice of a 24GB file, only that part recides in memory. Our system >>>> reads a slice of a memmap, calculates something (say, the sum), and then >>>> deletes the memmap. It then loops through this for consequitive slices, >>>> retaining a low memory usage. Consider the following code: >>>> >>>> import numpy as np >>>> res = [] >>>> vecLen = 3095677412 >>>> for i in xrange(vecLen/10**8+1): >>>> x = i * 10**8 >>>> y = min((i+1) * 10**8, vecLen) >>>> res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) >>>> >>>> The memory usage of this code on a 24GB file (one value for each nucleotide >>>> in the human DNA!) is 23g resident memory after the loop is finished (not >>>> 24g for some reason..). >>>> >>>> Running the same code on 1.5.1rc1 gives a resident memory of 23m after the >>>> loop. >>> >>> Your memory measurement tools are misleading you. The same memory is >>> resident in both cases, just in one case your tools say it is >>> operating system disk cache (and not attributed to your app), and in >>> the other case that same memory, treated in the same way by the OS, is >>> shown as part of your app's resident memory. Virtual memory is >>> confusing... >> >> But the crucial difference is perhaps that the disk cache can be cleared by the OS if needed, but not the application memory in the same way, which must be swapped to disk? Or am I still confused? >> >> (snip) >> >>>> >>>> Great! Any idea on whether such a patch may be included in 1.7? >>> >>> Not really, if I or you or someone else gets inspired to take the time >>> to write a patch soon then it will be, otherwise not... >>> >>> -N >> >> I have now tried to add a patch, in the way you proposed, but I may have gotten it wrong.. >> >> http://projects.scipy.org/numpy/ticket/2179 > > I put this in a github repo, and added tests (author credit to Sveinung) > https://github.com/thouis/numpy/tree/mmap_children > > I'm not sure which branch to issue a PR request against, though. Looks good to me, thanks to both of you! Obviously should be merged to master; beyond that I'm not sure. We definitely want it in 1.7, but I'm not sure if that's been branched yet or not. (Or rather, it has been branched, but then maybe it was unbranched again? Travis?) Since it was a 1.6 regression it'd make sense to cherrypick to the 1.6 branch too, just in case it gets another release. -n From chris.barker at noaa.gov Tue Jul 3 12:58:03 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 3 Jul 2012 09:58:03 -0700 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 12:17 PM, Andrew Dalke wrote: > In this email I propose a few changes which I think are minor > and which don't really affect the external NumPy API but which > I think could improve the "import numpy" performance by at > least 40%. +1 -- I think I remember that thread -- at the time, I was experiencing some really, really slow inport times myself -- it turned out to be something really wierd with my system (though I don't remember exactly what), but numpy still is too big an import. Another note -- I ship stuff with py2exe and friends a fair bit -- numpy's "Import a whole bunch of stuff you may well not be using" approach means I have to include all that stuff, or hack the heck out of numpy -- not ideal. > 1) remove "add_newdocs" and put the docstrings in the C code > ?'add_newdocs' still needs to be there, > > The code says: > > # This is only meant to add docs to objects defined in C-extension modules. > # The purpose is to allow easier editing of the docstrings without > # requiring a re-compile. +1 -- isn't it better for the docs to be with the code, anyway? > 2) Don't optimistically assume that all submodules are > needed. For example, some current code uses > >>>> import numpy >>>> numpy.fft.ifft > +1 see above -- really, what fraction of code uses fft and polynomial, and ... "namespaces are one honking great idea" I appreciate the legacy, and the easy-of-use at the interpreter, but it sure would be nice to clean this up -- maybe keep the leegacy by having a new import: import just_numpy as np that would import the core stuff, and offer the "extra" packages as specific imports -- ideally, we'd dpreciate the old way, and reccoment the extra importing for the future, and some day have "numpy" and "numpy_plus". (Kind of like pylab, I suppose) lazy importing may work OK, too, though more awkward for py2exe and friends, and perhaps a bit "magic" for my taste. > 3) Especially: don't always import 'numpy.testing' +1 > I have not worried about numpy import performance for > 4 years. While I have been developing scientific software > for 20 years, and in Python for 15 years, it has been > in areas of biology and chemistry which don't use arrays. remarkable -- I use arrays for everything! most of which are not classic big arrays you process with lapack type stuff ;-) > ? yeah, it's just using the homogenous array most of the time. exactly -- I know Travis says: "if you're going to use numpy arrays, use numpy", but they really are pretty darn handy even if you just use them as containers. Ben root wrote: > Not sure how this would impact projects like ipython that does tab-completion support, > but I know that that would drive me nuts in my basic tab-completion setup I have for >my regular python terminal. Of course, in the grand scheme of things, that really > isn't all that important, I don't think. I do think it's important to support easy interactive use, Ipyhton, etc -- with nice tab completion, easy access to doc string, etc. But it should alo be possible to not have all that where it isn't required -- hence my "import numpy_plus" type proposal. I never did get why the polynomial stuff was added to core numpy.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice 7600 Sand Point Way NE ??(206) 526-6329?? fax Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception Chris.Barker at noaa.gov From pearu.peterson at gmail.com Tue Jul 3 13:24:27 2012 From: pearu.peterson at gmail.com (Pearu Peterson) Date: Tue, 3 Jul 2012 20:24:27 +0300 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <4FF2FF94.4080904@molden.no> References: <4FF2FF94.4080904@molden.no> Message-ID: On Tue, Jul 3, 2012 at 5:20 PM, Sturla Molden wrote: > > As for f2py: Allocatable arrays are local variables for internal use, > and they are not a part of the subroutine's calling interface. f2py only > needs to know about the interface, not the local variables. > One can have allocatable arrays in module data block, for instance, where they a global. f2py supports wrapping these allocatable arrays to python. See, for example, http://cens.ioc.ee/projects/f2py2e/usersguide/index.html#allocatable-arrays Pearu -------------- next part -------------- An HTML attachment was scrubbed... URL: From caseywstark at gmail.com Tue Jul 3 14:38:36 2012 From: caseywstark at gmail.com (Casey W. Stark) Date: Tue, 3 Jul 2012 11:38:36 -0700 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: <4FF2FF94.4080904@molden.no> Message-ID: Hi all. Thanks for the speedy responses! I'll try to respond to all... The first idea is to split up the routine into two -- one to compute the final size of the arrays, and the second to fill them in. I might end up doing this, because it is simplest, but it means creating the initial conditions twice, throwing them away the first time. Not a huge deal because this setup is not that big (a few minutes), but it will take twice as long. Sturla, this is valid Fortran, but I agree it might just be a bad idea. The Fortran 90/95 Explained book mentions this in the allocatable dummy arguments section and has an example using an array with allocatable, intent(out) in a subrountine. You can also see this in the PDF linked from http://fortranwiki.org/fortran/show/Allocatable+enhancements. I have never used array pointers in Fortran, but I might give this a shot. Are there any problems returning pointers to arrays back to python though? As for storing the arrays in the module data block, I would like to avoid that approach if possible. It doesn't make sense for these to be module level components when the size and values depend on the input to a subroutine in the module. Best, Casey On Tue, Jul 3, 2012 at 10:24 AM, Pearu Peterson wrote: > > > On Tue, Jul 3, 2012 at 5:20 PM, Sturla Molden wrote: > >> >> As for f2py: Allocatable arrays are local variables for internal use, >> and they are not a part of the subroutine's calling interface. f2py only >> needs to know about the interface, not the local variables. >> > > One can have allocatable arrays in module data block, for instance, where > they a global. f2py supports wrapping these allocatable arrays to python. > See, for example, > > > http://cens.ioc.ee/projects/f2py2e/usersguide/index.html#allocatable-arrays > > Pearu > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Jul 3 14:41:59 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 3 Jul 2012 14:41:59 -0400 Subject: [Numpy-discussion] Would a patch with a function for incrementing an array with advanced indexing be accepted? In-Reply-To: References: Message-ID: Hi, Here is code example that work only with different index: import numpy x=numpy.zeros((5,5)) x[[0,2,4]]+=numpy.random.rand(3,5) print x This won't work if in the list [0,2,4], there is index duplication, but with your new code, it will. I think it is the most used case of advanced indexing. At least, for our lab:) Fred On Mon, Jul 2, 2012 at 7:48 PM, John Salvatier wrote: > Hi Fred, > > That's an excellent idea, but I am not too familiar with this use case. What > do you mean by list in 'matrix[list]'? Is the use case, just incrementing > in place a sub matrix of a numpy matrix? > > John > > > On Fri, Jun 29, 2012 at 11:43 AM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I personnaly can't review this as this is too much in NumPy internal. >> >> My only comments is that you could add a test and an example in the >> doc for matrix[list]. I think it will be the most used case. >> >> Fred >> >> On Wed, Jun 27, 2012 at 7:47 PM, John Salvatier >> wrote: >> > I've submitted a pull request ( https://github.com/numpy/numpy/pull/326 >> > ). >> > I'm new to the numpy and python internals, so feedback is greatly >> > appreciated. >> > >> > >> > On Tue, Jun 26, 2012 at 12:10 PM, Travis Oliphant >> > wrote: >> >> >> >> >> >> On Jun 26, 2012, at 1:34 PM, Fr?d?ric Bastien wrote: >> >> >> >> > Hi, >> >> > >> >> > I think he was referring that making NUMPY_ARRAY_OBJECT[...] syntax >> >> > support the operation that you said is hard. But having a separate >> >> > function do it is less complicated as you said. >> >> >> >> Yes. That's precisely what I meant. Thank you for clarifying. >> >> >> >> -Travis >> >> >> >> > >> >> > Fred >> >> > >> >> > On Tue, Jun 26, 2012 at 1:27 PM, John Salvatier >> >> > wrote: >> >> >> Can you clarify why it would be super hard? I just reused the code >> >> >> for >> >> >> advanced indexing (a modification of PyArray_SetMap). Am I missing >> >> >> something >> >> >> crucial? >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jun 26, 2012 at 9:57 AM, Travis Oliphant >> >> >> >> >> >> wrote: >> >> >>> >> >> >>> >> >> >>> On Jun 26, 2012, at 11:46 AM, John Salvatier wrote: >> >> >>> >> >> >>> Hello, >> >> >>> >> >> >>> If you increment an array using advanced indexing and have repeated >> >> >>> indexes, the array doesn't get repeatedly >> >> >>> incremented, >> >> >>> http://comments.gmane.org/gmane.comp.python.numeric.general/50291. >> >> >>> I wrote a C function that does incrementing with repeated indexes >> >> >>> correctly. >> >> >>> The branch is here (https://github.com/jsalvatier/numpy see the >> >> >>> last >> >> >>> two >> >> >>> commits). Would a patch with a cleaned up version of a function >> >> >>> like >> >> >>> this be >> >> >>> accepted into numpy? I'm not experienced writing numpy C code so >> >> >>> I'm >> >> >>> sure it >> >> >>> still needs improvement. >> >> >>> >> >> >>> >> >> >>> This is great. It is an often-requested feature. It's *very >> >> >>> difficult* >> >> >>> to do without changing fundamentally what NumPy is. But, yes this >> >> >>> would be >> >> >>> a great pull request. >> >> >>> >> >> >>> Thanks, >> >> >>> >> >> >>> -Travis >> >> >>> >> >> >>> >> >> >>> >> >> >>> _______________________________________________ >> >> >>> NumPy-Discussion mailing list >> >> >>> NumPy-Discussion at scipy.org >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >>> >> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> NumPy-Discussion mailing list >> >> >> NumPy-Discussion at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> > _______________________________________________ >> >> > NumPy-Discussion mailing list >> >> > NumPy-Discussion at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From stefan at sun.ac.za Tue Jul 3 16:59:17 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 3 Jul 2012 13:59:17 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> References: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> Message-ID: On Mon, Jul 2, 2012 at 6:31 PM, Travis Oliphant wrote: > Ondrej should have time to work on this full time in the coming days. That's great; having Ondrej on this full time will help a great deal. > NumFocus can provide some funding needed for maintaining servers, etc, but keeping build bots active requires the efforts of multiple volunteers. > If anyone has build machines to offer, please let Ondrej know so he can coordinate getting Jenkins slaves onto them and hooking them up to the master. I'd be glad if we could discuss it mainly on list, just to keep everyone in the loop. For now, I think we need to answer the two questions mentioned above: 1) What happens to the current installation on buildbot.scipy.org? 2) If we're not keeping buildbot, or if we want additional systems, which ones should we use? Jenkins? and then also 3) Which build slaves should we employ? We have the current build slaves, the nipy ones have been volunteered, and then there's the GCC build farm mentioned by Fernando. Ondrej, perhaps you can comment on what you had in mind? If we have a clear plan of action before you start off, we can all help out in putting the pieces together. St?fan From pivanov314 at gmail.com Tue Jul 3 19:47:20 2012 From: pivanov314 at gmail.com (Paul Ivanov) Date: Tue, 3 Jul 2012 16:47:20 -0700 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> Message-ID: On Mon, Jul 2, 2012 at 2:59 PM, Nathaniel Smith wrote: > On Mon, Jul 2, 2012 at 10:06 PM, Robert Kern wrote: >> On Mon, Jul 2, 2012 at 9:43 PM, Benjamin Root wrote: >>> >>> On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith wrote: >> >>>> I think this ship has sailed, but it'd be worth looking into lazy >>>> importing, where 'numpy.fft' isn't actually imported until someone >>>> starts using it. There are a bunch of libraries that do this, and one >>>> would have to fiddle to get compatibility with all the different >>>> python versions and make sure you're not killing performance (might >>>> have to be in C) but something along the lines of >>>> >>>> class _FFTModule(object): >>>> def __getattribute__(self, name): >>>> mod = importlib.import_module("numpy.fft") >>>> _FFTModule.__getattribute__ = mod.__getattribute__ >>>> return getattr(mod, name) >>>> fft = _FFTModule() >>> >>> Not sure how this would impact projects like ipython that does >>> tab-completion support, but I know that that would drive me nuts in my basic >>> tab-completion setup I have for my regular python terminal. Of course, in >>> the grand scheme of things, that really isn't all that important, I don't >>> think. >> >> We used to do it for scipy. It did interfere with tab completion. It >> did drive many people nuts. > > Sounds like a bug in your old code, or else the REPLs have gotten > better? I just pasted the above code into both ipython and python > prompts, and typing 'fft.' worked fine in both cases. dir(fft) > works first try as well. > > (If you try this, don't forget to 'import importlib' first, and note > importlib is 2.7+ only. Obviously importlib is not necessary but it > makes the minimal example less tedious.) For anyone interested, I worked out a small lazy-loading class that we use in nitime [1], which does not need importlib and thus works on python versions before 2.7 and also has a bit of repr pretty printing. I wrote about this to Scipy-Dev [2], and in the original nitime PR [3] tested that it works in python 2.5, 2.6, 2.7, 3.0, 3.1 and 3.2. Since that time, we've only changed how we deal with the one known limitation: reloading a lazily-loaded module was a noop in that PR, but now throws an error (there's one line commented out if the noop behavior is preferred). Here's a link to the rendered docs [4], but if you just grab the LazyImport class from [1], you can do fft = LazyImport('numpy.fft') 1. https://github.com/nipy/nitime/blob/master/nitime/lazyimports.py 2. http://mail.scipy.org/pipermail/scipy-dev/2011-September/016606.html 3. https://github.com/nipy/nitime/pull/88 4. http://nipy.sourceforge.net/nitime/api/generated/nitime.lazyimports.html#module-nitime.lazyimports best, -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From sturla at molden.no Tue Jul 3 19:59:26 2012 From: sturla at molden.no (Sturla Molden) Date: Wed, 04 Jul 2012 01:59:26 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: <4FF2FF94.4080904@molden.no> Message-ID: <4FF3875E.6060004@molden.no> Den 03.07.2012 19:24, skrev Pearu Peterson: > > One can have allocatable arrays in module data block, for instance, where > they a global In Fortran 2003 one can also have allocatable arrays as members in derived types. But neither was the case here. The allocatable was a dummy variable in a subroutine's interface, declared with intent(out). That is an error the compiler should trap, because it is doomed to segfault. Sturla From sturla at molden.no Tue Jul 3 20:23:37 2012 From: sturla at molden.no (Sturla Molden) Date: Wed, 04 Jul 2012 02:23:37 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: References: <4FF2FF94.4080904@molden.no> Message-ID: <4FF38D09.7080907@molden.no> Den 03.07.2012 20:38, skrev Casey W. Stark: > > Sturla, this is valid Fortran, but I agree it might just be a bad > idea. The Fortran 90/95 Explained book mentions this in the > allocatable dummy arguments section and has an example using an array > with allocatable, intent(out) in a subrountine. You can also see this > in the PDF linked from > http://fortranwiki.org/fortran/show/Allocatable+enhancements. Ok, so it's valid Fortran 2003. I never came any longer than to Fortran 95 :-) Make sure any Fortran code using this have the extension .f03 -- not .f95 or .f90 -- or it might crash horribly. > I have never used array pointers in Fortran, but I might give this a > shot. Are there any problems returning pointers to arrays back to > python though? In the Fortran 2003 ISO C binding you can find the methods F_C_POINTER and C_F_POINTER that will convert between C and Fortran pointers. A Fortran pointer is an array struct, much like a NumPy view array, and has dimensions and strides. It can reference contiguous and discontiguous blocks of memory. A C pointer is just a memory address. You must therefore use special methods to convert between C and Fortran pointers. There is an extension to Cython called fwrap that will generate this kind of boiler-plate code for conversion between NumPy and Fortran using the ISO C bindings in Fortran 2003. It is an alternative to f2py, though less mature. At the binary level, a Fortran pointer and an allocatable array usually have the same representation. But there are semantic differences between them. In Fortran 2003, pointers are less useful than in Fortran 95, except when interfacing with C through the ISO C bindings. That is because you can put allocatable arrays as memers in derived types, and (as I learned today) use them as dummy variables. In Fortran 95 those would be common cases for using pointers. By the way: The Fortran counter part to a C pointer is a Cray pointer, which is a common non-standard extension to the language. A Cray pointer, when supported, is a pair of variables: one storing the address and the other dereferencing the address. Sturla From sturla at molden.no Tue Jul 3 20:27:32 2012 From: sturla at molden.no (Sturla Molden) Date: Wed, 04 Jul 2012 02:27:32 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <4FF3875E.6060004@molden.no> References: <4FF2FF94.4080904@molden.no> <4FF3875E.6060004@molden.no> Message-ID: <4FF38DF4.2010208@molden.no> Den 04.07.2012 01:59, skrev Sturla Molden: > But neither was the case here. The allocatable was a dummy variable in > a subroutine's interface, declared with intent(out). That is an error > the compiler should trap, because it is doomed to segfault. Ok, so the answer here seems to be: In Fortran 90 is compiler dependent. In Fortran 95 it is an error. In extensions to Fortran 95 it is legal. In Fortran 2003 and 2008 it is legal. Sturla From paul.anton.letnes at gmail.com Wed Jul 4 00:51:29 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Wed, 4 Jul 2012 06:51:29 +0200 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <4FF38D09.7080907@molden.no> References: <4FF2FF94.4080904@molden.no> <4FF38D09.7080907@molden.no> Message-ID: <73D0660F-B5A7-4CEB-967A-D3829D4F1EC3@gmail.com> On 4. juli 2012, at 02:23, Sturla Molden wrote: > Den 03.07.2012 20:38, skrev Casey W. Stark: >> >> Sturla, this is valid Fortran, but I agree it might just be a bad >> idea. The Fortran 90/95 Explained book mentions this in the >> allocatable dummy arguments section and has an example using an array >> with allocatable, intent(out) in a subrountine. You can also see this >> in the PDF linked from >> http://fortranwiki.org/fortran/show/Allocatable+enhancements. > > Ok, so it's valid Fortran 2003. I never came any longer than to Fortran > 95 :-) Make sure any Fortran code using this have the extension .f03 -- > not .f95 or .f90 -- or it might crash horribly. > To be pedantic: to my knowledge, the common convention is .f for fixed and .f for free form source code. As is stated in the link, "..the Fortran standard itself does not define any extension..." http://fortranwiki.org/fortran/show/File+extensions As one example, ifort doesn't even want to read files with the .f95 suffix. You'll have to pass it a flag stating that "yep, that's a fortran file all right". I use the .f90 suffix everywhere, but maybe that's just me. Paul From heng at cantab.net Wed Jul 4 08:09:00 2012 From: heng at cantab.net (Henry Gomersall) Date: Wed, 04 Jul 2012 13:09:00 +0100 Subject: [Numpy-discussion] 32-bit numpy build on a 64-bit machine Message-ID: <1341403740.31565.11.camel@farnsworth> Does anyone have any experience building a 32-bit version of numpy on a 64-bit linux machine? I'm trying to build a python stack that I can use to handle a (closed source) 32-bit library. Much messing around with environment variables and linker flags has got some of the way, perhaps, but not enough to give me confidence I'm going down the right track. Advice would be appreciated! Cheers, Henry From cannonjunk at hotmail.co.uk Wed Jul 4 10:16:16 2012 From: cannonjunk at hotmail.co.uk (abc def) Date: Wed, 4 Jul 2012 15:16:16 +0100 Subject: [Numpy-discussion] numpy and readline installation fails Message-ID: Hello, I'm new to python and I'd like to learn about numpy / scipy / matplotlib, but I'm having trouble getting started. I'm following the instructions here: http://www.scipy.org/Getting_Started First I installed the latest version of python from python.org by downloading the dmg file, since I read that it doesn't work with apple's installer, and then installed numpy / scipy / matplotlib by downloading the relevent dmg files. I then downloaded ipython, ran "easy_install readline" and then ran "python setup.py install". Then I started ipython with "ipython -pylab" as per the instructions but then I get muliple error messages: $ ipython --pylab /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: RuntimeWarning: ****************************************************************************** libedit detected - readline will not be well behaved, including but not limited to: * crashes on tab completion * incorrect history navigation * corrupting long-lines * failure to wrap or indent lines properly It is highly recommended that you install readline, which is easy_installable: easy_install readline Note that `pip install readline` generally DOES NOT WORK, because it installs to site-packages, which come *after* lib-dynload in sys.path, where readline is located. It must be `easy_install readline`, or to a custom location on your PYTHONPATH (even --user comes after lib-dyload). ****************************************************************************** RuntimeWarning) Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) Type "copyright", "credits" or "license" for more information. IPython 0.13 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. [TerminalIPythonApp] GUI event loop or pylab initialization failed --------------------------------------------------------------------------- ImportError Traceback (most recent call last) /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in find_gui_and_backend(gui) 194 """ 195 --> 196 import matplotlib 197 198 if gui and gui != 'auto': /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py in () 131 import sys, os, tempfile 132 --> 133 from matplotlib.rcsetup import (defaultParams, 134 validate_backend, 135 validate_toolbar, /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py in () 17 import warnings 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern ---> 19 from matplotlib.colors import is_color_like 20 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py in () 50 """ 51 import re ---> 52 import numpy as np 53 from numpy import ma 54 import matplotlib.cbook as cbook ImportError: No module named numpy In [1]: ? it seems the installation of numpy and readline didn't work, and there are problems with matplotlib, even though I think I followed all the instructions carefully. I can't figure out what I did wrong. Can anybody help? I'm running mac os 10.6. Thank you! Tom From paul.anton.letnes at gmail.com Wed Jul 4 12:03:49 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Wed, 4 Jul 2012 18:03:49 +0200 Subject: [Numpy-discussion] numpy and readline installation fails In-Reply-To: References: Message-ID: Hello, I don't know exactly what went wrong. I'd start out my debugging by 1) which python # see whether you're running apple's python in /usr/bin/python, or the one you tried to install 2) which easy_install # did you run Apple-python's easy_install, or the one you (tried to) installed? 3) If all of the above match, try running python (not ipython) and try to import numpy. Apple's python ships with numpy (at least the Lion / 10.7 one does). 4) Next, print numpy.__file__ to see whether numpy got installed to where it should In general, I'd advice you to install one package at a time, then test it to see whether it has been installed properly. When you're confident everything is OK, move on to the next package. For instance, test numpy by $ python -c 'import numpy; numpy.test()' and scipy with $ python -c 'import scipy;scipy.test()' (for instance). When you're sure the fundament (python, numpy) is in order, proceed with the house (scipy, matplotlib). Cheers Paul On 4. juli 2012, at 16:16, abc def wrote: > > Hello, > > I'm new to python and I'd like to learn about numpy / scipy / matplotlib, but I'm having trouble getting started. > > I'm following the instructions here: http://www.scipy.org/Getting_Started > > First I installed the latest version of python from python.org by downloading the dmg file, since I read that it doesn't work with apple's installer, and then installed numpy / scipy / matplotlib by downloading the relevent dmg files. > I then downloaded ipython, ran "easy_install readline" and then ran "python setup.py install". > > Then I started ipython with "ipython -pylab" as per the instructions but then I get muliple error messages: > > > > > > $ ipython --pylab > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: RuntimeWarning: > ****************************************************************************** > libedit detected - readline will not be well behaved, including but not limited to: > * crashes on tab completion > * incorrect history navigation > * corrupting long-lines > * failure to wrap or indent lines properly > It is highly recommended that you install readline, which is easy_installable: > easy_install readline > Note that `pip install readline` generally DOES NOT WORK, because > it installs to site-packages, which come *after* lib-dynload in sys.path, > where readline is located. It must be `easy_install readline`, or to a custom > location on your PYTHONPATH (even --user comes after lib-dyload). > ****************************************************************************** > RuntimeWarning) > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) > Type "copyright", "credits" or "license" for more information. > > IPython 0.13 -- An enhanced Interactive Python. > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object', use 'object??' for extra details. > [TerminalIPythonApp] GUI event loop or pylab initialization failed > --------------------------------------------------------------------------- > ImportError Traceback (most recent call last) > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in find_gui_and_backend(gui) > 194 """ > 195 > --> 196 import matplotlib > 197 > 198 if gui and gui != 'auto': > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py in () > 131 import sys, os, tempfile > 132 > --> 133 from matplotlib.rcsetup import (defaultParams, > 134 validate_backend, > 135 validate_toolbar, > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py in () > 17 import warnings > 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern > ---> 19 from matplotlib.colors import is_color_like > 20 > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py in () > 50 """ > 51 import re > ---> 52 import numpy as np > 53 from numpy import ma > 54 import matplotlib.cbook as cbook > > ImportError: No module named numpy > > In [1]: > > > > > it seems the installation of numpy and readline didn't work, and there are problems with matplotlib, even though I think I followed all the instructions carefully. > I can't figure out what I did wrong. Can anybody help? > > I'm running mac os 10.6. > > Thank you! > > Tom > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From aron at ahmadia.net Wed Jul 4 12:24:46 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Wed, 4 Jul 2012 19:24:46 +0300 Subject: [Numpy-discussion] numpy and readline installation fails In-Reply-To: References: Message-ID: How do the .dmg files work? Are these binary installers into the system Python? I would expect that these wouldn't work with a manually installed Python from python.org, but I have no experience with them. Tom, I have had very good luck with the brew Python 2.7 installer, which will give you a /usr/local Python base install and easy_install. Brew can also install readline correctly for you. https://github.com/mxcl/homebrew/wiki/Homebrew-and-Python If you want an all-in-one solution, you can also grab either the free EPD distribution or another of the "all-in-one" packs. A On Wed, Jul 4, 2012 at 7:03 PM, Paul Anton Letnes < paul.anton.letnes at gmail.com> wrote: > Hello, > > I don't know exactly what went wrong. I'd start out my debugging by > 1) which python # see whether you're running apple's python in > /usr/bin/python, or the one you tried to install > 2) which easy_install # did you run Apple-python's easy_install, or the > one you (tried to) installed? > 3) If all of the above match, try running python (not ipython) and try to > import numpy. Apple's python ships with numpy (at least the Lion / 10.7 one > does). > 4) Next, print numpy.__file__ to see whether numpy got installed to where > it should > > In general, I'd advice you to install one package at a time, then test it > to see whether it has been installed properly. When you're confident > everything is OK, move on to the next package. For instance, test numpy by > $ python -c 'import numpy; numpy.test()' > and scipy with > $ python -c 'import scipy;scipy.test()' > (for instance). > > When you're sure the fundament (python, numpy) is in order, proceed with > the house (scipy, matplotlib). > > Cheers > Paul > > > On 4. juli 2012, at 16:16, abc def wrote: > > > > > Hello, > > > > I'm new to python and I'd like to learn about numpy / scipy / > matplotlib, but I'm having trouble getting started. > > > > I'm following the instructions here: > http://www.scipy.org/Getting_Started > > > > First I installed the latest version of python from python.org by > downloading the dmg file, since I read that it doesn't work with apple's > installer, and then installed numpy / scipy / matplotlib by downloading the > relevent dmg files. > > I then downloaded ipython, ran "easy_install readline" and then ran > "python setup.py install". > > > > Then I started ipython with "ipython -pylab" as per the instructions but > then I get muliple error messages: > > > > > > > > > > > > $ ipython --pylab > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: > RuntimeWarning: > > > ****************************************************************************** > > libedit detected - readline will not be well behaved, including but not > limited to: > > * crashes on tab completion > > * incorrect history navigation > > * corrupting long-lines > > * failure to wrap or indent lines properly > > It is highly recommended that you install readline, which is > easy_installable: > > easy_install readline > > Note that `pip install readline` generally DOES NOT WORK, because > > it installs to site-packages, which come *after* lib-dynload in sys.path, > > where readline is located. It must be `easy_install readline`, or to a > custom > > location on your PYTHONPATH (even --user comes after lib-dyload). > > > ****************************************************************************** > > RuntimeWarning) > > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) > > Type "copyright", "credits" or "license" for more information. > > > > IPython 0.13 -- An enhanced Interactive Python. > > ? -> Introduction and overview of IPython's features. > > %quickref -> Quick reference. > > help -> Python's own help system. > > object? -> Details about 'object', use 'object??' for extra details. > > [TerminalIPythonApp] GUI event loop or pylab initialization failed > > > --------------------------------------------------------------------------- > > ImportError Traceback (most recent call > last) > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc > in find_gui_and_backend(gui) > > 194 """ > > 195 > > --> 196 import matplotlib > > 197 > > 198 if gui and gui != 'auto': > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py > in () > > 131 import sys, os, tempfile > > 132 > > --> 133 from matplotlib.rcsetup import (defaultParams, > > 134 validate_backend, > > 135 validate_toolbar, > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py > in () > > 17 import warnings > > 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern > > ---> 19 from matplotlib.colors import is_color_like > > 20 > > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', > 'qtagg', 'qt4agg', > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py > in () > > 50 """ > > 51 import re > > ---> 52 import numpy as np > > 53 from numpy import ma > > 54 import matplotlib.cbook as cbook > > > > ImportError: No module named numpy > > > > In [1]: > > > > > > > > > > it seems the installation of numpy and readline didn't work, and there > are problems with matplotlib, even though I think I followed all the > instructions carefully. > > I can't figure out what I did wrong. Can anybody help? > > > > I'm running mac os 10.6. > > > > Thank you! > > > > Tom > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jul 4 12:44:46 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 4 Jul 2012 18:44:46 +0200 Subject: [Numpy-discussion] numpy and readline installation fails In-Reply-To: References: Message-ID: On Wed, Jul 4, 2012 at 6:24 PM, Aron Ahmadia wrote: > How do the .dmg files work? Are these binary installers into the system > Python? I would expect that these wouldn't work with a manually installed > Python from python.org, but I have no experience with them. > They're binary installers for the python.org Python. So they install into /Library/Frameworks/Python.framework/. They don't work with the system (Apple) Python; I think leaving that alone and installing from python.orgis in general good advice. > > Tom, I have had very good luck with the brew Python 2.7 installer, which > will give you a /usr/local Python base install and easy_install. Brew can > also install readline correctly for you. > > https://github.com/mxcl/homebrew/wiki/Homebrew-and-Python > > Homebrew appears to be better than MacPorts/Fink, but installing everything from source is not the first advice I'd give to a new user. Ralf If you want an all-in-one solution, you can also grab either the free EPD > distribution or another of the "all-in-one" packs. > > A > > > On Wed, Jul 4, 2012 at 7:03 PM, Paul Anton Letnes < > paul.anton.letnes at gmail.com> wrote: > >> Hello, >> >> I don't know exactly what went wrong. I'd start out my debugging by >> 1) which python # see whether you're running apple's python in >> /usr/bin/python, or the one you tried to install >> 2) which easy_install # did you run Apple-python's easy_install, or the >> one you (tried to) installed? >> 3) If all of the above match, try running python (not ipython) and try to >> import numpy. Apple's python ships with numpy (at least the Lion / 10.7 one >> does). >> 4) Next, print numpy.__file__ to see whether numpy got installed to where >> it should >> >> In general, I'd advice you to install one package at a time, then test it >> to see whether it has been installed properly. When you're confident >> everything is OK, move on to the next package. For instance, test numpy by >> $ python -c 'import numpy; numpy.test()' >> and scipy with >> $ python -c 'import scipy;scipy.test()' >> (for instance). >> >> When you're sure the fundament (python, numpy) is in order, proceed with >> the house (scipy, matplotlib). >> >> Cheers >> Paul >> >> >> On 4. juli 2012, at 16:16, abc def wrote: >> >> > >> > Hello, >> > >> > I'm new to python and I'd like to learn about numpy / scipy / >> matplotlib, but I'm having trouble getting started. >> > >> > I'm following the instructions here: >> http://www.scipy.org/Getting_Started >> > >> > First I installed the latest version of python from python.org by >> downloading the dmg file, since I read that it doesn't work with apple's >> installer, and then installed numpy / scipy / matplotlib by downloading the >> relevent dmg files. >> > I then downloaded ipython, ran "easy_install readline" and then ran >> "python setup.py install". >> > >> > Then I started ipython with "ipython -pylab" as per the instructions >> but then I get muliple error messages: >> > >> > >> > >> > >> > >> > $ ipython --pylab >> > >> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: >> RuntimeWarning: >> > >> ****************************************************************************** >> > libedit detected - readline will not be well behaved, including but not >> limited to: >> > * crashes on tab completion >> > * incorrect history navigation >> > * corrupting long-lines >> > * failure to wrap or indent lines properly >> > It is highly recommended that you install readline, which is >> easy_installable: >> > easy_install readline >> > Note that `pip install readline` generally DOES NOT WORK, because >> > it installs to site-packages, which come *after* lib-dynload in >> sys.path, >> > where readline is located. It must be `easy_install readline`, or to a >> custom >> > location on your PYTHONPATH (even --user comes after lib-dyload). >> > >> ****************************************************************************** >> > RuntimeWarning) >> > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) >> > Type "copyright", "credits" or "license" for more information. >> > >> > IPython 0.13 -- An enhanced Interactive Python. >> > ? -> Introduction and overview of IPython's features. >> > %quickref -> Quick reference. >> > help -> Python's own help system. >> > object? -> Details about 'object', use 'object??' for extra details. >> > [TerminalIPythonApp] GUI event loop or pylab initialization failed >> > >> --------------------------------------------------------------------------- >> > ImportError Traceback (most recent call >> last) >> > >> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc >> in find_gui_and_backend(gui) >> > 194 """ >> > 195 >> > --> 196 import matplotlib >> > 197 >> > 198 if gui and gui != 'auto': >> > >> > >> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py >> in () >> > 131 import sys, os, tempfile >> > 132 >> > --> 133 from matplotlib.rcsetup import (defaultParams, >> > 134 validate_backend, >> > 135 validate_toolbar, >> > >> > >> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py >> in () >> > 17 import warnings >> > 18 from matplotlib.fontconfig_pattern import >> parse_fontconfig_pattern >> > ---> 19 from matplotlib.colors import is_color_like >> > 20 >> > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', >> 'qtagg', 'qt4agg', >> > >> > >> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py >> in () >> > 50 """ >> > 51 import re >> > ---> 52 import numpy as np >> > 53 from numpy import ma >> > 54 import matplotlib.cbook as cbook >> > >> > ImportError: No module named numpy >> > >> > In [1]: >> > >> > >> > >> > >> > it seems the installation of numpy and readline didn't work, and there >> are problems with matplotlib, even though I think I followed all the >> instructions carefully. >> > I can't figure out what I did wrong. Can anybody help? >> > >> > I'm running mac os 10.6. >> > >> > Thank you! >> > >> > Tom >> > >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jul 4 14:21:38 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 4 Jul 2012 19:21:38 +0100 Subject: [Numpy-discussion] Change in memmap behaviour In-Reply-To: References: <50B02802-AF3A-4D88-8875-AF8298FA55FD@gmail.com> Message-ID: On Tue, Jul 3, 2012 at 4:08 PM, Nathaniel Smith wrote: > On Tue, Jul 3, 2012 at 10:35 AM, Thouis (Ray) Jones wrote: >> On Mon, Jul 2, 2012 at 11:52 PM, Sveinung Gundersen wrote: >>> >>> On 2. juli 2012, at 22.40, Nathaniel Smith wrote: >>> >>>> On Mon, Jul 2, 2012 at 6:54 PM, Sveinung Gundersen wrote: >>>>> [snip] >>>>> >>>>> >>>>> >>>>> Your actual memory usage may not have increased as much as you think, >>>>> since memmap objects don't necessarily take much memory -- it sounds >>>>> like you're leaking virtual memory, but your resident set size >>>>> shouldn't go up as much. >>>>> >>>>> >>>>> As I understand it, memmap objects retain the contents of the memmap in >>>>> memory after it has been read the first time (in a lazy manner). Thus, when >>>>> reading a slice of a 24GB file, only that part recides in memory. Our system >>>>> reads a slice of a memmap, calculates something (say, the sum), and then >>>>> deletes the memmap. It then loops through this for consequitive slices, >>>>> retaining a low memory usage. Consider the following code: >>>>> >>>>> import numpy as np >>>>> res = [] >>>>> vecLen = 3095677412 >>>>> for i in xrange(vecLen/10**8+1): >>>>> x = i * 10**8 >>>>> y = min((i+1) * 10**8, vecLen) >>>>> res.append(np.memmap('val.float64', dtype='float64')[x:y].sum()) >>>>> >>>>> The memory usage of this code on a 24GB file (one value for each nucleotide >>>>> in the human DNA!) is 23g resident memory after the loop is finished (not >>>>> 24g for some reason..). >>>>> >>>>> Running the same code on 1.5.1rc1 gives a resident memory of 23m after the >>>>> loop. >>>> >>>> Your memory measurement tools are misleading you. The same memory is >>>> resident in both cases, just in one case your tools say it is >>>> operating system disk cache (and not attributed to your app), and in >>>> the other case that same memory, treated in the same way by the OS, is >>>> shown as part of your app's resident memory. Virtual memory is >>>> confusing... >>> >>> But the crucial difference is perhaps that the disk cache can be cleared by the OS if needed, but not the application memory in the same way, which must be swapped to disk? Or am I still confused? >>> >>> (snip) >>> >>>>> >>>>> Great! Any idea on whether such a patch may be included in 1.7? >>>> >>>> Not really, if I or you or someone else gets inspired to take the time >>>> to write a patch soon then it will be, otherwise not... >>>> >>>> -N >>> >>> I have now tried to add a patch, in the way you proposed, but I may have gotten it wrong.. >>> >>> http://projects.scipy.org/numpy/ticket/2179 >> >> I put this in a github repo, and added tests (author credit to Sveinung) >> https://github.com/thouis/numpy/tree/mmap_children >> >> I'm not sure which branch to issue a PR request against, though. > > Looks good to me, thanks to both of you! > > Obviously should be merged to master; beyond that I'm not sure. We > definitely want it in 1.7, but I'm not sure if that's been branched > yet or not. (Or rather, it has been branched, but then maybe it was > unbranched again? Travis?) Since it was a 1.6 regression it'd make > sense to cherrypick to the 1.6 branch too, just in case it gets > another release. Merged into master and maintenance/1.6.x, but not maintenance/1.7.x, I'll let Ondrej or Travis figure that out... -N From matrixhasu at gmail.com Wed Jul 4 16:56:58 2012 From: matrixhasu at gmail.com (Sandro Tosi) Date: Wed, 4 Jul 2012 22:56:58 +0200 Subject: [Numpy-discussion] Numpy regression in 1.6.2 in deducing the dtype for record array In-Reply-To: References: Message-ID: Hello, On Mon, Jul 2, 2012 at 7:58 PM, Sandro Tosi wrote: > Hello, > I'd like to point you to this bug report just reported to Debian: > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679948 > > It would be really awesome if you could give a look and comment if the > proposed fix would be appropriate. Did you have a chance to look at this email? -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From ralf.gommers at googlemail.com Thu Jul 5 02:10:56 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jul 2012 08:10:56 +0200 Subject: [Numpy-discussion] Numpy regression in 1.6.2 in deducing the dtype for record array In-Reply-To: References: Message-ID: On Wed, Jul 4, 2012 at 10:56 PM, Sandro Tosi wrote: > Hello, > > On Mon, Jul 2, 2012 at 7:58 PM, Sandro Tosi wrote: > > Hello, > > I'd like to point you to this bug report just reported to Debian: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679948 > > > > It would be really awesome if you could give a look and comment if the > > proposed fix would be appropriate. > > Did you have a chance to look at this email? > The commit identified by Yaroslav looks like the right one. It just needs to be backported to 1.6.x. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Thu Jul 5 02:18:23 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Thu, 5 Jul 2012 08:18:23 +0200 Subject: [Numpy-discussion] Numpy regression in 1.6.2 in deducing the dtype for record array In-Reply-To: References: Message-ID: On 5 July 2012 08:10, Ralf Gommers wrote: > > > On Wed, Jul 4, 2012 at 10:56 PM, Sandro Tosi wrote: >> >> Hello, >> >> On Mon, Jul 2, 2012 at 7:58 PM, Sandro Tosi wrote: >> > Hello, >> > I'd like to point you to this bug report just reported to Debian: >> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679948 >> > >> > It would be really awesome if you could give a look and comment if the >> > proposed fix would be appropriate. >> >> Did you have a chance to look at this email? > > > The commit identified by Yaroslav looks like the right one. It just needs to > be backported to 1.6.x. Except that cherry picking the commit to the 1.6.x branch doesn't apply cleanly. It'll take some work by someone familiar with that part of the code.. Cheers, Scott From matrixhasu at gmail.com Thu Jul 5 11:46:26 2012 From: matrixhasu at gmail.com (Sandro Tosi) Date: Thu, 5 Jul 2012 17:46:26 +0200 Subject: [Numpy-discussion] Numpy regression in 1.6.2 in deducing the dtype for record array In-Reply-To: References: Message-ID: On Thu, Jul 5, 2012 at 8:18 AM, Scott Sinclair wrote: > Except that cherry picking the commit to the 1.6.x branch doesn't > apply cleanly. It'll take some work by someone familiar with that part > of the code.. That's actually what I was looking for :) someone knowing that code to backport the fix to 1.6.x Cheers -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From ralf.gommers at googlemail.com Thu Jul 5 12:38:53 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jul 2012 18:38:53 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: On Tue, Jul 3, 2012 at 1:16 AM, Andrew Dalke wrote: > On Jul 3, 2012, at 12:46 AM, David Cournapeau wrote: > > It is indeed irrelevant to your end goal, but it does affect the > > interpretation of what import_array does, and thus of your benchmark > > Indeed. > > > Focusing on polynomial seems the only sensible action. Except for > > test, all the other stuff seem difficult to change without breaking > > anything. > > I confirm that when I comment out numpy/__init__.py's "import polynomial" > then the import time for numpy.core.multiarray goes from > > 0.084u 0.031s 0:00.11 100.0% 0+0k 0+0io 0pf+0w > > to > > 0.058u 0.028s 0:00.08 87.5% 0+0k 0+0io 0pf+0w > > > numpy/polynomial imports: > from polynomial import Polynomial > from chebyshev import Chebyshev > from legendre import Legendre > from hermite import Hermite > from hermite_e import HermiteE > from laguerre import Laguerre > > and there's no easy way to make these be lazy imports. > > > Strange! The bottom of hermite.py has: > > exec polytemplate.substitute(name='Hermite', nick='herm', domain='[-1,1]') > > as well as similar code in laguerre.py, chebyshev.py, hermite_e.py, > and polynomial.py. > > I bet there's a lot of overhead generating and exec'ing > those for each import! > Looks like it. That could easily be done at build time though. Making that change and your proposed change to the test functions, which seems fine, will likely be enough to reach your 40% target. No need for new imports or lazy loading then I hope. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jul 5 12:49:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jul 2012 18:49:14 +0200 Subject: [Numpy-discussion] Preferring gfortran over g77 on OS X and other distributions? In-Reply-To: References: Message-ID: On Wed, Jun 27, 2012 at 11:26 PM, Aron Ahmadia wrote: > I've promoted gfortran to be the default compiler on OS X over vendor > compilers (to be more compatible with Linux), and made a similar adjustment > for the platform detection. I've promoted gfortran over g77 but not vendor > compilers on the other 'nixes. I left the Windows compiler options alone. > > https://github.com/numpy/numpy/pull/325 Aron made this change for {mac, linux, posix, sun, irix, aix} in his PR. I think it's a good idea for {mac, linux} and perhaps posix, and to leave the other platforms alone. Does anyone else have an opinion on this? Ralf On Wed, Jun 27, 2012 at 10:46 PM, Ralf Gommers wrote: > > > On Mon, Jun 18, 2012 at 9:47 AM, Aron Ahmadia wrote: > >> f2py, by default, seems to prefer g77 (no longer maintained, deprecated, >> speedy, doesn't support Fortran 90 or Fortran 95) over gfortran >> (maintained, slower, Fortran 90 and Fortran 95 support). >> >> This causes problems when we try to compile Fortran 90 extensions using >> f2py on platforms where both g77 and gfortran are installed without >> manually switching the compiler's flags. It is a very minor edit to the >> fcompiler/__init__.py file to prefer gfortran over g77 on OS X, and I can >> think of almost no reason not to do so, since the Vectorize framework (OS X >> tuned LAPACK/BLAS) appears to be ABI compatible with gfortran. I am not >> sure what the situation is on the distributions that numpy is trying to >> support, but my feeling is that g77 should not be preferred when gfortran >> is available. >> > > On Windows g77 is still the default. But indeed, on OS X gfortran is the > recommended Fortran compiler. A PR for this would be useful. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From caseywstark at gmail.com Thu Jul 5 15:12:55 2012 From: caseywstark at gmail.com (Casey W. Stark) Date: Thu, 5 Jul 2012 12:12:55 -0700 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <73D0660F-B5A7-4CEB-967A-D3829D4F1EC3@gmail.com> References: <4FF2FF94.4080904@molden.no> <4FF38D09.7080907@molden.no> <73D0660F-B5A7-4CEB-967A-D3829D4F1EC3@gmail.com> Message-ID: Hi all. Thanks for the help again. I ended up going with running it twice -- once for the final number of particles and second for the positions. Sturla, given that this functionality is so standard dependent, I decided to ditch it. It works with my gfortran, but who knows with other machines and compilers. Paul, I also stick with .f90 for everything. Best, Casey On Tue, Jul 3, 2012 at 9:51 PM, Paul Anton Letnes < paul.anton.letnes at gmail.com> wrote: > > On 4. juli 2012, at 02:23, Sturla Molden wrote: > > > Den 03.07.2012 20:38, skrev Casey W. Stark: > >> > >> Sturla, this is valid Fortran, but I agree it might just be a bad > >> idea. The Fortran 90/95 Explained book mentions this in the > >> allocatable dummy arguments section and has an example using an array > >> with allocatable, intent(out) in a subrountine. You can also see this > >> in the PDF linked from > >> http://fortranwiki.org/fortran/show/Allocatable+enhancements. > > > > Ok, so it's valid Fortran 2003. I never came any longer than to Fortran > > 95 :-) Make sure any Fortran code using this have the extension .f03 -- > > not .f95 or .f90 -- or it might crash horribly. > > > > To be pedantic: to my knowledge, the common convention is .f for fixed and > .f for free form source code. As is stated in the link, "..the Fortran > standard itself does not define any extension..." > > http://fortranwiki.org/fortran/show/File+extensions > > As one example, ifort doesn't even want to read files with the .f95 > suffix. You'll have to pass it a flag stating that "yep, that's a fortran > file all right". > > I use the .f90 suffix everywhere, but maybe that's just me. > > Paul > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Jul 5 19:00:17 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 5 Jul 2012 16:00:17 -0700 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <4FF38DF4.2010208@molden.no> References: <4FF2FF94.4080904@molden.no> <4FF3875E.6060004@molden.no> <4FF38DF4.2010208@molden.no> Message-ID: On Tue, Jul 3, 2012 at 5:27 PM, Sturla Molden wrote: > Den 04.07.2012 01:59, skrev Sturla Molden: >> But neither was the case here. The allocatable was a dummy variable in >> a subroutine's interface, declared with intent(out). That is an error >> the compiler should trap, because it is doomed to segfault. > > Ok, so the answer here seems to be: > > In Fortran 90 is compiler dependent. > In Fortran 95 it is an error. > In extensions to Fortran 95 it is legal. > In Fortran 2003 and 2008 it is legal. That's exactly right. I only use allocatable arrays (in intent(out) as well), never pointers, and I never call deallocate(), because Fortran frees it for me automatically (even for intent(out)). I think it's a good programming practice, as there can't be any memory leaks. If Fortran 95 compatibility is required, I changed allocatable, intent(out) variables to pointers, and then go up the call chain and explicitly deallocate the arrays when they are not needed, to avoid memory leaks. Ondrej From ondrej.certik at gmail.com Thu Jul 5 19:03:17 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 5 Jul 2012 16:03:17 -0700 Subject: [Numpy-discussion] f2py with allocatable arrays In-Reply-To: <73D0660F-B5A7-4CEB-967A-D3829D4F1EC3@gmail.com> References: <4FF2FF94.4080904@molden.no> <4FF38D09.7080907@molden.no> <73D0660F-B5A7-4CEB-967A-D3829D4F1EC3@gmail.com> Message-ID: On Tue, Jul 3, 2012 at 9:51 PM, Paul Anton Letnes wrote: > > On 4. juli 2012, at 02:23, Sturla Molden wrote: > >> Den 03.07.2012 20:38, skrev Casey W. Stark: >>> >>> Sturla, this is valid Fortran, but I agree it might just be a bad >>> idea. The Fortran 90/95 Explained book mentions this in the >>> allocatable dummy arguments section and has an example using an array >>> with allocatable, intent(out) in a subrountine. You can also see this >>> in the PDF linked from >>> http://fortranwiki.org/fortran/show/Allocatable+enhancements. >> >> Ok, so it's valid Fortran 2003. I never came any longer than to Fortran >> 95 :-) Make sure any Fortran code using this have the extension .f03 -- >> not .f95 or .f90 -- or it might crash horribly. >> > > To be pedantic: to my knowledge, the common convention is .f for fixed and .f for free form source code. As is stated in the link, "..the Fortran standard itself does not define any extension..." I assume you meant ".f for fixed and .f90 for free form" > > http://fortranwiki.org/fortran/show/File+extensions > > As one example, ifort doesn't even want to read files with the .f95 suffix. You'll have to pass it a flag stating that "yep, that's a fortran file all right". Yep. > > I use the .f90 suffix everywhere, but maybe that's just me. Exactly, the same here. Ondrej From ondrej.certik at gmail.com Thu Jul 5 19:36:41 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 5 Jul 2012 16:36:41 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> Message-ID: Hi Stefan, On Tue, Jul 3, 2012 at 1:59 PM, St?fan van der Walt wrote: > On Mon, Jul 2, 2012 at 6:31 PM, Travis Oliphant wrote: >> Ondrej should have time to work on this full time in the coming days. > > That's great; having Ondrej on this full time will help a great deal. > >> NumFocus can provide some funding needed for maintaining servers, etc, but keeping build bots active requires the efforts of multiple volunteers. >> If anyone has build machines to offer, please let Ondrej know so he can coordinate getting Jenkins slaves onto them and hooking them up to the master. > > I'd be glad if we could discuss it mainly on list, just to keep > everyone in the loop. > > For now, I think we need to answer the two questions mentioned above: > > 1) What happens to the current installation on buildbot.scipy.org? > 2) If we're not keeping buildbot, or if we want additional systems, > which ones should we use? Jenkins? > > and then also > > 3) Which build slaves should we employ? We have the current build > slaves, the nipy ones have been volunteered, and then there's the GCC > build farm mentioned by Fernando. > > Ondrej, perhaps you can comment on what you had in mind? If we have a > clear plan of action before you start off, we can all help out in > putting the pieces together. The only work that I did so far was to learn Jenkins and write these Fabric files to automatically setup testing for pretty much any project, from the command line: https://github.com/certik/vagrant-jenkins If we go with EC2, then the same Fabric files can be used to provision the EC2. That way, we can keep the configuration in one public repository and people can send pull requests with improvements. And thus pretty much anyone should be able to help with the maintenance. Currently it works in a way that somebody sets it up, then becomes busy, and then the buildbots stop working. It might be too idealistic though, but if we can have most of the setup in Fabric and the rest well documented, it will be much easier for other people to help out. So feel free to go ahead with what you think is the best and I will join you in a few days. Ondrej From hhchen at psu.edu Thu Jul 5 21:00:37 2012 From: hhchen at psu.edu (Hung-Hsuan Chen) Date: Thu, 5 Jul 2012 21:00:37 -0400 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux Message-ID: Dear all, I've built blas, lapack, and atlas libraries, as shown below. $ ls ~/lib/atlas/lib/ libatlas.a libcblas.a libf77blas.a liblapack.a libptcblas.a libptf77blas.a The library location are specified by site.cfg file, as shown below. [DEFAULT] library_dirs = /home/username/lib/atlas/lib include_dirs = /home/username/lib/atlas/include [blas] libraries = libf77blas, libcblas, libatlas [lapack] libraries = liblapack, libf77blas, libcblas, libatlas I've tried to build numpy (version 1.6.2) by $ python setup.py build --fcompiler=gnu However, I got the following error message: error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit status I've searched internet for possible solutions whole day but don't have any progress so far. Anyone has any idea of how to fix this? Thanks! From paul.anton.letnes at gmail.com Fri Jul 6 00:42:39 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Fri, 6 Jul 2012 06:42:39 +0200 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: References: Message-ID: Hi, are you sure that you want g77 and not gfortran? If you want gfortran, you should pass the > --fcompiler=gnu95 flag to setup.py. Which redhat version are you building on? (I don't know red hat well enough to comment, but perhaps someone else do...) Paul On 6. juli 2012, at 03:00, Hung-Hsuan Chen wrote: > Dear all, > > I've built blas, lapack, and atlas libraries, as shown below. > > $ ls ~/lib/atlas/lib/ > libatlas.a libcblas.a libf77blas.a liblapack.a libptcblas.a libptf77blas.a > > The library location are specified by site.cfg file, as shown below. > > [DEFAULT] > library_dirs = /home/username/lib/atlas/lib > include_dirs = /home/username/lib/atlas/include > > [blas] > libraries = libf77blas, libcblas, libatlas > > [lapack] > libraries = liblapack, libf77blas, libcblas, libatlas > > I've tried to build numpy (version 1.6.2) by > $ python setup.py build --fcompiler=gnu > > However, I got the following error message: > error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared > build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o > build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o > -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 > -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o > build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit > status > > I've searched internet for possible solutions whole day but don't have > any progress so far. Anyone has any idea of how to fix this? Thanks! > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hhchen at psu.edu Fri Jul 6 05:12:06 2012 From: hhchen at psu.edu (Hung-Hsuan Chen) Date: Fri, 6 Jul 2012 05:12:06 -0400 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: References: Message-ID: Thank you for pointing this out. Actually I tried both --fcompiler=gnu95 and --fcompiler=gnu flags, but I got the same error message. As for the redhat version, I'm using Red Hat Enterprise Linux Server release 6.3 (Santiago). On Fri, Jul 6, 2012 at 12:42 AM, Paul Anton Letnes wrote: > Hi, > > are you sure that you want g77 and not gfortran? If you want gfortran, you should pass the >> --fcompiler=gnu95 > flag to setup.py. > > Which redhat version are you building on? (I don't know red hat well enough to comment, but perhaps someone else do...) > > Paul > > On 6. juli 2012, at 03:00, Hung-Hsuan Chen wrote: > >> Dear all, >> >> I've built blas, lapack, and atlas libraries, as shown below. >> >> $ ls ~/lib/atlas/lib/ >> libatlas.a libcblas.a libf77blas.a liblapack.a libptcblas.a libptf77blas.a >> >> The library location are specified by site.cfg file, as shown below. >> >> [DEFAULT] >> library_dirs = /home/username/lib/atlas/lib >> include_dirs = /home/username/lib/atlas/include >> >> [blas] >> libraries = libf77blas, libcblas, libatlas >> >> [lapack] >> libraries = liblapack, libf77blas, libcblas, libatlas >> >> I've tried to build numpy (version 1.6.2) by >> $ python setup.py build --fcompiler=gnu >> >> However, I got the following error message: >> error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o >> -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 >> -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o >> build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit >> status >> >> I've searched internet for possible solutions whole day but don't have >> any progress so far. Anyone has any idea of how to fix this? Thanks! >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From aron at ahmadia.net Fri Jul 6 05:15:00 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Fri, 6 Jul 2012 12:15:00 +0300 Subject: [Numpy-discussion] Preferring gfortran over g77 on OS X and other distributions? In-Reply-To: References: Message-ID: > > Aron made this change for {mac, linux, posix, sun, irix, aix} in his PR. I > think it's a good idea for {mac, linux} and perhaps posix, and to leave the > other platforms alone. Does anyone else have an opinion on this? > Ralf, that sounds good to me. If you feel that we should leave sun, irix, and aix alone, we should probably leave out posix as well. If everyone concurs, I'll issue a new PR adjusting mac and linux. A -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.anton.letnes at gmail.com Fri Jul 6 05:21:24 2012 From: paul.anton.letnes at gmail.com (Paul Anton Letnes) Date: Fri, 6 Jul 2012 11:21:24 +0200 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: References: Message-ID: <81872A39-786E-49CC-9177-1EC413FB045B@gmail.com> > > However, I got the following error message: > error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared > build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o > build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o > -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 > -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o > build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit > status I'm sure there must have been more output? It does say that the command "failed", but not _why_ it failed. I suggest posting the entire output either in an email, or on a webpage (gist.github.com, for instance) and giving the link. It's very very hard to debug a build without the build log, so I'd suggest always giving it in the first instance. Paul From hhchen at psu.edu Fri Jul 6 06:00:26 2012 From: hhchen at psu.edu (Hung-Hsuan Chen) Date: Fri, 6 Jul 2012 06:00:26 -0400 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: <81872A39-786E-49CC-9177-1EC413FB045B@gmail.com> References: <81872A39-786E-49CC-9177-1EC413FB045B@gmail.com> Message-ID: Link is a great suggestion! I was hesitating about whether or not to paste such a long output. The site.cfg file is shown in the following link. https://gist.github.com/3059209 The output message for $ python setup.py build --fcompiler=gnu95 can be found at the URL. https://gist.github.com/3059320 Any suggestion is appreciated. On Fri, Jul 6, 2012 at 5:21 AM, Paul Anton Letnes wrote: >> >> However, I got the following error message: >> error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o >> -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 >> -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o >> build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit >> status > > I'm sure there must have been more output? It does say that the command "failed", but not _why_ it failed. I suggest posting the entire output either in an email, or on a webpage (gist.github.com, for instance) and giving the link. It's very very hard to debug a build without the build log, so I'd suggest always giving it in the first instance. > > Paul > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From aron at ahmadia.net Fri Jul 6 06:23:03 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Fri, 6 Jul 2012 06:23:03 -0400 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: References: <81872A39-786E-49CC-9177-1EC413FB045B@gmail.com> Message-ID: I usually find these problems by searching for "error" in the output, in your case the complete problem is at the bottom of the log. The relocation errors you're seeing are happening because the build process is trying to link in Atlas libraries (located here: /home/hxc249/lib/atlas/lib/) that were not compiled with -fPIC . Are you building ATLAS from source? If so, then you follow the instructions to recompile ATLAS with -fPIC enabled here: http://math-atlas.sourceforge.net/atlas_install/atlas_install.html#SECTION00043000000000000000 creating build/temp.linux-x86_64-2.6/numpy/core/blasdot compile options: '-DATLAS_INFO="\"3.9.83\"" -Inumpy/core/blasdot -I/home/hxc249/lib/atlas/include -Inumpy/core/include -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/usr/include/python2.6 -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c' gcc: numpy/core/blasdot/_dotblas.c numpy/core/blasdot/_dotblas.c: In function ?dotblas_matrixproduct?: numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct pointer types lacks a cast numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of ?(struct PyObject * (*)(struct PyObject *, struct PyObject *, struct PyArrayObject *))*(PyArray_API + 2240u)? from incompatible pointer type numpy/core/blasdot/_dotblas.c:257: note: expected ?struct PyArrayObject *? but argument is of type ?struct PyObject *? numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of ?(struct PyObject * (*)(struct PyObject *, struct PyObject *, struct PyArrayObject *))*(PyArray_API + 2240u)? from incompatible pointer type numpy/core/blasdot/_dotblas.c:292: note: expected ?struct PyArrayObject *? but argument is of type ?struct PyObject *? gcc -pthread -shared build/temp.linux-x86_64-2.6/numpy/core/blasdot/_dotblas.o -L/home/hxc249/lib/atlas/lib -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 -lptf77blas -lptcblas -latlas -lpython2.6 -o build/lib.linux-x86_64-2.6/numpy/core/_dotblas.so /usr/bin/ld: /home/hxc249/lib/atlas/lib/libptcblas.a(cblas_dptgemm.o): relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a shared object; recompile with -fPIC /home/hxc249/lib/atlas/lib/libptcblas.a: could not read symbols: Bad value collect2: ld returned 1 exit status /usr/bin/ld: /home/hxc249/lib/atlas/lib/libptcblas.a(cblas_dptgemm.o): relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a shared object; recompile with -fPIC /home/hxc249/lib/atlas/lib/libptcblas.a: could not read symbols: Bad value collect2: ld returned 1 exit status error: Command "gcc -pthread -shared build/temp.linux-x86_64-2.6/numpy/core/blasdot/_dotblas.o -L/home/hxc249/lib/atlas/lib -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 -lptf77blas -lptcblas -latlas -lpython2.6 -o build/lib.linux-x86_64-2.6/numpy/core/_dotblas.so" failed with exit status 1 On Fri, Jul 6, 2012 at 6:00 AM, Hung-Hsuan Chen wrote: > Link is a great suggestion! I was hesitating about whether or not to > paste such a long output. > > The site.cfg file is shown in the following link. > https://gist.github.com/3059209 > > The output message for > $ python setup.py build --fcompiler=gnu95 > can be found at the URL. > https://gist.github.com/3059320 > > Any suggestion is appreciated. > > On Fri, Jul 6, 2012 at 5:21 AM, Paul Anton Letnes > wrote: > >> > >> However, I got the following error message: > >> error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared > >> > build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o > >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o > >> -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 > >> -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o > >> build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit > >> status > > > > I'm sure there must have been more output? It does say that the command > "failed", but not _why_ it failed. I suggest posting the entire output > either in an email, or on a webpage (gist.github.com, for instance) and > giving the link. It's very very hard to debug a build without the build > log, so I'd suggest always giving it in the first instance. > > > > Paul > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmccully at mail.nih.gov Fri Jul 6 07:30:45 2012 From: dmccully at mail.nih.gov (McCully, Dwayne (NIH/NLM/LHC) [C]) Date: Fri, 6 Jul 2012 07:30:45 -0400 Subject: [Numpy-discussion] Numpy test failure - How to fix Message-ID: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> Hope this is the right list to post this problem! I'm getting two errors when running a numpy (see below). Could someone tell me how to fix this or if the errors are not a concern. Dwayne python -c 'import numpy; numpy.test(verbose=2)' Python 2.7.3 Numpy 1.6.2 Nose 1.1.2 PowerPC Red Hat Linux 64 bit ====================================================================== FAIL: test_umath.test_nextafterl ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 1119, in test_nextafterl return _test_nextafter(np.longdouble) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 1103, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError ====================================================================== FAIL: test_umath.test_spacingl ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 1146, in test_spacingl return _test_spacing(np.longdouble) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", line 1128, in _test_spacing assert np.spacing(one) == eps AssertionError ---------------------------------------------------------------------- Ran 3576 tests in 22.974s FAILED (KNOWNFAIL=6, failures=2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Fri Jul 6 07:36:01 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Fri, 6 Jul 2012 07:36:01 -0400 Subject: [Numpy-discussion] Numpy test failure - How to fix In-Reply-To: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> References: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> Message-ID: Disclosure: I'm not a numpy developer You've certainly got a functional numpy installation if those are the only two tests failing. Those two tests are related to the distance between floating-point numbers using long double precision (128 bits). If you're not using long doubles, then you probably don't need to worry about them. Still, the developers are interested in tests failing on non-x86 corner cases like this so thanks for reporting. -Aron -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmccully at mail.nih.gov Fri Jul 6 07:38:45 2012 From: dmccully at mail.nih.gov (McCully, Dwayne (NIH/NLM/LHC) [C]) Date: Fri, 6 Jul 2012 07:38:45 -0400 Subject: [Numpy-discussion] Numpy test failure - How to fix In-Reply-To: References: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov>, Message-ID: <8A8D24241B862C41B5D5EADB47CE39E2057299423F@NIHMLBX01.nih.gov> Good information to know Aron which I'll pass on to the group. Lets also see what the developers have to say since I'm hoping they watch this distribution list. Dwayne ________________________________________ From: Aron Ahmadia [aron at ahmadia.net] Sent: Friday, July 06, 2012 7:36 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Numpy test failure - How to fix Disclosure: I'm not a numpy developer You've certainly got a functional numpy installation if those are the only two tests failing. Those two tests are related to the distance between floating-point numbers using long double precision (128 bits). If you're not using long doubles, then you probably don't need to worry about them. Still, the developers are interested in tests failing on non-x86 corner cases like this so thanks for reporting. -Aron From dalke at dalkescientific.com Fri Jul 6 09:48:29 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 6 Jul 2012 15:48:29 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> Message-ID: <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> I followed the instructions at http://docs.scipy.org/doc/numpy/dev/gitwash/patching.html and added Ticket #2181 (with patch) at http://projects.scipy.org/numpy/ticket/2181 This remove the 5 'exec' calls from polynomial/*.py and improves the 'import numpy' time by about 25-30%. That is, on my laptop python -c 'import time; t1=time.time(); import numpy; print time.time()-t1' goes from 0.079 seconds to 0.057 (best of 10 for both cases). The patch does mean that if someone edits the template then they will need to run the template expansion script manually. I think it's well worth the effort. Cheers, Andrew dalke at dalkescientific.com From cannonjunk at hotmail.co.uk Fri Jul 6 09:57:43 2012 From: cannonjunk at hotmail.co.uk (abc def) Date: Fri, 6 Jul 2012 14:57:43 +0100 Subject: [Numpy-discussion] numpy and readline installation fails In-Reply-To: References: , , , Message-ID: Thank you for your advice. I followed the advice about installing python from homebrew. I then ran "brew doctor" and re-arranged the paths as it told me to. It also gives another list of warnings which I'll put at the bottom of this email. My question is now, what is the best way to install numpy (and scipy, and matplotlib), so that it will couple with the brew-installed python? My earlier dmg install of numpy still doesn't work: (I've tried "python -pylab" and just "python" with importing numpy) kuuki:~ tom$ which python /usr/local/bin/python kuuki:~ tom$ ipython -pylab WARNING: `-pylab` flag has been deprecated. ??? Use `--pylab` instead, or `--pylab=foo` to specify a backend./Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: RuntimeWarning: ****************************************************************************** libedit detected - readline will not be well behaved, including but not limited to: ?? * crashes on tab completion ?? * incorrect history navigation ?? * corrupting long-lines ?? * failure to wrap or indent lines properly It is highly recommended that you install readline, which is easy_installable: ???? easy_install readline Note that `pip install readline` generally DOES NOT WORK, because it installs to site-packages, which come *after* lib-dynload in sys.path, where readline is located.? It must be `easy_install readline`, or to a custom location on your PYTHONPATH (even --user comes after lib-dyload). ****************************************************************************** ? RuntimeWarning) Python 2.7.3 (v2.7.3:70274d53c1dd, Apr? 9 2012, 20:52:43) Type "copyright", "credits" or "license" for more information. IPython 0.13 -- An enhanced Interactive Python. ????????? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help????? -> Python's own help system. object??? -> Details about 'object', use 'object??' for extra details. [TerminalIPythonApp] GUI event loop or pylab initialization failed --------------------------------------------------------------------------- ImportError?????????????????????????????? Traceback (most recent call last) /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in find_gui_and_backend(gui) ??? 194???? """ ??? 195 --> 196???? import matplotlib ??? 197 ??? 198???? if gui and gui != 'auto': /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py in () ??? 131 import sys, os, tempfile ??? 132 --> 133 from matplotlib.rcsetup import (defaultParams, ??? 134???????????????????????????????? validate_backend, ??? 135???????????????????????????????? validate_toolbar, /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py in () ???? 17 import warnings ???? 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern ---> 19 from matplotlib.colors import is_color_like ???? 20 ???? 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py in () ???? 50 """ ???? 51 import re ---> 52 import numpy as np ???? 53 from numpy import ma ???? 54 import matplotlib.cbook as cbook ImportError: No module named numpy In [1]: KeyboardInterrupt In [1]: ^D Do you really want to exit ([y]/n)? y kuuki:~ tom$ python Python 2.7.3 (default, Jul? 5 2012, 08:21:27) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np Traceback (most recent call last): ? File "", line 1, in ImportError: No module named numpy >>> And the list of warnings: kuuki:~ tom$ brew doctor Error: Some directories in /usr/local/share/man aren't writable. This can happen if you "sudo make install" software that isn't managed by Homebrew. If a brew tries to add locale information to one of these directories, then the install will fail during the link step. You should probably `chown` them: ??? /usr/local/share/man/de ??? /usr/local/share/man/de/man1 ??? /usr/local/share/man/es ??? /usr/local/share/man/es/man1 ??? /usr/local/share/man/fr ??? /usr/local/share/man/fr/man1 ??? /usr/local/share/man/hr ??? /usr/local/share/man/hr/man1 ??? /usr/local/share/man/hu ??? /usr/local/share/man/hu/man1 ??? /usr/local/share/man/it ??? /usr/local/share/man/it/man1 ??? /usr/local/share/man/jp ??? /usr/local/share/man/jp/man1 ??? /usr/local/share/man/pl ??? /usr/local/share/man/pl/man1 ??? /usr/local/share/man/pt_BR ??? /usr/local/share/man/pt_BR/man1 ??? /usr/local/share/man/pt_PT ??? /usr/local/share/man/pt_PT/man1 ??? /usr/local/share/man/ro ??? /usr/local/share/man/ro/man1 ??? /usr/local/share/man/ru ??? /usr/local/share/man/ru/man1 ??? /usr/local/share/man/sk ??? /usr/local/share/man/sk/man1 ??? /usr/local/share/man/zh ??? /usr/local/share/man/zh/man1 Error: "config" scripts exist outside your system or Homebrew directories. `./configure` scripts often look for *-config scripts to determine if software packages are installed, and what additional flags to use when compiling and linking. Having additional scripts in your path can confuse software installed via Homebrew if the config script overrides a system or Homebrew provided script of the same name. We found the following "config" scripts: ??? /Library/Frameworks/Python.framework/Versions/2.7/bin/python-config ??? /Library/Frameworks/Python.framework/Versions/2.7/bin/python2-config ??? /Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7-config ??? /Library/Frameworks/Python.framework/Versions/2.6/bin/python-config ??? /Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6-config Error: Your Homebrew is outdated You haven't updated for at least 24 hours, this is a long time in brewland! Error: Unbrewed dylibs were found in /usr/local/lib. If you didn't put them there on purpose they could cause problems when building Homebrew formulae, and may need to be deleted. Unexpected dylibs: ??? /usr/local/lib/libgcc_ext.10.4.dylib??? /usr/local/lib/libgcc_ext.10.5.dylib??? /usr/local/lib/libgcc_s.1.dylib??? /usr/local/lib/libgfortran.3.dylib??? /usr/local/lib/libgomp.1.dylib??? /usr/local/lib/libmca_common_sm.1.dylib??? /usr/local/lib/libmpi.0.dylib??? /usr/local/lib/libmpi_cxx.0.dylib??? /usr/local/lib/libmpi_f77.0.dylib??? /usr/local/lib/libopen-pal.0.dylib??? /usr/local/lib/libopen-rte.0.dylib??? /usr/local/lib/libssp.0.dylib Error: Unbrewed .la files were found in /usr/local/lib. If you didn't put them there on purpose they could cause problems when building Homebrew formulae, and may need to be deleted. Unexpected .la files: ??? /usr/local/lib/libgfortran.la??? /usr/local/lib/libgmp.la??? /usr/local/lib/libgomp.la??? /usr/local/lib/libmca_common_sm.la??? /usr/local/lib/libmpc.la??? /usr/local/lib/libmpfr.la??? /usr/local/lib/libmpi.la??? /usr/local/lib/libmpi_cxx.la??? /usr/local/lib/libmpi_f77.la??? /usr/local/lib/libmpi_f90.la??? /usr/local/lib/libopen-pal.la??? /usr/local/lib/libopen-rte.la??? /usr/local/lib/libotf.la??? /usr/local/lib/libssp.la??? /usr/local/lib/libssp_nonshared.la Error: Unbrewed static libraries were found in /usr/local/lib. If you didn't put them there on purpose they could cause problems when building Homebrew formulae, and may need to be deleted. Unexpected static libraries: ??? /usr/local/lib/libgfortran.a??? /usr/local/lib/libgmp.a??? /usr/local/lib/libgomp.a??? /usr/local/lib/libiberty.a??? /usr/local/lib/libmpc.a??? /usr/local/lib/libmpfr.a??? /usr/local/lib/libmpi_f90.a??? /usr/local/lib/libotf.a??? /usr/local/lib/libssp.a??? /usr/local/lib/libssp_nonshared.a??? /usr/local/lib/libvt.a??? /usr/local/lib/libvt.fmpi.a??? /usr/local/lib/libvt.mpi.a??? /usr/local/lib/libvt.omp.a??? /usr/local/lib/libvt.ompi.a kuuki:~ tom$ brew install numpy Error: No available formula for numpy Some of these unexpected libraries seem to be from my earlier install of gfortran and mpi (nothing to do with homebrew). By the way, there seem to be many versions of python (apple, dmg, homebrew) on my macbook. Thanks again for your help!! Tom ________________________________ > Date: Wed, 4 Jul 2012 18:44:46 +0200 > From: ralf.gommers at googlemail.com > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] numpy and readline installation fails > > > > On Wed, Jul 4, 2012 at 6:24 PM, Aron Ahmadia > > wrote: > How do the .dmg files work? Are these binary installers into the > system Python? I would expect that these wouldn't work with a manually > installed Python from python.org, but I have no > experience with them. > > They're binary installers for the python.org Python. > So they install into /Library/Frameworks/Python.framework/. They don't > work with the system (Apple) Python; I think leaving that alone and > installing from python.org is in general good > advice. > > > Tom, I have had very good luck with the brew Python 2.7 installer, > which will give you a /usr/local Python base install and easy_install. > Brew can also install readline correctly for you. > > https://github.com/mxcl/homebrew/wiki/Homebrew-and-Python > > Homebrew appears to be better than MacPorts/Fink, but installing > everything from source is not the first advice I'd give to a new user. > > Ralf > > If you want an all-in-one solution, you can also grab either the free > EPD distribution or another of the "all-in-one" packs. > > A > > > On Wed, Jul 4, 2012 at 7:03 PM, Paul Anton Letnes > > > wrote: > Hello, > > I don't know exactly what went wrong. I'd start out my debugging by > 1) which python # see whether you're running apple's python in > /usr/bin/python, or the one you tried to install > 2) which easy_install # did you run Apple-python's easy_install, or the > one you (tried to) installed? > 3) If all of the above match, try running python (not ipython) and try > to import numpy. Apple's python ships with numpy (at least the Lion / > 10.7 one does). > 4) Next, print numpy.__file__ to see whether numpy got installed to > where it should > > In general, I'd advice you to install one package at a time, then test > it to see whether it has been installed properly. When you're confident > everything is OK, move on to the next package. For instance, test numpy > by > $ python -c 'import numpy; numpy.test()' > and scipy with > $ python -c 'import scipy;scipy.test()' > (for instance). > > When you're sure the fundament (python, numpy) is in order, proceed > with the house (scipy, matplotlib). > > Cheers > Paul > > > On 4. juli 2012, at 16:16, abc def wrote: > > > > > Hello, > > > > I'm new to python and I'd like to learn about numpy / scipy / > matplotlib, but I'm having trouble getting started. > > > > I'm following the instructions here: http://www.scipy.org/Getting_Started > > > > First I installed the latest version of python from > python.org by downloading the dmg file, since I read > that it doesn't work with apple's installer, and then installed numpy / > scipy / matplotlib by downloading the relevent dmg files. > > I then downloaded ipython, ran "easy_install readline" and then ran > "python setup.py install". > > > > Then I started ipython with "ipython -pylab" as per the instructions > but then I get muliple error messages: > > > > > > > > > > > > $ ipython --pylab > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: > RuntimeWarning: > > > ****************************************************************************** > > libedit detected - readline will not be well behaved, including but > not limited to: > > * crashes on tab completion > > * incorrect history navigation > > * corrupting long-lines > > * failure to wrap or indent lines properly > > It is highly recommended that you install readline, which is > easy_installable: > > easy_install readline > > Note that `pip install readline` generally DOES NOT WORK, because > > it installs to site-packages, which come *after* lib-dynload in sys.path, > > where readline is located. It must be `easy_install readline`, or to > a custom > > location on your PYTHONPATH (even --user comes after lib-dyload). > > > ****************************************************************************** > > RuntimeWarning) > > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) > > Type "copyright", "credits" or "license" for more information. > > > > IPython 0.13 -- An enhanced Interactive Python. > > ? -> Introduction and overview of IPython's features. > > %quickref -> Quick reference. > > help -> Python's own help system. > > object? -> Details about 'object', use 'object??' for extra details. > > [TerminalIPythonApp] GUI event loop or pylab initialization failed > > --------------------------------------------------------------------------- > > ImportError Traceback (most recent call last) > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc > in find_gui_and_backend(gui) > > 194 """ > > 195 > > --> 196 import matplotlib > > 197 > > 198 if gui and gui != 'auto': > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py > in () > > 131 import sys, os, tempfile > > 132 > > --> 133 from matplotlib.rcsetup import (defaultParams, > > 134 validate_backend, > > 135 validate_toolbar, > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py > in () > > 17 import warnings > > 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern > > ---> 19 from matplotlib.colors import is_color_like > > 20 > > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', > 'qtagg', 'qt4agg', > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py > in () > > 50 """ > > 51 import re > > ---> 52 import numpy as np > > 53 from numpy import ma > > 54 import matplotlib.cbook as cbook > > > > ImportError: No module named numpy > > > > In [1]: > > > > > > > > > > it seems the installation of numpy and readline didn't work, and > there are problems with matplotlib, even though I think I followed all > the instructions carefully. > > I can't figure out what I did wrong. Can anybody help? > > > > I'm running mac os 10.6. > > > > Thank you! > > > > Tom > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ NumPy-Discussion > mailing list NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From nouiz at nouiz.org Fri Jul 6 10:30:45 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 6 Jul 2012 10:30:45 -0400 Subject: [Numpy-discussion] PyArray_FILLWBYTE dangerous doc In-Reply-To: References: Message-ID: Hi, I just did a PR that update the doc to warn about this. https://github.com/numpy/numpy/pull/332 Fred On Thu, Jun 28, 2012 at 10:28 PM, Fr?d?ric Bastien wrote: > Hi, > > The doc of PyArray_FILLWBYTE here > http://docs.scipy.org/doc/numpy/reference/c-api.array.html is this > > > PyArray_FILLWBYTE(PyObject* obj, int val) > Fill the array pointed to by obj ?which must be a (subclass of) > bigndarray?with the contents of val (evaluated as a byte). > > In the code, what it does is call memset: > > numpy/core/include/numpy/ndarrayobject.h > #define PyArray_FILLWBYTE(obj, val) memset(PyArray_DATA(obj), val, \ > PyArray_NBYTES(obj)) > > This make it ignore completely the strides! > > So the easy fix would be to update the doc, the real fix is to test > the contiguity before calling memset, if not contiguous, call > something else appropriate. > > Fred From cannonjunk at hotmail.co.uk Fri Jul 6 11:03:40 2012 From: cannonjunk at hotmail.co.uk (abc def) Date: Fri, 6 Jul 2012 16:03:40 +0100 Subject: [Numpy-discussion] numpy and readline installation fails In-Reply-To: References: , , , , , , , Message-ID: My apologies for sending the last email out too quickly. I found this page (http://www.scipy.org/Installing_SciPy/Mac_OS_X) and, ignoring the parts about needing 4.0-version compilers (it doesn't work if you use 4.0, but does work for 4.2 (on osx 10.6, with homebrew's python)), it works fine for numpy and scipy - I recommend this for anyone else who is having similar trouble. Thank you again all very much for your help! Tom ---------------------------------------- > From: cannonjunk at hotmail.co.uk > To: numpy-discussion at scipy.org > Date: Fri, 6 Jul 2012 14:57:43 +0100 > Subject: Re: [Numpy-discussion] numpy and readline installation fails > > > Thank you for your advice. > I followed the advice about installing python from homebrew. I then ran "brew doctor" and re-arranged the paths as it told me to. It also gives another list of warnings which I'll put at the bottom of this email. > > My question is now, what is the best way to install numpy (and scipy, and matplotlib), so that it will couple with the brew-installed python? > My earlier dmg install of numpy still doesn't work: (I've tried "python -pylab" and just "python" with importing numpy) > > kuuki:~ tom$ which python > /usr/local/bin/python > kuuki:~ tom$ ipython -pylab > WARNING: `-pylab` flag has been deprecated. > Use `--pylab` instead, or `--pylab=foo` to specify a backend./Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: RuntimeWarning: > ****************************************************************************** > libedit detected - readline will not be well behaved, including but not limited to: > * crashes on tab completion > * incorrect history navigation > * corrupting long-lines > * failure to wrap or indent lines properly > It is highly recommended that you install readline, which is easy_installable: > easy_install readline > Note that `pip install readline` generally DOES NOT WORK, because > it installs to site-packages, which come *after* lib-dynload in sys.path, > where readline is located. It must be `easy_install readline`, or to a custom > location on your PYTHONPATH (even --user comes after lib-dyload). > ****************************************************************************** > RuntimeWarning) > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) > Type "copyright", "credits" or "license" for more information. > > IPython 0.13 -- An enhanced Interactive Python. > ? -> Introduction and overview of IPython's features. > %quickref -> Quick reference. > help -> Python's own help system. > object? -> Details about 'object', use 'object??' for extra details. > [TerminalIPythonApp] GUI event loop or pylab initialization failed > --------------------------------------------------------------------------- > ImportError Traceback (most recent call last) > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in find_gui_and_backend(gui) > 194 """ > 195 > --> 196 import matplotlib > 197 > 198 if gui and gui != 'auto': > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py in () > 131 import sys, os, tempfile > 132 > --> 133 from matplotlib.rcsetup import (defaultParams, > 134 validate_backend, > 135 validate_toolbar, > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py in () > 17 import warnings > 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern > ---> 19 from matplotlib.colors import is_color_like > 20 > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py in () > 50 """ > 51 import re > ---> 52 import numpy as np > 53 from numpy import ma > 54 import matplotlib.cbook as cbook > > ImportError: No module named numpy > > In [1]: > KeyboardInterrupt > > In [1]: ^D > Do you really want to exit ([y]/n)? y > kuuki:~ tom$ python > Python 2.7.3 (default, Jul 5 2012, 08:21:27) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy as np > Traceback (most recent call last): > File "", line 1, in > ImportError: No module named numpy > >>> > > > > And the list of warnings: > kuuki:~ tom$ brew doctor > > Error: Some directories in /usr/local/share/man aren't writable. > This can happen if you "sudo make install" software that isn't managed > by Homebrew. If a brew tries to add locale information to one of these > directories, then the install will fail during the link step. > You should probably `chown` them: > > /usr/local/share/man/de > /usr/local/share/man/de/man1 > /usr/local/share/man/es > /usr/local/share/man/es/man1 > /usr/local/share/man/fr > /usr/local/share/man/fr/man1 > /usr/local/share/man/hr > /usr/local/share/man/hr/man1 > /usr/local/share/man/hu > /usr/local/share/man/hu/man1 > /usr/local/share/man/it > /usr/local/share/man/it/man1 > /usr/local/share/man/jp > /usr/local/share/man/jp/man1 > /usr/local/share/man/pl > /usr/local/share/man/pl/man1 > /usr/local/share/man/pt_BR > /usr/local/share/man/pt_BR/man1 > /usr/local/share/man/pt_PT > /usr/local/share/man/pt_PT/man1 > /usr/local/share/man/ro > /usr/local/share/man/ro/man1 > /usr/local/share/man/ru > /usr/local/share/man/ru/man1 > /usr/local/share/man/sk > /usr/local/share/man/sk/man1 > /usr/local/share/man/zh > /usr/local/share/man/zh/man1 > Error: "config" scripts exist outside your system or Homebrew directories. > `./configure` scripts often look for *-config scripts to determine if > software packages are installed, and what additional flags to use when > compiling and linking. > > Having additional scripts in your path can confuse software installed via > Homebrew if the config script overrides a system or Homebrew provided > script of the same name. We found the following "config" scripts: > > /Library/Frameworks/Python.framework/Versions/2.7/bin/python-config > /Library/Frameworks/Python.framework/Versions/2.7/bin/python2-config > /Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7-config > /Library/Frameworks/Python.framework/Versions/2.6/bin/python-config > /Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6-config > Error: Your Homebrew is outdated > You haven't updated for at least 24 hours, this is a long time in brewland! > Error: Unbrewed dylibs were found in /usr/local/lib. > If you didn't put them there on purpose they could cause problems when > building Homebrew formulae, and may need to be deleted. > > Unexpected dylibs: > /usr/local/lib/libgcc_ext.10.4.dylib /usr/local/lib/libgcc_ext.10.5.dylib /usr/local/lib/libgcc_s.1.dylib /usr/local/lib/libgfortran.3.dylib /usr/local/lib/libgomp.1.dylib /usr/local/lib/libmca_common_sm.1.dylib /usr/local/lib/libmpi.0.dylib /usr/local/lib/libmpi_cxx.0.dylib /usr/local/lib/libmpi_f77.0.dylib /usr/local/lib/libopen-pal.0.dylib /usr/local/lib/libopen-rte.0.dylib /usr/local/lib/libssp.0.dylib > Error: Unbrewed .la files were found in /usr/local/lib. > If you didn't put them there on purpose they could cause problems when > building Homebrew formulae, and may need to be deleted. > > Unexpected .la files: > /usr/local/lib/libgfortran.la /usr/local/lib/libgmp.la /usr/local/lib/libgomp.la /usr/local/lib/libmca_common_sm.la /usr/local/lib/libmpc.la /usr/local/lib/libmpfr.la /usr/local/lib/libmpi.la /usr/local/lib/libmpi_cxx.la /usr/local/lib/libmpi_f77.la /usr/local/lib/libmpi_f90.la /usr/local/lib/libopen-pal.la /usr/local/lib/libopen-rte.la /usr/local/lib/libotf.la /usr/local/lib/libssp.la /usr/local/lib/libssp_nonshared.la > Error: Unbrewed static libraries were found in /usr/local/lib. > If you didn't put them there on purpose they could cause problems when > building Homebrew formulae, and may need to be deleted. > > Unexpected static libraries: > /usr/local/lib/libgfortran.a /usr/local/lib/libgmp.a /usr/local/lib/libgomp.a /usr/local/lib/libiberty.a /usr/local/lib/libmpc.a /usr/local/lib/libmpfr.a /usr/local/lib/libmpi_f90.a /usr/local/lib/libotf.a /usr/local/lib/libssp.a /usr/local/lib/libssp_nonshared.a /usr/local/lib/libvt.a /usr/local/lib/libvt.fmpi.a /usr/local/lib/libvt.mpi.a /usr/local/lib/libvt.omp.a /usr/local/lib/libvt.ompi.a > kuuki:~ tom$ brew install numpy > Error: No available formula for numpy > > > Some of these unexpected libraries seem to be from my earlier install of gfortran and mpi (nothing to do with homebrew). > By the way, there seem to be many versions of python (apple, dmg, homebrew) on my macbook. > Thanks again for your help!! > > Tom > > ________________________________ > > Date: Wed, 4 Jul 2012 18:44:46 +0200 > > From: ralf.gommers at googlemail.com > > To: numpy-discussion at scipy.org > > Subject: Re: [Numpy-discussion] numpy and readline installation fails > > > > > > > > On Wed, Jul 4, 2012 at 6:24 PM, Aron Ahmadia > > > wrote: > > How do the .dmg files work? Are these binary installers into the > > system Python? I would expect that these wouldn't work with a manually > > installed Python from python.org, but I have no > > experience with them. > > > > They're binary installers for the python.org Python. > > So they install into /Library/Frameworks/Python.framework/. They don't > > work with the system (Apple) Python; I think leaving that alone and > > installing from python.org is in general good > > advice. > > > > > > Tom, I have had very good luck with the brew Python 2.7 installer, > > which will give you a /usr/local Python base install and easy_install. > > Brew can also install readline correctly for you. > > > > https://github.com/mxcl/homebrew/wiki/Homebrew-and-Python > > > > Homebrew appears to be better than MacPorts/Fink, but installing > > everything from source is not the first advice I'd give to a new user. > > > > Ralf > > > > If you want an all-in-one solution, you can also grab either the free > > EPD distribution or another of the "all-in-one" packs. > > > > A > > > > > > On Wed, Jul 4, 2012 at 7:03 PM, Paul Anton Letnes > > > > > wrote: > > Hello, > > > > I don't know exactly what went wrong. I'd start out my debugging by > > 1) which python # see whether you're running apple's python in > > /usr/bin/python, or the one you tried to install > > 2) which easy_install # did you run Apple-python's easy_install, or the > > one you (tried to) installed? > > 3) If all of the above match, try running python (not ipython) and try > > to import numpy. Apple's python ships with numpy (at least the Lion / > > 10.7 one does). > > 4) Next, print numpy.__file__ to see whether numpy got installed to > > where it should > > > > In general, I'd advice you to install one package at a time, then test > > it to see whether it has been installed properly. When you're confident > > everything is OK, move on to the next package. For instance, test numpy > > by > > $ python -c 'import numpy; numpy.test()' > > and scipy with > > $ python -c 'import scipy;scipy.test()' > > (for instance). > > > > When you're sure the fundament (python, numpy) is in order, proceed > > with the house (scipy, matplotlib). > > > > Cheers > > Paul > > > > > > On 4. juli 2012, at 16:16, abc def wrote: > > > > > > > > Hello, > > > > > > I'm new to python and I'd like to learn about numpy / scipy / > > matplotlib, but I'm having trouble getting started. > > > > > > I'm following the instructions here: http://www.scipy.org/Getting_Started > > > > > > First I installed the latest version of python from > > python.org by downloading the dmg file, since I read > > that it doesn't work with apple's installer, and then installed numpy / > > scipy / matplotlib by downloading the relevent dmg files. > > > I then downloaded ipython, ran "easy_install readline" and then ran > > "python setup.py install". > > > > > > Then I started ipython with "ipython -pylab" as per the instructions > > but then I get muliple error messages: > > > > > > > > > > > > > > > > > > $ ipython --pylab > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/utils/rlineimpl.py:111: > > RuntimeWarning: > > > > > ****************************************************************************** > > > libedit detected - readline will not be well behaved, including but > > not limited to: > > > * crashes on tab completion > > > * incorrect history navigation > > > * corrupting long-lines > > > * failure to wrap or indent lines properly > > > It is highly recommended that you install readline, which is > > easy_installable: > > > easy_install readline > > > Note that `pip install readline` generally DOES NOT WORK, because > > > it installs to site-packages, which come *after* lib-dynload in sys.path, > > > where readline is located. It must be `easy_install readline`, or to > > a custom > > > location on your PYTHONPATH (even --user comes after lib-dyload). > > > > > ****************************************************************************** > > > RuntimeWarning) > > > Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) > > > Type "copyright", "credits" or "license" for more information. > > > > > > IPython 0.13 -- An enhanced Interactive Python. > > > ? -> Introduction and overview of IPython's features. > > > %quickref -> Quick reference. > > > help -> Python's own help system. > > > object? -> Details about 'object', use 'object??' for extra details. > > > [TerminalIPythonApp] GUI event loop or pylab initialization failed > > > --------------------------------------------------------------------------- > > > ImportError Traceback (most recent call last) > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/pylabtools.pyc > > in find_gui_and_backend(gui) > > > 194 """ > > > 195 > > > --> 196 import matplotlib > > > 197 > > > 198 if gui and gui != 'auto': > > > > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.py > > in () > > > 131 import sys, os, tempfile > > > 132 > > > --> 133 from matplotlib.rcsetup import (defaultParams, > > > 134 validate_backend, > > > 135 validate_toolbar, > > > > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/rcsetup.py > > in () > > > 17 import warnings > > > 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern > > > ---> 19 from matplotlib.colors import is_color_like > > > 20 > > > 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', > > 'qtagg', 'qt4agg', > > > > > > > > /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/colors.py > > in () > > > 50 """ > > > 51 import re > > > ---> 52 import numpy as np > > > 53 from numpy import ma > > > 54 import matplotlib.cbook as cbook > > > > > > ImportError: No module named numpy > > > > > > In [1]: > > > > > > > > > > > > > > > it seems the installation of numpy and readline didn't work, and > > there are problems with matplotlib, even though I think I followed all > > the instructions carefully. > > > I can't figure out what I did wrong. Can anybody help? > > > > > > I'm running mac os 10.6. > > > > > > Thank you! > > > > > > Tom > > > > > > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ NumPy-Discussion > > mailing list NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at googlemail.com Fri Jul 6 11:20:43 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 6 Jul 2012 17:20:43 +0200 Subject: [Numpy-discussion] Preferring gfortran over g77 on OS X and other distributions? In-Reply-To: References: Message-ID: On Fri, Jul 6, 2012 at 11:15 AM, Aron Ahmadia wrote: > Aron made this change for {mac, linux, posix, sun, irix, aix} in his PR. I >> think it's a good idea for {mac, linux} and perhaps posix, and to leave the >> other platforms alone. Does anyone else have an opinion on this? >> > > Ralf, that sounds good to me. If you feel that we should leave sun, irix, > and aix alone, we should probably leave out posix as well. > > If everyone concurs, I'll issue a new PR adjusting mac and linux. > You can just make the change in the existing PR. Either add a new commit, or amend the existing one and force push. Multiple PRs for the same issue makes things harder to follow. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From hhchen at psu.edu Fri Jul 6 11:28:42 2012 From: hhchen at psu.edu (Hung-Hsuan Chen) Date: Fri, 6 Jul 2012 11:28:42 -0400 Subject: [Numpy-discussion] Numpy Installation Problem on Redhat Linux In-Reply-To: References: <81872A39-786E-49CC-9177-1EC413FB045B@gmail.com> Message-ID: Thank you. This is very helpful! On Fri, Jul 6, 2012 at 6:23 AM, Aron Ahmadia wrote: > I usually find these problems by searching for "error" in the output, in > your case the complete problem is at the bottom of the log. The relocation > errors you're seeing are happening because the build process is trying to > link in Atlas libraries (located here: /home/hxc249/lib/atlas/lib/) that > were not compiled with -fPIC . Are you building ATLAS from source? If so, > then you follow the instructions to recompile ATLAS with -fPIC enabled here: > http://math-atlas.sourceforge.net/atlas_install/atlas_install.html#SECTION00043000000000000000 > > > creating build/temp.linux-x86_64-2.6/numpy/core/blasdot > compile options: '-DATLAS_INFO="\"3.9.83\"" -Inumpy/core/blasdot > -I/home/hxc249/lib/atlas/include -Inumpy/core/include > -Ibuild/src.linux-x86_64-2.6/numpy/core/include/numpy > -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/include -I/usr/include/python2.6 > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/multiarray > -Ibuild/src.linux-x86_64-2.6/numpy/core/src/umath -c' > gcc: numpy/core/blasdot/_dotblas.c > numpy/core/blasdot/_dotblas.c: In function ?dotblas_matrixproduct?: > numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct pointer > types lacks a cast > numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of ?(struct > PyObject * (*)(struct PyObject *, struct PyObject *, struct PyArrayObject > *))*(PyArray_API + 2240u)? from incompatible pointer type > numpy/core/blasdot/_dotblas.c:257: note: expected ?struct PyArrayObject *? > but argument is of type ?struct PyObject *? > numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of ?(struct > PyObject * (*)(struct PyObject *, struct PyObject *, struct PyArrayObject > *))*(PyArray_API + 2240u)? from incompatible pointer type > numpy/core/blasdot/_dotblas.c:292: note: expected ?struct PyArrayObject *? > but argument is of type ?struct PyObject *? > gcc -pthread -shared > build/temp.linux-x86_64-2.6/numpy/core/blasdot/_dotblas.o > -L/home/hxc249/lib/atlas/lib -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 > -lptf77blas -lptcblas -latlas -lpython2.6 -o > build/lib.linux-x86_64-2.6/numpy/core/_dotblas.so > /usr/bin/ld: /home/hxc249/lib/atlas/lib/libptcblas.a(cblas_dptgemm.o): > relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making > a shared object; recompile with -fPIC > /home/hxc249/lib/atlas/lib/libptcblas.a: could not read symbols: Bad value > collect2: ld returned 1 exit status > /usr/bin/ld: /home/hxc249/lib/atlas/lib/libptcblas.a(cblas_dptgemm.o): > relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making > a shared object; recompile with -fPIC > /home/hxc249/lib/atlas/lib/libptcblas.a: could not read symbols: Bad value > collect2: ld returned 1 exit status > error: Command "gcc -pthread -shared > build/temp.linux-x86_64-2.6/numpy/core/blasdot/_dotblas.o > -L/home/hxc249/lib/atlas/lib -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 > -lptf77blas -lptcblas -latlas -lpython2.6 -o > build/lib.linux-x86_64-2.6/numpy/core/_dotblas.so" failed with exit status 1 > > > On Fri, Jul 6, 2012 at 6:00 AM, Hung-Hsuan Chen wrote: >> >> Link is a great suggestion! I was hesitating about whether or not to >> paste such a long output. >> >> The site.cfg file is shown in the following link. >> https://gist.github.com/3059209 >> >> The output message for >> $ python setup.py build --fcompiler=gnu95 >> can be found at the URL. >> https://gist.github.com/3059320 >> >> Any suggestion is appreciated. >> >> On Fri, Jul 6, 2012 at 5:21 AM, Paul Anton Letnes >> wrote: >> >> >> >> However, I got the following error message: >> >> error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared >> >> >> >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/scipy/integrate/vodemodule.o >> >> build/temp.linux-x86_64-2.6/build/src.linux-x86_64-2.6/fortranobject.o >> >> -L/home/username/lib/ -L/usr/lib64 -Lbuild/temp.linux-x86_64-2.6 >> >> -lodepack -llinpack_lite -lmach -lblas -lpython2.6 -lg2c -o >> >> build/lib.linux-x86_64-2.6/scipy/integrate/vode.so" failed with exit >> >> status >> > >> > I'm sure there must have been more output? It does say that the command >> > "failed", but not _why_ it failed. I suggest posting the entire output >> > either in an email, or on a webpage (gist.github.com, for instance) and >> > giving the link. It's very very hard to debug a build without the build log, >> > so I'd suggest always giving it in the first instance. >> > >> > Paul >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jsalvati at u.washington.edu Fri Jul 6 11:37:48 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Fri, 6 Jul 2012 08:37:48 -0700 Subject: [Numpy-discussion] Would a patch with a function for incrementing an array with advanced indexing be accepted? In-Reply-To: References: Message-ID: Okay, done ( https://github.com/jsalvatier/numpy/commit/7d03753e6305dbc878ed7df3e21e9b099eae32ed ). On Tue, Jul 3, 2012 at 11:41 AM, Fr?d?ric Bastien wrote: > Hi, > > Here is code example that work only with different index: > > import numpy > x=numpy.zeros((5,5)) > x[[0,2,4]]+=numpy.random.rand(3,5) > print x > > This won't work if in the list [0,2,4], there is index duplication, > but with your new code, it will. I think it is the most used case of > advanced indexing. At least, for our lab:) > > Fred > > On Mon, Jul 2, 2012 at 7:48 PM, John Salvatier > wrote: > > Hi Fred, > > > > That's an excellent idea, but I am not too familiar with this use case. > What > > do you mean by list in 'matrix[list]'? Is the use case, just > incrementing > > in place a sub matrix of a numpy matrix? > > > > John > > > > > > On Fri, Jun 29, 2012 at 11:43 AM, Fr?d?ric Bastien > wrote: > >> > >> Hi, > >> > >> I personnaly can't review this as this is too much in NumPy internal. > >> > >> My only comments is that you could add a test and an example in the > >> doc for matrix[list]. I think it will be the most used case. > >> > >> Fred > >> > >> On Wed, Jun 27, 2012 at 7:47 PM, John Salvatier > >> wrote: > >> > I've submitted a pull request ( > https://github.com/numpy/numpy/pull/326 > >> > ). > >> > I'm new to the numpy and python internals, so feedback is greatly > >> > appreciated. > >> > > >> > > >> > On Tue, Jun 26, 2012 at 12:10 PM, Travis Oliphant < > travis at continuum.io> > >> > wrote: > >> >> > >> >> > >> >> On Jun 26, 2012, at 1:34 PM, Fr?d?ric Bastien wrote: > >> >> > >> >> > Hi, > >> >> > > >> >> > I think he was referring that making NUMPY_ARRAY_OBJECT[...] syntax > >> >> > support the operation that you said is hard. But having a separate > >> >> > function do it is less complicated as you said. > >> >> > >> >> Yes. That's precisely what I meant. Thank you for clarifying. > >> >> > >> >> -Travis > >> >> > >> >> > > >> >> > Fred > >> >> > > >> >> > On Tue, Jun 26, 2012 at 1:27 PM, John Salvatier > >> >> > wrote: > >> >> >> Can you clarify why it would be super hard? I just reused the code > >> >> >> for > >> >> >> advanced indexing (a modification of PyArray_SetMap). Am I missing > >> >> >> something > >> >> >> crucial? > >> >> >> > >> >> >> > >> >> >> > >> >> >> On Tue, Jun 26, 2012 at 9:57 AM, Travis Oliphant > >> >> >> > >> >> >> wrote: > >> >> >>> > >> >> >>> > >> >> >>> On Jun 26, 2012, at 11:46 AM, John Salvatier wrote: > >> >> >>> > >> >> >>> Hello, > >> >> >>> > >> >> >>> If you increment an array using advanced indexing and have > repeated > >> >> >>> indexes, the array doesn't get repeatedly > >> >> >>> incremented, > >> >> >>> > http://comments.gmane.org/gmane.comp.python.numeric.general/50291. > >> >> >>> I wrote a C function that does incrementing with repeated indexes > >> >> >>> correctly. > >> >> >>> The branch is here (https://github.com/jsalvatier/numpy see the > >> >> >>> last > >> >> >>> two > >> >> >>> commits). Would a patch with a cleaned up version of a function > >> >> >>> like > >> >> >>> this be > >> >> >>> accepted into numpy? I'm not experienced writing numpy C code so > >> >> >>> I'm > >> >> >>> sure it > >> >> >>> still needs improvement. > >> >> >>> > >> >> >>> > >> >> >>> This is great. It is an often-requested feature. It's *very > >> >> >>> difficult* > >> >> >>> to do without changing fundamentally what NumPy is. But, yes > this > >> >> >>> would be > >> >> >>> a great pull request. > >> >> >>> > >> >> >>> Thanks, > >> >> >>> > >> >> >>> -Travis > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> _______________________________________________ > >> >> >>> NumPy-Discussion mailing list > >> >> >>> NumPy-Discussion at scipy.org > >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> >>> > >> >> >> > >> >> >> > >> >> >> _______________________________________________ > >> >> >> NumPy-Discussion mailing list > >> >> >> NumPy-Discussion at scipy.org > >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> >> > >> >> > _______________________________________________ > >> >> > NumPy-Discussion mailing list > >> >> > NumPy-Discussion at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> > > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Sat Jul 7 05:33:03 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 07 Jul 2012 12:33:03 +0300 Subject: [Numpy-discussion] [ANN] Stochastic programming and optimization addon for FuncDesigner Message-ID: <38680.1341653583.12878990481238851584@ffe15.ukr.net> hi all, you may be interested in stochastic programming and optimization with free Python module FuncDesigner. We have wrote Stochastic addon for FuncDesigner, but (at least for several years) it will be commercional (currently it's free for some small-scaled problems only and for noncommercial research / educational purposes only). However, we will try to keep our prices several times less than our competitors have. Also, we will provide some discounts, including region-based ones, and first 15 customers will also got a discount. For further information, documentation and some examples etc read more at http://openopt.org/StochasticProgramming Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Jul 7 10:54:25 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 7 Jul 2012 16:54:25 +0200 Subject: [Numpy-discussion] frexp/ldexp docstrings Message-ID: Hi, The ldexp/frexp docstrings, now in add_newdocs.py, aren't picked up. It looks like they're in the wrong place, but moving them to code_generators/ufunc_docstrings.py and editing generate_umath.py also doesn't work. They seem to be a special case, see umathmodule.c. What should be done here, add the docstrings directly in umathmodule.c or make the functions less of a special case? Only after investigating I found that I'd already filed a bug for this: http://projects.scipy.org/numpy/ticket/1759. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Jul 7 17:01:32 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 7 Jul 2012 14:01:32 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> Message-ID: On Thu, Jul 5, 2012 at 4:36 PM, Ond?ej ?ert?k wrote: > So feel free to go ahead with what you think is the best and I will join you > in a few days. I propose that we following a simple migration path for now: move the current buildbot onto the EC2 instance, redirect buildbot.scipy.org, and then connect the nipy build slaves. This should require minimal effort, but provide us with fairly wide coverage until you can invest more time in Jenkins etc. St?fan From ondrej.certik at gmail.com Sat Jul 7 17:14:58 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 7 Jul 2012 14:14:58 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: <9C4E5D4B-7E6D-4640-A5E1-C4CF162B60E0@continuum.io> Message-ID: On Sat, Jul 7, 2012 at 2:01 PM, St?fan van der Walt wrote: > On Thu, Jul 5, 2012 at 4:36 PM, Ond?ej ?ert?k wrote: >> So feel free to go ahead with what you think is the best and I will join you >> in a few days. > > I propose that we following a simple migration path for now: move the > current buildbot onto the EC2 instance, redirect buildbot.scipy.org, > and then connect the nipy build slaves. This should require minimal > effort, but provide us with fairly wide coverage until you can invest > more time in Jenkins etc. I agree. Ondrej From charlesr.harris at gmail.com Sat Jul 7 21:22:23 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2012 19:22:23 -0600 Subject: [Numpy-discussion] Debug print while runnin tests Message-ID: There are a bunch of these: -------------------------------------- Dump of NumPy ndarray at address 0x5598430 ndim : 0 shape : dtype : dtype('int64') data : 0x5595c60 strides: base : (nil) flags : NPY_C_CONTIGUOUS NPY_F_CONTIGUOUS NPY_ALIGNED ------------------------------------------------------- Are these needed, or did someone forget to comment out some debug statements? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 7 22:23:14 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Jul 2012 20:23:14 -0600 Subject: [Numpy-discussion] Bento build fails Message-ID: /usr/bin/ld: numpy/core/libnpymath.a(halffloat.c.12.o): relocation R_X86_64_PC32 against symbol `npy_set_floatstatus_invalid' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Sun Jul 8 03:22:21 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Sun, 8 Jul 2012 09:22:21 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> Message-ID: On 6 July 2012 15:48, Andrew Dalke wrote: > I followed the instructions at > http://docs.scipy.org/doc/numpy/dev/gitwash/patching.html > and added Ticket #2181 (with patch) at > http://projects.scipy.org/numpy/ticket/2181 Those instructions need to be updated to reflect the current preferred practice. You'll make code review easier and increase the chances of getting your patch accepted by submitting the patch as a Github pull request instead (see http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html for a how-to). It's not very much extra work. Cheers, Scott From ralf.gommers at googlemail.com Sun Jul 8 05:36:51 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 8 Jul 2012 11:36:51 +0200 Subject: [Numpy-discussion] Numpy test failure - How to fix In-Reply-To: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> References: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> Message-ID: On Fri, Jul 6, 2012 at 1:30 PM, McCully, Dwayne (NIH/NLM/LHC) [C] < dmccully at mail.nih.gov> wrote: > Hope this is the right list to post this problem! I?m getting two errors > when running a numpy (see below). **** > > Could someone tell me how to fix this or if the errors are not a concern. > **** > > ** ** > > Dwayne**** > > ** ** > > python -c 'import numpy; numpy.test(verbose=2)'**** > > ** ** > > Python 2.7.3**** > > Numpy 1.6.2**** > > Nose 1.1.2**** > > PowerPC**** > > Red Hat Linux 64 bit** > These tests are known to fail on PowerPC, see http://projects.scipy.org/numpy/ticket/1664 https://github.com/numpy/numpy/commit/1b99089 The question is why the above commit is not effective on your system. Could you check that? For example, is this not true: import platform "powerpc" in platform.processor() Ralf > ** > > *======================================================================* > > FAIL: test_umath.test_nextafterl**** > > ----------------------------------------------------------------------**** > > Traceback (most recent call last):**** > > File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in > runTest**** > > self.test(*self.arg)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/testing/decorators.py", line > 215, in knownfailer**** > > return f(*args, **kwargs)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", > line 1119, in test_nextafterl**** > > return _test_nextafter(np.longdouble)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", > line 1103, in _test_nextafter**** > > assert np.nextafter(one, two) - one == eps**** > > AssertionError**** > > ** ** > > ======================================================================**** > > FAIL: test_umath.test_spacingl**** > > ----------------------------------------------------------------------**** > > Traceback (most recent call last):**** > > File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in > runTest**** > > self.test(*self.arg)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/testing/decorators.py", line > 215, in knownfailer**** > > return f(*args, **kwargs)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", > line 1146, in test_spacingl**** > > return _test_spacing(np.longdouble)**** > > File > "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_umath.py", > line 1128, in _test_spacing**** > > assert np.spacing(one) == eps**** > > AssertionError**** > > ** ** > > ----------------------------------------------------------------------**** > > Ran 3576 tests in 22.974s**** > > ** ** > > FAILED (KNOWNFAIL=6, failures=2)**** > > ** ** > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sun Jul 8 05:47:11 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 8 Jul 2012 02:47:11 -0700 Subject: [Numpy-discussion] Numpy test failure - How to fix In-Reply-To: References: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> Message-ID: Hi, On Sun, Jul 8, 2012 at 2:36 AM, Ralf Gommers wrote: > > > On Fri, Jul 6, 2012 at 1:30 PM, McCully, Dwayne (NIH/NLM/LHC) [C] > wrote: >> >> Hope this is the right list to post this problem! I?m getting two errors >> when running a numpy (see below). >> >> Could someone tell me how to fix this or if the errors are not a concern. >> >> >> >> Dwayne >> >> >> >> python -c 'import numpy; numpy.test(verbose=2)' >> >> >> >> Python 2.7.3 >> >> Numpy 1.6.2 >> >> Nose 1.1.2 >> >> PowerPC >> >> Red Hat Linux 64 bit > > > These tests are known to fail on PowerPC, see > http://projects.scipy.org/numpy/ticket/1664 > https://github.com/numpy/numpy/commit/1b99089 > > The question is why the above commit is not effective on your system. Could > you check that? For example, is this not true: > > import platform > "powerpc" in platform.processor() I get this on OSX PPC: In [1]: import platform In [2]: platform.processor() Out[2]: 'powerpc' In [3]: platform.machine() Out[3]: 'Power Macintosh' and this on Debian Wheezy PPC: In [1]: import platform In [2]: platform.processor() Out[2]: '' In [3]: platform.machine() Out[3]: 'ppc' In my own code I ended up making a one-line function, 'on_powerpc': https://github.com/nipy/nibabel/blob/master/nibabel/casting.py#L171 def on_powerpc(): return processor() == 'powerpc' or machine().startswith('ppc') See you, Matthew From s0454615 at sms.ed.ac.uk Sun Jul 8 13:44:11 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 8 Jul 2012 18:44:11 +0100 Subject: [Numpy-discussion] Option parsing: tox and test-installed-numpy.py Message-ID: Hi, When calling tools/test-installed-numpy.py (https://github.com/numpy/numpy/blob/master/tools/test-installed-numpy.py), I can pass options to nose by supplying those options after "--", eg: $ python tools/test-installed-numpy.py -- --with-xunit (which passes "--with-xunit" to nose). NumPy's tox.ini (https://github.com/numpy/numpy/blob/master/tox.ini) uses tools/test-installed-numpy.py to run the tests. To pass options to test-installed-numpy.py when calling tox, I can pass the options after "--", eg: $ tox -- -v (which passes "-v" to test-installed-numpy.py). However, what I want to do is supply an option to tox that gets all the way through to nose! Is there a way I can do that, or do I need to edit tools/test-installed-numpy.py to have an option corresponding to nose's option? I hope that makes sense to someone! Thanks, Chris From njs at pobox.com Sun Jul 8 13:52:28 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 8 Jul 2012 18:52:28 +0100 Subject: [Numpy-discussion] Option parsing: tox and test-installed-numpy.py In-Reply-To: References: Message-ID: On Sun, Jul 8, 2012 at 6:44 PM, Chris Ball wrote: > Hi, > > When calling tools/test-installed-numpy.py > (https://github.com/numpy/numpy/blob/master/tools/test-installed-numpy.py), > I can pass options to nose by supplying those options after "--", eg: > $ python tools/test-installed-numpy.py -- --with-xunit > (which passes "--with-xunit" to nose). > > NumPy's tox.ini (https://github.com/numpy/numpy/blob/master/tox.ini) > uses tools/test-installed-numpy.py to run the tests. To pass options > to test-installed-numpy.py when calling tox, I can pass the options > after "--", eg: > $ tox -- -v > (which passes "-v" to test-installed-numpy.py). > > However, what I want to do is supply an option to tox that gets all > the way through to nose! Is there a way I can do that, or do I need to > edit tools/test-installed-numpy.py to have an option corresponding to > nose's option? tox -- -- --with-xunit ? -n From s0454615 at sms.ed.ac.uk Sun Jul 8 14:30:55 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 8 Jul 2012 19:30:55 +0100 Subject: [Numpy-discussion] Option parsing: tox and test-installed-numpy.py In-Reply-To: References: Message-ID: On Sun, Jul 8, 2012 at 6:52 PM, Nathaniel Smith wrote: > On Sun, Jul 8, 2012 at 6:44 PM, Chris Ball wrote: >> Hi, >> >> When calling tools/test-installed-numpy.py >> (https://github.com/numpy/numpy/blob/master/tools/test-installed-numpy.py), >> I can pass options to nose by supplying those options after "--", eg: >> $ python tools/test-installed-numpy.py -- --with-xunit >> (which passes "--with-xunit" to nose). >> >> NumPy's tox.ini (https://github.com/numpy/numpy/blob/master/tox.ini) >> uses tools/test-installed-numpy.py to run the tests. To pass options >> to test-installed-numpy.py when calling tox, I can pass the options >> after "--", eg: >> $ tox -- -v >> (which passes "-v" to test-installed-numpy.py). >> >> However, what I want to do is supply an option to tox that gets all >> the way through to nose! Is there a way I can do that, or do I need to >> edit tools/test-installed-numpy.py to have an option corresponding to >> nose's option? > > tox -- -- --with-xunit ? Thanks - I'd tried that (was just guessing), but it didn't work: $ tox -e py26 -- -- --with-xunit ... [TOX] numpy/.tox/py26$ bin/python numpy/tools/test-installed-numpy.py --with-xunit Usage: test-installed-numpy.py [options] -- [nosetests options] test-installed-numpy.py: error: no such option: --with-xunit [TOX] ERROR: InvocationError: 'bin/python numpy/tools/test-installed-numpy.py --with-xunit' If the double "--" is standard (is it?), maybe it's a problem with tox? Chris From njs at pobox.com Sun Jul 8 14:38:03 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 8 Jul 2012 19:38:03 +0100 Subject: [Numpy-discussion] Option parsing: tox and test-installed-numpy.py In-Reply-To: References: Message-ID: On Sun, Jul 8, 2012 at 7:30 PM, Chris Ball wrote: > On Sun, Jul 8, 2012 at 6:52 PM, Nathaniel Smith wrote: >> On Sun, Jul 8, 2012 at 6:44 PM, Chris Ball wrote: >>> Hi, >>> >>> When calling tools/test-installed-numpy.py >>> (https://github.com/numpy/numpy/blob/master/tools/test-installed-numpy.py), >>> I can pass options to nose by supplying those options after "--", eg: >>> $ python tools/test-installed-numpy.py -- --with-xunit >>> (which passes "--with-xunit" to nose). >>> >>> NumPy's tox.ini (https://github.com/numpy/numpy/blob/master/tox.ini) >>> uses tools/test-installed-numpy.py to run the tests. To pass options >>> to test-installed-numpy.py when calling tox, I can pass the options >>> after "--", eg: >>> $ tox -- -v >>> (which passes "-v" to test-installed-numpy.py). >>> >>> However, what I want to do is supply an option to tox that gets all >>> the way through to nose! Is there a way I can do that, or do I need to >>> edit tools/test-installed-numpy.py to have an option corresponding to >>> nose's option? >> >> tox -- -- --with-xunit ? > > Thanks - I'd tried that (was just guessing), but it didn't work: > > $ tox -e py26 -- -- --with-xunit > ... > [TOX] numpy/.tox/py26$ bin/python numpy/tools/test-installed-numpy.py > --with-xunit > Usage: test-installed-numpy.py [options] -- [nosetests options] > > test-installed-numpy.py: error: no such option: --with-xunit > [TOX] ERROR: InvocationError: 'bin/python > numpy/tools/test-installed-numpy.py --with-xunit' > > If the double "--" is standard (is it?), maybe it's a problem with tox? The double "--" isn't a standard, it's just what you get when you compose the two rules you quoted in your original email :-). "--" is supposed to mean, "stop doing option processing and treat everything after this as positional arguments". So tox really should see the first "--" and then treat the second as a positional argument. Instead it is throwing the second one away. That is definitely a bug in tox or whatever option parsing library it's using. -N From s0454615 at sms.ed.ac.uk Sun Jul 8 18:44:18 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 8 Jul 2012 23:44:18 +0100 Subject: [Numpy-discussion] Buildbot status Message-ID: St?fan van der Walt sun.ac.za> writes: ... > I'd like to find out what the current status of continuous integration > is for numpy. I'm aware of: > > a) http://buildbot.scipy.org -- used by Ralf for testing releases? > b) http://travis-ci.org -- connected via GitHub > c) http://184.73.247.160:8111 -- dedicated Amazon EC2 with TeamCity > d) http://build.pydata.org:8111/ -- dedicated Rackspace instance with TeamCity > e) https://jenkins.shiningpanda.com/numpy/ -- python 2.4 on Debian There's also: f) https://jenkins.shiningpanda.com/scipy/ -- python 2.4, 2.5, 2.6, 2.7 on Debian 6 Could easily add: Windows 7 slave, and various other versions of python (3, pypy, etc.). Could also produce graphical test and coverage reports. g) buildbot on the EC2 Currently only has one old, temporary linux test slave. Should work with any available platform and python versions, but - in contrast to shiningpanda - each new addition is a machine (or VM) that someone has to volunteer and look after. Volunteer slaves would require tox and virtualenv (in addition to numpy's requirements). > I propose that we following a simple migration path for now: move the > current buildbot onto the EC2 instance, redirect buildbot.scipy.org, > and then connect the nipy build slaves. This should require minimal > effort, but provide us with fairly wide coverage until you can invest > more time in Jenkins etc. I'm happy to help connect slaves to the EC2 buildbot if someone sends connection details to me. Another thing to consider is where to send test results. Travis-CI already comments on pull requests, which is great; the other solutions could be set up to email a list (or even the individuals potentially responsible for causing test failures). Chris From stefan at sun.ac.za Mon Jul 9 16:45:29 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 9 Jul 2012 13:45:29 -0700 Subject: [Numpy-discussion] Buildbot status In-Reply-To: References: Message-ID: On Sun, Jul 8, 2012 at 3:44 PM, Chris Ball wrote: > I'm happy to help connect slaves to the EC2 buildbot if someone sends > connection details to me. Thanks very much, Chris. Matthew gave us access to the NiPy build slaves, so catch me online then we can hook those up. > Another thing to consider is where to send test results. Travis-CI > already comments on pull requests, which is great; the other solutions > could be set up to email a list (or even the individuals potentially > responsible for causing test failures). If we notify the individuals, for now, that's fine. Others can keep track by looking at the buildbot homepage. St?fan From silberman.six at gmail.com Mon Jul 9 22:20:49 2012 From: silberman.six at gmail.com (Six Silberman) Date: Mon, 9 Jul 2012 22:20:49 -0400 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. Message-ID: Hi all, Some colleagues and I are interested in contributing to numpy. We have a range of backgrounds -- I for example am new to contributing to open source software but have a (small) bit of background in scientific computation, while others have extensive experience contributing to open source projects. We've looked at the issue tracker and submitted a couple patches today but we would be interested to hear what active contributors to the project consider the most pressing, important, and/or interesting needs at the moment. I personally am quite interested in hearing about the most pressing documentation needs (including example code). Thanks very much, six From tang.yan at gmail.com Mon Jul 9 22:24:00 2012 From: tang.yan at gmail.com (Yan Tang) Date: Mon, 9 Jul 2012 22:24:00 -0400 Subject: [Numpy-discussion] Convert recarray to list (is this a bug?) Message-ID: Hi, I noticed there is an odd issue when I am trying to convert a recarray to list. See below for the example/test case. $ cat a.csv date,count 2011-07-25,91 2011-07-26,118 $ cat b.csv name,count foo,1233 bar,100 $ python >>> from matplotlib import mlab >>> import numpy as np >>> a = mlab.csv2rec('a.csv') >>> b = mlab.csv2rec('b.csv') >>> a rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26), 118)], dtype=[('date', '|O8'), ('count', '>> b rec.array([('foo', 1233), ('bar', 100)], dtype=[('name', '|S3'), ('count', '>> np.array(a.tolist()).tolist() [[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]] >>> np.array(b.tolist()).tolist() [['foo', '1233'], ['bar', '100']] The odd case is, 1233 becomes a string '1233' in the second command. But 91 is still a number 91. Why would this happen? What's the correct way to do this conversion? Thanks. -uris- -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 9 23:32:35 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Jul 2012 21:32:35 -0600 Subject: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that. Message-ID: Hi All, I've been adding type specific sorts for object and structured arrays. It seems that datetime64 and timedelta64 are also not supported. Is there any reason why those types should not be sorted as int64? Also, when sorting object arrays, what should be done with NULL pointers to objects? I'm tempted to treat them as nans and sort them to the end. OTOH, perhaps an error should be raised. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Jul 10 03:02:56 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 10 Jul 2012 02:02:56 -0500 Subject: [Numpy-discussion] Convert recarray to list (is this a bug?) In-Reply-To: References: Message-ID: <6E9B6344-22FC-41CB-B125-FE2D5C4EDB09@continuum.io> On Jul 9, 2012, at 9:24 PM, Yan Tang wrote: > Hi, > > I noticed there is an odd issue when I am trying to convert a recarray to list. See below for the example/test case. > > $ cat a.csv > date,count > 2011-07-25,91 > 2011-07-26,118 > $ cat b.csv > name,count > foo,1233 > bar,100 > > $ python > > >>> from matplotlib import mlab > >>> import numpy as np > > >>> a = mlab.csv2rec('a.csv') > >>> b = mlab.csv2rec('b.csv') > >>> a > rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26), 118)], > dtype=[('date', '|O8'), ('count', ' >>> b > rec.array([('foo', 1233), ('bar', 100)], > dtype=[('name', '|S3'), ('count', ' > > >>> np.array(a.tolist()).tolist() > [[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]] > >>> np.array(b.tolist()).tolist() > [['foo', '1233'], ['bar', '100']] > > > The odd case is, 1233 becomes a string '1233' in the second command. But 91 is still a number 91. > > Why would this happen? What's the correct way to do this conversion? You are trying to convert the record array into a list of lists, I presume? The tolist() method on the rec.array produces a list of tuples. Be sure that a list of tuples does not actually satisfy your requirements --- it might. Passing this back to np.array is going to try to come up with a data-type that satisfies all the elements in the list of tuples. You are relying here on np.array's "intelligence" for trying to figure out what kind of array you have. It tries to do it's best, but it is limited to determining a "primitive" data-type (float, int, string, object). It can't always predict what you expect --- especially when the original data source was a record like this. In the first case, because of the date-time object, it decides the data is an "object" array which works. In the second it decides that the data can all be represented as a "string" and so choose that. The second .tolist() just produces a list out of the 2-d array. Likely what you want to do is just create a list of lists from the original output of .tolist. Like this: [list(x) for x in a.tolist()] [list(x) for x in b.tolist()] This wil be faster as well... Best, -Travis > > Thanks. > > -uris- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalke at dalkescientific.com Tue Jul 10 03:05:22 2012 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 10 Jul 2012 09:05:22 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> Message-ID: On Jul 8, 2012, at 9:22 AM, Scott Sinclair wrote: > On 6 July 2012 15:48, Andrew Dalke wrote: >> I followed the instructions at >> http://docs.scipy.org/doc/numpy/dev/gitwash/patching.html >> and added Ticket #2181 (with patch) ... > > Those instructions need to be updated to reflect the current preferred > practice. You'll make code review easier and increase the chances of > getting your patch accepted by submitting the patch as a Github pull > request instead (see > http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html > for a how-to). It's not very much extra work. Both of those URLs point to related documentation under the same root, so I assumed that both are equally valid. The 'patching' one I linked to says: Making a patch is the simplest and quickest, but if you?re going to be doing anything more than simple quick things, please consider following the Git for development model instead. That really fits me the best, because I don't know git or github, and I don't plan to get involved in numpy development other than two patches (one already posted, and the other, after my holiday, to get rid of required the numpy.testing import). I did look at the development_workflow documentation, and am already bewildered by the terms 'rebase','fast-foward' etc. It seems to that last week I made a mistake because I did a "git pull" on my local copy (which is what I do with Mercurial to get the current trunk code) instead of: git fetch followed by gitrebase, git merge --ff-only or git merge --no-ff, depending on what you intend. I don't know if I made a "common mistake", and I don't know "what [I] intend." I realize that for someone who plans to be a long term contributor, understanding git, github, and the NumPy development model is "not very much extra work", but in terms of extra work for me, or at least minimizing my level of confusion, I would rather do what the documentation suggests and continue with the submitted patch. Andrew dalke at dalkescientific.com From travis at continuum.io Tue Jul 10 03:05:49 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 10 Jul 2012 02:05:49 -0500 Subject: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that. In-Reply-To: References: Message-ID: On Jul 9, 2012, at 10:32 PM, Charles R Harris wrote: > Hi All, > > I've been adding type specific sorts for object and structured arrays. It seems that datetime64 and timedelta64 are also not supported. Is there any reason why those types should not be sorted as int64? > > Also, when sorting object arrays, what should be done with NULL pointers to objects? I'm tempted to treat them as nans and sort them to the end. OTOH, perhaps an error should be raised. My understanding is that people using missing data for object arrays will use the "None" object. I don't think it's appropriate to ever have NULL pointers for OBJECT arrays except during initialization. For example, empty([10,5], dtype=object) produces an array of "None". What are you planning to do with None's? You could treat them as nans, I would think. -Travis > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From travis at continuum.io Tue Jul 10 03:11:46 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 10 Jul 2012 02:11:46 -0500 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> Message-ID: <856FE2AB-1F5A-4F1C-9E73-BA0617CA4275@continuum.io> Andrew, Thank you for your comments. I agree it's confusing coming to github at first. I still have to refer to the jargon-file to understand what everything means. There are a lot of unfamiliar terms. Thank you for your patches. It does imply more work for developers on NumPy, which is why we prefer the github pull request mechanism. But, having patches is better than not having them. Having easy ways to upload a patch somewhere is something to think about with the intended move to github issue tracker. Best, -Travis On Jul 10, 2012, at 2:05 AM, Andrew Dalke wrote: > On Jul 8, 2012, at 9:22 AM, Scott Sinclair wrote: >> On 6 July 2012 15:48, Andrew Dalke wrote: >>> I followed the instructions at >>> http://docs.scipy.org/doc/numpy/dev/gitwash/patching.html >>> and added Ticket #2181 (with patch) ... >> >> Those instructions need to be updated to reflect the current preferred >> practice. You'll make code review easier and increase the chances of >> getting your patch accepted by submitting the patch as a Github pull >> request instead (see >> http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html >> for a how-to). It's not very much extra work. > > Both of those URLs point to related documentation under the same > root, so I assumed that both are equally valid. The 'patching' one I > linked to says: > > Making a patch is the simplest and quickest, but if you?re going to be > doing anything more than simple quick things, please consider following > the Git for development model instead. > > That really fits me the best, because I don't know git or github, and > I don't plan to get involved in numpy development other than two patches > (one already posted, and the other, after my holiday, to get rid of > required the numpy.testing import). > > I did look at the development_workflow documentation, and am already > bewildered by the terms 'rebase','fast-foward' etc. It seems to that > last week I made a mistake because I did a "git pull" on my local copy > (which is what I do with Mercurial to get the current trunk code) > instead of: > > git fetch followed by gitrebase, git merge --ff-only or > git merge --no-ff, depending on what you intend. > > I don't know if I made a "common mistake", and I don't know "what [I] > intend." > > I realize that for someone who plans to be a long term contributor, > understanding git, github, and the NumPy development model is > "not very much extra work", but in terms of extra work for me, > or at least minimizing my level of confusion, I would rather do > what the documentation suggests and continue with the submitted > patch. > > Andrew > dalke at dalkescientific.com > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Tue Jul 10 03:37:24 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 10 Jul 2012 08:37:24 +0100 Subject: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 4:32 AM, Charles R Harris wrote: > Hi All, > > I've been adding type specific sorts for object and structured arrays. It > seems that datetime64 and timedelta64 are also not supported. Is there any > reason why those types should not be sorted as int64? You need special handling for NaTs to be consistent with how we deal with NaNs in floats. -- Robert Kern From emmanuelle.gouillart at nsup.org Tue Jul 10 04:06:13 2012 From: emmanuelle.gouillart at nsup.org (Emmanuelle Gouillart) Date: Tue, 10 Jul 2012 10:06:13 +0200 Subject: [Numpy-discussion] Euroscipy 2012: early bird registration ending soon Message-ID: <20120710080613.GA18828@phare.normalesup.org> Hello, early bird registration for Euroscipy 2012 is soon coming to an end, with the deadline on July 22nd. Don't forget to register soon! Reduced fees are available for academics, students and speakers. Registration takes place online on http://www.euroscipy.org/conference/euroscipy2012. Euroscipy 2012 is the annual European conference for scientists using the Python language. It will be held August 23-27 2012 in Brussels, Belgium. Our program has been online for a few weeks now: we're very excited to have a great selection of tutorials, as well as talks and poster presentations. During the two days of tutorials (August 23 and 24), it will possible to attend either the introduction track, or the advanced track, or to combine both tracks (see http://www.euroscipy.org/track/6538?tab=tracktalkslist and http://www.euroscipy.org/track/6539?tab=tracktalkslist). As for the highlights of the two days of conference (August 25 and 26), we are very happy to have David Beazley (http://www.dabeaz.com) and Eric Jones (http://www.enthought.com/company/support-team.php) as our keynote speakers. The list of talks is available on http://www.euroscipy.org/track/6540?tab=tracktalkslist, with subjects ranging from extension programming to machine learning, or cellular biology. We're looking forward to exciting discussions during the talk sessions or around the posters, as happened during the previous editions of Euroscipy! Sprints may be organized at the conference venue during the days following the conference, from Monday 27th on. Since there is a limited number of rooms booked for the sprints, please contact the organizers by July 22 if you intend to organize one. Two sprints are already planned by the scikit-learn and the scikits-image teams. The EuroSciPy 2012 conference will feature a best talk, a best poster and a jury award. All conference participants will be given the opportunity to cast a vote for the best talk and best poster awards while the jury award is selected by the members of the program committee. Each prize consists of a Commercial Use license for Wing IDE Professional, an integrated development environment designed specifically for Python programmers. The licenses are generously donated by Wingware. Financial support may be granted by Numfocus to a small number of eligible students. See http://www.euroscipy.org/card/euroscipy2012_support_numfocus for more details on how to apply. For information that cannot be found on the conference website, please contact the organizing team at org-team at lists.euroscipy.org Cheers, Emmanuelle, for the organizing team From ralf.gommers at googlemail.com Tue Jul 10 05:01:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 10 Jul 2012 11:01:14 +0200 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. In-Reply-To: References: Message-ID: Hi, On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman wrote: > Hi all, > > Some colleagues and I are interested in contributing to numpy. That's great, welcome! > We have a range of backgrounds -- I for example am new to contributing to > open > source software but have a (small) bit of background in scientific > computation, while others have extensive experience contributing to > open source projects. We've looked at the issue tracker and submitted > a couple patches today but we would be interested to hear what active > contributors to the project consider the most pressing, important, > and/or interesting needs at the moment. I personally am quite > interested in hearing about the most pressing documentation needs > (including example code). > For documentation we have docstrings for each function and tutorial-style docs (http://docs.scipy.org/doc/numpy/user/, http://scipy-lectures.github.com/intro/numpy/index.html) . All docstrings should have clear usage examples, but I'm actually finding it quite hard to find functions that don't have any right now. The only one I could dig up so quickly is corrcoef(). There must be a few more. There are two ways to contribute to the docs, either send a pull request on Github if you familiar with git (or want to learn it), or use our doc wiki: http://docs.scipy.org/numpy/docs/numpy.lib.function_base.corrcoef In the doc wiki you can immediate see if the rendered version looks OK. You have to register a username and then ask on this list for edit rights if you want to use the wiki. Besides those few docstrings that miss examples it's mainly the user guide that needs some work I think. For example the "performance" section is still empty. Filling that in will require some in-depth numpy/python knowledge though. If you would like like to work on improving the documentation with examples, my suggestion would be to actually work on a part of scipy that interests you. We aim to get the scipy docstrings to the same level of quality as the numpy ones, and there's a lot to do there. Most docstrings miss examples, and some even miss more basic stuff (parameter/return value descriptions, formatting issues). This is a good overview of important docstrings per topic: docs.scipy.org/scipy/Milestones/ Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Jul 10 05:36:49 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 10 Jul 2012 11:36:49 +0200 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman wrote: > Hi all, > > Some colleagues and I are interested in contributing to numpy. We have > a range of backgrounds -- I for example am new to contributing to open > source software but have a (small) bit of background in scientific > computation, while others have extensive experience contributing to > open source projects. We've looked at the issue tracker and submitted > a couple patches today but we would be interested to hear what active > contributors to the project consider the most pressing, important, > and/or interesting needs at the moment. I personally am quite > interested in hearing about the most pressing documentation needs > (including example code). > As for important issues, I think many of them are related to the core of numpy. But there's some more isolated ones, which is probably better to get started. Here are some that are high on my list of things to fix/improve: - Numpy doesn't work well (or at all) on OS X 10.7 when built with llvm-gcc, which is the default compiler on that platform. With Clang it seems to work fine. Same for Scipy. http://projects.scipy.org/numpy/ticket/1951 - We don't have binary installers for Python 3.x on OS X yet. This requires adapting the installer build scripts that work for 2.x. See pavement.py in the base dir of the repo. - Something that's more straightforward: improving test coverage. It's lacking in a number of places; one of the things that comes to mind is that all functions should be tested for correct behavior with empty input. Normally the expected behavior is empty in --> empty out. When that's not tested, we get things like http://projects.scipy.org/numpy/ticket/2078. Ticket for "empty" test coverage: http://projects.scipy.org/numpy/ticket/2007 - There's a large amount of "normal" bugs, working on any of those would be very helpful too. Hard to say here which ones out of the several hundred are important. It is safe to say though I think that the ones requiring touching the C code are more in need of attention than the pure Python ones. I see a patch for f2py already, and a second ticket opened. This is of course useful, but not too many devs are working on it. Unless Pearu has time to respond this week, it may be hard to get feedback on that topic quickly. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Tue Jul 10 05:39:30 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 10 Jul 2012 11:39:30 +0200 Subject: [Numpy-discussion] "import numpy" performance In-Reply-To: References: <8E530AA9-E652-484B-88C7-D9CC74373AC7@dalkescientific.com> <7219F010-A3FC-4E89-9834-5C29EE79198F@dalkescientific.com> <2C456B75-0CC0-4DDE-9943-3C96EAD7ECF8@dalkescientific.com> Message-ID: On 10 July 2012 09:05, Andrew Dalke wrote: > On Jul 8, 2012, at 9:22 AM, Scott Sinclair wrote: >> On 6 July 2012 15:48, Andrew Dalke wrote: >>> I followed the instructions at >>> http://docs.scipy.org/doc/numpy/dev/gitwash/patching.html >>> and added Ticket #2181 (with patch) ... >> >> Those instructions need to be updated to reflect the current preferred >> practice. You'll make code review easier and increase the chances of >> getting your patch accepted by submitting the patch as a Github pull >> request instead (see >> http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html >> for a how-to). It's not very much extra work. > > Both of those URLs point to related documentation under the same > root, so I assumed that both are equally valid. That's a valid assumption. > I did look at the development_workflow documentation, and am already > bewildered by the terms 'rebase','fast-foward' etc. It seems to that > last week I made a mistake because I did a "git pull" on my local copy > (which is what I do with Mercurial to get the current trunk code) > instead of: > > git fetch followed by gitrebase, git merge --ff-only or > git merge --no-ff, depending on what you intend. > > I don't know if I made a "common mistake", and I don't know "what [I] > intend." Fair enough, new terminology is seldom fun. Using git pull wasn't necessary in your case, neither was git rebase. > I realize that for someone who plans to be a long term contributor, > understanding git, github, and the NumPy development model is > "not very much extra work", but in terms of extra work for me, > or at least minimizing my level of confusion, I would rather do > what the documentation suggests and continue with the submitted > patch. By "not very much extra work" I assumed that you'd already done most of the legwork towards submitting a pull request (Github account, forking numpy repo, etc..) My mistake, I now retract that statement :) and submitted your patch in https://github.com/numpy/numpy/pull/334 as a peace offering. Cheers, Scott From ralf.gommers at googlemail.com Tue Jul 10 06:07:02 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 10 Jul 2012 12:07:02 +0200 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 11:36 AM, Ralf Gommers wrote: > > > On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman wrote: > >> Hi all, >> >> Some colleagues and I are interested in contributing to numpy. We have >> a range of backgrounds -- I for example am new to contributing to open >> source software but have a (small) bit of background in scientific >> computation, while others have extensive experience contributing to >> open source projects. We've looked at the issue tracker and submitted >> a couple patches today but we would be interested to hear what active >> contributors to the project consider the most pressing, important, >> and/or interesting needs at the moment. I personally am quite >> interested in hearing about the most pressing documentation needs >> (including example code). >> > > As for important issues, I think many of them are related to the core of > numpy. But there's some more isolated ones, which is probably better to get > started. Here are some that are high on my list of things to fix/improve: > > - Numpy doesn't work well (or at all) on OS X 10.7 when built with > llvm-gcc, which is the default compiler on that platform. With Clang it > seems to work fine. Same for Scipy. > http://projects.scipy.org/numpy/ticket/1951 > > - We don't have binary installers for Python 3.x on OS X yet. This > requires adapting the installer build scripts that work for 2.x. See > pavement.py in the base dir of the repo. > > - Something that's more straightforward: improving test coverage. It's > lacking in a number of places; one of the things that comes to mind is that > all functions should be tested for correct behavior with empty input. > Normally the expected behavior is empty in --> empty out. When that's not > tested, we get things like http://projects.scipy.org/numpy/ticket/2078. > Ticket for "empty" test coverage: > http://projects.scipy.org/numpy/ticket/2007 > > - There's a large amount of "normal" bugs, working on any of those would > be very helpful too. Hard to say here which ones out of the several hundred > are important. It is safe to say though I think that the ones requiring > touching the C code are more in need of attention than the pure Python ones. > > > I see a patch for f2py already, and a second ticket opened. This is of > course useful, but not too many devs are working on it. Unless Pearu has > time to respond this week, it may be hard to get feedback on that topic > quickly. > Here are some relatively straightforward issues which only require touching Python code: http://projects.scipy.org/numpy/ticket/808 http://projects.scipy.org/numpy/ticket/1968 http://projects.scipy.org/numpy/ticket/1976 http://projects.scipy.org/numpy/ticket/1989 And a Cython one (numpy.random): http://projects.scipy.org/numpy/ticket/1492 I ran into one more patch that I assume one of you just attached: http://projects.scipy.org/numpy/ticket/2074. It's important to understand a little of how our infrastructure works. We changed to git + github last year; submitting patches as pull requests on Github has the lowest overhead for us, and we get notifications. For patches on Trac, we have to manually download and apply them. Plus we don't get notifications, which is quite unhelpful unfortunately. Therefore I suggest using git, and if you can't or you feel that the overhead / learning curve is too large, please ping this mailing list about patches you submit on Trac. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Jul 10 08:30:44 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 10 Jul 2012 08:30:44 -0400 Subject: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 3:37 AM, Robert Kern wrote: > On Tue, Jul 10, 2012 at 4:32 AM, Charles R Harris > wrote: > > Hi All, > > > > I've been adding type specific sorts for object and structured arrays. It > > seems that datetime64 and timedelta64 are also not supported. Is there > any > > reason why those types should not be sorted as int64? > > You need special handling for NaTs to be consistent with how we deal > with NaNs in floats. > > Not sure if this is an issue or not, but different datetime64 objects can be set for different units: http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html#datetime-units. A straight-out comparison of the values as int64 would likely drop the units, correct? On second thought, though, I guess all datetime64's in a numpy array would all have the same units, so it shouldn't matter, right? Just thinking aloud. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Jul 10 09:04:38 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 10 Jul 2012 09:04:38 -0400 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 6:07 AM, Ralf Gommers wrote: > > > On Tue, Jul 10, 2012 at 11:36 AM, Ralf Gommers < > ralf.gommers at googlemail.com> wrote: > >> >> >> On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman wrote: >> >>> Hi all, >>> >>> Some colleagues and I are interested in contributing to numpy. We have >>> a range of backgrounds -- I for example am new to contributing to open >>> source software but have a (small) bit of background in scientific >>> computation, while others have extensive experience contributing to >>> open source projects. We've looked at the issue tracker and submitted >>> a couple patches today but we would be interested to hear what active >>> contributors to the project consider the most pressing, important, >>> and/or interesting needs at the moment. I personally am quite >>> interested in hearing about the most pressing documentation needs >>> (including example code). >>> >> >> As for important issues, I think many of them are related to the core of >> numpy. But there's some more isolated ones, which is probably better to get >> started. Here are some that are high on my list of things to fix/improve: >> >> - Numpy doesn't work well (or at all) on OS X 10.7 when built with >> llvm-gcc, which is the default compiler on that platform. With Clang it >> seems to work fine. Same for Scipy. >> http://projects.scipy.org/numpy/ticket/1951 >> >> - We don't have binary installers for Python 3.x on OS X yet. This >> requires adapting the installer build scripts that work for 2.x. See >> pavement.py in the base dir of the repo. >> >> - Something that's more straightforward: improving test coverage. It's >> lacking in a number of places; one of the things that comes to mind is that >> all functions should be tested for correct behavior with empty input. >> Normally the expected behavior is empty in --> empty out. When that's not >> tested, we get things like http://projects.scipy.org/numpy/ticket/2078. >> Ticket for "empty" test coverage: >> http://projects.scipy.org/numpy/ticket/2007 >> >> - There's a large amount of "normal" bugs, working on any of those would >> be very helpful too. Hard to say here which ones out of the several hundred >> are important. It is safe to say though I think that the ones requiring >> touching the C code are more in need of attention than the pure Python ones. >> >> >> I see a patch for f2py already, and a second ticket opened. This is of >> course useful, but not too many devs are working on it. Unless Pearu has >> time to respond this week, it may be hard to get feedback on that topic >> quickly. >> > > Here are some relatively straightforward issues which only require > touching Python code: > > http://projects.scipy.org/numpy/ticket/808 > http://projects.scipy.org/numpy/ticket/1968 > http://projects.scipy.org/numpy/ticket/1976 > http://projects.scipy.org/numpy/ticket/1989 > > And a Cython one (numpy.random): > http://projects.scipy.org/numpy/ticket/1492 > > I ran into one more patch that I assume one of you just attached: > http://projects.scipy.org/numpy/ticket/2074. It's important to understand > a little of how our infrastructure works. We changed to git + github last > year; submitting patches as pull requests on Github has the lowest overhead > for us, and we get notifications. For patches on Trac, we have to manually > download and apply them. Plus we don't get notifications, which is quite > unhelpful unfortunately. Therefore I suggest using git, and if you can't or > you feel that the overhead / learning curve is too large, please ping this > mailing list about patches you submit on Trac. > > Cheers, > Ralf > > By the way, for those who are looking to learn how to use git and github: https://github.com/blog/1183-try-git-in-your-browser Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at cerazone.net Tue Jul 10 09:16:38 2012 From: tim at cerazone.net (Cera, Tim) Date: Tue, 10 Jul 2012 09:16:38 -0400 Subject: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc. In-Reply-To: References: Message-ID: > For documentation we have docstrings for each function and tutorial-style > docs (http://docs.scipy.org/doc/numpy/user/, > http://scipy-lectures.github.com/intro/numpy/index.html) . All docstrings > should have clear usage examples, but I'm actually finding it quite hard to > find functions that don't have any right now. The only one I could dig up > so quickly is corrcoef(). There must be a few more. > The documentation wiki has a little known feature to list functions that do not have docstrings and docstrings that do not have examples. Go to http://docs.scipy.org/numpy/search/ and click on the 'No Examples' or 'No Documentation' links. Same searches are available with scipy at http://docs.scipy.org/scipy/search/, which Ralf already pointed out needs the most work. Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 10 09:51:45 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Jul 2012 07:51:45 -0600 Subject: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that. In-Reply-To: References: Message-ID: On Tue, Jul 10, 2012 at 1:05 AM, Travis Oliphant wrote: > > On Jul 9, 2012, at 10:32 PM, Charles R Harris wrote: > > > Hi All, > > > > I've been adding type specific sorts for object and structured arrays. > It seems that datetime64 and timedelta64 are also not supported. Is there > any reason why those types should not be sorted as int64? > > > > Also, when sorting object arrays, what should be done with NULL pointers > to objects? I'm tempted to treat them as nans and sort them to the end. > OTOH, perhaps an error should be raised. > > My understanding is that people using missing data for object arrays will > use the "None" object. I don't think it's appropriate to ever have NULL > pointers for OBJECT arrays except during initialization. > > For example, empty([10,5], dtype=object) produces an array of "None". > What are you planning to do with None's? You could treat them as nans, I > would think. > It looks like empty returns arrays filled with None. The main reason I was asking was that the compare function looks rather strange, checking for NULL, but also potentially dereferencing NULL ;) Thinking about things a bit more, it looks like qsort compatible versions of heapsort and mergesort, say hsort and msort, would be useful for general compatibility. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tang.yan at gmail.com Tue Jul 10 09:53:52 2012 From: tang.yan at gmail.com (Yan Tang) Date: Tue, 10 Jul 2012 09:53:52 -0400 Subject: [Numpy-discussion] Convert recarray to list (is this a bug?) In-Reply-To: <6E9B6344-22FC-41CB-B125-FE2D5C4EDB09@continuum.io> References: <6E9B6344-22FC-41CB-B125-FE2D5C4EDB09@continuum.io> Message-ID: Thank you very much. On Tue, Jul 10, 2012 at 3:02 AM, Travis Oliphant wrote: > > On Jul 9, 2012, at 9:24 PM, Yan Tang wrote: > > Hi, > > I noticed there is an odd issue when I am trying to convert a recarray to > list. See below for the example/test case. > > $ cat a.csv > date,count > 2011-07-25,91 > 2011-07-26,118 > $ cat b.csv > name,count > foo,1233 > bar,100 > > $ python > > >>> from matplotlib import mlab > >>> import numpy as np > > >>> a = mlab.csv2rec('a.csv') > >>> b = mlab.csv2rec('b.csv') > >>> a > rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26), > 118)], > dtype=[('date', '|O8'), ('count', ' >>> b > rec.array([('foo', 1233), ('bar', 100)], > dtype=[('name', '|S3'), ('count', ' > > >>> np.array(a.tolist()).tolist() > [[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]] > >>> np.array(b.tolist()).tolist() > [['foo', '1233'], ['bar', '100']] > > > The odd case is, 1233 becomes a string '1233' in the second command. But > 91 is still a number 91. > > Why would this happen? What's the correct way to do this conversion? > > > You are trying to convert the record array into a list of lists, I > presume? The tolist() method on the rec.array produces a list of tuples. > Be sure that a list of tuples does not actually satisfy your requirements > --- it might. > > Passing this back to np.array is going to try to come up with a data-type > that satisfies all the elements in the list of tuples. You are relying > here on np.array's "intelligence" for trying to figure out what kind of > array you have. It tries to do it's best, but it is limited to > determining a "primitive" data-type (float, int, string, object). It > can't always predict what you expect --- especially when the original data > source was a record like this. In the first case, because of the > date-time object, it decides the data is an "object" array which works. In > the second it decides that the data can all be represented as a "string" > and so choose that. The second .tolist() just produces a list out of the > 2-d array. > > Likely what you want to do is just create a list of lists from the > original output of .tolist. Like this: > > [list(x) for x in a.tolist()] > [list(x) for x in b.tolist()] > > This wil be faster as well... > > Best, > > -Travis > > > > > > > > > > Thanks. > > -uris- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericfode at gmail.com Tue Jul 10 11:07:49 2012 From: ericfode at gmail.com (Fode) Date: Tue, 10 Jul 2012 11:07:49 -0400 Subject: [Numpy-discussion] SSE Optimization Message-ID: I am interested in adding SSE optimizations to numpy, where should I start? Fode -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesc at continuum.io Tue Jul 10 11:43:15 2012 From: francesc at continuum.io (Francesc Alted) Date: Tue, 10 Jul 2012 17:43:15 +0200 Subject: [Numpy-discussion] SSE Optimization In-Reply-To: References: Message-ID: <4FFC4D93.8080303@continuum.io> On 7/10/12 5:07 PM, Fode wrote: > I am interested in adding SSE optimizations to numpy, where should I > start? Well, to my knowledge there is not many open source code (Intel MKL and AMD ACML do not enter in this section) that uses the SSE, but a good start could be: http://gruntthepeon.free.fr/ssemath/ I'd say that NumPy could benefit a lot of integrating optimized versions for transcendental functions (as the link above). Good luck! -- Francesc Alted From d.s.seljebotn at astro.uio.no Tue Jul 10 14:10:37 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 10 Jul 2012 20:10:37 +0200 Subject: [Numpy-discussion] SSE Optimization In-Reply-To: References: Message-ID: <7b1d3ef0-e83d-44fb-a648-15c24c1f149d@email.android.com> Some more context over what Francesc said: If you mean using SSE for simple things like addition and multiplication, then you must be aware that NumPy's way of working means that it lends itself very badly to such optimizations. For small arrays, the Python interpreter overhead tends to dominate and for large arrays it's all about memory bus speef. There's a video online of a talk Francesc gave at PyData this year that explains this and the current options. People are working on it (e.g. right now in Numba and Cython) and down the road perhaps NumPy 3.0 or 4.0 can have better performance. But it's a pretty complicated work, it'd be difficult to dive in without learning more first. (Mark Florisson is currently working on a library that is reusable across projects which will bring SSE/vectorization to Cython (it beats Intel Fortran in some benchmarks! :-)) Dag -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Fode wrote: I am interested in adding SSE optimizations to numpy, where should I start? Fode -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjoshi at numenta.com Tue Jul 10 14:45:24 2012 From: pjoshi at numenta.com (Prakash Joshi) Date: Tue, 10 Jul 2012 18:45:24 +0000 Subject: [Numpy-discussion] build numpy 1.6.2 In-Reply-To: <2BA6BFB370CEB34993BB1193833C1A9E87CA95@MBX021-W3-CA-5.exch021.domain.local> Message-ID: <2BA6BFB370CEB34993BB1193833C1A9E87CAC4@MBX021-W3-CA-5.exch021.domain.local> Hi All, I built numpy 1.6.2 on linux 64 bit and installed numpy in site-packages, It pass all the test cases of numpy, but I am not sure if this is good build; As I did not specified any fortran compiler while setup, also I do not have fortran compiler on my machine. Thanks Prakash -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 10 14:48:07 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Jul 2012 19:48:07 +0100 Subject: [Numpy-discussion] [ANN] Patsy 0.1.0, a python library for statistical formulas Message-ID: [Apologies for cross-posting. Please direct any replies to pydata at googlegroups.com.] I'm pleased to announce the first release of Patsy, a Python package for describing statistical models and building design matrices using "formulas". Patsy's formulas are inspired by and largely compatible with those used in R. Patsy makes it easy to quickly try different models and work with categorical data and interactions. (Note: Patsy was originally known as "Charlton" during development. Long story.) Patsy can be used directly by users to generate model matrices for passing to other libraries, or used by those libraries to offer a high-level formula interface. Patsy's goal is to become the standard format used by different Python statistical packages for specifying models, just as formulas are the standard interface used by R packages. While this is an initial release, we already have robust test coverage (>98% statement coverage), comprehensive documentation, and a number of advanced features. (We even correctly handle a few corner cases that R itself gets wrong.) For more information, see: Overview: http://patsy.readthedocs.org/en/latest/overview.html Quickstart: http://patsy.readthedocs.org/en/latest/quickstart.html How to integrate it into your library: http://patsy.readthedocs.org/en/latest/library-developers.html Downloads: http://pypi.python.org/pypi/patsy/ Source: https://github.com/pydata/patsy Share and enjoy, -n From ben.root at ou.edu Tue Jul 10 14:54:58 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 10 Jul 2012 14:54:58 -0400 Subject: [Numpy-discussion] build numpy 1.6.2 In-Reply-To: <2BA6BFB370CEB34993BB1193833C1A9E87CAC4@MBX021-W3-CA-5.exch021.domain.local> References: <2BA6BFB370CEB34993BB1193833C1A9E87CA95@MBX021-W3-CA-5.exch021.domain.local> <2BA6BFB370CEB34993BB1193833C1A9E87CAC4@MBX021-W3-CA-5.exch021.domain.local> Message-ID: On Tue, Jul 10, 2012 at 2:45 PM, Prakash Joshi wrote: > Hi All, > > I built numpy 1.6.2 on linux 64 bit and installed numpy in > site-packages, It pass all the test cases of numpy, but I am not sure if > this is good build; As I did not specified any fortran compiler while > setup, also I do not have fortran compiler on my machine. > > Thanks > Prakash > > NumPy does not need Fortran for its build. SciPy, however, does. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjoshi at numenta.com Tue Jul 10 15:26:47 2012 From: pjoshi at numenta.com (Prakash Joshi) Date: Tue, 10 Jul 2012 19:26:47 +0000 Subject: [Numpy-discussion] build numpy 1.6.2 In-Reply-To: Message-ID: <2BA6BFB370CEB34993BB1193833C1A9E87CB0A@MBX021-W3-CA-5.exch021.domain.local> Thanks Ben. Also I did not specified any of BLAS, LAPACK, ATLAS libraries, do we need these libraries for numpy? I simply used following command to build: python setup.py build python setup.py install ?prefix=/usr/local If above commands are sufficient, than I hope same steps to build will work on Mac OSX? Best Regards, Prakash Joshi From: Benjamin Root > Reply-To: Discussion of Numerical Python > Date: Tuesday, July 10, 2012 11:54 AM To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] build numpy 1.6.2 On Tue, Jul 10, 2012 at 2:45 PM, Prakash Joshi > wrote: Hi All, I built numpy 1.6.2 on linux 64 bit and installed numpy in site-packages, It pass all the test cases of numpy, but I am not sure if this is good build; As I did not specified any fortran compiler while setup, also I do not have fortran compiler on my machine. Thanks Prakash NumPy does not need Fortran for its build. SciPy, however, does. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Jul 10 15:41:50 2012 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 10 Jul 2012 15:41:50 -0400 Subject: [Numpy-discussion] build numpy 1.6.2 In-Reply-To: <2BA6BFB370CEB34993BB1193833C1A9E87CB0A@MBX021-W3-CA-5.exch021.domain.local> References: <2BA6BFB370CEB34993BB1193833C1A9E87CB0A@MBX021-W3-CA-5.exch021.domain.local> Message-ID: Prakash, On Tue, Jul 10, 2012 at 3:26 PM, Prakash Joshi wrote: > Thanks Ben. > > Also I did not specified any of BLAS, LAPACK, ATLAS libraries, do we > need these libraries for numpy? > "Need", no, you do not "need" them in the sense that NumPy does not require them to work. NumPy will work just fine without those libraries. However, if you "want" them, then that is where the choice of Fortran compiler comes in. Look at the INSTALL.txt file for more detailed instructions. > I simply used following command to build: > python setup.py build > python setup.py install ?prefix=/usr/local > > If above commands are sufficient, than I hope same steps to build will > work on Mac OSX? > > That entirely depends on your development setup on your Mac. I will leave that discussion up to others on the list to answer. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From silberman.six at gmail.com Tue Jul 10 16:52:45 2012 From: silberman.six at gmail.com (Six Silberman) Date: Tue, 10 Jul 2012 16:52:45 -0400 Subject: [Numpy-discussion] Operations on integer matrices known to return integers Message-ID: We now have >>> a = array([[1, 2], [3, 4]], dtype=int8) array([[1, 2], [3, 4]], dtype=int8) >>> d = linalg.det(a) >>> d -2.0000000000000004 >>> d.dtype dtype('float64') This is at least partly due to use of LU factorization in computing the determinant. Some operations on integer matrices always return integers. It occurred to me that it might be nice to be able to ask functions performing such operations to always return an integer. We could do >>> print d -2.0 but this seems a little unfortunate. If this has been discussed many times previously and you just groaned inwardly, or if this is simply outside the scope of numpy, please let me know and I'll move along. Thanks very much, six From n.nikandish at gmail.com Tue Jul 10 18:22:37 2012 From: n.nikandish at gmail.com (Naser Nikandish) Date: Tue, 10 Jul 2012 18:22:37 -0400 Subject: [Numpy-discussion] Numpy and scipy on python 2.6 Message-ID: Hi, I am using python 2.6 that comes pre installed on Mac Lion 10.7.4. I am also using Gurobipy which is an optimization solver. Now, I need to install numpy and scipy to use their statistical functions. I read mixed reviews about compatibility of numpy and scipy with preinstalled python 2.6. I was wondering if I can get a firm answer to the following question: 1- How can I install numpy and scipy to be used with my Mac OS? Is there any specific version that needs to be installed? I rally appreciate if you include a link to the download website. Cheers From ericfode at gmail.com Wed Jul 11 18:06:14 2012 From: ericfode at gmail.com (Fode) Date: Wed, 11 Jul 2012 18:06:14 -0400 Subject: [Numpy-discussion] Cython files Message-ID: Should build_ext regenerate the cython files? It is not doing so for me. Fode -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jul 11 18:13:51 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 12 Jul 2012 00:13:51 +0200 Subject: [Numpy-discussion] Cython files In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 12:06 AM, Fode wrote: > Should build_ext regenerate the cython files? It is not doing so for me. No, Cython is not a build requirement for numpy. Therefore Cython is run either by hand, or in the case of numpy.random by running generate_mtrand_c.py. Then we check in the generated C files. This is normally done in a separate commit, with prefix GEN for the commit message. Ralf > Fode > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jul 11 18:40:59 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 12 Jul 2012 00:40:59 +0200 Subject: [Numpy-discussion] Numpy test failure - How to fix In-Reply-To: References: <8A8D24241B862C41B5D5EADB47CE39E205728546E6@NIHMLBX01.nih.gov> Message-ID: On Sun, Jul 8, 2012 at 11:47 AM, Matthew Brett wrote: > Hi, > > On Sun, Jul 8, 2012 at 2:36 AM, Ralf Gommers > wrote: > > > > > > On Fri, Jul 6, 2012 at 1:30 PM, McCully, Dwayne (NIH/NLM/LHC) [C] > > wrote: > >> > >> Hope this is the right list to post this problem! I?m getting two > errors > >> when running a numpy (see below). > >> > >> Could someone tell me how to fix this or if the errors are not a > concern. > >> > >> > >> > >> Dwayne > >> > >> > >> > >> python -c 'import numpy; numpy.test(verbose=2)' > >> > >> > >> > >> Python 2.7.3 > >> > >> Numpy 1.6.2 > >> > >> Nose 1.1.2 > >> > >> PowerPC > >> > >> Red Hat Linux 64 bit > > > > > > These tests are known to fail on PowerPC, see > > http://projects.scipy.org/numpy/ticket/1664 > > https://github.com/numpy/numpy/commit/1b99089 > > > > The question is why the above commit is not effective on your system. > Could > > you check that? For example, is this not true: > > > > import platform > > "powerpc" in platform.processor() > > I get this on OSX PPC: > > In [1]: import platform > > In [2]: platform.processor() > Out[2]: 'powerpc' > > In [3]: platform.machine() > Out[3]: 'Power Macintosh' > > and this on Debian Wheezy PPC: > > In [1]: import platform > > In [2]: platform.processor() > Out[2]: '' > > In [3]: platform.machine() > Out[3]: 'ppc' > > In my own code I ended up making a one-line function, 'on_powerpc': > > https://github.com/nipy/nibabel/blob/master/nibabel/casting.py#L171 > > def on_powerpc(): > return processor() == 'powerpc' or machine().startswith('ppc') > Thanks Matthew. Sent a PR for this: https://github.com/numpy/numpy/pull/345 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 11 19:28:22 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 11 Jul 2012 17:28:22 -0600 Subject: [Numpy-discussion] Remove current 1.7 branch? Message-ID: Hi All, Travis and I agree that it would be appropriate to remove the current 1.7.x branch and branch again after a code freeze. That way we can avoid the pain and potential errors of backports. It is considered bad form to mess with public repositories that way, so another option would be to rename the branch, although I'm not sure how well that would work. Suggestions? I've forward ported the 1.7 release notes, which probably should have been in master to start with. Are there any other commits that should be forward ported? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Thu Jul 12 01:21:41 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Thu, 12 Jul 2012 07:21:41 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris wrote: > Hi All, > > Travis and I agree that it would be appropriate to remove the current 1.7.x > branch and branch again after a code freeze. That way we can avoid the pain > and potential errors of backports. It is considered bad form to mess with > public repositories that way, so another option would be to rename the > branch, although I'm not sure how well that would work. Suggestions? I might be mistaken, but if the branch is merged into master (even if that merge makes no changes), I think it's safe to delete it at that point (and recreate it at a later date with the same name) with regards to remote repositories. It should be fairly easy to test. Ray Jones From charlesr.harris at gmail.com Thu Jul 12 01:42:19 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 11 Jul 2012 23:42:19 -0600 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Wed, Jul 11, 2012 at 11:21 PM, Thouis (Ray) Jones wrote: > On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > wrote: > > Hi All, > > > > Travis and I agree that it would be appropriate to remove the current > 1.7.x > > branch and branch again after a code freeze. That way we can avoid the > pain > > and potential errors of backports. It is considered bad form to mess with > > public repositories that way, so another option would be to rename the > > branch, although I'm not sure how well that would work. Suggestions? > > I might be mistaken, but if the branch is merged into master (even if > that merge makes no changes), I think it's safe to delete it at that > point (and recreate it at a later date with the same name) with > regards to remote repositories. It should be fairly easy to test. > > Looking at the log, the only commits I see are the release notes and a backport of bento build fixes. I think it is safe to delete the branch. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jul 12 02:34:59 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 12 Jul 2012 01:34:59 -0500 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: <0FD40AEB-6CB0-41D2-BAE1-32313294127F@continuum.io> Thanks for the review. I think it will be safe as well. Ondrej is traveling to Prague at the moment. When he arrives, let's here what he has to say and then move forward. -Travis On Jul 12, 2012, at 12:42 AM, Charles R Harris wrote: > > > On Wed, Jul 11, 2012 at 11:21 PM, Thouis (Ray) Jones wrote: > On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > wrote: > > Hi All, > > > > Travis and I agree that it would be appropriate to remove the current 1.7.x > > branch and branch again after a code freeze. That way we can avoid the pain > > and potential errors of backports. It is considered bad form to mess with > > public repositories that way, so another option would be to rename the > > branch, although I'm not sure how well that would work. Suggestions? > > I might be mistaken, but if the branch is merged into master (even if > that merge makes no changes), I think it's safe to delete it at that > point (and recreate it at a later date with the same name) with > regards to remote repositories. It should be fairly easy to test. > > > Looking at the log, the only commits I see are the release notes and a backport of bento build fixes. I think it is safe to delete the branch. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From klin938 at gmail.com Thu Jul 12 03:44:34 2012 From: klin938 at gmail.com (Derrick Lin) Date: Thu, 12 Jul 2012 17:44:34 +1000 Subject: [Numpy-discussion] Compile numpy 1.6.2 with ACML 4.4.0 on CentOS6 Message-ID: Hi guys, I have been trying to compile numpy 1.6.2 with the ACML4.4.0 on our AMD based cluster. I mainly followed some guides but most likely outdated like this one: http://projects.scipy.org/numpy/attachment/ticket/740/acml2.log I have compiled CBLAS linked with ACML4.4.0 gfortran, and it appears to be working fine (I ran the supplied built tests). Then when I compiled numpy 1.6.2, it refused to pick up the CBLAS and ACML no matter how. It always uses the ATLAS which is installed via OS package by default. That legacy guide as well as some very old posts suggested that all magics happen in site.cfg, which is what I did: [blas_opt] libraries = cblas, acml library_dirs = /share/ClusterShare/software/centos6/CBLAS/lib/ACML440:/share/ClusterShare/software/noarch/acml4.4.0/gfortran64/lib include_dirs = /share/ClusterShare/software/centos6/CBLAS/include:/share/ClusterShare/software/noarch/acml4.4.0/gfortran64/include [lapack_opt] libraries = acml library_dirs = /share/ClusterShare/software/noarch/acml4.4.0/gfortran64/lib include_dirs = /share/ClusterShare/software/noarch/acml4.4.0/gfortran64/include Later I read the official installation guide http://www.scipy.org/Installing_SciPy/Linux, saying we can export BLAS, LAPACK to the specified libraries. I tried but didn't work either. Anyone has done the similar thing successfully? Thanks in adv. Derrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jul 12 04:00:45 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 12 Jul 2012 10:00:45 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 7:42 AM, Charles R Harris wrote: > > > On Wed, Jul 11, 2012 at 11:21 PM, Thouis (Ray) Jones wrote: > >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris >> wrote: >> > Hi All, >> > >> > Travis and I agree that it would be appropriate to remove the current >> 1.7.x >> > branch and branch again after a code freeze. That way we can avoid the >> pain >> > and potential errors of backports. It is considered bad form to mess >> with >> > public repositories that way, so another option would be to rename the >> > branch, although I'm not sure how well that would work. Suggestions? >> >> I might be mistaken, but if the branch is merged into master (even if >> that merge makes no changes), I think it's safe to delete it at that >> point (and recreate it at a later date with the same name) with >> regards to remote repositories. It should be fairly easy to test. >> >> > Looking at the log, the only commits I see are the release notes and a > backport of bento build fixes. I think it is safe to delete the branch. > I think Ray is right, it needs to be merged back into master. If you just delete it you'll create an issue for people tracking the 1.7.x branch. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 12 05:06:08 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Jul 2012 03:06:08 -0600 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 2:00 AM, Ralf Gommers wrote: > > > On Thu, Jul 12, 2012 at 7:42 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Wed, Jul 11, 2012 at 11:21 PM, Thouis (Ray) Jones wrote: >> >>> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris >>> wrote: >>> > Hi All, >>> > >>> > Travis and I agree that it would be appropriate to remove the current >>> 1.7.x >>> > branch and branch again after a code freeze. That way we can avoid the >>> pain >>> > and potential errors of backports. It is considered bad form to mess >>> with >>> > public repositories that way, so another option would be to rename the >>> > branch, although I'm not sure how well that would work. Suggestions? >>> >>> I might be mistaken, but if the branch is merged into master (even if >>> that merge makes no changes), I think it's safe to delete it at that >>> point (and recreate it at a later date with the same name) with >>> regards to remote repositories. It should be fairly easy to test. >>> >>> >> Looking at the log, the only commits I see are the release notes and a >> backport of bento build fixes. I think it is safe to delete the branch. >> > > I think Ray is right, it needs to be merged back into master. If you just > delete it you'll create an issue for people tracking the 1.7.x branch. > > Hmmm, since everything is in master already what happens if we just reset to the branch point, then merge (there were already conflicts with the bento backport) OK, that works, also reverting everything in the branch and then merging works. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 12 05:29:13 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Jul 2012 03:29:13 -0600 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 3:06 AM, Charles R Harris wrote: > > > On Thu, Jul 12, 2012 at 2:00 AM, Ralf Gommers > wrote: > >> >> >> On Thu, Jul 12, 2012 at 7:42 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Wed, Jul 11, 2012 at 11:21 PM, Thouis (Ray) Jones wrote: >>> >>>> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris >>>> wrote: >>>> > Hi All, >>>> > >>>> > Travis and I agree that it would be appropriate to remove the current >>>> 1.7.x >>>> > branch and branch again after a code freeze. That way we can avoid >>>> the pain >>>> > and potential errors of backports. It is considered bad form to mess >>>> with >>>> > public repositories that way, so another option would be to rename the >>>> > branch, although I'm not sure how well that would work. Suggestions? >>>> >>>> I might be mistaken, but if the branch is merged into master (even if >>>> that merge makes no changes), I think it's safe to delete it at that >>>> point (and recreate it at a later date with the same name) with >>>> regards to remote repositories. It should be fairly easy to test. >>>> >>>> >>> Looking at the log, the only commits I see are the release notes and a >>> backport of bento build fixes. I think it is safe to delete the branch. >>> >> >> I think Ray is right, it needs to be merged back into master. If you just >> delete it you'll create an issue for people tracking the 1.7.x branch. >> >> > Hmmm, since everything is in master already what happens if we just reset > to the branch point, then merge (there were already conflicts with the > bento backport) OK, that works, also reverting everything > in the branch and then merging works. > > The branch is still there, however. I think the simplest thing to do is just delete the thing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Jul 12 07:48:56 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 12 Jul 2012 07:48:56 -0400 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: > On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > > wrote: > > Hi All, > > > > Travis and I agree that it would be appropriate to remove the current > 1.7.x > > branch and branch again after a code freeze. That way we can avoid the > pain > > and potential errors of backports. It is considered bad form to mess with > > public repositories that way, so another option would be to rename the > > branch, although I'm not sure how well that would work. Suggestions? > > I might be mistaken, but if the branch is merged into master (even if > that merge makes no changes), I think it's safe to delete it at that > point (and recreate it at a later date with the same name) with > regards to remote repositories. It should be fairly easy to test. > > Ray Jones No, that is not the case. We had a situation occur awhile back where one of the public branches of mpl got completely messed up. You can't even rename it since the rename doesn't occur in the pulls and merges. What we ended up doing was creating a brand new branch "v1.0.x-maint" and making sure all the devs knew to switch over to that. You might even go a step further and make a final commit to the bad branch that makes the build fail with a big note explaining what to do. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jul 12 07:54:40 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Jul 2012 12:54:40 +0100 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root wrote: > > > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: >> >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris >> wrote: >> > Hi All, >> > >> > Travis and I agree that it would be appropriate to remove the current >> > 1.7.x >> > branch and branch again after a code freeze. That way we can avoid the >> > pain >> > and potential errors of backports. It is considered bad form to mess >> > with >> > public repositories that way, so another option would be to rename the >> > branch, although I'm not sure how well that would work. Suggestions? >> >> I might be mistaken, but if the branch is merged into master (even if >> that merge makes no changes), I think it's safe to delete it at that >> point (and recreate it at a later date with the same name) with >> regards to remote repositories. It should be fairly easy to test. >> >> Ray Jones > > > No, that is not the case. We had a situation occur awhile back where one of > the public branches of mpl got completely messed up. You can't even rename > it since the rename doesn't occur in the pulls and merges. > > What we ended up doing was creating a brand new branch "v1.0.x-maint" and > making sure all the devs knew to switch over to that. You might even go a > step further and make a final commit to the bad branch that makes the build > fail with a big note explaining what to do. The branch isn't bad, it's just out of date. So long as the new version of the branch has the current version of the branch in its ancestry, then everything will be fine. Option 1: git checkout master git merge maint1.7.x git checkout maint1.7.x git merge master # will be a fast-forward Option 2: git checkout master git merge maint1.7.x git branch -d maint1.7.x # delete the branch git checkout -b maint1.7.x # recreate it In git terms these two options are literally identical; they result in the exact same repo state... -N From ben.root at ou.edu Thu Jul 12 08:14:40 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 12 Jul 2012 08:14:40 -0400 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thursday, July 12, 2012, Nathaniel Smith wrote: > On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root > > wrote: > > > > > > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: > >> > >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > >> > wrote: > >> > Hi All, > >> > > >> > Travis and I agree that it would be appropriate to remove the current > >> > 1.7.x > >> > branch and branch again after a code freeze. That way we can avoid the > >> > pain > >> > and potential errors of backports. It is considered bad form to mess > >> > with > >> > public repositories that way, so another option would be to rename the > >> > branch, although I'm not sure how well that would work. Suggestions? > >> > >> I might be mistaken, but if the branch is merged into master (even if > >> that merge makes no changes), I think it's safe to delete it at that > >> point (and recreate it at a later date with the same name) with > >> regards to remote repositories. It should be fairly easy to test. > >> > >> Ray Jones > > > > > > No, that is not the case. We had a situation occur awhile back where > one of > > the public branches of mpl got completely messed up. You can't even > rename > > it since the rename doesn't occur in the pulls and merges. > > > > What we ended up doing was creating a brand new branch "v1.0.x-maint" and > > making sure all the devs knew to switch over to that. You might even go > a > > step further and make a final commit to the bad branch that makes the > build > > fail with a big note explaining what to do. > > The branch isn't bad, it's just out of date. So long as the new > version of the branch has the current version of the branch in its > ancestry, then everything will be fine. > > Option 1: > git checkout master > git merge maint1.7.x > git checkout maint1.7.x > git merge master # will be a fast-forward > > Option 2: > git checkout master > git merge maint1.7.x > git branch -d maint1.7.x # delete the branch > git checkout -b maint1.7.x # recreate it > > In git terms these two options are literally identical; they result in > the exact same repo state... > > -N Ah, I misunderstood. Then yes, I think this is correct. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin.haenel at epfl.ch Thu Jul 12 08:45:30 2012 From: valentin.haenel at epfl.ch (=?iso-8859-1?Q?H=E4nel?= Nikolaus Valentin) Date: Thu, 12 Jul 2012 14:45:30 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: <20120712124530.GB3093@kudu.in-berlin.de> Hi, * Charles R Harris [2012-07-12]: > Travis and I agree that it would be appropriate to remove the current 1.7.x > branch and branch again after a code freeze. That way we can avoid the pain > and potential errors of backports. It is considered bad form to mess with > public repositories that way, so another option would be to rename the > branch, although I'm not sure how well that would work. Suggestions? I am a silent listener on this list, but in this case I may have some valuable insights to share, on the two proposed options: 1) delete the branch You can delete the branch in the remote repository. Anyone using the --prune option for fetch, pull or remote update will have their corresponding remote-tracking-branch deleted (pruned) automatically. The problem arises, if people have created a local branch from the remote-tracking-branch, they need to explicitly delete it. If they don't, they risk pushing that branch again after it has been recreated in the remote repository (depends on the setting of 'push.default'). In fact, what I have seen in the past is people not deleting their local branch and when the branch is recreated, they receive a message from git when they push, that the branch could not be updated. This is because their setting for 'push.default' is set to 'matching' (which is the default) and suddenly the branch has appeared again in the remote repo, so git tries to push the local branch to remote branch (since they have the same name). But their local version of this branch is still at the old position and because the new remote branch does not contain the old branch's commits, a fast-forward is not possible and they end up merging the old commits into the new branch in a feeble attempt to somehow get rid of the annoying error message. In the end, you have a really bad mess including random merge commits that make no sense and suddenly re-appearing commits that were supposed to have been long forgotten. Do you trust everyone to a) read the mail the explains the need to delete their local branch and b) actually go and delete it? ;) Incidentally, if you wish the commits on the branch to be included in master, you must cherry-pick them, but from what I gather that has already been done. 2) merge the branch to master You can merge the branch to master and leave it at that. In this case, when you wish to make use of the branch again at a later stage, you fast-forward the branch to the commit from where it should be used again (which is most probably the commit that master points to) and then commit on that branch to actually create the bifurcation. With this option, nobody needs to take any explicit action. Anyone who has created a local branch to track that branch can simply merge the new remote-tracking-branch by fast-forward as and when it is used again. Of course, you must include all commits that are on that branch. If there are any commits who's changes you would not like to see in master, first merge, and the use 'git revert' on those commits to undo the changes. So, as you can see it depends on the git skill-level of the developers, how much risk you are willing to take, and how clean and pure you would like to have your history. Personally, I would favour option 1) since it is the cleaner solution, but in the light of the probable number of forks of the numpy repo, option 2) sounds like a safer bet IMHO. Hope that helps. V- From ndbecker2 at gmail.com Thu Jul 12 10:53:04 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Jul 2012 10:53:04 -0400 Subject: [Numpy-discussion] m-ary logical functions Message-ID: I've been bitten several times by this. logical_or (a, b, c) is silently accepted when I really meant logical_or (logical_or (a, b), c) because the logic functions are binary, where I expected them to be m-ary. Dunno if anything can be done about it. Sure would like it if they were m-ary and out was a kw arg. From heng at cantab.net Thu Jul 12 11:13:52 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 12 Jul 2012 16:13:52 +0100 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: References: Message-ID: <1342106032.2920.10.camel@farnsworth> On Thu, 2012-07-12 at 10:53 -0400, Neal Becker wrote: > I've been bitten several times by this. > > logical_or (a, b, c) > > is silently accepted when I really meant > > logical_or (logical_or (a, b), c) > > because the logic functions are binary, where I expected them to be > m-ary. I don't think you mean m-ary. It's just a simple binary OR of more than one variable. I don't even know what a m-ary OR would mean (a bit-wise OR of the binary representation?) It's already a bit-wise OR of an array, that's the whole point (otherwise you could just use `or'!) Henry From njs at pobox.com Thu Jul 12 11:21:30 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Jul 2012 16:21:30 +0100 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: <1342106032.2920.10.camel@farnsworth> References: <1342106032.2920.10.camel@farnsworth> Message-ID: On Thu, Jul 12, 2012 at 4:13 PM, Henry Gomersall wrote: > On Thu, 2012-07-12 at 10:53 -0400, Neal Becker wrote: >> I've been bitten several times by this. >> >> logical_or (a, b, c) >> >> is silently accepted when I really meant >> >> logical_or (logical_or (a, b), c) >> >> because the logic functions are binary, where I expected them to be >> m-ary. > > I don't think you mean m-ary. It's just a simple binary OR of more than > one variable. I don't even know what a m-ary OR would mean (a bit-wise > OR of the binary representation?) > > It's already a bit-wise OR of an array, that's the whole point > (otherwise you could just use `or'!) Different "ary": https://en.wikipedia.org/wiki/Arity -n From heng at cantab.net Thu Jul 12 12:08:24 2012 From: heng at cantab.net (Henry Gomersall) Date: Thu, 12 Jul 2012 17:08:24 +0100 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: References: <1342106032.2920.10.camel@farnsworth> Message-ID: <1342109304.2920.13.camel@farnsworth> On Thu, 2012-07-12 at 16:21 +0100, Nathaniel Smith wrote: > On Thu, Jul 12, 2012 at 4:13 PM, Henry Gomersall > wrote: > > On Thu, 2012-07-12 at 10:53 -0400, Neal Becker wrote: > >> I've been bitten several times by this. > >> > >> logical_or (a, b, c) > >> > >> is silently accepted when I really meant > >> > >> logical_or (logical_or (a, b), c) > >> > >> because the logic functions are binary, where I expected them to be > >> m-ary. > > > > I don't think you mean m-ary. It's just a simple binary OR of more > than > > one variable. I don't even know what a m-ary OR would mean (a > bit-wise > > OR of the binary representation?) > > > > It's already a bit-wise OR of an array, that's the whole point > > (otherwise you could just use `or'!) > > Different "ary": https://en.wikipedia.org/wiki/Arity haha! Apologies, my bad. Though talking about m-ary in a context like this is just damned confusing. Henry From mluessi at gmail.com Thu Jul 12 13:26:25 2012 From: mluessi at gmail.com (Martin Luessi) Date: Thu, 12 Jul 2012 13:26:25 -0400 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 10:53 AM, Neal Becker wrote: > I've been bitten several times by this. > > logical_or (a, b, c) > > is silently accepted when I really meant > > logical_or (logical_or (a, b), c) > > because the logic functions are binary, where I expected them to be m-ary. > > Dunno if anything can be done about it. > > Sure would like it if they were m-ary and out was a kw arg. I never had the problem that I tried to use or with more than two arguments. But I agree, it's easy to make that mistake. Instead of logical_or (logical_or (a, b), c) I usually use any((a, b, c)) From njs at pobox.com Thu Jul 12 13:45:52 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Jul 2012 18:45:52 +0100 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 3:53 PM, Neal Becker wrote: > I've been bitten several times by this. > > logical_or (a, b, c) > > is silently accepted when I really meant > > logical_or (logical_or (a, b), c) > > because the logic functions are binary, where I expected them to be m-ary. > > Dunno if anything can be done about it. > > Sure would like it if they were m-ary and out was a kw arg. I'd actually like to see out= as a kw-only arg. But I don't know how we could get there; there's a ton of code in numpy itself that passes 'out' as a positional argument, and surely even more out there in the world. I think some of the backwards-compatibility goals thrown about on the list lately are... impractical... but this would still be a fairly disruptive change. We could add another ufunc method, I guess, like np.logical_or.areduce(a, b, c) For that matter, it'd be super useful to be able to write np.dot(a, b, c, d), with the bonus that we could dynamically pick the fastest way to evaluate it ... unfortunately it looks like we've already shipped support for out= as a positional argument in np.dot too (added in 1.6). -N From alan.isaac at gmail.com Thu Jul 12 14:13:20 2012 From: alan.isaac at gmail.com (Alan G Isaac) Date: Thu, 12 Jul 2012 14:13:20 -0400 Subject: [Numpy-discussion] m-ary logical functions In-Reply-To: References: Message-ID: <4FFF13C0.7030209@gmail.com> On 7/12/2012 1:45 PM, Nathaniel Smith wrote: > I'd actually like to see out= as a kw-only arg. That would be great. Numpy 2.0? Alan Isaac From chaoyuejoy at gmail.com Thu Jul 12 15:38:34 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 12 Jul 2012 21:38:34 +0200 Subject: [Numpy-discussion] use slicing as argument values? Message-ID: Dear all, I want to create a function and I would like one of the arguments of the function to determine what slicing of numpy array I want to use. a simple example: a=np.arange(100).reshape(10,10) suppose I want to have a imaging function to show image of part of this data: def show_part_of_data(m,n): plt.imshow(a[m,n]) like I can give m=3:5, n=2:7, when I call function show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). the above example doesn't work in reality. but it illustrates something similar that I desire, that is, I can specify what slicing of number array I want by giving values to function arguments. thanks a lot, Chao -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Jul 12 15:42:44 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 12 Jul 2012 15:42:44 -0400 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE wrote: > Dear all, > > I want to create a function and I would like one of the arguments of the > function to determine what slicing of numpy array I want to use. > a simple example: > > a=np.arange(100).reshape(10,10) > > suppose I want to have a imaging function to show image of part of this > data: > > def show_part_of_data(m,n): > plt.imshow(a[m,n]) > > like I can give m=3:5, n=2:7, when I call function > show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). > the above example doesn't work in reality. but it illustrates something > similar that I desire, that is, I can specify what slicing of > number array I want by giving values to function arguments. > > thanks a lot, > > Chao > > What you want to do is create slice objects. a[3:5] is equivalent to sl = slice(3, 5) a[sl] and a[3:5, 5:14] is equivalent to sl = (slice(3, 5), slice(5, 14)) a[sl] Furthermore, notation such as "::-1" is equivalent to slice(None, None, -1) I hope this helps! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Thu Jul 12 16:46:39 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 12 Jul 2012 22:46:39 +0200 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: Hi Ben, it helps a lot. I am nearly finishing a function in a way I think pythonic. Just one more question, I have: In [24]: b=np.arange(1,11) In [25]: b Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) In [26]: b[slice(1)] Out[26]: array([1]) In [27]: b[slice(4)] Out[27]: array([1, 2, 3, 4]) In [28]: b[slice(None,4)] Out[28]: array([1, 2, 3, 4]) so slice(4) is actually slice(None,4), how can I exactly want retrieve a[4] using slice object? thanks again! Chao 2012/7/12 Benjamin Root > > > On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE wrote: > >> Dear all, >> >> I want to create a function and I would like one of the arguments of the >> function to determine what slicing of numpy array I want to use. >> a simple example: >> >> a=np.arange(100).reshape(10,10) >> >> suppose I want to have a imaging function to show image of part of this >> data: >> >> def show_part_of_data(m,n): >> plt.imshow(a[m,n]) >> >> like I can give m=3:5, n=2:7, when I call function >> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). >> the above example doesn't work in reality. but it illustrates something >> similar that I desire, that is, I can specify what slicing of >> number array I want by giving values to function arguments. >> >> thanks a lot, >> >> Chao >> >> > > What you want to do is create slice objects. > > a[3:5] > > is equivalent to > > sl = slice(3, 5) > a[sl] > > > and > > a[3:5, 5:14] > > is equivalent to > > sl = (slice(3, 5), slice(5, 14)) > a[sl] > > Furthermore, notation such as "::-1" is equivalent to slice(None, None, -1) > > I hope this helps! > Ben Root > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 12 16:49:16 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Jul 2012 21:49:16 +0100 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 9:46 PM, Chao YUE wrote: > Hi Ben, > > it helps a lot. I am nearly finishing a function in a way I think pythonic. > Just one more question, I have: > > In [24]: b=np.arange(1,11) > > In [25]: b > Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) > > In [26]: b[slice(1)] > Out[26]: array([1]) > > In [27]: b[slice(4)] > Out[27]: array([1, 2, 3, 4]) > > In [28]: b[slice(None,4)] > Out[28]: array([1, 2, 3, 4]) > > so slice(4) is actually slice(None,4), how can I exactly want retrieve a[4] > using slice object? You don't. You use 4. -- Robert Kern From jjhelmus at gmail.com Thu Jul 12 16:55:00 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Thu, 12 Jul 2012 16:55:00 -0400 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: <4FFF39A4.8070905@gmail.com> On 07/12/2012 04:46 PM, Chao YUE wrote: > Hi Ben, > > it helps a lot. I am nearly finishing a function in a way I think > pythonic. > Just one more question, I have: > > In [24]: b=np.arange(1,11) > > In [25]: b > Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) > > In [26]: b[slice(1)] > Out[26]: array([1]) > > In [27]: b[slice(4)] > Out[27]: array([1, 2, 3, 4]) > > In [28]: b[slice(None,4)] > Out[28]: array([1, 2, 3, 4]) > > so slice(4) is actually slice(None,4), how can I exactly want retrieve > a[4] using slice object? > > thanks again! > > Chao slice is a build in python function and the online docs explain its use (http://docs.python.org/library/functions.html#slice). b[slice(4,5)] will give you something close to b[4], but not quite the same. In [8]: b[4] Out[8]: 5 In [9]: b[slice(4,5)] Out[9]: array([5]) - Jonathan Helmus From ben.root at ou.edu Thu Jul 12 16:56:24 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 12 Jul 2012 16:56:24 -0400 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 4:46 PM, Chao YUE wrote: > Hi Ben, > > it helps a lot. I am nearly finishing a function in a way I think > pythonic. > Just one more question, I have: > > In [24]: b=np.arange(1,11) > > In [25]: b > Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) > > In [26]: b[slice(1)] > Out[26]: array([1]) > > In [27]: b[slice(4)] > Out[27]: array([1, 2, 3, 4]) > > In [28]: b[slice(None,4)] > Out[28]: array([1, 2, 3, 4]) > > so slice(4) is actually slice(None,4), how can I exactly want retrieve > a[4] using slice object? > > thanks again! > > Chao > > Tricky question. Note the difference between a[4] and a[4:5] The first returns a scalar, while the second returns an array. The first, though, is not a slice, just an integer. Also, note that the arguments for slice() behaves very similar to the arguments for range() (with some exceptions/differences). Cheers! Ben Root > 2012/7/12 Benjamin Root > >> >> >> On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE wrote: >> >>> Dear all, >>> >>> I want to create a function and I would like one of the arguments of the >>> function to determine what slicing of numpy array I want to use. >>> a simple example: >>> >>> a=np.arange(100).reshape(10,10) >>> >>> suppose I want to have a imaging function to show image of part of this >>> data: >>> >>> def show_part_of_data(m,n): >>> plt.imshow(a[m,n]) >>> >>> like I can give m=3:5, n=2:7, when I call function >>> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). >>> the above example doesn't work in reality. but it illustrates something >>> similar that I desire, that is, I can specify what slicing of >>> number array I want by giving values to function arguments. >>> >>> thanks a lot, >>> >>> Chao >>> >>> >> >> What you want to do is create slice objects. >> >> a[3:5] >> >> is equivalent to >> >> sl = slice(3, 5) >> a[sl] >> >> >> and >> >> a[3:5, 5:14] >> >> is equivalent to >> >> sl = (slice(3, 5), slice(5, 14)) >> a[sl] >> >> Furthermore, notation such as "::-1" is equivalent to slice(None, None, >> -1) >> >> I hope this helps! >> Ben Root >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > *********************************************************************************** > Chao YUE > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > UMR 1572 CEA-CNRS-UVSQ > Batiment 712 - Pe 119 > 91191 GIF Sur YVETTE Cedex > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 > > ************************************************************************************ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Thu Jul 12 17:32:12 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 12 Jul 2012 23:32:12 +0200 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: Thanks all for the discussion. Actually I am trying to use something like numpy ndarray indexing in the function. Like when I call: func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and func(a,'1:3,:,4') for a[1:3,:,4] ect. I am very close now. #so this function changes the string to list of slice objects. def convert_string_to_slice(slice_string): """ provide slice_string as '2:3,:', it will return [slice(2, 3, None), slice(None, None, None)] """ slice_list=[] split_slice_string_list=slice_string.split(',') for sub_slice_string in split_slice_string_list: split_sub=sub_slice_string.split(':') if len(split_sub)==1: sub_slice=slice(int(split_sub[0])) else: if split_sub[0]=='': sub1=None else: sub1=int(split_sub[0]) if split_sub[1]=='': sub2=None else: sub2=int(split_sub[1]) sub_slice=slice(sub1,sub2) slice_list.append(sub_slice) return slice_list In [119]: a=np.arange(3*4*5).reshape(3,4,5) for this it works fine. In [120]: convert_string_to_slice('1:3,:,2:4') Out[120]: [slice(1, 3, None), slice(None, None, None), slice(2, 4, None)] In [121]: a[slice(1, 3, None), slice(None, None, None), slice(2, 4, None)]==a[1:3,:,2:4] Out[121]: array([[[ True, True], [ True, True], [ True, True], [ True, True]], [[ True, True], [ True, True], [ True, True], [ True, True]]], dtype=bool) And problems happens when I want to retrieve a single number along a given dimension: because it treats 1:3,:,4 as 1:3,:,:4, as shown below: In [122]: convert_string_to_slice('1:3,:,4') Out[122]: [slice(1, 3, None), slice(None, None, None), slice(None, 4, None)] In [123]: a[1:3,:,4] Out[123]: array([[24, 29, 34, 39], [44, 49, 54, 59]]) In [124]: a[slice(1, 3, None), slice(None, None, None), slice(None, 4, None)] Out[124]: array([[[20, 21, 22, 23], [25, 26, 27, 28], [30, 31, 32, 33], [35, 36, 37, 38]], [[40, 41, 42, 43], [45, 46, 47, 48], [50, 51, 52, 53], [55, 56, 57, 58]]]) Then I have a function: #this function retrieves data from ndarray a by specifying slice_string: def retrieve_data(a,slice_string): slice_list=convert_string_to_slice(slice_string) return a[*slice_list] In the list line of the fuction "retrieve_data" I have problem, I get an invalid syntax error. return a[*slice_list] ^ SyntaxError: invalid syntax I hope it's not too long, please comment as you like. Thanks a lot!!!! Chao 2012/7/12 Benjamin Root > > On Thu, Jul 12, 2012 at 4:46 PM, Chao YUE wrote: > >> Hi Ben, >> >> it helps a lot. I am nearly finishing a function in a way I think >> pythonic. >> Just one more question, I have: >> >> In [24]: b=np.arange(1,11) >> >> In [25]: b >> Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) >> >> In [26]: b[slice(1)] >> Out[26]: array([1]) >> >> In [27]: b[slice(4)] >> Out[27]: array([1, 2, 3, 4]) >> >> In [28]: b[slice(None,4)] >> Out[28]: array([1, 2, 3, 4]) >> >> so slice(4) is actually slice(None,4), how can I exactly want retrieve >> a[4] using slice object? >> >> thanks again! >> >> Chao >> >> > Tricky question. Note the difference between > > a[4] > > and > > a[4:5] > > The first returns a scalar, while the second returns an array. The first, > though, is not a slice, just an integer. > > Also, note that the arguments for slice() behaves very similar to the > arguments for range() (with some exceptions/differences). > > Cheers! > Ben Root > > > >> 2012/7/12 Benjamin Root >> >>> >>> >>> On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE wrote: >>> >>>> Dear all, >>>> >>>> I want to create a function and I would like one of the arguments of >>>> the function to determine what slicing of numpy array I want to use. >>>> a simple example: >>>> >>>> a=np.arange(100).reshape(10,10) >>>> >>>> suppose I want to have a imaging function to show image of part of this >>>> data: >>>> >>>> def show_part_of_data(m,n): >>>> plt.imshow(a[m,n]) >>>> >>>> like I can give m=3:5, n=2:7, when I call function >>>> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). >>>> the above example doesn't work in reality. but it illustrates something >>>> similar that I desire, that is, I can specify what slicing of >>>> number array I want by giving values to function arguments. >>>> >>>> thanks a lot, >>>> >>>> Chao >>>> >>>> >>> >>> What you want to do is create slice objects. >>> >>> a[3:5] >>> >>> is equivalent to >>> >>> sl = slice(3, 5) >>> a[sl] >>> >>> >>> and >>> >>> a[3:5, 5:14] >>> >>> is equivalent to >>> >>> sl = (slice(3, 5), slice(5, 14)) >>> a[sl] >>> >>> Furthermore, notation such as "::-1" is equivalent to slice(None, None, >>> -1) >>> >>> I hope this helps! >>> Ben Root >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> >> -- >> >> *********************************************************************************** >> Chao YUE >> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) >> UMR 1572 CEA-CNRS-UVSQ >> Batiment 712 - Pe 119 >> 91191 GIF Sur YVETTE Cedex >> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 >> >> ************************************************************************************ >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Jul 12 17:52:18 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 12 Jul 2012 17:52:18 -0400 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thursday, July 12, 2012, Chao YUE wrote: > Thanks all for the discussion. Actually I am trying to use something like > numpy ndarray indexing in the function. Like when I call: > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > func(a,'1:3,:,4') for a[1:3,:,4] ect. > I am very close now. > > #so this function changes the string to list of slice objects. > def convert_string_to_slice(slice_string): > """ > provide slice_string as '2:3,:', it will return [slice(2, 3, None), > slice(None, None, None)] > """ > slice_list=[] > split_slice_string_list=slice_string.split(',') > for sub_slice_string in split_slice_string_list: > split_sub=sub_slice_string.split(':') > if len(split_sub)==1: > sub_slice=slice(int(split_sub[0])) > else: > if split_sub[0]=='': > sub1=None > else: > sub1=int(split_sub[0]) > if split_sub[1]=='': > sub2=None > else: > sub2=int(split_sub[1]) > sub_slice=slice(sub1,sub2) > slice_list.append(sub_slice) > return slice_list > > In [119]: a=np.arange(3*4*5).reshape(3,4,5) > > for this it works fine. > In [120]: convert_string_to_slice('1:3,:,2:4') > Out[120]: [slice(1, 3, None), slice(None, None, None), slice(2, 4, None)] > > In [121]: a[slice(1, 3, None), slice(None, None, None), slice(2, 4, > None)]==a[1:3,:,2:4] > Out[121]: > array([[[ True, True], > [ True, True], > [ True, True], > [ True, True]], > > [[ True, True], > [ True, True], > [ True, True], > [ True, True]]], dtype=bool) > > And problems happens when I want to retrieve a single number along a given > dimension: > because it treats 1:3,:,4 as 1:3,:,:4, as shown below: > > In [122]: convert_string_to_slice('1:3,:,4') > Out[122]: [slice(1, 3, None), slice(None, None, None), slice(None, 4, > None)] > > In [123]: a[1:3,:,4] > Out[123]: > array([[24, 29, 34, 39], > [44, 49, 54, 59]]) > > In [124]: a[slice(1, 3, None), slice(None, None, None), slice(None, 4, > None)] > Out[124]: > array([[[20, 21, 22, 23], > [25, 26, 27, 28], > [30, 31, 32, 33], > [35, 36, 37, 38]], > > [[40, 41, 42, 43], > [45, 46, 47, 48], > [50, 51, 52, 53], > [55, 56, 57, 58]]]) > > > Then I have a function: > > #this function retrieves data from ndarray a by specifying slice_string: > def retrieve_data(a,slice_string): > slice_list=convert_string_to_slice(slice_string) > return a[*slice_list] > > In the list line of the fuction "retrieve_data" I have problem, I get an > invalid syntax error. > > return a[*slice_list] > ^ > SyntaxError: invalid syntax > > I hope it's not too long, please comment as you like. Thanks a lot!!!! > > Chao I won't comment on the wisdom of your approach, but for you very last part, don't try unpacking the slice list. Also, I think it has to be a tuple, but I could be wrong on that. Ben Root > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Jul 13 06:20:44 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 13 Jul 2012 10:20:44 +0000 (UTC) Subject: [Numpy-discussion] use slicing as argument values? References: Message-ID: Benjamin Root ou.edu> writes: [clip] > a[sl] and a[3:5, 5:14] is equivalent to > sl = (slice(3, 5), slice(5, 14)) > a[sl] [clip] which is also equivalent to sl = np.s_[3:5, 5:14] From daniele at grinta.net Fri Jul 13 06:58:22 2012 From: daniele at grinta.net (Daniele Nicolodi) Date: Fri, 13 Jul 2012 12:58:22 +0200 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: <4FFFFF4E.7090003@grinta.net> On 12/07/2012 23:32, Chao YUE wrote: > Thanks all for the discussion. Actually I am trying to use something > like numpy ndarray indexing in the function. Like when I call: > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > func(a,'1:3,:,4') for a[1:3,:,4] ect. > I am very close now. I don't see the advantage of this approach over directly using the sliced array as an argument of your function, as in func(a[1:3,:,4]). Can you elaborate more why you are going through this route? Cheers, Daniele From chaoyuejoy at gmail.com Fri Jul 13 08:07:54 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Fri, 13 Jul 2012 14:07:54 +0200 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: <4FFFFF4E.7090003@grinta.net> References: <4FFFFF4E.7090003@grinta.net> Message-ID: Thanks Daniele. I am writing a small plotting function that can receive the index range as argument value. like I have variables var1, var2, var3, var4, var5 which have exactly the same dimensions. def plot_eg(index_range): #here I need the function above which can use the index_range to retrieve data from variables plot(func(var1,index_range))) plot(func(var2,index_range)) plot(func(var3,index_range)) plot(func(var4,index_range)) plot(func(var5,index_range)) actually I can also put the [var1,var2,var3,var4,var5] as arguments in the plot_eg function so that I can pick any variables I want to plot as long as they have the same dimension. otherwise, I have to change the index_range for every variable. cheers, Chao 2012/7/13 Daniele Nicolodi > On 12/07/2012 23:32, Chao YUE wrote: > > Thanks all for the discussion. Actually I am trying to use something > > like numpy ndarray indexing in the function. Like when I call: > > > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > > func(a,'1:3,:,4') for a[1:3,:,4] ect. > > I am very close now. > > I don't see the advantage of this approach over directly using the > sliced array as an argument of your function, as in func(a[1:3,:,4]). > > Can you elaborate more why you are going through this route? > > Cheers, > Daniele > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at paul.kishimoto.name Fri Jul 13 11:15:11 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Fri, 13 Jul 2012 11:15:11 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour Message-ID: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> Hello everyone, I am a longtime NumPy user, and I just filed my first contribution to the code as pull request to fix what I felt was a bug in the behaviour of genfromtxt() https://github.com/numpy/numpy/pull/351 It turns out this alters existing behaviour that some people may depend on, so I was encouraged to raise the issue on this list to see what the consensus was. This behaviour happens in the specific situation where: * Comments are used in the file (the default comment character is '#', which I'll use here), AND * The kwarg names=True is given. In this case, genfromtxt() is supposed to read an initial row containing the names of the columns and return an array with a structured dtype. Currently, these options work with a file like (Example #1): # gender age weight M 21 72.100000 F 35 58.330000 M 33 21.99 ?but NOT with a file like (Example #2): # here is a general file comment # it is spread over multiple lines gender age weight M 21 72.100000 F 35 58.330000 M 33 21.99 ?genfromtxt() believes the column names are 'here', 'is', 'a', etc., and thinks all of the columns are strings because 'gender', 'age' and 'weight' are not numbers. This is because genfromtxt() (after skipping a number of lines as specified in the optional kwarg skip_header) will use the *first* line it encounters to produce column names. If that line contains a comment character, genfromtxt() discards everything *up to and including* the comment character, and tries to use the content *after* the comment character as headers (Example 3): gender age weight # wrong column names M 21 72.100000 F 35 58.330000 M 33 21.99 ?the resulting column names are 'wrong', 'column' and 'names'. My proposed change was that, if the first (or any subsequent) line contains a comment character, it should be treated as an *actual comment*, and discarded along with anything that follows it on the line. In Example 2, the result would be that the first two lines appear empty (no text before '#'), and the third line ("gender age weight") is used for column names. In Example 3, the result would be that "gender age weight" is used for column names while "# wrong column names" is ignored. BUT! In Example 1, the result would be that the first line appears empty, and "M 21 72.100000" are used for column names. In other words, this change would do away with the previous behaviour where the very first commented line was (magically?) treated not as a comment but instead as column headers. This might break some existing code. On the positive side, it would allow the user to be more liberal with the format of input files (Example 4): # here is a general file comment # the columns in this table are gender age weight # here is a comment on the header line # following this line are the data M 21 72.100000 F 35 58.330000 # here is a comment on a data line M 33 21.99 I feel that this is a better/more flexible behaviour for genfromtxt(), but?as stated?I am interested in your thoughts. Cheers, -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu https://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From aldcroft at head.cfa.harvard.edu Fri Jul 13 12:13:38 2012 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 13 Jul 2012 12:13:38 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Fri, Jul 13, 2012 at 11:15 AM, Paul Natsuo Kishimoto wrote: > Hello everyone, > > I am a longtime NumPy user, and I just filed my first contribution to > the code as pull request to fix what I felt was a bug in the behaviour > of genfromtxt() https://github.com/numpy/numpy/pull/351 > It turns out this alters existing behaviour that some people may depend > on, so I was encouraged to raise the issue on this list to see what the > consensus was. > > This behaviour happens in the specific situation where: > * Comments are used in the file (the default comment character is > '#', which I'll use here), AND > * The kwarg names=True is given. In this case, genfromtxt() is > supposed to read an initial row containing the names of the > columns and return an array with a structured dtype. > > Currently, these options work with a file like (Example #1): > > # gender age weight > M 21 72.100000 > F 35 58.330000 > M 33 21.99 > > ?but NOT with a file like (Example #2): > > # here is a general file comment > # it is spread over multiple lines > gender age weight > M 21 72.100000 > F 35 58.330000 > M 33 21.99 > > ?genfromtxt() believes the column names are 'here', 'is', 'a', etc., and > thinks all of the columns are strings because 'gender', 'age' and > 'weight' are not numbers. > > This is because genfromtxt() (after skipping a number of lines as > specified in the optional kwarg skip_header) will use the *first* line > it encounters to produce column names. If that line contains a comment > character, genfromtxt() discards everything *up to and including* the > comment character, and tries to use the content *after* the comment > character as headers (Example 3): > > gender age weight # wrong column names > M 21 72.100000 > F 35 58.330000 > M 33 21.99 > > ?the resulting column names are 'wrong', 'column' and 'names'. > > My proposed change was that, if the first (or any subsequent) line > contains a comment character, it should be treated as an *actual > comment*, and discarded along with anything that follows it on the line. > > In Example 2, the result would be that the first two lines appear empty > (no text before '#'), and the third line ("gender age weight") is used > for column names. > > In Example 3, the result would be that "gender age weight" is used for > column names while "# wrong column names" is ignored. > > BUT! > > In Example 1, the result would be that the first line appears empty, > and "M 21 72.100000" are used for column names. > > In other words, this change would do away with the previous behaviour > where the very first commented line was (magically?) treated not as a > comment but instead as column headers. This might break some existing > code. On the positive side, it would allow the user to be more liberal > with the format of input files (Example 4): > > # here is a general file comment > # the columns in this table are > gender age weight # here is a comment on the header line > # following this line are the data > M 21 72.100000 > F 35 58.330000 # here is a comment on a data line > M 33 21.99 > > I feel that this is a better/more flexible behaviour for genfromtxt(), > but?as stated?I am interested in your thoughts. > > Cheers, > -- > Paul Natsuo Kishimoto > > SM candidate, Technology & Policy Program (2012) > Research assistant, http://globalchange.mit.edu > https://paul.kishimoto.name +1 617 302 6105 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi Paul, At least in astronomy tabular files with the column definitions in the first commented line are reasonably common. This is driven in part by wide use of legacy packages like supermongo etc that don't have intelligent table readers, so users document the column names as a comment line. I think making this break might be unfortunate for users in astronomy. Dealing with commented header definitions is annoying. Not that it matters specifically for your genfromtext() proposal, but in the asciitable reader this case is handled with a particular reader class that expects the first comment line to contain the column definitions: http://cxc.harvard.edu/contrib/asciitable/#asciitable.CommentedHeader Cheers, Tom From chris.barker at noaa.gov Fri Jul 13 12:18:53 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 13 Jul 2012 09:18:53 -0700 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 2:32 PM, Chao YUE wrote: > numpy ndarray indexing in the function. Like when I call: > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > func(a,'1:3,:,4') for a[1:3,:,4] ect. why do the string packing/unpacking? why not use an interface much like the slice() and range() functions? func(a, ( (start, stop, step),(start, stop, step),(sart, stop, step) )) or, I agree, jsut pass in the sliced array: func( a[start:stop:step, start:stop:step, start_stop:step] ) Will the rank of a always be 3? Do you ned to support "step" that could simplify it a bit. -Chris > I am very close now. > > #so this function changes the string to list of slice objects. > def convert_string_to_slice(slice_string): > """ > provide slice_string as '2:3,:', it will return [slice(2, 3, None), > slice(None, None, None)] > """ > slice_list=[] > split_slice_string_list=slice_string.split(',') > for sub_slice_string in split_slice_string_list: > split_sub=sub_slice_string.split(':') > if len(split_sub)==1: > sub_slice=slice(int(split_sub[0])) > else: > if split_sub[0]=='': > sub1=None > else: > sub1=int(split_sub[0]) > if split_sub[1]=='': > sub2=None > else: > sub2=int(split_sub[1]) > sub_slice=slice(sub1,sub2) > slice_list.append(sub_slice) > return slice_list > > In [119]: a=np.arange(3*4*5).reshape(3,4,5) > > for this it works fine. > In [120]: convert_string_to_slice('1:3,:,2:4') > Out[120]: [slice(1, 3, None), slice(None, None, None), slice(2, 4, None)] > > In [121]: a[slice(1, 3, None), slice(None, None, None), slice(2, 4, > None)]==a[1:3,:,2:4] > Out[121]: > array([[[ True, True], > [ True, True], > [ True, True], > [ True, True]], > > [[ True, True], > [ True, True], > [ True, True], > [ True, True]]], dtype=bool) > > And problems happens when I want to retrieve a single number along a given > dimension: > because it treats 1:3,:,4 as 1:3,:,:4, as shown below: > > In [122]: convert_string_to_slice('1:3,:,4') > Out[122]: [slice(1, 3, None), slice(None, None, None), slice(None, 4, None)] > > In [123]: a[1:3,:,4] > Out[123]: > array([[24, 29, 34, 39], > [44, 49, 54, 59]]) > > In [124]: a[slice(1, 3, None), slice(None, None, None), slice(None, 4, > None)] > Out[124]: > array([[[20, 21, 22, 23], > [25, 26, 27, 28], > [30, 31, 32, 33], > [35, 36, 37, 38]], > > [[40, 41, 42, 43], > [45, 46, 47, 48], > [50, 51, 52, 53], > [55, 56, 57, 58]]]) > > > Then I have a function: > > #this function retrieves data from ndarray a by specifying slice_string: > def retrieve_data(a,slice_string): > slice_list=convert_string_to_slice(slice_string) > return a[*slice_list] > > In the list line of the fuction "retrieve_data" I have problem, I get an > invalid syntax error. > > return a[*slice_list] > ^ > SyntaxError: invalid syntax > > I hope it's not too long, please comment as you like. Thanks a lot!!!! > > Chao > > > 2012/7/12 Benjamin Root >> >> >> On Thu, Jul 12, 2012 at 4:46 PM, Chao YUE wrote: >>> >>> Hi Ben, >>> >>> it helps a lot. I am nearly finishing a function in a way I think >>> pythonic. >>> Just one more question, I have: >>> >>> In [24]: b=np.arange(1,11) >>> >>> In [25]: b >>> Out[25]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) >>> >>> In [26]: b[slice(1)] >>> Out[26]: array([1]) >>> >>> In [27]: b[slice(4)] >>> Out[27]: array([1, 2, 3, 4]) >>> >>> In [28]: b[slice(None,4)] >>> Out[28]: array([1, 2, 3, 4]) >>> >>> so slice(4) is actually slice(None,4), how can I exactly want retrieve >>> a[4] using slice object? >>> >>> thanks again! >>> >>> Chao >>> >> >> Tricky question. Note the difference between >> >> a[4] >> >> and >> >> a[4:5] >> >> The first returns a scalar, while the second returns an array. The first, >> though, is not a slice, just an integer. >> >> Also, note that the arguments for slice() behaves very similar to the >> arguments for range() (with some exceptions/differences). >> >> Cheers! >> Ben Root >> >> >>> >>> 2012/7/12 Benjamin Root >>>> >>>> >>>> >>>> On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE wrote: >>>>> >>>>> Dear all, >>>>> >>>>> I want to create a function and I would like one of the arguments of >>>>> the function to determine what slicing of numpy array I want to use. >>>>> a simple example: >>>>> >>>>> a=np.arange(100).reshape(10,10) >>>>> >>>>> suppose I want to have a imaging function to show image of part of this >>>>> data: >>>>> >>>>> def show_part_of_data(m,n): >>>>> plt.imshow(a[m,n]) >>>>> >>>>> like I can give m=3:5, n=2:7, when I call function >>>>> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]). >>>>> the above example doesn't work in reality. but it illustrates something >>>>> similar that I desire, that is, I can specify what slicing of >>>>> number array I want by giving values to function arguments. >>>>> >>>>> thanks a lot, >>>>> >>>>> Chao >>>>> >>>> >>>> >>>> What you want to do is create slice objects. >>>> >>>> a[3:5] >>>> >>>> is equivalent to >>>> >>>> sl = slice(3, 5) >>>> a[sl] >>>> >>>> >>>> and >>>> >>>> a[3:5, 5:14] >>>> >>>> is equivalent to >>>> >>>> sl = (slice(3, 5), slice(5, 14)) >>>> a[sl] >>>> >>>> Furthermore, notation such as "::-1" is equivalent to slice(None, None, >>>> -1) >>>> >>>> I hope this helps! >>>> Ben Root >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> >>> -- >>> >>> *********************************************************************************** >>> Chao YUE >>> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) >>> UMR 1572 CEA-CNRS-UVSQ >>> Batiment 712 - Pe 119 >>> 91191 GIF Sur YVETTE Cedex >>> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 >>> >>> ************************************************************************************ >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > *********************************************************************************** > Chao YUE > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) > UMR 1572 CEA-CNRS-UVSQ > Batiment 712 - Pe 119 > 91191 GIF Sur YVETTE Cedex > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 > ************************************************************************************ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Fri Jul 13 12:24:24 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Jul 2012 17:24:24 +0100 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 10:32 PM, Chao YUE wrote: > Thanks all for the discussion. Actually I am trying to use something like > numpy ndarray indexing in the function. Like when I call: > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > func(a,'1:3,:,4') for a[1:3,:,4] ect. > I am very close now. [~] |1> from numpy import index_exp [~] |2> index_exp[1:3,:,2:4] (slice(1, 3, None), slice(None, None, None), slice(2, 4, None)) -- Robert Kern From n.nikandish at gmail.com Fri Jul 13 13:22:25 2012 From: n.nikandish at gmail.com (Naser Nikandish) Date: Fri, 13 Jul 2012 13:22:25 -0400 Subject: [Numpy-discussion] Istalling Numpy and Scipy on preinstalled Python 2.6 on Mac Message-ID: Hi, I need to install numpy and scipy on preinstalled Python 2.6 on my Mac Lion. Is there anyway to do it? I am aware that Lion OS comes with Python 2.7 as well. But I have to install it on Python 2.6. I really appreciate any help. Cheers From mail at paul.kishimoto.name Fri Jul 13 13:29:31 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Fri, 13 Jul 2012 13:29:31 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> Message-ID: <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> On Fri, 2012-07-13 at 12:13 -0400, Tom Aldcroft wrote: > On Fri, Jul 13, 2012 at 11:15 AM, Paul Natsuo Kishimoto > wrote: > > Hello everyone, > > > > I am a longtime NumPy user, and I just filed my first contribution to > > the code as pull request to fix what I felt was a bug in the behaviour > > of genfromtxt() https://github.com/numpy/numpy/pull/351 > > It turns out this alters existing behaviour that some people may depend > > on, so I was encouraged to raise the issue on this list to see what the > > consensus was. > > > > This behaviour happens in the specific situation where: > > * Comments are used in the file (the default comment character is > > '#', which I'll use here), AND > > * The kwarg names=True is given. In this case, genfromtxt() is > > supposed to read an initial row containing the names of the > > columns and return an array with a structured dtype. > > > > Currently, these options work with a file like (Example #1): > > > > # gender age weight > > M 21 72.100000 > > F 35 58.330000 > > M 33 21.99 > > > > ?but NOT with a file like (Example #2): > > > > # here is a general file comment > > # it is spread over multiple lines > > gender age weight > > M 21 72.100000 > > F 35 58.330000 > > M 33 21.99 > > > > ?genfromtxt() believes the column names are 'here', 'is', 'a', etc., and > > thinks all of the columns are strings because 'gender', 'age' and > > 'weight' are not numbers. > > > > This is because genfromtxt() (after skipping a number of lines as > > specified in the optional kwarg skip_header) will use the *first* line > > it encounters to produce column names. If that line contains a comment > > character, genfromtxt() discards everything *up to and including* the > > comment character, and tries to use the content *after* the comment > > character as headers (Example 3): > > > > gender age weight # wrong column names > > M 21 72.100000 > > F 35 58.330000 > > M 33 21.99 > > > > ?the resulting column names are 'wrong', 'column' and 'names'. > > > > My proposed change was that, if the first (or any subsequent) line > > contains a comment character, it should be treated as an *actual > > comment*, and discarded along with anything that follows it on the line. > > > > In Example 2, the result would be that the first two lines appear empty > > (no text before '#'), and the third line ("gender age weight") is used > > for column names. > > > > In Example 3, the result would be that "gender age weight" is used for > > column names while "# wrong column names" is ignored. > > > > BUT! > > > > In Example 1, the result would be that the first line appears empty, > > and "M 21 72.100000" are used for column names. > > > > In other words, this change would do away with the previous behaviour > > where the very first commented line was (magically?) treated not as a > > comment but instead as column headers. This might break some existing > > code. On the positive side, it would allow the user to be more liberal > > with the format of input files (Example 4): > > > > # here is a general file comment > > # the columns in this table are > > gender age weight # here is a comment on the header line > > # following this line are the data > > M 21 72.100000 > > F 35 58.330000 # here is a comment on a data line > > M 33 21.99 > > > > I feel that this is a better/more flexible behaviour for genfromtxt(), > > but?as stated?I am interested in your thoughts. > > > > Cheers, > > -- > > Paul Natsuo Kishimoto > > > > SM candidate, Technology & Policy Program (2012) > > Research assistant, http://globalchange.mit.edu > > https://paul.kishimoto.name +1 617 302 6105 > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Hi Paul, > > At least in astronomy tabular files with the column definitions in the > first commented line are reasonably common. This is driven in part by > wide use of legacy packages like supermongo etc that don't have > intelligent table readers, so users document the column names as a > comment line. I think making this break might be unfortunate for > users in astronomy. > > Dealing with commented header definitions is annoying. Not that it > matters specifically for your genfromtext() proposal, but in the > asciitable reader this case is handled with a particular reader class > that expects the first comment line to contain the column definitions: > > http://cxc.harvard.edu/contrib/asciitable/#asciitable.CommentedHeader > > Cheers, > Tom Tom, Thanks for this information. In thinking about how people would work around this, I figured it would be fairly easy to discard a comment character that occurred as the very first character in a file, e.g.: raw = StringIO(open('example.txt').read()[1:]) data = numpy.genfromtxt(raw, comment='#', names=True) ?but I realize that making this change in many places would still be an annoyance. I should perhaps also add that my view of 'proper' table formats is partly influenced by another plotting package, namely pgfplots for LaTeX (http://pgfplots.sourceforge.net/ , http://pgfplots.sourceforge.net/gallery.html) which uses uncommented headers. To the extent NumPy users are also LaTeX users, similar semantics could be more friendly. Looking forward to more input from other users, -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu https://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From chaoyuejoy at gmail.com Fri Jul 13 17:29:30 2012 From: chaoyuejoy at gmail.com (Chao YUE) Date: Fri, 13 Jul 2012 23:29:30 +0200 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: Thanks Robert. This is exactly what I want. I have a feeling that there must be something in numpy that can do the job and I didn't know. Thanks again, Chao 2012/7/13 Robert Kern > On Thu, Jul 12, 2012 at 10:32 PM, Chao YUE wrote: > > Thanks all for the discussion. Actually I am trying to use something like > > numpy ndarray indexing in the function. Like when I call: > > > > func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and > > func(a,'1:3,:,4') for a[1:3,:,4] ect. > > I am very close now. > > [~] > |1> from numpy import index_exp > > [~] > |2> index_exp[1:3,:,2:4] > (slice(1, 3, None), slice(None, None, None), slice(2, 4, None)) > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Jul 13 17:30:50 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 13 Jul 2012 14:30:50 -0700 Subject: [Numpy-discussion] use slicing as argument values? In-Reply-To: References: Message-ID: On Fri, Jul 13, 2012 at 9:24 AM, Robert Kern wrote: > On Thu, Jul 12, 2012 at 10:32 PM, Chao YUE wrote: >> Thanks all for the discussion. Actually I am trying to use something like >> numpy ndarray indexing in the function. Like when I call: >> >> func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and >> func(a,'1:3,:,4') for a[1:3,:,4] ect. >> I am very close now. > > [~] > |1> from numpy import index_exp > > [~] > |2> index_exp[1:3,:,2:4] > (slice(1, 3, None), slice(None, None, None), slice(2, 4, None)) Nice - thanks for the pointer, Matthew From tmp50 at ukr.net Sat Jul 14 07:16:53 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 14 Jul 2012 14:16:53 +0300 Subject: [Numpy-discussion] New Python tool for searching maximum stable set of a graph Message-ID: <65118.1342264613.18046577829960613888@ffe5.ukr.net> Hi all, In the OpenOpt software (BSD-licensed, http://openopt.org ) we have implemented new class - STAB - searching for maximum stable set of a graph. networkx graphs are used as input arguments. Unlike networkx maximum_independent_set() we focus on searching for exact solution (this is NP-Hard problem). interalg or OpenOpt MILP solvers are used, some GUI features and stop criterion (e.g. maxTime, maxCPUTime, fEnough) can be used. Optional arguments are includedNodes and excludedNodes - nodes that have to be present/absent in solution. See http://openopt.org/STAB for details. Future plans (probably very long-term although) include TSP and some other graph problems. ------------------------- Regards, Dmitrey. -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sat Jul 14 18:31:35 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 14 Jul 2012 17:31:35 -0500 Subject: [Numpy-discussion] Extracting sub-fields from an array as a view (PR 350) Message-ID: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> In https://github.com/numpy/numpy/pull/350/files , javius provides a patch to allow field extraction from a structured array to return a view instead of a copy. Generally, this is consistent with the desire to have NumPy return views whenever it can. The same idea underlies the change to the diagonal method. Suppose 'myarr' is a structured array with fields ['lat', 'long', 'meas1', 'meas2', 'meas3', 'meas4']. Currently, myarr[['lat', 'long', 'mesa3']] will return a copy of the data in the underlying array. The proposal is to have this return a view, but do it in a two-stage approach so that a first version returns a copy with the WARN_ON_WRITE flag set introduced in NumPy 1.7. A later version will remove the flag (and the copy). What are thoughts on this proposal and which version of NumPy it should go in? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sat Jul 14 18:45:30 2012 From: travis at continuum.io (Travis Oliphant) Date: Sat, 14 Jul 2012 17:45:30 -0500 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 Message-ID: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Hey all, We are nearing a code-freeze for NumPy 1.7. Are there any last-minute changes people are wanting to push into NumPy 1.7? We should discuss them as soon as possible. I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on July 17th). This will allow the creation of beta releases of NumPy on the 18th of July. This is a few days later than originally hoped for --- largely due to unexpected travel schedules of Ondrej and I, but it does give people a few more days to get patches in. Of course, we will be able to apply bug-fixes to the 1.7.x branch once the tag is made. If you have a pull-request that is not yet applied and would like to discuss it for inclusion, the time to do it is now. Best, -Travis From nouiz at nouiz.org Sat Jul 14 19:17:26 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sat, 14 Jul 2012 19:17:26 -0400 Subject: [Numpy-discussion] Extracting sub-fields from an array as a view (PR 350) In-Reply-To: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> References: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> Message-ID: I think it is better that we bundle those change together. As it is done for diagonal, doing it for this case as fine too. Fred On Sat, Jul 14, 2012 at 6:31 PM, Travis Oliphant wrote: > > In https://github.com/numpy/numpy/pull/350/files , > > javius provides a patch to allow field extraction from a structured array to > return a view instead of a copy. Generally, this is consistent with the > desire to have NumPy return views whenever it can. The same idea underlies > the change to the diagonal method. > > Suppose 'myarr' is a structured array with fields ['lat', 'long', 'meas1', > 'meas2', 'meas3', 'meas4']. > > Currently, > > myarr[['lat', 'long', 'mesa3']] will return a copy of the data in the > underlying array. The proposal is to have this return a view, but do it in > a two-stage approach so that a first version returns a copy with the > WARN_ON_WRITE flag set introduced in NumPy 1.7. A later version will > remove the flag (and the copy). > > What are thoughts on this proposal and which version of NumPy it should go > in? > > -Travis > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nouiz at nouiz.org Sat Jul 14 19:20:59 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sat, 14 Jul 2012 19:20:59 -0400 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: A very small PR about documentation: https://github.com/numpy/numpy/pull/332 Fred On Sat, Jul 14, 2012 at 6:45 PM, Travis Oliphant wrote: > > Hey all, > > We are nearing a code-freeze for NumPy 1.7. Are there any last-minute changes people are wanting to push into NumPy 1.7? We should discuss them as soon as possible. > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on July 17th). This will allow the creation of beta releases of NumPy on the 18th of July. This is a few days later than originally hoped for --- largely due to unexpected travel schedules of Ondrej and I, but it does give people a few more days to get patches in. Of course, we will be able to apply bug-fixes to the 1.7.x branch once the tag is made. > > If you have a pull-request that is not yet applied and would like to discuss it for inclusion, the time to do it is now. > > Best, > > -Travis > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From scopatz at gmail.com Sat Jul 14 20:17:28 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Sat, 14 Jul 2012 19:17:28 -0500 Subject: [Numpy-discussion] Extracting sub-fields from an array as a view (PR 350) In-Reply-To: References: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> Message-ID: +1 for more views. I agree with Fred about bundling the changes together. On Sat, Jul 14, 2012 at 6:17 PM, Fr?d?ric Bastien wrote: > I think it is better that we bundle those change together. As it is > done for diagonal, doing it for this case as fine too. > > Fred > > On Sat, Jul 14, 2012 at 6:31 PM, Travis Oliphant > wrote: > > > > In https://github.com/numpy/numpy/pull/350/files , > > > > javius provides a patch to allow field extraction from a structured > array to > > return a view instead of a copy. Generally, this is consistent with > the > > desire to have NumPy return views whenever it can. The same idea > underlies > > the change to the diagonal method. > > > > Suppose 'myarr' is a structured array with fields ['lat', 'long', > 'meas1', > > 'meas2', 'meas3', 'meas4']. > > > > Currently, > > > > myarr[['lat', 'long', 'mesa3']] will return a copy of the data in the > > underlying array. The proposal is to have this return a view, but do > it in > > a two-stage approach so that a first version returns a copy with the > > WARN_ON_WRITE flag set introduced in NumPy 1.7. A later version will > > remove the flag (and the copy). > > > > What are thoughts on this proposal and which version of NumPy it should > go > > in? > > > > -Travis > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Jul 14 21:02:32 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 14 Jul 2012 18:02:32 -0700 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: Hi, On Sat, Jul 14, 2012 at 3:45 PM, Travis Oliphant wrote: > > Hey all, > > We are nearing a code-freeze for NumPy 1.7. Are there any last-minute changes people are wanting to push into NumPy 1.7? We should discuss them as soon as possible. > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on July 17th). This will allow the creation of beta releases of NumPy on the 18th of July. This is a few days later than originally hoped for --- largely due to unexpected travel schedules of Ondrej and I, but it does give people a few more days to get patches in. Of course, we will be able to apply bug-fixes to the 1.7.x branch once the tag is made. > > If you have a pull-request that is not yet applied and would like to discuss it for inclusion, the time to do it is now. An appeal for (just submitted): https://github.com/numpy/numpy/pull/357 Cheers, Matthew From matthew.brett at gmail.com Sat Jul 14 21:03:06 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 14 Jul 2012 18:03:06 -0700 Subject: [Numpy-discussion] Matrix rank default tolerance - is it too low? In-Reply-To: References: Message-ID: Hi, On Tue, Jun 26, 2012 at 7:29 PM, Matthew Brett wrote: > Hi, > > On Tue, Jun 26, 2012 at 5:04 PM, Charles R Harris > wrote: >> >> >> On Tue, Jun 26, 2012 at 5:46 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Tue, Jun 26, 2012 at 4:39 PM, Benjamin Root wrote: >>> > >>> > >>> > On Tuesday, June 26, 2012, Charles R Harris wrote: >>> >> >>> >> >>> >> >>> >> On Tue, Jun 26, 2012 at 3:42 PM, Matthew Brett >>> >> >>> >> wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Mon, Jun 18, 2012 at 3:50 PM, Matthew Brett >>> >> >>> >> wrote: >>> >> > Hi, >>> >> > >>> >> > On Sun, Jun 17, 2012 at 7:22 PM, Charles R Harris >>> >> > wrote: >>> >> >> >>> >> >> >>> >> >> On Sat, Jun 16, 2012 at 2:33 PM, Matthew Brett >>> >> >> >>> >> >> wrote: >>> >> >>> >>> >> >>> Hi, >>> >> >>> >>> >> >>> On Sat, Jun 16, 2012 at 8:03 PM, Matthew Brett >>> >> >>> >>> >> >>> wrote: >>> >> >>> > Hi, >>> >> >>> > >>> >> >>> > On Sat, Jun 16, 2012 at 10:40 AM, Nathaniel Smith >>> >> >>> > wrote: >>> >> >>> >> On Fri, Jun 15, 2012 at 4:10 AM, Charles R Harris >>> >> >>> >> wrote: >>> >> >>> >>> >>> >> >>> >>> >>> >> >>> >>> On Thu, Jun 14, 2012 at 8:06 PM, Matthew Brett >>> One potential problem is that it implies that it will always be the >>> same as any version of matlab's tolerance. What if they change it in >>> a future release? How likely are we to even notice? >>> >>> >>> >> >>> >>> >>> >> >>> >>> wrote: >>> >> >>> >>>> >>> >> >>> >>>> Hi, >>> >> >>> >>>> >>> >> >>> >>>> I noticed that numpy.linalg.matrix_rank sometimes gives full >>> >> >>> >>>> rank >>> >> >>> >>>> for >>> >> >>> >>>> matrices that are numerically rank deficient: >>> >> >>> >>>> >>> >> >>> >>>> If I repeatedly make random matrices, then set the first >>> >> >>> >>>> column >>> >> >>> >>>> to be >>> >> >>> >>>> equal to the sum of the second and third columns: >>> >> >>> >>>> >>> >> >>> >>>> def make_deficient(): >>> >> >>> >>>> X = np.random.normal(size=(40, 10)) >>> >> >>> >>>> deficient_X = X.copy() >>> >> >>> >>>> deficient_X[:, 0] = deficient_X[:, 1] + deficient_X[:, 2] >>> >> >>> >>>> return deficient_X >>> >> >>> >>>> >>> >> >>> >>>> then the current numpy.linalg.matrix_rank algorithm returns >>> >> >>> >>>> full >>> >> >>> >>>> rank >>> >> >>> >>>> (10) in about 8 percent of cases (see appended script). >>> >> >>> >>>> >>> >> >>> >>>> I think this is a tolerance problem. The ``matrix_rank`` >>> >> >>> >>>> algorithm >>> >> >>> >>>> does this by default: >>> >> >>> >>>> >>> >> >>> >>>> S = spl.svd(M, compute_uv=False) >>> >> >>> >>>> tol = S.max() * np.finfo(S.dtype).eps >>> >> >>> >>>> return np.sum(S > tol) >>> >> >>> >>>> >>> >> >>> >>>> I guess we'd we want the lowest tolerance that nearly always >>> >> >>> >>>> or >>> >> >>> >>>> always >>> >> >>> >>>> identifies numerically rank deficient matrices. I suppose one >>> >> >>> >>>> way of >>> >> >>> >>>> looking at whether the tolerance is in the right range is to >>> >> >>> >>>> compare >>> >> >>> >>>> the calculated tolerance (``tol``) to the minimum singular >>> >> >>> >>>> value >>> >> >>> >>>> (``S.min()``) because S.min() in our case should be very small >>> >> >>> >>>> and >>> >> >>> >>>> indicate the rank deficiency. The mean value of tol / S.min() >>> >> >>> >>>> for >>> >> >>> >>>> the >>> >> >>> >>>> current algorithm, across many iterations, is about 2.8. We >>> >> >>> >>>> might >>> >> >>> >>>> hope this value would be higher than 1, but not much higher, >>> >> >>> >>>> otherwise >>> >> >>> >>>> we might be rejecting too many columns. >>> >> >>> >>>> >>> >> >>> >>>> Our current algorithm for tolerance is the same as the 2-norm >>> >> >>> >>>> of >>> >> >>> >>>> M * >>> >> >>> >>>> eps. We're citing Golub and Van Loan for this, but now I look >>> >> >>> >>>> at >>> >> >>> >>>> our >>> >> >>> >>>> copy (p 261, last para) - they seem to be suggesting using u * >>> >> >>> >>>> |M| >>> >> >>> >>>> where u = (p 61, section 2.4.2) eps / 2. (see [1]). I think >>> >> >>> >>>> the >>> >> >>> >>>> Golub >>> >> >>> >> >>> >> I'm fine with that, and agree that it is likely to lead to fewer folks >>> >> wondering why Matlab and numpy are different. A good explanation in the >>> >> function documentation would be useful. >>> >> >>> >> Chuck >>> >> >>> > >>> > One potential problem is that it implies that it will always be the same >>> > as >>> > any version of matlab's tolerance. What if they change it in a future >>> > release? How likely are we to even notice? >>> >>> I guess that matlab is unlikely to change for the same reason that we >>> would be reluctant to change, once we've found an acceptable value. >>> >>> I was thinking that we would say something like: >>> >>> """ >>> The default tolerance is : >>> >>> tol = S.max() * np.finfo(M.dtype).eps * max((m, n)) >>> >>> This corresponds to the tolerance suggested in NR page X, and to the >>> tolerance used by MATLAB at the time of writing (June 2012; see >>> http://www.mathworks.com/help/techdoc/ref/rank.html). >>> """ >>> >>> I don't know whether we would want to track changes made by matlab - >>> maybe we could have that discussion if they do change? >> >> >> I wouldn't bother tracking Matlab, but I think the alternative threshold >> could be mentioned in the notes. Something like >> >> A less conservative threshold is ... >> >> Maybe mention that because of numerical uncertainty there will always be a >> chance that the computed rank could be wrong, but that with the conservative >> threshold the rank is very unlikely to be less than the computed rank. > > Sounds good to me. Would anyone object to a pull request with these > changes (matlab tolerance default, description in docstring)? Pull request here: https://github.com/numpy/numpy/pull/357 Cheers, Matthew From charlesr.harris at gmail.com Sat Jul 14 23:56:05 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 14 Jul 2012 21:56:05 -0600 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sat, Jul 14, 2012 at 7:02 PM, Matthew Brett wrote: > Hi, > > On Sat, Jul 14, 2012 at 3:45 PM, Travis Oliphant > wrote: > > > > Hey all, > > > > We are nearing a code-freeze for NumPy 1.7. Are there any last-minute > changes people are wanting to push into NumPy 1.7? We should discuss them > as soon as possible. > > > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on > July 17th). This will allow the creation of beta releases of NumPy on the > 18th of July. This is a few days later than originally hoped for --- > largely due to unexpected travel schedules of Ondrej and I, but it does > give people a few more days to get patches in. Of course, we will be able > to apply bug-fixes to the 1.7.x branch once the tag is made. > > > > If you have a pull-request that is not yet applied and would like to > discuss it for inclusion, the time to do it is now. > > An appeal for (just submitted): > > https://github.com/numpy/numpy/pull/357 > > Already committed. I think we should look over the current PR's to identify any others that should go in. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Jul 15 08:08:26 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 15 Jul 2012 14:08:26 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant wrote: > > Hey all, > > We are nearing a code-freeze for NumPy 1.7. Are there any last-minute > changes people are wanting to push into NumPy 1.7? We should discuss them > as soon as possible. > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on > July 17th). This will allow the creation of beta releases of NumPy on the > 18th of July. This is a few days later than originally hoped for --- > largely due to unexpected travel schedules of Ondrej and I, but it does > give people a few more days to get patches in. Of course, we will be able > to apply bug-fixes to the 1.7.x branch once the tag is made. > What about the tickets still open for 1.7.0 ( http://projects.scipy.org/numpy/report/3)? There are a few important ones left. These I would consider blockers: - #2108 Datetime failures with MinGW - #2076 Bus error for F order ndarray creation on SPARC These have patches available which should be reviewed: - #2150 Distutils should support Debian multi-arch fully - #2179 Memmap children retain _mmap reference in all cases Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.s.seljebotn at astro.uio.no Sun Jul 15 10:33:20 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 15 Jul 2012 16:33:20 +0200 Subject: [Numpy-discussion] Extracting sub-fields from an array as a view (PR 350) In-Reply-To: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> References: <38C3231B-F771-4002-83D5-6F1C1D94291B@gmail.com> Message-ID: <5002D4B0.2090005@astro.uio.no> On 07/15/2012 12:31 AM, Travis Oliphant wrote: > > In https://github.com/numpy/numpy/pull/350/files , > > javius provides a patch to allow field extraction from a structured > array to return a view instead of a copy. Generally, this is consistent > with the desire to have NumPy return views whenever it can. The same > idea underlies the change to the diagonal method. > > Suppose 'myarr' is a structured array with fields ['lat', 'long', > 'meas1', 'meas2', 'meas3', 'meas4']. > > Currently, > > myarr[['lat', 'long', 'mesa3']] will return a copy of the data in the > underlying array. The proposal is to have this return a view, but do it > in a two-stage approach so that a first version returns a copy with the > WARN_ON_WRITE flag set introduced in NumPy 1.7. A later version will > remove the flag (and the copy). > > What are thoughts on this proposal and which version of NumPy it should > go in? > There would at least need to be a deprecation plan where you use warnings to get users to insert extra explicit copy() wherever it's needed. With some very ugly hacks you could have copy() return "self" if the refcount is 1, but that also requires some knowledge of locals() of the calling frame to be safe and wouldn't work that well with Cython etc., so probably way too ugly. I hesitate to write the below, but if you start going down this road, I feel someone should at least mention it: I would prefer it if NumPy returned views in a lot more situations than today. Using the suboffsets idea of PEP 3118 you could also return a view for a[[1, 2, 4]] which would mean that y = x[a] y[...] = 4 would finally mean the same as x[a] = 4 and be a lot more consistent overall. It wouldn't be efficient, it wouldn't be a good idea for most users -- but it would still be within the structure of PEP 3118 (using suboffsets and allocating a pointer table temporarily), and it would lower the learning curve. I realize the immense backwards compatability challenges and implementation challenges and that this probably won't ever happen, but I felt this was the time to at least bring it up. Dag From thouis at gmail.com Sun Jul 15 11:33:15 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Sun, 15 Jul 2012 17:33:15 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Jul 15, 2012 2:08 PM, "Ralf Gommers" wrote: > > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant wrote: >> >> >> Hey all, >> >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute changes people are wanting to push into NumPy 1.7? We should discuss them as soon as possible. >> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on July 17th). This will allow the creation of beta releases of NumPy on the 18th of July. This is a few days later than originally hoped for --- largely due to unexpected travel schedules of Ondrej and I, but it does give people a few more days to get patches in. Of course, we will be able to apply bug-fixes to the 1.7.x branch once the tag is made. > > > What about the tickets still open for 1.7.0 ( http://projects.scipy.org/numpy/report/3)? There are a few important ones left. > > These I would consider blockers: > - #2108 Datetime failures with MinGW > - #2076 Bus error for F order ndarray creation on SPARC > > These have patches available which should be reviewed: > - #2150 Distutils should support Debian multi-arch fully > - #2179 Memmap children retain _mmap reference in all cases This one was fixed in a recent PR of mine, but I can't find it right now (on phone). Njsmith committed the merge. Ray > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Jul 15 11:36:33 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 15 Jul 2012 17:36:33 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 5:33 PM, Thouis (Ray) Jones wrote: > > On Jul 15, 2012 2:08 PM, "Ralf Gommers" > wrote: > > > > > > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant > wrote: > >> > >> > >> Hey all, > >> > >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute > changes people are wanting to push into NumPy 1.7? We should discuss them > as soon as possible. > >> > >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on > July 17th). This will allow the creation of beta releases of NumPy on the > 18th of July. This is a few days later than originally hoped for --- > largely due to unexpected travel schedules of Ondrej and I, but it does > give people a few more days to get patches in. Of course, we will be able > to apply bug-fixes to the 1.7.x branch once the tag is made. > > > > > > What about the tickets still open for 1.7.0 ( > http://projects.scipy.org/numpy/report/3)? There are a few important ones > left. > > > > These I would consider blockers: > > - #2108 Datetime failures with MinGW > > - #2076 Bus error for F order ndarray creation on SPARC > > > > These have patches available which should be reviewed: > > - #2150 Distutils should support Debian multi-arch fully > > - #2179 Memmap children retain _mmap reference in all cases > > This one was fixed in a recent PR of mine, but I can't find it right now > (on phone). Njsmith committed the merge. > OK thanks, closed the ticket. Looks like it would be handy for you and Nathaniel to get admin rights on the tracker. Can you help with that, Pauli? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jul 15 11:57:13 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Jul 2012 16:57:13 +0100 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers wrote: > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant > wrote: >> >> >> Hey all, >> >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute >> changes people are wanting to push into NumPy 1.7? We should discuss them >> as soon as possible. >> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on >> July 17th). This will allow the creation of beta releases of NumPy on the >> 18th of July. This is a few days later than originally hoped for --- largely >> due to unexpected travel schedules of Ondrej and I, but it does give people >> a few more days to get patches in. Of course, we will be able to apply >> bug-fixes to the 1.7.x branch once the tag is made. > > > What about the tickets still open for 1.7.0 > (http://projects.scipy.org/numpy/report/3)? There are a few important ones > left. > > These I would consider blockers: > - #2108 Datetime failures with MinGW Is there a description anywhere of what the problem actually is here? I looked at the ticket, which referred to a PR, and it's hard to work out from the PR discussion what the actual remaining test failures are -- and there definitely doesn't seem to be any description of the underlying problem. (Something about working 64-bit time_t on windows being difficult depending on the compiler used?) > - #2076 Bus error for F order ndarray creation on SPARC Yes, this looks like a regression... Mark, any thoughts? > These have patches available which should be reviewed: > - #2150 Distutils should support Debian multi-arch fully This PR still needs tests and has unresolved comments, but is probably a blocker: https://github.com/numpy/numpy/pull/327 -N From ralf.gommers at googlemail.com Sun Jul 15 12:32:41 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 15 Jul 2012 18:32:41 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith wrote: > On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers > wrote: > > > > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant > > wrote: > >> > >> > >> Hey all, > >> > >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute > >> changes people are wanting to push into NumPy 1.7? We should discuss > them > >> as soon as possible. > >> > >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on > >> July 17th). This will allow the creation of beta releases of NumPy on > the > >> 18th of July. This is a few days later than originally hoped for --- > largely > >> due to unexpected travel schedules of Ondrej and I, but it does give > people > >> a few more days to get patches in. Of course, we will be able to apply > >> bug-fixes to the 1.7.x branch once the tag is made. > > > > > > What about the tickets still open for 1.7.0 > > (http://projects.scipy.org/numpy/report/3)? There are a few important > ones > > left. > > > > These I would consider blockers: > > - #2108 Datetime failures with MinGW > > Is there a description anywhere of what the problem actually is here? > I looked at the ticket, which referred to a PR, and it's hard to work > out from the PR discussion what the actual remaining test failures are > -- and there definitely doesn't seem to be any description of the > underlying problem. (Something about working 64-bit time_t on windows > being difficult depending on the compiler used?) > There's a lot more discussion on http://projects.scipy.org/numpy/ticket/1909 https://github.com/numpy/numpy/pull/156 https://github.com/numpy/numpy/pull/161. The issue is that for MinGW 3.x some _s / _t functions seem to be missing. And we don't yet support MinGW 4.x. Current issues can be seen from the last test log on our Windows XP buildbot (June 29, http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio ): ====================================================================== ERROR: test_datetime_arange (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1351, in test_datetime_arange assert_raises(ValueError, np.arange, np.datetime64('today'), OSError: Failed to use '_localtime64_s' to convert to a local time ====================================================================== ERROR: test_datetime_y2038 (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1706, in test_datetime_y2038 a = np.datetime64('2038-01-20T13:21:14') OSError: Failed to use '_gmtime64_s' to convert to a UTC time ====================================================================== ERROR: test_pydatetime_creation (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 467, in test_pydatetime_creation a = np.array(['today', datetime.date.today()], dtype='M8[D]') OSError: Failed to use '_localtime64_s' to convert to a local time ====================================================================== ERROR: test_string_parser_variants (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1054, in test_string_parser_variants assert_equal(np.array(['1980-02-29T01:02:03'], np.dtype('M8[s]')), OSError: Failed to use '_gmtime64_s' to convert to a UTC time ====================================================================== ERROR: test_timedelta_scalar_construction_units (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 287, in test_timedelta_scalar_construction_units assert_equal(np.datetime64('2010-03-12T17').dtype, OSError: Failed to use '_gmtime64_s' to convert to a UTC time ====================================================================== ERROR: Failure: OSError (Failed to use '_gmtime64_s' to convert to a UTC time) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python26\lib\site-packages\nose\loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "C:\Python26\lib\site-packages\nose\importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "C:\Python26\lib\site-packages\nose\importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", line 916, in class TestArgmax(TestCase): File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", line 938, in TestArgmax np.datetime64('1994-06-21T14:43:15'), OSError: Failed to use '_gmtime64_s' to convert to a UTC time -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 15 12:42:29 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 15 Jul 2012 10:42:29 -0600 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 10:32 AM, Ralf Gommers wrote: > > > On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith wrote: > >> On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers >> wrote: >> > >> > >> > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant >> > wrote: >> >> >> >> >> >> Hey all, >> >> >> >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute >> >> changes people are wanting to push into NumPy 1.7? We should discuss >> them >> >> as soon as possible. >> >> >> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on >> >> July 17th). This will allow the creation of beta releases of NumPy >> on the >> >> 18th of July. This is a few days later than originally hoped for --- >> largely >> >> due to unexpected travel schedules of Ondrej and I, but it does give >> people >> >> a few more days to get patches in. Of course, we will be able to apply >> >> bug-fixes to the 1.7.x branch once the tag is made. >> > >> > >> > What about the tickets still open for 1.7.0 >> > (http://projects.scipy.org/numpy/report/3)? There are a few important >> ones >> > left. >> > >> > These I would consider blockers: >> > - #2108 Datetime failures with MinGW >> >> Is there a description anywhere of what the problem actually is here? >> I looked at the ticket, which referred to a PR, and it's hard to work >> out from the PR discussion what the actual remaining test failures are >> -- and there definitely doesn't seem to be any description of the >> underlying problem. (Something about working 64-bit time_t on windows >> being difficult depending on the compiler used?) >> > > There's a lot more discussion on > http://projects.scipy.org/numpy/ticket/1909 > https://github.com/numpy/numpy/pull/156 > https://github.com/numpy/numpy/pull/161. > > The issue is that for MinGW 3.x some _s / _t functions seem to be missing. > And we don't yet support MinGW 4.x. > > Current issues can be seen from the last test log on our Windows XP > buildbot (June 29, > http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio > ): > > ====================================================================== > ERROR: test_datetime_arange (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1351, in test_datetime_arange > assert_raises(ValueError, np.arange, np.datetime64('today'), > OSError: Failed to use '_localtime64_s' to convert to a local time > > ====================================================================== > ERROR: test_datetime_y2038 (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1706, in test_datetime_y2038 > a = np.datetime64('2038-01-20T13:21:14') > OSError: Failed to use '_gmtime64_s' to convert to a UTC time > > ====================================================================== > ERROR: test_pydatetime_creation (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 467, in test_pydatetime_creation > a = np.array(['today', datetime.date.today()], dtype='M8[D]') > OSError: Failed to use '_localtime64_s' to convert to a local time > > ====================================================================== > ERROR: test_string_parser_variants (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 1054, in test_string_parser_variants > assert_equal(np.array(['1980-02-29T01:02:03'], np.dtype('M8[s]')), > OSError: Failed to use '_gmtime64_s' to convert to a UTC time > > ====================================================================== > ERROR: test_timedelta_scalar_construction_units (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", line 287, in test_timedelta_scalar_construction_units > assert_equal(np.datetime64('2010-03-12T17').dtype, > OSError: Failed to use '_gmtime64_s' to convert to a UTC time > > ====================================================================== > ERROR: Failure: OSError (Failed to use '_gmtime64_s' to convert to a UTC time) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\Python26\lib\site-packages\nose\loader.py", line 382, in loadTestsFromName > addr.filename, addr.module) > File "C:\Python26\lib\site-packages\nose\importer.py", line 39, in importFromPath > return self.importFromDir(dir_path, fqname) > File "C:\Python26\lib\site-packages\nose\importer.py", line 86, in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", line 916, in > class TestArgmax(TestCase): > File "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", line 938, in TestArgmax > np.datetime64('1994-06-21T14:43:15'), > OSError: Failed to use '_gmtime64_s' to convert to a UTC time > > > I've wondered about the current status of MinGW 4.x, the mingw.org release of GCC 4.7.0 was June 7. Looks like it is still 32 bits and breaks the ABI ... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Jul 15 12:56:06 2012 From: cournape at gmail.com (David Cournapeau) Date: Sun, 15 Jul 2012 17:56:06 +0100 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 5:42 PM, Charles R Harris wrote: > > > On Sun, Jul 15, 2012 at 10:32 AM, Ralf Gommers > wrote: >> >> >> >> On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith wrote: >>> >>> On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers >>> wrote: >>> > >>> > >>> > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant >>> > wrote: >>> >> >>> >> >>> >> Hey all, >>> >> >>> >> We are nearing a code-freeze for NumPy 1.7. Are there any >>> >> last-minute >>> >> changes people are wanting to push into NumPy 1.7? We should discuss >>> >> them >>> >> as soon as possible. >>> >> >>> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT >>> >> on >>> >> July 17th). This will allow the creation of beta releases of NumPy >>> >> on the >>> >> 18th of July. This is a few days later than originally hoped for --- >>> >> largely >>> >> due to unexpected travel schedules of Ondrej and I, but it does give >>> >> people >>> >> a few more days to get patches in. Of course, we will be able to >>> >> apply >>> >> bug-fixes to the 1.7.x branch once the tag is made. >>> > >>> > >>> > What about the tickets still open for 1.7.0 >>> > (http://projects.scipy.org/numpy/report/3)? There are a few important >>> > ones >>> > left. >>> > >>> > These I would consider blockers: >>> > - #2108 Datetime failures with MinGW >>> >>> Is there a description anywhere of what the problem actually is here? >>> I looked at the ticket, which referred to a PR, and it's hard to work >>> out from the PR discussion what the actual remaining test failures are >>> -- and there definitely doesn't seem to be any description of the >>> underlying problem. (Something about working 64-bit time_t on windows >>> being difficult depending on the compiler used?) >> >> >> There's a lot more discussion on >> http://projects.scipy.org/numpy/ticket/1909 >> https://github.com/numpy/numpy/pull/156 >> https://github.com/numpy/numpy/pull/161. >> >> The issue is that for MinGW 3.x some _s / _t functions seem to be missing. >> And we don't yet support MinGW 4.x. >> >> Current issues can be seen from the last test log on our Windows XP >> buildbot (June 29, >> http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio): >> >> ====================================================================== >> ERROR: test_datetime_arange (test_datetime.TestDateTime) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> line 1351, in test_datetime_arange >> assert_raises(ValueError, np.arange, np.datetime64('today'), >> OSError: Failed to use '_localtime64_s' to convert to a local time >> >> ====================================================================== >> ERROR: test_datetime_y2038 (test_datetime.TestDateTime) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> line 1706, in test_datetime_y2038 >> a = np.datetime64('2038-01-20T13:21:14') >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> ====================================================================== >> ERROR: test_pydatetime_creation (test_datetime.TestDateTime) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> line 467, in test_pydatetime_creation >> a = np.array(['today', datetime.date.today()], dtype='M8[D]') >> OSError: Failed to use '_localtime64_s' to convert to a local time >> >> ====================================================================== >> ERROR: test_string_parser_variants (test_datetime.TestDateTime) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> line 1054, in test_string_parser_variants >> assert_equal(np.array(['1980-02-29T01:02:03'], np.dtype('M8[s]')), >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> ====================================================================== >> ERROR: test_timedelta_scalar_construction_units >> (test_datetime.TestDateTime) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> line 287, in test_timedelta_scalar_construction_units >> assert_equal(np.datetime64('2010-03-12T17').dtype, >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> ====================================================================== >> ERROR: Failure: OSError (Failed to use '_gmtime64_s' to convert to a UTC >> time) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "C:\Python26\lib\site-packages\nose\loader.py", line 382, in >> loadTestsFromName >> addr.filename, addr.module) >> File "C:\Python26\lib\site-packages\nose\importer.py", line 39, in >> importFromPath >> return self.importFromDir(dir_path, fqname) >> File "C:\Python26\lib\site-packages\nose\importer.py", line 86, in >> importFromDir >> mod = load_module(part_fqname, fh, filename, desc) >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", >> line 916, in >> class TestArgmax(TestCase): >> File >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", >> line 938, in TestArgmax >> np.datetime64('1994-06-21T14:43:15'), >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> > > I've wondered about the current status of MinGW 4.x, the mingw.org release > of GCC 4.7.0 was June 7. Looks like it is still 32 bits and breaks the ABI The main issue with mingw 4.x is the dependency on mingw runtimes. With 3.x, it was possible to statically link everything, but not with 4.x. Distributing DLL outside numpy is not a good idea IMO, and I don't know how to share DLL across python extensions without putting them in the python installation (which does not sound like a good idea either). David From stefan-usenet at bytereef.org Sun Jul 15 13:06:18 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 15 Jul 2012 19:06:18 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: <20120715170618.GA12281@sleipnir.bytereef.org> Nathaniel Smith wrote: > > What about the tickets still open for 1.7.0 > > (http://projects.scipy.org/numpy/report/3)? There are a few important ones > > left. > > > > These I would consider blockers: > > - #2108 Datetime failures with MinGW I wonder if this might be a blocker: Python-3.3 will be released in August and I don't think the issue is fixed yet: http://projects.scipy.org/numpy/ticket/2145 Stefan Krah From charlesr.harris at gmail.com Sun Jul 15 13:10:36 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 15 Jul 2012 11:10:36 -0600 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 10:56 AM, David Cournapeau wrote: > On Sun, Jul 15, 2012 at 5:42 PM, Charles R Harris > wrote: > > > > > > On Sun, Jul 15, 2012 at 10:32 AM, Ralf Gommers < > ralf.gommers at googlemail.com> > > wrote: > >> > >> > >> > >> On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith wrote: > >>> > >>> On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers > >>> wrote: > >>> > > >>> > > >>> > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant < > travis at continuum.io> > >>> > wrote: > >>> >> > >>> >> > >>> >> Hey all, > >>> >> > >>> >> We are nearing a code-freeze for NumPy 1.7. Are there any > >>> >> last-minute > >>> >> changes people are wanting to push into NumPy 1.7? We should > discuss > >>> >> them > >>> >> as soon as possible. > >>> >> > >>> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT > >>> >> on > >>> >> July 17th). This will allow the creation of beta releases of NumPy > >>> >> on the > >>> >> 18th of July. This is a few days later than originally hoped for --- > >>> >> largely > >>> >> due to unexpected travel schedules of Ondrej and I, but it does give > >>> >> people > >>> >> a few more days to get patches in. Of course, we will be able to > >>> >> apply > >>> >> bug-fixes to the 1.7.x branch once the tag is made. > >>> > > >>> > > >>> > What about the tickets still open for 1.7.0 > >>> > (http://projects.scipy.org/numpy/report/3)? There are a few > important > >>> > ones > >>> > left. > >>> > > >>> > These I would consider blockers: > >>> > - #2108 Datetime failures with MinGW > >>> > >>> Is there a description anywhere of what the problem actually is here? > >>> I looked at the ticket, which referred to a PR, and it's hard to work > >>> out from the PR discussion what the actual remaining test failures are > >>> -- and there definitely doesn't seem to be any description of the > >>> underlying problem. (Something about working 64-bit time_t on windows > >>> being difficult depending on the compiler used?) > >> > >> > >> There's a lot more discussion on > >> http://projects.scipy.org/numpy/ticket/1909 > >> https://github.com/numpy/numpy/pull/156 > >> https://github.com/numpy/numpy/pull/161. > >> > >> The issue is that for MinGW 3.x some _s / _t functions seem to be > missing. > >> And we don't yet support MinGW 4.x. > >> > >> Current issues can be seen from the last test log on our Windows XP > >> buildbot (June 29, > >> > http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio > ): > >> > >> ====================================================================== > >> ERROR: test_datetime_arange (test_datetime.TestDateTime) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > >> line 1351, in test_datetime_arange > >> assert_raises(ValueError, np.arange, np.datetime64('today'), > >> OSError: Failed to use '_localtime64_s' to convert to a local time > >> > >> ====================================================================== > >> ERROR: test_datetime_y2038 (test_datetime.TestDateTime) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > >> line 1706, in test_datetime_y2038 > >> a = np.datetime64('2038-01-20T13:21:14') > >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time > >> > >> ====================================================================== > >> ERROR: test_pydatetime_creation (test_datetime.TestDateTime) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > >> line 467, in test_pydatetime_creation > >> a = np.array(['today', datetime.date.today()], dtype='M8[D]') > >> OSError: Failed to use '_localtime64_s' to convert to a local time > >> > >> ====================================================================== > >> ERROR: test_string_parser_variants (test_datetime.TestDateTime) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > >> line 1054, in test_string_parser_variants > >> assert_equal(np.array(['1980-02-29T01:02:03'], np.dtype('M8[s]')), > >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time > >> > >> ====================================================================== > >> ERROR: test_timedelta_scalar_construction_units > >> (test_datetime.TestDateTime) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > >> line 287, in test_timedelta_scalar_construction_units > >> assert_equal(np.datetime64('2010-03-12T17').dtype, > >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time > >> > >> ====================================================================== > >> ERROR: Failure: OSError (Failed to use '_gmtime64_s' to convert to a UTC > >> time) > >> ---------------------------------------------------------------------- > >> Traceback (most recent call last): > >> File "C:\Python26\lib\site-packages\nose\loader.py", line 382, in > >> loadTestsFromName > >> addr.filename, addr.module) > >> File "C:\Python26\lib\site-packages\nose\importer.py", line 39, in > >> importFromPath > >> return self.importFromDir(dir_path, fqname) > >> File "C:\Python26\lib\site-packages\nose\importer.py", line 86, in > >> importFromDir > >> mod = load_module(part_fqname, fh, filename, desc) > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", > >> line 916, in > >> class TestArgmax(TestCase): > >> File > >> > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", > >> line 938, in TestArgmax > >> np.datetime64('1994-06-21T14:43:15'), > >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time > >> > >> > > > > I've wondered about the current status of MinGW 4.x, the mingw.orgrelease > > of GCC 4.7.0 was June 7. Looks like it is still 32 bits and breaks the > ABI > > The main issue with mingw 4.x is the dependency on mingw runtimes. > With 3.x, it was possible to statically link everything, but not with > 4.x. Distributing DLL outside numpy is not a good idea IMO, and I > don't know how to share DLL across python extensions without putting > them in the python installation (which does not sound like a good idea > either). > > Have you looked at TDM ? A cursory look implies that the std/stdc++ libraries can be statically linked. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay.bourque at continuum.io Sun Jul 15 13:18:34 2012 From: jay.bourque at continuum.io (jay bourque) Date: Sun, 15 Jul 2012 12:18:34 -0500 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: Just added PR #359. The purpose is to allow the nditer object operand and iter flags to be set for a ufunc to provide better control over how an array is iterated over by a ufunc and how the ufunc uses the operands passed to it. One specific motivation for this is to be able to specify an input operand to a ufunc as being read/write instead of read only. For example, to allow your ufunc to write back to the first operand: PyObject *ufunc = PyUFunc_FromFuncAndData((PyUFuncGenericFunction*)func, data, types, 1, 2, 1, PyUFunc_None, "ufunc_name", "ufunc_doc", 0); /* override the default NPY_ITER_READONLY flag */ ((PyUFuncObject*)ufunc)->op_flags[0] = NPY_ITER_READWRITE; /* global iter flag that needs to be set for read/write flag to work */ ((PyUFuncObject*)ufunc)->iter_flags = NPY_ITER_REDUCE_OK; Thoughts? -Jay On Sun, Jul 15, 2012 at 12:10 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Jul 15, 2012 at 10:56 AM, David Cournapeau wrote: > >> On Sun, Jul 15, 2012 at 5:42 PM, Charles R Harris >> wrote: >> > >> > >> > On Sun, Jul 15, 2012 at 10:32 AM, Ralf Gommers < >> ralf.gommers at googlemail.com> >> > wrote: >> >> >> >> >> >> >> >> On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith >> wrote: >> >>> >> >>> On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers >> >>> wrote: >> >>> > >> >>> > >> >>> > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant < >> travis at continuum.io> >> >>> > wrote: >> >>> >> >> >>> >> >> >>> >> Hey all, >> >>> >> >> >>> >> We are nearing a code-freeze for NumPy 1.7. Are there any >> >>> >> last-minute >> >>> >> changes people are wanting to push into NumPy 1.7? We should >> discuss >> >>> >> them >> >>> >> as soon as possible. >> >>> >> >> >>> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm >> CDT >> >>> >> on >> >>> >> July 17th). This will allow the creation of beta releases of >> NumPy >> >>> >> on the >> >>> >> 18th of July. This is a few days later than originally hoped for >> --- >> >>> >> largely >> >>> >> due to unexpected travel schedules of Ondrej and I, but it does >> give >> >>> >> people >> >>> >> a few more days to get patches in. Of course, we will be able to >> >>> >> apply >> >>> >> bug-fixes to the 1.7.x branch once the tag is made. >> >>> > >> >>> > >> >>> > What about the tickets still open for 1.7.0 >> >>> > (http://projects.scipy.org/numpy/report/3)? There are a few >> important >> >>> > ones >> >>> > left. >> >>> > >> >>> > These I would consider blockers: >> >>> > - #2108 Datetime failures with MinGW >> >>> >> >>> Is there a description anywhere of what the problem actually is here? >> >>> I looked at the ticket, which referred to a PR, and it's hard to work >> >>> out from the PR discussion what the actual remaining test failures are >> >>> -- and there definitely doesn't seem to be any description of the >> >>> underlying problem. (Something about working 64-bit time_t on windows >> >>> being difficult depending on the compiler used?) >> >> >> >> >> >> There's a lot more discussion on >> >> http://projects.scipy.org/numpy/ticket/1909 >> >> https://github.com/numpy/numpy/pull/156 >> >> https://github.com/numpy/numpy/pull/161. >> >> >> >> The issue is that for MinGW 3.x some _s / _t functions seem to be >> missing. >> >> And we don't yet support MinGW 4.x. >> >> >> >> Current issues can be seen from the last test log on our Windows XP >> >> buildbot (June 29, >> >> >> http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio >> ): >> >> >> >> ====================================================================== >> >> ERROR: test_datetime_arange (test_datetime.TestDateTime) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> >> line 1351, in test_datetime_arange >> >> assert_raises(ValueError, np.arange, np.datetime64('today'), >> >> OSError: Failed to use '_localtime64_s' to convert to a local time >> >> >> >> ====================================================================== >> >> ERROR: test_datetime_y2038 (test_datetime.TestDateTime) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> >> line 1706, in test_datetime_y2038 >> >> a = np.datetime64('2038-01-20T13:21:14') >> >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> >> >> ====================================================================== >> >> ERROR: test_pydatetime_creation (test_datetime.TestDateTime) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> >> line 467, in test_pydatetime_creation >> >> a = np.array(['today', datetime.date.today()], dtype='M8[D]') >> >> OSError: Failed to use '_localtime64_s' to convert to a local time >> >> >> >> ====================================================================== >> >> ERROR: test_string_parser_variants (test_datetime.TestDateTime) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> >> line 1054, in test_string_parser_variants >> >> assert_equal(np.array(['1980-02-29T01:02:03'], np.dtype('M8[s]')), >> >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> >> >> ====================================================================== >> >> ERROR: test_timedelta_scalar_construction_units >> >> (test_datetime.TestDateTime) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", >> >> line 287, in test_timedelta_scalar_construction_units >> >> assert_equal(np.datetime64('2010-03-12T17').dtype, >> >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> >> >> ====================================================================== >> >> ERROR: Failure: OSError (Failed to use '_gmtime64_s' to convert to a >> UTC >> >> time) >> >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> >> File "C:\Python26\lib\site-packages\nose\loader.py", line 382, in >> >> loadTestsFromName >> >> addr.filename, addr.module) >> >> File "C:\Python26\lib\site-packages\nose\importer.py", line 39, in >> >> importFromPath >> >> return self.importFromDir(dir_path, fqname) >> >> File "C:\Python26\lib\site-packages\nose\importer.py", line 86, in >> >> importFromDir >> >> mod = load_module(part_fqname, fh, filename, desc) >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", >> >> line 916, in >> >> class TestArgmax(TestCase): >> >> File >> >> >> "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_multiarray.py", >> >> line 938, in TestArgmax >> >> np.datetime64('1994-06-21T14:43:15'), >> >> OSError: Failed to use '_gmtime64_s' to convert to a UTC time >> >> >> >> >> > >> > I've wondered about the current status of MinGW 4.x, the mingw.orgrelease >> > of GCC 4.7.0 was June 7. Looks like it is still 32 bits and breaks the >> ABI >> >> The main issue with mingw 4.x is the dependency on mingw runtimes. >> With 3.x, it was possible to statically link everything, but not with >> 4.x. Distributing DLL outside numpy is not a good idea IMO, and I >> don't know how to share DLL across python extensions without putting >> them in the python installation (which does not sound like a good idea >> either). >> >> > Have you looked at TDM ? A cursory look > implies that the std/stdc++ libraries can be statically linked. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jul 15 15:23:22 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 15 Jul 2012 20:23:22 +0100 Subject: [Numpy-discussion] ufunc and nditer flags (was Re: Code Freeze for NumPy 1.7) Message-ID: On Sun, Jul 15, 2012 at 6:18 PM, jay bourque wrote: > Just added PR #359. The purpose is to allow the nditer object operand and > iter flags to be set for a ufunc to provide better control over how an array > is iterated over by a ufunc and how the ufunc uses the operands passed to > it. One specific motivation for this is to be able to specify an input > operand to a ufunc as being read/write instead of read only. Huh. My first gut reaction to this is that it's an argument *against* merging this change, because ufuncs *shouldn't* be writing to their inputs. Maybe I'm wrong, but... obviously there is more context here than we've heard so far. Can you explain what you're actually trying to accomplish? -n From jay.bourque at continuum.io Sun Jul 15 16:03:04 2012 From: jay.bourque at continuum.io (jay bourque) Date: Sun, 15 Jul 2012 15:03:04 -0500 Subject: [Numpy-discussion] ufunc and nditer flags (was Re: Code Freeze for NumPy 1.7) In-Reply-To: References: Message-ID: Travis can better speak to specific use cases, but one example where this might be useful is an "in place" ufunc, or a ufunc operand that's broadcasted and can hold a reduce value. On Sun, Jul 15, 2012 at 2:23 PM, Nathaniel Smith wrote: > On Sun, Jul 15, 2012 at 6:18 PM, jay bourque > wrote: > > Just added PR #359. The purpose is to allow the nditer object operand and > > iter flags to be set for a ufunc to provide better control over how an > array > > is iterated over by a ufunc and how the ufunc uses the operands passed to > > it. One specific motivation for this is to be able to specify an input > > operand to a ufunc as being read/write instead of read only. > > Huh. My first gut reaction to this is that it's an argument *against* > merging this change, because ufuncs *shouldn't* be writing to their > inputs. Maybe I'm wrong, but... obviously there is more context here > than we've heard so far. Can you explain what you're actually trying > to accomplish? > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s0454615 at sms.ed.ac.uk Sun Jul 15 16:17:46 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 15 Jul 2012 20:17:46 +0000 (UTC) Subject: [Numpy-discussion] Help building NumPy on Windows Message-ID: Hi, I'm having some trouble building numpy on a 64-bit Windows 7 machine. I'm probably accidentally missing a step following the build process described at http://scipy.org/Installing_SciPy/Windows; it would be great if someone could spot what! Here's what I did: 1. installed python 2.7 from python.org 2. installed mingw32 from the link above (results in gcc 3.4.5) 3. added c:\mingw\bin to front of path 4. ran "python setup.py build --compiler=mingw32 install --prefix=numpy-install" I got the following error: [...] C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\system_info.py:1409: UserWarning: Lapack (http://www.netlib.org/lapack/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [lapack_src]) or by setting the LAPACK_SRC environment variable. warnings.warn(LapackSrcNotFoundError.__doc__) customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using config C compiler: gcc -mno-cygwin -O2 -Wall -Wstrict-prototypes compile options: '-DNPY_MINGW_USE_CUSTOM_MSVCR -D__MSVCRT_VERSION__=0x0900 - Inumpy\core\src\private -Inumpy\core\src -Inumpy\core -Inumpy\core\src\npymath - Inumpy\core\src\multiarray -Inumpy\core\src\umath -Inumpy\core\src\npysort - Inumpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c' gcc -mno-cygwin -O2 -Wall -Wstrict-prototypes -DNPY_MINGW_USE_CUSTOM_MSVCR - D__MSVCRT_VERSION__=0x0900 -Inumpy\core\src\private -Inumpy\core\src - Inumpy\core -Inumpy\core\src\npymath -Inumpy\core\src\multiarray - Inumpy\core\src\umath -Inumpy\core\src\npysort -Inumpy\core\include - Ic:\Python27\include -Ic:\Python27\PC -c _configtest.c -o _configtest.o Found executable c:\mingw\bin\gcc.exe g++ -mno-cygwin _configtest.o -lmsvcr90 -o _configtest.exe Found executable c:\mingw\bin\g++.exe c:\mingw\bin\..\lib\gcc\mingw32\3.4.5\..\..\..\..\mingw32\bin\ld.exe: cannot find -lmsvcr90 collect2: ld returned 1 exit status failure. removing: _configtest.exe.manifest _configtest.c _configtest.o Traceback (most recent call last): File "setup.py", line 214, in setup_package() File "setup.py", line 207, in setup_package configuration=configuration ) File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\core.py", line 186, in setup return old_setup(**new_attr) File "c:\Python27\lib\distutils\core.py", line 152, in setup dist.run_commands() File "c:\Python27\lib\distutils\dist.py", line 953, in run_commands self.run_command(cmd) File "c:\Python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\command\build.py", line 37, in run old_build.run(self) File "c:\Python27\lib\distutils\command\build.py", line 127, in run self.run_command(cmd_name) File "c:\Python27\lib\distutils\cmd.py", line 326, in run_command self.distribution.run_command(command) File "c:\Python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\command\build_src.py", line 152, in run self.build_sources() File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\command\build_src.py", line 163, in build_sources self.build_library_sources(*libname_info) File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\command\build_src.py", line 298, in build_library_sources sources = self.generate_sources(sources, (lib_name, build_info)) File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- py27\build\numpy\distutils\command\build_src.py", line 385, in generate_sources source = func(extension, build_dir) File "numpy\core\setup.py", line 648, in get_mathlib_info raise RuntimeError("Broken toolchain: cannot link a simple C program") RuntimeError: Broken toolchain: cannot link a simple C program Can anyone see what I've missed from this? I'm not sure what the situation is for supporting building with Microsoft compilers, but on another Windows 7 machine (also 64 bits) with MS Visual Studio 9.0 installed, the build completes but I get test failures. You can see the full output of this here: https://jenkins.shiningpanda.com/scipy/job/NumPy-Windows7_x64- py27/26/consoleFull From travis at continuum.io Sun Jul 15 16:29:01 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 15:29:01 -0500 Subject: [Numpy-discussion] ufunc and nditer flags (was Re: Code Freeze for NumPy 1.7) In-Reply-To: References: Message-ID: <2AF4D496-59B0-4CFC-AEB9-6D383D42772C@continuum.io> On Jul 15, 2012, at 2:23 PM, Nathaniel Smith wrote: > On Sun, Jul 15, 2012 at 6:18 PM, jay bourque wrote: >> Just added PR #359. The purpose is to allow the nditer object operand and >> iter flags to be set for a ufunc to provide better control over how an array >> is iterated over by a ufunc and how the ufunc uses the operands passed to >> it. One specific motivation for this is to be able to specify an input >> operand to a ufunc as being read/write instead of read only. > > Huh. My first gut reaction to this is that it's an argument *against* > merging this change, because ufuncs *shouldn't* be writing to their > inputs. Maybe I'm wrong, but... obviously there is more context here > than we've heard so far. Can you explain what you're actually trying > to accomplish? > This is a generalization that allows ufuncs to be more flexible. It's particularly important as the changes to the ufunc implementation in 1.6 where a lot more buffering is taking place has changed the implicit behavior that some users were relying on. In particular, there are several NumPy users who have assumed that they could treat "read-only" inputs as "read-write" and modify the inputs in the ufunc for a variety of reasons (to hold state, to implement interesting functions that depend on the order in which it's called, etc.). With the changes in 1.6 to the way ufuncs are buffered their code broke as buffered inputs were not copied back to the underlying arrays after the ufunc was called. It would be great if such people would use this list to communicate their concerns in more detail, but some are not able to. That doesn't mean their concerns are not valid and should not be considered. We can argue that people "should not" be using ufuncs in that way, or we could look at whether or not it makes sense to have input-and-output arguments for ufuncs. It's helpful to remember that ufuncs can be more general than the simple unary and binary ones that most are used to. Fortran has "inout" arguments for it's subroutines which is an argument for the general utility of such a device in programming. If we want ufunc kernels to grow beyond element-wise, or be used with structured arrays, etc., then allowing a ufunc to be created that defines arguments as inout seems reasonable. We already "sort-of" have the ability to define "inout" arguments in that one can pass an output array into a ufunc and it can be pre-filled with whatever one wants (and one can use the data in the "out" array as if it were input). But, this is also a hack, I think. I think it's better to just allow the user to specify their intent so that the nditer buffering mechanism to do the right thing with arrays that are inputs and arrays that are outputs and arrays that are specified as *both* input and output. My view is that intelligent programmers have found a use-case for treating ufunc arguments as inout. This is a general paradigm that exists in other lanaguages for scientific computing. We already have the ability to specify an "out" parameter which can be abused for this sort of thing as well, but I'd rather let people be explicit about it so that we can reason correctly in the future about what people are trying to do. This will especially be useful as more "generalized ufunc kernels" get written. Thus, I think it makes a lot of sense to allow people to be explicit about the intent of arguments as inout instead of trying to find loop-holes in the current implementation to get what they want. Thanks, -Travis From s0454615 at sms.ed.ac.uk Sun Jul 15 17:00:42 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 15 Jul 2012 21:00:42 +0000 (UTC) Subject: [Numpy-discussion] Possible test failure on Debian 6 with Python 2.6 Message-ID: Hi, I'm trying to set up various build machines. Some of these are with ShiningPanda.com, which provides a 64-bit Debian 6 machine (as well as Windows 7). This machine has multiple versions of Python installed. Using the build procedure below, I see a test failure with Python 2.6 (and 2.7) but not 2.5 (and 2.4). Any ideas? I guess other people are testing on Debian 6, and if they don't see the test failure with Python 2.6 maybe there's something unusual about the setup of this machine. Build procedure: $ python setup.py build install --prefix=./numpy-install $ cd numpy-install $ export PYTHONPATH=/path/to/numpy-install/lib/python2.6/site-packages $ python ../tools/test-installed-numpy.py --coverage -- --with-xunit Test error: ====================================================================== ERROR: test_multiarray.TestNewBufferProtocol.test_roundtrip ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/slave/jenkins/shiningpanda/jobs/51551925/virtualenvs/d41d8cd9/lib/python2 .6/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- install/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py", line 2537, in test_roundtrip assert_raises(ValueError, self._check_roundtrip, x) File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- install/lib/python2.6/site-packages/numpy/testing/utils.py", line 1018, in assert_raises return nose.tools.assert_raises(*args,**kwargs) File "/sp/lib/python/cpython-2.6/lib/python2.6/unittest.py", line 336, in failUnlessRaises callableObj(*args, **kwargs) File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- install/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py", line 2471, in _check_roundtrip y = np.asarray(x) RuntimeWarning: tp_compare didn't return -1 or -2 for exception [...] Ran 4781 tests in 68.179s FAILED (KNOWNFAIL=4, SKIP=6, errors=1) The full build and test output is here: https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py26/8/consoleFull For comparison, with Python 2.5: https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py25/9/consoleFull From s0454615 at sms.ed.ac.uk Sun Jul 15 17:07:01 2012 From: s0454615 at sms.ed.ac.uk (Chris Ball) Date: Sun, 15 Jul 2012 21:07:01 +0000 (UTC) Subject: [Numpy-discussion] Error building with Python 3.3 (was Re: Code Freeze for NumPy 1.7) References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <20120715170618.GA12281@sleipnir.bytereef.org> Message-ID: Stefan Krah bytereef.org> writes: ... > I wonder if this might be a blocker: Python-3.3 will be released in August > and I don't think the issue is fixed yet: > > http://projects.scipy.org/numpy/ticket/2145 In case it helps, on a 64-bit Debian 6 machine where building with Python 3.1 and 3.2 seems fine, I think I see the same error as above. Here's the build log: https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py33/7/consoleFull Chris From mike.ressler at alum.mit.edu Sun Jul 15 17:28:00 2012 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Sun, 15 Jul 2012 14:28:00 -0700 Subject: [Numpy-discussion] Select-based median (in light of code freeze) Message-ID: Hi, A couple of years ago there was a flurry of work partially at my instigation at SciPy 2009 to build a better median function based on a select algorithm rather than a sort algorithm. It seemed that it had progressed quite far, but the code in lib/function_base.py still uses a sort. Has the select work been abandoned or could it be merged in with a little spit-and-polish? (Or has it been and I'm just too naive to have seen it?) It seemed that the work had gotten quite close to release - I would hate to see it lost. (Not to mention I am greedy for the speed-up.) Regards, Mike -- mike.ressler at alum.mit.edu From charlesr.harris at gmail.com Sun Jul 15 17:40:04 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 15 Jul 2012 15:40:04 -0600 Subject: [Numpy-discussion] Select-based median (in light of code freeze) In-Reply-To: References: Message-ID: On Sun, Jul 15, 2012 at 3:28 PM, Mike Ressler wrote: > Hi, > > A couple of years ago there was a flurry of work partially at my > instigation at SciPy 2009 to build a better median function based on a > select algorithm rather than a sort algorithm. It seemed that it had > progressed quite far, but the code in lib/function_base.py still uses > a sort. Has the select work been abandoned or could it be merged in > with a little spit-and-polish? (Or has it been and I'm just too naive > to have seen it?) It seemed that the work had gotten quite close to > release - I would hate to see it lost. (Not to mention I am greedy for > the speed-up.) > > I was thinking of adding quickselect, but if you have made a start ... go for it. I think a good place to put the algorithm code would be in numpy/core/src/npysort, but I don't think you should count on having it go into 1.7. Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.ressler at alum.mit.edu Sun Jul 15 17:50:51 2012 From: mike.ressler at alum.mit.edu (Mike Ressler) Date: Sun, 15 Jul 2012 14:50:51 -0700 Subject: [Numpy-discussion] Select-based median (in light of code freeze) In-Reply-To: References: Message-ID: On Sun, Jul 15, 2012 at 2:40 PM, Charles R Harris wrote: > I was thinking of adding quickselect, but if you have made a start ... go > for it. This is territory where I personally am fearful to tread - I'm no developer, but I am an awfully good alpha/beta tester! I can go back to the archives and try to see how far people got and what it would take to finish it. If you have a good quickselect, I could wrap median around it and see how it behaves. I am happy to help push things along, but frankly would be embarrassed to have any low-level code I wrote see the light of day. I can test and document it in exchange though. Mike -- mike.ressler at alum.mit.edu From pgmdevlist at gmail.com Sun Jul 15 17:51:25 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 15 Jul 2012 23:51:25 +0200 Subject: [Numpy-discussion] Istalling Numpy and Scipy on preinstalled Python 2.6 on Mac In-Reply-To: References: Message-ID: <1F5DA083EC6E4B08B5F824444AF9F7DA@gmail.com> Hello, I assume you checked that ? http://www.scipy.org/Installing_SciPy/Mac_OS_X You basically have everything you need there. A basic warning, though: you don't want to overwrite Mac OS X's own numpy, but to install it either locally (in ~/.local, using python setup.install --user) or in a virtual environment (http://www.doughellmann.com/projects/virtualenvwrapper/) Note that virtualenvwrapper allows you to specify the version of python to use during the compilation (using the -p or --python= flag). But you could do it yourself when compiling the sources: just use `your_path_to_your_python_version setup install --user` Let me know how it goes -- Pierre GM On Friday, July 13, 2012 at 19:22 , Naser Nikandish wrote: > Hi, > > I need to install numpy and scipy on preinstalled Python 2.6 on my Mac Lion. Is there anyway to do it? I am aware that Lion OS comes with Python 2.7 as well. But I have to install it on Python 2.6. I really appreciate any help. > > Cheers > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sun Jul 15 18:14:49 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 17:14:49 -0500 Subject: [Numpy-discussion] Error building with Python 3.3 (was Re: Code Freeze for NumPy 1.7) In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <20120715170618.GA12281@sleipnir.bytereef.org> Message-ID: <6E2A45BA-172A-4400-A544-ED80A35901A5@continuum.io> On Jul 15, 2012, at 4:07 PM, Chris Ball wrote: > Stefan Krah bytereef.org> writes: > ... >> I wonder if this might be a blocker: Python-3.3 will be released in August >> and I don't think the issue is fixed yet: >> >> http://projects.scipy.org/numpy/ticket/2145 > > In case it helps, on a 64-bit Debian 6 machine where building with Python 3.1 > and 3.2 seems fine, I think I see the same error as above. Here's the build > log: > > https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py33/7/consoleFull This looks related to Unicode Object changes in Python 3.3 http://docs.python.org/dev/c-api/unicode.html At first blush, it looks like a blocker.. -Travis > > Chris > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 15 18:20:17 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 15 Jul 2012 16:20:17 -0600 Subject: [Numpy-discussion] Select-based median (in light of code freeze) In-Reply-To: References: Message-ID: On Sun, Jul 15, 2012 at 3:50 PM, Mike Ressler wrote: > On Sun, Jul 15, 2012 at 2:40 PM, Charles R Harris > wrote: > > I was thinking of adding quickselect, but if you have made a start ... go > > for it. > > This is territory where I personally am fearful to tread - I'm no > developer, but I am an awfully good alpha/beta tester! I can go back > to the archives and try to see how far people got and what it would > take to finish it. > > If you have a good quickselect, I could wrap median around it and see > how it behaves. I am happy to help push things along, but frankly > would be embarrassed to have any low-level code I wrote see the light > of day. I can test and document it in exchange though. > > It might be a good sprint for scipy2012 if any interested folks are going to be there. I was going to do additional cleanups of the sorting functions in any case. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail at paul.kishimoto.name Sun Jul 15 19:05:28 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Sun, 15 Jul 2012 19:05:28 -0400 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> On Sat, 2012-07-14 at 17:45 -0500, Travis Oliphant wrote: > Hey all, > > We are nearing a code-freeze for NumPy 1.7. Are there any > last-minute changes people are wanting to push into NumPy 1.7? We > should discuss them as soon as possible. > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT > on July 17th). This will allow the creation of beta releases of > NumPy on the 18th of July. This is a few days later than originally > hoped for --- largely due to unexpected travel schedules of Ondrej and > I, but it does give people a few more days to get patches in. Of > course, we will be able to apply bug-fixes to the 1.7.x branch once > the tag is made. > > If you have a pull-request that is not yet applied and would like to > discuss it for inclusion, the time to do it is now. > > Best, > > -Travis Bump for: https://github.com/numpy/numpy/pull/351 As requested by njsmith, I gave a more detailed explanation and asked the list for input at: http://www.mail-archive.com/numpy-discussion at scipy.org/msg38306.html There was one qualified negative reply and nothing (yet) further. I'd appreciate if some other devs could weigh in. Thanks, -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu https://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From travis at continuum.io Sun Jul 15 20:01:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 19:01:28 -0500 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> Message-ID: <1D55C51D-5D90-496E-964E-ED7B74342557@continuum.io> On Jul 15, 2012, at 7:08 AM, Ralf Gommers wrote: > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant wrote: > > Hey all, > > We are nearing a code-freeze for NumPy 1.7. Are there any last-minute changes people are wanting to push into NumPy 1.7? We should discuss them as soon as possible. > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on July 17th). This will allow the creation of beta releases of NumPy on the 18th of July. This is a few days later than originally hoped for --- largely due to unexpected travel schedules of Ondrej and I, but it does give people a few more days to get patches in. Of course, we will be able to apply bug-fixes to the 1.7.x branch once the tag is made. > > What about the tickets still open for 1.7.0 (http://projects.scipy.org/numpy/report/3)? There are a few important ones left. > > These I would consider blockers: > - #2108 Datetime failures with MinGW > - #2076 Bus error for F order ndarray creation on SPARC > > These have patches available which should be reviewed: > - #2150 Distutils should support Debian multi-arch fully > - #2179 Memmap children retain _mmap reference in all cases These blockers could certainly delay the rc1 release, but would they need to hold up the beta? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sun Jul 15 22:26:50 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 21:26:50 -0500 Subject: [Numpy-discussion] Select-based median (in light of code freeze) In-Reply-To: References: Message-ID: <886D2EB9-E65D-4F48-BE10-C21F68548033@continuum.io> That reminds me. How many NumPy devs are going to be at SciPy this year? It would be good to have a NumPy sprint there. Ideas for what we could work on: 1) Make progress on the 1.7.0 release 2) Make progress on the conversion to Github 3) Make progress against tickets and PRs If enough NumPy devs will be there, we should plan to have a BOF as well. Best, -Travis On Jul 15, 2012, at 5:20 PM, Charles R Harris wrote: > > > On Sun, Jul 15, 2012 at 3:50 PM, Mike Ressler wrote: > On Sun, Jul 15, 2012 at 2:40 PM, Charles R Harris > wrote: > > I was thinking of adding quickselect, but if you have made a start ... go > > for it. > > This is territory where I personally am fearful to tread - I'm no > developer, but I am an awfully good alpha/beta tester! I can go back > to the archives and try to see how far people got and what it would > take to finish it. > > If you have a good quickselect, I could wrap median around it and see > how it behaves. I am happy to help push things along, but frankly > would be embarrassed to have any low-level code I wrote see the light > of day. I can test and document it in exchange though. > > > It might be a good sprint for scipy2012 if any interested folks are going to be there. I was going to do additional cleanups of the sorting functions in any case. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sun Jul 15 22:40:07 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 21:40:07 -0500 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> Message-ID: <4243560E-C63A-496F-9A10-9C1D78CEB27A@continuum.io> > > Bump for: https://github.com/numpy/numpy/pull/351 > > > As requested by njsmith, I gave a more detailed explanation and asked > the list for input at: > http://www.mail-archive.com/numpy-discussion at scipy.org/msg38306.html > > There was one qualified negative reply and nothing (yet) further. I'd > appreciate if some other devs could weigh in. I can see the point of your proposal, but I don't think we can just change the behavior of genfromtxt quite like this. I'm not an expert on the genfromtxt code, so I don't know if I exactly understand what is changing and what is staying the same. From the look of the patch, it looks like you are changing the interpretation so that whereas headers used to be allowed in the "first-line" of the comment, they would no longer be allowed like that. I don't think we can do that, because it breaks code for someone without a path for change. Now, we could add another keyword (headers=True, for example), that interpreted the first non-commented line as the header line. Something like that has a much higher chance of getting accepted from my perspective. Best, -Travis > > > Thanks, > -- > Paul Natsuo Kishimoto > > SM candidate, Technology & Policy Program (2012) > Research assistant, http://globalchange.mit.edu > https://paul.kishimoto.name +1 617 302 6105 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From scopatz at gmail.com Sun Jul 15 22:52:28 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Sun, 15 Jul 2012 21:52:28 -0500 Subject: [Numpy-discussion] Select-based median (in light of code freeze) In-Reply-To: <886D2EB9-E65D-4F48-BE10-C21F68548033@continuum.io> References: <886D2EB9-E65D-4F48-BE10-C21F68548033@continuum.io> Message-ID: On Sun, Jul 15, 2012 at 9:26 PM, Travis Oliphant wrote: > That reminds me. > > How many NumPy devs are going to be at SciPy this year? It would be > good to have a NumPy sprint there. Ideas for what we could work on: > > 1) Make progress on the 1.7.0 release > 2) Make progress on the conversion to Github > 3) Make progress against tickets and PRs > > If enough NumPy devs will be there, we should plan to have a BOF as well. > Please be sure to use the sprints and birds-of-a-feather sign-up pages (which are limited web forums)! Be Well Anthony > > Best, > > -Travis > > > > > On Jul 15, 2012, at 5:20 PM, Charles R Harris wrote: > > > > On Sun, Jul 15, 2012 at 3:50 PM, Mike Ressler wrote: > >> On Sun, Jul 15, 2012 at 2:40 PM, Charles R Harris >> wrote: >> > I was thinking of adding quickselect, but if you have made a start ... >> go >> > for it. >> >> This is territory where I personally am fearful to tread - I'm no >> developer, but I am an awfully good alpha/beta tester! I can go back >> to the archives and try to see how far people got and what it would >> take to finish it. >> >> If you have a good quickselect, I could wrap median around it and see >> how it behaves. I am happy to help push things along, but frankly >> would be embarrassed to have any low-level code I wrote see the light >> of day. I can test and document it in exchange though. >> >> > It might be a good sprint for scipy2012 if any interested folks are going > to be there. I was going to do additional cleanups of the sorting functions > in any case. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Sun Jul 15 22:52:44 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 15 Jul 2012 22:52:44 -0400 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <4243560E-C63A-496F-9A10-9C1D78CEB27A@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> <4243560E-C63A-496F-9A10-9C1D78CEB27A@continuum.io> Message-ID: Hi, there is a PR that I think could be merged before the relase: https://github.com/numpy/numpy/pull/326 It is the addition of the inplace_increment function. It seam good, but I can't review it enough as it use many numpy internal that I never used or looked at. But the tests seam to cover all cases and it don't change current functions. So it should not have any side effect problems. This was a feature frequently requested. Fred On Sun, Jul 15, 2012 at 10:40 PM, Travis Oliphant wrote: >> >> Bump for: https://github.com/numpy/numpy/pull/351 >> >> >> As requested by njsmith, I gave a more detailed explanation and asked >> the list for input at: >> http://www.mail-archive.com/numpy-discussion at scipy.org/msg38306.html >> >> There was one qualified negative reply and nothing (yet) further. I'd >> appreciate if some other devs could weigh in. > > I can see the point of your proposal, but I don't think we can just change the behavior of genfromtxt quite like this. I'm not an expert on the genfromtxt code, so I don't know if I exactly understand what is changing and what is staying the same. > > >From the look of the patch, it looks like you are changing the interpretation so that whereas headers used to be allowed in the "first-line" of the comment, they would no longer be allowed like that. I don't think we can do that, because it breaks code for someone without a path for change. > > Now, we could add another keyword (headers=True, for example), that interpreted the first non-commented line as the header line. Something like that has a much higher chance of getting accepted from my perspective. > > Best, > > -Travis > > > > >> >> >> Thanks, >> -- >> Paul Natsuo Kishimoto >> >> SM candidate, Technology & Policy Program (2012) >> Research assistant, http://globalchange.mit.edu >> https://paul.kishimoto.name +1 617 302 6105 >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From travis at continuum.io Sun Jul 15 23:12:05 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 15 Jul 2012 22:12:05 -0500 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> <4243560E-C63A-496F-9A10-9C1D78CEB27A@continuum.io> Message-ID: <45C044E1-1EFD-452D-A575-5B3E47AE7C5B@continuum.io> This looked like an interesting one for sure. I can't look at the PR right now for some reason (Github gave me a 500 error). I know there were some comments, though. -Travis On Jul 15, 2012, at 9:52 PM, Fr?d?ric Bastien wrote: > Hi, > > there is a PR that I think could be merged before the relase: > > https://github.com/numpy/numpy/pull/326 > > It is the addition of the inplace_increment function. It seam good, > but I can't review it enough as it use many numpy internal that I > never used or looked at. But the tests seam to cover all cases and it > don't change current functions. So it should not have any side effect > problems. > > This was a feature frequently requested. > > Fred > > On Sun, Jul 15, 2012 at 10:40 PM, Travis Oliphant wrote: >>> >>> Bump for: https://github.com/numpy/numpy/pull/351 >>> >>> >>> As requested by njsmith, I gave a more detailed explanation and asked >>> the list for input at: >>> http://www.mail-archive.com/numpy-discussion at scipy.org/msg38306.html >>> >>> There was one qualified negative reply and nothing (yet) further. I'd >>> appreciate if some other devs could weigh in. >> >> I can see the point of your proposal, but I don't think we can just change the behavior of genfromtxt quite like this. I'm not an expert on the genfromtxt code, so I don't know if I exactly understand what is changing and what is staying the same. >> >>> From the look of the patch, it looks like you are changing the interpretation so that whereas headers used to be allowed in the "first-line" of the comment, they would no longer be allowed like that. I don't think we can do that, because it breaks code for someone without a path for change. >> >> Now, we could add another keyword (headers=True, for example), that interpreted the first non-commented line as the header line. Something like that has a much higher chance of getting accepted from my perspective. >> >> Best, >> >> -Travis >> >> >> >> >>> >>> >>> Thanks, >>> -- >>> Paul Natsuo Kishimoto >>> >>> SM candidate, Technology & Policy Program (2012) >>> Research assistant, http://globalchange.mit.edu >>> https://paul.kishimoto.name +1 617 302 6105 >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at googlemail.com Mon Jul 16 02:26:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 16 Jul 2012 08:26:14 +0200 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: <1D55C51D-5D90-496E-964E-ED7B74342557@continuum.io> References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1D55C51D-5D90-496E-964E-ED7B74342557@continuum.io> Message-ID: On Mon, Jul 16, 2012 at 2:01 AM, Travis Oliphant wrote: > > On Jul 15, 2012, at 7:08 AM, Ralf Gommers wrote: > > > > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant wrote: > >> >> Hey all, >> >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute >> changes people are wanting to push into NumPy 1.7? We should discuss them >> as soon as possible. >> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on >> July 17th). This will allow the creation of beta releases of NumPy on the >> 18th of July. This is a few days later than originally hoped for --- >> largely due to unexpected travel schedules of Ondrej and I, but it does >> give people a few more days to get patches in. Of course, we will be able >> to apply bug-fixes to the 1.7.x branch once the tag is made. >> > > What about the tickets still open for 1.7.0 ( > http://projects.scipy.org/numpy/report/3)? There are a few important ones > left. > > These I would consider blockers: > - #2108 Datetime failures with MinGW > - #2076 Bus error for F order ndarray creation on SPARC > > These have patches available which should be reviewed: > - #2150 Distutils should support Debian multi-arch fully > - #2179 Memmap children retain _mmap reference in all cases > > These blockers could certainly delay the rc1 release, but would they need > to hold up the beta? > For the SPARC issue, not really. For the datetime issue, I think it at least needs to be clear what the way forward is. If the issue isn't solvable without any changes to either the API or the compiler support, which may well be the case given the amount of effort that has gone into this already, we need a decision on what to do. It would also be a good idea to apply the distutils patch before the beta. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Jul 16 02:52:00 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 16 Jul 2012 08:52:00 +0200 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> Message-ID: Hello, I'm siding w/ Tom, Nathaniel and Travis. I don't think the change as it is is advisable. It's a regression, and breaking=bad. Now, I can understand your frustration, so, what about a trade-off? The first line w/ a comment after the first 'skip_header' ones should be parsed for column titles (and we call it 'first_commented_line'). We split it along the comment character, say, #. If there's some non-space character before the #, we keep this part of 'first_commented_line' as titles: that should work for your case. If the first non-space character was #, then what comes after are the titles (that's Tom's case and the current default). I'm not looking forward to introducing yet another keyword, genfromtxt is enough of a mess as it is (unless we add a 'need_coffee' one). What y'all think? On Jul 13, 2012 7:29 PM, "Paul Natsuo Kishimoto" wrote: > On Fri, 2012-07-13 at 12:13 -0400, Tom Aldcroft wrote: > > On Fri, Jul 13, 2012 at 11:15 AM, Paul Natsuo Kishimoto > > wrote: > > > Hello everyone, > > > > > > I am a longtime NumPy user, and I just filed my first > contribution to > > > the code as pull request to fix what I felt was a bug in the behaviour > > > of genfromtxt() https://github.com/numpy/numpy/pull/351 > > > It turns out this alters existing behaviour that some people may depend > > > on, so I was encouraged to raise the issue on this list to see what the > > > consensus was. > > > > > > This behaviour happens in the specific situation where: > > > * Comments are used in the file (the default comment character is > > > '#', which I'll use here), AND > > > * The kwarg names=True is given. In this case, genfromtxt() is > > > supposed to read an initial row containing the names of the > > > columns and return an array with a structured dtype. > > > > > > Currently, these options work with a file like (Example #1): > > > > > > # gender age weight > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?but NOT with a file like (Example #2): > > > > > > # here is a general file comment > > > # it is spread over multiple lines > > > gender age weight > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?genfromtxt() believes the column names are 'here', 'is', 'a', etc., > and > > > thinks all of the columns are strings because 'gender', 'age' and > > > 'weight' are not numbers. > > > > > > This is because genfromtxt() (after skipping a number of lines > as > > > specified in the optional kwarg skip_header) will use the *first* line > > > it encounters to produce column names. If that line contains a comment > > > character, genfromtxt() discards everything *up to and including* the > > > comment character, and tries to use the content *after* the comment > > > character as headers (Example 3): > > > > > > gender age weight # wrong column names > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?the resulting column names are 'wrong', 'column' and 'names'. > > > > > > My proposed change was that, if the first (or any subsequent) line > > > contains a comment character, it should be treated as an *actual > > > comment*, and discarded along with anything that follows it on the > line. > > > > > > In Example 2, the result would be that the first two lines > appear empty > > > (no text before '#'), and the third line ("gender age weight") is used > > > for column names. > > > > > > In Example 3, the result would be that "gender age weight" is > used for > > > column names while "# wrong column names" is ignored. > > > > > > BUT! > > > > > > In Example 1, the result would be that the first line appears > empty, > > > and "M 21 72.100000" are used for column names. > > > > > > In other words, this change would do away with the previous behaviour > > > where the very first commented line was (magically?) treated not as a > > > comment but instead as column headers. This might break some existing > > > code. On the positive side, it would allow the user to be more liberal > > > with the format of input files (Example 4): > > > > > > # here is a general file comment > > > # the columns in this table are > > > gender age weight # here is a comment on the header line > > > # following this line are the data > > > M 21 72.100000 > > > F 35 58.330000 # here is a comment on a data line > > > M 33 21.99 > > > > > > I feel that this is a better/more flexible behaviour for genfromtxt(), > > > but?as stated?I am interested in your thoughts. > > > > > > Cheers, > > > -- > > > Paul Natsuo Kishimoto > > > > > > SM candidate, Technology & Policy Program (2012) > > > Research assistant, http://globalchange.mit.edu > > > https://paul.kishimoto.name +1 617 302 6105 > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > Hi Paul, > > > > At least in astronomy tabular files with the column definitions in the > > first commented line are reasonably common. This is driven in part by > > wide use of legacy packages like supermongo etc that don't have > > intelligent table readers, so users document the column names as a > > comment line. I think making this break might be unfortunate for > > users in astronomy. > > > > Dealing with commented header definitions is annoying. Not that it > > matters specifically for your genfromtext() proposal, but in the > > asciitable reader this case is handled with a particular reader class > > that expects the first comment line to contain the column definitions: > > > > http://cxc.harvard.edu/contrib/asciitable/#asciitable.CommentedHeader > > > > Cheers, > > Tom > > Tom, > > Thanks for this information. In thinking about how people would work > around this, I figured it would be fairly easy to discard a comment > character that occurred as the very first character in a file, e.g.: > > raw = StringIO(open('example.txt').read()[1:]) > data = numpy.genfromtxt(raw, comment='#', names=True) > > ?but I realize that making this change in many places would still be an > annoyance. > > I should perhaps also add that my view of 'proper' table formats is > partly influenced by another plotting package, namely pgfplots for LaTeX > (http://pgfplots.sourceforge.net/ , > http://pgfplots.sourceforge.net/gallery.html) which uses uncommented > headers. To the extent NumPy users are also LaTeX users, similar > semantics could be more friendly. > > Looking forward to more input from other users, > -- > Paul Natsuo Kishimoto > > SM candidate, Technology & Policy Program (2012) > Research assistant, http://globalchange.mit.edu > https://paul.kishimoto.name +1 617 302 6105 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Jul 16 02:54:38 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 16 Jul 2012 01:54:38 -0500 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> Message-ID: <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> On Jul 16, 2012, at 1:52 AM, Pierre GM wrote: > Hello, > I'm siding w/ Tom, Nathaniel and Travis. I don't think the change as it is is advisable. It's a regression, and breaking=bad. > Now, I can understand your frustration, so, what about a trade-off? The first line w/ a comment after the first 'skip_header' ones should be parsed for column titles (and we call it 'first_commented_line'). We split it along the comment character, say, #. If there's some non-space character before the #, we keep this part of 'first_commented_line' as titles: that should work for your case. If the first non-space character was #, then what comes after are the titles (that's Tom's case and the current default). > I'm not looking forward to introducing yet another keyword, genfromtxt is enough of a mess as it is (unless we add a 'need_coffee' one). > What y'all think? > That seems like an acceptable proposal --- it is consistent with current behavior but also satisfies the use-case (without another keyword which is a bonus). So, +1 from me. -Travis > On Jul 13, 2012 7:29 PM, "Paul Natsuo Kishimoto" wrote: > On Fri, 2012-07-13 at 12:13 -0400, Tom Aldcroft wrote: > > On Fri, Jul 13, 2012 at 11:15 AM, Paul Natsuo Kishimoto > > wrote: > > > Hello everyone, > > > > > > I am a longtime NumPy user, and I just filed my first contribution to > > > the code as pull request to fix what I felt was a bug in the behaviour > > > of genfromtxt() https://github.com/numpy/numpy/pull/351 > > > It turns out this alters existing behaviour that some people may depend > > > on, so I was encouraged to raise the issue on this list to see what the > > > consensus was. > > > > > > This behaviour happens in the specific situation where: > > > * Comments are used in the file (the default comment character is > > > '#', which I'll use here), AND > > > * The kwarg names=True is given. In this case, genfromtxt() is > > > supposed to read an initial row containing the names of the > > > columns and return an array with a structured dtype. > > > > > > Currently, these options work with a file like (Example #1): > > > > > > # gender age weight > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?but NOT with a file like (Example #2): > > > > > > # here is a general file comment > > > # it is spread over multiple lines > > > gender age weight > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?genfromtxt() believes the column names are 'here', 'is', 'a', etc., and > > > thinks all of the columns are strings because 'gender', 'age' and > > > 'weight' are not numbers. > > > > > > This is because genfromtxt() (after skipping a number of lines as > > > specified in the optional kwarg skip_header) will use the *first* line > > > it encounters to produce column names. If that line contains a comment > > > character, genfromtxt() discards everything *up to and including* the > > > comment character, and tries to use the content *after* the comment > > > character as headers (Example 3): > > > > > > gender age weight # wrong column names > > > M 21 72.100000 > > > F 35 58.330000 > > > M 33 21.99 > > > > > > ?the resulting column names are 'wrong', 'column' and 'names'. > > > > > > My proposed change was that, if the first (or any subsequent) line > > > contains a comment character, it should be treated as an *actual > > > comment*, and discarded along with anything that follows it on the line. > > > > > > In Example 2, the result would be that the first two lines appear empty > > > (no text before '#'), and the third line ("gender age weight") is used > > > for column names. > > > > > > In Example 3, the result would be that "gender age weight" is used for > > > column names while "# wrong column names" is ignored. > > > > > > BUT! > > > > > > In Example 1, the result would be that the first line appears empty, > > > and "M 21 72.100000" are used for column names. > > > > > > In other words, this change would do away with the previous behaviour > > > where the very first commented line was (magically?) treated not as a > > > comment but instead as column headers. This might break some existing > > > code. On the positive side, it would allow the user to be more liberal > > > with the format of input files (Example 4): > > > > > > # here is a general file comment > > > # the columns in this table are > > > gender age weight # here is a comment on the header line > > > # following this line are the data > > > M 21 72.100000 > > > F 35 58.330000 # here is a comment on a data line > > > M 33 21.99 > > > > > > I feel that this is a better/more flexible behaviour for genfromtxt(), > > > but?as stated?I am interested in your thoughts. > > > > > > Cheers, > > > -- > > > Paul Natsuo Kishimoto > > > > > > SM candidate, Technology & Policy Program (2012) > > > Research assistant, http://globalchange.mit.edu > > > https://paul.kishimoto.name +1 617 302 6105 > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > Hi Paul, > > > > At least in astronomy tabular files with the column definitions in the > > first commented line are reasonably common. This is driven in part by > > wide use of legacy packages like supermongo etc that don't have > > intelligent table readers, so users document the column names as a > > comment line. I think making this break might be unfortunate for > > users in astronomy. > > > > Dealing with commented header definitions is annoying. Not that it > > matters specifically for your genfromtext() proposal, but in the > > asciitable reader this case is handled with a particular reader class > > that expects the first comment line to contain the column definitions: > > > > http://cxc.harvard.edu/contrib/asciitable/#asciitable.CommentedHeader > > > > Cheers, > > Tom > > Tom, > > Thanks for this information. In thinking about how people would work > around this, I figured it would be fairly easy to discard a comment > character that occurred as the very first character in a file, e.g.: > > raw = StringIO(open('example.txt').read()[1:]) > data = numpy.genfromtxt(raw, comment='#', names=True) > > ?but I realize that making this change in many places would still be an > annoyance. > > I should perhaps also add that my view of 'proper' table formats is > partly influenced by another plotting package, namely pgfplots for LaTeX > (http://pgfplots.sourceforge.net/ , > http://pgfplots.sourceforge.net/gallery.html) which uses uncommented > headers. To the extent NumPy users are also LaTeX users, similar > semantics could be more friendly. > > Looking forward to more input from other users, > -- > Paul Natsuo Kishimoto > > SM candidate, Technology & Policy Program (2012) > Research assistant, http://globalchange.mit.edu > https://paul.kishimoto.name +1 617 302 6105 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Jul 16 05:30:14 2012 From: andyfaff at gmail.com (Andrew Nelson) Date: Mon, 16 Jul 2012 19:30:14 +1000 Subject: [Numpy-discussion] swig + numpy + variable length arrays Message-ID: Dear list, I am trying to SWIG a C function with the signature: void abelescalcall(long numcoefs, double *coefP, long npoints , double *yP, double *xP); numcoefs corresponds to the number of points in the coefP array. npoints corresponds to the number of points in yP and xP arrays. coefP and xP are inputs (IN_ARRAY1), yP is an output (ARGOUT_ARRAY1). I have been trying to use the following input file: %module reflect %{ #define SWIG_FILE_WITH_INIT #include "myfitfunctions.h" %} %include "numpy.i" %init %{ import_array(); %} %apply (long DIM1, double* IN_ARRAY1){(long numcoefs, double* coefP)} %apply (long DIM1, double* ARGOUT_ARRAY1){(long len2, double* yP)} %apply (long DIM1, double* IN_ARRAY1){(long len3, double *xP)} %include "myfitfunctions.h" %rename (abelescalcall) my_abeles; %inline %{ my_abeles(long numcoefs, double* coefP, long len2, double* yP, long len3, double *xP) { if (len2 != len3) { PyErr_Format(PyExc_ValueError, "Arrays of lengths (%d,%d) given", len2, len3); return 0.0; } abelescalcall(numcoefs, coefP, len2, yP, xP); } %} However, if I look at the wrapper code, and if I try to call my_abeles, then I am asked for 6 parameters. I only want to supply two numpy arrays - coefP and xP. thanks for any help you are able to give. (I spent the whole afternoon trying to get this bloomin thing to work). cheers, Andrew. -- _____________________________________ Dr. Andrew Nelson _____________________________________ From mail at paul.kishimoto.name Mon Jul 16 08:13:54 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Mon, 16 Jul 2012 08:13:54 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> Message-ID: <1342440834.12313.26.camel@khaeru-desktop> Hi Pierre, On Mon, 2012-07-16 at 01:54 -0500, Travis Oliphant wrote: > On Jul 16, 2012, at 1:52 AM, Pierre GM wrote: > > > Hello, > > I'm siding w/ Tom, Nathaniel and Travis. I don't think the change as > > it is is advisable. It's a regression, and breaking=bad. > > Now, I can understand your frustration, so, what about a trade-off? > > The first line w/ a comment after the first 'skip_header' ones > > should be parsed for column titles (and we call it > > 'first_commented_line'). We split it along the comment character, > > say, #. If there's some non-space character before the #, we keep > > this part of 'first_commented_line' as titles: that should work for > > your case. If the first non-space character was #, then what comes > > after are the titles (that's Tom's case and the current default). > > I'm not looking forward to introducing yet another keyword, > > genfromtxt is enough of a mess as it is (unless we add a > > 'need_coffee' one). > > What y'all think? > That seems like an acceptable proposal --- it is consistent with > current behavior but also satisfies the use-case (without another > keyword which is a bonus). > So, > +1 from me. > -Travis > Thanks for jumping in, and for offering a compromise solution. I agree that genfromtxt() has too many kwargs?it took me several minutes of reading the docs to realize why it wasn't behaving as expected! To be ultra clear (since I want to code this), you are suggesting that 'first_commented_line' be a *new* accepted value for the kwarg 'names', to invoke the behaviour you suggest? --- If this IS what you mean, I'd counter-propose something in the same spirit, but a bit simpler?we let the kwarg 'skip_header' take some additional value, say int(0), int(-1), str('auto'), or True. In this case, instead of skipping a fixed number of lines, it will skip any number of consecutive empty OR commented lines; THEN apply the behaviour you describe. The semantics of this are more intuitive, because this is what I am really after: to *skip* a commented *header* of arbitrary length. So my four examples below could be parsed with: 1. genfromtxt(..., names=True) 2. genfromtxt(..., names=True, skip_header=True) 3. genfromtxt(..., names=True) 4. genfromtxt(..., names=True, skip_header=True) ?crucially #1 avoids the regression. Does this seem good to everyone? --- But if this is NOT what you mean, then what you say does not actually work with the simple use-case of my Example #2 below. The first commented line is "# here is a..." with # as the first non-space character, so the part after becomes the names 'here', 'is', 'a' etc. In short, the code can't resolve the ambiguity without some extra information from the user. > > > On Jul 13, 2012 7:29 PM, "Paul Natsuo Kishimoto" > > wrote: > > On Fri, 2012-07-13 at 12:13 -0400, Tom Aldcroft wrote: > > > On Fri, Jul 13, 2012 at 11:15 AM, Paul Natsuo Kishimoto > > > wrote: > > > > Hello everyone, > > > > > > > > I am a longtime NumPy user, and I just filed my > > first contribution to > > > > the code as pull request to fix what I felt was a bug in > > the behaviour > > > > of genfromtxt() https://github.com/numpy/numpy/pull/351 > > > > It turns out this alters existing behaviour that some > > people may depend > > > > on, so I was encouraged to raise the issue on this list > > to see what the > > > > consensus was. > > > > > > > > This behaviour happens in the specific situation where: > > > > * Comments are used in the file (the default > > comment character is > > > > '#', which I'll use here), AND > > > > * The kwarg names=True is given. In this case, > > genfromtxt() is > > > > supposed to read an initial row containing the > > names of the > > > > columns and return an array with a structured > > dtype. > > > > > > > > Currently, these options work with a file like (Example > > #1): > > > > > > > > # gender age weight > > > > M 21 72.100000 > > > > F 35 58.330000 > > > > M 33 21.99 > > > > > > > > ?but NOT with a file like (Example #2): > > > > > > > > # here is a general file comment > > > > # it is spread over multiple lines > > > > gender age weight > > > > M 21 72.100000 > > > > F 35 58.330000 > > > > M 33 21.99 > > > > > > > > ?genfromtxt() believes the column names are 'here', > > 'is', 'a', etc., and > > > > thinks all of the columns are strings because 'gender', > > 'age' and > > > > 'weight' are not numbers. > > > > > > > > This is because genfromtxt() (after skipping a > > number of lines as > > > > specified in the optional kwarg skip_header) will use > > the *first* line > > > > it encounters to produce column names. If that line > > contains a comment > > > > character, genfromtxt() discards everything *up to and > > including* the > > > > comment character, and tries to use the content *after* > > the comment > > > > character as headers (Example 3): > > > > > > > > gender age weight # wrong column names > > > > M 21 72.100000 > > > > F 35 58.330000 > > > > M 33 21.99 > > > > > > > > ?the resulting column names are 'wrong', 'column' and > > 'names'. > > > > > > > > My proposed change was that, if the first (or any > > subsequent) line > > > > contains a comment character, it should be treated as an > > *actual > > > > comment*, and discarded along with anything that follows > > it on the line. > > > > > > > > In Example 2, the result would be that the first > > two lines appear empty > > > > (no text before '#'), and the third line ("gender age > > weight") is used > > > > for column names. > > > > > > > > In Example 3, the result would be that "gender > > age weight" is used for > > > > column names while "# wrong column names" is ignored. > > > > > > > > BUT! > > > > > > > > In Example 1, the result would be that the first > > line appears empty, > > > > and "M 21 72.100000" are used for column names. > > > > > > > > In other words, this change would do away with the > > previous behaviour > > > > where the very first commented line was (magically?) > > treated not as a > > > > comment but instead as column headers. This might break > > some existing > > > > code. On the positive side, it would allow the user to > > be more liberal > > > > with the format of input files (Example 4): > > > > > > > > # here is a general file comment > > > > # the columns in this table are > > > > gender age weight # here is a comment on the > > header line > > > > # following this line are the data > > > > M 21 72.100000 > > > > F 35 58.330000 # here is a comment on a data > > line > > > > M 33 21.99 > > > > > > > > I feel that this is a better/more flexible behaviour for > > genfromtxt(), > > > > but?as stated?I am interested in your thoughts. > > > > > > > > Cheers, > > > > -- > > > > Paul Natsuo Kishimoto > > > > > > > > SM candidate, Technology & Policy Program (2012) > > > > Research assistant, http://globalchange.mit.edu > > > > https://paul.kishimoto.name +1 617 302 6105 > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > Hi Paul, > > > > > > At least in astronomy tabular files with the column > > definitions in the > > > first commented line are reasonably common. This is > > driven in part by > > > wide use of legacy packages like supermongo etc that don't > > have > > > intelligent table readers, so users document the column > > names as a > > > comment line. I think making this break might be > > unfortunate for > > > users in astronomy. > > > > > > Dealing with commented header definitions is annoying. > > Not that it > > > matters specifically for your genfromtext() proposal, but > > in the > > > asciitable reader this case is handled with a particular > > reader class > > > that expects the first comment line to contain the column > > definitions: > > > > > > > > http://cxc.harvard.edu/contrib/asciitable/#asciitable.CommentedHeader > > > > > > Cheers, > > > Tom > > > > Tom, > > > > Thanks for this information. In thinking about how people > > would work > > around this, I figured it would be fairly easy to discard a > > comment > > character that occurred as the very first character in a > > file, e.g.: > > > > raw = StringIO(open('example.txt').read()[1:]) > > data = numpy.genfromtxt(raw, comment='#', > > names=True) > > > > ?but I realize that making this change in many places would > > still be an > > annoyance. > > > > I should perhaps also add that my view of 'proper' > > table formats is > > partly influenced by another plotting package, namely > > pgfplots for LaTeX > > (http://pgfplots.sourceforge.net/ , > > http://pgfplots.sourceforge.net/gallery.html) which uses > > uncommented > > headers. To the extent NumPy users are also LaTeX users, > > similar > > semantics could be more friendly. > > > > Looking forward to more input from other users, > > -- > > Paul Natsuo Kishimoto > > > > SM candidate, Technology & Policy Program (2012) > > Research assistant, http://globalchange.mit.edu > > https://paul.kishimoto.name +1 617 302 6105 -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu http://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Mon Jul 16 09:03:58 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Jul 2012 14:03:58 +0100 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 In-Reply-To: References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> <4243560E-C63A-496F-9A10-9C1D78CEB27A@continuum.io> Message-ID: On Mon, Jul 16, 2012 at 3:52 AM, Fr?d?ric Bastien wrote: > Hi, > > there is a PR that I think could be merged before the relase: > > https://github.com/numpy/numpy/pull/326 > > It is the addition of the inplace_increment function. It seam good, > but I can't review it enough as it use many numpy internal that I > never used or looked at. But the tests seam to cover all cases and it > don't change current functions. So it should not have any side effect > problems. > > This was a feature frequently requested. There are a number of unresolved issues with that patch still -- see the comment thread on the PR. -N From njs at pobox.com Mon Jul 16 09:27:06 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Jul 2012 14:27:06 +0100 Subject: [Numpy-discussion] py3 datetime woes (was Re: Code Freeze for NumPy 1.7) Message-ID: On Sun, Jul 15, 2012 at 5:32 PM, Ralf Gommers wrote: > > > On Sun, Jul 15, 2012 at 5:57 PM, Nathaniel Smith wrote: >> >> On Sun, Jul 15, 2012 at 1:08 PM, Ralf Gommers >> wrote: >> > >> > >> > On Sun, Jul 15, 2012 at 12:45 AM, Travis Oliphant >> > wrote: >> >> >> >> >> >> Hey all, >> >> >> >> We are nearing a code-freeze for NumPy 1.7. Are there any last-minute >> >> changes people are wanting to push into NumPy 1.7? We should discuss >> >> them >> >> as soon as possible. >> >> >> >> I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT on >> >> July 17th). This will allow the creation of beta releases of NumPy on >> >> the >> >> 18th of July. This is a few days later than originally hoped for --- >> >> largely >> >> due to unexpected travel schedules of Ondrej and I, but it does give >> >> people >> >> a few more days to get patches in. Of course, we will be able to apply >> >> bug-fixes to the 1.7.x branch once the tag is made. >> > >> > >> > What about the tickets still open for 1.7.0 >> > (http://projects.scipy.org/numpy/report/3)? There are a few important >> > ones >> > left. >> > >> > These I would consider blockers: >> > - #2108 Datetime failures with MinGW >> >> Is there a description anywhere of what the problem actually is here? >> I looked at the ticket, which referred to a PR, and it's hard to work >> out from the PR discussion what the actual remaining test failures are >> -- and there definitely doesn't seem to be any description of the >> underlying problem. (Something about working 64-bit time_t on windows >> being difficult depending on the compiler used?) > > There's a lot more discussion on > http://projects.scipy.org/numpy/ticket/1909 > https://github.com/numpy/numpy/pull/156 > https://github.com/numpy/numpy/pull/161. > > The issue is that for MinGW 3.x some _s / _t functions seem to be missing. > And we don't yet support MinGW 4.x. > > Current issues can be seen from the last test log on our Windows XP buildbot > (June 29, > http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio): > > ====================================================================== > ERROR: test_datetime_arange (test_datetime.TestDateTime) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > line 1351, in test_datetime_arange > assert_raises(ValueError, np.arange, np.datetime64('today'), > OSError: Failed to use '_localtime64_s' to convert to a local time ...I don't understand how this is even building if the functions are missing. Well, anyway. MSVC provides both _s and regular versions of localtime and friends. Mingw only provides the regular version. It looks like there are two difference between the regular and _s versions of these functions: 1) The _s version reports errors by calling the "invalid parameter handler" instead of just returning an error code. (I.e., by default, if you try to work with a date that's out of range, then your program will just abort(). Very friendly.)[1] 2) MSVC whines if you use the regular version Basically AFAICT this is an interface designed to make it easier for lousy programmers to write broken code that will still limp along more successfully than it otherwise would. There's a legitimate role for such interfaces, but I'm not sure numpy is it -- I think we can write actual error handling code? The classic localtime() interface AFAICT works perfectly -- it's even threadsafe, unlike most unix versions [2]. I think the best solution here is to just switch to using localtime() everywhere, and tell MSVC to shut up by using a #pragma. Here's an example of how to do this taken from the Boost developer guidelines [3]: #if defined(_MSC_VER) #pragma warning(push) // Save warning settings. #pragma warning(disable : 4996) // Disable deprecated localtime/gmtime warning. #endif ... #if defined(_MSC_VER) #pragma warning(pop) // Restore warnings to previous state. #endif (Yes, Boost's canonical example of how you control warnings on MSVC is specifically for letting you use localtime() in piece -- I didn't add that. Of course this could be hidden in a macro or helper function... npy_localtime() or something.) Or alternatively one can just #define _CRT_SECURE_NO_DEPRECATE to kill this warning in general. The Boost developer's general advice on this particular warning is "Unless you strongly believe that the 'secure' versions are useful, suppress."[3] [1] http://msdn.microsoft.com/en-us/library/a442x3ye%28v=vs.80%29.aspx , and note the "community content" at the bottom. [2] http://old.nabble.com/Re%3A-Need-localtime_s-p13088250.html [3] https://svn.boost.org/trac/boost/wiki/Guidelines/WarningsGuidelines -n From pgmdevlist at gmail.com Mon Jul 16 10:12:56 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 16 Jul 2012 16:12:56 +0200 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <1342440834.12313.26.camel@khaeru-desktop> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> Message-ID: To be ultra clear (since I want to code this), you are suggesting that 'first_commented_line' be a *new* accepted value for the kwarg 'names', to invoke the behaviour you suggest? Nope, I was just referring to some hypothetical variable name. I meant that: first_values = None try: while not first_values: first_line = fhd.next() if names is True: parsed = [m for m in first_line.split(comments) if m.strip()] if parsed: first_value = split_line(parsed[0]) else: ... (it's not tested, I'm writing it as it comes. And I didn't even use the `first_commented_line` name, sorry) If this IS what you mean, I'd counter-propose something in the same spirit, but a bit simpler?we let the kwarg 'skip_header' take some additional value, say int(0), int(-1), str('auto'), or True. In this case, instead of skipping a fixed number of lines, it will skip any number of consecutive empty OR commented lines; I really like the idea of having `skip_header=-1` skip all the empty or commented lines (that is, lines whose first non-space character is the `comments` character). That'd be rather convenient. The semantics of this are more intuitive, because this is what I am really after: to *skip* a commented *header* of arbitrary length. So my four examples below could be parsed with: 1. genfromtxt(..., names=True) 2. genfromtxt(..., names=True, skip_header=True) 3. genfromtxt(..., names=True) 4. genfromtxt(..., names=True, skip_header=True) ?crucially #1 avoids the regression. Does this seem good to everyone? Sounds good w/ `skip_header=-1` But if this is NOT what you mean, then what you say does not actually work with the simple use-case of my Example #2 below. The first commented line is "# here is a..." with # as the first non-space character, so the part after becomes the names 'here', 'is', 'a' etc. In that case, you could always use `skip_header=2` In short, the code can't resolve the ambiguity without some extra information from the user. It's always best not to let the code guess too much anyway... Well, no regression, and you have a nice plan. I'm for it. Anybody else? -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 16 12:28:19 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 16 Jul 2012 10:28:19 -0600 Subject: [Numpy-discussion] Lazy imports again Message-ID: Hi All, Working lazy imports would be useful to have. Ralf is opposed to the idea because it caused all sorts of problems on different platforms when it was tried in scipy. I thought I'd open the topic for discussion so that folks who had various problems/solutions could offer input and the common experience could be collected in one place. Perhaps there is a solution that actually works. Ideas? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Jul 16 12:30:19 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 16 Jul 2012 18:30:19 +0200 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: <20120716163019.GB25591@phare.normalesup.org> On Mon, Jul 16, 2012 at 10:28:19AM -0600, Charles R Harris wrote: > Working lazy imports would be useful to have. Ralf is opposed to the idea > because it caused all sorts of problems on different platforms when it was > tried in scipy. I thought I'd open the topic for discussion so that folks > who had various problems/solutions could offer input and the common > experience could be collected in one place. Perhaps there is a solution > that actually works. > Ideas? I feel like Ralf: this kind of magic always ends up costing more than the benefits. G From chris.barker at noaa.gov Mon Jul 16 12:48:42 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 16 Jul 2012 09:48:42 -0700 Subject: [Numpy-discussion] Istalling Numpy and Scipy on preinstalled Python 2.6 on Mac In-Reply-To: <1F5DA083EC6E4B08B5F824444AF9F7DA@gmail.com> References: <1F5DA083EC6E4B08B5F824444AF9F7DA@gmail.com> Message-ID: On Sun, Jul 15, 2012 at 2:51 PM, Pierre GM wrote: > A basic warning, though: you > don't want to overwrite Mac OS X's own numpy, ... Which is one of the reasons many of us recommend installing the pyton,org python and leaving Apple's alone. And why the standard numpy/scipy binaries are built for the python.org pythons. Do you really NEED to use Apple's Python? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tmp50 at ukr.net Mon Jul 16 13:35:09 2012 From: tmp50 at ukr.net (Dmitrey) Date: Mon, 16 Jul 2012 20:35:09 +0300 Subject: [Numpy-discussion] routine for linear least norms problems with specifiable accuracy Message-ID: <79633.1342460109.14890437708640157696@ffe16.ukr.net> hi all, I have wrote a routine to solve dense / sparse problems min {alpha1*||A1 x - b1||_1 + alpha2*||A2 x - b2||^2 + beta1 * ||x||_1 + beta2 * ||x||^2} with specifiable accuracy fTol > 0: abs(f-f*) <= fTol (this parameter is handled by solvers gsubg and maybe amsg2p, latter requires known good enough fOpt estimation). Constraints (box-bound, linear, quadratic) also could be easily connected. This problem is very often encountered in many areas, e.g. machine learning, sparse approximation, see for example http://scikit-learn.org/stable/modules/ ? lastic-net First of all solver large-scale gsubg is recommended. Some hand-tuning of its parameters also could essentially speedup the solver. Also you could be interested in other OpenOpt NSP solvers - ralg and amsg2p (they are medium-scaled although). You can see the source of the routine and its demo result here. You shouldn't expect gsubg will always solve your problem and inform of obtained result with specifiable accuracy - for some very difficult, e.g. extremely ill-conditioned problems it may * fail to solve QP subproblem (default QP solver is cvxopt, you may involve another one, e.g. commercial or free-for-educational cplex) * exit with another stop criterion, e.g. maxIter has been reached, or maxShoots have been exceeded (usually latter means you have reached solution, but it cannot be guaranteed in the case) First of all I have created the routine to demonstrate gsubg abilities; I haven't decided yet commit or not commit the routine to OpenOpt, with or without special class for this problem; in either case you can very easily create problems like this one in FuncDesigner (without having to write a routine for derivatives) to solve them by gsubg or another NSP solver; however, IIRC FuncDesigner dot() doesn't work with sparse matrices yet -------------- next part -------------- An HTML attachment was scrubbed... URL: From heng at cantab.net Mon Jul 16 14:47:38 2012 From: heng at cantab.net (Henry Gomersall) Date: Mon, 16 Jul 2012 19:47:38 +0100 Subject: [Numpy-discussion] routine for linear least norms problems with specifiable accuracy In-Reply-To: <79633.1342460109.14890437708640157696@ffe16.ukr.net> References: <79633.1342460109.14890437708640157696@ffe16.ukr.net> Message-ID: <1342464458.11238.0.camel@farnsworth> On Mon, 2012-07-16 at 20:35 +0300, Dmitrey wrote: > I have wrote a routine to solve dense / sparse problems > min {alpha1*||A1 x - b1||_1 + alpha2*||A2 x - b2||^2 + beta1 * ||x||_1 > + beta2 * ||x||^2} > with specifiable accuracy fTol > 0: abs(f-f*) <= fTol (this parameter > is handled by solvers gsubg and maybe amsg2p, latter requires known > good enough fOpt estimation). Constraints (box-bound, linear, > quadratic) also could be easily connected. > Interesting. What algorithm are you using? Henry From mail at paul.kishimoto.name Mon Jul 16 15:06:15 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Mon, 16 Jul 2012 15:06:15 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> Message-ID: <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> I've implemented this feature with skip_header=-1 as suggested by Pierre, and in doing so removed the regression. TravisBot seems to like it: https://github.com/numpy/numpy/pull/351 On Mon, 2012-07-16 at 16:12 +0200, Pierre GM wrote: > To be ultra clear (since I want to code this), you are > suggesting that > 'first_commented_line' be a *new* accepted value for the kwarg > 'names', to invoke the behaviour you suggest? > > > > Nope, I was just referring to some hypothetical variable name. I meant > that: > > first_values = None > try: > while not first_values: > first_line = fhd.next() > if names is True: > parsed = [m for m in first_line.split(comments) if > m.strip()] > if parsed: > first_value = split_line(parsed[0]) > else: > ... > > (it's not tested, I'm writing it as it comes. And I didn't even use > the `first_commented_line` name, sorry) > > > If this IS what you mean, I'd counter-propose something in the > same spirit, but a bit simpler?we let the kwarg 'skip_header' > take some additional value, say int(0), int(-1), str('auto'), > or True. > > > > > In this case, instead of skipping a fixed number of lines, it > will skip any number of consecutive empty OR commented lines; > > > > > I really like the idea of having `skip_header=-1` skip all the empty > or commented lines (that is, lines whose first non-space character is > the `comments` character). That'd be rather convenient. > > > > > The semantics of this are more intuitive, because this is what > I am > really after: to *skip* a commented *header* of arbitrary > length. So my four examples below could be parsed with: > > 1. genfromtxt(..., names=True) > 2. genfromtxt(..., names=True, skip_header=True) > 3. genfromtxt(..., names=True) > 4. genfromtxt(..., names=True, skip_header=True) > > ?crucially #1 avoids the regression. > > > Does this seem good to everyone? > > > > > Sounds good w/ `skip_header=-1` > > > But if this is NOT what you mean, then what you say does not > actually work with the simple use-case of my Example #2 below. > The first commented line is "# here is a..." with # as the > first non-space character, so the part after becomes the names > 'here', 'is', 'a' etc. > > > > > In that case, you could always use `skip_header=2` > > In short, the code can't resolve the ambiguity without some > extra > information from the user. > > > > > It's always best not to let the code guess too much anyway... > > Well, no regression, and you have a nice plan. I'm for it. > Anybody else? > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu https://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at googlemail.com Mon Jul 16 15:29:36 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 16 Jul 2012 21:29:36 +0200 Subject: [Numpy-discussion] Help building NumPy on Windows In-Reply-To: References: Message-ID: On Sun, Jul 15, 2012 at 10:17 PM, Chris Ball wrote: > Hi, > > I'm having some trouble building numpy on a 64-bit Windows 7 machine. I'm > probably accidentally missing a step following the build process described > at > http://scipy.org/Installing_SciPy/Windows; it would be great if someone > could > spot what! > It complains about not being able to find msvcr90.dll. From some googling that regularly seems to happen for the 64-bit version of MinGW, but not for the 32-bit version. Is that dll actually present on your system? > > Here's what I did: > 1. installed python 2.7 from python.org > 2. installed mingw32 from the link above (results in gcc 3.4.5) > 3. added c:\mingw\bin to front of path > 4. ran "python setup.py build --compiler=mingw32 install > --prefix=numpy-install" > > I got the following error: > [...] > C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\system_info.py:1409: UserWarning: > Lapack (http://www.netlib.org/lapack/) sources not found. > Directories to search for the sources can be specified in the > numpy/distutils/site.cfg file (section [lapack_src]) or by setting > the LAPACK_SRC environment variable. > warnings.warn(LapackSrcNotFoundError.__doc__) > customize GnuFCompiler > gnu: no Fortran 90 compiler found > gnu: no Fortran 90 compiler found > customize GnuFCompiler using config > C compiler: gcc -mno-cygwin -O2 -Wall -Wstrict-prototypes > > compile options: '-DNPY_MINGW_USE_CUSTOM_MSVCR -D__MSVCRT_VERSION__=0x0900 > - > Inumpy\core\src\private -Inumpy\core\src -Inumpy\core > -Inumpy\core\src\npymath - > Inumpy\core\src\multiarray -Inumpy\core\src\umath -Inumpy\core\src\npysort > - > Inumpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c' > gcc -mno-cygwin -O2 -Wall -Wstrict-prototypes -DNPY_MINGW_USE_CUSTOM_MSVCR > - > D__MSVCRT_VERSION__=0x0900 -Inumpy\core\src\private -Inumpy\core\src - > Inumpy\core -Inumpy\core\src\npymath -Inumpy\core\src\multiarray - > Inumpy\core\src\umath -Inumpy\core\src\npysort -Inumpy\core\include - > Ic:\Python27\include -Ic:\Python27\PC -c _configtest.c -o _configtest.o > Found executable c:\mingw\bin\gcc.exe > g++ -mno-cygwin _configtest.o -lmsvcr90 -o _configtest.exe > Found executable c:\mingw\bin\g++.exe > c:\mingw\bin\..\lib\gcc\mingw32\3.4.5\..\..\..\..\mingw32\bin\ld.exe: > cannot > find -lmsvcr90 > collect2: ld returned 1 exit status > failure. > removing: _configtest.exe.manifest _configtest.c _configtest.o > Traceback (most recent call last): > File "setup.py", line 214, in > setup_package() > File "setup.py", line 207, in setup_package > configuration=configuration ) > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\core.py", line 186, in setup > return old_setup(**new_attr) > File "c:\Python27\lib\distutils\core.py", line 152, in setup > dist.run_commands() > File "c:\Python27\lib\distutils\dist.py", line 953, in run_commands > self.run_command(cmd) > File "c:\Python27\lib\distutils\dist.py", line 972, in run_command > cmd_obj.run() > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\command\build.py", line 37, in run > old_build.run(self) > File "c:\Python27\lib\distutils\command\build.py", line 127, in run > self.run_command(cmd_name) > File "c:\Python27\lib\distutils\cmd.py", line 326, in run_command > self.distribution.run_command(command) > File "c:\Python27\lib\distutils\dist.py", line 972, in run_command > cmd_obj.run() > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\command\build_src.py", line 152, in run > self.build_sources() > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\command\build_src.py", line 163, in > build_sources > self.build_library_sources(*libname_info) > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\command\build_src.py", line 298, in > build_library_sources > sources = self.generate_sources(sources, (lib_name, build_info)) > File "C:\Users\ceball\npslavetest\Windows7_x64_mingw32_3.4.5- > py27\build\numpy\distutils\command\build_src.py", line 385, in > generate_sources > source = func(extension, build_dir) > File "numpy\core\setup.py", line 648, in get_mathlib_info > raise RuntimeError("Broken toolchain: cannot link a simple C program") > RuntimeError: Broken toolchain: cannot link a simple C program > > > Can anyone see what I've missed from this? > > > I'm not sure what the situation is for supporting building with Microsoft > compilers, but on another Windows 7 machine (also 64 bits) with MS > Visual Studio 9.0 installed, the build completes but I get test failures. > You can see the full output of this here: > https://jenkins.shiningpanda.com/scipy/job/NumPy-Windows7_x64- > py27/26/consoleFull > The few errors due to invalid values are nothing to worry about, although they should be fixed. In released versions those are not errors but warnings. It's likely that those values are correct and the warning should simply be silenced (disclaimer: I didn't look at this in detail). That leaves one error: ====================================================================== ERROR: test_not_closing_opened_fid (test_io.TestSavezLoad) ---------------------------------------------------------------------- Traceback (most recent call last): File "S:\Users\slave\Jenkins\workspace\NumPy-Windows7_x64-py27\numpy-install\Lib\site-packages\numpy\lib\tests\test_io.py", line 190, in test_not_closing_opened_fid os.remove(tmp) WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\slave\\appdata\\local\\temp\\tmp46nhch.npz' Looks odd at first sight. Is it repeatable? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Mon Jul 16 15:36:42 2012 From: tmp50 at ukr.net (Dmitrey) Date: Mon, 16 Jul 2012 22:36:42 +0300 Subject: [Numpy-discussion] routine for linear least norms problems with specifiable accuracy In-Reply-To: <1342464458.11238.0.camel@farnsworth> References: <1342464458.11238.0.camel@farnsworth> <79633.1342460109.14890437708640157696@ffe16.ukr.net> Message-ID: <11828.1342467402.1956568605669523456@ffe12.ukr.net> gsubg uses N.Zhurbenko ( http://openopt.org/NikolayZhurbenko ) epsilon-subgradient method ralg and amsg2p use other algorithms --- ???????? ????????? --- ?? ????: "Henry Gomersall" ????: "Discussion of Numerical Python" ????: 16 ???? 2012, 21:47:47 ????: Re: [Numpy-discussion] routine for linear least norms problems with specifiable accuracy > On Mon, 2012-07-16 at 20:35 +0300, Dmitrey wrote: > I have wrote a routine to solve dense / sparse problems > min {alpha1*||A1 x - b1||_1 + alpha2*||A2 x - b2||^2 + beta1 * ||x||_1 > + beta2 * ||x||^2} > with specifiable accuracy fTol > 0: abs(f-f*) <= fTol (this parameter > is handled by solvers gsubg and maybe amsg2p, latter requires known > good enough fOpt estimation). Constraints (box-bound, linear, > quadratic) also could be easily connected. > Interesting. What algorithm are you using? Henry _______________________________________________ NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Jul 16 15:48:22 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 16 Jul 2012 21:48:22 +0200 Subject: [Numpy-discussion] Possible test failure on Debian 6 with Python 2.6 In-Reply-To: References: Message-ID: On Sun, Jul 15, 2012 at 11:00 PM, Chris Ball wrote: > Hi, > > I'm trying to set up various build machines. Some of these are with > ShiningPanda.com, which provides a 64-bit Debian 6 machine (as well as > Windows > 7). This machine has multiple versions of Python installed. > > Using the build procedure below, I see a test failure with Python 2.6 (and > 2.7) > but not 2.5 (and 2.4). Any ideas? I guess other people are testing on > Debian 6, > and if they don't see the test failure with Python 2.6 maybe there's > something > unusual about the setup of this machine. > > Build procedure: > > $ python setup.py build install --prefix=./numpy-install > $ cd numpy-install > $ export PYTHONPATH=/path/to/numpy-install/lib/python2.6/site-packages > $ python ../tools/test-installed-numpy.py --coverage -- --with-xunit > > Test error: > > ====================================================================== > ERROR: test_multiarray.TestNewBufferProtocol.test_roundtrip > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/home/slave/jenkins/shiningpanda/jobs/51551925/virtualenvs/d41d8cd9/lib/python2 > .6/site-packages/nose/case.py", line 197, in runTest > self.test(*self.arg) > File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- > install/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py", > line > 2537, in test_roundtrip > assert_raises(ValueError, self._check_roundtrip, x) > File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- > install/lib/python2.6/site-packages/numpy/testing/utils.py", line 1018, in > assert_raises > return nose.tools.assert_raises(*args,**kwargs) > File "/sp/lib/python/cpython-2.6/lib/python2.6/unittest.py", line 336, in > failUnlessRaises > callableObj(*args, **kwargs) > File "/home/slave/jenkins/workspace/NumPy-Debian6_x64-py26/numpy- > install/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py", > line > 2471, in _check_roundtrip > y = np.asarray(x) > RuntimeWarning: tp_compare didn't return -1 or -2 for exception > > Perhaps related to http://projects.scipy.org/numpy/ticket/1605 Was also an issue for pandas: https://github.com/pydata/pandas/issues/1546 Ralf > [...] > > Ran 4781 tests in 68.179s > > FAILED (KNOWNFAIL=4, SKIP=6, errors=1) > > > The full build and test output is here: > > https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py26/8/consoleFull > > For comparison, with Python 2.5: > > https://jenkins.shiningpanda.com/scipy/job/NumPy-Debian6_x64-py25/9/consoleFull > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Jul 16 15:56:35 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Jul 2012 20:56:35 +0100 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Mon, Jul 16, 2012 at 8:06 PM, Paul Natsuo Kishimoto wrote: > I've implemented this feature with skip_header=-1 as suggested by > Pierre, and in doing so removed the regression. TravisBot seems to like > it: https://github.com/numpy/numpy/pull/351 Can we please not use weird magic values like this? This isn't C. skip_header="comments" would work just as well and be far more self-explanatory... -n From aldcroft at head.cfa.harvard.edu Mon Jul 16 16:00:32 2012 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Mon, 16 Jul 2012 16:00:32 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Mon, Jul 16, 2012 at 3:06 PM, Paul Natsuo Kishimoto wrote: > I've implemented this feature with skip_header=-1 as suggested by > Pierre, and in doing so removed the regression. TravisBot seems to like > it: https://github.com/numpy/numpy/pull/351 > > On Mon, 2012-07-16 at 16:12 +0200, Pierre GM wrote: >> To be ultra clear (since I want to code this), you are >> suggesting that >> 'first_commented_line' be a *new* accepted value for the kwarg >> 'names', to invoke the behaviour you suggest? >> >> >> >> Nope, I was just referring to some hypothetical variable name. I meant >> that: >> >> first_values = None >> try: >> while not first_values: >> first_line = fhd.next() >> if names is True: >> parsed = [m for m in first_line.split(comments) if >> m.strip()] >> if parsed: >> first_value = split_line(parsed[0]) >> else: >> ... >> >> (it's not tested, I'm writing it as it comes. And I didn't even use >> the `first_commented_line` name, sorry) >> >> >> If this IS what you mean, I'd counter-propose something in the >> same spirit, but a bit simpler?we let the kwarg 'skip_header' >> take some additional value, say int(0), int(-1), str('auto'), >> or True. >> >> >> >> >> In this case, instead of skipping a fixed number of lines, it >> will skip any number of consecutive empty OR commented lines; >> >> >> >> >> I really like the idea of having `skip_header=-1` skip all the empty >> or commented lines (that is, lines whose first non-space character is >> the `comments` character). That'd be rather convenient. >> >> >> >> >> The semantics of this are more intuitive, because this is what >> I am >> really after: to *skip* a commented *header* of arbitrary >> length. So my four examples below could be parsed with: >> >> 1. genfromtxt(..., names=True) >> 2. genfromtxt(..., names=True, skip_header=True) >> 3. genfromtxt(..., names=True) >> 4. genfromtxt(..., names=True, skip_header=True) >> >> ?crucially #1 avoids the regression. >> >> >> Does this seem good to everyone? >> >> >> >> >> Sounds good w/ `skip_header=-1` >> >> >> But if this is NOT what you mean, then what you say does not >> actually work with the simple use-case of my Example #2 below. >> The first commented line is "# here is a..." with # as the >> first non-space character, so the part after becomes the names >> 'here', 'is', 'a' etc. >> >> >> >> >> In that case, you could always use `skip_header=2` >> >> In short, the code can't resolve the ambiguity without some >> extra >> information from the user. >> >> >> >> >> It's always best not to let the code guess too much anyway... >> >> Well, no regression, and you have a nice plan. I'm for it. >> Anybody else? >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- > Paul Natsuo Kishimoto > > SM candidate, Technology & Policy Program (2012) > Research assistant, http://globalchange.mit.edu > https://paul.kishimoto.name +1 617 302 6105 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I think that the proposed solution is OK, but it does make it even trickier for the average user to predict the behavior of genfromtxt() for different situations. Perhaps as part of this pull request Paul should also update the documentation to include a section describing this behavior and usage with examples 1 to 4. - Tom From pgmdevlist at gmail.com Mon Jul 16 16:01:42 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 16 Jul 2012 22:01:42 +0200 Subject: [Numpy-discussion] =?utf-8?q?Proposed_change_in_genfromtxt=28=2E?= =?utf-8?b?Li4sIGNvbW1lbnRzPScjJywgbmFtZXM9VHJ1ZSkgYmVoYXZpb3Vy?= In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: Well, as `skip_header` is a number of lines, I don't really see anything particular magical about a `skip_header=-1`. Plus, range(-1) == [], while range("comments") raises a TypeError. And then you'd have to figure why the exception was raised. -- Pierre GM On Monday, July 16, 2012 at 21:56 , Nathaniel Smith wrote: > On Mon, Jul 16, 2012 at 8:06 PM, Paul Natsuo Kishimoto > wrote: > > I've implemented this feature with skip_header=-1 as suggested by > > Pierre, and in doing so removed the regression. TravisBot seems to like > > it: https://github.com/numpy/numpy/pull/351 > > > > > Can we please not use weird magic values like this? This isn't C. > skip_header="comments" would work just as well and be far more > self-explanatory... > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Jul 16 16:14:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Jul 2012 21:14:48 +0100 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Mon, Jul 16, 2012 at 9:01 PM, Pierre GM wrote: > Well, as `skip_header` is a number of lines, I don't really see anything > particular magical about a `skip_header=-1`. The logic here is: - if names=True, then genfromtext expects the names to be given in the first line, and they may or may not be commented out - BUT, if skip_header=, then any all-comment lines will be skipped before looking for names, i.e. the names are not expected to be commented out and comments are given their original meaning again. I have no idea how one could derive this understanding by looking at skip_header=-1. "Ah, that -1 is a number of lines, everyone knows that skipping -1 lines is equivalent to toggling our expectation for whether the column names will appear inside a comment"? The API is pretty convoluted at this point and I'm not convinced we wouldn't be better off with adding a new argument like names_in_comment=False/True, but skip_header="comments" at least gives the reader a fighting chance... -n From pgmdevlist at gmail.com Mon Jul 16 16:15:53 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 16 Jul 2012 22:15:53 +0200 Subject: [Numpy-discussion] =?utf-8?q?Proposed_change_in_genfromtxt=28=2E?= =?utf-8?b?Li4sIGNvbW1lbnRzPScjJywgbmFtZXM9VHJ1ZSkgYmVoYXZpb3Vy?= In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: Tom, I agree that the documentation should be updated (both the doctoring and the relevant parts of the user manual), and specific unit-tests added. Paul, that's a direct nudge ;) (I'm sure you don't mind). I was also considering the weird case >>> first_line = "# A B C #1 #2 #3" How many columns in that case ? 6 ? 3 ? So, instead of using a `split`, maybe we should just check >>> index=first_line.index(comment) and take `first_line[:index]` (or `first_line[index+1:]` after depending on the case). But then again, it's a weird case. -- Pierre GM On Monday, July 16, 2012 at 22:00 , Tom Aldcroft wrote: > On Mon, Jul 16, 2012 at 3:06 PM, Paul Natsuo Kishimoto > wrote: > > I've implemented this feature with skip_header=-1 as suggested by > > Pierre, and in doing so removed the regression. TravisBot seems to like > > it: https://github.com/numpy/numpy/pull/351 > > > > On Mon, 2012-07-16 at 16:12 +0200, Pierre GM wrote: > > > To be ultra clear (since I want to code this), you are > > > suggesting that > > > 'first_commented_line' be a *new* accepted value for the kwarg > > > 'names', to invoke the behaviour you suggest? > > > > > > > > > > > > Nope, I was just referring to some hypothetical variable name. I meant > > > that: > > > > > > first_values = None > > > try: > > > while not first_values: > > > first_line = fhd.next() > > > if names is True: > > > parsed = [m for m in first_line.split(comments) if > > > m.strip()] > > > if parsed: > > > first_value = split_line(parsed[0]) > > > else: > > > ... > > > > > > (it's not tested, I'm writing it as it comes. And I didn't even use > > > the `first_commented_line` name, sorry) > > > > > > > > > If this IS what you mean, I'd counter-propose something in the > > > same spirit, but a bit simpler?we let the kwarg 'skip_header' > > > take some additional value, say int(0), int(-1), str('auto'), > > > or True. > > > > > > > > > > > > > > > In this case, instead of skipping a fixed number of lines, it > > > will skip any number of consecutive empty OR commented lines; > > > > > > > > > > > > > > > I really like the idea of having `skip_header=-1` skip all the empty > > > or commented lines (that is, lines whose first non-space character is > > > the `comments` character). That'd be rather convenient. > > > > > > > > > > > > > > > The semantics of this are more intuitive, because this is what > > > I am > > > really after: to *skip* a commented *header* of arbitrary > > > length. So my four examples below could be parsed with: > > > > > > 1. genfromtxt(..., names=True) > > > 2. genfromtxt(..., names=True, skip_header=True) > > > 3. genfromtxt(..., names=True) > > > 4. genfromtxt(..., names=True, skip_header=True) > > > > > > ?crucially #1 avoids the regression. > > > > > > > > > Does this seem good to everyone? > > > > > > > > > > > > > > > Sounds good w/ `skip_header=-1` > > > > > > > > > But if this is NOT what you mean, then what you say does not > > > actually work with the simple use-case of my Example #2 below. > > > The first commented line is "# here is a..." with # as the > > > first non-space character, so the part after becomes the names > > > 'here', 'is', 'a' etc. > > > > > > > > > > > > > > > In that case, you could always use `skip_header=2` > > > > > > In short, the code can't resolve the ambiguity without some > > > extra > > > information from the user. > > > > > > > > > > > > > > > It's always best not to let the code guess too much anyway... > > > > > > Well, no regression, and you have a nice plan. I'm for it. > > > Anybody else? > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > -- > > Paul Natsuo Kishimoto > > > > SM candidate, Technology & Policy Program (2012) > > Research assistant, http://globalchange.mit.edu > > https://paul.kishimoto.name +1 617 302 6105 > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > I think that the proposed solution is OK, but it does make it even > trickier for the average user to predict the behavior of genfromtxt() > for different situations. Perhaps as part of this pull request Paul > should also update the documentation to include a section describing > this behavior and usage with examples 1 to 4. > > - Tom > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org (mailto:NumPy-Discussion at scipy.org) > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wfspotz at sandia.gov Mon Jul 16 16:49:19 2012 From: wfspotz at sandia.gov (Bill Spotz) Date: Mon, 16 Jul 2012 15:49:19 -0500 Subject: [Numpy-discussion] [EXTERNAL] swig + numpy + variable length arrays In-Reply-To: References: Message-ID: <0450404C-B33B-450E-86F6-F23C1281694B@sandia.gov> Andrew, Since you are writing your own %inline function, the ARGOUT_ARRAY1 typemap is not appropriate. You should put the argout logic in your function (ie, allocate the numpy array and give its data pointer to abelescalcall when you call it, and then return the array at the end). Your signature for my_abeles should only take 4 arguments, and the typemaps should reduce it to two. -Bill On Jul 16, 2012, at 4:30 AM, Andrew Nelson wrote: > Dear list, > I am trying to SWIG a C function with the signature: > > void abelescalcall(long numcoefs, double *coefP, long npoints , double > *yP, double *xP); > > numcoefs corresponds to the number of points in the coefP array. > npoints corresponds to the number of points in yP and xP arrays. > > coefP and xP are inputs (IN_ARRAY1), yP is an output (ARGOUT_ARRAY1). > > I have been trying to use the following input file: > > %module reflect > > %{ > #define SWIG_FILE_WITH_INIT > #include "myfitfunctions.h" > %} > > %include "numpy.i" > > %init %{ > import_array(); > %} > > %apply (long DIM1, double* IN_ARRAY1){(long numcoefs, double* coefP)} > %apply (long DIM1, double* ARGOUT_ARRAY1){(long len2, double* yP)} > %apply (long DIM1, double* IN_ARRAY1){(long len3, double *xP)} > > %include "myfitfunctions.h" > %rename (abelescalcall) my_abeles; > > %inline %{ > my_abeles(long numcoefs, double* coefP, long len2, double* yP, long > len3, double *xP) { > if (len2 != len3) { > PyErr_Format(PyExc_ValueError, "Arrays of lengths (%d,%d) > given", len2, len3); > return 0.0; > } > abelescalcall(numcoefs, coefP, len2, yP, xP); > } > %} > > However, if I look at the wrapper code, and if I try to call > my_abeles, then I am asked for 6 parameters. I only want to supply two > numpy arrays - coefP and xP. > > thanks for any help you are able to give. (I spent the whole afternoon > trying to get this bloomin thing to work). > > cheers, > Andrew. > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From mail at paul.kishimoto.name Mon Jul 16 17:00:40 2012 From: mail at paul.kishimoto.name (Paul Natsuo Kishimoto) Date: Mon, 16 Jul 2012 17:00:40 -0400 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> Message-ID: <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> On Mon, 2012-07-16 at 21:14 +0100, Nathaniel Smith wrote: > On Mon, Jul 16, 2012 at 9:01 PM, Pierre GM wrote: > > Well, as `skip_header` is a number of lines, I don't really see anything > > particular magical about a `skip_header=-1`. > > The logic here is: > - if names=True, then genfromtext expects the names to be given in the > first line, and they may or may not be commented out > - BUT, if skip_header=, then any all-comment lines > will be skipped before looking for names, i.e. the names are not > expected to be commented out and comments are given their original > meaning again. > > I have no idea how one could derive this understanding by looking at > skip_header=-1. "Ah, that -1 is a number of lines, everyone knows that > skipping -1 lines is equivalent to toggling our expectation for > whether the column names will appear inside a comment"? The API is > pretty convoluted at this point and I'm not convinced we wouldn't be > better off with adding a new argument like > names_in_comment=False/True, but skip_header="comments" at least gives > the reader a fighting chance... > > -n Another option is to use skip_header=True. The internal monologue accompanying this is "Ah, do I want it to skip the header? Yes, true, I do," with no thought needed on the number of lines involved. Pierre, checking the type of the argument is trivial. Nathaniel, is this less weird/magical? Anyone else? I don't care what the value is and it's easy to change, but I'll await some agreement on this point so I don't have to change it yet again in response to more objections. Pierre, for a line "# A B C #1 #2 #3" the user gets six columns 'A', 'B', 'C', '#1', '#2', '#3', which is messy but what they deserve for using such messy input :) Also, if you look closely, the use of index() you propose is equivalent to my current code, just more verbose. Tom, in my branch I rewrote the documentation for the `names` kwarg in an attempt to be more clear, but I agree a documentation example of the non-legacy use would go a long way. I've also realized I neglected to update the documentation for `skip_header`. I'll do these once there is consensus on the value to use. If there was willingness to tolerate a backwards-incompatible change, the resulting behaviour would be quite simple and intuitive overall, but that's out of my hands. At the moment I'm just concerned with making an intuitive behaviour *possible*. Thanks everyone for your input, -- Paul Natsuo Kishimoto SM candidate, Technology & Policy Program (2012) Research assistant, http://globalchange.mit.edu https://paul.kishimoto.name +1 617 302 6105 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From rowen at uw.edu Mon Jul 16 17:28:22 2012 From: rowen at uw.edu (Russell E. Owen) Date: Mon, 16 Jul 2012 14:28:22 -0700 Subject: [Numpy-discussion] Code Freeze for NumPy 1.7 References: <976D1214-75DF-493E-8609-CFB1340BC823@continuum.io> <1342393528.28368.3.camel@esdceeprjpstudent1.mit.edu> Message-ID: In article <1342393528.28368.3.camel at esdceeprjpstudent1.mit.edu>, Paul Natsuo Kishimoto wrote: > On Sat, 2012-07-14 at 17:45 -0500, Travis Oliphant wrote: > > Hey all, > > > > We are nearing a code-freeze for NumPy 1.7. Are there any > > last-minute changes people are wanting to push into NumPy 1.7? We > > should discuss them as soon as possible. > > > > I'm proposing a code-freeze at midnight UTC on July 18th (7:00pm CDT > > on July 17th). This will allow the creation of beta releases of > > NumPy on the 18th of July. This is a few days later than originally > > hoped for --- largely due to unexpected travel schedules of Ondrej and > > I, but it does give people a few more days to get patches in. Of > > course, we will be able to apply bug-fixes to the 1.7.x branch once > > the tag is made. > > > > If you have a pull-request that is not yet applied and would like to > > discuss it for inclusion, the time to do it is now. > > > > Best, > > > > -Travis > > > Bump for: https://github.com/numpy/numpy/pull/351 > > > As requested by njsmith, I gave a more detailed explanation and asked > the list for input at: > http://www.mail-archive.com/numpy-discussion at scipy.org/msg38306.html > > There was one qualified negative reply and nothing (yet) further. I'd > appreciate if some other devs could weigh in. My personal opinion is that the improvement is not sufficient to justify breaking backword compatibility. -- Russell From pgmdevlist at gmail.com Mon Jul 16 17:39:33 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 16 Jul 2012 23:39:33 +0200 Subject: [Numpy-discussion] =?utf-8?q?Proposed_change_in_genfromtxt=28=2E?= =?utf-8?b?Li4sIGNvbW1lbnRzPScjJywgbmFtZXM9VHJ1ZSkgYmVoYXZpb3Vy?= In-Reply-To: <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> Message-ID: I don't really have any deep issue with `skip_header=True`, besides not really liking having an argument whose type can vary. But that's only a matter of personal taste. And yes, we could always check the type? > > Pierre, for a line "# A B C #1 #2 #3" the user gets six columns 'A', > 'B', 'C', '#1', '#2', '#3', which is messy but what they deserve for > using such messy input :) > > OK, we're on the same page. > Also, if you look closely, the use of index() > you propose is equivalent to my current code, just more verbose. > > I'm not convinced by line 1353: unless you change it to asbyte(comment).join(first_line.split(comments)[1:]) you gonna lose the '#', aren't you ? With the 'index' way, we just pick the first one, as intended. But it's late and multitasking isn't really working for me now. > Tom, in my branch I rewrote the documentation for the `names` kwarg in > an attempt to be more clear, but I agree a documentation example of the > non-legacy use would go a long way. I've also realized I neglected to > update the documentation for `skip_header`. I'll do these once there is > consensus on the value to use. > > Good. Don't forget to add some unit-tests too. > If there was willingness to tolerate a backwards-incompatible change, > the resulting behaviour would be quite simple and intuitive overall, but > that's out of my hands. > > See, that't the problem here: we can't have any backwards incompatibility, lest we upset users that may not even be on the list. So that's a no-no. Nevertheless, you're raising some interesting cases, and I'm sure a consensus will be found quite soon. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Jul 16 17:50:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 16 Jul 2012 23:50:23 +0200 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Mon, Jul 16, 2012 at 6:28 PM, Charles R Harris wrote: > Hi All, > > Working lazy imports would be useful to have. Ralf is opposed to the idea > because it caused all sorts of problems on different platforms when it was > tried in scipy. Note that my being opposed is because the benefits are smaller than the cost. If there was a better reason than shaving a couple of extra ms off the import time or being able to more cleanly deprecate modules, then of course it's possible I'd change my mind. Ralf > I thought I'd open the topic for discussion so that folks who had various > problems/solutions could offer input and the common experience could be > collected in one place. Perhaps there is a solution that actually works. > > Ideas? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Jul 16 23:37:31 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 16 Jul 2012 22:37:31 -0500 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Jul 16, 2012, at 4:50 PM, Ralf Gommers wrote: > > > On Mon, Jul 16, 2012 at 6:28 PM, Charles R Harris wrote: > Hi All, > > Working lazy imports would be useful to have. Ralf is opposed to the idea because it caused all sorts of problems on different platforms when it was tried in scipy. > > Note that my being opposed is because the benefits are smaller than the cost. If there was a better reason than shaving a couple of extra ms off the import time or being able to more cleanly deprecate modules, then of course it's possible I'd change my mind. This sort of cost / benefit analysis is always tricky. The "benefits" are always subjective. For some, shaving a few ms off of import times is *extremely* valuable and so they would opine that it absolutely outweights the "cost". Others see such benefits as barely worth mentioning. I doubt we can define an absolute scale that would allow such an optimization to actually be done. My personal view is that speeding up import times is an important goal (though not the *most* important goal). The "costs" here are also very loosely defined. Lazy imports were tried with SciPy. Several people had problems with them. Mostly it was difficult to not have a "brittle" implementation that didn't break someone downstream libraries concept of what "import" meant --- systems that try to "freeze" Python programs were particularly annoyed at SciPy's lazy import mechanism. However, I don't think that lazy imports should be universally eschewed. They are probably not right for the NumPy library by default, but I think good use could be made of a system loaded before any other import by users who are looking to avoid the costs of importing all the packages imported by a library. In particular, I suspect we could come up with an approach for C-API users like Andrew so that they could call _import_lazy() and then _import_array() and get what they want (hooks to the C-API) without what they don't (everything else on the Python side). I've implemented something like this in the past by using the metapath import hook to intercept import of a specific library. It could be done for NumPy in a way that doesn't affect people who don't want it, but can be used by people who want to speed up their import times and don't need much from NumPy. Certainly, in retrospect, it would have been a better design to allow the C-API to be loaded without everything else being loaded as well... -- maybe even providing a module *just* for importing the C-API that was in the top-level namespace "_numpy" or some such... Best, -Travis > > Ralf > > > I thought I'd open the topic for discussion so that folks who had various problems/solutions could offer input and the common experience could be collected in one place. Perhaps there is a solution that actually works. > > Ideas? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Tue Jul 17 00:34:34 2012 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 16 Jul 2012 21:34:34 -0700 Subject: [Numpy-discussion] Problem w/ Win installer Message-ID: Hi, folks! Having a problem w/ the Windows installer; first, the "back-story": I have both Python 2.7 and 3.2 installed. When I run the installer and click next on the first dialog, I get the message that I need Python 2.7, which was not found in my registry. I ran regedit and searched for Python and get multiple hits on both Python 2.7 and 3.2. So, precisely which registry key has to have the value Python 2.7 for the installer to find it? Thanks! OlyDLG -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Jul 17 03:31:06 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 17 Jul 2012 08:31:06 +0100 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Mon, Jul 16, 2012 at 5:28 PM, Charles R Harris wrote: > Hi All, > > Working lazy imports would be useful to have. Ralf is opposed to the idea > because it caused all sorts of problems on different platforms when it was > tried in scipy. I thought I'd open the topic for discussion so that folks > who had various problems/solutions could offer input and the common > experience could be collected in one place. Perhaps there is a solution that > actually works. I have never seen a lazy import system that did not cause issues in one way or the other. Lazy imports make a lot of sense for an application (e.g. mercurial), but I think it is a mistake to solve this at the numpy level. This should be solved at the application level, and there are solutions for that. For example, using the demandimport code from mercurial (GPL) cuts down the numpy import time by 3 on my mac if one uses np.zeros (100ms -> 50 ms, of which 25 are taken by python itself): """ import demandimport demandimport.enable() import numpy as np a = np.zeros(10) """ To help people who need fast numpy imports, I would suggest the following course of actions: - start benchmarking numpy import in a per-commit manner to detect significant regressions (like what happens with polynomial code) - have a small FAQ on it, with suggestion for people who need to optimize their short-lived script cheers, David From charlesr.harris at gmail.com Tue Jul 17 08:13:13 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 17 Jul 2012 06:13:13 -0600 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Tue, Jul 17, 2012 at 1:31 AM, David Cournapeau wrote: > On Mon, Jul 16, 2012 at 5:28 PM, Charles R Harris > wrote: > > Hi All, > > > > Working lazy imports would be useful to have. Ralf is opposed to the idea > > because it caused all sorts of problems on different platforms when it > was > > tried in scipy. I thought I'd open the topic for discussion so that folks > > who had various problems/solutions could offer input and the common > > experience could be collected in one place. Perhaps there is a solution > that > > actually works. > > I have never seen a lazy import system that did not cause issues in > one way or the other. Lazy imports make a lot of sense for an > application (e.g. mercurial), but I think it is a mistake to solve > this at the numpy level. > > This should be solved at the application level, and there are > solutions for that. For example, using the demandimport code from > mercurial (GPL) cuts down the numpy import time by 3 on my mac if one > uses np.zeros (100ms -> 50 ms, of which 25 are taken by python > itself): > > """ > import demandimport > demandimport.enable() > > import numpy as np > > a = np.zeros(10) > """ > > To help people who need fast numpy imports, I would suggest the > following course of actions: > - start benchmarking numpy import in a per-commit manner to detect > significant regressions (like what happens with polynomial code) > - have a small FAQ on it, with suggestion for people who need to > optimize their short-lived script > > That's really interesting. I'd like to see some folks try that solution. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 17 11:47:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Jul 2012 16:47:48 +0100 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Mon, Jul 16, 2012 at 10:39 PM, Pierre GM wrote: > I don't really have any deep issue with `skip_header=True`, besides not > really liking having an argument whose type can vary. But that's only a > matter of personal taste. And yes, we could always check the type? I guess I still have a small preference for skip_header="comments" over skip_header=True, since the latter is more opaque for no purpose. Also it makes me slightly antsy since skip_header is normally an integer, and True is, in fact, just an integer with a special __repr__: In [2]: isinstance(True, int) Out[2]: True In [3]: True + True Out[3]: 2 Not that there are likely to be people using skip_header=True as an alias for skip_header=1, but if they were it would currently work. > Pierre, for a line "# A B C #1 #2 #3" the user gets six columns 'A', > 'B', 'C', '#1', '#2', '#3', which is messy but what they deserve for > using such messy input :) > > OK, we're on the same page. > > > Also, if you look closely, the use of index() > you propose is equivalent to my current code, just more verbose. > > I'm not convinced by line 1353: unless you change it to > asbyte(comment).join(first_line.split(comments)[1:]) > you gonna lose the '#', aren't you ? With the 'index' way, we just pick the > first one, as intended. But it's late and multitasking isn't really working > for me now. I think you guys are looking for .split(comments, 1). -n From pgmdevlist at gmail.com Tue Jul 17 11:56:17 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 17 Jul 2012 17:56:17 +0200 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Tuesday, July 17, 2012 at 17:47 , Nathaniel Smith wrote: I guess I still have a small preference for skip_header="comments" over skip_header=True, since the latter is more opaque for no purpose. Also it makes me slightly antsy since skip_header is normally an integer, and True is, in fact, just an integer with a special __repr__: Nathaniel, that's basically my problem with `skip_header=True`. I still prefer my -1 to your "comments", but whatever, personal taste again. I'm not convinced by line 1353: unless you change it to asbyte(comment).join(first_line.split(comments)[1:]) you gonna lose the '#', aren't you ? With the 'index' way, we just pick the first one, as intended. But it's late and multitasking isn't really working for me now. I think you guys are looking for .split(comments, 1). -n Tadah! Thanks a lot! -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Jul 17 12:09:17 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 17 Jul 2012 17:09:17 +0100 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Tue, Jul 17, 2012 at 1:13 PM, Charles R Harris wrote: > > > On Tue, Jul 17, 2012 at 1:31 AM, David Cournapeau > wrote: >> >> On Mon, Jul 16, 2012 at 5:28 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > Working lazy imports would be useful to have. Ralf is opposed to the >> > idea >> > because it caused all sorts of problems on different platforms when it >> > was >> > tried in scipy. I thought I'd open the topic for discussion so that >> > folks >> > who had various problems/solutions could offer input and the common >> > experience could be collected in one place. Perhaps there is a solution >> > that >> > actually works. >> >> I have never seen a lazy import system that did not cause issues in >> one way or the other. Lazy imports make a lot of sense for an >> application (e.g. mercurial), but I think it is a mistake to solve >> this at the numpy level. >> >> This should be solved at the application level, and there are >> solutions for that. For example, using the demandimport code from >> mercurial (GPL) cuts down the numpy import time by 3 on my mac if one >> uses np.zeros (100ms -> 50 ms, of which 25 are taken by python >> itself): >> >> """ >> import demandimport >> demandimport.enable() >> >> import numpy as np >> >> a = np.zeros(10) >> """ >> >> To help people who need fast numpy imports, I would suggest the >> following course of actions: >> - start benchmarking numpy import in a per-commit manner to detect >> significant regressions (like what happens with polynomial code) >> - have a small FAQ on it, with suggestion for people who need to >> optimize their short-lived script >> > > That's really interesting. I'd like to see some folks try that solution. Anyone can:) the file is self-contained last time I checked: http://www.selenic.com/hg/file/67b8cca2f12b/mercurial/demandimport.py cheers, David From chris.barker at noaa.gov Tue Jul 17 13:31:54 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 17 Jul 2012 10:31:54 -0700 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Mon, Jul 16, 2012 at 2:50 PM, Ralf Gommers wrote: >> Working lazy imports would be useful to have. Ralf is opposed to the idea > Note that my being opposed is because the benefits are smaller than the > cost. If there was a better reason than shaving a couple of extra ms off the > import time or being able to more cleanly deprecate modules, then of course > it's possible I'd change my mind. Whether it's compelling or not is up to you, but I'd love to not have to include a bunch of stuff I'm not using in a py2exe or py2app or ... bundle when shipping self-contained executables. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Tue Jul 17 13:36:17 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 17 Jul 2012 10:36:17 -0700 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: On Mon, Jul 16, 2012 at 8:37 PM, Travis Oliphant wrote: >--- systems that try to "freeze" Python programs were > particularly annoyed at SciPy's lazy import mechanism. That's ironic to me -- while the solution to a lot of "freezing" problems is to include everything including the kitchen sink -- I really hate having to do that. With lazy imports, you *may* have to hand-include some stuff (or the while package -- that's actually pretty easy), but without lazy importing you *have* to include everything -- worse, as far as I'm concerned. In fact, bumpy aside, I've occasionally needed a tiny module from scipy for an I"m Im bundling, and I've ended up having to hack away at scipy to pull out what I needed without getting all sorts of stuff I didn't need. So I vote for lazy imports in Scipy too! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ralf.gommers at googlemail.com Tue Jul 17 16:05:19 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 17 Jul 2012 22:05:19 +0200 Subject: [Numpy-discussion] py3 datetime woes (was Re: Code Freeze for NumPy 1.7) In-Reply-To: References: Message-ID: On Mon, Jul 16, 2012 at 3:27 PM, Nathaniel Smith wrote: > On Sun, Jul 15, 2012 at 5:32 PM, Ralf Gommers > wrote: > > Current issues can be seen from the last test log on our Windows XP > buildbot > > (June 29, > > > http://buildbot.scipy.org/builders/Windows_XP_x86/builds/1124/steps/shell_1/logs/stdio > ): > > > > ====================================================================== > > ERROR: test_datetime_arange (test_datetime.TestDateTime) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > > "C:\buildbot\numpy\b11\numpy-install\Lib\site-packages\numpy\core\tests\test_datetime.py", > > line 1351, in test_datetime_arange > > assert_raises(ValueError, np.arange, np.datetime64('today'), > > OSError: Failed to use '_localtime64_s' to convert to a local time > > ...I don't understand how this is even building if the functions are > missing. > Because of the build magic added in https://github.com/numpy/numpy/pull/156 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Jul 17 16:36:20 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 17 Jul 2012 15:36:20 -0500 Subject: [Numpy-discussion] Would a patch with a function for incrementing an array with advanced indexing be accepted? In-Reply-To: References: Message-ID: <364D6508-78E4-40CB-981B-0F04C875F8CA@continuum.io> Hey John, Will you be able to clean up the indentation issues for your inplace increment patch today. I would like to include it in NumPy 1.7. Thanks, -Travis On Jul 6, 2012, at 10:37 AM, John Salvatier wrote: > Okay, done ( https://github.com/jsalvatier/numpy/commit/7d03753e6305dbc878ed7df3e21e9b099eae32ed ). > > > > On Tue, Jul 3, 2012 at 11:41 AM, Fr?d?ric Bastien wrote: > Hi, > > Here is code example that work only with different index: > > import numpy > x=numpy.zeros((5,5)) > x[[0,2,4]]+=numpy.random.rand(3,5) > print x > > This won't work if in the list [0,2,4], there is index duplication, > but with your new code, it will. I think it is the most used case of > advanced indexing. At least, for our lab:) > > Fred > > On Mon, Jul 2, 2012 at 7:48 PM, John Salvatier > wrote: > > Hi Fred, > > > > That's an excellent idea, but I am not too familiar with this use case. What > > do you mean by list in 'matrix[list]'? Is the use case, just incrementing > > in place a sub matrix of a numpy matrix? > > > > John > > > > > > On Fri, Jun 29, 2012 at 11:43 AM, Fr?d?ric Bastien wrote: > >> > >> Hi, > >> > >> I personnaly can't review this as this is too much in NumPy internal. > >> > >> My only comments is that you could add a test and an example in the > >> doc for matrix[list]. I think it will be the most used case. > >> > >> Fred > >> > >> On Wed, Jun 27, 2012 at 7:47 PM, John Salvatier > >> wrote: > >> > I've submitted a pull request ( https://github.com/numpy/numpy/pull/326 > >> > ). > >> > I'm new to the numpy and python internals, so feedback is greatly > >> > appreciated. > >> > > >> > > >> > On Tue, Jun 26, 2012 at 12:10 PM, Travis Oliphant > >> > wrote: > >> >> > >> >> > >> >> On Jun 26, 2012, at 1:34 PM, Fr?d?ric Bastien wrote: > >> >> > >> >> > Hi, > >> >> > > >> >> > I think he was referring that making NUMPY_ARRAY_OBJECT[...] syntax > >> >> > support the operation that you said is hard. But having a separate > >> >> > function do it is less complicated as you said. > >> >> > >> >> Yes. That's precisely what I meant. Thank you for clarifying. > >> >> > >> >> -Travis > >> >> > >> >> > > >> >> > Fred > >> >> > > >> >> > On Tue, Jun 26, 2012 at 1:27 PM, John Salvatier > >> >> > wrote: > >> >> >> Can you clarify why it would be super hard? I just reused the code > >> >> >> for > >> >> >> advanced indexing (a modification of PyArray_SetMap). Am I missing > >> >> >> something > >> >> >> crucial? > >> >> >> > >> >> >> > >> >> >> > >> >> >> On Tue, Jun 26, 2012 at 9:57 AM, Travis Oliphant > >> >> >> > >> >> >> wrote: > >> >> >>> > >> >> >>> > >> >> >>> On Jun 26, 2012, at 11:46 AM, John Salvatier wrote: > >> >> >>> > >> >> >>> Hello, > >> >> >>> > >> >> >>> If you increment an array using advanced indexing and have repeated > >> >> >>> indexes, the array doesn't get repeatedly > >> >> >>> incremented, > >> >> >>> http://comments.gmane.org/gmane.comp.python.numeric.general/50291. > >> >> >>> I wrote a C function that does incrementing with repeated indexes > >> >> >>> correctly. > >> >> >>> The branch is here (https://github.com/jsalvatier/numpy see the > >> >> >>> last > >> >> >>> two > >> >> >>> commits). Would a patch with a cleaned up version of a function > >> >> >>> like > >> >> >>> this be > >> >> >>> accepted into numpy? I'm not experienced writing numpy C code so > >> >> >>> I'm > >> >> >>> sure it > >> >> >>> still needs improvement. > >> >> >>> > >> >> >>> > >> >> >>> This is great. It is an often-requested feature. It's *very > >> >> >>> difficult* > >> >> >>> to do without changing fundamentally what NumPy is. But, yes this > >> >> >>> would be > >> >> >>> a great pull request. > >> >> >>> > >> >> >>> Thanks, > >> >> >>> > >> >> >>> -Travis > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> _______________________________________________ > >> >> >>> NumPy-Discussion mailing list > >> >> >>> NumPy-Discussion at scipy.org > >> >> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> >>> > >> >> >> > >> >> >> > >> >> >> _______________________________________________ > >> >> >> NumPy-Discussion mailing list > >> >> >> NumPy-Discussion at scipy.org > >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> >> > >> >> > _______________________________________________ > >> >> > NumPy-Discussion mailing list > >> >> > NumPy-Discussion at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> > > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Jul 17 16:56:39 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 17 Jul 2012 15:56:39 -0500 Subject: [Numpy-discussion] Pull Requests I'm planning to merge Message-ID: I would like to merge the following pull requests sometime today: * 326 -- inplace increment function * 325 -- prefer gfortran on OSX and Linux * 192 -- meshgrid enhancements * 327 -- restore commas and update C-API doc a bit * 352 -- simplifying case for insert and adding tests (#2028) * 350 -- return view for subset of fields. Also, I would like to do * 356 -- with Charles's edited fix in the comments (Bug #808) Are there concerns with the above list, or do others have PR they would especially like to see merged... Thanks, -Travis From aron at ahmadia.net Tue Jul 17 17:04:42 2012 From: aron at ahmadia.net (Aron Ahmadia) Date: Tue, 17 Jul 2012 16:04:42 -0500 Subject: [Numpy-discussion] Pull Requests I'm planning to merge In-Reply-To: References: Message-ID: I (or somebody else) needs to fix 325, Ralf pointed out that I was slightly too aggressive preferring gfortran on vendor operating systems. I think I should be able to fix this in the next couple of hours, I've been delinquent in getting back to it. A On Tue, Jul 17, 2012 at 3:56 PM, Travis Oliphant wrote: > > I would like to merge the following pull requests sometime today: > > * 326 -- inplace increment function > * 325 -- prefer gfortran on OSX and Linux > * 192 -- meshgrid enhancements > * 327 -- restore commas and update C-API doc a bit > * 352 -- simplifying case for insert and adding tests (#2028) > * 350 -- return view for subset of fields. > > Also, I would like to do > > * 356 -- with Charles's edited fix in the comments (Bug #808) > > Are there concerns with the above list, or do others have PR they would > especially like to see merged... > > Thanks, > > -Travis > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 17 19:48:53 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 18 Jul 2012 00:48:53 +0100 Subject: [Numpy-discussion] Pull Requests I'm planning to merge In-Reply-To: References: Message-ID: On Tue, Jul 17, 2012 at 9:56 PM, Travis Oliphant wrote: > > I would like to merge the following pull requests sometime today: > > * 326 -- inplace increment function -1, for the reasons stated in the comment thread -- we shouldn't lock ourselves into an ugly API when there's discussion going on about the right solution, and the submitter has expressed interest in rewriting it in a better way. Also to be clear, I'm not saying I will try to block this functionality forever unless it's perfect or anything -- I could probably be convinced that putting in something sub-optimal here was the best trade-off. Really what I'm -1 on is shoving something in without proper discussion just because the release is happening, on the theory that we can clean up any mess later. Until your comment a few hours ago no-one even looked at this new API except me and the submitter... > * 325 -- prefer gfortran on OSX and Linux > * 192 -- meshgrid enhancements > * 327 -- restore commas and update C-API doc a bit Still needs a test. > * 352 -- simplifying case for insert and adding tests (#2028) > * 350 -- return view for subset of fields. > > Also, I would like to do > > * 356 -- with Charles's edited fix in the comments (Bug #808) Otherwise looks fine to me. -n From ondrej.certik at gmail.com Tue Jul 17 20:20:25 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 18 Jul 2012 02:20:25 +0200 Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 8:22 PM, John Salvatier wrote: > Hello, > > I'm trying to compile numpy on Windows 7 using the command: "python setup.py > config --compiler=mingw32 build" but I get an error about a symbol table not > found. Anyone know how to work around this or what to look into? > > building library "npymath" sources > Building msvcr library: "C:\Python26\libs\libmsvcr90.a" (from > C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll) > objdump.exe: > C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll: > File format not recognized > Traceback (most recent call last): > File "setup.py", line 214, in > setup_package() > File "setup.py", line 207, in setup_package > configuration=configuration ) > File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\core.py", line > 186, in setup > return old_setup(**new_attr) > File "C:\Python26\lib\distutils\core.py", line 152, in setup > dist.run_commands() > File "C:\Python26\lib\distutils\dist.py", line 975, in run_commands > self.run_command(cmd) > File "C:\Python26\lib\distutils\dist.py", line 995, in run_command > cmd_obj.run() > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build.py", line > 37, in run > old_build.run(self) > File "C:\Python26\lib\distutils\command\build.py", line 134, in run > self.run_command(cmd_name) > File "C:\Python26\lib\distutils\cmd.py", line 333, in run_command > self.distribution.run_command(command) > File "C:\Python26\lib\distutils\dist.py", line 995, in run_command > cmd_obj.run() > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", > line 152, in run > self.build_sources() > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", > line 163, in build_sources > self.build_library_sources(*libname_info) > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", > line 298, in build_library_sources > sources = self.generate_sources(sources, (lib_name, build_info)) > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", > line 385, in generate_sources > source = func(extension, build_dir) > File "numpy\core\setup.py", line 646, in get_mathlib_info > st = config_cmd.try_link('int main(void) { return 0;}') > File "C:\Python26\lib\distutils\command\config.py", line 257, in try_link > self._check_compiler() > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\config.py", > line 45, in _check_compiler > old_config._check_compiler(self) > File "C:\Python26\lib\distutils\command\config.py", line 107, in > _check_compiler > dry_run=self.dry_run, force=1) > File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\ccompiler.py", > line 560, in new_compiler > compiler = klass(None, dry_run, force) > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", > line 94, in __init__ > msvcr_success = build_msvcr_library() > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", > line 362, in build_msvcr_library > generate_def(dll_file, def_file) > File > "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", > line 282, in generate_def > raise ValueError("Symbol table not found") > ValueError: Symbol table not found Did you find a workaround? I am having exactly the same problem. Ondrej From travis at continuum.io Tue Jul 17 22:20:25 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 17 Jul 2012 21:20:25 -0500 Subject: [Numpy-discussion] Pull Requests I'm planning to merge In-Reply-To: References: Message-ID: <8E141C42-1EDC-47BD-AB05-A64835ADCEB6@continuum.io> On Jul 17, 2012, at 6:48 PM, Nathaniel Smith wrote: > On Tue, Jul 17, 2012 at 9:56 PM, Travis Oliphant wrote: >> >> I would like to merge the following pull requests sometime today: >> >> * 326 -- inplace increment function > > -1, for the reasons stated in the comment thread -- we shouldn't lock > ourselves into an ugly API when there's discussion going on about the > right solution, and the submitter has expressed interest in rewriting > it in a better way. Also to be clear, I'm not saying I will try to > block this functionality forever unless it's perfect or anything -- I > could probably be convinced that putting in something sub-optimal here > was the best trade-off. Really what I'm -1 on is shoving something in > without proper discussion just because the release is happening, on > the theory that we can clean up any mess later. Until your comment a > few hours ago no-one even looked at this new API except me and the > submitter... You are incorrect about that. I've looked at it several times since it was posted. I've looked at and studied all the PRs multiple times in fact over the past several months. I've waited for others to express an opinion. I don't think it's accurate to assume that people have not seen something because they don't comment. I will hold off on this one, but only because you raised your objection to -1 (instead of -0.5). But, I disagree very strongly with you that it is "sub-optimal" in the context of NumPy. I actually think it is a very reasonable solution. He has re-used the MapIter code correctly (which is the only code that *does* fancy indexing). The author has done a lot of work to support multiple data-types and satisfy an often-requested feature in a very reasonable way. Your objection seems to be that you would prefer that it were a method on ufuncs. I don't think this will work without a *major* refactor. I'm not sure it's even worth it then either. I'm glad to hear you will not block this being added in the future. > >> * 325 -- prefer gfortran on OSX and Linux >> * 192 -- meshgrid enhancements >> * 327 -- restore commas and update C-API doc a bit > > Still needs a test. That would be nice, but the pull request that removed it also needed a test. This is just restoring behavior that should be there, so it has to go in before the release --- test or no test, unfortunately. > >> * 352 -- simplifying case for insert and adding tests (#2028) >> * 350 -- return view for subset of fields. >> >> Also, I would like to do >> >> * 356 -- with Charles's edited fix in the comments (Bug #808) > > Otherwise looks fine to me. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From travis at continuum.io Wed Jul 18 02:09:18 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 18 Jul 2012 01:09:18 -0500 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch Message-ID: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Hey all, We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. Thanks for all the many hours of reviewing, coding, writing, reviewing and merging pull requests that has gone into this tag. Just perusing the commit history shows there have been dozens of people (including several new contributors) who have contributed to this release. You are a very talented and productive group and it's time to start thinking about 1.8.x. I think the target should be around November or December. Thanks, -Travis From ondrej.certik at gmail.com Wed Jul 18 06:30:08 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 18 Jul 2012 12:30:08 +0200 Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: References: Message-ID: On Wed, Jul 18, 2012 at 2:20 AM, Ond?ej ?ert?k wrote: > On Thu, Jan 5, 2012 at 8:22 PM, John Salvatier > wrote: >> Hello, >> >> I'm trying to compile numpy on Windows 7 using the command: "python setup.py >> config --compiler=mingw32 build" but I get an error about a symbol table not >> found. Anyone know how to work around this or what to look into? >> >> building library "npymath" sources >> Building msvcr library: "C:\Python26\libs\libmsvcr90.a" (from >> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll) >> objdump.exe: >> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll: >> File format not recognized >> Traceback (most recent call last): >> File "setup.py", line 214, in >> setup_package() >> File "setup.py", line 207, in setup_package >> configuration=configuration ) >> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\core.py", line >> 186, in setup >> return old_setup(**new_attr) >> File "C:\Python26\lib\distutils\core.py", line 152, in setup >> dist.run_commands() >> File "C:\Python26\lib\distutils\dist.py", line 975, in run_commands >> self.run_command(cmd) >> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >> cmd_obj.run() >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build.py", line >> 37, in run >> old_build.run(self) >> File "C:\Python26\lib\distutils\command\build.py", line 134, in run >> self.run_command(cmd_name) >> File "C:\Python26\lib\distutils\cmd.py", line 333, in run_command >> self.distribution.run_command(command) >> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >> cmd_obj.run() >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >> line 152, in run >> self.build_sources() >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >> line 163, in build_sources >> self.build_library_sources(*libname_info) >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >> line 298, in build_library_sources >> sources = self.generate_sources(sources, (lib_name, build_info)) >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >> line 385, in generate_sources >> source = func(extension, build_dir) >> File "numpy\core\setup.py", line 646, in get_mathlib_info >> st = config_cmd.try_link('int main(void) { return 0;}') >> File "C:\Python26\lib\distutils\command\config.py", line 257, in try_link >> self._check_compiler() >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\config.py", >> line 45, in _check_compiler >> old_config._check_compiler(self) >> File "C:\Python26\lib\distutils\command\config.py", line 107, in >> _check_compiler >> dry_run=self.dry_run, force=1) >> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\ccompiler.py", >> line 560, in new_compiler >> compiler = klass(None, dry_run, force) >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >> line 94, in __init__ >> msvcr_success = build_msvcr_library() >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >> line 362, in build_msvcr_library >> generate_def(dll_file, def_file) >> File >> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >> line 282, in generate_def >> raise ValueError("Symbol table not found") >> ValueError: Symbol table not found > > > Did you find a workaround? I am having exactly the same problem. So this happens both in Windows and in Wine and the problem is that the numpy distutils is trying to read the symbol table using "objdump" from msvcr90.dll but it can't recognize the format: objdump.exe: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: File format not recognized The file exists: $ file ~/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll /home/ondrej/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll: PE32 executable for MS Windows (DLL) (unknown subsystem) Intel 80386 32-bit But objdump doesn't work on it. Ondrej From ondrej.certik at gmail.com Wed Jul 18 06:38:25 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 18 Jul 2012 12:38:25 +0200 Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: References: Message-ID: On Wed, Jul 18, 2012 at 12:30 PM, Ond?ej ?ert?k wrote: > On Wed, Jul 18, 2012 at 2:20 AM, Ond?ej ?ert?k wrote: >> On Thu, Jan 5, 2012 at 8:22 PM, John Salvatier >> wrote: >>> Hello, >>> >>> I'm trying to compile numpy on Windows 7 using the command: "python setup.py >>> config --compiler=mingw32 build" but I get an error about a symbol table not >>> found. Anyone know how to work around this or what to look into? >>> >>> building library "npymath" sources >>> Building msvcr library: "C:\Python26\libs\libmsvcr90.a" (from >>> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll) >>> objdump.exe: >>> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll: >>> File format not recognized >>> Traceback (most recent call last): >>> File "setup.py", line 214, in >>> setup_package() >>> File "setup.py", line 207, in setup_package >>> configuration=configuration ) >>> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\core.py", line >>> 186, in setup >>> return old_setup(**new_attr) >>> File "C:\Python26\lib\distutils\core.py", line 152, in setup >>> dist.run_commands() >>> File "C:\Python26\lib\distutils\dist.py", line 975, in run_commands >>> self.run_command(cmd) >>> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >>> cmd_obj.run() >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build.py", line >>> 37, in run >>> old_build.run(self) >>> File "C:\Python26\lib\distutils\command\build.py", line 134, in run >>> self.run_command(cmd_name) >>> File "C:\Python26\lib\distutils\cmd.py", line 333, in run_command >>> self.distribution.run_command(command) >>> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >>> cmd_obj.run() >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>> line 152, in run >>> self.build_sources() >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>> line 163, in build_sources >>> self.build_library_sources(*libname_info) >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>> line 298, in build_library_sources >>> sources = self.generate_sources(sources, (lib_name, build_info)) >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>> line 385, in generate_sources >>> source = func(extension, build_dir) >>> File "numpy\core\setup.py", line 646, in get_mathlib_info >>> st = config_cmd.try_link('int main(void) { return 0;}') >>> File "C:\Python26\lib\distutils\command\config.py", line 257, in try_link >>> self._check_compiler() >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\config.py", >>> line 45, in _check_compiler >>> old_config._check_compiler(self) >>> File "C:\Python26\lib\distutils\command\config.py", line 107, in >>> _check_compiler >>> dry_run=self.dry_run, force=1) >>> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\ccompiler.py", >>> line 560, in new_compiler >>> compiler = klass(None, dry_run, force) >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>> line 94, in __init__ >>> msvcr_success = build_msvcr_library() >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>> line 362, in build_msvcr_library >>> generate_def(dll_file, def_file) >>> File >>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>> line 282, in generate_def >>> raise ValueError("Symbol table not found") >>> ValueError: Symbol table not found >> >> >> Did you find a workaround? I am having exactly the same problem. > > So this happens both in Windows and in Wine and the problem is that > the numpy distutils is trying to read the symbol table using "objdump" > from msvcr90.dll but it can't recognize the format: > > objdump.exe: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: > File format not recognized > > The file exists: > > > $ file ~/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll > /home/ondrej/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll: > PE32 executable for MS Windows (DLL) (unknown subsystem) Intel 80386 > 32-bit > > > But objdump doesn't work on it. So the following patch fixes it: diff --git a/numpy/distutils/mingw32ccompiler.py b/numpy/distutils/mingw32ccompi index 5b9aa33..72ff5ed 100644 --- a/numpy/distutils/mingw32ccompiler.py +++ b/numpy/distutils/mingw32ccompiler.py @@ -91,11 +91,11 @@ class Mingw32CCompiler(distutils.cygwinccompiler.CygwinCComp build_import_library() # Check for custom msvc runtime library on Windows. Build if it doesn't - msvcr_success = build_msvcr_library() - msvcr_dbg_success = build_msvcr_library(debug=True) - if msvcr_success or msvcr_dbg_success: - # add preprocessor statement for using customized msvcr lib - self.define_macro('NPY_MINGW_USE_CUSTOM_MSVCR') + #msvcr_success = build_msvcr_library() + #msvcr_dbg_success = build_msvcr_library(debug=True) + #if msvcr_success or msvcr_dbg_success: + # # add preprocessor statement for using customized msvcr lib + # self.define_macro('NPY_MINGW_USE_CUSTOM_MSVCR') # Define the MSVC version as hint for MinGW msvcr_version = '0x%03i0' % int(msvc_runtime_library().lstrip('msvcr')) Now things work and start compiling. Any ideas what is going on here? Why is it trying to "build the msvcr" library? Ondrej From ondrej.certik at gmail.com Wed Jul 18 07:24:20 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 18 Jul 2012 13:24:20 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray Message-ID: Hi, I managed to compile NumPy in MinGW under Wine in Ubuntu 11.10 and here is a full log of the tests: https://gist.github.com/3135607 It fails at the test test_str (test_arrayprint.TestComplexArray) with a segfault like this: test_str (test_arrayprint.TestComplexArray) ... wine: Unhandled page fault on read access to 0x00000000 at address (nil) (thread 0009), starting debugger... Unhandled exception: page fault on read access to 0x00000000 in 32-bit code (0x00000000). Register dump: CS:0023 SS:002b DS:002b ES:002b FS:0063 GS:006b EIP:00000000 ESP:0041c230 EBP:00000000 EFLAGS:00010202( R- -- I - - - ) EAX:00000000 EBX:1e00807f ECX:00000000 EDX:0041c208 ESI:00f46de0 EDI:00000000 ... See the gist for the full log. Any ideas? I downloaded Python from python.org, is it supposed to work with numpy compiled using mingw? Ondrej From celine.molinaro at telecom-bretagne.eu Wed Jul 18 09:14:00 2012 From: celine.molinaro at telecom-bretagne.eu (=?ISO-8859-1?Q?Molinaro_C=E9line?=) Date: Wed, 18 Jul 2012 15:14:00 +0200 Subject: [Numpy-discussion] numpy.complex Message-ID: <5006B698.1040005@telecom-bretagne.eu> Hello, In [2]: numpy.real(arange(3)) Out[2]: array([0, 1, 2]) In [3]: numpy.complex(arange(3)) TypeError: only length-1 arrays can be converted to Python scalars Are there any reasons why numpy.complex doesn't work on arrays? Should it be bug reported? Thanks for your help C. Molinaro From heng at cantab.net Wed Jul 18 09:18:38 2012 From: heng at cantab.net (Henry Gomersall) Date: Wed, 18 Jul 2012 14:18:38 +0100 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <5006B698.1040005@telecom-bretagne.eu> References: <5006B698.1040005@telecom-bretagne.eu> Message-ID: <1342617518.2835.9.camel@farnsworth> On Wed, 2012-07-18 at 15:14 +0200, Molinaro C?line wrote: > In [2]: numpy.real(arange(3)) > Out[2]: array([0, 1, 2]) > In [3]: numpy.complex(arange(3)) > TypeError: only length-1 arrays can be converted to Python scalars > > > Are there any reasons why numpy.complex doesn't work on arrays? > Should it be bug reported? numpy.complex is just a reference to the built in complex, so only works on scalars: In [5]: numpy.complex is complex Out[5]: True Try numpy.complex128 (or another numpy.complexN, with N being the bit length). Henry From friedrichromstedt at gmail.com Wed Jul 18 09:54:18 2012 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Wed, 18 Jul 2012 15:54:18 +0200 Subject: [Numpy-discussion] Lazy imports again In-Reply-To: References: Message-ID: <561003E7-3457-4919-B399-8DBBA6493CB3@gmail.com> A.S.: Some forewords, below the first paragraphs are annotated after I got to the conclusion at the end of the email; read these annotation in square brackets as if it were footnotes. The main point to grasp and keep in mind for me was, when it comes to package hierarchies, that "import package.module" and all sorts of import statements are completely distinct from variable lookup "package.module" although the syntactic similarity is very sweet syntactic sugar and makes things look consistent. It did confuse me, so some initial words are without this clarity; it might explain it (I don't want to delete it since it's written and hence provides context, although it leads to the end I cannot just drop it for the sake of the ending; there is no ending in fact :-). So now, have fun! [This email is because it asks if there is interest in an existing working (see below) importing delay mechanism without use of meta levels inside of Python.] Hi, Some Years ago I made a "lazy" (who's lazy here? We have work with it, the code has, so?), so let's say, a postponed import system when I made Pmw py2exe compatible (it has or "had" a version choice built-in which was broken when loading from zips, so that files were not readable; this was the issue I wrote it for). It's standard Python 2.6 or so. I would have to lookup myself what the precise mechanism was, but in principle, the postponed-for-load modules of a package are objects with overloaded __getitem__ which does the import. After this, I think [and was wrong, see below], the loaded module is not placed in Python system space sys.modules but lookup always happens through this pseudo module which does the broadcast and yields the same behaviour in terms of attribute access [correct w.r.t. the lookup]. I would not bet my hand on that the module isn't placed in sys.modules [well done, Friedrich; good decision! :-)], but I think it isn't, the usage of that Pmw package was historically to never import submodules but always leave that to the system of the package and act as if it was already imported [what is true, nevertheless, it needs to import the modules somehow, and I didn't like the importing standard module to circumvent ordinary Python syntax, so I used exec instead]. I think this would actually play nice with what numpy was doing in past and would conserve that hence [this seems to be true; a subtlety is pointed out in the postpostscriptum]. Imports from the submodules in the style "from package.submodule import Class" did work, by design and not by fiddling afair [By Python design, yes; it has nothing to do with the import-delaying objects acting as pseudo modules]. Attributes of a package which are supposed to act as to be modules whose loading is postponed are coded in the __init__.py of the package, for instance, by code like "submodule = PostponedLoad("package.submodule")", this is the style, don't take it literally. I said it's years ago and I would look it up only if there is interest [probably I meant 'if I develop interest myself'; I didnt know :-)]. I think I even made a standalone module of the functionality already but I'm not sure. There might have been some cleanup which made me never publish it although there was even interest on the Pmw list. Sigh. You can be sure that it was a clean solution with all the constraints by modeling submodules as objects of an instantiated class clearly visible. It was fully compatible with the handling Pmw was doing before, and although I don't have user feedback, I never had to change any code after switching to my own remake. This includes for sure [not for sure; there were no submodules to be specified] syntax like I mentioned above (from p.m import C). I guess I would be very much mistaken in this; so I conclude without further reading that Python first looks up the named object "p.m" before traversing into sys.modules? This are details. [Yes, I was very much mistaken, it's only "from p import C" although it goes internally down to "p.m"; and yes, this are details indeed.] What I am not sure on are local [that is, relative] imports because I guess they might rely on something what causes incompatibility. This is a "might", and there is chance that it turns into a "not", or not, depending on how local imports (which I never use; bad style :-) are tied to file structure. If they are inherently object and name based ... well I don't know. What speculation. I'm not knowledgabe on this. [The import mechanism isn't touched by this proposal here; the only thing is how e.g. "numpy.random.randn" is made available to the user without importing "numpy.random" before: and this is done by making numpy.random (when looked up by name, not in an import statement) an object importing on demand and then pretending it would be the module object itself.] I'm currently not expecting myself to do anything substantial to this. I would just give the code free (public domain) and you can do whatever you want with it. Just use it freely. Even if I would like to get involved I probably won't do more that wise recommendations on how to use it. It's well documented inline. I don't think I get into coding here: No inclination and too much responsibility. After a few monthes another one would have to maintain it anyway. So better do it yourself from the beginning. What I can help with is to give the working code free; it was really a bit of thinking involved in getting into it and solved cleanly. I'm on the iPad currently (so I don't have the code around); let me know. [I looked the code up on github; could not stand it :-)] The other person interested was from NASA ... they said they had a huge codebase where it would be handy. Didn't drive me anyway to make it published; probably for perfectionism. So it might help them also. I still have him in mind from time to time. So you would also do NASA peer review ... so that I can give it to him with your reference. :-D. Funny imagination. Would be good to have some good testers on this. Pmw isn't too well-known. Actually it's pretty dead in terms of development. It doesn't change anymore. (Don't know why we call this "dead" all the time.) If anyone knows, I hereby clearly distant myself from the mess in other parts of the codebase of that. Ralf, maybe we got in touch with this earlier about it; I have a vague memory. If, then it didn't work out that time either. I just got interested in the topic; not reading the list regularly. If it helps you fine; for me it's not important. I think or pretend ;-) Yours, Friedrich P.S.: If it wouldn't be important for me I wouldn't write this email. I hope I can leave the bikeshedding to others. Probably it's part of that I would like to finish this project after three years or so having it lingering around unfinished in terms of feeling. Too much of this stuff. It'll hurt to hand it over to you but I think it would be the right step still. Things that can be handed over cannot be so important to make oneself believe they are; publication is probably just that. I would like to make this a history although I see the contradiction with that my interest shows that it isn't still for me; and probably won't even succeed in this. Coding and open source is probably part of getting involved in things because one doesn't want to get involved in it; well: life. Why do we want to always make things with functionality where noone cares about how we made it and why; not even we ourselves? I don't know, I don't believe into the separation of content and form. I don't know why I'm trying to help you here with this tiny piece of postponed module loading. I don't want to know it. Even if I like to and had some knowledge it would be plain useless and negligible, unimportant. :-) ? F. P.P.S.: Couldn't help it; usage illustrated here: https://github.com/friedrichromstedt/pmw2/blob/pmw2/__init__.py, and code which matters here: https://github.com/friedrichromstedt/pmw2/blob/pmw2/dynload.py. Did this because I wanted to know about if my arising doubts on "from p.m import C" are sensible; since in pmw2, everything is toplevel. I'm looking up the Python docs on how that works ... ah ya, and it (the dynload module) places the loaded module in sys.packages since it's an simple "exec 'import ' + name" statement; so ... Ya, importing .. but I think it should work, although not as expected. When saying e.g. "from numpy.random import randn", and the object present in the numpy module as attribute being intended to import numpy.random when its attributes are accessed e.g. by "numpy.random.randn(...)" has not been used yet s.t. numpy.random isn't loaded yet and not yet present in sys.modules, then the importing mechanism for Python 2.7.3 (currently at http://docs.python.org/reference/simple_stmts.html#the-import-statement) will import numpy.random, although without knowing or taking use of the attribute "random" already present in the "numpy" module object. It is a confusion between the syntax in the import statement "from numpy.random import (...)" and the syntax used at attribute access "a = numpy.random.rand(...)", which both use the dotted hierarchy naming, but are actually something very much different. In the end, if "from numpy.random import randn" is executed before accessing any of the attributes of the "random" attribute of "numpy", accessing numpy.random.randn afterwards will cause that import-delaying object to import "numpy.random" as a module in an import statement once more[,] what doesn't harm although it's not strictly necessary [see below for refinement; there might very well be a slight alteration causing this overhead to disappear completely without trace; it's rather an implication, not an alteration.]. I think everything should just be fine as long as no-one expects the result of name lookup for "numpy.random" to yield a module object, since this lookup will instead return an import-delaying class instance behaving similar to the module object it stands for in terms of attribute access. So as long as this similarity is sufficient it would suffice (pleaonasm! Say: Code would not have to be changed to reflect the change of object type.) Since import statements seem to never lookup local variables (they just use the same syntax), judging from abovenoted docs for 2.7.3, it wouldn't even harm to have numpy as package in a local variable with the same name when doing "from numpy.random import randn" later in the same scope. Oh my god, so much of writing for so little code. And then it's still untested! :-) Am 17.07.2012 um 05:37 schrieb Travis Oliphant : > > On Jul 16, 2012, at 4:50 PM, Ralf Gommers wrote: > >> >> >> On Mon, Jul 16, 2012 at 6:28 PM, Charles R Harris wrote: >> Hi All, >> >> Working lazy imports would be useful to have. Ralf is opposed to the idea because it caused all sorts of problems on different platforms when it was tried in scipy. >> >> Note that my being opposed is because the benefits are smaller than the cost. If there was a better reason than shaving a couple of extra ms off the import time or being able to more cleanly deprecate modules, then of course it's possible I'd change my mind. > > This sort of cost / benefit analysis is always tricky. The "benefits" are always subjective. For some, shaving a few ms off of import times is *extremely* valuable and so they would opine that it absolutely outweights the "cost". Others see such benefits as barely worth mentioning. I doubt we can define an absolute scale that would allow such an optimization to actually be done. > [Precisely, there is no absolute scale allowing such optimisations to be done. There are absolute scales, but they all say: Stay away from optimisation! :-)] [Or to frame it differently: There is a language barrier which is made to have it and to be able to stick to what is called "Subjectivism". I don't think Ralf was subjective here; only the ironic or nearly cynical, a bit polemic example was. The first sentence is what counts, and Ralf, am I understanding you correctly that in some sense it contains the delusion that benefits have to outweight costs; would you agree on that you just don't see why to pay for this, although not being reluctant to pay a lot for good stuff (so to speak); s.t. what is called cost and benefit isn't an opposing pair to be balanced but instead always comes together, but here's little of both? :-)] [I think the incinerating subject of milliseconds could be called barely worth mentioning if not taking that literal since it actually is mentioned while, e.g, well, it is mentioned, I cannot mention things without mentioning them. So it cannot be barely worth mentioning soo much. Nevertheless I think we all agree it would have been better if we never came to that point. :-)] > My personal view is that speeding up import times is an important goal (though not the *most* important goal). The "costs" here are also very loosely defined. Lazy imports were tried with SciPy. Several people had problems with them. Mostly it was difficult to not have a "brittle" implementation that didn't break someone downstream libraries concept of what "import" meant --- systems that try to "freeze" Python programs were particularly annoyed at SciPy's lazy import mechanism. It wouldn't offend anything here what I'm proposing as the Python import mechanism is used as is; no change there. All changes inside on the code base of the package concerned are just as outlined above. This might be brittle too if there is needed more than attribute access. For instance reloads might not work with "reload(numpy.random)", I think because it depends on the name lookup of "numpy.random", which isn't a module anymore and hence cannot be reloaded (even though there is a module with that name in sys.modules). One would be able to use the numpy.random.reload() method I coded for that task apparently (in 2009?). :-) Amendment: The abovenoted docs (http://docs.python.org/reference/simple_stmts.html#the-import-statement) do not state clearly how hierarchical imports are bound, but I guess that "import numpy.random" binds the module object for numpy.random to a variable called "random" being an attribute of the variable "numpy" (or its object, to be precise). This variable "numpy" is then the only local variable bound in current scope to the "numpy" module (or package). This would mean, since the __init__.py of the "numpy" module would initiate also a "random" attribute to be the import-delaying class instance, this "random" attribute would be overwritten, since initialisation of the "numpy" package happens first (not entirely sure, but it should), before the import traversal goes on to import "numpy.random". At this point, the import-delaying object is lost, without effect, i.e. not harmfully, since it was the goal to import numpy.random anyway, and thus the delaying object can be discarded from then; it is even more efficient. I notice that this should happen also when this delay object is used first by accessing its attributes, because it just exec's the same import: "exec 'import numpy.random' " essentially. Nevertheless, overloaded attribute access is still needed, because the delaying object is already looked up when the user issues "numpy.random.randn(...)" the first time "numpy.random" needs to be imported for this. After this, the delay mechanism should hence destruct itself and leave no trace. It should be as if it has never been there, for all following calls to "numpy.random.anything". This should be verified, though; I never cared for it since I didn't consider this. It would be highly elegant, also highly efficient (as there is no penalty after the first access anymore; everything just as right now), and I probably never noted this because the package I wrote dynload for (Pmw) couldn't make use of it, as it accesses the classes in the package's modules as if they were imported into the package module object via "from package.module import *", in principle. (Thus they are callable objects which instantiate on call an attribute retrieved from the import-delaying pseudo module.) I think the mechanism is highly versatile if tests do not fail and would be superior to metalevel approaches both in terms of simplicity and elegancy as well as penalty freeness. (Also it does not affect other packages, since the package wanting to make us of it needs to do this itself by publishing the submodules as import-delaying class instances as outlined in the beginning and in the illustration code, so by "module = dynload.Dynload('package.module')"; there might have been some renaming for the publication I planned later after doing the Pmw stuff with it.) > However, I don't think that lazy imports should be universally eschewed. They are probably not right for the NumPy library by default, but I think good use could be made of a system loaded before any other import by users who are looking to avoid the costs of importing all the packages imported by a library. In particular, I suspect we could come up with an approach for C-API users like Andrew so that they could call _import_lazy() and then _import_array() and get what they want (hooks to the C-API) without what they don't (everything else on the Python side). You can safely skip this paragraph; it says less then nothing: [I think this is rather nonoperational, no? From a universal not-eschewing follows just plain nothing than that the not-eschewing is then universal; something what's right for everything. I don't understand the relation between the second part and the third; I feel like lamped by technicalities? Here you have a perfect example of bikeshedding on my side: The discussion stays on low level and still a lot of text is produced. :-)] I'm not sure on the use; but frankly speaking I think nothing is made because people think it's useful. If people say something wouldn't be useful it often is nothing more than an objectivation of the observation of a lack of personal interest. I don't know, people who do their work with heart are probably easily driven to locate the reason in the outside world as soon as they are driven to notice the difference. If Use arguments come up it might be a sign of that the subject is a bike shed, though. Or that it's too far away for people to relate to it; I don't know. Bike sheds more tend to create arguments on the colour and size; not on its use. (So on how to do it; not if.) > I've implemented something like this in the past by using the metapath import hook to intercept import of a specific library. It could be done for NumPy in a way that doesn't affect people who don't want it, but can be used by people who want to speed up their import times and don't need much from NumPy. {Was this the thing Ilan had the talk about at PyCon DE Leipzig last year?} I never liked that; it's too much elegant for me. Somehow I feel like adding meta levels takes out the meaning. And a meta level which solves "everything" or looks like[,] even if it is not claimed to do; is just another thing on its own where it is not really clear why it is done other than actually to replace flaws in the lower layer by other flaws. I agree that this even applies to the import-delaying class instances behaving like modules. Somehow it's like a motorised bike: it will never look sensible. I have in general the impression that Python is growing beyond its bounds in terms of complexity and the package/module notion together with importing submodules as if they were just groups of toplevel objects (as it is done in numpy) might be not a design flaw but a sign that numpy is bigger than it really is or wants to be. Same applies probably to scipy, I don't know good enough. There should be something beyond packages and package installation/deinstallation but aside of this need probably no-one knows anything? > Certainly, in retrospect, it would have been a better design to allow the C-API to be loaded without everything else being loaded as well... -- maybe even providing a module *just* for importing the C-API that was in the top-level namespace "_numpy" or some such... In future, I'm afraid Python is expecting too much of itself, which means to have any expectations at all ... this is my personal opinion although I don't think the truth it is bound to is related to me other than by experience ... :-/ after we've seen that we can do a lot with Python we want a bit more. I don't think this will work. When people are changing from inventing to enhancing and embettering normally to my perceiving the "new" things arising then are not as sensible as the "invented" things before (which were not "new"; but shared what all inventions share). It's probably not pessimistic to say that Python is beyond peak; Python 3k might be a sign? Things like this just happen, all the time. People live, grow, mature, and die; it's very normal. Same with other entities; same with Python. Too fast-growing entities probably have a lack on other sides. And I really adore Python; I love it, although I didn't go beyond Python 2.6. I did ten thousands line of code per year without trash; I used it every day and it worked just fine (mostly). I did my whole scientific education in University with it as long as this was computer-related and not about text. I'm using it since about 2002 where it was about Python . And it worked that time also. Ten or twenty years are a long life time for computer languages I think. If we need to change something "substantial" without having another "idea" with it; I think we're doing something wrong then. It won't lead anywhere because it knows already what it wants. And the boundaries impressed by what is already there are nothing more than an obstacle then. Again, I'm not pessimistic, I see that it is unrelated to the list but I also find that the subject is in the sense I outlined above. It might explain that inconceivable colour smashing that happened recently ("by what?" ? "exactly."). Unrelated people are attracted by unrelated things. I don't like this development on the list, I find it annoying, and although I'm about to unsubscribe I have to confess that I believe myself and this post is somehow part of it. It's just that I'm doing critics about it and do not believe it would matter in the sense it should or people might be inclined to think I would like it to see alike. I think mailing lists allowing its members to have thoughts like this without being dreamy or "unrelated", but serious and honest; there must be something wrong with the subject, no? Things grow with their people, does anyone have similar thoughts or feelings? I might touch an open miracle here. It's because I notice this tendency without account for age in the minds of many people around me. And they agree with my thoughts, what is the most horrid. Maybe we should >>> import lazy again. Friedrich. P.B.: I guess I will get now the usual banners saying "You know very well that tenthousands of users and innocent and serious people will receive this text, even if they don't read it, and waste at least a button click on it or some minutes and some thinking otherwise?"; as well as "You wrote this email at speed xy.z characters per minute, do you think that was too fast"; and "You want to let it cool off a bit?"; so they have been there already: I think I will answer them "Yes", "hm, typing is probably fractal so how can you know, but what I know is that the speed was here a quarter of an email per hour!", and finally "No, not again! Then it will fall to ashes and be stone like!" :-) let's see ... ah right, the horrid impression mentioned above also is because people rather agree on that they don't understand what I'm talking about or say that themselves, if they don't plainly agree. So it cannot be that horrid. Probably it's an application of Goedel's theorem ;-) Quality Assurance is near to be passed ... about to smash QA ... Okay I have let it cool off one day actually, I was driven to sending and redacting it [now] by the solution which came up and which I find inelegant and highly agressive to the interpreter internals; I know it's possible but I find it a misuse for production environments. Yes, I don't think so of what I'm proposing here; I don't even call it a solution, not only because it's untested in numpy context, but also because it doesn't solve a problem but rather implements a concept. :-). F. > Best, > > -Travis > > > >> >> Ralf >> >> >> I thought I'd open the topic for discussion so that folks who had various problems/solutions could offer input and the common experience could be collected in one place. Perhaps there is a solution that actually works. Some bikesheds for people to fling on (e.g., me): What is a common experience? What is a solution that doesn't work? I agree on that there are things which work but aren't solutions. :-) Only joking, I'm just smiling because I got through editing (I was afraid I never would :-). >> Ideas? Yes! :-D >> Chuck Friedrich. Sending now. Hey, you got to here? Did you actually read? :-) >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jul 18 10:02:18 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 18 Jul 2012 15:02:18 +0100 Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: References: Message-ID: On Wed, Jul 18, 2012 at 11:38 AM, Ond?ej ?ert?k wrote: > Now things work and start compiling. Any ideas what is going on here? > Why is it trying to "build the msvcr" library? I believe that it's actually trying to link to the msvcr library (which requires first creating some file that describes its contents). And this is being done because it's trying to workaround mingw's lack of a "localtime_s" function. AFAICT the solution is to just delete this code and go back to using plain old "localtime" (see the other thread). -n From njs at pobox.com Wed Jul 18 10:06:25 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 18 Jul 2012 15:06:25 +0100 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Wed, Jul 18, 2012 at 7:09 AM, Travis Oliphant wrote: > > Hey all, > > We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. There are two strategies for working with maintenance branches: 1) commit bug-fixes to maintenance, merge maintenance back into master as needed 2) commit bug-fixes to master, cherry-pick to maintenance, never merge maintenance back into master Which are we using? (I.e., which branch should bug-fix PRs be submitted against?) -n From travis at continuum.io Wed Jul 18 10:17:02 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 18 Jul 2012 09:17:02 -0500 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: I don't have a strong preference. Which one do others prefer? Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Jul 18, 2012, at 9:06 AM, Nathaniel Smith wrote: > On Wed, Jul 18, 2012 at 7:09 AM, Travis Oliphant wrote: >> >> Hey all, >> >> We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. > > There are two strategies for working with maintenance branches: > 1) commit bug-fixes to maintenance, merge maintenance back into master as needed > 2) commit bug-fixes to master, cherry-pick to maintenance, never merge > maintenance back into master > > Which are we using? (I.e., which branch should bug-fix PRs be > submitted against?) > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From scott.sinclair.za at gmail.com Wed Jul 18 10:45:28 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 18 Jul 2012 16:45:28 +0200 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <5006B698.1040005@telecom-bretagne.eu> References: <5006B698.1040005@telecom-bretagne.eu> Message-ID: On 18 July 2012 15:14, Molinaro C?line wrote: > Hello, > > In [2]: numpy.real(arange(3)) > Out[2]: array([0, 1, 2]) > In [3]: numpy.complex(arange(3)) > TypeError: only length-1 arrays can be converted to Python scalars I think you're looking for the dtype keyword to the ndarray constructor: import numpy as np np.arange(3, dtype=np.complex) Out[2]: array([ 0.+0.j, 1.+0.j, 2.+0.j]) or if you have an existing array to cast: np.asarray(np.arange(3), dtype=np.complex) Out[3]: array([ 0.+0.j, 1.+0.j, 2.+0.j]) You can get the real and imaginary components of your complex array like so: a = np.arange(3, dtype=np.complex) a Out[9]: array([ 0.+0.j, 1.+0.j, 2.+0.j]) a.real Out[10]: array([ 0., 1., 2.]) a.imag Out[11]: array([ 0., 0., 0.]) Cheers, Scott From cournape at gmail.com Wed Jul 18 11:10:23 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 18 Jul 2012 16:10:23 +0100 Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: References: Message-ID: On Wed, Jul 18, 2012 at 11:38 AM, Ond?ej ?ert?k wrote: > On Wed, Jul 18, 2012 at 12:30 PM, Ond?ej ?ert?k wrote: >> On Wed, Jul 18, 2012 at 2:20 AM, Ond?ej ?ert?k wrote: >>> On Thu, Jan 5, 2012 at 8:22 PM, John Salvatier >>> wrote: >>>> Hello, >>>> >>>> I'm trying to compile numpy on Windows 7 using the command: "python setup.py >>>> config --compiler=mingw32 build" but I get an error about a symbol table not >>>> found. Anyone know how to work around this or what to look into? >>>> >>>> building library "npymath" sources >>>> Building msvcr library: "C:\Python26\libs\libmsvcr90.a" (from >>>> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll) >>>> objdump.exe: >>>> C:\Windows\winsxs\amd64_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.21022.8_none_750b37ff97f4f68b\msvcr90.dll: >>>> File format not recognized >>>> Traceback (most recent call last): >>>> File "setup.py", line 214, in >>>> setup_package() >>>> File "setup.py", line 207, in setup_package >>>> configuration=configuration ) >>>> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\core.py", line >>>> 186, in setup >>>> return old_setup(**new_attr) >>>> File "C:\Python26\lib\distutils\core.py", line 152, in setup >>>> dist.run_commands() >>>> File "C:\Python26\lib\distutils\dist.py", line 975, in run_commands >>>> self.run_command(cmd) >>>> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >>>> cmd_obj.run() >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build.py", line >>>> 37, in run >>>> old_build.run(self) >>>> File "C:\Python26\lib\distutils\command\build.py", line 134, in run >>>> self.run_command(cmd_name) >>>> File "C:\Python26\lib\distutils\cmd.py", line 333, in run_command >>>> self.distribution.run_command(command) >>>> File "C:\Python26\lib\distutils\dist.py", line 995, in run_command >>>> cmd_obj.run() >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>>> line 152, in run >>>> self.build_sources() >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>>> line 163, in build_sources >>>> self.build_library_sources(*libname_info) >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>>> line 298, in build_library_sources >>>> sources = self.generate_sources(sources, (lib_name, build_info)) >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\build_src.py", >>>> line 385, in generate_sources >>>> source = func(extension, build_dir) >>>> File "numpy\core\setup.py", line 646, in get_mathlib_info >>>> st = config_cmd.try_link('int main(void) { return 0;}') >>>> File "C:\Python26\lib\distutils\command\config.py", line 257, in try_link >>>> self._check_compiler() >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\command\config.py", >>>> line 45, in _check_compiler >>>> old_config._check_compiler(self) >>>> File "C:\Python26\lib\distutils\command\config.py", line 107, in >>>> _check_compiler >>>> dry_run=self.dry_run, force=1) >>>> File "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\ccompiler.py", >>>> line 560, in new_compiler >>>> compiler = klass(None, dry_run, force) >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>>> line 94, in __init__ >>>> msvcr_success = build_msvcr_library() >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>>> line 362, in build_msvcr_library >>>> generate_def(dll_file, def_file) >>>> File >>>> "C:\Users\jsalvatier\workspace\numpy\numpy\distutils\mingw32ccompiler.py", >>>> line 282, in generate_def >>>> raise ValueError("Symbol table not found") >>>> ValueError: Symbol table not found >>> >>> >>> Did you find a workaround? I am having exactly the same problem. >> >> So this happens both in Windows and in Wine and the problem is that >> the numpy distutils is trying to read the symbol table using "objdump" >> from msvcr90.dll but it can't recognize the format: >> >> objdump.exe: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: >> File format not recognized >> >> The file exists: >> >> >> $ file ~/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll >> /home/ondrej/.wine/drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll: >> PE32 executable for MS Windows (DLL) (unknown subsystem) Intel 80386 >> 32-bit >> >> >> But objdump doesn't work on it. > > So the following patch fixes it: > > > diff --git a/numpy/distutils/mingw32ccompiler.py b/numpy/distutils/mingw32ccompi > index 5b9aa33..72ff5ed 100644 > --- a/numpy/distutils/mingw32ccompiler.py > +++ b/numpy/distutils/mingw32ccompiler.py > @@ -91,11 +91,11 @@ class Mingw32CCompiler(distutils.cygwinccompiler.CygwinCComp > build_import_library() > > # Check for custom msvc runtime library on Windows. Build if it doesn't > - msvcr_success = build_msvcr_library() > - msvcr_dbg_success = build_msvcr_library(debug=True) > - if msvcr_success or msvcr_dbg_success: > - # add preprocessor statement for using customized msvcr lib > - self.define_macro('NPY_MINGW_USE_CUSTOM_MSVCR') > + #msvcr_success = build_msvcr_library() > + #msvcr_dbg_success = build_msvcr_library(debug=True) > + #if msvcr_success or msvcr_dbg_success: > + # # add preprocessor statement for using customized msvcr lib > + # self.define_macro('NPY_MINGW_USE_CUSTOM_MSVCR') > > # Define the MSVC version as hint for MinGW > msvcr_version = '0x%03i0' % int(msvc_runtime_library().lstrip('msvcr')) > > > > > Now things work and start compiling. Any ideas what is going on here? > Why is it trying to "build the msvcr" library? Because the import library did not exist in older mingw versions, we had to build it IIRC. The fact that objdump does not recognize the dll format is strange. David From cournape at gmail.com Wed Jul 18 11:29:27 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 18 Jul 2012 16:29:27 +0100 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Wed, Jul 18, 2012 at 3:17 PM, Travis Oliphant wrote: > I don't have a strong preference. Which one do others prefer? > We've used 2) in the past, and I don't think Ralf changed this when he took over release maintenance. David From ralf.gommers at googlemail.com Wed Jul 18 13:12:59 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 18 Jul 2012 19:12:59 +0200 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Wed, Jul 18, 2012 at 5:29 PM, David Cournapeau wrote: > On Wed, Jul 18, 2012 at 3:17 PM, Travis Oliphant > wrote: > > I don't have a strong preference. Which one do others prefer? > > > > We've used 2) in the past, and I don't think Ralf changed this when he > took over release maintenance. Indeed, everything normally lands in master first. I also have the habit of waiting a few days before backporting when possible, because that way a lot of testing already happens by people tracking master. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jul 18 13:25:04 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 18 Jul 2012 19:25:04 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Wed, Jul 18, 2012 at 1:24 PM, Ond?ej ?ert?k wrote: > Hi, > > I managed to compile NumPy in MinGW under Wine in Ubuntu 11.10 and > here is a full log of the tests: > > https://gist.github.com/3135607 > > It fails at the test test_str (test_arrayprint.TestComplexArray) with > a segfault like this: > > > test_str (test_arrayprint.TestComplexArray) ... wine: Unhandled page > fault on read access to 0x00000000 at address (nil) (thread 0009), > starting debugger... > Unhandled exception: page fault on read access to 0x00000000 in 32-bit > code (0x00000000). > Register dump: > CS:0023 SS:002b DS:002b ES:002b FS:0063 GS:006b > EIP:00000000 ESP:0041c230 EBP:00000000 EFLAGS:00010202( R- -- I - - - > ) > EAX:00000000 EBX:1e00807f ECX:00000000 EDX:0041c208 > ESI:00f46de0 EDI:00000000 > > ... > > See the gist for the full log. Any ideas? I downloaded Python from > python.org, is it supposed to work with numpy compiled using mingw? > It is. I have Python 2.5 ... 3.2 from Python.org, with MinGW from http://prdownloads.sourceforge.net/mingw/MinGW-5.0.3.exe?download and ATLAS binaries from https://github.com/numpy/vendor. Then I normally build numpy/scipy with "paver bdist_wininst_simple" or "paver bdist_wininst_superpack". This also requires MakeNSIS and the CpuId plugin for it, as documented at https://github.com/numpy/numpy/blob/master/doc/HOWTO_RELEASE.rst.txt Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Jul 18 16:22:37 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 18 Jul 2012 21:22:37 +0100 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Wed, Jul 18, 2012 at 6:12 PM, Ralf Gommers wrote: > > > On Wed, Jul 18, 2012 at 5:29 PM, David Cournapeau > wrote: >> >> On Wed, Jul 18, 2012 at 3:17 PM, Travis Oliphant >> wrote: >> > I don't have a strong preference. Which one do others prefer? >> > >> >> We've used 2) in the past, and I don't think Ralf changed this when he >> took over release maintenance. > > > Indeed, everything normally lands in master first. I also have the habit of > waiting a few days before backporting when possible, because that way a lot > of testing already happens by people tracking master. Okay then: https://github.com/numpy/numpy/pull/362 -n From travis at vaught.net Wed Jul 18 16:48:43 2012 From: travis at vaught.net (Travis Vaught) Date: Wed, 18 Jul 2012 15:48:43 -0500 Subject: [Numpy-discussion] Proposed change in genfromtxt(..., comments='#', names=True) behaviour In-Reply-To: References: <1342192511.19936.22.camel@esdceeprjpstudent1.mit.edu> <1342200571.19936.32.camel@esdceeprjpstudent1.mit.edu> <35FEFC2B-E373-410D-971F-8A3AA3E0FCBF@continuum.io> <1342440834.12313.26.camel@khaeru-desktop> <1342465575.14478.15.camel@esdceeprjpstudent1.mit.edu> <1342472440.14478.38.camel@esdceeprjpstudent1.mit.edu> Message-ID: On Jul 17, 2012, at 10:47 AM, Nathaniel Smith wrote: > > Not that there are likely to be people using skip_header=True as an > alias for skip_header=1, but if they were it would currently work. I write messy code like that all the time. Best, Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Wed Jul 18 20:24:12 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 19 Jul 2012 02:24:12 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: Hi Ralf, On Wed, Jul 18, 2012 at 7:25 PM, Ralf Gommers wrote: > > > On Wed, Jul 18, 2012 at 1:24 PM, Ond?ej ?ert?k > wrote: >> >> Hi, >> >> I managed to compile NumPy in MinGW under Wine in Ubuntu 11.10 and >> here is a full log of the tests: >> >> https://gist.github.com/3135607 >> >> It fails at the test test_str (test_arrayprint.TestComplexArray) with >> a segfault like this: >> >> >> test_str (test_arrayprint.TestComplexArray) ... wine: Unhandled page >> fault on read access to 0x00000000 at address (nil) (thread 0009), >> starting debugger... >> Unhandled exception: page fault on read access to 0x00000000 in 32-bit >> code (0x00000000). >> Register dump: >> CS:0023 SS:002b DS:002b ES:002b FS:0063 GS:006b >> EIP:00000000 ESP:0041c230 EBP:00000000 EFLAGS:00010202( R- -- I - - - >> ) >> EAX:00000000 EBX:1e00807f ECX:00000000 EDX:0041c208 >> ESI:00f46de0 EDI:00000000 >> >> ... >> >> See the gist for the full log. Any ideas? I downloaded Python from >> python.org, is it supposed to work with numpy compiled using mingw? > > > It is. I have Python 2.5 ... 3.2 from Python.org, with MinGW from > http://prdownloads.sourceforge.net/mingw/MinGW-5.0.3.exe?download and ATLAS > binaries from https://github.com/numpy/vendor. > > Then I normally build numpy/scipy with "paver bdist_wininst_simple" or > "paver bdist_wininst_superpack". This also requires MakeNSIS and the CpuId > plugin for it, as documented at > https://github.com/numpy/numpy/blob/master/doc/HOWTO_RELEASE.rst.txt Thanks for the details. I'll try to reproduce it. I have no idea why my setup doesn't work. I've nailed it to: >>> import numpy as np >>> np.array([complex(0, 1)], np.complex64) wine: Unhandled page fault on read access to 0x00000000 at address (nil) (thread 0009), starting debugger... ... So it is something with the complex types. Ondrej From nouiz at nouiz.org Thu Jul 19 01:04:53 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Thu, 19 Jul 2012 01:04:53 -0400 Subject: [Numpy-discussion] Pull Requests I'm planning to merge In-Reply-To: <8E141C42-1EDC-47BD-AB05-A64835ADCEB6@continuum.io> References: <8E141C42-1EDC-47BD-AB05-A64835ADCEB6@continuum.io> Message-ID: Hi, On Tue, Jul 17, 2012 at 10:20 PM, Travis Oliphant wrote: > > On Jul 17, 2012, at 6:48 PM, Nathaniel Smith wrote: > >> On Tue, Jul 17, 2012 at 9:56 PM, Travis Oliphant wrote: >>> >>> I would like to merge the following pull requests sometime today: >>> >>> * 326 -- inplace increment function >> >> -1, for the reasons stated in the comment thread -- we shouldn't lock >> ourselves into an ugly API when there's discussion going on about the >> right solution, and the submitter has expressed interest in rewriting >> it in a better way. Also to be clear, I'm not saying I will try to >> block this functionality forever unless it's perfect or anything -- I >> could probably be convinced that putting in something sub-optimal here >> was the best trade-off. Really what I'm -1 on is shoving something in >> without proper discussion just because the release is happening, on >> the theory that we can clean up any mess later. Until your comment a >> few hours ago no-one even looked at this new API except me and the >> submitter... > > You are incorrect about that. I've looked at it several times since it was posted. I've looked at and studied all the PRs multiple times in fact over the past several months. I've waited for others to express an opinion. I don't think it's accurate to assume that people have not seen something because they don't comment. > > I will hold off on this one, but only because you raised your objection to -1 (instead of -0.5). But, I disagree very strongly with you that it is "sub-optimal" in the context of NumPy. I actually think it is a very reasonable solution. He has re-used the MapIter code correctly (which is the only code that *does* fancy indexing). The author has done a lot of work to support multiple data-types and satisfy an often-requested feature in a very reasonable way. Your objection seems to be that you would prefer that it were a method on ufuncs. I don't think this will work without a *major* refactor. I'm not sure it's even worth it then either. Due to Travis comments, I expect the "better" solution won't be done in the short time. So blocking a feature requeted frequently because we could do sometimes in an unknow futur something better is not a good idea. (if I get it wrong on the expectation, please, correct me!) Also, if we decide to change the interface, we could do it for NumPy 2, so I don't see a need for an infinit time support of this API. So in conclusion, can we say that when John will have make is PR work with all dtype, we merge it? Also, I understood that it won't be included in NumPy 1.7. Is that right? Fred From ralf.gommers at googlemail.com Thu Jul 19 02:56:00 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 19 Jul 2012 08:56:00 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 12, 2012 at 1:54 PM, Nathaniel Smith wrote: > On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root wrote: > > > > > > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: > >> > >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > Travis and I agree that it would be appropriate to remove the current > >> > 1.7.x > >> > branch and branch again after a code freeze. That way we can avoid the > >> > pain > >> > and potential errors of backports. It is considered bad form to mess > >> > with > >> > public repositories that way, so another option would be to rename the > >> > branch, although I'm not sure how well that would work. Suggestions? > >> > >> I might be mistaken, but if the branch is merged into master (even if > >> that merge makes no changes), I think it's safe to delete it at that > >> point (and recreate it at a later date with the same name) with > >> regards to remote repositories. It should be fairly easy to test. > >> > >> Ray Jones > > > > > > No, that is not the case. We had a situation occur awhile back where > one of > > the public branches of mpl got completely messed up. You can't even > rename > > it since the rename doesn't occur in the pulls and merges. > > > > What we ended up doing was creating a brand new branch "v1.0.x-maint" and > > making sure all the devs knew to switch over to that. You might even go > a > > step further and make a final commit to the bad branch that makes the > build > > fail with a big note explaining what to do. > > The branch isn't bad, it's just out of date. So long as the new > version of the branch has the current version of the branch in its > ancestry, then everything will be fine. > > Option 1: > git checkout master > git merge maint1.7.x > git checkout maint1.7.x > git merge master # will be a fast-forward > > Option 2: > git checkout master > git merge maint1.7.x > git branch -d maint1.7.x # delete the branch > git checkout -b maint1.7.x # recreate it > > In git terms these two options are literally identical; they result in > the exact same repo state... > $ git co 1.7.x Switched to branch '1.7.x' Your branch and 'upstream/maintenance/1.7.x' have diverged, and have 1 and 124 different commit(s) each, respectively. $ git pull Auto-merging numpy/core/SConscript Auto-merging numpy/core/bscript CONFLICT (content): Of course I can fix this easily, but why are we having this long thread, coming to a conclusion and then doing something else? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jul 19 04:50:22 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 19 Jul 2012 09:50:22 +0100 Subject: [Numpy-discussion] Indexing API Message-ID: So the underlying problem with the controversial inplace_increment PR[1] is that currently, there's actually nothing in the public numpy API that exposes the workings of numpy indexing. The only thing you can do with a numpy index is call ndarray.__getattr__ or __setattr__. This is a pretty obvious gap, given how fundamental an operation indexing is in numpy (and how difficult to emulate). So how can we expose something that fixes it? Make PyArrayMapIterObject part of the public API? Something else? -n [1] https://github.com/numpy/numpy/pull/326 From njs at pobox.com Thu Jul 19 05:56:33 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 19 Jul 2012 10:56:33 +0100 Subject: [Numpy-discussion] indexed increment (was Re: Pull Requests I'm planning to merge) Message-ID: On Thu, Jul 19, 2012 at 6:04 AM, Fr?d?ric Bastien wrote: > On Tue, Jul 17, 2012 at 10:20 PM, Travis Oliphant wrote: >> I will hold off on this one, but only because you raised your objection to -1 (instead of -0.5). But, I disagree very strongly with you that it is "sub-optimal" in the context of NumPy. I actually think it is a very reasonable solution. He has re-used the MapIter code correctly (which is the only code that *does* fancy indexing). The author has done a lot of work to support multiple data-types and satisfy an often-requested feature in a very reasonable way. Your objection seems to be that you would prefer that it were a method on ufuncs. I don't think this will work without a *major* refactor. I'm not sure it's even worth it then either. > > Due to Travis comments, I expect the "better" solution won't be done > in the short time. So blocking a feature requeted frequently because > we could do sometimes in an unknow futur something better is not a > good idea. (if I get it wrong on the expectation, please, correct me!) Really I'm making two points: 1- this design smells just "off" enough that we should discuss it to figure out whether or not it's actually the best way forward. 2- such discussions are too important to just skip whenever a release is imminent. there will be another release... > Also, if we decide to change the interface, we could do it for NumPy > 2, so I don't see a need for an infinit time support of this API. The "NumPy 2" idea is not very realistic IMHO, and shouldn't be used as an excuse for being sloppy. We're never going to be able to go through and make all the APIs perfect in one release cycle. Especially if we keep deferring all hard questions until then. > So in conclusion, can we say that when John will have make is PR work > with all dtype, we merge it? Well, we'll come up with something :-). (Assuming that people stay interested, which seems likely since this is such a common problem people have.) But I'm not sure what. Making the PR work with all dtypes is exactly what Travis is arguing is too much work. > Also, I understood that it won't be > included in NumPy 1.7. Is that right? Yes. -n From ondrej.certik at gmail.com Thu Jul 19 06:07:30 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 19 Jul 2012 12:07:30 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 8:56 AM, Ralf Gommers wrote: > > > On Thu, Jul 12, 2012 at 1:54 PM, Nathaniel Smith wrote: >> >> On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root wrote: >> > >> > >> > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: >> >> >> >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris >> >> wrote: >> >> > Hi All, >> >> > >> >> > Travis and I agree that it would be appropriate to remove the current >> >> > 1.7.x >> >> > branch and branch again after a code freeze. That way we can avoid >> >> > the >> >> > pain >> >> > and potential errors of backports. It is considered bad form to mess >> >> > with >> >> > public repositories that way, so another option would be to rename >> >> > the >> >> > branch, although I'm not sure how well that would work. Suggestions? >> >> >> >> I might be mistaken, but if the branch is merged into master (even if >> >> that merge makes no changes), I think it's safe to delete it at that >> >> point (and recreate it at a later date with the same name) with >> >> regards to remote repositories. It should be fairly easy to test. >> >> >> >> Ray Jones >> > >> > >> > No, that is not the case. We had a situation occur awhile back where >> > one of >> > the public branches of mpl got completely messed up. You can't even >> > rename >> > it since the rename doesn't occur in the pulls and merges. >> > >> > What we ended up doing was creating a brand new branch "v1.0.x-maint" >> > and >> > making sure all the devs knew to switch over to that. You might even go >> > a >> > step further and make a final commit to the bad branch that makes the >> > build >> > fail with a big note explaining what to do. >> >> The branch isn't bad, it's just out of date. So long as the new >> version of the branch has the current version of the branch in its >> ancestry, then everything will be fine. >> >> Option 1: >> git checkout master >> git merge maint1.7.x >> git checkout maint1.7.x >> git merge master # will be a fast-forward >> >> Option 2: >> git checkout master >> git merge maint1.7.x >> git branch -d maint1.7.x # delete the branch >> git checkout -b maint1.7.x # recreate it >> >> In git terms these two options are literally identical; they result in >> the exact same repo state... > > > $ git co 1.7.x > Switched to branch '1.7.x' > Your branch and 'upstream/maintenance/1.7.x' have diverged, > and have 1 and 124 different commit(s) each, respectively. > > $ git pull > Auto-merging numpy/core/SConscript > Auto-merging numpy/core/bscript > CONFLICT (content): > > Of course I can fix this easily, but why are we having this long thread, > coming to a conclusion and then doing something else? Unfortunately the maintenance/1.7.x was rebased, but when I pulled the new branch into my local repository, I have carefully checked and I think that no work/patches were lost. As to me, one should try not to do any rebasing once the branch is up there, visible to everybody. But apart from that, for example in SymPy I think we always try to merge the release branch (after the release) with the master, and remove it. The tag stays there, so if (in the future) one has to add some new patches and create a bug-fix release, one can easily do so. At the same time, one doesn't have to carry all these branches around. Ondrej From ondrej.certik at gmail.com Thu Jul 19 06:46:22 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 19 Jul 2012 12:46:22 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 2:24 AM, Ond?ej ?ert?k wrote: > Hi Ralf, > > On Wed, Jul 18, 2012 at 7:25 PM, Ralf Gommers > wrote: >> >> >> On Wed, Jul 18, 2012 at 1:24 PM, Ond?ej ?ert?k >> wrote: >>> >>> Hi, >>> >>> I managed to compile NumPy in MinGW under Wine in Ubuntu 11.10 and >>> here is a full log of the tests: >>> >>> https://gist.github.com/3135607 >>> >>> It fails at the test test_str (test_arrayprint.TestComplexArray) with >>> a segfault like this: >>> >>> >>> test_str (test_arrayprint.TestComplexArray) ... wine: Unhandled page >>> fault on read access to 0x00000000 at address (nil) (thread 0009), >>> starting debugger... >>> Unhandled exception: page fault on read access to 0x00000000 in 32-bit >>> code (0x00000000). >>> Register dump: >>> CS:0023 SS:002b DS:002b ES:002b FS:0063 GS:006b >>> EIP:00000000 ESP:0041c230 EBP:00000000 EFLAGS:00010202( R- -- I - - - >>> ) >>> EAX:00000000 EBX:1e00807f ECX:00000000 EDX:0041c208 >>> ESI:00f46de0 EDI:00000000 >>> >>> ... >>> >>> See the gist for the full log. Any ideas? I downloaded Python from >>> python.org, is it supposed to work with numpy compiled using mingw? >> >> >> It is. I have Python 2.5 ... 3.2 from Python.org, with MinGW from >> http://prdownloads.sourceforge.net/mingw/MinGW-5.0.3.exe?download and ATLAS >> binaries from https://github.com/numpy/vendor. >> >> Then I normally build numpy/scipy with "paver bdist_wininst_simple" or >> "paver bdist_wininst_superpack". This also requires MakeNSIS and the CpuId >> plugin for it, as documented at >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_RELEASE.rst.txt > > Thanks for the details. I'll try to reproduce it. I have no idea why > my setup doesn't work. > I've nailed it to: > >>>> import numpy as np >>>> np.array([complex(0, 1)], np.complex64) > wine: Unhandled page fault on read access to 0x00000000 at address > (nil) (thread 0009), starting debugger... > ... > > So it is something with the complex types. I've also tried these compilers: http://tdm-gcc.tdragon.net/ Those are gcc 4.6.1: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=c:/mingw32/bin/../libexec/gcc/mingw32/4.6.1/lto-wrapper.exe Target: mingw32 Configured with: ../../src/gcc-4.6.1/configure --build=mingw32 --enable-languages=c,c++,ada,fortran,objc,obj-c++ --enable-threads=win32 --enable-libgomp --enable-lto --enable-fully-dynamic-string --enable-libstdcxx-debug --enable-version-specific-runtime-libs --with-gnu-ld --disable-nls --disable-win32-registry --disable-symvers --disable-werror --prefix=/mingw32tdm --with-local-prefix=/mingw32tdm --enable-cxx-flags='-fno-function-sections -fno-data-sections' --with-pkgversion=tdm-1 --enable-sjlj-exceptions --with-bugurl=http://tdm-gcc.tdragon.net/bugs Thread model: win32 gcc version 4.6.1 (tdm-1) and I use Python from python.org, I use the following installer: python-2.7.3.msi, that is, a 32bit one. The tdm-gcc is also 32 bit. And the same segfault. Ralf, I am going to try your setup now. Ondrej From scrappedprince.li at gmail.com Thu Jul 19 06:52:02 2012 From: scrappedprince.li at gmail.com (Cheng Li) Date: Thu, 19 Jul 2012 18:52:02 +0800 Subject: [Numpy-discussion] numpy.fromfunction() doesn't work as expected? Message-ID: <016f01cd659c$8a59b1d0$9f0d1570$@gmail.com> Hi All, I have spot a strange behavior of numpy.fromfunction(). The sample codes are as follows: >>> import numpy as np >>> def myOnes(i,j): return 1.0 >>> a = np.fromfunction(myOnes,(2000,2000)) >>> a 1.0 Actually what I expected is that the function will return a 2000*2000 2d array with unit value. The returned single float value really confused me. Is this a known bug? The numpy version I used is 1.6.1. Regards, Cheng -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Jul 19 06:58:42 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 19 Jul 2012 12:58:42 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: >> I've nailed it to: >> >>>>> import numpy as np >>>>> np.array([complex(0, 1)], np.complex64) >> wine: Unhandled page fault on read access to 0x00000000 at address >> (nil) (thread 0009), starting debugger... >> ... Btw, I tried to debug it using: $ winedbg --gdb "C:\Python27\python" and I got: Wine-gdb> backtrace #0 0x7bc73735 in call_exception_handler () from /usr/bin/../lib32/wine/ntdll.dll.so #1 0x7bc76397 in ?? () from /usr/bin/../lib32/wine/ntdll.dll.so #2 0xdeadbabe in ?? () Backtrace stopped: Not enough registers or memory available to unwind further So unfortunately it doesn't show where exactly it fails, I would need to get the full stack trace. Also I noticed even simpler way to segfault it: import numpy numpy.array([1j]) Ondrej From travis at continuum.io Thu Jul 19 09:45:21 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 19 Jul 2012 08:45:21 -0500 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: I'm not sure what the conclusion actually was from this long thread. But, in trying to update the 1.7.x branch, I ended up in a very odd state with respect to the github pages. I used git filter-branch to try and get rid of "empty" commits that were showing up for some reason. However, this resulted in a branch that seemed fine on my local version but when pushed to github ended up duplicating nearly every commit in the maintenance branch so that the commits page for maintenance/1.7.x showed a duplicate commit for every actual commit, I didn't know how to fix this except to delete the branch (after doing a diff with master), recreate the branch, and apply the saved diff as a patch. I'm very sorry if I messed anyone up. I thought the plan was to delete the branch anyway. There could be something else wrong as well, but I'm not sure what the implication of your message is, exactly. People using the maintenance/1.7.x branch (how many people were actually using it?) will need to delete their local branch and re-pull from github. Best, -Travis On Jul 19, 2012, at 1:56 AM, Ralf Gommers wrote: > > > On Thu, Jul 12, 2012 at 1:54 PM, Nathaniel Smith wrote: > On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root wrote: > > > > > > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote: > >> > >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris > >> wrote: > >> > Hi All, > >> > > >> > Travis and I agree that it would be appropriate to remove the current > >> > 1.7.x > >> > branch and branch again after a code freeze. That way we can avoid the > >> > pain > >> > and potential errors of backports. It is considered bad form to mess > >> > with > >> > public repositories that way, so another option would be to rename the > >> > branch, although I'm not sure how well that would work. Suggestions? > >> > >> I might be mistaken, but if the branch is merged into master (even if > >> that merge makes no changes), I think it's safe to delete it at that > >> point (and recreate it at a later date with the same name) with > >> regards to remote repositories. It should be fairly easy to test. > >> > >> Ray Jones > > > > > > No, that is not the case. We had a situation occur awhile back where one of > > the public branches of mpl got completely messed up. You can't even rename > > it since the rename doesn't occur in the pulls and merges. > > > > What we ended up doing was creating a brand new branch "v1.0.x-maint" and > > making sure all the devs knew to switch over to that. You might even go a > > step further and make a final commit to the bad branch that makes the build > > fail with a big note explaining what to do. > > The branch isn't bad, it's just out of date. So long as the new > version of the branch has the current version of the branch in its > ancestry, then everything will be fine. > > Option 1: > git checkout master > git merge maint1.7.x > git checkout maint1.7.x > git merge master # will be a fast-forward > > Option 2: > git checkout master > git merge maint1.7.x > git branch -d maint1.7.x # delete the branch > git checkout -b maint1.7.x # recreate it > > In git terms these two options are literally identical; they result in > the exact same repo state... > > $ git co 1.7.x > Switched to branch '1.7.x' > Your branch and 'upstream/maintenance/1.7.x' have diverged, > and have 1 and 124 different commit(s) each, respectively. > > $ git pull > Auto-merging numpy/core/SConscript > Auto-merging numpy/core/bscript > CONFLICT (content): > > Of course I can fix this easily, but why are we having this long thread, coming to a conclusion and then doing something else? > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jul 19 09:48:54 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 19 Jul 2012 08:48:54 -0500 Subject: [Numpy-discussion] indexed increment (was Re: Pull Requests I'm planning to merge) In-Reply-To: References: Message-ID: <3F5DBAA7-CD8B-4D22-998C-CE4C372A2BAC@continuum.io> On Jul 19, 2012, at 4:56 AM, Nathaniel Smith wrote: > On Thu, Jul 19, 2012 at 6:04 AM, Fr?d?ric Bastien wrote: >> On Tue, Jul 17, 2012 at 10:20 PM, Travis Oliphant wrote: >>> I will hold off on this one, but only because you raised your objection to -1 (instead of -0.5). But, I disagree very strongly with you that it is "sub-optimal" in the context of NumPy. I actually think it is a very reasonable solution. He has re-used the MapIter code correctly (which is the only code that *does* fancy indexing). The author has done a lot of work to support multiple data-types and satisfy an often-requested feature in a very reasonable way. Your objection seems to be that you would prefer that it were a method on ufuncs. I don't think this will work without a *major* refactor. I'm not sure it's even worth it then either. >> >> Due to Travis comments, I expect the "better" solution won't be done >> in the short time. So blocking a feature requeted frequently because >> we could do sometimes in an unknow futur something better is not a >> good idea. (if I get it wrong on the expectation, please, correct me!) > > Really I'm making two points: > 1- this design smells just "off" enough that we should discuss it to > figure out whether or not it's actually the best way forward. > 2- such discussions are too important to just skip whenever a release > is imminent. there will be another release... > >> Also, if we decide to change the interface, we could do it for NumPy >> 2, so I don't see a need for an infinit time support of this API. > > The "NumPy 2" idea is not very realistic IMHO, and shouldn't be used > as an excuse for being sloppy. We're never going to be able to go > through and make all the APIs perfect in one release cycle. Especially > if we keep deferring all hard questions until then. > >> So in conclusion, can we say that when John will have make is PR work >> with all dtype, we merge it? > > Well, we'll come up with something :-). (Assuming that people stay > interested, which seems likely since this is such a common problem > people have.) But I'm not sure what. Making the PR work with all > dtypes is exactly what Travis is arguing is too much work. I don't think that's what I was arguing would be too much work exactly. I must have mis-communicated. What I was saying was trying to use the ufunc mechanism to do this would be too much work at this point. Having the PR work with all dtypes (that support addition) should be done (and I believe with the exception of long double dtypes it has been done). -Travis From travis at continuum.io Thu Jul 19 09:53:35 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 19 Jul 2012 08:53:35 -0500 Subject: [Numpy-discussion] Indexing API In-Reply-To: References: Message-ID: <33C105EF-9C05-45D2-80E5-7B49F8F76F09@continuum.io> On Jul 19, 2012, at 3:50 AM, Nathaniel Smith wrote: > So the underlying problem with the controversial inplace_increment > PR[1] is that currently, there's actually nothing in the public numpy > API that exposes the workings of numpy indexing. The only thing you > can do with a numpy index is call ndarray.__getattr__ or __setattr__. > This is a pretty obvious gap, given how fundamental an operation > indexing is in numpy (and how difficult to emulate). So how can we > expose something that fixes it? Make PyArrayMapIterObject part of the > public API? Something else? I think you meant ndarray.__getitem__ and ndarray.__setitem__ As I mentioned in the comments, the original intention was to make PyArrayMapIterObject part of the public API. However, I was not able to make it work in the way I had intended back then. Exposing the MapIterObject is a good idea (but it would have to be exposed already bound to an array) --- i.e. you create a new API that binds to a particular array and then expose the PyArray_MapIterNext, etc. functions. Perhaps something like: PyArray_MapIterArray -Travis From charlesr.harris at gmail.com Thu Jul 19 10:14:38 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 19 Jul 2012 08:14:38 -0600 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 7:45 AM, Travis Oliphant wrote: > I'm not sure what the conclusion actually was from this long thread. > > But, in trying to update the 1.7.x branch, I ended up in a very odd state > with respect to the github pages. I used git filter-branch to try and get > rid of "empty" commits that were showing up for some reason. However, > this resulted in a branch that seemed fine on my local version but when > pushed to github ended up duplicating nearly every commit in the > maintenance branch so that the commits page for maintenance/1.7.x showed a > duplicate commit for every actual commit, > > I didn't know how to fix this except to delete the branch (after doing a > diff with master), recreate the branch, and apply the saved diff as a > patch. I'm very sorry if I messed anyone up. > > I thought the plan was to delete the branch anyway. There could be > something else wrong as well, but I'm not sure what the implication of your > message is, exactly. > > People using the maintenance/1.7.x branch (how many people were actually > using it?) will need to delete their local branch and re-pull from github. > > I agree that the easiest thing to do is remove the current 1.7 branch and branch again. It isn't quite according to the book of Linus, but it will get us where we need to be. git push upstream :maintenance/1.7.x Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jul 19 11:04:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 19 Jul 2012 17:04:23 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 4:14 PM, Charles R Harris wrote: > > > On Thu, Jul 19, 2012 at 7:45 AM, Travis Oliphant wrote: > >> I'm not sure what the conclusion actually was from this long thread. >> >> But, in trying to update the 1.7.x branch, I ended up in a very odd state >> with respect to the github pages. I used git filter-branch to try and get >> rid of "empty" commits that were showing up for some reason. However, >> this resulted in a branch that seemed fine on my local version but when >> pushed to github ended up duplicating nearly every commit in the >> maintenance branch so that the commits page for maintenance/1.7.x showed a >> duplicate commit for every actual commit, >> >> I didn't know how to fix this except to delete the branch (after doing a >> diff with master), recreate the branch, and apply the saved diff as a >> patch. I'm very sorry if I messed anyone up. >> >> I thought the plan was to delete the branch anyway. There could be >> something else wrong as well, but I'm not sure what the implication of your >> message is, exactly. >> >> People using the maintenance/1.7.x branch (how many people were actually >> using it?) will need to delete their local branch and re-pull from github. >> >> > I agree that the easiest thing to do is remove the current 1.7 branch and > branch again. It isn't quite according to the book of Linus, but it will > get us where we need to be. > > git push upstream :maintenance/1.7.x > No, why? The damage is already done, this doesn't change anything. The point was, as several people pointed out, to merge 1.7.x into master. Then it could have either been deleted and recreated, or fast-forwarded. The merge should have been straightforward. Nathaniel provided all commands needed. For now, let's leave it as is. Everyone who was using 1.7.x should just delete his branch and recreate it. Then force push it to their own Github account if necessary. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Jul 19 11:58:19 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 19 Jul 2012 17:58:19 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 12:58 PM, Ond?ej ?ert?k wrote: >>> I've nailed it to: >>> >>>>>> import numpy as np >>>>>> np.array([complex(0, 1)], np.complex64) >>> wine: Unhandled page fault on read access to 0x00000000 at address >>> (nil) (thread 0009), starting debugger... >>> ... > > Btw, I tried to debug it using: > > $ winedbg --gdb "C:\Python27\python" > > and I got: > > > Wine-gdb> backtrace > #0 0x7bc73735 in call_exception_handler () > from /usr/bin/../lib32/wine/ntdll.dll.so > #1 0x7bc76397 in ?? () from /usr/bin/../lib32/wine/ntdll.dll.so > #2 0xdeadbabe in ?? () > Backtrace stopped: Not enough registers or memory available to unwind further > > > > So unfortunately it doesn't show where exactly it fails, I would need > to get the full stack trace. Also I noticed even simpler > way to segfault it: > > import numpy > numpy.array([1j]) So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install from some wrong url and it fails to install. I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": binutils-2.17.50-20070129-1.tar.gz w32api-3.7.tar.gz gcc-g77-3.4.5-20051220-1.tar.gz gcc-g++-3.4.5-20051220-1.tar.gz gcc-core-3.4.5-20051220-1.tar.gz mingw-runtime-3.10.tar.gz also in the same directory, I had to do: cp ../windows/system32/msvcr90.dll lib/ Also I've added the bin directory to PATH using the following trick: $ cat > tmp < References: Message-ID: On Thu, Jul 19, 2012 at 5:58 PM, Ond?ej ?ert?k wrote: > On Thu, Jul 19, 2012 at 12:58 PM, Ond?ej ?ert?k > wrote: > >>> I've nailed it to: > >>> > >>>>>> import numpy as np > >>>>>> np.array([complex(0, 1)], np.complex64) > >>> wine: Unhandled page fault on read access to 0x00000000 at address > >>> (nil) (thread 0009), starting debugger... > >>> ... > > > > Btw, I tried to debug it using: > > > > $ winedbg --gdb "C:\Python27\python" > > > > and I got: > > > > > > Wine-gdb> backtrace > > #0 0x7bc73735 in call_exception_handler () > > from /usr/bin/../lib32/wine/ntdll.dll.so > > #1 0x7bc76397 in ?? () from /usr/bin/../lib32/wine/ntdll.dll.so > > #2 0xdeadbabe in ?? () > > Backtrace stopped: Not enough registers or memory available to unwind > further > > > > > > > > So unfortunately it doesn't show where exactly it fails, I would need > > to get the full stack trace. Also I noticed even simpler > > way to segfault it: > > > > import numpy > > numpy.array([1j]) > > > So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install > from some wrong url and it fails to install. > I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": > > Not surprising, that MinGW is really getting old. It's still the last available one with gcc 3.x as IIRC. > binutils-2.17.50-20070129-1.tar.gz > w32api-3.7.tar.gz > gcc-g77-3.4.5-20051220-1.tar.gz > gcc-g++-3.4.5-20051220-1.tar.gz > gcc-core-3.4.5-20051220-1.tar.gz > mingw-runtime-3.10.tar.gz > > also in the same directory, I had to do: > > cp ../windows/system32/msvcr90.dll lib/ > Looks like I have an older Wine, not sure if it makes a difference: $ locate msvcr90.dll /Users/rgommers/.wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll /Users/rgommers/__wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll $ locate msvcr71.dll /Users/rgommers/.wine/drive_c/windows/system32/msvcr71.dll /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.fake /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.so /Users/rgommers/__wine/drive_c/windows/system32/msvcr71.dll /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.fake /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.so /Users/rgommers/wine/wine-1.1.39/lib/wine/fakedlls/msvcr71.dll /Users/rgommers/wine/wine-1.1.39/lib/wine/msvcr71.dll.so /usr/local/lib/wine/fakedlls/msvcr71.dll /usr/local/lib/wine/msvcr71.dll.so > > Also I've added the bin directory to PATH using the following trick: > > $ cat > tmp < REGEDIT4 > > [HKEY_CURRENT_USER\Environment] > "PATH"="C:\\\\MinGW\\\\bin" > EOF > $ wine regedit tmp > > > > Then I build and installed numpy using: > > wine "C:\Python27\python" setup.py build --compiler=mingw32 install > > And now there is no segfault when constructing a complex array! So > newer (newest) mingw miscompiles NumPy somehow... > > > Anyway, running tests, it gets much farther then before, now it hangs at: > > > test_multiarray.TestIO.test_ascii ... > err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed > out in thread 0009, blocked by 0000, retrying (60 sec) > fixme:keyboard:X11DRV_ActivateKeyboardLayout 0x4090409, 0000: semi-stub! > err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed > out in thread 0009, blocked by 0000, retrying (60 sec) > err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed > out in thread 0009, blocked by 0000, retrying (60 sec) > ... > > Not sure what this problem is yet. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jul 19 14:10:28 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 19 Jul 2012 20:10:28 +0200 Subject: [Numpy-discussion] ANN: SciPy 0.11.0 release candidate 1 Message-ID: Hi, I am pleased to announce the availability of the first release candidate of SciPy 0.11.0. For this release many new features have been added, and over 120 tickets and pull requests have been closed. Also noteworthy is that the number of contributors for this release has risen to over 50. Some of the highlights are: - A new module, sparse.csgraph, has been added which provides a number of common sparse graph algorithms. - New unified interfaces to the existing optimization and root finding functions have been added. Sources and binaries can be found at https://sourceforge.net/projects/scipy/files/scipy/0.11.0rc1/, release notes are copied below. Please try this release candidate and report any problems on the scipy mailing lists. Cheers, Ralf ========================== SciPy 0.11.0 Release Notes ========================== .. note:: Scipy 0.11.0 is not released yet! .. contents:: SciPy 0.11.0 is the culmination of 8 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. Highlights of this release are: - A new module has been added which provides a number of common sparse graph algorithms. - New unified interfaces to the existing optimization and root finding functions have been added. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Our development attention will now shift to bug-fix releases on the 0.11.x branch, and on adding new features on the master branch. This release requires Python 2.4-2.7 or 3.1-3.2 and NumPy 1.5.1 or greater. New features ============ Sparse Graph Submodule ---------------------- The new submodule :mod:`scipy.sparse.csgraph` implements a number of efficient graph algorithms for graphs stored as sparse adjacency matrices. Available routines are: - :func:`connected_components` - determine connected components of a graph - :func:`laplacian` - compute the laplacian of a graph - :func:`shortest_path` - compute the shortest path between points on a positive graph - :func:`dijkstra` - use Dijkstra's algorithm for shortest path - :func:`floyd_warshall` - use the Floyd-Warshall algorithm for shortest path - :func:`breadth_first_order` - compute a breadth-first order of nodes - :func:`depth_first_order` - compute a depth-first order of nodes - :func:`breadth_first_tree` - construct the breadth-first tree from a given node - :func:`depth_first_tree` - construct a depth-first tree from a given node - :func:`minimum_spanning_tree` - construct the minimum spanning tree of a graph ``scipy.optimize`` improvements ------------------------------- The optimize module has received a lot of attention this release. In addition to added tests, documentation improvements, bug fixes and code clean-up, the following improvements were made: - A unified interface to minimizers of univariate and multivariate functions has been added. - A unified interface to root finding algorithms for multivariate functions has been added. - The L-BFGS-B algorithm has been updated to version 3.0. Unified interfaces to minimizers ```````````````````````````````` Two new functions ``scipy.optimize.minimize`` and ``scipy.optimize.minimize_scalar`` were added to provide a common interface to minimizers of multivariate and univariate functions respectively. For multivariate functions, ``scipy.optimize.minimize`` provides an interface to methods for unconstrained optimization (`fmin`, `fmin_powell`, `fmin_cg`, `fmin_ncg`, `fmin_bfgs` and `anneal`) or constrained optimization (`fmin_l_bfgs_b`, `fmin_tnc`, `fmin_cobyla` and `fmin_slsqp`). For univariate functions, ``scipy.optimize.minimize_scalar`` provides an interface to methods for unconstrained and bounded optimization (`brent`, `golden`, `fminbound`). This allows for easier comparing and switching between solvers. Unified interface to root finding algorithms ```````````````````````````````````````````` The new function ``scipy.optimize.root`` provides a common interface to root finding algorithms for multivariate functions, embeding `fsolve`, `leastsq` and `nonlin` solvers. ``scipy.linalg`` improvements ----------------------------- New matrix equation solvers ``````````````````````````` Solvers for the Sylvester equation (``scipy.linalg.solve_sylvester``, discrete and continuous Lyapunov equations (``scipy.linalg.solve_lyapunov``, ``scipy.linalg.solve_discrete_lyapunov``) and discrete and continuous algebraic Riccati equations (``scipy.linalg.solve_continuous_are``, ``scipy.linalg.solve_discrete_are``) have been added to ``scipy.linalg``. These solvers are often used in the field of linear control theory. QZ and QR Decomposition ```````````````````````` It is now possible to calculate the QZ, or Generalized Schur, decomposition using ``scipy.linalg.qz``. This function wraps the LAPACK routines sgges, dgges, cgges, and zgges. The function ``scipy.linalg.qr_multiply``, which allows efficient computation of the matrix product of Q (from a QR decompostion) and a vector, has been added. Pascal matrices ``````````````` A function for creating Pascal matrices, ``scipy.linalg.pascal``, was added. Sparse matrix construction and operations ----------------------------------------- Two new functions, ``scipy.sparse.diags`` and ``scipy.sparse.block_diag``, were added to easily construct diagonal and block-diagonal sparse matrices respectively. ``scipy.sparse.csc_matrix`` and ``csr_matrix`` now support the operations ``sin``, ``tan``, ``arcsin``, ``arctan``, ``sinh``, ``tanh``, ``arcsinh``, ``arctanh``, ``rint``, ``sign``, ``expm1``, ``log1p``, ``deg2rad``, ``rad2deg``, ``floor``, ``ceil`` and ``trunc``. Previously, these operations had to be performed by operating on the matrices' ``data`` attribute. LSMR iterative solver --------------------- LSMR, an iterative method for solving (sparse) linear and linear least-squares systems, was added as ``scipy.sparse.linalg.lsmr``. Discrete Sine Transform ----------------------- Bindings for the discrete sine transform functions have been added to ``scipy.fftpack``. ``scipy.interpolate`` improvements ---------------------------------- For interpolation in spherical coordinates, the three classes ``scipy.interpolate.SmoothSphereBivariateSpline``, ``scipy.interpolate.LSQSphereBivariateSpline``, and ``scipy.interpolate.RectSphereBivariateSpline`` have been added. Binned statistics (``scipy.stats``) ----------------------------------- The stats module has gained functions to do binned statistics, which are a generalization of histograms, in 1-D, 2-D and multiple dimensions: ``scipy.stats.binned_statistic``, ``scipy.stats.binned_statistic_2d`` and ``scipy.stats.binned_statistic_dd``. Deprecated features =================== ``scipy.sparse.cs_graph_components`` has been made a part of the sparse graph submodule, and renamed to ``scipy.sparse.csgraph.connected_components``. Calling the former routine will result in a deprecation warning. ``scipy.misc.radon`` has been deprecated. A more full-featured radon transform can be found in scikits-image. ``scipy.io.save_as_module`` has been deprecated. A better way to save multiple Numpy arrays is the ``numpy.savez`` function. The `xa` and `xb` parameters for all distributions in ``scipy.stats.distributions`` already weren't used; they have now been deprecated. Backwards incompatible changes ============================== Removal of ``scipy.maxentropy`` ------------------------------- The ``scipy.maxentropy`` module, which was deprecated in the 0.10.0 release, has been removed. Logistic regression in scikits.learn is a good and modern alternative for this functionality. Minor change in behavior of ``splev`` ------------------------------------- The spline evaluation function now behaves similarly to ``interp1d`` for size-1 arrays. Previous behavior:: >>> from scipy.interpolate import splev, splrep, interp1d >>> x = [1,2,3,4,5] >>> y = [4,5,6,7,8] >>> tck = splrep(x, y) >>> splev([1], tck) 4. >>> splev(1, tck) 4. Corrected behavior:: >>> splev([1], tck) array([ 4.]) >>> splev(1, tck) array(4.) This affects also the ``UnivariateSpline`` classes. Behavior of ``scipy.integrate.complex_ode`` ------------------------------------------- The behavior of the ``y`` attribute of ``complex_ode`` is changed. Previously, it expressed the complex-valued solution in the form:: z = ode.y[::2] + 1j * ode.y[1::2] Now, it is directly the complex-valued solution:: z = ode.y Minor change in behavior of T-tests ----------------------------------- The T-tests ``scipy.stats.ttest_ind``, ``scipy.stats.ttest_rel`` and ``scipy.stats.ttest_1samp`` have been changed so that 0 / 0 now returns NaN instead of 1. Other changes ============= The SuperLU sources in ``scipy.sparse.linalg`` have been updated to version 4.3 from upstream. The function ``scipy.signal.bode``, which calculates magnitude and phase data for a continuous-time system, has been added. The two-sample T-test ``scipy.stats.ttest_ind`` gained an option to compare samples with unequal variances, i.e. Welch's T-test. ``scipy.misc.logsumexp`` now takes an optional ``axis`` keyword argument. Authors ======= This release contains work by the following people (contributed at least one patch to this release, names in alphabetical order): * Jeff Armstrong * Chad Baker * Brandon Beacher + * behrisch + * borishim + * Matthew Brett * Lars Buitinck * Luis Pedro Coelho + * Johann Cohen-Tanugi * David Cournapeau * dougal + * Ali Ebrahim + * endolith + * Bj?rn Forsman + * Robert Gantner + * Sebastian Gassner + * Christoph Gohlke * Ralf Gommers * Yaroslav Halchenko * Charles Harris * Jonathan Helmus + * Andreas Hilboll + * Marc Honnorat + * Jonathan Hunt + * Maxim Ivanov + * Thouis (Ray) Jones * Christopher Kuster + * Josh Lawrence + * Denis Laxalde + * Travis Oliphant * Joonas Paalasmaa + * Fabian Pedregosa * Josef Perktold * Gavin Price + * Jim Radford + * Andrew Schein + * Skipper Seabold * Jacob Silterra + * Scott Sinclair * Alexis Tabary + * Martin Teichmann * Matt Terry + * Nicky van Foreest + * Jacob Vanderplas * Patrick Varilly + * Pauli Virtanen * Nils Wagner + * Darryl Wally + * Stefan van der Walt * Liming Wang + * David Warde-Farley + * Warren Weckesser * Sebastian Werk + * Mike Wimmer + * Tony S Yu + A total of 55 people contributed to this release. People with a "+" by their names contributed a patch for the first time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jul 19 14:21:31 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 19 Jul 2012 20:21:31 +0200 Subject: [Numpy-discussion] Problem w/ Win installer In-Reply-To: References: Message-ID: On Tue, Jul 17, 2012 at 6:34 AM, David Goldsmith wrote: > Hi, folks! Having a problem w/ the Windows installer; first, the > "back-story": I have both Python 2.7 and 3.2 installed. When I run the > installer and click next on the first dialog, I get the message that I need > Python 2.7, which was not found in my registry. I ran regedit and searched > for Python and get multiple hits on both Python 2.7 and 3.2. So, precisely > which registry key has to have the value Python 2.7 for the installer to > find it? Thanks! > You probably have a 64-bit Python and using the 32-bit installer, or something similar. Hard to tell without more details. How exactly did you install things? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jul 19 15:01:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 19 Jul 2012 14:01:28 -0500 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: <51D95F2B-AEB5-4B9D-96CC-3097FD6C8C2F@continuum.io> On Jul 19, 2012, at 9:14 AM, Charles R Harris wrote: > > > On Thu, Jul 19, 2012 at 7:45 AM, Travis Oliphant wrote: > I'm not sure what the conclusion actually was from this long thread. > > But, in trying to update the 1.7.x branch, I ended up in a very odd state with respect to the github pages. I used git filter-branch to try and get rid of "empty" commits that were showing up for some reason. However, this resulted in a branch that seemed fine on my local version but when pushed to github ended up duplicating nearly every commit in the maintenance branch so that the commits page for maintenance/1.7.x showed a duplicate commit for every actual commit, > > I didn't know how to fix this except to delete the branch (after doing a diff with master), recreate the branch, and apply the saved diff as a patch. I'm very sorry if I messed anyone up. > > I thought the plan was to delete the branch anyway. There could be something else wrong as well, but I'm not sure what the implication of your message is, exactly. > > People using the maintenance/1.7.x branch (how many people were actually using it?) will need to delete their local branch and re-pull from github. > > > I agree that the easiest thing to do is remove the current 1.7 branch and branch again. It isn't quite according to the book of Linus, but it will get us where we need to be. > > git push upstream :maintenance/1.7.x That's basically what I already did on Tuesday night. -Travis > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin.haenel at epfl.ch Thu Jul 19 16:12:33 2012 From: valentin.haenel at epfl.ch (=?iso-8859-1?Q?H=E4nel?= Nikolaus Valentin) Date: Thu, 19 Jul 2012 22:12:33 +0200 Subject: [Numpy-discussion] Remove current 1.7 branch? In-Reply-To: References: Message-ID: <20120719201233.GF23275@kudu.in-berlin.de> Hi, * Ralf Gommers [2012-07-19]: > On Thu, Jul 19, 2012 at 4:14 PM, Charles R Harris > On Thu, Jul 19, 2012 at 7:45 AM, Travis Oliphant wrote: > > > >> I'm not sure what the conclusion actually was from this long thread. > >> > >> But, in trying to update the 1.7.x branch, I ended up in a very odd state > >> with respect to the github pages. I used git filter-branch to try and get > >> rid of "empty" commits that were showing up for some reason. However, > >> this resulted in a branch that seemed fine on my local version but when > >> pushed to github ended up duplicating nearly every commit in the > >> maintenance branch so that the commits page for maintenance/1.7.x showed a > >> duplicate commit for every actual commit, > >> > >> I didn't know how to fix this except to delete the branch (after doing a > >> diff with master), recreate the branch, and apply the saved diff as a > >> patch. I'm very sorry if I messed anyone up. > >> > >> I thought the plan was to delete the branch anyway. There could be > >> something else wrong as well, but I'm not sure what the implication of your > >> message is, exactly. > >> > >> People using the maintenance/1.7.x branch (how many people were actually > >> using it?) will need to delete their local branch and re-pull from github. > >> > >> > > I agree that the easiest thing to do is remove the current 1.7 branch and > > branch again. It isn't quite according to the book of Linus, but it will > > get us where we need to be. > > > > git push upstream :maintenance/1.7.x > > > > No, why? The damage is already done, this doesn't change anything. The > point was, as several people pointed out, to merge 1.7.x into master. Then > it could have either been deleted and recreated, or fast-forwarded. The > merge should have been straightforward. Nathaniel provided all commands > needed. Indeed, deleting the branch is no longer required, as the wrong history which was erroneously pushed has already been deleted. I suppose it would be good to get confirmation that the maintenance/1.7.x branch should now be at commit f93774d. Is that correct? > For now, let's leave it as is. Everyone who was using 1.7.x should just > delete his branch and recreate it. Then force push it to their own Github > account if necessary. +1 Alternatively you can use 'git reset --hard @{u}' where '@{u}' is short for upstream-branch: zsh? git co maintenance/1.7.x Switched to branch 'maintenance/1.7.x' Your branch and 'origin/maintenance/1.7.x' have diverged, and have 4 and 124 different commits each, respectively. zsh? git log -1 --oneline 95c84bf Merge pull request #318 from certik/ondrej1 zsh? git reset --hard @{u} HEAD is now at f93774d Update release notes and version number. zsh? git log -1 --oneline f93774d Update release notes and version number. Using '@{u}' requires you to have an appropriate upstream-branch set: zsh? git config remote.origin.url git://github.com/numpy/numpy.git zsh? git config branch.maintenance/1.7.x.remote origin zsh? git config branch.maintenance/1.7.x.merge refs/heads/maintenance/1.7.x Or from the .git/config: zsh? cat .git/config |& grep -A 2 "remote \"origin\"" [remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = git://github.com/numpy/numpy.git zsh? cat .git/config |& grep -A 3 maintenance/1.7.x [branch "maintenance/1.7.x"] remote = origin merge = refs/heads/maintenance/1.7.x Hope that helps. V- From lists at hilboll.de Fri Jul 20 05:34:01 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 20 Jul 2012 11:34:01 +0200 Subject: [Numpy-discussion] [SciPy-Dev] ANN: SciPy 0.11.0 release candidate 1 In-Reply-To: References: Message-ID: <3fb4522421d3b87335b5d040a0e6cd4b.squirrel@srv2.s4y.tournesol-consulting.eu> > Hi, > > I am pleased to announce the availability of the first release candidate > of > SciPy 0.11.0. For this release many new features have been added, and over > 120 tickets and pull requests have been closed. Also noteworthy is that > the > number of contributors for this release has risen to over 50. Some of the > highlights are: > > - A new module, sparse.csgraph, has been added which provides a number > of > common sparse graph algorithms. > - New unified interfaces to the existing optimization and root finding > functions have been added. > > Sources and binaries can be found at > https://sourceforge.net/projects/scipy/files/scipy/0.11.0rc1/, release > notes are copied below. > > Please try this release candidate and report any problems on the scipy > mailing lists. Failure on Archlinux 64bit, Python 2.7.3, Numpy 1.6.1: This is what I did: mkvirtualenv --system-site-packages --distribute scipy_test_rc1 cd ~/.virtualenvs/scipy_test_rc1 mkdir src wget -O src/scipy-0.11.0rc1.tar.gz "http://downloads.sourceforge.net/project/scipy/scipy/0.11.0rc1/scipy-0.11.0rc1.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fscipy%2Ffiles%2Fscipy%2F0.11.0rc1%2F&ts=1342772081&use_mirror=netcologne" cd src tar xzf scipy-0.11.0rc1.tar.gz cd scipy-0.11.0rc1 python setup.py build python setup.py install cd python -c "import scipy; scipy.test('full')" Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /usr/lib/python2.7/site-packages/numpy SciPy version 0.11.0rc1 SciPy is installed in /home2/hilboll/.virtualenvs/scipy_test_rc1/lib/python2.7/site-packages/scipy Python version 2.7.3 (default, Apr 24 2012, 00:00:54) [GCC 4.7.0 20120414 (prerelease)] nose version 1.1.2 [...] ====================================================================== FAIL: test_basic.TestNorm.test_stable ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home2/hilboll/.virtualenvs/scipy_test_rc1/lib/python2.7/site-packages/scipy/linalg/tests/test_basic.py", line 585, in test_stable assert_almost_equal(norm(a) - 1e4, 0.5) File "/usr/lib/python2.7/site-packages/numpy/testing/utils.py", line 468, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal to 7 decimals ACTUAL: 0.0 DESIRED: 0.5 ---------------------------------------------------------------------- FAILED (KNOWNFAIL=16, SKIP=42, failures=1) From warren.weckesser at enthought.com Fri Jul 20 05:50:22 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Fri, 20 Jul 2012 04:50:22 -0500 Subject: [Numpy-discussion] numpy.fromfunction() doesn't work as expected? In-Reply-To: <016f01cd659c$8a59b1d0$9f0d1570$@gmail.com> References: <016f01cd659c$8a59b1d0$9f0d1570$@gmail.com> Message-ID: On Thu, Jul 19, 2012 at 5:52 AM, Cheng Li wrote: > Hi All,**** > > ** ** > > I have spot a strange behavior of numpy.fromfunction(). The sample codes > are as follows:**** > > >>> import numpy as np**** > > >>> def myOnes(i,j):**** > > return 1.0**** > > >>> a = np.fromfunction(myOnes,(2000,2000))**** > > >>> a**** > > 1.0**** > > ** ** > > Actually what I expected is that the function will return a 2000*2000 2d > array with unit value. The returned single float value really confused me. > Is this a known bug? The numpy version I used is 1.6.1.**** > > ** > Your function will be called *once*, with arguments that are *arrays* of coordinate values. It must handle these arrays when it computes the values of the array to be created. To see what is happening, print the values of i and j from within your function, e.g.: In [57]: def ijsum(i, j): ....: print "i =", i ....: print "j =", j ....: return i + j ....: In [58]: fromfunction(ijsum, (3, 4)) i = [[ 0. 0. 0. 0.] [ 1. 1. 1. 1.] [ 2. 2. 2. 2.]] j = [[ 0. 1. 2. 3.] [ 0. 1. 2. 3.] [ 0. 1. 2. 3.]] Out[58]: array([[ 0., 1., 2., 3.], [ 1., 2., 3., 4.], [ 2., 3., 4., 5.]]) Your `myOnes` function will work if you modify it something like this: In [59]: def myOnes(i, j): ....: return np.ones(i.shape) ....: In [60]: fromfunction(myOnes, (3, 4)) Out[60]: array([[ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]]) The bug is in the docstring for fromfunction. In the description of the `function` argument, it says "`function` must be capable of operating on arrays, and should return a scalar value." But the function should *not* return a scalar value. It should return an array of values appropriate for the given arguments. Warren > ** > > Regards,**** > > Cheng**** > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vs at it.uu.se Fri Jul 20 06:12:24 2012 From: vs at it.uu.se (Virgil Stokes) Date: Fri, 20 Jul 2012 12:12:24 +0200 Subject: [Numpy-discussion] [SciPy-Dev] ANN: SciPy 0.11.0 release candidate 1 In-Reply-To: <3fb4522421d3b87335b5d040a0e6cd4b.squirrel@srv2.s4y.tournesol-consulting.eu> References: <3fb4522421d3b87335b5d040a0e6cd4b.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: <50092F08.3080605@it.uu.se> On 20-Jul-2012 11:34, Andreas Hilboll wrote: >> Hi, >> >> I am pleased to announce the availability of the first release candidate >> of >> SciPy 0.11.0. For this release many new features have been added, and over >> 120 tickets and pull requests have been closed. Also noteworthy is that >> the >> number of contributors for this release has risen to over 50. Some of the >> highlights are: >> >> - A new module, sparse.csgraph, has been added which provides a number >> of >> common sparse graph algorithms. >> - New unified interfaces to the existing optimization and root finding >> functions have been added. >> >> Sources and binaries can be found at >> https://sourceforge.net/projects/scipy/files/scipy/0.11.0rc1/, release >> notes are copied below. >> >> Please try this release candidate and report any problems on the scipy >> mailing lists. > Failure on Archlinux 64bit, Python 2.7.3, Numpy 1.6.1: > > This is what I did: > > mkvirtualenv --system-site-packages --distribute scipy_test_rc1 > cd ~/.virtualenvs/scipy_test_rc1 > mkdir src > wget -O src/scipy-0.11.0rc1.tar.gz > "http://downloads.sourceforge.net/project/scipy/scipy/0.11.0rc1/scipy-0.11.0rc1.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fscipy%2Ffiles%2Fscipy%2F0.11.0rc1%2F&ts=1342772081&use_mirror=netcologne" > cd src > tar xzf scipy-0.11.0rc1.tar.gz > cd scipy-0.11.0rc1 > python setup.py build > python setup.py install > cd > python -c "import scipy; scipy.test('full')" > > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /usr/lib/python2.7/site-packages/numpy > SciPy version 0.11.0rc1 > SciPy is installed in > /home2/hilboll/.virtualenvs/scipy_test_rc1/lib/python2.7/site-packages/scipy > Python version 2.7.3 (default, Apr 24 2012, 00:00:54) [GCC 4.7.0 20120414 > (prerelease)] > nose version 1.1.2 > > [...] > > ====================================================================== > FAIL: test_basic.TestNorm.test_stable > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest > self.test(*self.arg) > File > "/home2/hilboll/.virtualenvs/scipy_test_rc1/lib/python2.7/site-packages/scipy/linalg/tests/test_basic.py", > line 585, in test_stable > assert_almost_equal(norm(a) - 1e4, 0.5) > File "/usr/lib/python2.7/site-packages/numpy/testing/utils.py", line > 468, in assert_almost_equal > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal to 7 decimals > ACTUAL: 0.0 > DESIRED: 0.5 > > ---------------------------------------------------------------------- > FAILED (KNOWNFAIL=16, SKIP=42, failures=1) > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion But when I use > pip install --upgrade SciPy Installation of vers. 0.10.1 is attempted. From ondrej.certik at gmail.com Fri Jul 20 07:24:49 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 20 Jul 2012 13:24:49 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: >> So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install >> from some wrong url and it fails to install. >> I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": >> > Not surprising, that MinGW is really getting old. It's still the last > available one with gcc 3.x as IIRC. To make things reproducible, I've put all my packages in this repository: https://github.com/certik/numpy-vendor > >> >> binutils-2.17.50-20070129-1.tar.gz >> w32api-3.7.tar.gz >> gcc-g77-3.4.5-20051220-1.tar.gz >> gcc-g++-3.4.5-20051220-1.tar.gz >> gcc-core-3.4.5-20051220-1.tar.gz >> mingw-runtime-3.10.tar.gz >> >> also in the same directory, I had to do: >> >> cp ../windows/system32/msvcr90.dll lib/ > > > Looks like I have an older Wine, not sure if it makes a difference: > > $ locate msvcr90.dll > /Users/rgommers/.wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll > /Users/rgommers/__wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll > > $ locate msvcr71.dll > /Users/rgommers/.wine/drive_c/windows/system32/msvcr71.dll > /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.fake > /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.so > /Users/rgommers/__wine/drive_c/windows/system32/msvcr71.dll > /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.fake > /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.so > /Users/rgommers/wine/wine-1.1.39/lib/wine/fakedlls/msvcr71.dll > /Users/rgommers/wine/wine-1.1.39/lib/wine/msvcr71.dll.so > /usr/local/lib/wine/fakedlls/msvcr71.dll > /usr/local/lib/wine/msvcr71.dll.so Actually, I made a mistake --- the one in drive_c/windows/system32/msvcr90.dll does not work for me. The one I use is installed by the Python installer (as I found out) and it is in: drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll Which seems to be the same as the one that you use. Just in case, I've put it here: https://github.com/certik/numpy-vendor/blob/master/msvcr90.dll > > >> >> >> Also I've added the bin directory to PATH using the following trick: >> >> $ cat > tmp <> REGEDIT4 >> >> [HKEY_CURRENT_USER\Environment] >> "PATH"="C:\\\\MinGW\\\\bin" >> EOF >> $ wine regedit tmp >> >> >> >> Then I build and installed numpy using: >> >> wine "C:\Python27\python" setup.py build --compiler=mingw32 install >> >> And now there is no segfault when constructing a complex array! So >> newer (newest) mingw miscompiles NumPy somehow... >> >> >> Anyway, running tests, it gets much farther then before, now it hangs at: >> >> >> test_multiarray.TestIO.test_ascii ... >> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >> out in thread 0009, blocked by 0000, retrying (60 sec) >> fixme:keyboard:X11DRV_ActivateKeyboardLayout 0x4090409, 0000: semi-stub! >> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >> out in thread 0009, blocked by 0000, retrying (60 sec) >> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >> out in thread 0009, blocked by 0000, retrying (60 sec) >> ... >> >> Not sure what this problem is yet. This however is a big problem. I've tested it on the actual Windows 64bit XP box, and the test simply segfaults at this place. Ralf, I should note, that your latest scipy RC tests also segfault on my Windows machine, so maybe something is wrong with the machine... Ondrej From ondrej.certik at gmail.com Fri Jul 20 08:50:47 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 20 Jul 2012 14:50:47 +0200 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Wed, Jul 18, 2012 at 8:09 AM, Travis Oliphant wrote: > > Hey all, > > We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. I think we need to fix the MinGW build problems before the release: https://github.com/numpy/numpy/pull/363 Ondrej From cournape at gmail.com Fri Jul 20 10:58:21 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 20 Jul 2012 15:58:21 +0100 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Fri, Jul 20, 2012 at 1:50 PM, Ond?ej ?ert?k wrote: > On Wed, Jul 18, 2012 at 8:09 AM, Travis Oliphant wrote: >> >> Hey all, >> >> We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. > > I think we need to fix the MinGW build problems before the release: > > https://github.com/numpy/numpy/pull/363 I am looking into some windows issues at work, so will have time to look into it in the next few hours. You got the issue for Mingw 3.x, right ? David From lists at hilboll.de Fri Jul 20 11:11:30 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 20 Jul 2012 17:11:30 +0200 Subject: [Numpy-discussion] Problems understanding histogram2d Message-ID: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> Hi, I have a problem using histogram2d: from numpy import linspace, histogram2d bins_x = linspace(-180., 180., 360) bins_y = linspace(-90., 90., 180) data_x = linspace(-179.96875, 179.96875, 5760) data_y = linspace(-89.96875, 89.96875, 2880) histogram2d(data_x, data_y, (bins_x, bins_y)) AttributeError: The dimension of bins must be equal to the dimension of the sample x. I would expect histogram2d to return a 2d array of shape (360,180), which is full of 256s. What am I missing here? Cheers, Andreas. From yogeshkarpate at gmail.com Fri Jul 20 11:42:13 2012 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Fri, 20 Jul 2012 17:42:13 +0200 Subject: [Numpy-discussion] Problems understanding histogram2d In-Reply-To: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> References: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: I think since its a joint histogram, you need to have equal no. of data points and bins in both x and y. On Fri, Jul 20, 2012 at 5:11 PM, Andreas Hilboll wrote: > Hi, > > I have a problem using histogram2d: > > from numpy import linspace, histogram2d > bins_x = linspace(-180., 180., 360) > bins_y = linspace(-90., 90., 180) > data_x = linspace(-179.96875, 179.96875, 5760) > data_y = linspace(-89.96875, 89.96875, 2880) > histogram2d(data_x, data_y, (bins_x, bins_y)) > > AttributeError: The dimension of bins must be equal to the dimension of > the sample x. > > I would expect histogram2d to return a 2d array of shape (360,180), which > is full of 256s. What am I missing here? > > Cheers, > Andreas. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Regards Yogesh Karpate -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Jul 20 14:40:37 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 20 Jul 2012 19:40:37 +0100 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Fri, Jul 20, 2012 at 12:24 PM, Ond?ej ?ert?k wrote: >>> So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install >>> from some wrong url and it fails to install. >>> I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": >>> >> Not surprising, that MinGW is really getting old. It's still the last >> available one with gcc 3.x as IIRC. > > To make things reproducible, I've put all my packages in this repository: > > https://github.com/certik/numpy-vendor > >> >>> >>> binutils-2.17.50-20070129-1.tar.gz >>> w32api-3.7.tar.gz >>> gcc-g77-3.4.5-20051220-1.tar.gz >>> gcc-g++-3.4.5-20051220-1.tar.gz >>> gcc-core-3.4.5-20051220-1.tar.gz >>> mingw-runtime-3.10.tar.gz >>> >>> also in the same directory, I had to do: >>> >>> cp ../windows/system32/msvcr90.dll lib/ >> >> >> Looks like I have an older Wine, not sure if it makes a difference: >> >> $ locate msvcr90.dll >> /Users/rgommers/.wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll >> /Users/rgommers/__wine/drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll >> >> $ locate msvcr71.dll >> /Users/rgommers/.wine/drive_c/windows/system32/msvcr71.dll >> /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.fake >> /Users/rgommers/Code/wine/dlls/msvcr71/msvcr71.dll.so >> /Users/rgommers/__wine/drive_c/windows/system32/msvcr71.dll >> /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.fake >> /Users/rgommers/wine/build/wine-1.1.39/dlls/msvcr71/msvcr71.dll.so >> /Users/rgommers/wine/wine-1.1.39/lib/wine/fakedlls/msvcr71.dll >> /Users/rgommers/wine/wine-1.1.39/lib/wine/msvcr71.dll.so >> /usr/local/lib/wine/fakedlls/msvcr71.dll >> /usr/local/lib/wine/msvcr71.dll.so > > Actually, I made a mistake --- the one in > drive_c/windows/system32/msvcr90.dll does not work for me. > The one I use is installed by the Python installer (as I found out) > and it is in: > > drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll > > Which seems to be the same as the one that you use. Just in case, I've > put it here: > > https://github.com/certik/numpy-vendor/blob/master/msvcr90.dll > >> >> >>> >>> >>> Also I've added the bin directory to PATH using the following trick: >>> >>> $ cat > tmp <>> REGEDIT4 >>> >>> [HKEY_CURRENT_USER\Environment] >>> "PATH"="C:\\\\MinGW\\\\bin" >>> EOF >>> $ wine regedit tmp >>> >>> >>> >>> Then I build and installed numpy using: >>> >>> wine "C:\Python27\python" setup.py build --compiler=mingw32 install >>> >>> And now there is no segfault when constructing a complex array! So >>> newer (newest) mingw miscompiles NumPy somehow... >>> >>> >>> Anyway, running tests, it gets much farther then before, now it hangs at: >>> >>> >>> test_multiarray.TestIO.test_ascii ... >>> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >>> out in thread 0009, blocked by 0000, retrying (60 sec) >>> fixme:keyboard:X11DRV_ActivateKeyboardLayout 0x4090409, 0000: semi-stub! >>> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >>> out in thread 0009, blocked by 0000, retrying (60 sec) >>> err:ntdll:RtlpWaitForCriticalSection section 0x785b7428 "?" wait timed >>> out in thread 0009, blocked by 0000, retrying (60 sec) >>> ... >>> >>> Not sure what this problem is yet. > > > This however is a big problem. I've tested it on the actual Windows > 64bit XP box, and the test simply segfaults at this place. > Ralf, I should note, that your latest scipy RC tests also segfault on > my Windows machine, so maybe something is wrong with the machine... I have some good news for numpy, but bad news for you :) - first, building numpy and testing mostly work for me (tried the last commit from 1.7.x branch) with mingw 5.0.4 with python 2.7.3 and *without* any change in the code (i.e. I did not commented out the part to build msgcr90 import library). - I don't know what the issue is in your environment for msvc90, but I can confirm that it is required. gcc 3.x which was built around 2005/2006 cannot possibly provide the import library for msvcr90, and the build works ok - I strongly suspect some issues because you started with mingw / gcc 4.x. If you moved some libraries in system directories, I suggest you start fresh from a clean state in your VM (or rm -rf .wine :) ). I noticed that when VS 2008 is available, distutils does the configuration with MS compilers, which is broken. I will test later on a machine wo vs 2008. cheers, David From cournape at gmail.com Fri Jul 20 14:51:18 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 20 Jul 2012 19:51:18 +0100 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Thu, Jul 19, 2012 at 4:58 PM, Ond?ej ?ert?k wrote: > > So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install > from some wrong url and it fails to install. > I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": > > binutils-2.17.50-20070129-1.tar.gz > w32api-3.7.tar.gz > gcc-g77-3.4.5-20051220-1.tar.gz > gcc-g++-3.4.5-20051220-1.tar.gz > gcc-core-3.4.5-20051220-1.tar.gz > mingw-runtime-3.10.tar.gz > > also in the same directory, I had to do: > > cp ../windows/system32/msvcr90.dll lib/ I think that's your problem right there. You should not need to do that, and doing so will likely result in having multiple copies of the DLL in your process (you can confirm with process dependency walker). This should be avoided at all cost, as the python C API is not designed to deal with this, and your crashes are pretty typical of what happens in those cases. David From oc-spam66 at laposte.net Fri Jul 20 16:17:27 2012 From: oc-spam66 at laposte.net (OC) Date: Fri, 20 Jul 2012 22:17:27 +0200 Subject: [Numpy-discussion] numpy.complex Message-ID: <5009BCD7.1000306@laposte.net> The syntax "numpy.complex(A)" seems to be the most natural and obvious thing a user would want for casting an array A to complex values. Expressions like "A.astype(complex)", "array(A, dtype=complex)", "numpy.complex128(A)" are less obvious, especially the last two ones, which look a bit far-fetched. Of course, these tricks can be learned. But Python is a language where natural and obvious things most often work as expected. Here, it is not the case. It also breaks the Principle of Least Astonishment, by comparison with "numpy.real(A)". > numpy.complex is just a reference to the built in complex, so only works > on scalars: > > In [5]: numpy.complex is complex > Out[5]: True Thank you for pointing this out. What is the use of storing the "complex()" built-in function in the numpy namespace, when it is already accessible from everywhere? Best regards, -- O.C. From chris.barker at noaa.gov Fri Jul 20 16:24:49 2012 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 20 Jul 2012 13:24:49 -0700 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <5009BCD7.1000306@laposte.net> References: <5009BCD7.1000306@laposte.net> Message-ID: On Fri, Jul 20, 2012 at 1:17 PM, OC wrote: >> numpy.complex is just a reference to the built in complex, so only works >> on scalars: > What is the use of storing the "complex()" built-in function in the > numpy namespace, when it is already accessible from everywhere? for consitancy with teh rest of the numpy types. When I create a numpy array, I might do: np.zeros( (3,4), dtype=np.float32 ) so for the numpy types that have a direct relationship with the python types, we put the type in the numpy namespace as well. But, since in numpy, you generally really want to control your types closely, I"d tend to use: np.zeros( (3,4), dtype=np.complex128 ) (or np.complex64) anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From scopatz at gmail.com Fri Jul 20 16:40:43 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Fri, 20 Jul 2012 15:40:43 -0500 Subject: [Numpy-discussion] ANN: PyTables 2.4.0 released Message-ID: ========================== Announcing PyTables 2.4.0 ========================== We are happy to announce PyTables 2.4.0. This is an incremental release which includes many changes to prepare for future Python 3 support. What's new ========== This release includes support for the float16 data type and read-only support for variable length string attributes. The handling of HDF5 errors has been improved. The user will no longer see HDF5 error stacks dumped to the console. All HDF5 error messages are trapped and attached to a proper Python exception. Now PyTables only supports HDF5 v1.8.4+. All the code has been updated to the new HDF5 API. Supporting only HDF5 1.8 series is beneficial for future development. Documentation has been improved. As always, a large amount of bugs have been addressed and squashed as well. In case you want to know more in detail what has changed in this version, please refer to: http://pytables.github.com/release_notes.html You can download a source package with generated PDF and HTML docs, as well as binaries for Windows, from: http://sourceforge.net/projects/pytables/files/pytables/2.4.0 For an online version of the manual, visit: http://pytables.github.com/usersguide/index.html What it is? =========== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. PyTables includes OPSI, a new indexing technology, allowing to perform data lookups in tables exceeding 10 gigarows (10**10 rows) in less than a tenth of a second. Resources ========= About PyTables: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Most specially, a lot of kudos go to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Jul 20 17:05:22 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 20 Jul 2012 23:05:22 +0200 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <5009BCD7.1000306@laposte.net> References: <5009BCD7.1000306@laposte.net> Message-ID: 20.07.2012 22:17, OC kirjoitti: > The syntax "numpy.complex(A)" seems to be the most natural and obvious > thing a user would want for casting an array A to complex values. I think I disagree here -- that something like that works at all is rather surprising. Remember that numpy.complex, complex64, complex128, float64, ... et al. are types that represent scalar numbers, not arrays. That you get an array out from `float64(some_array)` rather than a ValueError is un-Pythonic. Happens to work, but probably for the wrong reasons. -- Pauli Virtanen From e.antero.tammi at gmail.com Fri Jul 20 18:58:23 2012 From: e.antero.tammi at gmail.com (eat) Date: Sat, 21 Jul 2012 01:58:23 +0300 Subject: [Numpy-discussion] Problems understanding histogram2d In-Reply-To: References: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: Hi, On Fri, Jul 20, 2012 at 6:42 PM, yogesh karpate wrote: > I think since its a joint histogram, you need to have equal no. of data > points and bins > in both x and y. Makes sense that number of elements of data points (x, y) is equal. Perhaps the documentation like http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.histogram2d.html could make this aspect more clearer. Especially confused is the requirement that *x* : array_like, shape(N,) and *y* : array_like, shape(M,), may indicate that N!= M could be a feasible case. A slightly better way would just state that x and y must be one dimensional and they must be equal length. My 2 cents, -eat > > > On Fri, Jul 20, 2012 at 5:11 PM, Andreas Hilboll wrote: > >> Hi, >> >> I have a problem using histogram2d: >> >> from numpy import linspace, histogram2d >> bins_x = linspace(-180., 180., 360) >> bins_y = linspace(-90., 90., 180) >> data_x = linspace(-179.96875, 179.96875, 5760) >> data_y = linspace(-89.96875, 89.96875, 2880) >> histogram2d(data_x, data_y, (bins_x, bins_y)) >> >> AttributeError: The dimension of bins must be equal to the dimension of >> the sample x. >> >> I would expect histogram2d to return a 2d array of shape (360,180), which >> is full of 256s. What am I missing here? >> >> Cheers, >> Andreas. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > Regards > Yogesh Karpate > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aronne.merrelli at gmail.com Fri Jul 20 21:10:06 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Fri, 20 Jul 2012 20:10:06 -0500 Subject: [Numpy-discussion] Problems understanding histogram2d In-Reply-To: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> References: <3b0b00673342e3ac2bfc3d7e890a587c.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Fri, Jul 20, 2012 at 10:11 AM, Andreas Hilboll wrote: > Hi, > > I have a problem using histogram2d: > > from numpy import linspace, histogram2d > bins_x = linspace(-180., 180., 360) > bins_y = linspace(-90., 90., 180) > data_x = linspace(-179.96875, 179.96875, 5760) > data_y = linspace(-89.96875, 89.96875, 2880) > histogram2d(data_x, data_y, (bins_x, bins_y)) > > AttributeError: The dimension of bins must be equal to the dimension of > the sample x. > > I would expect histogram2d to return a 2d array of shape (360,180), which > is full of 256s. What am I missing here? > It is a joint histogram, so the x and y inputs represent each dimension of a 2-dimensional sample. So, the x and y arrays must be the same length. (the documentation does appear to be incorrect here). The bins do not need to have the same length. Here is your example adjusted (with many fewer bins so I could print it in the console) - note since you just have two "ramps" from linspace as the data, most of the points are near the diagonal. In [15]: bins_x = linspace(-180,180,6) In [16]: bins_y = linspace(-90,90,4) In [17]: data_x = linspace(-179.96875, 179.96875, 2880) In [18]: data_y = linspace(-89.96875, 89.96875, 2880) In [19]: H, x_edges, y_edges = np.histogram2d(data_x, data_y, (bins_x, bins_y)) In [20]: H Out[20]: array([[ 576., 0., 0.], [ 384., 192., 0.], [ 0., 576., 0.], [ 0., 192., 384.], [ 0., 0., 576.]]) In [21]: x_edges Out[21]: array([-180., -108., -36., 36., 108., 180.]) In [22]: y_edges Out[22]: array([-90., -30., 30., 90.]) So, back to that AttributeError - it is clearly unhelpful. Looking through the code, it looks like the x,y input arrays are joined into a 2D array with a numpy core function 'atleast_2d'. If this function sees inputs that are not the same length, it actually produces a 2-element numpy object array: In [57]: data_x.shape, data_y.shape Out[57]: ((5760,), (2880,)) In [58]: data_xy = atleast_2d([data_x, data_y]) In [59]: data_xy.shape, data_xy.dtype Out[59]: ((1, 2), dtype('object')) In [60]: data_xy[0,0].shape, data_xy[0,1].shape Out[60]: ((5760,), (2880,)) If the x, y array have the same length this looks a lot more logical: In [62]: data_x.shape, data_y.shape Out[62]: ((2880,), (2880,)) In [63]: data_xy = atleast_2d([data_x, data_y]) In [64]: data_xy.shape, data_xy.dtype Out[64]: ((2, 2880), dtype('float64')) So, that Assertion error comes up histogramdd (which actually does the work), expects the data array to be [Ndimension, Nsample], and the number of dimensions is set by the number of bin arrays that were input (2). Since it sees that [1,2] shaped object array, it treats that as a 2-element, 1-dimension dataset; thus, at that level, the AssertionError actually makes sense. Hope that helps, Aronne From russel at appliedminds.com Fri Jul 20 23:04:56 2012 From: russel at appliedminds.com (Russel Howe) Date: Fri, 20 Jul 2012 20:04:56 -0700 Subject: [Numpy-discussion] Memory Leak Message-ID: <500A1C58.5080801@appliedminds.com> The attached program leaks about 24 bytes per loop. The comments give a bit more detail as to when the leak occurs and doesn't. How can I track down where this leak is actually coming from? Here is a sample run on my machine: $ python simple.py Python Version: 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] numpy version: 1.6.1 /etc/lsb-release: DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04 LTS" 110567424.0 24.465408 135168000.0 24.600576 159768576.0 24.600576 184369152.0 24.600576 208969728.0 24.600576 233570304.0 24.600576 258035712.0 24.465408 282636288.0 24.600576 307236864.0 24.600576 331837440.0 24.600576 -------------- next part -------------- A non-text attachment was scrubbed... Name: simple.py Type: text/x-python Size: 1124 bytes Desc: not available URL: From oc-spam66 at laposte.net Sat Jul 21 11:44:43 2012 From: oc-spam66 at laposte.net (OC) Date: Sat, 21 Jul 2012 17:44:43 +0200 Subject: [Numpy-discussion] numpy.complex In-Reply-To: Message-ID: <500ACE6B.8050102@laposte.net> Thank you for your answers. Chris Barker wrote: > for consistency with the rest of the numpy types Then, why do "numpy.complex64(A)", "numpy.complex128(A)", "numpy.uint8(A)",... all work with arrays? It's very convenient that it works like this! It's awkward that "numpy.complex(A)" is the only one that does not. Is there a problem to extend "numpy.complex" so that it acts the same as "numpy.complex64"? Pauli Virtanen wrote: > Remember that "numpy.complex", "numpy.complex64" (...) are types that > represent scalar numbers, not arrays. (...) That you get an array > out from "numpy.complex64(A)" rather than a "ValueError" is > un-Pythonic. Thanks for pointing this out. I don't see why it would be un-pythonic, and on the contrary this behavior is useful. Why shouldn't a "type" object offer such useful method/constructor? Is there a design mistake here? (from the Python point of view, not from the C++ point of view). All the types you mention inherit from "numpy.generic", except "numpy.complex". Is there a reason for this? I find it awkward and misleading. I understand that "numpy.real" and "numpy.complex" are different things from a programmer's point of view, the first being a "function" and the latter being a "type". However, from the syntax point of view, I think that an average user is founded to believe that they behave similarly with arrays. And such an improvement seems to be easy. For example, why isn't "numpy.complex" simply equal to "numpy.complex_" instead of "__builtin__.complex"? Note: same remark for "numpy.bool" and "numpy.bool_" From njs at pobox.com Sat Jul 21 12:09:50 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 21 Jul 2012 17:09:50 +0100 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <500ACE6B.8050102@laposte.net> References: <500ACE6B.8050102@laposte.net> Message-ID: On Sat, Jul 21, 2012 at 4:44 PM, OC wrote: > Thank you for your answers. > > Chris Barker wrote: > > for consistency with the rest of the numpy types > > Then, why do "numpy.complex64(A)", "numpy.complex128(A)", > "numpy.uint8(A)",... all work with arrays? It's very convenient that it > works like this! It's awkward that "numpy.complex(A)" is the only one > that does not. > > Is there a problem to extend "numpy.complex" so that it acts the same as > "numpy.complex64"? > > > Pauli Virtanen wrote: >> Remember that "numpy.complex", "numpy.complex64" (...) are types that >> represent scalar numbers, not arrays. (...) That you get an array >> out from "numpy.complex64(A)" rather than a "ValueError" is >> un-Pythonic. > > Thanks for pointing this out. I don't see why it would be un-pythonic, > and on the contrary this behavior is useful. Why shouldn't a "type" > object offer such useful method/constructor? Is there a design mistake > here? (from the Python point of view, not from the C++ point of view). It's unPythonic just in the sense that it is unlike every other type constructor in Python. int(x) returns an int, list(x) returns a list, but np.complex64(x) sometimes returns a np.complex64, and sometimes it returns a np.ndarray, depending on what 'x' is. I can see an argument for deprecating this behaviour altogether and referring people to the np.asarray(x, dtype=complex) form; that would be cleaner and reduce confusion. Don't know if it's worth it, but that's the only cleanup that I can see even being considered for these constructors. > All the types you mention inherit from "numpy.generic", except > "numpy.complex". Is there a reason for this? I find it awkward and > misleading. > > I understand that "numpy.real" and "numpy.complex" are different things > from a programmer's point of view, the first being a "function" and the > latter being a "type". However, from the syntax point of view, I think > that an average user is founded to believe that they behave similarly > with arrays. > > And such an improvement seems to be easy. For example, why isn't > "numpy.complex" simply equal to "numpy.complex_" instead of > "__builtin__.complex"? > > Note: same remark for "numpy.bool" and "numpy.bool_" There's also np.int/np.int_ and np.float/np.float_. It's considered poor form to have variables names that clash with builtins. People do write "from numpy import *", and it would be very bad if that overwrote basic builtins like int/float/bool. I'd probably have just had np.int_ be what it is, and np.int be undefined. But hindsight is 20/20 and all that; it's a bit late to do anything about it now. (I suppose there probably aren't many people who depend on np.int, and the ones who do are probably confused, so maybe that is a good argument for deprecating the builtin aliases. But it's very difficult to deprecate exports, so I guess this will probably never happen.) -n From fhaxbox66 at googlemail.com Sun Jul 22 08:54:36 2012 From: fhaxbox66 at googlemail.com (Dr.Leo) Date: Sun, 22 Jul 2012 14:54:36 +0200 Subject: [Numpy-discussion] Making cdecimal.Decimal a native numpy type In-Reply-To: <20120722100727.GA1781@sleipnir.bytereef.org> References: <20120722100727.GA1781@sleipnir.bytereef.org> Message-ID: <500BF80C.5030703@googlemail.com> Hi, I am a seasoned numpy/pandas user mainly interested in financial applications. These and other applications would greatly benefit from a decimal data type with flexible rounding rules, precision etc. Yes, there is cdecimal, the traditional decimal module from the Python stdlib rewritten in C, - http://www.bytereef.org/mpdecimal/index.html - which has become part of the stdlib from Python 3.3. However, it appears that cdecimal cannot be meaningfully used with numpy (see the benchmark below). Squaring an n=10000 ndarray is 1500 times faster with float64 than with a dtype=object ndarray based on cdecimal.Decimal, and even simple operations fail in the first place. I am not deeply enough into ufuncs etc. to judge if some of these problems can be avoided with a few lines of Python code. However, my impression is that ultimately we would all benefit from cdecimal.Decimal becoming a native numpy type. Put bluntly, cdecimal is a great tool. But it is not yet where we most need it. The author of cdecimal, Stefan Krah, would probably have a great deal of the skillset needed to successfully take such a project forward. He happens to have also written the new memoryview implementation of Python 3.3. And from recent correspondence I understand he might be willing to get involved in an effort to marry numpy and cdecimal. The main question is if such project would fit into what core developers see as the future of numpy. Regards Leo And here is the benchmark: In [1]: from numpy import * In [2]: from cdecimal import Decimal In [3]: r=random.rand(10000) In [4]: d=ndarray(10000, dtype=Decimal) In [5]: d.dtype Out[5]: dtype('object') In [6]: r.dtype Out[6]: dtype('float64') In [7]: for i in range(10000): d[i] = Decimal(r[i]) In [8]: %timeit r**2 100000 loops, best of 3: 14.7 us per loop In [9]: %timeit d**2 10 loops, best of 3: 21.2 ms per loop In [10]: r.var() Out[10]: 0.082478142261349557 In [11]: d.var() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) C:\ in () ----> 1 d.var() From aldcroft at head.cfa.harvard.edu Sun Jul 22 09:08:35 2012 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Sun, 22 Jul 2012 09:08:35 -0400 Subject: [Numpy-discussion] Making cdecimal.Decimal a native numpy type In-Reply-To: <500BF80C.5030703@googlemail.com> References: <20120722100727.GA1781@sleipnir.bytereef.org> <500BF80C.5030703@googlemail.com> Message-ID: On Sun, Jul 22, 2012 at 8:54 AM, Dr.Leo wrote: > Hi, > > I am a seasoned numpy/pandas user mainly interested in financial > applications. These and other applications would greatly benefit from a > decimal data type with flexible rounding rules, precision etc. > > Yes, there is cdecimal, the traditional decimal module from the Python > stdlib rewritten in C, > > - http://www.bytereef.org/mpdecimal/index.html - > > which has become part of the stdlib from Python 3.3. > > However, it appears that cdecimal cannot be meaningfully used with numpy > (see the benchmark below). Squaring an n=10000 ndarray is 1500 times > faster with float64 than with a dtype=object ndarray based on > cdecimal.Decimal, and even simple operations fail in the first place. > > I am not deeply enough into ufuncs etc. to judge if some of these > problems can be avoided with a few lines of Python code. However, my > impression is that ultimately we would all benefit from cdecimal.Decimal > becoming a native numpy type. Put bluntly, cdecimal is a great tool. But > it is not yet where we most need it. > > The author of cdecimal, Stefan Krah, would probably have a great deal of > the skillset needed to successfully take such a project forward. He > happens to have also written the new memoryview implementation of Python > 3.3. And from recent correspondence I understand he might be willing to > get involved in an effort to marry numpy and cdecimal. > > The main question is if such project would fit into what core developers > see as the future of numpy. > > Regards > > Leo > > And here is the benchmark: > > In [1]: from numpy import * > > In [2]: from cdecimal import Decimal > > In [3]: r=random.rand(10000) > > In [4]: d=ndarray(10000, dtype=Decimal) > > In [5]: d.dtype > Out[5]: dtype('object') > > In [6]: r.dtype > Out[6]: dtype('float64') > > In [7]: for i in range(10000): d[i] = Decimal(r[i]) > > In [8]: %timeit r**2 > 100000 loops, best of 3: 14.7 us per loop > > In [9]: %timeit d**2 > 10 loops, best of 3: 21.2 ms per loop > > In [10]: r.var() > Out[10]: 0.082478142261349557 > > In [11]: d.var() > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > C:\ -11-bf09d28e33ab> in () > ----> 1 d.var() > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > The numpy-dtypes repository (https://github.com/numpy/numpy-dtypes) has been created recently as a repository for extension dtypes for numpy. This would be the natural place for a decimal dtype. Currently there is a rational and quaternion type, and documentation on how to implement a new dtype. This project is at an early stage and moving somewhat slowly, so contributions and input would be quite welcome. - Tom From ralf.gommers at googlemail.com Sun Jul 22 14:15:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 22 Jul 2012 20:15:14 +0200 Subject: [Numpy-discussion] view of recarray issue Message-ID: Hi, Just a heads up that right now views of recarrays seem to be problematic, this doesn't work anymore: >>> import statsmodels.api as sm >>> dta = sm.datasets.macrodata.load() # returns a record array with 14 fields >>> dta.data[['infl', 'realgdp']].view((float,2)) I opened http://projects.scipy.org/numpy/ticket/2187 for this. Probably a blocker for 1.7.0. Question: is that really the recommended way to get an (N, 2) size float array from two columns of a larger record array? If so, why isn't there a better way? If you'd want to write to that (N, 2) array you have to append a copy, making it even uglier. Also, then there really should be tests for views in test_records.py. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From russel at appliedminds.com Sun Jul 22 17:39:17 2012 From: russel at appliedminds.com (Russel Howe) Date: Sun, 22 Jul 2012 21:39:17 +0000 Subject: [Numpy-discussion] Memory Leak In-Reply-To: <500A1C58.5080801@appliedminds.com> References: <500A1C58.5080801@appliedminds.com> Message-ID: <2A6ED067A7843A488BC6382803A1BBB6146D2F67@am-ex00.amthinking.net> Never mind, this was fixed with commit 3a7e61c7d55be9a84929747c38cd71e62593129d. Russel ________________________________________ From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] on behalf of Russel Howe [russel at appliedminds.com] Sent: Friday, July 20, 2012 8:04 PM To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Memory Leak The attached program leaks about 24 bytes per loop. The comments give a bit more detail as to when the leak occurs and doesn't. How can I track down where this leak is actually coming from? Here is a sample run on my machine: $ python simple.py Python Version: 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] numpy version: 1.6.1 /etc/lsb-release: DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04 LTS" 110567424.0 24.465408 135168000.0 24.600576 159768576.0 24.600576 184369152.0 24.600576 208969728.0 24.600576 233570304.0 24.600576 258035712.0 24.465408 282636288.0 24.600576 307236864.0 24.600576 331837440.0 24.600576 From ndbecker2 at gmail.com Mon Jul 23 07:41:45 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 23 Jul 2012 07:41:45 -0400 Subject: [Numpy-discussion] gnu mpc 1.0 released Message-ID: Perhaps of some interest here: http://lwn.net/Articles/507756/rss From oc-spam66 at laposte.net Mon Jul 23 15:29:25 2012 From: oc-spam66 at laposte.net (OC) Date: Mon, 23 Jul 2012 21:29:25 +0200 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <500ACE6B.8050102@laposte.net> References: <500ACE6B.8050102@laposte.net> Message-ID: <500DA615.8080008@laposte.net> > It's unPythonic just in the sense that it is unlike every other type > constructor in Python. int(x) returns an int, list(x) returns a list, > but np.complex64(x) sometimes returns a np.complex64, and sometimes it > returns a np.ndarray, depending on what 'x' is. This "object factory" design pattern adds useful and natural functionality. > I can see an argument for deprecating this behaviour altogether and > referring people to the np.asarray(x, dtype=complex) form; that would > be cleaner and reduce confusion. Don't know if it's worth it, but > that's the only cleanup that I can see even being considered for these > constructors. From my experience in teaching, I can tell that even beginners have no problem with the fact that "complex128(1)" returns a scalar and that "complex128(r_[1])" returns an array. It seems to be pretty natural. Also, from the duck-typing point of view, both returned values are complex, i.e. provide 'real' and 'imag' attributes and 'conjugate()' method. On the contrary a real confusion is with "numpy.complex" acting differently than the other "numpy.complex*". > People do write "from numpy import *" Yeah, that's what I do very often in interactive "ipython" sessions. Other than this, people are warned often enough that this shouldn't be used in real programs. > and it would be very bad if that overwrote basic builtins like > int/float/bool True, but is it so bad? All the following is true when 'x' is a scalar: complex(x) == numpy.complex_(x) bool(x) == numpy.bool_(x) float(x) == numpy.float_(x) int(x) == numpy.int_(x) isinstance(numpy.complex_(x), complex) isinstance(numpy.int_(x), int) isinstance(numpy.float_(x), float) However, it's indeed a problem that "numpy.complex_(1,1)" is not defined, contrary to "__builtin__.complex(1,1)" (but this looks easy to implement). > it's very difficult to deprecate exports, so I guess this will probably > never happen. With the move to Python 3, people are used to movement :-) I will just summarize my opinion and understanding, in case someone is interested: * I find it ugly that: - "numpy.real(A)" and "numpy.complex(A)" do not behave the same. - "numpy.complex(A)" and "numpy.complex128(A)" do not behave the same. * A solution could be to let "numpy.complex" be equal to "numpy.complex_" (and the same for "int", "float", "bool"). - This would overwrite builtins in case of "from numpy import *". - This may not be that harmful, except for the fact that "numpy.complex_(1,1)" is not implemented, contrary to "__builtin__.complex(1,1)". From ben.root at ou.edu Mon Jul 23 20:58:16 2012 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 23 Jul 2012 20:58:16 -0400 Subject: [Numpy-discussion] numpy.complex In-Reply-To: <500DA615.8080008@laposte.net> References: <500ACE6B.8050102@laposte.net> <500DA615.8080008@laposte.net> Message-ID: On Monday, July 23, 2012, OC wrote: > > It's unPythonic just in the sense that it is unlike every other type > > constructor in Python. int(x) returns an int, list(x) returns a list, > > but np.complex64(x) sometimes returns a np.complex64, and sometimes it > > returns a np.ndarray, depending on what 'x' is. > > This "object factory" design pattern adds useful and natural functionality. > > > I can see an argument for deprecating this behaviour altogether and > > referring people to the np.asarray(x, dtype=complex) form; that would > > be cleaner and reduce confusion. Don't know if it's worth it, but > > that's the only cleanup that I can see even being considered for these > > constructors. > > From my experience in teaching, I can tell that even beginners have no > problem with the fact that "complex128(1)" returns a scalar and that > "complex128(r_[1])" returns an array. It seems to be pretty natural. > > Also, from the duck-typing point of view, both returned values are > complex, i.e. provide 'real' and 'imag' attributes and 'conjugate()' > method. > > On the contrary a real confusion is with "numpy.complex" acting > differently than the other "numpy.complex*". > > > People do write "from numpy import *" > > Yeah, that's what I do very often in interactive "ipython" sessions. > Other than this, people are warned often enough that this shouldn't be > used in real programs. Don't be so sure of that. The "pylab" mode from matplotlib has been both a blessing and a curse. This mode is very popular and for many, "it is all they need/want to know". While it has made the transition from other languages easier for many, the polluted namespace comes at a small cost. And it is only going to get worse when moving over to py3k where just about everything is a generator. __builtin__.any can handle generators, but np.any does not. Same goes for several other functions. Note, I do agree with you that the discrepancy needs to be fixed, I just am not sure which way. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.stelzer at gmail.com Tue Jul 24 10:00:42 2012 From: jason.stelzer at gmail.com (Jason Stelzer) Date: Tue, 24 Jul 2012 10:00:42 -0400 Subject: [Numpy-discussion] building a numpy egg for centos Message-ID: I'm building numpy 1.6.2 for python 2.5 on centos 5.8. I ran into a problem where bdist_egg was not working. It seems there's a minor bug in numpy/distutils/core.py Under python 2.5 the check for setuptools does not work, so the bdist target for eggs is not available. I've attached a patch that works around the issue for me. It is my understanding that python 2.5 should still be a valid target for building this release. If not, ignore this. -- J. -------------- next part -------------- A non-text attachment was scrubbed... Name: numpy-1.6.2-bdist_egg.patch Type: application/octet-stream Size: 1326 bytes Desc: not available URL: From robert.kern at gmail.com Tue Jul 24 10:06:22 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Jul 2012 15:06:22 +0100 Subject: [Numpy-discussion] building a numpy egg for centos In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 3:00 PM, Jason Stelzer wrote: > I'm building numpy 1.6.2 for python 2.5 on centos 5.8. > > I ran into a problem where bdist_egg was not working. It seems there's > a minor bug in numpy/distutils/core.py > > Under python 2.5 the check for setuptools does not work, so the bdist > target for eggs is not available. > > I've attached a patch that works around the issue for me. It is my > understanding that python 2.5 should still be a valid target for > building this release. If not, ignore this. This was an explicit design choice. numpy.distutils will never import setuptools for you even if you have it installed. It will simply integrate with it if you have run the setup.py script from something that has setuptools imported, like the setupegg.py script. What exactly is failing to work under Python 2.5 on your system? "python setup.py bdist_egg" should never work, but "python setupegg.py bdist_egg" should. -- Robert Kern From jason.stelzer at gmail.com Tue Jul 24 10:44:18 2012 From: jason.stelzer at gmail.com (Jason Stelzer) Date: Tue, 24 Jul 2012 10:44:18 -0400 Subject: [Numpy-discussion] building a numpy egg for centos In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 10:06 AM, Robert Kern wrote: > This was an explicit design choice. numpy.distutils will never import > setuptools for you even if you have it installed. It will simply > integrate with it if you have run the setup.py script from something > that has setuptools imported, like the setupegg.py script. > > What exactly is failing to work under Python 2.5 on your system? > "python setup.py bdist_egg" should never work, but "python setupegg.py > bdist_egg" should. > Thanks Robert, The explanation is user error on my part. I got too used to the convention of setup.py/eggs. -- J. From ondrej.certik at gmail.com Tue Jul 24 11:14:31 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 08:14:31 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Fri, Jul 20, 2012 at 11:51 AM, David Cournapeau wrote: > On Thu, Jul 19, 2012 at 4:58 PM, Ond?ej ?ert?k wrote: > >> >> So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install >> from some wrong url and it fails to install. >> I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": >> >> binutils-2.17.50-20070129-1.tar.gz >> w32api-3.7.tar.gz >> gcc-g77-3.4.5-20051220-1.tar.gz >> gcc-g++-3.4.5-20051220-1.tar.gz >> gcc-core-3.4.5-20051220-1.tar.gz >> mingw-runtime-3.10.tar.gz >> >> also in the same directory, I had to do: >> >> cp ../windows/system32/msvcr90.dll lib/ > > I think that's your problem right there. You should not need to do > that, and doing so will likely result in having multiple copies of the > DLL in your process (you can confirm with process dependency walker). > This should be avoided at all cost, as the python C API is not > designed to deal with this, and your crashes are pretty typical of > what happens in those cases. Ah, that could be it. David, what version of binutils do you use? I use 2.17.50 (https://github.com/certik/numpy-vendor) and maybe that's the problem, that the objdump from there can't read the msvcr library. I use gcc 3.4.5. What exact version do you use in wine? I always do rm -rf .wine and install things from scratch. Here is my script: https://gist.github.com/3170576 As you can see, it removes .wine and then installs things automatically. This exact setup fails to build the msvcr library, and I have to copy it manually on the line 51. Ralf, David, I really appreciate your help --- I think I am very close to having a working environment, but so far I didn't manage to install numpy and pass all tests yet, so that's why I don't know what is normal and what is not. Once I can reproduce the build at least once, things will go much faster from there. Ondrej From ondrej.certik at gmail.com Tue Jul 24 11:21:57 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 08:21:57 -0700 Subject: [Numpy-discussion] Finished maintenance/1.7.x branch In-Reply-To: References: <60C559B9-B858-4344-A7C3-50702D629ED1@continuum.io> Message-ID: On Fri, Jul 20, 2012 at 7:58 AM, David Cournapeau wrote: > On Fri, Jul 20, 2012 at 1:50 PM, Ond?ej ?ert?k wrote: >> On Wed, Jul 18, 2012 at 8:09 AM, Travis Oliphant wrote: >>> >>> Hey all, >>> >>> We are going to work on a beta release on the 1.7.x branch. The master is open again for changes for 1.8.x. There will be some work on the 1.7.x branch to fix bugs including bugs that are already reported but have not yet been addressed (like the regression against data-type detection for Sage). It would be great if 1.7.x gets as much testing as possible so that we can discover regressions that may have occurred. But, it was important to draw the line for 1.7.0 features. >> >> I think we need to fix the MinGW build problems before the release: >> >> https://github.com/numpy/numpy/pull/363 > > I am looking into some windows issues at work, so will have time to > look into it in the next few hours. You got the issue for Mingw 3.x, > right ? I think so --- as we talked in the other threads, it looks now to me, that I should never copy any dll around, and that is probably the problem. As such, I think the problem might be in my wine setup. Ondrej From ralf.gommers at googlemail.com Tue Jul 24 15:32:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 24 Jul 2012 21:32:23 +0200 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 5:14 PM, Ond?ej ?ert?k wrote: > On Fri, Jul 20, 2012 at 11:51 AM, David Cournapeau > wrote: > > On Thu, Jul 19, 2012 at 4:58 PM, Ond?ej ?ert?k > wrote: > > > >> > >> So I have tried the MinGW-5.0.3.exe in Wine, but it tries to install > >> from some wrong url and it fails to install. > >> I have unpacked the tarballs by hand into "~/.wine/drive_c/MinGW": > >> > >> binutils-2.17.50-20070129-1.tar.gz > >> w32api-3.7.tar.gz > >> gcc-g77-3.4.5-20051220-1.tar.gz > >> gcc-g++-3.4.5-20051220-1.tar.gz > >> gcc-core-3.4.5-20051220-1.tar.gz > >> mingw-runtime-3.10.tar.gz > >> > >> also in the same directory, I had to do: > >> > >> cp ../windows/system32/msvcr90.dll lib/ > > > > I think that's your problem right there. You should not need to do > > that, and doing so will likely result in having multiple copies of the > > DLL in your process (you can confirm with process dependency walker). > > This should be avoided at all cost, as the python C API is not > > designed to deal with this, and your crashes are pretty typical of > > what happens in those cases. > > Ah, that could be it. > > David, what version of binutils do you use? > I use 2.17.50 (https://github.com/certik/numpy-vendor) > and maybe that's the problem, that the objdump from there > can't read the msvcr library. > > I use gcc 3.4.5. What exact version do you use in wine? > > I have gcc 3.4.5 and binutils 2.20. I'll send you a tar file with all the binaries off-list. Apologies in advance for clogging your inbox. Ralf > I always do rm -rf .wine and install things from scratch. Here is my > script: > > https://gist.github.com/3170576 > > As you can see, it removes .wine and then installs things automatically. > This exact setup fails to build the msvcr library, and I have to copy > it manually on the line 51. > > > Ralf, David, I really appreciate your help --- I think I am very close to > having a working environment, but so far I didn't manage to install > numpy and pass all tests yet, so that's why I don't know what is normal > and what is not. Once I can reproduce the build at least once, > things will go much faster from there. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Tue Jul 24 17:46:37 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Tue, 24 Jul 2012 23:46:37 +0200 Subject: [Numpy-discussion] Github notifications and trac-to-github migration Message-ID: Hello, I would estimate I'm between a fourth and halfway through the implementation of the trac-to-github-issues migration code. The work lives in at https://github.com/thouis/numpy-trac-migration , though without a copy of the trac DB, it's not really possible to experiment with it. I haven't included the DB because of its size and potential privacy issues. My plan is to use this repository for testing. One concern I have is that when issues are assigned or someone is mentioned via @username, a notification will be sent (for the default notifications settings). I believe this is the case even for closed issues. My plan had been to add @user mentions for the reporter and CCs listed in each trac issue, but there are other options: - only add @users for open trac tickets, don't assign closed tickets to their original owner, - don't add any @users for reporter and CCs (though owners will be notified when issues are assigned), - a combination of the above. I also thought it might be good to send a warning, via an issue on the test repo with an @user for everyone that might be messaged in the final transition, warning them of what's about to take place. During testing, all @users will be replaced with something else, to keep from flooding people with notifications during debugging. Thank you for any feedback. Ray Jones From giuseppe.amatulli at gmail.com Tue Jul 24 18:09:13 2012 From: giuseppe.amatulli at gmail.com (Giuseppe Amatulli) Date: Tue, 24 Jul 2012 17:09:13 -0500 Subject: [Numpy-discussion] np.unique for one bi-dimensional array Message-ID: Hi, would like to identify unique pairs of numbers in two arrays o in one bi-dimensional array, and count the observation a_clean=array([4,4,5,4,4,4]) b_clean=array([3,5,4,4,3,4]) and obtain (4,3,2) (4,5,1) (5,4,1) (4,4,2) I solved with tow loops but off course there will be a fast solution Any idea? what about using np.unique for one bi-dimensional array? In bash I usually unique command thanks in advance Giuseppe -- Giuseppe Amatulli Web: www.spatial-ecology.net From ondrej.certik at gmail.com Tue Jul 24 18:58:40 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 15:58:40 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: Ralf, >> David, what version of binutils do you use? >> I use 2.17.50 (https://github.com/certik/numpy-vendor) >> and maybe that's the problem, that the objdump from there >> can't read the msvcr library. >> >> I use gcc 3.4.5. What exact version do you use in wine? >> > > I have gcc 3.4.5 and binutils 2.20. I'll send you a tar file with all the > binaries off-list. Apologies in advance for clogging your inbox. Thanks a lot for this. It helped a lot. I got your binutils 2.20, installed (into a clean .wine environment), and used the original numpy code from the maintenance/1.7.x, it still fails with the "ValueError: Symbol table not found" exception due to this problem: Z:\home\ondrej>objdump -t C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll objdump: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: File format not recognized My binutils are correct now: Z:\home\ondrej>objdump --version GNU objdump (GNU Binutils) 2.20 Copyright 2009 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or (at your option) any later version. This program has absolutely no warranty. So now I know that the problem is not there, and so it must be in my msvcr90.dll library somehow. I will keep digging. Ondrej From e.antero.tammi at gmail.com Tue Jul 24 20:02:46 2012 From: e.antero.tammi at gmail.com (eat) Date: Wed, 25 Jul 2012 03:02:46 +0300 Subject: [Numpy-discussion] np.unique for one bi-dimensional array In-Reply-To: References: Message-ID: Hi, On Wed, Jul 25, 2012 at 1:09 AM, Giuseppe Amatulli < giuseppe.amatulli at gmail.com> wrote: > Hi, > > would like to identify unique pairs of numbers in two arrays o in one > bi-dimensional array, and count the observation > > a_clean=array([4,4,5,4,4,4]) > b_clean=array([3,5,4,4,3,4]) > > and obtain > (4,3,2) > (4,5,1) > (5,4,1) > (4,4,2) > > I solved with tow loops but off course there will be a fast solution > Any idea? > what about using np.unique for one bi-dimensional array? > According the docs http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html np.unique will flatten the input to 1D array. However, perhaps something like the following lines will help you: In []: lot= zip(a_clean, b_clean) In []: lot Out[]: [(4, 3), (4, 5), (5, 4), (4, 4), (4, 3), (4, 4)] In []: [[x, lot.count(x)] for x in set(lot)] Out[]: [[(4, 5), 1], [(5, 4), 1], [(4, 4), 2], [(4, 3), 2]] My 2 cents, -eat > > In bash I usually unique command > > thanks in advance > Giuseppe > > -- > Giuseppe Amatulli > Web: www.spatial-ecology.net > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Tue Jul 24 20:04:57 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 17:04:57 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 3:58 PM, Ond?ej ?ert?k wrote: > Ralf, > >>> David, what version of binutils do you use? >>> I use 2.17.50 (https://github.com/certik/numpy-vendor) >>> and maybe that's the problem, that the objdump from there >>> can't read the msvcr library. >>> >>> I use gcc 3.4.5. What exact version do you use in wine? >>> >> >> I have gcc 3.4.5 and binutils 2.20. I'll send you a tar file with all the >> binaries off-list. Apologies in advance for clogging your inbox. > > Thanks a lot for this. It helped a lot. > > > I got your binutils 2.20, installed (into a clean .wine environment), > and used the original numpy code > from the maintenance/1.7.x, it still fails with the "ValueError: > Symbol table not found" exception > due to this problem: > > > Z:\home\ondrej>objdump -t > C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll > objdump: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: > File format not recognized > > > My binutils are correct now: > > > Z:\home\ondrej>objdump --version > GNU objdump (GNU Binutils) 2.20 > Copyright 2009 Free Software Foundation, Inc. > This program is free software; you may redistribute it under the terms of > the GNU General Public License version 3 or (at your option) any later version. > This program has absolutely no warranty. > > > So now I know that the problem is not there, and so it must be in my > msvcr90.dll library somehow. I will keep digging. I've meed a little progress. Here are all my msvcr90.dll on my system: ondrej at eagle:~/.wine$ find . -name msvcr90.dll ./drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll ./drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll ./drive_c/windows/system32/msvcr90.dll since the "deadbeef" one gets picked up by numpy and it doesn't work, I did the following: rm ./drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll And now numpy gets further: Building msvcr library: "C:\Python27\libs\libmsvcr90.a" (from C:\windows\winsxs\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375\msvcr90.dll) Cannot build msvcr library: "msvcr90d.dll" not found The msvcr90d.dll is some debugging version of the library, which indeed is not even installed in my wine. So I did some debugging of the build_msvcr_library() function in numpy/distutils/mingw32ccompiler.py and it turns out that when it is called with debug=False, it indeed builds the new library in C:\Python27\libs\libmsvcr90.a. When called with debug=True, it simply prints the message that msvcr90d.dll was not found, but this should be harmless. So this looks like things are actually working for me. Except that later, I get this error: Found executable C:\MinGW\bin\g++.exe C:\MinGW\bin\..\lib\gcc\mingw32\3.4.5\..\..\..\..\mingw32\bin\ld.exe: cannot find -lmsvcr90 collect2: ld returned 1 exit status failure. removing: _configtest.exe.manifest _configtest.c _configtest.o Traceback (most recent call last): File "setup.py", line 214, in setup_package() File "setup.py", line 207, in setup_package configuration=configuration ) File "Z:\home\ondrej\repos\numpy\numpy\distutils\core.py", line 186, in setup return old_setup(**new_attr) File "C:\Python27\lib\distutils\core.py", line 152, in setup dist.run_commands() File "C:\Python27\lib\distutils\dist.py", line 953, in run_commands self.run_command(cmd) File "C:\Python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build.py", line 37, in run old_build.run(self) File "C:\Python27\lib\distutils\command\build.py", line 127, in run self.run_command(cmd_name) File "C:\Python27\lib\distutils\cmd.py", line 326, in run_command self.distribution.run_command(command) File "C:\Python27\lib\distutils\dist.py", line 972, in run_command cmd_obj.run() File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", line 152, in run self.build_sources() File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", line 163, in build_sources self.build_library_sources(*libname_info) File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", line 298, in build_library_sources sources = self.generate_sources(sources, (lib_name, build_info)) File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", line 385, in generate_sources source = func(extension, build_dir) File "numpy\core\setup.py", line 648, in get_mathlib_info raise RuntimeError("Broken toolchain: cannot link a simple C program") RuntimeError: Broken toolchain: cannot link a simple C program The reason is that "ld" doesn't find the msvcr90.dll. If I copy the right one into C:\MinGW\lib, then it starts working. But I think that this copying by hand should not be done... Ondrej From ondrej.certik at gmail.com Tue Jul 24 20:06:49 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 17:06:49 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 5:04 PM, Ond?ej ?ert?k wrote: > On Tue, Jul 24, 2012 at 3:58 PM, Ond?ej ?ert?k wrote: >> Ralf, >> >>>> David, what version of binutils do you use? >>>> I use 2.17.50 (https://github.com/certik/numpy-vendor) >>>> and maybe that's the problem, that the objdump from there >>>> can't read the msvcr library. >>>> >>>> I use gcc 3.4.5. What exact version do you use in wine? >>>> >>> >>> I have gcc 3.4.5 and binutils 2.20. I'll send you a tar file with all the >>> binaries off-list. Apologies in advance for clogging your inbox. >> >> Thanks a lot for this. It helped a lot. >> >> >> I got your binutils 2.20, installed (into a clean .wine environment), >> and used the original numpy code >> from the maintenance/1.7.x, it still fails with the "ValueError: >> Symbol table not found" exception >> due to this problem: >> >> >> Z:\home\ondrej>objdump -t >> C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll >> objdump: C:\windows\winsxs\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef\msvcr90.dll: >> File format not recognized >> >> >> My binutils are correct now: >> >> >> Z:\home\ondrej>objdump --version >> GNU objdump (GNU Binutils) 2.20 >> Copyright 2009 Free Software Foundation, Inc. >> This program is free software; you may redistribute it under the terms of >> the GNU General Public License version 3 or (at your option) any later version. >> This program has absolutely no warranty. >> >> >> So now I know that the problem is not there, and so it must be in my >> msvcr90.dll library somehow. I will keep digging. > > I've meed a little progress. Here are all my msvcr90.dll on my system: > > ondrej at eagle:~/.wine$ find . -name msvcr90.dll > ./drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll > ./drive_c/windows/winsxs/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375/msvcr90.dll > ./drive_c/windows/system32/msvcr90.dll > > > since the "deadbeef" one gets picked up by numpy and it doesn't work, > I did the following: > > rm ./drive_c/windows/winsxs/x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef/msvcr90.dll > > And now numpy gets further: > > Building msvcr library: "C:\Python27\libs\libmsvcr90.a" (from > C:\windows\winsxs\x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375\msvcr90.dll) > Cannot build msvcr library: "msvcr90d.dll" not found > > > > The msvcr90d.dll is some debugging version of the library, which > indeed is not even installed in my wine. > > So I did some debugging of the build_msvcr_library() function in > numpy/distutils/mingw32ccompiler.py and it turns out > that when it is called with debug=False, it indeed builds the new > library in C:\Python27\libs\libmsvcr90.a. > When called with debug=True, it simply prints the message that > msvcr90d.dll was not found, but this should be harmless. > > So this looks like things are actually working for me. Except that > later, I get this error: > > Found executable C:\MinGW\bin\g++.exe > C:\MinGW\bin\..\lib\gcc\mingw32\3.4.5\..\..\..\..\mingw32\bin\ld.exe: > cannot find -lmsvcr90 > collect2: ld returned 1 exit status > failure. > removing: _configtest.exe.manifest _configtest.c _configtest.o > Traceback (most recent call last): > File "setup.py", line 214, in > setup_package() > File "setup.py", line 207, in setup_package > configuration=configuration ) > File "Z:\home\ondrej\repos\numpy\numpy\distutils\core.py", line 186, in setup > return old_setup(**new_attr) > File "C:\Python27\lib\distutils\core.py", line 152, in setup > dist.run_commands() > File "C:\Python27\lib\distutils\dist.py", line 953, in run_commands > self.run_command(cmd) > File "C:\Python27\lib\distutils\dist.py", line 972, in run_command > cmd_obj.run() > File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build.py", > line 37, in run > old_build.run(self) > File "C:\Python27\lib\distutils\command\build.py", line 127, in run > self.run_command(cmd_name) > File "C:\Python27\lib\distutils\cmd.py", line 326, in run_command > self.distribution.run_command(command) > File "C:\Python27\lib\distutils\dist.py", line 972, in run_command > cmd_obj.run() > File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", > line 152, in run > self.build_sources() > File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", > line 163, in build_sources > self.build_library_sources(*libname_info) > File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", > line 298, in build_library_sources > sources = self.generate_sources(sources, (lib_name, build_info)) > File "Z:\home\ondrej\repos\numpy\numpy\distutils\command\build_src.py", > line 385, in generate_sources > source = func(extension, build_dir) > File "numpy\core\setup.py", line 648, in get_mathlib_info > raise RuntimeError("Broken toolchain: cannot link a simple C program") > RuntimeError: Broken toolchain: cannot link a simple C program > > > The reason is that "ld" doesn't find the msvcr90.dll. If I copy the > right one into C:\MinGW\lib, then it starts working. But I think that > this copying by hand should not be done... But --- I can run all numpy tests for the first time!! Here are the failures: ====================================================================== ERROR: test_datetime_y2038 (test_datetime.TestDateTime) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python27\lib\site-packages\numpy\core\tests\test_datetime.py", line 1749, in test_datetime_y2038 assert_equal(str(a)[:-5], '2038-01-20T13:21:14') OSError: Failed to use '_localtime64_s' to convert to a local time ====================================================================== ERROR: test_combinations (test_multiarray.TestArgmax) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", line 1280, in test_combinations assert_equal(np.argmax(arr), pos, err_msg="%r"%arr) OSError: Failed to use '_localtime64_s' to convert to a local time ====================================================================== ERROR: test_combinations (test_multiarray.TestArgmin) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python27\lib\site-packages\numpy\core\tests\test_multiarray.py", line 1348, in test_combinations assert_equal(np.argmin(arr), pos, err_msg="%r"%arr) OSError: Failed to use '_localtime64_s' to convert to a local time ---------------------------------------------------------------------- Ran 4258 tests in 21.161s FAILED (KNOWNFAIL=9, SKIP=8, errors=3) Ondrej From ondrej.certik at gmail.com Tue Jul 24 20:38:05 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 17:38:05 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: Possible conclusion: I think I know what is going on. I use $ wine --version wine-1.3.28 and it installs (by default) the following msvcr libraries: $ ls ~/.wine/drive_c/windows/winsxs/ manifests Policies x86_microsoft.msxml2_6bd6b9abf345378f_4.1.0.0_none_deadbeef x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.4053_none_deadbeef x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.2600.2982_none_deadbeef x86_microsoft.windows.gdiplus_6595b64144ccf1df_1.0.6000.16386_none_deadbeef x86_microsoft-windows-msxml30_31bf3856ad364e35_6.0.6000.16386_none_deadbeef x86_microsoft-windows-msxml60_31bf3856ad364e35_6.0.6000.16386_none_deadbeef And the msvcr90.dll is in the *vc90* directory. This msvcr90.dll cannot be read by our objdump. As such, I just remove it using "rm". The Python installer installs the following directory: x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375 and that is a good library. NumPy picks it up automatically and all is ok. I also copy this good dll into C:\MinGW\libs, so that it is picked up by the "ld" linker. Otherwise the numpy build also fails. As such, here is my updated setup-wine.sh script: https://gist.github.com/3170576 In there, I still need to copy $tarballs/msvcr90.dll into lib/ (which David does not recommend, but if I don't do that, then any compilation with "gcc something.c -lmsvcr90" will fail, even though "gcc something.c" works). Once this script is run, then numpy builds out of the box just using: wine python setup.py build --compiler=mingw32 install and all tests run via: wine python -c "import numpy; numpy.test()" with no segfault. So I think that this is a lot of progress. Let me now test this numpy binary on a Windows machine. Ondrej From ondrej.certik at gmail.com Tue Jul 24 20:57:18 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 17:57:18 -0700 Subject: [Numpy-discussion] Segfault in mingw in test_arrayprint.TestComplexArray In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 5:38 PM, Ond?ej ?ert?k wrote: > Possible conclusion: > > I think I know what is going on. I use > > $ wine --version > wine-1.3.28 > > and it installs (by default) the following msvcr libraries: > > > $ ls ~/.wine/drive_c/windows/winsxs/ > manifests > Policies > x86_microsoft.msxml2_6bd6b9abf345378f_4.1.0.0_none_deadbeef > x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.4053_none_deadbeef > x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4148_none_deadbeef > x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.2600.2982_none_deadbeef > x86_microsoft.windows.gdiplus_6595b64144ccf1df_1.0.6000.16386_none_deadbeef > x86_microsoft-windows-msxml30_31bf3856ad364e35_6.0.6000.16386_none_deadbeef > x86_microsoft-windows-msxml60_31bf3856ad364e35_6.0.6000.16386_none_deadbeef > > > And the msvcr90.dll is in the *vc90* directory. This msvcr90.dll > cannot be read by our objdump. As such, I just remove it using "rm". > The Python installer installs the following directory: > > x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_d08d0375 > > and that is a good library. NumPy picks it up automatically and all is > ok. I also copy this good dll into C:\MinGW\libs, so that it is picked > up by the "ld" linker. Otherwise the numpy build also fails. > > As such, here is my updated setup-wine.sh script: > > https://gist.github.com/3170576 > > In there, I still need to copy $tarballs/msvcr90.dll into lib/ (which > David does not recommend, but if I don't do that, then any compilation > with "gcc something.c -lmsvcr90" will fail, even though "gcc > something.c" works). > > Once this script is run, then numpy builds out of the box just using: > > wine python setup.py build --compiler=mingw32 install > > and all tests run via: > > wine python -c "import numpy; numpy.test()" > > with no segfault. So I think that this is a lot of progress. Let me > now test this numpy binary on a Windows machine. Also works! There is even one more test failure, but I'll post these in another thread. So I think I am set for creating Windows binaries now. Thanks Ralf and David for your help! Ondrej From ondrej.certik at gmail.com Tue Jul 24 21:03:22 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 24 Jul 2012 18:03:22 -0700 Subject: [Numpy-discussion] Test failures on Windows XP 64-bit Message-ID: Hi, I've created a Windows installer of the maintenance/1.7.x branch (commit f93774d), the binary is available here (created in Wine on linux): https://github.com/certik/numpy-vendor/blob/69555c40dff5ae9f2d27d827f283bc6d9c53fccf/test/ and when I install it on the Windows XP 64-bit machine, here are the test results: https://gist.github.com/3173696 There are four failures --- three of them are related to the same exception: OSError: Failed to use '_localtime64_s' to convert to a local time and the fourth one seems like a regular bug in the test_iterator.test_iter_array_cast test. Are these known failures? Note that I can reproduce the _localtime64_s failures in Wine, but not the test_iterator.test_iter_array_cast one. Ondrej From kalatsky at gmail.com Tue Jul 24 21:15:35 2012 From: kalatsky at gmail.com (Val Kalatsky) Date: Tue, 24 Jul 2012 20:15:35 -0500 Subject: [Numpy-discussion] np.unique for one bi-dimensional array In-Reply-To: References: Message-ID: There are various ways to repack the pair of arrays into one array. The most universal is probably to use structured array (can repack more than a pair): x = np.array(zip(a, b), dtype=[('a',int), ('b',int)]) After repacking you can use unique and other numpy methods: xu = np.unique(x) zip(xu['a'], xu['b'], np.bincount(np.searchsorted(xu, x))) [(4, 3, 2), (4, 4, 2), (4, 5, 1), (5, 4, 1)] Val On Tue, Jul 24, 2012 at 5:09 PM, Giuseppe Amatulli < giuseppe.amatulli at gmail.com> wrote: > Hi, > > would like to identify unique pairs of numbers in two arrays o in one > bi-dimensional array, and count the observation > > a_clean=array([4,4,5,4,4,4]) > b_clean=array([3,5,4,4,3,4]) > > and obtain > (4,3,2) > (4,5,1) > (5,4,1) > (4,4,2) > > I solved with tow loops but off course there will be a fast solution > Any idea? > what about using np.unique for one bi-dimensional array? > > In bash I usually unique command > > thanks in advance > Giuseppe > > -- > Giuseppe Amatulli > Web: www.spatial-ecology.net > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Jul 25 00:53:31 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 24 Jul 2012 21:53:31 -0700 Subject: [Numpy-discussion] Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: Hi Thouis, On Tue, Jul 24, 2012 at 2:46 PM, Thouis (Ray) Jones wrote: > I would estimate I'm between a fourth and halfway through the > implementation of the trac-to-github-issues migration code. The work > lives in at https://github.com/thouis/numpy-trac-migration mmh, I would have thought you're farther ahead... Aric Hagberg (@hagberg) and Jordi Torrents (@jtorrents) of NetworkX fame last weekend completed the trac2github migration for nx, and he said he'd only had to make a few improvements to your code. I'm cc'ing Aric here so he can give us more details, but based on the fact that they were able to successfully migrate nx completely to GH, I would have imagined you'd be much, much closer for numpy/scipy. Their migration looks pretty solid, including all old comments and attachments being correctly linked, cf this one: https://github.com/networkx/networkx/issues/693 Many thanks for getting that code in place, since it made Aric and Jordi's job much easier (they still worked super hard, I saw them hunkered down in the sprints area for a long time, but at least this made it *possible* to get nx on github, which makes me very happy). Cheers, f From ralf.gommers at googlemail.com Wed Jul 25 02:02:28 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 25 Jul 2012 08:02:28 +0200 Subject: [Numpy-discussion] Test failures on Windows XP 64-bit In-Reply-To: References: Message-ID: On Wed, Jul 25, 2012 at 3:03 AM, Ond?ej ?ert?k wrote: > Hi, > > I've created a Windows installer of the maintenance/1.7.x branch > (commit f93774d), the binary is available here (created in Wine on > linux): > > > https://github.com/certik/numpy-vendor/blob/69555c40dff5ae9f2d27d827f283bc6d9c53fccf/test/ > > and when I install it on the Windows XP 64-bit machine, here are the > test results: > > https://gist.github.com/3173696 > > There are four failures --- three of them are related to the same > exception: > > OSError: Failed to use '_localtime64_s' to convert to a local time > These are known, and are being discussed on another thread. > > and the fourth one seems like a regular bug in the > test_iterator.test_iter_array_cast test. Are these known failures? > This one is new. Ralf > > Note that I can reproduce the _localtime64_s failures in Wine, but not > the test_iterator.test_iter_array_cast one. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Wed Jul 25 05:51:43 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Wed, 25 Jul 2012 11:51:43 +0200 Subject: [Numpy-discussion] Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: On Wed, Jul 25, 2012 at 6:53 AM, Fernando Perez wrote: > Hi Thouis, > > On Tue, Jul 24, 2012 at 2:46 PM, Thouis (Ray) Jones wrote: >> I would estimate I'm between a fourth and halfway through the >> implementation of the trac-to-github-issues migration code. The work >> lives in at https://github.com/thouis/numpy-trac-migration > > mmh, I would have thought you're farther ahead... Aric Hagberg > (@hagberg) and Jordi Torrents (@jtorrents) of NetworkX fame last > weekend completed the trac2github migration for nx, and he said he'd > only had to make a few improvements to your code. > > I'm cc'ing Aric here so he can give us more details, but based on the > fact that they were able to successfully migrate nx completely to GH, > I would have imagined you'd be much, much closer for numpy/scipy. Perhaps my estimate was low. I hadn't done any work with creating issues on github (only extracting them from Trac into a form that maps onto github issues), but I expect the PyGithub library (https://github.com/jacquev6/PyGithub) helps make the rest of the work easier. Glad to hear it helped. > Their migration looks pretty solid, including all old comments and > attachments being correctly linked, cf this one: > > https://github.com/networkx/networkx/issues/693 Based on that issue, it looks like I wasn't careful enough in temporal ordering of comments, not that it's that critical. Aric, is the code you ended up using available somewhere? Thanks, Ray From giuseppe.amatulli at gmail.com Wed Jul 25 10:31:41 2012 From: giuseppe.amatulli at gmail.com (Giuseppe Amatulli) Date: Wed, 25 Jul 2012 09:31:41 -0500 Subject: [Numpy-discussion] Fwd: np.unique for one bi-dimensional array In-Reply-To: References: Message-ID: Hi, would like to identify unique pairs of numbers in two arrays o in one bi-dimensional array, and count the observation a_clean=array([4,4,5,4,4,4]) b_clean=array([3,5,4,4,3,4]) and obtain (4,3,2) (4,5,1) (5,4,1) (4,4,2) I solved with tow loops but off course there will be a faster solution. I was checking also for np.unique but i did not find how to apply for a bi-dimensional array. or Concatenate the two arrays a_concatenate=array([4_3,4_5,5_4,4_4,4_3,4_4]), then np.unique, then split again. Any other/faster solutions? In bash I usually unique command Thanks in advance -- Giuseppe Amatulli Web: www.spatial-ecology.net From lists at hilboll.de Wed Jul 25 10:50:48 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Wed, 25 Jul 2012 16:50:48 +0200 Subject: [Numpy-discussion] Fwd: np.unique for one bi-dimensional array In-Reply-To: References: Message-ID: <17408a956bb2b8bd5a7cfec759b83e18.squirrel@srv2.s4y.tournesol-consulting.eu> > Hi, > > would like to identify unique pairs of numbers in two arrays o in one > bi-dimensional array, and count the observation > > a_clean=array([4,4,5,4,4,4]) > b_clean=array([3,5,4,4,3,4]) > > and obtain > (4,3,2) > (4,5,1) > (5,4,1) > (4,4,2) > > I solved with tow loops but off course there will be a faster solution. > > I was checking also > > for np.unique but i did not find how to apply for a bi-dimensional array. > or > Concatenate the two arrays > a_concatenate=array([4_3,4_5,5_4,4_4,4_3,4_4]), then np.unique, then > split again. > > Any other/faster solutions? > In bash I usually unique command > Thanks in advance I'd try np.histogram2d, but probably only because I don't know np.unique. A. From jsseabold at gmail.com Wed Jul 25 11:15:10 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 25 Jul 2012 11:15:10 -0400 Subject: [Numpy-discussion] view of recarray issue In-Reply-To: References: Message-ID: On Sun, Jul 22, 2012 at 2:15 PM, Ralf Gommers wrote: > Hi, > > Just a heads up that right now views of recarrays seem to be problematic, > this doesn't work anymore: > >>>> import statsmodels.api as sm >>>> dta = sm.datasets.macrodata.load() # returns a record array with 14 >>>> fields >>>> dta.data[['infl', 'realgdp']].view((float,2)) > > I opened http://projects.scipy.org/numpy/ticket/2187 for this. Probably a > blocker for 1.7.0. > > Question: is that really the recommended way to get an (N, 2) size float > array from two columns of a larger record array? If so, why isn't there a > better way? If you'd want to write to that (N, 2) array you have to append a > copy, making it even uglier. Also, then there really should be tests for > views in test_records.py. > Any comments on this? I have a lot of broken code to deal with if ((float, shape[1])) is no longer allowed on structured and rec arrays. Skipper From jay.bourque at continuum.io Wed Jul 25 13:29:50 2012 From: jay.bourque at continuum.io (Jay Bourque) Date: Wed, 25 Jul 2012 12:29:50 -0500 Subject: [Numpy-discussion] view of recarray issue In-Reply-To: References: Message-ID: I'm actively looking at this issue since it was my pull request that broke this (https://github.com/numpy/numpy/pull/350). We definitely don't want to break this functionality for 1.7. The problem is that even though indexing with a subset of fields still returns a copy (for now), it now returns a copy of a view of the original array. When you call copy() on a view, it copies the entire original structured array with the view dtype. A short term fix would be to "manually" create a proper copy to return similar to what _index_fields() did before my change, but since the idea is to eventually return the view instead of a copy, long term we need a way to do a proper copy of a structured array view that doesn't copy the unwanted fields. -Jay On Wed, Jul 25, 2012 at 10:15 AM, Skipper Seabold wrote: > On Sun, Jul 22, 2012 at 2:15 PM, Ralf Gommers > wrote: > > Hi, > > > > Just a heads up that right now views of recarrays seem to be problematic, > > this doesn't work anymore: > > > >>>> import statsmodels.api as sm > >>>> dta = sm.datasets.macrodata.load() # returns a record array with 14 > >>>> fields > >>>> dta.data[['infl', 'realgdp']].view((float,2)) > > > > I opened http://projects.scipy.org/numpy/ticket/2187 for this. Probably > a > > blocker for 1.7.0. > > > > Question: is that really the recommended way to get an (N, 2) size float > > array from two columns of a larger record array? If so, why isn't there a > > better way? If you'd want to write to that (N, 2) array you have to > append a > > copy, making it even uglier. Also, then there really should be tests for > > views in test_records.py. > > > > Any comments on this? I have a lot of broken code to deal with if > ((float, shape[1])) is no longer allowed on structured and rec arrays. > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jul 25 13:36:31 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 25 Jul 2012 19:36:31 +0200 Subject: [Numpy-discussion] Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: On Tue, Jul 24, 2012 at 11:46 PM, Thouis (Ray) Jones wrote: > Hello, > > I would estimate I'm between a fourth and halfway through the > implementation of the trac-to-github-issues migration code. The work > lives in at https://github.com/thouis/numpy-trac-migration , though > without a copy of the trac DB, it's not really possible to experiment > with it. I haven't included the DB because of its size and potential > privacy issues. My plan is to use this repository for testing. > > One concern I have is that when issues are assigned or someone is > mentioned via @username, a notification will be sent (for the default > notifications settings). I believe this is the case even for closed > issues. My plan had been to add @user mentions for the reporter and > CCs listed in each trac issue, but there are other options: > I would prefer the above. Having CC's correct on issues that are closed is imho more important than a larger one-time flood of emails. Otherwise it becomes quite hard to revisit issues. > - only add @users for open trac tickets, don't assign closed tickets > to their original owner, > - don't add any @users for reporter and CCs (though owners will be > notified when issues are assigned), > - a combination of the above. > > I also thought it might be good to send a warning, via an issue on the > test repo with an @user for everyone that might be messaged in the > final transition, warning them of what's about to take place. > Good idea. > During testing, all @users will be replaced with something else, to > keep from flooding people with notifications during debugging. > > Thank you for any feedback. > It looks pretty good, looking forward to trying it out in a test Github repo. Below are a few things that I noticed in the code. It looks like you want to discard the Milestones, except for the 1.7.0, 1.8.0 and 2.0.0 ones. Why not keep all of them? I think the bug/enhancement/task label is missing. This shouldn't work I think (and I did a short test): def t2g_markup(s): return s.replace('{{{', "'''").replace('}}}', "'''") The {{{}}} part should be replaced by ```` for inline markup and indented for multi-line comments in Github I believe, then it will render the same way. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aric.hagberg at gmail.com Wed Jul 25 21:01:45 2012 From: aric.hagberg at gmail.com (Aric Hagberg) Date: Wed, 25 Jul 2012 19:01:45 -0600 Subject: [Numpy-discussion] Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: On Wed, Jul 25, 2012 at 3:51 AM, Thouis (Ray) Jones wrote: > On Wed, Jul 25, 2012 at 6:53 AM, Fernando Perez wrote: >> Hi Thouis, >> >> On Tue, Jul 24, 2012 at 2:46 PM, Thouis (Ray) Jones wrote: >>> I would estimate I'm between a fourth and halfway through the >>> implementation of the trac-to-github-issues migration code. The work >>> lives in at https://github.com/thouis/numpy-trac-migration >> >> mmh, I would have thought you're farther ahead... Aric Hagberg >> (@hagberg) and Jordi Torrents (@jtorrents) of NetworkX fame last >> weekend completed the trac2github migration for nx, and he said he'd >> only had to make a few improvements to your code. >> >> I'm cc'ing Aric here so he can give us more details, but based on the >> fact that they were able to successfully migrate nx completely to GH, >> I would have imagined you'd be much, much closer for numpy/scipy. > > Perhaps my estimate was low. I hadn't done any work with creating > issues on github (only extracting them from Trac into a form that maps > onto github issues), but I expect the PyGithub library > (https://github.com/jacquev6/PyGithub) helps make the rest of the work > easier. Glad to hear it helped. > >> Their migration looks pretty solid, including all old comments and >> attachments being correctly linked, cf this one: >> >> https://github.com/networkx/networkx/issues/693 > > Based on that issue, it looks like I wasn't careful enough in temporal > ordering of comments, not that it's that critical. > > Aric, is the code you ended up using available somewhere? We obviously didn't get all of the details quite right when we migrated the Trac tickets to Github issues. We gave ourselves a day or so at the SciPy sprints to do it and made a best effort. We would have never been able to accomplish what we did without the code Ray wrote. Really there wasn't much more to add and we are happy to share what we wrote (though it is a hack). I've cc'd Jordi who wrote the extra code. Briefly here are some of the issues we encountered - Jordi can probably add more. 1) We made and applied a mapping from the changeset hashes in our old repository (Mercurial) to the Git changeset hashes. This mostly worked. 2) We didn't make a mapping between the Git issue numbers and the Trac issue numbers so many of the cross references were wrong. I recommend doing that. 3) You are right that many messages will get sent out so considering the impact of that is worthwhile. 4) Some of the tickets/comments (maybe 50 of the approx 800 tickets we converted) had some formatting that broke during conversion. Jordi might have some thoughts on how to fix that. Aric From fperez.net at gmail.com Thu Jul 26 13:47:51 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 26 Jul 2012 10:47:51 -0700 Subject: [Numpy-discussion] Fwd: Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: Forwarding Jordi's message on the trac migration, he's having issues sending it directly. ---------- Forwarded message ---------- From: Jordi Torrents Date: 2012/7/26 Subject: Re: [Numpy-discussion] Github notifications and trac-to-github migration To: Aric Hagberg Cc: Discussion of Numerical Python Hi, Sorry for the delay in reporting feedback on the migration. I was planning to do it when I came back to Barcelona from Scipy but had some domestic problems (broken pipe, partially flooded flat) that got in the way. 2012/7/26 Aric Hagberg : > On Wed, Jul 25, 2012 at 3:51 AM, Thouis (Ray) Jones wrote: >> On Wed, Jul 25, 2012 at 6:53 AM, Fernando Perez wrote: >>> Hi Thouis, >>> >>> On Tue, Jul 24, 2012 at 2:46 PM, Thouis (Ray) Jones wrote: >>>> I would estimate I'm between a fourth and halfway through the >>>> implementation of the trac-to-github-issues migration code. The work >>>> lives in at https://github.com/thouis/numpy-trac-migration >>> >>> mmh, I would have thought you're farther ahead... Aric Hagberg >>> (@hagberg) and Jordi Torrents (@jtorrents) of NetworkX fame last >>> weekend completed the trac2github migration for nx, and he said he'd >>> only had to make a few improvements to your code. >>> >>> I'm cc'ing Aric here so he can give us more details, but based on the >>> fact that they were able to successfully migrate nx completely to GH, >>> I would have imagined you'd be much, much closer for numpy/scipy. >> >> Perhaps my estimate was low. I hadn't done any work with creating >> issues on github (only extracting them from Trac into a form that maps >> onto github issues), but I expect the PyGithub library >> (https://github.com/jacquev6/PyGithub) helps make the rest of the work >> easier. Glad to hear it helped. >> >>> Their migration looks pretty solid, including all old comments and >>> attachments being correctly linked, cf this one: >>> >>> https://github.com/networkx/networkx/issues/693 >> >> Based on that issue, it looks like I wasn't careful enough in temporal >> ordering of comments, not that it's that critical. >> >> Aric, is the code you ended up using available somewhere? > > We obviously didn't get all of the details quite right when we > migrated the Trac tickets to Github issues. We gave ourselves a day > or so at the SciPy sprints to do it and made a best effort. We would > have never been able to accomplish what we did without the code Ray > wrote. Really there wasn't much more to add and we are happy to share > what we wrote (though it is a hack). I've cc'd Jordi who wrote the > extra code. As Aric says, Ray's code was a life saver for us. We tried several other scripts for the migration before knowing about Ray's code, and all of them failed badly. The only important part missing in Ray's code was the method push() of the class issue in issue.py. We didn't migrate milestones nor labels, so more work is needed to do that. Here is how we implemented the push method: def push(self, repo): github_issue = repo.create_issue(title=self.github.title, body=self.github.body)#, #assignee=self.github.assignee, #milestone=self.github.milestone, #labels=self.github.labels) try: for comment in self.github.comments: github_issue.create_comment(comment) except: print("!!! Error in ticket %s" % self.trac.id) finally: if self.github.state == "closed": github_issue.edit(state='closed') Some formatting in the comments was problematic (more on that below). In our case this affected approximately 5% of the tickets. Github returned an ugly HTTP 500 (Internal Server Error) and all comments coming after a problematic one were lost. A more careful handling of individual comments would have prevented the loss of subsequent comments. Then the move_issues.py code was simply: import trac from ghissues import gh_repo repo = gh_repo() # Manual auth here for issue in trac.issues('data/trac.db'): print("processing issue %s" % issue.trac.id) issue.githubify() issue.push(repo) > Briefly here are some of the issues we encountered - Jordi can > probably add more. > > 1) We made and applied a mapping from the changeset hashes in our old > repository (Mercurial) to the Git changeset hashes. This mostly > worked. We used a regular expression to match the mercurial hash for commits in trac comments and a map generated by hg-git. It turns out that the regular expression was not general enough and we missed some hashes. The quick and dirty code that we used is: m = re.compile('\[(.*)/networkx\]') def load_hg_map(fname='git-mapfile'): hg_map = {} f = open(fname,'r') for row in f: hg_map[row.split(" ")[1].strip()] = row.split(" ")[0].strip() f.close() return hg_map hg_map = load_hg_map() def map_hg(hg_hash): if hg_hash in hg_map: return hg_map[hg_hash] else: return hg_hash def t2g_markup(s): h = m.search(s) if h: hh = h.group(1) s = s.replace(hh,map_hg(hh)) return s.replace('{{{', "'''").replace('}}}', "'''") > 2) We didn't make a mapping between the Git issue numbers and the Trac > issue numbers so many of the cross references were wrong. I recommend > doing that. > > 3) You are right that many messages will get sent out so considering > the impact of that is worthwhile. > > 4) Some of the tickets/comments (maybe 50 of the approx 800 tickets > we converted) had some formatting that broke during conversion. Jordi > might have some thoughts on how to fix that. As I said above, the problematic comments triggered an ugly HTTP 500 (Internal Server Error) from Github. We didn't spend time trying to debug and fix that. Most of the comments that failed in the migration had python code in their code using trac syntax ({{{#!python .... }}}). However not all comments with python code failed, my feeling is that the problematic parts were the prints that used the percent sign (%) but I'm not sure about that. Also, the edited trac comments also triggered a HTTP 500 error, see for instance: https://networkx.lanl.gov/trac/ticket/609#comment:26 The list of almost all issues with problematic comments is (to look at them https://networkx.lanl.gov/trac/ticket/{id} ): 231, 255, 273, 282, 283, 301, 310, 314, 346, 348, 362, 401, 431, 450, 466, 477, 494, 501, 539, 563, 574, 583, 598, 609, 623, 628, 632, 637, 643, 676, 704, 707, 713, 740 Ray, thank you very much for your work, without your code our migration would have been a lot more painful and slow. Salut! From hodge at stsci.edu Thu Jul 26 14:33:45 2012 From: hodge at stsci.edu (Phil Hodge) Date: Thu, 26 Jul 2012 14:33:45 -0400 Subject: [Numpy-discussion] bug in numpy.where? Message-ID: <50118D89.3090001@stsci.edu> On a Linux machine: > uname -srvop Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 GNU/Linux this example shows an apparent problem with the where function: Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> print np.__version__ 1.5.1 >>> net = np.zeros(3, dtype='>f4') >>> net[1] = 0.00458849 >>> net[2] = 0.605202 >>> max_net = net.max() >>> test = np.where(net <= 0., max_net, net) >>> print test [ -2.23910537e-35 4.58848989e-03 6.05202019e-01] When I specified the dtype for net as '>f8', test[0] was 3.46244974e+68. It worked as expected (i.e. test[0] should be 0.605202) when I specified float(max_net) as the second argument to np.where. Phil From carlosadean at linea.gov.br Thu Jul 26 16:01:10 2012 From: carlosadean at linea.gov.br (Carlos Adean) Date: Thu, 26 Jul 2012 17:01:10 -0300 Subject: [Numpy-discussion] =?windows-1252?q?afte_run_python_setup=2Epy_bu?= =?windows-1252?q?ild_-_error=3A_expected_specifier-qualifier-list_before_?= =?windows-1252?q?=91complex=92?= Message-ID: <5011A206.7070603@linea.gov.br> Hi all, After run the command below (numpy 1.6.2): python setup.py build I receive the following message error: gcc: build/src.linux-x86_64-2.5/numpy/core/src/npymath/ieee754.c In file included from numpy/core/src/npymath/ieee754.c.src:8: numpy/core/src/npymath/npy_math_private.h:452: error: expected specifier-qualifier-list before ?complex? numpy/core/src/npymath/npy_math_private.h:457: error: expected specifier-qualifier-list before ?complex? numpy/core/src/npymath/npy_math_private.h:462: error: expected specifier-qualifier-list before ?complex? In file included from numpy/core/src/npymath/ieee754.c.src:8: numpy/core/src/npymath/npy_math_private.h:452: error: expected specifier-qualifier-list before ?complex? numpy/core/src/npymath/npy_math_private.h:457: error: expected specifier-qualifier-list before ?complex? numpy/core/src/npymath/npy_math_private.h:462: error: expected specifier-qualifier-list before ?complex? error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include -Ibuild/src.linux-x86_64-2.5/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/tawala/packages/python-2.5.4-2/include/python2.5 -Ibuild/src.linux-x86_64-2.5/numpy/core/src/multiarray -Ibuild/src.linux-x86_64-2.5/numpy/core/src/umath -c build/src.linux-x86_64-2.5/numpy/core/src/npymath/ieee754.c -o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/numpy/core/src/npymath/ieee754.o" failed with exit status 1 CentOS 5.8 python version 2.5.4 numpy 1.6.2 any ideia to solve this? thanks -- -- Carlos Adean IT Team linea.gov.br From pjoshi at numenta.com Thu Jul 26 16:29:55 2012 From: pjoshi at numenta.com (Prakash Joshi) Date: Thu, 26 Jul 2012 20:29:55 +0000 Subject: [Numpy-discussion] Warnings and errors while building numpy 1.6.2 on 64 bit system Message-ID: <2BA6BFB370CEB34993BB1193833C1A9E8870F1@MBX021-W3-CA-5.exch021.domain.local> Hi All, I am building numpy 1.6.2 on 64 bit Mac OSX/CentOS and I found many warnings and errors of similar kind, however the numpy passes all the tests. Please let me know if I can consider this build as good? llvm-gcc-4.2: _configtest.c _configtest.c:5: error: size of array ?test_array? is negative _configtest.c:1:20: error: endian.h: No such file or directory _configtest.c:7: error: ?SIZEOF_LONGDOUBLE? undeclared (first use in this function) _configtest.c:1: warning: conflicting types for built-in function ?exp? numpy/core/src/npymath/ieee754.c.src:260: warning: implicit conversion shortens 64-bit value into a 32-bit value _configtest.c:5: warning: function declaration isn?t a prototype _configtest.c:4: warning: conflicting types for built-in function ?fabs? _configtest.c:8: warning: control reaches end of non-void function numpy/core/src/umath/loops.c.src:1403: warning: ?FLOAT_ldexp_long? defined but not used numpy/core/src/private/lowlevel_strided_loops.h:36: warning: ?PyArray_FreeStridedTransferData? declared ?static? but never defined numpy/lib/src/_compiled_base.c:120: warning: assignment from incompatible pointer type Best Regards, Prakash Joshi From fn681 at ncf.ca Thu Jul 26 16:45:43 2012 From: fn681 at ncf.ca (Colin J. Williams) Date: Thu, 26 Jul 2012 16:45:43 -0400 Subject: [Numpy-discussion] Synonym standards Message-ID: <5011AC77.3070406@ncf.ca> An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Jul 26 16:57:16 2012 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 26 Jul 2012 16:57:16 -0400 Subject: [Numpy-discussion] Synonym standards In-Reply-To: <5011AC77.3070406@ncf.ca> References: <5011AC77.3070406@ncf.ca> Message-ID: On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams wrote: > It seems that these standards have been adopted, which is good: > > The following import conventions are used throughout the NumPy source and > documentation: > > import numpy as np > import matplotlib as mpl > import matplotlib.pyplot as plt > > Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt > > Is there some similar standard for PyLab? > > Thanks, > > Colin W. > > Colin, Typically, with pylab mode of matplotlib, you do: from pylab import * This is essentially equivalent to: from numpy import * from matplotlib.pyplot import * Note that the pylab "module" is actually a part of matplotlib and is a shortcut to provide an environment that is very familiar to Matlab users. Converts are then encouraged to use the imports you mentioned in order to properly utilize python namespaces. I hope that helps! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Thu Jul 26 21:31:06 2012 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Thu, 26 Jul 2012 21:31:06 Subject: [Numpy-discussion] Synonym standards In-Reply-To: CANNq6FnWMmFL=mdT9OwBvOvBwF4Z-wD1uCWmBfZiT+mQDn2rJg@mail.gmail.com References: <5011AC77.3070406@ncf.ca> Message-ID: <5515207906894721089@unknownmsgid> Sent from my BlackBerry? PlayBook? www.blackberry.com ------------------------------ *From:* "Benjamin Root" *To:* "Discussion of Numerical Python" *Sent:* 26 July 2012 16:57 *Subject:* Re: [Numpy-discussion] Synonym standards On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams wrote: > It seems that these standards have been adopted, which is good: > > The following import conventions are used throughout the NumPy source and > documentation: > > import numpy as np > import matplotlib as mpl > import matplotlib.pyplot as plt > > Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt > > Is there some similar standard for PyLab? > > Thanks, > > Colin W. > > Colin, Typically, with pylab mode of matplotlib, you do: from pylab import * This is essentially equivalent to: from numpy import * from matplotlib.pyplot import * Note that the pylab "module" is actually a part of matplotlib and is a shortcut to provide an environment that is very familiar to Matlab users. Converts are then encouraged to use the imports you mentioned in order to properly utilize python namespaces. I hope that helps! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Thu Jul 26 19:05:40 2012 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Thu, 26 Jul 2012 19:05:40 -0400 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> Message-ID: <5011CD44.2050709@gmail.com> An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 26 19:12:45 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Jul 2012 00:12:45 +0100 Subject: [Numpy-discussion] Synonym standards In-Reply-To: <5011CD44.2050709@gmail.com> References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> Message-ID: On Fri, Jul 27, 2012 at 12:05 AM, Colin J. Williams wrote: > On 26/07/2012 4:57 PM, Benjamin Root wrote: > > > On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams wrote: >> >> It seems that these standards have been adopted, which is good: >> >> The following import conventions are used throughout the NumPy source and >> documentation: >> >> import numpy as np >> import matplotlib as mpl >> import matplotlib.pyplot as plt >> >> Source: >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt >> >> Is there some similar standard for PyLab? >> >> Thanks, >> >> Colin W. >> > > > Colin, > > Typically, with pylab mode of matplotlib, you do: > > from pylab import * > > This is essentially equivalent to: > > from numpy import * > from matplotlib.pyplot import * > > Note that the pylab "module" is actually a part of matplotlib and is a > shortcut to provide an environment that is very familiar to Matlab users. > Converts are then encouraged to use the imports you mentioned in order to > properly utilize python namespaces. > > I hope that helps! > Ben Root > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Thanks Ben, > > I would prefer not to use: from xxx import *, > > because of the name pollution. > > The name convention that I copied above facilitates avoiding the pollution. > > In the same spirit, I've used: > import pylab as plb But in that same spirit, using np and plt separately is preferred. -- Robert Kern From charlesr.harris at gmail.com Thu Jul 26 21:21:44 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Jul 2012 19:21:44 -0600 Subject: [Numpy-discussion] Warnings and errors while building numpy 1.6.2 on 64 bit system In-Reply-To: <2BA6BFB370CEB34993BB1193833C1A9E8870F1@MBX021-W3-CA-5.exch021.domain.local> References: <2BA6BFB370CEB34993BB1193833C1A9E8870F1@MBX021-W3-CA-5.exch021.domain.local> Message-ID: On Thu, Jul 26, 2012 at 2:29 PM, Prakash Joshi wrote: > Hi All, > > I am building numpy 1.6.2 on 64 bit Mac OSX/CentOS and I found many > warnings and errors of similar kind, however the numpy passes all the > tests. > Please let me know if I can consider this build as good? > > > llvm-gcc-4.2: _configtest.c > > _configtest.c:5: error: size of array ?test_array? is negative > > _configtest.c:1:20: error: endian.h: No such file or directory > > _configtest.c:7: error: ?SIZEOF_LONGDOUBLE? undeclared (first use in this > function) > > > _configtest.c:1: warning: conflicting types for built-in function ?exp? > > > numpy/core/src/npymath/ieee754.c.src:260: warning: implicit conversion > shortens 64-bit value into a 32-bit value > > _configtest.c:5: warning: function declaration isn?t a prototype > > _configtest.c:4: warning: conflicting types for built-in function ?fabs? > > _configtest.c:8: warning: control reaches end of non-void function > > You can ignore all the _configtest.c warnings, they result from numpy running configuration tests to figure out what the system looks like. > numpy/core/src/umath/loops.c.src:1403: warning: ?FLOAT_ldexp_long? defined > but not used > > numpy/core/src/private/lowlevel_strided_loops.h:36: warning: > ?PyArray_FreeStridedTransferData? declared ?static? but never defined > > > numpy/lib/src/_compiled_base.c:120: warning: assignment from incompatible > pointer type > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Thu Jul 26 23:26:56 2012 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Thu, 26 Jul 2012 23:26:56 -0400 Subject: [Numpy-discussion] Bug in numpy.mean() revisited Message-ID: There was a thread in January discussing the non-obvious behavior of numpy.mean() for large arrays of float32 values [1]. This issue is nicely discussed at the end of the numpy.mean() documentation [2] with an example: >>> a = np.zeros((2, 512*512), dtype=np.float32) >>> a[0, :] = 1.0 >>> a[1, :] = 0.1 >>> np.mean(a) 0.546875 >From the docs and previous discussion it seems there is no technical difficulty in choosing a different (higher precision) type for the accumulator using the dtype arg, and in fact this is done automatically for int values. My question is whether there would be any support for doing something more than documenting this behavior. I suspect very few people ever make it below the fold for the np.mean() documentation. Taking the mean of large arrays of float32 values is a *very* common use case and giving the wrong answer with default inputs is really disturbing. I recently had to rebuild a complex science data archive because of corrupted mean values. Possible ideas to stimulate discussion: 1. Always use float64 to accumulate float types that are 64 bits or less. Are there serious performance impacts to automatically using float64 to accumulate float32 arrays? I appreciate this would likely introduce unwanted regressions (sometimes suddenly getting the right answer is a bad thing). So could this be considered for numpy 2.0? 2. Might there be a way to emit a warning if the number of values and the max accumulated value [3] are such that the estimated fractional error is above some tolerance? I'm not even sure if this is a good idea or if there will be howls from the community as their codes start warning about inaccurate mean values. Better idea along this line?? Cheers, Tom [1]: http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059960.html [2]: http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html [3]: Using the max accumulated value during accumulation instead of the final accumulated value seems like the right thing for estimating precision loss. But this would affect performance so maybe just using the final value would catch many cases. From charlesr.harris at gmail.com Fri Jul 27 00:15:14 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Jul 2012 22:15:14 -0600 Subject: [Numpy-discussion] Bug in numpy.mean() revisited In-Reply-To: References: Message-ID: On Thu, Jul 26, 2012 at 9:26 PM, Tom Aldcroft wrote: > There was a thread in January discussing the non-obvious behavior of > numpy.mean() for large arrays of float32 values [1]. This issue is > nicely discussed at the end of the numpy.mean() documentation [2] with > an example: > > >>> a = np.zeros((2, 512*512), dtype=np.float32) > >>> a[0, :] = 1.0 > >>> a[1, :] = 0.1 > >>> np.mean(a) > 0.546875 > > >From the docs and previous discussion it seems there is no technical > difficulty in choosing a different (higher precision) type for the > accumulator using the dtype arg, and in fact this is done > automatically for int values. > > My question is whether there would be any support for doing something > more than documenting this behavior. I suspect very few people ever > make it below the fold for the np.mean() documentation. Taking the > mean of large arrays of float32 values is a *very* common use case and > giving the wrong answer with default inputs is really disturbing. I > recently had to rebuild a complex science data archive because of > corrupted mean values. > > Possible ideas to stimulate discussion: > 1. Always use float64 to accumulate float types that are 64 bits or > less. Are there serious performance impacts to automatically using > float64 to accumulate float32 arrays? I appreciate this would likely > introduce unwanted regressions (sometimes suddenly getting the right > answer is a bad thing). So could this be considered for numpy 2.0? > > 2. Might there be a way to emit a warning if the number of values and > the max accumulated value [3] are such that the estimated fractional > error is above some tolerance? I'm not even sure if this is a good > idea or if there will be howls from the community as their codes start > warning about inaccurate mean values. Better idea along this line?? > > I would support accumulating in 64 bits but, IIRC, the function will need to be rewritten so that it works by adding 32 bit floats to the accumulator to save space. There are also more stable methods that could also be investigated. There is a nice little project there for someone to cut their teeth on. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Jul 27 02:30:36 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 27 Jul 2012 01:30:36 -0500 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 Message-ID: Hey all, I'm wondering who has tried to make NumPy work with Python 3.3. The Unicode handling was significantly improved in Python 3.3 and the array-scalar code (which assumed a certain structure for UnicodeObjects) is not working now. It would be nice to get 1.7.0 working with Python 3.3 if possible before the release. Anyone interested in tackling that little challenge? If someone has already tried it would be nice to hear your experience. -Travis From heng at cantab.net Fri Jul 27 03:22:46 2012 From: heng at cantab.net (Henry Gomersall) Date: Fri, 27 Jul 2012 08:22:46 +0100 Subject: [Numpy-discussion] Bug in numpy.mean() revisited In-Reply-To: References: Message-ID: <1343373766.29750.14.camel@farnsworth> On Thu, 2012-07-26 at 22:15 -0600, Charles R Harris wrote: > I would support accumulating in 64 bits but, IIRC, the function will > need to be rewritten so that it works by adding 32 bit floats to the > accumulator to save space. There are also more stable methods that > could also be investigated. There is a nice little project there for > someone to cut their teeth on. So a (very) quick read around suggests that using an interim mean gives a more robust algorithm. The problem being, that these techniques are either multi-pass, or inherently slower (due to say a division in the loop). Higher precision would not suffer the same potential slow down and would solve most cases of this problem. Henry From cournape at gmail.com Fri Jul 27 04:28:38 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Jul 2012 09:28:38 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 7:30 AM, Travis Oliphant wrote: > Hey all, > > I'm wondering who has tried to make NumPy work with Python 3.3. The Unicode handling was significantly improved in Python 3.3 and the array-scalar code (which assumed a certain structure for UnicodeObjects) is not working now. > > It would be nice to get 1.7.0 working with Python 3.3 if possible before the release. Anyone interested in tackling that little challenge? If someone has already tried it would be nice to hear your experience. Given that we're late with 1.7, I would suggest passing this to the next release, unless the fix is simple (just a change of API). cheers, David From cournape at gmail.com Fri Jul 27 04:43:01 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Jul 2012 09:43:01 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 9:28 AM, David Cournapeau wrote: > On Fri, Jul 27, 2012 at 7:30 AM, Travis Oliphant wrote: >> Hey all, >> >> I'm wondering who has tried to make NumPy work with Python 3.3. The Unicode handling was significantly improved in Python 3.3 and the array-scalar code (which assumed a certain structure for UnicodeObjects) is not working now. >> >> It would be nice to get 1.7.0 working with Python 3.3 if possible before the release. Anyone interested in tackling that little challenge? If someone has already tried it would be nice to hear your experience. > > Given that we're late with 1.7, I would suggest passing this to the > next release, unless the fix is simple (just a change of API). I took a brief look at it, and from the errors I have seen, one is cosmetic, the other one is a bit more involved (rewriting PyArray_Scalar unicode support). While it is not difficult in nature, the current code has multiple #ifdef of Py_UNICODE_WIDE, meaning it would require multiple configurations on multiple python versions to be tested. I don't think python 3.3 support is critical - people who want to play with bet interpreters can build numpy by themselves from master, so I am -1 on integrating this into 1.7. I may have a fix within tonight for it, though, David From njs at pobox.com Fri Jul 27 06:24:32 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 27 Jul 2012 11:24:32 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 9:28 AM, David Cournapeau wrote: > On Fri, Jul 27, 2012 at 7:30 AM, Travis Oliphant wrote: >> Hey all, >> >> I'm wondering who has tried to make NumPy work with Python 3.3. The Unicode handling was significantly improved in Python 3.3 and the array-scalar code (which assumed a certain structure for UnicodeObjects) is not working now. >> >> It would be nice to get 1.7.0 working with Python 3.3 if possible before the release. Anyone interested in tackling that little challenge? If someone has already tried it would be nice to hear your experience. > > Given that we're late with 1.7, I would suggest passing this to the > next release, unless the fix is simple (just a change of API). IMO, it's not a regression so it's not a release blocker. Of course we should release the fix whenever it's ready (in 1.7 if it's ready by then, else in 1.7.1), but we shouldn't hold up the release for it. -n From njs at pobox.com Fri Jul 27 06:53:03 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 27 Jul 2012 11:53:03 +0100 Subject: [Numpy-discussion] Bug in numpy.mean() revisited In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 5:15 AM, Charles R Harris wrote: > I would support accumulating in 64 bits but, IIRC, the function will need to > be rewritten so that it works by adding 32 bit floats to the accumulator to > save space. There are also more stable methods that could also be > investigated. There is a nice little project there for someone to cut their > teeth on. So the obvious solution here would be to make the ufunc reduce loop smart enough that x = np.zeros(2 ** 30, dtype=float32) np.sum(x, dtype=float64) does not upcast 'x' to float64's as a whole. This shouldn't be too terrible to implement -- iterate over the float32 array, and only upcast each inner-loop "buffer" as you go, instead of upcasting the whole thing. In fact, nditer might do this already? Then using a wide accumulator by default would just take a few lines of code in numpy.core._methods._mean to select the proper dtype and downcast the result. -n From cwg at falma.de Fri Jul 27 08:04:23 2012 From: cwg at falma.de (Christoph Groth) Date: Fri, 27 Jul 2012 14:04:23 +0200 Subject: [Numpy-discussion] numpy.dot: why second-to-last index Message-ID: <87zk6lcsvs.fsf@falma.de> Dear numpy historians, When multiplying two arrays with numpy.dot, the summation is made over the last index of the first argument, and over the *second-to-last* index of the second argument. I wonder why the convention has been chosen like this? The only reason I can think of is that this allows to use GEMM as a building block also for the >2d case. Is this the motivation? However, the actual implementation of numpy.dot uses GEMM only in the 2d x 2d case... Summation over the last index of the first argument and the first index of the second would seem a more obvious choice. From ben.root at ou.edu Fri Jul 27 09:27:48 2012 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 27 Jul 2012 09:27:48 -0400 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> Message-ID: On Thu, Jul 26, 2012 at 7:12 PM, Robert Kern wrote: > On Fri, Jul 27, 2012 at 12:05 AM, Colin J. Williams > wrote: > > On 26/07/2012 4:57 PM, Benjamin Root wrote: > > > > > > On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams wrote: > >> > >> It seems that these standards have been adopted, which is good: > >> > >> The following import conventions are used throughout the NumPy source > and > >> documentation: > >> > >> import numpy as np > >> import matplotlib as mpl > >> import matplotlib.pyplot as plt > >> > >> Source: > >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt > >> > >> Is there some similar standard for PyLab? > >> > >> Thanks, > >> > >> Colin W. > >> > > > > > > Colin, > > > > Typically, with pylab mode of matplotlib, you do: > > > > from pylab import * > > > > This is essentially equivalent to: > > > > from numpy import * > > from matplotlib.pyplot import * > > > > Note that the pylab "module" is actually a part of matplotlib and is a > > shortcut to provide an environment that is very familiar to Matlab users. > > Converts are then encouraged to use the imports you mentioned in order to > > properly utilize python namespaces. > > > > I hope that helps! > > Ben Root > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Thanks Ben, > > > > I would prefer not to use: from xxx import *, > > > > because of the name pollution. > > > > The name convention that I copied above facilitates avoiding the > pollution. > > > > In the same spirit, I've used: > > import pylab as plb > > But in that same spirit, using np and plt separately is preferred. > > "Namespaces are one honking great idea -- let's do more of those!" from http://www.python.org/dev/peps/pep-0020/ Absolutely correct. The namespace pollution is exactly why we encourage converts to move over from the pylab mode to separating out the numpy and pyplot namespaces. There are very subtle issues that arise when doing "from pylab import *" such as overriding the built-in "any" and "all". The only real advantage of the pylab mode over separating out numpy and pyplot is conciseness, which many matlab users expect at first. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Fri Jul 27 09:47:24 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 27 Jul 2012 15:47:24 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 10:43 AM, David Cournapeau wrote: > On Fri, Jul 27, 2012 at 9:28 AM, David Cournapeau > wrote: > > On Fri, Jul 27, 2012 at 7:30 AM, Travis Oliphant > wrote: > >> Hey all, > >> > >> I'm wondering who has tried to make NumPy work with Python 3.3. The > Unicode handling was significantly improved in Python 3.3 and the > array-scalar code (which assumed a certain structure for UnicodeObjects) is > not working now. > >> > >> It would be nice to get 1.7.0 working with Python 3.3 if possible > before the release. Anyone interested in tackling that little > challenge? If someone has already tried it would be nice to hear your > experience. > > > > Given that we're late with 1.7, I would suggest passing this to the > > next release, unless the fix is simple (just a change of API). > > I took a brief look at it, and from the errors I have seen, one is > cosmetic, the other one is a bit more involved (rewriting > PyArray_Scalar unicode support). While it is not difficult in nature, > the current code has multiple #ifdef of Py_UNICODE_WIDE, meaning it > would require multiple configurations on multiple python versions to > be tested. > > I don't think python 3.3 support is critical - people who want to play > with bet interpreters can build numpy by themselves from master, so I > am -1 on integrating this into 1.7. > > I may have a fix within tonight for it, though, > There are 2 tickets about this: http://projects.scipy.org/numpy/ticket/2145 http://projects.scipy.org/numpy/ticket/1471 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Fri Jul 27 11:39:58 2012 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 27 Jul 2012 17:39:58 +0200 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> Message-ID: On 27.07.2012, at 3:27PM, Benjamin Root wrote: > > I would prefer not to use: from xxx import *, > > > > because of the name pollution. > > > > The name convention that I copied above facilitates avoiding the pollution. > > > > In the same spirit, I've used: > > import pylab as plb > > But in that same spirit, using np and plt separately is preferred. > > > "Namespaces are one honking great idea -- let's do more of those!" > from http://www.python.org/dev/peps/pep-0020/ > > Absolutely correct. The namespace pollution is exactly why we encourage converts to move over from the pylab mode to separating out the numpy and pyplot namespaces. There are very subtle issues that arise when doing "from pylab import *" such as overriding the built-in "any" and "all". The only real advantage of the pylab mode over separating out numpy and pyplot is conciseness, which many matlab users expect at first. It unfortunately also comes with the convenience of using the "ipython --pylab" mode - does anyone know how to turn the "import *" part of, or how to create a similar working environment with ipython that does keep namespaces clean? Cheers, Derek From tsyu80 at gmail.com Fri Jul 27 11:58:26 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Fri, 27 Jul 2012 11:58:26 -0400 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> Message-ID: On Fri, Jul 27, 2012 at 11:39 AM, Derek Homeier < derek at astro.physik.uni-goettingen.de> wrote: > On 27.07.2012, at 3:27PM, Benjamin Root wrote: > > > > I would prefer not to use: from xxx import *, > > > > > > because of the name pollution. > > > > > > The name convention that I copied above facilitates avoiding the > pollution. > > > > > > In the same spirit, I've used: > > > import pylab as plb > > > > But in that same spirit, using np and plt separately is preferred. > > > > > > "Namespaces are one honking great idea -- let's do more of those!" > > from http://www.python.org/dev/peps/pep-0020/ > > > > Absolutely correct. The namespace pollution is exactly why we encourage > converts to move over from the pylab mode to separating out the numpy and > pyplot namespaces. There are very subtle issues that arise when doing > "from pylab import *" such as overriding the built-in "any" and "all". The > only real advantage of the pylab mode over separating out numpy and pyplot > is conciseness, which many matlab users expect at first. > > It unfortunately also comes with the convenience of using the "ipython > --pylab" mode - > does anyone know how to turn the "import *" part of, or how to create a > similar working > environment with ipython that does keep namespaces clean? > > Cheers, > Derek > There's a config flag that you can add to your ipython profile: c.TerminalIPythonApp.pylab_import_all = False For example, my profile is in ~/.ipython/profile_default/ipython_config.py Cheers, -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Fri Jul 27 12:43:14 2012 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 27 Jul 2012 18:43:14 +0200 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> Message-ID: <0063D7BC-EE51-4C0F-B8D5-9B3BEE752ADF@astro.physik.uni-goettingen.de> On 27 Jul 2012, at 17:58, Tony Yu wrote: > On Fri, Jul 27, 2012 at 11:39 AM, Derek Homeier wrote: > On 27.07.2012, at 3:27PM, Benjamin Root wrote: > > > > I would prefer not to use: from xxx import *, > > > > > > because of the name pollution. > > > > > > The name convention that I copied above facilitates avoiding the pollution. > > > > > > In the same spirit, I've used: > > > import pylab as plb > > > > But in that same spirit, using np and plt separately is preferred. > > > > > > "Namespaces are one honking great idea -- let's do more of those!" > > from http://www.python.org/dev/peps/pep-0020/ > > > > Absolutely correct. The namespace pollution is exactly why we encourage converts to move over from the pylab mode to separating out the numpy and pyplot namespaces. There are very subtle issues that arise when doing "from pylab import *" such as overriding the built-in "any" and "all". The only real advantage of the pylab mode over separating out numpy and pyplot is conciseness, which many matlab users expect at first. > > It unfortunately also comes with the convenience of using the "ipython --pylab" mode - > does anyone know how to turn the "import *" part of, or how to create a similar working > environment with ipython that does keep namespaces clean? > > Cheers, > Derek > > > There's a config flag that you can add to your ipython profile: > > c.TerminalIPythonApp.pylab_import_all = False > > For example, my profile is in ~/.ipython/profile_default/ipython_config.py > thanks, that was exactly what I was looking for - together with c.TerminalIPythonApp.exec_lines = ['import sys', 'import numpy as np', 'import matplotlib as mpl', 'import matplotlib.pyplot as plt'] etc. to have the shortcuts. Cheers, Derek From ben.root at ou.edu Fri Jul 27 14:01:40 2012 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 27 Jul 2012 14:01:40 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <50118D89.3090001@stsci.edu> References: <50118D89.3090001@stsci.edu> Message-ID: On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge wrote: > On a Linux machine: > > > uname -srvop > Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 > GNU/Linux > > this example shows an apparent problem with the where function: > > Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy as np > >>> print np.__version__ > 1.5.1 > >>> net = np.zeros(3, dtype='>f4') > >>> net[1] = 0.00458849 > >>> net[2] = 0.605202 > >>> max_net = net.max() > >>> test = np.where(net <= 0., max_net, net) > >>> print test > [ -2.23910537e-35 4.58848989e-03 6.05202019e-01] > > When I specified the dtype for net as '>f8', test[0] was > 3.46244974e+68. It worked as expected (i.e. test[0] should be 0.605202) > when I specified float(max_net) as the second argument to np.where. > > Phil > Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine. Strange indeed. Breaking it down further: >>> res = (net <= 0.) >>> print res [ True False False] >>> np.where(res, max_net, net) array([ -2.23910537e-35, 4.58848989e-03, 6.05202019e-01], dtype=float32) Very Strange... Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at gmail.com Fri Jul 27 14:07:27 2012 From: chanley at gmail.com (Christopher Hanley) Date: Fri, 27 Jul 2012 14:07:27 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: References: <50118D89.3090001@stsci.edu> Message-ID: On Fri, Jul 27, 2012 at 2:01 PM, Benjamin Root wrote: > > > On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge wrote: > >> On a Linux machine: >> >> > uname -srvop >> Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 >> GNU/Linux >> >> this example shows an apparent problem with the where function: >> >> Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43) >> [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import numpy as np >> >>> print np.__version__ >> 1.5.1 >> >>> net = np.zeros(3, dtype='>f4') >> >>> net[1] = 0.00458849 >> >>> net[2] = 0.605202 >> >>> max_net = net.max() >> >>> test = np.where(net <= 0., max_net, net) >> >>> print test >> [ -2.23910537e-35 4.58848989e-03 6.05202019e-01] >> >> When I specified the dtype for net as '>f8', test[0] was >> 3.46244974e+68. It worked as expected (i.e. test[0] should be 0.605202) >> when I specified float(max_net) as the second argument to np.where. >> >> Phil >> > > Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine. > Strange indeed. > > Breaking it down further: > > >>> res = (net <= 0.) > >>> print res > [ True False False] > >>> np.where(res, max_net, net) > array([ -2.23910537e-35, 4.58848989e-03, 6.05202019e-01], > dtype=float32) > > Very Strange... > > Ben Root > What if find really interesting is that -2.23910537e-35 is the byte swapped version of 6.05202019e-01. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Jul 27 14:30:36 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 27 Jul 2012 11:30:36 -0700 Subject: [Numpy-discussion] Synonym standards In-Reply-To: <0063D7BC-EE51-4C0F-B8D5-9B3BEE752ADF@astro.physik.uni-goettingen.de> References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> <0063D7BC-EE51-4C0F-B8D5-9B3BEE752ADF@astro.physik.uni-goettingen.de> Message-ID: On Fri, Jul 27, 2012 at 9:43 AM, Derek Homeier wrote: > thanks, that was exactly what I was looking for - together with > > c.TerminalIPythonApp.exec_lines = ['import sys', > 'import numpy as np', > 'import matplotlib as mpl', > 'import matplotlib.pyplot as plt'] Note that if you do this only and don't use %pylab interactively or the --pylab flag, then you will *not* get the proper non-blocking control of the matplotlib event loop integrated with the terminal or qtconsole. In summary, following Tony's suggestion is enough to give you: - event loop integration when you do --pylab at the prompt or %pylab in ipython. - the np, mpl and plt shortcuts - no 'import *' at all. So that should be sufficient, but you should still use --pylab or %pylab to indicate to IPython that you want the mpl event loops to work in conjunction with the shell. Cheers, f From derek at astro.physik.uni-goettingen.de Fri Jul 27 15:49:33 2012 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 27 Jul 2012 21:49:33 +0200 Subject: [Numpy-discussion] Synonym standards In-Reply-To: References: <5011AC77.3070406@ncf.ca> <5011CD44.2050709@gmail.com> <0063D7BC-EE51-4C0F-B8D5-9B3BEE752ADF@astro.physik.uni-goettingen.de> Message-ID: On 27.07.2012, at 8:30PM, Fernando Perez wrote: > On Fri, Jul 27, 2012 at 9:43 AM, Derek Homeier > wrote: >> thanks, that was exactly what I was looking for - together with >> >> c.TerminalIPythonApp.exec_lines = ['import sys', >> 'import numpy as np', >> 'import matplotlib as mpl', >> 'import matplotlib.pyplot as plt'] > > Note that if you do this only and don't use %pylab interactively or > the --pylab flag, then you will *not* get the proper non-blocking > control of the matplotlib event loop integrated with the terminal or > qtconsole. > > In summary, following Tony's suggestion is enough to give you: > > - event loop integration when you do --pylab at the prompt or %pylab in ipython. > - the np, mpl and plt shortcuts > - no 'import *' at all. > > So that should be sufficient, but you should still use --pylab or > %pylab to indicate to IPython that you want the mpl event loops to > work in conjunction with the shell. Yes, I was aware of that, without the pylab option at least with the macosx backend windows either would not draw and refresh properly, or block the shell after a draw() or show(); that's why I was asking how to avoid the 'import *' with it. I have not used the %pylab builtin before, though. Cheers, Derek From amueller at ais.uni-bonn.de Fri Jul 27 15:58:57 2012 From: amueller at ais.uni-bonn.de (Andreas Mueller) Date: Fri, 27 Jul 2012 20:58:57 +0100 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: References: <50118D89.3090001@stsci.edu> Message-ID: <5012F301.5020309@ais.uni-bonn.de> Hi Everybody. The bug is that no error is raised, right? The docs say where(condition, [x, y]) x, y : array_like, optional Values from which to choose. `x` and `y` need to have the same shape as `condition` In the example you gave, x was a scalar. Cheers, Andy From ben.root at ou.edu Fri Jul 27 16:10:31 2012 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 27 Jul 2012 16:10:31 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <5012F301.5020309@ais.uni-bonn.de> References: <50118D89.3090001@stsci.edu> <5012F301.5020309@ais.uni-bonn.de> Message-ID: On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller wrote: > Hi Everybody. > The bug is that no error is raised, right? > The docs say > > where(condition, [x, y]) > > x, y : array_like, optional > Values from which to choose. `x` and `y` need to have the same > shape as `condition` > > In the example you gave, x was a scalar. > > Cheers, > Andy > Hmm, that is incorrect, I believe. I have used a scalar before. Maybe it works because a scalar is broadcastable to the same shape as any other N-dim array? If so, then the wording of that docstring needs to be fixed. No, I think Christopher hit it on the head. For whatever reason, the endian-ness somewhere is not being respected and causes a byte-swapped version to show up. How that happens, though, is beyond me. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From amueller at ais.uni-bonn.de Fri Jul 27 16:21:08 2012 From: amueller at ais.uni-bonn.de (Andreas Mueller) Date: Fri, 27 Jul 2012 21:21:08 +0100 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: References: <50118D89.3090001@stsci.edu> <5012F301.5020309@ais.uni-bonn.de> Message-ID: <5012F834.9020205@ais.uni-bonn.de> On 07/27/2012 09:10 PM, Benjamin Root wrote: > > > On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller > > wrote: > > Hi Everybody. > The bug is that no error is raised, right? > The docs say > > where(condition, [x, y]) > > x, y : array_like, optional > Values from which to choose. `x` and `y` need to have the same > shape as `condition` > > In the example you gave, x was a scalar. > > Cheers, > Andy > > > Hmm, that is incorrect, I believe. I have used a scalar before. > Maybe it works because a scalar is broadcastable to the same shape as > any other N-dim array? > > If so, then the wording of that docstring needs to be fixed. > > No, I think Christopher hit it on the head. For whatever reason, the > endian-ness somewhere is not being respected and causes a byte-swapped > version to show up. How that happens, though, is beyond me. Well, if you use np.repeat(max_net, 3) instead of max_net, it works as expected. So if you use the function as documented, it does the right thing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at gmail.com Fri Jul 27 16:36:02 2012 From: chanley at gmail.com (Christopher Hanley) Date: Fri, 27 Jul 2012 16:36:02 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: References: <50118D89.3090001@stsci.edu> <5012F301.5020309@ais.uni-bonn.de> Message-ID: On Fri, Jul 27, 2012 at 4:10 PM, Benjamin Root wrote: > > > On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller > wrote: > >> Hi Everybody. >> The bug is that no error is raised, right? >> The docs say >> >> where(condition, [x, y]) >> >> x, y : array_like, optional >> Values from which to choose. `x` and `y` need to have the same >> shape as `condition` >> >> In the example you gave, x was a scalar. >> >> Cheers, >> Andy >> > > Hmm, that is incorrect, I believe. I have used a scalar before. Maybe it > works because a scalar is broadcastable to the same shape as any other > N-dim array? > > If so, then the wording of that docstring needs to be fixed. > > No, I think Christopher hit it on the head. For whatever reason, the > endian-ness somewhere is not being respected and causes a byte-swapped > version to show up. How that happens, though, is beyond me. > > Ben Root > > > It may have something to do with the dtype size as well. The problem seen with, net = np.zeros(3, dtype='>f4') Disappears for net = np.zeros(3, dtype='>f8') and above. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Fri Jul 27 17:00:41 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 27 Jul 2012 14:00:41 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: On Fri, Jul 27, 2012 at 6:47 AM, Ralf Gommers wrote: > > > On Fri, Jul 27, 2012 at 10:43 AM, David Cournapeau > wrote: >> >> On Fri, Jul 27, 2012 at 9:28 AM, David Cournapeau >> wrote: >> > On Fri, Jul 27, 2012 at 7:30 AM, Travis Oliphant >> > wrote: >> >> Hey all, >> >> >> >> I'm wondering who has tried to make NumPy work with Python 3.3. The >> >> Unicode handling was significantly improved in Python 3.3 and the >> >> array-scalar code (which assumed a certain structure for UnicodeObjects) is >> >> not working now. >> >> >> >> It would be nice to get 1.7.0 working with Python 3.3 if possible >> >> before the release. Anyone interested in tackling that little challenge? >> >> If someone has already tried it would be nice to hear your experience. >> > >> > Given that we're late with 1.7, I would suggest passing this to the >> > next release, unless the fix is simple (just a change of API). >> >> I took a brief look at it, and from the errors I have seen, one is >> cosmetic, the other one is a bit more involved (rewriting >> PyArray_Scalar unicode support). While it is not difficult in nature, >> the current code has multiple #ifdef of Py_UNICODE_WIDE, meaning it >> would require multiple configurations on multiple python versions to >> be tested. >> >> I don't think python 3.3 support is critical - people who want to play >> with bet interpreters can build numpy by themselves from master, so I >> am -1 on integrating this into 1.7. >> >> I may have a fix within tonight for it, though, > > > There are 2 tickets about this: > http://projects.scipy.org/numpy/ticket/2145 > http://projects.scipy.org/numpy/ticket/1471 I am currently working on a PR trying to fix the unicode failures: https://github.com/numpy/numpy/pull/366 It's a work in progress, I am still have some little issues, see the PR for up-to-date details. Ondrej From stefan-usenet at bytereef.org Sat Jul 28 05:36:22 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sat, 28 Jul 2012 11:36:22 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: Message-ID: <20120728093622.GA27387@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > >> I took a brief look at it, and from the errors I have seen, one is > >> cosmetic, the other one is a bit more involved (rewriting > >> PyArray_Scalar unicode support). While it is not difficult in nature, > >> the current code has multiple #ifdef of Py_UNICODE_WIDE, meaning it > >> would require multiple configurations on multiple python versions to > >> be tested. The cleanest way might be to leave the existing code in place and write completely new and independent code for Python 3.3. > https://github.com/numpy/numpy/pull/366 > > It's a work in progress, I am still have some little issues, see the > PR for up-to-date details. I'm not a Unicode expert, but I think it's best to avoid Py_UNICODE altogether. What should matter in 3.3 is the maximum character in a Unicode string that determines the kind of the string: PyUnicode_1BYTE_KIND -> Py_UCS1 PyUnicode_2BYTE_KIND -> Py_UCS2 PyUnicode_4BYTE_KIND -> Py_UCS4 So Py_UNICODE_WIDE should not matter as all builds support PyUnicode_4BYTE_KIND. That's why I /think/ it's possible to drop Py_UNICODE altogether. For instance, the line in https://github.com/certik/numpy/commit/d02e36e5c85d5ee444614254643037aafc8deccc should probably be: itemsize = PyUnicode_GetLength(robj) * PyUnicode_KIND(robj) Stefan Krah From ondrej.certik at gmail.com Sat Jul 28 10:58:57 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 07:58:57 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120728093622.GA27387@sleipnir.bytereef.org> References: <20120728093622.GA27387@sleipnir.bytereef.org> Message-ID: Stefan, On Sat, Jul 28, 2012 at 2:36 AM, Stefan Krah wrote: > Ond??ej ??ert??k wrote: >> >> I took a brief look at it, and from the errors I have seen, one is >> >> cosmetic, the other one is a bit more involved (rewriting >> >> PyArray_Scalar unicode support). While it is not difficult in nature, >> >> the current code has multiple #ifdef of Py_UNICODE_WIDE, meaning it >> >> would require multiple configurations on multiple python versions to >> >> be tested. > > The cleanest way might be to leave the existing code in place and write > completely new and independent code for Python 3.3. > > >> https://github.com/numpy/numpy/pull/366 >> >> It's a work in progress, I am still have some little issues, see the >> PR for up-to-date details. > > I'm not a Unicode expert, but I think it's best to avoid Py_UNICODE altogether. I think so too. > > What should matter in 3.3 is the maximum character in a Unicode string that > determines the kind of the string: > > PyUnicode_1BYTE_KIND -> Py_UCS1 > PyUnicode_2BYTE_KIND -> Py_UCS2 > PyUnicode_4BYTE_KIND -> Py_UCS4 > > > So Py_UNICODE_WIDE should not matter as all builds support PyUnicode_4BYTE_KIND. > That's why I /think/ it's possible to drop Py_UNICODE altogether. For instance, > the line in https://github.com/certik/numpy/commit/d02e36e5c85d5ee444614254643037aafc8deccc > should probably be: > > itemsize = PyUnicode_GetLength(robj) * PyUnicode_KIND(robj) Yes, I think that's it. I've changed it and pushed in the change into the PR. I am now seeing failures like these: ====================================================================== ERROR: test_rmul (test_defchararray.TestOperations) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_defchararray.py", line 592, in test_rmul Ar = np.array([[A[0,0]*r, A[0,1]*r], File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/defchararray.py", line 1916, in __getitem__ if issubclass(val.dtype.type, character) and not _len(val) == 0: AttributeError: 'str' object has no attribute 'dtype' Here is the code in defchararray.py: 1911 if not _globalvar and self.dtype.char not in 'SUbc': 1912 raise ValueError("Can only create a chararray from string data.") 1913 1914 def __getitem__(self, obj): 1915 val = ndarray.__getitem__(self, obj) 1916 -> if issubclass(val.dtype.type, character) and not _len(val) == 0: 1917 temp = val.rstrip() 1918 if _len(temp) == 0: 1919 val = '' 1920 else: 1921 val = temp and here is some debugging info: (Pdb) p self (Pdb) p obj (0, 0) (Pdb) p val 'abc' (Pdb) p type(val) So "val" is a Python string, which of course doesn't have .dtype. What I don't understand yet is why val = ndarray.__getitem__(self, obj) returns a Python string. I've been debugging it for a few hours yesterday, but so far no luck. Then there are failures in the test_unicode.py of the following type: ====================================================================== FAIL: Check byteorder of single-dimensional objects ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", line 286, in test_valuesSD self.assertTrue(ua[0] != ua2[0]) AssertionError: False is not true I didn't dig into those yet. If anyone has any ideas, let me know. Ondrej From ondrej.certik at gmail.com Sat Jul 28 11:04:57 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 08:04:57 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 7:58 AM, Ond?ej ?ert?k wrote: [...] > Here is the code in defchararray.py: > > > 1911 if not _globalvar and self.dtype.char not in 'SUbc': > 1912 raise ValueError("Can only create a chararray from > string data.") > 1913 > 1914 def __getitem__(self, obj): > 1915 val = ndarray.__getitem__(self, obj) > 1916 -> if issubclass(val.dtype.type, character) and not _len(val) == 0: > 1917 temp = val.rstrip() > 1918 if _len(temp) == 0: > 1919 val = '' > 1920 else: > 1921 val = temp > > > and here is some debugging info: > Python 3.3: > > (Pdb) p self > (Pdb) p obj > (0, 0) > (Pdb) p val > 'abc' > (Pdb) p type(val) > Python 3.2: (Pdb) p self chararray([['abc', '123'], ['789', 'xyz']], dtype=' So I think there might be some conversion issues int the chararray, that instead of using numpy.str_, it uses Python's str. Weird. Ondrej From ondrej.certik at gmail.com Sat Jul 28 11:12:15 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 08:12:15 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 8:04 AM, Ond?ej ?ert?k wrote: > On Sat, Jul 28, 2012 at 7:58 AM, Ond?ej ?ert?k wrote: > [...] >> Here is the code in defchararray.py: >> >> >> 1911 if not _globalvar and self.dtype.char not in 'SUbc': >> 1912 raise ValueError("Can only create a chararray from >> string data.") >> 1913 >> 1914 def __getitem__(self, obj): >> 1915 val = ndarray.__getitem__(self, obj) >> 1916 -> if issubclass(val.dtype.type, character) and not _len(val) == 0: >> 1917 temp = val.rstrip() >> 1918 if _len(temp) == 0: >> 1919 val = '' >> 1920 else: >> 1921 val = temp >> >> >> and here is some debugging info: >> > > Python 3.3: > >> >> (Pdb) p self >> (Pdb) p obj >> (0, 0) >> (Pdb) p val >> 'abc' >> (Pdb) p type(val) >> > > Python 3.2: > > (Pdb) p self > chararray([['abc', '123'], > ['789', 'xyz']], > dtype=' (Pdb) p obj > (0, 0) > (Pdb) p val > 'abc' > (Pdb) p type(val) > > > > So I think there might be some conversion issues int the chararray, > that instead of using numpy.str_, it uses Python's str. > Weird. Ok, found this minimal example of the problem. Python 3.3: >>> from numpy import array >>> a = array(["123", "abc"]) >>> a array(['123', 'abc'], dtype='>> a[0] '123' >>> type(a[0]) Python 3.2: >>> from numpy import array >>> a = array(["123", "abc"]) >>> a array(['123', 'abc'], dtype='>> a[0] '123' >>> type(a[0]) So at some point, the strings get converted to numpy strings in 3.2, but not in 3.3. Ondrej From stefan-usenet at bytereef.org Sat Jul 28 14:19:57 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sat, 28 Jul 2012 20:19:57 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> Message-ID: <20120728181956.GA30702@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > So at some point, the strings get converted to numpy strings in 3.2, > but not in 3.3. PyArray_Scalar() must return a subtype of PyUnicodeObject. I'm boldly assuming that data is in utf-32. If so, then this unoptimized version should work: diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/scalarapi.c index 2e255c0..c134aed 100644 --- a/numpy/core/src/multiarray/scalarapi.c +++ b/numpy/core/src/multiarray/scalarapi.c @@ -643,7 +643,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject *base) } #if PY_VERSION_HEX >= 0x03030000 if (type_num == NPY_UNICODE) { - return PyUnicode_FromKindAndData(PyUnicode_4BYTE_KIND, data, itemsize/4); + PyObject *b, *args; + b = PyBytes_FromStringAndSize(data, itemsize); + if (b == NULL) { + return NULL; + } + args = Py_BuildValue("(Os)", b, "utf-32"); + if (args == NULL) { + Py_DECREF(b); + return NULL; + } + obj = type->tp_new(type, args, NULL); + Py_DECREF(b); + Py_DECREF(args); + return obj; } #endif if (type->tp_itemsize != 0) { Stefan Krah From ondrej.certik at gmail.com Sat Jul 28 16:43:11 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 13:43:11 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120728181956.GA30702@sleipnir.bytereef.org> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 11:19 AM, Stefan Krah wrote: > Ond??ej ??ert??k wrote: >> So at some point, the strings get converted to numpy strings in 3.2, >> but not in 3.3. > > PyArray_Scalar() must return a subtype of PyUnicodeObject. I'm boldly > assuming that data is in utf-32. If so, then this unoptimized version > should work: > > diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/scalarapi.c > index 2e255c0..c134aed 100644 > --- a/numpy/core/src/multiarray/scalarapi.c > +++ b/numpy/core/src/multiarray/scalarapi.c > @@ -643,7 +643,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject *base) > } > #if PY_VERSION_HEX >= 0x03030000 > if (type_num == NPY_UNICODE) { > - return PyUnicode_FromKindAndData(PyUnicode_4BYTE_KIND, data, itemsize/4); Why doesn't PyUnicode_FromKindAndData return a subtype of PyUnicodeObject? http://docs.python.org/dev/c-api/unicode.html#PyUnicode_FromKindAndData > + PyObject *b, *args; > + b = PyBytes_FromStringAndSize(data, itemsize); > + if (b == NULL) { > + return NULL; > + } > + args = Py_BuildValue("(Os)", b, "utf-32"); > + if (args == NULL) { > + Py_DECREF(b); > + return NULL; > + } > + obj = type->tp_new(type, args, NULL); > + Py_DECREF(b); > + Py_DECREF(args); > + return obj; > } > #endif > if (type->tp_itemsize != 0) { Nice!! I pushed your patch into the PR, now it works great in Python 3.3. There are still other failures: https://gist.github.com/3194707 But this particular bug is fixed. Thanks for your help! Ondrej From ondrej.certik at gmail.com Sat Jul 28 18:04:29 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 15:04:29 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: Many of the failures in https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 are of the type: ====================================================================== FAIL: Check byteorder of single-dimensional objects ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", line 286, in test_valuesSD self.assertTrue(ua[0] != ua2[0]) AssertionError: False is not true and those are caused by the following minimal example: Python 3.2: >>> from numpy import array >>> a = array(["abc"]) >>> b = a.newbyteorder() >>> a.dtype dtype('>> b.dtype dtype('>U3') >>> a[0].dtype dtype('>> b[0].dtype dtype('>> a[0] == b[0] False >>> a[0] 'abc' >>> b[0] '?\udc00?\udc00?\udc00' Python 3.3: >>> from numpy import array >>> a = array(["abc"]) >>> b = a.newbyteorder() >>> a.dtype dtype('>> b.dtype dtype('>U3') >>> a[0].dtype dtype('>> b[0].dtype dtype('>> a[0] == b[0] True >>> a[0] 'abc' >>> b[0] 'abc' So somehow the newbyteorder() method doesn't change the dtype of the elements in our new code. This method is implemented in numpy/core/src/multiarray/descriptor.c (I think), but so far I don't see where the problem could be. Any ideas? Ondrej From ondrej.certik at gmail.com Sat Jul 28 18:31:11 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 15:31:11 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 3:04 PM, Ond?ej ?ert?k wrote: > Many of the failures in > https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 > are of the type: > > ====================================================================== > FAIL: Check byteorder of single-dimensional objects > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", > line 286, in test_valuesSD > self.assertTrue(ua[0] != ua2[0]) > AssertionError: False is not true > > > and those are caused by the following minimal example: > > Python 3.2: > >>>> from numpy import array >>>> a = array(["abc"]) >>>> b = a.newbyteorder() >>>> a.dtype > dtype('>>> b.dtype > dtype('>U3') >>>> a[0].dtype > dtype('>>> b[0].dtype > dtype('>>> a[0] == b[0] > False >>>> a[0] > 'abc' >>>> b[0] > '?\udc00?\udc00?\udc00' > > > Python 3.3: > > >>>> from numpy import array >>>> a = array(["abc"]) >>>> b = a.newbyteorder() >>>> a.dtype > dtype('>>> b.dtype > dtype('>U3') >>>> a[0].dtype > dtype('>>> b[0].dtype > dtype('>>> a[0] == b[0] > True >>>> a[0] > 'abc' >>>> b[0] > 'abc' > > > So somehow the newbyteorder() method doesn't change the dtype of the > elements in our new code. > This method is implemented in numpy/core/src/multiarray/descriptor.c > (I think), but so far I don't see > where the problem could be. > > Any ideas? Ok, after some investigating, I think we need to do something along these lines: diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s index c134aed..daf7fc4 100644 --- a/numpy/core/src/multiarray/scalarapi.c +++ b/numpy/core/src/multiarray/scalarapi.c @@ -644,7 +644,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * #if PY_VERSION_HEX >= 0x03030000 if (type_num == NPY_UNICODE) { PyObject *b, *args; - b = PyBytes_FromStringAndSize(data, itemsize); + if (swap) { + char *buffer; + buffer = malloc(itemsize); + if (buffer == NULL) { + PyErr_NoMemory(); + } + memcpy(buffer, data, itemsize); + byte_swap_vector(buffer, itemsize, 4); + b = PyBytes_FromStringAndSize(buffer, itemsize); + // We have to deallocate this later, otherwise we get a segfault... + //free(buffer); + } else { + b = PyBytes_FromStringAndSize(data, itemsize); + } if (b == NULL) { return NULL; } This particular implementation still fails though: >>> from numpy import array >>> a = array(["abc"]) >>> b = a.newbyteorder() >>> a.dtype dtype('>> b.dtype dtype('>U3') >>> a[0].dtype dtype('>> b[0].dtype Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: codepoint not in range(0x110000) >>> a[0] == b[0] Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: codepoint not in range(0x110000) >>> a[0] 'abc' >>> b[0] Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: codepoint not in range(0x110000) But I think that we simply need to take into account the "swap" flag. Ondrej From ondrej.certik at gmail.com Sat Jul 28 20:09:20 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 17:09:20 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 3:31 PM, Ond?ej ?ert?k wrote: > On Sat, Jul 28, 2012 at 3:04 PM, Ond?ej ?ert?k wrote: >> Many of the failures in >> https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 >> are of the type: >> >> ====================================================================== >> FAIL: Check byteorder of single-dimensional objects >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", >> line 286, in test_valuesSD >> self.assertTrue(ua[0] != ua2[0]) >> AssertionError: False is not true >> >> >> and those are caused by the following minimal example: >> >> Python 3.2: >> >>>>> from numpy import array >>>>> a = array(["abc"]) >>>>> b = a.newbyteorder() >>>>> a.dtype >> dtype('>>>> b.dtype >> dtype('>U3') >>>>> a[0].dtype >> dtype('>>>> b[0].dtype >> dtype('>>>> a[0] == b[0] >> False >>>>> a[0] >> 'abc' >>>>> b[0] >> '?\udc00?\udc00?\udc00' >> >> >> Python 3.3: >> >> >>>>> from numpy import array >>>>> a = array(["abc"]) >>>>> b = a.newbyteorder() >>>>> a.dtype >> dtype('>>>> b.dtype >> dtype('>U3') >>>>> a[0].dtype >> dtype('>>>> b[0].dtype >> dtype('>>>> a[0] == b[0] >> True >>>>> a[0] >> 'abc' >>>>> b[0] >> 'abc' >> >> >> So somehow the newbyteorder() method doesn't change the dtype of the >> elements in our new code. >> This method is implemented in numpy/core/src/multiarray/descriptor.c >> (I think), but so far I don't see >> where the problem could be. >> >> Any ideas? > > Ok, after some investigating, I think we need to do something along these lines: > > diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s > index c134aed..daf7fc4 100644 > --- a/numpy/core/src/multiarray/scalarapi.c > +++ b/numpy/core/src/multiarray/scalarapi.c > @@ -644,7 +644,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * > #if PY_VERSION_HEX >= 0x03030000 > if (type_num == NPY_UNICODE) { > PyObject *b, *args; > - b = PyBytes_FromStringAndSize(data, itemsize); > + if (swap) { > + char *buffer; > + buffer = malloc(itemsize); > + if (buffer == NULL) { > + PyErr_NoMemory(); > + } > + memcpy(buffer, data, itemsize); > + byte_swap_vector(buffer, itemsize, 4); > + b = PyBytes_FromStringAndSize(buffer, itemsize); > + // We have to deallocate this later, otherwise we get a segfault... > + //free(buffer); > + } else { > + b = PyBytes_FromStringAndSize(data, itemsize); > + } > if (b == NULL) { > return NULL; > } > > This particular implementation still fails though: > > >>>> from numpy import array >>>> a = array(["abc"]) >>>> b = a.newbyteorder() >>>> a.dtype > dtype('>>> b.dtype > dtype('>U3') >>>> a[0].dtype > dtype('>>> b[0].dtype > Traceback (most recent call last): > File "", line 1, in > UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: > codepoint not in range(0x110000) >>>> a[0] == b[0] > Traceback (most recent call last): > File "", line 1, in > UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: > codepoint not in range(0x110000) >>>> a[0] > 'abc' >>>> b[0] > Traceback (most recent call last): > File "", line 1, in > UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: > codepoint not in range(0x110000) > > > > But I think that we simply need to take into account the "swap" flag. Ok, so first of all, I tried to disable the swapping in Python 3.2: if (swap) { byte_swap_vector(buffer, itemsize >> 2, 4); } And then it behaves *exactly* as in Python 3.3. So I am pretty sure that the problem is right there and something along the lines of my patch above should fix it. I had a few bugs there, here is the correct version: diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s index c134aed..bed73f7 100644 --- a/numpy/core/src/multiarray/scalarapi.c +++ b/numpy/core/src/multiarray/scalarapi.c @@ -644,7 +644,19 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * #if PY_VERSION_HEX >= 0x03030000 if (type_num == NPY_UNICODE) { PyObject *b, *args; - b = PyBytes_FromStringAndSize(data, itemsize); + if (swap) { + char *buffer; + buffer = malloc(itemsize); + if (buffer == NULL) { + PyErr_NoMemory(); + } + memcpy(buffer, data, itemsize); + byte_swap_vector(buffer, itemsize >> 2, 4); + b = PyBytes_FromStringAndSize(buffer, itemsize); + free(buffer); + } else { + b = PyBytes_FromStringAndSize(data, itemsize); + } if (b == NULL) { return NULL; } That works well, except that it gives the UnicodeDecodeError: >>> b[0].dtype NULL Traceback (most recent call last): File "", line 1, in UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: codepoint not in range(0x110000) This error is actually triggered by this line: obj = type->tp_new(type, args, NULL); in the patch by Stefan above. So I think what is happening is that it simply tries to convert it from bytes to a string and fails. That makes great sense. The question is why doesn't it fail in exactly the same way in Python 3.2? I think it's because the conversion check is bypassed somehow. Stefan, I think we need to swap it after the object is created. I am still experimenting with this. Ondrej From aclark at aclark.net Sat Jul 28 19:25:26 2012 From: aclark at aclark.net (Alex Clark) Date: Sat, 28 Jul 2012 19:25:26 -0400 Subject: [Numpy-discussion] ANN: pythonpackages.com beta Message-ID: Hi NumPy folks, I am reaching out to various Python-related programming communities in order to offer new help packaging your software. If you have ever struggled with packaging and releasing Python software (e.g. to PyPI), please check out this service: - http://pythonpackages.com The basic idea is to automate packaging by checking out code, testing, and uploading (e.g. to PyPI) all through the web, as explained in this introduction: - http://docs.pythonpackages.com/en/latest/introduction.html Also, I will be available to answer your Python packaging questions most days/nights in #pythonpackages on irc.freenode.net. Hope to meet/talk with all of you soon. Alex -- Alex Clark ? http://pythonpackages.com/ONE_CLICK From ondrej.certik at gmail.com Sat Jul 28 21:09:23 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 28 Jul 2012 18:09:23 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: On Sat, Jul 28, 2012 at 5:09 PM, Ond?ej ?ert?k wrote: > On Sat, Jul 28, 2012 at 3:31 PM, Ond?ej ?ert?k wrote: >> On Sat, Jul 28, 2012 at 3:04 PM, Ond?ej ?ert?k wrote: >>> Many of the failures in >>> https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 >>> are of the type: >>> >>> ====================================================================== >>> FAIL: Check byteorder of single-dimensional objects >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", >>> line 286, in test_valuesSD >>> self.assertTrue(ua[0] != ua2[0]) >>> AssertionError: False is not true >>> >>> >>> and those are caused by the following minimal example: >>> >>> Python 3.2: >>> >>>>>> from numpy import array >>>>>> a = array(["abc"]) >>>>>> b = a.newbyteorder() >>>>>> a.dtype >>> dtype('>>>>> b.dtype >>> dtype('>U3') >>>>>> a[0].dtype >>> dtype('>>>>> b[0].dtype >>> dtype('>>>>> a[0] == b[0] >>> False >>>>>> a[0] >>> 'abc' >>>>>> b[0] >>> '?\udc00?\udc00?\udc00' >>> >>> >>> Python 3.3: >>> >>> >>>>>> from numpy import array >>>>>> a = array(["abc"]) >>>>>> b = a.newbyteorder() >>>>>> a.dtype >>> dtype('>>>>> b.dtype >>> dtype('>U3') >>>>>> a[0].dtype >>> dtype('>>>>> b[0].dtype >>> dtype('>>>>> a[0] == b[0] >>> True >>>>>> a[0] >>> 'abc' >>>>>> b[0] >>> 'abc' >>> >>> >>> So somehow the newbyteorder() method doesn't change the dtype of the >>> elements in our new code. >>> This method is implemented in numpy/core/src/multiarray/descriptor.c >>> (I think), but so far I don't see >>> where the problem could be. >>> >>> Any ideas? >> >> Ok, after some investigating, I think we need to do something along these lines: >> >> diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s >> index c134aed..daf7fc4 100644 >> --- a/numpy/core/src/multiarray/scalarapi.c >> +++ b/numpy/core/src/multiarray/scalarapi.c >> @@ -644,7 +644,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * >> #if PY_VERSION_HEX >= 0x03030000 >> if (type_num == NPY_UNICODE) { >> PyObject *b, *args; >> - b = PyBytes_FromStringAndSize(data, itemsize); >> + if (swap) { >> + char *buffer; >> + buffer = malloc(itemsize); >> + if (buffer == NULL) { >> + PyErr_NoMemory(); >> + } >> + memcpy(buffer, data, itemsize); >> + byte_swap_vector(buffer, itemsize, 4); >> + b = PyBytes_FromStringAndSize(buffer, itemsize); >> + // We have to deallocate this later, otherwise we get a segfault... >> + //free(buffer); >> + } else { >> + b = PyBytes_FromStringAndSize(data, itemsize); >> + } >> if (b == NULL) { >> return NULL; >> } >> >> This particular implementation still fails though: >> >> >>>>> from numpy import array >>>>> a = array(["abc"]) >>>>> b = a.newbyteorder() >>>>> a.dtype >> dtype('>>>> b.dtype >> dtype('>U3') >>>>> a[0].dtype >> dtype('>>>> b[0].dtype >> Traceback (most recent call last): >> File "", line 1, in >> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >> codepoint not in range(0x110000) >>>>> a[0] == b[0] >> Traceback (most recent call last): >> File "", line 1, in >> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >> codepoint not in range(0x110000) >>>>> a[0] >> 'abc' >>>>> b[0] >> Traceback (most recent call last): >> File "", line 1, in >> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >> codepoint not in range(0x110000) >> >> >> >> But I think that we simply need to take into account the "swap" flag. > > Ok, so first of all, I tried to disable the swapping in Python 3.2: > > if (swap) { > byte_swap_vector(buffer, itemsize >> 2, 4); > } > > And then it behaves *exactly* as in Python 3.3. So I am pretty sure > that the problem is right there and something > along the lines of my patch above should fix it. I had a few bugs > there, here is the correct version: > > diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s > index c134aed..bed73f7 100644 > --- a/numpy/core/src/multiarray/scalarapi.c > +++ b/numpy/core/src/multiarray/scalarapi.c > @@ -644,7 +644,19 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * > #if PY_VERSION_HEX >= 0x03030000 > if (type_num == NPY_UNICODE) { > PyObject *b, *args; > - b = PyBytes_FromStringAndSize(data, itemsize); > + if (swap) { > + char *buffer; > + buffer = malloc(itemsize); > + if (buffer == NULL) { > + PyErr_NoMemory(); > + } > + memcpy(buffer, data, itemsize); > + byte_swap_vector(buffer, itemsize >> 2, 4); > + b = PyBytes_FromStringAndSize(buffer, itemsize); > + free(buffer); > + } else { > + b = PyBytes_FromStringAndSize(data, itemsize); > + } > if (b == NULL) { > return NULL; > } > > > That works well, except that it gives the UnicodeDecodeError: > >>>> b[0].dtype > NULL > Traceback (most recent call last): > File "", line 1, in > UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: > codepoint not in range(0x110000) > > This error is actually triggered by this line: > > > obj = type->tp_new(type, args, NULL); > > in the patch by Stefan above. So I think what is happening is that it > simply tries to convert it from bytes > to a string and fails. That makes great sense. The question is why > doesn't it fail in exactly the same way > in Python 3.2? I think it's because the conversion check is bypassed > somehow. Stefan, I think > we need to swap it after the object is created. I am still > experimenting with this. Well, I simply went to the Python sources and then implemented a solution that works with this patch: https://github.com/certik/numpy/commit/36fcd1327746a3d0ad346ce58ffbe00506e27654 So now the PR actually seems to work. The rest of the failures are here: https://gist.github.com/3195520 and they seem to be unrelated. Can somebody please review this PR? https://github.com/numpy/numpy/pull/366 I will squash the commits after it's reviewed (I want to keep the history there for now). Ondrej From cgohlke at uci.edu Sat Jul 28 21:17:04 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 28 Jul 2012 18:17:04 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: <50148F10.4040900@uci.edu> On 7/28/2012 6:09 PM, Ond?ej ?ert?k wrote: > On Sat, Jul 28, 2012 at 5:09 PM, Ond?ej ?ert?k wrote: >> On Sat, Jul 28, 2012 at 3:31 PM, Ond?ej ?ert?k wrote: >>> On Sat, Jul 28, 2012 at 3:04 PM, Ond?ej ?ert?k wrote: >>>> Many of the failures in >>>> https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 >>>> are of the type: >>>> >>>> ====================================================================== >>>> FAIL: Check byteorder of single-dimensional objects >>>> ---------------------------------------------------------------------- >>>> Traceback (most recent call last): >>>> File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", >>>> line 286, in test_valuesSD >>>> self.assertTrue(ua[0] != ua2[0]) >>>> AssertionError: False is not true >>>> >>>> >>>> and those are caused by the following minimal example: >>>> >>>> Python 3.2: >>>> >>>>>>> from numpy import array >>>>>>> a = array(["abc"]) >>>>>>> b = a.newbyteorder() >>>>>>> a.dtype >>>> dtype('>>>>>> b.dtype >>>> dtype('>U3') >>>>>>> a[0].dtype >>>> dtype('>>>>>> b[0].dtype >>>> dtype('>>>>>> a[0] == b[0] >>>> False >>>>>>> a[0] >>>> 'abc' >>>>>>> b[0] >>>> '?\udc00?\udc00?\udc00' >>>> >>>> >>>> Python 3.3: >>>> >>>> >>>>>>> from numpy import array >>>>>>> a = array(["abc"]) >>>>>>> b = a.newbyteorder() >>>>>>> a.dtype >>>> dtype('>>>>>> b.dtype >>>> dtype('>U3') >>>>>>> a[0].dtype >>>> dtype('>>>>>> b[0].dtype >>>> dtype('>>>>>> a[0] == b[0] >>>> True >>>>>>> a[0] >>>> 'abc' >>>>>>> b[0] >>>> 'abc' >>>> >>>> >>>> So somehow the newbyteorder() method doesn't change the dtype of the >>>> elements in our new code. >>>> This method is implemented in numpy/core/src/multiarray/descriptor.c >>>> (I think), but so far I don't see >>>> where the problem could be. >>>> >>>> Any ideas? >>> >>> Ok, after some investigating, I think we need to do something along these lines: >>> >>> diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s >>> index c134aed..daf7fc4 100644 >>> --- a/numpy/core/src/multiarray/scalarapi.c >>> +++ b/numpy/core/src/multiarray/scalarapi.c >>> @@ -644,7 +644,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * >>> #if PY_VERSION_HEX >= 0x03030000 >>> if (type_num == NPY_UNICODE) { >>> PyObject *b, *args; >>> - b = PyBytes_FromStringAndSize(data, itemsize); >>> + if (swap) { >>> + char *buffer; >>> + buffer = malloc(itemsize); >>> + if (buffer == NULL) { >>> + PyErr_NoMemory(); >>> + } >>> + memcpy(buffer, data, itemsize); >>> + byte_swap_vector(buffer, itemsize, 4); >>> + b = PyBytes_FromStringAndSize(buffer, itemsize); >>> + // We have to deallocate this later, otherwise we get a segfault... >>> + //free(buffer); >>> + } else { >>> + b = PyBytes_FromStringAndSize(data, itemsize); >>> + } >>> if (b == NULL) { >>> return NULL; >>> } >>> >>> This particular implementation still fails though: >>> >>> >>>>>> from numpy import array >>>>>> a = array(["abc"]) >>>>>> b = a.newbyteorder() >>>>>> a.dtype >>> dtype('>>>>> b.dtype >>> dtype('>U3') >>>>>> a[0].dtype >>> dtype('>>>>> b[0].dtype >>> Traceback (most recent call last): >>> File "", line 1, in >>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>> codepoint not in range(0x110000) >>>>>> a[0] == b[0] >>> Traceback (most recent call last): >>> File "", line 1, in >>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>> codepoint not in range(0x110000) >>>>>> a[0] >>> 'abc' >>>>>> b[0] >>> Traceback (most recent call last): >>> File "", line 1, in >>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>> codepoint not in range(0x110000) >>> >>> >>> >>> But I think that we simply need to take into account the "swap" flag. >> >> Ok, so first of all, I tried to disable the swapping in Python 3.2: >> >> if (swap) { >> byte_swap_vector(buffer, itemsize >> 2, 4); >> } >> >> And then it behaves *exactly* as in Python 3.3. So I am pretty sure >> that the problem is right there and something >> along the lines of my patch above should fix it. I had a few bugs >> there, here is the correct version: >> >> diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s >> index c134aed..bed73f7 100644 >> --- a/numpy/core/src/multiarray/scalarapi.c >> +++ b/numpy/core/src/multiarray/scalarapi.c >> @@ -644,7 +644,19 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * >> #if PY_VERSION_HEX >= 0x03030000 >> if (type_num == NPY_UNICODE) { >> PyObject *b, *args; >> - b = PyBytes_FromStringAndSize(data, itemsize); >> + if (swap) { >> + char *buffer; >> + buffer = malloc(itemsize); >> + if (buffer == NULL) { >> + PyErr_NoMemory(); >> + } >> + memcpy(buffer, data, itemsize); >> + byte_swap_vector(buffer, itemsize >> 2, 4); >> + b = PyBytes_FromStringAndSize(buffer, itemsize); >> + free(buffer); >> + } else { >> + b = PyBytes_FromStringAndSize(data, itemsize); >> + } >> if (b == NULL) { >> return NULL; >> } >> >> >> That works well, except that it gives the UnicodeDecodeError: >> >>>>> b[0].dtype >> NULL >> Traceback (most recent call last): >> File "", line 1, in >> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >> codepoint not in range(0x110000) >> >> This error is actually triggered by this line: >> >> >> obj = type->tp_new(type, args, NULL); >> >> in the patch by Stefan above. So I think what is happening is that it >> simply tries to convert it from bytes >> to a string and fails. That makes great sense. The question is why >> doesn't it fail in exactly the same way >> in Python 3.2? I think it's because the conversion check is bypassed >> somehow. Stefan, I think >> we need to swap it after the object is created. I am still >> experimenting with this. > > Well, I simply went to the Python sources and then implemented a > solution that works with this patch: > > https://github.com/certik/numpy/commit/36fcd1327746a3d0ad346ce58ffbe00506e27654 > > So now the PR actually seems to work. The rest of the failures are here: > > https://gist.github.com/3195520 > > and they seem to be unrelated. Can somebody please review this PR? > > https://github.com/numpy/numpy/pull/366 > > > I will squash the commits after it's reviewed (I want to keep the > history there for now). > > > Ondrej Thank you. I backported the PR to numpy 1.6.2 and it works for me on win-amd64-py3.3 with the msvc10 compiler. I get the same 5 test failures of the kind: AssertionError: Items are not equal: ACTUAL: () DESIRED: None Christoph From cgohlke at uci.edu Sun Jul 29 02:25:25 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 28 Jul 2012 23:25:25 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <50148F10.4040900@uci.edu> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <50148F10.4040900@uci.edu> Message-ID: <5014D755.3030103@uci.edu> On 7/28/2012 6:17 PM, Christoph Gohlke wrote: > On 7/28/2012 6:09 PM, Ond?ej ?ert?k wrote: >> On Sat, Jul 28, 2012 at 5:09 PM, Ond?ej ?ert?k wrote: >>> On Sat, Jul 28, 2012 at 3:31 PM, Ond?ej ?ert?k wrote: >>>> On Sat, Jul 28, 2012 at 3:04 PM, Ond?ej ?ert?k wrote: >>>>> Many of the failures in >>>>> https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 >>>>> are of the type: >>>>> >>>>> ====================================================================== >>>>> FAIL: Check byteorder of single-dimensional objects >>>>> ---------------------------------------------------------------------- >>>>> Traceback (most recent call last): >>>>> File "/home/ondrej/py33/lib/python3.3/site-packages/numpy/core/tests/test_unicode.py", >>>>> line 286, in test_valuesSD >>>>> self.assertTrue(ua[0] != ua2[0]) >>>>> AssertionError: False is not true >>>>> >>>>> >>>>> and those are caused by the following minimal example: >>>>> >>>>> Python 3.2: >>>>> >>>>>>>> from numpy import array >>>>>>>> a = array(["abc"]) >>>>>>>> b = a.newbyteorder() >>>>>>>> a.dtype >>>>> dtype('>>>>>>> b.dtype >>>>> dtype('>U3') >>>>>>>> a[0].dtype >>>>> dtype('>>>>>>> b[0].dtype >>>>> dtype('>>>>>>> a[0] == b[0] >>>>> False >>>>>>>> a[0] >>>>> 'abc' >>>>>>>> b[0] >>>>> '?\udc00?\udc00?\udc00' >>>>> >>>>> >>>>> Python 3.3: >>>>> >>>>> >>>>>>>> from numpy import array >>>>>>>> a = array(["abc"]) >>>>>>>> b = a.newbyteorder() >>>>>>>> a.dtype >>>>> dtype('>>>>>>> b.dtype >>>>> dtype('>U3') >>>>>>>> a[0].dtype >>>>> dtype('>>>>>>> b[0].dtype >>>>> dtype('>>>>>>> a[0] == b[0] >>>>> True >>>>>>>> a[0] >>>>> 'abc' >>>>>>>> b[0] >>>>> 'abc' >>>>> >>>>> >>>>> So somehow the newbyteorder() method doesn't change the dtype of the >>>>> elements in our new code. >>>>> This method is implemented in numpy/core/src/multiarray/descriptor.c >>>>> (I think), but so far I don't see >>>>> where the problem could be. >>>>> >>>>> Any ideas? >>>> >>>> Ok, after some investigating, I think we need to do something along these lines: >>>> >>>> diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s >>>> index c134aed..daf7fc4 100644 >>>> --- a/numpy/core/src/multiarray/scalarapi.c >>>> +++ b/numpy/core/src/multiarray/scalarapi.c >>>> @@ -644,7 +644,20 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * >>>> #if PY_VERSION_HEX >= 0x03030000 >>>> if (type_num == NPY_UNICODE) { >>>> PyObject *b, *args; >>>> - b = PyBytes_FromStringAndSize(data, itemsize); >>>> + if (swap) { >>>> + char *buffer; >>>> + buffer = malloc(itemsize); >>>> + if (buffer == NULL) { >>>> + PyErr_NoMemory(); >>>> + } >>>> + memcpy(buffer, data, itemsize); >>>> + byte_swap_vector(buffer, itemsize, 4); >>>> + b = PyBytes_FromStringAndSize(buffer, itemsize); >>>> + // We have to deallocate this later, otherwise we get a segfault... >>>> + //free(buffer); >>>> + } else { >>>> + b = PyBytes_FromStringAndSize(data, itemsize); >>>> + } >>>> if (b == NULL) { >>>> return NULL; >>>> } >>>> >>>> This particular implementation still fails though: >>>> >>>> >>>>>>> from numpy import array >>>>>>> a = array(["abc"]) >>>>>>> b = a.newbyteorder() >>>>>>> a.dtype >>>> dtype('>>>>>> b.dtype >>>> dtype('>U3') >>>>>>> a[0].dtype >>>> dtype('>>>>>> b[0].dtype >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>>> codepoint not in range(0x110000) >>>>>>> a[0] == b[0] >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>>> codepoint not in range(0x110000) >>>>>>> a[0] >>>> 'abc' >>>>>>> b[0] >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>>> codepoint not in range(0x110000) >>>> >>>> >>>> >>>> But I think that we simply need to take into account the "swap" flag. >>> >>> Ok, so first of all, I tried to disable the swapping in Python 3.2: >>> >>> if (swap) { >>> byte_swap_vector(buffer, itemsize >> 2, 4); >>> } >>> >>> And then it behaves *exactly* as in Python 3.3. So I am pretty sure >>> that the problem is right there and something >>> along the lines of my patch above should fix it. I had a few bugs >>> there, here is the correct version: >>> >>> diff --git a/numpy/core/src/multiarray/scalarapi.c b/numpy/core/src/multiarray/s >>> index c134aed..bed73f7 100644 >>> --- a/numpy/core/src/multiarray/scalarapi.c >>> +++ b/numpy/core/src/multiarray/scalarapi.c >>> @@ -644,7 +644,19 @@ PyArray_Scalar(void *data, PyArray_Descr *descr, PyObject * >>> #if PY_VERSION_HEX >= 0x03030000 >>> if (type_num == NPY_UNICODE) { >>> PyObject *b, *args; >>> - b = PyBytes_FromStringAndSize(data, itemsize); >>> + if (swap) { >>> + char *buffer; >>> + buffer = malloc(itemsize); >>> + if (buffer == NULL) { >>> + PyErr_NoMemory(); >>> + } >>> + memcpy(buffer, data, itemsize); >>> + byte_swap_vector(buffer, itemsize >> 2, 4); >>> + b = PyBytes_FromStringAndSize(buffer, itemsize); >>> + free(buffer); >>> + } else { >>> + b = PyBytes_FromStringAndSize(data, itemsize); >>> + } >>> if (b == NULL) { >>> return NULL; >>> } >>> >>> >>> That works well, except that it gives the UnicodeDecodeError: >>> >>>>>> b[0].dtype >>> NULL >>> Traceback (most recent call last): >>> File "", line 1, in >>> UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: >>> codepoint not in range(0x110000) >>> >>> This error is actually triggered by this line: >>> >>> >>> obj = type->tp_new(type, args, NULL); >>> >>> in the patch by Stefan above. So I think what is happening is that it >>> simply tries to convert it from bytes >>> to a string and fails. That makes great sense. The question is why >>> doesn't it fail in exactly the same way >>> in Python 3.2? I think it's because the conversion check is bypassed >>> somehow. Stefan, I think >>> we need to swap it after the object is created. I am still >>> experimenting with this. >> >> Well, I simply went to the Python sources and then implemented a >> solution that works with this patch: >> >> https://github.com/certik/numpy/commit/36fcd1327746a3d0ad346ce58ffbe00506e27654 >> >> So now the PR actually seems to work. The rest of the failures are here: >> >> https://gist.github.com/3195520 >> >> and they seem to be unrelated. Can somebody please review this PR? >> >> https://github.com/numpy/numpy/pull/366 >> >> >> I will squash the commits after it's reviewed (I want to keep the >> history there for now). >> >> >> Ondrej > > > Thank you. I backported the PR to numpy 1.6.2 and it works for me on > win-amd64-py3.3 with the msvc10 compiler. I get the same 5 test failures > of the kind: > > AssertionError: > Items are not equal: > ACTUAL: () > DESIRED: None > > > Christoph Pull request #367 should fix the NewBufferProtocol test failures. https://github.com/numpy/numpy/pull/367 Christoph From pierre.raybaut at gmail.com Sun Jul 29 03:42:50 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Sun, 29 Jul 2012 09:42:50 +0200 Subject: [Numpy-discussion] ANN: Spyder v2.1.11 Message-ID: Hi all, On the behalf of Spyder's development team (http://code.google.com/p/spyderlib/people/list), I'm pleased to announce that Spyder v2.1.11 has been released and is available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://code.google.com/p/spyderlib/ This is a pure maintenance release -- a lot of bugs were fixed since v2.1.10: http://code.google.com/p/spyderlib/wiki/ChangeLog Spyder is a free, open-source (MIT license) interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. Originally designed to provide MATLAB-like features (integrated help, interactive console, variable explorer with GUI-based editors for dictionaries, NumPy arrays, ...), it is strongly oriented towards scientific computing and software development. Thanks to the `spyderlib` library, Spyder also provides powerful ready-to-use widgets: embedded Python console (example: http://packages.python.org/guiqwt/_images/sift3.png), NumPy array editor (example: http://packages.python.org/guiqwt/_images/sift2.png), dictionary editor, source code editor, etc. Description of key features with tasty screenshots can be found at: http://code.google.com/p/spyderlib/wiki/Features On Windows platforms, Spyder is also available as a stand-alone executable (don't forget to disable UAC on Vista/7). This all-in-one portable version is still experimental (for example, it does not embed sphinx -- meaning no rich text mode for the object inspector) but it should provide a working version of Spyder for Windows platforms without having to install anything else (except Python 2.x itself, of course). Don't forget to follow Spyder updates/news: * on the project website: http://code.google.com/p/spyderlib/ * and on our official blog: http://spyder-ide.blogspot.com/ Last, but not least, we welcome any contribution that helps making Spyder an efficient scientific development/computing environment. Join us to help creating your favourite environment! (http://code.google.com/p/spyderlib/wiki/NoteForContributors) Enjoy! -Pierre From stefan-usenet at bytereef.org Sun Jul 29 05:20:21 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 11:20:21 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: <20120729092021.GA3243@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > Why doesn't PyUnicode_FromKindAndData return a subtype of PyUnicodeObject? > > http://docs.python.org/dev/c-api/unicode.html#PyUnicode_FromKindAndData Well, it would need a PyTypeObject * parameter to do that. I agree that many C-API functions would be more useful if they did this. Stefan Krah From stefan-usenet at bytereef.org Sun Jul 29 06:40:34 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 12:40:34 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: <20120729104034.GB3243@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > Well, I simply went to the Python sources and then implemented a > solution that works with this patch: > > https://github.com/certik/numpy/commit/36fcd1327746a3d0ad346ce58ffbe00506e27654 > https://github.com/numpy/numpy/pull/366 Nice! I hit the same problem yesterday: unicode_new() does not accept byte-swapped input with an encoding, since the input is not valid. But your solution circumvents the validation. I'm not sure what the use case is for byte-swapped (invalid?) unicode strings, but the approach looks good to me in the sense that it does the same thing as the Py_UNICODE_WIDE path in 3.2. In PyArray_Scalar() I only have these comments, two of which are stylistic: - I think the 'size' parameter in PyUnicode_New() refers to the number of code points (UCS4 in this case), so: PyUnicode_New(itemsize >> 2, max_char) - The 'b' variable could be renamed to 'u' now. - PyArray_Scalar() is beginning to look a little crowded. Perhaps the whole PY_VERSION_HEX >= 0x03030000 block could go into a separate function such as: NPY_NO_EXPORT PyObject * get_unicode_scalar_3_3(PyTypeObject *type, void *data, Py_ssize_t itemsize, int swap); Then there's another problem in numpy.test() if Python 3.3 is compiled --with-pydebug: .python3.3: numpy/core/src/multiarray/common.c:161: PyArray_DTypeFromObjectHelper: Assertion `((((((PyObject*)(temp))->ob_type))->tp_flags & ((1L<<27))) != 0)' failed. Aborted Stefan Krah From stefan-usenet at bytereef.org Sun Jul 29 07:52:12 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 13:52:12 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120729104034.GB3243@sleipnir.bytereef.org> References: <20120728181956.GA30702@sleipnir.bytereef.org> <20120729104034.GB3243@sleipnir.bytereef.org> Message-ID: <20120729115212.GA4336@sleipnir.bytereef.org> Stefan Krah wrote: > Then there's another problem in numpy.test() if Python 3.3 is compiled > --with-pydebug: > > .python3.3: numpy/core/src/multiarray/common.c:161: PyArray_DTypeFromObjectHelper: Assertion `((((((PyObject*)(temp))->ob_type))->tp_flags & ((1L<<27))) != 0)' failed. > Aborted This also occurs with Python 3.2, so it's unrelated to the Unicode changes: http://projects.scipy.org/numpy/ticket/2193 Stefan Krah From stefan-usenet at bytereef.org Sun Jul 29 09:42:05 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 15:42:05 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120729115212.GA4336@sleipnir.bytereef.org> References: <20120728181956.GA30702@sleipnir.bytereef.org> <20120729104034.GB3243@sleipnir.bytereef.org> <20120729115212.GA4336@sleipnir.bytereef.org> Message-ID: <20120729134205.GA4840@sleipnir.bytereef.org> Stefan Krah wrote: > > .python3.3: numpy/core/src/multiarray/common.c:161: PyArray_DTypeFromObjectHelper: Assertion `((((((PyObject*)(temp))->ob_type))->tp_flags & ((1L<<27))) != 0)' failed. > > Aborted > > This also occurs with Python 3.2, so it's unrelated to the Unicode changes: > > http://projects.scipy.org/numpy/ticket/2193 I've uploaded a patch for the issue. Stefan Krah From stefan-usenet at bytereef.org Sun Jul 29 09:56:44 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 15:56:44 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: <20120729135644.GA4986@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > https://github.com/numpy/numpy/pull/366 Using Python 3.3 compiled --with-pydebug it appears to be impossible to fool the new Unicode implementation with byte-swapped data: Apply the patch from: http://projects.scipy.org/numpy/ticket/2193 Then: Python 3.3.0b1 (default:68e2690a471d+, Jul 29 2012, 15:28:41) [GCC 4.4.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from numpy import array [206376 refs] >>> a = array(["abc"]) [206382 refs] >>> b = a.newbyteorder() [206387 refs] >>> b python3.3: Objects/unicodeobject.c:401: _PyUnicode_CheckConsistency: Assertion `maxchar <= 0x10ffff' failed. Program received signal SIGABRT, Aborted. 0x00007ffff71e6a75 in *__GI_raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/raise.c (gdb) This should be expected since the byte-swapped strings aren't valid. Stefan Krah From ondrej.certik at gmail.com Sun Jul 29 11:12:43 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 29 Jul 2012 08:12:43 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120729135644.GA4986@sleipnir.bytereef.org> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <20120729135644.GA4986@sleipnir.bytereef.org> Message-ID: On Sun, Jul 29, 2012 at 6:56 AM, Stefan Krah wrote: > Ond??ej ??ert??k wrote: >> https://github.com/numpy/numpy/pull/366 > > Using Python 3.3 compiled --with-pydebug it appears to be impossible > to fool the new Unicode implementation with byte-swapped data: > > > Apply the patch from: > > http://projects.scipy.org/numpy/ticket/2193 > > > Then: > > Python 3.3.0b1 (default:68e2690a471d+, Jul 29 2012, 15:28:41) > [GCC 4.4.3] on linux > Type "help", "copyright", "credits" or "license" for more information. >>>> from numpy import array > [206376 refs] >>>> a = array(["abc"]) > [206382 refs] >>>> b = a.newbyteorder() > [206387 refs] >>>> b > python3.3: Objects/unicodeobject.c:401: _PyUnicode_CheckConsistency: Assertion `maxchar <= 0x10ffff' failed. > > Program received signal SIGABRT, Aborted. > 0x00007ffff71e6a75 in *__GI_raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. > in ../nptl/sysdeps/unix/sysv/linux/raise.c > (gdb) > > > This should be expected since the byte-swapped strings aren't valid. Exactly, I am aware that my solution is a hack. So is the Python 3.2 solution, except that Python 3.2 doesn't seem to have the _PyUnicode_CheckConsistency() function, so no checks are done. As such, I think that my PR simply extends the numpy approach to Python 3.3. A separate issue is that the swapping thing is a hack -- Travis, what is the purpose of the newbyteorder() and the need to swap the internals of the unicode object? Ondrej From stefan-usenet at bytereef.org Sun Jul 29 11:26:46 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 17:26:46 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728181956.GA30702@sleipnir.bytereef.org> <20120729135644.GA4986@sleipnir.bytereef.org> Message-ID: <20120729152646.GA5652@sleipnir.bytereef.org> Ond??ej ??ert??k wrote: > > This should be expected since the byte-swapped strings aren't valid. > > Exactly, I am aware that my solution is a hack. So is the Python 3.2 > solution, except that Python 3.2 doesn't seem to have > the _PyUnicode_CheckConsistency() function, so no checks are done. > As such, I think that my PR simply extends the numpy approach to Python 3.3. Absolutely, I also think that using invalid Unicode strings in 3.2 looks kind of hackish. -- Nothing wrong with your 3.3 implementation, it's the general concept that I don't understand. Stefan Krah From ondrej.certik at gmail.com Sun Jul 29 11:50:03 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 29 Jul 2012 08:50:03 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120729104034.GB3243@sleipnir.bytereef.org> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <20120729104034.GB3243@sleipnir.bytereef.org> Message-ID: On Sun, Jul 29, 2012 at 3:40 AM, Stefan Krah wrote: > Ond??ej ??ert??k wrote: >> Well, I simply went to the Python sources and then implemented a >> solution that works with this patch: >> >> https://github.com/certik/numpy/commit/36fcd1327746a3d0ad346ce58ffbe00506e27654 > >> https://github.com/numpy/numpy/pull/366 > > > Nice! I hit the same problem yesterday: unicode_new() does not accept > byte-swapped input with an encoding, since the input is not valid. But > your solution circumvents the validation. > > I'm not sure what the use case is for byte-swapped (invalid?) unicode > strings, but the approach looks good to me in the sense that it does > the same thing as the Py_UNICODE_WIDE path in 3.2. > > > In PyArray_Scalar() I only have these comments, two of which are stylistic: > > - I think the 'size' parameter in PyUnicode_New() refers to the number > of code points (UCS4 in this case), so: > > PyUnicode_New(itemsize >> 2, max_char) Right. Done. > > - The 'b' variable could be renamed to 'u' now. Done. > > - PyArray_Scalar() is beginning to look a little crowded. Perhaps the whole > PY_VERSION_HEX >= 0x03030000 block could go into a separate function such > as: > > NPY_NO_EXPORT PyObject * > get_unicode_scalar_3_3(PyTypeObject *type, void *data, Py_ssize_t itemsize, > int swap); I didn't do this, as I think the function is fine as it is. If further refactoring is needed, then one should probably create 3 functions, one for 3.3, one for <3.3-wide and one for <3.3-narrow. I've also rebased and squashed the commits, so now it is ready to be merged: https://github.com/numpy/numpy/pull/366 Thanks Stefan for your help. Can somebody with push access please review it? Ondrej From ronan.lamy at gmail.com Sun Jul 29 17:27:03 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Sun, 29 Jul 2012 22:27:03 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> Message-ID: <1343597223.2223.131.camel@ronan-desktop> Le samedi 28 juillet 2012 ? 18:09 -0700, Ond?ej ?ert?k a ?crit : > > So now the PR actually seems to work. The rest of the failures are here: > > https://gist.github.com/3195520 > I wanted to have a look at the import errors in your previous gist. How did you get rid of them? I can't even install numpy on 3.3 as setup.py chokes on 'import numpy.distutils.core': (py33)ronan at ronan-desktop:~/dev/numpy$ python setup.py install Converting to Python3 via 2to3... Running from numpy source directory. /home/ronan/dev/numpy/py33/lib/python3.3/distutils/__init__.py:16: ResourceWarning: unclosed file <_io.TextIOWrapper name='/usr/local/lib/python3.3/distutils/__init__.py' mode='r' encoding='UTF-8'> exec(open(os.path.join(distutils_path, '__init__.py')).read()) Traceback (most recent call last): File "setup.py", line 214, in setup_package() File "setup.py", line 191, in setup_package from numpy.distutils.core import setup File "/home/ronan/dev/numpy/build/py3k/numpy/distutils/core.py", line 25, in from numpy.distutils.command import config, config_compiler, \ File "/home/ronan/dev/numpy/build/py3k/numpy/distutils/command/__init__.py", line 17, in __import__('distutils.command',globals(),locals(),distutils_all) ImportError: No module named 'distutils.command.install_clib' Actually, I don't even understand how this __import__() call can work on earlier versions, nor what it's trying to achieve. From ondrej.certik at gmail.com Sun Jul 29 17:45:21 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 29 Jul 2012 14:45:21 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343597223.2223.131.camel@ronan-desktop> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> Message-ID: Hi Ronan! On Sun, Jul 29, 2012 at 2:27 PM, Ronan Lamy wrote: > Le samedi 28 juillet 2012 ? 18:09 -0700, Ond?ej ?ert?k a ?crit : > >> >> So now the PR actually seems to work. The rest of the failures are here: >> >> https://gist.github.com/3195520 >> > I wanted to have a look at the import errors in your previous gist. How > did you get rid of them? I can't even install numpy on 3.3 as setup.py Do you mean this gist: https://gist.github.com/3194707/482382fb6fd6f0d756128d97ea6c892ddb31fff9 ? I have incorrectly run the tests from the wrong directory and numpy was picking up the wrong files to import --- I think either from the numpy directory directly (there is a check for this though), or from numpy/core or something, I don't remember anymore. So I then run the tests from /tmp and posted the correct result into the same gist as a new commit: https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 > chokes on 'import numpy.distutils.core': > > (py33)ronan at ronan-desktop:~/dev/numpy$ python setup.py install > Converting to Python3 via 2to3... > Running from numpy source directory. > /home/ronan/dev/numpy/py33/lib/python3.3/distutils/__init__.py:16: > ResourceWarning: unclosed file <_io.TextIOWrapper > name='/usr/local/lib/python3.3/distutils/__init__.py' mode='r' > encoding='UTF-8'> > exec(open(os.path.join(distutils_path, '__init__.py')).read()) > Traceback (most recent call last): > File "setup.py", line 214, in > setup_package() > File "setup.py", line 191, in setup_package > from numpy.distutils.core import setup > File "/home/ronan/dev/numpy/build/py3k/numpy/distutils/core.py", line > 25, in > from numpy.distutils.command import config, config_compiler, \ > File > "/home/ronan/dev/numpy/build/py3k/numpy/distutils/command/__init__.py", > line 17, in > __import__('distutils.command',globals(),locals(),distutils_all) > ImportError: No module named 'distutils.command.install_clib' > > Actually, I don't even understand how this __import__() call can work on > earlier versions, nor what it's trying to achieve. That's weird, I've never seen this error before. Try to install numpy using your regular Python like this: python setup.py install --prefix /tmp let's say. If it works, then something is wrong with your Python 3.3 installation. If you want to reproduce my setup, checkout my repo: https://github.com/certik/python-3.3 and from inside it, run: SPKG_LOCAL=`pwd`/xx MAKEFLAGS="-j4" sh spkg-install (adjust the "-j4" flag, or remove it). You need a few packages installed like zlib1g-dev and so on. Then install virtualenv by downloading the tar.gz and from inside it doing "/path/to/my/python-3.3/xx/bin/python3.3 setup.py install". Add the file /path/to/my/python-3.3/xx/bin/virtualenv-3.3 into your $PATH. Then: rm -rf $HOME/py33 virtualenv-3.3 $HOME/py33 . $HOME/py33/bin/activate go to your numpy directory and do "python setup.py install". To run tests, you also need to: TMPDIR=/tmp/numpy-env rm -rf $TMPDIR mkdir $TMPDIR cd $TMPDIR tar xzf $tarballs/nose-1.1.2.tar.gz cd nose-1.1.2 python setup.py install using the virtualenv environment. When I tried to install nose into the python installation in python-3.3./xx, then it failed... Ondrej From stefan-usenet at bytereef.org Sun Jul 29 17:55:55 2012 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Sun, 29 Jul 2012 23:55:55 +0200 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343597223.2223.131.camel@ronan-desktop> References: <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> Message-ID: <20120729215555.GA8674@sleipnir.bytereef.org> Ronan Lamy wrote: > ImportError: No module named 'distutils.command.install_clib' I'm seeing the same with Python 3.3.0b1 (68e2690a471d+) and this patch solves the problem: diff --git a/numpy/distutils/command/__init__.py b/numpy/distutils/command/__init__.py index f8f0884..b9f0d09 100644 --- a/numpy/distutils/command/__init__.py +++ b/numpy/distutils/command/__init__.py @@ -7,13 +7,13 @@ __revision__ = "$Id: __init__.py,v 1.3 2005/05/16 11:08:49 pearu Exp $" distutils_all = [ #'build_py', 'clean', - 'install_clib', 'install_scripts', 'bdist', 'bdist_dumb', 'bdist_wininst', ] +from numpy.distutils.command import install_clib __import__('distutils.command',globals(),locals(),distutils_all) __all__ = ['build', Stefan Krah From nouiz at nouiz.org Sun Jul 29 20:27:04 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Sun, 29 Jul 2012 20:27:04 -0400 Subject: [Numpy-discussion] ANN: pythonpackages.com beta In-Reply-To: References: Message-ID: Hi, Do you have some help on installing an optimized BLAS on Windows, MacOS and/or linux? That is what is the most complicated part when installing our packages. Fred On Sat, Jul 28, 2012 at 7:25 PM, Alex Clark wrote: > Hi NumPy folks, > > > I am reaching out to various Python-related programming communities in > order to offer new help packaging your software. > > If you have ever struggled with packaging and releasing Python software > (e.g. to PyPI), please check out this service: > > > - http://pythonpackages.com > > > The basic idea is to automate packaging by checking out code, testing, > and uploading (e.g. to PyPI) all through the web, as explained in this > introduction: > > > - http://docs.pythonpackages.com/en/latest/introduction.html > > > Also, I will be available to answer your Python packaging questions most > days/nights in #pythonpackages on irc.freenode.net. Hope to meet/talk > with all of you soon. > > > > Alex > > > > -- > Alex Clark ? http://pythonpackages.com/ONE_CLICK > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ronan.lamy at gmail.com Sun Jul 29 20:52:32 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 30 Jul 2012 01:52:32 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <20120729215555.GA8674@sleipnir.bytereef.org> References: <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <20120729215555.GA8674@sleipnir.bytereef.org> Message-ID: <1343609552.2223.150.camel@ronan-desktop> Le dimanche 29 juillet 2012 ? 23:55 +0200, Stefan Krah a ?crit : > Ronan Lamy wrote: > > ImportError: No module named 'distutils.command.install_clib' > > I'm seeing the same with Python 3.3.0b1 (68e2690a471d+) and this patch > solves the problem: > > diff --git a/numpy/distutils/command/__init__.py b/numpy/distutils/command/__init__.py > index f8f0884..b9f0d09 100644 > --- a/numpy/distutils/command/__init__.py > +++ b/numpy/distutils/command/__init__.py > @@ -7,13 +7,13 @@ __revision__ = "$Id: __init__.py,v 1.3 2005/05/16 11:08:49 pearu Exp $" > > distutils_all = [ #'build_py', > 'clean', > - 'install_clib', > 'install_scripts', > 'bdist', > 'bdist_dumb', > 'bdist_wininst', > ] > > +from numpy.distutils.command import install_clib > __import__('distutils.command',globals(),locals(),distutils_all) > > __all__ = ['build', That does indeed solve the problem, thanks. However, I'm quite sure that 'rm numpy/distutils/command/__init__.py && touch numpy/distutils/command/__init__.py' works just as well - or probably better, in fact, as it allows 'from numpy.distutils.command import *' to run without error. From ronan.lamy at gmail.com Sun Jul 29 21:00:23 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 30 Jul 2012 02:00:23 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> Message-ID: <1343610023.2223.156.camel@ronan-desktop> Le dimanche 29 juillet 2012 ? 14:45 -0700, Ond?ej ?ert?k a ?crit : > Hi Ronan! > > On Sun, Jul 29, 2012 at 2:27 PM, Ronan Lamy wrote: > > Le samedi 28 juillet 2012 ? 18:09 -0700, Ond?ej ?ert?k a ?crit : > > > >> > >> So now the PR actually seems to work. The rest of the failures are here: > >> > >> https://gist.github.com/3195520 > >> > > I wanted to have a look at the import errors in your previous gist. How > > did you get rid of them? I can't even install numpy on 3.3 as setup.py > > Do you mean this gist: > > https://gist.github.com/3194707/482382fb6fd6f0d756128d97ea6c892ddb31fff9 > > ? I have incorrectly run the tests from the wrong directory and numpy > was picking up the wrong files to import --- I think either from the > numpy directory directly (there is a check for this though), or from > numpy/core or something, I don't remember anymore. So I then run the > tests from /tmp and posted the correct result into the same gist as a > new commit: > > https://gist.github.com/3194707/5696c8d3091b16ba8a9f00a921d512ed02e94d71 Ah, OK. False alarm, then. I'm on the lookout for import errors with Python 3.3, as the import system has been completely rewritten and anything that relied on undocumented behaviour is likely to break. > > > chokes on 'import numpy.distutils.core': > > > > (py33)ronan at ronan-desktop:~/dev/numpy$ python setup.py install > > Converting to Python3 via 2to3... > > Running from numpy source directory. > > /home/ronan/dev/numpy/py33/lib/python3.3/distutils/__init__.py:16: > > ResourceWarning: unclosed file <_io.TextIOWrapper > > name='/usr/local/lib/python3.3/distutils/__init__.py' mode='r' > > encoding='UTF-8'> > > exec(open(os.path.join(distutils_path, '__init__.py')).read()) > > Traceback (most recent call last): > > File "setup.py", line 214, in > > setup_package() > > File "setup.py", line 191, in setup_package > > from numpy.distutils.core import setup > > File "/home/ronan/dev/numpy/build/py3k/numpy/distutils/core.py", line > > 25, in > > from numpy.distutils.command import config, config_compiler, \ > > File > > "/home/ronan/dev/numpy/build/py3k/numpy/distutils/command/__init__.py", > > line 17, in > > __import__('distutils.command',globals(),locals(),distutils_all) > > ImportError: No module named 'distutils.command.install_clib' > > > > Actually, I don't even understand how this __import__() call can work on > > earlier versions, nor what it's trying to achieve. > > That's weird, I've never seen this error before. Try to install numpy > using your regular Python like this: > > python setup.py install --prefix /tmp > > let's say. If it works, then something is wrong with your Python 3.3 I simply used a virtualenv (you might need to get the latest from PyPI), roughly as follows: virtualenv -p python3.3 py33 py33/bin/python setup.py install It worked fine with 3.2 and 2.7, but not with 3.3. > installation. If you want to reproduce my setup, checkout my repo: > > https://github.com/certik/python-3.3 > > and from inside it, run: > > SPKG_LOCAL=`pwd`/xx MAKEFLAGS="-j4" sh spkg-install > > > (adjust the "-j4" flag, or remove it). You need a few packages > installed like zlib1g-dev and so on. Then install virtualenv by > downloading the tar.gz and from inside it doing > "/path/to/my/python-3.3/xx/bin/python3.3 setup.py install". Add the > file > /path/to/my/python-3.3/xx/bin/virtualenv-3.3 into your $PATH. > > Then: > > > rm -rf $HOME/py33 > virtualenv-3.3 $HOME/py33 > . $HOME/py33/bin/activate > > go to your numpy directory and do "python setup.py install". To run > tests, you also need to: > > TMPDIR=/tmp/numpy-env > rm -rf $TMPDIR > mkdir $TMPDIR > cd $TMPDIR > tar xzf $tarballs/nose-1.1.2.tar.gz > cd nose-1.1.2 > python setup.py install > > using the virtualenv environment. When I tried to install nose into > the python installation in python-3.3./xx, then it failed... Installing nose from a git checkout works fine for me. Maybe nose-1.1.2 isn't really compatible with Python 3.3? Anyway, I managed to compile (by blanking numpy/distutils/command/__init__.py) and to run the tests. I only see the 2 pickle errors from your latest gist. So that's all good! From aclark at aclark.net Sun Jul 29 21:47:04 2012 From: aclark at aclark.net (Alex Clark) Date: Sun, 29 Jul 2012 21:47:04 -0400 Subject: [Numpy-discussion] ANN: pythonpackages.com beta In-Reply-To: References: Message-ID: Hi Fred, On 7/29/12 8:27 PM, Fr?d?ric Bastien wrote: > Hi, > > Do you have some help on installing an optimized BLAS on Windows, > MacOS and/or linux? That is what is the most complicated part when > installing our packages. No but I'd be willing to consider helping with this, can you open a ticket here with some more details about the problem? - https://bitbucket.org/pythonpackages/pythonpackages.com/issues/new IIUC an "optimized BLAS" is some shared library that makes numpy's operations peform better? Alex > > Fred > > On Sat, Jul 28, 2012 at 7:25 PM, Alex Clark wrote: >> Hi NumPy folks, >> >> >> I am reaching out to various Python-related programming communities in >> order to offer new help packaging your software. >> >> If you have ever struggled with packaging and releasing Python software >> (e.g. to PyPI), please check out this service: >> >> >> - http://pythonpackages.com >> >> >> The basic idea is to automate packaging by checking out code, testing, >> and uploading (e.g. to PyPI) all through the web, as explained in this >> introduction: >> >> >> - http://docs.pythonpackages.com/en/latest/introduction.html >> >> >> Also, I will be available to answer your Python packaging questions most >> days/nights in #pythonpackages on irc.freenode.net. Hope to meet/talk >> with all of you soon. >> >> >> >> Alex >> >> >> >> -- >> Alex Clark ? http://pythonpackages.com/ONE_CLICK >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Alex Clark ? http://pythonpackages.com/ONE_CLICK From ronan.lamy at gmail.com Sun Jul 29 23:57:47 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 30 Jul 2012 04:57:47 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343610023.2223.156.camel@ronan-desktop> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <1343610023.2223.156.camel@ronan-desktop> Message-ID: <1343620667.2223.160.camel@ronan-desktop> Le lundi 30 juillet 2012 ? 02:00 +0100, Ronan Lamy a ?crit : > > Anyway, I managed to compile (by blanking > numpy/distutils/command/__init__.py) and to run the tests. I only see > the 2 pickle errors from your latest gist. So that's all good! And the cause of these errors is that running the test suite somehow corrupts Python's internal cache of bytes objects, causing the following: >>> b'\x01XXX'[0:1] b'\xbb' From hodge at stsci.edu Mon Jul 30 09:30:46 2012 From: hodge at stsci.edu (Phil Hodge) Date: Mon, 30 Jul 2012 09:30:46 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <5012F301.5020309@ais.uni-bonn.de> References: <50118D89.3090001@stsci.edu> <5012F301.5020309@ais.uni-bonn.de> Message-ID: <50168C86.9020704@stsci.edu> On 07/27/2012 03:58 PM, Andreas Mueller wrote: > Hi Everybody. > The bug is that no error is raised, right? > The docs say > > where(condition, [x, y]) > > x, y : array_like, optional > Values from which to choose. `x` and `y` need to have the same > shape as `condition` > > In the example you gave, x was a scalar. net.max() returns an array: >>> print type(net.max()) That was the reason I cast it to a float to check that that did result in the correct behavior for `where`. Phil From robert.kern at gmail.com Mon Jul 30 09:51:42 2012 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 30 Jul 2012 14:51:42 +0100 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <50168C86.9020704@stsci.edu> References: <50118D89.3090001@stsci.edu> <5012F301.5020309@ais.uni-bonn.de> <50168C86.9020704@stsci.edu> Message-ID: On Mon, Jul 30, 2012 at 2:30 PM, Phil Hodge wrote: > On 07/27/2012 03:58 PM, Andreas Mueller wrote: >> Hi Everybody. >> The bug is that no error is raised, right? >> The docs say >> >> where(condition, [x, y]) >> >> x, y : array_like, optional >> Values from which to choose. `x` and `y` need to have the same >> shape as `condition` >> >> In the example you gave, x was a scalar. > > net.max() returns an array: > > >>> print type(net.max()) > No, that's a scalar. The type would be numpy.ndarray if it were an array. -- Robert Kern From travis at continuum.io Mon Jul 30 10:53:13 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 30 Jul 2012 09:53:13 -0500 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <50118D89.3090001@stsci.edu> References: <50118D89.3090001@stsci.edu> Message-ID: <3B8AA7C9-2E13-4266-8899-F314F40D01A7@continuum.io> Can you file a bug report on Github's issue tracker? Thanks, -Travis On Jul 26, 2012, at 1:33 PM, Phil Hodge wrote: > On a Linux machine: > >> uname -srvop > Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64 > GNU/Linux > > this example shows an apparent problem with the where function: > > Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy as np >>>> print np.__version__ > 1.5.1 >>>> net = np.zeros(3, dtype='>f4') >>>> net[1] = 0.00458849 >>>> net[2] = 0.605202 >>>> max_net = net.max() >>>> test = np.where(net <= 0., max_net, net) >>>> print test > [ -2.23910537e-35 4.58848989e-03 6.05202019e-01] > > When I specified the dtype for net as '>f8', test[0] was > 3.46244974e+68. It worked as expected (i.e. test[0] should be 0.605202) > when I specified float(max_net) as the second argument to np.where. > > Phil > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From hodge at stsci.edu Mon Jul 30 11:08:38 2012 From: hodge at stsci.edu (Phil Hodge) Date: Mon, 30 Jul 2012 11:08:38 -0400 Subject: [Numpy-discussion] bug in numpy.where? In-Reply-To: <3B8AA7C9-2E13-4266-8899-F314F40D01A7@continuum.io> References: <50118D89.3090001@stsci.edu> <3B8AA7C9-2E13-4266-8899-F314F40D01A7@continuum.io> Message-ID: <5016A376.8080000@stsci.edu> On 07/30/2012 10:53 AM, Travis Oliphant wrote: > Can you file a bug report on Github's issue tracker? It's https://github.com/numpy/numpy/issues/369 Phil From ronan.lamy at gmail.com Mon Jul 30 12:10:53 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 30 Jul 2012 17:10:53 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343620667.2223.160.camel@ronan-desktop> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <1343610023.2223.156.camel@ronan-desktop> <1343620667.2223.160.camel@ronan-desktop> Message-ID: <1343664653.2223.166.camel@ronan-desktop> Le lundi 30 juillet 2012 ? 04:57 +0100, Ronan Lamy a ?crit : > Le lundi 30 juillet 2012 ? 02:00 +0100, Ronan Lamy a ?crit : > > > > > Anyway, I managed to compile (by blanking > > numpy/distutils/command/__init__.py) and to run the tests. I only see > > the 2 pickle errors from your latest gist. So that's all good! > > And the cause of these errors is that running the test suite somehow > corrupts Python's internal cache of bytes objects, causing the > following: > >>> b'\x01XXX'[0:1] > b'\xbb' The culprit is test_pickle_string_overwrite() in test_regression.py. The test actually tries to check for that kind of problem, but on Python 3, it only manages to trigger it without detecting it. Here's a simple way to reproduce the issue: >>> a = numpy.array([1], 'b') >>> b = pickle.loads(pickle.dumps(a)) >>> b[0] = 77 >>> b'\x01 '[0:1] b'M' Actually, this problem is probably quite old: I can see it in 1.6.1 w/ Python 3.2.3. 3.3 only makes it more visible. I'll open an issue on GitHub ASAP. From Wolfgang.Draxinger at physik.uni-muenchen.de Mon Jul 30 12:33:01 2012 From: Wolfgang.Draxinger at physik.uni-muenchen.de (Wolfgang Draxinger) Date: Mon, 30 Jul 2012 18:33:01 +0200 Subject: [Numpy-discussion] Updating SciPy from 0.9.0 to 0.10.1 triggers undesired behavior in NumPy 1.6.2 Message-ID: <20120730183301.4d5f172b@gar-ws-bl08.garching.physik.uni-muenchen.de> Hi, first my apologies for crossposting this to two maillists, but as both projects are affected I think this is in order. Like the subject says I encountered some undesired behavior in the interaction of SciPy with NumPy. Using the "old" SciPy version 0.9.0 everything works fine and smooth. But upgrading to SciPy-0.10.1 triggers some ValueError in numpy.histogram2d of NumPy-1.6.2 when executing one of my programs. I developed a numerical evaluation system for our detectors here. One of the key operations is determining the distribution of some 2-dimensional variable space based on the values found in the image delivered by the detector, where each pixel has associated values for the target variables. This goes something like the following ABdist, binsA, binsB = numpy.histogram2d( B_yz.ravel(), A_yz.ravel(), [B_bins, A_bins], weights=image.ravel() ) The bins parameter can be either [int, int] or [array, array], that makes no difference in the outcome. The mappings A_yz and B_yz are created using scipy.interpolate.griddata. We have a list of pairs of pairs which are determined by measurement. Basically in the calibration step we vary variables A,B and store at which Y,Z we get the corresponding signal. So essentially this is a (A,B) -> (Y,Z) mapping. In the region of interest is has a bijective subset that's also smooth. However the original data also contains areas where (Y,Z) has no corresponding (A,B) or where multiple (A,B) map to the same (Y,Z); like said, those lie outside the RoI. For our measurements we need to reverse this process, i.e. we want to do (Y,Z) -> (A,B). So I use griddata to evaluate a discrete reversal for this mapping, of the same dimensions that the to be evaluated image has: gry, grz = numpy.mgrid[self.y_lim[0]:self.y_lim[1]:self.y_res*1j, self.z_lim[0]:self.z_lim[1]:self.z_res*1j] # for whatever reason I have to do the following # assigning to evalgrid directly breaks the program. evalgrid = (gry, grz) points = (Y.ravel(), Z.ravel()) def gd(a): return scinp.griddata( points, a.ravel(), evalgrid, method='cubic' ) A_yz = gd(A) B_yz = gd(B) where A,B,Y,Z have the same dimensions and are the ordered lists/arrays of the scalar values of the two sets mapped between. As you can see, this approach does also involve the elements of the sets, which are not mapped bijectively. As lying outside the convex boundary or not being properly interpolatable they should receive the fill value. As long as I stay with SciPy-0.9.0 everything works fine. However after upgrading to SciPy-0.10.1 the histogram2d step fails with a ValueError. The version of NumPy is 1.6.2 for both cases. /usr/lib64/python2.7/site-packages/numpy/ma/core.py:772: RuntimeWarning: invalid value encountered in absolute return umath.absolute(a) * self.tolerance >= umath.absolute(b) Traceback (most recent call last): File "./ephi.py", line 71, in ABdist, binsA, binsB = numpy.histogram2d(B_yz.ravel(), A_yz.ravel(), [B_bins, A_bins], weights=image.ravel()) File "/usr/lib64/python2.7/site-packages/numpy/lib/twodim_base.py", line 615, in histogram2d hist, edges = histogramdd([x,y], bins, range, normed, weights) File "/usr/lib64/python2.7/site-packages/numpy/lib/function_base.py", line 357, in histogramdd decimal = int(-log10(mindiff)) + 6 ValueError: cannot convert float NaN to integer Any ideas on this? Regards, Wolfgang -- Fakult?t f?r Physik, LMU M?nchen Beschleunigerlabor/MLL Am Coulombwall 6 85748 Garching Deutschland Tel: +49-89-289-14286 Fax: +49-89-289-14280 From ronan.lamy at gmail.com Mon Jul 30 13:04:08 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Mon, 30 Jul 2012 18:04:08 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343664653.2223.166.camel@ronan-desktop> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <1343610023.2223.156.camel@ronan-desktop> <1343620667.2223.160.camel@ronan-desktop> <1343664653.2223.166.camel@ronan-desktop> Message-ID: <1343667848.2223.167.camel@ronan-desktop> Le lundi 30 juillet 2012 ? 17:10 +0100, Ronan Lamy a ?crit : > Le lundi 30 juillet 2012 ? 04:57 +0100, Ronan Lamy a ?crit : > > Le lundi 30 juillet 2012 ? 02:00 +0100, Ronan Lamy a ?crit : > > > > > > > > Anyway, I managed to compile (by blanking > > > numpy/distutils/command/__init__.py) and to run the tests. I only see > > > the 2 pickle errors from your latest gist. So that's all good! > > > > And the cause of these errors is that running the test suite somehow > > corrupts Python's internal cache of bytes objects, causing the > > following: > > >>> b'\x01XXX'[0:1] > > b'\xbb' > > The culprit is test_pickle_string_overwrite() in test_regression.py. The > test actually tries to check for that kind of problem, but on Python 3, > it only manages to trigger it without detecting it. Here's a simple way > to reproduce the issue: > > >>> a = numpy.array([1], 'b') > >>> b = pickle.loads(pickle.dumps(a)) > >>> b[0] = 77 > >>> b'\x01 '[0:1] > b'M' > > Actually, this problem is probably quite old: I can see it in 1.6.1 w/ > Python 3.2.3. 3.3 only makes it more visible. > > I'll open an issue on GitHub ASAP. > https://github.com/numpy/numpy/issues/370 From thouis at gmail.com Mon Jul 30 13:52:22 2012 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Mon, 30 Jul 2012 19:52:22 +0200 Subject: [Numpy-discussion] Github notifications and trac-to-github migration In-Reply-To: References: Message-ID: On Wed, Jul 25, 2012 at 7:36 PM, Ralf Gommers wrote: > [...] > It looks like you want to discard the Milestones, except for the 1.7.0, > 1.8.0 and 2.0.0 ones. Why not keep all of them? These were the only ones defined in the current numpy repository. I'll add code to create any missing ones. > I think the bug/enhancement/task label is missing. Noted. > This shouldn't work I think (and I did a short test): > def t2g_markup(s): > return s.replace('{{{', "'''").replace('}}}', "'''") > The {{{}}} part should be replaced by ```` for inline markup and > indented for multi-line comments in Github I believe, then it will render > the same way. Okay. I'll fix this. My plan is to do a test import of all the issues to the test repository (without any @notifications) when I return from vacation in late August or early September. It should be possible to run through several iterations of this to make sure everything is behaving as expected before the actual import. Thanks for the feedback. Thanks also to Aric and Jordi for their pre-alpha testing and suggestions. Ray Jones From ondrej.certik at gmail.com Mon Jul 30 14:07:02 2012 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 30 Jul 2012 11:07:02 -0700 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: <1343667848.2223.167.camel@ronan-desktop> References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <1343610023.2223.156.camel@ronan-desktop> <1343620667.2223.160.camel@ronan-desktop> <1343664653.2223.166.camel@ronan-desktop> <1343667848.2223.167.camel@ronan-desktop> Message-ID: On Mon, Jul 30, 2012 at 10:04 AM, Ronan Lamy wrote: > Le lundi 30 juillet 2012 ? 17:10 +0100, Ronan Lamy a ?crit : >> Le lundi 30 juillet 2012 ? 04:57 +0100, Ronan Lamy a ?crit : >> > Le lundi 30 juillet 2012 ? 02:00 +0100, Ronan Lamy a ?crit : >> > >> > > >> > > Anyway, I managed to compile (by blanking >> > > numpy/distutils/command/__init__.py) and to run the tests. I only see >> > > the 2 pickle errors from your latest gist. So that's all good! >> > >> > And the cause of these errors is that running the test suite somehow >> > corrupts Python's internal cache of bytes objects, causing the >> > following: >> > >>> b'\x01XXX'[0:1] >> > b'\xbb' >> >> The culprit is test_pickle_string_overwrite() in test_regression.py. The >> test actually tries to check for that kind of problem, but on Python 3, >> it only manages to trigger it without detecting it. Here's a simple way >> to reproduce the issue: >> >> >>> a = numpy.array([1], 'b') >> >>> b = pickle.loads(pickle.dumps(a)) >> >>> b[0] = 77 >> >>> b'\x01 '[0:1] >> b'M' >> >> Actually, this problem is probably quite old: I can see it in 1.6.1 w/ >> Python 3.2.3. 3.3 only makes it more visible. >> >> I'll open an issue on GitHub ASAP. >> > https://github.com/numpy/numpy/issues/370 Thanks Ronan, nice work! Since you looked into this -- do you know a way to fix this? (Both NumPy and the test.) Ondrej From vlastimil.brom at gmail.com Mon Jul 30 15:33:13 2012 From: vlastimil.brom at gmail.com (Vlastimil Brom) Date: Mon, 30 Jul 2012 21:33:13 +0200 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi all, I'd like to ask for some hints or advice regarding the usage of numpy.array and especially slicing. I only recently tried numpy and was impressed by the speedup in some parts of the code, hence I suspect, that I might miss some other oportunities in this area. I currently use the following code for a simple visualisation of the search matches within the text, the arrays are generally much larger than the sample - the texts size is generally hundreds of kilobytes up to a few MB - with an index position for each character. First there is a list of spans(obtained form the regex match objects), the respective character indices in between these slices should be set to 1: >>> import numpy >>> characters_matches = numpy.zeros(10) >>> matches_spans = numpy.array([[2,4], [5,9]]) >>> for start, stop in matches_spans: ... characters_matches[start:stop] = 1 ... >>> characters_matches array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) Is there maybe a way tu achieve this in a numpy-only way - without the python loop? (I got the impression, the powerful slicing capabilities could make it possible, bud haven't found this kind of solution.) In the next piece of code all the character positions are evaluated with their "neighbourhood" and a kind of running proportions of the matched text parts are computed (the checks_distance could be generally up to the order of the half the text length, usually less : >>> >>> check_distance = 1 >>> floating_checks_proportions = [] >>> for i in numpy.arange(len(characters_matches)): ... lo = i - check_distance ... if lo < 0: ... lo = None ... hi = i + check_distance + 1 ... checked_sublist = characters_matches[lo:hi] ... proportion = (checked_sublist.sum() / (check_distance * 2 + 1.0)) ... floating_checks_proportions.append(proportion) ... >>> floating_checks_proportions [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, 0.66666666666666663, 0.33333333333333331] >>> I'd like to ask about the possible better approaches, as it doesn't look very elegant to me, and I obviously don't know the implications or possible drawbacks of numpy arrays in some scenarios. the pattern for i in range(len(...)): is usually considered inadequate in python, but what should be used in this case as the indices are primarily needed? is something to be gained or lost using (x)range or np.arange as the python loop is (probably?) inevitable anyway? Is there some mor elegant way to check for the "underflowing" lower bound "lo" to replace with None? Is it significant, which container is used to collect the results of the computation in the python loop - i.e. python list or a numpy array? (Could possibly matplotlib cooperate better with either container?) And of course, are there maybe other things, which should be made better/differently? (using Numpy 1.6.2, python 2.7.3, win XP) Thanks in advance for any hints or suggestions, regards, Vlastimil Brom From e.antero.tammi at gmail.com Mon Jul 30 17:59:12 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 00:59:12 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, A partial answer to your questions: On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom wrote: > Hi all, > I'd like to ask for some hints or advice regarding the usage of > numpy.array and especially slicing. > > I only recently tried numpy and was impressed by the speedup in some > parts of the code, hence I suspect, that I might miss some other > oportunities in this area. > > I currently use the following code for a simple visualisation of the > search matches within the text, the arrays are generally much larger > than the sample - the texts size is generally hundreds of kilobytes up > to a few MB - with an index position for each character. > First there is a list of spans(obtained form the regex match objects), > the respective character indices in between these slices should be set > to 1: > > >>> import numpy > >>> characters_matches = numpy.zeros(10) > >>> matches_spans = numpy.array([[2,4], [5,9]]) > >>> for start, stop in matches_spans: > ... characters_matches[start:stop] = 1 > ... > >>> characters_matches > array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > > Is there maybe a way tu achieve this in a numpy-only way - without the > python loop? > (I got the impression, the powerful slicing capabilities could make it > possible, bud haven't found this kind of solution.) > > > In the next piece of code all the character positions are evaluated > with their "neighbourhood" and a kind of running proportions of the > matched text parts are computed (the checks_distance could be > generally up to the order of the half the text length, usually less : > > >>> > >>> check_distance = 1 > >>> floating_checks_proportions = [] > >>> for i in numpy.arange(len(characters_matches)): > ... lo = i - check_distance > ... if lo < 0: > ... lo = None > ... hi = i + check_distance + 1 > ... checked_sublist = characters_matches[lo:hi] > ... proportion = (checked_sublist.sum() / (check_distance * 2 + 1.0)) > ... floating_checks_proportions.append(proportion) > ... > >>> floating_checks_proportions > [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, > 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, > 0.66666666666666663, 0.33333333333333331] > >>> > Define a function for proportions: from numpy import r_ from numpy.lib.stride_tricks import as_strided as ast def proportions(matches, distance= 1): cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] # pad m= r_[[0.]* cd, matches, [0.]* cd] # create a suitable view m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) # average return m[:-2* cd].sum(1)/ cd2p1 and use it like: In []: matches Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) In []: proportions(matches).round(2) Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , 0.67, 0.33]) In []: proportions(matches, 5).round(2) Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, 0.45, 0.36]) > > I'd like to ask about the possible better approaches, as it doesn't > look very elegant to me, and I obviously don't know the implications > or possible drawbacks of numpy arrays in some scenarios. > > the pattern > for i in range(len(...)): is usually considered inadequate in python, > but what should be used in this case as the indices are primarily > needed? > is something to be gained or lost using (x)range or np.arange as the > python loop is (probably?) inevitable anyway? > Here np.arange(.) will create a new array and potentially wasting memory if it's not otherwise used. IMO nothing wrong looping with xrange(.) (if you really need to loop ;). > Is there some mor elegant way to check for the "underflowing" lower > bound "lo" to replace with None? > > Is it significant, which container is used to collect the results of > the computation in the python loop - i.e. python list or a numpy > array? > (Could possibly matplotlib cooperate better with either container?) > > And of course, are there maybe other things, which should be made > better/differently? > > (using Numpy 1.6.2, python 2.7.3, win XP) > My 2 cents, -eat > Thanks in advance for any hints or suggestions, > regards, > Vlastimil Brom > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Mon Jul 30 18:45:08 2012 From: doutriaux1 at llnl.gov (Doutriaux, Charles) Date: Mon, 30 Jul 2012 15:45:08 -0700 Subject: [Numpy-discussion] SWIG Numpy and C++ extensions Message-ID: Hi, I have wrapped a c++ code with SWIG. Now that code used to read input from ASCII files. I'm trying to replace that part with an input coming from numpy. I would rather not use setup.py if I have to (but if I can't avoid fine) I'm looking at SWIG/numpy tutorials looks like you need to do something like %apply 2 questions: 1- How do use "apply" for class functions %apply (bla) myobject::foo ? 2-that's ok if your C++ deals with arrays but what if I actually want to receive the Numpy object so that I can manipulate it directly (or if for example the array isn't contiguous in memory) An"dummy"example of foo function I'd like to wrap: void FOO::fooNumpy(PyArrayObject *nparray) { int j; for(j=0;jnd;j++) { printf("Ok array dim %i has length: %i\n",j,nparray->dimensions[j]); } } Thanks, C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronan.lamy at gmail.com Mon Jul 30 20:00:31 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Tue, 31 Jul 2012 01:00:31 +0100 Subject: [Numpy-discussion] Status of NumPy and Python 3.3 In-Reply-To: References: <20120728093622.GA27387@sleipnir.bytereef.org> <20120728181956.GA30702@sleipnir.bytereef.org> <1343597223.2223.131.camel@ronan-desktop> <1343610023.2223.156.camel@ronan-desktop> <1343620667.2223.160.camel@ronan-desktop> <1343664653.2223.166.camel@ronan-desktop> <1343667848.2223.167.camel@ronan-desktop> Message-ID: <1343692831.2223.171.camel@ronan-desktop> Le lundi 30 juillet 2012 ? 11:07 -0700, Ond?ej ?ert?k a ?crit : > On Mon, Jul 30, 2012 at 10:04 AM, Ronan Lamy wrote: > > Le lundi 30 juillet 2012 ? 17:10 +0100, Ronan Lamy a ?crit : > >> Le lundi 30 juillet 2012 ? 04:57 +0100, Ronan Lamy a ?crit : > >> > Le lundi 30 juillet 2012 ? 02:00 +0100, Ronan Lamy a ?crit : > >> > > >> > > > >> > > Anyway, I managed to compile (by blanking > >> > > numpy/distutils/command/__init__.py) and to run the tests. I only see > >> > > the 2 pickle errors from your latest gist. So that's all good! > >> > > >> > And the cause of these errors is that running the test suite somehow > >> > corrupts Python's internal cache of bytes objects, causing the > >> > following: > >> > >>> b'\x01XXX'[0:1] > >> > b'\xbb' > >> > >> The culprit is test_pickle_string_overwrite() in test_regression.py. The > >> test actually tries to check for that kind of problem, but on Python 3, > >> it only manages to trigger it without detecting it. Here's a simple way > >> to reproduce the issue: > >> > >> >>> a = numpy.array([1], 'b') > >> >>> b = pickle.loads(pickle.dumps(a)) > >> >>> b[0] = 77 > >> >>> b'\x01 '[0:1] > >> b'M' > >> > >> Actually, this problem is probably quite old: I can see it in 1.6.1 w/ > >> Python 3.2.3. 3.3 only makes it more visible. > >> > >> I'll open an issue on GitHub ASAP. > >> > > https://github.com/numpy/numpy/issues/370 > > Thanks Ronan, nice work! > > Since you looked into this -- do you know a way to fix this? (Both > NumPy and the test.) Pauli found out how to fix the code, so I'll try to send a PR tonight. From vlastimil.brom at gmail.com Tue Jul 31 03:23:19 2012 From: vlastimil.brom at gmail.com (Vlastimil Brom) Date: Tue, 31 Jul 2012 09:23:19 +0200 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: 2012/7/30 eat : > Hi, > > A partial answer to your questions: > > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom > wrote: >> >> Hi all, >> I'd like to ask for some hints or advice regarding the usage of >> numpy.array and especially slicing. >> >> I only recently tried numpy and was impressed by the speedup in some >> parts of the code, hence I suspect, that I might miss some other >> oportunities in this area. >> >> I currently use the following code for a simple visualisation of the >> search matches within the text, the arrays are generally much larger >> than the sample - the texts size is generally hundreds of kilobytes up >> to a few MB - with an index position for each character. >> First there is a list of spans(obtained form the regex match objects), >> the respective character indices in between these slices should be set >> to 1: >> >> >>> import numpy >> >>> characters_matches = numpy.zeros(10) >> >>> matches_spans = numpy.array([[2,4], [5,9]]) >> >>> for start, stop in matches_spans: >> ... characters_matches[start:stop] = 1 >> ... >> >>> characters_matches >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> Is there maybe a way tu achieve this in a numpy-only way - without the >> python loop? >> (I got the impression, the powerful slicing capabilities could make it >> possible, bud haven't found this kind of solution.) >> >> >> In the next piece of code all the character positions are evaluated >> with their "neighbourhood" and a kind of running proportions of the >> matched text parts are computed (the checks_distance could be >> generally up to the order of the half the text length, usually less : >> >> >>> >> >>> check_distance = 1 >> >>> floating_checks_proportions = [] >> >>> for i in numpy.arange(len(characters_matches)): >> ... lo = i - check_distance >> ... if lo < 0: >> ... lo = None >> ... hi = i + check_distance + 1 >> ... checked_sublist = characters_matches[lo:hi] >> ... proportion = (checked_sublist.sum() / (check_distance * 2 + 1.0)) >> ... floating_checks_proportions.append(proportion) >> ... >> >>> floating_checks_proportions >> [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, >> 0.66666666666666663, 0.33333333333333331] >> >>> > > Define a function for proportions: > > from numpy import r_ > > from numpy.lib.stride_tricks import as_strided as ast > > def proportions(matches, distance= 1): > > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] > > # pad > > m= r_[[0.]* cd, matches, [0.]* cd] > > # create a suitable view > > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) > > # average > > return m[:-2* cd].sum(1)/ cd2p1 > and use it like: > In []: matches > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > > In []: proportions(matches).round(2) > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , 0.67, > 0.33]) > In []: proportions(matches, 5).round(2) > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, 0.45, > 0.36]) >> >> >> I'd like to ask about the possible better approaches, as it doesn't >> look very elegant to me, and I obviously don't know the implications >> or possible drawbacks of numpy arrays in some scenarios. >> >> the pattern >> for i in range(len(...)): is usually considered inadequate in python, >> but what should be used in this case as the indices are primarily >> needed? >> is something to be gained or lost using (x)range or np.arange as the >> python loop is (probably?) inevitable anyway? > > Here np.arange(.) will create a new array and potentially wasting memory if > it's not otherwise used. IMO nothing wrong looping with xrange(.) (if you > really need to loop ;). >> >> Is there some mor elegant way to check for the "underflowing" lower >> bound "lo" to replace with None? >> >> Is it significant, which container is used to collect the results of >> the computation in the python loop - i.e. python list or a numpy >> array? >> (Could possibly matplotlib cooperate better with either container?) >> >> And of course, are there maybe other things, which should be made >> better/differently? >> >> (using Numpy 1.6.2, python 2.7.3, win XP) > > > My 2 cents, > -eat >> >> Thanks in advance for any hints or suggestions, >> regards, >> Vlastimil Brom >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, thank you very much for your suggestions! do I understand it correctly, that I have to special-case the function for distance = 0 (which should return the matches themselves without recalculation)? However, more importantly, I am getting a ValueError for some larger, (but not completely unreasonable) "distance" >>> proportions(matches, distance= 8190) Traceback (most recent call last): File "", line 1, in File "", line 11, in proportions File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", line 28, in as_strided return np.asarray(DummyArray(interface, base=x)) File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line 235, in asarray return array(a, dtype, copy=False, order=order) ValueError: array is too big. >>> the distance= 8189 was the largest which worked in this snippet, however, it might be data-dependent, as I got this error as well e.g. for distance=4529 for a 20k text. Is this implementation-limited, or could it be solved in some alternative way which wouldn't have such limits (up to the order of, say, millions)? Thanks again regards vbr From e.antero.tammi at gmail.com Tue Jul 31 09:23:34 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 16:23:34 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom wrote: > 2012/7/30 eat : > > Hi, > > > > A partial answer to your questions: > > > > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom < > vlastimil.brom at gmail.com> > > wrote: > >> > >> Hi all, > >> I'd like to ask for some hints or advice regarding the usage of > >> numpy.array and especially slicing. > >> > >> I only recently tried numpy and was impressed by the speedup in some > >> parts of the code, hence I suspect, that I might miss some other > >> oportunities in this area. > >> > >> I currently use the following code for a simple visualisation of the > >> search matches within the text, the arrays are generally much larger > >> than the sample - the texts size is generally hundreds of kilobytes up > >> to a few MB - with an index position for each character. > >> First there is a list of spans(obtained form the regex match objects), > >> the respective character indices in between these slices should be set > >> to 1: > >> > >> >>> import numpy > >> >>> characters_matches = numpy.zeros(10) > >> >>> matches_spans = numpy.array([[2,4], [5,9]]) > >> >>> for start, stop in matches_spans: > >> ... characters_matches[start:stop] = 1 > >> ... > >> >>> characters_matches > >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > >> > >> Is there maybe a way tu achieve this in a numpy-only way - without the > >> python loop? > >> (I got the impression, the powerful slicing capabilities could make it > >> possible, bud haven't found this kind of solution.) > >> > >> > >> In the next piece of code all the character positions are evaluated > >> with their "neighbourhood" and a kind of running proportions of the > >> matched text parts are computed (the checks_distance could be > >> generally up to the order of the half the text length, usually less : > >> > >> >>> > >> >>> check_distance = 1 > >> >>> floating_checks_proportions = [] > >> >>> for i in numpy.arange(len(characters_matches)): > >> ... lo = i - check_distance > >> ... if lo < 0: > >> ... lo = None > >> ... hi = i + check_distance + 1 > >> ... checked_sublist = characters_matches[lo:hi] > >> ... proportion = (checked_sublist.sum() / (check_distance * 2 + > 1.0)) > >> ... floating_checks_proportions.append(proportion) > >> ... > >> >>> floating_checks_proportions > >> [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, > >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, > >> 0.66666666666666663, 0.33333333333333331] > >> >>> > > > > Define a function for proportions: > > > > from numpy import r_ > > > > from numpy.lib.stride_tricks import as_strided as ast > > > > def proportions(matches, distance= 1): > > > > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] > > > > # pad > > > > m= r_[[0.]* cd, matches, [0.]* cd] > > > > # create a suitable view > > > > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) > > > > # average > > > > return m[:-2* cd].sum(1)/ cd2p1 > > and use it like: > > In []: matches > > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > > > > In []: proportions(matches).round(2) > > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , > 0.67, > > 0.33]) > > In []: proportions(matches, 5).round(2) > > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, > 0.45, > > 0.36]) > >> > >> > >> I'd like to ask about the possible better approaches, as it doesn't > >> look very elegant to me, and I obviously don't know the implications > >> or possible drawbacks of numpy arrays in some scenarios. > >> > >> the pattern > >> for i in range(len(...)): is usually considered inadequate in python, > >> but what should be used in this case as the indices are primarily > >> needed? > >> is something to be gained or lost using (x)range or np.arange as the > >> python loop is (probably?) inevitable anyway? > > > > Here np.arange(.) will create a new array and potentially wasting memory > if > > it's not otherwise used. IMO nothing wrong looping with xrange(.) (if you > > really need to loop ;). > >> > >> Is there some mor elegant way to check for the "underflowing" lower > >> bound "lo" to replace with None? > >> > >> Is it significant, which container is used to collect the results of > >> the computation in the python loop - i.e. python list or a numpy > >> array? > >> (Could possibly matplotlib cooperate better with either container?) > >> > >> And of course, are there maybe other things, which should be made > >> better/differently? > >> > >> (using Numpy 1.6.2, python 2.7.3, win XP) > > > > > > My 2 cents, > > -eat > >> > >> Thanks in advance for any hints or suggestions, > >> regards, > >> Vlastimil Brom > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Hi, > thank you very much for your suggestions! > > do I understand it correctly, that I have to special-case the function > for distance = 0 (which should return the matches themselves without > recalculation)? > Yes. > > However, more importantly, I am getting a ValueError for some larger, > (but not completely unreasonable) "distance" > > >>> proportions(matches, distance= 8190) > Traceback (most recent call last): > File "", line 1, in > File "", line 11, in proportions > File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", > line 28, in as_strided > return np.asarray(DummyArray(interface, base=x)) > File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line > 235, in asarray > return array(a, dtype, copy=False, order=order) > ValueError: array is too big. > >>> > > the distance= 8189 was the largest which worked in this snippet, > however, it might be data-dependent, as I got this error as well e.g. > for distance=4529 for a 20k text. > > Is this implementation-limited, or could it be solved in some > alternative way which wouldn't have such limits (up to the order of, > say, millions)? > Apparently ast(.) does not return a view of the original matches rather a copy of size (n* (2* distance+ 1)), thus you may run out of memory. Surely it can be solved up to millions of matches, but perhaps much slower speed. Regards, -eat > > Thanks again > regards > vbr > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlastimil.brom at gmail.com Tue Jul 31 10:01:43 2012 From: vlastimil.brom at gmail.com (Vlastimil Brom) Date: Tue, 31 Jul 2012 16:01:43 +0200 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: 2012/7/31 eat : > Hi, > > On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom > wrote: >> >> 2012/7/30 eat : >> > Hi, >> > >> > A partial answer to your questions: >> > >> > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom >> > >> > wrote: >> >> >> >> Hi all, >> >> I'd like to ask for some hints or advice regarding the usage of >> >> numpy.array and especially slicing. >> >> >> >> I only recently tried numpy and was impressed by the speedup in some >> >> parts of the code, hence I suspect, that I might miss some other >> >> oportunities in this area. >> >> >> >> I currently use the following code for a simple visualisation of the >> >> search matches within the text, the arrays are generally much larger >> >> than the sample - the texts size is generally hundreds of kilobytes up >> >> to a few MB - with an index position for each character. >> >> First there is a list of spans(obtained form the regex match objects), >> >> the respective character indices in between these slices should be set >> >> to 1: >> >> >> >> >>> import numpy >> >> >>> characters_matches = numpy.zeros(10) >> >> >>> matches_spans = numpy.array([[2,4], [5,9]]) >> >> >>> for start, stop in matches_spans: >> >> ... characters_matches[start:stop] = 1 >> >> ... >> >> >>> characters_matches >> >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> >> >> Is there maybe a way tu achieve this in a numpy-only way - without the >> >> python loop? >> >> (I got the impression, the powerful slicing capabilities could make it >> >> possible, bud haven't found this kind of solution.) >> >> >> >> >> >> In the next piece of code all the character positions are evaluated >> >> with their "neighbourhood" and a kind of running proportions of the >> >> matched text parts are computed (the checks_distance could be >> >> generally up to the order of the half the text length, usually less : >> >> >> >> >>> >> >> >>> check_distance = 1 >> >> >>> floating_checks_proportions = [] >> >> >>> for i in numpy.arange(len(characters_matches)): >> >> ... lo = i - check_distance >> >> ... if lo < 0: >> >> ... lo = None >> >> ... hi = i + check_distance + 1 >> >> ... checked_sublist = characters_matches[lo:hi] >> >> ... proportion = (checked_sublist.sum() / (check_distance * 2 + >> >> 1.0)) >> >> ... floating_checks_proportions.append(proportion) >> >> ... >> >> >>> floating_checks_proportions >> >> [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, >> >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, >> >> 0.66666666666666663, 0.33333333333333331] >> >> >>> >> > >> > Define a function for proportions: >> > >> > from numpy import r_ >> > >> > from numpy.lib.stride_tricks import as_strided as ast >> > >> > def proportions(matches, distance= 1): >> > >> > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] >> > >> > # pad >> > >> > m= r_[[0.]* cd, matches, [0.]* cd] >> > >> > # create a suitable view >> > >> > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) >> > >> > # average >> > >> > return m[:-2* cd].sum(1)/ cd2p1 >> > and use it like: >> > In []: matches >> > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> > >> > In []: proportions(matches).round(2) >> > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , >> > 0.67, >> > 0.33]) >> > In []: proportions(matches, 5).round(2) >> > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, >> > 0.45, >> > 0.36]) >> >> >> >> >> >> I'd like to ask about the possible better approaches, as it doesn't >> >> look very elegant to me, and I obviously don't know the implications >> >> or possible drawbacks of numpy arrays in some scenarios. >> >> >> >> the pattern >> >> for i in range(len(...)): is usually considered inadequate in python, >> >> but what should be used in this case as the indices are primarily >> >> needed? >> >> is something to be gained or lost using (x)range or np.arange as the >> >> python loop is (probably?) inevitable anyway? >> > >> > Here np.arange(.) will create a new array and potentially wasting memory >> > if >> > it's not otherwise used. IMO nothing wrong looping with xrange(.) (if >> > you >> > really need to loop ;). >> >> >> >> Is there some mor elegant way to check for the "underflowing" lower >> >> bound "lo" to replace with None? >> >> >> >> Is it significant, which container is used to collect the results of >> >> the computation in the python loop - i.e. python list or a numpy >> >> array? >> >> (Could possibly matplotlib cooperate better with either container?) >> >> >> >> And of course, are there maybe other things, which should be made >> >> better/differently? >> >> >> >> (using Numpy 1.6.2, python 2.7.3, win XP) >> > >> > >> > My 2 cents, >> > -eat >> >> >> >> Thanks in advance for any hints or suggestions, >> >> regards, >> >> Vlastimil Brom >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> Hi, >> thank you very much for your suggestions! >> >> do I understand it correctly, that I have to special-case the function >> for distance = 0 (which should return the matches themselves without >> recalculation)? > > Yes. >> >> >> However, more importantly, I am getting a ValueError for some larger, >> (but not completely unreasonable) "distance" >> >> >>> proportions(matches, distance= 8190) >> Traceback (most recent call last): >> File "", line 1, in >> File "", line 11, in proportions >> File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", >> line 28, in as_strided >> return np.asarray(DummyArray(interface, base=x)) >> File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line >> 235, in asarray >> return array(a, dtype, copy=False, order=order) >> ValueError: array is too big. >> >>> >> >> the distance= 8189 was the largest which worked in this snippet, >> however, it might be data-dependent, as I got this error as well e.g. >> for distance=4529 for a 20k text. >> >> Is this implementation-limited, or could it be solved in some >> alternative way which wouldn't have such limits (up to the order of, >> say, millions)? > > Apparently ast(.) does not return a view of the original matches rather a > copy of size (n* (2* distance+ 1)), thus you may run out of memory. > > Surely it can be solved up to millions of matches, but perhaps much slower > speed. > > > Regards, > -eat >> >> >> Thanks again >> regards >> vbr >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > Thank you for the confirmation, I'll wait and see, whether the current speed isn't actually already acceptable for the most cases... I could already gain a speedup by using the array.sum() and other features, maybe I will find yet other possibilities. regards, vbr From e.antero.tammi at gmail.com Tue Jul 31 11:28:49 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 18:28:49 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, On Tue, Jul 31, 2012 at 5:01 PM, Vlastimil Brom wrote: > 2012/7/31 eat : > > Hi, > > > > On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom < > vlastimil.brom at gmail.com> > > wrote: > >> > >> 2012/7/30 eat : > >> > Hi, > >> > > >> > A partial answer to your questions: > >> > > >> > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom > >> > > >> > wrote: > >> >> > >> >> Hi all, > >> >> I'd like to ask for some hints or advice regarding the usage of > >> >> numpy.array and especially slicing. > >> >> > >> >> I only recently tried numpy and was impressed by the speedup in some > >> >> parts of the code, hence I suspect, that I might miss some other > >> >> oportunities in this area. > >> >> > >> >> I currently use the following code for a simple visualisation of the > >> >> search matches within the text, the arrays are generally much larger > >> >> than the sample - the texts size is generally hundreds of kilobytes > up > >> >> to a few MB - with an index position for each character. > >> >> First there is a list of spans(obtained form the regex match > objects), > >> >> the respective character indices in between these slices should be > set > >> >> to 1: > >> >> > >> >> >>> import numpy > >> >> >>> characters_matches = numpy.zeros(10) > >> >> >>> matches_spans = numpy.array([[2,4], [5,9]]) > >> >> >>> for start, stop in matches_spans: > >> >> ... characters_matches[start:stop] = 1 > >> >> ... > >> >> >>> characters_matches > >> >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > >> >> > >> >> Is there maybe a way tu achieve this in a numpy-only way - without > the > >> >> python loop? > >> >> (I got the impression, the powerful slicing capabilities could make > it > >> >> possible, bud haven't found this kind of solution.) > >> >> > >> >> > >> >> In the next piece of code all the character positions are evaluated > >> >> with their "neighbourhood" and a kind of running proportions of the > >> >> matched text parts are computed (the checks_distance could be > >> >> generally up to the order of the half the text length, usually less : > >> >> > >> >> >>> > >> >> >>> check_distance = 1 > >> >> >>> floating_checks_proportions = [] > >> >> >>> for i in numpy.arange(len(characters_matches)): > >> >> ... lo = i - check_distance > >> >> ... if lo < 0: > >> >> ... lo = None > >> >> ... hi = i + check_distance + 1 > >> >> ... checked_sublist = characters_matches[lo:hi] > >> >> ... proportion = (checked_sublist.sum() / (check_distance * 2 + > >> >> 1.0)) > >> >> ... floating_checks_proportions.append(proportion) > >> >> ... > >> >> >>> floating_checks_proportions > >> >> [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, > >> >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, > >> >> 0.66666666666666663, 0.33333333333333331] > >> >> >>> > >> > > >> > Define a function for proportions: > >> > > >> > from numpy import r_ > >> > > >> > from numpy.lib.stride_tricks import as_strided as ast > >> > > >> > def proportions(matches, distance= 1): > >> > > >> > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] > >> > > >> > # pad > >> > > >> > m= r_[[0.]* cd, matches, [0.]* cd] > >> > > >> > # create a suitable view > >> > > >> > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) > >> > > >> > # average > >> > > >> > return m[:-2* cd].sum(1)/ cd2p1 > >> > and use it like: > >> > In []: matches > >> > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > >> > > >> > In []: proportions(matches).round(2) > >> > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , > >> > 0.67, > >> > 0.33]) > >> > In []: proportions(matches, 5).round(2) > >> > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, > >> > 0.45, > >> > 0.36]) > >> >> > >> >> > >> >> I'd like to ask about the possible better approaches, as it doesn't > >> >> look very elegant to me, and I obviously don't know the implications > >> >> or possible drawbacks of numpy arrays in some scenarios. > >> >> > >> >> the pattern > >> >> for i in range(len(...)): is usually considered inadequate in python, > >> >> but what should be used in this case as the indices are primarily > >> >> needed? > >> >> is something to be gained or lost using (x)range or np.arange as the > >> >> python loop is (probably?) inevitable anyway? > >> > > >> > Here np.arange(.) will create a new array and potentially wasting > memory > >> > if > >> > it's not otherwise used. IMO nothing wrong looping with xrange(.) (if > >> > you > >> > really need to loop ;). > >> >> > >> >> Is there some mor elegant way to check for the "underflowing" lower > >> >> bound "lo" to replace with None? > >> >> > >> >> Is it significant, which container is used to collect the results of > >> >> the computation in the python loop - i.e. python list or a numpy > >> >> array? > >> >> (Could possibly matplotlib cooperate better with either container?) > >> >> > >> >> And of course, are there maybe other things, which should be made > >> >> better/differently? > >> >> > >> >> (using Numpy 1.6.2, python 2.7.3, win XP) > >> > > >> > > >> > My 2 cents, > >> > -eat > >> >> > >> >> Thanks in advance for any hints or suggestions, > >> >> regards, > >> >> Vlastimil Brom > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> Hi, > >> thank you very much for your suggestions! > >> > >> do I understand it correctly, that I have to special-case the function > >> for distance = 0 (which should return the matches themselves without > >> recalculation)? > > > > Yes. > >> > >> > >> However, more importantly, I am getting a ValueError for some larger, > >> (but not completely unreasonable) "distance" > >> > >> >>> proportions(matches, distance= 8190) > >> Traceback (most recent call last): > >> File "", line 1, in > >> File "", line 11, in proportions > >> File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", > >> line 28, in as_strided > >> return np.asarray(DummyArray(interface, base=x)) > >> File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line > >> 235, in asarray > >> return array(a, dtype, copy=False, order=order) > >> ValueError: array is too big. > >> >>> > >> > >> the distance= 8189 was the largest which worked in this snippet, > >> however, it might be data-dependent, as I got this error as well e.g. > >> for distance=4529 for a 20k text. > >> > >> Is this implementation-limited, or could it be solved in some > >> alternative way which wouldn't have such limits (up to the order of, > >> say, millions)? > > > > Apparently ast(.) does not return a view of the original matches rather a > > copy of size (n* (2* distance+ 1)), thus you may run out of memory. > > > > Surely it can be solved up to millions of matches, but perhaps much > slower > > speed. > > > > > > Regards, > > -eat > >> > >> > >> Thanks again > >> regards > >> vbr > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Thank you for the confirmation, > I'll wait and see, whether the current speed isn't actually already > acceptable for the most cases... > I could already gain a speedup by using the array.sum() and other > features, maybe I will find yet other possibilities. > I just cooked up some pure pyhton and running sum based solution, which actually may be faster and it scales quite well up to millions of matches: def proportions_p(matches, distance= 1): cd, cd2p1= distance, 2* distance+ 1 m, r= [0]* cd+ matches+ [0]* (cd+ 1), [0]* len(matches) s= sum(m[:cd2p1]) for k in xrange(len(matches)): r[k]= s/ cd2p1 s-= m[k] s+= m[cd2p1+ k] return r Some verification and timings: In []: a= arange(1, 100000, dtype= float) In []: allclose(proportions(a, 1000), proportions_p(a.tolist(), 1000)) Out[]: True In []: %timeit proportions(a, 1000) 1 loops, best of 3: 288 ms per loop In []: %timeit proportions_p(a.tolist(), 1000) 10 loops, best of 3: 66.2 ms per loop In []: a= arange(1, 1000000, dtype= float) In []: %timeit proportions(a, 10000) ------------------------------------------------------------ Traceback (most recent call last): [snip] ValueError: array is too big. In []: %timeit proportions_p(a.tolist(), 10000) 1 loops, best of 3: 680 ms per loop Regards, -eat > > regards, > vbr > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Jul 31 11:43:30 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 31 Jul 2012 16:43:30 +0100 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: On Tue, Jul 31, 2012 at 2:23 PM, eat wrote: > Apparently ast(.) does not return a view of the original matches rather a > copy of size (n* (2* distance+ 1)), thus you may run out of memory. The problem isn't memory, it's that on 32-bit Python, np.prod(arr.shape) must be <2**32 (or maybe 2**31 -- something like that). Normally you wouldn't be creating such arrays anyway because they would be too big to fit into memory, so this problem isn't observed, but when you're using stride_tricks then it's very easy to create arrays that use only a small amount of memory but that have very large shapes. Solution: don't buy more memory, just use a 64-bit Python, where the limit is 2**64 (or 2**63 or whatever). -n From e.antero.tammi at gmail.com Tue Jul 31 11:57:31 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 18:57:31 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, On Tue, Jul 31, 2012 at 6:43 PM, Nathaniel Smith wrote: > On Tue, Jul 31, 2012 at 2:23 PM, eat wrote: > > Apparently ast(.) does not return a view of the original matches rather a > > copy of size (n* (2* distance+ 1)), thus you may run out of memory. > > The problem isn't memory, it's that on 32-bit Python, > np.prod(arr.shape) must be <2**32 (or maybe 2**31 -- something like > that). I think this is what the traceback is indicating. > Normally you wouldn't be creating such arrays anyway because > they would be too big to fit into memory, so this problem isn't > observed, but when you're using stride_tricks then it's very easy to > create arrays that use only a small amount of memory but that have > very large shapes. But in this specific case .nbytes attribute indicates that a huge amount of memory is used. So I guess stride_tricks(.) is not returning a view. > Solution: don't buy more memory, just use a 64-bit > Python, where the limit is 2**64 (or 2**63 or whatever). > Regards, -eat > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlastimil.brom at gmail.com Tue Jul 31 12:20:27 2012 From: vlastimil.brom at gmail.com (Vlastimil Brom) Date: Tue, 31 Jul 2012 18:20:27 +0200 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: 2012/7/31 eat : > Hi, > > On Tue, Jul 31, 2012 at 5:01 PM, Vlastimil Brom > wrote: >> >> 2012/7/31 eat : >> > Hi, >> > >> > On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom >> > >> > wrote: >> >> >> >> 2012/7/30 eat : >> >> > Hi, >> >> > >> >> > A partial answer to your questions: >> >> > >> >> > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom >> >> > >> >> > wrote: >> >> >> >> >> >> Hi all, >> >> >> I'd like to ask for some hints or advice regarding the usage of >> >> >> numpy.array and especially slicing. >> >> >> >> >> >> I only recently tried numpy and was impressed by the speedup in some >> >> >> parts of the code, hence I suspect, that I might miss some other >> >> >> oportunities in this area. >> >> >> >> >> >> I currently use the following code for a simple visualisation of the >> >> >> search matches within the text, the arrays are generally much larger >> >> >> than the sample - the texts size is generally hundreds of kilobytes >> >> >> up >> >> >> to a few MB - with an index position for each character. >> >> >> First there is a list of spans(obtained form the regex match >> >> >> objects), >> >> >> the respective character indices in between these slices should be >> >> >> set >> >> >> to 1: >> >> >> >> >> >> >>> import numpy >> >> >> >>> characters_matches = numpy.zeros(10) >> >> >> >>> matches_spans = numpy.array([[2,4], [5,9]]) >> >> >> >>> for start, stop in matches_spans: >> >> >> ... characters_matches[start:stop] = 1 >> >> >> ... >> >> >> >>> characters_matches >> >> >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> >> >> >> >> Is there maybe a way tu achieve this in a numpy-only way - without >> >> >> the >> >> >> python loop? >> >> >> (I got the impression, the powerful slicing capabilities could make >> >> >> it >> >> >> possible, bud haven't found this kind of solution.) >> >> >> >> >> >> >> >> >> In the next piece of code all the character positions are evaluated >> >> >> with their "neighbourhood" and a kind of running proportions of the >> >> >> matched text parts are computed (the checks_distance could be >> >> >> generally up to the order of the half the text length, usually less >> >> >> : >> >> >> >> >> >> >>> >> >> >> >>> check_distance = 1 >> >> >> >>> floating_checks_proportions = [] >> >> >> >>> for i in numpy.arange(len(characters_matches)): >> >> >> ... lo = i - check_distance >> >> >> ... if lo < 0: >> >> >> ... lo = None >> >> >> ... hi = i + check_distance + 1 >> >> >> ... checked_sublist = characters_matches[lo:hi] >> >> >> ... proportion = (checked_sublist.sum() / (check_distance * 2 + >> >> >> 1.0)) >> >> >> ... floating_checks_proportions.append(proportion) >> >> >> ... >> >> >> >>> floating_checks_proportions >> >> >> [0.0, 0.33333333333333331, 0.66666666666666663, 0.66666666666666663, >> >> >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, >> >> >> 0.66666666666666663, 0.33333333333333331] >> >> >> >>> >> >> > >> >> > Define a function for proportions: >> >> > >> >> > from numpy import r_ >> >> > >> >> > from numpy.lib.stride_tricks import as_strided as ast >> >> > >> >> > def proportions(matches, distance= 1): >> >> > >> >> > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] >> >> > >> >> > # pad >> >> > >> >> > m= r_[[0.]* cd, matches, [0.]* cd] >> >> > >> >> > # create a suitable view >> >> > >> >> > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) >> >> > >> >> > # average >> >> > >> >> > return m[:-2* cd].sum(1)/ cd2p1 >> >> > and use it like: >> >> > In []: matches >> >> > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> > >> >> > In []: proportions(matches).round(2) >> >> > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. , >> >> > 0.67, >> >> > 0.33]) >> >> > In []: proportions(matches, 5).round(2) >> >> > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, 0.55, >> >> > 0.45, >> >> > 0.36]) >> >> >> >> >> >> >> >> >> I'd like to ask about the possible better approaches, as it doesn't >> >> >> look very elegant to me, and I obviously don't know the implications >> >> >> or possible drawbacks of numpy arrays in some scenarios. >> >> >> >> >> >> the pattern >> >> >> for i in range(len(...)): is usually considered inadequate in >> >> >> python, >> >> >> but what should be used in this case as the indices are primarily >> >> >> needed? >> >> >> is something to be gained or lost using (x)range or np.arange as the >> >> >> python loop is (probably?) inevitable anyway? >> >> > >> >> > Here np.arange(.) will create a new array and potentially wasting >> >> > memory >> >> > if >> >> > it's not otherwise used. IMO nothing wrong looping with xrange(.) (if >> >> > you >> >> > really need to loop ;). >> >> >> >> >> >> Is there some mor elegant way to check for the "underflowing" lower >> >> >> bound "lo" to replace with None? >> >> >> >> >> >> Is it significant, which container is used to collect the results of >> >> >> the computation in the python loop - i.e. python list or a numpy >> >> >> array? >> >> >> (Could possibly matplotlib cooperate better with either container?) >> >> >> >> >> >> And of course, are there maybe other things, which should be made >> >> >> better/differently? >> >> >> >> >> >> (using Numpy 1.6.2, python 2.7.3, win XP) >> >> > >> >> > >> >> > My 2 cents, >> >> > -eat >> >> >> >> >> >> Thanks in advance for any hints or suggestions, >> >> >> regards, >> >> >> Vlastimil Brom >> >> >> _______________________________________________ >> >> >> NumPy-Discussion mailing list >> >> >> NumPy-Discussion at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > >> >> Hi, >> >> thank you very much for your suggestions! >> >> >> >> do I understand it correctly, that I have to special-case the function >> >> for distance = 0 (which should return the matches themselves without >> >> recalculation)? >> > >> > Yes. >> >> >> >> >> >> However, more importantly, I am getting a ValueError for some larger, >> >> (but not completely unreasonable) "distance" >> >> >> >> >>> proportions(matches, distance= 8190) >> >> Traceback (most recent call last): >> >> File "", line 1, in >> >> File "", line 11, in proportions >> >> File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", >> >> line 28, in as_strided >> >> return np.asarray(DummyArray(interface, base=x)) >> >> File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line >> >> 235, in asarray >> >> return array(a, dtype, copy=False, order=order) >> >> ValueError: array is too big. >> >> >>> >> >> >> >> the distance= 8189 was the largest which worked in this snippet, >> >> however, it might be data-dependent, as I got this error as well e.g. >> >> for distance=4529 for a 20k text. >> >> >> >> Is this implementation-limited, or could it be solved in some >> >> alternative way which wouldn't have such limits (up to the order of, >> >> say, millions)? >> > >> > Apparently ast(.) does not return a view of the original matches rather >> > a >> > copy of size (n* (2* distance+ 1)), thus you may run out of memory. >> > >> > Surely it can be solved up to millions of matches, but perhaps much >> > slower >> > speed. >> > >> > >> > Regards, >> > -eat >> >> >> >> >> >> Thanks again >> >> regards >> >> vbr >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> Thank you for the confirmation, >> I'll wait and see, whether the current speed isn't actually already >> acceptable for the most cases... >> I could already gain a speedup by using the array.sum() and other >> features, maybe I will find yet other possibilities. > > I just cooked up some pure pyhton and running sum based solution, which > actually may be faster and it scales quite well up to millions of matches: > > def proportions_p(matches, distance= 1): > > cd, cd2p1= distance, 2* distance+ 1 > > m, r= [0]* cd+ matches+ [0]* (cd+ 1), [0]* len(matches) > > s= sum(m[:cd2p1]) > > for k in xrange(len(matches)): > > r[k]= s/ cd2p1 > > s-= m[k] > > s+= m[cd2p1+ k] > > return r > > > Some verification and timings: > In []: a= arange(1, 100000, dtype= float) > In []: allclose(proportions(a, 1000), proportions_p(a.tolist(), 1000)) > Out[]: True > > In []: %timeit proportions(a, 1000) > 1 loops, best of 3: 288 ms per loop > In []: %timeit proportions_p(a.tolist(), 1000) > 10 loops, best of 3: 66.2 ms per loop > > In []: a= arange(1, 1000000, dtype= float) > In []: %timeit proportions(a, 10000) > ------------------------------------------------------------ > Traceback (most recent call last): > [snip] > ValueError: array is too big. > > In []: %timeit proportions_p(a.tolist(), 10000) > 1 loops, best of 3: 680 ms per loop > > > Regards, > -eat >> >> >> regards, >> >> vbr >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Thanks for further assistance; I hope, I am not misunderstanding something in the code, but this calculation of proportions is supposed to be run over an array of either 0 or 1, rather than a range; a test data would be something like: import random test_lst = [0,1]*500 random.shuffle(test_lst) For this data, I am not getting the same results like with my previously posted python function. Thanks and regards vbr From njs at pobox.com Tue Jul 31 12:30:15 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 31 Jul 2012 17:30:15 +0100 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: On Tue, Jul 31, 2012 at 4:57 PM, eat wrote: > Hi, > > On Tue, Jul 31, 2012 at 6:43 PM, Nathaniel Smith wrote: >> >> On Tue, Jul 31, 2012 at 2:23 PM, eat wrote: >> > Apparently ast(.) does not return a view of the original matches rather >> > a >> > copy of size (n* (2* distance+ 1)), thus you may run out of memory. >> >> The problem isn't memory, it's that on 32-bit Python, >> np.prod(arr.shape) must be <2**32 (or maybe 2**31 -- something like >> that). > > I think this is what the traceback is indicating. >> >> Normally you wouldn't be creating such arrays anyway because >> they would be too big to fit into memory, so this problem isn't >> observed, but when you're using stride_tricks then it's very easy to >> create arrays that use only a small amount of memory but that have >> very large shapes. > > But in this specific case .nbytes attribute indicates that a huge amount of > memory is used. So I guess stride_tricks(.) is not returning a view. No, .nbytes is lying to you -- it just returns np.prod(arr.shape) * arr.dtype.itemsize. It isn't smart enough to realize that you have wacky strides that cause the same memory region to be referenced by many different array elements. -n From e.antero.tammi at gmail.com Tue Jul 31 12:39:20 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 19:39:20 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, On Tue, Jul 31, 2012 at 7:30 PM, Nathaniel Smith wrote: > On Tue, Jul 31, 2012 at 4:57 PM, eat wrote: > > Hi, > > > > On Tue, Jul 31, 2012 at 6:43 PM, Nathaniel Smith wrote: > >> > >> On Tue, Jul 31, 2012 at 2:23 PM, eat wrote: > >> > Apparently ast(.) does not return a view of the original matches > rather > >> > a > >> > copy of size (n* (2* distance+ 1)), thus you may run out of memory. > >> > >> The problem isn't memory, it's that on 32-bit Python, > >> np.prod(arr.shape) must be <2**32 (or maybe 2**31 -- something like > >> that). > > > > I think this is what the traceback is indicating. > >> > >> Normally you wouldn't be creating such arrays anyway because > >> they would be too big to fit into memory, so this problem isn't > >> observed, but when you're using stride_tricks then it's very easy to > >> create arrays that use only a small amount of memory but that have > >> very large shapes. > > > > But in this specific case .nbytes attribute indicates that a huge amount > of > > memory is used. So I guess stride_tricks(.) is not returning a view. > > No, .nbytes is lying to you -- it just returns np.prod(arr.shape) * > arr.dtype.itemsize. It isn't smart enough to realize that you have > wacky strides that cause the same memory region to be referenced by > many different array elements. > Aha, very good to know. Thanks, -eat > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Tue Jul 31 12:49:13 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 31 Jul 2012 19:49:13 +0300 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: Hi, On Tue, Jul 31, 2012 at 7:20 PM, Vlastimil Brom wrote: > 2012/7/31 eat : > > Hi, > > > > On Tue, Jul 31, 2012 at 5:01 PM, Vlastimil Brom < > vlastimil.brom at gmail.com> > > wrote: > >> > >> 2012/7/31 eat : > >> > Hi, > >> > > >> > On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom > >> > > >> > wrote: > >> >> > >> >> 2012/7/30 eat : > >> >> > Hi, > >> >> > > >> >> > A partial answer to your questions: > >> >> > > >> >> > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom > >> >> > > >> >> > wrote: > >> >> >> > >> >> >> Hi all, > >> >> >> I'd like to ask for some hints or advice regarding the usage of > >> >> >> numpy.array and especially slicing. > >> >> >> > >> >> >> I only recently tried numpy and was impressed by the speedup in > some > >> >> >> parts of the code, hence I suspect, that I might miss some other > >> >> >> oportunities in this area. > >> >> >> > >> >> >> I currently use the following code for a simple visualisation of > the > >> >> >> search matches within the text, the arrays are generally much > larger > >> >> >> than the sample - the texts size is generally hundreds of > kilobytes > >> >> >> up > >> >> >> to a few MB - with an index position for each character. > >> >> >> First there is a list of spans(obtained form the regex match > >> >> >> objects), > >> >> >> the respective character indices in between these slices should be > >> >> >> set > >> >> >> to 1: > >> >> >> > >> >> >> >>> import numpy > >> >> >> >>> characters_matches = numpy.zeros(10) > >> >> >> >>> matches_spans = numpy.array([[2,4], [5,9]]) > >> >> >> >>> for start, stop in matches_spans: > >> >> >> ... characters_matches[start:stop] = 1 > >> >> >> ... > >> >> >> >>> characters_matches > >> >> >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > >> >> >> > >> >> >> Is there maybe a way tu achieve this in a numpy-only way - without > >> >> >> the > >> >> >> python loop? > >> >> >> (I got the impression, the powerful slicing capabilities could > make > >> >> >> it > >> >> >> possible, bud haven't found this kind of solution.) > >> >> >> > >> >> >> > >> >> >> In the next piece of code all the character positions are > evaluated > >> >> >> with their "neighbourhood" and a kind of running proportions of > the > >> >> >> matched text parts are computed (the checks_distance could be > >> >> >> generally up to the order of the half the text length, usually > less > >> >> >> : > >> >> >> > >> >> >> >>> > >> >> >> >>> check_distance = 1 > >> >> >> >>> floating_checks_proportions = [] > >> >> >> >>> for i in numpy.arange(len(characters_matches)): > >> >> >> ... lo = i - check_distance > >> >> >> ... if lo < 0: > >> >> >> ... lo = None > >> >> >> ... hi = i + check_distance + 1 > >> >> >> ... checked_sublist = characters_matches[lo:hi] > >> >> >> ... proportion = (checked_sublist.sum() / (check_distance * 2 > + > >> >> >> 1.0)) > >> >> >> ... floating_checks_proportions.append(proportion) > >> >> >> ... > >> >> >> >>> floating_checks_proportions > >> >> >> [0.0, 0.33333333333333331, 0.66666666666666663, > 0.66666666666666663, > >> >> >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, > >> >> >> 0.66666666666666663, 0.33333333333333331] > >> >> >> >>> > >> >> > > >> >> > Define a function for proportions: > >> >> > > >> >> > from numpy import r_ > >> >> > > >> >> > from numpy.lib.stride_tricks import as_strided as ast > >> >> > > >> >> > def proportions(matches, distance= 1): > >> >> > > >> >> > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] > >> >> > > >> >> > # pad > >> >> > > >> >> > m= r_[[0.]* cd, matches, [0.]* cd] > >> >> > > >> >> > # create a suitable view > >> >> > > >> >> > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) > >> >> > > >> >> > # average > >> >> > > >> >> > return m[:-2* cd].sum(1)/ cd2p1 > >> >> > and use it like: > >> >> > In []: matches > >> >> > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) > >> >> > > >> >> > In []: proportions(matches).round(2) > >> >> > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. > , > >> >> > 0.67, > >> >> > 0.33]) > >> >> > In []: proportions(matches, 5).round(2) > >> >> > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, > 0.55, > >> >> > 0.45, > >> >> > 0.36]) > >> >> >> > >> >> >> > >> >> >> I'd like to ask about the possible better approaches, as it > doesn't > >> >> >> look very elegant to me, and I obviously don't know the > implications > >> >> >> or possible drawbacks of numpy arrays in some scenarios. > >> >> >> > >> >> >> the pattern > >> >> >> for i in range(len(...)): is usually considered inadequate in > >> >> >> python, > >> >> >> but what should be used in this case as the indices are primarily > >> >> >> needed? > >> >> >> is something to be gained or lost using (x)range or np.arange as > the > >> >> >> python loop is (probably?) inevitable anyway? > >> >> > > >> >> > Here np.arange(.) will create a new array and potentially wasting > >> >> > memory > >> >> > if > >> >> > it's not otherwise used. IMO nothing wrong looping with xrange(.) > (if > >> >> > you > >> >> > really need to loop ;). > >> >> >> > >> >> >> Is there some mor elegant way to check for the "underflowing" > lower > >> >> >> bound "lo" to replace with None? > >> >> >> > >> >> >> Is it significant, which container is used to collect the results > of > >> >> >> the computation in the python loop - i.e. python list or a numpy > >> >> >> array? > >> >> >> (Could possibly matplotlib cooperate better with either > container?) > >> >> >> > >> >> >> And of course, are there maybe other things, which should be made > >> >> >> better/differently? > >> >> >> > >> >> >> (using Numpy 1.6.2, python 2.7.3, win XP) > >> >> > > >> >> > > >> >> > My 2 cents, > >> >> > -eat > >> >> >> > >> >> >> Thanks in advance for any hints or suggestions, > >> >> >> regards, > >> >> >> Vlastimil Brom > >> >> >> _______________________________________________ > >> >> >> NumPy-Discussion mailing list > >> >> >> NumPy-Discussion at scipy.org > >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > > >> >> Hi, > >> >> thank you very much for your suggestions! > >> >> > >> >> do I understand it correctly, that I have to special-case the > function > >> >> for distance = 0 (which should return the matches themselves without > >> >> recalculation)? > >> > > >> > Yes. > >> >> > >> >> > >> >> However, more importantly, I am getting a ValueError for some larger, > >> >> (but not completely unreasonable) "distance" > >> >> > >> >> >>> proportions(matches, distance= 8190) > >> >> Traceback (most recent call last): > >> >> File "", line 1, in > >> >> File "", line 11, in proportions > >> >> File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", > >> >> line 28, in as_strided > >> >> return np.asarray(DummyArray(interface, base=x)) > >> >> File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line > >> >> 235, in asarray > >> >> return array(a, dtype, copy=False, order=order) > >> >> ValueError: array is too big. > >> >> >>> > >> >> > >> >> the distance= 8189 was the largest which worked in this snippet, > >> >> however, it might be data-dependent, as I got this error as well e.g. > >> >> for distance=4529 for a 20k text. > >> >> > >> >> Is this implementation-limited, or could it be solved in some > >> >> alternative way which wouldn't have such limits (up to the order of, > >> >> say, millions)? > >> > > >> > Apparently ast(.) does not return a view of the original matches > rather > >> > a > >> > copy of size (n* (2* distance+ 1)), thus you may run out of memory. > >> > > >> > Surely it can be solved up to millions of matches, but perhaps much > >> > slower > >> > speed. > >> > > >> > > >> > Regards, > >> > -eat > >> >> > >> >> > >> >> Thanks again > >> >> regards > >> >> vbr > >> >> > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > >> > >> Thank you for the confirmation, > >> I'll wait and see, whether the current speed isn't actually already > >> acceptable for the most cases... > >> I could already gain a speedup by using the array.sum() and other > >> features, maybe I will find yet other possibilities. > > > > I just cooked up some pure pyhton and running sum based solution, which > > actually may be faster and it scales quite well up to millions of > matches: > > > > def proportions_p(matches, distance= 1): > > > > cd, cd2p1= distance, 2* distance+ 1 > > > > m, r= [0]* cd+ matches+ [0]* (cd+ 1), [0]* len(matches) > > > > s= sum(m[:cd2p1]) > > > > for k in xrange(len(matches)): > > > > r[k]= s/ cd2p1 > > > > s-= m[k] > > > > s+= m[cd2p1+ k] > > > > return r > > > > > > Some verification and timings: > > In []: a= arange(1, 100000, dtype= float) > > In []: allclose(proportions(a, 1000), proportions_p(a.tolist(), 1000)) > > Out[]: True > > > > In []: %timeit proportions(a, 1000) > > 1 loops, best of 3: 288 ms per loop > > In []: %timeit proportions_p(a.tolist(), 1000) > > 10 loops, best of 3: 66.2 ms per loop > > > > In []: a= arange(1, 1000000, dtype= float) > > In []: %timeit proportions(a, 10000) > > ------------------------------------------------------------ > > Traceback (most recent call last): > > [snip] > > ValueError: array is too big. > > > > In []: %timeit proportions_p(a.tolist(), 10000) > > 1 loops, best of 3: 680 ms per loop > > > > > > Regards, > > -eat > >> > >> > >> regards, > >> > >> vbr > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > Thanks for further assistance; > I hope, I am not misunderstanding something in the code, but this > calculation of proportions is supposed to be run over an array of > either 0 or 1, rather than a range; > a test data would be something like: > It really shouldn't matter what the test case numbers are. We are just calculating averages over of certain 'window', like: In []: proportions(arange(1, 4, dtype= float)) Out[]: array([ 1. , 2. , 1.66666667]) In []: (0.+ 1.+ 2.)/ 3 Out[]: 1.0 In []: (1.+ 2.+ 3.)/ 3 Out[]: 2.0 In []: (2.+ 3.+ 0.)/ 3 Out[]: 1.6666666666666667 Regards, -eat > > import random > test_lst = [0,1]*500 > random.shuffle(test_lst) > > For this data, I am not getting the same results like with my > previously posted python function. > > Thanks and regards > vbr > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vlastimil.brom at gmail.com Tue Jul 31 14:12:44 2012 From: vlastimil.brom at gmail.com (Vlastimil Brom) Date: Tue, 31 Jul 2012 20:12:44 +0200 Subject: [Numpy-discussion] array slicing questions In-Reply-To: References: Message-ID: 2012/7/31 eat : > Hi, > > On Tue, Jul 31, 2012 at 7:20 PM, Vlastimil Brom > wrote: >> >> 2012/7/31 eat : >> > Hi, >> > >> > On Tue, Jul 31, 2012 at 5:01 PM, Vlastimil Brom >> > >> > wrote: >> >> >> >> 2012/7/31 eat : >> >> > Hi, >> >> > >> >> > On Tue, Jul 31, 2012 at 10:23 AM, Vlastimil Brom >> >> > >> >> > wrote: >> >> >> >> >> >> 2012/7/30 eat : >> >> >> > Hi, >> >> >> > >> >> >> > A partial answer to your questions: >> >> >> > >> >> >> > On Mon, Jul 30, 2012 at 10:33 PM, Vlastimil Brom >> >> >> > >> >> >> > wrote: >> >> >> >> >> >> >> >> Hi all, >> >> >> >> I'd like to ask for some hints or advice regarding the usage of >> >> >> >> numpy.array and especially slicing. >> >> >> >> >> >> >> >> I only recently tried numpy and was impressed by the speedup in >> >> >> >> some >> >> >> >> parts of the code, hence I suspect, that I might miss some other >> >> >> >> oportunities in this area. >> >> >> >> >> >> >> >> I currently use the following code for a simple visualisation of >> >> >> >> the >> >> >> >> search matches within the text, the arrays are generally much >> >> >> >> larger >> >> >> >> than the sample - the texts size is generally hundreds of >> >> >> >> kilobytes >> >> >> >> up >> >> >> >> to a few MB - with an index position for each character. >> >> >> >> First there is a list of spans(obtained form the regex match >> >> >> >> objects), >> >> >> >> the respective character indices in between these slices should >> >> >> >> be >> >> >> >> set >> >> >> >> to 1: >> >> >> >> >> >> >> >> >>> import numpy >> >> >> >> >>> characters_matches = numpy.zeros(10) >> >> >> >> >>> matches_spans = numpy.array([[2,4], [5,9]]) >> >> >> >> >>> for start, stop in matches_spans: >> >> >> >> ... characters_matches[start:stop] = 1 >> >> >> >> ... >> >> >> >> >>> characters_matches >> >> >> >> array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> >> >> >> >> >> >> Is there maybe a way tu achieve this in a numpy-only way - >> >> >> >> without >> >> >> >> the >> >> >> >> python loop? >> >> >> >> (I got the impression, the powerful slicing capabilities could >> >> >> >> make >> >> >> >> it >> >> >> >> possible, bud haven't found this kind of solution.) >> >> >> >> >> >> >> >> >> >> >> >> In the next piece of code all the character positions are >> >> >> >> evaluated >> >> >> >> with their "neighbourhood" and a kind of running proportions of >> >> >> >> the >> >> >> >> matched text parts are computed (the checks_distance could be >> >> >> >> generally up to the order of the half the text length, usually >> >> >> >> less >> >> >> >> : >> >> >> >> >> >> >> >> >>> >> >> >> >> >>> check_distance = 1 >> >> >> >> >>> floating_checks_proportions = [] >> >> >> >> >>> for i in numpy.arange(len(characters_matches)): >> >> >> >> ... lo = i - check_distance >> >> >> >> ... if lo < 0: >> >> >> >> ... lo = None >> >> >> >> ... hi = i + check_distance + 1 >> >> >> >> ... checked_sublist = characters_matches[lo:hi] >> >> >> >> ... proportion = (checked_sublist.sum() / (check_distance * 2 >> >> >> >> + >> >> >> >> 1.0)) >> >> >> >> ... floating_checks_proportions.append(proportion) >> >> >> >> ... >> >> >> >> >>> floating_checks_proportions >> >> >> >> [0.0, 0.33333333333333331, 0.66666666666666663, >> >> >> >> 0.66666666666666663, >> >> >> >> 0.66666666666666663, 0.66666666666666663, 1.0, 1.0, >> >> >> >> 0.66666666666666663, 0.33333333333333331] >> >> >> >> >>> >> >> >> > >> >> >> > Define a function for proportions: >> >> >> > >> >> >> > from numpy import r_ >> >> >> > >> >> >> > from numpy.lib.stride_tricks import as_strided as ast >> >> >> > >> >> >> > def proportions(matches, distance= 1): >> >> >> > >> >> >> > cd, cd2p1, s= distance, 2* distance+ 1, matches.strides[0] >> >> >> > >> >> >> > # pad >> >> >> > >> >> >> > m= r_[[0.]* cd, matches, [0.]* cd] >> >> >> > >> >> >> > # create a suitable view >> >> >> > >> >> >> > m= ast(m, shape= (m.shape[0], cd2p1), strides= (s, s)) >> >> >> > >> >> >> > # average >> >> >> > >> >> >> > return m[:-2* cd].sum(1)/ cd2p1 >> >> >> > and use it like: >> >> >> > In []: matches >> >> >> > Out[]: array([ 0., 0., 1., 1., 0., 1., 1., 1., 1., 0.]) >> >> >> > >> >> >> > In []: proportions(matches).round(2) >> >> >> > Out[]: array([ 0. , 0.33, 0.67, 0.67, 0.67, 0.67, 1. , 1. >> >> >> > , >> >> >> > 0.67, >> >> >> > 0.33]) >> >> >> > In []: proportions(matches, 5).round(2) >> >> >> > Out[]: array([ 0.27, 0.36, 0.45, 0.55, 0.55, 0.55, 0.55, >> >> >> > 0.55, >> >> >> > 0.45, >> >> >> > 0.36]) >> >> >> >> >> >> >> >> >> >> >> >> I'd like to ask about the possible better approaches, as it >> >> >> >> doesn't >> >> >> >> look very elegant to me, and I obviously don't know the >> >> >> >> implications >> >> >> >> or possible drawbacks of numpy arrays in some scenarios. >> >> >> >> >> >> >> >> the pattern >> >> >> >> for i in range(len(...)): is usually considered inadequate in >> >> >> >> python, >> >> >> >> but what should be used in this case as the indices are primarily >> >> >> >> needed? >> >> >> >> is something to be gained or lost using (x)range or np.arange as >> >> >> >> the >> >> >> >> python loop is (probably?) inevitable anyway? >> >> >> > >> >> >> > Here np.arange(.) will create a new array and potentially wasting >> >> >> > memory >> >> >> > if >> >> >> > it's not otherwise used. IMO nothing wrong looping with xrange(.) >> >> >> > (if >> >> >> > you >> >> >> > really need to loop ;). >> >> >> >> >> >> >> >> Is there some mor elegant way to check for the "underflowing" >> >> >> >> lower >> >> >> >> bound "lo" to replace with None? >> >> >> >> >> >> >> >> Is it significant, which container is used to collect the results >> >> >> >> of >> >> >> >> the computation in the python loop - i.e. python list or a numpy >> >> >> >> array? >> >> >> >> (Could possibly matplotlib cooperate better with either >> >> >> >> container?) >> >> >> >> >> >> >> >> And of course, are there maybe other things, which should be made >> >> >> >> better/differently? >> >> >> >> >> >> >> >> (using Numpy 1.6.2, python 2.7.3, win XP) >> >> >> > >> >> >> > >> >> >> > My 2 cents, >> >> >> > -eat >> >> >> >> >> >> >> >> Thanks in advance for any hints or suggestions, >> >> >> >> regards, >> >> >> >> Vlastimil Brom >> >> >> >> _______________________________________________ >> >> >> >> NumPy-Discussion mailing list >> >> >> >> NumPy-Discussion at scipy.org >> >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> >> >> Hi, >> >> >> thank you very much for your suggestions! >> >> >> >> >> >> do I understand it correctly, that I have to special-case the >> >> >> function >> >> >> for distance = 0 (which should return the matches themselves without >> >> >> recalculation)? >> >> > >> >> > Yes. >> >> >> >> >> >> >> >> >> However, more importantly, I am getting a ValueError for some >> >> >> larger, >> >> >> (but not completely unreasonable) "distance" >> >> >> >> >> >> >>> proportions(matches, distance= 8190) >> >> >> Traceback (most recent call last): >> >> >> File "", line 1, in >> >> >> File "", line 11, in proportions >> >> >> File "C:\Python27\lib\site-packages\numpy\lib\stride_tricks.py", >> >> >> line 28, in as_strided >> >> >> return np.asarray(DummyArray(interface, base=x)) >> >> >> File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line >> >> >> 235, in asarray >> >> >> return array(a, dtype, copy=False, order=order) >> >> >> ValueError: array is too big. >> >> >> >>> >> >> >> >> >> >> the distance= 8189 was the largest which worked in this snippet, >> >> >> however, it might be data-dependent, as I got this error as well >> >> >> e.g. >> >> >> for distance=4529 for a 20k text. >> >> >> >> >> >> Is this implementation-limited, or could it be solved in some >> >> >> alternative way which wouldn't have such limits (up to the order of, >> >> >> say, millions)? >> >> > >> >> > Apparently ast(.) does not return a view of the original matches >> >> > rather >> >> > a >> >> > copy of size (n* (2* distance+ 1)), thus you may run out of memory. >> >> > >> >> > Surely it can be solved up to millions of matches, but perhaps much >> >> > slower >> >> > speed. >> >> > >> >> > >> >> > Regards, >> >> > -eat >> >> >> >> >> >> >> >> >> Thanks again >> >> >> regards >> >> >> vbr >> >> >> >> >> >> _______________________________________________ >> >> >> NumPy-Discussion mailing list >> >> >> NumPy-Discussion at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > >> >> >> >> Thank you for the confirmation, >> >> I'll wait and see, whether the current speed isn't actually already >> >> acceptable for the most cases... >> >> I could already gain a speedup by using the array.sum() and other >> >> features, maybe I will find yet other possibilities. >> > >> > I just cooked up some pure pyhton and running sum based solution, which >> > actually may be faster and it scales quite well up to millions of >> > matches: >> > >> > def proportions_p(matches, distance= 1): >> > >> > cd, cd2p1= distance, 2* distance+ 1 >> > >> > m, r= [0]* cd+ matches+ [0]* (cd+ 1), [0]* len(matches) >> > >> > s= sum(m[:cd2p1]) >> > >> > for k in xrange(len(matches)): >> > >> > r[k]= s/ cd2p1 >> > >> > s-= m[k] >> > >> > s+= m[cd2p1+ k] >> > >> > return r >> > >> > >> > Some verification and timings: >> > In []: a= arange(1, 100000, dtype= float) >> > In []: allclose(proportions(a, 1000), proportions_p(a.tolist(), 1000)) >> > Out[]: True >> > >> > In []: %timeit proportions(a, 1000) >> > 1 loops, best of 3: 288 ms per loop >> > In []: %timeit proportions_p(a.tolist(), 1000) >> > 10 loops, best of 3: 66.2 ms per loop >> > >> > In []: a= arange(1, 1000000, dtype= float) >> > In []: %timeit proportions(a, 10000) >> > ------------------------------------------------------------ >> > Traceback (most recent call last): >> > [snip] >> > ValueError: array is too big. >> > >> > In []: %timeit proportions_p(a.tolist(), 10000) >> > 1 loops, best of 3: 680 ms per loop >> > >> > >> > Regards, >> > -eat >> >> >> >> >> >> regards, >> >> >> >> vbr >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> Thanks for further assistance; >> I hope, I am not misunderstanding something in the code, but this >> calculation of proportions is supposed to be run over an array of >> either 0 or 1, rather than a range; >> a test data would be something like: > > It really shouldn't matter what the test case numbers are. We are just > calculating averages over of certain 'window', like: > In []: proportions(arange(1, 4, dtype= float)) > Out[]: array([ 1. , 2. , 1.66666667]) > In []: (0.+ 1.+ 2.)/ 3 > Out[]: 1.0 > In []: (1.+ 2.+ 3.)/ 3 > Out[]: 2.0 > In []: (2.+ 3.+ 0.)/ 3 > Out[]: 1.6666666666666667 > > > Regards, > -eat >> >> >> import random >> test_lst = [0,1]*500 >> random.shuffle(test_lst) >> >> For this data, I am not getting the same results like with my >> previously posted python function. >> >> Thanks and regards >> vbr >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Thanks, and sorry for the stupid mistake, I somehow forgot dtype=float in my testdata, which apparently caused the errors and different output of the functions in question. I hoped, it is fixed now. Regards, vbr From david.froger at gmail.com Tue Jul 31 16:04:06 2012 From: david.froger at gmail.com (David Froger) Date: Tue, 31 Jul 2012 22:04:06 +0200 Subject: [Numpy-discussion] SWIG Numpy and C++ extensions In-Reply-To: References: Message-ID: <20120731220406.GA18071@david-desktop.localdomain> Hi, > I'm looking at SWIG/numpy tutorials They are these tutorials: http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html http://www.scipy.org/Cookbook/SWIG_NumPy_examples Reading numpy.i is also very instructive. > 1- How do use "apply" for class functions %apply (bla) myobject::foo ? %apply is specified on function/method arguments names and types only, never on function names. So if for example you use: %apply (int* ARGOUT_ARRAY1, int DIM1) {(int* rangevec, int n)} it will apply on every functions that have arguments "int* ARGOUT_ARRAY1, int DIM1" > 2-that's ok if your C++ deals with arrays but what if I actually want to receive the Numpy object so that I can manipulate it directly (or if for example the array isn't contiguous in memory) > > An"dummy"example of foo function I'd like to wrap: > > void FOO::fooNumpy(PyArrayObject *nparray) { > > int j; > for(j=0;jnd;j++) { > printf("Ok array dim %i has length: %i\n",j,nparray->dimensions[j]); > } > } I never do it with Swig, will try to make this example works! David From david.froger at gmail.com Tue Jul 31 16:07:17 2012 From: david.froger at gmail.com (David Froger) Date: Tue, 31 Jul 2012 22:07:17 +0200 Subject: [Numpy-discussion] SWIG Numpy and C++ extensions In-Reply-To: <20120731220406.GA18071@david-desktop.localdomain> References: <20120731220406.GA18071@david-desktop.localdomain> Message-ID: <20120731220717.GB18071@david-desktop.localdomain> > > I'm looking at SWIG/numpy tutorials > They are these tutorials: > http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html > http://www.scipy.org/Cookbook/SWIG_NumPy_examples Sorry, I've read "look for"... From wfspotz at sandia.gov Tue Jul 31 16:48:24 2012 From: wfspotz at sandia.gov (Bill Spotz) Date: Tue, 31 Jul 2012 14:48:24 -0600 Subject: [Numpy-discussion] [EXTERNAL] Re: SWIG Numpy and C++ extensions In-Reply-To: <20120731220406.GA18071@david-desktop.localdomain> References: <20120731220406.GA18071@david-desktop.localdomain> Message-ID: Use %inline %{ ... %} around your function. SWIG will add your function directly to the wrapper file as well as add a wrapper function for calling it from python. On Jul 31, 2012, at 2:04 PM, David Froger wrote: >> 2-that's ok if your C++ deals with arrays but what if I actually want to receive the Numpy object so that I can manipulate it directly (or if for example the array isn't contiguous in memory) >> >> An"dummy"example of foo function I'd like to wrap: >> >> void FOO::fooNumpy(PyArrayObject *nparray) { >> >> int j; >> for(j=0;jnd;j++) { >> printf("Ok array dim %i has length: %i\n",j,nparray->dimensions[j]); >> } >> } ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From doutriaux1 at llnl.gov Tue Jul 31 18:28:02 2012 From: doutriaux1 at llnl.gov (Doutriaux, Charles) Date: Tue, 31 Jul 2012 15:28:02 -0700 Subject: [Numpy-discussion] SWIG Numpy and C++ extensions In-Reply-To: <20120731220406.GA18071@david-desktop.localdomain> Message-ID: Thanks David, The clarification on apply is actually a important one! C. On 7/31/12 1:04 PM, "David Froger" wrote: >Hi, > >> I'm looking at SWIG/numpy tutorials >They are these tutorials: >http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html >http://www.scipy.org/Cookbook/SWIG_NumPy_examples > >Reading numpy.i is also very instructive. > >> 1- How do use "apply" for class functions %apply (bla) myobject::foo ? >%apply is specified on function/method arguments names and types > only, >never on function names. So if for example you use: >%apply (int* ARGOUT_ARRAY1, int DIM1) {(int* rangevec, int n)} >it will apply on every functions that have arguments "int* ARGOUT_ARRAY1, >int DIM1" > >> 2-that's ok if your C++ deals with arrays but what if I actually want >>to receive the Numpy object so that I can manipulate it directly (or if >>for example the array isn't contiguous in memory) >> >> An"dummy"example of foo function I'd like to wrap: >> >> void FOO::fooNumpy(PyArrayObject *nparray) { >> >> int j; >> for(j=0;jnd;j++) { >> printf("Ok array dim %i has length: %i\n",j,nparray->dimensions[j]); >> } >> } >I never do it with Swig, will try to make this example works! > >David >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion