From hanni.ali at gmail.com Tue Jul 1 06:13:21 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Tue, 1 Jul 2008 11:13:21 +0100 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730806301722h335cd094n5cc28ff4c37002b1@mail.gmail.com> Message-ID: <789d27b10807010313w46ba53a2gd54c852bcb70487d@mail.gmail.com> Would it not be possible to import just the necessary module of numpy to meet the necessary functionality of your application. i.e. import numpy.core or whatever you're using you could even do: import numpy.core as numpy I think, to simplify your code, I'm no expert though. Hanni 2008/7/1 Andrew Dalke : > On Jul 1, 2008, at 2:22 AM, Robert Kern wrote: > > Your use case isn't so typical and so suffers on the import time > > end of the > > balance. > > I'm working on my presentation for EuroSciPy. "Isn't so typical" > seems to be a good summary of my first slide. :) > > >> Any chance of cutting down on the number, in order > >> to improve startup costs? > > > > Not at this point in time, no. That would break too much code. > > Understood. > > Thanks for the response, > > Andrew > dalke at dalkescientific.com > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Tue Jul 1 06:24:13 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 1 Jul 2008 12:24:13 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <789d27b10807010313w46ba53a2gd54c852bcb70487d@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730806301722h335cd094n5cc28ff4c37002b1@mail.gmail.com> <789d27b10807010313w46ba53a2gd54c852bcb70487d@mail.gmail.com> Message-ID: Hi, IIRC, il you do import numpy.core as numpy, it starts by importing numpy, so it will be even slower. Matthieu 2008/7/1 Hanni Ali : > Would it not be possible to import just the necessary module of numpy to > meet the necessary functionality of your application. > > i.e. > > import numpy.core > > or whatever you're using > > you could even do: > > import numpy.core as numpy > > I think, to simplify your code, I'm no expert though. > > Hanni > > > 2008/7/1 Andrew Dalke : >> >> On Jul 1, 2008, at 2:22 AM, Robert Kern wrote: >> > Your use case isn't so typical and so suffers on the import time >> > end of the >> > balance. >> >> I'm working on my presentation for EuroSciPy. "Isn't so typical" >> seems to be a good summary of my first slide. :) >> >> >> Any chance of cutting down on the number, in order >> >> to improve startup costs? >> > >> > Not at this point in time, no. That would break too much code. >> >> Understood. >> >> Thanks for the response, >> >> Andrew >> dalke at dalkescientific.com >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher From hanni.ali at gmail.com Tue Jul 1 06:44:55 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Tue, 1 Jul 2008 11:44:55 +0100 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730806301722h335cd094n5cc28ff4c37002b1@mail.gmail.com> <789d27b10807010313w46ba53a2gd54c852bcb70487d@mail.gmail.com> Message-ID: <789d27b10807010344v613a0f20x707432a65e253d0f@mail.gmail.com> You are correct, it appears to take slightly longer to import numpy.core and longer again to import numpy.core as numpy I should obviously check first in future. Hanni 2008/7/1 Matthieu Brucher : > Hi, > > IIRC, il you do import numpy.core as numpy, it starts by importing > numpy, so it will be even slower. > > Matthieu > > 2008/7/1 Hanni Ali : > > Would it not be possible to import just the necessary module of numpy to > > meet the necessary functionality of your application. > > > > i.e. > > > > import numpy.core > > > > or whatever you're using > > > > you could even do: > > > > import numpy.core as numpy > > > > I think, to simplify your code, I'm no expert though. > > > > Hanni > > > > > > 2008/7/1 Andrew Dalke : > >> > >> On Jul 1, 2008, at 2:22 AM, Robert Kern wrote: > >> > Your use case isn't so typical and so suffers on the import time > >> > end of the > >> > balance. > >> > >> I'm working on my presentation for EuroSciPy. "Isn't so typical" > >> seems to be a good summary of my first slide. :) > >> > >> >> Any chance of cutting down on the number, in order > >> >> to improve startup costs? > >> > > >> > Not at this point in time, no. That would break too much code. > >> > >> Understood. > >> > >> Thanks for the response, > >> > >> Andrew > >> dalke at dalkescientific.com > >> > >> > >> _______________________________________________ > >> Numpy-discussion mailing list > >> Numpy-discussion at scipy.org > >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalke at dalkescientific.com Tue Jul 1 06:53:02 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Tue, 1 Jul 2008 12:53:02 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730806301722h335cd094n5cc28ff4c37002b1@mail.gmail.com> <789d27b10807010313w46ba53a2gd54c852bcb70487d@mail.gmail.com> Message-ID: <12500070-087E-482B-821F-DF6D5CC67738@dalkescientific.com> 2008/7/1 Hanni Ali : > Would it not be possible to import just the necessary module of > numpy to > meet the necessary functionality of your application. Matthieu Brucher responded: > IIRC, il you do import numpy.core as numpy, it starts by importing > numpy, so it will be even slower. which you can see if you start python with the "-v" option to display imports. >>> import numpy.core import numpy # directory /Library/Frameworks/Python.framework/ Versions/2.5/lib/python2.5/site-packages/numpy # /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ site-packages/numpy/__init__.pyc matches /Library/Frameworks/ Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ __init__.py import numpy # precompiled from /Library/Frameworks/Python.framework/ Versions/2.5/lib/python2.5/site-packages/numpy/__init__.pyc # /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ site-packages/numpy/__config__.pyc matches /Library/Frameworks/ Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ __config__.py import numpy.__config__ # precompiled from /Library/Frameworks/ Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ __config__.pyc ... and many more Andrew dalke at dalkescientific.com From cimrman3 at ntc.zcu.cz Tue Jul 1 10:13:46 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 01 Jul 2008 16:13:46 +0200 Subject: [Numpy-discussion] ANN: SfePy 00.46.02 Message-ID: <486A3B9A.8060106@ntc.zcu.cz> I am pleased announce the release of SfePy 00.46.02. SfePy is a finite element analysis software in Python, based primarily on Numpy and SciPy. Mailing lists, issue tracking, mercurial repository: http://sfepy.org Home page: http://sfepy.kme.zcu.cz Major improvements: - alternative short syntax for specifying essential boundary conditions, variables and regions - manufactured solutions tests: - SymPy support - site configuration now via script/config.py + site_cfg.py - new solvers - new terms For more information on this release, see http://sfepy.googlecode.com/svn/web/releases/004602_RELEASE_NOTES.txt If you happen to come to Leipzig for EuroSciPy 2008, see you there! Best regards, Robert Cimrman & SfePy developers From alan.mcintyre at gmail.com Tue Jul 1 11:26:09 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 11:26:09 -0400 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> Message-ID: <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> On Mon, Jun 30, 2008 at 1:54 PM, Alan McIntyre wrote: > 1. All doctests in NumPy will have the numpy module available in their > execution context as "np". > > 2. Turn on the normalized whitespace option for all doctests. Having > a doctest fail just because there's a space after your result seems > like an unnecessary hassle for documenters. > > 3. Output will be ignored for each doctest expected output line that > contains "#random". I figured this can serve both as an ignore flag > and indication to the reader that the listed output may differ from > what they see if they execute the associated command. So you would be > able to do: >>>> random.random() > 0.1234567890 #random: output may differ on your system > > And have the example executed but not cause a failure. You could also > use this to ignore the output from plot > methods as well. Since I didn't see any objections, these changes are now committed. I'll be updating some doctests to take advantage of them later today. Alan From millman at berkeley.edu Tue Jul 1 13:41:28 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 1 Jul 2008 10:41:28 -0700 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 8:26 AM, Alan McIntyre wrote: > Since I didn't see any objections, these changes are now committed. > I'll be updating some doctests to take advantage of them later today. Excellent. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From alan.mcintyre at gmail.com Tue Jul 1 13:56:45 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 13:56:45 -0400 Subject: [Numpy-discussion] Doctest items Message-ID: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> Just a few questions/comments about doctests: 1. Should all doctests be written such that you could start Python, do an "import numpy as np", and then type in the examples verbatim? There are a number that currently wouldn't work that way (they depend on the function under test being in the local namespace, for example). 2. In regard to the auto-ignore of "plt.", etc. in commands: using the existing ellipsis feature of doctest should cover a significant portion of the cases that the auto-ignore was suggested to solve, and it's a very minor change to enable it (whereas the auto-ignore is more involved). If nobody objects, I will enable ellipsis for all doctests (which doesn't cause any obvious problems in existing NumPy tests), and use it to clean up existing doctests where appropriate. If the auto-ignore capability is still needed after that, I'll work on it. 3. When the test suite is run with doctests enabled, some unit tests fail that normally wouldn't, and some doctests fail that shouldn't. There's probably some state that needs to be reset (or otherwise managed) between doctest runs; I'll look into that and provide a fix as soon as possible. I just figured I should mention it in case it causes somebody problems before it gets fixed. Thanks, Alan From alan.mcintyre at gmail.com Tue Jul 1 14:13:30 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 14:13:30 -0400 Subject: [Numpy-discussion] Coverage improvement requests Message-ID: <1d36917a0807011113m44cbf4cfl760678dd9e8e25f4@mail.gmail.com> Hi all, This week I'm going to start working on new tests to improve Python code coverage in NumPy (C code coverage will come later in the summer, or maybe even after GSoC). Does anyone have recommendations for particularly important bits of code that need coverage? If not, I'm just going to start on modules with the largest number of uncovered lines, and go from there. Thanks, Alan (Sorry for the flood of posts) From charlesr.harris at gmail.com Tue Jul 1 14:37:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 1 Jul 2008 12:37:07 -0600 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 9:26 AM, Alan McIntyre wrote: > On Mon, Jun 30, 2008 at 1:54 PM, Alan McIntyre > wrote: > > 1. All doctests in NumPy will have the numpy module available in their > > execution context as "np". > > > > 2. Turn on the normalized whitespace option for all doctests. Having > > a doctest fail just because there's a space after your result seems > > like an unnecessary hassle for documenters. > > > > 3. Output will be ignored for each doctest expected output line that > > contains "#random". I figured this can serve both as an ignore flag > > and indication to the reader that the listed output may differ from > > what they see if they execute the associated command. So you would be > > able to do: > >>>> random.random() > > 0.1234567890 #random: output may differ on your > system > > > > And have the example executed but not cause a failure. You could also > > use this to ignore the output from plot > > methods as well. > > Since I didn't see any objections, these changes are now committed. > I'll be updating some doctests to take advantage of them later today. > I note that a lot of unit test files import tons of specific functions, numpy.core, etc., etc. Is there any reason not to fix things up to import numpy as np from numpy.testing import * I fixed one file this way, but I wonder if we shouldn't make all of them work like that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 1 14:38:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 1 Jul 2008 12:38:57 -0600 Subject: [Numpy-discussion] Doctest items In-Reply-To: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 11:56 AM, Alan McIntyre wrote: > Just a few questions/comments about doctests: > > 1. Should all doctests be written such that you could start Python, do > an "import numpy as np", and then type in the examples verbatim? There > are a number that currently wouldn't work that way (they depend on the > function under test being in the local namespace, for example). > +1 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 1 14:42:05 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 13:42:05 -0500 Subject: [Numpy-discussion] Coverage improvement requests In-Reply-To: <1d36917a0807011113m44cbf4cfl760678dd9e8e25f4@mail.gmail.com> References: <1d36917a0807011113m44cbf4cfl760678dd9e8e25f4@mail.gmail.com> Message-ID: <3d375d730807011142q613ebf74sae0d700ed534ae19@mail.gmail.com> On Tue, Jul 1, 2008 at 13:13, Alan McIntyre wrote: > Hi all, > > This week I'm going to start working on new tests to improve Python > code coverage in NumPy (C code coverage will come later in the summer, > or maybe even after GSoC). Does anyone have recommendations for > particularly important bits of code that need coverage? If not, I'm > just going to start on modules with the largest number of uncovered > lines, and go from there. numpy.core and numpy.lib are good places to start. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 1 14:45:58 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 13:45:58 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> Message-ID: <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> On Tue, Jul 1, 2008 at 12:56, Alan McIntyre wrote: > Just a few questions/comments about doctests: > > 1. Should all doctests be written such that you could start Python, do > an "import numpy as np", and then type in the examples verbatim? There > are a number that currently wouldn't work that way (they depend on the > function under test being in the local namespace, for example). > > 2. In regard to the auto-ignore of "plt.", etc. in commands: using the > existing ellipsis feature of doctest should cover a significant > portion of the cases that the auto-ignore was suggested to solve, and > it's a very minor change to enable it (whereas the auto-ignore is more > involved). If nobody objects, I will enable ellipsis for all doctests > (which doesn't cause any obvious problems in existing NumPy tests), > and use it to clean up existing doctests where appropriate. +1 > If the > auto-ignore capability is still needed after that, I'll work on it. It seems to me that the ellipsis mechanism just allows the output to differ. However, matplotlib would still be required because plt.plot() would still be executed. matplotlib should not be a requirement for running the tests. > 3. When the test suite is run with doctests enabled, some unit tests > fail that normally wouldn't, and some doctests fail that shouldn't. > There's probably some state that needs to be reset (or otherwise > managed) between doctest runs; I'll look into that and provide a fix > as soon as possible. I just figured I should mention it in case it > causes somebody problems before it gets fixed. Thank you for the notice. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Tue Jul 1 15:14:32 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:14:32 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> Message-ID: <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> On Tue, Jul 1, 2008 at 2:45 PM, Robert Kern wrote: >> If the >> auto-ignore capability is still needed after that, I'll work on it. > > It seems to me that the ellipsis mechanism just allows the output to > differ. However, matplotlib would still be required because plt.plot() > would still be executed. matplotlib should not be a requirement for > running the tests. Oops, I misunderstood, then: I thought the intent was to execute the statement but not compare the output (because they returned objects that had their address in the repr). I didn't look before to see how many times this feature would be needed (yeah, should have done that before complaining about using #doctest: +SKIP), but now that I look, I only see one batch of plt. commands, in numpy.lib.function_base.bartlett. In view of that, does it make more sense to use the SKIP directive for the ten plt. lines in that one example? From robert.kern at gmail.com Tue Jul 1 15:20:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 14:20:32 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> Message-ID: <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> On Tue, Jul 1, 2008 at 14:14, Alan McIntyre wrote: > On Tue, Jul 1, 2008 at 2:45 PM, Robert Kern wrote: >>> If the >>> auto-ignore capability is still needed after that, I'll work on it. >> >> It seems to me that the ellipsis mechanism just allows the output to >> differ. However, matplotlib would still be required because plt.plot() >> would still be executed. matplotlib should not be a requirement for >> running the tests. > > Oops, I misunderstood, then: I thought the intent was to execute the > statement but not compare the output (because they returned objects > that had their address in the repr). > > I didn't look before to see how many times this feature would be > needed (yeah, should have done that before complaining about using > #doctest: +SKIP), but now that I look, I only see one batch of plt. > commands, in numpy.lib.function_base.bartlett. In view of that, does > it make more sense to use the SKIP directive for the ten plt. lines in > that one example? Can it work on an entire section? If not, can we do something that works on a whole section? Everything after "Plot the window and its frequency response:" is not required for testing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Tue Jul 1 15:21:31 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:21:31 -0400 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> Message-ID: <1d36917a0807011221g40db9e1cie41c7a1a35c60855@mail.gmail.com> On Tue, Jul 1, 2008 at 2:37 PM, Charles R Harris wrote: > I note that a lot of unit test files import tons of specific functions, > numpy.core, etc., etc. Is there any reason not to fix things up to > > import numpy as np > from numpy.testing import * > > I fixed one file this way, but I wonder if we shouldn't make all of them > work like that. Personally, I prefer the imports to be as simple as possible, but I managed to restrain myself from cleaning up test module imports when I was making my changes. ;) If making them somewhat standardized is desirable, I might as well do it while I'm cleaning up and fixing tests. From alan.mcintyre at gmail.com Tue Jul 1 15:30:47 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:30:47 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> Message-ID: <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> On Tue, Jul 1, 2008 at 3:20 PM, Robert Kern wrote: > Can it work on an entire section? If not, can we do something that > works on a whole section? Everything after "Plot the window and its > frequency response:" is not required for testing. It's on a per-line basis at the moment, so each lines needs a "#doctest: +SKIP". Changing a directive to apply to multiple lines probably isn't trivial (I haven't really looked into doing that, though). We could always just make the plotting section one of those "it's just an example not a doctest" things and remove the ">>>" (since it doesn't appear to provide any useful test coverage or anything). From robert.kern at gmail.com Tue Jul 1 15:33:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 14:33:19 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: <3d375d730807011233i3b0bbc60n718ab3169b900279@mail.gmail.com> On Tue, Jul 1, 2008 at 14:30, Alan McIntyre wrote: > On Tue, Jul 1, 2008 at 3:20 PM, Robert Kern wrote: >> Can it work on an entire section? If not, can we do something that >> works on a whole section? Everything after "Plot the window and its >> frequency response:" is not required for testing. > > It's on a per-line basis at the moment, so each lines needs a > "#doctest: +SKIP". Changing a directive to apply to multiple lines > probably isn't trivial (I haven't really looked into doing that, > though). > > We could always just make the plotting section one of those "it's just > an example not a doctest" things and remove the ">>>" (since it > doesn't appear to provide any useful test coverage or anything). That's not a bad idea. Coordinate with St?fan about the details (if any are left to be decided). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jul 1 15:37:44 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 1 Jul 2008 13:37:44 -0600 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: <1d36917a0807011221g40db9e1cie41c7a1a35c60855@mail.gmail.com> References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> <1d36917a0807011221g40db9e1cie41c7a1a35c60855@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 1:21 PM, Alan McIntyre wrote: > On Tue, Jul 1, 2008 at 2:37 PM, Charles R Harris > wrote: > > I note that a lot of unit test files import tons of specific functions, > > numpy.core, etc., etc. Is there any reason not to fix things up to > > > > import numpy as np > > from numpy.testing import * > > > > I fixed one file this way, but I wonder if we shouldn't make all of them > > work like that. > > Personally, I prefer the imports to be as simple as possible, but I > managed to restrain myself from cleaning up test module imports when I > was making my changes. ;) If making them somewhat standardized is > desirable, I might as well do it while I'm cleaning up and fixing > tests. > A lot of the imports seem to have just grown over the years, some even contain duplicates. So I think cleaning up would be a good idea if no one objects. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 1 15:39:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 1 Jul 2008 13:39:36 -0600 Subject: [Numpy-discussion] Doctest items In-Reply-To: <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 1:30 PM, Alan McIntyre wrote: > On Tue, Jul 1, 2008 at 3:20 PM, Robert Kern wrote: > > Can it work on an entire section? If not, can we do something that > > works on a whole section? Everything after "Plot the window and its > > frequency response:" is not required for testing. > > It's on a per-line basis at the moment, so each lines needs a > "#doctest: +SKIP". Changing a directive to apply to multiple lines > probably isn't trivial (I haven't really looked into doing that, > though). > > We could always just make the plotting section one of those "it's just > an example not a doctest" things and remove the ">>>" (since it > doesn't appear to provide any useful test coverage or anything). Would it serve to overload plot with a function that does zippo? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Tue Jul 1 15:44:52 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:44:52 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: <1d36917a0807011244r78937d32re5cb83ac8ae63238@mail.gmail.com> On Tue, Jul 1, 2008 at 3:39 PM, Charles R Harris wrote: >> We could always just make the plotting section one of those "it's just >> an example not a doctest" things and remove the ">>>" (since it >> doesn't appear to provide any useful test coverage or anything). > > Would it serve to overload plot with a function that does zippo? Probably not in this case; there's an explicit matplotlib import, and then a bunch of method calls on a matplotlib object: >>> from matplotlib import pyplot as plt >>> window = np.bartlett(51) >>> plt.plot(window) >>> plt.title("Bartlett window") and so on. From alan.mcintyre at gmail.com Tue Jul 1 15:46:22 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:46:22 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <3d375d730807011233i3b0bbc60n718ab3169b900279@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011233i3b0bbc60n718ab3169b900279@mail.gmail.com> Message-ID: <1d36917a0807011246r4c5d97adm3fba92d8f5c1714f@mail.gmail.com> On Tue, Jul 1, 2008 at 3:33 PM, Robert Kern wrote: > That's not a bad idea. Coordinate with St?fan about the details (if > any are left to be decided). Ok, will do. I'll also update all the test documentation I can find so that documenters have a chance of being aware of the doctest assumptions/requirements/capabilities. So, unless anyone else has objections, I'll: 1. Enabled ellipsis for all doctests 2. Update all doctests so that they only assume "import numpy as np". I'll also check into restricting the doctest execution environment so that tests that make other assumptions should fail (as part of figuring out the test state pollution problem). From alan.mcintyre at gmail.com Tue Jul 1 15:49:52 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 15:49:52 -0400 Subject: [Numpy-discussion] More pending test framework changes (please give feedback) In-Reply-To: References: <1d36917a0806301054t578dfc94n64ffc213cb3c79df@mail.gmail.com> <1d36917a0807010826n62daaf03x8d8dd6fdd42ca37b@mail.gmail.com> <1d36917a0807011221g40db9e1cie41c7a1a35c60855@mail.gmail.com> Message-ID: <1d36917a0807011249kd84684anacc125c10e214c61@mail.gmail.com> On Tue, Jul 1, 2008 at 3:37 PM, Charles R Harris wrote: > A lot of the imports seem to have just grown over the years, some even > contain duplicates. So I think cleaning up would be a good idea if no one > objects. Ok. As a pre-emptive clarification, I'll only be tweaking imports in unit test files--I don't want to mess with any of the magic that goes on in the package imports. ;) From robert.kern at gmail.com Tue Jul 1 16:12:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 15:12:24 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: <3d375d730807011312q7382e54ei6469fdd010f6b477@mail.gmail.com> On Tue, Jul 1, 2008 at 14:39, Charles R Harris wrote: > > On Tue, Jul 1, 2008 at 1:30 PM, Alan McIntyre > wrote: >> >> On Tue, Jul 1, 2008 at 3:20 PM, Robert Kern wrote: >> > Can it work on an entire section? If not, can we do something that >> > works on a whole section? Everything after "Plot the window and its >> > frequency response:" is not required for testing. >> >> It's on a per-line basis at the moment, so each lines needs a >> "#doctest: +SKIP". Changing a directive to apply to multiple lines >> probably isn't trivial (I haven't really looked into doing that, >> though). >> >> We could always just make the plotting section one of those "it's just >> an example not a doctest" things and remove the ">>>" (since it >> doesn't appear to provide any useful test coverage or anything). > > Would it serve to overload plot with a function that does zippo? If it's not going to test anything, I would prefer that it not be part of the tests. Admittedly, that's just my sense of aesthetics, not a technical objection. A technical objection would be that some of the matplotlib functions actually do return something, and we would still have to uglify the examples with the ellipsis stuff. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Tue Jul 1 16:41:41 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 1 Jul 2008 20:41:41 +0000 (UTC) Subject: [Numpy-discussion] Doctest items References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: Tue, 01 Jul 2008 15:30:47 -0400, Alan McIntyre wrote: > On Tue, Jul 1, 2008 at 3:20 PM, Robert Kern > wrote: >> Can it work on an entire section? If not, can we do something that >> works on a whole section? Everything after "Plot the window and its >> frequency response:" is not required for testing. > > It's on a per-line basis at the moment, so each lines needs a "#doctest: > +SKIP". Changing a directive to apply to multiple lines probably isn't > trivial (I haven't really looked into doing that, though). I think this can be done without too many problems: Looking at the doctest.py source code, the easiest way to change the way how docstrings are parsed into doctests is to subclass `DoctestParser` and override its `parse` method. The `testmod` and `testfile` functions don't take a parser argument, but they appear to be only thin wrappers for instantiating `DocTestFinder` (which does take a parser argument) and then calling a method in `DocTestRunner`. Now, things appear to work a bit differently in nose. There, DocTestParser is instantiated by Doctest.loadTestsFromFile in the doctests.py plugin. I don't see an easy way to override this, except for monkeypatching doctest.DocTestParser or the whole nose plugin with our stuff. All in all, I'd estimate this to be ~100 lines, put in a suitable location. But it's a custom tweak to doctest, so it might break at some point in the future, and I don't love the monkeypatching here... > We could always just make the plotting section one of those "it's just > an example not a doctest" things and remove the ">>>" (since it doesn't > appear to provide any useful test coverage or anything). If possible, I'd like other possibilities be considered first before jumping this route. I think it would be nice to retain the ability to run also the matplotlib examples as (optional) doctests, to make sure also they execute correctly. Also, using two different markups in the documentation to work around a shortcoming of doctest is IMHO not very elegant. -- Pauli Virtanen From sdb at cloud9.net Tue Jul 1 17:18:55 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Tue, 1 Jul 2008 17:18:55 -0400 (EDT) Subject: [Numpy-discussion] Change of behavior in flatten between 1.0.4 and 1.1 Message-ID: Hi -- I have noticed a change in the behavior of numpy.flatten(True) between NumPy 1.0.4 and NumPy 1.1. The change affects 3D arrays. I am wondering if this is a bug or a feature. Here's the change. Note that the output from flatten(True) is different between 1.0.4 and 1.1. ======= First the preliminary set up: ======= In [3]: A = numpy.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]) In [4]: A Out[4]: array([[[ 1, 2], [ 3, 4]], [[ 5, 6], [ 7, 8]], [[ 9, 10], [11, 12]]]) ======= Now the change: Numpy 1.0.4 ======= In [5]: A.flatten() Out[5]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) In [6]: A.flatten(True) Out[6]: array([ 1, 5, 9, 2, 6, 10, 3, 7, 11, 4, 8, 12]) ======= Numpy 1.1 ======= In [4]: A.flatten() Out[4]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) In [5]: A.flatten(True) Out[5]: array([ 1, 5, 9, 3, 7, 11, 2, 6, 10, 4, 8, 12]) Note that the output of A.flatten(True) is different. Is this a bug or a feature? Cheers, Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com/ From pav at iki.fi Tue Jul 1 17:38:50 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 1 Jul 2008 21:38:50 +0000 (UTC) Subject: [Numpy-discussion] Change of behavior in flatten between 1.0.4 and 1.1 References: Message-ID: Tue, 01 Jul 2008 17:18:55 -0400, Stuart Brorson wrote: > Hi -- > > I have noticed a change in the behavior of numpy.flatten(True) between > NumPy 1.0.4 and NumPy 1.1. The change affects 3D arrays. I am > wondering if this is a bug or a feature. > > Here's the change. Note that the output from flatten(True) is different > between 1.0.4 and 1.1. I think it was this one http://scipy.org/scipy/numpy/ticket/676 The rationale was to make the output from .flatten(1) to be equal to interpreting the data as it would appear in a multidimensional Fortran array (equivalent to reshape(a, (prod(a.shape),), order='F'), IIRC). In 1.0.4 a.flatten(1) only swapped the two first axes and then flattened in C-order. In 1.1, a.flatten(1) == a.transpose().flatten(). To me, it appeared that the behavior in 1.0.4 was incorrect, so I filed the bug (after being bitten by it in real code...) and submitted a patch that got applied. -- Pauli Virtanen > ======= First the preliminary set up: ======= > > In [3]: A = numpy.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], > [11, 12]]]) > > In [4]: A > > Out[4]: > > array([[[ 1, 2], > [ 3, 4]], > > [[ 5, 6], > [ 7, 8]], > > [[ 9, 10], > [11, 12]]]) > > ======= Now the change: Numpy 1.0.4 ======= > > In [5]: A.flatten() > > Out[5]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) > > In [6]: A.flatten(True) > > Out[6]: array([ 1, 5, 9, 2, 6, 10, 3, 7, 11, 4, 8, 12]) Here the two first dimensions are swapped and data is interpreted in C- order. > ======= Numpy 1.1 ======= > > In [4]: A.flatten() > > Out[4]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]) > > In [5]: A.flatten(True) > > Out[5]: array([ 1, 5, 9, 3, 7, 11, 2, 6, 10, 4, 8, 12]) Here dimensions are transposed, and data is interpreted in C-order. -- Pauli Virtanen From fperez.net at gmail.com Tue Jul 1 18:50:27 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 1 Jul 2008 15:50:27 -0700 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: > But it's a custom tweak to doctest, so it might break at some point in > the future, and I don't love the monkeypatching here... Welcome to the joys of extending doctest/unittest. They hardcoded so much stuff in there that the only way to reuse that code is by copy/paste/monkeypatch. It's absolutely atrocious. >> We could always just make the plotting section one of those "it's just >> an example not a doctest" things and remove the ">>>" (since it doesn't >> appear to provide any useful test coverage or anything). > > If possible, I'd like other possibilities be considered first before > jumping this route. I think it would be nice to retain the ability to run > also the matplotlib examples as (optional) doctests, to make sure also > they execute correctly. Also, using two different markups in the > documentation to work around a shortcoming of doctest is IMHO not very > elegant. How about a much simpler approach? Just pre-populate the globals dict where doctest executes with an object called 'plt' that basically does def noop(*a,**k): pass class dummy(): def __getattr__(self,k): return noop plt = dummy() This would ensure that all calls to plt.anything() silently succeed in the doctests. Granted, we're not testing matplotlib, but it has the benefit of simplicity and of letting us keep consistent formatting, and examples that *users* can still paste into their sessions where plt refers to the real matplotlib. Just an idea... f From dbrown at ucar.edu Tue Jul 1 18:53:49 2008 From: dbrown at ucar.edu (David Brown) Date: Tue, 1 Jul 2008 16:53:49 -0600 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? Message-ID: Hi, Our code depends on NumPy and we have been taking advantage of the NumPy test harness to create test scripts for new features as we develop them. As recently as version 1.1.0.dev5064 we had no problem running our tests from the directory in which they live, e.g.: python test_script.py ..... . ---------------------------------------------------------------------- Ran 10 tests in 0.397s OK However, with the released version NumPy 1.1.0 none of our tests work any longer and instead we get this error message: Traceback (most recent call last): File "test_mfio.py", line 426, in NumpyTest().run() File "/usr/local/lib/python2.5/site-packages/numpy/testing/ numpytest.py", line 655, in run testcase_pattern=options.testcase_pattern) File "/usr/local/lib/python2.5/site-packages/numpy/testing/ numpytest.py", line 575, in test level, verbosity) File "/usr/local/lib/python2.5/site-packages/numpy/testing/ numpytest.py", line 453, in _test_suite_from_all_tests importall(this_package) File "/usr/local/lib/python2.5/site-packages/numpy/testing/ numpytest.py", line 681, in importall for subpackage_name in os.listdir(package_dir): OSError: [Errno 2] No such file or directory: '' Looking through the archives I have found a reference to the same error at http://www.nabble.com/NumpyTest-problem-td17603890.html. However, this thread is focused on the use of NumpyTest for test scripts within the NumPy source tree and the suggestions do not seem to apply to our situation. My question is whether using NumpyTest for our own tests is considered to be a proper application of the test module. If not why not? If it is, is there some modification we can make to our tests to make them work with NumpyTest as it is now configured? Or is this possibly a bug in NumPyTest? Dave Brown PyNGL/PyNIO development team From charlesr.harris at gmail.com Tue Jul 1 19:03:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 1 Jul 2008 17:03:22 -0600 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 4:50 PM, Fernando Perez wrote: > On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: > > > But it's a custom tweak to doctest, so it might break at some point in > > the future, and I don't love the monkeypatching here... > > Welcome to the joys of extending doctest/unittest. They hardcoded so > much stuff in there that the only way to reuse that code is by > copy/paste/monkeypatch. It's absolutely atrocious. > > >> We could always just make the plotting section one of those "it's just > >> an example not a doctest" things and remove the ">>>" (since it doesn't > >> appear to provide any useful test coverage or anything). > > > > If possible, I'd like other possibilities be considered first before > > jumping this route. I think it would be nice to retain the ability to run > > also the matplotlib examples as (optional) doctests, to make sure also > > they execute correctly. Also, using two different markups in the > > documentation to work around a shortcoming of doctest is IMHO not very > > elegant. > > How about a much simpler approach? Just pre-populate the globals dict > where doctest executes with an object called 'plt' that basically does > > def noop(*a,**k): pass > > class dummy(): > def __getattr__(self,k): return noop > > plt = dummy() > > This would ensure that all calls to plt.anything() silently succeed in > the doctests. Granted, we're not testing matplotlib, but it has the > benefit of simplicity and of letting us keep consistent formatting, > and examples that *users* can still paste into their sessions where > plt refers to the real matplotlib. > > Just an idea... > That was my thought, but Robert didn't like it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Tue Jul 1 19:11:03 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 1 Jul 2008 16:11:03 -0700 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 4:03 PM, Charles R Harris wrote: > That was my thought, but Robert didn't like it. Mmh, ok, sorry that I'd missed the original. Cheers, f From robert.kern at gmail.com Tue Jul 1 19:13:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 18:13:10 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> On Tue, Jul 1, 2008 at 17:50, Fernando Perez wrote: > On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: > >> But it's a custom tweak to doctest, so it might break at some point in >> the future, and I don't love the monkeypatching here... > > Welcome to the joys of extending doctest/unittest. They hardcoded so > much stuff in there that the only way to reuse that code is by > copy/paste/monkeypatch. It's absolutely atrocious. > >>> We could always just make the plotting section one of those "it's just >>> an example not a doctest" things and remove the ">>>" (since it doesn't >>> appear to provide any useful test coverage or anything). >> >> If possible, I'd like other possibilities be considered first before >> jumping this route. I think it would be nice to retain the ability to run >> also the matplotlib examples as (optional) doctests, to make sure also >> they execute correctly. Also, using two different markups in the >> documentation to work around a shortcoming of doctest is IMHO not very >> elegant. > > How about a much simpler approach? Just pre-populate the globals dict > where doctest executes with an object called 'plt' that basically does > > def noop(*a,**k): pass > > class dummy(): > def __getattr__(self,k): return noop > > plt = dummy() > > This would ensure that all calls to plt.anything() silently succeed in > the doctests. Granted, we're not testing matplotlib, but it has the > benefit of simplicity and of letting us keep consistent formatting, > and examples that *users* can still paste into their sessions where > plt refers to the real matplotlib. It's actually easier for users to paste the non-doctestable examples since they don't have the >>> markers and any stdout the examples produce as a byproduct. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 1 19:24:53 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 18:24:53 -0500 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? In-Reply-To: References: Message-ID: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> On Tue, Jul 1, 2008 at 17:53, David Brown wrote: > > Hi, > Our code depends on NumPy and we have been taking advantage of the > NumPy test harness > to create test scripts for new features as we develop them. As > recently as version 1.1.0.dev5064 > we had no problem running our tests from the directory in which they > live, e.g.: > > python test_script.py > > ..... > > . > ---------------------------------------------------------------------- > Ran 10 tests in 0.397s > > OK > > However, with the released version NumPy 1.1.0 none of our tests work > any longer and instead we get this > error message: > > Traceback (most recent call last): > File "test_mfio.py", line 426, in > NumpyTest().run() > File "/usr/local/lib/python2.5/site-packages/numpy/testing/ > numpytest.py", line 655, in run > testcase_pattern=options.testcase_pattern) > File "/usr/local/lib/python2.5/site-packages/numpy/testing/ > numpytest.py", line 575, in test > level, verbosity) > File "/usr/local/lib/python2.5/site-packages/numpy/testing/ > numpytest.py", line 453, in _test_suite_from_all_tests > importall(this_package) > File "/usr/local/lib/python2.5/site-packages/numpy/testing/ > numpytest.py", line 681, in importall > for subpackage_name in os.listdir(package_dir): > OSError: [Errno 2] No such file or directory: '' > > Looking through the archives I have found a reference to the same > error at > http://www.nabble.com/NumpyTest-problem-td17603890.html. > > However, this thread is focused on the use of NumpyTest for test > scripts within the NumPy source tree and the suggestions > do not seem to apply to our situation. > > My question is whether using NumpyTest for our own tests is > considered to be a proper application of the test module. > If not why not? You should migrate your tests to some other test runner. NumpyTest and NumpyTestCase will be deprecated in numpy 1.2 and removed as early as numpy 1.3. They were never particularly intended to be used outside of numpy and scipy. We are migrating to using nose as our test runner so we do not have to maintain our own; it's not our core competency. As you can see. > If it is, is there some modification we can make to our tests to make > them work with NumpyTest as it is now configured? Try NumpyTest().test(level=11, all=False) > Or is this possibly a bug in NumPyTest? Possibly a bug. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dbrown at ucar.edu Tue Jul 1 19:39:17 2008 From: dbrown at ucar.edu (David Brown) Date: Tue, 1 Jul 2008 17:39:17 -0600 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? In-Reply-To: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> References: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> Message-ID: <86DB7ECB-D004-4985-A140-5F977813722C@ucar.edu> OK, thanks for the info Robert. Any recommendations for another test harness? -dave On Jul 1, 2008, at 5:24 PM, Robert Kern wrote: > On Tue, Jul 1, 2008 at 17:53, David Brown wrote: >> >> Hi, >> Our code depends on NumPy and we have been taking advantage of the >> NumPy test harness >> to create test scripts for new features as we develop them. As >> recently as version 1.1.0.dev5064 >> we had no problem running our tests from the directory in which they >> live, e.g.: >> >> python test_script.py >> >> ..... >> >> . >> --------------------------------------------------------------------- >> - >> Ran 10 tests in 0.397s >> >> OK >> >> However, with the released version NumPy 1.1.0 none of our tests work >> any longer and instead we get this >> error message: >> >> Traceback (most recent call last): >> File "test_mfio.py", line 426, in >> NumpyTest().run() >> File "/usr/local/lib/python2.5/site-packages/numpy/testing/ >> numpytest.py", line 655, in run >> testcase_pattern=options.testcase_pattern) >> File "/usr/local/lib/python2.5/site-packages/numpy/testing/ >> numpytest.py", line 575, in test >> level, verbosity) >> File "/usr/local/lib/python2.5/site-packages/numpy/testing/ >> numpytest.py", line 453, in _test_suite_from_all_tests >> importall(this_package) >> File "/usr/local/lib/python2.5/site-packages/numpy/testing/ >> numpytest.py", line 681, in importall >> for subpackage_name in os.listdir(package_dir): >> OSError: [Errno 2] No such file or directory: '' >> >> Looking through the archives I have found a reference to the same >> error at >> http://www.nabble.com/NumpyTest-problem-td17603890.html. >> >> However, this thread is focused on the use of NumpyTest for test >> scripts within the NumPy source tree and the suggestions >> do not seem to apply to our situation. >> >> My question is whether using NumpyTest for our own tests is >> considered to be a proper application of the test module. >> If not why not? > > You should migrate your tests to some other test runner. NumpyTest and > NumpyTestCase will be deprecated in numpy 1.2 and removed as early as > numpy 1.3. They were never particularly intended to be used outside of > numpy and scipy. We are migrating to using nose as our test runner so > we do not have to maintain our own; it's not our core competency. As > you can see. > >> If it is, is there some modification we can make to our tests to make >> them work with NumpyTest as it is now configured? > > Try NumpyTest().test(level=11, all=False) > >> Or is this possibly a bug in NumPyTest? > > Possibly a bug. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Tue Jul 1 19:43:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 18:43:04 -0500 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? In-Reply-To: <86DB7ECB-D004-4985-A140-5F977813722C@ucar.edu> References: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> <86DB7ECB-D004-4985-A140-5F977813722C@ucar.edu> Message-ID: <3d375d730807011643t2400b68eu88841cfdc216ef@mail.gmail.com> On Tue, Jul 1, 2008 at 18:39, David Brown wrote: > OK, thanks for the info Robert. > Any recommendations for another test harness? numpy and scipy are moving to nose as of numpy 1.2 and scipy 0.7. http://www.somethingaboutorange.com/mrl/projects/nose/ It has been a personal favorite of mine for some time, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Tue Jul 1 19:48:44 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 1 Jul 2008 16:48:44 -0700 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? In-Reply-To: <3d375d730807011643t2400b68eu88841cfdc216ef@mail.gmail.com> References: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> <86DB7ECB-D004-4985-A140-5F977813722C@ucar.edu> <3d375d730807011643t2400b68eu88841cfdc216ef@mail.gmail.com> Message-ID: On Tue, Jul 1, 2008 at 4:43 PM, Robert Kern wrote: > On Tue, Jul 1, 2008 at 18:39, David Brown wrote: >> OK, thanks for the info Robert. >> Any recommendations for another test harness? > > numpy and scipy are moving to nose as of numpy 1.2 and scipy 0.7. > > http://www.somethingaboutorange.com/mrl/projects/nose/ > > It has been a personal favorite of mine for some time, too. So has ipython recently, for what it's worth. It does make easy a lot of things that should be easy but aren't under unittest, which is a big plus in my view. We also adopted it after advice from Robert, others at Enthought and Titus Brown during a BOF session at last year's SciPy. The fact that nose absorbs/recognizes all regular unittest tests makes the transition fairly easy. Cheers, f From dbrown at ucar.edu Tue Jul 1 19:53:58 2008 From: dbrown at ucar.edu (David Brown) Date: Tue, 1 Jul 2008 17:53:58 -0600 Subject: [Numpy-discussion] problem with NumPy test harness in 1.1.0? In-Reply-To: References: <3d375d730807011624y64c46ad2o5ee01821dde0dabe@mail.gmail.com> <86DB7ECB-D004-4985-A140-5F977813722C@ucar.edu> <3d375d730807011643t2400b68eu88841cfdc216ef@mail.gmail.com> Message-ID: Thanks for the input Fernando. -dave On Jul 1, 2008, at 5:48 PM, Fernando Perez wrote: > On Tue, Jul 1, 2008 at 4:43 PM, Robert Kern > wrote: >> On Tue, Jul 1, 2008 at 18:39, David Brown wrote: >>> OK, thanks for the info Robert. >>> Any recommendations for another test harness? >> >> numpy and scipy are moving to nose as of numpy 1.2 and scipy 0.7. >> >> http://www.somethingaboutorange.com/mrl/projects/nose/ >> >> It has been a personal favorite of mine for some time, too. > > So has ipython recently, for what it's worth. It does make easy a lot > of things that should be easy but aren't under unittest, which is a > big plus in my view. We also adopted it after advice from Robert, > others at Enthought and Titus Brown during a BOF session at last > year's SciPy. > > The fact that nose absorbs/recognizes all regular unittest tests makes > the transition fairly easy. > > Cheers, > > f > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From rmay31 at gmail.com Tue Jul 1 20:19:14 2008 From: rmay31 at gmail.com (Ryan May) Date: Tue, 01 Jul 2008 20:19:14 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> Message-ID: <486AC982.1010006@gmail.com> Robert Kern wrote: > On Tue, Jul 1, 2008 at 17:50, Fernando Perez wrote: >> On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: >> >>> But it's a custom tweak to doctest, so it might break at some point in >>> the future, and I don't love the monkeypatching here... >> Welcome to the joys of extending doctest/unittest. They hardcoded so >> much stuff in there that the only way to reuse that code is by >> copy/paste/monkeypatch. It's absolutely atrocious. >> >>>> We could always just make the plotting section one of those "it's just >>>> an example not a doctest" things and remove the ">>>" (since it doesn't >>>> appear to provide any useful test coverage or anything). >>> If possible, I'd like other possibilities be considered first before >>> jumping this route. I think it would be nice to retain the ability to run >>> also the matplotlib examples as (optional) doctests, to make sure also >>> they execute correctly. Also, using two different markups in the >>> documentation to work around a shortcoming of doctest is IMHO not very >>> elegant. >> How about a much simpler approach? Just pre-populate the globals dict >> where doctest executes with an object called 'plt' that basically does >> >> def noop(*a,**k): pass >> >> class dummy(): >> def __getattr__(self,k): return noop >> >> plt = dummy() >> >> This would ensure that all calls to plt.anything() silently succeed in >> the doctests. Granted, we're not testing matplotlib, but it has the >> benefit of simplicity and of letting us keep consistent formatting, >> and examples that *users* can still paste into their sessions where >> plt refers to the real matplotlib. > > It's actually easier for users to paste the non-doctestable examples > since they don't have the >>> markers and any stdout the examples > produce as a byproduct. > I'm with Robert here. It's definitely easier as an example without the >>>>. I also don't see the utility of being able to have the matplotlib code as tests of anything. We're not testing matplotlib here and any behavior that matplotlib relies on (and hence tests) should be captured in a test for that behavior separate from matplotlib code. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From alan.mcintyre at gmail.com Tue Jul 1 20:33:08 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 1 Jul 2008 20:33:08 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> Message-ID: <1d36917a0807011733h47861da7u4d78eb4eef06e8f3@mail.gmail.com> On Tue, Jul 1, 2008 at 4:41 PM, Pauli Virtanen wrote: > All in all, I'd estimate this to be ~100 lines, put in a suitable > location. > > If possible, I'd like other possibilities be considered first before > jumping this route. I think it would be nice to retain the ability to run > also the matplotlib examples as (optional) doctests, to make sure also > they execute correctly. Also, using two different markups in the > documentation to work around a shortcoming of doctest is IMHO not very > elegant. Well, for the moment, it's 100 lines of new code that's needed in order to *avoid* running 10 lines of doctest code in one function's docstring. Maybe down the road there will be more examples that need a bulk "don't execute me" mechanism, and if we do I'll be glad to work on it, but for right now I need to spend more time increasing coverage. I also agree with Ryan that matplotlib is where the tests for this particular use case of the object returned by bartlett should be. So at the moment I'm inclined to just remove the ">>>" and come back to it later. From robert.kern at gmail.com Tue Jul 1 20:39:29 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 1 Jul 2008 19:39:29 -0500 Subject: [Numpy-discussion] Doctest items In-Reply-To: <486AC982.1010006@gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> <486AC982.1010006@gmail.com> Message-ID: <3d375d730807011739j1b8dff15n1577a31d2c7d4d83@mail.gmail.com> On Tue, Jul 1, 2008 at 19:19, Ryan May wrote: > Robert Kern wrote: >> On Tue, Jul 1, 2008 at 17:50, Fernando Perez wrote: >>> On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: >>> >>>> But it's a custom tweak to doctest, so it might break at some point in >>>> the future, and I don't love the monkeypatching here... >>> Welcome to the joys of extending doctest/unittest. They hardcoded so >>> much stuff in there that the only way to reuse that code is by >>> copy/paste/monkeypatch. It's absolutely atrocious. >>> >>>>> We could always just make the plotting section one of those "it's just >>>>> an example not a doctest" things and remove the ">>>" (since it doesn't >>>>> appear to provide any useful test coverage or anything). >>>> If possible, I'd like other possibilities be considered first before >>>> jumping this route. I think it would be nice to retain the ability to run >>>> also the matplotlib examples as (optional) doctests, to make sure also >>>> they execute correctly. Also, using two different markups in the >>>> documentation to work around a shortcoming of doctest is IMHO not very >>>> elegant. >>> How about a much simpler approach? Just pre-populate the globals dict >>> where doctest executes with an object called 'plt' that basically does >>> >>> def noop(*a,**k): pass >>> >>> class dummy(): >>> def __getattr__(self,k): return noop >>> >>> plt = dummy() >>> >>> This would ensure that all calls to plt.anything() silently succeed in >>> the doctests. Granted, we're not testing matplotlib, but it has the >>> benefit of simplicity and of letting us keep consistent formatting, >>> and examples that *users* can still paste into their sessions where >>> plt refers to the real matplotlib. >> >> It's actually easier for users to paste the non-doctestable examples >> since they don't have the >>> markers and any stdout the examples >> produce as a byproduct. >> > > I'm with Robert here. It's definitely easier as an example without the > >>>>. I also don't see the utility of being able to have the > matplotlib code as tests of anything. We're not testing matplotlib here > and any behavior that matplotlib relies on (and hence tests) should be > captured in a test for that behavior separate from matplotlib code. To be clear, these aren't tests of the numpy code. The tests would be to make sure the examples still run. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rmay31 at gmail.com Tue Jul 1 20:52:27 2008 From: rmay31 at gmail.com (Ryan May) Date: Tue, 01 Jul 2008 20:52:27 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <3d375d730807011739j1b8dff15n1577a31d2c7d4d83@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011145m54a6f8d8u65bf54abeadf58ee@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> <486AC982.1010006@gmail.com> <3d375d730807011739j1b8dff15n1577a31d2c7d4d83@mail.gmail.com> Message-ID: <486AD14B.1040502@gmail.com> Robert Kern wrote: > On Tue, Jul 1, 2008 at 19:19, Ryan May wrote: >> Robert Kern wrote: >>> On Tue, Jul 1, 2008 at 17:50, Fernando Perez wrote: >>>> On Tue, Jul 1, 2008 at 1:41 PM, Pauli Virtanen wrote: >>>> >>>>> But it's a custom tweak to doctest, so it might break at some point in >>>>> the future, and I don't love the monkeypatching here... >>>> Welcome to the joys of extending doctest/unittest. They hardcoded so >>>> much stuff in there that the only way to reuse that code is by >>>> copy/paste/monkeypatch. It's absolutely atrocious. >>>> >>>>>> We could always just make the plotting section one of those "it's just >>>>>> an example not a doctest" things and remove the ">>>" (since it doesn't >>>>>> appear to provide any useful test coverage or anything). >>>>> If possible, I'd like other possibilities be considered first before >>>>> jumping this route. I think it would be nice to retain the ability to run >>>>> also the matplotlib examples as (optional) doctests, to make sure also >>>>> they execute correctly. Also, using two different markups in the >>>>> documentation to work around a shortcoming of doctest is IMHO not very >>>>> elegant. >>>> How about a much simpler approach? Just pre-populate the globals dict >>>> where doctest executes with an object called 'plt' that basically does >>>> >>>> def noop(*a,**k): pass >>>> >>>> class dummy(): >>>> def __getattr__(self,k): return noop >>>> >>>> plt = dummy() >>>> >>>> This would ensure that all calls to plt.anything() silently succeed in >>>> the doctests. Granted, we're not testing matplotlib, but it has the >>>> benefit of simplicity and of letting us keep consistent formatting, >>>> and examples that *users* can still paste into their sessions where >>>> plt refers to the real matplotlib. >>> It's actually easier for users to paste the non-doctestable examples >>> since they don't have the >>> markers and any stdout the examples >>> produce as a byproduct. >>> >> I'm with Robert here. It's definitely easier as an example without the >> >>>>. I also don't see the utility of being able to have the >> matplotlib code as tests of anything. We're not testing matplotlib here >> and any behavior that matplotlib relies on (and hence tests) should be >> captured in a test for that behavior separate from matplotlib code. > > To be clear, these aren't tests of the numpy code. The tests would be > to make sure the examples still run. > Right. I just don't think effort should be put into making examples using matplotlib run as doctests. If the behavior is important, numpy should have a standalone test for it. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From zbyszek at in.waw.pl Wed Jul 2 06:21:59 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Wed, 2 Jul 2008 12:21:59 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: References: Message-ID: <20080702102159.GB21783@szyszka.in.waw.pl> On Sun, Jun 29, 2008 at 09:57:52AM -0600, Charles R Harris wrote: Hi, > That's Ticket #709 : > > > I'm faily sure that: > > numpy.isnan(datetime.datetime.now() > > ...should just return False and not raise an exception. IMHO numpy.isnan() makes no sense for non-numerical types. What should numpy.isnan({}) return? It is neither a valid number, nor an invalid one. So TypeError, as returned currently, seems best to me. Zbyszek From zbyszek at in.waw.pl Wed Jul 2 06:29:44 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Wed, 2 Jul 2008 12:29:44 +0200 Subject: [Numpy-discussion] Time to fix ticket #390? In-Reply-To: References: Message-ID: <20080702102944.GC21783@szyszka.in.waw.pl> On Sat, Jun 28, 2008 at 04:31:22PM -0600, Charles R Harris wrote: > Questions about ticket #390: Unfortunately, Trac has a problem, it's impossible to view the ticket: SubversionException: ("Can't open file '/home/scipy/svn/numpy/db/revprops/5331': Permission denied", 13) It seems like some to do with permissions? Thanks, Zbyszek From falted at pytables.org Wed Jul 2 09:12:23 2008 From: falted at pytables.org (Francesc Alted) Date: Wed, 2 Jul 2008 15:12:23 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 Message-ID: <200807021512.23907.falted@pytables.org> Hi, I've seen that NumPy has changed the representation of complex numbers starting with NumPy 1.1. Before, it was: >>> numpy.__version__ '1.0.3' >>> repr(numpy.complex(0)) # The Python type '0j' >>> repr(numpy.complex128(0)) # The NumPy type '0j' Now, it is: >>> numpy.__version__ '1.2.0.dev5313' >>> repr(numpy.complex(0)) '0j' >>> repr(numpy.complex128(0)) '(0.0+0.0j)' Not that I don't like the new way, but that broke a couple of tests of the PyTables suite, and before fixing it, I'd like to know if the new way would stay. Also, I'm not certain why you have chosen a different representation than the Python type. Thanks, -- Francesc Alted From alan.mcintyre at gmail.com Wed Jul 2 09:18:45 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 09:18:45 -0400 Subject: [Numpy-discussion] set_local_path in test files Message-ID: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> Some test files have a set_local_path()/restore_path() pair at the top, and some don't. Is there any reason to be changing sys.path like this in the test modules? If not, I'll take them out when I see them. Thanks, Alan From pearu at cens.ioc.ee Wed Jul 2 09:35:12 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed, 2 Jul 2008 16:35:12 +0300 (EEST) Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> Message-ID: <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> Alan McIntyre wrote: > Some test files have a set_local_path()/restore_path() pair at the > top, and some don't. Is there any reason to be changing sys.path like > this in the test modules? If not, I'll take them out when I see them. The idea behind set_local_path is that it allows running tests inside subpackages without the need to rebuild the entire package. set_local_path()/restore_path() are convenient when debugging or developing a subpackage. If you are sure that there are no bugs in numpy subpackges that need such debugging process, then the set_local_path() restore_path() calls can be removed. (But please do not remove them from scipy tests files, rebuilding scipy just takes too much time and debugging subpackages globally would be too painful). Pearu From alan.mcintyre at gmail.com Wed Jul 2 10:01:07 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 10:01:07 -0400 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> Message-ID: <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> On Wed, Jul 2, 2008 at 9:35 AM, Pearu Peterson wrote: > Alan McIntyre wrote: >> Some test files have a set_local_path()/restore_path() pair at the >> top, and some don't. Is there any reason to be changing sys.path like >> this in the test modules? If not, I'll take them out when I see them. > > The idea behind set_local_path is that it allows running tests > inside subpackages without the need to rebuild the entire package. Ah, thanks; I'd forgotten about that. I'll leave them alone, then. I made a note for myself to make sure it's possible to run tests locally without doing a full build/install (where practical). From travis at enthought.com Wed Jul 2 10:34:48 2008 From: travis at enthought.com (Travis Vaught) Date: Wed, 2 Jul 2008 09:34:48 -0500 Subject: [Numpy-discussion] Enthought Python Distribution Message-ID: <05B4EAF2-B182-4E33-A622-B3F65D98429A@enthought.com> Greetings, We're pleased to announce the beta release of the Enthought Python Distribution for *Mac OS X*. http://www.enthought.com/products/epd.php This release should safely install alongside other existing Python installations on your Mac. With the Mac OS X platform support, EPD now provides a consistent scientific application tool set across three major platforms (Windows, RedHat Linux (32 and 64 bit) and OS X). This is a _beta_ release, so install at your own risk. Please provide any feedback to info at enthought.com. See the included EPD Readme.txt for instructions and known issues. About EPD --------- The Enthought Python Distribution (EPD) is a "kitchen-sink-included" distribution of the Python? Programming Language, including over 60 additional tools and libraries. The EPD bundle includes the following major packages: Python Core Python NumPy Multidimensional arrays and fast numerics for Python SciPy Scientific Library for Python Enthought Tool Suite (ETS) A suite of tools including: Traits Manifest typing, validation, visualization, delegation, etc. Mayavi 3D interactive data exploration environment. Chaco Advanced 2D plotting toolkit for interactive 2D visualization. Kiva 2D drawing library in the spirit of DisplayPDF. Enable Object-based canvas for interacting with 2D components and widgets. Matplotlib 2D plotting library wxPython Cross-platform windowing and widget library. Visualization Toolkit (VTK) 3D visualization framework There are many more included packages as well. There's a complete list here: http://www.enthought.com/products/epdlibraries.php License ------- EPD is a bundle of software--every piece of which is available for free under various open-source licenses. The bundle itself is offered as a free download to academic and individual hobbyist use. Commercial and non-degree granting institutions and agencies may purchase individual subscriptions for the bundle (http://www.enthought.com/products/order.php?ver=MacOSX ) or contact Enthought to discuss an Enterprise license (http://www.enthought.com/products/enterprise.php ). Please see the FAQ for further explanation about how the software came together. (http://www.enthought.com/products/epdfaq.php) Thanks, Travis From matthew.brett at gmail.com Wed Jul 2 10:49:30 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 2 Jul 2008 15:49:30 +0100 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> Message-ID: <1e2af89e0807020749v4a87c2h80eac4af211582fc@mail.gmail.com> Hi, >> The idea behind set_local_path is that it allows running tests >> inside subpackages without the need to rebuild the entire package. > > Ah, thanks; I'd forgotten about that. I'll leave them alone, then. I > made a note for myself to make sure it's possible to run tests locally > without doing a full build/install (where practical). I noticed that people were often putting this at the top of scipy test files as boilerplate when they didn't in fact need the local directory on the path. I did it myself when I first started using the Numpy test rig. So, to avoid that happening, it would be good to remove all the cases where it is not necessary... Best, Matthew From charlesr.harris at gmail.com Wed Jul 2 11:58:38 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Jul 2008 09:58:38 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: <200807021512.23907.falted@pytables.org> References: <200807021512.23907.falted@pytables.org> Message-ID: On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted wrote: > Hi, > > I've seen that NumPy has changed the representation of complex numbers > starting with NumPy 1.1. Before, it was: > > >>> numpy.__version__ > '1.0.3' > >>> repr(numpy.complex(0)) # The Python type > '0j' > >>> repr(numpy.complex128(0)) # The NumPy type > '0j' > > Now, it is: > > >>> numpy.__version__ > '1.2.0.dev5313' > >>> repr(numpy.complex(0)) > '0j' > >>> repr(numpy.complex128(0)) > '(0.0+0.0j)' > > Not that I don't like the new way, but that broke a couple of tests of > the PyTables suite, and before fixing it, I'd like to know if the new > way would stay. Also, I'm not certain why you have chosen a different > representation than the Python type. > Looks like different functions are being called, as identical code is available for all the complex types. Hmm... probably float is promoted to double and for double the python repr is called. Since python can't handle longdoubles the following code is called. static PyObject * c at name@type_ at kind@(PyObject *self) { static char buf1[100]; static char buf2[100]; static char buf3[202]; c at name@ x; x = ((PyC at Name@ScalarObject *)self)->obval; format_ at name@(buf1, sizeof(buf1), x.real, @NAME at PREC_@KIND@); format_ at name@(buf2, sizeof(buf2), x.imag, @NAME at PREC_@KIND@); snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); return PyString_FromString(buf3); } So this can be fixed two ways, changing the cfloat and cdouble types to call the above, or fixing the above to look like python. Whichever way is chosen, I would rather they go through the same generated functions as it keeps the code paths simpler, puts the format choice in a single location, and separates numpy from whatever might happen in python. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 2 12:16:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Jul 2008 10:16:55 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> Message-ID: On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris wrote: > > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > wrote: > >> Hi, >> >> I've seen that NumPy has changed the representation of complex numbers >> starting with NumPy 1.1. Before, it was: >> >> >>> numpy.__version__ >> '1.0.3' >> >>> repr(numpy.complex(0)) # The Python type >> '0j' >> >>> repr(numpy.complex128(0)) # The NumPy type >> '0j' >> >> Now, it is: >> >> >>> numpy.__version__ >> '1.2.0.dev5313' >> >>> repr(numpy.complex(0)) >> '0j' >> >>> repr(numpy.complex128(0)) >> '(0.0+0.0j)' >> >> Not that I don't like the new way, but that broke a couple of tests of >> the PyTables suite, and before fixing it, I'd like to know if the new >> way would stay. Also, I'm not certain why you have chosen a different >> representation than the Python type. >> > > Looks like different functions are being called, as identical code is > available for all the complex types. Hmm... probably float is promoted to > double and for double the python repr is called. Since python can't handle > longdoubles the following code is called. > > static PyObject * > c at name@type_ at kind@(PyObject *self) > { > static char buf1[100]; > static char buf2[100]; > static char buf3[202]; > c at name@ x; > x = ((PyC at Name@ScalarObject *)self)->obval; > format_ at name@(buf1, sizeof(buf1), x.real, @NAME at PREC_@KIND@); > format_ at name@(buf2, sizeof(buf2), x.imag, @NAME at PREC_@KIND@); > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > return PyString_FromString(buf3); > } > > So this can be fixed two ways, changing the cfloat and cdouble types to > call the above, or fixing the above to look like python. Whichever way is > chosen, I would rather they go through the same generated functions as it > keeps the code paths simpler, puts the format choice in a single location, > and separates numpy from whatever might happen in python. > And I suspect this might be fallout from changeset #5014: Fix missing format code so longdoubles print with proper precision. The clongdouble repr function used to be missing and probably defaulted to cdouble. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Wed Jul 2 12:56:41 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 02 Jul 2008 18:56:41 +0200 Subject: [Numpy-discussion] New numpy.test() failures Message-ID: Hi all, If I run numpy.test() >>> numpy.__version__ '1.2.0.dev5331' I obtain ====================================================================== FAIL: Tests count ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_core.py", line 566, in test_count_func assert_equal(3, count(ott)) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 97, in assert_equal assert desired == actual, msg AssertionError: Items are not equal: ACTUAL: 3 DESIRED: 4 ====================================================================== FAIL: Tests reshape ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_core.py", line 1461, in test_reshape assert_equal(y._mask.shape, (2,2,)) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 94, in assert_equal return _assert_equal_on_sequences(actual, desired, err_msg='') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 66, in _assert_equal_on_sequences assert_equal(len(actual),len(desired),err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 97, in assert_equal assert desired == actual, msg AssertionError: Items are not equal: ACTUAL: 0 DESIRED: 2 ====================================================================== FAIL: Tests dot product ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_extras.py", line 223, in test_dot assert_equal(c.mask, [[1,1],[1,0]]) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 111, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 177, in assert_array_equal header='Arrays are not equal') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 171, in assert_array_compare verbose=verbose, header=header) File "/usr/local/lib64/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 75.0%) x: array([[False, False], [False, False]], dtype=bool) y: array([[1, 1], [1, 0]]) ====================================================================== FAIL: Test of average. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_extras.py", line 35, in test_testAverage1 assert_equal(average(ott,axis=0), [2.0, 0.0]) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 111, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 177, in assert_array_equal header='Arrays are not equal') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 171, in assert_array_compare verbose=verbose, header=header) File "/usr/local/lib64/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 50.0%) x: array([ 1., 1.]) y: array([ 2., 1.]) ====================================================================== FAIL: Test of average. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 526, in test_testAverage1 self.failUnless(eq(average(ott,axis=0), [2.0, 0.0])) AssertionError ====================================================================== FAIL: Test count ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 159, in test_xtestCount self.failUnless (eq(3, count(ott))) AssertionError ---------------------------------------------------------------------- Ran 1660 tests in 12.483s FAILED (failures=6) Nils From charlesr.harris at gmail.com Wed Jul 2 13:07:04 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Jul 2008 11:07:04 -0600 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: Message-ID: On Wed, Jul 2, 2008 at 10:56 AM, Nils Wagner wrote: > Hi all, > > If I run numpy.test() > > >>> numpy.__version__ > '1.2.0.dev5331' > > I obtain > This shows up on all the 64-bit buildbots also. But the 32 bit Mac still works. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jul 2 13:25:57 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 12:25:57 -0500 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> Message-ID: <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> On Wed, Jul 2, 2008 at 09:01, Alan McIntyre wrote: > On Wed, Jul 2, 2008 at 9:35 AM, Pearu Peterson wrote: >> Alan McIntyre wrote: >>> Some test files have a set_local_path()/restore_path() pair at the >>> top, and some don't. Is there any reason to be changing sys.path like >>> this in the test modules? If not, I'll take them out when I see them. >> >> The idea behind set_local_path is that it allows running tests >> inside subpackages without the need to rebuild the entire package. > > Ah, thanks; I'd forgotten about that. I'll leave them alone, then. I > made a note for myself to make sure it's possible to run tests locally > without doing a full build/install (where practical). Please remove them and adjust the imports. As I've mentioned before, numpy and scipy can now reliably be built in-place with "python setup.py build_src --inplace build_ext --inplace". This is a more robust method to test uninstalled code than adjusting sys.path. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Jul 2 13:31:58 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 12:31:58 -0500 Subject: [Numpy-discussion] Time to fix ticket #390? In-Reply-To: <20080702102944.GC21783@szyszka.in.waw.pl> References: <20080702102944.GC21783@szyszka.in.waw.pl> Message-ID: <3d375d730807021031v4a61fce2kf8f0c2b9a0559d86@mail.gmail.com> On Wed, Jul 2, 2008 at 05:29, Zbyszek Szmek wrote: > On Sat, Jun 28, 2008 at 04:31:22PM -0600, Charles R Harris wrote: >> Questions about ticket #390: > > Unfortunately, Trac has a problem, it's impossible to view the ticket: > > SubversionException: ("Can't open file '/home/scipy/svn/numpy/db/revprops/5331': > Permission denied", 13) > > It seems like some to do with permissions? It is working for me right now. Can you try again? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Wed Jul 2 13:41:56 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 02 Jul 2008 19:41:56 +0200 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: Message-ID: > > This shows up on all the 64-bit buildbots also. But the >32 bit Mac still > works. > > Chuck There are also new test failures in scipy ====================================================================== FAIL: Tests the confidence intervals of the trimmed mean. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/scipy/stats/tests/test_mmorestats.py", line 35, in test_trimmedmeanci assert_almost_equal(ms.trimmed_mean(data,0.2), 596.2, 1) File "/usr/local/lib64/python2.5/site-packages/numpy/testing/utils.py", line 158, in assert_almost_equal assert round(abs(desired - actual),decimal) == 0, msg AssertionError: Items are not equal: ACTUAL: 600.26666666666665 DESIRED: 596.20000000000005 ====================================================================== FAIL: Tests trimming ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/scipy/stats/tests/test_mstats.py", line 213, in test_trim [None,1,2,3,4,5,6,7,None,None]) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 111, in assert_equal return assert_array_equal(actual, desired, err_msg) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 177, in assert_array_equal header='Arrays are not equal') File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 171, in assert_array_compare verbose=verbose, header=header) File "/usr/local/lib64/python2.5/site-packages/numpy/testing/utils.py", line 240, in assert_array_compare assert cond, msg AssertionError: Arrays are not equal (mismatch 30.0%) x: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) y: array([None, 1, 2, 3, 4, 5, 6, 7, None, None], dtype=object) ====================================================================== FAIL: Tests trimming. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/scipy/stats/tests/test_mstats.py", line 240, in test_trim_old assert_equal(mstats.trimboth(x).count(), 60) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 97, in assert_equal assert desired == actual, msg AssertionError: Items are not equal: ACTUAL: 100 DESIRED: 60 ====================================================================== FAIL: Tests the trimmed mean. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/scipy/stats/tests/test_mstats.py", line 255, in test_trimmedmean assert_almost_equal(mstats.trimmed_mean(data,0.1), 343, 0) File "/usr/local/lib64/python2.5/site-packages/numpy/ma/testutils.py", line 145, in assert_almost_equal assert round(abs(desired - actual),decimal) == 0, msg AssertionError: Items are not equal: ACTUAL: 448.10526315789474 DESIRED: 343 Nils From sdb at cloud9.net Wed Jul 2 13:55:19 2008 From: sdb at cloud9.net (Stuart Brorson) Date: Wed, 2 Jul 2008 13:55:19 -0400 (EDT) Subject: [Numpy-discussion] Change of behavior in flatten between 1.0.4 and 1.1 In-Reply-To: References: Message-ID: On Tue, 1 Jul 2008, Pauli Virtanen wrote: > Tue, 01 Jul 2008 17:18:55 -0400, Stuart Brorson wrote: >> I have noticed a change in the behavior of numpy.flatten(True) between >> NumPy 1.0.4 and NumPy 1.1. The change affects 3D arrays. I am >> wondering if this is a bug or a feature. [...] > To me, it appeared that the behavior in 1.0.4 was incorrect, so I filed > the bug (after being bitten by it in real code...) and submitted a patch > that got applied. OK, it's a feature. Thanks for the reply! Cheers, Stuart Brorson Interactive Supercomputing, inc. 135 Beaver Street | Waltham | MA | 02452 | USA http://www.interactivesupercomputing.com/ From pearu at cens.ioc.ee Wed Jul 2 14:58:28 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed, 2 Jul 2008 21:58:28 +0300 (EEST) Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> Message-ID: <47116.82.131.23.17.1215025108.squirrel@cens.ioc.ee> On Wed, July 2, 2008 8:25 pm, Robert Kern wrote: > On Wed, Jul 2, 2008 at 09:01, Alan McIntyre > wrote: >> On Wed, Jul 2, 2008 at 9:35 AM, Pearu Peterson >> wrote: >>> Alan McIntyre wrote: >>>> Some test files have a set_local_path()/restore_path() pair at the >>>> top, and some don't. Is there any reason to be changing sys.path like >>>> this in the test modules? If not, I'll take them out when I see them. >>> >>> The idea behind set_local_path is that it allows running tests >>> inside subpackages without the need to rebuild the entire package. >> >> Ah, thanks; I'd forgotten about that. I'll leave them alone, then. I >> made a note for myself to make sure it's possible to run tests locally >> without doing a full build/install (where practical). > > Please remove them and adjust the imports. As I've mentioned before, > numpy and scipy can now reliably be built in-place with "python > setup.py build_src --inplace build_ext --inplace". This is a more > robust method to test uninstalled code than adjusting sys.path. Note that the point of set_local_path is not to test uninstalled code but to test only a subpackage. For example, cd svn/scipy/scipy/fftpack python setup.py build python tests/test_basic.py would run the tests using the extensions from the build directory. Well, at least it used to do that in past but it seems that the feature has been removed from scipy svn:( Scipy subpackages used to be usable as standalone packages (even not requiring scipy itself) but this seems to be changed. This is not good from from the refactoring point of view. Pearu From millman at berkeley.edu Wed Jul 2 15:04:14 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 2 Jul 2008 14:04:14 -0500 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> Message-ID: On Wed, Jul 2, 2008 at 12:25 PM, Robert Kern wrote: > Please remove them and adjust the imports. As I've mentioned before, > numpy and scipy can now reliably be built in-place with "python > setup.py build_src --inplace build_ext --inplace". This is a more > robust method to test uninstalled code than adjusting sys.path. +1 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Wed Jul 2 15:07:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 14:07:01 -0500 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <47116.82.131.23.17.1215025108.squirrel@cens.ioc.ee> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> <47116.82.131.23.17.1215025108.squirrel@cens.ioc.ee> Message-ID: <3d375d730807021207v5ef65a80r428341e4bd2babc5@mail.gmail.com> On Wed, Jul 2, 2008 at 13:58, Pearu Peterson wrote: > On Wed, July 2, 2008 8:25 pm, Robert Kern wrote: >> On Wed, Jul 2, 2008 at 09:01, Alan McIntyre >> wrote: >>> On Wed, Jul 2, 2008 at 9:35 AM, Pearu Peterson >>> wrote: >>>> Alan McIntyre wrote: >>>>> Some test files have a set_local_path()/restore_path() pair at the >>>>> top, and some don't. Is there any reason to be changing sys.path like >>>>> this in the test modules? If not, I'll take them out when I see them. >>>> >>>> The idea behind set_local_path is that it allows running tests >>>> inside subpackages without the need to rebuild the entire package. >>> >>> Ah, thanks; I'd forgotten about that. I'll leave them alone, then. I >>> made a note for myself to make sure it's possible to run tests locally >>> without doing a full build/install (where practical). >> >> Please remove them and adjust the imports. As I've mentioned before, >> numpy and scipy can now reliably be built in-place with "python >> setup.py build_src --inplace build_ext --inplace". This is a more >> robust method to test uninstalled code than adjusting sys.path. > > Note that the point of set_local_path is not to test uninstalled > code but to test only a subpackage. And nose does that just fine by itself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Wed Jul 2 15:13:37 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 02 Jul 2008 21:13:37 +0200 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: Message-ID: On Wed, 02 Jul 2008 19:41:56 +0200 "Nils Wagner" wrote: >> >> This shows up on all the 64-bit buildbots also. But the >>32 bit Mac still >> works. >> >> Chuck I can reproduce the test failures on my old 32-bit laptop. Linux linux 2.6.11.4-21.17-default #1 Fri Apr 6 08:42:34 UTC 2007 i686 athlon i386 GNU/Linux Cheers, Nils From pgmdevlist at gmail.com Wed Jul 2 15:26:05 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 2 Jul 2008 15:26:05 -0400 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: Message-ID: <200807021526.05954.pgmdevlist@gmail.com> On Wednesday 02 July 2008 15:13:37 Nils Wagner wrote: > I can reproduce the test failures on my old 32-bit laptop. As you should. My bad, I messed up on my last commit. I'll fix that later this afternoon. From alan.mcintyre at gmail.com Wed Jul 2 15:34:15 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 15:34:15 -0400 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> Message-ID: <1d36917a0807021234x64163bfcx2c0be4b67e8e94cd@mail.gmail.com> On Wed, Jul 2, 2008 at 1:25 PM, Robert Kern wrote: > Please remove them and adjust the imports. As I've mentioned before, > numpy and scipy can now reliably be built in-place with "python > setup.py build_src --inplace build_ext --inplace". This is a more > robust method to test uninstalled code than adjusting sys.path. Ok. Since set_package_path is also just used for adjusting sys.path, should that go away as well? If so, I'll remove set_package_path, set_local_path, and restore_path from numpy/testing/numpytest.py, and of course remove all their uses. From robert.kern at gmail.com Wed Jul 2 15:49:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 14:49:18 -0500 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <1d36917a0807021234x64163bfcx2c0be4b67e8e94cd@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> <1d36917a0807021234x64163bfcx2c0be4b67e8e94cd@mail.gmail.com> Message-ID: <3d375d730807021249s348bd776heddbdf11f2fdacbb@mail.gmail.com> On Wed, Jul 2, 2008 at 14:34, Alan McIntyre wrote: > On Wed, Jul 2, 2008 at 1:25 PM, Robert Kern wrote: >> Please remove them and adjust the imports. As I've mentioned before, >> numpy and scipy can now reliably be built in-place with "python >> setup.py build_src --inplace build_ext --inplace". This is a more >> robust method to test uninstalled code than adjusting sys.path. > > Ok. Since set_package_path is also just used for adjusting sys.path, > should that go away as well? If so, I'll remove set_package_path, > set_local_path, and restore_path from numpy/testing/numpytest.py, and > of course remove all their uses. Please deprecate the functions for 1.2 and schedule them for removal in 1.3. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Jul 2 16:03:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Jul 2008 14:03:11 -0600 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: <200807021526.05954.pgmdevlist@gmail.com> References: <200807021526.05954.pgmdevlist@gmail.com> Message-ID: On Wed, Jul 2, 2008 at 1:26 PM, Pierre GM wrote: > On Wednesday 02 July 2008 15:13:37 Nils Wagner wrote: > > I can reproduce the test failures on my old 32-bit laptop. > > As you should. My bad, I messed up on my last commit. I'll fix that later > this > afternoon. > ___ > Hmmm. So I check the Mac output and it's almost all test failures, yet the test shows up as a success. Something's not quite right. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Wed Jul 2 16:07:12 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 16:07:12 -0400 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: <200807021526.05954.pgmdevlist@gmail.com> Message-ID: <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> The buildbot test command should be using sys.exit to return the success flag from the test run, but it's not. The FreeBSD's test command is: /usr/local/bin/python2.4 -c 'import numpy,sys;sys.exit(not numpy.test(level=9999,verbosity=9999).wasSuccessful())' while the OSX bot's command is python -c 'import sys; sys.path=["numpy-install/lib/python2.5/site-packages"]+sys.path;import numpy;numpy.test(doctests=True)' On Wed, Jul 2, 2008 at 4:03 PM, Charles R Harris wrote: > > > On Wed, Jul 2, 2008 at 1:26 PM, Pierre GM wrote: >> >> On Wednesday 02 July 2008 15:13:37 Nils Wagner wrote: >> > I can reproduce the test failures on my old 32-bit laptop. >> >> As you should. My bad, I messed up on my last commit. I'll fix that later >> this >> afternoon. >> ___ > > Hmmm. So I check the Mac output and it's almost all test failures, yet the > test shows up as a success. Something's not quite right. > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From stefan at sun.ac.za Wed Jul 2 16:33:19 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 2 Jul 2008 22:33:19 +0200 Subject: [Numpy-discussion] set_local_path in test files In-Reply-To: <3d375d730807021249s348bd776heddbdf11f2fdacbb@mail.gmail.com> References: <1d36917a0807020618n4e6a8d23g89dccfdb153714f5@mail.gmail.com> <38330.172.17.0.4.1215005712.squirrel@cens.ioc.ee> <1d36917a0807020701k7ac84a5as884689dece4240ce@mail.gmail.com> <3d375d730807021025u602736ffpc958821f1a13cd85@mail.gmail.com> <1d36917a0807021234x64163bfcx2c0be4b67e8e94cd@mail.gmail.com> <3d375d730807021249s348bd776heddbdf11f2fdacbb@mail.gmail.com> Message-ID: <9457e7c80807021333g5056b879y25eb84f677856526@mail.gmail.com> 2008/7/2 Robert Kern : > On Wed, Jul 2, 2008 at 14:34, Alan McIntyre wrote: >> On Wed, Jul 2, 2008 at 1:25 PM, Robert Kern wrote: >>> Please remove them and adjust the imports. As I've mentioned before, >>> numpy and scipy can now reliably be built in-place with "python >>> setup.py build_src --inplace build_ext --inplace". This is a more >>> robust method to test uninstalled code than adjusting sys.path. >> >> Ok. Since set_package_path is also just used for adjusting sys.path, >> should that go away as well? If so, I'll remove set_package_path, >> set_local_path, and restore_path from numpy/testing/numpytest.py, and >> of course remove all their uses. > > Please deprecate the functions for 1.2 and schedule them for removal in 1.3. I'm really glad these are disappearing -- they were never working properly for arbitrarily deeply nested modules. Regards St?fan From barrywark at gmail.com Wed Jul 2 16:45:18 2008 From: barrywark at gmail.com (Barry Wark) Date: Wed, 2 Jul 2008 13:45:18 -0700 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> References: <200807021526.05954.pgmdevlist@gmail.com> <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> Message-ID: Fixed. Sorry. On Wed, Jul 2, 2008 at 1:07 PM, Alan McIntyre wrote: > The buildbot test command should be using sys.exit to return the > success flag from the test run, but it's not. The FreeBSD's test > command is: > > /usr/local/bin/python2.4 -c 'import numpy,sys;sys.exit(not > numpy.test(level=9999,verbosity=9999).wasSuccessful())' > > while the OSX bot's command is > > python -c 'import sys; > sys.path=["numpy-install/lib/python2.5/site-packages"]+sys.path;import > numpy;numpy.test(doctests=True)' > > On Wed, Jul 2, 2008 at 4:03 PM, Charles R Harris > wrote: >> >> >> On Wed, Jul 2, 2008 at 1:26 PM, Pierre GM wrote: >>> >>> On Wednesday 02 July 2008 15:13:37 Nils Wagner wrote: >>> > I can reproduce the test failures on my old 32-bit laptop. >>> >>> As you should. My bad, I messed up on my last commit. I'll fix that later >>> this >>> afternoon. >>> ___ >> >> Hmmm. So I check the Mac output and it's almost all test failures, yet the >> test shows up as a success. Something's not quite right. >> >> Chuck >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From stefan at sun.ac.za Wed Jul 2 16:55:12 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 2 Jul 2008 22:55:12 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: <20080702102159.GB21783@szyszka.in.waw.pl> References: <20080702102159.GB21783@szyszka.in.waw.pl> Message-ID: <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> 2008/7/2 Zbyszek Szmek : >> That's Ticket #709 : >> >> > I'm faily sure that: >> > numpy.isnan(datetime.datetime.now() >> > ...should just return False and not raise an exception. > IMHO numpy.isnan() makes no sense for non-numerical types. I agree with Chuck's rationale [if someone asks me whether a peanut butter sandwhich is a Koala bear, then I'd say no, without trying to cast the sandwhich to a mammal], and it seems that other ufuncs try to be general this way, e.g. np.add handles arbitrary classes with __add__ methods. Cheers St?fan From charlesr.harris at gmail.com Wed Jul 2 17:22:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Jul 2008 15:22:59 -0600 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: <200807021526.05954.pgmdevlist@gmail.com> <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> Message-ID: On Wed, Jul 2, 2008 at 2:45 PM, Barry Wark wrote: > Fixed. Sorry. The Mac seems to have a whole different set of errors than the other bots, lots of import errors like ERROR: Failure: ImportError (cannot import name log) I wonder if there is a path issue somewhere? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From barrywark at gmail.com Wed Jul 2 17:31:43 2008 From: barrywark at gmail.com (Barry Wark) Date: Wed, 2 Jul 2008 14:31:43 -0700 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: <200807021526.05954.pgmdevlist@gmail.com> <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> Message-ID: very likely a path issue. i've had two hard drive crashed on the buildslave box this week. i'm sure something's fubar'd. i'll take a look. thanks for the heads up. On Wed, Jul 2, 2008 at 2:22 PM, Charles R Harris wrote: > > > On Wed, Jul 2, 2008 at 2:45 PM, Barry Wark wrote: >> >> Fixed. Sorry. > > The Mac seems to have a whole different set of errors than the other bots, > lots of import errors like > > ERROR: Failure: ImportError (cannot import name log) > > > I wonder if there is a path issue somewhere? > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From msarahan at gmail.com Wed Jul 2 17:33:04 2008 From: msarahan at gmail.com (Mike Sarahan) Date: Wed, 02 Jul 2008 14:33:04 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images Message-ID: <1215034384.6901.30.camel@helios> Hi, I'm trying to do phase reconstruction on images which involves switching back and forth between Fourier space and real space. I'm trying to test numpy (& scipy, for that matter) just to see if I can go back and forth. After an FFT/iFFT, the resulting image is garbage. I'm using numpy.fft.fftn, but I've also tried fft2, rfftn, rfft2, and the corresponding inverse FFT's. >From looking at the matrices, it appears to be creating complex components that aren't in the matrix prior to any FFT's. Real fft's seem to add some small component to each value (<1). I'm using Image.fromarray to convert arrays to images, and I'm working with 8-bit grayscale images. Any help is much appreciated! Thanks, Mike Sarahan From jturner at gemini.edu Wed Jul 2 17:43:50 2008 From: jturner at gemini.edu (James Turner) Date: Wed, 02 Jul 2008 17:43:50 -0400 Subject: [Numpy-discussion] Ctypes required? Fails to build. Message-ID: <486BF696.40200@gemini.edu> Hello, I'm trying to build Python 2.5.1 on Solaris 9 with the Sun WorkShop 6 compiler, but it is failing to build the ctypes extension. Can anyone tell me whether NumPy 1.1 (or 1.04) can work without ctypes, please? What about SciPy 0.6? Maybe that's a silly question, but I can't see how to make it compile so I'm a bit stuck! I have been through one previous iteration where I tried building NumPy 1.0.4 after ignoring the ctypes problem and it produces undefined symbol errors for various math functions like "exp" and "sqrt". So perhaps that answers my question ... but I'm not 100% clear about it being due to the ctypes problem so I'd be grateful if someone can confirm that. In fact, ctypes is not listed under NumPy "prerequisites" (which, incidentally, is mistyped) in the FAQ and I believe it wasn't part of Python either until 2.5. Thanks, James. From stefan at sun.ac.za Wed Jul 2 17:47:32 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 2 Jul 2008 23:47:32 +0200 Subject: [Numpy-discussion] Doctest items In-Reply-To: <486AD14B.1040502@gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <1d36917a0807011214q42199fe7ydbf85c1638b4ba20@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> <486AC982.1010006@gmail.com> <3d375d730807011739j1b8dff15n1577a31d2c7d4d83@mail.gmail.com> <486AD14B.1040502@gmail.com> Message-ID: <9457e7c80807021447g5b3d431el15bff39f03a904ed@mail.gmail.com> 2008/7/2 Ryan May : >> To be clear, these aren't tests of the numpy code. The tests would be >> to make sure the examples still run. >> > Right. I just don't think effort should be put into making examples > using matplotlib run as doctests. If the behavior is important, numpy > should have a standalone test for it. Still, I'd like to see consistent markup for code snippets. Currently, those are indicated by '>>>'. First and foremost, docstrings (in NumPy, with its test suite) are not tests, but serve as documentation. Running them is simply a way of verifying that we did not make mistakes in the examples. Matplotlib docstrings: Currently, the plotting docstrings have not been written as valid doctests, because I didn't want them riddled with: [] How about a slight modification to Fernando's idea: a dummy function that a) Does nothing if matplotlib is not installed b) Otherwise passes through calls to matplotlib, after setting the backend to /dev/null. Any results (lines, figures, etc.) are suppressed. This limits us to very simple graphical examples, but I'm ok with that. If you really need to get hold of the figure, for example, you could always use `figure(); f = gcf()` instead of `f = figure()`. Pasting code: I recommend using the %cpaste magic in IPython. It has been updated to support line-spanning examples. Alan: The latest versions of the docstrings are on the wiki, and there are a number of examples using matplotlib. These are always listed last in the examples section, so that they can be easily ignored by terminal-only users. We'll definitely merge the wiki docstrings to into the source tree before the conference. I've been at a workshop this past week, but on Monday I'll proceed full-steam again. Regards St?fan From alan.mcintyre at gmail.com Wed Jul 2 17:48:58 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 17:48:58 -0400 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: References: <200807021526.05954.pgmdevlist@gmail.com> <1d36917a0807021307w23e653afpf52c265759687b67@mail.gmail.com> Message-ID: <1d36917a0807021448n24b44012p69d2e6a6876e14e5@mail.gmail.com> On Wed, Jul 2, 2008 at 5:22 PM, Charles R Harris wrote: > The Mac seems to have a whole different set of errors than the other bots, > lots of import errors like > > ERROR: Failure: ImportError (cannot import name log) > > I wonder if there is a path issue somewhere? At least one of those looks nose-related, and is probably due to a difference between 0.10 and 0.11. I was mistakenly using nose from svn locally, and I think I might need to check in a change to NoseTester. I won't be able to get to it for a few hours though. I'm not sure what those other import errors are about. From stefan at sun.ac.za Wed Jul 2 17:56:03 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 2 Jul 2008 23:56:03 +0200 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <1215034384.6901.30.camel@helios> References: <1215034384.6901.30.camel@helios> Message-ID: <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> Hi Mike 2008/7/2 Mike Sarahan : > I'm trying to do phase reconstruction on images which involves switching > back and forth between Fourier space and real space. I'm trying to test > numpy (& scipy, for that matter) just to see if I can go back and forth. > After an FFT/iFFT, the resulting image is garbage. I'm using > numpy.fft.fftn, but I've also tried fft2, rfftn, rfft2, and the > corresponding inverse FFT's. > > >From looking at the matrices, it appears to be creating complex > components that aren't in the matrix prior to any FFT's. Real fft's > seem to add some small component to each value (<1). I'm using > Image.fromarray to convert arrays to images, and I'm working with 8-bit > grayscale images. Those components are very small! In [59]: x = (np.random.random((15,15)) * 255).astype(np.uint8) In [60]: np.fft.fft2(x).imag.sum() Out[60]: -2.5011104298755527e-12 And you can see that the forward-reverse transformed values compare well to the original: In [61]: z = np.fft.ifft2(np.fft.fft2(x)) In [62]: np.abs(x - z).sum() Out[62]: 2.5060395252422397e-11 If you have bigger problems, send us a code snippet and we'll take a look. Regards St?fan From pgmdevlist at gmail.com Wed Jul 2 17:53:12 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 2 Jul 2008 17:53:12 -0400 Subject: [Numpy-discussion] New numpy.test() failures In-Reply-To: <200807021526.05954.pgmdevlist@gmail.com> References: <200807021526.05954.pgmdevlist@gmail.com> Message-ID: <200807021753.12818.pgmdevlist@gmail.com> On Wednesday 02 July 2008 15:26:05 you wrote: > On Wednesday 02 July 2008 15:13:37 Nils Wagner wrote: > > I can reproduce the test failures on my old 32-bit laptop. > > As you should. My bad, I messed up on my last commit. I'll fix that later > this afternoon. OK, so it should be fixed in v5332. Sorry again for the noise. From robert.kern at gmail.com Wed Jul 2 17:59:50 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 16:59:50 -0500 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <1215034384.6901.30.camel@helios> References: <1215034384.6901.30.camel@helios> Message-ID: <3d375d730807021459n59779d6bq8b3e88130a8f8696@mail.gmail.com> On Wed, Jul 2, 2008 at 16:33, Mike Sarahan wrote: > Hi, > > I'm trying to do phase reconstruction on images which involves switching > back and forth between Fourier space and real space. I'm trying to test > numpy (& scipy, for that matter) just to see if I can go back and forth. > After an FFT/iFFT, the resulting image is garbage. I'm using > numpy.fft.fftn, but I've also tried fft2, rfftn, rfft2, and the > corresponding inverse FFT's. > > >From looking at the matrices, it appears to be creating complex > components that aren't in the matrix prior to any FFT's. Are the all of the order 1e-14 or so? These are expected. Double-precision floating point operations each have about 1e-16 relative error. For an FFT on an image with values in [0..255], you will get absolute errors about 1e-16 to 1e-13 reconstructing the original array. > Real fft's > seem to add some small component to each value (<1). Exactly how large are these? If they are the same 1e-14 or so, then everything is working as expected. For example, here is what I get when doing this with fft2 and rfft2 on Images/lena.gif from the PIL's source tree. In [23]: from scipy.misc import imread In [24]: !ls IPython system call: ls courB08.bdf courB08.pbm courB08.pil lena.gif lena.jpg lena.ppm In [25]: lena = imread('lena.gif') In [26]: lena.shape, lena.dtype Out[26]: ((128, 128), dtype('uint8')) In [27]: from numpy import fft In [28]: flena = fft.fft2(lena) In [29]: lena2 = fft.ifft2(flena) In [30]: lena Out[30]: array([[100, 100, 100, ..., 197, 100, 104], [124, 62, 58, ..., 65, 21, 24], [207, 124, 104, ..., 24, 59, 59], ..., [216, 23, 84, ..., 84, 216, 193], [169, 216, 169, ..., 216, 118, 26], [ 59, 59, 59, ..., 193, 87, 128]], dtype=uint8) In [31]: lena2 Out[31]: array([[ 100. -1.02418074e-14j, 100. -4.73237530e-15j, 100. -1.80966353e-14j, ..., 197. -1.54082828e-14j, 100. +1.42941214e-14j, 104. -3.92740898e-14j], [ 124. +2.68025506e-14j, 62. -1.36449897e-14j, 58. +6.41674693e-15j, ..., 65. -3.95411363e-14j, 21. +4.25895005e-14j, 24. -1.08962931e-14j], [ 207. -4.84334794e-14j, 124. +4.98905975e-14j, 104. -2.94209102e-14j, ..., 24. -5.80408469e-14j, 59. +1.02973186e-14j, 59. +2.53408902e-14j], ..., [ 216. +5.40427922e-14j, 23. -1.46100124e-13j, 84. -3.71689877e-15j, ..., 84. +8.35485795e-14j, 216. +1.27404625e-13j, 193. +1.54715424e-14j], [ 169. -3.69981823e-14j, 216. +1.79161744e-14j, 169. -3.98847622e-14j, ..., 216. +7.45197822e-14j, 118. -1.14630527e-14j, 26. -7.45791820e-14j], [ 59. -3.23731472e-16j, 59. -5.38406684e-14j, 59. +7.23899628e-15j, ..., 193. -1.76964932e-14j, 87. -1.37792130e-14j, 128. +4.76010179e-14j]]) In [32]: (abs(lena2 - lena) < 1e-12).all() Out[32]: True In [34]: rflena = fft.rfft2(lena) In [35]: rlena2 = fft.irfft2(rflena) In [36]: (abs(rlena2 - lena) < 1e-12).all() Out[36]: True In [37]: rlena2 Out[37]: array([[ 100., 100., 100., ..., 197., 100., 104.], [ 124., 62., 58., ..., 65., 21., 24.], [ 207., 124., 104., ..., 24., 59., 59.], ..., [ 216., 23., 84., ..., 84., 216., 193.], [ 169., 216., 169., ..., 216., 118., 26.], [ 59., 59., 59., ..., 193., 87., 128.]]) Note that when printing arrays, floating point numbers are rounded to 11 decimal places or so for convenience, so the teeny-tiny errors aren't seen in the real parts of the reconstructed images. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed Jul 2 18:06:04 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 3 Jul 2008 00:06:04 +0200 Subject: [Numpy-discussion] Ctypes required? Fails to build. In-Reply-To: <486BF696.40200@gemini.edu> References: <486BF696.40200@gemini.edu> Message-ID: <9457e7c80807021506q73e68629l74766d948bb06253@mail.gmail.com> Hi James 2008/7/2 James Turner : > I'm trying to build Python 2.5.1 on Solaris 9 with the Sun > WorkShop 6 compiler, but it is failing to build the ctypes > extension. Can anyone tell me whether NumPy 1.1 (or 1.04) can > work without ctypes, please? What about SciPy 0.6? Maybe > that's a silly question, but I can't see how to make it > compile so I'm a bit stuck! Neither NumPy nor SciPy requires ctypes, as far as I know. The error you see seems to be unrelated (sorry, I don't have a solution at hand!). Regards St?fan From robert.kern at gmail.com Wed Jul 2 18:10:06 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 17:10:06 -0500 Subject: [Numpy-discussion] Ctypes required? Fails to build. In-Reply-To: <486BF696.40200@gemini.edu> References: <486BF696.40200@gemini.edu> Message-ID: <3d375d730807021510j4e951740kf8332ab4ab33e1bb@mail.gmail.com> On Wed, Jul 2, 2008 at 16:43, James Turner wrote: > Hello, > > I'm trying to build Python 2.5.1 on Solaris 9 with the Sun > WorkShop 6 compiler, but it is failing to build the ctypes > extension. Can anyone tell me whether NumPy 1.1 (or 1.04) can > work without ctypes, please? It should work unless if you specifically use the ctypes features. Some tests related to that functionality might fail, but you can ignore them. > What about SciPy 0.6? Yes. > Maybe > that's a silly question, but I can't see how to make it > compile so I'm a bit stuck! > > I have been through one previous iteration where I tried > building NumPy 1.0.4 after ignoring the ctypes problem and it > produces undefined symbol errors for various math functions > like "exp" and "sqrt". You may need to link in your system's math library explicitly. Typically, the link flags provided from Python during the build process have this, but your system may be atypical. Try this: python setup.py build_ext -lm build If your system's math library is not named "libm.so" (or .a or whatever extension is appropriate), then replace "m" in "-lm" with the correct name. Your problem may also be that you have the environment variable $LDFLAGS defined. This can override the default link flags when building numpy, so only do so if you know exactly what flags need to go there. Instead, use the -L and -l flags as above if you just need to change those. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed Jul 2 18:10:27 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 2 Jul 2008 18:10:27 -0400 Subject: [Numpy-discussion] Doctest items In-Reply-To: <9457e7c80807021447g5b3d431el15bff39f03a904ed@mail.gmail.com> References: <1d36917a0807011056s1e49932fh2219c8ff7de6cae8@mail.gmail.com> <3d375d730807011220w10bd8038x84088f5d93364eca@mail.gmail.com> <1d36917a0807011230o6f7df4d3j86f378dcb3a3d2f@mail.gmail.com> <3d375d730807011613y28a5a1cbqed0d85fe30535f67@mail.gmail.com> <486AC982.1010006@gmail.com> <3d375d730807011739j1b8dff15n1577a31d2c7d4d83@mail.gmail.com> <486AD14B.1040502@gmail.com> <9457e7c80807021447g5b3d431el15bff39f03a904ed@mail.gmail.com> Message-ID: <1d36917a0807021510u16add370s7e94abeda96d6caa@mail.gmail.com> On Wed, Jul 2, 2008 at 5:47 PM, St?fan van der Walt wrote: > How about a slight modification to Fernando's idea: a dummy function that > > a) Does nothing if matplotlib is not installed > b) Otherwise passes through calls to matplotlib, after setting the > backend to /dev/null. Any results (lines, figures, etc.) are > suppressed. That's doable; it turned out to be easy to explicitly specify the execution context of the doctests, so we can add in any number of dummy objects/functions. In my working NumPy, the only globally available items are __builtins__ and np; I was going to update all the doctests that won't run under that context before changing the tester behavior, though. At the moment I don't know anything about configuring matplotlib's backend and all that, but I'm sure it shouldn't be hard to figure out. If somebody was to write a first draft of these dummy objects before I got around to it I wouldn't complain about it, though. ;) (By the way, I'm not assuming that there's a consensus about the matplotlib dummy objects, just saying it turns out to be easy to add if we want to.) From msarahan at gmail.com Wed Jul 2 18:14:48 2008 From: msarahan at gmail.com (Mike Sarahan) Date: Wed, 02 Jul 2008 15:14:48 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> Message-ID: <1215036888.6901.35.camel@helios> I agree that the components are very small, and in a numeric sense, I wouldn't worry at all about them, but the image result is simply noise, albeit periodic-looking noise. Here's a code snippet: ---------------------------------------- import numpy,Image img=Image.open('LlamaTeeth.jpg') arr=numpy.asarray(img) fftarr=numpy.fft.fftn(arr) ifftarr=numpy.fft.ifftn(fftarr) img2=Image.fromarray(ifftarr) img2.show() ---------------------------------------- Please try it on an image that you have lying around. Thanks for looking at this! -Mike On Wed, 2008-07-02 at 23:56 +0200, St?fan van der Walt wrote: > Hi Mike > > 2008/7/2 Mike Sarahan : > > I'm trying to do phase reconstruction on images which involves switching > > back and forth between Fourier space and real space. I'm trying to test > > numpy (& scipy, for that matter) just to see if I can go back and forth. > > After an FFT/iFFT, the resulting image is garbage. I'm using > > numpy.fft.fftn, but I've also tried fft2, rfftn, rfft2, and the > > corresponding inverse FFT's. > > > > >From looking at the matrices, it appears to be creating complex > > components that aren't in the matrix prior to any FFT's. Real fft's > > seem to add some small component to each value (<1). I'm using > > Image.fromarray to convert arrays to images, and I'm working with 8-bit > > grayscale images. > > Those components are very small! > > In [59]: x = (np.random.random((15,15)) * 255).astype(np.uint8) > > In [60]: np.fft.fft2(x).imag.sum() > Out[60]: -2.5011104298755527e-12 > > And you can see that the forward-reverse transformed values compare > well to the original: > > In [61]: z = np.fft.ifft2(np.fft.fft2(x)) > > In [62]: np.abs(x - z).sum() > Out[62]: 2.5060395252422397e-11 > > If you have bigger problems, send us a code snippet and we'll take a look. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From strawman at astraw.com Wed Jul 2 18:33:11 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 02 Jul 2008 15:33:11 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <1215036888.6901.35.camel@helios> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> Message-ID: <486C0227.7080003@astraw.com> Mike Sarahan wrote: > I agree that the components are very small, and in a numeric sense, I > wouldn't worry at all about them, but the image result is simply noise, > albeit periodic-looking noise. Fernando Perez and John Hunter have written a nice FFT image denoising example: http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/examples/fft_imdenoise.py?view=markup with documentation, even: http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/workbook/fft_imdenoise.tex?view=markup From Nathan_Jensen at raytheon.com Wed Jul 2 18:43:00 2008 From: Nathan_Jensen at raytheon.com (Nathan Jensen) Date: Wed, 02 Jul 2008 22:43:00 +0000 Subject: [Numpy-discussion] slow import of numpy modules Message-ID: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> Hi, I was wondering if there was any way to speed up the global import of numpy modules. For a simple import numpy, it takes ~250 ms. In comparison, importing Numeric is only taking 40 ms. It appears that even if you only import a numpy submodule, it loads all the libraries, resulting in the painful performance hit. Are there plans to speed up the importing of numpy, or at least have it not load libraries that aren't requested? Nathan From robert.kern at gmail.com Wed Jul 2 18:49:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 17:49:20 -0500 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <1215036888.6901.35.camel@helios> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> Message-ID: <3d375d730807021549o5cf4910ey655a77fdf86941d0@mail.gmail.com> On Wed, Jul 2, 2008 at 17:14, Mike Sarahan wrote: > I agree that the components are very small, and in a numeric sense, I > wouldn't worry at all about them, but the image result is simply noise, > albeit periodic-looking noise. > > Here's a code snippet: > ---------------------------------------- > import numpy,Image > > img=Image.open('LlamaTeeth.jpg') > arr=numpy.asarray(img) > fftarr=numpy.fft.fftn(arr) > ifftarr=numpy.fft.ifftn(fftarr) > img2=Image.fromarray(ifftarr) > > img2.show() > ---------------------------------------- > Please try it on an image that you have lying around. Image.fromarray() does not know much about numpy dtypes other than uint8, bool_, uint32, and float32. Cast (the real component of) the array to a uint8 array and then use Image.fromarray(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Wed Jul 2 18:53:16 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 2 Jul 2008 15:53:16 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <486C0227.7080003@astraw.com> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> <486C0227.7080003@astraw.com> Message-ID: On Wed, Jul 2, 2008 at 3:33 PM, Andrew Straw wrote: > Mike Sarahan wrote: >> I agree that the components are very small, and in a numeric sense, I >> wouldn't worry at all about them, but the image result is simply noise, >> albeit periodic-looking noise. > > Fernando Perez and John Hunter have written a nice FFT image denoising > example: > http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/examples/fft_imdenoise.py?view=markup > > with documentation, even: > http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/workbook/fft_imdenoise.tex?view=markup Ahem. Stefan wrote that :) Cheers, f From robert.kern at gmail.com Wed Jul 2 18:59:51 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 17:59:51 -0500 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> Message-ID: <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> On Wed, Jul 2, 2008 at 17:43, Nathan Jensen wrote: > Hi, > > I was wondering if there was any way to speed up the global import of > numpy modules. For a simple import numpy, it takes ~250 ms. In > comparison, importing Numeric is only taking 40 ms. It appears that > even if you only import a numpy submodule, it loads all the libraries, > resulting in the painful performance hit. Are there plans to speed up > the importing of numpy, I am not sure how much is possible. > or at least have it not load libraries that > aren't requested? At this point in time, it is too late to make such sweeping changes to the API. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed Jul 2 19:00:57 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 3 Jul 2008 01:00:57 +0200 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <1215036888.6901.35.camel@helios> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> Message-ID: <9457e7c80807021600p67d91da2o8ecc6b65a33af2ea@mail.gmail.com> Hi Mike 2008/7/3 Mike Sarahan : > I agree that the components are very small, and in a numeric sense, I > wouldn't worry at all about them, but the image result is simply noise, > albeit periodic-looking noise. > > Here's a code snippet: > ---------------------------------------- > import numpy,Image > > img=Image.open('LlamaTeeth.jpg') > arr=numpy.asarray(img) > fftarr=numpy.fft.fftn(arr) > ifftarr=numpy.fft.ifftn(fftarr) > img2=Image.fromarray(ifftarr) > > img2.show() > ---------------------------------------- There is a bug in PIL, so you must make sure that your array is of type uint8 before you convert it to an Image, i.e. ifftarr = ifftarr.astype(np.uint8) imt2 = Image.fromarray(ifftarr) Hope that works. Cheers St?fan From msarahan at gmail.com Wed Jul 2 19:28:32 2008 From: msarahan at gmail.com (Mike Sarahan) Date: Wed, 02 Jul 2008 16:28:32 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: <9457e7c80807021600p67d91da2o8ecc6b65a33af2ea@mail.gmail.com> References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> <9457e7c80807021600p67d91da2o8ecc6b65a33af2ea@mail.gmail.com> Message-ID: <1215041312.6901.42.camel@helios> Beautiful! Thanks Stefan! It was the PIL bug. Thanks for all the replies. -Mike On Thu, 2008-07-03 at 01:00 +0200, St?fan van der Walt wrote: > Hi Mike > > 2008/7/3 Mike Sarahan : > > I agree that the components are very small, and in a numeric sense, I > > wouldn't worry at all about them, but the image result is simply noise, > > albeit periodic-looking noise. > > > > Here's a code snippet: > > ---------------------------------------- > > import numpy,Image > > > > img=Image.open('LlamaTeeth.jpg') > > arr=numpy.asarray(img) > > fftarr=numpy.fft.fftn(arr) > > ifftarr=numpy.fft.ifftn(fftarr) > > img2=Image.fromarray(ifftarr) > > > > img2.show() > > ---------------------------------------- > > There is a bug in PIL, so you must make sure that your array is of > type uint8 before you convert it to an Image, i.e. > > ifftarr = ifftarr.astype(np.uint8) > imt2 = Image.fromarray(ifftarr) > > Hope that works. > > Cheers > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From mforbes at physics.ubc.ca Wed Jul 2 20:00:24 2008 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Wed, 2 Jul 2008 17:00:24 -0700 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> Message-ID: On 2 Jul 2008, at 3:59 PM, Robert Kern wrote: > On Wed, Jul 2, 2008 at 17:43, Nathan Jensen > wrote: >> Hi, >> >> I was wondering if there was any way to speed up the global import of >> numpy modules. For a simple import numpy, it takes ~250 ms. In >> comparison, importing Numeric is only taking 40 ms. It appears that >> even if you only import a numpy submodule, it loads all the >> libraries, >> resulting in the painful performance hit. Are there plans to >> speed up >> the importing of numpy, > > I am not sure how much is possible. > >> or at least have it not load libraries that >> aren't requested? > > At this point in time, it is too late to make such sweeping changes > to the API. One could use an environmental variable such as NUMPY_SUPPRESS_TOP_LEVEL_IMPORTS, that, if defined, suppresses the importing of unneeded packages. This would only affect systems that define this variable, thus not breaking the API but providing the flexibility for those that need it. (This or a similar variable could also contain a list of the numpy components to import automatically.) If you want to try this, just modify numpy/__init__.py with something like the following import os fast_import = if 'NUMPY_SUPPRESS_TOL_LEVEL_IMPORTS' in os.environ del os if fast_import: else: del fast_import Michael. From cournapeau at cslab.kecl.ntt.co.jp Wed Jul 2 21:23:59 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 10:23:59 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> Message-ID: <1215048239.15230.2.camel@bbc8> On Wed, 2008-07-02 at 17:00 -0700, Michael McNeil Forbes wrote: > > One could use an environmental variable such as > NUMPY_SUPPRESS_TOP_LEVEL_IMPORTS, that, if defined, suppresses the > importing of unneeded packages. This would only affect systems that > define this variable, thus not breaking the API but providing the > flexibility for those that need it. (This or a similar variable > could also contain a list of the numpy components to import > automatically.) This does not sound like a good idea to me. It would mean that you effectively have two code paths depending on the environment variable, with more problems to support (people would use this option, but many other software would break without them knowing why; typically, scipy would not work anymore). I think that import numpy.core being slower than import numpy is a bug which can be solved without breaking anything, though. cheers, David From robert.kern at gmail.com Wed Jul 2 22:21:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 21:21:47 -0500 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215048239.15230.2.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> Message-ID: <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> On Wed, Jul 2, 2008 at 20:23, David Cournapeau wrote: > I think that import numpy.core being slower than import numpy is a bug > which can be solved without breaking anything, though. It does not appear to be slower to me. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournapeau at cslab.kecl.ntt.co.jp Wed Jul 2 22:38:09 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 11:38:09 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> Message-ID: <1215052689.15230.11.camel@bbc8> On Wed, 2008-07-02 at 21:21 -0500, Robert Kern wrote: > On Wed, Jul 2, 2008 at 20:23, David Cournapeau > wrote: > > > I think that import numpy.core being slower than import numpy is a bug > > which can be solved without breaking anything, though. > > It does not appear to be slower to me. > It isn't either on my computer. While we are talking about import timings, there was a system for lazy import at some point, right (this is when I first tried python and numpy a few years ago, so I may mix with something else) ? Because we could win between 20 and 40 % time of import by lazily importing a few modules (namely urllib, which I guess it not often used, and already takes around 20-30 ms; inspect and compiler are takinh a long time too, but maybe those are always needed, I have not checked carefully). Maybe this would be complicated to implement for numpy, though. cheers, David From robert.kern at gmail.com Wed Jul 2 22:50:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 21:50:32 -0500 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215052689.15230.11.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> Message-ID: <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> On Wed, Jul 2, 2008 at 21:38, David Cournapeau wrote: > On Wed, 2008-07-02 at 21:21 -0500, Robert Kern wrote: >> On Wed, Jul 2, 2008 at 20:23, David Cournapeau >> wrote: >> >> > I think that import numpy.core being slower than import numpy is a bug >> > which can be solved without breaking anything, though. >> >> It does not appear to be slower to me. > > It isn't either on my computer. So ... what were you referring to? > While we are talking about import timings, there was a system for lazy > import at some point, right (this is when I first tried python and numpy > a few years ago, so I may mix with something else) ? There is special purpose code, yes. We used to use it to load proxy objects for scipy subpackages such that "import scipy" would have scipy.stats semi-immediately available. We have stopped using it because of fragility, confusing behavior at the interpreter, py2exe problems, and my general abhorrence of things which mess too deeply with imports. It is not a general-purpose solution for lazily-loading stdlib modules, I don't think. > Because we could > win between 20 and 40 % time of import by lazily importing a few modules > (namely urllib, which I guess it not often used, and already takes > around 20-30 ms; inspect and compiler are takinh a long time too, but > maybe those are always needed, I have not checked carefully). Maybe this > would be complicated to implement for numpy, though. These imports could easily be pushed down into the handful of functions that need them (with an appropriate comment about why they are down there). There is no need to have complicated machinery involved. Do you have a breakdown of the import costs? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournapeau at cslab.kecl.ntt.co.jp Thu Jul 3 00:14:49 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 13:14:49 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> Message-ID: <1215058489.15230.24.camel@bbc8> On Wed, 2008-07-02 at 21:50 -0500, Robert Kern wrote: > > So ... what were you referring to? To a former email from Matthieu in this thread (or Stefan ?). > > There is special purpose code, yes. We used to use it to load proxy > objects for scipy subpackages such that "import scipy" would have > scipy.stats semi-immediately available. We have stopped using it > because of fragility, confusing behavior at the interpreter, py2exe > problems, and my general abhorrence of things which mess too deeply > with imports. It is not a general-purpose solution for lazily-loading > stdlib modules, I don't think. I was afraid of something like this. > > > Because we could > > win between 20 and 40 % time of import by lazily importing a few modules > > (namely urllib, which I guess it not often used, and already takes > > around 20-30 ms; inspect and compiler are takinh a long time too, but > > maybe those are always needed, I have not checked carefully). Maybe this > > would be complicated to implement for numpy, though. > > These imports could easily be pushed down into the handful of > functions that need them (with an appropriate comment about why they > are down there). There is no need to have complicated machinery > involved. > > Do you have a breakdown of the import costs? I don't have the precise timings/scripts at the moment, but even by using really crude method: - urllib2 (in numpy.lib._datasource) by itself takes 30 ms from 180ms. That's an easy 20 % win, since it is not often called. - inspect in numpy.lib.utils: this cost around 25 ms If I just comment the above imports, I go from 180 to 120 ms. Then, something which takes a awful lot of time is finfo to get floating points limits. This takes like 30-40 ms. I wonder if there are some ways to make it faster. After that, there is no obvious spot I remember, but I can get them tonight when I go back to my lab. cheers, David From robert.kern at gmail.com Thu Jul 3 00:36:31 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Jul 2008 23:36:31 -0500 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215058489.15230.24.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> Message-ID: <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> On Wed, Jul 2, 2008 at 23:14, David Cournapeau wrote: > On Wed, 2008-07-02 at 21:50 -0500, Robert Kern wrote: >> >> So ... what were you referring to? > > To a former email from Matthieu in this thread (or Stefan ?). Neither one has participated in this thread. At least, no such email has made it to my inbox. >> There is special purpose code, yes. We used to use it to load proxy >> objects for scipy subpackages such that "import scipy" would have >> scipy.stats semi-immediately available. We have stopped using it >> because of fragility, confusing behavior at the interpreter, py2exe >> problems, and my general abhorrence of things which mess too deeply >> with imports. It is not a general-purpose solution for lazily-loading >> stdlib modules, I don't think. > > I was afraid of something like this. > >> > Because we could >> > win between 20 and 40 % time of import by lazily importing a few modules >> > (namely urllib, which I guess it not often used, and already takes >> > around 20-30 ms; inspect and compiler are takinh a long time too, but >> > maybe those are always needed, I have not checked carefully). Maybe this >> > would be complicated to implement for numpy, though. >> >> These imports could easily be pushed down into the handful of >> functions that need them (with an appropriate comment about why they >> are down there). There is no need to have complicated machinery >> involved. >> >> Do you have a breakdown of the import costs? > > I don't have the precise timings/scripts at the moment, but even by > using really crude method: > - urllib2 (in numpy.lib._datasource) by itself takes 30 ms from 180ms. > That's an easy 20 % win, since it is not often called. > - inspect in numpy.lib.utils: this cost around 25 ms > > If I just comment the above imports, I go from 180 to 120 ms. I think it's worth moving these imports into the functions, then. > Then, something which takes a awful lot of time is finfo to get floating > points limits. This takes like 30-40 ms. I wonder if there are some ways > to make it faster. After that, there is no obvious spot I remember, but > I can get them tonight when I go back to my lab. They can all be turned into properties that look up in a cache first. iinfo already does this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournapeau at cslab.kecl.ntt.co.jp Thu Jul 3 00:56:01 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 13:56:01 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> Message-ID: <1215060961.15230.37.camel@bbc8> On Wed, 2008-07-02 at 23:36 -0500, Robert Kern wrote: > > Neither one has participated in this thread. At least, no such email > has made it to my inbox. This was in the thread "import numpy" is slow, I mixed the two, sorry. > > I think it's worth moving these imports into the functions, then. Ok, will do it, then. > > > Then, something which takes a awful lot of time is finfo to get floating > > points limits. This takes like 30-40 ms. I wonder if there are some ways > > to make it faster. After that, there is no obvious spot I remember, but > > I can get them tonight when I go back to my lab. > > They can all be turned into properties that look up in a cache first. > iinfo already does this. Yes, it is cached, but the first run is slow and seems to take a long time. As it is used as a default argument in numpy.ma.extras, it is run when you import numpy. Just to check, I set the default argument to None, and now import numpy is ~85ms instead of 180ms. 40ms to get the tiny attribute of float sounds slow, but maybe there is no way around it (maybe MachAr can be sped up a bit, but this looks like quite sensitive code). cheers, David From cournapeau at cslab.kecl.ntt.co.jp Thu Jul 3 01:36:32 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 14:36:32 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215060961.15230.37.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> <1215060961.15230.37.camel@bbc8> Message-ID: <1215063392.15230.39.camel@bbc8> On Thu, 2008-07-03 at 13:56 +0900, David Cournapeau wrote: > > Ok, will do it, then. I put the patches in ticket 838. I tried to commit the changes directly, but it looks like they disabled some proxy settings necessary to commit to svn at my company. On my computer, the changes cut 1/3 of total numpy import time, which is not bad since the changes are trivial. cheers, David From fperez.net at gmail.com Thu Jul 3 01:37:55 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 2 Jul 2008 22:37:55 -0700 Subject: [Numpy-discussion] FFT's & IFFT's on images In-Reply-To: References: <1215034384.6901.30.camel@helios> <9457e7c80807021456m2f7f9316j8566df424042649f@mail.gmail.com> <1215036888.6901.35.camel@helios> <486C0227.7080003@astraw.com> Message-ID: On Wed, Jul 2, 2008 at 3:53 PM, Fernando Perez wrote: > On Wed, Jul 2, 2008 at 3:33 PM, Andrew Straw wrote: >> Mike Sarahan wrote: >>> I agree that the components are very small, and in a numeric sense, I >>> wouldn't worry at all about them, but the image result is simply noise, >>> albeit periodic-looking noise. >> >> Fernando Perez and John Hunter have written a nice FFT image denoising >> example: >> http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/examples/fft_imdenoise.py?view=markup >> >> with documentation, even: >> http://matplotlib.svn.sourceforge.net/viewvc/matplotlib/trunk/py4science/workbook/fft_imdenoise.tex?view=markup > > Ahem. Stefan wrote that :) By the way, I should apologize for a comment that was inadvertently rather curt. I simply meant to note that in py4science there's code from several people, and the fft piece was written by Stefan when we taught the workshop together. But Andrew had no way to know that, since that code isn't individually marked for authorship (there are acknowledgements at the start of the main document). Sorry about that, mate. Cheers, f From robert.kern at gmail.com Thu Jul 3 02:25:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 01:25:10 -0500 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215060961.15230.37.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> <1215060961.15230.37.camel@bbc8> Message-ID: <3d375d730807022325l6cd28e31q7a7783c4708d698@mail.gmail.com> On Wed, Jul 2, 2008 at 23:56, David Cournapeau wrote: > On Wed, 2008-07-02 at 23:36 -0500, Robert Kern wrote: >> >> Neither one has participated in this thread. At least, no such email >> has made it to my inbox. > > This was in the thread "import numpy" is slow, I mixed the two, sorry. Ah yes. It's probably not worth tracking down. >> I think it's worth moving these imports into the functions, then. > > Ok, will do it, then. I've checked in your changes with some modifications to the comments. >> > Then, something which takes a awful lot of time is finfo to get floating >> > points limits. This takes like 30-40 ms. I wonder if there are some ways >> > to make it faster. After that, there is no obvious spot I remember, but >> > I can get them tonight when I go back to my lab. >> >> They can all be turned into properties that look up in a cache first. >> iinfo already does this. > > Yes, it is cached, but the first run is slow and seems to take a long > time. As it is used as a default argument in numpy.ma.extras, it is run > when you import numpy. Just to check, I set the default argument to > None, and now import numpy is ~85ms instead of 180ms. 40ms to get the > tiny attribute of float sounds slow, but maybe there is no way around it > (maybe MachAr can be sped up a bit, but this looks like quite sensitive > code). So here are all of the places that use the computed finfo values on the module level: numpy.lib.polynomial: Makes _single_eps and _double_eps global but only uses them inside functions. numpy.ma.extras: Just imports _single_eps and _double_eps from numpy.lib.polynomial but uses them inside functions. numpy.ma.core: Makes a global divide_tolerance that is used as a default in a constructor. This class is then instantiated at the module's level. I think the first two are easily replaced by actual calls to finfo() inside their functions. Because of the _finfo_cache, this should be fast enough, and it makes the code cleaner. Currently numpy.lib.polynomial and numpy.ma.extras go through if: tests to determine which of _single_eps and _double_eps to use. The last one is a bit tricky. I've pushed down the computation into the actual __call__ where it is used. It caches the result. It's not ideal, but it works. I hope this is acceptable, Pierre. Here are my timings (Intel OS X 10.5.3 with numpy.linalg linked to Accelerate.framework; warm disk caches; taking the consistent user and system times from repeated executions; I'd ignore the wall-clock time): Before: $ time python -c "import numpy" python -c "import numpy" 0.30s user 0.82s system 91% cpu 1.232 total Removal of finfo: $ time python -c "import numpy" python -c "import numpy" 0.27s user 0.82s system 94% cpu 1.156 total Removal of finfo and delayed imports: $ time python -c "import numpy" python -c "import numpy" 0.19s user 0.56s system 93% cpu 0.811 total Not too shabby. Anyways, I've checked it all in. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournapeau at cslab.kecl.ntt.co.jp Thu Jul 3 02:38:42 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 03 Jul 2008 15:38:42 +0900 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <3d375d730807022325l6cd28e31q7a7783c4708d698@mail.gmail.com> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <3d375d730807021559l2c505632j9cd47a67ced6ba0@mail.gmail.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> <1215060961.15230.37.camel@bbc8> <3d375d730807022325l6cd28e31q7a7783c4708d698@mail.gmail.com> Message-ID: <1215067123.15230.56.camel@bbc8> On Thu, 2008-07-03 at 01:25 -0500, Robert Kern wrote: > > Before: > $ time python -c "import numpy" > python -c "import numpy" 0.30s user 0.82s system 91% cpu 1.232 total > > Removal of finfo: > $ time python -c "import numpy" > python -c "import numpy" 0.27s user 0.82s system 94% cpu 1.156 total > > Removal of finfo and delayed imports: > $ time python -c "import numpy" > python -c "import numpy" 0.19s user 0.56s system 93% cpu 0.811 total I don't know how much is due to the hardware and how much is due to OS differences, but in my case (Linux with core 2 duo), with your changes, it went from: real 0m0.184s user 0m0.146s sys 0m0.034s To real 0m0.081s user 0m0.056s sys 0m0.022s Definitely worthwhile (now, importing numpy has no noticeable latency on a fast computer, which feels nice). Thanks for committing the changes, David From fperez.net at gmail.com Thu Jul 3 03:02:54 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 3 Jul 2008 00:02:54 -0700 Subject: [Numpy-discussion] slow import of numpy modules In-Reply-To: <1215067123.15230.56.camel@bbc8> References: <1215038580.27861.14.camel@omad-ac002598.oma.us.ray.com> <1215048239.15230.2.camel@bbc8> <3d375d730807021921g283c8750x70d0962d6749a2e1@mail.gmail.com> <1215052689.15230.11.camel@bbc8> <3d375d730807021950u7b2dc828y37b7275b2f2f1fae@mail.gmail.com> <1215058489.15230.24.camel@bbc8> <3d375d730807022136w60fc4eabrd8fc5ac0fd3f34bc@mail.gmail.com> <1215060961.15230.37.camel@bbc8> <3d375d730807022325l6cd28e31q7a7783c4708d698@mail.gmail.com> <1215067123.15230.56.camel@bbc8> Message-ID: Hardy, Core 2 Duo laptop, picking a typical score, warm disk caches. Before: maqroll[research]> time python -c 'import numpy' 0.180u 0.032s 0:00.20 105.0% 0+0k 0+0io 0pf+0w After: maqroll[research]> time python -c 'import numpy' 0.100u 0.032s 0:00.12 108.3% 0+0k 0+0io 0pf+0w Definitely a worthwhile improvement. Many thanks to all responsible! Cheers, f From robert.kern at gmail.com Thu Jul 3 03:06:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 02:06:14 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> Message-ID: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> On Mon, Jun 30, 2008 at 18:32, Andrew Dalke wrote: > Why does numpy/__init__.py need to import all of these other modules > and submodules? Any chance of cutting down on the number, in order > to improve startup costs? Can you try the SVN trunk? In another thread (it must be "numpy imports slowly!" week), David Cournapeau found some optimizations that could be done that don't affect the API. They seem to cut down my import times (on OS X) by about 1/3; on his Linux machine, it seems to be more. I would be interested to know how significantly it improves your use case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Thu Jul 3 03:16:39 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 3 Jul 2008 09:16:39 +0200 Subject: [Numpy-discussion] Running doctests on buildbots In-Reply-To: <3d375d730806301208u35d1abe9y86a46d9866f35d8e@mail.gmail.com> References: <1d36917a0806301138n3e2b799fwe73decb137f55bd5@mail.gmail.com> <3d375d730806301155n4e53414er56a28b8db339a1b4@mail.gmail.com> <1d36917a0806301206g574dbb96m7a3a485d0e04cfbc@mail.gmail.com> <3d375d730806301208u35d1abe9y86a46d9866f35d8e@mail.gmail.com> Message-ID: <9457e7c80807030016rd6164c0gdd3f4acd14b79bcf@mail.gmail.com> 2008/6/30 Robert Kern : > On Mon, Jun 30, 2008 at 14:06, Alan McIntyre wrote: >> On Mon, Jun 30, 2008 at 2:55 PM, Robert Kern wrote: >>> Add the doctests argument. >> >> Where is the buildbot configuration kept? > > I have not the slightest idea. St?fan? Sorry for the slow response, I'm still catching up. The Buildbot configuration is kept on buildmaster.scipy.org, that won't help you. It sends a request to the client for (something similar to) "make build; make install; make test" to be run. The administrator of each slave has control over the Makefile itself, so we'll have to ask those individuals to fix the problem. Regards St?fan From stefan at sun.ac.za Thu Jul 3 03:34:46 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 3 Jul 2008 09:34:46 +0200 Subject: [Numpy-discussion] Record arrays In-Reply-To: References: <9457e7c80806260913m15badd84sdf895fea5cdf7556@mail.gmail.com> <4863C605.1080103@enthought.com> <88e473830806260948t7774bb8ene889a3f808acd5e9@mail.gmail.com> <20080626193425.GB3848@phare.normalesup.org> <15e4667e0806261313g6f3140aaj4be0e4ac306d0295@mail.gmail.com> <3d375d730806261325g194c0488lcdc0bcc4bf13f31f@mail.gmail.com> Message-ID: <9457e7c80807030034v4e79798ajab9b97fd1a2a156a@mail.gmail.com> 2008/6/27 Fernando Perez : > On Thu, Jun 26, 2008 at 1:25 PM, Robert Kern wrote: > >> One downside of this is that the attribute access feature slows down >> all field accesses, even the r['foo'] form, because it sticks a bunch >> of pure Python code in the middle. Much code won't notice this, but if >> you end up having to iterate over an array of records (as I have), >> this will be a hotspot for you. > > I wonder if it wouldn't be useful for *all* numpy arrays to have a .f > attribute that would provide attribute access to fields for complex > dtypes: > > In [13]: r['foo'] > Out[13]: array([1, 1, 1]) > > In [14]: r.f.foo > -> Hypothetically, same as [13] above I like this idea, and think it is worth exploring further. It would have been even better if we could have done x.f.field.subfield Unfortunately, there is no way (I know of) to tell `f` whether getattribute is being called further down the chain. But even having x.f.field.f.subfield would already be useful. St?fan From falted at pytables.org Thu Jul 3 04:00:23 2008 From: falted at pytables.org (Francesc Alted) Date: Thu, 3 Jul 2008 10:00:23 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> Message-ID: <200807031000.24328.falted@pytables.org> A Wednesday 02 July 2008, Charles R Harris escrigu?: > On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris > > > wrote: > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > > > > > > wrote: > >> Hi, > >> > >> I've seen that NumPy has changed the representation of complex > >> numbers > >> > >> starting with NumPy 1.1. Before, it was: > >> >>> numpy.__version__ > >> > >> '1.0.3' > >> > >> >>> repr(numpy.complex(0)) # The Python type > >> > >> '0j' > >> > >> >>> repr(numpy.complex128(0)) # The NumPy type > >> > >> '0j' > >> > >> Now, it is: > >> >>> numpy.__version__ > >> > >> '1.2.0.dev5313' > >> > >> >>> repr(numpy.complex(0)) > >> > >> '0j' > >> > >> >>> repr(numpy.complex128(0)) > >> > >> '(0.0+0.0j)' > >> > >> Not that I don't like the new way, but that broke a couple of > >> tests of the PyTables suite, and before fixing it, I'd like to > >> know if the new way would stay. Also, I'm not certain why you > >> have chosen a different representation than the Python type. > > > > Looks like different functions are being called, as identical code > > is available for all the complex types. Hmm... probably float is > > promoted to double and for double the python repr is called. Since > > python can't handle longdoubles the following code is called. > > > > static PyObject * > > c at name@type_ at kind@(PyObject *self) > > { > > static char buf1[100]; > > static char buf2[100]; > > static char buf3[202]; > > c at name@ x; > > x = ((PyC at Name@ScalarObject *)self)->obval; > > format_ at name@(buf1, sizeof(buf1), x.real, @NAME at PREC_@KIND@); > > format_ at name@(buf2, sizeof(buf2), x.imag, @NAME at PREC_@KIND@); > > > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > > return PyString_FromString(buf3); > > } > > > > So this can be fixed two ways, changing the cfloat and cdouble > > types to call the above, or fixing the above to look like python. > > Whichever way is chosen, I would rather they go through the same > > generated functions as it keeps the code paths simpler, puts the > > format choice in a single location, and separates numpy from > > whatever might happen in python. > > And I suspect this might be fallout from changeset #5014: Fix missing > format code so longdoubles print with proper precision. The > clongdouble repr function used to be missing and probably defaulted > to cdouble. I'm not sure I follow you. Are you telling that this is a result of upcasting cfloats and cdoubles to clongdoubles when representing NumPy complex numbers? If so, why this should happen at all? Cheers, -- Francesc Alted From zbyszek at in.waw.pl Thu Jul 3 05:25:08 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Thu, 3 Jul 2008 11:25:08 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> References: <20080702102159.GB21783@szyszka.in.waw.pl> <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> Message-ID: <20080703092508.GA25596@szyszka.in.waw.pl> On Wed, Jul 02, 2008 at 10:55:12PM +0200, St?fan van der Walt wrote: > 2008/7/2 Zbyszek Szmek : > >> That's Ticket #709 : > >> > >> > I'm faily sure that: > >> > numpy.isnan(datetime.datetime.now() > >> > ...should just return False and not raise an exception. > > IMHO numpy.isnan() makes no sense for non-numerical types. > > I agree with Chuck's rationale [if someone asks me whether a peanut > butter sandwhich is a Koala bear, then I'd say no, without trying to > cast the sandwhich to a mammal], and it seems that other ufuncs try to > be general this way, e.g. np.add handles arbitrary classes with > __add__ methods. Exactly, you can add things only when they are numerical or when you have defined __add__, otherwise it's a TypeError. If it was important and general enough, we could add an attribute __isnan__, and extend isnan-iness to other things, but it probably isn't. The operator 'is' makes sense for both mammals and sandwiches, so the statement 'a butter sandwich is a Koala' makes sense. I would compare isnan-iness to asking if 'a butter sandwich is even-toed'? It is defined for some mammals, and doesn't have a definite value for other things. Defining isnan for arbitrary objects seems a workaround for passing objects to a wrong place function and pollution of code with a kludge for a special case. Cheers, Zbyszek From alan.mcintyre at gmail.com Thu Jul 3 09:26:36 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 3 Jul 2008 09:26:36 -0400 Subject: [Numpy-discussion] Running doctests on buildbots In-Reply-To: <9457e7c80807030016rd6164c0gdd3f4acd14b79bcf@mail.gmail.com> References: <1d36917a0806301138n3e2b799fwe73decb137f55bd5@mail.gmail.com> <3d375d730806301155n4e53414er56a28b8db339a1b4@mail.gmail.com> <1d36917a0806301206g574dbb96m7a3a485d0e04cfbc@mail.gmail.com> <3d375d730806301208u35d1abe9y86a46d9866f35d8e@mail.gmail.com> <9457e7c80807030016rd6164c0gdd3f4acd14b79bcf@mail.gmail.com> Message-ID: <1d36917a0807030626i7dd00bc9n92f7c78b60394ca1@mail.gmail.com> On Thu, Jul 3, 2008 at 3:16 AM, St?fan van der Walt wrote: > Sorry for the slow response, I'm still catching up. I think if we turned on doctests right now, there would be about 100 test failures, so I've got plenty of my own catching up to do before doctests really need to be enabled. ;) > The Buildbot configuration is kept on buildmaster.scipy.org, that > won't help you. It sends a request to the client for (something > similar to) "make build; make install; make test" to be run. The > administrator of each slave has control over the Makefile itself, so > we'll have to ask those individuals to fix the problem. I wonder how hard it would be to have simpler local makefiles that pull the build/install/test instructions out of svn after the update? That way we don't have to bother the slave maintainers whenever we want to tweak the build/test parameters. If I recall correctly, Python does something like this. I could look into it if it seems worth doing. From brnstrmrs at gmail.com Thu Jul 3 09:57:48 2008 From: brnstrmrs at gmail.com (Brain Stormer) Date: Thu, 3 Jul 2008 09:57:48 -0400 Subject: [Numpy-discussion] Numpy Array filling Message-ID: <24bc7f6c0807030657i5668176sf89cc750fc7875b8@mail.gmail.com> I am using numpy to create an array then filling some of the values using a for loop, I was wondering if there is way to easily fill the values without iterating through sort of like "array.fill[start:stop,start:stop]"? The reason for my question is, in some cases, I might have to fill hundreds (within a 10,000x10,000 matrix) of values and I am not sure if iteration is the right way to go. Code: from numpy import * x, y = 5, 5 matrix = zeros((y,x), int) print matrix fillxstart, fillystart = 1,1 fillxstop,fillystop = 4, 4 for i in range(fillystart,fillystop,1): for j in range(fillxstart,fillxstop,1): matrix[i,j] = 1 print matrix Output before filling: [[0 0 0 0 0] [0 0 0 0 0] [0 0 0 0 0] [0 0 0 0 0] [0 0 0 0 0]] Output after filling: [[0 0 0 0 0] [0 1 1 1 0] [0 1 1 1 0] [0 1 1 1 0] [0 0 0 0 0]] -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu Jul 3 10:17:02 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 3 Jul 2008 07:17:02 -0700 Subject: [Numpy-discussion] Numpy Array filling In-Reply-To: <24bc7f6c0807030657i5668176sf89cc750fc7875b8@mail.gmail.com> References: <24bc7f6c0807030657i5668176sf89cc750fc7875b8@mail.gmail.com> Message-ID: On Thu, Jul 3, 2008 at 6:57 AM, Brain Stormer wrote: > I am using numpy to create an array then filling some of the values using a > for loop, I was wondering if there is way to easily fill the values without > iterating through sort of like "array.fill[start:stop,start > :stop]"? The reason for my question is, in some cases, I might have to fill > hundreds (within a 10,000x10,000 matrix) of values and I am not sure if > iteration is the right way to go. > > Code: > from numpy import * > x, y = 5, 5 > matrix = zeros((y,x), int) > print matrix > fillxstart, fillystart = 1,1 > fillxstop,fillystop = 4, 4 > for i in range(fillystart,fillystop,1): > for j in range(fillxstart,fillxstop,1): > matrix[i,j] = 1 > print matrix > > Output before filling: > [[0 0 0 0 0] > [0 0 0 0 0] > [0 0 0 0 0] > [0 0 0 0 0] > [0 0 0 0 0]] > Output after filling: > [[0 0 0 0 0] > [0 1 1 1 0] > [0 1 1 1 0] > [0 1 1 1 0] > [0 0 0 0 0]] >> matrix[fillxstart:fillxstop, fillystart:fillystop] = 1 >> matrix array([[0, 0, 0, 0, 0], [0, 1, 1, 1, 0], [0, 1, 1, 1, 0], [0, 1, 1, 1, 0], [0, 0, 0, 0, 0]]) From charlesr.harris at gmail.com Thu Jul 3 11:52:38 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Jul 2008 09:52:38 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: <200807031000.24328.falted@pytables.org> References: <200807021512.23907.falted@pytables.org> <200807031000.24328.falted@pytables.org> Message-ID: On Thu, Jul 3, 2008 at 2:00 AM, Francesc Alted wrote: > A Wednesday 02 July 2008, Charles R Harris escrigu?: > > On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris > > > > > > wrote: > > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > > > > > > > > > wrote: > > >> Hi, > > >> > > >> I've seen that NumPy has changed the representation of complex > > >> numbers > > >> > > >> starting with NumPy 1.1. Before, it was: > > >> >>> numpy.__version__ > > >> > > >> '1.0.3' > > >> > > >> >>> repr(numpy.complex(0)) # The Python type > > >> > > >> '0j' > > >> > > >> >>> repr(numpy.complex128(0)) # The NumPy type > > >> > > >> '0j' > > >> > > >> Now, it is: > > >> >>> numpy.__version__ > > >> > > >> '1.2.0.dev5313' > > >> > > >> >>> repr(numpy.complex(0)) > > >> > > >> '0j' > > >> > > >> >>> repr(numpy.complex128(0)) > > >> > > >> '(0.0+0.0j)' > > >> > > >> Not that I don't like the new way, but that broke a couple of > > >> tests of the PyTables suite, and before fixing it, I'd like to > > >> know if the new way would stay. Also, I'm not certain why you > > >> have chosen a different representation than the Python type. > > > > > > Looks like different functions are being called, as identical code > > > is available for all the complex types. Hmm... probably float is > > > promoted to double and for double the python repr is called. Since > > > python can't handle longdoubles the following code is called. > > > > > > static PyObject * > > > c at name@type_ at kind@(PyObject *self) > > > { > > > static char buf1[100]; > > > static char buf2[100]; > > > static char buf3[202]; > > > c at name@ x; > > > x = ((PyC at Name@ScalarObject *)self)->obval; > > > format_ at name@(buf1, sizeof(buf1), x.real, @NAME at PREC_@KIND@); > > > format_ at name@(buf2, sizeof(buf2), x.imag, @NAME at PREC_@KIND@); > > > > > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > > > return PyString_FromString(buf3); > > > } > > > > > > So this can be fixed two ways, changing the cfloat and cdouble > > > types to call the above, or fixing the above to look like python. > > > Whichever way is chosen, I would rather they go through the same > > > generated functions as it keeps the code paths simpler, puts the > > > format choice in a single location, and separates numpy from > > > whatever might happen in python. > > > > And I suspect this might be fallout from changeset #5014: Fix missing > > format code so longdoubles print with proper precision. The > > clongdouble repr function used to be missing and probably defaulted > > to cdouble. > > I'm not sure I follow you. Are you telling that this is a result of > upcasting cfloats and cdoubles to clongdoubles when representing NumPy > complex numbers? If so, why this should happen at all? > No, just that clongdoubles didn't use to print with sufficient repr precision for reasons I didn't understand, I added a NPY_ prefix to the name of the generated printing function, and voila, it worked. For some reason cfloat and cdouble are using other print functions, which I suspect call the python repr function. Anyway, I propose the following 1) Make all the prints go through the same generated code. This makes the appearence consistent. 2) Decide how the output should be formatted. So I can make either (0.0+0.0j) or perhaps something shorter the standard format. If you have a preference, speak up. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From eads at soe.ucsc.edu Thu Jul 3 11:53:24 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Thu, 3 Jul 2008 08:53:24 -0700 (PDT) Subject: [Numpy-discussion] Numpy Array filling In-Reply-To: References: <24bc7f6c0807030657i5668176sf89cc750fc7875b8@mail.gmail.com> Message-ID: <51213.128.165.112.85.1215100404.squirrel@squirrelmail.soe.ucsc.edu> > On Thu, Jul 3, 2008 at 6:57 AM, Brain Stormer wrote: >> I am using numpy to create an array then filling some of the values >> using a >> for loop, I was wondering if there is way to easily fill the values >> without >> iterating through sort of like "array.fill[start:stop,start >> :stop]"? The reason for my question is, in some cases, I might have to >> fill >> hundreds (within a 10,000x10,000 matrix) of values and I am not sure if >> iteration is the right way to go. >> >> Code: >> from numpy import * >> x, y = 5, 5 >> matrix = zeros((y,x), int) Brain, I would avoid calling your variable matrix since it collides with numpy.matrix class, which you've imported with *. The topic to look up in the documentation is called "slicing". You'll find good explanations here. http://www.scipy.org/NumPy_for_Matlab_Users http://www.scipy.org/Tentative_NumPy_Tutorial http://www.tramy.us/ (Guide to NumPy) Try to vectorize (avoid for loops) your code whenever possible. When for loops are needed, use xrange for large loops. I hope this helps. :-) Damian >> print matrix >> fillxstart, fillystart = 1,1 >> fillxstop,fillystop = 4, 4 >> for i in range(fillystart,fillystop,1): >> for j in range(fillxstart,fillxstop,1): >> matrix[i,j] = 1 >> print matrix >> >> Output before filling: >> [[0 0 0 0 0] >> [0 0 0 0 0] >> [0 0 0 0 0] >> [0 0 0 0 0] >> [0 0 0 0 0]] >> Output after filling: >> [[0 0 0 0 0] >> [0 1 1 1 0] >> [0 1 1 1 0] >> [0 1 1 1 0] >> [0 0 0 0 0]] > >>> matrix[fillxstart:fillxstop, fillystart:fillystop] = 1 >>> matrix > > array([[0, 0, 0, 0, 0], > [0, 1, 1, 1, 0], > [0, 1, 1, 1, 0], > [0, 1, 1, 1, 0], > [0, 0, 0, 0, 0]]) > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Thu Jul 3 12:19:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Jul 2008 10:19:22 -0600 Subject: [Numpy-discussion] New test failure on SPARC buildbots. Message-ID: ====================================================================== ERROR: test_ValidHTTP (test__datasource.TestDataSourceOpen) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/numpybb/Buildbot/numpy/b13/numpy-install/lib/python2.4/site-packages/numpy/lib/tests/test__datasource.py", line 81, in test_ValidHTTP assert self.ds.open(valid_httpurl()) File "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 394, in open File "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 252, in _findfile File "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", line 214, in _cache URLError: It seems to be from the changes to speed up imports. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From falted at pytables.org Thu Jul 3 12:27:50 2008 From: falted at pytables.org (Francesc Alted) Date: Thu, 3 Jul 2008 18:27:50 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> <200807031000.24328.falted@pytables.org> Message-ID: <200807031827.50919.falted@pytables.org> A Thursday 03 July 2008, Charles R Harris escrigu?: > On Thu, Jul 3, 2008 at 2:00 AM, Francesc Alted wrote: > > A Wednesday 02 July 2008, Charles R Harris escrigu?: > > > On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris > > > > > > > > > wrote: > > > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > > > > > > > > > > > > wrote: > > > >> Hi, > > > >> > > > >> I've seen that NumPy has changed the representation of complex > > > >> numbers > > > >> > > > >> starting with NumPy 1.1. Before, it was: > > > >> >>> numpy.__version__ > > > >> > > > >> '1.0.3' > > > >> > > > >> >>> repr(numpy.complex(0)) # The Python type > > > >> > > > >> '0j' > > > >> > > > >> >>> repr(numpy.complex128(0)) # The NumPy type > > > >> > > > >> '0j' > > > >> > > > >> Now, it is: > > > >> >>> numpy.__version__ > > > >> > > > >> '1.2.0.dev5313' > > > >> > > > >> >>> repr(numpy.complex(0)) > > > >> > > > >> '0j' > > > >> > > > >> >>> repr(numpy.complex128(0)) > > > >> > > > >> '(0.0+0.0j)' > > > >> > > > >> Not that I don't like the new way, but that broke a couple of > > > >> tests of the PyTables suite, and before fixing it, I'd like to > > > >> know if the new way would stay. Also, I'm not certain why you > > > >> have chosen a different representation than the Python type. > > > > > > > > Looks like different functions are being called, as identical > > > > code is available for all the complex types. Hmm... probably > > > > float is promoted to double and for double the python repr is > > > > called. Since python can't handle longdoubles the following > > > > code is called. > > > > > > > > static PyObject * > > > > c at name@type_ at kind@(PyObject *self) > > > > { > > > > static char buf1[100]; > > > > static char buf2[100]; > > > > static char buf3[202]; > > > > c at name@ x; > > > > x = ((PyC at Name@ScalarObject *)self)->obval; > > > > format_ at name@(buf1, sizeof(buf1), x.real, > > > > @NAME at PREC_@KIND@); format_ at name@(buf2, sizeof(buf2), x.imag, > > > > @NAME at PREC_@KIND@); > > > > > > > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > > > > return PyString_FromString(buf3); > > > > } > > > > > > > > So this can be fixed two ways, changing the cfloat and cdouble > > > > types to call the above, or fixing the above to look like > > > > python. Whichever way is chosen, I would rather they go through > > > > the same generated functions as it keeps the code paths > > > > simpler, puts the format choice in a single location, and > > > > separates numpy from whatever might happen in python. > > > > > > And I suspect this might be fallout from changeset #5014: Fix > > > missing format code so longdoubles print with proper precision. > > > The clongdouble repr function used to be missing and probably > > > defaulted to cdouble. > > > > I'm not sure I follow you. Are you telling that this is a result > > of upcasting cfloats and cdoubles to clongdoubles when representing > > NumPy complex numbers? If so, why this should happen at all? > > No, just that clongdoubles didn't use to print with sufficient repr > precision for reasons I didn't understand, I added a NPY_ prefix to > the name of the generated printing function, and voila, it worked. > For some reason cfloat and cdouble are using other print functions, > which I suspect call the python repr function. Anyway, I propose the > following > > 1) Make all the prints go through the same generated code. This makes > the appearence consistent. +1 > 2) Decide how the output should be formatted. > > So I can make either (0.0+0.0j) or perhaps something shorter the > standard format. If you have a preference, speak up. My personal preference goes for following the python standard (I suppose that this would be good for tests that check for the representation). If this is difficult, the (0.0+0.0j) representation will do. Cheers, -- Francesc Alted From drnlmuller+scipy at gmail.com Thu Jul 3 12:41:02 2008 From: drnlmuller+scipy at gmail.com (Neil Muller) Date: Thu, 3 Jul 2008 18:41:02 +0200 Subject: [Numpy-discussion] New test failure on SPARC buildbots. In-Reply-To: References: Message-ID: On Thu, Jul 3, 2008 at 6:19 PM, Charles R Harris wrote: > ====================================================================== > ERROR: test_ValidHTTP (test__datasource.TestDataSourceOpen) > ---------------------------------------------------------------------- > > Traceback (most recent call last): > File > "/home/numpybb/Buildbot/numpy/b13/numpy-install/lib/python2.4/site-packages/numpy/lib/tests/test__datasource.py", > line 81, in test_ValidHTTP > assert self.ds.open(valid_httpurl()) > > File > "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", > line 394, in open > File > "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", > line 252, in _findfile > > File > "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", > line 214, in _cache > URLError: > > > It seems to be from the changes to speed up imports. This is due to a local configuration issue on the buildbots - the proxy isn't being set correctly, so trying to open the url fails. I'll fix this tomorrow. -- Neil Muller drnlmuller at gmail.com I've got a gmail account. Why haven't I become cool? From charlesr.harris at gmail.com Thu Jul 3 12:50:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Jul 2008 10:50:16 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: <200807031827.50919.falted@pytables.org> References: <200807021512.23907.falted@pytables.org> <200807031000.24328.falted@pytables.org> <200807031827.50919.falted@pytables.org> Message-ID: On Thu, Jul 3, 2008 at 10:27 AM, Francesc Alted wrote: > A Thursday 03 July 2008, Charles R Harris escrigu?: > > On Thu, Jul 3, 2008 at 2:00 AM, Francesc Alted > wrote: > > > A Wednesday 02 July 2008, Charles R Harris escrigu?: > > > > On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris > > > > > > > > > > > > wrote: > > > > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > > > > > > > > > > > > > > > wrote: > > > > >> Hi, > > > > >> > > > > >> I've seen that NumPy has changed the representation of complex > > > > >> numbers > > > > >> > > > > >> starting with NumPy 1.1. Before, it was: > > > > >> >>> numpy.__version__ > > > > >> > > > > >> '1.0.3' > > > > >> > > > > >> >>> repr(numpy.complex(0)) # The Python type > > > > >> > > > > >> '0j' > > > > >> > > > > >> >>> repr(numpy.complex128(0)) # The NumPy type > > > > >> > > > > >> '0j' > > > > >> > > > > >> Now, it is: > > > > >> >>> numpy.__version__ > > > > >> > > > > >> '1.2.0.dev5313' > > > > >> > > > > >> >>> repr(numpy.complex(0)) > > > > >> > > > > >> '0j' > > > > >> > > > > >> >>> repr(numpy.complex128(0)) > > > > >> > > > > >> '(0.0+0.0j)' > > > > >> > > > > >> Not that I don't like the new way, but that broke a couple of > > > > >> tests of the PyTables suite, and before fixing it, I'd like to > > > > >> know if the new way would stay. Also, I'm not certain why you > > > > >> have chosen a different representation than the Python type. > > > > > > > > > > Looks like different functions are being called, as identical > > > > > code is available for all the complex types. Hmm... probably > > > > > float is promoted to double and for double the python repr is > > > > > called. Since python can't handle longdoubles the following > > > > > code is called. > > > > > > > > > > static PyObject * > > > > > c at name@type_ at kind@(PyObject *self) > > > > > { > > > > > static char buf1[100]; > > > > > static char buf2[100]; > > > > > static char buf3[202]; > > > > > c at name@ x; > > > > > x = ((PyC at Name@ScalarObject *)self)->obval; > > > > > format_ at name@(buf1, sizeof(buf1), x.real, > > > > > @NAME at PREC_@KIND@); format_ at name@(buf2, sizeof(buf2), x.imag, > > > > > @NAME at PREC_@KIND@); > > > > > > > > > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > > > > > return PyString_FromString(buf3); > > > > > } > > > > > > > > > > So this can be fixed two ways, changing the cfloat and cdouble > > > > > types to call the above, or fixing the above to look like > > > > > python. Whichever way is chosen, I would rather they go through > > > > > the same generated functions as it keeps the code paths > > > > > simpler, puts the format choice in a single location, and > > > > > separates numpy from whatever might happen in python. > > > > > > > > And I suspect this might be fallout from changeset #5014: Fix > > > > missing format code so longdoubles print with proper precision. > > > > The clongdouble repr function used to be missing and probably > > > > defaulted to cdouble. > > > > > > I'm not sure I follow you. Are you telling that this is a result > > > of upcasting cfloats and cdoubles to clongdoubles when representing > > > NumPy complex numbers? If so, why this should happen at all? > > > > No, just that clongdoubles didn't use to print with sufficient repr > > precision for reasons I didn't understand, I added a NPY_ prefix to > > the name of the generated printing function, and voila, it worked. > > For some reason cfloat and cdouble are using other print functions, > > which I suspect call the python repr function. Anyway, I propose the > > following > > > > 1) Make all the prints go through the same generated code. This makes > > the appearence consistent. > > +1 > > > 2) Decide how the output should be formatted. > > > > So I can make either (0.0+0.0j) or perhaps something shorter the > > standard format. If you have a preference, speak up. > > My personal preference goes for following the python standard (I suppose > that this would be good for tests that check for the representation). > If this is difficult, the (0.0+0.0j) representation will do. > I'll add that is not possible to guarantee reproducible results for repr, especially for longdoubles, because of hardware differences and/or compilers. MSVC, for instance, defines long doubles to be the same as doubles while PPC uses some odd pairing of doubles. One reason str uses less precision is to hide those differences. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From falted at pytables.org Thu Jul 3 13:44:06 2008 From: falted at pytables.org (Francesc Alted) Date: Thu, 3 Jul 2008 19:44:06 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> <200807031827.50919.falted@pytables.org> Message-ID: <200807031944.07066.falted@pytables.org> A Thursday 03 July 2008, Charles R Harris escrigu?: > On Thu, Jul 3, 2008 at 10:27 AM, Francesc Alted wrote: > > A Thursday 03 July 2008, Charles R Harris escrigu?: > > > On Thu, Jul 3, 2008 at 2:00 AM, Francesc Alted > > > > > > > wrote: > > > > A Wednesday 02 July 2008, Charles R Harris escrigu?: > > > > > On Wed, Jul 2, 2008 at 9:58 AM, Charles R Harris > > > > > > > > > > > > > > > wrote: > > > > > > On Wed, Jul 2, 2008 at 7:12 AM, Francesc Alted > > > > > > > > > > > > > > > > > > wrote: > > > > > >> Hi, > > > > > >> > > > > > >> I've seen that NumPy has changed the representation of > > > > > >> complex numbers > > > > > >> > > > > > >> starting with NumPy 1.1. Before, it was: > > > > > >> >>> numpy.__version__ > > > > > >> > > > > > >> '1.0.3' > > > > > >> > > > > > >> >>> repr(numpy.complex(0)) # The Python type > > > > > >> > > > > > >> '0j' > > > > > >> > > > > > >> >>> repr(numpy.complex128(0)) # The NumPy type > > > > > >> > > > > > >> '0j' > > > > > >> > > > > > >> Now, it is: > > > > > >> >>> numpy.__version__ > > > > > >> > > > > > >> '1.2.0.dev5313' > > > > > >> > > > > > >> >>> repr(numpy.complex(0)) > > > > > >> > > > > > >> '0j' > > > > > >> > > > > > >> >>> repr(numpy.complex128(0)) > > > > > >> > > > > > >> '(0.0+0.0j)' > > > > > >> > > > > > >> Not that I don't like the new way, but that broke a couple > > > > > >> of tests of the PyTables suite, and before fixing it, I'd > > > > > >> like to know if the new way would stay. Also, I'm not > > > > > >> certain why you have chosen a different representation > > > > > >> than the Python type. > > > > > > > > > > > > Looks like different functions are being called, as > > > > > > identical code is available for all the complex types. > > > > > > Hmm... probably float is promoted to double and for double > > > > > > the python repr is called. Since python can't handle > > > > > > longdoubles the following code is called. > > > > > > > > > > > > static PyObject * > > > > > > c at name@type_ at kind@(PyObject *self) > > > > > > { > > > > > > static char buf1[100]; > > > > > > static char buf2[100]; > > > > > > static char buf3[202]; > > > > > > c at name@ x; > > > > > > x = ((PyC at Name@ScalarObject *)self)->obval; > > > > > > format_ at name@(buf1, sizeof(buf1), x.real, > > > > > > @NAME at PREC_@KIND@); format_ at name@(buf2, sizeof(buf2), > > > > > > x.imag, @NAME at PREC_@KIND@); > > > > > > > > > > > > snprintf(buf3, sizeof(buf3), "(%s+%sj)", buf1, buf2); > > > > > > return PyString_FromString(buf3); > > > > > > } > > > > > > > > > > > > So this can be fixed two ways, changing the cfloat and > > > > > > cdouble types to call the above, or fixing the above to > > > > > > look like python. Whichever way is chosen, I would rather > > > > > > they go through the same generated functions as it keeps > > > > > > the code paths simpler, puts the format choice in a single > > > > > > location, and separates numpy from whatever might happen in > > > > > > python. > > > > > > > > > > And I suspect this might be fallout from changeset #5014: Fix > > > > > missing format code so longdoubles print with proper > > > > > precision. The clongdouble repr function used to be missing > > > > > and probably defaulted to cdouble. > > > > > > > > I'm not sure I follow you. Are you telling that this is a > > > > result of upcasting cfloats and cdoubles to clongdoubles when > > > > representing NumPy complex numbers? If so, why this should > > > > happen at all? > > > > > > No, just that clongdoubles didn't use to print with sufficient > > > repr precision for reasons I didn't understand, I added a NPY_ > > > prefix to the name of the generated printing function, and voila, > > > it worked. For some reason cfloat and cdouble are using other > > > print functions, which I suspect call the python repr function. > > > Anyway, I propose the following > > > > > > 1) Make all the prints go through the same generated code. This > > > makes the appearence consistent. > > > > +1 > > > > > 2) Decide how the output should be formatted. > > > > > > So I can make either (0.0+0.0j) or perhaps something shorter the > > > standard format. If you have a preference, speak up. > > > > My personal preference goes for following the python standard (I > > suppose that this would be good for tests that check for the > > representation). If this is difficult, the (0.0+0.0j) > > representation will do. > > I'll add that is not possible to guarantee reproducible results for > repr, especially for longdoubles, because of hardware differences > and/or compilers. MSVC, for instance, defines long doubles to be the > same as doubles while PPC uses some odd pairing of doubles. One > reason str uses less precision is to hide those differences. Ok. But str also represents differently the 0j: In [24]: str(numpy.complex64(0)) Out[24]: '(0.0+0.0j)' In [25]: str(numpy.complex(0)) Out[25]: '0j' In addition, I find the new representation not too nice looking: In [35]: str(numpy.complex128(5/3j)) Out[35]: '(0.0+-1.66666666667j)' # note the '+-' thing In [36]: str(numpy.complex(5/3j)) Out[36]: '-1.66666666667j' So, perhaps it would be a wise thing to mimic the python behaviour for this sort of things, if possible. Cheers, -- Francesc Alted From robert.kern at gmail.com Thu Jul 3 14:50:29 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 13:50:29 -0500 Subject: [Numpy-discussion] New test failure on SPARC buildbots. In-Reply-To: References: Message-ID: <3d375d730807031150p5a11faf2h949ad3261a3fba6d@mail.gmail.com> On Thu, Jul 3, 2008 at 11:41, Neil Muller wrote: > On Thu, Jul 3, 2008 at 6:19 PM, Charles R Harris > wrote: >> ====================================================================== >> ERROR: test_ValidHTTP (test__datasource.TestDataSourceOpen) >> ---------------------------------------------------------------------- >> >> Traceback (most recent call last): >> File >> "/home/numpybb/Buildbot/numpy/b13/numpy-install/lib/python2.4/site-packages/numpy/lib/tests/test__datasource.py", >> line 81, in test_ValidHTTP >> assert self.ds.open(valid_httpurl()) >> >> File >> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >> line 394, in open >> File >> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >> line 252, in _findfile >> >> File >> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >> line 214, in _cache >> URLError: >> >> >> It seems to be from the changes to speed up imports. > > This is due to a local configuration issue on the buildbots - the > proxy isn't being set correctly, so trying to open the url fails. I'll > fix this tomorrow. On the other hand, the unit tests should not have been written to require network access. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Jul 3 15:02:28 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 14:02:28 -0500 Subject: [Numpy-discussion] New test failure on SPARC buildbots. In-Reply-To: <3d375d730807031150p5a11faf2h949ad3261a3fba6d@mail.gmail.com> References: <3d375d730807031150p5a11faf2h949ad3261a3fba6d@mail.gmail.com> Message-ID: <3d375d730807031202h2a147c58x705a430bebc7af3b@mail.gmail.com> On Thu, Jul 3, 2008 at 13:50, Robert Kern wrote: > On Thu, Jul 3, 2008 at 11:41, Neil Muller wrote: >> On Thu, Jul 3, 2008 at 6:19 PM, Charles R Harris >> wrote: >>> ====================================================================== >>> ERROR: test_ValidHTTP (test__datasource.TestDataSourceOpen) >>> ---------------------------------------------------------------------- >>> >>> Traceback (most recent call last): >>> File >>> "/home/numpybb/Buildbot/numpy/b13/numpy-install/lib/python2.4/site-packages/numpy/lib/tests/test__datasource.py", >>> line 81, in test_ValidHTTP >>> assert self.ds.open(valid_httpurl()) >>> >>> File >>> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >>> line 394, in open >>> File >>> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >>> line 252, in _findfile >>> >>> File >>> "../numpy-install/lib/python2.4/site-packages/numpy/lib/_datasource.py", >>> line 214, in _cache >>> URLError: >>> >>> >>> It seems to be from the changes to speed up imports. >> >> This is due to a local configuration issue on the buildbots - the >> proxy isn't being set correctly, so trying to open the url fails. I'll >> fix this tomorrow. > > On the other hand, the unit tests should not have been written to > require network access. Actually it *is* related to the import changes. The stub replaced _datasource.urlopen(), which is no longer the location of the urlopen() that is actually used. This is fixed in SVN. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Jul 3 15:30:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Jul 2008 13:30:54 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: <200807031944.07066.falted@pytables.org> References: <200807021512.23907.falted@pytables.org> <200807031827.50919.falted@pytables.org> <200807031944.07066.falted@pytables.org> Message-ID: Hmm, On Thu, Jul 3, 2008 at 11:44 AM, Francesc Alted wrote: > > Ok. But str also represents differently the 0j: > > In [24]: str(numpy.complex64(0)) > Out[24]: '(0.0+0.0j)' > > In [25]: str(numpy.complex(0)) > Out[25]: '0j' > > In addition, I find the new representation not too nice looking: > > In [35]: str(numpy.complex128(5/3j)) > Out[35]: '(0.0+-1.66666666667j)' # note the '+-' thing > > In [36]: str(numpy.complex(5/3j)) > Out[36]: '-1.66666666667j' > > > So, perhaps it would be a wise thing to mimic the python behaviour for > this sort of things, if possible. > Looks like the numpy.complex scalar is the python type: In [2]: str(numpy.complex64(0)) Out[2]: '(0.0+0.0j)' In [3]: str(numpy.complex128(0)) Out[3]: '(0.0+0.0j)' In [4]: str(numpy.complex192(0)) Out[4]: '(0.0+0.0j)' In [5]: str(numpy.complex(0)) Out[5]: '0j' ... In [9]: type(numpy.complex(0)) Out[9]: In [10]: type(numpy.complex128(0)) Out[10]: In [11]: ones(1, numpy.complex) Out[11]: array([ 1.+0.j]) Unless it is used as a dtype: In [12]: ones(1, numpy.complex).dtype Out[12]: dtype('complex128') So one fix would be to use the specific numpy type. Because we don't want to overload the usual Python complex type this distinction probably has to be kept in mind. Note that cfloat, cdouble, and clongdouble are more portable ways of getting at the c precisions. The other fix is to format numpy complex in exactly the same way as python complex. That is more complicated, not least because we have to figure out what the rules are, but can be done. I would like some folks to weigh in on the desirability of changing the format before I go off to do it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 3 15:44:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Jul 2008 13:44:46 -0600 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> <200807031827.50919.falted@pytables.org> <200807031944.07066.falted@pytables.org> Message-ID: On Thu, Jul 3, 2008 at 1:30 PM, Charles R Harris wrote: > Hmm, > > On Thu, Jul 3, 2008 at 11:44 AM, Francesc Alted > wrote: > > >> >> Ok. But str also represents differently the 0j: >> >> In [24]: str(numpy.complex64(0)) >> Out[24]: '(0.0+0.0j)' >> >> In [25]: str(numpy.complex(0)) >> Out[25]: '0j' >> >> In addition, I find the new representation not too nice looking: >> >> In [35]: str(numpy.complex128(5/3j)) >> Out[35]: '(0.0+-1.66666666667j)' # note the '+-' thing >> >> In [36]: str(numpy.complex(5/3j)) >> Out[36]: '-1.66666666667j' >> >> >> So, perhaps it would be a wise thing to mimic the python behaviour for >> this sort of things, if possible. >> > > Looks like the numpy.complex scalar is the python type: > > In [2]: str(numpy.complex64(0)) > Out[2]: '(0.0+0.0j)' > > In [3]: str(numpy.complex128(0)) > Out[3]: '(0.0+0.0j)' > > In [4]: str(numpy.complex192(0)) > Out[4]: '(0.0+0.0j)' > > In [5]: str(numpy.complex(0)) > Out[5]: '0j' > > ... > > In [9]: type(numpy.complex(0)) > Out[9]: > > In [10]: type(numpy.complex128(0)) > Out[10]: > > In [11]: ones(1, numpy.complex) > Out[11]: array([ 1.+0.j]) > > Unless it is used as a dtype: > > In [12]: ones(1, numpy.complex).dtype > Out[12]: dtype('complex128') > > So one fix would be to use the specific numpy type. Because we don't want > to overload the usual Python complex type this distinction probably has to > be kept in mind. Note that cfloat, cdouble, and clongdouble are more > portable ways of getting at the c precisions. > > The other fix is to format numpy complex in exactly the same way as python > complex. That is more complicated, not least because we have to figure out > what the rules are, but can be done. > > I would like some folks to weigh in on the desirability of changing the > format before I go off to do it. > Note that python itself is a bit inconsistent: In [7]: complex(0) Out[7]: 0j In [8]: float(0) Out[8]: 0.0 Putting on my pedant hat, I would think 0.0j would be more appropriate. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonnojohnson at gmail.com Thu Jul 3 15:53:28 2008 From: jonnojohnson at gmail.com (Jonno) Date: Thu, 3 Jul 2008 14:53:28 -0500 Subject: [Numpy-discussion] Combination of element-wise and matrix multiplication Message-ID: <3d15ebce0807031253p38e7e8f5ha9e4e7041aa8d72f@mail.gmail.com> I have two 2d arrays a & b for example: a=array([c,d],[e,f]) b=array([g,h],[i,j]) Each of the elements of a & b are actually 1d arrays of length N so I guess technically a & b have shape (2,2,N). However I want to matrix multiply a & b to create a 2d array x, where the elements of x are created with element-wise math as so: x[0,0] = c*g+d*i x[0,1] = c*h+d*j x[1,0] = e*g+f*i x[1,1] = e*h+f*j What is the simplest way to do this? I ended up doing the matrix multiplication of a & b manually as above but this doesn't scale very nicely if a & b become larger in size. Cheers, Jonno. -- "If a theory can't produce hypotheses, can't be tested, can't be disproven, and can't make predictions, then it's not a theory and certainly not science." by spisska on Slashdot, Monday April 21, 2008 From jh at physics.ucf.edu Thu Jul 3 17:59:08 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Thu, 03 Jul 2008 17:59:08 -0400 Subject: [Numpy-discussion] Summer Doc Marathon status report and request for more writers Message-ID: This is an interim status report on the Summer Documentation Marathon. It is also an invitation and plea for all experienced users to participate! I am cross-posting in an effort to get broader participation. Please hold any discussion on the scipy-dev mailing list. As you know, our immediate goal is to produce first-draft docstrings for the user-visible parts of Numpy in time for a release before Fall classes (about 1 August). The short version is: We are really moving along! But, we need *your* help to make it in time for August. Here's the scoop: 1. We have all our infrastructure, standards, and procedure in place: We have a wiki that makes editing the docs easy and even fun. It communicates directly with the numpy sources. We have PDF and HTML reference guides being generated essentially automatically: http://sd-2116.dedibox.fr/pydocweb http://mentat.za.net/numpy/refguide/NumPy.pdf http://mentat.za.net/numpy/refguide/ The wiki front page contains or points to all you need to get started. The wiki lets you pull down a docstring with a few mouse clicks, edit on your machine, upload it, and see how it will look in its HTML version right away. You can also read everyone else's docstrings, comment on them, see the status of the project, and so on. The formatted versions necessarily lag the docstrings on the wiki because they are made whenever the docstrings are checked into the sources. 2. We have documented about 1/4 of numpy in a fairly professional way, comparable to the reference pages of the major commercial packages. The doc wiki is probably the next place to go if your question isn't answered by the docstring in the current version's help(), since you can look at the new docstrings we've generated, and they're *good*! 3. But, we're only 1/4 of the way there, we're halfway through the summer, and some of the initial enthusiasm is waning. The following page tells the tale: http://sd-2116.dedibox.fr/pydocweb/stats/ As you can see (you did click, right? please click...), there are 2323 numpy objects with docstrings. Of these, 1464 we deemed "unimportant" (for now). These are generally items not seen by regular users. This left 859 objects to document in this first pass. We've done 24% of them at this writing. Now, 24% is really exciting, and I'd like to take a moment to say a public "Hooray!" for the team (no particular order): St?fan van der Walt Pauli Virtanen Robert Hetland Gael Varoquaux Scott Sinclair Alan Jackson Tim Cera Johann Cohen-Tanugi David Huard Keith Goodman Together these ten have written around 7500 words of documentation on the community's behalf, mainly as volunteers. HOWEVER, we can all do the math. We've spent one of our two months. We are 1/4 of the way there. Progress is slowing, and even if it didn't, we wouldn't make it in time. This is not a sprint, it's a MARATHON. WE NEED YOUR HELP! And we need it now. Are you excited by the idea of having documentation for numpy by the Fall release? Of having docs that answer your questions, that have *good* examples, that really save you time? If so, then please invest just a fraction of the time that documentation will save you in the next year alone by signing up on the wiki and writing some. If each experienced user wrote just a few pages, we'd be done! If you don't think you know enough to write, then do some reviewing. Are the docs readable? Do you understand the examples? Each docstring on the wiki has an easy comment box waiting for your thoughts. You will have a reference guide in the next release of numpy! I hope you will help make it a complete one. Sign up on the doc wiki today: http://sd-2116.dedibox.fr/pydocweb/ Thanks, --jh-- and the SciPy Doc Team Prof. Joseph Harrington Department of Physics MAP 414 4000 Central Florida Blvd. University of Central Florida Orlando, FL 32816-2385 (407) 823-3416 voice (407) 823-5112 fax (407) 823-2325 physics office jh at physics.ucf.edu From alan.mcintyre at gmail.com Thu Jul 3 19:49:05 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 3 Jul 2008 19:49:05 -0400 Subject: [Numpy-discussion] More doctest questions Message-ID: <1d36917a0807031649g2b9d5a2aqa70e9c22f753dd02@mail.gmail.com> Just a couple of quick questions for whomever knows: 1. Should we skip the numpy/f2py directory when looking for doctests? 2. Are the functions in numpy/lib/convdtype.py used anywhere? I can fix the doctests so they run, but I can't find anywhere they are used, so I wanted to see if they were intended for removal but forgotten. Thanks, Alan From robert.kern at gmail.com Thu Jul 3 20:26:40 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 19:26:40 -0500 Subject: [Numpy-discussion] More doctest questions In-Reply-To: <1d36917a0807031649g2b9d5a2aqa70e9c22f753dd02@mail.gmail.com> References: <1d36917a0807031649g2b9d5a2aqa70e9c22f753dd02@mail.gmail.com> Message-ID: <3d375d730807031726k548bf6ddl7d3b8d81787ca6e0@mail.gmail.com> On Thu, Jul 3, 2008 at 18:49, Alan McIntyre wrote: > Just a couple of quick questions for whomever knows: > > 1. Should we skip the numpy/f2py directory when looking for doctests? I don't see anything there that should cause problems (according to my understanding of the collector). Are you seeing problems? > 2. Are the functions in numpy/lib/convdtype.py used anywhere? I can > fix the doctests so they run, but I can't find anywhere they are used, > so I wanted to see if they were intended for removal but forgotten. I think they were intended to be used for the Numeric->numpy code transformation. It never got used. Delete it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Thu Jul 3 20:42:38 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 3 Jul 2008 20:42:38 -0400 Subject: [Numpy-discussion] More doctest questions In-Reply-To: <3d375d730807031726k548bf6ddl7d3b8d81787ca6e0@mail.gmail.com> References: <1d36917a0807031649g2b9d5a2aqa70e9c22f753dd02@mail.gmail.com> <3d375d730807031726k548bf6ddl7d3b8d81787ca6e0@mail.gmail.com> Message-ID: <1d36917a0807031742s66af7e69v8c687dbdc260c375@mail.gmail.com> On Thu, Jul 3, 2008 at 8:26 PM, Robert Kern wrote: >> 1. Should we skip the numpy/f2py directory when looking for doctests? > > I don't see anything there that should cause problems (according to my > understanding of the collector). Are you seeing problems? Actually it's a problem with the execution context restriction we added for NumPy doctests; they no longer have access to all the stuff in the module where they are declared. So in f2py/lib/extgen/base.py, the Container doctest expects the Container class to be available, but it isn't. >> 2. Are the functions in numpy/lib/convdtype.py used anywhere? I can >> fix the doctests so they run, but I can't find anywhere they are used, >> so I wanted to see if they were intended for removal but forgotten. > > I think they were intended to be used for the Numeric->numpy code > transformation. It never got used. Delete it. Ok. From robert.kern at gmail.com Thu Jul 3 23:21:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 3 Jul 2008 22:21:34 -0500 Subject: [Numpy-discussion] More doctest questions In-Reply-To: <1d36917a0807031742s66af7e69v8c687dbdc260c375@mail.gmail.com> References: <1d36917a0807031649g2b9d5a2aqa70e9c22f753dd02@mail.gmail.com> <3d375d730807031726k548bf6ddl7d3b8d81787ca6e0@mail.gmail.com> <1d36917a0807031742s66af7e69v8c687dbdc260c375@mail.gmail.com> Message-ID: <3d375d730807032021m2b1589acn3006ad11b1f61074@mail.gmail.com> On Thu, Jul 3, 2008 at 19:42, Alan McIntyre wrote: > On Thu, Jul 3, 2008 at 8:26 PM, Robert Kern wrote: >>> 1. Should we skip the numpy/f2py directory when looking for doctests? >> >> I don't see anything there that should cause problems (according to my >> understanding of the collector). Are you seeing problems? > > Actually it's a problem with the execution context restriction we > added for NumPy doctests; they no longer have access to all the stuff > in the module where they are declared. So in f2py/lib/extgen/base.py, > the Container doctest expects the Container class to be available, but > it isn't. f2py/lib is now gone as of today. It was the implementation of the next version of f2py. The development of that has moved elsewhere. Any other problems? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Fri Jul 4 00:34:18 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 4 Jul 2008 06:34:18 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> References: <20080702102159.GB21783@szyszka.in.waw.pl> <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> Message-ID: <20080704043418.GB19428@phare.normalesup.org> On Wed, Jul 02, 2008 at 10:55:12PM +0200, St?fan van der Walt wrote: > I agree with Chuck's rationale [if someone asks me whether a peanut > butter sandwhich is a Koala bear, then I'd say no, without trying to > cast the sandwhich to a mammal], Man, you need to get more sleep. Ga?l From haase at msg.ucsf.edu Fri Jul 4 01:24:13 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 4 Jul 2008 07:24:13 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: <20080704043418.GB19428@phare.normalesup.org> References: <20080702102159.GB21783@szyszka.in.waw.pl> <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> <20080704043418.GB19428@phare.normalesup.org> Message-ID: On Fri, Jul 4, 2008 at 6:34 AM, Gael Varoquaux wrote: > On Wed, Jul 02, 2008 at 10:55:12PM +0200, St?fan van der Walt wrote: >> I agree with Chuck's rationale [if someone asks me whether a peanut >> butter sandwhich is a Koala bear, then I'd say no, without trying to >> cast the sandwhich to a mammal], > > Man, you need to get more sleep. > I thought it was a good explanation. - Sebastian From stefan at sun.ac.za Fri Jul 4 02:48:05 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 4 Jul 2008 08:48:05 +0200 Subject: [Numpy-discussion] Should we fix Ticket #709? In-Reply-To: <20080703092508.GA25596@szyszka.in.waw.pl> References: <20080702102159.GB21783@szyszka.in.waw.pl> <9457e7c80807021355g7b021e92pc13bf37679633b1f@mail.gmail.com> <20080703092508.GA25596@szyszka.in.waw.pl> Message-ID: <9457e7c80807032348n108caf64pbc0840f69427fe97@mail.gmail.com> 2008/7/3 Zbyszek Szmek : > On Wed, Jul 02, 2008 at 10:55:12PM +0200, St?fan van der Walt wrote: >> 2008/7/2 Zbyszek Szmek : >> >> That's Ticket #709 : >> >> >> >> > I'm faily sure that: >> >> > numpy.isnan(datetime.datetime.now() >> >> > ...should just return False and not raise an exception. >> > IMHO numpy.isnan() makes no sense for non-numerical types. >> >> I agree with Chuck's rationale [if someone asks me whether a peanut >> butter sandwhich is a Koala bear, then I'd say no, without trying to >> cast the sandwhich to a mammal], and it seems that other ufuncs try to >> be general this way, e.g. np.add handles arbitrary classes with >> __add__ methods. > Exactly, you can add things only when they are numerical or when you have > defined __add__, otherwise it's a TypeError. > If it was important and general enough, we could add an attribute __isnan__, > and extend isnan-iness to other things, but it probably isn't. > > The operator 'is' makes sense for both mammals and sandwiches, so the statement > 'a butter sandwich is a Koala' makes sense. > I would compare isnan-iness to asking if > 'a butter sandwich is even-toed'? It is defined for some mammals, and doesn't > have a definite value for other things. We may be talking about two different things here: - NaN: the IEEE floating point number - Not a number: anything that is not a number The `isnan`-function specifically aims to identify the first case. > Defining isnan for arbitrary objects seems a workaround for passing > objects to a wrong place function and pollution of code with a kludge for > a special case. Sure, this can be done. Then the function will simply become: Return False unless the input object is a NumPy floating-point scalar (or a Python float) that matches the IEEE NaN bit-pattern. Long live even-toed peanut-butter eating Koalas! St?fan From falted at pytables.org Fri Jul 4 03:17:27 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 4 Jul 2008 09:17:27 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: References: <200807021512.23907.falted@pytables.org> Message-ID: <200807040917.28567.falted@pytables.org> A Thursday 03 July 2008, Charles R Harris escrigu?: > On Thu, Jul 3, 2008 at 1:30 PM, Charles R Harris > > > wrote: > > Hmm, > > > > On Thu, Jul 3, 2008 at 11:44 AM, Francesc Alted > > wrote: > > > > > >> Ok. But str also represents differently the 0j: > >> > >> In [24]: str(numpy.complex64(0)) > >> Out[24]: '(0.0+0.0j)' > >> > >> In [25]: str(numpy.complex(0)) > >> Out[25]: '0j' > >> > >> In addition, I find the new representation not too nice looking: > >> > >> In [35]: str(numpy.complex128(5/3j)) > >> Out[35]: '(0.0+-1.66666666667j)' # note the '+-' thing > >> > >> In [36]: str(numpy.complex(5/3j)) > >> Out[36]: '-1.66666666667j' > >> > >> > >> So, perhaps it would be a wise thing to mimic the python behaviour > >> for this sort of things, if possible. > > > > Looks like the numpy.complex scalar is the python type: > > > > In [2]: str(numpy.complex64(0)) > > Out[2]: '(0.0+0.0j)' > > > > In [3]: str(numpy.complex128(0)) > > Out[3]: '(0.0+0.0j)' > > > > In [4]: str(numpy.complex192(0)) > > Out[4]: '(0.0+0.0j)' > > > > In [5]: str(numpy.complex(0)) > > Out[5]: '0j' > > > > ... > > > > In [9]: type(numpy.complex(0)) > > Out[9]: > > > > In [10]: type(numpy.complex128(0)) > > Out[10]: > > > > In [11]: ones(1, numpy.complex) > > Out[11]: array([ 1.+0.j]) > > > > Unless it is used as a dtype: > > > > In [12]: ones(1, numpy.complex).dtype > > Out[12]: dtype('complex128') > > > > So one fix would be to use the specific numpy type. Because we > > don't want to overload the usual Python complex type this > > distinction probably has to be kept in mind. Note that cfloat, > > cdouble, and clongdouble are more portable ways of getting at the c > > precisions. > > > > The other fix is to format numpy complex in exactly the same way as > > python complex. That is more complicated, not least because we have > > to figure out what the rules are, but can be done. Well, I'd choose that route. Perhaps digging into the python code we can find the relevant code and copy it 'as is' for maximum compatibility. > > > > I would like some folks to weigh in on the desirability of changing > > the format before I go off to do it. > > Note that python itself is a bit inconsistent: > > In [7]: complex(0) > Out[7]: 0j > > In [8]: float(0) > Out[8]: 0.0 > > Putting on my pedant hat, I would think 0.0j would be more > appropriate. Well, I'm not sure about that. Technically, the '0.0' notation has been chosen to diferentiate from zero integer '0'. However, all complex types are 'floating point', so a '0j' representation refers clearly to the complex type and there is not room for confusion here. In fact, I prefer '0j' over '0.0j' representation (don't ask me why, but my guts tell me that '0j' is more 'zero' than '0.0j' ;-) Cheers, -- Francesc Alted From lbolla at gmail.com Fri Jul 4 03:38:19 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Fri, 4 Jul 2008 09:38:19 +0200 Subject: [Numpy-discussion] Combination of element-wise and matrix multiplication In-Reply-To: <3d15ebce0807031253p38e7e8f5ha9e4e7041aa8d72f@mail.gmail.com> References: <3d15ebce0807031253p38e7e8f5ha9e4e7041aa8d72f@mail.gmail.com> Message-ID: <80c99e790807040038l3abb40b0qa731c96d42195fe1@mail.gmail.com> If a and b are 2d arrays, you can use numpy.dot: In [36]: a Out[36]: array([[1, 2], [3, 4]]) In [37]: b Out[37]: array([[5, 6], [7, 8]]) In [38]: numpy.dot(a,b) Out[38]: array([[19, 22], [43, 50]]) If a and b are 3d arrays of shape 2x2xN, you can use something like that: In [52]: a = numpy.arange(16).reshape(2,2,4) In [53]: b = numpy.arange(16,32).reshape(2,2,4) In [54]: c = numpy.array([numpy.dot(a[...,i],b[...,i]) for i in xrange(a.shape[-1])]) In [55]: c.shape Out[55]: (4, 2, 2) Here c has shape (4,2,2) instead (2,2,4), but you got the idea! hth, L. On Thu, Jul 3, 2008 at 9:53 PM, Jonno wrote: > I have two 2d arrays a & b for example: > a=array([c,d],[e,f]) > b=array([g,h],[i,j]) > Each of the elements of a & b are actually 1d arrays of length N so I > guess technically a & b have shape (2,2,N). > However I want to matrix multiply a & b to create a 2d array x, where > the elements of x are created with element-wise math as so: > x[0,0] = c*g+d*i > x[0,1] = c*h+d*j > x[1,0] = e*g+f*i > x[1,1] = e*h+f*j > > What is the simplest way to do this? I ended up doing the matrix > multiplication of a & b manually as above but this doesn't scale very > nicely if a & b become larger in size. > > Cheers, > > Jonno. > > -- > "If a theory can't produce hypotheses, can't be tested, can't be > disproven, and can't make predictions, then it's not a theory and > certainly not science." by spisska on Slashdot, Monday April 21, 2008 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- "Whereof one cannot speak, thereof one must be silent." -- Ludwig Wittgenstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Jul 4 04:06:29 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 4 Jul 2008 10:06:29 +0200 Subject: [Numpy-discussion] Running doctests on buildbots In-Reply-To: <1d36917a0807030626i7dd00bc9n92f7c78b60394ca1@mail.gmail.com> References: <1d36917a0806301138n3e2b799fwe73decb137f55bd5@mail.gmail.com> <3d375d730806301155n4e53414er56a28b8db339a1b4@mail.gmail.com> <1d36917a0806301206g574dbb96m7a3a485d0e04cfbc@mail.gmail.com> <3d375d730806301208u35d1abe9y86a46d9866f35d8e@mail.gmail.com> <9457e7c80807030016rd6164c0gdd3f4acd14b79bcf@mail.gmail.com> <1d36917a0807030626i7dd00bc9n92f7c78b60394ca1@mail.gmail.com> Message-ID: <9457e7c80807040106t60ed74c6v1bfab17aa1cbf1cd@mail.gmail.com> 2008/7/3 Alan McIntyre : >> The Buildbot configuration is kept on buildmaster.scipy.org, that >> won't help you. It sends a request to the client for (something >> similar to) "make build; make install; make test" to be run. The >> administrator of each slave has control over the Makefile itself, so >> we'll have to ask those individuals to fix the problem. > > I wonder how hard it would be to have simpler local makefiles that > pull the build/install/test instructions out of svn after the update? > That way we don't have to bother the slave maintainers whenever we > want to tweak the build/test parameters. > > If I recall correctly, Python does something like this. I could look > into it if it seems worth doing. We used to keep the build instructions in a central location, but it turned out that many maintainers needed to modify the Makefile to do machine-specific setups (setting up paths, mainly). What we could do is to leave the current "make build" in place, but to change the "make test" step to be common to all. We'd have to figure out which Python binary to run because, again, it depends on the build-slave setup. Regards St?fan From falted at pytables.org Fri Jul 4 07:04:35 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 4 Jul 2008 13:04:35 +0200 Subject: [Numpy-discussion] Change in the representation of complex numbers in NumPy 1.1 In-Reply-To: <200807040917.28567.falted@pytables.org> References: <200807021512.23907.falted@pytables.org> <200807040917.28567.falted@pytables.org> Message-ID: <200807041304.35799.falted@pytables.org> A Friday 04 July 2008, Francesc Alted escrigu?: > A Thursday 03 July 2008, Charles R Harris escrigu?: > > On Thu, Jul 3, 2008 at 1:30 PM, Charles R Harris > > > > > > wrote: > > > Hmm, > > > > > > On Thu, Jul 3, 2008 at 11:44 AM, Francesc Alted > > > wrote: > > > > > > > > >> Ok. But str also represents differently the 0j: > > >> > > >> In [24]: str(numpy.complex64(0)) > > >> Out[24]: '(0.0+0.0j)' > > >> > > >> In [25]: str(numpy.complex(0)) > > >> Out[25]: '0j' > > >> > > >> In addition, I find the new representation not too nice looking: > > >> > > >> In [35]: str(numpy.complex128(5/3j)) > > >> Out[35]: '(0.0+-1.66666666667j)' # note the '+-' thing > > >> > > >> In [36]: str(numpy.complex(5/3j)) > > >> Out[36]: '-1.66666666667j' > > >> > > >> > > >> So, perhaps it would be a wise thing to mimic the python > > >> behaviour for this sort of things, if possible. > > > > > > Looks like the numpy.complex scalar is the python type: > > > > > > In [2]: str(numpy.complex64(0)) > > > Out[2]: '(0.0+0.0j)' > > > > > > In [3]: str(numpy.complex128(0)) > > > Out[3]: '(0.0+0.0j)' > > > > > > In [4]: str(numpy.complex192(0)) > > > Out[4]: '(0.0+0.0j)' > > > > > > In [5]: str(numpy.complex(0)) > > > Out[5]: '0j' > > > > > > ... > > > > > > In [9]: type(numpy.complex(0)) > > > Out[9]: > > > > > > In [10]: type(numpy.complex128(0)) > > > Out[10]: > > > > > > In [11]: ones(1, numpy.complex) > > > Out[11]: array([ 1.+0.j]) > > > > > > Unless it is used as a dtype: > > > > > > In [12]: ones(1, numpy.complex).dtype > > > Out[12]: dtype('complex128') > > > > > > So one fix would be to use the specific numpy type. Because we > > > don't want to overload the usual Python complex type this > > > distinction probably has to be kept in mind. Note that cfloat, > > > cdouble, and clongdouble are more portable ways of getting at the > > > c precisions. > > > > > > The other fix is to format numpy complex in exactly the same way > > > as python complex. That is more complicated, not least because we > > > have to figure out what the rules are, but can be done. > > Well, I'd choose that route. Perhaps digging into the python code we > can find the relevant code and copy it 'as is' for maximum > compatibility. I've added a ticket about this issue: http://scipy.org/scipy/numpy/ticket/841 Cheers, -- Francesc Alted From michael at araneidae.co.uk Fri Jul 4 07:51:19 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Fri, 4 Jul 2008 11:51:19 +0000 (GMT) Subject: [Numpy-discussion] Python ref count leak in numpy Message-ID: <20080704113400.C69716@saturn.araneidae.co.uk> I hope this is going to the right place. I've tried to submit a Trac ticket at http://projects.scipy.org/scipy/numpy/ but unfortunately it won't let me log in, even though I've registered and successfully logged in as a SciPy user! Grr. The bug itself is very easy to see in Python debug mode: adding arrays of differing shapes causes the reference count to increase on each operation. For example: array(1) + 1 # Leaks one ref count per call 1 + array(1) # Leaks 12 ref counts per call(!) It turns out that PyArray_CanCoerceScalar is being called rather a lot of times (some of the lower level numpy code is really not nice, which is a pity as there's some really clean code there too) ... and it has a ref count leak. The patch below fixes this problem. This patch was made against version 5331 from subversion. The leak was introduced in revision 2575 in June '06, over two years ago -- I'm a little surprised this hasn't been discovered already. commit ea66a7ee65e1ae855bbc432a48eb1f48176a85d9 Author: Michael Abbott Date: Fri Jul 4 12:08:24 2008 +0100 Fix memory leak in PyArray_CanCoerceScalar This routine (in multiarraymodule.c) was calling PyArray_DescrFromType without properly decrementing the reference count afterwards. As this is use widely, this results in memory leaks in the most basic of operations. diff --git a/numpy/core/src/multiarraymodule.c b/numpy/core/src/multiarraymodule.c index 1b752d6..e44ebac 100644 --- a/numpy/core/src/multiarraymodule.c +++ b/numpy/core/src/multiarraymodule.c @@ -2143,8 +2143,13 @@ PyArray_CanCoerceScalar(int thistype, int neededtype, if (from->f->cancastscalarkindto && (castlist = from->f->cancastscalarkindto[scalar])) { while (*castlist != PyArray_NOTYPE) - if (*castlist++ == neededtype) return 1; + if (*castlist++ == neededtype) { + Py_DECREF(from); + return 1; + } } + Py_DECREF(from); + switch(scalar) { case PyArray_BOOL_SCALAR: case PyArray_OBJECT_SCALAR: From dalke at dalkescientific.com Fri Jul 4 08:22:59 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 4 Jul 2008 14:22:59 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> Message-ID: <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> On Jul 3, 2008, at 9:06 AM, Robert Kern wrote: > Can you try the SVN trunk? Sure. Though did you know it's not easy to find how to get numpy from SVN? I had to go to the second page of Google, which linked to someone's talk. I expected to find a link to it at http://numpy.scipy.org/ . Just like I expected to find a link to the numpy mailing list. Okay, compiled. [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c 'pass' 0.015u 0.042s 0:00.06 83.3% 0+0k 0+0io 0pf+0w [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c 'import numpy' 0.084u 0.231s 0:00.33 93.9% 0+0k 0+8io 0pf+0w [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% Previously it took 0.44 seconds so it's now 24% faster. > I would be interested to know how significantly it improves your > use case. For one of my clients I wrote a tool to analyze import times. I don't have it, but here's something similar I just now whipped up: import time seen = set() import_order = [] elapsed_times = {} level = 0 parent = None children = {} def new_import(name, globals, locals, fromlist): global level, parent if name in seen: return old_import(name, globals, locals, fromlist) seen.add(name) import_order.append((name, level, parent)) t1 = time.time() old_parent = parent parent = name level += 1 module = old_import(name, globals, locals, fromlist) level -= 1 parent = old_parent t2 = time.time() elapsed_times[name] = t2-t1 return module old_import = __builtins__.__import__ __builtins__.__import__ = new_import import numpy parents = {} for name, level, parent in import_order: parents[name] = parent print "== Tree ==" for name, level,parent in import_order: print "%s%s: %.3f (%s)" % (" "*level, name, elapsed_times[name], parent) print "\n" print "== Slowest (including children) ==" slowest = sorted((t, name) for (name, t) in elapsed_times.items())[-20:] for elapsed_time, name in slowest[::-1]: print "%.3f %s (%s)" % (elapsed_time, name, parents[name]) The result using the version out of subversion is == Tree == numpy: 0.237 (None) numpy.__config__: 0.000 (numpy) version: 0.000 (numpy) os: 0.000 (version) imp: 0.000 (version) _import_tools: 0.024 (numpy) sys: 0.000 (_import_tools) glob: 0.024 (_import_tools) fnmatch: 0.020 (glob) re: 0.018 (fnmatch) sre_compile: 0.009 (re) _sre: 0.000 (sre_compile) sre_constants: 0.004 (sre_compile) sre_parse: 0.006 (re) copy_reg: 0.000 (re) add_newdocs: 0.156 (numpy) lib: 0.150 (add_newdocs) info: 0.000 (lib) numpy.version: 0.000 (lib) type_check: 0.091 (lib) ... many lines removed ... mtrand: 0.021 (numpy) ctypeslib: 0.024 (numpy) ctypes: 0.023 (ctypeslib) _ctypes: 0.003 (ctypes) gestalt: 0.013 (ctypes) ctypes._endian: 0.001 (ctypes) numpy.core._internal: 0.000 (ctypeslib) ma: 0.005 (numpy) extras: 0.001 (ma) numpy.lib.index_tricks: 0.000 (extras) numpy.lib.polynomial: 0.000 (extras) == Slowest (including children) == 0.237 numpy (None) 0.156 add_newdocs (numpy) 0.150 lib (add_newdocs) 0.091 type_check (lib) 0.090 numpy.core.numeric (type_check) 0.049 io (lib) 0.048 numpy.testing (numpy.core.numeric) 0.024 _import_tools (numpy) 0.024 ctypeslib (numpy) 0.024 glob (_import_tools) 0.023 ctypes (ctypeslib) 0.022 utils (numpy.testing) 0.022 difflib (utils) 0.021 mtrand (numpy) 0.020 fnmatch (glob) 0.020 _datasource (io) 0.020 tempfile (io) 0.018 re (fnmatch) 0.018 heapq (difflib) 0.013 gestalt (ctypes) This only reports the first time a module is imported so fixing, say, the 'glob' in _import_tools doesn't mean it won't appear elsewhere. Andrew dalke at dalkescientific.com From jonnojohnson at gmail.com Fri Jul 4 10:09:59 2008 From: jonnojohnson at gmail.com (Jonno) Date: Fri, 4 Jul 2008 09:09:59 -0500 Subject: [Numpy-discussion] Combination of element-wise and matrix multiplication In-Reply-To: <80c99e790807040038l3abb40b0qa731c96d42195fe1@mail.gmail.com> References: <3d15ebce0807031253p38e7e8f5ha9e4e7041aa8d72f@mail.gmail.com> <80c99e790807040038l3abb40b0qa731c96d42195fe1@mail.gmail.com> Message-ID: <3d15ebce0807040709l344937b4tfe7f813fcf89a7bc@mail.gmail.com> Awesome. I just added rollaxis(c,0,3) and was done. Cheers mate. On Fri, Jul 4, 2008 at 2:38 AM, lorenzo bolla wrote: > If a and b are 2d arrays, you can use numpy.dot: > > In [36]: a > Out[36]: > array([[1, 2], > [3, 4]]) > In [37]: b > Out[37]: > array([[5, 6], > [7, 8]]) > In [38]: numpy.dot(a,b) > Out[38]: > array([[19, 22], > [43, 50]]) > > If a and b are 3d arrays of shape 2x2xN, you can use something like that: > In [52]: a = numpy.arange(16).reshape(2,2,4) > In [53]: b = numpy.arange(16,32).reshape(2,2,4) > In [54]: c = numpy.array([numpy.dot(a[...,i],b[...,i]) for i in > xrange(a.shape[-1])]) > In [55]: c.shape > Out[55]: (4, 2, 2) > > Here c has shape (4,2,2) instead (2,2,4), but you got the idea! > > hth, > L. > > > On Thu, Jul 3, 2008 at 9:53 PM, Jonno wrote: >> >> I have two 2d arrays a & b for example: >> a=array([c,d],[e,f]) >> b=array([g,h],[i,j]) >> Each of the elements of a & b are actually 1d arrays of length N so I >> guess technically a & b have shape (2,2,N). >> However I want to matrix multiply a & b to create a 2d array x, where >> the elements of x are created with element-wise math as so: >> x[0,0] = c*g+d*i >> x[0,1] = c*h+d*j >> x[1,0] = e*g+f*i >> x[1,1] = e*h+f*j >> >> What is the simplest way to do this? I ended up doing the matrix >> multiplication of a & b manually as above but this doesn't scale very >> nicely if a & b become larger in size. >> >> Cheers, >> >> Jonno. >> >> -- >> "If a theory can't produce hypotheses, can't be tested, can't be >> disproven, and can't make predictions, then it's not a theory and >> certainly not science." by spisska on Slashdot, Monday April 21, 2008 >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > "Whereof one cannot speak, thereof one must be silent." -- Ludwig > Wittgenstein > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- "If a theory can't produce hypotheses, can't be tested, can't be disproven, and can't make predictions, then it's not a theory and certainly not science." by spisska on Slashdot, Monday April 21, 2008 From pav at iki.fi Fri Jul 4 14:57:36 2008 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 4 Jul 2008 18:57:36 +0000 (UTC) Subject: [Numpy-discussion] Python ref count leak in numpy References: <20080704113400.C69716@saturn.araneidae.co.uk> Message-ID: Fri, 04 Jul 2008 11:51:19 +0000, Michael Abbott wrote: > I hope this is going to the right place. I've tried to submit a Trac > ticket at http://projects.scipy.org/scipy/numpy/ but unfortunately it > won't let me log in, even though I've registered and successfully logged > in as a SciPy user! Grr. Logged on the scipy.org wiki, or on the Trac? Numpy Trac, Scipy Trac and the scipy.org website all require separate registrations. > The bug itself is very easy to see in Python debug mode: adding arrays > of differing shapes causes the reference count to increase on each > operation. For example: [clip] I put this report in as ticket #843 http://scipy.org/scipy/numpy/ticket/843 -- Pauli Virtanen From michael at araneidae.co.uk Fri Jul 4 15:37:12 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Fri, 4 Jul 2008 19:37:12 +0000 (GMT) Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: References: <20080704113400.C69716@saturn.araneidae.co.uk> Message-ID: <20080704190801.W76180@saturn.araneidae.co.uk> On Fri, 4 Jul 2008, Pauli Virtanen wrote: > Fri, 04 Jul 2008 11:51:19 +0000, Michael Abbott wrote: > > I hope this is going to the right place. I've tried to submit a Trac > > ticket at http://projects.scipy.org/scipy/numpy/ but unfortunately it > > won't let me log in, even though I've registered and successfully logged > > in as a SciPy user! Grr. > Logged on the scipy.org wiki, or on the Trac? Numpy Trac, Scipy Trac and > the scipy.org website all require separate registrations. Well. Bear with me... First the web sites. I can find three different websites that appear to be relevant to NumPy. Let's take a look, shall we? 1. http://numpy.scipy.org This is the first link on Google and appears to be the proper home of Numpy ... but although there's an extended essay on what Numpy is, and there are links to download the current version and to trelgol.com (bizzarely blocked by my employers!), there are no development resources there. Oh, tell a lie: there's an e-mail address for this mailing list, but no links to the sign-on page or the archives. 2. http://sourceforge.net/projects/numpy/ As the downloads are hosted here, this must be the proper home? Evidently not. Ok, back to Google. 3. http://www.scipy.org/NumPy Oh, that's odd. We're back on scipy.org, but in a different place. How very strange. Well, at least there's a Mailing Lists link ... and where do bugs go? Developer Zone? Ok. FINALLY I find out where the current SVN branch is hosted. Good grief. Now to report my bug... Evidently I go to NumPy Developer's wiki (interesting -- is it Numpy or NumPy?) at http://projects.scipy.org/scipy/numpy and register by following the link to "register first" at ... wait for ... at http://projects.scipy.org/scipy/scipy/register Notice that there is no numpy in that URL? When I finish registering (I notice with surprise that there's no e-mail for any notification, validation or lost password recovery; that's a shame), I'm taken to http://projects.scipy.org/scipy/scipy , which is hardly surprising, but very confusing -- I wrote my ticket but finally noticed that I was on the SciPy tracker before submitting it. Earlier I'm sure I found text on the Developer Zone page to the effect that only one login was required. I can't find it now, maybe it's been edited, or I'm looking in the wrong place. Anyhow, the fact remains that there is no link to the numpy registration page. Well well well. If I manually edit the link (Trac won't actually let me visit the SciPy registration page anymore, oddly enough) to http://projects.scipy.org/scipy/numpy/register then I get a new registration page. And, remarkably, it works (though giving separate passwords for the two logins creates its own headaches with Firefox). Ah well. Some work needed on the web sites I guess... Thanks for entering my ticket. I've managed to register now. From charlesr.harris at gmail.com Fri Jul 4 15:48:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Jul 2008 13:48:19 -0600 Subject: [Numpy-discussion] Should the numpy complex print the same way as python complex numbers? Message-ID: Currently they print differently: In [6]: str(np.complex128(0)) Out[6]: '(0.0+0.0j)' In [7]: str(complex(0)) Out[7]: '0j' It looks pretty easy to make numpy complex print the same as python complex. Shall we make that change? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 4 16:02:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Jul 2008 14:02:43 -0600 Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: <20080704190801.W76180@saturn.araneidae.co.uk> References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> Message-ID: On Fri, Jul 4, 2008 at 1:37 PM, Michael Abbott wrote: > On Fri, 4 Jul 2008, Pauli Virtanen wrote: > > Fri, 04 Jul 2008 11:51:19 +0000, Michael Abbott wrote: > > > I hope this is going to the right place. I've tried to submit a Trac > > > ticket at http://projects.scipy.org/scipy/numpy/ but unfortunately it > > > won't let me log in, even though I've registered and successfully > logged > > > in as a SciPy user! Grr. > > Logged on the scipy.org wiki, or on the Trac? Numpy Trac, Scipy Trac and > > the scipy.org website all require separate registrations. > > Well. Bear with me... > > First the web sites. I can find three different websites that appear to > be relevant to NumPy. Let's take a look, shall we? > > 1. http://numpy.scipy.org > > This is the first link on Google and appears to be the proper home of > Numpy ... but although there's an extended essay on what Numpy is, and > there are links to download the current version and to trelgol.com > (bizzarely blocked by my employers!), there are no development resources > there. Oh, tell a lie: there's an e-mail address for this mailing list, > but no links to the sign-on page or the archives. > > 2. http://sourceforge.net/projects/numpy/ > > As the downloads are hosted here, this must be the proper home? Evidently > not. > > Ok, back to Google. > > 3. http://www.scipy.org/NumPy > > Oh, that's odd. We're back on scipy.org, but in a different place. How > very strange. Well, at least there's a Mailing Lists link ... and where > do bugs go? Developer Zone? Ok. > > FINALLY I find out where the current SVN branch is hosted. Good grief. > Now to report my bug... > > > Evidently I go to NumPy Developer's wiki (interesting -- is it Numpy or > NumPy?) at http://projects.scipy.org/scipy/numpy and register by following > the link to "register first" at ... wait for ... at > > http://projects.scipy.org/scipy/scipy/register > > Notice that there is no numpy in that URL? > > When I finish registering (I notice with surprise that there's no e-mail > for any notification, validation or lost password recovery; that's a > shame), I'm taken to http://projects.scipy.org/scipy/scipy , which is > hardly surprising, but very confusing -- I wrote my ticket but finally > noticed that I was on the SciPy tracker before submitting it. > > Earlier I'm sure I found text on the Developer Zone page to the effect > that only one login was required. I can't find it now, maybe it's been > edited, or I'm looking in the wrong place. Anyhow, the fact remains that > there is no link to the numpy registration page. > > Well well well. If I manually edit the link (Trac won't actually let me > visit the SciPy registration page anymore, oddly enough) to > > http://projects.scipy.org/scipy/numpy/register > > then I get a new registration page. And, remarkably, it works (though > giving separate passwords for the two logins creates its own headaches > with Firefox). > > Curiously, you didn't find the root page: http://www.scipy.org/. At the top there are icons that link to all the things you were looking for. I blame Google ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at araneidae.co.uk Fri Jul 4 16:06:52 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Fri, 4 Jul 2008 20:06:52 +0000 (GMT) Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> Message-ID: <20080704200512.T76446@saturn.araneidae.co.uk> On Fri, 4 Jul 2008, Charles R Harris wrote: > Curiously, you didn't find the root page: http://www.scipy.org/. At the top > there are icons that link to all the things you were looking for. I blame > Google ;) Ah, yes. That's where the text I was looking for was: "You only need to [register] once (i.e, SciPy and NumPy Developer Pages use the same login/password)." From pav at iki.fi Fri Jul 4 16:22:55 2008 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 4 Jul 2008 20:22:55 +0000 (UTC) Subject: [Numpy-discussion] Python ref count leak in numpy References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> <20080704200512.T76446@saturn.araneidae.co.uk> Message-ID: Fri, 04 Jul 2008 20:06:52 +0000, Michael Abbott wrote: > On Fri, 4 Jul 2008, Charles R Harris wrote: >> Curiously, you didn't find the root page: http://www.scipy.org/. At the >> top there are icons that link to all the things you were looking for. I >> blame Google ;) > > Ah, yes. That's where the text I was looking for was: > > "You only need to [register] once (i.e, SciPy and NumPy Developer > Pages use the same login/password)." Hmm, I'm not sure if I was correct in claiming that two registrations are needed for scipy and numpy Tracs. Two logins are needed, but I don't remember if I actually registered twice... -- Pauli Virtanen From matthew.brett at gmail.com Fri Jul 4 17:03:01 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 4 Jul 2008 22:03:01 +0100 Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: <20080704190801.W76180@saturn.araneidae.co.uk> References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> Message-ID: <1e2af89e0807041403s65995cg83a6801ebe397195@mail.gmail.com> Hi, > First the web sites. I can find three different websites that appear to > be relevant to NumPy. Let's take a look, shall we? > > 1. http://numpy.scipy.org > > This is the first link on Google and appears to be the proper home of > Numpy ... Thanks for this - no really - it's funny, and you're right, it's a bit of a mess at the moment. I seem to remember that this has been discussed before, but is there any chance of moving the content at http://numpy.scipy.org to the www.scipy.org site, and redirecting the original link? Pretty please? Matthew From michael at araneidae.co.uk Fri Jul 4 18:05:47 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Fri, 4 Jul 2008 22:05:47 +0000 (GMT) Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> <20080704200512.T76446@saturn.araneidae.co.uk> Message-ID: <20080704220329.E76864@saturn.araneidae.co.uk> On Fri, 4 Jul 2008, Pauli Virtanen wrote: > Fri, 04 Jul 2008 20:06:52 +0000, Michael Abbott wrote: > > "You only need to [register] once (i.e, SciPy and NumPy Developer > > Pages use the same login/password)." > Hmm, I'm not sure if I was correct in claiming that two registrations are > needed for scipy and numpy Tracs. Two logins are needed, but I don't > remember if I actually registered twice... Yes, you do need to register twice, and they are separate passwords (I'll have to change one of mine to make Firefox be a bit happier). There are two problems here: 1. The text on the page quoted above is quite wrong; 2. The "register here" link on the numpy Trac page goes to the wrong registration page! From haase at msg.ucsf.edu Fri Jul 4 19:36:43 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sat, 5 Jul 2008 01:36:43 +0200 Subject: [Numpy-discussion] Python ref count leak in numpy In-Reply-To: <20080704220329.E76864@saturn.araneidae.co.uk> References: <20080704113400.C69716@saturn.araneidae.co.uk> <20080704190801.W76180@saturn.araneidae.co.uk> <20080704200512.T76446@saturn.araneidae.co.uk> <20080704220329.E76864@saturn.araneidae.co.uk> Message-ID: On Sat, Jul 5, 2008 at 12:05 AM, Michael Abbott wrote: > On Fri, 4 Jul 2008, Pauli Virtanen wrote: >> Fri, 04 Jul 2008 20:06:52 +0000, Michael Abbott wrote: >> > "You only need to [register] once (i.e, SciPy and NumPy Developer >> > Pages use the same login/password)." >> Hmm, I'm not sure if I was correct in claiming that two registrations are >> needed for scipy and numpy Tracs. Two logins are needed, but I don't >> remember if I actually registered twice... > > Yes, you do need to register twice, and they are separate passwords (I'll > have to change one of mine to make Firefox be a bit happier). There are > two problems here: > > 1. The text on the page quoted above is quite wrong; > > 2. The "register here" link on the numpy Trac page goes to the wrong > registration page! I (hope I just) fixed this (#2). On the other hand I would be nice a) if numpy and scipy could use the same username/password database and b) as mentioned, an email field would make sense here - Sebastian Haase From jturner at gemini.edu Fri Jul 4 20:47:35 2008 From: jturner at gemini.edu (James Turner) Date: Fri, 04 Jul 2008 20:47:35 -0400 Subject: [Numpy-discussion] Ctypes required? Fails to build. In-Reply-To: 3d375d730807021510j4e951740kf8332ab4ab33e1bb@mail.gmail.com References: 3d375d730807021510j4e951740kf8332ab4ab33e1bb@mail.gmail.com Message-ID: <486EC4A7.1020306@gemini.edu> Thanks, Robert and Stefan for your helpful replies. It makes a big difference to know which problem I need to solve and which I don't :-). Unfortunately I'm still getting those undefined symbol errors for certain maths functions. I tried "python setup.py build_ext -lm build" and don't have $LDFLAGS defined, though I am setting LD_LIBRARY_PATH and LD_RUN_PATH. But I think the compiler is actually finding libm because "-lm" is included on some lines without any complaint. Oddly enough, when I do numpy.test() in Python, everything passes except for ctypes, so maybe it's OK? I'll include the relevant output below in case anyone has ideas. I also found a second problem that I was able to solve but I'm wondering whether it's a bug? The C compiler failed with a syntax error on line 1518 of numpy/core/src/umathmodule.c.src because of a C++ style comment ("//"). When I remove the comment it works. Should NumPy coding style cater for my eccentric C compiler or is this all above board? Hope I'm not being stupid with the first thing. Thanks, James. --- cc: _configtest.c cc _configtest.o -L/astro/iraf/solsparc/gempylocal/lib -L/usr/local/lib -L/usr/lib -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -o _configtest Undefined first referenced symbol in file exp _configtest.o ld: fatal: Symbol referencing errors. No output written to _configtest Undefined first referenced symbol in file exp _configtest.o ld: fatal: Symbol referencing errors. No output written to _configtest failure. removing: _configtest.c _configtest.o C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c "_configtest.c", line 4: undefined symbol: expl cc: acomp failed for _configtest.c "_configtest.c", line 4: undefined symbol: expl cc: acomp failed for _configtest.c failure. removing: _configtest.c _configtest.o C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c "_configtest.c", line 4: undefined symbol: expf cc: acomp failed for _configtest.c "_configtest.c", line 4: undefined symbol: expf cc: acomp failed for _configtest.c failure. removing: _configtest.c _configtest.o C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c "_configtest.c", line 4: undefined symbol: atanhf cc: acomp failed for _configtest.c "_configtest.c", line 4: undefined symbol: atanhf cc: acomp failed for _configtest.c failure. removing: _configtest.c _configtest.o C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c "_configtest.c", line 4: undefined symbol: isinf cc: acomp failed for _configtest.c "_configtest.c", line 4: undefined symbol: isinf cc: acomp failed for _configtest.c failure. removing: _configtest.c _configtest.o C compiler: cc -DNDEBUG -O -xcode=pic32 compile options: '-Inumpy/core/src -Inumpy/core/include -I/astro/iraf/solsparc/gempylocal/include/python2.5 -c' cc: _configtest.c cc _configtest.o -lm -o _configtest success! removing: _configtest.c _configtest.o _configtest From robert.kern at gmail.com Fri Jul 4 21:25:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 4 Jul 2008 20:25:03 -0500 Subject: [Numpy-discussion] Ctypes required? Fails to build. In-Reply-To: <486EC4A7.1020306@gemini.edu> References: <486EC4A7.1020306@gemini.edu> Message-ID: <3d375d730807041825l1f722affnc2fc3699f7515417@mail.gmail.com> On Fri, Jul 4, 2008 at 19:47, James Turner wrote: > Thanks, Robert and Stefan for your helpful replies. It makes > a big difference to know which problem I need to solve and which > I don't :-). > > Unfortunately I'm still getting those undefined symbol errors > for certain maths functions. I tried "python setup.py build_ext > -lm build" and don't have $LDFLAGS defined, though I am setting > LD_LIBRARY_PATH and LD_RUN_PATH. But I think the compiler is > actually finding libm because "-lm" is included on some lines > without any complaint. Oddly enough, when I do numpy.test() in > Python, everything passes except for ctypes, so maybe it's OK? > I'll include the relevant output below in case anyone has ideas. Oh, don't worry about these. These are configuration tests to look for various math functions which vary in their availability, like those for long-double floats. numpy itself will build fine once it has this information. Your installation is fine. > I also found a second problem that I was able to solve but I'm > wondering whether it's a bug? The C compiler failed with a > syntax error on line 1518 of numpy/core/src/umathmodule.c.src > because of a C++ style comment ("//"). When I remove the comment > it works. Should NumPy coding style cater for my eccentric C > compiler or is this all above board? Yes, we should only use /**/. I have fixed this. Thank you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sat Jul 5 06:29:13 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 05 Jul 2008 19:29:13 +0900 Subject: [Numpy-discussion] [numscons] 0.8.1 released Message-ID: <486F4CF9.9060606@ar.media.kyoto-u.ac.jp> Hi, I am pleased to announce numscons 0.8.1. numscons is a tentative for a new build system for numpy/scipy and other packages depending on numpy.distutils for extension code. You can get it through the usual channels: http://projects.scipy.org/scipy/numpy/wiki/NumScons I did not announce releases of numscons for quite some time, but this did not mean I did not continue working on numscons. The goal is to have a beta available for scipy conference in August, where I will present numscons. Most changes since 0.8.0 are internal, with code simplification (compiler customization, etc...). The goal is to put most of the code upstream into scons (numscons is now around 3700 lines of code, and I hope to be able to reduce this to ~ 2000 lines of code). In particular: - no custom code for fortran anymore, everything is now integrated upstream. - python extensions are now built using a scons tool. This means more reliability, and hopefully at some point upstream integration. There are still some rough edges, but hopefully, nothing fundamental. - VS 2003/2005/2008 are supported as well (numpy can build with python 2.6 alpha and Visual Studio express), using a method similar to distutils in python 2.6 to get informations from the var*.bat files (much simpler and more robust than relying on the brain-dead registry). Since 0.8, the changes are not backward compatible with < 0.8. In particular the names of python builders have changed, and you should now use 2 scons scripts instead of one. I am sorry I have not yet updated the documentation, but I really want to get a beta ready for scipy conference, and I don't have so much time to keep the doc in sync. The examples are up to date, though. cheers, David From falted at pytables.org Sat Jul 5 07:36:51 2008 From: falted at pytables.org (Francesc Alted) Date: Sat, 5 Jul 2008 13:36:51 +0200 Subject: [Numpy-discussion] ANN: PyTables 2.0.4 released Message-ID: <200807051336.51783.falted@pytables.org> =========================== Announcing PyTables 2.0.4 =========================== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. After some months without new versions (I have been busy for a while doing things not related with PyTables, unfortunately), I'm happy to announce the availability of PyTables 2.0.4. It fixes some important issues, and now it is possible to use table selections in threaded environments. Also, ``EArray.truncate(0)`` can be used so that you can completely void existing EArrays (only enabled if you have a recent version, i.e. >= 1.8.0, of the HDF5 library installed). Finally, the usage of recent versions of NumPy (1.1) and HDF5 (1.8.1) has been tested and, fortunately, they work just fine. In case you want to know more in detail what has changed in this version, have a look at ``RELEASE_NOTES.txt``. Find the HTML version for this document at: http://www.pytables.org/moin/ReleaseNotes/Release_2.0.4 You can download a source package of the version 2.0.4 with generated PDF and HTML docs and binaries for Windows from http://www.pytables.org/download/stable/ For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.0.4 *Important note for PyTables Pro users*: due to lack of resources, I'll not be delivering a MacOSX binary version of Pro for the time being (this is pretty easy to compile, though). However, I'll continue offering the all-in-one binary for Windows (32-bit). Migration Notes for PyTables 1.x users ====================================== If you are a user of PyTables 1.x, probably it is worth for you to look at ``MIGRATING_TO_2.x.txt`` file where you will find directions on how to migrate your existing PyTables 1.x apps to the 2.x versions. You can find an HTML version of this document at http://www.pytables.org/moin/ReleaseNotes/Migrating_To_2.x Resources ========= Go to the PyTables web site for more details: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Many thanks also to SourceForge who have helped to make and distribute this package! And last, but not least thanks a lot to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy your data! -- Francesc Alted From dagss at student.matnat.uio.no Sat Jul 5 09:44:58 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 Jul 2008 15:44:58 +0200 Subject: [Numpy-discussion] NumPy, buffers (PEP 3118), complex floats, and Cython Message-ID: <486F7ADA.40609@student.matnat.uio.no> I'd like some advice for what way people feel would be the best for supporting complex datatypes in NumPy in Cython; as well as ask in what way it is likely that NumPy will make use of PEP 3118. It seems like NumPy defines its complex data to be a struct of two doubles, for instance: typedef struct { double real, imag; } npy_cdouble; According to PEP 3118 [1], it would be natural to export this as "dd" (record of two doubles) rather than "Zd" (complex double), when exporting ndarrays through a buffer interface. Right? I'm implementing native Cython support/syntax candy for PEP 3118 [2] and at the same time a backwards-compatability wrapper so that Cython can access NumPy through PEP 3118 for Python <= 2.5 (essentially I implement PyObject_GetBuffer etc. on behalf of NumPy for Python <= 2.5). Then the question arise about how to deal with complex datatypes. For Cython, it seems most user-friendly to use the C99 standard for complex numbers (not currently supported, but it wouldn't be much work) and have code like this: cdef ndarray[complex double, 2] arr = ... cdef complex double item item = arr[34, 23] So the natural questions then are: - Is it ok to assume that (complex double*) can be safely casted to (npy_cdouble*) on all platforms which both a) NumPy compiles and runs on, and b) supports complex double? (It seems to be ok with GCC but I wouldn't know about other compilers.) - Will NumPy ever be rewritten to use the C99 complex datatype if available? (I.e. #ifdef how npy_cdouble is defined, and define corresponding #ifdef-ed macros for all complex operations). This would remove the need for such a cast. - Does NumPy plan to support PEP3118, and if so, will complex numbers be exported as "dd" or "Zd"? - [1]: http://www.python.org/dev/peps/pep-3118/ [2]: http://wiki.cython.org/enhancements/buffer -- Dag Sverre From gregor.thalhammer at gmail.com Sat Jul 5 10:03:19 2008 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Sat, 05 Jul 2008 16:03:19 +0200 Subject: [Numpy-discussion] failure with numpy.inner Message-ID: <486F7F27.2080801@googlemail.com> After upgrading to NumPy 1.1.0 (I installed numpy-1.1.0-win32-superpack-pyhon2.5) I observed a fatal failure with the following code which uses numpy.inner import numpy F = numpy.zeros(shape = (1,79), dtype = numpy.float64) #this suceeds FtF = numpy.inner(F,F.copy()) #this fails FtF = numpy.inner(F,F) The failure (Exception code 0xc0000005) happens in _dotblas.pyd. I use Windows XP on a Intel Core2Duo system. Further, I could observe that it depends on the size of the array wether this failure appears or not. I submitted this report also as ticket #844. I hope somebody can track down the reason for this behaviour (and fix it). Gregor Thalhammer From oliphant at enthought.com Sat Jul 5 14:50:53 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 05 Jul 2008 13:50:53 -0500 Subject: [Numpy-discussion] NumPy, buffers (PEP 3118), complex floats, and Cython In-Reply-To: <486F7ADA.40609@student.matnat.uio.no> References: <486F7ADA.40609@student.matnat.uio.no> Message-ID: <486FC28D.2010502@enthought.com> Dag Sverre Seljebotn wrote: > I'd like some advice for what way people feel would be the best for > supporting complex datatypes in NumPy in Cython; as well as ask in what > way it is likely that NumPy will make use of PEP 3118. > > It seems like NumPy defines its complex data to be a struct of two > doubles, for instance: > > typedef struct { double real, imag; } npy_cdouble; > > According to PEP 3118 [1], it would be natural to export this as "dd" > (record of two doubles) rather than "Zd" (complex double), when > exporting ndarrays through a buffer interface. Right? > No, it is more natural to use Zd because then you know you are dealing with complex numbers and not just two separate floats. > For Cython, it seems most user-friendly to use the C99 standard for > complex numbers (not currently supported, but it wouldn't be much work) > and have code like this: > > cdef ndarray[complex double, 2] arr = ... > cdef complex double item > item = arr[34, 23] > That is fine to use the C99 standard. > So the natural questions then are: > > - Is it ok to assume that (complex double*) can be safely casted to > (npy_cdouble*) on all platforms which both a) NumPy compiles and runs > on, and b) supports complex double? (It seems to be ok with GCC but I > wouldn't know about other compilers.) > I think so, but I'm not 100% sure. > - Will NumPy ever be rewritten to use the C99 complex datatype if > available? (I.e. #ifdef how npy_cdouble is defined, and define > corresponding #ifdef-ed macros for all complex operations). This would > remove the need for such a cast. > Can't answer that one. C99 is not supported by Microsoft's compilers and other compilers and so could not be used though I would have liked to. > - Does NumPy plan to support PEP3118, and if so, will complex numbers be > exported as "dd" or "Zd"? > Yes, it will support it eventually. PEP3118 came from NumPy. Complex numbers will be exported as "Zd" -Travis From robert.kern at gmail.com Sat Jul 5 19:45:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 5 Jul 2008 18:45:02 -0500 Subject: [Numpy-discussion] failure with numpy.inner In-Reply-To: <486F7F27.2080801@googlemail.com> References: <486F7F27.2080801@googlemail.com> Message-ID: <3d375d730807051645g78e32b08l3a89218b90de82f7@mail.gmail.com> On Sat, Jul 5, 2008 at 09:03, Gregor Thalhammer wrote: > After upgrading to NumPy 1.1.0 (I installed > numpy-1.1.0-win32-superpack-pyhon2.5) I observed a fatal failure with > the following code which uses numpy.inner > > import numpy > F = numpy.zeros(shape = (1,79), dtype = numpy.float64) > #this suceeds > FtF = numpy.inner(F,F.copy()) > #this fails > FtF = numpy.inner(F,F) > > The failure (Exception code 0xc0000005) happens in _dotblas.pyd. I use > Windows XP on a Intel Core2Duo system. Further, I could observe that it > depends on the size of the array wether this failure appears or not. > > I submitted this report also as ticket #844. > > I hope somebody can track down the reason for this behaviour (and fix it). Hmm. This looks like a problem with that particular build's ATLAS. I do not see a problem on OS X. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wojciechowski_m at o2.pl Sat Jul 5 20:43:02 2008 From: wojciechowski_m at o2.pl (Marek Wojciechowski) Date: Sun, 6 Jul 2008 02:43:02 +0200 Subject: [Numpy-discussion] Numpy on AIX 5.3 In-Reply-To: References: Message-ID: <200807060243.02500.wojciechowski_m@o2.pl> Hi! I'm trying to install numpy-1.1 on AIX 5.3 but i'm getting an error: running build running scons customize UnixCCompiler Found executable /usr/bin/cc_r customize IBMFCompiler Found executable /usr/bin/xlf90 Found executable /usr/bin/xlf Found executable /usr/bin/xlf95 Creating /tmp/tmp5j_OiW/qV0MJ4_xlf.cfg customize IBMFCompiler Creating /tmp/tmp5j_OiW/-LWcxB_xlf.cfg customize UnixCCompiler customize UnixCCompiler using scons Traceback (most recent call last): File "setup.py", line 96, in setup_package() File "setup.py", line 89, in setup_package configuration=configuration ) File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/core.py", line 184, in setup return old_setup(**new_attr) File "/home/marek/apython/lib/python2.5/distutils/core.py", line 151, in setup dist.run_commands() File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 974, in run_commands self.run_command(cmd) File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 994, in run_command cmd_obj.run() File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/build.py", line 38, in run self.run_command('scons') File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 333, in run_command self.distribution.run_command(command) File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 993, in run_command cmd_obj.ensure_finalized() File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 117, in ensure_finalized self.finalize_options() File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/scons.py", line 289, in finalize_options self.cxxcompiler = cxxcompiler.cxx_compiler() File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/ccompiler.py", line 303, in CCompiler_cxx_compiler + cxx.linker_so[2:] TypeError: can only concatenate list (not "str") to list Setting CXX=xlc++_r (which is proper C++ compiler) does not work. How to fix this? Greetings -- Marek Wojciechowski From david at ar.media.kyoto-u.ac.jp Sun Jul 6 00:36:24 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 06 Jul 2008 13:36:24 +0900 Subject: [Numpy-discussion] Numpy on AIX 5.3 In-Reply-To: <200807060243.02500.wojciechowski_m@o2.pl> References: <200807060243.02500.wojciechowski_m@o2.pl> Message-ID: <48704BC8.6070900@ar.media.kyoto-u.ac.jp> Marek Wojciechowski wrote: > Hi! > > I'm trying to install numpy-1.1 on AIX 5.3 but i'm getting an error: > > running build > running scons > customize UnixCCompiler > Found executable /usr/bin/cc_r > customize IBMFCompiler > Found executable /usr/bin/xlf90 > Found executable /usr/bin/xlf > Found executable /usr/bin/xlf95 > Creating /tmp/tmp5j_OiW/qV0MJ4_xlf.cfg > customize IBMFCompiler > Creating /tmp/tmp5j_OiW/-LWcxB_xlf.cfg > customize UnixCCompiler > customize UnixCCompiler using scons > Traceback (most recent call last): > File "setup.py", line 96, in > setup_package() > File "setup.py", line 89, in setup_package > configuration=configuration ) > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/core.py", line 184, in > setup > return old_setup(**new_attr) > File "/home/marek/apython/lib/python2.5/distutils/core.py", line 151, in > setup > dist.run_commands() > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 974, in > run_commands > self.run_command(cmd) > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 994, in > run_command > cmd_obj.run() > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/build.py", line > 38, in run > self.run_command('scons') > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 333, in > run_command > self.distribution.run_command(command) > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 993, in > run_command > cmd_obj.ensure_finalized() > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 117, in > ensure_finalized > self.finalize_options() > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/scons.py", line > 289, in finalize_options > self.cxxcompiler = cxxcompiler.cxx_compiler() > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/ccompiler.py", line 303, > in CCompiler_cxx_compiler > + cxx.linker_so[2:] > TypeError: can only concatenate list (not "str") to list > > Just by reading at the code, the line [cxx.linker_so[0]] + cxx.compiler_cxx[0] + cxx.linker_so[2:] Cannot work unless cxx.compiler_cxx is a nested list. Since AIX is not that common, it is well possible that this mistake was hidden for a long time. So I would first try something like: cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] +cxx.linker_so[2:] cheers, David From wilson.t.thompson at gmail.com Sun Jul 6 08:55:44 2008 From: wilson.t.thompson at gmail.com (wilson) Date: Sun, 6 Jul 2008 05:55:44 -0700 (PDT) Subject: [Numpy-discussion] calling numpy from java Message-ID: hi all, is it possible to use numpy functions (like eigh()..etc)from java code? isthere a java wrapper for numpy? thanks wilson From robert.kern at gmail.com Sun Jul 6 09:04:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 6 Jul 2008 08:04:30 -0500 Subject: [Numpy-discussion] calling numpy from java In-Reply-To: References: Message-ID: <3d375d730807060604q15d2410dv2e0ee9f612c6215b@mail.gmail.com> On Sun, Jul 6, 2008 at 07:55, wilson wrote: > hi all, > is it possible to use numpy functions (like eigh()..etc)from java > code? Not particularly, no. > isthere a java wrapper for numpy? No. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Sun Jul 6 09:10:49 2008 From: cournape at gmail.com (David Cournapeau) Date: Sun, 6 Jul 2008 22:10:49 +0900 Subject: [Numpy-discussion] calling numpy from java In-Reply-To: References: Message-ID: <5b8d13220807060610l1d6fe36ar53aacecadd10e9bd@mail.gmail.com> On Sun, Jul 6, 2008 at 9:55 PM, wilson wrote: > hi all, > is it possible to use numpy functions (like eigh()..etc)from java > code? isthere a java wrapper for numpy? As Robert said, not really possible. A large part of numpy (around 50%) is pure C, of which a large part is tied to the python C api. If all you want is some linear algebra things like eigh, I would be really surprised if java did not have wrappers around LAPACK, though, so that's where you should look. cheers, David From wright at esrf.fr Sun Jul 6 11:07:46 2008 From: wright at esrf.fr (Jon Wright) Date: Sun, 06 Jul 2008 17:07:46 +0200 Subject: [Numpy-discussion] calling numpy from java In-Reply-To: <5b8d13220807060610l1d6fe36ar53aacecadd10e9bd@mail.gmail.com> References: <5b8d13220807060610l1d6fe36ar53aacecadd10e9bd@mail.gmail.com> Message-ID: <4870DFC2.3050301@esrf.fr> David Cournapeau wrote: > On Sun, Jul 6, 2008 at 9:55 PM, wilson wrote: >> hi all, >> is it possible to use numpy functions (like eigh()..etc)from java >> code? isthere a java wrapper for numpy? Yes, it is possible, but not yet 100% convenient. Have a look at jepp, from jepp.sourceforge.net. The idiom is something like array.tostring being picked up as a floatarray or bytearray in java. Best, Jon From dagss at student.matnat.uio.no Sun Jul 6 11:24:49 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 06 Jul 2008 17:24:49 +0200 Subject: [Numpy-discussion] NumPy, buffers (PEP 3118), complex floats, and Cython In-Reply-To: <486FC28D.2010502@enthought.com> References: <486F7ADA.40609@student.matnat.uio.no> <486FC28D.2010502@enthought.com> Message-ID: <4870E3C1.8070706@student.matnat.uio.no> Travis E. Oliphant wrote: > Dag Sverre Seljebotn wrote: >> I'd like some advice for what way people feel would be the best for >> supporting complex datatypes in NumPy in Cython; as well as ask in what >> way it is likely that NumPy will make use of PEP 3118. >> >> It seems like NumPy defines its complex data to be a struct of two >> doubles, for instance: >> >> typedef struct { double real, imag; } npy_cdouble; >> >> According to PEP 3118 [1], it would be natural to export this as "dd" >> (record of two doubles) rather than "Zd" (complex double), when >> exporting ndarrays through a buffer interface. Right? >> > No, it is more natural to use Zd because then you know you are dealing > with complex numbers and not just two separate floats. I guess my confusion comes from assuming the wrong thing about what "complex double" refers to in the PEP; since that's the only reference I found to the concept in the PEP I assumed it corresponded to the C99 float. Can I assume then that Zd means that the data always has the form of npy_cdouble and friends? (I guess this might seem "obvious", but as you see I've already had one wrong guess, and I can't seem to find it defined anywhere...) -- Dag Sverre From rblove_lists at comcast.net Sun Jul 6 12:08:07 2008 From: rblove_lists at comcast.net (Robert Love) Date: Sun, 6 Jul 2008 11:08:07 -0500 Subject: [Numpy-discussion] Enthought Python Distribution In-Reply-To: <05B4EAF2-B182-4E33-A622-B3F65D98429A@enthought.com> References: <05B4EAF2-B182-4E33-A622-B3F65D98429A@enthought.com> Message-ID: On Jul 2, 2008, at 9:34 AM, Travis Vaught wrote: > Greetings, > > We're pleased to announce the beta release of the Enthought Python > Distribution for *Mac OS X*. > Oh Happiness, Oh Joy. I'm downloading it now. Have a virtual beer on me until I can get to Austin and buy you all a real one. /s/ Bob From wojciechowski_m at o2.pl Sun Jul 6 13:05:21 2008 From: wojciechowski_m at o2.pl (Marek Wojciechowski) Date: Sun, 6 Jul 2008 19:05:21 +0200 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 21 In-Reply-To: References: Message-ID: <200807061905.21251.wojciechowski_m@o2.pl> > Message: 4 > Date: Sun, 06 Jul 2008 13:36:24 +0900 > From: David Cournapeau > Subject: Re: [Numpy-discussion] Numpy on AIX 5.3 > To: Discussion of Numerical Python > Message-ID: <48704BC8.6070900 at ar.media.kyoto-u.ac.jp> > Content-Type: text/plain; charset=ISO-8859-1 > > Marek Wojciechowski wrote: > > Hi! > > > > I'm trying to install numpy-1.1 on AIX 5.3 but i'm getting an error: > > > > running build > > running scons > > customize UnixCCompiler > > Found executable /usr/bin/cc_r > > customize IBMFCompiler > > Found executable /usr/bin/xlf90 > > Found executable /usr/bin/xlf > > Found executable /usr/bin/xlf95 > > Creating /tmp/tmp5j_OiW/qV0MJ4_xlf.cfg > > customize IBMFCompiler > > Creating /tmp/tmp5j_OiW/-LWcxB_xlf.cfg > > customize UnixCCompiler > > customize UnixCCompiler using scons > > Traceback (most recent call last): > > File "setup.py", line 96, in > > setup_package() > > File "setup.py", line 89, in setup_package > > configuration=configuration ) > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/core.py", line 184, > > in setup > > return old_setup(**new_attr) > > File "/home/marek/apython/lib/python2.5/distutils/core.py", line 151, > > in setup > > dist.run_commands() > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 974, > > in run_commands > > self.run_command(cmd) > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 994, > > in run_command > > cmd_obj.run() > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/build.py", > > line 38, in run > > self.run_command('scons') > > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 333, in > > run_command > > self.distribution.run_command(command) > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 993, > > in run_command > > cmd_obj.ensure_finalized() > > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 117, in > > ensure_finalized > > self.finalize_options() > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/scons.py", > > line 289, in finalize_options > > self.cxxcompiler = cxxcompiler.cxx_compiler() > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/ccompiler.py", line > > 303, in CCompiler_cxx_compiler > > + cxx.linker_so[2:] > > TypeError: can only concatenate list (not "str") to list > > Just by reading at the code, the line > > [cxx.linker_so[0]] + cxx.compiler_cxx[0] + cxx.linker_so[2:] > > Cannot work unless cxx.compiler_cxx is a nested list. Since AIX is not > that common, it is well possible that this mistake was hidden for a long > time. So I would first try something like: > > cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] +cxx.linker_so[2:] > Yeah, this pushed building further. However there's another bug: creating build/temp.aix-5.3-2.5/build creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5 creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core/src compile options: '-Ibuild/src.aix-5.3-2.5/numpy/core/src -Inumpy/core/include -Ibuild/src.aix-5.3-2.5/numpy/core -Inumpy/core/src -Inumpy/core/include -I/home/marek/apython/include/python2.5 -I/home/marek/apython/include/python2.5 -c' cc_r: build/src.aix-5.3-2.5/numpy/core/src/umathmodule.c "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.28: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.33: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.31: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.35: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.28: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.33: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.31: 1506-046 (S) Syntax error. "numpy/core/src/umathmodule.c.src", line 1518.35: 1506-046 (S) Syntax error. error: Command "cc_r -DNDEBUG -O -Ibuild/src.aix-5.3-2.5/numpy/core/src -Inumpy/core/include -Ibuild/src.aix-5.3-2.5/numpy/core -Inumpy/core/src -Inumpy/core/include -I/home/marek/apython/include/python2.5 -I/home/marek/apython/include/python2.5 -c build/src.aix-5.3-2.5/numpy/core/src/umathmodule.c -o build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core/src/umathmodule.o" failed with exit status 1 The solution is to change line 1518 of umathmodule.c from: *((@typ@ *)op) += 0; // clear sign-bit to *((@typ@ *)op) += 0; /* clear sign-bit */ Generally IBM C compiler doesn't like // style commenting. I found similar problems also in scipy. Now numpy-1.1.0 compiles on AIX 5.3 (with ActivePython 2.5). Greetings, Marek -- Marek Wojciechowski From charlesr.harris at gmail.com Sun Jul 6 13:13:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 6 Jul 2008 11:13:49 -0600 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 21 In-Reply-To: <200807061905.21251.wojciechowski_m@o2.pl> References: <200807061905.21251.wojciechowski_m@o2.pl> Message-ID: On Sun, Jul 6, 2008 at 11:05 AM, Marek Wojciechowski wrote: > > Message: 4 > > Date: Sun, 06 Jul 2008 13:36:24 +0900 > > From: David Cournapeau > > Subject: Re: [Numpy-discussion] Numpy on AIX 5.3 > > To: Discussion of Numerical Python > > Message-ID: <48704BC8.6070900 at ar.media.kyoto-u.ac.jp> > > Content-Type: text/plain; charset=ISO-8859-1 > > > > Marek Wojciechowski wrote: > > > Hi! > > > > > > I'm trying to install numpy-1.1 on AIX 5.3 but i'm getting an error: > > > > > > running build > > > running scons > > > customize UnixCCompiler > > > Found executable /usr/bin/cc_r > > > customize IBMFCompiler > > > Found executable /usr/bin/xlf90 > > > Found executable /usr/bin/xlf > > > Found executable /usr/bin/xlf95 > > > Creating /tmp/tmp5j_OiW/qV0MJ4_xlf.cfg > > > customize IBMFCompiler > > > Creating /tmp/tmp5j_OiW/-LWcxB_xlf.cfg > > > customize UnixCCompiler > > > customize UnixCCompiler using scons > > > Traceback (most recent call last): > > > File "setup.py", line 96, in > > > setup_package() > > > File "setup.py", line 89, in setup_package > > > configuration=configuration ) > > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/core.py", line 184, > > > in setup > > > return old_setup(**new_attr) > > > File "/home/marek/apython/lib/python2.5/distutils/core.py", line 151, > > > in setup > > > dist.run_commands() > > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 974, > > > in run_commands > > > self.run_command(cmd) > > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 994, > > > in run_command > > > cmd_obj.run() > > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/build.py", > > > line 38, in run > > > self.run_command('scons') > > > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 333, > in > > > run_command > > > self.distribution.run_command(command) > > > File "/home/marek/apython/lib/python2.5/distutils/dist.py", line 993, > > > in run_command > > > cmd_obj.ensure_finalized() > > > File "/home/marek/apython/lib/python2.5/distutils/cmd.py", line 117, > in > > > ensure_finalized > > > self.finalize_options() > > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/command/scons.py", > > > line 289, in finalize_options > > > self.cxxcompiler = cxxcompiler.cxx_compiler() > > > File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/ccompiler.py", line > > > 303, in CCompiler_cxx_compiler > > > + cxx.linker_so[2:] > > > TypeError: can only concatenate list (not "str") to list > > > > Just by reading at the code, the line > > > > [cxx.linker_so[0]] + cxx.compiler_cxx[0] + cxx.linker_so[2:] > > > > Cannot work unless cxx.compiler_cxx is a nested list. Since AIX is not > > that common, it is well possible that this mistake was hidden for a long > > time. So I would first try something like: > > > > cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] > +cxx.linker_so[2:] > > > > Yeah, this pushed building further. However there's another bug: > > creating build/temp.aix-5.3-2.5/build > creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5 > creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy > creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core > creating build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core/src > compile > options: '-Ibuild/src.aix-5.3-2.5/numpy/core/src -Inumpy/core/include > -Ibuild/src.aix-5.3-2.5/numpy/core -Inumpy/core/src -Inumpy/core/include > -I/home/marek/apython/include/python2.5 > -I/home/marek/apython/include/python2.5 -c' > cc_r: build/src.aix-5.3-2.5/numpy/core/src/umathmodule.c > "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.28: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.33: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.31: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.35: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.28: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.29: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.33: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.30: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.31: 1506-046 (S) Syntax > error. > "numpy/core/src/umathmodule.c.src", line 1518.35: 1506-046 (S) Syntax > error. > error: > Command "cc_r -DNDEBUG -O -Ibuild/src.aix-5.3-2.5/numpy/core/src > -Inumpy/core/include -Ibuild/src.aix-5.3-2.5/numpy/core -Inumpy/core/src > -Inumpy/core/include -I/home/marek/apython/include/python2.5 > -I/home/marek/apython/include/python2.5 -c > build/src.aix-5.3-2.5/numpy/core/src/umathmodule.c -o > build/temp.aix-5.3-2.5/build/src.aix-5.3-2.5/numpy/core/src/umathmodule.o" > failed with exit status 1 > > The solution is to change line 1518 of umathmodule.c from: > *((@typ@ *)op) += 0; // clear sign-bit > > to > > *((@typ@ *)op) += 0; /* clear sign-bit */ > > Generally IBM C compiler doesn't like // style commenting. I found similar > problems also in scipy. > > Now numpy-1.1.0 compiles on AIX 5.3 (with ActivePython 2.5). > This has been done in mainline, I'll backport it to the 1.1.x branch. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Jul 6 17:07:31 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Jul 2008 23:07:31 +0200 Subject: [Numpy-discussion] Schedule for the SciPy08 conferencez Message-ID: <20080706210731.GC25810@phare.normalesup.org> We have received a large number of excellent contributions for papers for the SciPy 2008 conference. The program committee has had to make a difficult selection and we are happy to bring to you a preliminary schedule: Thursday ========= **8:00** Registration/Breakfast **8:55** Welcome (Travis Vaught) **9:10** Keynote (Alex Martelli) **10:00** State of SciPy (Travis Vaught, Jarrod Millman) **10:40** -- Break -- **11:00** Sympy - Python library for symbolic mathematics: introduction and applications (Ond?ej ?ertik) **11:40** Interval arithmetic: Python implementation and applications (Stefano Taschini) **12:00** Experiences Using Scipy for Computer Vision Research (Damian Eads) **12:20** -- Lunch -- **1:40** The new NumPy documentation framework (St?fan Van der Walt) **2:00** Matplotlib solves the riddle of the sphinx (Michael Droettboom) **2:40** The SciPy documentation project (Joe Harrington) **3:00** -- Break -- **3:40** Sage: creating a viable free Python-based open source alternatice to Magma, Maple, Mathematica and Matlab (William Stein) **4:20** Open space for lightning talks Friday ======== **8:30** Breakfast **9:00** Pysynphot: A Python Re-Implementation of a Legacy App in Astronomy (Perry Greenfield) **9:40** How the Large Synoptic Survey Telescope (LSST) is using Python (Robert Lupton) **10:00** Real-time Astronomical Time-series Classification and Broadcast Pipeline (Dan Starr) **10:20** Analysis and Visualization of Multi-Scale Astrophysical Simulations using Python and NumPy (Matthew Turk) **10:40** -- Break -- **11:00** Exploring network structure, dynamics, and function using NetworkX (Aric Hagberg) **11:40** Mayavi: Making 3D data visualization reusable (Prabhu Ramachandran, Ga?l Varoquaux) **12:00** Finite Element Modeling of Contact and Impact Problems Using Python (Ryan Krauss) **12:20** -- Lunch -- **2:00** PyCircuitScape: A Tool for Landscape Ecology (Viral Shah) **2:20** Summarizing Complexity in High Dimensional Spaces (Karl Young) **2:40** UFuncs: A generic function mechanism in Python (Travis Oliphant) **3:20** -- Break -- **3:40** NumPy Optimization: Manual tuning and automated approaches (Evan Patterson) **4:00** Converting Python functions to dynamically-compiled C (Ilan Schnell) **4:20** unPython: Converting Python numerical programs into C (Rahul Garg) **4:40** Implementing the Grammar of Graphics for Python (Robert Kern) **5:00** Ask the experts session. A more detailled booklet including the abstract text will be available soon. We are looking forward to seeing you in Caltech, Ga?l Varoquaux, on behalf of the program committee. -- SciPy2008 conference. Program committee Anne Archibald, McGill University Matthew Brett Perry Greenfield, Space Telescope Science Institute Charles Harris Ryan Krauss, Southern Illinois University Ga?l Varoquaux St?fan van der Walt, University of Stellenbosch From c.l.l.bartels at gmail.com Mon Jul 7 04:20:29 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 10:20:29 +0200 Subject: [Numpy-discussion] numpy installation issues Message-ID: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> Hi, I have been trying to install numpy (as a dependancy of matplotlib) on cygwin. I would like to use the native cygwin-python numpy install, as the cygwin development environment I use is portable on usb disk ( http://www.dam.brown.edu/people/sezer/software/cygwin/). (Extremely useful!) It would be even better if i can actually install numpy/scipy/matplotlib on this portable distribution. Unfortunately "python setup.py build" does give an error. (I installed python with sources, so /usr/lib/python2.5/config is existing. Which seemed to be a problem with another python package when I googled for this error.) I included the build messages... Anybody any clue? (I know there exists now a windows enthought distribution but I cannot use that as it is not portable and I don't know how to combine it with cygwin. I essentially only need numpy and matplotlib. These should not be impossible to install?) Kind regards, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: buildlog.zip Type: application/zip Size: 3990 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Mon Jul 7 04:27:40 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Jul 2008 17:27:40 +0900 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> Message-ID: <4871D37C.6040208@ar.media.kyoto-u.ac.jp> Chris Bartels wrote: > Hi, > > I have been trying to install numpy (as a dependancy of matplotlib) on > cygwin. I would like to use the native cygwin-python numpy install, as > the cygwin development environment I use is portable on usb disk > (http://www.dam.brown.edu/people/sezer/software/cygwin/). (Extremely > useful!) > > It would be even better if i can actually install > numpy/scipy/matplotlib on this portable distribution. > > Unfortunately "python setup.py build" does give an error. (I installed > python with sources, so /usr/lib/python2.5/config is existing. Which > seemed to be a problem with another python package when I googled for > this error.) It looks like you did not install python correctly. I strongly recommend you to use the python available for cygwin, after having removed the python you installed from sources. Installing the python package does give you /usr/lib/python2.5/config, as can be seen on the cygwin package list. Once you correctly installed the python package from cygwin, remove the build directory in numpy sources, and start again. This should get you further, cheers, David From c.l.l.bartels at gmail.com Mon Jul 7 05:17:24 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 11:17:24 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <4871D37C.6040208@ar.media.kyoto-u.ac.jp> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> Message-ID: <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> Hi David, Thank you for the quick reply. It looks like you did not install python correctly. I strongly recommend > you to use the python available for cygwin, after having removed the > python you installed from sources. Installing the python package does > give you /usr/lib/python2.5/config, as can be seen on the cygwin package > list. > Sorry, my message was ambiguous: I indeed did this, I installed python from the cygwin installer (in cygwin you can opt to install both binary and sources of a package. Sources are needed, I guess, if you want to compile python modules.) So, I did not compile python myself. Once you correctly installed the python package from cygwin, remove the > build directory in numpy sources, and start again. This should get you > further, I did this exactly. I even tried with the two different versions of python available in cygwin (2.5.2 and 2.5.1 I believe). Removing the build directory each time. I'm really at a loss here. It apparently does not find the required library, but everything is available in /usr/lib/python ? Kind regards, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jul 7 05:16:45 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Jul 2008 18:16:45 +0900 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> Message-ID: <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> Chris Bartels wrote: > > Sorry, my message was ambiguous: I indeed did this, I installed python > from the cygwin installer Ok. Python sources are not needed to build python extensions. Only the headers and the python runtime are needed. Basically, the problem is that for some reason, the library path flags are not passed to the linker, and I thought this was because of a bad python build (from which numpy build system find those informations). Now, the most likely reason for the problem is that you have LIBPATH/LINKFLAGS in your environment. If so, unset them: they do not work as you would expect with numpy/scipy. cheers, David From c.l.l.bartels at gmail.com Mon Jul 7 08:09:09 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 14:09:09 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> Message-ID: <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> Hi David, Thanks again. Unfortunately these variables are not set in the environment. (I checked with 'env'.) I also reinstalled the cygwin python package (again) and ran 'find . -name libpython* -print', this gives: /usr/bin/libpython2.5.dll /usr/lib/python2.5/config/libpython2.5.dll.a The /usr/lib/python2.5/config/ directory contains the following files: Makefile Setup Setup.config Setup.local config.c config.c.in install-sh libpython2.5.dll.a makesetup python.o But the numpy install is still unable to use these, it gives the same error. Kind regards, Chris On Mon, Jul 7, 2008 at 11:16 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Chris Bartels wrote: > > > > Sorry, my message was ambiguous: I indeed did this, I installed python > > from the cygwin installer > > Ok. Python sources are not needed to build python extensions. Only the > headers and the python runtime are needed. > > Basically, the problem is that for some reason, the library path flags > are not passed to the linker, and I thought this was because of a bad > python build (from which numpy build system find those informations). > Now, the most likely reason for the problem is that you have > LIBPATH/LINKFLAGS in your environment. If so, unset them: they do not > work as you would expect with numpy/scipy. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jul 7 08:22:12 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Jul 2008 21:22:12 +0900 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> Message-ID: <48720A74.306@ar.media.kyoto-u.ac.jp> Chris Bartels wrote: > Hi David, > > Thanks again. > > Unfortunately these variables are not set in the environment. (I > checked with 'env'.) > I also reinstalled the cygwin python package (again) and ran 'find . > -name libpython* -print', this gives: > > /usr/bin/libpython2.5.dll > /usr/lib/python2.5/config/libpython2.5.dll.a > Mmh, ok, I am out of ideas by only looking at your build.log. I don't understand why the linker does not get any LIBPATH information; I will take a look by compiling it by myself, then, cheers, David From c.l.l.bartels at gmail.com Mon Jul 7 08:38:59 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 14:38:59 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <48720A74.306@ar.media.kyoto-u.ac.jp> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> <48720A74.306@ar.media.kyoto-u.ac.jp> Message-ID: <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> Hi David, I just discovered this: http://mail.python.org/pipermail/python-bugs-list/2006-April/032817.html I'm currently trying. On Mon, Jul 7, 2008 at 2:22 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Chris Bartels wrote: > > Hi David, > > > > Thanks again. > > > > Unfortunately these variables are not set in the environment. (I > > checked with 'env'.) > > I also reinstalled the cygwin python package (again) and ran 'find . > > -name libpython* -print', this gives: > > > > /usr/bin/libpython2.5.dll > > /usr/lib/python2.5/config/libpython2.5.dll.a > > > > Mmh, ok, I am out of ideas by only looking at your build.log. I don't > understand why the linker does not get any LIBPATH information; I will > take a look by compiling it by myself, then, > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From c.l.l.bartels at gmail.com Mon Jul 7 08:55:13 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 14:55:13 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> <48720A74.306@ar.media.kyoto-u.ac.jp> <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> Message-ID: <8d00cdad0807070555r1ea4e2av21adad201983389e@mail.gmail.com> I created 2 symlinks (libpython2.5.a and libpython2.5.dll.a) in the /usr/lib directory which both point to /usr/lib/python2.5/config/libpython2.5.dll.a This seems to solve the linker problem, now I just get some assembly errors :p Included is a new buildlog, it ends with: gcc: build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c In file included from numpy/core/src/umathmodule.c.src:2183: numpy/core/src/ufuncobject.c: In function `_extract_pyvals': numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg 4) numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg 5) /tmp/ccWY4IyY.s: Assembler messages: /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' In file included from numpy/core/src/umathmodule.c.src:2183: numpy/core/src/ufuncobject.c: In function `_extract_pyvals': numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg 4) numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg 5) /tmp/ccWY4IyY.s: Assembler messages: /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' error: Command "gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core/src -Inumpy/core/include -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -I/usr/include/python2.5 -c build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c -o build/temp.cygwin-1.5.25-i686-2.5/build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.o" failed with exit status 1 Kind regards, Chris On Mon, Jul 7, 2008 at 2:38 PM, Chris Bartels wrote: > Hi David, > > I just discovered this: > http://mail.python.org/pipermail/python-bugs-list/2006-April/032817.html > > I'm currently trying. > > > > > On Mon, Jul 7, 2008 at 2:22 PM, David Cournapeau < > david at ar.media.kyoto-u.ac.jp> wrote: > >> Chris Bartels wrote: >> > Hi David, >> > >> > Thanks again. >> > >> > Unfortunately these variables are not set in the environment. (I >> > checked with 'env'.) >> > I also reinstalled the cygwin python package (again) and ran 'find . >> > -name libpython* -print', this gives: >> > >> > /usr/bin/libpython2.5.dll >> > /usr/lib/python2.5/config/libpython2.5.dll.a >> > >> >> Mmh, ok, I am out of ideas by only looking at your build.log. I don't >> understand why the linker does not get any LIBPATH information; I will >> take a look by compiling it by myself, then, >> >> cheers, >> >> David >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: buildlog2.zip Type: application/zip Size: 4268 bytes Desc: not available URL: From c.l.l.bartels at gmail.com Mon Jul 7 09:11:55 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 15:11:55 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070555r1ea4e2av21adad201983389e@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> <48720A74.306@ar.media.kyoto-u.ac.jp> <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> <8d00cdad0807070555r1ea4e2av21adad201983389e@mail.gmail.com> Message-ID: <8d00cdad0807070611p332a12e1w18a784c69ab20ede@mail.gmail.com> Hi David (and others) This issue is known: http://www.scipy.org/scipy/numpy/ticket/811 I think this is an issue for the numpy developers. (I don't know how to fix this easily, i can try to install an older version of binutils (if cygwin has these), but this will probably break a lot of other stuff. So that is not my preferred solution.) Kind regards, Chris On Mon, Jul 7, 2008 at 2:55 PM, Chris Bartels wrote: > I created 2 symlinks (libpython2.5.a and libpython2.5.dll.a) in the > /usr/lib directory which both point to > /usr/lib/python2.5/config/libpython2.5.dll.a > This seems to solve the linker problem, now I just get some assembly > errors :p > > Included is a new buildlog, it ends with: > > gcc: build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c > In file included from numpy/core/src/umathmodule.c.src:2183: > numpy/core/src/ufuncobject.c: In function `_extract_pyvals': > numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg > 4) > numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg > 5) > /tmp/ccWY4IyY.s: Assembler messages: > /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' > /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' > In file included from numpy/core/src/umathmodule.c.src:2183: > numpy/core/src/ufuncobject.c: In function `_extract_pyvals': > numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg > 4) > numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg > 5) > /tmp/ccWY4IyY.s: Assembler messages: > /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' > /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' > error: Command "gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall > -Wstrict-prototypes -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core/src > -Inumpy/core/include -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core > -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 > -I/usr/include/python2.5 -c > build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c -o > build/temp.cygwin-1.5.25-i686-2.5/build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.o" > failed with exit status 1 > > Kind regards, > Chris > > > On Mon, Jul 7, 2008 at 2:38 PM, Chris Bartels > wrote: > >> Hi David, >> >> I just discovered this: >> http://mail.python.org/pipermail/python-bugs-list/2006-April/032817.html >> >> I'm currently trying. >> >> >> >> >> On Mon, Jul 7, 2008 at 2:22 PM, David Cournapeau < >> david at ar.media.kyoto-u.ac.jp> wrote: >> >>> Chris Bartels wrote: >>> > Hi David, >>> > >>> > Thanks again. >>> > >>> > Unfortunately these variables are not set in the environment. (I >>> > checked with 'env'.) >>> > I also reinstalled the cygwin python package (again) and ran 'find . >>> > -name libpython* -print', this gives: >>> > >>> > /usr/bin/libpython2.5.dll >>> > /usr/lib/python2.5/config/libpython2.5.dll.a >>> > >>> >>> Mmh, ok, I am out of ideas by only looking at your build.log. I don't >>> understand why the linker does not get any LIBPATH information; I will >>> take a look by compiling it by myself, then, >>> >>> cheers, >>> >>> David >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From c.l.l.bartels at gmail.com Mon Jul 7 09:34:04 2008 From: c.l.l.bartels at gmail.com (Chris Bartels) Date: Mon, 7 Jul 2008 15:34:04 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070611p332a12e1w18a784c69ab20ede@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> <48720A74.306@ar.media.kyoto-u.ac.jp> <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> <8d00cdad0807070555r1ea4e2av21adad201983389e@mail.gmail.com> <8d00cdad0807070611p332a12e1w18a784c69ab20ede@mail.gmail.com> Message-ID: <8d00cdad0807070634x177b7972ge34f6305572dd4fb@mail.gmail.com> After installing the old binutils. (I hope this will not be necessary for future numpy versions, that these can run on an unmodified cygwin install) I managed to build and install. Running numpy in python now gives some errors in the unit test: Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test(all=True) Numpy is installed in /usr/lib/python2.5/site-packages/numpy Numpy version 1.1.0 Python version 2.5.1 (r251:54863, May 18 2007, 16:56:43) [GCC 3.4.4 (cygming spe cial, gdc 0.12, using dmd 0.125)] Found 18/18 tests for numpy.core.tests.test_defmatrix Found 3/3 tests for numpy.core.tests.test_errstate Found 3/3 tests for numpy.core.tests.test_memmap Found 286/286 tests for numpy.core.tests.test_multiarray Found 70/70 tests for numpy.core.tests.test_numeric Found 36/36 tests for numpy.core.tests.test_numerictypes Found 12/12 tests for numpy.core.tests.test_records Found 143/143 tests for numpy.core.tests.test_regression Found 7/7 tests for numpy.core.tests.test_scalarmath Found 2/2 tests for numpy.core.tests.test_ufunc Found 16/16 tests for numpy.core.tests.test_umath Found 63/63 tests for numpy.core.tests.test_unicode Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 2/2 tests for numpy.fft.tests.test_fftpack Found 3/3 tests for numpy.fft.tests.test_helper Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 1/1 tests for numpy.lib.tests.test_financial Found 53/53 tests for numpy.lib.tests.test_function_base Found 5/5 tests for numpy.lib.tests.test_getlimits Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 15/15 tests for numpy.lib.tests.test_io Found 1/1 tests for numpy.lib.tests.test_machar Found 4/4 tests for numpy.lib.tests.test_polynomial Found 1/1 tests for numpy.lib.tests.test_regression Found 49/49 tests for numpy.lib.tests.test_shape_base Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 24/24 tests for numpy.lib.tests.test__datasource Found 89/89 tests for numpy.linalg.tests.test_linalg Found 3/3 tests for numpy.linalg.tests.test_regression Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 17/17 tests for numpy.ma.tests.test_mrecords Found 36/36 tests for numpy.ma.tests.test_old_ma Found 4/4 tests for numpy.ma.tests.test_subclassing Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ................................................................................ ..............................................................................E. ... ====================================================================== ERROR: check_basic (numpy.tests.test_ctypeslib.TestLoadLibrary) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/numpy/tests/test_ctypeslib.py", line 9, in check_basic np.core.multiarray.__file__) File "/usr/lib/python2.5/site-packages/numpy/ctypeslib.py", line 55, in load_l ibrary raise e OSError: No such file or directory ---------------------------------------------------------------------- Ran 1283 tests in 7.533s FAILED (errors=1) >>> Kind regards, Chris On Mon, Jul 7, 2008 at 3:11 PM, Chris Bartels wrote: > Hi David (and others) > > This issue is known: > http://www.scipy.org/scipy/numpy/ticket/811 > > I think this is an issue for the numpy developers. (I don't know how to fix > this easily, i can try to install an older version of binutils (if cygwin > has these), but this will probably break a lot of other stuff. So that is > not my preferred solution.) > > Kind regards, > Chris > > > > On Mon, Jul 7, 2008 at 2:55 PM, Chris Bartels > wrote: > >> I created 2 symlinks (libpython2.5.a and libpython2.5.dll.a) in the >> /usr/lib directory which both point to >> /usr/lib/python2.5/config/libpython2.5.dll.a >> This seems to solve the linker problem, now I just get some assembly >> errors :p >> >> Included is a new buildlog, it ends with: >> >> gcc: build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c >> In file included from numpy/core/src/umathmodule.c.src:2183: >> numpy/core/src/ufuncobject.c: In function `_extract_pyvals': >> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg >> 4) >> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg >> 5) >> /tmp/ccWY4IyY.s: Assembler messages: >> /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' >> /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' >> In file included from numpy/core/src/umathmodule.c.src:2183: >> numpy/core/src/ufuncobject.c: In function `_extract_pyvals': >> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg >> 4) >> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg (arg >> 5) >> /tmp/ccWY4IyY.s: Assembler messages: >> /tmp/ccWY4IyY.s:72160: Error: suffix or operands invalid for `fnstsw' >> /tmp/ccWY4IyY.s:72415: Error: suffix or operands invalid for `fnstsw' >> error: Command "gcc -fno-strict-aliasing -DNDEBUG -g -O3 -Wall >> -Wstrict-prototypes -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core/src >> -Inumpy/core/include -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core >> -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 >> -I/usr/include/python2.5 -c >> build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c -o >> build/temp.cygwin-1.5.25-i686-2.5/build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.o" >> failed with exit status 1 >> >> Kind regards, >> Chris >> >> >> On Mon, Jul 7, 2008 at 2:38 PM, Chris Bartels >> wrote: >> >>> Hi David, >>> >>> I just discovered this: >>> http://mail.python.org/pipermail/python-bugs-list/2006-April/032817.html >>> >>> I'm currently trying. >>> >>> >>> >>> >>> On Mon, Jul 7, 2008 at 2:22 PM, David Cournapeau < >>> david at ar.media.kyoto-u.ac.jp> wrote: >>> >>>> Chris Bartels wrote: >>>> > Hi David, >>>> > >>>> > Thanks again. >>>> > >>>> > Unfortunately these variables are not set in the environment. (I >>>> > checked with 'env'.) >>>> > I also reinstalled the cygwin python package (again) and ran 'find . >>>> > -name libpython* -print', this gives: >>>> > >>>> > /usr/bin/libpython2.5.dll >>>> > /usr/lib/python2.5/config/libpython2.5.dll.a >>>> > >>>> >>>> Mmh, ok, I am out of ideas by only looking at your build.log. I don't >>>> understand why the linker does not get any LIBPATH information; I will >>>> take a look by compiling it by myself, then, >>>> >>>> cheers, >>>> >>>> David >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Mon Jul 7 10:44:54 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 7 Jul 2008 16:44:54 +0200 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so Message-ID: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> Hi, we have this problem in Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=489726 The problem is that numpy should not depend on atlas unconditionally, yet it should allow it for users that have it. I am not an expert in blas/lapack/atlas and it's Debian packaging much (I know some people say that atlas packaging in Debian is not very good, actually pretty bad), so I am just forwarding the question here. The problem is with this patch: http://projects.scipy.org/scipy/numpy/changeset/3854 and the question that we have is: I'd like to know, if the code was changed to only work with atlas, or if was never working. if it's the latter, then we should use atlas Matthias, Tiziano, feel free to clarify this more. See the above Debian bug for more information and background. Thanks, Ondrej From cournape at gmail.com Mon Jul 7 11:31:15 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Jul 2008 00:31:15 +0900 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> Message-ID: <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> On Mon, Jul 7, 2008 at 11:44 PM, Ondrej Certik wrote: > Hi, > > we have this problem in Debian: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=489726 > > The problem is that numpy should not depend on atlas unconditionally, > yet it should allow it for users that have it. Why can't numpy depends on ATLAS ? That's something I don't understand. The g77 gfortran transition is done, right ? Is it because it is not available on all archs ? In this case, wouldn't it be easier to change the dependencies depending on the arch ? > > I am not an expert in blas/lapack/atlas and it's Debian packaging much > (I know some people say that atlas packaging in Debian is not very > good, actually pretty bad) Well, at least the package exists and works. That's actually the only distribution I know which packaged a working atlas/blas/lapack. To be fair, atlas is really difficult to package. Its very nature makes it almost impossible to package it while keeping its advantages (speed), because the binary is extremely dependent on the host machine, meaning reproducible builds is practically impossible (even on the same machine). > > The problem is with this patch: > > http://projects.scipy.org/scipy/numpy/changeset/3854 > > and the question that we have is: > > I'd like to know, if the code was changed to only work with > atlas, or if was never working. if it's the latter, then we should use > atlas _dotblas depends on cblas, not blas. IIRC, Robert said that using the cblas interface around an existing blas (for the case where ATLAS is not available) would not be effective because of change of row/column order. It may be worth checking that it is indeed not useful from a speed point of view. Note also that replacing blas/lapack by the ATLAS blas/lapack, as already done on debian, makes quite a big difference already. IOW, if you install numpy, and after that atlas, then thanks to hwcap, the os loader will pick up blas/lapack in /usr/lib/sse2 (for arch with sse2) instead of /usr/lib, and the sse2 ones are the atlas ones. It won't work for _dotblas, but will work for linalg. cheers, David From robert.kern at gmail.com Mon Jul 7 12:24:22 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 7 Jul 2008 11:24:22 -0500 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> Message-ID: <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> On Mon, Jul 7, 2008 at 10:31, David Cournapeau wrote: > On Mon, Jul 7, 2008 at 11:44 PM, Ondrej Certik wrote: >> The problem is with this patch: >> >> http://projects.scipy.org/scipy/numpy/changeset/3854 >> >> and the question that we have is: >> >> I'd like to know, if the code was changed to only work with >> atlas, or if was never working. if it's the latter, then we should use >> atlas > > _dotblas depends on cblas, not blas. IIRC, Robert said that using the > cblas interface around an existing blas (for the case where ATLAS is > not available) would not be effective because of change of row/column > order. It may be worth checking that it is indeed not useful from a > speed point of view. I was probably talking out of my ass at the time. Fortran BLAS does have flags to handle any combination of transposition. The cblas interface just picks the right ones. We probably should relax our assumption that only ATLAS provides a cblas interface. It's not as trivial as just reverting that changeset, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Mon Jul 7 12:30:08 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 7 Jul 2008 09:30:08 -0700 Subject: [Numpy-discussion] array_equal returns True and 0 Message-ID: I'm writing the doc string for array_equal. From the existing one-line doc string I expect array_equal to return True or False. But I get this: >> np.array_equal([1,2], [1,2]) True >> np.array_equal([1,2], [1,2,3]) 0 >> np.array_equal(np.array([1,2]), np.array([1,2,3])) 0 >> np.__version__ '1.1.0' From kwgoodman at gmail.com Mon Jul 7 12:40:18 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 7 Jul 2008 09:40:18 -0700 Subject: [Numpy-discussion] array_equal returns True and 0 In-Reply-To: References: Message-ID: On Mon, Jul 7, 2008 at 9:30 AM, Keith Goodman wrote: > I'm writing the doc string for array_equal. From the existing one-line > doc string I expect array_equal to return True or False. But I get > this: > >>> np.array_equal([1,2], [1,2]) > True >>> np.array_equal([1,2], [1,2,3]) > 0 >>> np.array_equal(np.array([1,2]), np.array([1,2,3])) > 0 >>> np.__version__ > '1.1.0' Oh, it's written in python: def array_equal(a1, a2): """Returns True if a1 and a2 have identical shapes and all elements equal and False otherwise. """ try: a1, a2 = asarray(a1), asarray(a2) except: return 0 if a1.shape != a2.shape: return 0 return logical_and.reduce(equal(a1,a2).ravel()) Could someone change the 0's to False? Otherwise it will return True, False or 0. >> np.array_equal([3,2], [1,2]) False From kwgoodman at gmail.com Mon Jul 7 12:47:58 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 7 Jul 2008 09:47:58 -0700 Subject: [Numpy-discussion] array_equal returns True and 0 In-Reply-To: References: Message-ID: On Mon, Jul 7, 2008 at 9:40 AM, Keith Goodman wrote: > On Mon, Jul 7, 2008 at 9:30 AM, Keith Goodman wrote: >> I'm writing the doc string for array_equal. From the existing one-line >> doc string I expect array_equal to return True or False. But I get >> this: >> >>>> np.array_equal([1,2], [1,2]) >> True >>>> np.array_equal([1,2], [1,2,3]) >> 0 >>>> np.array_equal(np.array([1,2]), np.array([1,2,3])) >> 0 >>>> np.__version__ >> '1.1.0' > > Oh, it's written in python: > > def array_equal(a1, a2): > """Returns True if a1 and a2 have identical shapes > and all elements equal and False otherwise. > """ > try: > a1, a2 = asarray(a1), asarray(a2) > except: > return 0 > if a1.shape != a2.shape: > return 0 > return logical_and.reduce(equal(a1,a2).ravel()) > > Could someone change the 0's to False? Otherwise it will return True, > False or 0. > >>> np.array_equal([3,2], [1,2]) > False Sorry for all the messages. array_equiv also returns 0. From robert.kern at gmail.com Mon Jul 7 12:48:08 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 7 Jul 2008 11:48:08 -0500 Subject: [Numpy-discussion] array_equal returns True and 0 In-Reply-To: References: Message-ID: <3d375d730807070948qfe1d5d8yf8a94ab6ba80d7c2@mail.gmail.com> On Mon, Jul 7, 2008 at 11:30, Keith Goodman wrote: > I'm writing the doc string for array_equal. From the existing one-line > doc string I expect array_equal to return True or False. But I get > this: > >>> np.array_equal([1,2], [1,2]) > True >>> np.array_equal([1,2], [1,2,3]) > 0 >>> np.array_equal(np.array([1,2]), np.array([1,2,3])) > 0 >>> np.__version__ > '1.1.0' Fixed on the trunk. array_equal() and array_equiv() now consistently return either True or False exactly. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From timmichelsen at gmx-topmail.de Mon Jul 7 16:08:36 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 07 Jul 2008 22:08:36 +0200 Subject: [Numpy-discussion] removing and replacing certain values in arrays Message-ID: Hello, how do I remove all rows (or column) from an array which contain a certain value or sting? some like: array.delte_row_from_array_wich_contains('March') array.delte_row_from_array_wich_contains(-333) Is there also possibility to replace all occurences of a certain value in a array by nothing or another one? I am not looking to masking the values under question in this case here. I will appreciate any help. Kind regards, Timmie From timmichelsen at gmx-topmail.de Mon Jul 7 16:14:33 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 07 Jul 2008 22:14:33 +0200 Subject: [Numpy-discussion] Tip: Backporting the latest numpy release in Ubuntu Hardy (8.04) Message-ID: Hello, due to fixed release cycles Ubuntu 8.04 still lacks the latest numpy 1.1. I succeeded backporting the package python-numpy from the upcoming release Intrepid 8.10 with the help of the automatic tool Prevu. Instructions can be found here: * https://wiki.ubuntu.com/Prevu * http://ubuntuforums.org/showthread.php?t=268687 Information on the package: http://packages.ubuntu.com/search?keywords=python-numpy I can encourage other Ubuntu users to do the same because it is not difficult. Just follow the instructions on the wiki. Regards, Timmie From pgmdevlist at gmail.com Mon Jul 7 16:25:19 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 7 Jul 2008 16:25:19 -0400 Subject: [Numpy-discussion] removing and replacing certain values in arrays In-Reply-To: References: Message-ID: <200807071625.22570.pgmdevlist@gmail.com> On Monday 07 July 2008 16:08:36 Tim Michelsen wrote: > Hello, > how do I remove all rows (or column) from an array which contain a > certain value or sting? Timmie, You could try a combination of masking the values you want to discard, followed by ma.compress_rows/cols, provided your array is 2D. For example: >>> import numpy.ma as ma >>> x=ma.array([[1,2,3,4,5]]) >>> x[x==3] = ma.masked >>> ma.compress_cols(x) array([[1, 2, 4, 5]]) From robert.kern at gmail.com Mon Jul 7 16:38:33 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 7 Jul 2008 15:38:33 -0500 Subject: [Numpy-discussion] removing and replacing certain values in arrays In-Reply-To: References: Message-ID: <3d375d730807071338p2a6c97e8vd7a39aa820c7b743@mail.gmail.com> On Mon, Jul 7, 2008 at 15:08, Tim Michelsen wrote: > Hello, > how do I remove all rows (or column) from an array which contain a > certain value or sting? Modification in-place is not possible. You will have to make a boolean mask, then use boolean indexing to pull out a new array with the desired values removed. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Mon Jul 7 23:44:32 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Mon, 7 Jul 2008 23:44:32 -0400 Subject: [Numpy-discussion] chararray behavior Message-ID: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> Since chararray doesn't currently have any tests, I'm writing some, and I ran across a couple of things that didn't make sense to me: 1. The code for __mul__ is exactly the same as that for __rmul__; is there any reason __rmul__ shouldn't just call __mul__? 1.5. __radd__ seems like it doesn't do anything fundamentally different from __add__, is there a reason to have a separate implementation of __radd__? 2. The behavior of __mul__ seems odd: >>> Q=np.chararray((2,2),itemsize=1,buffer='abcd') >>> Q chararray([['a', 'b'], ['c', 'd']], dtype='|S1') >>> Q*3 chararray([['aaa', 'bbb'], ['ccc', 'ddd']], dtype='|S4') >>> Q*4 chararray([['aaaa', 'bbbb'], ['cccc', 'dddd']], dtype='|S4') >>> Q*5 chararray([['aaaa', 'bbbb'], ['cccc', 'dddd']], dtype='|S4') Is it supposed to work this way? Thanks, Alan From stefan at sun.ac.za Tue Jul 8 04:34:11 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 8 Jul 2008 10:34:11 +0200 Subject: [Numpy-discussion] Patch for `numpy.doc` Message-ID: <9457e7c80807080134m616d7422id7572606ea2fa97b@mail.gmail.com> Hi all, Please review http://codereview.appspot.com/2485 which adds `numpy.doc` as a way of documenting topics such as indexing and broadcasting. The corresponding trac ticket is http://scipy.org/scipy/numpy/ticket/846 Regards St?fan From michael at araneidae.co.uk Tue Jul 8 05:35:40 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 8 Jul 2008 09:35:40 +0000 (GMT) Subject: [Numpy-discussion] Another reference count leak: ticket #848 Message-ID: <20080708093048.G5947@saturn.araneidae.co.uk> The attached patch fixes another reference count leak in the use of PyArray_DescrFromType. Could I ask that both this patch and my earlier one (ticket #843) be applied to subversion. Thank you. Definitely not enjoying this low level code. commit 80e1aca1725dd4cd8e091126cf515c39ac3a33ff Author: Michael Abbott Date: Tue Jul 8 10:10:59 2008 +0100 Another reference leak using PyArray_DescrFromType This change fixes two issues: a spurious ADDREF on a typecode returned from PyArray_DescrFromType and a return path with no DECREF. diff --git a/numpy/core/src/scalartypes.inc.src b/numpy/core/src/scalartypes.inc.src index 3feefc0..772cf94 100644 --- a/numpy/core/src/scalartypes.inc.src +++ b/numpy/core/src/scalartypes.inc.src @@ -1886,7 +1886,6 @@ static PyObject * if (!PyArg_ParseTuple(args, "|O", &obj)) return NULL; typecode = PyArray_DescrFromType(PyArray_ at TYPE@); - Py_INCREF(typecode); if (obj == NULL) { #if @default@ == 0 char *mem; @@ -1904,7 +1903,10 @@ static PyObject * } arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: From opossumnano at gmail.com Tue Jul 8 05:48:51 2008 From: opossumnano at gmail.com (Tiziano Zito) Date: Tue, 8 Jul 2008 11:48:51 +0200 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> Message-ID: Hi numpy-devs, I was the one reporting the original bug about missing ATLAS support in the debian lenny python-numpy package. AFAICT the source python-numpy package in etch (numpy version 1.0.1) does not require atlas to build _dotblas.c, only lapack is needed. If you install the resulting binary package on a system where ATLAS is present, ATLAS libraries are used instead of plain lapack. So basically it was already working before the check for ATLAS was introduced into the numpy building system. Why should ATLAS now be required? > It's not as trivial as just reverting that changeset, though. why is that? I mean, it was *working* before... thank you, tiziano From charlesr.harris at gmail.com Tue Jul 8 08:30:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Jul 2008 06:30:42 -0600 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080708093048.G5947@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 8, 2008 at 3:35 AM, Michael Abbott wrote: > The attached patch fixes another reference count leak in the use of > PyArray_DescrFromType. > > Could I ask that both this patch and my earlier one (ticket #843) be > applied to subversion. Thank you. > I'll take a look at them. > > Definitely not enjoying this low level code. > > It can leave one a bit boggled, no? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Tue Jul 8 09:06:00 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 8 Jul 2008 15:06:00 +0200 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> Message-ID: <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> On Tue, Jul 8, 2008 at 11:48 AM, Tiziano Zito wrote: > Hi numpy-devs, I was the one reporting the original bug about missing ATLAS > support in the debian lenny python-numpy package. AFAICT the source > python-numpy package in etch (numpy version 1.0.1) does not require > atlas to build > _dotblas.c, only lapack is needed. If you install the resulting binary > package on a > system where ATLAS is present, ATLAS libraries are used instead of plain lapack. > So basically it was already working before the check for ATLAS was > introduced into > the numpy building system. Why should ATLAS now be required? > >> It's not as trivial as just reverting that changeset, though. > why is that? I mean, it was *working* before... So just removing the two lines from numpy seems to fix the problem in Debian. So far all tests seem to run both on i386 and amd64, both with and without atlas packages installed. And it is indeed faster with the altas packages instaled, yet it doesn't need them to build. I think that's what we want, no? Ondrej From wojciechowski_m at o2.pl Tue Jul 8 11:08:48 2008 From: wojciechowski_m at o2.pl (Marek Wojciechowski) Date: Tue, 8 Jul 2008 17:08:48 +0200 Subject: [Numpy-discussion] Numpy on AIX 5.3 In-Reply-To: References: Message-ID: <200807081708.48781.wojciechowski_m@o2.pl> Dnia poniedzia?ek 07 lipiec 2008, numpy-discussion-request at scipy.org napisa?: > > > > ? File "/home/marek/tmp/numpy-1.1.0/numpy/distutils/ccompiler.py", > > > > line 303, in CCompiler_cxx_compiler > > > > ? ? + cxx.linker_so[2:] > > > > TypeError: can only concatenate list (not "str") to list > > > > > > Just by reading at the code, the line > > > > > > [cxx.linker_so[0]] + cxx.compiler_cxx[0] + cxx.linker_so[2:] > > > > > > Cannot work unless cxx.compiler_cxx is a nested list. Since AIX is not > > > that common, it is well possible that this mistake was hidden for a > > > long time. So I would first try something like: > > > > > > cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] > > > > +cxx.linker_so[2:] > > Please apply also the above bugfix to trunk and numpy-1.1, i.e. change cxx.linker_so = [cxx.linker_so[0]] + cxx.compiler_cxx[0] + cxx.linker_so[2:] to cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] + cxx.linker_so[2:] in line 303 of cccompiler.py in distutils. Greetings, -- Marek Wojciechowski From charlesr.harris at gmail.com Tue Jul 8 11:28:47 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Jul 2008 09:28:47 -0600 Subject: [Numpy-discussion] Schedule for 1.1.1 Message-ID: I think we should try to get a quick bug fix version out by the end of the month. What do others think? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue Jul 8 12:19:22 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 8 Jul 2008 18:19:22 +0200 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> 2008/7/8 Charles R Harris : > I think we should try to get a quick bug fix version out by the end of the > month. What do others think? That was the plan. We wanted a release before the conference -- early enough so that Enthought could push out an EPD release. We'd also like to include the latest docstrings to benefit the tutorials. Cheers St?fan From stefan at sun.ac.za Tue Jul 8 12:22:28 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 8 Jul 2008 18:22:28 +0200 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> References: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> Message-ID: <9457e7c80807080922y4720846fj4a3bbb4e65eb64aa@mail.gmail.com> 2008/7/8 St?fan van der Walt : > 2008/7/8 Charles R Harris : >> I think we should try to get a quick bug fix version out by the end of the >> month. What do others think? > > That was the plan. We wanted a release before the conference -- early > enough so that Enthought could push out an EPD release. We'd also > like to include the latest docstrings to benefit the tutorials. Just to be clear, I don't think Enthought made any commitment to make an EPD release. It is more of a wish-list item, since so many people depend on the EPD for a fully-working tool-suite. We certainly need to get a release out with the latest docstrings, though. Regards St?fan From charlesr.harris at gmail.com Tue Jul 8 12:23:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Jul 2008 10:23:36 -0600 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> References: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> Message-ID: On Tue, Jul 8, 2008 at 10:19 AM, St?fan van der Walt wrote: > 2008/7/8 Charles R Harris : > > I think we should try to get a quick bug fix version out by the end of > the > > month. What do others think? > > That was the plan. We wanted a release before the conference -- early > enough so that Enthought could push out an EPD release. We'd also > like to include the latest docstrings to benefit the tutorials. > So what is the schedule? I have 4 bugs to fix and will get to them this weekend. Should we put together a bug action list? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue Jul 8 12:58:18 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 8 Jul 2008 09:58:18 -0700 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> Message-ID: On Tue, Jul 8, 2008 at 9:23 AM, Charles R Harris wrote: > So what is the schedule? I have 4 bugs to fix and will get to them this > weekend. Should we put together a bug action list? Thanks for getting this conversation going. I have been meaning to send an email for some time now. I would like to get the 1.1.1 release out on 7/31/08, which gives us three weeks. I want to get this out soon because I would like to stick to the original plan to get the 1.2.0 release out on 8/31/2008. I will send an email about 1.2 out next, so if you want to comment on 1.2 please respond to that email. Here is a schedule for 1.1.1: - 7/20/08 tag the 1.1.1rc1 release and prepare packages - 7/27/08 tag the 1.1.1 release and prepare packages - 7/31/08 announce release Of course, this is assuming that there are no issues with the rc. This release should include only bug-fixes and possible improved documentation. Also, as a reminder, the trunk is for 1.2 development; so please remember that 1.1.1 will be tagged off the 1.1.x branch: svn co http://svn.scipy.org/svn/numpy/branches/1.1.x numpy-1.1.x Please use the NumPy 1.1.1 milestone if you want to create a bug action list: http://scipy.org/scipy/numpy/milestone/1.1.1 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From oliphant at enthought.com Tue Jul 8 13:26:25 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 08 Jul 2008 12:26:25 -0500 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080708093048.G5947@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> Message-ID: <4873A341.1030402@enthought.com> Michael Abbott wrote: > The attached patch fixes another reference count leak in the use of > PyArray_DescrFromType. > The first part of this patch is good. The second is not needed. Also, it would be useful if you could write a test case that shows what is leaking and how you determined that it is leaking. > Could I ask that both this patch and my earlier one (ticket #843) be > applied to subversion. Thank you. > > Definitely not enjoying this low level code. > What doesn't kill you makes you stronger :-) But, you are correct that reference counting is a bear. -Travis From oliphant at enthought.com Tue Jul 8 13:29:25 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 08 Jul 2008 12:29:25 -0500 Subject: [Numpy-discussion] chararray behavior In-Reply-To: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> References: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> Message-ID: <4873A3F5.4030100@enthought.com> Alan McIntyre wrote: > Since chararray doesn't currently have any tests, I'm writing some, > and I ran across a couple of things that didn't make sense to me: > > 1. The code for __mul__ is exactly the same as that for __rmul__; is > there any reason __rmul__ shouldn't just call __mul__? > Just additional function call overhead, but it's probably fine to just call __mul__. > 1.5. __radd__ seems like it doesn't do anything fundamentally > different from __add__, is there a reason to have a separate > implementation of __radd__? > Possibly. I'm not sure. > 2. The behavior of __mul__ seems odd: > What is odd about this? It is patterned after >>> 'a' * 3 >>> 'a' * 4 >>> 'a' * 5 for regular python strings. -Travis From charlesr.harris at gmail.com Tue Jul 8 13:43:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Jul 2008 11:43:46 -0600 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <4873A341.1030402@enthought.com> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> Message-ID: Hi Travis, On Tue, Jul 8, 2008 at 11:26 AM, Travis E. Oliphant wrote: > Michael Abbott wrote: > > The attached patch fixes another reference count leak in the use of > > PyArray_DescrFromType. > > > The first part of this patch is good. The second is not needed. Also, > it would be useful if you could write a test case that shows what is > leaking and how you determined that it is leaking. > > Could I ask that both this patch and my earlier one (ticket #843) be > > applied to subversion. Thank you. > > > > Definitely not enjoying this low level code. > > > What doesn't kill you makes you stronger :-) But, you are correct that > reference counting is a bear. > Could you backport your fixes to 1.1.x also? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Tue Jul 8 13:53:24 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 8 Jul 2008 13:53:24 -0400 Subject: [Numpy-discussion] chararray behavior In-Reply-To: <4873A3F5.4030100@enthought.com> References: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> <4873A3F5.4030100@enthought.com> Message-ID: <1d36917a0807081053i1e6612acl2b53169c91a9498@mail.gmail.com> On Tue, Jul 8, 2008 at 1:29 PM, Travis E. Oliphant wrote: > Alan McIntyre wrote: >> Since chararray doesn't currently have any tests, I'm writing some, >> and I ran across a couple of things that didn't make sense to me: >> >> 1. The code for __mul__ is exactly the same as that for __rmul__; is >> there any reason __rmul__ shouldn't just call __mul__? >> > Just additional function call overhead, but it's probably fine to just > call __mul__. > >> 1.5. __radd__ seems like it doesn't do anything fundamentally >> different from __add__, is there a reason to have a separate >> implementation of __radd__? >> > Possibly. I'm not sure. I'll probably leave them alone; I was just curious, mostly. >> 2. The behavior of __mul__ seems odd: >> > What is odd about this? > > It is patterned after > > >>> 'a' * 3 > >>> 'a' * 4 > >>> 'a' * 5 > > for regular python strings. That's what I would have expected, but for N >= 4, Q*N is the same as Q*4. From kwgoodman at gmail.com Tue Jul 8 15:01:05 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 8 Jul 2008 12:01:05 -0700 Subject: [Numpy-discussion] alterdot and restoredot Message-ID: I don't know what to write for a doc string for alterdot and restoredot. Any ideas? From nouiz at nouiz.org Tue Jul 8 15:14:21 2008 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 8 Jul 2008 15:14:21 -0400 Subject: [Numpy-discussion] numpy with fftw Message-ID: <2d1d7fe70807081214y354c2c22y80d2343f3ea92c6e@mail.gmail.com> Hi, I want to compile numpy so that it use the optimized fftw librairy. In the site.cfg.example their is a section [fftw3] that I fill with: [fftw3] include_dirs = /usr/include library_dirs = /usr/lib64 fftw3_libs = fftw3, fftw3f fftw3_opt_libs = fftw3_threads, fftw3f_threads when I compile it, their is no message that it found it like for the atlas librairy(optimized blas librairy). numpy.show_config() don't tell which version of fft it use. How can I know it was well compiled? I tried comparing the speed of the numpy version in FC9 and the one I built, but they have the same speed. Here is the code I used to do my timing: time python -c "import numpy.fft;a=numpy.random.rand(30000); for i in xrange(10000): numpy.fft.fft(a); numpy.show_config()" thanks for your time Fr?d?ric Bastien From robert.kern at gmail.com Tue Jul 8 15:15:22 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 14:15:22 -0500 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> Message-ID: <3d375d730807081215j38dafa06gd44a2630ab9fb067@mail.gmail.com> On Tue, Jul 8, 2008 at 08:06, Ondrej Certik wrote: > On Tue, Jul 8, 2008 at 11:48 AM, Tiziano Zito wrote: >> Hi numpy-devs, I was the one reporting the original bug about missing ATLAS >> support in the debian lenny python-numpy package. AFAICT the source >> python-numpy package in etch (numpy version 1.0.1) does not require >> atlas to build >> _dotblas.c, only lapack is needed. If you install the resulting binary >> package on a >> system where ATLAS is present, ATLAS libraries are used instead of plain lapack. >> So basically it was already working before the check for ATLAS was >> introduced into >> the numpy building system. Why should ATLAS now be required? >> >>> It's not as trivial as just reverting that changeset, though. >> why is that? I mean, it was *working* before... > > So just removing the two lines from numpy seems to fix the problem in > Debian. So far all tests seem to run both on i386 and amd64, both with > and without atlas packages installed. And it is indeed faster with the > altas packages instaled, yet it doesn't need them to build. I think > that's what we want, no? Can you give me more details? Was the binary built on a machine with an absent ATLAS? Show me the output of ldd on _dotblas.so with both ATLAS installed and not. Can you import numpy.core._dotblas explicitly under both? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 8 15:16:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 14:16:49 -0500 Subject: [Numpy-discussion] numpy with fftw In-Reply-To: <2d1d7fe70807081214y354c2c22y80d2343f3ea92c6e@mail.gmail.com> References: <2d1d7fe70807081214y354c2c22y80d2343f3ea92c6e@mail.gmail.com> Message-ID: <3d375d730807081216u49cc5dd1yeda316f942b0d567@mail.gmail.com> On Tue, Jul 8, 2008 at 14:14, Fr?d?ric Bastien wrote: > Hi, > > I want to compile numpy so that it use the optimized fftw librairy. numpy itself does not support this. scipy does. > In > the site.cfg.example their is a section [fftw3] that I fill with: > > [fftw3] > include_dirs = /usr/include > library_dirs = /usr/lib64 > fftw3_libs = fftw3, fftw3f > fftw3_opt_libs = fftw3_threads, fftw3f_threads This is for scipy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Tue Jul 8 15:30:16 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 8 Jul 2008 15:30:16 -0400 Subject: [Numpy-discussion] chararray behavior In-Reply-To: <1d36917a0807081053i1e6612acl2b53169c91a9498@mail.gmail.com> References: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> <4873A3F5.4030100@enthought.com> <1d36917a0807081053i1e6612acl2b53169c91a9498@mail.gmail.com> Message-ID: 2008/7/8 Alan McIntyre : > On Tue, Jul 8, 2008 at 1:29 PM, Travis E. Oliphant > wrote: >> Alan McIntyre wrote: >>> 2. The behavior of __mul__ seems odd: >>> >> What is odd about this? >> >> It is patterned after >> >> >>> 'a' * 3 >> >>> 'a' * 4 >> >>> 'a' * 5 >> >> for regular python strings. > > That's what I would have expected, but for N >= 4, Q*N is the same as Q*4. In particular, the returned type is always "string of length four", which is very peculiar - why four? I realize that variable-length strings are a problem (object arrays, I guess?), as is returning arrays of varying dtypes (strings of length N), but this definitely violates the principle of least surprise... Anne From nouiz at nouiz.org Tue Jul 8 15:33:02 2008 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 8 Jul 2008 15:33:02 -0400 Subject: [Numpy-discussion] numpy with fftw In-Reply-To: <3d375d730807081216u49cc5dd1yeda316f942b0d567@mail.gmail.com> References: <2d1d7fe70807081214y354c2c22y80d2343f3ea92c6e@mail.gmail.com> <3d375d730807081216u49cc5dd1yeda316f942b0d567@mail.gmail.com> Message-ID: <2d1d7fe70807081233g187f1237s98cb5413c6f0b96@mail.gmail.com> thanks for the information. What made my thought it was possible is that in the file site.cfg.example their is: # Given only this section, numpy.distutils will try to figure out which version # of FFTW you are using. #[fftw] #libraries = fftw3 Is this fftw section still usefull? Fr?d?ric Bastien On Tue, Jul 8, 2008 at 3:16 PM, Robert Kern wrote: > On Tue, Jul 8, 2008 at 14:14, Fr?d?ric Bastien wrote: >> Hi, >> >> I want to compile numpy so that it use the optimized fftw librairy. > > numpy itself does not support this. scipy does. > >> In >> the site.cfg.example their is a section [fftw3] that I fill with: >> >> [fftw3] >> include_dirs = /usr/include >> library_dirs = /usr/lib64 >> fftw3_libs = fftw3, fftw3f >> fftw3_opt_libs = fftw3_threads, fftw3f_threads > > This is for scipy. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Jul 8 15:43:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 14:43:52 -0500 Subject: [Numpy-discussion] numpy with fftw In-Reply-To: <2d1d7fe70807081233g187f1237s98cb5413c6f0b96@mail.gmail.com> References: <2d1d7fe70807081214y354c2c22y80d2343f3ea92c6e@mail.gmail.com> <3d375d730807081216u49cc5dd1yeda316f942b0d567@mail.gmail.com> <2d1d7fe70807081233g187f1237s98cb5413c6f0b96@mail.gmail.com> Message-ID: <3d375d730807081243v1a1d56c3r597707a7c9194da1@mail.gmail.com> On Tue, Jul 8, 2008 at 14:33, Fr?d?ric Bastien wrote: > thanks for the information. What made my thought it was possible is > that in the file site.cfg.example their is: > > # Given only this section, numpy.distutils will try to figure out which version > # of FFTW you are using. > #[fftw] > #libraries = fftw3 > > Is this fftw section still usefull? Yes, for building scipy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ondrej at certik.cz Tue Jul 8 15:47:50 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 8 Jul 2008 21:47:50 +0200 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <3d375d730807081215j38dafa06gd44a2630ab9fb067@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> <3d375d730807081215j38dafa06gd44a2630ab9fb067@mail.gmail.com> Message-ID: <85b5c3130807081247p74fbdde7g4f76be89ffbde46e@mail.gmail.com> On Tue, Jul 8, 2008 at 9:15 PM, Robert Kern wrote: > On Tue, Jul 8, 2008 at 08:06, Ondrej Certik wrote: >> On Tue, Jul 8, 2008 at 11:48 AM, Tiziano Zito wrote: >>> Hi numpy-devs, I was the one reporting the original bug about missing ATLAS >>> support in the debian lenny python-numpy package. AFAICT the source >>> python-numpy package in etch (numpy version 1.0.1) does not require >>> atlas to build >>> _dotblas.c, only lapack is needed. If you install the resulting binary >>> package on a >>> system where ATLAS is present, ATLAS libraries are used instead of plain lapack. >>> So basically it was already working before the check for ATLAS was >>> introduced into >>> the numpy building system. Why should ATLAS now be required? >>> >>>> It's not as trivial as just reverting that changeset, though. >>> why is that? I mean, it was *working* before... >> >> So just removing the two lines from numpy seems to fix the problem in >> Debian. So far all tests seem to run both on i386 and amd64, both with >> and without atlas packages installed. And it is indeed faster with the >> altas packages instaled, yet it doesn't need them to build. I think >> that's what we want, no? > > Can you give me more details? Sure. :) > Was the binary built on a machine with > an absent ATLAS? Yes, the binary is always built on a machine with an absent atlas, as the package is build-conflicting with atlas. > Show me the output of ldd on _dotblas.so with both > ATLAS installed and not. Can you import numpy.core._dotblas explicitly > under both? ATLAS installed: ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so linux-gate.so.1 => (0xb7fba000) libblas.so.3gf => /usr/lib/atlas/libblas.so.3gf (0xb7c19000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7b67000) libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7b40000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7b33000) libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb79d8000) /lib/ld-linux.so.2 (0xb7fbb000) ondra at fuji:~/debian$ python Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) [GCC 4.3.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.core._dotblas >>> ATLAS not installed: ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so linux-gate.so.1 => (0xb7f2f000) libblas.so.3gf => /usr/lib/libblas.so.3gf (0xb7e82000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7dd0000) libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7da9000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7d9c000) libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb7c41000) /lib/ld-linux.so.2 (0xb7f30000) ondra at fuji:~/debian$ python Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) [GCC 4.3.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.core._dotblas >>> Ondrej From robert.kern at gmail.com Tue Jul 8 16:19:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 15:19:13 -0500 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <85b5c3130807081247p74fbdde7g4f76be89ffbde46e@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> <3d375d730807081215j38dafa06gd44a2630ab9fb067@mail.gmail.com> <85b5c3130807081247p74fbdde7g4f76be89ffbde46e@mail.gmail.com> Message-ID: <3d375d730807081319t2abfbafale3e2bb28275c7e26@mail.gmail.com> On Tue, Jul 8, 2008 at 14:47, Ondrej Certik wrote: > On Tue, Jul 8, 2008 at 9:15 PM, Robert Kern wrote: >> On Tue, Jul 8, 2008 at 08:06, Ondrej Certik wrote: >>> On Tue, Jul 8, 2008 at 11:48 AM, Tiziano Zito wrote: >>>> Hi numpy-devs, I was the one reporting the original bug about missing ATLAS >>>> support in the debian lenny python-numpy package. AFAICT the source >>>> python-numpy package in etch (numpy version 1.0.1) does not require >>>> atlas to build >>>> _dotblas.c, only lapack is needed. If you install the resulting binary >>>> package on a >>>> system where ATLAS is present, ATLAS libraries are used instead of plain lapack. >>>> So basically it was already working before the check for ATLAS was >>>> introduced into >>>> the numpy building system. Why should ATLAS now be required? >>>> >>>>> It's not as trivial as just reverting that changeset, though. >>>> why is that? I mean, it was *working* before... >>> >>> So just removing the two lines from numpy seems to fix the problem in >>> Debian. So far all tests seem to run both on i386 and amd64, both with >>> and without atlas packages installed. And it is indeed faster with the >>> altas packages instaled, yet it doesn't need them to build. I think >>> that's what we want, no? >> >> Can you give me more details? > > Sure. :) > >> Was the binary built on a machine with >> an absent ATLAS? > > Yes, the binary is always built on a machine with an absent atlas, as > the package is build-conflicting with atlas. > >> Show me the output of ldd on _dotblas.so with both >> ATLAS installed and not. Can you import numpy.core._dotblas explicitly >> under both? > > ATLAS installed: > > ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so > linux-gate.so.1 => (0xb7fba000) > libblas.so.3gf => /usr/lib/atlas/libblas.so.3gf (0xb7c19000) > libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7b67000) > libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7b40000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7b33000) > libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb79d8000) > /lib/ld-linux.so.2 (0xb7fbb000) > ondra at fuji:~/debian$ python > Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) > [GCC 4.3.1] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy.core._dotblas >>>> > > > ATLAS not installed: > > ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so > linux-gate.so.1 => (0xb7f2f000) > libblas.so.3gf => /usr/lib/libblas.so.3gf (0xb7e82000) > libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7dd0000) > libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7da9000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7d9c000) > libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb7c41000) > /lib/ld-linux.so.2 (0xb7f30000) > ondra at fuji:~/debian$ python > Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) > [GCC 4.3.1] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy.core._dotblas >>>> Okay, it turns out that libblas on Ubuntu (and I'm guessing Debian) includes the CBLAS interface. $ nm /usr/lib/libblas.a | grep "T cblas_" 00000000 T cblas_caxpy 00000000 T cblas_ccopy ... This is specific to Debian and its derivatives. Not all libblas's have this. So I stand by my statement that just reverting the change is not acceptable. We need a real check for the CBLAS interface. In the meantime, the Debian package maintainer can patch the file to remove that check during the build for Debian systems. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ondrej at certik.cz Tue Jul 8 16:47:57 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 8 Jul 2008 22:47:57 +0200 Subject: [Numpy-discussion] Debian: numpy not building _dotblas.so In-Reply-To: <3d375d730807081319t2abfbafale3e2bb28275c7e26@mail.gmail.com> References: <85b5c3130807070744g593be9ddr2b77124d221a6fd5@mail.gmail.com> <5b8d13220807070831wb5fff17rc4e27c8503322f84@mail.gmail.com> <3d375d730807070924u355ac351j44c25862556ac57d@mail.gmail.com> <85b5c3130807080606o56cb53f8g69d25e638fe5cb0b@mail.gmail.com> <3d375d730807081215j38dafa06gd44a2630ab9fb067@mail.gmail.com> <85b5c3130807081247p74fbdde7g4f76be89ffbde46e@mail.gmail.com> <3d375d730807081319t2abfbafale3e2bb28275c7e26@mail.gmail.com> Message-ID: <85b5c3130807081347u3f9c4efehdb0819130e41929e@mail.gmail.com> On Tue, Jul 8, 2008 at 10:19 PM, Robert Kern wrote: > On Tue, Jul 8, 2008 at 14:47, Ondrej Certik wrote: >> On Tue, Jul 8, 2008 at 9:15 PM, Robert Kern wrote: >>> On Tue, Jul 8, 2008 at 08:06, Ondrej Certik wrote: >>>> On Tue, Jul 8, 2008 at 11:48 AM, Tiziano Zito wrote: >>>>> Hi numpy-devs, I was the one reporting the original bug about missing ATLAS >>>>> support in the debian lenny python-numpy package. AFAICT the source >>>>> python-numpy package in etch (numpy version 1.0.1) does not require >>>>> atlas to build >>>>> _dotblas.c, only lapack is needed. If you install the resulting binary >>>>> package on a >>>>> system where ATLAS is present, ATLAS libraries are used instead of plain lapack. >>>>> So basically it was already working before the check for ATLAS was >>>>> introduced into >>>>> the numpy building system. Why should ATLAS now be required? >>>>> >>>>>> It's not as trivial as just reverting that changeset, though. >>>>> why is that? I mean, it was *working* before... >>>> >>>> So just removing the two lines from numpy seems to fix the problem in >>>> Debian. So far all tests seem to run both on i386 and amd64, both with >>>> and without atlas packages installed. And it is indeed faster with the >>>> altas packages instaled, yet it doesn't need them to build. I think >>>> that's what we want, no? >>> >>> Can you give me more details? >> >> Sure. :) >> >>> Was the binary built on a machine with >>> an absent ATLAS? >> >> Yes, the binary is always built on a machine with an absent atlas, as >> the package is build-conflicting with atlas. >> >>> Show me the output of ldd on _dotblas.so with both >>> ATLAS installed and not. Can you import numpy.core._dotblas explicitly >>> under both? >> >> ATLAS installed: >> >> ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so >> linux-gate.so.1 => (0xb7fba000) >> libblas.so.3gf => /usr/lib/atlas/libblas.so.3gf (0xb7c19000) >> libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7b67000) >> libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7b40000) >> libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7b33000) >> libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb79d8000) >> /lib/ld-linux.so.2 (0xb7fbb000) >> ondra at fuji:~/debian$ python >> Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) >> [GCC 4.3.1] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy.core._dotblas >>>>> >> >> >> ATLAS not installed: >> >> ondra at fuji:~/debian$ ldd /usr/lib/python2.5/site-packages/numpy/core/_dotblas.so >> linux-gate.so.1 => (0xb7f2f000) >> libblas.so.3gf => /usr/lib/libblas.so.3gf (0xb7e82000) >> libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0xb7dd0000) >> libm.so.6 => /lib/i686/cmov/libm.so.6 (0xb7da9000) >> libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7d9c000) >> libc.so.6 => /lib/i686/cmov/libc.so.6 (0xb7c41000) >> /lib/ld-linux.so.2 (0xb7f30000) >> ondra at fuji:~/debian$ python >> Python 2.5.2 (r252:60911, Jun 25 2008, 17:58:32) >> [GCC 4.3.1] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy.core._dotblas >>>>> > > Okay, it turns out that libblas on Ubuntu (and I'm guessing Debian) > includes the CBLAS interface. > > $ nm /usr/lib/libblas.a | grep "T cblas_" > 00000000 T cblas_caxpy > 00000000 T cblas_ccopy > ... > > This is specific to Debian and its derivatives. Not all libblas's have > this. So I stand by my statement that just reverting the change is not > acceptable. We need a real check for the CBLAS interface. In the Right. > meantime, the Debian package maintainer can patch the file to remove > that check during the build for Debian systems. Yes, I just did that. Thanks for the clarification. Ondrej From michael at araneidae.co.uk Tue Jul 8 17:23:48 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 8 Jul 2008 21:23:48 +0000 (GMT) Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <4873A341.1030402@enthought.com> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> Message-ID: <20080708210520.Y42061@saturn.araneidae.co.uk> On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > Michael Abbott wrote: > > The attached patch fixes another reference count leak in the use of > > PyArray_DescrFromType. > > > The first part of this patch is good. The second is not needed. I don't see that. The second part of the patch addresses the case of an early return: this means that the DECREF that occurs later on in the code is bypassed, and so a reference leak will still occur if this early return case occurs. Don't forget that PyArray_DescrFromType returns an incremented reference that has to be decremented, returned or explicitly assigned -- the DECREF obligation has to be met somewhere. > Also, it would be useful if you could write a test case that shows what > is leaking and how you determined that it is leaking. Roughly r = range(n) i = 0 refs = 0 refs = sys.gettotalrefcount() for i in r: float32() print refs - sys.gettotalrefcount() in debug mode python. This isn't quite the whole story (reference counts can be annoyingly fluid), but that's the most of it. In trunk this leaks 2 refs per n, with the attached patch there remains one leak I haven't chased down yet. Is there a framework for writing test cases? I'm constructing tests just to pin down leaks that I find in my application (uses numpy and ctypes extensively), so they're terribly ad-hoc at the moment. I'm chasing down leaks in a leaky application (it ate 1G over a weekend), and numpy is just one source of leaks -- I suspec that the two reference count leaks I've identified so far don't actually leak significant memory, so alas my search continues... > > Definitely not enjoying this low level code. > What doesn't kill you makes you stronger :-) Heh. > But, you are correct that reference counting is a bear. Actually, it's not the reference counting I'm grumbling about -- the rules for reference counting are, on the whole, pretty straightforward. From robert.kern at gmail.com Tue Jul 8 17:35:08 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 16:35:08 -0500 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080708210520.Y42061@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> Message-ID: <3d375d730807081435q1b66b7afo68aad4d4e18abebf@mail.gmail.com> On Tue, Jul 8, 2008 at 16:23, Michael Abbott wrote: > On Tue, 8 Jul 2008, Travis E. Oliphant wrote: >> Michael Abbott wrote: >> > The attached patch fixes another reference count leak in the use of >> > PyArray_DescrFromType. >> > >> The first part of this patch is good. The second is not needed. > I don't see that. The second part of the patch addresses the case of an > early return: this means that the DECREF that occurs later on in the code > is bypassed, and so a reference leak will still occur if this early return > case occurs. Don't forget that PyArray_DescrFromType returns an > incremented reference that has to be decremented, returned or explicitly > assigned -- the DECREF obligation has to be met somewhere. > >> Also, it would be useful if you could write a test case that shows what >> is leaking and how you determined that it is leaking. > > Roughly > > r = range(n) > i = 0 > refs = 0 > refs = sys.gettotalrefcount() > for i in r: float32() > print refs - sys.gettotalrefcount() > > in debug mode python. This isn't quite the whole story (reference counts > can be annoyingly fluid), but that's the most of it. In trunk this leaks > 2 refs per n, with the attached patch there remains one leak I haven't > chased down yet. > > Is there a framework for writing test cases? I'm constructing tests just > to pin down leaks that I find in my application (uses numpy and ctypes > extensively), so they're terribly ad-hoc at the moment. If you can measure the leak in-process with sys.getrefcount() and friends on a standard non-debug build of Python, then it might be useful for you to write a unit test for us. For example, in numpy/core/tests/test_regression.py, you can see several tests involving reference counts. If the leak isn't particularly measurable without resorting to top(1), then don't bother trying to make a unit test, but a small, complete example that demonstrates the leak is very useful for the rest of us to see if the problem exists on our systems and so we can try our hands at fixing it, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jul 8 18:05:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Jul 2008 16:05:46 -0600 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080708210520.Y42061@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 8, 2008 at 3:23 PM, Michael Abbott wrote: > On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > > Michael Abbott wrote: > > > The attached patch fixes another reference count leak in the use of > > > PyArray_DescrFromType. > > > > > The first part of this patch is good. The second is not needed. > I don't see that. The second part of the patch addresses the case of an > early return: this means that the DECREF that occurs later on in the code > is bypassed, and so a reference leak will still occur if this early return > case occurs. Don't forget that PyArray_DescrFromType returns an > incremented reference that has to be decremented, returned or explicitly > assigned -- the DECREF obligation has to be met somewhere. > Some function calls do the DECREF on an error return. I haven't looked, but that might be the case here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Tue Jul 8 18:46:32 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 08 Jul 2008 17:46:32 -0500 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080708210520.Y42061@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> Message-ID: <4873EE48.1090500@enthought.com> Michael Abbott wrote: > On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > >> Michael Abbott wrote: >> >>> The attached patch fixes another reference count leak in the use of >>> PyArray_DescrFromType. >>> >>> >> The first part of this patch is good. The second is not needed. >> > I don't see that. The second part of the patch addresses the case of an > early return: this means that the DECREF that occurs later on in the code > is bypassed, and so a reference leak will still occur if this early return > case occurs. Don't forget that PyArray_DescrFromType returns an > incremented reference that has to be decremented, returned or explicitly > assigned -- the DECREF obligation has to be met somewhere. > Don't forget that PyArray_FromAny consumes the reference even if it returns with an error. -Travis From haase at msg.ucsf.edu Tue Jul 8 20:21:26 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed, 9 Jul 2008 02:21:26 +0200 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: Hi, I haven't checked out a recent numpy (( >>> N.__version__ '1.0.3.1')) But could someone please check if the division has been changed from '/' to '//' in these places: C:\Priithon_25_win\numpy\core\numerictypes.py:142: DeprecationWarning: classic int division bytes = bits / 8 C:\Priithon_25_win\numpy\core\numerictypes.py:182: DeprecationWarning: classic int division na_name = '%s%d' % (base.capitalize(), bit/2) C:\Priithon_25_win\numpy\core\numerictypes.py:212: DeprecationWarning: classic int division charname = 'i%d' % (bits/8,) C:\Priithon_25_win\numpy\core\numerictypes.py:213: DeprecationWarning: classic int division ucharname = 'u%d' % (bits/8,) C:\Priithon_25_win\numpy\core\numerictypes.py:409: DeprecationWarning: classic int division nbytes[obj] = val[2] / 8 I found these by starting python using the -Qwarn option. Thanks, Sebastian Haase On Tue, Jul 8, 2008 at 5:28 PM, Charles R Harris wrote: > I think we should try to get a quick bug fix version out by the end of the > month. What do others think? > > Chuck From robert.kern at gmail.com Wed Jul 9 00:03:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Jul 2008 23:03:52 -0500 Subject: [Numpy-discussion] alterdot and restoredot In-Reply-To: References: Message-ID: <3d375d730807082103x3f1f9a2emad53349d82912b1f@mail.gmail.com> On Tue, Jul 8, 2008 at 14:01, Keith Goodman wrote: > I don't know what to write for a doc string for alterdot and > restoredot. Then maybe you're the best one to figure it out. What details do you think are missing from the current docstrings? What questions do they leave you with? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Wed Jul 9 01:46:40 2008 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Jul 2008 07:46:40 +0200 Subject: [Numpy-discussion] numpy installation issues In-Reply-To: <8d00cdad0807070611p332a12e1w18a784c69ab20ede@mail.gmail.com> References: <8d00cdad0807070120v55ec5b54q8ee1a0e51db3cd88@mail.gmail.com> <4871D37C.6040208@ar.media.kyoto-u.ac.jp> <8d00cdad0807070217g2912ae03t1f6f6e9f6e15e957@mail.gmail.com> <4871DEFD.3050102@ar.media.kyoto-u.ac.jp> <8d00cdad0807070509g4dd4c6dbt43954a4cb9d104be@mail.gmail.com> <48720A74.306@ar.media.kyoto-u.ac.jp> <8d00cdad0807070538j6c445d28uf4bdb7be8bd01a75@mail.gmail.com> <8d00cdad0807070555r1ea4e2av21adad201983389e@mail.gmail.com> <8d00cdad0807070611p332a12e1w18a784c69ab20ede@mail.gmail.com> Message-ID: <5b8d13220807082246k7d45f2f6od6aaa844381ed5d5@mail.gmail.com> On Mon, Jul 7, 2008 at 3:11 PM, Chris Bartels wrote: > Hi David (and others) > > This issue is known: > http://www.scipy.org/scipy/numpy/ticket/811 > > I think this is an issue for the numpy developers. (I don't know how to fix > this easily, i can try to install an older version of binutils (if cygwin > has these), but this will probably break a lot of other stuff. So that is > not my preferred solution.) Well, AFAIK, numpy does not have a single line in assembly, so this looks more like a bug in the cygwin packaging (incompatibilities between gcc and binutiles versions). There is not much we can do about it. cheers, David From alan.mcintyre at gmail.com Wed Jul 9 01:49:59 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 01:49:59 -0400 Subject: [Numpy-discussion] chararray behavior In-Reply-To: References: <1d36917a0807072044n528d1f41o4bcde8a0389fa388@mail.gmail.com> <4873A3F5.4030100@enthought.com> <1d36917a0807081053i1e6612acl2b53169c91a9498@mail.gmail.com> Message-ID: <1d36917a0807082249t67cf5610he395bb050d6905ad@mail.gmail.com> On Tue, Jul 8, 2008 at 3:30 PM, Anne Archibald wrote: > In particular, the returned type is always "string of length four", > which is very peculiar - why four? I realize that variable-length > strings are a problem (object arrays, I guess?), as is returning > arrays of varying dtypes (strings of length N), but this definitely > violates the principle of least surprise... Hmm..__mul__ calculates the required size of the result array, but the result of the calculation is a numpy.int32. So ndarray__new__ is given this int32 as the itemsize argument, and it looks like the itemsize of the argument (rather than its contained value) is used as the itemsize of the new array: >>> np.chararray((1,2), itemsize=5) chararray([[';>> np.chararray((1,2), itemsize=np.int32(5)) chararray([['{5', '']], dtype='|S4') >>> np.chararray((1,2), itemsize=np.int16(5)) chararray([['{5', '']], dtype='|S2') Is this expected behavior? I can fix this particular case by forcing the calculated size to be a Python int, but this treatment of the itemsize argument seems like it might be an easy way to cause subtle bugs. From cournape at gmail.com Wed Jul 9 02:07:10 2008 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Jul 2008 08:07:10 +0200 Subject: [Numpy-discussion] Numpy on AIX 5.3 In-Reply-To: <200807081708.48781.wojciechowski_m@o2.pl> References: <200807081708.48781.wojciechowski_m@o2.pl> Message-ID: <5b8d13220807082307h62419e19vf0b09212ff4ffdf@mail.gmail.com> On Tue, Jul 8, 2008 at 5:08 PM, Marek Wojciechowski wrote: > > cxx.linker_so = [cxx.linker_so[0], cxx.compiler_cxx[0]] + cxx.linker_so[2:] > > in line 303 of cccompiler.py in distutils. > Should be fixed in r5368. I will merge the change into 1.1.1 as well cheers, David From alan.mcintyre at gmail.com Wed Jul 9 02:57:58 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 02:57:58 -0400 Subject: [Numpy-discussion] A couple of testing issues Message-ID: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> Hi all, I wanted to point out a couple of things about the new test framework that you should keep in mind if you're writing tests: - Don't use NumpyTestCase any more, just use TestCase (which is available if you do from numpy.testing import *). Using NumpyTestCase now causes a deprecation warning. - Test functions and methods will only be picked up based on name if they begin with "test"; "check_*" will no longer be seen as a test function. I figured I should mention these since there probably hasn't been a general announcement about the testing changes. Thanks, Alan From pav at iki.fi Wed Jul 9 03:53:48 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 9 Jul 2008 07:53:48 +0000 (UTC) Subject: [Numpy-discussion] alterdot and restoredot References: <3d375d730807082103x3f1f9a2emad53349d82912b1f@mail.gmail.com> Message-ID: Tue, 08 Jul 2008 23:03:52 -0500, Robert Kern wrote: > On Tue, Jul 8, 2008 at 14:01, Keith Goodman wrote: >> I don't know what to write for a doc string for alterdot and >> restoredot. > > Then maybe you're the best one to figure it out. What details do you > think are missing from the current docstrings? What questions do they > leave you with? I have the following for starters: - Are these meant as user-visible functions? - Should the user call them? When? What is the advantage? - Are BLAS routines used by default? (And if not, why not?) - Which operations do the functions exactly affect? It seems that alterdot sets the "dot" function slot to a BLAS version, but what operations does this affect? Pauli From stefan at sun.ac.za Wed Jul 9 04:28:23 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 9 Jul 2008 10:28:23 +0200 Subject: [Numpy-discussion] Documentation: topical docs and reviewing our work Message-ID: <9457e7c80807090128u2194c8datc2ccef2be65d855c@mail.gmail.com> Hi all, A `numpy.doc` sub-module has been added, which contains documentation for topics such as indexing, broadcasting, array operations etc. These can be edited from the documentation wiki: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.doc/ If you'd like to document a topic that is not there, let me know and I'll add it. Further, we have documented a large number of functions, and the list is growing by the day. If you go to the docstring summary page: http://sd-2116.dedibox.fr/pydocweb/doc/ the ones ready for review are marked in pink, right at the top. Please log in and leave comments on those. Your input would be much appreciated! Regards St?fan From michael at araneidae.co.uk Wed Jul 9 04:36:27 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Wed, 9 Jul 2008 08:36:27 +0000 (GMT) Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <4873EE48.1090500@enthought.com> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> <4873EE48.1090500@enthought.com> Message-ID: <20080709064836.Q65045@saturn.araneidae.co.uk> There are three separate patches in this message plus some remarks on "stealing" reference counts at the bottom. On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > Michael Abbott wrote: > > On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > >> The first part of this patch is good. The second is not needed. > > I don't see that. > Don't forget that PyArray_FromAny consumes the reference even if it > returns with an error. Oh dear. That's not good. Well then, I need to redo my patch. Here's the new patch for ..._arrtype_new: commit 431d99f40ca200201ba59c74a88b0bd972022ff0 Author: Michael Abbott Date: Tue Jul 8 10:10:59 2008 +0100 Another reference leak using PyArray_DescrFromType This change fixes two issues: a spurious ADDREF on a typecode returned from PyArray_DescrFromType and an awkward interaction with PyArray_FromAny. diff --git a/numpy/core/src/scalartypes.inc.src b/numpy/core/src/scalartypes.inc.src index 3feefc0..7d3e562 100644 --- a/numpy/core/src/scalartypes.inc.src +++ b/numpy/core/src/scalartypes.inc.src @@ -1886,7 +1886,6 @@ static PyObject * if (!PyArg_ParseTuple(args, "|O", &obj)) return NULL; typecode = PyArray_DescrFromType(PyArray_ at TYPE@); - Py_INCREF(typecode); if (obj == NULL) { #if @default@ == 0 char *mem; @@ -1903,8 +1902,12 @@ static PyObject * goto finish; } + Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: I don't think we can dispense with the extra INCREF and DECREF. Looking at the uses of PyArray_FromAny I can see the motivation for this design: core/include/numpy/ndarrayobject.h has a lot of calls which take a value returned by PyArray_DescrFromType as argument. This has prompted me to take a trawl through the code to see what else is going on, and I note a couple more issues with patches below. In the patch below the problem being fixed is that the first call to PyArray_FromAny can result in the erasure of dtype *before* Py_INCREF is called. Perhaps you can argue that this only occurs when NULL is returned... diff --git a/numpy/core/blasdot/_dotblas.c b/numpy/core/blasdot/_dotblas.c index e2619b6..0b34ec7 100644 --- a/numpy/core/blasdot/_dotblas.c +++ b/numpy/core/blasdot/_dotblas.c @@ -234,9 +234,9 @@ dotblas_matrixproduct(PyObject *dummy, PyObject *args) } dtype = PyArray_DescrFromType(typenum); - ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL); - if (ap1 == NULL) return NULL; Py_INCREF(dtype); + ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL); + if (ap1 == NULL) { Py_DECREF(dtype); return NULL; } ap2 = (PyArrayObject *)PyArray_FromAny(op2, dtype, 0, 0, ALIGNED, NULL); if (ap2 == NULL) goto fail; The next patch deals with an interestingly subtle memory leak in _string_richcompare where if casting to a common type fails then a reference count will leaked. Actually this one has nothing to do with PyArray_FromAny, but I spotted it in passing. diff --git a/numpy/core/src/arrayobject.c b/numpy/core/src/arrayobject.c index ee4e945..2294b8d 100644 --- a/numpy/core/src/arrayobject.c +++ b/numpy/core/src/arrayobject.c @@ -4715,7 +4715,6 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op, PyObject *new; if (self->descr->type_num == PyArray_STRING && \ other->descr->type_num == PyArray_UNICODE) { - Py_INCREF(other); Py_INCREF(other->descr); new = PyArray_FromAny((PyObject *)self, other->descr, 0, 0, 0, NULL); @@ -4723,16 +4722,17 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op, return NULL; } self = (PyArrayObject *)new; + Py_INCREF(other); } else if (self->descr->type_num == PyArray_UNICODE && \ other->descr->type_num == PyArray_STRING) { - Py_INCREF(self); Py_INCREF(self->descr); new = PyArray_FromAny((PyObject *)other, self->descr, 0, 0, 0, NULL); if (new == NULL) { return NULL; } + Py_INCREF(self); other = (PyArrayObject *)new; } else { I really don't think that this design of reference count handling in PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea. Unfortunately these seem to be part of the published API, so presumably it's too late to change this? (Otherwise I might see how the corresponding patch comes out.) Not only is this not a good idea, it's not documented in the API documentation (I'm referring to the "Guide to NumPy" book), although at least the inline comment on the implementation of PyArray_FromAny does mention that the reference is "stolen". I've been trying to find some documentation on stealing references. The Python C API reference (http://docs.python.org/api/refcountDetails.html) says Few functions steal references; the two notable exceptions are PyList_SetItem() and PyTuple_SetItem() An interesting essay on reference counting is at http://lists.blender.org/pipermail/bf-python/2005-September/003092.html In conclusion, I can't find much about the role of stealing in reference count management, but it's such a source of surprise (and frankly doesn't actually work out all that well in numpy) that I don't think it's justified. If PyList_SetItem() and PyTuple_SetItem() could remain the only examples of this it would be good. From robert.kern at gmail.com Wed Jul 9 04:47:59 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 03:47:59 -0500 Subject: [Numpy-discussion] Documentation: topical docs and reviewing our work In-Reply-To: <9457e7c80807090128u2194c8datc2ccef2be65d855c@mail.gmail.com> References: <9457e7c80807090128u2194c8datc2ccef2be65d855c@mail.gmail.com> Message-ID: <3d375d730807090147u72a1425tf382a1519dcf19c6@mail.gmail.com> On Wed, Jul 9, 2008 at 03:28, St?fan van der Walt wrote: > Please log in and leave comments on those. Your input would be much > appreciated! Each docstring page could use a "Next" link to move to the next docstring with the same review status (actually, the same review status that it was when the user first came to the page; not sure how hard that makes the implementation). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael at araneidae.co.uk Wed Jul 9 05:33:53 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Wed, 9 Jul 2008 09:33:53 +0000 (GMT) Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080709064836.Q65045@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> <4873EE48.1090500@enthought.com> <20080709064836.Q65045@saturn.araneidae.co.uk> Message-ID: <20080709092456.W65706@saturn.araneidae.co.uk> On Wed, 9 Jul 2008, Michael Abbott wrote: > Well then, I need to redo my patch. Here's the new patch for > ..._arrtype_new: I'm sorry about this, I posted too early. Here is the final patch (and I'll update the ticket accordingly). commit a1ff570cbd3ca6c28f87c55cebf2675b395c6fa0 Author: Michael Abbott Date: Tue Jul 8 10:10:59 2008 +0100 Another reference leak using PyArray_DescrFromType This change fixes the following issues resulting in reference count leaks: a spurious ADDREF on a typecode returned from PyArray_DescrFromType, an awkward interaction with PyArray_FromAny, and a couple of early returns which need DECREFs. diff --git a/numpy/core/src/scalartypes.inc.src b/numpy/core/src/scalartypes.inc.src index 3feefc0..d54ae1b 100644 --- a/numpy/core/src/scalartypes.inc.src +++ b/numpy/core/src/scalartypes.inc.src @@ -1886,7 +1886,6 @@ static PyObject * if (!PyArg_ParseTuple(args, "|O", &obj)) return NULL; typecode = PyArray_DescrFromType(PyArray_ at TYPE@); - Py_INCREF(typecode); if (obj == NULL) { #if @default@ == 0 char *mem; @@ -1903,19 +1902,30 @@ static PyObject * goto finish; } + Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_ at TYPE@); dest = scalar_value(obj, typecode); The corresponding test case is (sorry it's crude): import sys from numpy import float32 refs = 0 refs = sys.gettotalrefcount() float32() print sys.gettotalrefcount() - refs I'm afraid I haven't tested all the possible paths through this routine. I need to get back to chasing my other leaks. From robert.kern at gmail.com Wed Jul 9 05:43:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 04:43:54 -0500 Subject: [Numpy-discussion] alterdot and restoredot In-Reply-To: References: <3d375d730807082103x3f1f9a2emad53349d82912b1f@mail.gmail.com> Message-ID: <3d375d730807090243t5bae7058p7798799d586c207@mail.gmail.com> On Wed, Jul 9, 2008 at 02:53, Pauli Virtanen wrote: > > Tue, 08 Jul 2008 23:03:52 -0500, Robert Kern wrote: >> On Tue, Jul 8, 2008 at 14:01, Keith Goodman wrote: >>> I don't know what to write for a doc string for alterdot and >>> restoredot. >> >> Then maybe you're the best one to figure it out. What details do you >> think are missing from the current docstrings? What questions do they >> leave you with? > > I have the following for starters: > > - Are these meant as user-visible functions? Yes, with the caveats below. > - Should the user call them? When? What is the advantage? Typically, one would only want to call them when trying to troubleshoot an installation problem, benchmark their installation, or otherwise need complete control over what code is used. Most users will never need to touch them. > - Are BLAS routines used by default? (And if not, why not?) If numpy.core._blasdot was built and imports, then yes. > - Which operations do the functions exactly affect? > It seems that alterdot sets the "dot" function slot to a BLAS > version, but what operations does this affect? dot(), vdot(), and innerproduct() on C-contiguous arrays which are Matrix-Matrix, Matrix-Vector or Vector-Vector products. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gruben at bigpond.net.au Wed Jul 9 07:13:59 2008 From: gruben at bigpond.net.au (Gary Ruben) Date: Wed, 09 Jul 2008 12:13:59 +0100 Subject: [Numpy-discussion] Detecting phase windings In-Reply-To: References: <10495791.1213633824742.JavaMail.root@nschwwebs02p> Message-ID: <48749D77.3090304@bigpond.net.au> I had a chance to look at Anne's suggestion from this thread and I thought I should post my phase winding finder solution, which is slightly modified from her idea. Thanks Anne. This is a vast improvement over my original slow code, and is useful to me now, but I will probably have to rewrite it in C, weave or Cython when I start generating large data sets. import numpy as np from pyvtk import * def find_vortices(x, axis=0): xx = np.rollaxis(x, axis) r = np.empty_like(xx).astype(np.bool) for i in range(xx.shape[0]): print i, xxx = xx[i,...] loop = np.concatenate(([xxx], [np.roll(xxx,1,0)], [np.roll(np.roll(xxx,1,0),1,1)], [np.roll(xxx,1,1)], [xxx]), axis=0) loop = np.unwrap(loop, axis=0) r[i,...] = np.abs(loop[0,...]-loop[-1,...])>pi/2 return np.rollaxis(r, 0, axis+1)[1:-1,1:-1,1:-1] and call it like so on the 3D phaseField array, which is a float32 array containing the phase angle at each point: # Detect the nodal lines b0 = find_vortices(phaseField, axis=0) b0 |= find_vortices(phaseField, axis=1) b0 |= find_vortices(phaseField, axis=2) # output vortices to vtk indices = np.transpose(np.nonzero(b0)).tolist() vtk = VtkData(UnstructuredGrid(indices)) vtk.tofile('%s_vol'%sys.argv[0][:-3],'binary') del vtk -- Gary R. From cournape at gmail.com Wed Jul 9 07:28:58 2008 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Jul 2008 13:28:58 +0200 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080709064836.Q65045@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> <4873EE48.1090500@enthought.com> <20080709064836.Q65045@saturn.araneidae.co.uk> Message-ID: <5b8d13220807090428t5975021coc0d8de8dc77f5a3e@mail.gmail.com> > I really don't think that this design of reference count handling in > PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea. > Unfortunately these seem to be part of the published API, so presumably > it's too late to change this? (Otherwise I might see how the > corresponding patch comes out.) Changing it would break almost every code using numpy C API ... Don't forget that numpy did build on numarray and numeric, with an almost backward compatible C API. cheers David From peridot.faceted at gmail.com Wed Jul 9 07:36:00 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 07:36:00 -0400 Subject: [Numpy-discussion] alterdot and restoredot In-Reply-To: <3d375d730807090243t5bae7058p7798799d586c207@mail.gmail.com> References: <3d375d730807082103x3f1f9a2emad53349d82912b1f@mail.gmail.com> <3d375d730807090243t5bae7058p7798799d586c207@mail.gmail.com> Message-ID: 2008/7/9 Robert Kern : >> - Which operations do the functions exactly affect? >> It seems that alterdot sets the "dot" function slot to a BLAS >> version, but what operations does this affect? > > dot(), vdot(), and innerproduct() on C-contiguous arrays which are > Matrix-Matrix, Matrix-Vector or Vector-Vector products. Really? Not, say, tensordot()? Anne From peridot.faceted at gmail.com Wed Jul 9 09:26:39 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 09:26:39 -0400 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> Message-ID: 2008/7/9 Alan McIntyre : > - Test functions and methods will only be picked up based on name if > they begin with "test"; "check_*" will no longer be seen as a test > function. Is it possible to induce nose to pick these up and, if not actually run them, warn about them? It's not so good to have some tests silently not being run... Anne From charlesr.harris at gmail.com Wed Jul 9 09:55:08 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 07:55:08 -0600 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080709064836.Q65045@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> <4873EE48.1090500@enthought.com> <20080709064836.Q65045@saturn.araneidae.co.uk> Message-ID: On Wed, Jul 9, 2008 at 2:36 AM, Michael Abbott wrote: > There are three separate patches in this message plus some remarks on > "stealing" reference counts at the bottom. > > > I really don't think that this design of reference count handling in > PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea. > Unfortunately these seem to be part of the published API, so presumably > it's too late to change this? (Otherwise I might see how the > corresponding patch comes out.) > There was a previous discussion along those lines initiated by, ahem, myself. But changing things at this point would be too much churn. Better, I think, to get these things documented and the code cleaned up. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Catherine.M.Moroney at jpl.nasa.gov Wed Jul 9 12:21:54 2008 From: Catherine.M.Moroney at jpl.nasa.gov (Catherine Moroney) Date: Wed, 9 Jul 2008 09:21:54 -0700 Subject: [Numpy-discussion] element-wise logical operations on numpy arrays Message-ID: <07696CD2-45C1-4214-AB1A-FC09369B8FDF@jpl.nasa.gov> Hello, I have a question about performing element-wise logical operations on numpy arrays. If "a", "b" and "c" are numpy arrays of the same size, does the following syntax work? mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) It seems to be performing correctly, but the documentation that I've read indicates that "&" and "|" are for bitwise operations, not element-by- element operations in arrays. I'm trying to avoid using "logical_and" and "logical_or" because they make the code more cumbersome and difficult to read. Are "&" and "|" acceptable substitutes for numpy arrays? Thanks, Catherine From charlesr.harris at gmail.com Wed Jul 9 12:34:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 10:34:06 -0600 Subject: [Numpy-discussion] element-wise logical operations on numpy arrays In-Reply-To: <07696CD2-45C1-4214-AB1A-FC09369B8FDF@jpl.nasa.gov> References: <07696CD2-45C1-4214-AB1A-FC09369B8FDF@jpl.nasa.gov> Message-ID: On Wed, Jul 9, 2008 at 10:21 AM, Catherine Moroney < Catherine.M.Moroney at jpl.nasa.gov> wrote: > Hello, > > I have a question about performing element-wise logical operations > on numpy arrays. > > If "a", "b" and "c" are numpy arrays of the same size, does the > following > syntax work? > > mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) > > It seems to be performing correctly, but the documentation that I've > read > indicates that "&" and "|" are for bitwise operations, not element-by- > element operations in arrays. > They perform bitwise operations element by element. They only work for integer/bool arrays and you should avoid mixing signed/unsigned types because of the type promotion rules. Other than that, things should work fine. > I'm trying to avoid using "logical_and" and "logical_or" because they > make the code more cumbersome and difficult to read. Are "&" and "|" > acceptable substitutes for numpy arrays? > Generally, yes, but they are more restrictive in the types they accept. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Jul 9 12:35:20 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 12:35:20 -0400 Subject: [Numpy-discussion] element-wise logical operations on numpy arrays In-Reply-To: <07696CD2-45C1-4214-AB1A-FC09369B8FDF@jpl.nasa.gov> References: <07696CD2-45C1-4214-AB1A-FC09369B8FDF@jpl.nasa.gov> Message-ID: 2008/7/9 Catherine Moroney : > I have a question about performing element-wise logical operations > on numpy arrays. > > If "a", "b" and "c" are numpy arrays of the same size, does the > following > syntax work? > > mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) > > It seems to be performing correctly, but the documentation that I've > read > indicates that "&" and "|" are for bitwise operations, not element-by- > element operations in arrays. > > I'm trying to avoid using "logical_and" and "logical_or" because they > make the code more cumbersome and difficult to read. Are "&" and "|" > acceptable substitutes for numpy arrays? Yes. Unfortunately it is impossible to make python's usual logical operators, "and", "or", etcetera, behave correctly on numpy arrays. So the decision was made to use the bitwise operators to express logical operations on boolean arrays. If you like, you can think of boolean arrays as containing single bits, so that the bitwise operators *are* the logical operators. Confusing, but I'm afraid there really isn't anything the numpy developers can do about it, besides write good documentation. Good luck, Anne From millman at berkeley.edu Wed Jul 9 13:03:08 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 9 Jul 2008 10:03:08 -0700 Subject: [Numpy-discussion] REMINDER: SciPy 2008 Early Registration ends in 2 days Message-ID: Hello, This is a reminder that early registration for SciPy 2008 ends in two days on Friday, July 11th. To register, please see: http://conference.scipy.org/to_register This year's conference has two days for tutorials, two days of presentations, and ends with a two day coding sprint. If you want to learn more see my blog post: http://jarrodmillman.blogspot.com/2008/07/scipy-2008-conference-program-posted.html Cheers, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From Catherine.M.Moroney at jpl.nasa.gov Wed Jul 9 13:11:03 2008 From: Catherine.M.Moroney at jpl.nasa.gov (Catherine Moroney) Date: Wed, 9 Jul 2008 10:11:03 -0700 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 32 In-Reply-To: References: Message-ID: <6E873B64-D961-45E5-AD19-10E6A4171D65@jpl.nasa.gov> On Jul 9, 2008, at 10:00 AM, numpy-discussion-request at scipy.org wrote: > Send Numpy-discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://projects.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Numpy-discussion digest..." > Today's Topics: > > 1. Re: element-wise logical operations on numpy arrays > (Anne Archibald) > > From: "Anne Archibald" > Date: July 9, 2008 9:35:20 AM PDT > To: "Discussion of Numerical Python" > Subject: Re: [Numpy-discussion] element-wise logical operations on > numpy arrays > Reply-To: Discussion of Numerical Python > > > 2008/7/9 Catherine Moroney : > >> I have a question about performing element-wise logical operations >> on numpy arrays. >> >> If "a", "b" and "c" are numpy arrays of the same size, does the >> following syntax work? >> >> mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) >> >> It seems to be performing correctly, but the documentation that I've >> read indicates that "&" and "|" are for bitwise operations, not >> element-by- >> element operations in arrays. >> >> I'm trying to avoid using "logical_and" and "logical_or" because they >> make the code more cumbersome and difficult to read. Are "&" and "|" >> acceptable substitutes for numpy arrays? > > Yes. Unfortunately it is impossible to make python's usual logical > operators, "and", "or", etcetera, behave correctly on numpy arrays. So > the decision was made to use the bitwise operators to express logical > operations on boolean arrays. If you like, you can think of boolean > arrays as containing single bits, so that the bitwise operators *are* > the logical operators. > > Confusing, but I'm afraid there really isn't anything the numpy > developers can do about it, besides write good documentation. > Do "&" and "|" work on all types of numpy arrays (i.e. floats and 16 and 32-bit integers), or only on arrays of booleans? The short tests I've done seem to indicate that it does, but I'd like to have some confirmation. > Good luck, > Anne > Catherine > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Wed Jul 9 13:13:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 11:13:41 -0600 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 32 In-Reply-To: <6E873B64-D961-45E5-AD19-10E6A4171D65@jpl.nasa.gov> References: <6E873B64-D961-45E5-AD19-10E6A4171D65@jpl.nasa.gov> Message-ID: On Wed, Jul 9, 2008 at 11:11 AM, Catherine Moroney < Catherine.M.Moroney at jpl.nasa.gov> wrote: > > On Jul 9, 2008, at 10:00 AM, numpy-discussion-request at scipy.org wrote: > > > Send Numpy-discussion mailing list submissions to > > numpy-discussion at scipy.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > or, via email, send a message with subject or body 'help' to > > numpy-discussion-request at scipy.org > > > > You can reach the person managing the list at > > numpy-discussion-owner at scipy.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of Numpy-discussion digest..." > > Today's Topics: > > > > 1. Re: element-wise logical operations on numpy arrays > > (Anne Archibald) > > > > From: "Anne Archibald" > > Date: July 9, 2008 9:35:20 AM PDT > > To: "Discussion of Numerical Python" > > Subject: Re: [Numpy-discussion] element-wise logical operations on > > numpy arrays > > Reply-To: Discussion of Numerical Python > > > > > > 2008/7/9 Catherine Moroney : > > > >> I have a question about performing element-wise logical operations > >> on numpy arrays. > >> > >> If "a", "b" and "c" are numpy arrays of the same size, does the > >> following syntax work? > >> > >> mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) > >> > >> It seems to be performing correctly, but the documentation that I've > >> read indicates that "&" and "|" are for bitwise operations, not > >> element-by- > >> element operations in arrays. > >> > >> I'm trying to avoid using "logical_and" and "logical_or" because they > >> make the code more cumbersome and difficult to read. Are "&" and "|" > >> acceptable substitutes for numpy arrays? > > > > Yes. Unfortunately it is impossible to make python's usual logical > > operators, "and", "or", etcetera, behave correctly on numpy arrays. So > > the decision was made to use the bitwise operators to express logical > > operations on boolean arrays. If you like, you can think of boolean > > arrays as containing single bits, so that the bitwise operators *are* > > the logical operators. > > > > Confusing, but I'm afraid there really isn't anything the numpy > > developers can do about it, besides write good documentation. > > > Do "&" and "|" work on all types of numpy arrays (i.e. floats and > 16 and 32-bit integers), or only on arrays of booleans? The short > tests I've done seem to indicate that it does, but I'd like to have > some confirmation. > They work for all integer types but not for float or complex types: In [1]: x = ones(3) In [2]: x | x --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/charris/ in () TypeError: unsupported operand type(s) for |: 'float' and 'float' Comparisons always return boolean arrays, so you don't have to worry about that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From marlin_rowley at hotmail.com Wed Jul 9 15:16:47 2008 From: marlin_rowley at hotmail.com (Marlin Rowley) Date: Wed, 9 Jul 2008 14:16:47 -0500 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? Message-ID: All: I'm trying to take a constant vector: v = (0.122169, 0.61516, 0.262671) and multiply those values by every 3 components in an array of length N: A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ....] So what I want is: v[0]*A[0] v[1]*A[1] v[2]*A[2] v[0]*A[3] v[1]*A[4] v[2]*A[5] v[0]*A[6] ... How do I do this with one command in numPy? -M _________________________________________________________________ Need to know now? Get instant answers with Windows Live Messenger. http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_messenger_072008 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Wed Jul 9 15:19:30 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 15:19:30 -0400 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> Message-ID: <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> On Wed, Jul 9, 2008 at 9:26 AM, Anne Archibald wrote: >> - Test functions and methods will only be picked up based on name if >> they begin with "test"; "check_*" will no longer be seen as a test >> function. > > Is it possible to induce nose to pick these up and, if not actually > run them, warn about them? It's not so good to have some tests > silently not being run... Having nose pick up "check_" functions as tests may interfere with SciPy testing; it looks like there are a couple dozen functions/methods named that way in the SciPy tree. I didn't look at all of them, though; it could be that some are tests that still need renaming. Since I'm looking at coverage (including test code coverage), any tests that don't get run will be found, at least while I'm working on tests. Still, it might not hurt to have something automated looking for potentially missed tests for 1.2. That would also help with third-party code that depends on NumPy for testing, since they probably don't have the luxury of someone able to spend all their time worrying over test coverage. I can make a pass through all the test_* modules in the source tree under test and post a warning if "def check_" is found in them before handing things over to nose. Anyone else have thoughts on this? From charlesr.harris at gmail.com Wed Jul 9 15:26:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 13:26:01 -0600 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? In-Reply-To: References: Message-ID: On Wed, Jul 9, 2008 at 1:16 PM, Marlin Rowley wrote: > All: > > I'm trying to take a constant vector: > > v = (0.122169, 0.61516, 0.262671) > > and multiply those values by every 3 components in an array of length N: > > A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ....] > > So what I want is: > > v[0]*A[0] > v[1]*A[1] > v[2]*A[2] > v[0]*A[3] > v[1]*A[4] > v[2]*A[5] > v[0]*A[6] > > ... > > How do I do this with one command in numPy? > > If the length of A is divisible by 3: A.reshape((-1,3))*v You might want to reshape the result to 1-D. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jul 9 15:26:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 14:26:01 -0500 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> Message-ID: <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> On Wed, Jul 9, 2008 at 14:19, Alan McIntyre wrote: > I can make a pass through all the test_* modules in the source tree > under test and post a warning if "def check_" is found in them before > handing things over to nose. Anyone else have thoughts on this? I don't think it's worth automating on every run. People can see for themselves if they have any such check_methods() and make the conversion once: nosetests -v --include "check_.*" --exclude "test_.*" -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Jul 9 15:28:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 14:28:10 -0500 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> Message-ID: <3d375d730807091228k3752f7a3sa9472469580fc51e@mail.gmail.com> On Wed, Jul 9, 2008 at 14:26, Robert Kern wrote: > On Wed, Jul 9, 2008 at 14:19, Alan McIntyre wrote: > >> I can make a pass through all the test_* modules in the source tree >> under test and post a warning if "def check_" is found in them before >> handing things over to nose. Anyone else have thoughts on this? > > I don't think it's worth automating on every run. People can see for > themselves if they have any such check_methods() and make the > conversion once: > > nosetests -v --include "check_.*" --exclude "test_.*" Hmm, could be wrong about that. Let me find the right incantation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed Jul 9 15:35:37 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 15:35:37 -0400 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> Message-ID: <1d36917a0807091235i48cffb9r3826805f2fb7c288@mail.gmail.com> On Wed, Jul 9, 2008 at 3:26 PM, Robert Kern wrote: > I don't think it's worth automating on every run. People can see for > themselves if they have any such check_methods() and make the > conversion once: Does this fall into the "how in the world should I have known to do that" category? As long as there's a prominent note in the release notes, either containing such suggestions or links to a page that does, I don't have any problem just including this in the list of things that people should do if they're planning on upgrading to 1.2. From Catherine.M.Moroney at jpl.nasa.gov Wed Jul 9 15:43:53 2008 From: Catherine.M.Moroney at jpl.nasa.gov (Catherine Moroney) Date: Wed, 9 Jul 2008 12:43:53 -0700 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 33 In-Reply-To: References: Message-ID: <199CC9A4-507C-4F85-80BF-F995D84B6AA0@jpl.nasa.gov> > > > 2008/7/9 Catherine Moroney : > > > >> I have a question about performing element-wise logical operations > >> on numpy arrays. > >> > >> If "a", "b" and "c" are numpy arrays of the same size, does the > >> following syntax work? > >> > >> mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) > >> > >> It seems to be performing correctly, but the documentation that > I've > >> read indicates that "&" and "|" are for bitwise operations, not > >> element-by- > >> element operations in arrays. > >> > >> I'm trying to avoid using "logical_and" and "logical_or" because > they > >> make the code more cumbersome and difficult to read. Are "&" > and "|" > >> acceptable substitutes for numpy arrays? > > > > Yes. Unfortunately it is impossible to make python's usual logical > > operators, "and", "or", etcetera, behave correctly on numpy > arrays. So > > the decision was made to use the bitwise operators to express > logical > > operations on boolean arrays. If you like, you can think of boolean > > arrays as containing single bits, so that the bitwise operators > *are* > > the logical operators. > > > > Confusing, but I'm afraid there really isn't anything the numpy > > developers can do about it, besides write good documentation. > > > Do "&" and "|" work on all types of numpy arrays (i.e. floats and > 16 and 32-bit integers), or only on arrays of booleans? The short > tests I've done seem to indicate that it does, but I'd like to have > some confirmation. > > They work for all integer types but not for float or complex types: > > In [1]: x = ones(3) > > In [2]: x | x > ---------------------------------------------------------------------- > ----- > TypeError Traceback (most recent > call last) > > /home/charris/ in () > > TypeError: unsupported operand type(s) for |: 'float' and 'float' > > > Comparisons always return boolean arrays, so you don't have to > worry about that. > > Chuck > I've attached a short test program for numpy arrays of floats for which "&" and "|" seem to work. If, as you say, "&" and "|" don't work for floats, why does this program work? from numpy import * a = array([(1.1, 2.1),(3.1, 4.1)],'float') b = a + 1 c = b + 1 print "a = ",a print "b = ",b print "c = ",c mask = (a < 4.5) & (b < 4.5) & (c < 4.5) print "mask = ",mask print "masked a = ",a[mask] print "masked b = ",b[mask] print "masked c = ",c[mask] > From kwgoodman at gmail.com Wed Jul 9 15:45:17 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 9 Jul 2008 12:45:17 -0700 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 33 In-Reply-To: <199CC9A4-507C-4F85-80BF-F995D84B6AA0@jpl.nasa.gov> References: <199CC9A4-507C-4F85-80BF-F995D84B6AA0@jpl.nasa.gov> Message-ID: On Wed, Jul 9, 2008 at 12:43 PM, Catherine Moroney wrote: >> >> > 2008/7/9 Catherine Moroney : >> > >> >> I have a question about performing element-wise logical operations >> >> on numpy arrays. >> >> >> >> If "a", "b" and "c" are numpy arrays of the same size, does the >> >> following syntax work? >> >> >> >> mask = (a > 1.0) & ((b > 3.0) | (c > 10.0)) >> >> >> >> It seems to be performing correctly, but the documentation that >> I've >> >> read indicates that "&" and "|" are for bitwise operations, not >> >> element-by- >> >> element operations in arrays. >> >> >> >> I'm trying to avoid using "logical_and" and "logical_or" because >> they >> >> make the code more cumbersome and difficult to read. Are "&" >> and "|" >> >> acceptable substitutes for numpy arrays? >> > >> > Yes. Unfortunately it is impossible to make python's usual logical >> > operators, "and", "or", etcetera, behave correctly on numpy >> arrays. So >> > the decision was made to use the bitwise operators to express >> logical >> > operations on boolean arrays. If you like, you can think of boolean >> > arrays as containing single bits, so that the bitwise operators >> *are* >> > the logical operators. >> > >> > Confusing, but I'm afraid there really isn't anything the numpy >> > developers can do about it, besides write good documentation. >> > >> Do "&" and "|" work on all types of numpy arrays (i.e. floats and >> 16 and 32-bit integers), or only on arrays of booleans? The short >> tests I've done seem to indicate that it does, but I'd like to have >> some confirmation. >> >> They work for all integer types but not for float or complex types: >> >> In [1]: x = ones(3) >> >> In [2]: x | x >> ---------------------------------------------------------------------- >> ----- >> TypeError Traceback (most recent >> call last) >> >> /home/charris/ in () >> >> TypeError: unsupported operand type(s) for |: 'float' and 'float' >> >> >> Comparisons always return boolean arrays, so you don't have to >> worry about that. >> >> Chuck >> > I've attached a short test program for numpy arrays of floats for which > "&" and "|" seem to work. If, as you say, "&" and "|" don't work for > floats, why does this program work? > > from numpy import * > > a = array([(1.1, 2.1),(3.1, 4.1)],'float') > b = a + 1 > c = b + 1 > > print "a = ",a > print "b = ",b > print "c = ",c > > mask = (a < 4.5) & (b < 4.5) & (c < 4.5) > print "mask = ",mask > > print "masked a = ",a[mask] > print "masked b = ",b[mask] > print "masked c = ",c[mask] a contains floats. But a > 4.5 doesn't: >> a = np.array([(1.1, 2.1),(3.1, 4.1)],'float') >> a array([[ 1.1, 2.1], [ 3.1, 4.1]]) >> a < 4.5 array([[ True, True], [ True, True]], dtype=bool) >> a | a --------------------------------------------------------------------------- TypeError: unsupported operand type(s) for |: 'float' and 'float' From robert.kern at gmail.com Wed Jul 9 15:45:32 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 14:45:32 -0500 Subject: [Numpy-discussion] A couple of testing issues In-Reply-To: <1d36917a0807091235i48cffb9r3826805f2fb7c288@mail.gmail.com> References: <1d36917a0807082357n350c6bf5l5fd7aa40936ac02e@mail.gmail.com> <1d36917a0807091219m6275c7c9mfd9b3bbdcf9f60d5@mail.gmail.com> <3d375d730807091226q31ebdaf8t7c6f4d65666338d9@mail.gmail.com> <1d36917a0807091235i48cffb9r3826805f2fb7c288@mail.gmail.com> Message-ID: <3d375d730807091245y473ca660ib327fdce587b3871@mail.gmail.com> On Wed, Jul 9, 2008 at 14:35, Alan McIntyre wrote: > On Wed, Jul 9, 2008 at 3:26 PM, Robert Kern wrote: >> I don't think it's worth automating on every run. People can see for >> themselves if they have any such check_methods() and make the >> conversion once: > > Does this fall into the "how in the world should I have known to do > that" category? Doesn't matter. It doesn't work anyways; those arguments are for matching classes and module-level functions, not TestCase methods. Fortunately, grep works just as well. > As long as there's a prominent note in the release > notes, either containing such suggestions or links to a page that > does, I don't have any problem just including this in the list of > things that people should do if they're planning on upgrading to 1.2. By all means. This should be documented, of course, but one-time conversion tasks amenable to grep are not worth checking for on every run. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Jul 9 15:52:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 14:52:45 -0500 Subject: [Numpy-discussion] alterdot and restoredot In-Reply-To: References: <3d375d730807082103x3f1f9a2emad53349d82912b1f@mail.gmail.com> <3d375d730807090243t5bae7058p7798799d586c207@mail.gmail.com> Message-ID: <3d375d730807091252i25624563qa696e6c3f8cd0ab6@mail.gmail.com> On Wed, Jul 9, 2008 at 06:36, Anne Archibald wrote: > 2008/7/9 Robert Kern : > >>> - Which operations do the functions exactly affect? >>> It seems that alterdot sets the "dot" function slot to a BLAS >>> version, but what operations does this affect? >> >> dot(), vdot(), and innerproduct() on C-contiguous arrays which are >> Matrix-Matrix, Matrix-Vector or Vector-Vector products. > > Really? Not, say, tensordot()? If the ultimate dot() call inside tensordot() is one of the above forms, then yes. If it's a 3D-3D product, for example, or one where the shape manipulations leave the arrays discontiguous, then no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed Jul 9 16:16:05 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 16:16:05 -0400 Subject: [Numpy-discussion] Unused matrix method Message-ID: <1d36917a0807091316y393a2735w35cf4149fd161b78@mail.gmail.com> There's a _get_truendim method on matrix that isn't referenced anywhere in NumPy, SciPy, or matplotlib. Should this get deprecated or removed in 1.2? From robert.kern at gmail.com Wed Jul 9 16:23:59 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 15:23:59 -0500 Subject: [Numpy-discussion] Unused matrix method In-Reply-To: <1d36917a0807091316y393a2735w35cf4149fd161b78@mail.gmail.com> References: <1d36917a0807091316y393a2735w35cf4149fd161b78@mail.gmail.com> Message-ID: <3d375d730807091323k16552e67o53c03233061c1e35@mail.gmail.com> On Wed, Jul 9, 2008 at 15:16, Alan McIntyre wrote: > There's a _get_truendim method on matrix that isn't referenced > anywhere in NumPy, SciPy, or matplotlib. Should this get deprecated > or removed in 1.2? We could remove it. It's a private method. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed Jul 9 16:26:01 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 16:26:01 -0400 Subject: [Numpy-discussion] Unused matrix method In-Reply-To: <3d375d730807091323k16552e67o53c03233061c1e35@mail.gmail.com> References: <1d36917a0807091316y393a2735w35cf4149fd161b78@mail.gmail.com> <3d375d730807091323k16552e67o53c03233061c1e35@mail.gmail.com> Message-ID: <1d36917a0807091326t5e6f51d1y67177890d075cd01@mail.gmail.com> On Wed, Jul 9, 2008 at 4:23 PM, Robert Kern wrote: > On Wed, Jul 9, 2008 at 15:16, Alan McIntyre wrote: >> There's a _get_truendim method on matrix that isn't referenced >> anywhere in NumPy, SciPy, or matplotlib. Should this get deprecated >> or removed in 1.2? > > We could remove it. It's a private method. Done. From marlin_rowley at hotmail.com Wed Jul 9 16:34:08 2008 From: marlin_rowley at hotmail.com (Marlin Rowley) Date: Wed, 9 Jul 2008 15:34:08 -0500 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? In-Reply-To: References: Message-ID: Thanks Chuck, but I wasn't quit clear with my question. You answered exactly according to what I asked, but I failed to mention needing the dot product instead of just the product. So, v dot A = v' v'[0] = v[0]*A[0] + v[1]*A[1] + v[2]*A[2] v'[1] = v[0]*A[3] + v[1]*A[4] + v[2]*A[5] v'[2] = v[0]*A[6] + v[1]*A[7] + v[2]*A[8] -M Date: Wed, 9 Jul 2008 13:26:01 -0600From: charlesr.harris at gmail.comTo: numpy-discussion at scipy.orgSubject: Re: [Numpy-discussion] Multiplying every 3 elements by a vector? On Wed, Jul 9, 2008 at 1:16 PM, Marlin Rowley wrote: All: I'm trying to take a constant vector: v = (0.122169, 0.61516, 0.262671) and multiply those values by every 3 components in an array of length N: A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ....] So what I want is: v[0]*A[0]v[1]*A[1]v[2]*A[2]v[0]*A[3]v[1]*A[4]v[2]*A[5]v[0]*A[6] ... How do I do this with one command in numPy? If the length of A is divisible by 3:A.reshape((-1,3))*vYou might want to reshape the result to 1-D.Chuck _________________________________________________________________ Use video conversation to talk face-to-face with Windows Live Messenger. http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_video_072008 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Wed Jul 9 16:35:12 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 9 Jul 2008 16:35:12 -0400 Subject: [Numpy-discussion] chararray constructor change Message-ID: <1d36917a0807091335k41a2bc08wa1d98743a4ab02f@mail.gmail.com> I'd like to make the following change to the chararray constructor. This is motivated by some of chararray's methods constructing new chararrays with NumPy integer arguments to itemsize, and it just seemed easier to fix this in the constructor. Index: numpy/numpy/core/defchararray.py =================================================================== --- numpy/numpy/core/defchararray.py (revision 5378) +++ numpy/numpy/core/defchararray.py (working copy) @@ -25,6 +25,11 @@ else: dtype = string_ + # force itemsize to be a Python long, since using Numpy integer + # types results in itemsize.itemsize being used as the size of + # strings in the new array. + itemsize = long(itemsize) + _globalvar = 1 if buffer is None: self = ndarray.__new__(subtype, shape, (dtype, itemsize), From charlesr.harris at gmail.com Wed Jul 9 17:17:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 15:17:24 -0600 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? In-Reply-To: References: Message-ID: On Wed, Jul 9, 2008 at 2:34 PM, Marlin Rowley wrote: > Thanks Chuck, but I wasn't quit clear with my question. > > You answered exactly according to what I asked, but I failed to mention > needing the dot product instead of just the product. > > So, > > v dot A = v' > > v'[0] = v[0]*A[0] + v[1]*A[1] + v[2]*A[2] > v'[1] = v[0]*A[3] + v[1]*A[4] + v[2]*A[5] > v'[2] = v[0]*A[6] + v[1]*A[7] + v[2]*A[8] > > There is no built in method for this specific problem (stacks of vectors and matrices), but you can make things work: sum(A.reshape((-1,3))*v, axis=1) You can do lots of interesting things using such manipulations and newaxis. For instance, multiplying stacks of matrices by stacks of matrices etc. I put up a post of such things once if you are interested. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Jul 9 17:26:11 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 17:26:11 -0400 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? In-Reply-To: References: Message-ID: 2008/7/9 Charles R Harris : > > On Wed, Jul 9, 2008 at 2:34 PM, Marlin Rowley > wrote: >> >> Thanks Chuck, but I wasn't quit clear with my question. >> >> You answered exactly according to what I asked, but I failed to mention >> needing the dot product instead of just the product. >> >> So, >> >> v dot A = v' >> >> v'[0] = v[0]*A[0] + v[1]*A[1] + v[2]*A[2] >> v'[1] = v[0]*A[3] + v[1]*A[4] + v[2]*A[5] >> v'[2] = v[0]*A[6] + v[1]*A[7] + v[2]*A[8] >> > > There is no built in method for this specific problem (stacks of vectors and > matrices), but you can make things work: > > sum(A.reshape((-1,3))*v, axis=1) > > You can do lots of interesting things using such manipulations and newaxis. > For instance, multiplying stacks of matrices by stacks of matrices etc. I > put up a post of such things once if you are interested. This particular instance can be viewed as a matrix multiplication (np.dot(A.reshape((-1,3)),v) I think). Anne From charlesr.harris at gmail.com Wed Jul 9 17:44:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Jul 2008 15:44:13 -0600 Subject: [Numpy-discussion] Multiplying every 3 elements by a vector? In-Reply-To: References: Message-ID: On Wed, Jul 9, 2008 at 3:26 PM, Anne Archibald wrote: > 2008/7/9 Charles R Harris : > > > > On Wed, Jul 9, 2008 at 2:34 PM, Marlin Rowley > > > wrote: > >> > >> Thanks Chuck, but I wasn't quit clear with my question. > >> > >> You answered exactly according to what I asked, but I failed to mention > >> needing the dot product instead of just the product. > >> > >> So, > >> > >> v dot A = v' > >> > >> v'[0] = v[0]*A[0] + v[1]*A[1] + v[2]*A[2] > >> v'[1] = v[0]*A[3] + v[1]*A[4] + v[2]*A[5] > >> v'[2] = v[0]*A[6] + v[1]*A[7] + v[2]*A[8] > >> > > > > There is no built in method for this specific problem (stacks of vectors > and > > matrices), but you can make things work: > > > > sum(A.reshape((-1,3))*v, axis=1) > > > > You can do lots of interesting things using such manipulations and > newaxis. > > For instance, multiplying stacks of matrices by stacks of matrices etc. I > > put up a post of such things once if you are interested. > > This particular instance can be viewed as a matrix multiplication > (np.dot(A.reshape((-1,3)),v) I think). Yep, that should work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Jul 9 19:55:10 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 19:55:10 -0400 Subject: [Numpy-discussion] "expected a single-segment buffer object" Message-ID: Hi, When trying to construct an ndarray, I sometimes run into the more-or-less mystifying error "expected a single-segment buffer object": Out[54]: (0, 16, 8) In [55]: A=np.zeros(2); A=A[np.newaxis,...]; np.ndarray(strides=A.strides,shape=A.shape,buffer=A,dtype=A.dtype) --------------------------------------------------------------------------- Traceback (most recent call last) /home/peridot/ in () : expected a single-segment buffer object In [56]: A.strides Out[56]: (0, 8) That is, when I try to construct an ndarray based on an array with a zero stride, I get this mystifying error. Zero-strided arrays appear naturally when one uses newaxis, but they are valuable in their own right (for example for broadcasting purposes). So it's a bit awkward to have this error appearing when one tries to feed them to ndarray.__new__ as a buffer. I can, I think, work around it by removing all axes with stride zero: def bufferize(A): idx = [] for v in A.strides: if v==0: idx.append(0) else: idx.append(slice(None,None,None)) return A[tuple(idx)] Is there any reason for this restriction? Thanks, Anne From robert.kern at gmail.com Wed Jul 9 21:42:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 20:42:14 -0500 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: Message-ID: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> On Wed, Jul 9, 2008 at 18:55, Anne Archibald wrote: > Hi, > > When trying to construct an ndarray, I sometimes run into the > more-or-less mystifying error "expected a single-segment buffer > object": > > Out[54]: (0, 16, 8) > In [55]: A=np.zeros(2); A=A[np.newaxis,...]; > np.ndarray(strides=A.strides,shape=A.shape,buffer=A,dtype=A.dtype) > --------------------------------------------------------------------------- > Traceback (most recent call last) > > /home/peridot/ in () > > : expected a single-segment buffer object > > In [56]: A.strides > Out[56]: (0, 8) > > That is, when I try to construct an ndarray based on an array with a > zero stride, I get this mystifying error. Zero-strided arrays appear > naturally when one uses newaxis, but they are valuable in their own > right (for example for broadcasting purposes). So it's a bit awkward > to have this error appearing when one tries to feed them to > ndarray.__new__ as a buffer. I can, I think, work around it by > removing all axes with stride zero: > > def bufferize(A): > idx = [] > for v in A.strides: > if v==0: > idx.append(0) > else: > idx.append(slice(None,None,None)) > return A[tuple(idx)] > > Is there any reason for this restriction? Yes, the buffer interface, at least the subset that ndarray() consumes, requires that all of the data be contiguous in memory. array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which looks like this: #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 || \ PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) || \ PyArray_CHKFLAGS(m, NPY_FORTRAN)) Trying to get a buffer object from anything that is neither C- or Fortran-contiguous will fail. E.g. In [1]: from numpy import * In [2]: A = arange(10) In [3]: B = A[::2] In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/rkern/today/ in () TypeError: expected a single-segment buffer object What is the use case, here? One rarely has to use the ndarray constructor by itself. For example, the result you seem to want from the call you make above can be done just fine with .view(). In [8]: C = B.view(ndarray) In [9]: C Out[9]: array([0, 2, 4, 6, 8]) In [10]: B Out[10]: array([0, 2, 4, 6, 8]) In [11]: C is B Out[11]: False In [12]: B[0] = 10 In [13]: C Out[13]: array([10, 2, 4, 6, 8]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Wed Jul 9 22:29:38 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Jul 2008 22:29:38 -0400 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> Message-ID: 2008/7/9 Robert Kern : > Yes, the buffer interface, at least the subset that ndarray() > consumes, requires that all of the data be contiguous in memory. > array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which > looks like this: > > #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 || \ > PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) || \ > PyArray_CHKFLAGS(m, NPY_FORTRAN)) > > Trying to get a buffer object from anything that is neither C- or > Fortran-contiguous will fail. E.g. > > In [1]: from numpy import * > > In [2]: A = arange(10) > > In [3]: B = A[::2] > > In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > > /Users/rkern/today/ in () > > TypeError: expected a single-segment buffer object Is this really necessary? What does making this restriction gain? It certainly means that many arrays whose storage is a contiguous block of memory can still not be used (just permute the axes of a 3d array, say; it may even be possible for an array to be in C contiguous order but for the flag not to be set), but how is one to construct exotic slices of an array that is strided in memory? (The real part of a complex array, say.) I suppose one could follow the linked list of .bases up to the original ndarray, which should normally be C- or Fortran-contiguous, then work out the offset, but even this may not always work: what if the original array was constructed with non-C-contiguous strides from some preexisting buffer? If the concern is that this allows users to shoot themselves in the foot, it's worth noting that even with the current setup you can easily fabricate strides and shapes that go outside the allocated part of memory. > What is the use case, here? One rarely has to use the ndarray > constructor by itself. For example, the result you seem to want from > the call you make above can be done just fine with .view(). I was presenting a simple example. I was actually trying to use zero-strided arrays to implement broadcasting. The code was rather long, but essentially what it was meant to do was generate a view of an array in which an axis of length one had been replaced by an axis of length m with stride zero. (The point of all this was to create a class like vectorize that was suitable for use on, for example, np.linalg.inv().) But I also ran into this problem while writing segmentaxis.py, the code to produce a "matrix" of sliding windows. (See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the exception and copied the array (unnecessarily) if this came up. Anne From robert.kern at gmail.com Wed Jul 9 23:03:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Jul 2008 22:03:18 -0500 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> Message-ID: <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> On Wed, Jul 9, 2008 at 21:29, Anne Archibald wrote: > 2008/7/9 Robert Kern : > >> Yes, the buffer interface, at least the subset that ndarray() >> consumes, requires that all of the data be contiguous in memory. >> array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which >> looks like this: >> >> #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 || \ >> PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) || \ >> PyArray_CHKFLAGS(m, NPY_FORTRAN)) >> >> Trying to get a buffer object from anything that is neither C- or >> Fortran-contiguous will fail. E.g. >> >> In [1]: from numpy import * >> >> In [2]: A = arange(10) >> >> In [3]: B = A[::2] >> >> In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype) >> --------------------------------------------------------------------------- >> TypeError Traceback (most recent call last) >> >> /Users/rkern/today/ in () >> >> TypeError: expected a single-segment buffer object > > Is this really necessary? What does making this restriction gain? It > certainly means that many arrays whose storage is a contiguous block > of memory can still not be used (just permute the axes of a 3d array, > say; it may even be possible for an array to be in C contiguous order > but for the flag not to be set), but how is one to construct exotic > slices of an array that is strided in memory? (The real part of a > complex array, say.) Because that's just what a buffer= argument *is*. It is not a place for presenting the starting pointer to exotically-strided memory. Use __array_interface__s to describe the full range of representable memory. See below. > I suppose one could follow the linked list of .bases up to the > original ndarray, which should normally be C- or Fortran-contiguous, > then work out the offset, but even this may not always work: what if > the original array was constructed with non-C-contiguous strides from > some preexisting buffer? > > If the concern is that this allows users to shoot themselves in the > foot, it's worth noting that even with the current setup you can > easily fabricate strides and shapes that go outside the allocated part > of memory. > >> What is the use case, here? One rarely has to use the ndarray >> constructor by itself. For example, the result you seem to want from >> the call you make above can be done just fine with .view(). > > I was presenting a simple example. I was actually trying to use > zero-strided arrays to implement broadcasting. The code was rather > long, but essentially what it was meant to do was generate a view of > an array in which an axis of length one had been replaced by an axis > of length m with stride zero. (The point of all this was to create a > class like vectorize that was suitable for use on, for example, > np.linalg.inv().) But I also ran into this problem while writing > segmentaxis.py, the code to produce a "matrix" of sliding windows. > (See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the > exception and copied the array (unnecessarily) if this came up. I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chanley at stsci.edu Thu Jul 10 10:06:29 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Thu, 10 Jul 2008 10:06:29 -0400 Subject: [Numpy-discussion] new chararray test fails on Mac OS X Message-ID: <48761765.2020100@stsci.edu> From the svn log it looks like the tests are intended to fail? However I would prefer tests that are designed only to fail when problems occur. Otherwise we end up with problems with our automatic build and test scripts. Thanks, Chris ====================================================================== FAIL: test_mul (test_defchararray.TestOperations) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/chanley/dev/site-packages/lib/python/numpy/core/tests/test_defcha rarray.py", line 49, in test_mul assert all(Ar == (self.A * r)) AssertionError ====================================================================== FAIL: test_rmul (test_defchararray.TestOperations) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/chanley/dev/site-packages/lib/python/numpy/core/tests/test_defcha rarray.py", line 65, in test_rmul assert all(Ar == (r * self.A)) AssertionError ---------------------------------------------------------------------- Ran 1930 tests in 13.130s FAILED (failures=2) >>> n.__version__ '1.2.0.dev5385' -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From alan.mcintyre at gmail.com Thu Jul 10 10:26:17 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 10 Jul 2008 10:26:17 -0400 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <48761765.2020100@stsci.edu> References: <48761765.2020100@stsci.edu> Message-ID: <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> On Thu, Jul 10, 2008 at 10:06 AM, Christopher Hanley wrote: > From the svn log it looks like the tests are intended to fail? However > I would prefer tests that are designed only to fail when problems > occur. Otherwise we end up with problems with our automatic build and > test scripts. Sorry about that, I just wanted to make sure the tests would actually fail on all the builders before the bug in chararray is fixed. I just checked in a change that disables the failing portions of the test. From peridot.faceted at gmail.com Thu Jul 10 11:33:58 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 10 Jul 2008 11:33:58 -0400 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: 2008/7/9 Robert Kern : > Because that's just what a buffer= argument *is*. It is not a place > for presenting the starting pointer to exotically-strided memory. Use > __array_interface__s to describe the full range of representable > memory. See below. Aha! Is this stuff documented somewhere? > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk. Nice! Unfortunately it can't quite do what I want... for the linear algebra I need something that can broadcast all but certain axes. For example, take an array of matrices and an array of vectors. The "array" axes need broadcasting, but you can't broadcast on all axes without (incorrectly) turning the vector into a matrix. I've written a (messy) implementation, but the corner cases are giving me headaches. I'll let you know when I have something that works. Thanks, Anne From charlesr.harris at gmail.com Thu Jul 10 11:55:35 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Jul 2008 09:55:35 -0600 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald wrote: > 2008/7/9 Robert Kern : > > > Because that's just what a buffer= argument *is*. It is not a place > > for presenting the starting pointer to exotically-strided memory. Use > > __array_interface__s to describe the full range of representable > > memory. See below. > > Aha! Is this stuff documented somewhere? > > > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the > trunk. > > Nice! Unfortunately it can't quite do what I want... for the linear > algebra I need something that can broadcast all but certain axes. For > example, take an array of matrices and an array of vectors. The > "array" axes need broadcasting, but you can't broadcast on all axes > without (incorrectly) turning the vector into a matrix. I've written a > (messy) implementation, but the corner cases are giving me headaches. > I'll let you know when I have something that works. > I think something like a matrix/vector dtype would be another way to go for stacks of matrices and vectors. It would have to be a user defined type to fit into the current type hierarchy for ufuncs, but I think the base machinery is there with the generic inner loops. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtlussier at gmail.com Thu Jul 10 12:38:43 2008 From: dtlussier at gmail.com (Dan Lussier) Date: Thu, 10 Jul 2008 17:38:43 +0100 Subject: [Numpy-discussion] huge array calculation speed Message-ID: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Hello, I am relatively new to numpy and am having trouble with the speed of a specific array based calculation that I'm trying to do. What I'm trying to do is to calculate the total total potential energy and coordination number of each atom within a relatively large simulation. Each atom is at a position (x,y,z) given by a row in a large array (approximately 1e6 by 3) and presently I have no information about its nearest neighbours so each its position must be checked against all others before cutting the list down prior to calculating the energy. My current numpy code is below and in it I have tried to push as much of the work for this computation into compiled C (ideally) of numpy. However it is still taking a very long time at approx 1 second per row. At this speed even some simple speed ups like weave won't really help. Are there any other numpy ways that it would be possible to calculate this, or should I start looking at going to C/C++? I am also left wondering if there is something wrong with my installation of numpy (i.e. not making use of lapack/ATLAS). Other than that if it is relevant - I am running 32 bit x86 Centos 5.1 linux on a dual Xeon 3.0 GHz Dell tower with 4 GB of memory. Any suggestions of things to try would be great. Dan ======================================================================== ======================= def epairANDcoord(xyz,cutoff=4.0): box_length = 160.0 cut2 = cutoff**2 ljTotal = numpy.zeros(xyz.shape[0]) coord = numpy.zeros(xyz.shape[0]) i = 0 for r0 in xyz: r2 = r0-xyz # application of minimum image convention so each atom interacts with nearest periodic image w/i box_length/2 r2 -= box_length*numpy.rint(r2/box_length) r2 = numpy.power(r2,2).sum(axis=1) r2 = numpy.extract(r225: break # used to cut down the length of calculation as needed return ljTotal,coord From charlesr.harris at gmail.com Thu Jul 10 13:35:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Jul 2008 11:35:41 -0600 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: On Thu, Jul 10, 2008 at 10:38 AM, Dan Lussier wrote: > Hello, > > I am relatively new to numpy and am having trouble with the speed of > a specific array based calculation that I'm trying to do. > > What I'm trying to do is to calculate the total total potential > energy and coordination number of each atom within a relatively large > simulation. Each atom is at a position (x,y,z) given by a row in a > large array (approximately 1e6 by 3) and presently I have no > information about its nearest neighbours so each its position must be > checked against all others before cutting the list down prior to > calculating the energy. > This looks to be O(n^2) and might well be the bottle neck. There are various ways to speed up such things but more information would help determine the method, i.e., is this operation within a loop so that the values change a lot. However, one quick thing to try is a sort on one of the coordinates so you only need to check a subset of the vectors. Searchsorted could be useful here also. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu Jul 10 14:03:26 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 10 Jul 2008 20:03:26 +0200 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: <9457e7c80807101103obd957cdgad50a820a63da07d@mail.gmail.com> 2008/7/10 Robert Kern : > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk. Robert, this is fantastic! I think people are going to enjoy your talk at SciPy'08. If you want, we could also tutor this in the advanced NumPy session. Cheers St?fan From Nicolas.Rougier at loria.fr Thu Jul 10 14:48:18 2008 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Thu, 10 Jul 2008 20:48:18 +0200 Subject: [Numpy-discussion] python user defined type Message-ID: <1215715698.6369.7.camel@oxygen> Hi all, I'm rather new to the list so maybe the question is well known but from my researches on the web and list archive, I did not find a clear explanation. I would like to create numpy array with my own (python) datatype, so I tried the obvious solution: from numpy import * class Unit(object): def __init__(self,value=0.0): self.value = value def __float__(self): return self.value def __repr__(self): return str(self.value) a = array([[Unit(1), Unit(2)], [Unit(3), Unit(4)]]) print a print a.dtype.name print a[0,0].value #a[0,0] = 0 #print a[0,0].value a = array (array([[1,2],[3,4]]), dtype=Unit) print a print a.dtype print a[0,0].value but the last print make python to raise an AttributeError stating that int object do not have a 'value' attribute. I guess I missed something concerning numpy object data type. Do I need to implement some special methods in Unit to make numpy happy ? Also, the commented line (a[0,0] = 0) makes the item to become an int while I need to set the value of the item instead, is that possible ? Nicolas From bsouthey at gmail.com Thu Jul 10 15:13:00 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 10 Jul 2008 14:13:00 -0500 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: <48765F3C.10903@gmail.com> Charles R Harris wrote: > > > On Thu, Jul 10, 2008 at 10:38 AM, Dan Lussier > wrote: > > Hello, > > I am relatively new to numpy and am having trouble with the speed of > a specific array based calculation that I'm trying to do. > > What I'm trying to do is to calculate the total total potential > energy and coordination number of each atom within a relatively large > simulation. Each atom is at a position (x,y,z) given by a row in a > large array (approximately 1e6 by 3) and presently I have no > information about its nearest neighbours so each its position must be > checked against all others before cutting the list down prior to > calculating the energy. > > > This looks to be O(n^2) and might well be the bottle neck. There are > various ways to speed up such things but more information would help > determine the method, i.e., is this operation within a loop so that > the values change a lot. However, one quick thing to try is a sort on > one of the coordinates so you only need to check a subset of the > vectors. Searchsorted could be useful here also. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > If I understand correctly, I notice that you are doing many scalar multiplications where some are the same or constants. For example, you compute r2*r2 a few times: numpy.power(r2,2)= r2*r2 numpy.power(r2,3)= r2*r2*r2 numpy.power(r2,6)= r2*r2*r2*r2*r2*r2 But you don't need this numpy.power(r2,6) as it can be factored. Also the division is unnecessary. So the liTotal is really: ljTotal[i] = 2.0*((numpy.power(r2,-3)*(numpy.power(r2,-3)-1)).sum(axis=0)) However, the real problem is getting the coordinates that I do not follow and limits how you can remove the loop over xyz. Like why the need for the repeated (r0-xyz)? This is a huge cost that you need to avoid perhaps by just extracting those elements within your criteria. Bruce From peridot.faceted at gmail.com Thu Jul 10 16:25:16 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 10 Jul 2008 16:25:16 -0400 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: 2008/7/10 Charles R Harris : > On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald > wrote: >> >> 2008/7/9 Robert Kern : >> >> > Because that's just what a buffer= argument *is*. It is not a place >> > for presenting the starting pointer to exotically-strided memory. Use >> > __array_interface__s to describe the full range of representable >> > memory. See below. >> >> Aha! Is this stuff documented somewhere? >> >> > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the >> > trunk. >> >> Nice! Unfortunately it can't quite do what I want... for the linear >> algebra I need something that can broadcast all but certain axes. For >> example, take an array of matrices and an array of vectors. The >> "array" axes need broadcasting, but you can't broadcast on all axes >> without (incorrectly) turning the vector into a matrix. I've written a >> (messy) implementation, but the corner cases are giving me headaches. >> I'll let you know when I have something that works. > > I think something like a matrix/vector dtype would be another way to go for > stacks of matrices and vectors. It would have to be a user defined type to > fit into the current type hierarchy for ufuncs, but I think the base > machinery is there with the generic inner loops. There's definitely room for discussion about how such a linear algebra system should ultimately work. In the short term, though, I think it's valuable to create a prototype system, even if much of the looping has to be in python, just to figure out how it should look to the user. For example, I'm not sure whether a matrix/vector dtype would make easy things like indexing operations - fancy indexing spanning both array and matrix axes, for example. dtypes can also be a little impenetrable to use, so even if this is how elementwise linear algebra is ultimately implemented, we may want to provide a different user frontend. My idea for a first sketch was simply to provide (for example) np.elementwise_linalg.dot_mm, that interprets its arguments both as arrays of matrices and yields an array of matrices. A second sketch might be a subclass MatrixArray, which could serve much as Matrix does now, to tell various functions which axes to do linear algebra over and which axes to iterate over. There's also the question of how to make these efficient; one could of course write a C wrapper for each linear algebra function that simply iterates, but your suggestion of using the ufunc machinery with a matrix dtype might be better. Or, if cython acquires sensible polymorphism over numpy dtypes, a cython wrapper might be the way to go. But I think establishing what it should look like to users is a good first step, and I think that's best done with sample implementations. Anne From Chris.Barker at noaa.gov Thu Jul 10 16:43:18 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 10 Jul 2008 13:43:18 -0700 Subject: [Numpy-discussion] python user defined type In-Reply-To: <1215715698.6369.7.camel@oxygen> References: <1215715698.6369.7.camel@oxygen> Message-ID: <48767466.6000504@noaa.gov> Nicolas Rougier wrote: > I would like to create numpy array with my own (python) datatype, so I > tried the obvious solution: > > from numpy import * > class Unit(object): > def __init__(self,value=0.0): > self.value = value > def __float__(self): > return self.value > def __repr__(self): > return str(self.value) > a = array (array([[1,2],[3,4]]), dtype=Unit) the dtype argument is designed to take a numpy type object, not an arbitrary class -- what you want in this case is dtype=numpy.object, which is what you did before. I'm a surprised this didn't raise an error, but it looks like you got an object array, but you objects you gave it are python integers. All python classes are "objects" as far an numpy is concerned. The numpy dtypes are a description of a given number of bytes in memory -- python classes are stored as a pointer to the python object. (and you really don't want to use "import *") > Also, the commented line (a[0,0] = 0) makes the item to become an int > while I need to set the value of the item instead, is that possible ? a[0,0] = Unit(0) You're setting the [0,0]th element to an object, you need to give it the object you want, the literal "0" is a python integer with the value zero. numpy arrays of python objects act a whole lot like other python containers. What would you expect from : a = [1,2,3,4] a list of integers, no? or a = [Unit(1), Unit(2)] # a list of Units.. then # a[0] = 3 now a list with the integer3 in the zeroth position, and a Unit in the 1st. You did it right the first time: a = array([[Unit(1), Unit(2)], [Unit(3), Unit(4)]]) though you need to be careful building object arrays of nested lists -- numpy won't unnecessarily figure out how do d-nest it. You might want to do: >>> import numpy as np >>> a = np.empty((2,2), np.object) >>> a array([[None, None], [None, None]], dtype=object) >>> a[:,:] = [[Unit(1), Unit(2)], [Unit(3), Unit(4)]] >>> a array([[1, 2], [3, 4]], dtype=object) One more note: class Unit(object): def __init__(self,value=0.0): self.value = value def __float__(self): return self.value def __repr__(self): return str(self.value) __repr__ really should be something like: def __repr__(self): return "Unit(%g)%self.value" eval(repr(Object)) == Object, ideally, plus it'll be easier to debug if you can see what it is. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From peridot.faceted at gmail.com Thu Jul 10 16:55:26 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 10 Jul 2008 16:55:26 -0400 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: 2008/7/10 Dan Lussier : > I am relatively new to numpy and am having trouble with the speed of > a specific array based calculation that I'm trying to do. > > What I'm trying to do is to calculate the total total potential > energy and coordination number of each atom within a relatively large > simulation. Each atom is at a position (x,y,z) given by a row in a > large array (approximately 1e6 by 3) and presently I have no > information about its nearest neighbours so each its position must be > checked against all others before cutting the list down prior to > calculating the energy. The way you've implemented this goes as the square of the number of atoms. This is going to absolutely kill performance, and you can spend weeks trying to optimize the code for a factor of two speedup. I would look hard for algorithms that do this in less than O(n^2). This problem of finding the neighbours of an object has seen substantial research, and there are a variety of well-established techniques covering many possible situations. My knowledge of it is far from up-to-date, but since you have only a three-dimensional problem, you could try a three-dimensional grid (if your atoms aren't too clumpy) or octrees (if they are somewhat clumpy); kd-trees are probably overkill (since they're designed for higher-dimensional problems). > My current numpy code is below and in it I have tried to push as much > of the work for this computation into compiled C (ideally) of numpy. > However it is still taking a very long time at approx 1 second per > row. At this speed even some simple speed ups like weave won't > really help. > > Are there any other numpy ways that it would be possible to calculate > this, or should I start looking at going to C/C++? > > I am also left wondering if there is something wrong with my > installation of numpy (i.e. not making use of lapack/ATLAS). Other > than that if it is relevant - I am running 32 bit x86 Centos 5.1 > linux on a dual Xeon 3.0 GHz Dell tower with 4 GB of memory. Unfortunately, implementing most of the algorithms in the literature within numpy is going to be fairly cumbersome. But I think there are better algorithms you could try: * Put all your atoms in a grid. Ideally the cells would be about the size of your cutoff radius, so that for each atom you need to pull out all atoms in at most eight cells for checking against the cutoff. I think this can be done fairly efficiently in numpy. * Sort the atoms along a coordinate and use searchsorted to pull out those for which that coordinate is within the cutoff. This should get you down to reasonably modest lists fairly quickly. There is of course a tradeoff between preprocessing time and lookup time. We seem to get quite a few posts from people wanting some kind of spatial data structure (whether they know it or not). Would it make sense to come up with some kind of compiled spatial data structure to include in scipy? Anne From charlesr.harris at gmail.com Thu Jul 10 17:30:36 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Jul 2008 15:30:36 -0600 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: On Thu, Jul 10, 2008 at 2:55 PM, Anne Archibald wrote: > 2008/7/10 Dan Lussier : > > > I am relatively new to numpy and am having trouble with the speed of > > a specific array based calculation that I'm trying to do. > > > > What I'm trying to do is to calculate the total total potential > > energy and coordination number of each atom within a relatively large > > simulation. Each atom is at a position (x,y,z) given by a row in a > > large array (approximately 1e6 by 3) and presently I have no > > information about its nearest neighbours so each its position must be > > checked against all others before cutting the list down prior to > > calculating the energy. > > The way you've implemented this goes as the square of the number of > atoms. This is going to absolutely kill performance, and you can spend > weeks trying to optimize the code for a factor of two speedup. I would > look hard for algorithms that do this in less than O(n^2). > > This problem of finding the neighbours of an object has seen > substantial research, and there are a variety of well-established > techniques covering many possible situations. My knowledge of it is > far from up-to-date, but since you have only a three-dimensional > problem, you could try a three-dimensional grid (if your atoms aren't > too clumpy) or octrees (if they are somewhat clumpy); kd-trees are > probably overkill (since they're designed for higher-dimensional > problems). > > > My current numpy code is below and in it I have tried to push as much > > of the work for this computation into compiled C (ideally) of numpy. > > However it is still taking a very long time at approx 1 second per > > row. At this speed even some simple speed ups like weave won't > > really help. > > > > Are there any other numpy ways that it would be possible to calculate > > this, or should I start looking at going to C/C++? > > > > I am also left wondering if there is something wrong with my > > installation of numpy (i.e. not making use of lapack/ATLAS). Other > > than that if it is relevant - I am running 32 bit x86 Centos 5.1 > > linux on a dual Xeon 3.0 GHz Dell tower with 4 GB of memory. > > Unfortunately, implementing most of the algorithms in the literature > within numpy is going to be fairly cumbersome. But I think there are > better algorithms you could try: > > * Put all your atoms in a grid. Ideally the cells would be about the > size of your cutoff radius, so that for each atom you need to pull out > all atoms in at most eight cells for checking against the cutoff. I > think this can be done fairly efficiently in numpy. > > * Sort the atoms along a coordinate and use searchsorted to pull out > those for which that coordinate is within the cutoff. This should get > you down to reasonably modest lists fairly quickly. > > There is of course a tradeoff between preprocessing time and lookup time. > > We seem to get quite a few posts from people wanting some kind of > spatial data structure (whether they know it or not). Would it make > sense to come up with some kind of compiled spatial data structure to > include in scipy? > I think there should be a "computer science" module containing such things as r-b trees, k-d trees, equivalence relations, and so forth. For this problem one could probably make a low resolution cube of list objects and index atoms into the appropriate list by severe rounding. I have done similar things for indexing stars in a (fairly) wide field image and it worked well. But I think the sorting approach would be a good first try here. Argsort on the proper column followed by take would be the way to go for that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Thu Jul 10 17:36:04 2008 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 11 Jul 2008 06:36:04 +0900 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: On Fri, Jul 11, 2008 at 5:55 AM, Anne Archibald wrote: > 2008/7/10 Dan Lussier : > > We seem to get quite a few posts from people wanting some kind of > spatial data structure (whether they know it or not). Would it make > sense to come up with some kind of compiled spatial data structure to > include in scipy? Yes! http://en.wikipedia.org/wiki/Nearest_neighbor_search I've been using cover trees recently. http://hunch.net/~jl/projects/cover_tree/cover_tree.html Though I haven't hammered on it very hard, it seems to be working well so far. --bb -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 10 17:38:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Jul 2008 15:38:01 -0600 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: On Thu, Jul 10, 2008 at 2:25 PM, Anne Archibald wrote: > 2008/7/10 Charles R Harris : > > > On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald < > peridot.faceted at gmail.com> > > wrote: > >> > >> 2008/7/9 Robert Kern : > >> > >> > Because that's just what a buffer= argument *is*. It is not a place > >> > for presenting the starting pointer to exotically-strided memory. Use > >> > __array_interface__s to describe the full range of representable > >> > memory. See below. > >> > >> Aha! Is this stuff documented somewhere? > >> > >> > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the > >> > trunk. > >> > >> Nice! Unfortunately it can't quite do what I want... for the linear > >> algebra I need something that can broadcast all but certain axes. For > >> example, take an array of matrices and an array of vectors. The > >> "array" axes need broadcasting, but you can't broadcast on all axes > >> without (incorrectly) turning the vector into a matrix. I've written a > >> (messy) implementation, but the corner cases are giving me headaches. > >> I'll let you know when I have something that works. > > > > I think something like a matrix/vector dtype would be another way to go > for > > stacks of matrices and vectors. It would have to be a user defined type > to > > fit into the current type hierarchy for ufuncs, but I think the base > > machinery is there with the generic inner loops. > > There's definitely room for discussion about how such a linear algebra > system should ultimately work. In the short term, though, I think it's > valuable to create a prototype system, even if much of the looping has > to be in python, just to figure out how it should look to the user. > > For example, I'm not sure whether a matrix/vector dtype would make > easy things like indexing operations - fancy indexing spanning both > array and matrix axes, True enough. And I would expect the ufunc approach to require the sub matrices to be contiguous. In any case, ufuncs aren't ready for this yet, in fact they aren't ready for string types or other numeric types, so there is a lot of work to be done just at that level. for example. dtypes can also be a little > impenetrable to use, so even if this is how elementwise linear algebra > is ultimately implemented, we may want to provide a different user > frontend. > > My idea for a first sketch was simply to provide (for example) > np.elementwise_linalg.dot_mm, that interprets its arguments both as > arrays of matrices and yields an array of matrices. A second sketch > might be a subclass MatrixArray, which could serve much as Matrix does > now, to tell various functions which axes to do linear algebra over > and which axes to iterate over. > Definitely needs doing. It's hard to see how things work out in practice without some experimentation. > There's also the question of how to make these efficient; one could of > course write a C wrapper for each linear algebra function that simply > iterates, but your suggestion of using the ufunc machinery with a > matrix dtype might be better. Or, if cython acquires sensible > polymorphism over numpy dtypes, a cython wrapper might be the way to > go. But I think establishing what it should look like to users is a > good first step, and I think that's best done with sample > implementations. > Amen. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at loria.fr Thu Jul 10 17:45:29 2008 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Thu, 10 Jul 2008 23:45:29 +0200 Subject: [Numpy-discussion] python user defined type In-Reply-To: <48767466.6000504@noaa.gov> References: <1215715698.6369.7.camel@oxygen> <48767466.6000504@noaa.gov> Message-ID: <1215726329.6669.27.camel@oxygen> Thanks for the precise explanation that make things clearer. In fact, the example I gave was mainly to illustrate my question in the quickest way. Concerning the dtype argument during array creation, I thought it was here for somehow controlling the type of array elements. For example, if I use a "regular" numpy array (let's say a float array), I cannot set an item to a string value (it raises a ValueError: setting an array element with a sequence). So what would be the best way to use numpy arrays with "foreign" types (or is it possible at all) ? I've designed the "real" Unit in C++ and exported it to python (via boost and shared pointers) and I would like to create array of such Units (in fact, I also created an array-like class but I would prefer to use directly the real array interface to benefit from the great work of numpy instead of re-inventing the wheel). Ideally, I would like to be able to write z = N.array (a, dtype=Unit) and would expect numpy to make a copy of the array by calling my type with each element of a. Then, if my type accepts the argument during creation, everything's fine, else it raises an error. Nicolas On Thu, 2008-07-10 at 13:43 -0700, Christopher Barker wrote: > Nicolas Rougier wrote: > > I would like to create numpy array with my own (python) datatype, so I > > tried the obvious solution: > > > > from numpy import * > > class Unit(object): > > def __init__(self,value=0.0): > > self.value = value > > def __float__(self): > > return self.value > > def __repr__(self): > > return str(self.value) > > > a = array (array([[1,2],[3,4]]), dtype=Unit) > > the dtype argument is designed to take a numpy type object, not an > arbitrary class -- what you want in this case is dtype=numpy.object, > which is what you did before. I'm a surprised this didn't raise an > error, but it looks like you got an object array, but you objects you > gave it are python integers. All python classes are "objects" as far an > numpy is concerned. The numpy dtypes are a description of a given number > of bytes in memory -- python classes are stored as a pointer to the > python object. > > (and you really don't want to use "import *") > > > Also, the commented line (a[0,0] = 0) makes the item to become an int > > while I need to set the value of the item instead, is that possible ? > > a[0,0] = Unit(0) > > You're setting the [0,0]th element to an object, you need to give it the > object you want, the literal "0" is a python integer with the value zero. > > numpy arrays of python objects act a whole lot like other python > containers. What would you expect from : > > a = [1,2,3,4] > > a list of integers, no? > > or > > a = [Unit(1), Unit(2)] > > # a list of Units.. > > then > > # a[0] = 3 > > now a list with the integer3 in the zeroth position, and a Unit in the 1st. > > You did it right the first time: > > a = array([[Unit(1), Unit(2)], [Unit(3), Unit(4)]]) > > though you need to be careful building object arrays of nested lists -- > numpy won't unnecessarily figure out how do d-nest it. You might want to do: > > >>> import numpy as np > >>> a = np.empty((2,2), np.object) > >>> a > array([[None, None], > [None, None]], dtype=object) > >>> a[:,:] = [[Unit(1), Unit(2)], [Unit(3), Unit(4)]] > >>> a > array([[1, 2], > [3, 4]], dtype=object) > > One more note: > > class Unit(object): > def __init__(self,value=0.0): > self.value = value > def __float__(self): > return self.value > def __repr__(self): > return str(self.value) > > > __repr__ really should be something like: > def __repr__(self): > return "Unit(%g)%self.value" > > eval(repr(Object)) == Object, ideally, plus it'll be easier to debug if > you can see what it is. > > -Chris > > > From stefan at sun.ac.za Thu Jul 10 18:49:19 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 11 Jul 2008 00:49:19 +0200 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: <9457e7c80807101549m574e2438l22038860b0d13601@mail.gmail.com> 2008/7/10 Anne Archibald : > Unfortunately, implementing most of the algorithms in the literature > within numpy is going to be fairly cumbersome. But I think there are > better algorithms you could try: There's also http://pypi.python.org/pypi/scikits.ann Regards St?fan From Chris.Barker at noaa.gov Thu Jul 10 19:30:13 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 10 Jul 2008 16:30:13 -0700 Subject: [Numpy-discussion] python user defined type In-Reply-To: <1215726329.6669.27.camel@oxygen> References: <1215715698.6369.7.camel@oxygen> <48767466.6000504@noaa.gov> <1215726329.6669.27.camel@oxygen> Message-ID: <48769B85.3060807@noaa.gov> Nicolas Rougier wrote: > Concerning the dtype argument during array creation, I thought it was > here for somehow controlling the type of array elements. For example, if > I use a "regular" numpy array (let's say a float array), I cannot set an > item to a string value (it raises a ValueError: setting an array element > with a sequence). Yes, but numpy is designed primarily for numeric types: ints, floats, etc. It can also be used for custom types that are essentially like C structs (see recarray). The key is that a dtype desribes a data type in terms of bytes and that they represent -- It can not be a python type. The only way to use arbitrary python types is a object array, which you've discovered, but then numpy doesn't know any thing about the objects, other than that they are python objects. > So what would be the best way to use numpy arrays with "foreign" types > (or is it possible at all) ? I've designed the "real" Unit in C++ and > exported it to python (via boost and shared pointers) and I would like > to create array of such Units If your type is a C++ class, then it may be possible, with some C hackary to get numpy to understand it, but you're getting beyong my depth here -- also I doubt that you'd get the full features like array math and all anyway -- that's all set up for basic numeric types. Maybe others will have some idea, but I think you're pushing what numpy is capable of. > (in fact, I also created an array-like > class but I would prefer to use directly the real array interface to > benefit from the great work of numpy instead of re-inventing the > wheel). What operations do you expect to perform with these arrays of Units? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Jul 10 19:32:09 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 10 Jul 2008 16:32:09 -0700 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: <48769BF9.1000304@noaa.gov> Anne Archibald wrote: > Unfortunately, implementing most of the algorithms in the literature > within numpy is going to be fairly cumbersome. Maybe this rtree implementation would help: http://pypi.python.org/pypi/Rtree -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dalke at dalkescientific.com Thu Jul 10 19:48:12 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 11 Jul 2008 01:48:12 +0200 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: <4886AB12-21CC-4488-B9A9-32EBB95DEA12@dalkescientific.com> On Jul 10, 2008, at 6:38 PM, Dan Lussier wrote: > What I'm trying to do is to calculate the total total potential > energy and coordination number of each atom within a relatively large > simulation. Anne Archibald already responded: > you could try a three-dimensional grid (if your atoms aren't > too clumpy) or octrees (if they are somewhat clumpy); kd-trees are > probably overkill (since they're designed for higher-dimensional > problems). I implemented something like what you did in VMD about 14(!) years ago. (VMD is a molecular structure visualization program.) I needed to find which atoms might be bonded to each other. I assume you have a cutoff value, which means you don't need to worry about the general nearest-neighbor problem. Molecules are nice because the distributions are not clumpy. Atoms don't get that close to nor that far from other atoms. I implemented a grid. It goes something like this: import collections # Search within 3 A d = 3.0 coordinates = [ ("C1", 34.287, 50.970, 115.006), ("O1", 34.972, 51.144, 113.870), ("C2", 33.929, 52.255, 115.739), ("N2", 34.753, 52.387, 116.954), ("C3", 32.448, 52.219, 116.121), ("O3", 32.033, 50.877, 116.336), ("C4", 31.528, 52.817, 115.054), ("C5", 30.095, 53.013, 115.558), ("C6", 29.226, 53.835, 114.609), ("C7", 29.807, 55.217, 114.304), ("C8", 29.092, 55.920, 113.147), ("C9", 29.525, 57.375, 112.971), ("C10", 28.409, 58.267, 112.422), ("C11", 28.828, 59.734, 112.294), ("C12", 27.902, 60.542, 111.385), ("C13", 26.617, 60.996, 112.085), ("C14", 26.182, 62.401, 111.667), ("C15", 24.739, 62.731, 112.054), ("C16", 24.251, 64.046, 111.441), ("C17", 23.026, 64.624, 112.150), ("C18", 22.631, 66.007, 111.623),] def dict_of_list(): return collections.defaultdict(list) def dict_of_dict(): return collections.defaultdict(dict_of_list) grid = collections.defaultdict(dict_of_dict) for atom in coordinates: name,x,y,z = atom i = int(x/d) j = int(y/d) k = int(z/d) grid[i][j][k].append(atom) query_name, query_x, query_y, query_z = coordinates[8] query_i = int(query_x/d) query_j = int(query_y/d) query_k = int(query_z/d) # Given the search distance 'd', only need to check cells up to 1 unit away within_names = set() for i in range(query_i-1, query_i+2): for j in range(query_j-1, query_j+2): for k in range(query_k-1, query_k+2): for atom in grid[i][j][k]: name, x, y, z = atom print "Check", atom, query_d2 = (x-query_x)**2+(y-query_y)**2+(z-query_z)**2 if query_d2 < d*d: print "Within!", query_d2 within_names.add(name) else: print "Too far", query_d2 print len(within_names), "were close enough" # Linear search to verify count = 0 for name, x, y, z in coordinates: query_d2 = (x-query_x)**2+(y-query_y)**2+(z-query_z)**2 if query_d2 < d*d: print "Within", name if name not in within_names: raise AssertionError(name) count += 1 if count != len(within_names): raise AssertionError("count problem") You can also grab the KDTree from Biopython, which is implemented in C. http://www.biopython.org/DIST/docs/api/Bio.KDTree.KDTree'-module.html It was designed for just this task. Andrew dalke at dalkescientific.com From robert.kern at gmail.com Thu Jul 10 20:08:41 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Jul 2008 19:08:41 -0500 Subject: [Numpy-discussion] "expected a single-segment buffer object" In-Reply-To: References: <3d375d730807091842s710ba524o7a1fb77124140d0d@mail.gmail.com> <3d375d730807092003k2b8fad96w4f25b1d762fbbec@mail.gmail.com> Message-ID: <3d375d730807101708r37dc43ceg1868a45169436eb4@mail.gmail.com> On Thu, Jul 10, 2008 at 10:33, Anne Archibald wrote: > 2008/7/9 Robert Kern : > >> Because that's just what a buffer= argument *is*. It is not a place >> for presenting the starting pointer to exotically-strided memory. Use >> __array_interface__s to describe the full range of representable >> memory. See below. > > Aha! Is this stuff documented somewhere? _The Guide to Numpy_, section 3.1.4. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Jul 10 20:14:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Jul 2008 19:14:02 -0500 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> Message-ID: <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> On Thu, Jul 10, 2008 at 09:26, Alan McIntyre wrote: > On Thu, Jul 10, 2008 at 10:06 AM, Christopher Hanley wrote: >> From the svn log it looks like the tests are intended to fail? However >> I would prefer tests that are designed only to fail when problems >> occur. Otherwise we end up with problems with our automatic build and >> test scripts. > > Sorry about that, I just wanted to make sure the tests would actually > fail on all the builders before the bug in chararray is fixed. I just > checked in a change that disables the failing portions of the test. For things that you don't expect to be platform specific, there is no need. For things that you do expect to be platform specific on platforms that you cannot access, please ask for volunteers. Make an SVN branch if the changes are extensive. I would like to keep to the rule of not checking in unit tests (on trunk or a 1.1.x branch, etc.) that you expect to fail. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Thu Jul 10 20:20:29 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 10 Jul 2008 20:20:29 -0400 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> Message-ID: <1d36917a0807101720l35d2be8eu355a2c2445b37833@mail.gmail.com> On Thu, Jul 10, 2008 at 8:14 PM, Robert Kern wrote: > For things that you don't expect to be platform specific, there is no > need. For things that you do expect to be platform specific on > platforms that you cannot access, please ask for volunteers. Make an > SVN branch if the changes are extensive. I would like to keep to the > rule of not checking in unit tests (on trunk or a 1.1.x branch, etc.) > that you expect to fail. Ok. From robert.kern at gmail.com Thu Jul 10 20:24:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Jul 2008 19:24:14 -0500 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <1d36917a0807101720l35d2be8eu355a2c2445b37833@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> <1d36917a0807101720l35d2be8eu355a2c2445b37833@mail.gmail.com> Message-ID: <3d375d730807101724s626a133ay4da28f7d1d69ada4@mail.gmail.com> On Thu, Jul 10, 2008 at 19:20, Alan McIntyre wrote: > On Thu, Jul 10, 2008 at 8:14 PM, Robert Kern wrote: >> For things that you don't expect to be platform specific, there is no >> need. For things that you do expect to be platform specific on >> platforms that you cannot access, please ask for volunteers. Make an >> SVN branch if the changes are extensive. I would like to keep to the >> rule of not checking in unit tests (on trunk or a 1.1.x branch, etc.) >> that you expect to fail. > > Ok. Thanks. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Jul 10 22:44:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Jul 2008 20:44:56 -0600 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <4886AB12-21CC-4488-B9A9-32EBB95DEA12@dalkescientific.com> References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> <4886AB12-21CC-4488-B9A9-32EBB95DEA12@dalkescientific.com> Message-ID: On Thu, Jul 10, 2008 at 5:48 PM, Andrew Dalke wrote: > You can also grab the KDTree from Biopython, which is implemented in C. > > http://www.biopython.org/DIST/docs/api/Bio.KDTree.KDTree'-module.html > > It was designed for just this task. > Looks nice, has a BSD type license, but uses Numeric :( Oh well, a little fixing up should do the trick. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Fri Jul 11 02:41:48 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 11 Jul 2008 08:41:48 +0200 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> <4886AB12-21CC-4488-B9A9-32EBB95DEA12@dalkescientific.com> Message-ID: Hi, scikits.learn.machine/manifold_learning.regression.neighbor contains a kd-tree for neighbors search as well (implemented in C++). Matthieu 2008/7/11 Charles R Harris : > > > On Thu, Jul 10, 2008 at 5:48 PM, Andrew Dalke > wrote: > > > >> >> You can also grab the KDTree from Biopython, which is implemented in C. >> >> http://www.biopython.org/DIST/docs/api/Bio.KDTree.KDTree'-module.html >> >> It was designed for just this task. > > Looks nice, has a BSD type license, but uses Numeric :( Oh well, a little > fixing up should do the trick. > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher From stefan at sun.ac.za Fri Jul 11 03:06:05 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 11 Jul 2008 09:06:05 +0200 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> Message-ID: <9457e7c80807110006p56652397p1ee9d6752955f48f@mail.gmail.com> 2008/7/11 Robert Kern : > On Thu, Jul 10, 2008 at 09:26, Alan McIntyre wrote: >> On Thu, Jul 10, 2008 at 10:06 AM, Christopher Hanley wrote: >>> From the svn log it looks like the tests are intended to fail? However >>> I would prefer tests that are designed only to fail when problems >>> occur. Otherwise we end up with problems with our automatic build and >>> test scripts. >> >> Sorry about that, I just wanted to make sure the tests would actually >> fail on all the builders before the bug in chararray is fixed. I just >> checked in a change that disables the failing portions of the test. > > For things that you don't expect to be platform specific, there is no > need. For things that you do expect to be platform specific on > platforms that you cannot access, please ask for volunteers. Make an > SVN branch if the changes are extensive. Branches may also be built using the buildbot. St?fan From charlesr.harris at gmail.com Fri Jul 11 03:11:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 01:11:54 -0600 Subject: [Numpy-discussion] doctests failing in ipython Message-ID: The problem is the Out[#] appended to the output. ................................................Out[4]: poly1d([ 1., 2., 3.]) ********************************************************************** File "/usr/lib/python2.5/site-packages/numpy/lib/tests/test_polynomial.py", line 6, in test_polynomial Failed example: p Expected: poly1d([ 1., 2., 3.]) Got nothing Tons of these. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 11 03:12:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 01:12:48 -0600 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: References: Message-ID: On Fri, Jul 11, 2008 at 1:11 AM, Charles R Harris wrote: > The problem is the Out[#] appended to the output. ^^^^^^ prepended. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 11 03:13:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:13:14 -0500 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <9457e7c80807110006p56652397p1ee9d6752955f48f@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> <9457e7c80807110006p56652397p1ee9d6752955f48f@mail.gmail.com> Message-ID: <3d375d730807110013r595c352er8ce5ca5628e0a105@mail.gmail.com> On Fri, Jul 11, 2008 at 02:06, St?fan van der Walt wrote: > 2008/7/11 Robert Kern : >> On Thu, Jul 10, 2008 at 09:26, Alan McIntyre wrote: >>> On Thu, Jul 10, 2008 at 10:06 AM, Christopher Hanley wrote: >>>> From the svn log it looks like the tests are intended to fail? However >>>> I would prefer tests that are designed only to fail when problems >>>> occur. Otherwise we end up with problems with our automatic build and >>>> test scripts. >>> >>> Sorry about that, I just wanted to make sure the tests would actually >>> fail on all the builders before the bug in chararray is fixed. I just >>> checked in a change that disables the failing portions of the test. >> >> For things that you don't expect to be platform specific, there is no >> need. For things that you do expect to be platform specific on >> platforms that you cannot access, please ask for volunteers. Make an >> SVN branch if the changes are extensive. > > Branches may also be built using the buildbot. What is the procedure for requesting this? Do we just email you and ask for the buildbots to build a particular branch? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Nicolas.Rougier at loria.fr Fri Jul 11 03:24:05 2008 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Fri, 11 Jul 2008 09:24:05 +0200 Subject: [Numpy-discussion] python user defined type In-Reply-To: <48769B85.3060807@noaa.gov> References: <1215715698.6369.7.camel@oxygen> <48767466.6000504@noaa.gov> <1215726329.6669.27.camel@oxygen> <48769B85.3060807@noaa.gov> Message-ID: <1215761045.23456.12.camel@sulfur.loria.fr> My Unit class is supposed to represent a neuron that can be linked to any other unit. The neuron itself is merely a (float) potential that can vary along time under the influence of other units and learning. I gather these units into groups which are in fact 2D matrix of units. Currently, I implemented the Unit and Group and I "talk" with numpy through an attribute of groups which represent all available potentials. Finally my group is like a simple 2d matrix of float but I need to have an underlying object to perform computation on each Unit at each time step. Currently I'm able to write something like: >>> group = Unit()*[2,2] >>> group.potentials = numpy.zeros([2,2]) >>> print group.potentials [[ 0. 0.] [ 0. 0.]] >>> group[0,0].potential = 1 ?[[ 1. 0.] [ 0. 0.]] Nicolas On Thu, 2008-07-10 at 16:30 -0700, Christopher Barker wrote: > Nicolas Rougier wrote: > > Concerning the dtype argument during array creation, I thought it was > > here for somehow controlling the type of array elements. For example, if > > I use a "regular" numpy array (let's say a float array), I cannot set an > > item to a string value (it raises a ValueError: setting an array element > > with a sequence). > > Yes, but numpy is designed primarily for numeric types: ints, floats, > etc. It can also be used for custom types that are essentially like C > structs (see recarray). The key is that a dtype desribes a data type in > terms of bytes and that they represent -- It can not be a python type. > > The only way to use arbitrary python types is a object array, which > you've discovered, but then numpy doesn't know any thing about the > objects, other than that they are python objects. > > > So what would be the best way to use numpy arrays with "foreign" types > > (or is it possible at all) ? I've designed the "real" Unit in C++ and > > exported it to python (via boost and shared pointers) and I would like > > to create array of such Units > > If your type is a C++ class, then it may be possible, with some C > hackary to get numpy to understand it, but you're getting beyong my > depth here -- also I doubt that you'd get the full features like array > math and all anyway -- that's all set up for basic numeric types. > > Maybe others will have some idea, but I think you're pushing what numpy > is capable of. > > > (in fact, I also created an array-like > > class but I would prefer to use directly the real array interface to > > benefit from the great work of numpy instead of re-inventing the > > wheel). > > What operations do you expect to perform with these arrays of Units? > > -Chris > From robert.kern at gmail.com Fri Jul 11 03:29:16 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:29:16 -0500 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: References: Message-ID: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> On Fri, Jul 11, 2008 at 02:11, Charles R Harris wrote: > The problem is the Out[#] appended to the output. > > ................................................Out[4]: poly1d([ 1., 2., > 3.]) > ********************************************************************** > File "/usr/lib/python2.5/site-packages/numpy/lib/tests/test_polynomial.py", > line 6, in test_polynomial > Failed example: > p > Expected: > poly1d([ 1., 2., 3.]) > Got nothing > > Tons of these. Yes. This is well-known. IPython cannot run doctests in general without modification. This is not a bug in numpy's tests; just an incompatibility between IPython and doctest. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Jul 11 03:31:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:31:55 -0500 Subject: [Numpy-discussion] python user defined type In-Reply-To: <1215726329.6669.27.camel@oxygen> References: <1215715698.6369.7.camel@oxygen> <48767466.6000504@noaa.gov> <1215726329.6669.27.camel@oxygen> Message-ID: <3d375d730807110031n1fddf338r636ff87895caba19@mail.gmail.com> On Thu, Jul 10, 2008 at 16:45, Nicolas Rougier wrote: > Ideally, I would like to be able to write > > z = N.array (a, dtype=Unit) > > and would expect numpy to make a copy of the array by calling my type > with each element of a. Then, if my type accepts the argument during > creation, everything's fine, else it raises an error. Just use dtype=object and make any copies you need yourself. Encapsulate this logic in a function if you need it frequently. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Fri Jul 11 03:37:32 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 11 Jul 2008 00:37:32 -0700 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> Message-ID: On Fri, Jul 11, 2008 at 12:29 AM, Robert Kern wrote: > Yes. This is well-known. IPython cannot run doctests in general > without modification. This is not a bug in numpy's tests; just an > incompatibility between IPython and doctest. Couple of questions: - how are these being run? I'm trying np.test('full',doctests=True) and I get the same Ran 1746 tests in 5.104s FAILED (errors=5) from current SVN. But I don't get any doctest failure. - Does %doctest_mode not help with the error Charles is getting? Cheers, f From stefan at sun.ac.za Fri Jul 11 03:39:09 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 11 Jul 2008 09:39:09 +0200 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <3d375d730807110013r595c352er8ce5ca5628e0a105@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> <9457e7c80807110006p56652397p1ee9d6752955f48f@mail.gmail.com> <3d375d730807110013r595c352er8ce5ca5628e0a105@mail.gmail.com> Message-ID: <9457e7c80807110039we187cf1nf6e17425da7483d5@mail.gmail.com> 2008/7/11 Robert Kern : >> Branches may also be built using the buildbot. > > What is the procedure for requesting this? Do we just email you and > ask for the buildbots to build a particular branch? Go to the waterfall display and click on a build-slave name at the top. Use "Force build", and type the branch name in the appropriate field. By default it is 'trunk', so I think you'll have to use 'branches/branchname'. St?fan From charlesr.harris at gmail.com Fri Jul 11 03:40:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 01:40:25 -0600 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> Message-ID: On Fri, Jul 11, 2008 at 1:29 AM, Robert Kern wrote: > On Fri, Jul 11, 2008 at 02:11, Charles R Harris > wrote: > > The problem is the Out[#] appended to the output. > > > > ................................................Out[4]: poly1d([ 1., 2., > > 3.]) > > ********************************************************************** > > File > "/usr/lib/python2.5/site-packages/numpy/lib/tests/test_polynomial.py", > > line 6, in test_polynomial > > Failed example: > > p > > Expected: > > poly1d([ 1., 2., 3.]) > > Got nothing > > > > Tons of these. > > Yes. This is well-known. IPython cannot run doctests in general > without modification. This is not a bug in numpy's tests; just an > incompatibility between IPython and doctest. > I don't think any of the unit tests should be doctests. They look ugly and are hard to read. Second, I didn't used to see the problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 11 03:40:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:40:55 -0500 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> Message-ID: <3d375d730807110040p35604d2bpb4849be420dedb35@mail.gmail.com> On Fri, Jul 11, 2008 at 02:37, Fernando Perez wrote: > On Fri, Jul 11, 2008 at 12:29 AM, Robert Kern wrote: > >> Yes. This is well-known. IPython cannot run doctests in general >> without modification. This is not a bug in numpy's tests; just an >> incompatibility between IPython and doctest. > > Couple of questions: > > - how are these being run? I'm trying > > np.test('full',doctests=True) > > and I get the same > > Ran 1746 tests in 5.104s > > FAILED (errors=5) > > from current SVN. But I don't get any doctest failure. > > - Does %doctest_mode not help with the error Charles is getting? Probably. I forgot about that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Jul 11 03:41:33 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:41:33 -0500 Subject: [Numpy-discussion] new chararray test fails on Mac OS X In-Reply-To: <9457e7c80807110039we187cf1nf6e17425da7483d5@mail.gmail.com> References: <48761765.2020100@stsci.edu> <1d36917a0807100726x5cbe021bh5976a467fe2f7683@mail.gmail.com> <3d375d730807101714w1e10673ci3914fe640f56c8e6@mail.gmail.com> <9457e7c80807110006p56652397p1ee9d6752955f48f@mail.gmail.com> <3d375d730807110013r595c352er8ce5ca5628e0a105@mail.gmail.com> <9457e7c80807110039we187cf1nf6e17425da7483d5@mail.gmail.com> Message-ID: <3d375d730807110041q431d2111i2089402d5eb10415@mail.gmail.com> On Fri, Jul 11, 2008 at 02:39, St?fan van der Walt wrote: > 2008/7/11 Robert Kern : >>> Branches may also be built using the buildbot. >> >> What is the procedure for requesting this? Do we just email you and >> ask for the buildbots to build a particular branch? > > Go to the waterfall display and click on a build-slave name at the > top. Use "Force build", and type the branch name in the appropriate > field. By default it is 'trunk', so I think you'll have to use > 'branches/branchname'. Excellent! Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Jul 11 03:50:16 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:50:16 -0500 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> Message-ID: <3d375d730807110050u549f9a15k83266552b05bf15e@mail.gmail.com> On Fri, Jul 11, 2008 at 02:40, Charles R Harris wrote: > > > On Fri, Jul 11, 2008 at 1:29 AM, Robert Kern wrote: >> >> On Fri, Jul 11, 2008 at 02:11, Charles R Harris >> wrote: >> > The problem is the Out[#] appended to the output. >> > >> > ................................................Out[4]: poly1d([ 1., >> > 2., >> > 3.]) >> > ********************************************************************** >> > File >> > "/usr/lib/python2.5/site-packages/numpy/lib/tests/test_polynomial.py", >> > line 6, in test_polynomial >> > Failed example: >> > p >> > Expected: >> > poly1d([ 1., 2., 3.]) >> > Got nothing >> > >> > Tons of these. >> >> Yes. This is well-known. IPython cannot run doctests in general >> without modification. This is not a bug in numpy's tests; just an >> incompatibility between IPython and doctest. > > I don't think any of the unit tests should be doctests. They look ugly and > are hard to read. Sometimes, they're the most convenient way to express a bunch of tests, in my opinion. You're welcome to have a differing opinion, but you haven't convinced me to reconsider mine. > Second, I didn't used to see the problem. When? Exactly what did you run? I don't see this problem with the trunk of numpy (and IPython, incidentally): In [1]: import numpy In [2]: numpy.test() ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ---------------------------------------------------------------------- Ran 1744 tests in 8.595s OK Out[2]: -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Jul 11 03:52:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Jul 2008 02:52:48 -0500 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: <3d375d730807110050u549f9a15k83266552b05bf15e@mail.gmail.com> References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> <3d375d730807110050u549f9a15k83266552b05bf15e@mail.gmail.com> Message-ID: <3d375d730807110052m286a1ee7s8af0b37e078d05f8@mail.gmail.com> On Fri, Jul 11, 2008 at 02:50, Robert Kern wrote: > I don't see this problem with the > trunk of numpy (and IPython, incidentally): Also, nose 0.10.3, which may be part of the solution. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Jul 11 03:58:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 01:58:28 -0600 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: <3d375d730807110040p35604d2bpb4849be420dedb35@mail.gmail.com> References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> <3d375d730807110040p35604d2bpb4849be420dedb35@mail.gmail.com> Message-ID: On Fri, Jul 11, 2008 at 1:40 AM, Robert Kern wrote: > On Fri, Jul 11, 2008 at 02:37, Fernando Perez > wrote: > > On Fri, Jul 11, 2008 at 12:29 AM, Robert Kern > wrote: > > > >> Yes. This is well-known. IPython cannot run doctests in general > >> without modification. This is not a bug in numpy's tests; just an > >> incompatibility between IPython and doctest. > > > > Couple of questions: > > > > - how are these being run? I'm trying > > > > np.test('full',doctests=True) > > > > and I get the same > > > > Ran 1746 tests in 5.104s > > > > FAILED (errors=5) > > > > from current SVN. But I don't get any doctest failure. > > > > - Does %doctest_mode not help with the error Charles is getting? > > Probably. I forgot about that. > The problem might be the old ipython version (8.1) shipped with ubuntu 8.04. Debian is slow to update and I've been trying out ubuntu for 64 bit testing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 11 04:21:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 02:21:43 -0600 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: <3d375d730807110052m286a1ee7s8af0b37e078d05f8@mail.gmail.com> References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> <3d375d730807110050u549f9a15k83266552b05bf15e@mail.gmail.com> <3d375d730807110052m286a1ee7s8af0b37e078d05f8@mail.gmail.com> Message-ID: On Fri, Jul 11, 2008 at 1:52 AM, Robert Kern wrote: > On Fri, Jul 11, 2008 at 02:50, Robert Kern wrote: > > I don't see this problem with the > > trunk of numpy (and IPython, incidentally): > > Also, nose 0.10.3, which may be part of the solution. > Maybe. Upgrading nose alone didn't help, but going to ipython 0.8.4 did the trick. I think we are going to see a lot of error reports from people using the default versions that come with Ubuntu Hardy, which is, after all, the latest and greatest. Maybe we should ping the Ubuntu people before 1.2 goes out. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From falted at pytables.org Fri Jul 11 09:59:03 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 15:59:03 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy Message-ID: <200807111559.03986.falted@pytables.org> Hi, We are planning to implement some date/time types for NumPy, and I'm sending a document that explains our approach. We would love to hear the feedback of the NumPy community in order to cover their needs as much as possible. Cheers, Francesc =========================================================== A proposal for implementing some date/time types in NumPy =========================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python have several modules that define a date/time type, like ``mx.DateTime`` or the integrated ``datetime`` [1]_, NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with three different types, namely ``datetime64``, ``timestamp64`` and ``timefloat64`` -- these names are preliminary and can be changed indeed; they are mostly useful for the sake of the discussion -- that cover different needs. Here it goes a detailed description of the different types: ``datetime64`` - Implemented internally as an ``int64`` type. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 278922 years in each direction since the POSIX epoch. Observations:: This will be compatible with the Python ``datetime`` module not only in terms of precision (it also have a resolution of microseconds) and time-span (its range is year 1 to year 9999), but also in that we will provide getters and setters for it. ``timestamp64`` - Implemented internally as an ``int64`` type. - Expressed in nanoseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 272 years in each direction since the POSIX epoch. Observations:: This will be not be fully compatible with the Python ``datetime`` module neither in terms of precision nor time-span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). * ``timefloat64`` - Implemented internally as a float64. - Expressed in microseconds since POSIX epoch. - Resolution: 1 microsecond (for +-32 years from epoch) or 14 digits (for distant years from epoch). So the precision is *variable*. - Time span: 1e+308 years in each direction since the POSIX epoch. Observations:: In general, this will be not be fully compatible with the Python datetime neither in terms of precision nor time-span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). Example of use ============== Here it is an example of usage of one of the types described above (``datetime64``):: In [10]: t = numpy.zeros(5, dtype="datetime64") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: 733234384724 # representation as an int64 (scalar) In [13]: t Out[13]: array([12155899511985929, 0, 0, 0, 0], dtype=datetime64) In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 11, 14, 27, 3, 384724) Final considerations ==================== About the ``mx.DateTime`` module -------------------------------- In this document, the emphasis has been put in comparing the compatibility of future NumPy date/time types against the ``datetime`` module that comes with Python. Should we consider the compatibility with mx.DateTime as well? Are there many people using ``mx.DateTime`` [2]_ out there? If so, which are their advantages over ``datetime`` module?. A final note on time scales --------------------------- [Only for people with high precision time requirements or just for those that love the feel of their brain exploding] POSIX (or UTC, in which POSIX is based) time scale [3]_ is based on the time that the Earth takes to revolve around the Sun. However, after the adoption of more precise time patterns (read: atomic clocks), it became clear that this time is pretty imprecise compared with the latter. As a result, the UTC standard is adding ``leap seconds`` from time to time (at a rate of 1 second per year, approximately) in order to compensate these differences. Because of this, when computing time deltas (using the UTC standard) between two instants that differ in more than one year, it is extremely probable that an error of a second or several (depending of the time span) would be introduced. While this in general is harmless for common use cases, there are situations that this can bite people quite a lot. For example, realize that IERS (International Earth Rotation and Reference Systems Service) decided to add a leap second the past June 30th 00:00:00 UTC and that you were doing an experiment precisely at that time (this is not so rare, and has probably happened already to someone, somewhere). With this, in the analysis phase of your experiment, you could have the surprise that the time deltas computed during this leap second are actually 1 second shorter (and then, why the heck do we want to use types that supposedly support micro- or nano-seconds of precision?). Because of this, we were initially tempted to use the TAI (Temps Atomique International) [4]_ standard because it is strictly *continuous* (contrarily to UTC or POSIX), so avoiding the sort of problems exposed above. However, the omnipresence of the UTC clocks in the computing world would force us to be continuously converting UTC timestamps to TAI ones and vice-versa. Unfortunately, this is not easy to do because there is not a simple mathematical relationship between UTC and TAI. Instead, you have to use a table in order to check when the IERS added the leap seconds, and take them into account. The problem is that, as the revolution of the Earth is imprecise, one cannot determine ahead of time when the leap seconds will be added. This can lead to problems with code that performs the TAI <-> UTC conversion in the sense that the internal conversion table has to be continuously updated if we don't want to have problems of precision. And this is a too much added complication to be worth the effort, in our opinion. These are the difficulties that driven us to prefer the POSIX time scale over TAI for this implementation. However, more input on this issue is very welcome. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. [4] http://en.wikipedia.org/wiki/International_Atomic_Time -------------- next part -------------- =========================================================== A proposal for implementing some date/time types in NumPy =========================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python have several modules that define a date/time type, like ``mx.DateTime`` or the integrated ``datetime`` [1]_, NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with three different types, namely ``datetime64``, ``timestamp64`` and ``timefloat64`` -- these names are preliminary and can be changed indeed; they are mostly useful for the sake of the discussion -- that cover different needs. Here it goes a detailed description of the different types: ``datetime64`` - Implemented internally as an ``int64`` type. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 278922 years in each direction since the POSIX epoch. Observations:: This will be compatible with the Python ``datetime`` module not only in terms of precision (it also have a resolution of microseconds) and time-span (its range is year 1 to year 9999), but also in that we will provide getters and setters for it. ``timestamp64`` - Implemented internally as an ``int64`` type. - Expressed in nanoseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 272 years in each direction since the POSIX epoch. Observations:: This will be not be fully compatible with the Python ``datetime`` module neither in terms of precision nor time-span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). * ``timefloat64`` - Implemented internally as a float64. - Expressed in microseconds since POSIX epoch. - Resolution: 1 microsecond (for +-32 years from epoch) or 14 digits (for distant years from epoch). So the precision is *variable*. - Time span: 1e+308 years in each direction since the POSIX epoch. Observations:: In general, this will be not be fully compatible with the Python datetime neither in terms of precision nor time-span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). Example of use ============== Here it is an example of usage of one of the types described above (``datetime64``):: In [10]: t = numpy.zeros(5, dtype="datetime64") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: 733234384724 # representation as an int64 (scalar) In [13]: t Out[13]: array([12155899511985929, 0, 0, 0, 0], dtype=datetime64) In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 11, 14, 27, 3, 384724) Final considerations ==================== About the ``mx.DateTime`` module -------------------------------- In this document, the emphasis has been put in comparing the compatibility of future NumPy date/time types against the ``datetime`` module that comes with Python. Should we consider the compatibility with mx.DateTime as well? Are there many people using ``mx.DateTime`` [2]_ out there? If so, which are their advantages over ``datetime`` module?. A final note on time scales --------------------------- [Only for people with high precision time requirements or just for those that love the feel of their brain exploding] POSIX (or UTC, in which POSIX is based) time scale [3]_ is based on the time that the Earth takes to revolve around the Sun. However, after the adoption of more precise time patterns (read: atomic clocks), it became clear that this time is pretty imprecise compared with the latter. As a result, the UTC standard is adding ``leap seconds`` from time to time (at a rate of 1 second per year, approximately) in order to compensate these differences. Because of this, when computing time deltas (using the UTC standard) between two instants that differ in more than one year, it is extremely probable that an error of a second or several (depending of the time span) would be introduced. While this in general is harmless for common use cases, there are situations that this can bite people quite a lot. For example, realize that IERS (International Earth Rotation and Reference Systems Service) decided to add a leap second the past June 30th 00:00:00 UTC and that you were doing an experiment precisely at that time (this is not so rare, and has probably happened already to someone, somewhere). With this, in the analysis phase of your experiment, you could have the surprise that the time deltas computed during this leap second are actually 1 second shorter (and then, why the heck do we want to use types that supposedly support micro- or nano-seconds of precision?). Because of this, we were initially tempted to use the TAI (Temps Atomique International) [4]_ standard because it is strictly *continuous* (contrarily to UTC or POSIX), so avoiding the sort of problems exposed above. However, the omnipresence of the UTC clocks in the computing world would force us to be continuously converting UTC timestamps to TAI ones and vice-versa. Unfortunately, this is not easy to do because there is not a simple mathematical relationship between UTC and TAI. Instead, you have to use a table in order to check when the IERS added the leap seconds, and take them into account. The problem is that, as the revolution of the Earth is imprecise, one cannot determine ahead of time when the leap seconds will be added. This can lead to problems with code that performs the TAI <-> UTC conversion in the sense that the internal conversion table has to be continuously updated if we don't want to have problems of precision. And this is a too much added complication to be worth the effort, in our opinion. These are the difficulties that driven us to prefer the POSIX time scale over TAI for this implementation. However, more input on this issue is very welcome. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. [4] http://en.wikipedia.org/wiki/International_Atomic_Time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: From kwgoodman at gmail.com Fri Jul 11 11:11:38 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Jul 2008 08:11:38 -0700 Subject: [Numpy-discussion] doctests failing in ipython In-Reply-To: References: <3d375d730807110029h6a775fc8v26465318ee64d224@mail.gmail.com> <3d375d730807110040p35604d2bpb4849be420dedb35@mail.gmail.com> Message-ID: On Fri, Jul 11, 2008 at 12:58 AM, Charles R Harris wrote: > The problem might be the old ipython version (8.1) shipped with ubuntu 8.04. > Debian is slow to update and I've been trying out ubuntu for 64 bit testing. Debian Lenny is at ipython 0.8.4. From pgmdevlist at gmail.com Fri Jul 11 12:10:23 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 11 Jul 2008 12:10:23 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111559.03986.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> Message-ID: <200807111210.23484.pgmdevlist@gmail.com> Francesc, > We are planning to implement some date/time types for NumPy, and I'm > sending a document that explains our approach. We would love to hear > the feedback of the NumPy community in order to cover their needs as > much as possible. That sounds like an excellent idea. Matt Knox and I tried something similar with the scikits.timeseries module we've been developing over the last 18 months (scipy.org/scipy/scikits/wiki/TimeSeries). Our approach for dealing with dates was to translate them into integers through a particular class (Date). The trick was to change the reference depending on the problem at hands: when dealing with annual series, the Date object is simply the year (since CE); when dealing with months, the number of months since 0CE; when dealing with hours, the number of hours since 1970... All the nitty-gritty parts were coded by Matt in C. And yes, we have routines to transform a datetime object into a Date object and back. We also used a parser from mxDate when dealing with dates in string formats. We thought about creating specific dtypes to simplify the interface, but had problems finding proper documentation for that and were anyway more interested in having something running. The approach works well for us, but one of the biggest limitations we have is that we can't handle series with a frequency less than 1s (as we need integers), and your idea of a float for higher frequencies is great. About the types you propose, isn't there a typo somewhere in the resolution ? What's the difference between your datetime64 and timestamp64 ? > In this document, the emphasis has been put in comparing the > compatibility of future NumPy date/time types against the ``datetime`` > module that comes with Python. Should we consider the compatibility > with mx.DateTime as well? Are there many people using ``mx.DateTime`` > [2]_ out there? If so, which are their advantages over ``datetime`` > module?. mx.DateTime has a great parser for strings, but its use adds yet some other requirements (you need to have the module installed and it doesn't come by default with python, there's some licensing issues...), so I wouldn't focus on that for now, if I were you. > A final note on time scales > --------------------------- Wow, indeed. In environmental sciences (my side) and in finances (Matt's), we very rarely have a need for that precision, thankfully... From peridot.faceted at gmail.com Fri Jul 11 12:28:30 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 11 Jul 2008 12:28:30 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111210.23484.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807111210.23484.pgmdevlist@gmail.com> Message-ID: 2008/7/11 Pierre GM : >> A final note on time scales >> --------------------------- > Wow, indeed. In environmental sciences (my side) and in finances (Matt's), we > very rarely have a need for that precision, thankfully... We do, sometimes, in pulsar astronomy. But I think it's reasonable to force us to use our own custom time representations. For example, I've dealt with time series representing (X-ray) photon arrival times, expressed in modified Julian days. The effect of representing microsecond-scale quantities as differences between numbers on the order of fifty thousand days I leave to your imagination. But 64 bits was certainly not enough. More, there are even more subtle effects relating to differences between TAI and TT, issues relating to whether your times are measured at the earth's barycenter or the solar system barycenter... A date/time class that tries to do everything is quickly going to become unfeasibly complicated. Anne From pgmdevlist at gmail.com Fri Jul 11 12:35:20 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 11 Jul 2008 12:35:20 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: References: <200807111559.03986.falted@pytables.org> <200807111210.23484.pgmdevlist@gmail.com> Message-ID: <200807111235.20313.pgmdevlist@gmail.com> On Friday 11 July 2008 12:28:30 Anne Archibald wrote: > A date/time class that tries to do everything is quickly going to become > unfeasibly complicated. I quite agree, and I think that's why Francesc and Ivan are considering different classes for different problems: one targetting series at a frequency at most daily, and some others when sub-second resolution is needed. From falted at pytables.org Fri Jul 11 12:47:46 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 18:47:46 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111210.23484.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807111210.23484.pgmdevlist@gmail.com> Message-ID: <200807111847.47434.falted@pytables.org> A Friday 11 July 2008, Pierre GM escrigu?: > Francesc, > > > We are planning to implement some date/time types for NumPy, and > > I'm sending a document that explains our approach. We would love > > to hear the feedback of the NumPy community in order to cover their > > needs as much as possible. > > That sounds like an excellent idea. Matt Knox and I tried something > similar with the scikits.timeseries module we've been developing over > the last 18 months (scipy.org/scipy/scikits/wiki/TimeSeries). > > Our approach for dealing with dates was to translate them into > integers through a particular class (Date). The trick was to change > the reference depending on the problem at hands: when dealing with > annual series, the Date object is simply the year (since CE); when > dealing with months, the number of months since 0CE; when dealing > with hours, the number of hours since 1970... All the nitty-gritty > parts were coded by Matt in C. And yes, we have routines to transform > a datetime object into a Date object and back. We also used a parser > from mxDate when dealing with dates in string formats. That's very interesting. We will have a look at your implementation and see if we can reuse code/ideas. I suppose this code in your TimeSeries module, right? > We thought about creating specific dtypes to simplify the interface, > but had problems finding proper documentation for that and were > anyway more interested in having something running. The approach > works well for us, but one of the biggest limitations we have is that > we can't handle series with a frequency less than 1s (as we need > integers), and your idea of a float for higher frequencies is great. You can obtain at least a precision of microseconds with any of the proposed int64-based types. For the float64-based, you can get that precision too if you are dealing with dates in the [1970, 2038] range. > About the types you propose, isn't there a typo somewhere in the > resolution ? What's the difference between your datetime64 and > timestamp64 ? It's a typo indeed. I'm attaching the new version here (with some additional minor fixes, mainly in the format). > > > In this document, the emphasis has been put in comparing the > > compatibility of future NumPy date/time types against the > > ``datetime`` module that comes with Python. Should we consider the > > compatibility with mx.DateTime as well? Are there many people > > using ``mx.DateTime`` [2]_ out there? If so, which are their > > advantages over ``datetime`` module?. > > mx.DateTime has a great parser for strings, but its use adds yet some > other requirements (you need to have the module installed and it > doesn't come by default with python, there's some licensing > issues...), so I wouldn't focus on that for now, if I were you. Interesting... > > > A final note on time scales > > --------------------------- > > Wow, indeed. In environmental sciences (my side) and in finances > (Matt's), we very rarely have a need for that precision, > thankfully... I was surprised about this too when Ivan bring it to my attention :) Thanks for excellent fedback! -- Francesc Alted =========================================================== A proposal for implementing some date/time types in NumPy =========================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python have several modules that define a date/time type, like ``mx.DateTime`` or the integrated ``datetime`` [1]_, NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with three different types, namely ``datetime64``, ``timestamp64`` and ``timefloat64`` -- these names are preliminary and can be changed indeed; they are mostly useful for the sake of the discussion -- that cover different needs. Here it goes a detailed description of the different types: * ``datetime64`` - Implemented internally as an ``int64`` type. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: microseconds. - Time span: 278922 years in each direction since the POSIX epoch. Observations:: This will be compatible with the Python ``datetime`` module not only in terms of precision (it also have a resolution of microseconds) and time span (its range is year 1 to year 9999), but also in that we will provide getters and setters for it. * ``timestamp64`` - Implemented internally as an ``int64`` type. - Expressed in nanoseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 272 years in each direction since the POSIX epoch. Observations:: This will be not be fully compatible with the Python ``datetime`` module neither in terms of precision nor time span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). * ``timefloat64`` - Implemented internally as a float64. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: 1 microsecond (for +-32 years from epoch) or 14 digits (for distant years from epoch). So the precision is *variable*. - Time span: 1e+308 years in each direction since the POSIX epoch. Observations:: In general, this will be not be fully compatible with the Python datetime neither in terms of precision nor time span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). Example of use ============== Here it is an example of usage of one of the types described above (``datetime64``):: In [10]: t = numpy.zeros(5, dtype="datetime64") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: 733234384724 # representation as an int64 (scalar) In [13]: t Out[13]: array([12155899511985929, 0, 0, 0, 0], dtype=datetime64) In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 11, 14, 27, 3, 384724) Final considerations ==================== About the ``mx.DateTime`` module -------------------------------- In this document, the emphasis has been put in comparing the compatibility of future NumPy date/time types against the ``datetime`` module that comes with Python. Should we consider the compatibility with ``mx.DateTime`` as well? Are there many people using ``mx.DateTime`` [2]_ out there? If so, which are their advantages over ``datetime`` module?. A final note on time scales --------------------------- [Only for people with high precision time requirements or just for those that love the feel of their brain exploding] POSIX (or UTC, in which POSIX is based) time scale [3]_ is based on the time that the Earth takes to revolve around the Sun. However, after the adoption of more precise time patterns (read: atomic clocks), it became clear that this time is pretty imprecise compared with the latter. As a result, the UTC standard is adding ``leap seconds`` from time to time (at a rate of 1 second per year, approximately) in order to compensate these differences. Because of this, when computing time deltas (using the UTC standard) between two instants that differ in more than one year, it is extremely probable that an error of a second or several (depending of the time span) would be introduced. While this in general is harmless for common use cases, there are situations that this can bite people quite a lot. For example, realize that IERS (International Earth Rotation and Reference Systems Service) decided to add a leap second the past June 30th 00:00:00 UTC and that you were doing an experiment precisely at that time (this is not so rare, and has probably happened already to someone, somewhere). With this, in the analysis phase of your experiment, you could have the surprise that the time deltas computed during this leap second are actually 1 second shorter (and then, why the heck do we want to use types that supposedly support micro- or nano-seconds of precision?). Because of this, we were initially tempted to use the TAI (Temps Atomique International) [4]_ standard because it is strictly *continuous* (contrarily to UTC or POSIX), so avoiding the sort of problems exposed above. However, the omnipresence of the UTC clocks in the computing world would force us to be continuously converting UTC timestamps to TAI ones and vice-versa. Unfortunately, this is not easy to do because there is not a simple mathematical relationship between UTC and TAI. Instead, you have to use a table in order to check when the IERS added the leap seconds, and take them into account. The problem is that, as the revolution of the Earth is imprecise, one cannot determine ahead of time when the leap seconds will be added. This can lead to problems with code that performs the TAI <-> UTC conversion in the sense that the internal conversion table has to be continuously updated if we don't want to have problems of precision. And this is a too much added complication to be worth the effort, in our opinion. These are the difficulties that driven us to prefer the POSIX time scale over TAI for this implementation. However, more input on this issue is very welcome. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. [4] http://en.wikipedia.org/wiki/International_Atomic_Time -------------- next part -------------- =========================================================== A proposal for implementing some date/time types in NumPy =========================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python have several modules that define a date/time type, like ``mx.DateTime`` or the integrated ``datetime`` [1]_, NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with three different types, namely ``datetime64``, ``timestamp64`` and ``timefloat64`` -- these names are preliminary and can be changed indeed; they are mostly useful for the sake of the discussion -- that cover different needs. Here it goes a detailed description of the different types: * ``datetime64`` - Implemented internally as an ``int64`` type. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: microseconds. - Time span: 278922 years in each direction since the POSIX epoch. Observations:: This will be compatible with the Python ``datetime`` module not only in terms of precision (it also have a resolution of microseconds) and time span (its range is year 1 to year 9999), but also in that we will provide getters and setters for it. * ``timestamp64`` - Implemented internally as an ``int64`` type. - Expressed in nanoseconds since POSIX epoch (January 1, 1970). - Resolution: nanoseconds. - Time span: 272 years in each direction since the POSIX epoch. Observations:: This will be not be fully compatible with the Python ``datetime`` module neither in terms of precision nor time span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). * ``timefloat64`` - Implemented internally as a float64. - Expressed in microseconds since POSIX epoch (January 1, 1970). - Resolution: 1 microsecond (for +-32 years from epoch) or 14 digits (for distant years from epoch). So the precision is *variable*. - Time span: 1e+308 years in each direction since the POSIX epoch. Observations:: In general, this will be not be fully compatible with the Python datetime neither in terms of precision nor time span. However, getters and setters will be provided for it (loosing precision or overflowing as needed). Example of use ============== Here it is an example of usage of one of the types described above (``datetime64``):: In [10]: t = numpy.zeros(5, dtype="datetime64") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: 733234384724 # representation as an int64 (scalar) In [13]: t Out[13]: array([12155899511985929, 0, 0, 0, 0], dtype=datetime64) In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 11, 14, 27, 3, 384724) Final considerations ==================== About the ``mx.DateTime`` module -------------------------------- In this document, the emphasis has been put in comparing the compatibility of future NumPy date/time types against the ``datetime`` module that comes with Python. Should we consider the compatibility with ``mx.DateTime`` as well? Are there many people using ``mx.DateTime`` [2]_ out there? If so, which are their advantages over ``datetime`` module?. A final note on time scales --------------------------- [Only for people with high precision time requirements or just for those that love the feel of their brain exploding] POSIX (or UTC, in which POSIX is based) time scale [3]_ is based on the time that the Earth takes to revolve around the Sun. However, after the adoption of more precise time patterns (read: atomic clocks), it became clear that this time is pretty imprecise compared with the latter. As a result, the UTC standard is adding ``leap seconds`` from time to time (at a rate of 1 second per year, approximately) in order to compensate these differences. Because of this, when computing time deltas (using the UTC standard) between two instants that differ in more than one year, it is extremely probable that an error of a second or several (depending of the time span) would be introduced. While this in general is harmless for common use cases, there are situations that this can bite people quite a lot. For example, realize that IERS (International Earth Rotation and Reference Systems Service) decided to add a leap second the past June 30th 00:00:00 UTC and that you were doing an experiment precisely at that time (this is not so rare, and has probably happened already to someone, somewhere). With this, in the analysis phase of your experiment, you could have the surprise that the time deltas computed during this leap second are actually 1 second shorter (and then, why the heck do we want to use types that supposedly support micro- or nano-seconds of precision?). Because of this, we were initially tempted to use the TAI (Temps Atomique International) [4]_ standard because it is strictly *continuous* (contrarily to UTC or POSIX), so avoiding the sort of problems exposed above. However, the omnipresence of the UTC clocks in the computing world would force us to be continuously converting UTC timestamps to TAI ones and vice-versa. Unfortunately, this is not easy to do because there is not a simple mathematical relationship between UTC and TAI. Instead, you have to use a table in order to check when the IERS added the leap seconds, and take them into account. The problem is that, as the revolution of the Earth is imprecise, one cannot determine ahead of time when the leap seconds will be added. This can lead to problems with code that performs the TAI <-> UTC conversion in the sense that the internal conversion table has to be continuously updated if we don't want to have problems of precision. And this is a too much added complication to be worth the effort, in our opinion. These are the difficulties that driven us to prefer the POSIX time scale over TAI for this implementation. However, more input on this issue is very welcome. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. [4] http://en.wikipedia.org/wiki/International_Atomic_Time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: From Chris.Barker at noaa.gov Fri Jul 11 12:56:31 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 11 Jul 2008 09:56:31 -0700 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111559.03986.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> Message-ID: <487790BF.1010808@noaa.gov> Francesc Alted wrote: > We are planning to implement some date/time types for NumPy, +1 A couple questions/comments: > ``datetime64`` > - Expressed in microseconds since POSIX epoch (January 1, 1970). > > - Resolution: nanoseconds. how is that possible? Is that a typo? > This will be compatible with the Python ``datetime`` module very important! > Observations:: > > This will be not be fully compatible with the Python ``datetime`` > module neither in terms of precision nor time-span. However, > getters and setters will be provided for it (loosing precision or > overflowing as needed). How to you propose handling overflowing? Would it raise an exception? Another option would be to have a version that stored the datetime in two values: say two int64s or something (kind of like complex numbers are handled). This would allow a long time span and nanosecond (or finer) precision. I guess it would require a bunch of math code to be written, however. > * ``timefloat64`` > - Resolution: 1 microsecond (for +-32 years from epoch) or 14 digits > (for distant years from epoch). So the precision is *variable*. I'm not sure this is that useful, exactly for that reason. What's the motivation for it? I can see using a float for timedelta -- as, in general, you'll need less precision the linger your time span, but having precision depend on how far you happen to be from the epoch seems risky (though for anything I do, it wouldn't matter in the least). > Example of use > In [11]: t[0] = datetime.datetime.now() # setter in action > > In [12]: t[0] > Out[12]: 733234384724 # representation as an int64 (scalar) hmm - could it return a numpy.datetime object instead, rather than a straight int64? I'd like to see a representation that is clearly datetime. > About the ``mx.DateTime`` module > -------------------------------- > > In this document, the emphasis has been put in comparing the > compatibility of future NumPy date/time types against the ``datetime`` > module that comes with Python. Should we consider the compatibility > with mx.DateTime as well? No. The whole point of python's standard datetime is to have a common system with which to deal with date-time values -- it's too bad it didn't come sooner, so that mx.DateTime could have been built on it, but at this point, I think supporting the standard lib one is most important. I couldn't find documentation (not quickly, anyway) of how the datetime object stores its data internally, but it might be nice to support that protocol directly -- maybe that would make for too much math code to write, though. What about timedelta types? My final thought is that while I see that different applications need different properties, having multiple representations seems like it will introduce a lot of maintenance, documentation and support issues. Maybe a single, more complicated representation would be a better bet (like using two ints, rather than one, to get both range and precision) Thanks for working on this -- I think it will be a great addition to numpy! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From falted at pytables.org Fri Jul 11 12:55:13 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 18:55:13 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111235.20313.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807111235.20313.pgmdevlist@gmail.com> Message-ID: <200807111855.13209.falted@pytables.org> A Friday 11 July 2008, Pierre GM escrigu?: > On Friday 11 July 2008 12:28:30 Anne Archibald wrote: > > A date/time class that tries to do everything is quickly going to > > become unfeasibly complicated. > > I quite agree, and I think that's why Francesc and Ivan are > considering different classes for different problems: one targetting > series at a frequency at most daily, and some others when sub-second > resolution is needed. Exactly, that's the idea: two of the proposed types has a fixed sub-second resolution. ``datetime64`` has microsecond precision, while ``timestamp64`` has a precision of nanoseconds (at the expenses of a much more limited time span). For people needing geological or astronomical time scales the ``timefloat64`` comes handy (at the expense of loosing resolution for large time spans). Cheers, -- Francesc Alted From pgmdevlist at gmail.com Fri Jul 11 13:01:58 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 11 Jul 2008 13:01:58 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111847.47434.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <200807111210.23484.pgmdevlist@gmail.com> <200807111847.47434.falted@pytables.org> Message-ID: <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> On Fri, Jul 11, 2008 at 12:47 PM, Francesc Alted wrote: > A Friday 11 July 2008, Pierre GM escrigu?: > > > Our approach for dealing with dates was to translate them into > > integers through a particular class (Date). > > That's very interesting. We will have a look at your implementation and > see if we can reuse code/ideas. I suppose this code in your TimeSeries > module, right? > Yes, in the src directory (c_dates.c). In addition, we have a DateArray class, which implements an array of Dates (you didn't see that coming...) and whose dtype is int64. The reason why we wanted to stick to ints was to permit a direct correspondence index<->date in an array. If you know the first date and the timestep, you can get the date corresponding to any element of your series, and vice-versa. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Fri Jul 11 13:03:50 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 11 Jul 2008 10:03:50 -0700 Subject: [Numpy-discussion] python user defined type In-Reply-To: <1215761045.23456.12.camel@sulfur.loria.fr> References: <1215715698.6369.7.camel@oxygen> <48767466.6000504@noaa.gov> <1215726329.6669.27.camel@oxygen> <48769B85.3060807@noaa.gov> <1215761045.23456.12.camel@sulfur.loria.fr> Message-ID: <48779276.80605@noaa.gov> Nicolas Rougier wrote: > My Unit class is supposed to represent a neuron that can be linked to > any other unit. The neuron itself is merely a (float) potential that can > vary along time under the influence of other units and learning. > gather these units into groups which are in fact 2D matrix of units. If you don't want to get into numpy internals, or write your own array classes, my thought is to re-think the structure, and rather than have an array of Unit objects, have a subclass of numpy array that represents the group, that is an array floats, plus whatever meta information you need. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lou_boog2000 at yahoo.com Fri Jul 11 13:04:37 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Fri, 11 Jul 2008 10:04:37 -0700 (PDT) Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> Message-ID: <92000.48372.qm@web34408.mail.mud.yahoo.com> If your positions are static (I'm not clear on that from your message), then you might want to check the technique of "slice searching". It only requires one sort of the data for each dimension initially, then uses a simple, but clever look up to find neighbors within some epsilon of a chosen point. Speeds appear to be about equal to k-d trees. Programming is vastly simpler than k-d trees, however. See, [1] "A Simple Algorithm for Nearest Neighbor Search in High Dimensions," Sameer A. Nene and Shree K. Nayar, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 19 (9), 989 (1997). -- Lou Pecora, my views are my own. --- On Thu, 7/10/08, Dan Lussier wrote: > From: Dan Lussier > Subject: [Numpy-discussion] huge array calculation speed > To: numpy-discussion at scipy.org > Date: Thursday, July 10, 2008, 12:38 PM > Hello, > > I am relatively new to numpy and am having trouble with the > speed of > a specific array based calculation that I'm trying to > do. > > What I'm trying to do is to calculate the total total > potential > energy and coordination number of each atom within a > relatively large > simulation. Each atom is at a position (x,y,z) given by a > row in a > large array (approximately 1e6 by 3) and presently I have > no > information about its nearest neighbours so each its > position must be > checked against all others before cutting the list down > prior to > calculating the energy. From falted at pytables.org Fri Jul 11 13:52:32 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 19:52:32 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <487790BF.1010808@noaa.gov> References: <200807111559.03986.falted@pytables.org> <487790BF.1010808@noaa.gov> Message-ID: <200807111952.32849.falted@pytables.org> A Friday 11 July 2008, Christopher Barker escrigu?: > Francesc Alted wrote: > > We are planning to implement some date/time types for NumPy, > > +1 > > A couple questions/comments: > > ``datetime64`` > > - Expressed in microseconds since POSIX epoch (January 1, 1970). > > > > - Resolution: nanoseconds. > > how is that possible? Is that a typo? Exactly. This should read *microseconds*. I've sent the corrected version before. > > > This will be compatible with the Python ``datetime`` module > > very important! > > > Observations:: > > > > This will be not be fully compatible with the Python > > ``datetime`` module neither in terms of precision nor time-span. > > However, getters and setters will be provided for it (loosing > > precision or overflowing as needed). > > How to you propose handling overflowing? Would it raise an exception? Yes. We propose to use exactly the same exception handling than NumPy (so it will be configurable by the user). > > Another option would be to have a version that stored the datetime in > two values: say two int64s or something (kind of like complex numbers > are handled). This would allow a long time span and nanosecond (or > finer) precision. I guess it would require a bunch of math code to be > written, however. I suppose so, yes. Besides, this certainly violates the requeriment of having a fast implementation (unless we want to use a lot of time optimizing such a 'complex' date/time type). There is also the problem of requiring more space. See later. > > > * ``timefloat64`` > > - Resolution: 1 microsecond (for +-32 years from epoch) or 14 > > digits (for distant years from epoch). So the precision is > > *variable*. > > I'm not sure this is that useful, exactly for that reason. What's the > motivation for it? I can see using a float for timedelta -- as, in > general, you'll need less precision the linger your time span, but > having precision depend on how far you happen to be from the epoch > seems risky (though for anything I do, it wouldn't matter in the > least). Well, as I said before, we wanted this mainly for geological/astronomical uses, but as this type has the property of having microsecond resolution during the years [1902 - 2038], it would be definitely useful for many other cases too. I can say that Postgres, as for one, implements a datetime type based on a float64 by default (although you can choose an int64 in compilation time) with exactly the same properties than ``timefloat64``. So, if Postgres is doing this, it should be definitely useful in many use cases. > > > Example of use > > > > In [11]: t[0] = datetime.datetime.now() # setter in action > > > > In [12]: t[0] > > Out[12]: 733234384724 # representation as an int64 (scalar) > > hmm - could it return a numpy.datetime object instead, rather than a > straight int64? I'd like to see a representation that is clearly > datetime. Could be. But we should not forget that we are implementing the type for an array package, and the output can become cumbersome very soon. What I wanted to avoid here was having this: [datetime(2008, 7, 11, 19, 16, 10, 996509), datetime(2008, 7, 11, 19, 16, 10, 996535), datetime(2008, 7, 11, 19, 16, 10, 996547), datetime(2008, 7, 11, 19, 16, 10, 996559), datetime(2008, 7, 11, 19, 16, 10, 996568), dtype="datetime64"] I prefer to see this: [733234000000, 733234000000, 733234000000, 733234000000, 733234000000, dtype="datetime64"] Hmm, although for a scalar representation, I agree that this is a bit too terse. Maybe adding a 'T' (meaning 'T'ime type) and the end would be better?: In [12]: t[0] Out[12]: 733234384724T and hence: [733234000000T, 733234000000T, 733234000000T, 733234000000T, 733234000000T, dtype="datetime64"] But it would be interesting to see what other people thinks. > > > About the ``mx.DateTime`` module > > -------------------------------- > > > > In this document, the emphasis has been put in comparing the > > compatibility of future NumPy date/time types against the > > ``datetime`` module that comes with Python. Should we consider the > > compatibility with mx.DateTime as well? > > No. The whole point of python's standard datetime is to have a common > system with which to deal with date-time values -- it's too bad it > didn't come sooner, so that mx.DateTime could have been built on it, > but at this point, I think supporting the standard lib one is most > important. I see. > I couldn't find documentation (not quickly, anyway) of how the > datetime object stores its data internally, but it might be nice to > support that protocol directly -- maybe that would make for too much > math code to write, though. The internal format for the datetime module is documented in the sources, and at first sight, supporting the protocol shouldn't be too difficult. > What about timedelta types? Well, we deliberately have left timedelta out because we think that any of the three proposed types can act as a timedelta (this is also another reason for keeping the proposed representation, i.e. don't show year/month/day/etc... info). In fact, if they represent an absolute time is by the convention of having the origin of time in the UNIX epoch. But if you don't impose this convention for your array, all of timetypes can represent timedeltas. However, I suppose that there is a problem with the getters and setters here, that is, how external ``datetime`` timedeltas interacts with the new NumPy date/time types. Thinking a bit, the setter should be relatively easy to implement: In [37]: numpy.datetime64(datetime.timedelta(12)) Out [37]: 12T For the getter, one can think on adding a new method (only available for the date/time types): In [38]: t = numpy.datetime64(datetime.timedelta(12)) In [39]: t.totimedelta() Out [39]: datetime.timedelta(12) IMO, that would solve the issue without having to implement specific timedelta types. > My final thought is that while I see that different applications need > different properties, having multiple representations seems like it > will introduce a lot of maintenance, documentation and support > issues. Maybe a single, more complicated representation would be a > better bet (like using two ints, rather than one, to get both range > and precision) Yeah, but besides the fact that implementation would be quite slower, this sort of structs of two 'int64' would take twice the space of the proposed timetypes, and this can be killer for a package that is meant for dealing with large arrays of data. [Incidentally, I was even pondering to introduce some 32-bit date/time precisely for saving space, but as the usability of such a type would be really restricted, in the end I've opted to not including it]. > Thanks for working on this -- I think it will be a great addition to > numpy! Thanks for excellent feedback too! -- Francesc Alted From wright at esrf.fr Fri Jul 11 13:54:31 2008 From: wright at esrf.fr (Jon Wright) Date: Fri, 11 Jul 2008 19:54:31 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111559.03986.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> Message-ID: <48779E57.6070400@esrf.fr> Hello, Nice idea - please can you make it work with matplotlib's time/date stuff too? Thanks, Jon Francesc Alted wrote: ... > =========================================================== > A proposal for implementing some date/time types in NumPy > =========================================================== From falted at pytables.org Fri Jul 11 14:01:39 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 20:01:39 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> References: <200807111559.03986.falted@pytables.org> <200807111847.47434.falted@pytables.org> <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> Message-ID: <200807112001.39395.falted@pytables.org> A Friday 11 July 2008, Pierre GM escrigu?: > On Fri, Jul 11, 2008 at 12:47 PM, Francesc Alted > > > wrote: > > A Friday 11 July 2008, Pierre GM escrigu?: > > > Our approach for dealing with dates was to translate them into > > > integers through a particular class (Date). > > > > That's very interesting. We will have a look at your > > implementation and see if we can reuse code/ideas. I suppose this > > code in your TimeSeries module, right? > > Yes, in the src directory (c_dates.c). > In addition, we have a DateArray class, which implements an array of > Dates (you didn't see that coming...) and whose dtype is int64. > > The reason why we wanted to stick to ints was to permit a direct > correspondence index<->date in an array. If you know the first date > and the timestep, you can get the date corresponding to any element > of your series, and vice-versa. Ah! Very smart! I wonder if we could use this to implement a special array with a fixed timestep that could be indexed by time instead than by index. Something like: t1 = datetime.datetime(1,2,3) t2 = datetime.datetime(3,4,5) and then: arr[numpy.datetime(t1):numpy.datetime(t2)] would select the events between t1 and t2 timestamps. Powerful! But that would introduce more complexity and besides this is not directly related with our goal. Interesting anyway. -- Francesc Alted From falted at pytables.org Fri Jul 11 14:14:01 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 20:14:01 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <48779E57.6070400@esrf.fr> References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> Message-ID: <200807112014.01248.falted@pytables.org> A Friday 11 July 2008, Jon Wright escrigu?: > Hello, > > Nice idea - please can you make it work with matplotlib's time/date > stuff too? Hmmm, following the matplotlib docstrings: """ datetime objects are converted to floating point numbers which represent the number of days since 0001-01-01 UTC """ So it is using something similar to the ``timefloat64`` in our proposal, but with a different scale (it is counting days instead of microseconds) and a different epoch (0001-01-01 UTC instead of 1970-01-01 UTC). So, it seems that setters/getters for matplotlib datetime could be supported, maybe at the risk of loosing precision. We should study this more carefully, but I suppose that if there is interest enough that could be implemented, yes. Thanks for pointing out this, -- Francesc Alted From pgmdevlist at gmail.com Fri Jul 11 14:20:26 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 11 Jul 2008 14:20:26 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807112001.39395.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> <200807112001.39395.falted@pytables.org> Message-ID: <200807111420.26794.pgmdevlist@gmail.com> On Friday 11 July 2008 14:01:39 Francesc Alted wrote: > Ah! Very smart! I wonder if we could use this to implement a special > array with a fixed timestep that could be indexed by time instead than > by index. Something like: > > t1 = datetime.datetime(1,2,3) > t2 = datetime.datetime(3,4,5) Well, we coded something like that in our TimeSeries class: its __getitem__ is quite bloated, but you can use integers/dates/strings as indices and get your result. We implemented in Python, so that's slow, but it works great. On Friday 11 July 2008 13:54:31 Jon Wright wrote: > Hello, > > Nice idea - please can you make it work with matplotlib's time/date > stuff too? FYI, the scikits.timeseries has a module for plotting w/ TimeSeries objects. We had fun implementing the part where the labels change depending on the level of zoom... About the representation (datetime vs integer): I think that everything depends on what you want to do. Our DateArray class pretty-prints results in a human format while still using integers internally. For example, >>> import scikits.timeseries as ts >>> example=ts.date_array(start_date=ts.now('M'), length=6) >>> print example [Jul-2008 Aug-2008 Sep-2008 Oct-2008 Nov-2008 Dec-2008] >>> print example.tovalue() [24091 24092 24093 24094 24095 24096] >>> print example.tolist() [datetime.datetime(2008, 7, 31, 0, 0), datetime.datetime(2008, 8, 31, 0, 0), datetime.datetime(2008, 9, 30, 0, 0), datetime.datetime(2008, 10, 31, 0, 0), datetime.datetime(2008, 11, 30, 0, 0), datetime.datetime(2008, 12, 31, 0, 0)] Et voila (like we say at home) Francesc: A few weeks back, I coded some interface between TimeSeries and pytables. I haven't really cleaned it yet but will post it very soon. Roughly, a TimeSeries object is the combination of a MaskedArray and a DateArray, and it can be readily transformed into a record-array which in turns can be transformed into a table. I experimented with various levels of nesting in the definition of dtypes, and I've been amazed by how powerful tailor-made dtypes can be. I bow to Travis O. et al. for the implementation... From falted at pytables.org Fri Jul 11 14:37:54 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 20:37:54 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807112014.01248.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> <200807112014.01248.falted@pytables.org> Message-ID: <200807112037.54947.falted@pytables.org> A Friday 11 July 2008, Francesc Alted escrigu?: > A Friday 11 July 2008, Jon Wright escrigu?: > > Hello, > > > > Nice idea - please can you make it work with matplotlib's time/date > > stuff too? > > Hmmm, following the matplotlib docstrings: > > """ > datetime objects are converted to floating point numbers > which represent the number of days since 0001-01-01 UTC > """ > > So it is using something similar to the ``timefloat64`` in our > proposal, but with a different scale (it is counting days instead of > microseconds) and a different epoch (0001-01-01 UTC instead of > 1970-01-01 UTC). > > So, it seems that setters/getters for matplotlib datetime could be > supported, maybe at the risk of loosing precision. We should study > this more carefully, but I suppose that if there is interest enough > that could be implemented, yes. Now that I think about this, wouldn't be better if, after the eventual introduction of the new datetime types in NumPy, the matplotlib would use any of these three and throw away their current datetime class? [Unless they have good reasons for keeping their epoch and/or scale] Cheers, -- Francesc Alted From charlesr.harris at gmail.com Fri Jul 11 14:51:15 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 12:51:15 -0600 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807112037.54947.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> <200807112014.01248.falted@pytables.org> <200807112037.54947.falted@pytables.org> Message-ID: On Fri, Jul 11, 2008 at 12:37 PM, Francesc Alted wrote: > A Friday 11 July 2008, Francesc Alted escrigu?: > > A Friday 11 July 2008, Jon Wright escrigu?: > > > Hello, > > > > > > Nice idea - please can you make it work with matplotlib's time/date > > > stuff too? > > > > Hmmm, following the matplotlib docstrings: > > > > """ > > datetime objects are converted to floating point numbers > > which represent the number of days since 0001-01-01 UTC > > """ > > > > So it is using something similar to the ``timefloat64`` in our > > proposal, but with a different scale (it is counting days instead of > > microseconds) and a different epoch (0001-01-01 UTC instead of > > 1970-01-01 UTC). > > > > So, it seems that setters/getters for matplotlib datetime could be > > supported, maybe at the risk of loosing precision. We should study > > this more carefully, but I suppose that if there is interest enough > > that could be implemented, yes. > > Now that I think about this, wouldn't be better if, after the eventual > introduction of the new datetime types in NumPy, the matplotlib would > use any of these three and throw away their current datetime class? > [Unless they have good reasons for keeping their epoch and/or scale] > Especially as there was a ten day adjustment made with the adoption of the Gregorian calender on Oct 4, 1582; early dates can be hard to interpret. Curiously, IIRC, 01/01/0001 was a Monday. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From falted at pytables.org Fri Jul 11 14:58:33 2008 From: falted at pytables.org (Francesc Alted) Date: Fri, 11 Jul 2008 20:58:33 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111420.26794.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807112001.39395.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> Message-ID: <200807112058.33840.falted@pytables.org> A Friday 11 July 2008, Pierre GM escrigu?: > On Friday 11 July 2008 14:01:39 Francesc Alted wrote: > > Ah! Very smart! I wonder if we could use this to implement a > > special array with a fixed timestep that could be indexed by time > > instead than by index. Something like: > > > > t1 = datetime.datetime(1,2,3) > > t2 = datetime.datetime(3,4,5) > > Well, we coded something like that in our TimeSeries class: its > __getitem__ is quite bloated, but you can use integers/dates/strings > as indices and get your result. We implemented in Python, so that's > slow, but it works great. That's nice! But it would be even nicer if that could be integrated in general NumPy arrays after the introduction of the datetime types (just thinking aloud ;-) > About the representation (datetime vs integer): I think that > everything depends on what you want to do. Our DateArray class > pretty-prints results in a human format while still using integers > internally. For example, > > >>> import scikits.timeseries as ts > >>> example=ts.date_array(start_date=ts.now('M'), length=6) > >>> print example > > [Jul-2008 Aug-2008 Sep-2008 Oct-2008 Nov-2008 Dec-2008] That's ok. But my point is that this forces you to represent absolute dates, and that's what I was trying to avoid. The proposed date/time types could work either as absolute or relative, depending on the needs of the user. Only when converting them to the Python ``datetime.datetime`` containers a time origin will be set, and hence, they represents an absolute date then. However, if you convert the NumPy datetimes into a ``datetime.timedelta``, your times will continue to be relative. That would be utterly important so as not to clutter NumPy too much with another set of 'timedelta' types, IMO. > > >>> print example.tovalue() > > [24091 24092 24093 24094 24095 24096] > > >>> print example.tolist() > > [datetime.datetime(2008, 7, 31, 0, 0), datetime.datetime(2008, 8, 31, > 0, 0), datetime.datetime(2008, 9, 30, 0, 0), datetime.datetime(2008, > 10, 31, 0, 0), datetime.datetime(2008, 11, 30, 0, 0), > datetime.datetime(2008, 12, 31, 0, 0)] Yes. With our proposal, '.tolist()' would return the same representation than yours. > Et voila (like we say at home) And at many other parts of the planet too ;-) > > Francesc: > A few weeks back, I coded some interface between TimeSeries and > pytables. I haven't really cleaned it yet but will post it very soon. > Roughly, a TimeSeries object is the combination of a MaskedArray and > a DateArray, and it can be readily transformed into a record-array > which in turns can be transformed into a table. > I experimented with various levels of nesting in the definition of > dtypes, and I've been amazed by how powerful tailor-made dtypes can > be. I bow to Travis O. et al. for the implementation... I completely agree. Travis made a stunning work with the implementation of the nested dtypes (as many other things in NumPy, but I perhaps appreciate this the most). Cheers, -- Francesc Alted From Chris.Barker at noaa.gov Fri Jul 11 15:16:39 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 11 Jul 2008 12:16:39 -0700 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807111420.26794.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> <200807112001.39395.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> Message-ID: <4877B197.7050607@noaa.gov> Pierre GM wrote: > but you can use integers/dates/strings as indices and get your > result. cool! I like that. >>>> print example > [Jul-2008 Aug-2008 Sep-2008 Oct-2008 Nov-2008 Dec-2008] I like this -- seeing the integers for the times makes me wonder what that point is -- we've all been using numbers for time for years already -- what would a datetime array give us other than auto-conversion from datetime objects, if it doesn't include nicer display, timedelta, etc. >>>> print example.tovalue() > [24091 24092 24093 24094 24095 24096] And is that a regular array of integers? >>>> print example.tolist() > [datetime.datetime(2008, 7, 31, 0, 0), datetime.datetime(2008, 8, 31, 0, 0), nice, too. > Now that I think about this, wouldn't be better if, after the eventual > introduction of the new datetime types in NumPy, the matplotlib would > use any of these three and throw away their current datetime class? yes, that would be better, but what to do during the transition? > [Unless they have good reasons for keeping their epoch and/or scale] If they do, they those should be taken into account when designing numpy's datetime types. > That's nice! But it would be even nicer if that could be integrated in > general NumPy arrays after the introduction of the datetime types (just > thinking aloud ;-) what would using dates/strings as indices mean for general numpy arrays? > That's ok. But my point is that this forces you to represent absolute > dates, and that's what I was trying to avoid. The proposed date/time > types could work either as absolute or relative, depending on the needs > of the user. Only when converting them to the Python > ``datetime.datetime`` containers a time origin will be set, and hence, > they represents an absolute date then. However, if you convert the > NumPy datetimes into a ``datetime.timedelta``, your times will continue > to be relative. That would be utterly important so as not to clutter > NumPy too much with another set of 'timedelta' types, IMO. hmm -- I see the tradeoff, but I like the timedelta concept too. I'm ambivalent now... I'm also imaging some extra utility functions/method that would be nice: aDateTimeArray.hours(dtype=float) to convert to hours (and days, and seconds, etc). And maybe some that would create a DateTimeArray from various time units. I often have to read/write data files that have time in various units like that -- it would be nice to use array operations to work with them. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Fri Jul 11 15:22:30 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 11 Jul 2008 15:22:30 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807112058.33840.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> <200807112058.33840.falted@pytables.org> Message-ID: <200807111522.30751.pgmdevlist@gmail.com> On Friday 11 July 2008 14:58:33 Francesc Alted wrote: > > Well, we coded something like that in our TimeSeries class: its > > __getitem__ is quite bloated, but you can use integers/dates/strings > > as indices and get your result. We implemented in Python, so that's > > slow, but it works great. > > That's nice! But it would be even nicer if that could be integrated in > general NumPy arrays after the introduction of the datetime types (just > thinking aloud ;-) Oh yes. Matt and I have the plan to implement that part in C, but I doubt it's gonna happen anytime soon: I'd have to learn proper C first and > > About the representation (datetime vs integer): > That's ok. But my point is that this forces you to represent absolute > dates, and that's what I was trying to avoid. Mmh, it's only a matter of repr/str, in fact. Internally, your array would still be datetime64, and you'll leave the user decide how s/he wants to display it. > The proposed date/time > types could work either as absolute or relative, depending on the needs > of the user. Only when converting them to the Python > ``datetime.datetime`` containers a time origin will be set, and hence, > they represents an absolute date then. However, if you convert the > NumPy datetimes into a ``datetime.timedelta``, your times will continue > to be relative. That would be utterly important so as not to clutter > NumPy too much with another set of 'timedelta' types, IMO. +1 With DateArray, timedeltas are just integers, the behaviour depends on the objects they're added to. With a annual DateArray, +1 means "add one year". With a monthly DateArray, it means "add one month", and so forth... From wright at esrf.fr Fri Jul 11 18:56:47 2008 From: wright at esrf.fr (Jon Wright) Date: Sat, 12 Jul 2008 00:56:47 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> <200807112014.01248.falted@pytables.org> <200807112037.54947.falted@pytables.org> Message-ID: <4877E52F.3060209@esrf.fr> Charles R Harris wrote: > On Fri, Jul 11, 2008 at 12:37 PM, Francesc Alted > A Friday 11 July 2008, Francesc Alted escrigu?: > > A Friday 11 July 2008, Jon Wright escrigu?: > > > Nice idea - please can you make it work with matplotlib's time/date > > Hmmm, following the matplotlib docstrings: > > > > """ > > datetime objects are converted to floating point numbers > > which represent the number of days since 0001-01-01 UTC > > """ ... > > this more carefully, but I suppose that if there is interest enough > > that could be implemented, yes. > > Now that I think about this, wouldn't be better if, after the eventual > introduction of the new datetime types in NumPy, the matplotlib would > use any of these three and throw away their current datetime class? > [Unless they have good reasons for keeping their epoch and/or scale] > > Especially as there was a ten day adjustment made with the adoption of > the Gregorian calender on Oct 4, 1582; early dates can be hard to > interpret. Curiously, IIRC, 01/01/0001 was a Monday. So I think I will just want to plot timeseries without (ever please) caring about date formats again. If you're proposing a "new" format then I'm assuming you want me to once again care that: 1) my temperature logger is recording data in Romance Standard Time, but not saying so, just day/month/year : time. 2) When we read that data we cannot tell which time zone it was recorded in, even if we think we remember where the logger was when it logged. 3) That the program I am running could currently be in any time zone 4) Whether the program is plotting compared to "now" in the current time zone or "then" that the data were recorded. None of these problems are new, or indeed unique, I think we only want a to_ and from_ converter to "what we mean" that we can plot, using matplotlib. Timezones are a heck of a problem if you want to be accurate. You are talking about nanosecond resolutions, however, atomic clocks in orbit apparently suffer from relativistic corrections of the order 38000 nanoseconds per day [1]. What will you do about data recorded on the international space station? Getting into time formats at this level seems to be rather complicated - there is no absolute time you can reference to - it is all relative :-) Thanks, and bon chance, Jon [1] http://www.phys.lsu.edu/mog/mog9/node9.html From peridot.faceted at gmail.com Fri Jul 11 20:17:23 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 11 Jul 2008 20:17:23 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <4877E52F.3060209@esrf.fr> References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> <200807112014.01248.falted@pytables.org> <200807112037.54947.falted@pytables.org> <4877E52F.3060209@esrf.fr> Message-ID: 2008/7/11 Jon Wright : > Timezones are a heck of a problem if you want to be accurate. You are > talking about nanosecond resolutions, however, atomic clocks in orbit > apparently suffer from relativistic corrections of the order 38000 > nanoseconds per day [1]. What will you do about data recorded on the > international space station? Getting into time formats at this level > seems to be rather complicated - there is no absolute time you can > reference to - it is all relative :-) This particular issue turns up in pulsar timing; if you use X-ray observations, the satellite's orbit introduces all kinds of time variations. If you care about this you need to think about using (say) TAI, referenced to (say) the solar system barycenter (if there were no mass there). You can do all this, but I think it's out of scope for an ordinary date/time class. Anne From charlesr.harris at gmail.com Sat Jul 12 00:23:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Jul 2008 22:23:12 -0600 Subject: [Numpy-discussion] Missing NULL return checks? Message-ID: PyArray_DescrFromType can return NULL static PyArray_Descr * PyArray_DescrFromType(int type) { PyArray_Descr *ret = NULL; if (type < PyArray_NTYPES) { ret = _builtin_descrs[type]; } else if (type == PyArray_NOTYPE) { /* * This needs to not raise an error so * that PyArray_DescrFromType(PyArray_NOTYPE) * works for backwards-compatible C-API */ return NULL; } else if ((type == PyArray_CHAR) || (type == PyArray_CHARLTR)) { ret = PyArray_DescrNew(_builtin_descrs[PyArray_STRING]); if (ret == NULL) { return NULL; } ret->elsize = 1; ret->type = PyArray_CHARLTR; return ret; } else if (PyTypeNum_ISUSERDEF(type)) { ret = userdescrs[type - PyArray_USERDEF]; } else { int num = PyArray_NTYPES; if (type < _MAX_LETTER) { num = (int) _letter_to_num[type]; } if (num >= PyArray_NTYPES) { ret = NULL; } else { ret = _builtin_descrs[num]; } } if (ret == NULL) { PyErr_SetString(PyExc_ValueError, "Invalid data-type for array"); } else { Py_INCREF(ret); } return ret; } Yet it is unchecked in several places: static int PyArray_CanCastSafely(int fromtype, int totype) { PyArray_Descr *from, *to; register int felsize, telsize; if (fromtype == totype) return 1; if (fromtype == PyArray_BOOL) return 1; if (totype == PyArray_BOOL) return 0; if (totype == PyArray_OBJECT || totype == PyArray_VOID) return 1; if (fromtype == PyArray_OBJECT || fromtype == PyArray_VOID) return 0; from = PyArray_DescrFromType(fromtype); /* * cancastto is a PyArray_NOTYPE terminated C-int-array of types that * the data-type can be cast to safely. */ if (from->f->cancastto) { int *curtype; curtype = from->f->cancastto; while (*curtype != PyArray_NOTYPE) { if (*curtype++ == totype) return 1; } } if (PyTypeNum_ISUSERDEF(totype)) return 0; to = PyArray_DescrFromType(totype); telsize = to->elsize; felsize = from->elsize; Py_DECREF(from); Py_DECREF(to); switch(fromtype) { case PyArray_BYTE: case PyArray_SHORT: case PyArray_INT: case PyArray_LONG: case PyArray_LONGLONG: if (PyTypeNum_ISINTEGER(totype)) { if (PyTypeNum_ISUNSIGNED(totype)) { return 0; } else { return (telsize >= felsize); } } else if (PyTypeNum_ISFLOAT(totype)) { if (felsize < 8) return (telsize > felsize); else return (telsize >= felsize); } else if (PyTypeNum_ISCOMPLEX(totype)) { if (felsize < 8) return ((telsize >> 1) > felsize); else return ((telsize >> 1) >= felsize); } else return totype > fromtype; case PyArray_UBYTE: case PyArray_USHORT: case PyArray_UINT: case PyArray_ULONG: case PyArray_ULONGLONG: if (PyTypeNum_ISINTEGER(totype)) { if (PyTypeNum_ISSIGNED(totype)) { return (telsize > felsize); } else { return (telsize >= felsize); } } else if (PyTypeNum_ISFLOAT(totype)) { if (felsize < 8) return (telsize > felsize); else return (telsize >= felsize); } else if (PyTypeNum_ISCOMPLEX(totype)) { if (felsize < 8) return ((telsize >> 1) > felsize); else return ((telsize >> 1) >= felsize); } else return totype > fromtype; case PyArray_FLOAT: case PyArray_DOUBLE: case PyArray_LONGDOUBLE: if (PyTypeNum_ISCOMPLEX(totype)) return ((telsize >> 1) >= felsize); else return (totype > fromtype); case PyArray_CFLOAT: case PyArray_CDOUBLE: case PyArray_CLONGDOUBLE: return (totype > fromtype); case PyArray_STRING: case PyArray_UNICODE: return (totype > fromtype); default: return 0; } } Furthermore, the last function can fail, but doesn't seem to have an error return. What is the best way to go about cleaning this up? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.tore.korneliussen at tandberg.com Sat Jul 12 09:56:08 2008 From: jan.tore.korneliussen at tandberg.com (Jan Tore Korneliussen) Date: Sat, 12 Jul 2008 15:56:08 +0200 Subject: [Numpy-discussion] Intel Math Kernel Library "FATAL ERROR" Message-ID: <1215870968.24069.7.camel@jtk-desktop> I found that the NumPy regression test error for MKL that was reported a while ago happens with MKL 10.0.3.020, but not with MKL 10.0.1.014 (everything else equal) Here is a dump of the regression test on my machine with 10.0.3.020 >>> import numpy >>> numpy.test() Numpy is installed in /usr/lib/python2.5/site-packages/numpy Numpy version 1.1.0 Python version 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] Found 36/36 tests for numpy.core.tests.test_numerictypes Found 16/16 tests for numpy.core.tests.test_umath Found 3/3 tests for numpy.core.tests.test_memmap Found 7/7 tests for numpy.core.tests.test_scalarmath Found 12/12 tests for numpy.core.tests.test_records Found 286/286 tests for numpy.core.tests.test_multiarray Found 143/143 tests for numpy.core.tests.test_regression Found 2/2 tests for numpy.core.tests.test_ufunc Found 18/18 tests for numpy.core.tests.test_defmatrix Found 3/3 tests for numpy.core.tests.test_errstate Found 70/70 tests for numpy.core.tests.test_numeric Found 63/63 tests for numpy.core.tests.test_unicode Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 3/3 tests for numpy.fft.tests.test_helper Found 2/2 tests for numpy.fft.tests.test_fftpack Found 5/5 tests for numpy.lib.tests.test_getlimits Found 15/15 tests for numpy.lib.tests.test_io Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 49/49 tests for numpy.lib.tests.test_shape_base Found 53/53 tests for numpy.lib.tests.test_function_base Found 24/24 tests for numpy.lib.tests.test__datasource Found 4/4 tests for numpy.lib.tests.test_polynomial Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 1/1 tests for numpy.lib.tests.test_regression Found 1/1 tests for numpy.lib.tests.test_financial Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_machar Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 89/89 tests for numpy.linalg.tests.test_linalg Found 3/3 tests for numpy.linalg.tests.test_regression Found 36/36 tests for numpy.ma.tests.test_old_ma Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 4/4 tests for numpy.ma.tests.test_subclassing Found 17/17 tests for numpy.ma.tests.test_mrecords Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib .......................................................................................................................................................................................................MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/32//: cannot read file data: Is a directory From jdh2358 at gmail.com Sat Jul 12 10:01:33 2008 From: jdh2358 at gmail.com (John Hunter) Date: Sat, 12 Jul 2008 09:01:33 -0500 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807112014.01248.falted@pytables.org> References: <200807111559.03986.falted@pytables.org> <48779E57.6070400@esrf.fr> <200807112014.01248.falted@pytables.org> Message-ID: <88e473830807120701j2a0693f6o5c246b55d68211f9@mail.gmail.com> On Fri, Jul 11, 2008 at 1:14 PM, Francesc Alted wrote: > So, it seems that setters/getters for matplotlib datetime could be > supported, maybe at the risk of loosing precision. We should study > this more carefully, but I suppose that if there is interest enough > that could be implemented, yes. You don't need to worry about mpl -- we will support whatever datetime handling numpy implements (I think your proposal would be a great addition). We have been moving away from the date2num floats in mpl (as you note using floats was not a great idea because of the precision issue). We now support native python datetime handling but a numpy datetime would be ideal. The infrastructure we use to handle python datetime's can easily support other datetime objects. JDH From bryan.fodness at gmail.com Sat Jul 12 10:07:06 2008 From: bryan.fodness at gmail.com (Bryan Fodness) Date: Sat, 12 Jul 2008 10:07:06 -0400 Subject: [Numpy-discussion] loadtxt and usecols Message-ID: i would like to load my data without knowing the length, i have explicitly stated the rows data = loadtxt('18B180.dat', skiprows = 1, usecols = 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45)) and would like to use something like, data = loadtxt('18B180.dat', skiprows = 1, usecols = (1,:)) the first column is the only that i do not need. -- "The game of science can accurately be described as a never-ending insult to human intelligence." - Jo?o Magueijo "Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius - and a lot of courage - to move in the opposite direction. " -Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Jul 12 10:31:14 2008 From: cournape at gmail.com (David Cournapeau) Date: Sat, 12 Jul 2008 16:31:14 +0200 Subject: [Numpy-discussion] Intel Math Kernel Library "FATAL ERROR" In-Reply-To: <1215870968.24069.7.camel@jtk-desktop> References: <1215870968.24069.7.camel@jtk-desktop> Message-ID: <5b8d13220807120731m19694e0cic87470355a055d08@mail.gmail.com> On Sat, Jul 12, 2008 at 3:56 PM, Jan Tore Korneliussen wrote: > I found that the NumPy regression test error for MKL that was reported a > while ago happens with MKL 10.0.3.020, but not with MKL 10.0.1.014 > (everything else equal) > It is more likely a bug in the MKL. Please update to a more recent version, or downgrade to a version which works. David From lbolla at gmail.com Sat Jul 12 10:35:20 2008 From: lbolla at gmail.com (Lorenzo Bolla) Date: Sat, 12 Jul 2008 16:35:20 +0200 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: References: Message-ID: <20080712143519.GA15362@lollo-laptop> why not using: data = loadtxt('18B180.dat', skiprows = 1, usecols = xrange(1,46)) obviously, you need to know how many columns you have. hth, L. On Sat, Jul 12, 2008 at 10:07:06AM -0400, Bryan Fodness wrote: > i would like to load my data without knowing the length, i have explicitly > stated the rows > > data = loadtxt('18B180.dat', skiprows = 1, usecols = > 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45)) > and would like to use something like, > > data = loadtxt('18B180.dat', skiprows = 1, usecols = (1,:)) > > the first column is the only that i do not need. > > -- > "The game of science can accurately be described as a never-ending insult to > human intelligence." - Jo?o Magueijo > > "Any intelligent fool can make things bigger, more complex, and more > violent. It takes a touch of genius - and a lot of courage - to move in the > opposite direction. " -Albert Einstein > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From jan.tore.korneliussen at tandberg.com Sat Jul 12 10:38:22 2008 From: jan.tore.korneliussen at tandberg.com (Jan Tore Korneliussen) Date: Sat, 12 Jul 2008 16:38:22 +0200 Subject: [Numpy-discussion] Intel Math Kernel Library "FATAL ERROR" In-Reply-To: <5b8d13220807120731m19694e0cic87470355a055d08@mail.gmail.com> References: <1215870968.24069.7.camel@jtk-desktop> <5b8d13220807120731m19694e0cic87470355a055d08@mail.gmail.com> Message-ID: <1215873502.24069.12.camel@jtk-desktop> Yes, I downgraded from 10.0.3.020 to 10.0.1.014, and then the test worked. Since 10.0.3.020 is the newest version, perhaps it should be reported to Intel to prevent it from appearing in coming versions? (If it is actually a bug in MKL) On Sat, 2008-07-12 at 16:31 +0200, David Cournapeau wrote: > On Sat, Jul 12, 2008 at 3:56 PM, Jan Tore Korneliussen > wrote: > > I found that the NumPy regression test error for MKL that was reported a > > while ago happens with MKL 10.0.3.020, but not with MKL 10.0.1.014 > > (everything else equal) > > > > It is more likely a bug in the MKL. Please update to a more recent > version, or downgrade to a version which works. > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From michael at araneidae.co.uk Sat Jul 12 10:42:54 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Sat, 12 Jul 2008 14:42:54 +0000 (GMT) Subject: [Numpy-discussion] Missing NULL return checks? In-Reply-To: References: Message-ID: <20080712083351.P30480@saturn.araneidae.co.uk> > PyArray_DescrFromType can return NULL Yah, you noticed ;) > Yet it is unchecked in several places: Pity about that. Easy enough to fix though -- just don't lose track of ref counts. In fact, I've already submitted a patch to this function (but not addressing this issue). > static int > PyArray_CanCastSafely(int fromtype, int totype) > { > PyArray_Descr *from, *to; > register int felsize, telsize; > > if (fromtype == totype) return 1; > if (fromtype == PyArray_BOOL) return 1; > if (totype == PyArray_BOOL) return 0; > if (totype == PyArray_OBJECT || totype == PyArray_VOID) return 1; > if (fromtype == PyArray_OBJECT || fromtype == PyArray_VOID) return 0; > > from = PyArray_DescrFromType(fromtype); if (from == NULL) return 0; > /* > * cancastto is a PyArray_NOTYPE terminated C-int-array of types that > * the data-type can be cast to safely. > */ > if (from->f->cancastto) { > int *curtype; > curtype = from->f->cancastto; > while (*curtype != PyArray_NOTYPE) { > if (*curtype++ == totype) return 1; > } > } > if (PyTypeNum_ISUSERDEF(totype)) return 0; > > to = PyArray_DescrFromType(totype); if (to == NULL) { Py_DECREF(from); return 0; } > telsize = to->elsize; > felsize = from->elsize; > Py_DECREF(from); > Py_DECREF(to); > ... > } > > Furthermore, the last function can fail, but doesn't seem to have an error > return. What is the best way to go about cleaning this up? Given the question the function is asking, returning false seems good enough for "failure". From nadavh at visionsense.com Sat Jul 12 10:39:49 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 12 Jul 2008 17:39:49 +0300 Subject: [Numpy-discussion] A correction to numpy trapz function Message-ID: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> The function trapz accepts x axis vector only for axis=-1. Here is my modification (correction?) to let it accept a vector x for integration along any axis: def trapz(y, x=None, dx=1.0, axis=-1): """ Integrate y(x) using samples along the given axis and the composite trapezoidal rule. If x is None, spacing given by dx is assumed. If x is an array, it must have either the dimensions of y, or a vector of length matching the dimension of y along the integration axis. """ y = asarray(y) nd = y.ndim slice1 = [slice(None)]*nd slice2 = [slice(None)]*nd slice1[axis] = slice(1,None) slice2[axis] = slice(None,-1) if x is None: d = dx else: x = asarray(x) if x.ndim == 1: if len(x) != y.shape[axis]: raise ValueError('x length (%d) does not match y axis %d length (%d)' % (len(x), axis, y.shape[axis])) d = diff(x) return tensordot(d, (y[slice1]+y[slice2])/2.0,(0, axis)) d = diff(x, axis=axis) return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) Nadav. From rmay31 at gmail.com Sat Jul 12 11:31:11 2008 From: rmay31 at gmail.com (Ryan May) Date: Sat, 12 Jul 2008 11:31:11 -0400 Subject: [Numpy-discussion] A correction to numpy trapz function In-Reply-To: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> References: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> Message-ID: <4878CE3F.9050308@gmail.com> Nadav Horesh wrote: > The function trapz accepts x axis vector only for axis=-1. Here is my modification (correction?) to let it accept a vector x for integration along any axis: > > def trapz(y, x=None, dx=1.0, axis=-1): > """ > Integrate y(x) using samples along the given axis and the composite > trapezoidal rule. If x is None, spacing given by dx is assumed. If x > is an array, it must have either the dimensions of y, or a vector of > length matching the dimension of y along the integration axis. > """ > y = asarray(y) > nd = y.ndim > slice1 = [slice(None)]*nd > slice2 = [slice(None)]*nd > slice1[axis] = slice(1,None) > slice2[axis] = slice(None,-1) > if x is None: > d = dx > else: > x = asarray(x) > if x.ndim == 1: > if len(x) != y.shape[axis]: > raise ValueError('x length (%d) does not match y axis %d length (%d)' % (len(x), axis, y.shape[axis])) > d = diff(x) > return tensordot(d, (y[slice1]+y[slice2])/2.0,(0, axis)) > d = diff(x, axis=axis) > return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) > What version were you working with originally? With 1.1, this is what I have: def trapz(y, x=None, dx=1.0, axis=-1): """Integrate y(x) using samples along the given axis and the composite trapezoidal rule. If x is None, spacing given by dx is assumed. """ y = asarray(y) if x is None: d = dx else: d = diff(x,axis=axis) nd = len(y.shape) slice1 = [slice(None)]*nd slice2 = [slice(None)]*nd slice1[axis] = slice(1,None) slice2[axis] = slice(None,-1) return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) For me, this works fine with supplying x for axis != -1. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From travis at enthought.com Sat Jul 12 11:34:57 2008 From: travis at enthought.com (Travis Vaught) Date: Sat, 12 Jul 2008 10:34:57 -0500 Subject: [Numpy-discussion] SciPy 2008 Registration Deadline Extended Message-ID: <3A00FC0B-B463-4308-B0CF-7D6F3195E609@enthought.com> Greetings, The merchant account processor that we use for the SciPy Conference online registration has been experiencing some inexplicable problems authorizing some registrations. Apologies to those who have struggled to register and have not been successful. Because of the problems, we're extending the early-bird rates through Monday at midnight Central Time. If you experience any problems registering, please give us a call during business hours Monday (9:00am - 5:00pm Central - 512.536.1057). http://conference.scipy.org/ For those of you who have set up an account on the conference site, but have not yet registered, I encourage you to do so in time to take advantage of the lower rates. I also encourage everyone to make sure you've specified which tutorial track, T-shirt size, whether you'll attend the sprint, and meal preferences in your profile (http://conference.scipy.org/profile ). Please send me an email if you have any questions. Best, Travis From mattknox.ca at gmail.com Sat Jul 12 11:50:40 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Sat, 12 Jul 2008 15:50:40 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_proposal_for_implementing_s?= =?utf-8?q?ome=09date/time_types_in_NumPy?= References: <200807111559.03986.falted@pytables.org> <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> <200807112001.39395.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> <4877B197.7050607@noaa.gov> Message-ID: Christopher Barker noaa.gov> writes: >> I'm also imaging some extra utility functions/method that would be nice: >> >> aDateTimeArray.hours(dtype=float) >> >> to convert to hours (and days, and seconds, etc). And maybe some that >> would create a DateTimeArray from various time units. The DateArray class in the timeseries scikits can do part of what you want. Observe... >>> import scikits.timeseries as ts >>> a = ts.date_array(start_date=ts.now('hourly'), length=15) >>> a DateArray([12-Jul-2008 11:00, 12-Jul-2008 12:00, 12-Jul-2008 13:00, 12-Jul-2008 14:00, 12-Jul-2008 15:00, 12-Jul-2008 16:00, 12-Jul-2008 17:00, 12-Jul-2008 18:00, 12-Jul-2008 19:00, 12-Jul-2008 20:00, 12-Jul-2008 21:00, 12-Jul-2008 22:00, 12-Jul-2008 23:00, 13-Jul-2008 00:00, 13-Jul-2008 01:00], freq='H') >>> a.year array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008]) >>> a.hour array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0, 1]) >>> a.day array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13]) >>> Note that the DateArray (or TimeSeries) need not be continuous, I just constructed a continuous DateArray in this example for simplicity. I would encourage you to take a look at the wiki (http://scipy.org/scipy/scikits/wiki/TimeSeries) as you may find some surprises in there that prove useful. >> >> I often have to read/write data files that have time in various units >> like that -- it would be nice to use array operations to work with them. If peak performance is not a concern, parsing of most date formats can be done automatically using the built in parser in the timeseries module (borrowed from mx.DateTime). Observe... >>> dlist = ['14-jan-2001 14:34:33', '16-jan-2001 10:09:11'] >>> a = ts.date_array(dlist, freq='secondly') >>> a DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11], freq='S') >>> a.second array([33, 11]) - Matt From charlesr.harris at gmail.com Sat Jul 12 12:13:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 10:13:11 -0600 Subject: [Numpy-discussion] Missing NULL return checks? In-Reply-To: <20080712083351.P30480@saturn.araneidae.co.uk> References: <20080712083351.P30480@saturn.araneidae.co.uk> Message-ID: On Sat, Jul 12, 2008 at 8:42 AM, Michael Abbott wrote: > > PyArray_DescrFromType can return NULL > Yah, you noticed ;) > > > Yet it is unchecked in several places: > Pity about that. Easy enough to fix though -- just don't lose track of > ref counts. In fact, I've already submitted a patch to this function (but > not addressing this issue). > > > static int > > PyArray_CanCastSafely(int fromtype, int totype) > > { > > PyArray_Descr *from, *to; > > register int felsize, telsize; > > > > if (fromtype == totype) return 1; > > if (fromtype == PyArray_BOOL) return 1; > > if (totype == PyArray_BOOL) return 0; > > if (totype == PyArray_OBJECT || totype == PyArray_VOID) return 1; > > if (fromtype == PyArray_OBJECT || fromtype == PyArray_VOID) return 0; > > > > from = PyArray_DescrFromType(fromtype); > if (from == NULL) return 0; > > > /* > > * cancastto is a PyArray_NOTYPE terminated C-int-array of types that > > * the data-type can be cast to safely. > > */ > > if (from->f->cancastto) { > > int *curtype; > > curtype = from->f->cancastto; > > while (*curtype != PyArray_NOTYPE) { > > if (*curtype++ == totype) return 1; > > } > > } > > if (PyTypeNum_ISUSERDEF(totype)) return 0; > > > > to = PyArray_DescrFromType(totype); > if (to == NULL) { Py_DECREF(from); return 0; } > > > telsize = to->elsize; > > felsize = from->elsize; > > Py_DECREF(from); > > Py_DECREF(to); > > ... > > } > > > > Furthermore, the last function can fail, but doesn't seem to have an > error > > return. What is the best way to go about cleaning this up? > > Given the question the function is asking, returning false seems good > enough for "failure". Good point, but a memory error may have been set by PyArray_DescrNew. My impression is that the routine was originally intended to return references to static singleton instances, in which case it couldn't fail. I think we need a separate static instance for PyArray_CHARLTR instead of making a copyof a string type and fudging a few members so that it too can't fail. The array indexing for user defined types probably needs bounds checking also, but I'm not sure what to do there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 12 12:49:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 10:49:49 -0600 Subject: [Numpy-discussion] Missing NULL return checks? In-Reply-To: References: <20080712083351.P30480@saturn.araneidae.co.uk> Message-ID: On Sat, Jul 12, 2008 at 10:13 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Jul 12, 2008 at 8:42 AM, Michael Abbott > wrote: > >> > PyArray_DescrFromType can return NULL >> Yah, you noticed ;) >> >> > Yet it is unchecked in several places: >> Pity about that. Easy enough to fix though -- just don't lose track of >> ref counts. In fact, I've already submitted a patch to this function (but >> not addressing this issue). >> >> > static int >> > PyArray_CanCastSafely(int fromtype, int totype) >> > { >> > PyArray_Descr *from, *to; >> > register int felsize, telsize; >> > >> > if (fromtype == totype) return 1; >> > if (fromtype == PyArray_BOOL) return 1; >> > if (totype == PyArray_BOOL) return 0; >> > if (totype == PyArray_OBJECT || totype == PyArray_VOID) return 1; >> > if (fromtype == PyArray_OBJECT || fromtype == PyArray_VOID) return >> 0; >> > >> > from = PyArray_DescrFromType(fromtype); >> if (from == NULL) return 0; >> >> > /* >> > * cancastto is a PyArray_NOTYPE terminated C-int-array of types >> that >> > * the data-type can be cast to safely. >> > */ >> > if (from->f->cancastto) { >> > int *curtype; >> > curtype = from->f->cancastto; >> > while (*curtype != PyArray_NOTYPE) { >> > if (*curtype++ == totype) return 1; >> > } >> > } >> > if (PyTypeNum_ISUSERDEF(totype)) return 0; >> > >> > to = PyArray_DescrFromType(totype); >> if (to == NULL) { Py_DECREF(from); return 0; } >> >> > telsize = to->elsize; >> > felsize = from->elsize; >> > Py_DECREF(from); >> > Py_DECREF(to); >> > ... >> > } >> > >> > Furthermore, the last function can fail, but doesn't seem to have an >> error >> > return. What is the best way to go about cleaning this up? >> >> Given the question the function is asking, returning false seems good >> enough for "failure". > > > Good point, but a memory error may have been set by PyArray_DescrNew. My > impression is that the routine was originally intended to return references > to static singleton instances, in which case it couldn't fail. I think we > need a separate static instance for PyArray_CHARLTR instead of making a > copy of a string type and fudging a few members so that it too can't fail. > The array indexing for user defined types probably needs bounds checking > also, but I'm not sure what to do there. > This bit looks hinky, too: else { int num = PyArray_NTYPES; if (type < _MAX_LETTER) { num = (int) _letter_to_num[type]; } if (num >= PyArray_NTYPES) { ret = NULL; } else { ret = _builtin_descrs[num]; } Type shouldn't have alternate meanings. Maybe this is a backwards compatibility thing. We could write a new function that doesn't set any Python errors, leaving that to higher level routines. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 12 14:11:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 12:11:01 -0600 Subject: [Numpy-discussion] snprintf vs PyOS_snprintf Message-ID: Numpy uses a mix of snprintf and PyOS_snprintf. The Python version is there because snprintf wasn't part of the standard until C99. So either we should stick to the python version or make the decision that we only support compilers with a working snprintf. Which way should we go? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Sat Jul 12 14:34:04 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 12 Jul 2008 21:34:04 +0300 Subject: [Numpy-discussion] A correction to numpy trapz function References: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> <4878CE3F.9050308@gmail.com> Message-ID: <710F2847B0018641891D9A216027636029C1C8@ex3.envision.co.il> Here is what I get with the orriginal trapz function: IDLE 1.2.2 >>> import numpy as np >>> np.__version__ '1.1.0' >>> y = np.arange(24).reshape(6,4) >>> x = np.arange(6) >>> np.trapz(y, x, axis=0) Traceback (most recent call last): File "", line 1, in np.trapz(y, x, axis=0) File "C:\Python25\Lib\site-packages\numpy\lib\function_base.py", line 1536, in trapz return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) ValueError: shape mismatch: objects cannot be broadcast to a single shape >>> Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Ryan May ????: ? 12-????-08 18:31 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] A correction to numpy trapz function Nadav Horesh wrote: > The function trapz accepts x axis vector only for axis=-1. Here is my modification (correction?) to let it accept a vector x for integration along any axis: > > def trapz(y, x=None, dx=1.0, axis=-1): > """ > Integrate y(x) using samples along the given axis and the composite > trapezoidal rule. If x is None, spacing given by dx is assumed. If x > is an array, it must have either the dimensions of y, or a vector of > length matching the dimension of y along the integration axis. > """ > y = asarray(y) > nd = y.ndim > slice1 = [slice(None)]*nd > slice2 = [slice(None)]*nd > slice1[axis] = slice(1,None) > slice2[axis] = slice(None,-1) > if x is None: > d = dx > else: > x = asarray(x) > if x.ndim == 1: > if len(x) != y.shape[axis]: > raise ValueError('x length (%d) does not match y axis %d length (%d)' % (len(x), axis, y.shape[axis])) > d = diff(x) > return tensordot(d, (y[slice1]+y[slice2])/2.0,(0, axis)) > d = diff(x, axis=axis) > return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) > What version were you working with originally? With 1.1, this is what I have: def trapz(y, x=None, dx=1.0, axis=-1): """Integrate y(x) using samples along the given axis and the composite trapezoidal rule. If x is None, spacing given by dx is assumed. """ y = asarray(y) if x is None: d = dx else: d = diff(x,axis=axis) nd = len(y.shape) slice1 = [slice(None)]*nd slice2 = [slice(None)]*nd slice1[axis] = slice(1,None) slice2[axis] = slice(None,-1) return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) For me, this works fine with supplying x for axis != -1. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4084 bytes Desc: not available URL: From rmay31 at gmail.com Sat Jul 12 15:24:52 2008 From: rmay31 at gmail.com (Ryan May) Date: Sat, 12 Jul 2008 15:24:52 -0400 Subject: [Numpy-discussion] A correction to numpy trapz function In-Reply-To: <710F2847B0018641891D9A216027636029C1C8@ex3.envision.co.il> References: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> <4878CE3F.9050308@gmail.com> <710F2847B0018641891D9A216027636029C1C8@ex3.envision.co.il> Message-ID: <48790504.3090406@gmail.com> Nadav Horesh wrote: > Here is what I get with the orriginal trapz function: > > IDLE 1.2.2 >>>> import numpy as np >>>> np.__version__ > '1.1.0' >>>> y = np.arange(24).reshape(6,4) >>>> x = np.arange(6) >>>> np.trapz(y, x, axis=0) > > Traceback (most recent call last): > File "", line 1, in > np.trapz(y, x, axis=0) > File "C:\Python25\Lib\site-packages\numpy\lib\function_base.py", line 1536, in trapz > return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) > ValueError: shape mismatch: objects cannot be broadcast to a single shape > (Try not to top post on this list.) I can get it to work like this: import numpy as np y = np.arange(24).reshape(6,4) x = np.arange(6).reshape(-1,1) np.trapz(y, x, axis=0) From the text of the error message, you can see this is a problem with broadcasting. Due to broadcasting rules (which will *prepend* dimensions with size 1), you need to manually add an extra dimension to the end. Once I resize x, I can get this to work. You might want to look at this: http://www.scipy.org/EricsBroadcastingDoc Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From robert.kern at gmail.com Sat Jul 12 19:14:16 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 12 Jul 2008 18:14:16 -0500 Subject: [Numpy-discussion] snprintf vs PyOS_snprintf In-Reply-To: References: Message-ID: <3d375d730807121614w9aae3d7n7b5657be53d79db3@mail.gmail.com> On Sat, Jul 12, 2008 at 13:11, Charles R Harris wrote: > Numpy uses a mix of snprintf and PyOS_snprintf. The Python version is there > because snprintf wasn't part of the standard until C99. So either we should > stick to the python version or make the decision that we only support > compilers with a working snprintf. Which way should we go? PyOS_snprintf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat Jul 12 20:39:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 18:39:00 -0600 Subject: [Numpy-discussion] huge array calculation speed In-Reply-To: <92000.48372.qm@web34408.mail.mud.yahoo.com> References: <12D2AD1D-7742-4248-9C5D-9EF3EE3CEE06@gmail.com> <92000.48372.qm@web34408.mail.mud.yahoo.com> Message-ID: On Fri, Jul 11, 2008 at 11:04 AM, Lou Pecora wrote: > If your positions are static (I'm not clear on that from your message), > then you might want to check the technique of "slice searching". It only > requires one sort of the data for each dimension initially, then uses a > simple, but clever look up to find neighbors within some epsilon of a chosen > point. Speeds appear to be about equal to k-d trees. Programming is vastly > simpler than k-d trees, however. > This one is actually easy to implement in numpy using argsort. I'm not sure how much speed the integer comparisons buy as opposed to straight floating comparisons; they probably did it that way for the hardware implementation. It might be interesting to make a comparison. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Sun Jul 13 00:30:25 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 13 Jul 2008 07:30:25 +0300 Subject: [Numpy-discussion] A correction to numpy trapz function References: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> <4878CE3F.9050308@gmail.com> <710F2847B0018641891D9A216027636029C1C8@ex3.envision.co.il> <48790504.3090406@gmail.com> Message-ID: <710F2847B0018641891D9A216027636029C1C9@ex3.envision.co.il> I am aware that the error is related to the broadcasting, and that it can be solved by matching the shape of x to that of y --- this is how I solved it in the first place. I was thinking that the function "promises" to integrate over an array given a x vector and the axis, so let obscure the broadcasting rules and just enable it to do the work. There is a reason to leave trapz as it is (or even drop it) since numpy should stay as close as possible to "bare metal", but this function is borrowed also by scipy integration package, thus I rather give it a face lift. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Ryan May ????: ? 12-????-08 22:24 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] A correction to numpy trapz function Nadav Horesh wrote: > Here is what I get with the orriginal trapz function: > > IDLE 1.2.2 >>>> import numpy as np >>>> np.__version__ > '1.1.0' >>>> y = np.arange(24).reshape(6,4) >>>> x = np.arange(6) >>>> np.trapz(y, x, axis=0) > > Traceback (most recent call last): > File "", line 1, in > np.trapz(y, x, axis=0) > File "C:\Python25\Lib\site-packages\numpy\lib\function_base.py", line 1536, in trapz > return add.reduce(d * (y[slice1]+y[slice2])/2.0,axis) > ValueError: shape mismatch: objects cannot be broadcast to a single shape > (Try not to top post on this list.) I can get it to work like this: import numpy as np y = np.arange(24).reshape(6,4) x = np.arange(6).reshape(-1,1) np.trapz(y, x, axis=0) From the text of the error message, you can see this is a problem with broadcasting. Due to broadcasting rules (which will *prepend* dimensions with size 1), you need to manually add an extra dimension to the end. Once I resize x, I can get this to work. You might want to look at this: http://www.scipy.org/EricsBroadcastingDoc Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sun Jul 13 01:44:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 23:44:25 -0600 Subject: [Numpy-discussion] How to handle a busted API? Message-ID: Hi All, This is apropos ticket #805 . The reporter wants to change the signature of the functions PyArray_FromDims and PyArray_FromDimsAndDataAndDesc, which we really can't do at this point because they are part of the Numpy API. The problem can be seen in the current signatures: PyArray_FromDims(int nd, int *d, int type) PyArray_FromDimsAndDataAndDescr(int nd, int *d, PyArray_Descr *descr, char *data) where d points to the desired dimensions. On 64 bit architectures the int type can be too small to hold the dimensions. Now these functions are old and retained for compatibility; the user's problem turned up in num_utils, which is a C++ interface to Numeric. Quite possibly other programs use it also (BOOST?). So the question is, how do we go about dealing with this. Do we remove them at some point, breaking compatibility and the current API? If so, when do we do this? Should we issue a deprecation warning? If so, how do we do it from C? Should it show up at run time or compile time? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 13 01:49:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Jul 2008 23:49:06 -0600 Subject: [Numpy-discussion] A correction to numpy trapz function In-Reply-To: <710F2847B0018641891D9A216027636029C1C9@ex3.envision.co.il> References: <710F2847B0018641891D9A216027636029C1C7@ex3.envision.co.il> <4878CE3F.9050308@gmail.com> <710F2847B0018641891D9A216027636029C1C8@ex3.envision.co.il> <48790504.3090406@gmail.com> <710F2847B0018641891D9A216027636029C1C9@ex3.envision.co.il> Message-ID: On Sat, Jul 12, 2008 at 10:30 PM, Nadav Horesh wrote: > I am aware that the error is related to the broadcasting, and that it can > be solved by matching the shape of x to that of y --- this is how I solved > it in the first place. I was thinking that the function "promises" to > integrate over an array given a x vector and the axis, so let obscure the > broadcasting rules and just enable it to do the work. There is a reason to > leave trapz as it is (or even drop it) since numpy should stay as close as > possible to "bare metal", but this function is borrowed also by scipy > integration package, thus I rather give it a face lift. > I think you can pretty much borrow the average function to do this, all you need to do is generate the proper weights and scaling. It's in numpy/lib/function_base.py Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 13 02:02:28 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 13 Jul 2008 01:02:28 -0500 Subject: [Numpy-discussion] How to handle a busted API? In-Reply-To: References: Message-ID: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> On Sun, Jul 13, 2008 at 00:44, Charles R Harris wrote: > Hi All, > > This is apropos ticket #805. The reporter wants to change the signature of > the functions PyArray_FromDims and PyArray_FromDimsAndDataAndDesc, which we > really can't do at this point because they are part of the Numpy API. The > problem can be seen in the current signatures: > > PyArray_FromDims(int nd, int *d, int type) > PyArray_FromDimsAndDataAndDescr(int nd, int *d, PyArray_Descr *descr, char > *data) > > where d points to the desired dimensions. On 64 bit architectures the int > type can be too small to hold the dimensions. Now these functions are old > and retained for compatibility; the user's problem turned up in num_utils, > which is a C++ interface to Numeric. Quite possibly other programs use it > also (BOOST?). > > So the question is, how do we go about dealing with this. Do we remove them > at some point, breaking compatibility and the current API? If so, when do we > do this? The original vision was to remove numpy.oldnumeric and (I think) numpy/oldnumeric.h at what was envisioned as "1.1" long ago when we were still at 0.9 or so. That vision has been overcome by events, I think. Given the evidence of people's adoption, I don't quite think it's time to remove the compatibility APIs wholesale, yet. However, for problematic APIs like this one, I think we can issue a DeprecationWarning (see below) in 1.2, and schedule them for removal in 1.3. In 1.3 until the whole compatibility API is removed, we can have these APIs just contain an #error such that they stop the build at compile time. > Should we issue a deprecation warning? If so, how do we do it from > C? Should it show up at run time or compile time? Compile-time warnings will be ignored if they aren't errors that stop the build. Run-time DeprecationWarnings are feasible: http://docs.python.org/dev/c-api/exceptions.html#PyErr_WarnEx -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Sun Jul 13 02:32:35 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 12 Jul 2008 23:32:35 -0700 Subject: [Numpy-discussion] Doctests for extensions/cython code Message-ID: Hi all (esp. Alan McIntyre), I'm attaching two little tarballs with a set of tools that may come in handy for testing numpy: - plugin.tgz contains a Nose plugin that works around various issues in the python stdlib, in nose and in cython, to enable the testing of doctests embedded in extension modules. I must note that this code also provides a second plugin that is ipython-aware, and the code as shipped is NOT yet acceptable for public use in numpy, because it insantiates a full ipython on import. But I wrote this primarily for ipython, so for us that's OK. For numpy, we obviously must first remove all the ipython-specific code from there. The two plugins are separated, so it's perfectly doable, I just ran out of time. I'm putting it here in the hopes that it will be useful to Alan, who can strip it of the ipython dependencies and start using it in the numpy tests. The one thing I didn't figure out yet was how to load the plugin from within a python script (instead of doing it at the command line via 'nosetests --extdoctests'). But this should be trivial, it's just a matter of finding the right call in nose, and you may already know it. - primes.tgz is the cython 'primes' example, souped up with trivial code that contains some extra doctests both in python and in extension code. It's just meant to serve as a test for the above plugin (I also used it to provide a self-contained cython example complete with a setup.py file in a seminar, so it can be useful as a starter example for some). I don't know if today's numpy.test() picks up doctests in extension modules (if it does, I'd like to know how). I suspect the answer is not, but if we are to encourage better examples that serve as doctests, then actdually testing them would be good. I hope this helps in that regard. Cheers f -------------- next part -------------- A non-text attachment was scrubbed... Name: plugin.tgz Type: application/x-gzip Size: 10149 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: primes.tgz Type: application/x-gzip Size: 2609 bytes Desc: not available URL: From charlesr.harris at gmail.com Sun Jul 13 02:49:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 00:49:20 -0600 Subject: [Numpy-discussion] How to handle a busted API? In-Reply-To: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> References: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> Message-ID: On Sun, Jul 13, 2008 at 12:02 AM, Robert Kern wrote: > On Sun, Jul 13, 2008 at 00:44, Charles R Harris > wrote: > > Hi All, > > > > This is apropos ticket #805. The reporter wants to change the signature > of > > the functions PyArray_FromDims and PyArray_FromDimsAndDataAndDesc, which > we > > really can't do at this point because they are part of the Numpy API. The > > problem can be seen in the current signatures: > > > > PyArray_FromDims(int nd, int *d, int type) > > PyArray_FromDimsAndDataAndDescr(int nd, int *d, PyArray_Descr *descr, > char > > *data) > > > > where d points to the desired dimensions. On 64 bit architectures the int > > type can be too small to hold the dimensions. Now these functions are old > > and retained for compatibility; the user's problem turned up in > num_utils, > > which is a C++ interface to Numeric. Quite possibly other programs use it > > also (BOOST?). > > > > So the question is, how do we go about dealing with this. Do we remove > them > > at some point, breaking compatibility and the current API? If so, when do > we > > do this? > > The original vision was to remove numpy.oldnumeric and (I think) > numpy/oldnumeric.h at what was envisioned as "1.1" long ago when we > were still at 0.9 or so. That vision has been overcome by events, I > think. > > Given the evidence of people's adoption, I don't quite think it's time > to remove the compatibility APIs wholesale, yet. However, for > problematic APIs like this one, I think we can issue a > DeprecationWarning (see below) in 1.2, and schedule them for removal > in 1.3. In 1.3 until the whole compatibility API is removed, we can > have these APIs just contain an #error such that they stop the build > at compile time. > > > Should we issue a deprecation warning? If so, how do we do it from > > C? Should it show up at run time or compile time? > > Compile-time warnings will be ignored if they aren't errors that stop > the build. Run-time DeprecationWarnings are feasible: > > http://docs.python.org/dev/c-api/exceptions.html#PyErr_WarnEx > OK, will do. The same user wants to fix up fftpack_lite. This should actually be pretty easy, just replace all ints by longs in fftpack and fftpack_lite. Or maybe we should use one of the python.h types. As far as I know, these functions are only exposed through the python interface. Chuck > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 13 02:53:46 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 13 Jul 2008 01:53:46 -0500 Subject: [Numpy-discussion] How to handle a busted API? In-Reply-To: References: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> Message-ID: <3d375d730807122353heacefd8k842be6c3fdf7f2a5@mail.gmail.com> On Sun, Jul 13, 2008 at 01:49, Charles R Harris wrote: > > On Sun, Jul 13, 2008 at 12:02 AM, Robert Kern wrote: >> Compile-time warnings will be ignored if they aren't errors that stop >> the build. Run-time DeprecationWarnings are feasible: >> >> http://docs.python.org/dev/c-api/exceptions.html#PyErr_WarnEx > > OK, will do. The same user wants to fix up fftpack_lite. This should > actually be pretty easy, just replace all ints by longs in fftpack and > fftpack_lite. Or maybe we should use one of the python.h types. As far as I > know, these functions are only exposed through the python interface. Py_ssize_t and Py_size_t are probably the most appropriate, in this case. long is not always the same size as a pointer address. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sun Jul 13 03:17:32 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 01:17:32 -0600 Subject: [Numpy-discussion] How to handle a busted API? In-Reply-To: <3d375d730807122353heacefd8k842be6c3fdf7f2a5@mail.gmail.com> References: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> <3d375d730807122353heacefd8k842be6c3fdf7f2a5@mail.gmail.com> Message-ID: On Sun, Jul 13, 2008 at 12:53 AM, Robert Kern wrote: > On Sun, Jul 13, 2008 at 01:49, Charles R Harris > wrote: > > > > On Sun, Jul 13, 2008 at 12:02 AM, Robert Kern > wrote: > > >> Compile-time warnings will be ignored if they aren't errors that stop > >> the build. Run-time DeprecationWarnings are feasible: > >> > >> http://docs.python.org/dev/c-api/exceptions.html#PyErr_WarnEx > > > > OK, will do. The same user wants to fix up fftpack_lite. This should > > actually be pretty easy, just replace all ints by longs in fftpack and > > fftpack_lite. Or maybe we should use one of the python.h types. As far as > I > > know, these functions are only exposed through the python interface. > > Py_ssize_t and Py_size_t are probably the most appropriate, in this > case. long is not always the same size as a pointer address. > I'll go with Py_ssize_t then, I'd have to vet the code before using an unsigned type. Hmm, I wonder if any of the npy types defined in terms of the corresponding python types? If not, npy_intp might be the best choice since it will be needed to create arrays. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 13 03:25:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 13 Jul 2008 02:25:54 -0500 Subject: [Numpy-discussion] How to handle a busted API? In-Reply-To: References: <3d375d730807122302p1e254200va1c2b4eaba3e321f@mail.gmail.com> <3d375d730807122353heacefd8k842be6c3fdf7f2a5@mail.gmail.com> Message-ID: <3d375d730807130025g26ee3d0ep693fdbe210f10adc@mail.gmail.com> On Sun, Jul 13, 2008 at 02:17, Charles R Harris wrote: > > On Sun, Jul 13, 2008 at 12:53 AM, Robert Kern wrote: >> >> On Sun, Jul 13, 2008 at 01:49, Charles R Harris >> wrote: >> > >> > On Sun, Jul 13, 2008 at 12:02 AM, Robert Kern >> > wrote: >> >> >> Compile-time warnings will be ignored if they aren't errors that stop >> >> the build. Run-time DeprecationWarnings are feasible: >> >> >> >> http://docs.python.org/dev/c-api/exceptions.html#PyErr_WarnEx >> > >> > OK, will do. The same user wants to fix up fftpack_lite. This should >> > actually be pretty easy, just replace all ints by longs in fftpack and >> > fftpack_lite. Or maybe we should use one of the python.h types. As far >> > as I >> > know, these functions are only exposed through the python interface. >> >> Py_ssize_t and Py_size_t are probably the most appropriate, in this >> case. long is not always the same size as a pointer address. > > I'll go with Py_ssize_t then, I'd have to vet the code before using an > unsigned type. Hmm, I wonder if any of the npy types defined in terms of the > corresponding python types? If not, npy_intp might be the best choice since > it will be needed to create arrays. Yes, npy_intp would probably be better from the reader's point of view. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Sun Jul 13 04:51:24 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 13 Jul 2008 01:51:24 -0700 Subject: [Numpy-discussion] Schedule for 1.2.0 Message-ID: The NumPy 1.2.0 release date (8/24/08) is rapidly approaching and we need everyone's help. David Cournapeau has volunteered to take the lead on coordinating this release. The main focus of this release is on improved documentation and a new testing framework based on nose with only a few new features. The only new features that I am aware of being planned for this release are: * Robert Kern has a function that does efficient, pure-Python broadcasting of arrays. It is quite useful in certain instances where you want to have the broadcasting feature of ufuncs but cannot cast your function into a form where you can make use of ufuncs. * When writing big arrays to filelike objects in the new NPY format introduced in 1.1, it would be good to do so in chunks. Robert de Almeida's arrayterator would do the job well. When Robert K. implemented the NPY format, he asked him if we could use arrayterator in numpy and expose it as a public API in numpy, too. He agreed, but he had some fixes to do and unit tests to write. This has now been completed and arrayterator 1.0 has been released: http://pypi.python.org/pypi/arrayterator/1.0 Another issue that we should address is whether it is OK to postpone the planned API changes to histogram and median. A couple of people have mentioned to me that they would like to delay the API changes to 1.3, which seems reasonable to me. If anyone would prefer that we make the planned API changes for histogram and median in 1.2, please speak now. Here is the schedule for 1.2.0: - 8/05/08 tag the 1.1.1rc1 release and prepare packages - 8/12/08 tag the 1.1.1rc2 release and prepare packages - 8/23/08 tag the 1.1.1 release and prepare packages - 8/24/08 announce release We need to follow this schedule as closely as possible because we should get SciPy 0.7.0 out ASAP (I will send an email out about scipy 0.7.0 tomorrow). Also it would be very good to have release candidates of the newest NumPy and SciPy available for the SciPy 2008 conference. I would also like to get them both out in time for the new school year. If you have any additional features that you have been working on, please let me know ASAP. Otherwise, all feature development should take place on a branch from this point forward. If you want to open a ticket specifically for this bug-fix release, please use the NumPy 1.2.0 milestone: http://scipy.org/scipy/numpy/milestone/1.2.0 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Sun Jul 13 05:30:53 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 13 Jul 2008 02:30:53 -0700 Subject: [Numpy-discussion] Schedule for 1.2.0 In-Reply-To: References: Message-ID: On Sun, Jul 13, 2008 at 1:51 AM, Jarrod Millman wrote: > Here is the schedule for 1.2.0: > - 8/05/08 tag the 1.1.1rc1 release and prepare packages > - 8/12/08 tag the 1.1.1rc2 release and prepare packages > - 8/23/08 tag the 1.1.1 release and prepare packages > - 8/24/08 announce release Whoops, that should read: Here is the schedule for 1.2.0: - 8/05/08 tag the 1.2.0rc1 release and prepare packages - 8/12/08 tag the 1.2.0rc2 release and prepare packages - 8/23/08 tag the 1.2.0 release and prepare packages - 8/24/08 announce release -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From bryan.fodness at gmail.com Sun Jul 13 09:50:24 2008 From: bryan.fodness at gmail.com (Bryan Fodness) Date: Sun, 13 Jul 2008 09:50:24 -0400 Subject: [Numpy-discussion] loadtxt and N/A Message-ID: I am using loadtxt and I have missing values that are show up as N/A. I get a, ValueError: invalid literal for float(): N/A Is there a way to ignore these? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Sun Jul 13 11:55:01 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 13 Jul 2008 11:55:01 -0400 Subject: [Numpy-discussion] trapz documentation on Examples pages Message-ID: There are two examples pages: http://www.scipy.org/Numpy_Example_List_With_Doc http://www.scipy.org/Numpy_Example_List The latter says: This page contains a large database of examples demonstrating most of the Numpy functionality. Numpy Example List With Doc has these examples interleaved with the built-in documentation, but is not as regularly updated as this page. So I expect the latter to be most complete and up-to-date. (Indeed, I assume it automagically generated.) But ``trapz`` is documented on the former but not the latter. Is there a reason? Thank you, Alan Isaac From lbolla at gmail.com Sun Jul 13 12:32:12 2008 From: lbolla at gmail.com (Lorenzo Bolla) Date: Sun, 13 Jul 2008 18:32:12 +0200 Subject: [Numpy-discussion] loadtxt and N/A In-Reply-To: References: Message-ID: <20080713163210.GA11628@lollo-laptop> you can use the 'converters' keyword in numpy.loadtxt. first define a function to convert a string in a float, that can handle your 'N/A' entries: def converter(x): if x == 'N/A': return numpy.nan else: return float(x) then use: >>> numpy.loadtxt('test.dat', converters={1:converter,2:converter}) array([[ 1., 2., NaN, 4.], [ 1., NaN, 3., 4.]]) where the file test.dat I used looks like this: 1 2 N/A 4 1 N/A 3 4 hth, L. On Sun, Jul 13, 2008 at 09:50:24AM -0400, Bryan Fodness wrote: > I am using loadtxt and I have missing values that are show up as N/A. > > I get a, > > ValueError: invalid literal for float(): N/A > > Is there a way to ignore these? > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From alan.mcintyre at gmail.com Sun Jul 13 10:28:08 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 13 Jul 2008 10:28:08 -0400 Subject: [Numpy-discussion] Doctests for extensions/cython code In-Reply-To: References: Message-ID: <1d36917a0807130728n2ef7a788l17b0dd3cc1bcaf76@mail.gmail.com> On Sun, Jul 13, 2008 at 2:32 AM, Fernando Perez wrote: > For numpy, we obviously must first remove all the ipython-specific > code from there. The two plugins are separated, so it's perfectly > doable, I just ran out of time. I'm putting it here in the hopes that > it will be useful to Alan, who can strip it of the ipython > dependencies and start using it in the numpy tests. Thanks! I'll see if I can work this in soon. > The one thing I didn't figure out yet was how to load the plugin from > within a python script (instead of doing it at the command line via > 'nosetests --extdoctests'). But this should be trivial, it's just a > matter of finding the right call in nose, and you may already know > it. If you're constructing a TestProgram, you can pass in a list of plugin instances. If I recall correctly, you have to pass in every plugin that you might want to use, even if they're builtin (I suppose the given list replaces the internal plugin list somewhere, I haven't traced it). So if you leave the doctest plugin out of this list, for example, the "--with-doctest" will cause an error. (This is all from memory, though, so take it with a grain of salt) > I don't know if today's numpy.test() picks up doctests in extension > modules (if it does, I'd like to know how). I suspect the answer is > not, but if we are to encourage better examples that serve as > doctests, then actdually testing them would be good. I'd be surprised if it did; I'll have a look. > I hope this helps in that regard. Certainly, thanks for doing it! From aisaac at american.edu Sun Jul 13 13:51:29 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 13 Jul 2008 13:51:29 -0400 Subject: [Numpy-discussion] trapz In-Reply-To: References: Message-ID: One other thing: I believe ``trapz`` could get a small efficiency gain by returning (dx * (y[slice1]+y[slice2])).sum(axis=axis)/2.0 Cheers, Alan Isaac From pav at iki.fi Sun Jul 13 13:53:02 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 13 Jul 2008 17:53:02 +0000 (UTC) Subject: [Numpy-discussion] trapz documentation on Examples pages References: Message-ID: Sun, 13 Jul 2008 11:55:01 -0400, Alan G Isaac wrote: > There are two examples pages: > http://www.scipy.org/Numpy_Example_List_With_Doc > http://www.scipy.org/Numpy_Example_List > > The latter says: > > This page contains a large database of examples demonstrating most > of the Numpy functionality. Numpy Example List With Doc has these > examples interleaved with the built-in documentation, but is not as > regularly updated as this page. > > So I expect the latter to be most complete and up-to-date. (Indeed, I > assume it automagically generated.) But ``trapz`` is documented on the > former but not the latter. Is there a reason? The "... With Doc" is the automatically generated one. It appears that "trapz" does not have additional examples, only the "builtin" reference documentation, so it is not listed on the latter page. -- Pauli Virtanen From aisaac at american.edu Sun Jul 13 14:07:44 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 13 Jul 2008 14:07:44 -0400 Subject: [Numpy-discussion] trapz documentation on Examples pages In-Reply-To: References: Message-ID: On Sun, 13 Jul 2008, (UTC) Pauli Virtanen apparently wrote: > The "... With Doc" is the automatically generated one. It appears that > "trapz" does not have additional examples, only the "builtin" reference > documentation, so it is not listed on the latter page. So do I understand that if I'd like to see ``trapz`` on then I should provide an example by directly editing that page? Thank you, Alan Isaac From ralf.gommers at googlemail.com Sun Jul 13 15:58:32 2008 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 13 Jul 2008 15:58:32 -0400 Subject: [Numpy-discussion] binary_repr dtype dependence Message-ID: Hi all, binary_repr() behaves differently for different types of ints/uints: In [210]: binary_repr(255) Out[210]: '11111111' In [211]: binary_repr(uint32(255)) Out[211]: '11111111' In [212]: binary_repr(uint16(255)) Out[212]: '1' In [213]: np.__version__ Out[213]: '1.0.4' Is this intended/acceptable behavior? It is due to the use of hex() (checked that it is still present in trunk), which expects int or long int. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun Jul 13 17:35:29 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 13 Jul 2008 23:35:29 +0200 Subject: [Numpy-discussion] binary_repr dtype dependence In-Reply-To: References: Message-ID: <9457e7c80807131435p17dcc333ka66e6de09aa81bfb@mail.gmail.com> Hi Ralf 2008/7/13 Ralf Gommers : > Hi all, > > binary_repr() behaves differently for different types of ints/uints: > > In [210]: binary_repr(255) > Out[210]: '11111111' > > In [211]: binary_repr(uint32(255)) > Out[211]: '11111111' > > In [212]: binary_repr(uint16(255)) > Out[212]: '1' That's a bug. It happens because In [9]: hex(np.uint16(255)) Out[9]: '0x1' Please create a ticket. Regards St?fan From charlesr.harris at gmail.com Sun Jul 13 19:04:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 17:04:07 -0600 Subject: [Numpy-discussion] Is PyArray_AsCArray really the best replacement for deprecated PyArray_As1D? Message-ID: Just askin'. Neither is mentioned in the numpy book. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Sun Jul 13 20:26:46 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 13 Jul 2008 20:26:46 -0400 Subject: [Numpy-discussion] Unused private functions Message-ID: <1d36917a0807131726u1ad500acq15e82868d14004a3@mail.gmail.com> Does anybody know whether there's any reason to keep the following functions? They don't appear to be used anywhere. linalg/linalg.py:95 _castCopyAndTranspose lib/function_base.py:659 _asarray1d From charlesr.harris at gmail.com Sun Jul 13 23:14:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 21:14:37 -0600 Subject: [Numpy-discussion] Newly deprecated API functions Message-ID: The following numpy API functions have been deprecated and will be removed in 1.3 PyArray_FromDims PyArray_FromDimsAndDataAndDescr PyArray_As1D If there are other functions that should be added to the list, let me know. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Sun Jul 13 23:47:26 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 13 Jul 2008 22:47:26 -0500 Subject: [Numpy-discussion] snprintf vs PyOS_snprintf In-Reply-To: <3d375d730807121614w9aae3d7n7b5657be53d79db3@mail.gmail.com> References: <3d375d730807121614w9aae3d7n7b5657be53d79db3@mail.gmail.com> Message-ID: <487ACC4E.3070903@enthought.com> Robert Kern wrote: > On Sat, Jul 12, 2008 at 13:11, Charles R Harris > wrote: > >> Numpy uses a mix of snprintf and PyOS_snprintf. The Python version is there >> because snprintf wasn't part of the standard until C99. So either we should >> stick to the python version or make the decision that we only support >> compilers with a working snprintf. Which way should we go? >> > > PyOS_snprintf > > +1 From charlesr.harris at gmail.com Mon Jul 14 01:31:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 23:31:48 -0600 Subject: [Numpy-discussion] Run np.test() twice, get message. Message-ID: Alan, Any idea what this is: *** DocTestRunner.merge: '__main__' in both testers; summing outcomes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 14 01:50:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 13 Jul 2008 23:50:56 -0600 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: On Tue, Jul 8, 2008 at 6:21 PM, Sebastian Haase wrote: > Hi, > I haven't checked out a recent numpy (( >>> N.__version__ > '1.0.3.1')) > But could someone please check if the division has been changed from > '/' to '//' in these places: > > C:\Priithon_25_win\numpy\core\numerictypes.py:142: DeprecationWarning: > classic int division > bytes = bits / 8 > C:\Priithon_25_win\numpy\core\numerictypes.py:182: DeprecationWarning: > classic int division > na_name = '%s%d' % (base.capitalize(), bit/2) > C:\Priithon_25_win\numpy\core\numerictypes.py:212: DeprecationWarning: > classic int division > charname = 'i%d' % (bits/8,) > C:\Priithon_25_win\numpy\core\numerictypes.py:213: DeprecationWarning: > classic int division > ucharname = 'u%d' % (bits/8,) > C:\Priithon_25_win\numpy\core\numerictypes.py:409: DeprecationWarning: > classic int division > nbytes[obj] = val[2] / 8 > I don't believe we have made any changes to '/'. It is going to be tricky making the transition to 3.0, a lot of code is going to break, and we haven't started down that path. Maybe next summer... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Mon Jul 14 01:59:41 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Mon, 14 Jul 2008 01:59:41 -0400 Subject: [Numpy-discussion] Run np.test() twice, get message. In-Reply-To: References: Message-ID: <1d36917a0807132259l1e13da83l6833de7b4d7e1fb3@mail.gmail.com> On Mon, Jul 14, 2008 at 1:31 AM, Charles R Harris wrote: > Any idea what this is: > > *** DocTestRunner.merge: '__main__' in both testers; summing outcomes. Hmm..that's coming from nose. I'll see what it's about tomorrow. From charlesr.harris at gmail.com Mon Jul 14 02:04:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Jul 2008 00:04:27 -0600 Subject: [Numpy-discussion] Unused private functions In-Reply-To: <1d36917a0807131726u1ad500acq15e82868d14004a3@mail.gmail.com> References: <1d36917a0807131726u1ad500acq15e82868d14004a3@mail.gmail.com> Message-ID: On Sun, Jul 13, 2008 at 6:26 PM, Alan McIntyre wrote: > Does anybody know whether there's any reason to keep the following > functions? They don't appear to be used anywhere. > > linalg/linalg.py:95 _castCopyAndTranspose > lib/function_base.py:659 _asarray1d > __ +1 to remove them if they aren't used. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From haase at msg.ucsf.edu Mon Jul 14 04:22:22 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Mon, 14 Jul 2008 10:22:22 +0200 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: On Mon, Jul 14, 2008 at 7:50 AM, Charles R Harris wrote: `> > > On Tue, Jul 8, 2008 at 6:21 PM, Sebastian Haase wrote: >> >> Hi, >> I haven't checked out a recent numpy (( >>> N.__version__ >> '1.0.3.1')) >> But could someone please check if the division has been changed from >> '/' to '//' in these places: >> >> C:\Priithon_25_win\numpy\core\numerictypes.py:142: DeprecationWarning: >> classic int division >> bytes = bits / 8 >> C:\Priithon_25_win\numpy\core\numerictypes.py:182: DeprecationWarning: >> classic int division >> na_name = '%s%d' % (base.capitalize(), bit/2) >> C:\Priithon_25_win\numpy\core\numerictypes.py:212: DeprecationWarning: >> classic int division >> charname = 'i%d' % (bits/8,) >> C:\Priithon_25_win\numpy\core\numerictypes.py:213: DeprecationWarning: >> classic int division >> ucharname = 'u%d' % (bits/8,) >> C:\Priithon_25_win\numpy\core\numerictypes.py:409: DeprecationWarning: >> classic int division >> nbytes[obj] = val[2] / 8 > > I don't believe we have made any changes to '/'. It is going to be tricky > making the transition to 3.0, a lot of code is going to break, and we > haven't started down that path. Maybe next summer... > > Chuck > Hi, thanks for the reply. Could you maybe please check in these changes, and replace the '/' with '//' at those places !? The '//' operator is supported by Python 2.4 (if not before). For a "matlab / IDL - like" interactive environment I'm using numpy already for some 5+ month with the -Qnew option. And everything seems to work smoothly - so those minimal changes would advance a small step further towards the already started "true division" support. Thanks, Sebastian Haase From falted at pytables.org Mon Jul 14 06:58:56 2008 From: falted at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 12:58:56 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?RFC=3A_A_proposal_for_implement?= =?iso-8859-1?q?ing_some=09date/time_types_in_NumPy?= In-Reply-To: <4877B197.7050607@noaa.gov> References: <200807111559.03986.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> <4877B197.7050607@noaa.gov> Message-ID: <200807141258.56833.falted@pytables.org> A Friday 11 July 2008, Christopher Barker escrigu?: > >>>> print example > > > > [Jul-2008 Aug-2008 Sep-2008 Oct-2008 Nov-2008 Dec-2008] > > I like this -- seeing the integers for the times makes me wonder what > that point is -- we've all been using numbers for time for years > already -- what would a datetime array give us other than > auto-conversion from datetime objects, if it doesn't include nicer > display, timedelta, etc. I see your point and I think that it would be a great addition to the date/time types to support additional resolution meta-information in order to offer the most proper string representation. And I'm starting to see the merit of a timedelta type too. > > Now that I think about this, wouldn't be better if, after the > > eventual introduction of the new datetime types in NumPy, the > > matplotlib would use any of these three and throw away their > > current datetime class? > > yes, that would be better, but what to do during the transition? Well, John Hunter has already agreed to adapt mpl to the NumPy date/time as soon as they are in, so I suppose they will have to decide the path for the transition. > I'm also imaging some extra utility functions/method that would be > nice: > > aDateTimeArray.hours(dtype=float) > > to convert to hours (and days, and seconds, etc). And maybe some that > would create a DateTimeArray from various time units. > > I often have to read/write data files that have time in various units > like that -- it would be nice to use array operations to work with > them. I think that a datetime type with a resolution property can help here. I hope to post soon the new proposal about this. Cheers, -- Francesc Alted From falted at pytables.org Mon Jul 14 07:36:33 2008 From: falted at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 13:36:33 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <4877E52F.3060209@esrf.fr> References: <200807111559.03986.falted@pytables.org> <4877E52F.3060209@esrf.fr> Message-ID: <200807141336.33730.falted@pytables.org> A Saturday 12 July 2008, Jon Wright escrigu?: > Charles R Harris wrote: > > On Fri, Jul 11, 2008 at 12:37 PM, Francesc Alted > > > > A Friday 11 July 2008, Francesc Alted escrigu?: > > > A Friday 11 July 2008, Jon Wright escrigu?: > > > > Nice idea - please can you make it work with matplotlib's > > > > time/date > > > > > > Hmmm, following the matplotlib docstrings: > > > > > > """ > > > datetime objects are converted to floating point numbers > > > which represent the number of days since 0001-01-01 UTC > > > """ > > ... > > > > this more carefully, but I suppose that if there is interest > > > enough that could be implemented, yes. > > > > Now that I think about this, wouldn't be better if, after the > > eventual introduction of the new datetime types in NumPy, the > > matplotlib would use any of these three and throw away their > > current datetime class? [Unless they have good reasons for keeping > > their epoch and/or scale] > > > > Especially as there was a ten day adjustment made with the adoption > > of the Gregorian calender on Oct 4, 1582; early dates can be hard > > to interpret. Curiously, IIRC, 01/01/0001 was a Monday. > > So I think I will just want to plot timeseries without (ever please) > caring about date formats again. If you're proposing a "new" format > then I'm assuming you want me to once again care that: > > 1) my temperature logger is recording data in Romance Standard Time, > but not saying so, just day/month/year : time. > 2) When we read that data we cannot tell which time zone it was > recorded in, even if we think we remember where the logger was when > it logged. 3) That the program I am running could currently be in any > time zone 4) Whether the program is plotting compared to "now" in the > current time zone or "then" that the data were recorded. > > None of these problems are new, or indeed unique, I think we only > want a to_ and from_ converter to "what we mean" that we can plot, > using matplotlib. Well, timezones are a hairy issue. From the docstrings of the Python's ``datetime`` module: """ Supporting timezones at whatever level of detail is required is up to the application. The rules for time adjustment across the world are more political than rational, and there is no standard suitable for every application. """ In fact, the approach for the ``datetime`` module is that the user has to provide a concrete class (derived from the abstract ``tzinfo`` class) with *user-defined* methods that are supposed to do the necessary computations to convert UTC <--> 'localtime' timestamps. However, doing the same in NumPy itself would be prohibitive in terms of computational effort. So, our intention was to ignore this part of the reality (timezones) by always working internally with UTC time and let the user do the necessary conversions by using the machinery (together with user's own methods) that provides the ``datetime`` module. We really think that this is a sensible way to proceed with timezones in the NumPy context. > Timezones are a heck of a problem if you want to be accurate. You are > talking about nanosecond resolutions, however, atomic clocks in orbit > apparently suffer from relativistic corrections of the order 38000 > nanoseconds per day [1]. What will you do about data recorded on the > international space station? Getting into time formats at this level > seems to be rather complicated - there is no absolute time you can > reference to - it is all relative :-) Wow, this is really a much larger jitter than I expected. I suppose that this reinforces the decision of deprecate the use of the TAI standard just for improving the precision in large time spans. Thanks, -- Francesc Alted From silva at lma.cnrs-mrs.fr Mon Jul 14 07:47:51 2008 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Mon, 14 Jul 2008 13:47:51 +0200 Subject: [Numpy-discussion] error importing a f2py compiled module. In-Reply-To: <1214223058.3133.25.camel@Portable-s2m.cnrs-mrs.fr> References: <1214206686.3133.14.camel@Portable-s2m.cnrs-mrs.fr> <63206.88.89.32.166.1214218810.squirrel@cens.ioc.ee> <1214223058.3133.25.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <1216036071.3027.3.camel@Portable-s2m.cnrs-mrs.fr> Le lundi 23 juin 2008 ? 14:10 +0200, Fabrice Silva a ?crit : > > I don't have ideas what is causing this import error. Try > > the instructions above, may be it is due to some compile object > > conflicts. > The only posts on mailing lists I've read mention security policies > (SElinux) and /tmp execution limitations... Another point is that the working directory has been created by a subversion checkout command but has proper permissions drwxr-xr-x 4 fab fab 4096 jui 14 13:29 lib as the fortran and the shared object files : -rwxr-xr-x 1 fab fab 6753 jui 9 14:14 systeme.f -rwxr-xr-x 1 fab fab 85746 jui 14 13:21 systeme.so Moving these files in another directory (and adding this latter to path) suppress the import problem... -- Fabrice Silva LMA UPR CNRS 7051 - ?quipe S2M From faltet at pytables.org Mon Jul 14 07:58:10 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 13:58:10 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?RFC=3A_A_proposal_for_implement?= =?iso-8859-1?q?ing_some=09date/time_types_in_NumPy?= In-Reply-To: References: <200807111559.03986.falted@pytables.org> <4877B197.7050607@noaa.gov> Message-ID: <200807141358.11079.faltet@pytables.org> A Saturday 12 July 2008, Matt Knox escrigu?: > Christopher Barker noaa.gov> writes: > >> I'm also imaging some extra utility functions/method that would be > >> nice: > >> > >> aDateTimeArray.hours(dtype=float) > >> > >> to convert to hours (and days, and seconds, etc). And maybe some > >> that would create a DateTimeArray from various time units. > > The DateArray class in the timeseries scikits can do part of what you > want. Observe... > > >>> import scikits.timeseries as ts > >>> a = ts.date_array(start_date=ts.now('hourly'), length=15) > >>> a > > DateArray([12-Jul-2008 11:00, 12-Jul-2008 12:00, 12-Jul-2008 13:00, > 12-Jul-2008 14:00, 12-Jul-2008 15:00, 12-Jul-2008 16:00, > 12-Jul-2008 17:00, 12-Jul-2008 18:00, 12-Jul-2008 19:00, > 12-Jul-2008 20:00, 12-Jul-2008 21:00, 12-Jul-2008 22:00, > 12-Jul-2008 23:00, 13-Jul-2008 00:00, 13-Jul-2008 01:00], > freq='H') Mmh, I like very much your notion of 'frequency' as meta-information of your DateArray class. I was in fact thinking in something similar for the (more general) date/time in NumPy, but based on the notion of 'resolution' instead of 'frequency'. I'll expand more about this in our next proposal. > > >>> a.year > > array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, > 2008, 2008, 2008, 2008, 2008]) > > >>> a.hour > > array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0, 1]) > > >>> a.day > > array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13]) > Well, while I see the merits of the '.year', '.hour' and so on properties, I'm not sure whether this would be useful for a general date/time type. I'd prefer what was suggested by Chris before, i.e. something like: a.hours(dtype=float) to convert to hours (and days, and seconds, etc). > I would encourage you to take a look at the wiki > (http://scipy.org/scipy/scikits/wiki/TimeSeries) as you may find some > surprises in there that prove useful. I've had a look at it, and it is clear that you guys have put a lot of thought on it. We will be sure to have your implementation in mind. > >> I often have to read/write data files that have time in various > >> units like that -- it would be nice to use array operations to > >> work with them. > > If peak performance is not a concern, parsing of most date formats > can be done automatically using the built in parser in the timeseries > module (borrowed from mx.DateTime). Observe... > > >>> dlist = ['14-jan-2001 14:34:33', '16-jan-2001 10:09:11'] > >>> a = ts.date_array(dlist, freq='secondly') > >>> a > > DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11], > freq='S') That's great. However we only planned to import/export dates from the ``datetime`` module for the time being, mainly because of efficency but also simplicity. Would many people be interested in seeing this kind of string date parsing integrated in the native NumPy types? Thanks, -- Francesc Alted From alan.mcintyre at gmail.com Mon Jul 14 08:05:19 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Mon, 14 Jul 2008 08:05:19 -0400 Subject: [Numpy-discussion] Unused private functions In-Reply-To: <1d36917a0807131726u1ad500acq15e82868d14004a3@mail.gmail.com> References: <1d36917a0807131726u1ad500acq15e82868d14004a3@mail.gmail.com> Message-ID: <1d36917a0807140505j6e224a2cmca890b32d521a6bf@mail.gmail.com> On Sun, Jul 13, 2008 at 8:26 PM, Alan McIntyre wrote: > Does anybody know whether there's any reason to keep the following > functions? They don't appear to be used anywhere. > > linalg/linalg.py:95 _castCopyAndTranspose > lib/function_base.py:659 _asarray1d _castCopyAndTranspose is referenced in the SciPy weave tutorial (but not actually used anywhere); if I have some time later I'll look into updating that. From faltet at pytables.org Mon Jul 14 09:07:47 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 15:07:47 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept Message-ID: <200807141507.47484.faltet@pytables.org> Hi, Before giving more thought to the new proposal of the date/time types for NumPy based in the concept of 'resolution', I'd like to gather more feedback on your opinions about this. After pondering about the opinions about the first proposal, the idea we are incubating is to complement the ``datetime64`` with a 'resolution' metainfo. The ``datetime64`` will still be based on a int64 type, but the meaning of the 'ticks' would depend on a 'resolution' property. This is best seen with an example: In [21]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'sec')) Out [21]: [1970-01-01T00:00:00, 1970-01-01T00:00:01, 1970-01-01T00:00:02] In [22]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'hour')) Out [22]: [1970-01-01T00, 1970-01-01T01, 1970-01-01T02] i.e. the 'resolution' gives the actual meaning to the 'int64' counter. The advantage of this abstraction is that the user can easily choose the scale of resolution that better fits his need. I'm thinking in providing the next resolutions: ["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", "min", "hour", "month", "year"] Also, together with the absolute ``datetime64`` one can have a relative counterpart, say, ``timedelta64`` that also supports the notion of 'resolution'. Between both one would cover the needs for most uses, while providing the user with a lot of flexibility, IMO. We very much prefer this new approach than the stated in our first proposal. Now, it comes the tricky part: how to integrate the notion of 'resolution' with the 'dtype' data type factory of NumPy? Well, we have thought a couple of possibilities. 1) Using the NumPy 'dtype' factory: nanoabs = numpy.dtype('datetime64', resolution="nanosec") nanorel = numpy.dtype('timedelta64', resolution="nanosec") 2) Extending the string notation by using the '[]' square brackets: nanoabs = numpy.dtype('datetime64[nanosec]') # long notation nanoabs = numpy.dtype('T[nanosec]') # short notation nanorel = numpy.dtype('timedelta64[nanosec]') # long notation nanorel = numpy.dtype('t[nanosec]') # short notation With these building blocks, one may obtain more complex dtype structures easily. Now, the question is: would that proposal enter in conflict with the spirit of the current 'dtype' factory? And another important one, would that complicate the implementation too much? If the answer to the both previous questions is 'no', then we will study this more and provide another proposal based on this. BTW, I suppose that the best candidate to answer these would be Travis O., but if anybody feels brave enough ;-) please go ahead and give your advice. Cheers, -- Francesc Alted From david.huard at gmail.com Mon Jul 14 09:34:07 2008 From: david.huard at gmail.com (David Huard) Date: Mon, 14 Jul 2008 09:34:07 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807141358.11079.faltet@pytables.org> References: <200807111559.03986.falted@pytables.org> <4877B197.7050607@noaa.gov> <200807141358.11079.faltet@pytables.org> Message-ID: <91cf711d0807140634m4074e4aaw3514b26a8b7c1035@mail.gmail.com> 2008/7/14 Francesc Alted : > [...] > > DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11], > > freq='S') > > That's great. However we only planned to import/export dates from the > ``datetime`` module for the time being, mainly because of efficency but > also simplicity. Would many people be interested in seeing this kind > of string date parsing integrated in the native NumPy types? > > It's useful to have a complete string representation to write dates to a file and be able to retrieve them later on. In this sense, a strftime-like write/read method would be appreciated (where the date format is specified by the user or set by convention). On the other hand, trying to second guess the format the date is formatted in can quickly turn into a regular expression nightmare (look at the mx.datetime module that does this). I'd hate to see you waste time on this. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Jul 14 09:34:49 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 09:34:49 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141507.47484.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> Message-ID: <200807140934.49929.pgmdevlist@gmail.com> On Monday 14 July 2008 09:07:47 Francesc Alted wrote: > The advantage of this abstraction is that the user can easily choose the > scale of resolution that better fits his need. I'm thinking in > providing the next resolutions: > > ["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", "min", > "hour", "month", "year"] In TimeSeries, we don't have anything less than a second, but we have 'daily', 'business daily', 'weekly' and 'quarterly' resolutions. A very useful point that Matt Knox had coded is the possibility to specify starting points for switching from one resolution to another. For example, you can have a series with a 'ANN_MAR' frequency, that corresponds to 1 point a year, the year starting in April. When switching back to a monthly resolution, the points from January to March of the first year will be masked. Another useful point would be allow the user to define his/her own resolution (every 15min, every 12h...). Right now it's a bit clunky in TimeSeries, we have to use the lowest resolution of the series (min, hour) and leave a lot of blanks (TimeSeries don't have to be regularly spaced, but it helps...) > Now, it comes the tricky part: how to integrate the notion > of 'resolution' with the 'dtype' data type factory of NumPy? In TimeSeries, the frequency is stored as an integer. For example, a daily frequency is stored as 6000, an annual frequency as 1000, a 'ANN_MAR' frequency as 1003... From aisaac at american.edu Mon Jul 14 09:47:57 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 14 Jul 2008 09:47:57 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141507.47484.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> Message-ID: On Mon, 14 Jul 2008, Francesc Alted apparently wrote: > Before giving more thought to the new proposal of the > date/time types for NumPy based in the concept of > 'resolution', I'd like to gather more feedback on your > opinions about this. It might be a good idea to run the proposal(s) past Marc-Andre Lemburg mal (at) egenix (dot) com Cheers, Alan Isaac From peridot.faceted at gmail.com Mon Jul 14 10:21:32 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 14 Jul 2008 10:21:32 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141507.47484.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> Message-ID: 2008/7/14 Francesc Alted : > After pondering about the opinions about the first proposal, the idea we > are incubating is to complement the ``datetime64`` with a 'resolution' > metainfo. The ``datetime64`` will still be based on a int64 type, but > the meaning of the 'ticks' would depend on a 'resolution' property. This is an interesting idea. To be useful, though, you would also need a flexible "offset" defining the zero of time. After all, the reason not to just always use (say) femtosecond accuracy is that 2**64 femtoseconds is only about five hours. So if you're going to use femtosecond steps, you really want to choose your start point carefully. (It's also worth noting that there is little need for more time accuracy than atomic clocks can provide, since anyone looking for more than that is going to be doing some tricky metrology anyway.) One might take guidance from the FITS format, which represents (arrays of) quantities as (usually) fixed-point numbers, but has a global "scale" and "offset" parameter for each array. This allows one to accurately represent many common arrays with relatively few bits. The FITS libraries transparently convert these quantities. Of course, this isn't so convenient if you don't have basic machine datatypes with enough precision to handle all the quantities of interest. Anne From zbyszek at in.waw.pl Mon Jul 14 11:13:55 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Mon, 14 Jul 2008 17:13:55 +0200 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <20080712143519.GA15362@lollo-laptop> References: <20080712143519.GA15362@lollo-laptop> Message-ID: <20080714151355.GB23477@szyszka.in.waw.pl> > data = loadtxt('18B180.dat', skiprows = 1, usecols = xrange(1,46)) On Sat, Jul 12, 2008 at 04:35:20PM +0200, Lorenzo Bolla wrote: > why not using: or data = loadtxt('18B180.dat', skiprows=1, unpack=True)[1:] > > obviously, you need to know how many columns you have. Or not, if you don't mind the very small cost of parsing an extra column. - Zbyszek > On Sat, Jul 12, 2008 at 10:07:06AM -0400, Bryan Fodness wrote: > > i would like to load my data without knowing the length, i have explicitly > > stated the rows > > > > data = loadtxt('18B180.dat', skiprows = 1, usecols = > > 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45)) > > and would like to use something like, > > > > data = loadtxt('18B180.dat', skiprows = 1, usecols = (1,:)) > > > > the first column is the only that i do not need. From faltet at pytables.org Mon Jul 14 12:09:51 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 18:09:51 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <91cf711d0807140634m4074e4aaw3514b26a8b7c1035@mail.gmail.com> References: <200807111559.03986.falted@pytables.org> <200807141358.11079.faltet@pytables.org> <91cf711d0807140634m4074e4aaw3514b26a8b7c1035@mail.gmail.com> Message-ID: <200807141809.51523.faltet@pytables.org> A Monday 14 July 2008, David Huard escrigu?: > 2008/7/14 Francesc Alted : > > [...] > > > > > DateArray([14-Jan-2001 14:34:33, 16-Jan-2001 10:09:11], > > > freq='S') > > > > That's great. However we only planned to import/export dates from > > the ``datetime`` module for the time being, mainly because of > > efficency but also simplicity. Would many people be interested in > > seeing this kind of string date parsing integrated in the native > > NumPy types? > > It's useful to have a complete string representation to write dates > to a file and be able to retrieve them later on. In this sense, a > strftime-like > write/read method would be appreciated (where the date format is > specified by the user or set by convention). > > On the other hand, trying to second guess the format the date is > formatted in can quickly turn > into a regular expression nightmare (look at the mx.datetime module > that does this). I'd hate > to see you waste time on this. Ok. With the proposal based on the 'resolution' concept we were going to output the times in string format (more specifically, the ISO 8601 format so as to follow a clear standard). My guess is that adding code to parse this specific format on input should be easy and is a reasonable thing to do. I agree that adding parsers for more formats would innecessarily complicate things. Thanks, -- Francesc Alted From millman at berkeley.edu Mon Jul 14 12:25:47 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 14 Jul 2008 09:25:47 -0700 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: <9457e7c80807080919x2bc54cb9v935b4f2c7fa3b3cd@mail.gmail.com> Message-ID: The NumPy 1.1.1 release date (7/31/08) is rapidly approaching and we need everyone's help. Chuck Harris has volunteered to take the lead on coordinating this release. As a reminder here is the schedule for 1.1.1: - 7/20/08 tag the 1.1.1rc1 release and prepare packages - 7/27/08 tag the 1.1.1 release and prepare packages - 7/31/08 announce release This release should include only bug-fixes and improved documentation. We need to follow this schedule as closely as possible because we will need to start focusing on the upcoming NumPy 1.2.0 release as soon as 1.1.1 is released. As a reminder, the trunk is for 1.2.0 development; 1.1.1 will be tagged off the 1.1.x branch: svn co http://svn.scipy.org/svn/numpy/branches/1.1.x numpy-1.1.x If you have any fixes that you haven't back-ported yet, please do so ASAP. According to our release schedule we will be tagging the release candidate for 1.1.1 next Sunday (7/20). We will be asking for wide-spread testing of the release candidate during the week of the 20th. If you want to open a ticket specifically for this bug-fix release, please use the NumPy 1.1.1 milestone: http://scipy.org/scipy/numpy/milestone/1.1.1 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From faltet at pytables.org Mon Jul 14 12:50:21 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 18:50:21 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807140934.49929.pgmdevlist@gmail.com> References: <200807141507.47484.faltet@pytables.org> <200807140934.49929.pgmdevlist@gmail.com> Message-ID: <200807141850.21269.faltet@pytables.org> A Monday 14 July 2008, Pierre GM escrigu?: > On Monday 14 July 2008 09:07:47 Francesc Alted wrote: > > The advantage of this abstraction is that the user can easily > > choose the scale of resolution that better fits his need. I'm > > thinking in providing the next resolutions: > > > > ["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", > > "min", "hour", "month", "year"] > > In TimeSeries, we don't have anything less than a second, but we > have 'daily', 'business daily', 'weekly' and 'quarterly' resolutions. Yes, I forgot the "day" resolution. I suppose that "weekly" and "quaterly" could be added too. However, if we adopt a new way to specify the resolution (see later), these can be stated as '7d' and '3m' respectively. Mmh, not sure about "business daily"; this maybe is useful in time series, but I don't find a reasonable meaning for it as a 'time resolution' (which is a different concept from 'time frequency'). So I'd let it out. > A very useful point that Matt Knox had coded is the possibility to > specify starting points for switching from one resolution to another. > For example, you can have a series with a 'ANN_MAR' frequency, that > corresponds to 1 point a year, the year starting in April. When > switching back to a monthly resolution, the points from January to > March of the first year will be masked. Ok. Ann was also suggesting that the origin of time would be configurable, but then, you are talking about *masking* values. Mmm, I don't think we should try to incorporate masking capabilities in the NumPy date/time types. At any rate, I've not thought about the possibility of having an origin defined by the user, but if we could add the 'resolution' metainfo, I don't see why we couldn't do the same with the 'origin' metainfo too. > Another useful point would be allow the user to define his/her own > resolution (every 15min, every 12h...). Right now it's a bit clunky > in TimeSeries, we have to use the lowest resolution of the series > (min, hour) and leave a lot of blanks (TimeSeries don't have to be > regularly spaced, but it helps...) Ok. I see the use case for this, but for implementation purposes, we should come with a more complete way to specify the resolution than I realized before. Hmm, what about the next: [N]timeunit where ``timeunit`` can take the values in: ['y', 'm', 'd', 'h', 'm', 's', 'ms', 'us', 'ns', 'fs'] so, for example, '14d' means a resolution of 14 days, or '10ms' means a resolution of 1 hundreth of second. Sounds good to me. What other people think? > > > Now, it comes the tricky part: how to integrate the notion > > of 'resolution' with the 'dtype' data type factory of NumPy? > > In TimeSeries, the frequency is stored as an integer. For example, a > daily frequency is stored as 6000, an annual frequency as 1000, a > 'ANN_MAR' frequency as 1003... Well, I initially planned to keep the resolution as an enumerated (int8 would be enough), but if the new way to specify resolutions goes ahead, I'm afraid that we may need a fill int64 to save this. But apart from that, this should be not a problem (in general, the metainfo is a very tiny part of the space taken by a dataset). Cheers, -- Francesc Alted From faltet at pytables.org Mon Jul 14 12:55:29 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 18:55:29 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: References: <200807141507.47484.faltet@pytables.org> Message-ID: <200807141855.29884.faltet@pytables.org> A Monday 14 July 2008, Alan G Isaac escrigu?: > On Mon, 14 Jul 2008, Francesc Alted apparently wrote: > > Before giving more thought to the new proposal of the > > date/time types for NumPy based in the concept of > > 'resolution', I'd like to gather more feedback on your > > opinions about this. > > It might be a good idea to run the proposal(s) past > Marc-Andre Lemburg mal (at) egenix (dot) com Sure. And maybe also to Fred Drake, the original autor of the ``datetime`` module. However, I'd prefer to send them something in a more advanced state of refinement than it is now. Thanks for the suggestion, -- Francesc Alted From pgmdevlist at gmail.com Mon Jul 14 13:11:22 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 13:11:22 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141850.21269.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> <200807140934.49929.pgmdevlist@gmail.com> <200807141850.21269.faltet@pytables.org> Message-ID: <200807141311.22746.pgmdevlist@gmail.com> On Monday 14 July 2008 12:50:21 Francesc Alted wrote: > > A very useful point that Matt Knox had coded is the possibility to > > specify starting points for switching from one resolution to another. > > For example, you can have a series with a 'ANN_MAR' frequency, that > > corresponds to 1 point a year, the year starting in April. When > > switching back to a monthly resolution, the points from January to > > March of the first year will be masked. > > Ok. Ann was also suggesting that the origin of time would be > configurable, but then, you are talking about *masking* values. Mmm, I > don't think we should try to incorporate masking capabilities in the > NumPy date/time types. Francesc, In scikits.timeseries, we have 2 different objects: * DateArray, which is basically a ndarray of integers with a given 'frequency' attribute. * TimeSeries, which is basically the combination of a MaskedArray (the data part) and a DateArray (which keeps track of the date corresponding to each data point. TimeSeries object have the resolution/origin of the companion DateArray, and when they're converted from one resolution to another, some masking may occur. My understanding is that you intend to define an object similar to DateArray. You want to define a new dtype (datetime64 or other), we used yet another class instead, Date. A dtype would be easier to manipulate, but as neither Matt nor I were particularly experienced with that at the time, we followed the simpler approach of a class... > [N]timeunit > > where ``timeunit`` can take the values in: > > ['y', 'm', 'd', 'h', 'm', 's', 'ms', 'us', 'ns', 'fs'] > > so, for example, '14d' means a resolution of 14 days, or '10ms' means a > resolution of 1 hundreth of second. Sounds good to me. What other > people think? Sounds pretty cool and intuitive to use. However, writing the conversion rules from one to another will be a lot of fun. Take weekly, for example: that's a period of 7 days, but when does it start ? On a monday ? Then, 12/31/2007 was the start of the first week of 2008... OK, we can leave that problem for the moment... From faltet at pytables.org Mon Jul 14 13:18:28 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 19:18:28 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: References: <200807141507.47484.faltet@pytables.org> Message-ID: <200807141918.28367.faltet@pytables.org> A Monday 14 July 2008, Anne Archibald escrigu?: > 2008/7/14 Francesc Alted : > > After pondering about the opinions about the first proposal, the > > idea we are incubating is to complement the ``datetime64`` with a > > 'resolution' metainfo. The ``datetime64`` will still be based on a > > int64 type, but the meaning of the 'ticks' would depend on a > > 'resolution' property. > > This is an interesting idea. To be useful, though, you would also > need a flexible "offset" defining the zero of time. After all, the > reason not to just always use (say) femtosecond accuracy is that > 2**64 femtoseconds is only about five hours. So if you're going to > use femtosecond steps, you really want to choose your start point > carefully. (It's also worth noting that there is little need for more > time accuracy than atomic clocks can provide, since anyone looking > for more than that is going to be doing some tricky metrology > anyway.) That's a good point indeed. Well, to start with, I suppose that picosecond resolution is more than enough for nowadays precision standards (even when using atomic clocks). However, provided that atomic clocks are always improving its precision [1], having a femtosecond resolution is not going to bother people, I think. [1] http://en.wikipedia.org/wiki/Image:Clock_accurcy.jpg But the time origin is certainly an issue, yes. See later. > One might take guidance from the FITS format, which represents > (arrays of) quantities as (usually) fixed-point numbers, but has a > global "scale" and "offset" parameter for each array. This allows one > to accurately represent many common arrays with relatively few bits. > The FITS libraries transparently convert these quantities. Of course, > this isn't so convenient if you don't have basic machine datatypes > with enough precision to handle all the quantities of interest. That's pretty interesting in that the "scale" is certainly something similar to the "resolution" concept that we want to introduce. And definitely, "offset" would be similar to "origin". So yes, we will try to introduce both concepts. However, one thing that we would try to avoid is to use fixed-point arithmetic (we plan to use integer arithmetic only). The rational is that fixed-point arithmetic is computationally more complex (it has to implemented in software, while integer arithmetic is implemented in hardware) and that would slow down things too much. Thanks! -- Francesc Alted From jdh2358 at gmail.com Mon Jul 14 13:22:32 2008 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 14 Jul 2008 12:22:32 -0500 Subject: [Numpy-discussion] permissions on tests in numpy and scipy Message-ID: <88e473830807141022h2ddc874ai4743e11421cc391@mail.gmail.com> I have a rather unconventional install pipeline at work and owner only read permissions on a number of the tests are causing me minor problems. It appears the permissions on the tests are set rather inconsistently in numpy and python -- is there any reason not to make these all 644? johnh at flag:site-packages> find numpy -name "test_*.py"|xargs ls -l -rw------- 1 johnh research 2769 Jul 14 12:01 numpy/core/tests/test_defchararray.py -rw------- 1 johnh research 8283 Jul 14 12:01 numpy/core/tests/test_defmatrix.py -rw------- 1 johnh research 1769 Jun 25 10:00 numpy/core/tests/test_errstate.py -rw------- 1 johnh research 1508 Jun 25 10:00 numpy/core/tests/test_memmap.py -rw------- 1 johnh research 33334 Jul 14 12:01 numpy/core/tests/test_multiarray.py -rw------- 1 johnh research 26695 Jul 14 12:01 numpy/core/tests/test_numeric.py -rw------- 1 johnh research 13781 Jul 14 12:01 numpy/core/tests/test_numerictypes.py -rw------- 1 johnh research 1113 Jul 14 12:01 numpy/core/tests/test_print.py -rw------- 1 johnh research 4290 Jul 14 12:01 numpy/core/tests/test_records.py -rw------- 1 johnh research 39370 Jul 14 12:01 numpy/core/tests/test_regression.py -rw------- 1 johnh research 4097 Jul 14 12:01 numpy/core/tests/test_scalarmath.py -rw------- 1 johnh research 9330 Jun 25 10:00 numpy/core/tests/test_ufunc.py -rw------- 1 johnh research 7583 Jul 14 12:01 numpy/core/tests/test_umath.py -rw------- 1 johnh research 11264 Jul 14 12:01 numpy/core/tests/test_unicode.py -rw------- 1 johnh research 222 Jul 14 12:00 numpy/distutils/tests/f2py_ext/tests/test_fib2.py -rw------- 1 johnh research 220 Jul 14 12:00 numpy/distutils/tests/f2py_f90_ext/tests/test_foo.py -rw------- 1 johnh research 220 Jul 14 12:00 numpy/distutils/tests/gen_ext/tests/test_fib3.py -rw------- 1 johnh research 277 Jul 14 12:00 numpy/distutils/tests/pyrex_ext/tests/test_primes.py -rw------- 1 johnh research 387 Jul 14 12:00 numpy/distutils/tests/swig_ext/tests/test_example.py -rw------- 1 johnh research 276 Jul 14 12:00 numpy/distutils/tests/swig_ext/tests/test_example2.py -rw------- 1 johnh research 1800 Jul 14 12:00 numpy/distutils/tests/test_fcompiler_gnu.py -rw------- 1 johnh research 2421 Jun 25 09:59 numpy/distutils/tests/test_misc_util.py -rw-r--r-- 1 johnh research 64338 Jun 25 09:59 numpy/f2py/lib/parser/test_Fortran2003.py -rw-r--r-- 1 johnh research 24785 Jun 25 09:59 numpy/f2py/lib/parser/test_parser.py -rw------- 1 johnh research 574 Jul 14 12:00 numpy/fft/tests/test_fftpack.py -rw------- 1 johnh research 1256 Jul 14 12:00 numpy/fft/tests/test_helper.py -rw------- 1 johnh research 9948 Jul 14 12:01 numpy/lib/tests/test__datasource.py -rw------- 1 johnh research 4088 Jul 14 12:01 numpy/lib/tests/test_arraysetops.py -rw------- 1 johnh research 809 Jul 14 12:01 numpy/lib/tests/test_financial.py -rw------- 1 johnh research 23794 Jul 14 12:01 numpy/lib/tests/test_format.py -rw------- 1 johnh research 30772 Jul 14 12:01 numpy/lib/tests/test_function_base.py -rw------- 1 johnh research 1587 Jun 25 10:00 numpy/lib/tests/test_getlimits.py -rw------- 1 johnh research 2111 Jul 14 12:01 numpy/lib/tests/test_index_tricks.py -rw------- 1 johnh research 7691 Jul 14 12:01 numpy/lib/tests/test_io.py -rw------- 1 johnh research 999 Jun 25 10:00 numpy/lib/tests/test_machar.py -rw------- 1 johnh research 2431 Jun 25 10:00 numpy/lib/tests/test_polynomial.py -rw------- 1 johnh research 1409 Jul 14 12:01 numpy/lib/tests/test_regression.py -rw------- 1 johnh research 14353 Jul 14 12:01 numpy/lib/tests/test_shape_base.py -rw------- 1 johnh research 6958 Jul 14 12:01 numpy/lib/tests/test_stride_tricks.py -rw------- 1 johnh research 6735 Jul 14 12:01 numpy/lib/tests/test_twodim_base.py -rw------- 1 johnh research 9579 Jul 14 12:01 numpy/lib/tests/test_type_check.py -rw------- 1 johnh research 1771 Jun 25 10:00 numpy/lib/tests/test_ufunclike.py -rw------- 1 johnh research 8279 Jul 14 12:01 numpy/linalg/tests/test_linalg.py -rw------- 1 johnh research 1739 Jul 14 12:01 numpy/linalg/tests/test_regression.py -rw------- 1 johnh research 78819 Jun 25 10:00 numpy/ma/tests/test_core.py -rw------- 1 johnh research 15744 Jul 14 12:01 numpy/ma/tests/test_extras.py -rw------- 1 johnh research 17759 Jun 25 10:00 numpy/ma/tests/test_mrecords.py -rw------- 1 johnh research 33009 Jun 25 10:00 numpy/ma/tests/test_old_ma.py -rw------- 1 johnh research 5956 Jun 25 10:00 numpy/ma/tests/test_subclassing.py -rw------- 1 johnh research 3173 Jul 14 12:01 numpy/oldnumeric/tests/test_oldnumeric.py -rw------- 1 johnh research 2043 Jun 25 09:59 numpy/random/tests/test_random.py -rw------- 1 johnh research 4330 Jun 25 10:00 numpy/testing/tests/test_utils.py -rw------- 1 johnh research 3356 Jun 25 10:00 numpy/tests/test_ctypeslib.py johnh at flag:site-packages> find scipy -name "test_*.py"|xargs ls -l -rw------- 1 johnh research 7653 Jul 14 12:01 scipy/cluster/tests/test_hierarchy.py -rw------- 1 johnh research 5852 Jul 14 12:01 scipy/cluster/tests/test_vq.py -rw------- 1 johnh research 13552 Jul 14 12:01 scipy/fftpack/tests/test_basic.py -rw-r--r-- 1 johnh research 1846 Mar 10 12:57 scipy/fftpack/tests/test_helper.py -rw-r--r-- 1 johnh research 11554 Mar 10 12:57 scipy/fftpack/tests/test_pseudo_diffs.py -rw------- 1 johnh research 4192 Apr 25 09:15 scipy/integrate/tests/test_integrate.py -rw------- 1 johnh research 3728 Feb 19 17:12 scipy/integrate/tests/test_quadpack.py -rw------- 1 johnh research 1035 Feb 19 17:12 scipy/integrate/tests/test_quadrature.py -rw------- 1 johnh research 5310 Jul 14 12:01 scipy/interpolate/tests/test_fitpack.py -rw------- 1 johnh research 8014 Jul 14 12:01 scipy/interpolate/tests/test_interpolate.py -rw------- 1 johnh research 11273 May 7 12:43 scipy/interpolate/tests/test_polyint.py -rw------- 1 johnh research 929 Apr 25 09:15 scipy/interpolate/tests/test_rbf.py -rw------- 1 johnh research 8625 Jul 14 12:01 scipy/io/matlab/tests/test_mio.py -rw------- 1 johnh research 2010 Apr 25 09:15 scipy/io/tests/test_array_import.py -rw------- 1 johnh research 10246 Jul 14 12:01 scipy/io/tests/test_mmio.py -rw-r--r-- 1 johnh research 3621 Mar 10 12:57 scipy/io/tests/test_npfile.py -rw------- 1 johnh research 7276 Apr 25 09:15 scipy/io/tests/test_recaster.py -rw-r--r-- 1 johnh research 8320 Mar 10 12:57 scipy/lib/blas/tests/test_blas.py -rw-r--r-- 1 johnh research 17225 Mar 10 12:57 scipy/lib/blas/tests/test_fblas.py -rw------- 1 johnh research 4271 Apr 25 09:15 scipy/lib/lapack/tests/test_lapack.py -rw-r--r-- 1 johnh research 230 Mar 10 12:57 scipy/linalg/tests/test_atlas_version.py -rw-r--r-- 1 johnh research 13148 Mar 10 12:57 scipy/linalg/tests/test_basic.py -rw-r--r-- 1 johnh research 7597 Mar 10 12:57 scipy/linalg/tests/test_blas.py -rw------- 1 johnh research 32051 Apr 25 09:15 scipy/linalg/tests/test_decomp.py -rw-r--r-- 1 johnh research 17095 Mar 10 12:57 scipy/linalg/tests/test_fblas.py -rw-r--r-- 1 johnh research 2017 Mar 10 12:57 scipy/linalg/tests/test_lapack.py -rw-r--r-- 1 johnh research 3515 Mar 10 12:57 scipy/linalg/tests/test_matfuncs.py -rw-r--r-- 1 johnh research 961 Mar 10 12:57 scipy/maxentropy/tests/test_maxentropy.py -rw------- 1 johnh research 1399 Feb 19 17:12 scipy/misc/tests/test_pilutil.py -rw------- 1 johnh research 12120 Feb 19 17:12 scipy/odr/tests/test_odr.py -rw-r--r-- 1 johnh research 576 Mar 10 12:57 scipy/optimize/tests/test_cobyla.py -rw-r--r-- 1 johnh research 2632 Mar 10 12:57 scipy/optimize/tests/test_nonlin.py -rw------- 1 johnh research 10392 Apr 25 09:15 scipy/optimize/tests/test_optimize.py -rw-r--r-- 1 johnh research 3032 Mar 10 12:57 scipy/optimize/tests/test_slsqp.py -rw-r--r-- 1 johnh research 977 Mar 10 12:57 scipy/optimize/tests/test_zeros.py -rw-r--r-- 1 johnh research 1943 Apr 3 10:22 scipy/signal/tests/test_signaltools.py -rw-r--r-- 1 johnh research 3009 Apr 3 10:22 scipy/signal/tests/test_wavelets.py -rw------- 1 johnh research 1181 Apr 25 09:15 scipy/sparse/linalg/dsolve/tests/test_linsolve.py -rw------- 1 johnh research 5763 Apr 25 09:15 scipy/sparse/linalg/dsolve/umfpack/tests/test_umfpack.py -rw------- 1 johnh research 8458 Apr 25 09:15 scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py -rw-r--r-- 1 johnh research 2048 Mar 10 12:54 scipy/sparse/linalg/eigen/arpack/tests/test_speigs.py -rw------- 1 johnh research 1896 Jul 14 12:01 scipy/sparse/linalg/eigen/lobpcg/tests/test_lobpcg.py -rw------- 1 johnh research 4535 Apr 25 09:15 scipy/sparse/linalg/isolve/tests/test_iterative.py -rw------- 1 johnh research 1706 Apr 25 09:15 scipy/sparse/linalg/tests/test_interface.py -rw------- 1 johnh research 47594 Jul 14 12:01 scipy/sparse/tests/test_base.py -rw------- 1 johnh research 7454 Apr 25 09:15 scipy/sparse/tests/test_construct.py -rw------- 1 johnh research 3259 Apr 25 09:15 scipy/sparse/tests/test_spfuncs.py -rw------- 1 johnh research 2411 Apr 25 09:15 scipy/sparse/tests/test_sputils.py -rw------- 1 johnh research 74547 Jul 14 12:01 scipy/special/tests/test_basic.py -rw------- 1 johnh research 1008 Feb 19 17:12 scipy/special/tests/test_spfun_stats.py -rw-r--r-- 1 johnh research 877 Mar 10 12:55 scipy/stats/models/tests/test_glm.py -rw-r--r-- 1 johnh research 1241 Mar 10 12:55 scipy/stats/models/tests/test_scale.py -rw-r--r-- 1 johnh research 1585 Mar 10 12:55 scipy/stats/models/tests/test_utils.py -rw-r--r-- 1 johnh research 237473 Mar 10 12:57 scipy/ndimage/tests/test_ndimage.py -rw------- 1 johnh research 5394 Jul 14 12:01 scipy/ndimage/tests/test_registration.py -rw------- 1 johnh research 478 Jul 14 12:01 scipy/ndimage/tests/test_regression.py -rw------- 1 johnh research 6929 Jul 14 12:01 scipy/ndimage/tests/test_segment.py -rw------- 1 johnh research 557 Jul 14 12:01 scipy/stats/models/tests/test_bspline.py -rw------- 1 johnh research 10045 Apr 25 09:15 scipy/stats/models/tests/test_formula.py -rw-r--r-- 1 johnh research 1120 Mar 10 12:55 scipy/stats/models/tests/test_regression.py -rw-r--r-- 1 johnh research 662 Mar 10 12:55 scipy/stats/models/tests/test_rlm.py -rw-r--r-- 1 johnh research 7993 Mar 10 12:55 scipy/stats/tests/test_distributions.py -rw------- 1 johnh research 4701 Apr 25 09:15 scipy/stats/tests/test_mmorestats.py -rw-r--r-- 1 johnh research 4278 Mar 10 12:55 scipy/stats/tests/test_morestats.py -rw------- 1 johnh research 20707 Apr 25 09:15 scipy/stats/tests/test_mstats.py -rw------- 1 johnh research 29167 Jul 14 12:01 scipy/stats/tests/test_stats.py -rw-r--r-- 1 johnh research 760 Mar 10 12:57 scipy/weave/tests/test_ast_tools.py -rw------- 1 johnh research 7687 Apr 25 09:15 scipy/weave/tests/test_blitz_tools.py -rw-r--r-- 1 johnh research 2178 Mar 10 12:57 scipy/weave/tests/test_build_tools.py -rw------- 1 johnh research 23557 Apr 25 09:15 scipy/weave/tests/test_c_spec.py -rw-r--r-- 1 johnh research 11938 Mar 10 12:57 scipy/weave/tests/test_catalog.py -rw-r--r-- 1 johnh research 4623 Mar 10 12:57 scipy/weave/tests/test_ext_tools.py -rw-r--r-- 1 johnh research 1240 Mar 10 12:57 scipy/weave/tests/test_inline_tools.py -rw-r--r-- 1 johnh research 5136 Mar 10 12:57 scipy/weave/tests/test_numpy_scalar_spec.py -rw-r--r-- 1 johnh research 291 Mar 10 12:57 scipy/weave/tests/test_scxx.py -rw-r--r-- 1 johnh research 8661 Mar 10 12:57 scipy/weave/tests/test_scxx_dict.py -rw-r--r-- 1 johnh research 27851 Mar 10 12:57 scipy/weave/tests/test_scxx_object.py -rw-r--r-- 1 johnh research 13442 Mar 10 12:57 scipy/weave/tests/test_scxx_sequence.py -rw-r--r-- 1 johnh research 12072 Mar 10 12:57 scipy/weave/tests/test_size_check.py -rw-r--r-- 1 johnh research 6367 Mar 10 12:57 scipy/weave/tests/test_slice_handler.py -rw-r--r-- 1 johnh research 1197 Mar 10 12:57 scipy/weave/tests/test_standard_array_spec.py -rw------- 1 johnh research 3486 Jul 14 12:01 scipy/weave/tests/test_wx_spec.py johnh at flag:site-packages> From robert.kern at gmail.com Mon Jul 14 13:34:50 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 12:34:50 -0500 Subject: [Numpy-discussion] permissions on tests in numpy and scipy In-Reply-To: <88e473830807141022h2ddc874ai4743e11421cc391@mail.gmail.com> References: <88e473830807141022h2ddc874ai4743e11421cc391@mail.gmail.com> Message-ID: <3d375d730807141034xa79c581jfda42982e117811b@mail.gmail.com> On Mon, Jul 14, 2008 at 12:22, John Hunter wrote: > I have a rather unconventional install pipeline at work and owner only > read permissions on a number of the tests are causing me minor > problems. It appears the permissions on the tests are set rather > inconsistently in numpy and python -- is there any reason not to make > these all 644? We're not doing anything special, here. When I install using "sudo python install.py" on OS X, all of the permissions are 644. I think the problem may be in your pipeline. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Mon Jul 14 13:46:53 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 14 Jul 2008 10:46:53 -0700 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: References: <200807111559.03986.falted@pytables.org> <777651ce0807111001w2a65307cg30bcf11aa43f1c9c@mail.gmail.com> <200807112001.39395.falted@pytables.org> <200807111420.26794.pgmdevlist@gmail.com> <4877B197.7050607@noaa.gov> Message-ID: <487B910D.50405@noaa.gov> Matt Knox wrote: > The DateArray class in the timeseries scikits can do part of what you want. > Observe... >>>> a.year > array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, > 2008, 2008, 2008, 2008]) >>>> a.hour > array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0, 1]) >>>> a.day > array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13]) This is great for what I often need: to output data in a format with columns of: year, month, day, hour, min, sec But I also often need to be able to convert a "TimeDelta" to a particular unit, for example (using the lib datetime): >>> td = datetime.datetime(2008, 7, 14, 12) - datetime.datetime(2008, 7, 13, 10) >>> td datetime.timedelta(1, 7200) so we have a timedelta of one day, 7200 seconds. I'd like: >>> td.as_hours Traceback (most recent call last): File "", line 1, in AttributeError: 'datetime.timedelta' object has no attribute 'as_hours' which doesn't exist in the datetime module, so I do: >>> hours = td.days*24 + td.seconds/3600. >>> hours 26.0 I find myself writing this code al ot, so I'd love to have it built in. Which brings up an issue: The reason it isn't built in is that the philosophy behind the datetime module is that it provides the building blocks with which to build more feature-full packages. Personally, I really wish it had a bit more built in, but what can we do? As for the numpy datetime types, we need to decide how much to build in. I think the kind of functionality described here is pretty basic, and should be included, but if we inculde everyone's idea of basic, it could get pretty bloated! > I would encourage you to take a look at the wiki > (http://scipy.org/scipy/scikits/wiki/TimeSeries) as you may find some surprises > in there that prove useful. So maybe we should have very little in the numy datetime type, and have scikits.TimeSeries as a more feature full package built on top of it. > Would many people be interested in seeing this kind > of string date parsing integrated in the native NumPy types? I think that more than one string format is a feature for a meta package. > the idea we > are incubating is to complement the ``datetime64`` with a 'resolution' > metainfo. The ``datetime64`` will still be based on a int64 type, but > the meaning of the 'ticks' would depend on a 'resolution' property. I like this! Would there be conversion between different resolutions available? I wonder what that syntax for that should be? > And > definitely, "offset" would be similar to "origin". So yes, we will try > to introduce both concepts. yup -- origin is critical! What resolution (and numerical format) do you use to express the origin? Even if you data is ini days, you may want to specify the origin with more precision, so as not to have confusion about what "0 days" means in some higher resolution unit. Also, if you want picosecond resolution, then the origin needs to be picosecond resolution as well. Thanks for working on this -- I'm looking forward to using it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From mdroe at stsci.edu Mon Jul 14 13:53:09 2008 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 14 Jul 2008 13:53:09 -0400 Subject: [Numpy-discussion] mrecarray Message-ID: <487B9285.6070907@stsci.edu> I'm running into a couple of small problems with mrecarray. I'm not sure if they're bugs or a usage error. First, the constructor throws an exception when the format string contains nested arrays (if that is the proper term) such as "(2,2)f8". This creates a three-element tuple in the dtype.descr list, whereas the mrecord.py code seems to assume each descr will always have 2 elements. ======== In [1]: from numpy.ma import mrecords In [2]: x = mrecords.mrecarray(1, formats="(2,2)f8") --------------------------------------------------------------------------- Traceback (most recent call last) /wonkabar/data1/scraps/ in () /home/mdroe/usr/lib/python2.5/site-packages/numpy/ma/mrecords.py in __new__(cls, shape, dtype, buf, offset, strides, formats, names, titles, byteorder, aligned, mask, hard_mask, fill_value, keep_mask, copy, **options) 121 self = recarray.__new__(cls, shape, dtype=dtype, buf=buf, offset=offset, 122 strides=strides, formats=formats, --> 123 byteorder=byteorder, aligned=aligned,) 124 # 125 mdtype = [(k,'|b1') for (k,_) in self.dtype.descr] /home/mdroe/usr/lib/python2.5/site-packages/numpy/core/records.py in __new__(subtype, shape, dtype, buf, offset, strides, formats, names, titles, byteorder, aligned) 248 249 if buf is None: --> 250 self = ndarray.__new__(subtype, shape, (record, descr)) 251 else: 252 self = ndarray.__new__(subtype, shape, (record, descr), /home/mdroe/usr/lib/python2.5/site-packages/numpy/ma/mrecords.py in __array_finalize__(self, obj) 157 _fieldmask = getattr(obj, '_fieldmask', None) 158 if _fieldmask is None: --> 159 mdescr = [(n,'|b1') for (n,_) in self.dtype.descr] 160 objmask = getattr(obj, '_mask', nomask) 161 if objmask is nomask: : too many values to unpack ======== In my own use case, I don't care if the individual elements of the nested array are maskable -- either the whole array being masked or not is good enough -- but perhaps that shortcoming is why this wasn't designed to work? I have attached a patch to mrecords.py that gets me past this and does seem to allow the nested array to be masked as a whole. Secondly, the "names" and "titles" kwargs seem to be ignored by the mrecarray constructor. ======== In [3]: x = mrecords.mrecarray(1, formats="f8", names="foo") In [4]: x[0]['foo'] = 42.0 --------------------------------------------------------------------------- Traceback (most recent call last) /wonkabar/data1/scraps/ in () /home/mdroe/usr/lib/python2.5/site-packages/numpy/ma/mrecords.py in __setitem__(self, indx, value) 309 def __setitem__(self, indx, value): 310 "Sets the given record to value." --> 311 MaskedArray.__setitem__(self, indx, value) 312 if isinstance(indx, basestring): 313 self._mask[indx] = ma.getmaskarray(value) /home/mdroe/usr/lib/python2.5/site-packages/numpy/ma/core.py in __setitem__(self, indx, value) 1437 # raise IndexError, msg 1438 if isinstance(indx, basestring): -> 1439 ndarray.__setitem__(self._data, indx, value) 1440 ndarray.__setitem__(self._mask, indx, getmask(value)) 1441 return : field named foo not found. ======== The included patch delegates these kwargs onto the underlying recarray. Cheers, Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mrecords.py.diff URL: From pgmdevlist at gmail.com Mon Jul 14 14:05:13 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 14:05:13 -0400 Subject: [Numpy-discussion] mrecarray In-Reply-To: <487B9285.6070907@stsci.edu> References: <487B9285.6070907@stsci.edu> Message-ID: <200807141405.13680.pgmdevlist@gmail.com> On Monday 14 July 2008 13:53:09 Michael Droettboom wrote: > I'm running into a couple of small problems with mrecarray. I'm not > sure if they're bugs or a usage error. Bugs are my bet. I'll check that. The first one might be problematic, as it probable comes from ma.core. The second one is most likely due to the negligence of the author ;), thanks a lot for the patch and the feedback. From faltet at pytables.org Mon Jul 14 14:17:18 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 20:17:18 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141311.22746.pgmdevlist@gmail.com> References: <200807141507.47484.faltet@pytables.org> <200807141850.21269.faltet@pytables.org> <200807141311.22746.pgmdevlist@gmail.com> Message-ID: <200807142017.18659.faltet@pytables.org> A Monday 14 July 2008, Pierre GM escrigu?: > On Monday 14 July 2008 12:50:21 Francesc Alted wrote: > > > A very useful point that Matt Knox had coded is the possibility > > > to specify starting points for switching from one resolution to > > > another. For example, you can have a series with a 'ANN_MAR' > > > frequency, that corresponds to 1 point a year, the year starting > > > in April. When switching back to a monthly resolution, the points > > > from January to March of the first year will be masked. > > > > Ok. Ann was also suggesting that the origin of time would be > > configurable, but then, you are talking about *masking* values. > > Mmm, I don't think we should try to incorporate masking > > capabilities in the NumPy date/time types. > > Francesc, > In scikits.timeseries, we have 2 different objects: > * DateArray, which is basically a ndarray of integers with a given > 'frequency' attribute. > * TimeSeries, which is basically the combination of a MaskedArray > (the data part) and a DateArray (which keeps track of the date > corresponding to each data point. TimeSeries object have the > resolution/origin of the companion DateArray, and when they're > converted from one resolution to another, some masking may occur. > > My understanding is that you intend to define an object similar to > DateArray. You want to define a new dtype (datetime64 or other), we > used yet another class instead, Date. A dtype would be easier to > manipulate, but as neither Matt nor I were particularly experienced > with that at the time, we followed the simpler approach of a class... Well, what we are after is precisely this: a new dtype type. After integrating it in NumPy, I suppose that your DateArray would be similar than a NumPy array with a dtype ``datetime64`` (bar the conceptual differences between your 'frequency' behind DateArray and the 'resolution' behind the datetime64 dtype). > > > [N]timeunit > > > > where ``timeunit`` can take the values in: > > > > ['y', 'm', 'd', 'h', 'm', 's', 'ms', 'us', 'ns', 'fs'] > > > > so, for example, '14d' means a resolution of 14 days, or '10ms' > > means a resolution of 1 hundreth of second. Sounds good to me. > > What other people think? > > Sounds pretty cool and intuitive to use. However, writing the > conversion rules from one to another will be a lot of fun. Take > weekly, for example: that's a period of 7 days, but when does it > start ? On a monday ? Then, 12/31/2007 was the start of the first > week of 2008... OK, we can leave that problem for the moment... It would start when the origin tells that it should start. It is important to note that our proposal will not force a '7d' (seven days) 'tick' to start on monday, or a '1m' (one month) to start the 1st day of a calendar month, but rather where the user decides to set its origin. Cheers, -- Francesc Alted From pgmdevlist at gmail.com Mon Jul 14 14:35:20 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 14:35:20 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807142017.18659.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> <200807141311.22746.pgmdevlist@gmail.com> <200807142017.18659.faltet@pytables.org> Message-ID: <200807141435.21068.pgmdevlist@gmail.com> On Monday 14 July 2008 14:17:18 Francesc Alted wrote: > Well, what we are after is precisely this: a new dtype type. After > integrating it in NumPy, I suppose that your DateArray would be similar > than a NumPy array with a dtype ``datetime64`` (bar the conceptual > differences between your 'frequency' behind DateArray and > the 'resolution' behind the datetime64 dtype). Well, you're losing me on this one: could you explain the difference between the two concepts ? It might only be a problem of vocabulary... > It would start when the origin tells that it should start. It is > important to note that our proposal will not force a '7d' (seven > days) 'tick' to start on monday, or a '1m' (one month) to start the 1st > day of a calendar month, but rather where the user decides to set its > origin. OK, so we need 2 flags, one for the resolution, one for the origin. Because there won't be that many resolution possible, an int8 should be sufficient. What do you have in mind for the origin ? When using a resolution coarser than 1d (7d, 1m, 3m, 1a), an origin in day is OK. What about less than a day ? From faltet at pytables.org Mon Jul 14 15:12:18 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Jul 2008 21:12:18 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <487B910D.50405@noaa.gov> References: <200807111559.03986.falted@pytables.org> <487B910D.50405@noaa.gov> Message-ID: <200807142112.19226.faltet@pytables.org> A Monday 14 July 2008, Christopher Barker escrigu?: > Matt Knox wrote: > > The DateArray class in the timeseries scikits can do part of what > > you want. Observe... > > > >>>> a.year > > > > array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, > > 2008, 2008, 2008, 2008, 2008]) > > > >>>> a.hour > > > > array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 0, 1]) > > > >>>> a.day > > > > array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13]) > > This is great for what I often need: to output data in a format with > columns of: > > year, month, day, hour, min, sec I see. However, the more I think about this, the more I see the need to split the date/time functionalities and duties in two parts: * the first one implementing a date/time dtype with the basic functionality for timestamping and/or time-interval measuring. * the second part would be a specific array container of date/time types (which maybe perfectly a porting of the DateArray of the scikits.timeseries that would be based on the date/time type) where one can implement all of the functionality (like the one that you are proposing above) that escapes to a humble date/time dtype. Definitely, having this two-layer approach is going to allow a more powerful and flexible approach in the long term, IMO. > But I also often need to be able to convert a "TimeDelta" to a > > particular unit, for example (using the lib datetime): > >>> td = datetime.datetime(2008, 7, 14, 12) - > >>> datetime.datetime(2008, > > 7, 13, 10) > > >>> td > > datetime.timedelta(1, 7200) > > so we have a timedelta of one day, 7200 seconds. > > I'd like: > >>> td.as_hours > > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'datetime.timedelta' object has no attribute > 'as_hours' > > which doesn't exist in the datetime module, so I do: > >>> hours = td.days*24 + td.seconds/3600. > >>> hours > > 26.0 > > I find myself writing this code al ot, so I'd love to have it built > in. Hmm, I don't know if having a conversor for every time unit would be too much. I'd prefer the next: td.as_timeunit('hour') where you can specify the time unit as the parameter. Will take note of this. > > Which brings up an issue: > > The reason it isn't built in is that the philosophy behind the > datetime module is that it provides the building blocks with which to > build more feature-full packages. Personally, I really wish it had a > bit more built in, but what can we do? > > As for the numpy datetime types, we need to decide how much to build > in. I think the kind of functionality described here is pretty basic, > and should be included, but if we inculde everyone's idea of basic, > it could get pretty bloated! Completely agree. This is why I'm proposing the two-layer approach: have the basic date/time functionality implemented as a dtype (i.e. in C space), and put the other niceties into a sort of ``DateArray`` (perhaps in Python space). > > the idea we > > are incubating is to complement the ``datetime64`` with a > > 'resolution' metainfo. The ``datetime64`` will still be based on a > > int64 type, but the meaning of the 'ticks' would depend on a > > 'resolution' property. > > I like this! Would there be conversion between different resolutions > available? I wonder what that syntax for that should be? Well, what about the ".as_timeunit()" stated above for the date/time scalar and another similar for the ``DateArray`` layer?. However, be aware that, as we are proposing integer arithmetic for the date/time types (and not fixed-point of floating-point arithmetic) you *will* loose precision when changing resolution from a fine-grained time unit to another more coarse-grained (and inversely, you may risk to overflow when changing resolution from a coarse-grained to another more fine-grained unit), and this may not be what you want. > > And > > definitely, "offset" would be similar to "origin". So yes, we will > > try to introduce both concepts. > > yup -- origin is critical! > > What resolution (and numerical format) do you use to express the > origin? Even if you data is ini days, you may want to specify the > origin with more precision, so as not to have confusion about what "0 > days" means in some higher resolution unit. Also, if you want > picosecond resolution, then the origin needs to be picosecond > resolution as well. Good point. I'm afraid that we will only support the specification of the origin with a fixed resolution of microseconds, and between the year 1 and 9999 (mainly for ``datetime`` compatibility, but also to avoid the 'egg and the chicken' effect that you noticed ;-). Cheers, -- Francesc Alted From pgmdevlist at gmail.com Mon Jul 14 15:29:20 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 15:29:20 -0400 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807142112.19226.faltet@pytables.org> References: <200807111559.03986.falted@pytables.org> <487B910D.50405@noaa.gov> <200807142112.19226.faltet@pytables.org> Message-ID: <200807141529.20467.pgmdevlist@gmail.com> On Monday 14 July 2008 15:12:18 Francesc Alted wrote: > I see. However, the more I think about this, the more I see the need to > split the date/time functionalities and duties in two parts: > > * the first one implementing a date/time dtype with the basic > functionality for timestamping and/or time-interval measuring. That would be our Date class > * the second part would be a specific array container of date/time types > (which maybe perfectly a porting of the DateArray of the > scikits.timeseries that would be based on the date/time type) where one > can implement all of the functionality (like the one that you are > proposing above) that escapes to a humble date/time dtype. That would be our DateArray class indeed... Francesc, Chris, may I suggest you to try TimeSeries if you didn't already ? That way you could see what kind of features are missing and which ones should be improved with the new dtype ? From millman at berkeley.edu Sun Jul 13 04:49:18 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 13 Jul 2008 01:49:18 -0700 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: The NumPy 1.1.1 release date (7/31/08) is rapidly approaching and we need everyone's help. Chuck Harris has volunteered to take the lead on coordinating this release. As a reminder here is the schedule for 1.1.1: - 7/20/08 tag the 1.1.1rc1 release and prepare packages - 7/27/08 tag the 1.1.1 release and prepare packages - 7/31/08 announce release This release should include only bug-fixes and improved documentation. We need to follow this schedule as closely as possible because we will need to start focusing on the upcoming NumPy 1.2.0 release as soon as 1.1.1 is released. As a reminder, the trunk is for 1.2.0 development; 1.1.1 will be tagged off the 1.1.x branch: svn co http://svn.scipy.org/svn/numpy/branches/1.1.x numpy-1.1.x If you have any fixes that you haven't back-ported yet, please do so ASAP. According to our release schedule we will be tagging the release candidate for 1.1.1 next Sunday (7/20). We will be asking for wide-spread testing of the release candidate during the week of the 20th. If you want to open a ticket specifically for this bug-fix release, please use the NumPy 1.1.1 milestone: http://scipy.org/scipy/numpy/milestone/1.1.1 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pgmdevlist at gmail.com Mon Jul 14 18:06:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 18:06:29 -0400 Subject: [Numpy-discussion] Doc on dtypes Message-ID: <200807141806.30592.pgmdevlist@gmail.com> All, Anybody could point me to some docs on dtypes ? Michael Droettboom's recent question made me realize that things were far more complex than I thought. For example, how can I find the shape and type of a field (without using dtype.descr). Thanks a lot in advance P. From robert.kern at gmail.com Mon Jul 14 18:53:41 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 17:53:41 -0500 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <200807141806.30592.pgmdevlist@gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> Message-ID: <3d375d730807141553r1b3cfaa3u2e9469bcaeb23a82@mail.gmail.com> On Mon, Jul 14, 2008 at 17:06, Pierre GM wrote: > All, > Anybody could point me to some docs on dtypes ? Michael Droettboom's recent > question made me realize that things were far more complex than I thought. > For example, how can I find the shape and type of a field (without using > dtype.descr). dtype.fields is a dict-like object containing the same information, but accessible by field name. Chapter 7 of _The Guide to Numpy_ has more content, and the equivalent will be wending its way into the numpy documentation marathon sooner or later, but with the insight about dtype.fields, I think you can get pretty far on your own. If you have any other specific questions, feel free to ask. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon Jul 14 19:24:53 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 19:24:53 -0400 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <3d375d730807141553r1b3cfaa3u2e9469bcaeb23a82@mail.gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <3d375d730807141553r1b3cfaa3u2e9469bcaeb23a82@mail.gmail.com> Message-ID: <200807141924.54407.pgmdevlist@gmail.com> On Monday 14 July 2008 18:53:41 Robert Kern wrote: > dtype.fields is a dict-like object containing the same information, > but accessible by field name. But as it's a dictionary, I can't use iteritems() without risking having the wrong order of fields, right ? Or do dictproxies behave differently ? > Chapter 7 of _The Guide to Numpy_ has more content, Oh, of course... My version is just 2 years old, has there been much change about dtypes in the meantime ? From robert.kern at gmail.com Mon Jul 14 19:39:39 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 18:39:39 -0500 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <200807141924.54407.pgmdevlist@gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <3d375d730807141553r1b3cfaa3u2e9469bcaeb23a82@mail.gmail.com> <200807141924.54407.pgmdevlist@gmail.com> Message-ID: <3d375d730807141639tf8c00d2h9f0d4035ccb7955d@mail.gmail.com> On Mon, Jul 14, 2008 at 18:24, Pierre GM wrote: > On Monday 14 July 2008 18:53:41 Robert Kern wrote: >> dtype.fields is a dict-like object containing the same information, >> but accessible by field name. > > But as it's a dictionary, I can't use iteritems() without risking having the > wrong order of fields, right ? Or do dictproxies behave differently ? Right. If you want order, use dtype.descr, or sort on the last item in the tuple. We can probably reimplement the dictproxy to guarantee order. >> Chapter 7 of _The Guide to Numpy_ has more content, > Oh, of course... My version is just 2 years old, has there been much change > about dtypes in the meantime ? No. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Mon Jul 14 19:58:52 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 14 Jul 2008 16:58:52 -0700 Subject: [Numpy-discussion] Run np.test() twice, get message. In-Reply-To: <1d36917a0807132259l1e13da83l6833de7b4d7e1fb3@mail.gmail.com> References: <1d36917a0807132259l1e13da83l6833de7b4d7e1fb3@mail.gmail.com> Message-ID: On Sun, Jul 13, 2008 at 10:59 PM, Alan McIntyre wrote: > On Mon, Jul 14, 2008 at 1:31 AM, Charles R Harris > wrote: >> Any idea what this is: >> >> *** DocTestRunner.merge: '__main__' in both testers; summing outcomes. > > Hmm..that's coming from nose. I'll see what it's about tomorrow. It's actually coming from doctest. Hardcode import doctest doctest.master = None in the code that runs the tests. This resets a module-global that's the cause of that message. Cheers, f From charlesr.harris at gmail.com Mon Jul 14 20:05:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Jul 2008 18:05:27 -0600 Subject: [Numpy-discussion] Release 1.1.1 Message-ID: All, The rc release of numpy-1.1.1 is due out next Sunday. I have gone through the commits made to the trunk since the 1.1.x branch to pull out backport candidates. If you find your name here could you make the backport or say why you think it inappropriate. David, I know that these are mostly build fixes and you have backported many of them, but I don't know the current build state of 1.1.x cdavid r5236 r5240 r5266 r5267 r5268 r5269 r5270 r5271 r5272 r5273 r5274 r5275 r5276 r5277 r5278 r5279 r5280 r5281 r5282 r5283 r5302 r5304 r5355 r5365 r5366 r5367 r5368 These are mine and I'll take care of them. charris r5259 r5312 r5322 r5324 r5392 r5394 r5399 r5406 r5407 dhuard r5254 fperez r5298 r5301 r5303 jarrod r5285 oliphant r5245 r5255 Pierre, I know you have been working diligently to get masked arrays up to speed and have made numerous fixes in the 1.1.x branch. All the tests pass for me. Is there more that needs to be done? pierregm r5242 r5244 r5248 r5249 r5251 r5253 r5256 r5260 r5263 r5264 r5284 r5292 r5314 r5329 r5332 ptvirtan r5261 rkern r5296 r5297 r5342 r5349 r5357 Stefan, these are mostly documentation related. IIRC, you planned to update the documentation in 1.1.1, which probably also needs ptvirtan's commit above. What is the current status of this project? stefan r5290 r5293 r5294 r5299 r5360 r5371 r5372 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Mon Jul 14 20:13:40 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 14 Jul 2008 17:13:40 -0700 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: On Mon, Jul 14, 2008 at 5:05 PM, Charles R Harris wrote: > The rc release of numpy-1.1.1 is due out next Sunday. I have gone through > the commits made to the trunk since the 1.1.x branch to pull out backport > candidates. If you find your name here could you make the backport or say > why you think it inappropriate. Thanks for putting this together! > jarrod > r5285 I was just testing subversion (I inserted a blank line). -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Mon Jul 14 20:18:51 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 19:18:51 -0500 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: <3d375d730807141718t75ce5967p67bcf81c6fca85ab@mail.gmail.com> On Mon, Jul 14, 2008 at 19:05, Charles R Harris wrote: > All, > > The rc release of numpy-1.1.1 is due out next Sunday. I have gone through > the commits made to the trunk since the 1.1.x branch to pull out backport > candidates. If you find your name here could you make the backport or say > why you think it inappropriate. > rkern > r5296 > r5297 The latter is the backport of the former. > r5342 > r5349 > r5357 Sure, I'll take these. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon Jul 14 20:21:50 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 20:21:50 -0400 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: <200807142021.50352.pgmdevlist@gmail.com> > Pierre, I know you have been working diligently to get masked arrays up to > speed and have made numerous fixes in the 1.1.x branch. All the tests pass > for me. Is there more that needs to be done? Charles, I did as much as I could to ensure compatibility with Python 2.3, but I can't test it myself (can't install Python 2.3 on my machine). It'd be great if somebody could check it works with that version, otherwise I'm all go (the recent pbs with mrecords and exotic flexible types are for 1.2). From pgmdevlist at gmail.com Mon Jul 14 20:41:43 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 20:41:43 -0400 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <3d375d730807141639tf8c00d2h9f0d4035ccb7955d@mail.gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <200807141924.54407.pgmdevlist@gmail.com> <3d375d730807141639tf8c00d2h9f0d4035ccb7955d@mail.gmail.com> Message-ID: <200807142041.43287.pgmdevlist@gmail.com> On Monday 14 July 2008 19:39:39 Robert Kern wrote: > Right. If you want order, use dtype.descr, or sort on the last item in > the tuple. We can probably reimplement the dictproxy to guarantee > order. Oh, with dtype.names and dtype.fields I can work. The Guide mentions a key [-1] in dtype.fields that should store the names in order: that would be quite useful but it doesn't work (KeyError). From robert.kern at gmail.com Mon Jul 14 21:56:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 20:56:47 -0500 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <200807142041.43287.pgmdevlist@gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <200807141924.54407.pgmdevlist@gmail.com> <3d375d730807141639tf8c00d2h9f0d4035ccb7955d@mail.gmail.com> <200807142041.43287.pgmdevlist@gmail.com> Message-ID: <3d375d730807141856h7bf36acck33b82fdeaba750b1@mail.gmail.com> On Mon, Jul 14, 2008 at 19:41, Pierre GM wrote: > On Monday 14 July 2008 19:39:39 Robert Kern wrote: >> Right. If you want order, use dtype.descr, or sort on the last item in >> the tuple. We can probably reimplement the dictproxy to guarantee >> order. > > Oh, with dtype.names and dtype.fields I can work. The Guide mentions a key > [-1] in dtype.fields that should store the names in order: that would be > quite useful but it doesn't work (KeyError). Where do you see this mention? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon Jul 14 22:15:46 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 22:15:46 -0400 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <3d375d730807141856h7bf36acck33b82fdeaba750b1@mail.gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <200807142041.43287.pgmdevlist@gmail.com> <3d375d730807141856h7bf36acck33b82fdeaba750b1@mail.gmail.com> Message-ID: <200807142215.46090.pgmdevlist@gmail.com> On Monday 14 July 2008 21:56:47 Robert Kern wrote: > > Oh, with dtype.names and dtype.fields I can work. The Guide mentions a > > key [-1] in dtype.fields that should store the names in order: that would > > be quite useful but it doesn't work (KeyError). > > Where do you see this mention? Page 116 of my version (once again, 07/17/2006) """ An ordered (by offset) list of ?eld names is also stored in the ?elds dictionary under the key -1. This can be used to walk through all of the named ?elds in offset order. Notice that the de?ned ?elds do not have to ?cover? the record, but the itemsize of the container data-type object must always be at least as large as the itemsizes of the data-type objects in the de?ned ?elds. """ From robert.kern at gmail.com Mon Jul 14 22:27:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Jul 2008 21:27:25 -0500 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <200807142215.46090.pgmdevlist@gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <200807142041.43287.pgmdevlist@gmail.com> <3d375d730807141856h7bf36acck33b82fdeaba750b1@mail.gmail.com> <200807142215.46090.pgmdevlist@gmail.com> Message-ID: <3d375d730807141927v544df722m10e7923051e5125b@mail.gmail.com> On Mon, Jul 14, 2008 at 21:15, Pierre GM wrote: > On Monday 14 July 2008 21:56:47 Robert Kern wrote: >> > Oh, with dtype.names and dtype.fields I can work. The Guide mentions a >> > key [-1] in dtype.fields that should store the names in order: that would >> > be quite useful but it doesn't work (KeyError). >> >> Where do you see this mention? > > Page 116 of my version (once again, 07/17/2006) > """ > An ordered (by offset) list of ?eld names is also stored in the ?elds > dictionary under the key -1. This can be used to walk through all of the > named ?elds in offset order. Notice that the de?ned ?elds do not have > to "cover" the record, but the itemsize of the container data-type object > must always be at least as large as the itemsizes of the data-type objects in > the de?ned ?elds. > """ I have a slightly newer version of the book. The majority of this text appears verbatim under the description of dtype.names, so I assume dtype.fields[-1] got removed in favor of dtype.names. As well it should. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Mon Jul 14 22:36:05 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Jul 2008 22:36:05 -0400 Subject: [Numpy-discussion] Doc on dtypes In-Reply-To: <3d375d730807141927v544df722m10e7923051e5125b@mail.gmail.com> References: <200807141806.30592.pgmdevlist@gmail.com> <200807142215.46090.pgmdevlist@gmail.com> <3d375d730807141927v544df722m10e7923051e5125b@mail.gmail.com> Message-ID: <200807142236.05707.pgmdevlist@gmail.com> On Monday 14 July 2008 22:27:25 Robert Kern wrote: > I have a slightly newer version of the book. The majority of this text > appears verbatim under the description of dtype.names, so I assume > dtype.fields[-1] got removed in favor of dtype.names. As well it > should. OK then, I should be set. Thanks again for pointing me in the right direction. From alan.mcintyre at gmail.com Mon Jul 14 23:00:22 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Mon, 14 Jul 2008 23:00:22 -0400 Subject: [Numpy-discussion] Run np.test() twice, get message. In-Reply-To: References: <1d36917a0807132259l1e13da83l6833de7b4d7e1fb3@mail.gmail.com> Message-ID: <1d36917a0807142000r79f76968y4d7a35feb0ea8b93@mail.gmail.com> On Mon, Jul 14, 2008 at 7:58 PM, Fernando Perez wrote: > It's actually coming from doctest. Hardcode > > import doctest > doctest.master = None > > in the code that runs the tests. This resets a module-global that's > the cause of that message. Thanks, Fernando, I added that and it prevents the message. This change is checked in. From alan.mcintyre at gmail.com Tue Jul 15 00:16:39 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 15 Jul 2008 00:16:39 -0400 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <200807142021.50352.pgmdevlist@gmail.com> References: <200807142021.50352.pgmdevlist@gmail.com> Message-ID: <1d36917a0807142116p599a6210wb099a1fb43d2f9c0@mail.gmail.com> On Mon, Jul 14, 2008 at 8:21 PM, Pierre GM wrote: > I did as much as I could to ensure compatibility with Python 2.3, but I can't > test it myself (can't install Python 2.3 on my machine). It'd be great if > somebody could check it works with that version, otherwise I'm all go (the > recent pbs with mrecords and exotic flexible types are for 1.2). For what it's worth, I ran the full test suite of NumPy 1.1.1 with Python 2.3.7 (both from svn) on a Linux machine (Gentoo 2.6.24, gcc 4.1.2) and everything passed. From gael.varoquaux at normalesup.org Tue Jul 15 00:50:16 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 15 Jul 2008 06:50:16 +0200 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: Message-ID: <20080715045016.GA22431@phare.normalesup.org> On Sun, Jul 13, 2008 at 01:49:18AM -0700, Jarrod Millman wrote: > The NumPy 1.1.1 release date (7/31/08) is rapidly approaching and we > need everyone's help. Chuck Harris has volunteered to take the lead > on coordinating this release. Anybody has an idea what the status is on #844? ( http://scipy.org/scipy/numpy/ticket/844 ) Cheers, Ga?l From charlesr.harris at gmail.com Tue Jul 15 01:01:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Jul 2008 23:01:41 -0600 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: <20080715045016.GA22431@phare.normalesup.org> References: <20080715045016.GA22431@phare.normalesup.org> Message-ID: On Mon, Jul 14, 2008 at 10:50 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Sun, Jul 13, 2008 at 01:49:18AM -0700, Jarrod Millman wrote: > > The NumPy 1.1.1 release date (7/31/08) is rapidly approaching and we > > need everyone's help. Chuck Harris has volunteered to take the lead > > on coordinating this release. > > Anybody has an idea what the status is on #844? ( > http://scipy.org/scipy/numpy/ticket/844 ) > I suspect it is a blas problem, it doesn't show up here. David? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue Jul 15 03:24:38 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 15 Jul 2008 09:24:38 +0200 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: <9457e7c80807150024v652d18dew82111dfe75ea54d5@mail.gmail.com> 2008/7/15 Charles R Harris : > Stefan, these are mostly documentation related. IIRC, you planned to update > the documentation in 1.1.1, which probably also needs ptvirtan's commit > above. What is the current status of this project? The plan was to include the documentation as part of 1.2. RC1 should be released on 4 August, IIRC, and I'll have it merged by then. > r5290 > r5294 > r5299 > r5372 These are fixes to the documentation standard, and can easily be back-ported. > r5293 # Add `ma` to __all__ > r5360 # Piecewise bug-fix I shall also include these bug-fixes. > r5371 This introduces the numpy.doc framework, and can be postponed until 1.2. Regards St?fan From michael at araneidae.co.uk Tue Jul 15 03:42:08 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 15 Jul 2008 07:42:08 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #843 Message-ID: <20080715073530.B81915@saturn.araneidae.co.uk> I'm reviewing my tickets (seems a good thing to do with a release imminent), and I'll post up each ticket that merits comment as a separate message. Ticket #843 has gone into trunk (commit 5361, oliphant) ... but your editor appears to be introducing hard tabs! Hard tab characters are fortunately relatively rare in numpy source, but my patch has gone in with tabs I didn't use. /me wanders off muttering darkly about the evils of tab characters... From pav at iki.fi Tue Jul 15 03:47:00 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 15 Jul 2008 07:47:00 +0000 (UTC) Subject: [Numpy-discussion] Release 1.1.1 References: Message-ID: Mon, 14 Jul 2008 18:05:27 -0600, Charles R Harris wrote: > All, > > The rc release of numpy-1.1.1 is due out next Sunday. I have gone > through the commits made to the trunk since the 1.1.x branch to pull out > backport candidates. If you find your name here could you make the > backport or say why you think it inappropriate. > > David, I know that these are mostly build fixes and you have backported > many of them, but I don't know the current build state of 1.1.x [clip] > ptvirtan > r5261 This is needed when we start merging in changes from the Doc marathon, but not otherwise. So it's probably not crucial for 1.1.1. Pauli From michael at araneidae.co.uk Tue Jul 15 03:48:38 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 15 Jul 2008 07:48:38 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType Message-ID: <20080715074217.R81915@saturn.araneidae.co.uk> Only half of my patch for this bug has gone into trunk, and without the rest of my patch there remains a leak. Furthermore, it remains necessary to perform an extra INCREF on typecode before calling PyArray_FromAny ... as otherwise there is the real possibility that typecode will have evaporated by the time it's passed to scalar_value later down in the routine. I attach the patch I believe remains necessary against this routine: --- numpy/core/src/scalartypes.inc.src (revision 5411) +++ numpy/core/src/scalartypes.inc.src (working copy) @@ -1925,19 +1925,30 @@ goto finish; } + Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: - if ((robj==NULL) || (robj->ob_type == type)) return robj; + if ((robj==NULL) || (robj->ob_type == type)) { + Py_XDECREF(typecode); + return robj; + } /* Need to allocate new type and copy data-area over */ if (type->tp_itemsize) { itemsize = PyString_GET_SIZE(robj); } else itemsize = 0; obj = type->tp_alloc(type, itemsize); - if (obj == NULL) {Py_DECREF(robj); return NULL;} + if (obj == NULL) { + Py_XDECREF(typecode); + Py_DECREF(robj); + return NULL; + } if (typecode==NULL) typecode = PyArray_DescrFromType(PyArray_ at TYPE@); dest = scalar_value(obj, typecode); I can see that there might be an argument that PyArray_FromAny has the semantics that it retains a reference to typecode unless it returns NULL ... but I don't want to go there. That would not be a good thing to rely on -- and even with those semantics the existing code still needs fixing. From michael at araneidae.co.uk Tue Jul 15 03:50:38 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 15 Jul 2008 07:50:38 +0000 (GMT) Subject: [Numpy-discussion] Ticket review #850: leak in _strings_richcompare Message-ID: <20080715074920.J81915@saturn.araneidae.co.uk> This one is easy, ought to go in. Fixes a (not particularly likely) memory leak. From michael at araneidae.co.uk Tue Jul 15 03:53:58 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 15 Jul 2008 07:53:58 +0000 (GMT) Subject: [Numpy-discussion] Ticket review #849: reference to deallocated object? Message-ID: <20080715075253.O81915@saturn.araneidae.co.uk> Tenuous but easy fix, and conformant to style elsewhere. From robert.kern at gmail.com Tue Jul 15 04:12:06 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Jul 2008 03:12:06 -0500 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: <3d375d730807150112r1b766aa9n1d77642aa82600e7@mail.gmail.com> On Mon, Jul 14, 2008 at 19:05, Charles R Harris wrote: > rkern > r5296 > r5297 > r5342 > r5349 > r5357 Done. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Tue Jul 15 04:15:01 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 15 Jul 2008 10:15:01 +0200 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: <5b8d13220807150115g575cbb30x73e54f36130f5291@mail.gmail.com> On Tue, Jul 15, 2008 at 2:05 AM, Charles R Harris wrote: > All, > > The rc release of numpy-1.1.1 is due out next Sunday. I have gone through > the commits made to the trunk since the 1.1.x branch to pull out backport > candidates. If you find your name here could you make the backport or say > why you think it inappropriate. > > David, I know that these are mostly build fixes and you have backported many > of them, but I don't know the current build state of 1.1.x Those are only build fixes related to numscons. Most code does not touch at all the normal build code path, and I tested the ones which may affect the normal build. With those backports, numpy 1.1.x should build with the most recent numscons (0.8.2: the version is tested). cheers, David From faltet at pytables.org Tue Jul 15 06:49:03 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 15 Jul 2008 12:49:03 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807141529.20467.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807142112.19226.faltet@pytables.org> <200807141529.20467.pgmdevlist@gmail.com> Message-ID: <200807151249.03487.faltet@pytables.org> A Monday 14 July 2008, Pierre GM escrigu?: > On Monday 14 July 2008 15:12:18 Francesc Alted wrote: > > I see. However, the more I think about this, the more I see the > > need to split the date/time functionalities and duties in two > > parts: > > > > * the first one implementing a date/time dtype with the basic > > functionality for timestamping and/or time-interval measuring. > > That would be our Date class That's correct. Having this functionality in a native dtype will allow for much more flexibility, though. > > * the second part would be a specific array container of date/time > > types (which maybe perfectly a porting of the DateArray of the > > scikits.timeseries that would be based on the date/time type) where > > one can implement all of the functionality (like the one that you > > are proposing above) that escapes to a humble date/time dtype. > > That would be our DateArray class indeed... Yes. In fact, I'd like to see the ``DateArray`` using the new date/time dtypes as soon as possible after their eventual introduction in NumPy. > Francesc, Chris, may I suggest you to try TimeSeries if you didn't > already ? That way you could see what kind of features are missing > and which ones should be improved with the new dtype ? Will do, don't worry. The wheel has been invented too many times already ;-) Thanks a lot for being so persistent about the TimeSeries! ;-) -- Francesc Alted From faltet at pytables.org Tue Jul 15 07:30:09 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 15 Jul 2008 13:30:09 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807141435.21068.pgmdevlist@gmail.com> References: <200807141507.47484.faltet@pytables.org> <200807142017.18659.faltet@pytables.org> <200807141435.21068.pgmdevlist@gmail.com> Message-ID: <200807151330.10394.faltet@pytables.org> A Monday 14 July 2008, Pierre GM escrigu?: > On Monday 14 July 2008 14:17:18 Francesc Alted wrote: > > Well, what we are after is precisely this: a new dtype type. After > > integrating it in NumPy, I suppose that your DateArray would be > > similar than a NumPy array with a dtype ``datetime64`` (bar the > > conceptual differences between your 'frequency' behind DateArray > > and > > the 'resolution' behind the datetime64 dtype). > > Well, you're losing me on this one: could you explain the difference > between the two concepts ? It might only be a problem of > vocabulary... Maybe is only that. But by using the term 'frequency' I tend to think that you are expecting to have one entry (observation) in your array for each time 'tick' since time start. OTOH, the term 'resolution' doesn't have this implication, and only states the precision of the timestamp. I don't know whether my impression is true or not, but after reading about your TimeSeries package, I'm still thinking that this expectation of one observation per 'tick' was what driven you to choose the 'frequency' name. > > It would start when the origin tells that it should start. It is > > important to note that our proposal will not force a '7d' (seven > > days) 'tick' to start on monday, or a '1m' (one month) to start the > > 1st day of a calendar month, but rather where the user decides to > > set its origin. > > OK, so we need 2 flags, one for the resolution, one for the origin. > Because there won't be that many resolution possible, an int8 should > be sufficient. What do you have in mind for the origin ? When using a > resolution coarser than 1d (7d, 1m, 3m, 1a), an origin in day is OK. > What about less than a day ? Well, after reading the mails from Chris and Anne, I think the best is that the origin would be kept as an int64 with a resolution of microseconds (for compatibility with the ``datetime`` module, as I've said before). Cheers, -- Francesc Alted From faltet at pytables.org Tue Jul 15 07:40:54 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 15 Jul 2008 13:40:54 +0200 Subject: [Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy In-Reply-To: <200807141529.20467.pgmdevlist@gmail.com> References: <200807111559.03986.falted@pytables.org> <200807142112.19226.faltet@pytables.org> <200807141529.20467.pgmdevlist@gmail.com> Message-ID: <200807151340.55084.faltet@pytables.org> A Monday 14 July 2008, Pierre GM escrigu?: > Francesc, Chris, may I suggest you to try TimeSeries if you didn't > already ? That way you could see what kind of features are missing > and which ones should be improved with the new dtype ? I'm having a look at your package, and I see that you have included pieces of code from the mx.DateTime, and that this code is subjected to the egenix public license version 1.0.0. After reading it, it seems an OpenSource license (it is based on the CNRI Python license). Anybody knows if there could be a problem in integrating this code (or part of it) in NumPy itself? Cheers, -- Francesc Alted From charlesr.harris at gmail.com Tue Jul 15 08:37:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:37:20 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <3d375d730807150112r1b766aa9n1d77642aa82600e7@mail.gmail.com> References: <3d375d730807150112r1b766aa9n1d77642aa82600e7@mail.gmail.com> Message-ID: On Tue, Jul 15, 2008 at 2:12 AM, Robert Kern wrote: > On Mon, Jul 14, 2008 at 19:05, Charles R Harris > wrote: > > rkern > > r5296 > > r5297 > > r5342 > > r5349 > > r5357 > > Done. > Thanks, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 08:37:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:37:55 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <5b8d13220807150115g575cbb30x73e54f36130f5291@mail.gmail.com> References: <5b8d13220807150115g575cbb30x73e54f36130f5291@mail.gmail.com> Message-ID: On Tue, Jul 15, 2008 at 2:15 AM, David Cournapeau wrote: > On Tue, Jul 15, 2008 at 2:05 AM, Charles R Harris > wrote: > > All, > > > > The rc release of numpy-1.1.1 is due out next Sunday. I have gone through > > the commits made to the trunk since the 1.1.x branch to pull out backport > > candidates. If you find your name here could you make the backport or say > > why you think it inappropriate. > > > > David, I know that these are mostly build fixes and you have backported > many > > of them, but I don't know the current build state of 1.1.x > > Those are only build fixes related to numscons. Most code does not > touch at all the normal build code path, and I tested the ones which > may affect the normal build. > > With those backports, numpy 1.1.x should build with the most recent > numscons (0.8.2: the version is tested). > OK, I'll cross your stuff off the list. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 08:39:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:39:18 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <200807142021.50352.pgmdevlist@gmail.com> References: <200807142021.50352.pgmdevlist@gmail.com> Message-ID: On Mon, Jul 14, 2008 at 6:21 PM, Pierre GM wrote: > > Pierre, I know you have been working diligently to get masked arrays up > to > > speed and have made numerous fixes in the 1.1.x branch. All the tests > pass > > for me. Is there more that needs to be done? > > Charles, > I did as much as I could to ensure compatibility with Python 2.3, but I > can't > test it myself (can't install Python 2.3 on my machine). It'd be great if > somebody could check it works with that version, otherwise I'm all go (the > recent pbs with mrecords and exotic flexible types are for 1.2). > _ > OK. With Alan's testing below I'll take your stuff off the list. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 08:39:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:39:57 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <1d36917a0807142116p599a6210wb099a1fb43d2f9c0@mail.gmail.com> References: <200807142021.50352.pgmdevlist@gmail.com> <1d36917a0807142116p599a6210wb099a1fb43d2f9c0@mail.gmail.com> Message-ID: On Mon, Jul 14, 2008 at 10:16 PM, Alan McIntyre wrote: > On Mon, Jul 14, 2008 at 8:21 PM, Pierre GM wrote: > > I did as much as I could to ensure compatibility with Python 2.3, but I > can't > > test it myself (can't install Python 2.3 on my machine). It'd be great if > > somebody could check it works with that version, otherwise I'm all go > (the > > recent pbs with mrecords and exotic flexible types are for 1.2). > > For what it's worth, I ran the full test suite of NumPy 1.1.1 with > Python 2.3.7 (both from svn) on a Linux machine (Gentoo 2.6.24, gcc > 4.1.2) and everything passed. > ___ > Thanks Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 08:42:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:42:17 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: <9457e7c80807150024v652d18dew82111dfe75ea54d5@mail.gmail.com> References: <9457e7c80807150024v652d18dew82111dfe75ea54d5@mail.gmail.com> Message-ID: On Tue, Jul 15, 2008 at 1:24 AM, St?fan van der Walt wrote: > 2008/7/15 Charles R Harris : > > Stefan, these are mostly documentation related. IIRC, you planned to > update > > the documentation in 1.1.1, which probably also needs ptvirtan's commit > > above. What is the current status of this project? > > The plan was to include the documentation as part of 1.2. RC1 should > be released on 4 August, IIRC, and I'll have it merged by then. > > > r5290 > > r5294 > > r5299 > > r5372 > > These are fixes to the documentation standard, and can easily be > back-ported. > > > r5293 # Add `ma` to __all__ > > r5360 # Piecewise bug-fix > > I shall also include these bug-fixes. > I see you have done that. > > r5371 > > This introduces the numpy.doc framework, and can be postponed until 1.2. > OK. Thanks, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 08:46:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 06:46:41 -0600 Subject: [Numpy-discussion] Release 1.1.1 In-Reply-To: References: Message-ID: On Tue, Jul 15, 2008 at 1:47 AM, Pauli Virtanen wrote: > Mon, 14 Jul 2008 18:05:27 -0600, Charles R Harris wrote: > > All, > > > > The rc release of numpy-1.1.1 is due out next Sunday. I have gone > > through the commits made to the trunk since the 1.1.x branch to pull out > > backport candidates. If you find your name here could you make the > > backport or say why you think it inappropriate. > > > > David, I know that these are mostly build fixes and you have backported > > many of them, but I don't know the current build state of 1.1.x > [clip] > > ptvirtan > > r5261 > > This is needed when we start merging in changes from the Doc marathon, > but not otherwise. So it's probably not crucial for 1.1.1. > OK, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Jul 15 09:33:03 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Jul 2008 08:33:03 -0500 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: <47FE55DF.3040201@enthought.com> References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> Message-ID: <487CA70F.2000609@gmail.com> Hi, Following Travis's suggestion below, I would like to suggest that the following definitions be depreciated or removed in this forthcoming release: numpy.Inf numpy.Infinity numpy.infty numpy.PINF numpy.NAN numpy.NaN I am not sure about what would be best for numpy.NINF. Thanks Bruce Travis E. Oliphant wrote: > Bruce Southey wrote: > >> Hi, >> Since we are discussing namespace and standardization, I am curious in >> why there are multiple definitions for defining infinity in numpy when >> perhaps there should be two (one for positive infinity and one for >> negative infinity). I really do understand that other people have use of >> these definitions and that it is easier to leave them in than take them >> out. Also, it is minor reduction in namespace because I do know that >> much of the namespace is either defining variables (like different >> floats and complex numbers) or mathematical functions (like logs and >> trig functions). >> >> Currently we have: >> numpy.Inf >> numpy.Infinity >> numpy.inf >> numpy.infty >> numpy.NINF >> numpy.PINF >> >> Most of these are defined in numeric.py: 'Inf = inf = infty = Infinity = >> PINF' >> In the f2py/tests subdirectories, the files return_real.py and >> return_complex.py uses both 'inf','Infinity'. >> The only occurrence of NINF and PINF are in core/src/umathmodule.c but I >> don't see any other usage. >> There does not seem to be any use of 'infty'. >> >> > I think this is a product of bringing together a few definitions into > one and not forcing a standard. > > numpy.inf > numpy.nan > > should be used except for backward compatibility. > > -Travis > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From jdh2358 at gmail.com Tue Jul 15 09:59:00 2008 From: jdh2358 at gmail.com (John Hunter) Date: Tue, 15 Jul 2008 08:59:00 -0500 Subject: [Numpy-discussion] permissions on tests in numpy and scipy In-Reply-To: <3d375d730807141034xa79c581jfda42982e117811b@mail.gmail.com> References: <88e473830807141022h2ddc874ai4743e11421cc391@mail.gmail.com> <3d375d730807141034xa79c581jfda42982e117811b@mail.gmail.com> Message-ID: <88e473830807150659n2d4dc4c8vac1743357fbb2558@mail.gmail.com> On Mon, Jul 14, 2008 at 12:34 PM, Robert Kern wrote: > We're not doing anything special, here. When I install using "sudo > python install.py" on OS X, all of the permissions are 644. I think > the problem may be in your pipeline. With a little more testing, what I am finding is that when I do a fresh svn co at work (solaris x86) a lot of files (eg setup.py or the test*.py files) come down permissioned at 600 or 700. If I do the same checkout on a recent linux box, they come down as 644 or 755. I checked my umask and they are the same on both boxes. So I am a bit stumped and it is clearly not a numpy problem, but I wanted to mention it here in case any unix guru has an idea (both of these are from clean svn checkouts) Solaris box (funky permissions): johnh at flag:~> svn --version svn, version 1.4.3 (r23084) compiled Jun 6 2007, 16:45:15 johnh at flag:~> uname -a SunOS flag 5.10 Generic_118855-15 i86pc i386 i86pc johnh at flag:~> umask 0002 johnh at flag:~> cd /export/home/johnh/tmp/numpy/ johnh at flag:numpy> ls -l setup.py -rwx------ 1 johnh research 3370 Jul 14 14:20 setup.py ############################################################## Linux box (expected permissions): jdhunter at bic128:~> svn --version svn, version 1.4.4 (r25188) compiled Sep 2 2007, 14:25:40 jdhunter at bic128:~> uname -a Linux bic128.bic.berkeley.edu 2.6.25.9-40.fc8 #1 SMP Fri Jun 27 16:05:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux jdhunter at bic128:numpy> umask 0002 jdhunter at bic128:~> cd /home/jdhunter/tmp/numpy/ jdhunter at bic128:numpy> ls -l setup.py -rwxrwxr-x 1 jdhunter jdhunter 3370 Jul 14 12:19 setup.py From charlesr.harris at gmail.com Tue Jul 15 10:57:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 08:57:57 -0600 Subject: [Numpy-discussion] Revised list of backport candidates for 1.1.1 Message-ID: After the first round of backports the following remain. charris r5259 r5312 r5322 r5324 r5392 r5394 r5399 r5406 r5407 dhuard r5254 fperez r5298 r5301 r5303 oliphant r5245 r5255 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 10:59:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 08:59:24 -0600 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: <20080715073530.B81915@saturn.araneidae.co.uk> References: <20080715073530.B81915@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 1:42 AM, Michael Abbott wrote: > I'm reviewing my tickets (seems a good thing to do with a release > imminent), and I'll post up each ticket that merits comment as a separate > message. > > Ticket #843 has gone into trunk (commit 5361, oliphant) ... but your > editor appears to be introducing hard tabs! Hard tab characters are > fortunately relatively rare in numpy source, but my patch has gone in with > tabs I didn't use. > > /me wanders off muttering darkly about the evils of tab characters... > ___ Thanks for doing this. If Travis doesn't comment in the next day or two I'll look to making the changes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at araneidae.co.uk Tue Jul 15 11:28:25 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Tue, 15 Jul 2008 15:28:25 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080715074217.R81915@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> Message-ID: <20080715150718.R97049@saturn.araneidae.co.uk> On Tue, 15 Jul 2008, Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. I think I might need to explain a little more about the reason for this patch, because obviously the bug it fixes was missed the last time I posted on this bug. So here is the missing part of the patch: > --- numpy/core/src/scalartypes.inc.src (revision 5411) > +++ numpy/core/src/scalartypes.inc.src (working copy) > @@ -1925,19 +1925,30 @@ > goto finish; > } > > + Py_XINCREF(typecode); > arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); > - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; > + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { > + Py_XDECREF(typecode); > + return arr; > + } > robj = PyArray_Return((PyArrayObject *)arr); > > finish: > - if ((robj==NULL) || (robj->ob_type == type)) return robj; > + if ((robj==NULL) || (robj->ob_type == type)) { > + Py_XDECREF(typecode); > + return robj; > + } > /* Need to allocate new type and copy data-area over */ > if (type->tp_itemsize) { > itemsize = PyString_GET_SIZE(robj); > } > else itemsize = 0; > obj = type->tp_alloc(type, itemsize); > - if (obj == NULL) {Py_DECREF(robj); return NULL;} > + if (obj == NULL) { > + Py_XDECREF(typecode); > + Py_DECREF(robj); > + return NULL; > + } > if (typecode==NULL) > typecode = PyArray_DescrFromType(PyArray_ at TYPE@); > dest = scalar_value(obj, typecode); On the face of it it might appear that all the DECREFs are cancelling out the first INCREF, but not so. Let's see two more lines of context: > src = scalar_value(robj, typecode); > Py_DECREF(typecode); Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe the later call ... and of course this has to happen on *ALL* return paths. If we now take a closer look at the patch we can see that it's doing two separate things: 1. There's an extra Py_XINCREF to balance the ref count lost to PyArray_FromAny and ensure that typecode survives long enough; 2. Every early return path has an extra Py_XDECREF to balance the creation of typecode. I rest my case for this patch. From pgmdevlist at gmail.com Tue Jul 15 11:36:16 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Jul 2008 11:36:16 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807151330.10394.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> <200807141435.21068.pgmdevlist@gmail.com> <200807151330.10394.faltet@pytables.org> Message-ID: <200807151136.17061.pgmdevlist@gmail.com> On Tuesday 15 July 2008 07:30:09 Francesc Alted wrote: > Maybe is only that. But by using the term 'frequency' I tend to think > that you are expecting to have one entry (observation) in your array > for each time 'tick' since time start. OTOH, the term 'resolution' > doesn't have this implication, and only states the precision of the > timestamp. OK, now I get it. > I don't know whether my impression is true or not, but after reading > about your TimeSeries package, I'm still thinking that this expectation > of one observation per 'tick' was what driven you to choose > the 'frequency' name. Well, we do require a "one point per tick" for some operations, such as conversion from one frequency to another, but only for TimeSeries. A Date Array doesn't have to be regularly spaced. From peridot.faceted at gmail.com Tue Jul 15 11:54:40 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 15 Jul 2008 11:54:40 -0400 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807151330.10394.faltet@pytables.org> References: <200807141507.47484.faltet@pytables.org> <200807142017.18659.faltet@pytables.org> <200807141435.21068.pgmdevlist@gmail.com> <200807151330.10394.faltet@pytables.org> Message-ID: 2008/7/15 Francesc Alted : > Maybe is only that. But by using the term 'frequency' I tend to think > that you are expecting to have one entry (observation) in your array > for each time 'tick' since time start. OTOH, the term 'resolution' > doesn't have this implication, and only states the precision of the > timestamp. > Well, after reading the mails from Chris and Anne, I think the best is > that the origin would be kept as an int64 with a resolution of > microseconds (for compatibility with the ``datetime`` module, as I've > said before). A couple of details worth pointing out: we don't need a zillion resolutions. One that's as good as the world time standards, and one that spans an adequate length of time should cover it. After all, the only reason for not using the highest available resolution is if you want to cover a larger range of times. So there is no real need for microseconds and milliseconds and seconds and days and weeks and... There is also no need for the origin to be kept with a resolution as high as microseconds; seconds would do just fine, since if necessary it can be interpreted as "exactly 7000 seconds after the epoch" even if you are using femtoseconds elsewhere. Anne From david.huard at gmail.com Tue Jul 15 12:26:32 2008 From: david.huard at gmail.com (David Huard) Date: Tue, 15 Jul 2008 12:26:32 -0400 Subject: [Numpy-discussion] Revised list of backport candidates for 1.1.1 In-Reply-To: References: Message-ID: <91cf711d0807150926j49efa824q6380c98f6506f70f@mail.gmail.com> The revision number for the backport of 5254 is 5419. David 2008/7/15 Charles R Harris : > After the first round of backports the following remain. > > charris > r5259 > r5312 > r5322 > r5324 > r5392 > r5394 > r5399 > r5406 > r5407 > > dhuard > r5254 > > fperez > r5298 > r5301 > r5303 > > oliphant > r5245 > r5255 > > Chuck > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 15 12:39:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 10:39:41 -0600 Subject: [Numpy-discussion] Revised list of backport candidates for 1.1.1 In-Reply-To: <91cf711d0807150926j49efa824q6380c98f6506f70f@mail.gmail.com> References: <91cf711d0807150926j49efa824q6380c98f6506f70f@mail.gmail.com> Message-ID: On Tue, Jul 15, 2008 at 10:26 AM, David Huard wrote: > The revision number for the backport of 5254 is 5419. > Great, thanks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Tue Jul 15 13:10:23 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Tue, 15 Jul 2008 10:10:23 -0700 Subject: [Numpy-discussion] quick question about numpy deprecation warnings Message-ID: <487CD9FF.4000501@llnl.gov> Hi I have a quick question and i hope somebody can answer me (I admit I should first really check the numpy doc) I have been porting old Numeric based C code to numpy/numpy.ma for the alst couple weeks. Just as I thought I was done, this morning I updatethe numpy trunk and I now get the following deprecation warnings DeprecationWarning: PyArray_FromDims: use PyArray_SimpleNew. DeprecationWarning: PyArray_FromDimsAndDataAndDescr: use PyArray_NewFromDescr. My quick question is: can I simply replace the function names or is the arg list different? Thanks, Charles. From charlesr.harris at gmail.com Tue Jul 15 14:06:40 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 12:06:40 -0600 Subject: [Numpy-discussion] quick question about numpy deprecation warnings In-Reply-To: <487CD9FF.4000501@llnl.gov> References: <487CD9FF.4000501@llnl.gov> Message-ID: On Tue, Jul 15, 2008 at 11:10 AM, Charles Doutriaux wrote: > Hi I have a quick question and i hope somebody can answer me (I admit I > should first really check the numpy doc) > > I have been porting old Numeric based C code to numpy/numpy.ma for the > alst couple weeks. > > Just as I thought I was done, this morning I updatethe numpy trunk and I > now get the following deprecation warnings > > DeprecationWarning: PyArray_FromDims: use PyArray_SimpleNew. > DeprecationWarning: PyArray_FromDimsAndDataAndDescr: use > PyArray_NewFromDescr. > > My quick question is: > can I simply replace the function names or is the arg list different? > Not quite. They are actually macros. ndarrayobject.h:#define PyArray_SimpleNew(nd, dims, typenum) \ ndarrayobject.h- PyArray_New(&PyArray_Type, nd, dims, typenum, NULL, NULL, 0, 0, NULL) ndarrayobject.h- ndarrayobject.h:#define PyArray_SimpleNewFromData(nd, dims, typenum, data) \ ndarrayobject.h- PyArray_New(&PyArray_Type, nd, dims, typenum, NULL, \ ndarrayobject.h- data, 0, NPY_CARRAY, NULL) ndarrayobject.h- ndarrayobject.h:#define PyArray_SimpleNewFromDescr(nd, dims, descr) \ ndarrayobject.h- PyArray_NewFromDescr(&PyArray_Type, descr, nd, dims, \ ndarrayobject.h- NULL, NULL, 0, NULL) The main change is that dims needs to point to npy_intp instead of int so that 64 bit architectures can use large arrays. I'm not sure the recommended replacements are the best, I just copied them from comments in the source code. Travis could perhaps clarify things a bit for us. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue Jul 15 15:01:21 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 15 Jul 2008 12:01:21 -0700 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: <487CA70F.2000609@gmail.com> References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> <487CA70F.2000609@gmail.com> Message-ID: On Tue, Jul 15, 2008 at 6:33 AM, Bruce Southey wrote: > Following Travis's suggestion below, I would like to suggest that the > following definitions be depreciated or removed in this forthcoming release: > > numpy.Inf > numpy.Infinity > numpy.infty > numpy.PINF > numpy.NAN > numpy.NaN +1 on deprecating in the 1.2 release -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From Chris.Barker at noaa.gov Tue Jul 15 15:47:34 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 15 Jul 2008 12:47:34 -0700 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> <487CA70F.2000609@gmail.com> Message-ID: <487CFED6.3050809@noaa.gov> Jarrod Millman wrote: > On Tue, Jul 15, 2008 at 6:33 AM, Bruce Southey wrote: >> Following Travis's suggestion below, I would like to suggest that the >> following definitions be depreciated or removed in this forthcoming release: >> >> numpy.Inf >> numpy.Infinity >> numpy.infty >> numpy.PINF >> numpy.NAN >> numpy.NaN Just to be clear, is the idea to remove the duplicate names, which will leave us with: numpy.nan numpy.inf What about: numpy.NINF should that be: numpy.ninf And is there a need for: numpy.pinf (or is that just redundant with numpy.inf) anyway, +1 for removing redundancies... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Tue Jul 15 16:23:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Jul 2008 15:23:47 -0500 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk> Message-ID: <3d375d730807151323y58c203eayc3ea0e2024da2fbd@mail.gmail.com> On Tue, Jul 15, 2008 at 09:59, Charles R Harris wrote: > > On Tue, Jul 15, 2008 at 1:42 AM, Michael Abbott > wrote: >> >> I'm reviewing my tickets (seems a good thing to do with a release >> imminent), and I'll post up each ticket that merits comment as a separate >> message. >> >> Ticket #843 has gone into trunk (commit 5361, oliphant) ... but your >> editor appears to be introducing hard tabs! Hard tab characters are >> fortunately relatively rare in numpy source, but my patch has gone in with >> tabs I didn't use. >> >> /me wanders off muttering darkly about the evils of tab characters... >> ___ > > Thanks for doing this. If Travis doesn't comment in the next day or two I'll > look to making the changes. He's on vacation, so do what you think is best. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Tue Jul 15 17:23:20 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Jul 2008 16:23:20 -0500 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: <487CFED6.3050809@noaa.gov> References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> <487CA70F.2000609@gmail.com> <487CFED6.3050809@noaa.gov> Message-ID: <487D1548.9000008@gmail.com> Christopher Barker wrote: > Jarrod Millman wrote: > >> On Tue, Jul 15, 2008 at 6:33 AM, Bruce Southey wrote: >> >>> Following Travis's suggestion below, I would like to suggest that the >>> following definitions be depreciated or removed in this forthcoming release: >>> >>> numpy.Inf >>> numpy.Infinity >>> numpy.infty >>> numpy.PINF >>> numpy.NAN >>> numpy.NaN >>> > > Just to be clear, is the idea to remove the duplicate names, which will > leave us with: > > numpy.nan > numpy.inf > > Yes, that is my goal. > What about: > > numpy.NINF > > should that be: > > numpy.ninf > And is there a need for: > > numpy.pinf > > (or is that just redundant with numpy.inf) > In numeric.py these are defined as equal: 'Inf = inf = infty = Infinity = PINF' So, yes, these are redundant with using only numpy.inf. > anyway, +1 for removing redundancies... > > -Chris > > > Bruce From rowen at cesmail.net Tue Jul 15 17:21:59 2008 From: rowen at cesmail.net (Russell E. Owen) Date: Tue, 15 Jul 2008 14:21:59 -0700 Subject: [Numpy-discussion] Recommendations for using numpy ma? Message-ID: I have some code that does this: # an extra array cast is used because "compressed" returns what *looks* like an array # but is actually something else (I'm not sure exactly what) unmaskedArr = numpy.array( numpy.core.ma.array( dataArr, mask = mask & self.stretchExcludeBits, dtype = float, ).compressed()) That was working fine in numpy 1.0.4 but I've just gotten a report that it fails in 1.1. So...is there a notation that is safer: compatible with the widest possible range of versions? If I replace "numpy.core.ma" with "numpy.ma" this weems to work in 1.0.4 (I'm not sure about 1.1). But I fear it might not work with older versions of numpy. This software is used by a wide range of users with a wide range of versions of numpy. -- Russell From pgmdevlist at gmail.com Tue Jul 15 17:55:31 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 Jul 2008 17:55:31 -0400 Subject: [Numpy-discussion] Recommendations for using numpy ma? In-Reply-To: References: Message-ID: <200807151755.32256.pgmdevlist@gmail.com> Russell, What used to be numpy.core.ma is now numpy.oldnumeric.ma, but this latter isd no longer supported and will disappear soon as well. Just use numpy.ma If you really need support to ancient versions of numpy, just check the import try: import numpy.core.ma as ma except ImportError: import numpy as ma Then, you need to replace every mention of numpy.core.ma in your code by ma. Your example would then become: unmaskedArr = numpy.array( ? ? ma.array( ^^ ? ? ? ? dataArr, ? ? ? ? mask = mask & self.stretchExcludeBits, ? ? ? ? dtype = float, ? ? ).compressed()) On another note: wha't the problem with 'compressed' ? It should return a ndarray, why/how doesn't it work ? From mforbes at physics.ubc.ca Tue Jul 15 21:55:13 2008 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Tue, 15 Jul 2008 18:55:13 -0700 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: <487CA70F.2000609@gmail.com> References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> <487CA70F.2000609@gmail.com> Message-ID: <1DBAA5FC-4DA1-4956-9088-57BC3A7FE8D8@physics.ubc.ca> On 15 Jul 2008, at 6:33 AM, Bruce Southey wrote: > Hi, > Following Travis's suggestion below, I would like to suggest that the > following definitions be depreciated or removed in this forthcoming > release: > > numpy.Inf > numpy.Infinity > numpy.infty > numpy.PINF > numpy.NAN > numpy.NaN ... While this is being discussed, what about the "representation" of nan's an infinities produced by repr in various forms? It is particularly annoying when the repr of simple numeric types cannot be evaluated. These include: 'infj' == repr(complex(0,inf)) 'nanj' == repr(complex(0,nan)) 'infnanj' == repr(complex(inf,nan)) 'nannanj' == repr(complex(nan,nan)) It seems that perhaps infj and nanj should be defined symbols. I am not sure why a + does not get inserted before nan. In addition, the default infstr and nanstr printing options should also be changed to 'inf' and 'nan'. Michael. From mailanhilli at googlemail.com Tue Jul 15 22:59:18 2008 From: mailanhilli at googlemail.com (Matthias Hillenbrand) Date: Tue, 15 Jul 2008 21:59:18 -0500 Subject: [Numpy-discussion] Calculating roots with negative numbers Message-ID: <67b3a51f0807151959u4c100bccu31d56d5fb6c18d2c@mail.gmail.com> Hello, I want to calculate the root of a numpy array with negative numbers. Here is an example: x = linspace(-10,10,100) h = zeros(100) h[where(abs(x) < 2)] = sqrt( abs(x) -2 ) h[where(2 <= abs(x))] = 1j * sqrt( 2 - abs(x) ) Unfortunately I get the following error: Warning: invalid value encountered in sqrt and h contains some NaN-values How can I make a correct case differentiation in combination with roots that might contain negative values? Thank you very much, Matthias From robert.kern at gmail.com Tue Jul 15 23:25:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Jul 2008 22:25:34 -0500 Subject: [Numpy-discussion] Calculating roots with negative numbers In-Reply-To: <67b3a51f0807151959u4c100bccu31d56d5fb6c18d2c@mail.gmail.com> References: <67b3a51f0807151959u4c100bccu31d56d5fb6c18d2c@mail.gmail.com> Message-ID: <3d375d730807152025n3877dd0fxea4fa6d2188d5376@mail.gmail.com> On Tue, Jul 15, 2008 at 21:59, Matthias Hillenbrand wrote: > Hello, > > I want to calculate the root of a numpy array with negative numbers. > Here is an example: > > x = linspace(-10,10,100) > h = zeros(100) > > h[where(abs(x) < 2)] = sqrt( abs(x) -2 ) > h[where(2 <= abs(x))] = 1j * sqrt( 2 - abs(x) ) > > Unfortunately I get the following error: Warning: invalid value > encountered in sqrt > and h contains some NaN-values This scheme would work if you masked the right-hand-sides, too. In [14]: from numpy import * In [15]: x = linspace(-4, 4, 21) In [16]: h = empty(21, dtype=complex) In [17]: m = abs(x) > 2 In [18]: h[m] = sqrt(abs(x[m]) - 2) In [19]: h[~m] = 1j*sqrt(2 - abs(x[~m])) In [20]: h Out[20]: array([ 1.41421356+0.j , 1.26491106+0.j , 1.09544512+0.j , 0.89442719+0.j , 0.63245553+0.j , 0.00000000+0.j , 0.00000000+0.63245553j, 0.00000000+0.89442719j, 0.00000000+1.09544512j, 0.00000000+1.26491106j, 0.00000000+1.41421356j, 0.00000000+1.26491106j, 0.00000000+1.09544512j, 0.00000000+0.89442719j, 0.00000000+0.63245553j, 0.00000000+0.j , 0.63245553+0.j , 0.89442719+0.j , 1.09544512+0.j , 1.26491106+0.j , 1.41421356+0.j ]) > How can I make a correct case differentiation in combination with > roots that might contain negative values? The versions of functions in scimath will extend the domain to include negatives and return the complex result. You don't need to do any testing yourself. In [21]: from numpy.lib import scimath In [22]: scimath.sqrt(abs(x) - 2) Out[22]: array([ 1.41421356+0.j , 1.26491106+0.j , 1.09544512+0.j , 0.89442719+0.j , 0.63245553+0.j , 0.00000000+0.j , 0.00000000+0.63245553j, 0.00000000+0.89442719j, 0.00000000+1.09544512j, 0.00000000+1.26491106j, 0.00000000+1.41421356j, 0.00000000+1.26491106j, 0.00000000+1.09544512j, 0.00000000+0.89442719j, 0.00000000+0.63245553j, 0.00000000+0.j , 0.63245553+0.j , 0.89442719+0.j , 1.09544512+0.j , 1.26491106+0.j , 1.41421356+0.j ]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mailanhilli at googlemail.com Tue Jul 15 23:28:54 2008 From: mailanhilli at googlemail.com (Matthias Hillenbrand) Date: Tue, 15 Jul 2008 22:28:54 -0500 Subject: [Numpy-discussion] Calculating roots with negative numbers In-Reply-To: <3d375d730807152025n3877dd0fxea4fa6d2188d5376@mail.gmail.com> References: <67b3a51f0807151959u4c100bccu31d56d5fb6c18d2c@mail.gmail.com> <3d375d730807152025n3877dd0fxea4fa6d2188d5376@mail.gmail.com> Message-ID: <67b3a51f0807152028x3a691b36n2fe8fd5c5a0437db@mail.gmail.com> Thanks Robert, that is exactely what I was looking for. Matthias From rng7 at cornell.edu Tue Jul 15 23:39:53 2008 From: rng7 at cornell.edu (Ryan Gutenkunst) Date: Tue, 15 Jul 2008 23:39:53 -0400 Subject: [Numpy-discussion] It appears that f2py fails to pass --compiler information to distutils on Windows XP Message-ID: <943729F1-C9DB-4947-82B9-96C9497214C4@cornell.edu> Hi all, A project of mine uses f2py internally to dynamically build extension modules, and we've recently had problems with it on Windows XP using Python 2.5.2. I think I've isolated the problem to f2py passing the -- compiler option to distutils. It appears not to be doing so. If I run "python c:\python25\Scripts\f2py.py -c --compiler=mingw32" I get the message: running build running scons No module named msvccompiler in numpy.distutils; trying from distutils error: Python was built with Visual Studio 2003; extensions must be built with a compiler that can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py. It appears that the mingw32 information isn't getting passed to distutils. I do indeed have a good installation of mingw32, as I can successfully build a related project that uses both C and Fortran extension modules via "python setup.py build -c mingw32". If I just run "python c:\python25\Scripts\f2py.py", the help screen ends with: Version: 2_5239 numpy Version: 1.1.0 Our Windows testing is very sporadic. I'm quite sure this worked a year ago, but I'm not sure when it broke. Does anyone have any suggestions? Is this an actual bug, or am I using something incorrectly? Thanks, Ryan -- Ryan Gutenkunst Biological Statistics and Computational Biology Cornell University http://pages.physics.cornell.edu/~rgutenkunst/ From charlesr.harris at gmail.com Tue Jul 15 23:41:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Jul 2008 21:41:14 -0600 Subject: [Numpy-discussion] Second revised list of backports for 1.1.1. Message-ID: After the second pass only a few remain. Fernando, if you don't get to these I'll do them tomorrow. fperez r5298 r5301 r5303 Thanks to everyone for your prompt response. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Jul 16 01:31:41 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 15 Jul 2008 22:31:41 -0700 Subject: [Numpy-discussion] Second revised list of backports for 1.1.1. In-Reply-To: References: Message-ID: Hi Chuck, On Tue, Jul 15, 2008 at 8:41 PM, Charles R Harris wrote: > After the second pass only a few remain. Fernando, if you don't get to these > I'll do them tomorrow. > > fperez > r5298 > r5301 > r5303 Sorry, I hadn't seen this. If you can do it that would be great: I'm at a workshop right now with only a couple of hours in the night for any coding, and trying to push an ipython release for scipy. But if you can't do it tomorrow let me know and I'll make a space for it later in the week. Cheers, f From zachary.pincus at yale.edu Wed Jul 16 09:16:12 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 16 Jul 2008 09:16:12 -0400 Subject: [Numpy-discussion] Infinity definitions In-Reply-To: <1DBAA5FC-4DA1-4956-9088-57BC3A7FE8D8@physics.ubc.ca> References: <47FE2460.6050107@gmail.com> <47FE55DF.3040201@enthought.com> <487CA70F.2000609@gmail.com> <1DBAA5FC-4DA1-4956-9088-57BC3A7FE8D8@physics.ubc.ca> Message-ID: >> Hi, >> Following Travis's suggestion below, I would like to suggest that the >> following definitions be depreciated or removed in this forthcoming >> release: >> >> numpy.Inf >> numpy.Infinity >> numpy.infty >> numpy.PINF >> numpy.NAN >> numpy.NaN > ... > > While this is being discussed, what about the "representation" of > nan's an infinities produced by repr in various forms? It is > particularly annoying when the repr of simple numeric types cannot be > evaluated. These include: > > 'infj' == repr(complex(0,inf)) > 'nanj' == repr(complex(0,nan)) > 'infnanj' == repr(complex(inf,nan)) > 'nannanj' == repr(complex(nan,nan)) > > It seems that perhaps infj and nanj should be defined symbols. I am > not sure why a + does not get inserted before nan. > > In addition, the default infstr and nanstr printing options should > also be changed to 'inf' and 'nan'. Hello all, If I recall correctly, one reason for the plethora of infinity definitions (which had been mentioned previously on the list) was that the repr for some or all float/complex types was generated by code in the host OS, and not in numpy. As such, these reprs were different for different platforms. As there was a desire to ensure that reprs could always be evaluated, the various ways that inf and nan could be spit out by the host libs were all included. Has this been fixed now, so that repr(inf), (etc.) looks identical on all platforms? Zach From stefan at sun.ac.za Wed Jul 16 09:38:21 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 16 Jul 2008 15:38:21 +0200 Subject: [Numpy-discussion] Histogram bin definition Message-ID: <9457e7c80807160638j6b925e0br2276051fbdbd157f@mail.gmail.com> Hi all, I am busy documenting `histogram`, and the definition of a "bin" eludes me. Here is the behaviour that troubles me: >>> np.histogram([1,2,1], bins=[0, 1, 2, 3], new=True) (array([0, 2, 1]), array([0, 1, 2, 3])) >From this result, it seems as if a bin is defined as the half-open interval [right_edge, left_edge). Now, looks what happens in the following case: >>> np.histogram([1,2,3], bins=[0,1,2,3], new=True) (array([0, 1, 2]), array([0, 1, 2, 3])) Here, the last bin is defined by the closed interval [right_edge, left_edge]! Is this a bug, or a design consideration? Regards St?fan From faltet at pytables.org Wed Jul 16 12:23:11 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 16 Jul 2008 18:23:11 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: References: <200807141507.47484.faltet@pytables.org> <200807151330.10394.faltet@pytables.org> Message-ID: <200807161823.12148.faltet@pytables.org> A Tuesday 15 July 2008, Anne Archibald escrigu?: > 2008/7/15 Francesc Alted : > > Maybe is only that. But by using the term 'frequency' I tend to > > think that you are expecting to have one entry (observation) in > > your array for each time 'tick' since time start. OTOH, the term > > 'resolution' doesn't have this implication, and only states the > > precision of the timestamp. > > > > Well, after reading the mails from Chris and Anne, I think the best > > is that the origin would be kept as an int64 with a resolution of > > microseconds (for compatibility with the ``datetime`` module, as > > I've said before). > > A couple of details worth pointing out: we don't need a zillion > resolutions. One that's as good as the world time standards, and one > that spans an adequate length of time should cover it. After all, the > only reason for not using the highest available resolution is if you > want to cover a larger range of times. So there is no real need for > microseconds and milliseconds and seconds and days and weeks and... Maybe you are right, but by providing many resolutions we are trying to cope with the needs of people that are using them a lot. In particular, we are willing that the authors of the timseries scikit can find on these new dtype a fair replacement of their Date class (our proposal will be not so featured, but...). > There is also no need for the origin to be kept with a resolution as > high as microseconds; seconds would do just fine, since if necessary > it can be interpreted as "exactly 7000 seconds after the epoch" even > if you are using femtoseconds elsewhere. Good point. However, we finally managed to not include the ``origin`` metadata in our new proposal. Have a look at the second proposal that I'll be posting soon for details. Cheers, -- Francesc Alted From rowen at cesmail.net Wed Jul 16 12:28:40 2008 From: rowen at cesmail.net (Russell E. Owen) Date: Wed, 16 Jul 2008 09:28:40 -0700 Subject: [Numpy-discussion] Recommendations for using numpy ma? References: <200807151755.32256.pgmdevlist@gmail.com> Message-ID: In article <200807151755.32256.pgmdevlist at gmail.com>, Pierre GM wrote: > Russell, > > What used to be numpy.core.ma is now numpy.oldnumeric.ma, but this latter isd > no longer supported and will disappear soon as well. Just use numpy.ma > > If you really need support to ancient versions of numpy, just check the import > try: > import numpy.core.ma as ma > except ImportError: > import numpy as ma (I assume you mean the last line to be "import numpy .ma as ma"?) Thanks! I was afraid I would have to do that, but not having ready access to ancient versions of numpy I was hoping I was wrong and that numpy.ma would work for those as well. However, I plan to assume a modern numpy first, as in: try: import numpy.ma as ma except ImportError: import numpy.core.ma as ma > Then, you need to replace every mention of numpy.core.ma in your code by ma. > Your example would then become: > > unmaskedArr = numpy.array( > ? ? ma.array( > ^^ > ? ? ? ? dataArr, > ? ? ? ? mask = mask & self.stretchExcludeBits, > ? ? ? ? dtype = float, > ? ? ).compressed()) > > > > On another note: wha't the problem with 'compressed' ? It should return a > ndarray, why/how doesn't it work ? The problem is that the returned array does not support the "sort" method. Here's an example using numpy 1.0.4: import numpy z = numpy.zeros(10, dtype=float) m = numpy.zeros(10, dtype=bool) m[1] = 1 mzc = numpy.ma.array(z, mask=m).compressed() mzc.sort() the last statement fails witH: Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac kages/numpy/core/ma.py", line 2132, in not_implemented raise NotImplementedError, "not yet implemented for numpy.ma arrays" NotImplementedError: not yet implemented for numpy.ma arrays This seems like a bug to me. The returned object is reported by "repr" to be a normal numpy array; there is no obvious way to tell that it is anything else. Also I didn't see any reason for "compressed" to return anything except an ordinary array. Oh well. I reported this on the mailing list awhile ago when I first stumbled across it, but nobody seemed interested at the time. It wasn't clear to me whether it was a bug so I dropped it without reporting it formally (and I've still not reported it formally). -- Russell From pgmdevlist at gmail.com Wed Jul 16 12:37:54 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 16 Jul 2008 12:37:54 -0400 Subject: [Numpy-discussion] Recommendations for using numpy ma? In-Reply-To: References: <200807151755.32256.pgmdevlist@gmail.com> Message-ID: <200807161237.54881.pgmdevlist@gmail.com> On Wednesday 16 July 2008 12:28:40 Russell E. Owen wrote: > > If you really need support to ancient versions of numpy, just check the > > import try: > > import numpy.core.ma as ma > > except ImportError: > > import numpy as ma > > (I assume you mean the last line to be "import numpy .ma as ma"?) Indeed, sorry about that. > import numpy > z = numpy.zeros(10, dtype=float) > m = numpy.zeros(10, dtype=bool) > m[1] = 1 > mzc = numpy.ma.array(z, mask=m).compressed() > mzc.sort() > > the last statement fails witH: > > Traceback (most recent call last): > File "", line 1, in > File > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-pac > kages/numpy/core/ma.py", line 2132, in not_implemented > raise NotImplementedError, "not yet implemented for numpy.ma arrays" > NotImplementedError: not yet implemented for numpy.ma arrays > > This seems like a bug to me. Works on 1.1.x > The returned object is reported by "repr" > to be a normal numpy array; there is no obvious way to tell that it is > anything else. Also I didn't see any reason for "compressed" to return > anything except an ordinary array. Oh well. .compressed returns an array of the same type as the underlying .data > I reported this on the mailing list awhile ago when I first stumbled > across it, but nobody seemed interested at the time. It wasn't clear to > me whether it was a bug so I dropped it without reporting it formally > (and I've still not reported it formally). Try again w/ 1.1.x, but please, please do report bugs when you see them.. From faltet at pytables.org Wed Jul 16 12:44:36 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 16 Jul 2008 18:44:36 +0200 Subject: [Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy Message-ID: <200807161844.36953.faltet@pytables.org> Hi, After tons of excellent feedback received for our first proposal about the date/time types in NumPy Ivan and me have had another brainstorming session and ended with a new proposal for your consideration. While this one does not reap all and every of the suggestions you have made, we think that it does represent a fair balance between capabilities and simplicity and that it can be a solid and efficient basis for build-up more date/time niceties on top of it (read a full-fledged ``DateTime`` array class). Although the proposal is not complete, the essentials are there. So, please read on. We will be glad to hear your opinions. Thanks! -- Francesc Alted ==================================================================== A (second) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-16 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different resolutions so as to cover different needs. **Important note:** the resolution is conceived here as a metadata that *complements* a date/time dtype, *without changing the base type*. Now it goes a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is POSIX epoch (see [3]_). Resolution ~~~~~~~~~~ It accepts different resolutions and for each of these resolutions, it will support different time spans. The table below describes the resolutions supported with its corresponding time spans. +----------------------+----------------------------------+ | Resolution | Time span (years) | +----------------------+----------------------------------+ | Code | Meaning | | +======================+==================================+ | Y | year | [9.2e18 BC, 9.2e18 AC] | | Q | quarter | [3.0e18 BC, 3.0e18 AC] | | M | month | [7.6e17 BC, 7.6e17 AC] | | W | week | [1.7e17 BC, 1.7e17 AC] | | d | day | [2.5e16 BC, 2.5e16 AC] | | h | hour | [1.0e15 BC, 1.0e15 AC] | | m | minute | [1.7e13 BC, 1.7e13 AC] | | s | second | [ 2.9e9 BC, 2.9e9 AC] | | ms | millisecond | [ 2.9e6 BC, 2.9e6 AC] | | us | microsecond | [290301 BC, 294241 AC] | | ns | nanosecond | [ 1678 AC, 2262 AC] | +----------------------+----------------------------------+ Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('datetime64', res="us") # the default res. is microseconds Using the long string notation:: dtype('datetime64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('T8[us]') # equivalent to dtype('T8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. Resolution ~~~~~~~~~~ It accepts different resolutions and for each of these resolutions, it will support different time spans. The table below describes the resolutions supported with its corresponding time spans. +----------------------+--------------------------+ | Resolution | Time span | +----------------------+--------------------------+ | Code | Meaning | | +======================+==========================+ | W | week | +- 1.7e17 years | | D | day | +- 2.5e16 years | | h | hour | +- 1.0e15 years | | m | minute | +- 1.7e13 years | | s | second | +- 2.9e12 years | | ms | millisecond | +- 2.9e9 years | | us | microsecond | +- 2.9e6 years | | ns | nanosecond | +- 292 years | | ps | picosecond | +- 106 days | | fs | femtosecond | +- 2.6 hours | | as | attosecond | +- 9.2 seconds | +----------------------+--------------------------+ Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('timedelta64', res="us") # the default res. is microseconds Using the long string notation:: dtype('timedelta64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('t8[us]') # equivalent to dtype('t8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. Example of use ============== Here it is an example of use for the ``datetime64``:: In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: '2008-07-16T13:39:25.315' # representation in ISO 8601 format In [13]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0] In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [15]: print t.dtype datetime64[ms] And here it goes an example of use for the ``timedelta64``:: In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]") In [9]: t2 = numpy.ones(5, dtype="datetime64[s]") In [10]: t = t2 - t1 In [11]: t[0] = 24 # setter in action (setting to 24 seconds) In [12]: t[0] Out[12]: 24 # representation as an int64 In [13]: print t [24 1 1 1 1] In [14]: t[0].item() # getter in action Out[14]: datetime.timedelta(0, 24) In [15]: print t.dtype timedelta64[s] Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8") Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) But not other operations:: In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' ``timedelta64`` vs anything --------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(5, 't8') Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M]) But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: The result cannot be converted into a ``timedelta64`` dtype/resolution conversions ============================ For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing resolutions. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[d]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, Ivan and me find that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Resolution and dtype issues --------------------------- The date/time dtype's resolution metadata cannot be used in general as part of typical dtype usage. For example, in:: numpy.zeros(5, dtype=numpy.datetime64) we have to found yet a sensible way to pass the resolution. Perhaps the next would work:: numpy.zeros(5, dtype=numpy.datetime64(res='Y')) but we are not sure if this would collide with the spirit of the NumPy dtypes. At any rate, one can always do:: numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y')) BTW, prior to all of this, one should also elucidate whether:: numpy.dtype('datetime64', res='Y') or:: numpy.dtype('datetime64[Y]') numpy.dtype('T8[Y]') would be a consistent way to instantiate a dtype in NumPy. We do really think that could be a good way, but we would need to hear the opinion of the expert. Travis? .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: -------------- next part -------------- ==================================================================== A (second) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-16 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stick with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different resolutions so as to cover different needs. **Important note:** the resolution is conceived here as a metadata that *complements* a date/time dtype, *without changing the base type*. Now it goes a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is POSIX epoch (see [3]_). Resolution ~~~~~~~~~~ It accepts different resolutions and for each of these resolutions, it will support different time spans. The table below describes the resolutions supported with its corresponding time spans. +----------------------+----------------------------------+ | Resolution | Time span (years) | +----------------------+----------------------------------+ | Code | Meaning | | +======================+==================================+ | Y | year | [9.2e18 BC, 9.2e18 AC] | | Q | quarter | [3.0e18 BC, 3.0e18 AC] | | M | month | [7.6e17 BC, 7.6e17 AC] | | W | week | [1.7e17 BC, 1.7e17 AC] | | d | day | [2.5e16 BC, 2.5e16 AC] | | h | hour | [1.0e15 BC, 1.0e15 AC] | | m | minute | [1.7e13 BC, 1.7e13 AC] | | s | second | [ 2.9e9 BC, 2.9e9 AC] | | ms | millisecond | [ 2.9e6 BC, 2.9e6 AC] | | us | microsecond | [290301 BC, 294241 AC] | | ns | nanosecond | [ 1678 AC, 2262 AC] | +----------------------+----------------------------------+ Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('datetime64', res="us") # the default res. is microseconds Using the long string notation:: dtype('datetime64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('T8[us]') # equivalent to dtype('T8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. Resolution ~~~~~~~~~~ It accepts different resolutions and for each of these resolutions, it will support different time spans. The table below describes the resolutions supported with its corresponding time spans. +----------------------+--------------------------+ | Resolution | Time span | +----------------------+--------------------------+ | Code | Meaning | | +======================+==========================+ | W | week | +- 1.7e17 years | | D | day | +- 2.5e16 years | | h | hour | +- 1.0e15 years | | m | minute | +- 1.7e13 years | | s | second | +- 2.9e12 years | | ms | millisecond | +- 2.9e9 years | | us | microsecond | +- 2.9e6 years | | ns | nanosecond | +- 292 years | | ps | picosecond | +- 106 days | | fs | femtosecond | +- 2.6 hours | | as | attosecond | +- 9.2 seconds | +----------------------+--------------------------+ Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('timedelta64', res="us") # the default res. is microseconds Using the long string notation:: dtype('timedelta64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('t8[us]') # equivalent to dtype('t8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. Example of use ============== Here it is an example of use for the ``datetime64``:: In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: t[0] Out[12]: '2008-07-16T13:39:25.315' # representation in ISO 8601 format In [13]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0] In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [15]: print t.dtype datetime64[ms] And here it goes an example of use for the ``timedelta64``:: In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]") In [9]: t2 = numpy.ones(5, dtype="datetime64[s]") In [10]: t = t2 - t1 In [11]: t[0] = 24 # setter in action (setting to 24 seconds) In [12]: t[0] Out[12]: 24 # representation as an int64 In [13]: print t [24 1 1 1 1] In [14]: t[0].item() # getter in action Out[14]: datetime.timedelta(0, 24) In [15]: print t.dtype timedelta64[s] Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8") Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) But not other operations:: In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' ``timedelta64`` vs anything --------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(5, 't8') Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M]) But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: The result cannot be converted into a ``timedelta64`` dtype/resolution conversions ============================ For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing resolutions. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[d]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, Ivan and me find that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Resolution and dtype issues --------------------------- The date/time dtype's resolution metadata cannot be used in general as part of typical dtype usage. For example, in:: numpy.zeros(5, dtype=numpy.datetime64) we have to found yet a sensible way to pass the resolution. Perhaps the next would work:: numpy.zeros(5, dtype=numpy.datetime64(res='Y')) but we are not sure if this would collide with the spirit of the NumPy dtypes. At any rate, one can always do:: numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y')) BTW, prior to all of this, one should also elucidate whether:: numpy.dtype('datetime64', res='Y') or:: numpy.dtype('datetime64[Y]') numpy.dtype('T8[Y]') would be a consistent way to instantiate a dtype in NumPy. We do really think that could be a good way, but we would need to hear the opinion of the expert. Travis? .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: From charlesr.harris at gmail.com Wed Jul 16 12:50:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 10:50:27 -0600 Subject: [Numpy-discussion] Second revised list of backports for 1.1.1. In-Reply-To: References: Message-ID: On Tue, Jul 15, 2008 at 11:31 PM, Fernando Perez wrote: > Hi Chuck, > > On Tue, Jul 15, 2008 at 8:41 PM, Charles R Harris > wrote: > > After the second pass only a few remain. Fernando, if you don't get to > these > > I'll do them tomorrow. > > > > fperez > > r5298 > > r5301 > > r5303 > > Sorry, I hadn't seen this. If you can do it that would be great: I'm > at a workshop right now with only a couple of hours in the night for > any coding, and trying to push an ipython release for scipy. But if > you can't do it tomorrow let me know and I'll make a space for it > later in the week. > OK. Done. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Wed Jul 16 14:14:56 2008 From: david.huard at gmail.com (David Huard) Date: Wed, 16 Jul 2008 14:14:56 -0400 Subject: [Numpy-discussion] Histogram bin definition In-Reply-To: <9457e7c80807160638j6b925e0br2276051fbdbd157f@mail.gmail.com> References: <9457e7c80807160638j6b925e0br2276051fbdbd157f@mail.gmail.com> Message-ID: <91cf711d0807161114x5a3b2422v1d9cd6d5105d11ed@mail.gmail.com> Hi Stefan, It's designed this way. The main reason is that the default bin edges are generated using linspace(a.min(), a.max(), bin) when bin is an integer. If we leave the rightmost edge open, then the histogram of a 100 items array will typically yield an histogram with 99 values because the maximum value is an outlier. I thought the least surprising behavior was to make sure that all items are counted. The other reason has to do with backward compatibility, I tried to avoid breakage for the simplest use case. `histogram(r, bins=10)` yields the same thing as `histogram(r, bins=10, new=True)` We could avoid the open ended edge by defining the edges by linspace(a.min(), a.max()+delta, bin), but people will wonder why the right edge is 3.000001 instead of 3. Cheers, David 2008/7/16 St?fan van der Walt : > Hi all, > > I am busy documenting `histogram`, and the definition of a "bin" > eludes me. Here is the behaviour that troubles me: > > >>> np.histogram([1,2,1], bins=[0, 1, 2, 3], new=True) > (array([0, 2, 1]), array([0, 1, 2, 3])) > > >From this result, it seems as if a bin is defined as the half-open > interval [right_edge, left_edge). > > Now, looks what happens in the following case: > > >>> np.histogram([1,2,3], bins=[0,1,2,3], new=True) > (array([0, 1, 2]), array([0, 1, 2, 3])) > > Here, the last bin is defined by the closed interval [right_edge, > left_edge]! > > Is this a bug, or a design consideration? > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Jul 16 14:58:15 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 16 Jul 2008 11:58:15 -0700 Subject: [Numpy-discussion] Second revised list of backports for 1.1.1. In-Reply-To: References: Message-ID: On Wed, Jul 16, 2008 at 9:50 AM, Charles R Harris > OK. Done. Fantastic, many thanks. Cheers, f From doutriaux1 at llnl.gov Wed Jul 16 15:08:59 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 16 Jul 2008 12:08:59 -0700 Subject: [Numpy-discussion] kinds Message-ID: <487E474B.3010708@llnl.gov> Hello, A long long time ago, there used to be this module named "kinds" It's totally outdated nowdays but it had one nice functionality and i was wondering if you knew how to reproduce that it was: maxexp=kinds.default_float_kind.MAX_10_EXP minexp=kinds.default_float_kind.MIN_10_EXP and a bunch of similar flags that would basically tell you the limits on the machine you're running (or at least compiled on) Any idea on how to reproduce this? While we're at it, does anybody know of way in python to know how memory is available on your system (similar to the free call unde rLinux) I'm looking for something that works accross platforms (well we can forget windows for now I could live w/o it) Thanks, C. From charlesr.harris at gmail.com Wed Jul 16 15:13:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 13:13:48 -0600 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: <20080715073530.B81915@saturn.araneidae.co.uk> References: <20080715073530.B81915@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 1:42 AM, Michael Abbott wrote: > I'm reviewing my tickets (seems a good thing to do with a release > imminent), and I'll post up each ticket that merits comment as a separate > message. > > Ticket #843 has gone into trunk (commit 5361, oliphant) ... but your > editor appears to be introducing hard tabs! Hard tab characters are > fortunately relatively rare in numpy source, but my patch has gone in with > tabs I didn't use. > Heh, hard tabs have been removed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed Jul 16 15:20:04 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 16 Jul 2008 15:20:04 -0400 Subject: [Numpy-discussion] kinds In-Reply-To: <487E474B.3010708@llnl.gov> References: <487E474B.3010708@llnl.gov> Message-ID: <200807161520.04816.pgmdevlist@gmail.com> On Wednesday 16 July 2008 15:08:59 Charles Doutriaux wrote: > and a bunch of similar flags that would basically tell you the limits on > the machine you're running (or at least compiled on) > > Any idea on how to reproduce this? Charels, have you tried numpy.finfo ? That should give you information for floating points. From aisaac at american.edu Wed Jul 16 15:33:11 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 16 Jul 2008 15:33:11 -0400 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk> Message-ID: On Wed, 16 Jul 2008, Charles R Harris apparently wrote: > Hard tab characters are fortunately relatively rare in > numpy source http://www.rizzoweb.com/java/tabs-vs-spaces.html Cheers, Alan Isaac From michael at araneidae.co.uk Wed Jul 16 15:33:59 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Wed, 16 Jul 2008 19:33:59 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk> Message-ID: <20080716193031.D20294@saturn.araneidae.co.uk> On Wed, 16 Jul 2008, Alan G Isaac wrote: > On Wed, 16 Jul 2008, Charles R Harris apparently wrote: Michael Abbott actually wrote: > > Hard tab characters are fortunately relatively rare in > > numpy source > http://www.rizzoweb.com/java/tabs-vs-spaces.html Ha ha ha! I'm not going to rise to the bait. Well, I'll just say: source code is a concrete expression, not an abstraction. I expect an argument on this topic could be unhealthy... From charlesr.harris at gmail.com Wed Jul 16 15:40:51 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 13:40:51 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080715150718.R97049@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott wrote: > On Tue, 15 Jul 2008, Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and without the > > rest of my patch there remains a leak. > > I think I might need to explain a little more about the reason for this > patch, because obviously the bug it fixes was missed the last time I > posted on this bug. > > So here is the missing part of the patch: > > > --- numpy/core/src/scalartypes.inc.src (revision 5411) > > +++ numpy/core/src/scalartypes.inc.src (working copy) > > @@ -1925,19 +1925,30 @@ > > goto finish; > > } > > > > + Py_XINCREF(typecode); > > arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); > > - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; > > + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { > > + Py_XDECREF(typecode); > > + return arr; > > + } > > robj = PyArray_Return((PyArrayObject *)arr); > > > > finish: > > - if ((robj==NULL) || (robj->ob_type == type)) return robj; > > + if ((robj==NULL) || (robj->ob_type == type)) { > > + Py_XDECREF(typecode); > > + return robj; > > + } > > /* Need to allocate new type and copy data-area over */ > > if (type->tp_itemsize) { > > itemsize = PyString_GET_SIZE(robj); > > } > > else itemsize = 0; > > obj = type->tp_alloc(type, itemsize); > > - if (obj == NULL) {Py_DECREF(robj); return NULL;} > > + if (obj == NULL) { > > + Py_XDECREF(typecode); > > + Py_DECREF(robj); > > + return NULL; > > + } > > if (typecode==NULL) > > typecode = PyArray_DescrFromType(PyArray_ at TYPE@); > > dest = scalar_value(obj, typecode); > > On the face of it it might appear that all the DECREFs are cancelling out > the first INCREF, but not so. Let's see two more lines of context: > > > src = scalar_value(robj, typecode); > > Py_DECREF(typecode); > > Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe > the later call ... and of course this has to happen on *ALL* return paths. > If we now take a closer look at the patch we can see that it's doing two > separate things: > > 1. There's an extra Py_XINCREF to balance the ref count lost to > PyArray_FromAny and ensure that typecode survives long enough; > > 2. Every early return path has an extra Py_XDECREF to balance the creation > of typecode. > > I rest my case for this patch. Yes, there does look to be a memory leak here. Not to mention a missing NULL check since PyArray_Scalar not only doesn't swallow a reference, it can't take a Null value for desc. But the whole function is such a mess I want to see if we can rewrite it to have a better flow of logic Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 16 15:44:03 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 13:44:03 -0600 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: <20080716193031.D20294@saturn.araneidae.co.uk> References: <20080715073530.B81915@saturn.araneidae.co.uk> <20080716193031.D20294@saturn.araneidae.co.uk> Message-ID: On Wed, Jul 16, 2008 at 1:33 PM, Michael Abbott wrote: > On Wed, 16 Jul 2008, Alan G Isaac wrote: > > On Wed, 16 Jul 2008, Charles R Harris apparently wrote: > Michael Abbott actually wrote: > > > Hard tab characters are fortunately relatively rare in > > > numpy source > > http://www.rizzoweb.com/java/tabs-vs-spaces.html > > Ha ha ha! I'm not going to rise to the bait. > > Well, I'll just say: source code is a concrete expression, not an > abstraction. > > I expect an argument on this topic could be unhealthy... In any case, the python standard is four spaces. Linux uses tabs. When in Rome... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 16 16:07:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 14:07:17 -0600 Subject: [Numpy-discussion] Ticket review #849: reference to deallocated object? In-Reply-To: <20080715075253.O81915@saturn.araneidae.co.uk> References: <20080715075253.O81915@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 1:53 AM, Michael Abbott wrote: > Tenuous but easy fix, and conformant to style elsewhere. > This one depends on whether there is some sort of threading going on that can interrupt in the middle of the call. Probably not, but the fix doesn't look disruptive and I'll put it in just to be safe. I actually think things should be set up so that the descr reference counts never go to zero, i.e., all the types should be singleton's set up during initialization and maintained throughout the existence of numpy. But it's not that way now with the way the chararray type is implemented. I wonder if we need a NULL check also? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Wed Jul 16 16:25:54 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 16 Jul 2008 13:25:54 -0700 Subject: [Numpy-discussion] kinds In-Reply-To: <200807161520.04816.pgmdevlist@gmail.com> References: <487E474B.3010708@llnl.gov> <200807161520.04816.pgmdevlist@gmail.com> Message-ID: <487E5952.2050403@llnl.gov> Thx Pierre, That's exactly what i was looking for C. Pierre GM wrote: > On Wednesday 16 July 2008 15:08:59 Charles Doutriaux wrote: > > >> and a bunch of similar flags that would basically tell you the limits on >> the machine you're running (or at least compiled on) >> >> Any idea on how to reproduce this? >> > > Charels, have you tried numpy.finfo ? That should give you information for > floating points. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Wed Jul 16 16:26:33 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 14:26:33 -0600 Subject: [Numpy-discussion] Ticket review #850: leak in _strings_richcompare In-Reply-To: <20080715074920.J81915@saturn.araneidae.co.uk> References: <20080715074920.J81915@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 1:50 AM, Michael Abbott wrote: > This one is easy, ought to go in. Fixes a (not particularly likely) > memory leak. > _______ Done and backported. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Wed Jul 16 16:48:11 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 16 Jul 2008 16:48:11 -0400 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk><20080716193031.D20294@saturn.araneidae.co.uk> Message-ID: On Wed, 16 Jul 2008, Charles R Harris apparently wrote: > the python standard is four spaces It is only a recommendation: http://www.python.org/dev/peps/pep-0008/ (And a misguided one at that. ;-) ) Cheers, Alan Isaac From pav at iki.fi Wed Jul 16 17:05:58 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 16 Jul 2008 21:05:58 +0000 (UTC) Subject: [Numpy-discussion] Ticket #837 Message-ID: http://scipy.org/scipy/numpy/ticket/837 Infinite loop in fromfile and fromstring with sep=' ' and malformed input. I committed a fix to trunk. Does this need a 1.1.1 backport? -- Pauli Virtanen From stefan at sun.ac.za Wed Jul 16 17:41:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 16 Jul 2008 23:41:52 +0200 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> Message-ID: <9457e7c80807161441g572f97c5s588f517a613d4985@mail.gmail.com> 2008/7/16 Charles R Harris : > Yes, there does look to be a memory leak here. Not to mention a missing NULL > check since PyArray_Scalar not only doesn't swallow a reference, it can't > take a Null value for desc. But the whole function is such a mess I want to > see if we can rewrite it to have a better flow of logic Can we apply the patch in the meantime? (My) TODO lists tend to get very long... St?fan From charlesr.harris at gmail.com Wed Jul 16 17:42:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 15:42:00 -0600 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk> <20080716193031.D20294@saturn.araneidae.co.uk> Message-ID: On Wed, Jul 16, 2008 at 2:48 PM, Alan G Isaac wrote: > On Wed, 16 Jul 2008, Charles R Harris apparently wrote: > > the python standard is four spaces > > It is only a recommendation: > http://www.python.org/dev/peps/pep-0008/ > (And a misguided one at that. ;-) ) > I see your pep-0008 and raise you pep-3100 ;) - The C style guide will be updated to use 4-space indents, never tabs. This style should be used for all new files; existing files can be updated only if there is no hope to ever merge a particular file from the Python 2 HEAD. Within a file, the indentation style should be consistent. No other style guide changes are planned ATM. I'll bet you use Emacs too ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Jul 16 17:43:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 15:43:00 -0600 Subject: [Numpy-discussion] Ticket #837 In-Reply-To: References: Message-ID: On Wed, Jul 16, 2008 at 3:05 PM, Pauli Virtanen wrote: > > http://scipy.org/scipy/numpy/ticket/837 > > Infinite loop in fromfile and fromstring with sep=' ' and malformed input. > > I committed a fix to trunk. Does this need a 1.1.1 backport? > Yes, I think so. TIA, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jack.Cook at shell.com Wed Jul 16 17:45:37 2008 From: Jack.Cook at shell.com (Jack.Cook at shell.com) Date: Wed, 16 Jul 2008 16:45:37 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question Message-ID: Greetings, I have an I,J,K 3D volume of amplitude values at regularly sampled time intervals. I have an I,J 2D slice which contains a time (K) value at each I, J location. What I would like to do is extract a subvolume at a constant +/- K window around the slice. Is there an easy way to do this using advanced indexing or some other method? Thanks in advanced for your help. - Jack Kind Regards, Jack Cook Jack Cook -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony.floyd at convergent.ca Wed Jul 16 17:47:38 2008 From: anthony.floyd at convergent.ca (Anthony Floyd) Date: Wed, 16 Jul 2008 14:47:38 -0700 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle Message-ID: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> We have an application that has previously used masked arrays from Numpy 1.0.3. Part of saving files from that application involved pickling data types that contained these masked arrays. In the latest round of library updates, we've decided to move to the most recent version of matplotlib, which requires Numpy 1.1. Unfortunately, when we try to unpickle the data saved with Numpy 1.0.3 in the new code using Numpy 1.1.0, it chokes because it can't import numpy.core.ma for the masked arrays. A check of Numpy 1.1.0 shows that this is now numpy.ma.core. Does anyone have any advice on how we can unpickle the old data files and update the references to the new classes? Thanks, Anthony. -- Anthony Floyd, PhD Convergent Manufacturing Technologies Inc. 6190 Agronomy Rd, Suite 403 Vancouver BC V6T 1Z3 CANADA Email: Anthony.Floyd at convergent.ca | Tel: 604-822-9682 x102 WWW: http://www.convergent.ca | Fax: 604-822-9659 CMT is hiring: See http://www.convergent.ca for details From Chris.Barker at noaa.gov Wed Jul 16 17:51:45 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 16 Jul 2008 14:51:45 -0700 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: References: <20080715073530.B81915@saturn.araneidae.co.uk> <20080716193031.D20294@saturn.araneidae.co.uk> Message-ID: <487E6D71.40903@noaa.gov> Alan G Isaac wrote: > It is only a recommendation: > http://www.python.org/dev/peps/pep-0008/ > (And a misguided one at that. ;-) ) Maybe so, but it is indeed more than a recommendation, it is a de-facto standard. If Python had never allowed mixed tabs and spaces, it might be OK to use tabs for some projects, and spaces for others, but that's water long since under the bridge. Indentation is syntax in python -- we do need to all do it the same way, and four spaces is the standard -- there simply isn't another reasonable option if you want o share code with anyone else. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan at sun.ac.za Wed Jul 9 04:28:23 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 9 Jul 2008 10:28:23 +0200 Subject: [Numpy-discussion] Documentation: topical docs and reviewing our work Message-ID: <9457e7c80807090128u2194c8datc2ccef2be65d855c@mail.gmail.com> Hi all, A `numpy.doc` sub-module has been added, which contains documentation for topics such as indexing, broadcasting, array operations etc. These can be edited from the documentation wiki: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.doc/ If you'd like to document a topic that is not there, let me know and I'll add it. Further, we have documented a large number of functions, and the list is growing by the day. If you go to the docstring summary page: http://sd-2116.dedibox.fr/pydocweb/doc/ the ones ready for review are marked in pink, right at the top. Please log in and leave comments on those. Your input would be much appreciated! Regards St?fan From stefan at sun.ac.za Wed Jul 9 04:28:23 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 9 Jul 2008 10:28:23 +0200 Subject: [Numpy-discussion] Documentation: topical docs and reviewing our work Message-ID: <9457e7c80807090128u2194c8datc2ccef2be65d855c@mail.gmail.com> Hi all, A `numpy.doc` sub-module has been added, which contains documentation for topics such as indexing, broadcasting, array operations etc. These can be edited from the documentation wiki: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.doc/ If you'd like to document a topic that is not there, let me know and I'll add it. Further, we have documented a large number of functions, and the list is growing by the day. If you go to the docstring summary page: http://sd-2116.dedibox.fr/pydocweb/doc/ the ones ready for review are marked in pink, right at the top. Please log in and leave comments on those. Your input would be much appreciated! Regards St?fan From robert.kern at gmail.com Wed Jul 16 17:55:58 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Jul 2008 16:55:58 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: References: Message-ID: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> On Wed, Jul 16, 2008 at 16:45, wrote: > Greetings, > > I have an I,J,K 3D volume of amplitude values at regularly sampled time > intervals. I have an I,J 2D slice which contains a time (K) value at each I, > J location. What I would like to do is extract a subvolume at a constant +/- > K window around the slice. Is there an easy way to do this using advanced > indexing or some other method? Thanks in advanced for your help. cube[:,:,K-half_width:K+half_width] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From doutriaux1 at llnl.gov Wed Jul 16 17:58:17 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 16 Jul 2008 14:58:17 -0700 Subject: [Numpy-discussion] svd Message-ID: <487E6EF9.80308@llnl.gov> Hello, I'm using 1.1.0 and I have a bizarre thing happening it seems as if: doing: import numpy SVD = numpy.linalg.svd if different as doing import numpy.oldnumeric.linear_algebra SVD = numpy.oldnumeric.linear_algebra.singular_value_decomposition In the first case passing an array (204,1484) retuns array of shape: svd: (204, 204) (204,) (1484, 1484) in the second case I get (what i expected actually): svd: (204, 204) (204,) (204, 1484) But looking at the code, it seems like numpy.oldnumeric.linear_algebra.singular_value_decomposition is basicalyy numpy.linalg.svd Any idea on what's happening here? Thx, C. From charlesr.harris at gmail.com Wed Jul 16 18:09:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 16:09:02 -0600 Subject: [Numpy-discussion] svd In-Reply-To: <487E6EF9.80308@llnl.gov> References: <487E6EF9.80308@llnl.gov> Message-ID: On Wed, Jul 16, 2008 at 3:58 PM, Charles Doutriaux wrote: > Hello, > > I'm using 1.1.0 and I have a bizarre thing happening > > it seems as if: > doing: > import numpy > SVD = numpy.linalg.svd > > if different as doing > import numpy.oldnumeric.linear_algebra > SVD = numpy.oldnumeric.linear_algebra.singular_value_decomposition > > In the first case passing an array (204,1484) retuns array of shape: > svd: (204, 204) (204,) (1484, 1484) > > in the second case I get (what i expected actually): > svd: (204, 204) (204,) (204, 1484) > > But looking at the code, it seems like > numpy.oldnumeric.linear_algebra.singular_value_decomposition > is basicalyy numpy.linalg.svd > > Any idea on what's happening here? > There is a full_matrices flag that determines if you get the full orthogonal matrices, or the the minimum size needed, i.e. In [12]: l,d,r = linalg.svd(x, full_matrices=0) In [13]: shape(r) Out[13]: (2, 4) In [14]: x = zeros((2,4)) In [15]: l,d,r = linalg.svd(x) In [16]: shape(r) Out[16]: (4, 4) In [17]: l,d,r = linalg.svd(x, full_matrices=0) In [18]: shape(r) Out[18]: (2, 4) Chuck > > Thx, > > C. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Jul 16 18:11:02 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 00:11:02 +0200 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: <487E6D71.40903@noaa.gov> References: <20080715073530.B81915@saturn.araneidae.co.uk> <20080716193031.D20294@saturn.araneidae.co.uk> <487E6D71.40903@noaa.gov> Message-ID: <9457e7c80807161511m1f1e933q691683d5f1f4a0ae@mail.gmail.com> 2008/7/16 Christopher Barker : > Indentation is syntax in python -- we do need to all do it the same way, > and four spaces is the standard -- there simply isn't another reasonable > option if you want o share code with anyone else. I agree. Let's just end this thread here. It simply can't lead to any useful discussion. St?fan From Jack.Cook at shell.com Wed Jul 16 18:12:47 2008 From: Jack.Cook at shell.com (Jack.Cook at shell.com) Date: Wed, 16 Jul 2008 17:12:47 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> Message-ID: Robert, I can understand how this works if K is a constant time value but in my case K varies at each location in the two-dimensional slice. In other words, if I was doing this in a for loop I would do something like this for i in range(numI): for j in range(numJ): k = slice(i,j) trace = cube(i,j,k-half_width:k+half_width) # shove trace in sub volume What am I missing? - Jack -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org]On Behalf Of Robert Kern Sent: Wednesday, July 16, 2008 4:56 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Numpy Advanced Indexing Question On Wed, Jul 16, 2008 at 16:45, wrote: > Greetings, > > I have an I,J,K 3D volume of amplitude values at regularly sampled time > intervals. I have an I,J 2D slice which contains a time (K) value at each I, > J location. What I would like to do is extract a subvolume at a constant +/- > K window around the slice. Is there an easy way to do this using advanced > indexing or some other method? Thanks in advanced for your help. cube[:,:,K-half_width:K+half_width] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From aisaac at american.edu Wed Jul 16 18:27:05 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 16 Jul 2008 18:27:05 -0400 Subject: [Numpy-discussion] Ticket review: #843 In-Reply-To: <487E6D71.40903@noaa.gov> References: <20080715073530.B81915@saturn.araneidae.co.uk><20080716193031.D20294@saturn.araneidae.co.uk> <487E6D71.40903@noaa.gov> Message-ID: On Wed, 16 Jul 2008, Christopher Barker apparently wrote: > Indentation is syntax in python -- we do need to all do it > the same way, and four spaces is the standard -- there > simply isn't another reasonable option if you want o share > code with anyone else. Last comment (since this has already gone too long): There are large projects that accept the use of either convention. (E.g., Zope, if I recall correctly.) BUT projects set their own style guidelines, and I am NOT in any way proposing that NumPy change. (Not that it would possibly matter if I did so.) But just to be clear: the common arguments against the use of tabs are demonstrably false and illustrate either ignorance or use of an incapable editor. Nobody with a decent editor will ever have problems with code that consistently uses tabs for indentation---a choice that can easily be signalled by modelines when code is shared on a project that allows both conventions. Cheers, Alan From doutriaux1 at llnl.gov Wed Jul 16 18:24:17 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 16 Jul 2008 15:24:17 -0700 Subject: [Numpy-discussion] svd In-Reply-To: References: <487E6EF9.80308@llnl.gov> Message-ID: <487E7511.1070304@llnl.gov> doh... Thanks Charles... I guess I've been staring at this code for too long now... C. Charles R Harris wrote: > > > On Wed, Jul 16, 2008 at 3:58 PM, Charles Doutriaux > > wrote: > > Hello, > > I'm using 1.1.0 and I have a bizarre thing happening > > it seems as if: > doing: > import numpy > SVD = numpy.linalg.svd > > if different as doing > import numpy.oldnumeric.linear_algebra > SVD = numpy.oldnumeric.linear_algebra.singular_value_decomposition > > In the first case passing an array (204,1484) retuns array of shape: > svd: (204, 204) (204,) (1484, 1484) > > in the second case I get (what i expected actually): > svd: (204, 204) (204,) (204, 1484) > > But looking at the code, it seems like > numpy.oldnumeric.linear_algebra.singular_value_decomposition > is basicalyy numpy.linalg.svd > > Any idea on what's happening here? > > > There is a full_matrices flag that determines if you get the full > orthogonal matrices, or the the minimum size needed, i.e. > > In [12]: l,d,r = linalg.svd(x, full_matrices=0) > > In [13]: shape(r) > Out[13]: (2, 4) > > In [14]: x = zeros((2,4)) > > In [15]: l,d,r = linalg.svd(x) > > In [16]: shape(r) > Out[16]: (4, 4) > > In [17]: l,d,r = linalg.svd(x, full_matrices=0) > > In [18]: shape(r) > Out[18]: (2, 4) > > > Chuck > > > > > > > Thx, > > C. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed Jul 16 19:08:12 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Jul 2008 18:08:12 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: References: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> Message-ID: <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> On Wed, Jul 16, 2008 at 17:12, wrote: > Robert, > > I can understand how this works if K is a constant time value but in my case K varies at each location in the two-dimensional slice. In other words, if I was doing this in a for loop I would do something like this > > for i in range(numI): > for j in range(numJ): > k = slice(i,j) > trace = cube(i,j,k-half_width:k+half_width) > # shove trace in sub volume > > What am I missing? Ah, okay. It's a bit tricky, though. Yes, you need to use fancy indexing. Since axis you want to be index fancifully is not the first one, you have to be more explicit than you might otherwise want. For example, it would be great if you could just use slices for the first two axes: cube[:,:,slice + numpy.arange(-half_width,half_width)] but the semantics of that are a bit different for reasons I can explain later, if you want. Instead, you have to have explicit fancy-index arrays for the first two axes. Further, the arrays for each axis need to be broadcastable to each other. Fancy indexing will iterate over these broadcasted arrays in parallel with each other to form the new array. The liberal application of numpy.newaxis will help us achieve that. So this is the complete recipe: In [29]: import numpy In [30]: ni, nj, nk = (10, 15, 20) # Make a fake data cube such that cube[i,j,k] == k for all i,j,k. In [31]: cube = numpy.empty((ni,nj,nk), dtype=int) In [32]: cube[:,:,:] = numpy.arange(nk)[numpy.newaxis,numpy.newaxis,:] # Pick out a random fake horizon in k. In [34]: kslice = numpy.random.randint(5, 15, size=(ni, nj)) In [35]: kslice Out[35]: array([[13, 14, 9, 12, 12, 11, 8, 14, 11, 13, 13, 13, 8, 11, 8], [ 7, 12, 12, 6, 10, 12, 9, 11, 13, 9, 14, 11, 5, 12, 12], [ 7, 5, 10, 9, 6, 5, 5, 14, 5, 6, 7, 10, 6, 10, 11], [ 6, 9, 11, 14, 7, 11, 10, 6, 6, 9, 9, 11, 5, 5, 14], [12, 8, 11, 6, 10, 8, 5, 9, 8, 10, 7, 5, 9, 9, 14], [ 9, 8, 10, 9, 10, 12, 10, 10, 6, 10, 11, 6, 8, 7, 7], [11, 12, 7, 13, 5, 5, 8, 14, 5, 14, 9, 10, 12, 7, 14], [ 7, 7, 7, 12, 10, 6, 13, 13, 11, 13, 8, 11, 13, 14, 14], [ 6, 13, 13, 10, 10, 14, 10, 8, 9, 14, 13, 12, 9, 9, 5], [13, 14, 10, 8, 11, 11, 10, 6, 12, 11, 12, 12, 13, 11, 7]]) In [36]: half_width = 3 # These two replace the empty slices for the first two axes. In [37]: idx_i = numpy.arange(ni)[:,numpy.newaxis,numpy.newaxis] In [38]: idx_j = numpy.arange(nj)[numpy.newaxis,:,numpy.newaxis] # This is the substantive part that actually picks out our window. In [41]: idx_k = kslice[:,:,numpy.newaxis] + numpy.arange(-half_width,half_width+1) In [42]: smallcube = cube[idx_i,idx_j,idx_k] In [43]: smallcube.shape Out[43]: (10, 15, 7) # Now verify that our window is centered on kslice everywhere: In [47]: smallcube[:,:,3] Out[47]: array([[13, 14, 9, 12, 12, 11, 8, 14, 11, 13, 13, 13, 8, 11, 8], [ 7, 12, 12, 6, 10, 12, 9, 11, 13, 9, 14, 11, 5, 12, 12], [ 7, 5, 10, 9, 6, 5, 5, 14, 5, 6, 7, 10, 6, 10, 11], [ 6, 9, 11, 14, 7, 11, 10, 6, 6, 9, 9, 11, 5, 5, 14], [12, 8, 11, 6, 10, 8, 5, 9, 8, 10, 7, 5, 9, 9, 14], [ 9, 8, 10, 9, 10, 12, 10, 10, 6, 10, 11, 6, 8, 7, 7], [11, 12, 7, 13, 5, 5, 8, 14, 5, 14, 9, 10, 12, 7, 14], [ 7, 7, 7, 12, 10, 6, 13, 13, 11, 13, 8, 11, 13, 14, 14], [ 6, 13, 13, 10, 10, 14, 10, 8, 9, 14, 13, 12, 9, 9, 5], [13, 14, 10, 8, 11, 11, 10, 6, 12, 11, 12, 12, 13, 11, 7]]) In [50]: (smallcube[:,:,3] == kslice).all() Out[50]: True Clear as mud? I can go into more detail if you like particularly about how newaxis works. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Wed Jul 16 19:32:08 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 16 Jul 2008 16:32:08 -0700 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> References: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> Message-ID: On Wed, Jul 16, 2008 at 4:08 PM, Robert Kern wrote: > Ah, okay. It's a bit tricky, though. Yes, you need to use fancy > indexing. Since axis you want to be index fancifully is not the first > one, you have to be more explicit than you might otherwise want. For > example, it would be great if you could just use slices for the first > two axes: +1 for TOW (Tip of the Week). I'm sure some helpful soul will be kind enough to put this in one of the example/tutorial/cookbook pages. Cheers, f From mattknox.ca at gmail.com Wed Jul 16 19:48:33 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Wed, 16 Jul 2008 23:48:33 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?NumPy_date/time_types_and_the_resolu?= =?utf-8?q?tion=09concept?= References: <200807141507.47484.faltet@pytables.org> <200807151330.10394.faltet@pytables.org> <200807161823.12148.faltet@pytables.org> Message-ID: > Maybe you are right, but by providing many resolutions we are trying to > cope with the needs of people that are using them a lot. In > particular, we are willing that the authors of the timseries scikit can > find on these new dtype a fair replacement of their Date class (our > proposal will be not so featured, but...). I think a basic date/time dtype for numpy would be a nice addition for general usage. Now as for the timeseries module using this dtype for most of the date-fu that goes on... that would be a bit more challenging. Unless all of the frequencies/resolutions currently supported in the timeseries scikit are supported with the new dtype, it is unlikely we would be able to replace our implementation. In particular, business day frequency (Monday - Friday) is of central importance for working with financial time series (which was my motivation for the original prototype of the module). But using plain integers for the DateArray class actually seems to work pretty well and I'm not sure a whole lot would be gained by using a date dtype. That being said, if someone creates a fork of the timeseries module using a new date dtype at it's core and it works amazingly well, then I'd probably get on board. I just think that may be difficult to do with a general purpose date dtype suitable for inclusion in the numpy core. - Matt From fperez.net at gmail.com Wed Jul 16 21:42:33 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 16 Jul 2008 18:42:33 -0700 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? Message-ID: Howdy, In working on the ipython testing machinery, I looked at the numpy nosetester.py file and found that it works by monkeypatching nose itself. I'm curious as to why this approach was taken rather than constructing a plugin object. In general, monkeypatching should be done as a last-resort trick, because it tends to be brittle and can cause bizarre problems to users who after running numpy.test() find that their normal nose-using code starts doing funny things. Any thoughts/insights? Cheers, f From robert.kern at gmail.com Wed Jul 16 22:00:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Jul 2008 21:00:48 -0500 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: References: Message-ID: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> On Wed, Jul 16, 2008 at 20:42, Fernando Perez wrote: > Howdy, > > In working on the ipython testing machinery, I looked at the numpy > nosetester.py file and found that it works by monkeypatching nose > itself. I'm curious as to why this approach was taken rather than > constructing a plugin object. In general, monkeypatching should be > done as a last-resort trick, because it tends to be brittle and can > cause bizarre problems to users who after running numpy.test() find > that their normal nose-using code starts doing funny things. > > Any thoughts/insights? Is there a way to do it programatically without requiring numpy to be installed with setuptools? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Wed Jul 16 22:44:42 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 16 Jul 2008 19:44:42 -0700 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> Message-ID: On Wed, Jul 16, 2008 at 7:00 PM, Robert Kern wrote: > Is there a way to do it programatically without requiring numpy to be > installed with setuptools? I think so, though I'm not 100% certain because I haven't finished the ipython work. So far what I have for ip is all nose mods done as a nose plugin. Right now that plugin needs to be installed as a true plugin (i.e. via setuptools), but IPython does NOT need to be installed via st. What I need to add is a way to run the testing via a python script (right now I use the command line, hence the requirement for the plugin to be really available to nose) that would correctly load and configure everything needed. I think under this scenario, it should be possible to load this plugin from a private package (IPython.testing.plugin) instead of the nose namespace, but that's the part I have yet to confirm with an actual implementation. Cheers, f From alan.mcintyre at gmail.com Wed Jul 16 23:21:00 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 16 Jul 2008 23:21:00 -0400 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> Message-ID: <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> On Wed, Jul 16, 2008 at 10:00 PM, Robert Kern wrote: > Is there a way to do it programatically without requiring numpy to be > installed with setuptools? There is; you have to pass a list of plugin instances to the constructor of TestProgram--all plugins that you might want to use, even the builtin ones. (As far as I know, that is.) The monkeypatching approach was the first one that I could make to work with the least amount of hassle, but it's definitely not the best way. I only had to monkeypatch a couple of things at first, but as I figured out what the test framework needed to do, it just got worse, so I was beginning to get uncomfortable with it myself. (Honest! :) Once the NumPy and SciPy test suites are mostly fixed up to work under the current rules, I'll go back and use a method that doesn't require monkeypatching. It shouldn't have any effect on the public interface or the tests themselves. Since we're discussing this sort of thing, there's something I've been meaning to ask anyway: do we really need to allow end users to pass in arbitrary extra arguments to nose (via the extra_argv in test())? This seems to lock us in to having a mostly unobstructed path from test() through to an uncustomized nose backend. From robert.kern at gmail.com Wed Jul 16 23:32:14 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Jul 2008 22:32:14 -0500 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> Message-ID: <3d375d730807162032n4c795d9ckf34011ce9d4e240d@mail.gmail.com> On Wed, Jul 16, 2008 at 22:21, Alan McIntyre wrote: > On Wed, Jul 16, 2008 at 10:00 PM, Robert Kern wrote: >> Is there a way to do it programatically without requiring numpy to be >> installed with setuptools? > > There is; you have to pass a list of plugin instances to the > constructor of TestProgram--all plugins that you might want to use, > even the builtin ones. (As far as I know, that is.) > > The monkeypatching approach was the first one that I could make to > work with the least amount of hassle, but it's definitely not the best > way. I only had to monkeypatch a couple of things at first, but as I > figured out what the test framework needed to do, it just got worse, > so I was beginning to get uncomfortable with it myself. (Honest! :) > Once the NumPy and SciPy test suites are mostly fixed up to work under > the current rules, I'll go back and use a method that doesn't require > monkeypatching. It shouldn't have any effect on the public interface > or the tests themselves. Sounds good. > Since we're discussing this sort of thing, there's something I've been > meaning to ask anyway: do we really need to allow end users to pass in > arbitrary extra arguments to nose (via the extra_argv in test())? > This seems to lock us in to having a mostly unobstructed path from > test() through to an uncustomized nose backend. At least with other projects, I occasionally want to do things like run with --pdb-failure or --detailed-errors, etc. What exactly is extra_argv blocking? My preference, actually, is for the nosetests command to be able to run our tests correctly if at all possible. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Wed Jul 16 23:52:43 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 16 Jul 2008 23:52:43 -0400 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <3d375d730807162032n4c795d9ckf34011ce9d4e240d@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> <3d375d730807162032n4c795d9ckf34011ce9d4e240d@mail.gmail.com> Message-ID: <1d36917a0807162052t2f19f127pafb9377048d4e23b@mail.gmail.com> On Wed, Jul 16, 2008 at 11:32 PM, Robert Kern wrote: >> Since we're discussing this sort of thing, there's something I've been >> meaning to ask anyway: do we really need to allow end users to pass in >> arbitrary extra arguments to nose (via the extra_argv in test())? >> This seems to lock us in to having a mostly unobstructed path from >> test() through to an uncustomized nose backend. > > At least with other projects, I occasionally want to do things like > run with --pdb-failure or --detailed-errors, etc. What exactly is > extra_argv blocking? It's not blocking anything; it just feels wrong for some reason. Probably because I've been duck-punching nose and doctest to death to make them act the way I want, and I can't fit all the doctest/nose/unittest behavior in my head all at once to comfortably say that any of those other options will still work correctly. ;) It's probably just a pointless worry that will be moot after all the monkeypatching is removed, since the underlying test libraries will be in an unaltered state. > My preference, actually, is for the nosetests > command to be able to run our tests correctly if at all possible. The unit tests will run just fine via nosetests, but the doctests generally will not, because of the limited execution context NoseTester now enforces on them. From robert.kern at gmail.com Thu Jul 17 00:07:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 Jul 2008 23:07:07 -0500 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <1d36917a0807162052t2f19f127pafb9377048d4e23b@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> <3d375d730807162032n4c795d9ckf34011ce9d4e240d@mail.gmail.com> <1d36917a0807162052t2f19f127pafb9377048d4e23b@mail.gmail.com> Message-ID: <3d375d730807162107h7ab24fa1r334423c19fd864e8@mail.gmail.com> On Wed, Jul 16, 2008 at 22:52, Alan McIntyre wrote: > On Wed, Jul 16, 2008 at 11:32 PM, Robert Kern wrote: >>> Since we're discussing this sort of thing, there's something I've been >>> meaning to ask anyway: do we really need to allow end users to pass in >>> arbitrary extra arguments to nose (via the extra_argv in test())? >>> This seems to lock us in to having a mostly unobstructed path from >>> test() through to an uncustomized nose backend. >> >> At least with other projects, I occasionally want to do things like >> run with --pdb-failure or --detailed-errors, etc. What exactly is >> extra_argv blocking? > > It's not blocking anything; it just feels wrong for some reason. > Probably because I've been duck-punching nose and doctest to death to > make them act the way I want, and I can't fit all the > doctest/nose/unittest behavior in my head all at once to comfortably > say that any of those other options will still work correctly. ;) > > It's probably just a pointless worry that will be moot after all the > monkeypatching is removed, since the underlying test libraries will be > in an unaltered state. That's what I expect. >> My preference, actually, is for the nosetests >> command to be able to run our tests correctly if at all possible. > > The unit tests will run just fine via nosetests, but the doctests > generally will not, because of the limited execution context > NoseTester now enforces on them. Personally, I could live with that. I don't see the extra options as very useful for testing examples. However, I would prefer to leave the capability there until a concrete practical problem arises. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Thu Jul 17 00:14:33 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 16 Jul 2008 21:14:33 -0700 Subject: [Numpy-discussion] Monkeypatching vs nose plugin? In-Reply-To: <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> References: <3d375d730807161900hce08560h6110a859302fb063@mail.gmail.com> <1d36917a0807162021k79f59bbg4051803103d317fb@mail.gmail.com> Message-ID: On Wed, Jul 16, 2008 at 8:21 PM, Alan McIntyre wrote: > The monkeypatching approach was the first one that I could make to > work with the least amount of hassle, but it's definitely not the best > way. I only had to monkeypatch a couple of things at first, but as I > figured out what the test framework needed to do, it just got worse, > so I was beginning to get uncomfortable with it myself. (Honest! :) > Once the NumPy and SciPy test suites are mostly fixed up to work under > the current rules, I'll go back and use a method that doesn't require > monkeypatching. It shouldn't have any effect on the public interface > or the tests themselves. Great. As I mentioned, the code I sent a few days ago already has a non-monkeypatching plugin you can use as a starting point. > Since we're discussing this sort of thing, there's something I've been > meaning to ask anyway: do we really need to allow end users to pass in > arbitrary extra arguments to nose (via the extra_argv in test())? > This seems to lock us in to having a mostly unobstructed path from > test() through to an uncustomized nose backend. As RK said, it's often handy to be able to run the tests from the plain command line to tweak nose's behavior. What I plan for ipython is to have a little script (installed to $PREFIX/bin) that loads all the necessary machinery but otherwise works like 'nosetests --necessary-options'. This will let us run the ipython tests from the command line (and adding additional nose flags as required) but without users needing to install the plugin. I think it's perfectly doable, I just haven't finished it. If you don't like for numpy the script option, then users would need to have the plugin installed when working at the command line (though a python numpy.test() call wouldn't need that, since it can do the plugin loading itself). Cheers, f From alan.mcintyre at gmail.com Thu Jul 17 00:37:47 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 17 Jul 2008 00:37:47 -0400 Subject: [Numpy-discussion] Python version support for NumPy 1.2 Message-ID: <1d36917a0807162137g6de1f81aw1cb7773510c7a349@mail.gmail.com> Which versions of Python are to be officially supported by NumPy 1.2? I've been working against 2.5 and testing against 2.4 occasionally, but 2.3 still has some issues I need to address (or at least that was the case the last time I checked). From charlesr.harris at gmail.com Thu Jul 17 00:52:40 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 16 Jul 2008 22:52:40 -0600 Subject: [Numpy-discussion] Python version support for NumPy 1.2 In-Reply-To: <1d36917a0807162137g6de1f81aw1cb7773510c7a349@mail.gmail.com> References: <1d36917a0807162137g6de1f81aw1cb7773510c7a349@mail.gmail.com> Message-ID: On Wed, Jul 16, 2008 at 10:37 PM, Alan McIntyre wrote: > Which versions of Python are to be officially supported by NumPy 1.2? > I've been working against 2.5 and testing against 2.4 occasionally, > but 2.3 still has some issues I need to address (or at least that was > the case the last time I checked). > ____ You don't need to worry about 2.3. So 2.4 and 2.5. Python 2.6 is scheduled for release in October and might cause some problems. I have 2.6 installed and can maybe do some testing if you need it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Thu Jul 17 03:42:32 2008 From: faltet at pytables.org (Francesc Alted) Date: Thu, 17 Jul 2008 09:42:32 +0200 Subject: [Numpy-discussion] NumPy date/time types and the resolution concept In-Reply-To: <200807151136.17061.pgmdevlist@gmail.com> References: <200807141507.47484.faltet@pytables.org> <200807151330.10394.faltet@pytables.org> <200807151136.17061.pgmdevlist@gmail.com> Message-ID: <200807170942.32721.faltet@pytables.org> A Tuesday 15 July 2008, Pierre GM escrigu?: > On Tuesday 15 July 2008 07:30:09 Francesc Alted wrote: > > Maybe is only that. But by using the term 'frequency' I tend to > > think that you are expecting to have one entry (observation) in > > your array for each time 'tick' since time start. OTOH, the term > > 'resolution' doesn't have this implication, and only states the > > precision of the timestamp. > > OK, now I get it. > > > I don't know whether my impression is true or not, but after > > reading about your TimeSeries package, I'm still thinking that this > > expectation of one observation per 'tick' was what driven you to > > choose the 'frequency' name. > > Well, we do require a "one point per tick" for some operations, such > as conversion from one frequency to another, but only for TimeSeries. > A Date Array doesn't have to be regularly spaced. Ok, I see. So, it is just the 'frequency' keyword that was misleading me. Thanks for the clarification. Cheers, -- Francesc Alted From faltet at pytables.org Thu Jul 17 04:10:13 2008 From: faltet at pytables.org (Francesc Alted) Date: Thu, 17 Jul 2008 10:10:13 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?NumPy_date/time_types_and_the_r?= =?iso-8859-1?q?esolution=09concept?= In-Reply-To: References: <200807141507.47484.faltet@pytables.org> <200807161823.12148.faltet@pytables.org> Message-ID: <200807171010.13419.faltet@pytables.org> A Thursday 17 July 2008, Matt Knox escrigu?: > > Maybe you are right, but by providing many resolutions we are > > trying to cope with the needs of people that are using them a lot. > > In particular, we are willing that the authors of the timseries > > scikit can find on these new dtype a fair replacement of their Date > > class (our proposal will be not so featured, but...). > > I think a basic date/time dtype for numpy would be a nice addition > for general usage. > > Now as for the timeseries module using this dtype for most of the > date-fu that goes on... that would be a bit more challenging. Unless > all of the frequencies/resolutions currently supported in the > timeseries scikit are supported with the new dtype, it is unlikely we > would be able to replace our implementation. In particular, business > day frequency (Monday - Friday) is of central importance for working > with financial time series (which was my motivation for the original > prototype of the module). But using plain integers for the DateArray > class actually seems to work pretty well and I'm not sure a whole lot > would be gained by using a date dtype. Yeah, the business week. We've pondered including this, but we are not sure about the differences of such a thing and a calendar week in terms of a time unit. I see for sure its merits on the TimeSeries module, but I'm afraid that it would be non-sense in the context of a general date/time dtype. Now that I think about it, maybe we should revise our initial intention of adding a quarter too, because ISO 8601 does not offer a way to print it nicely. We can also opt by extending the ISO 8601 representation in order to allow the next sort of string representation: In [35]: array([70, 72, 19], 'datetime64[Q]') Out[35]: array([1988Q2, 1988Q4, 1975Q3], dtype="datetime64[Q]") but, I don't know if this would innecessarily complicate things (apart of representing a departure from standards :-/). > That being said, if someone creates a fork of the timeseries module > using a new date dtype at it's core and it works amazingly well, then > I'd probably get on board. I just think that may be difficult to do > with a general purpose date dtype suitable for inclusion in the numpy > core. Yeah, I understand your reasons. In fact, it is a pity that your requeriments diverge in some key points from our proposal for the general dtypes. I have had a look at how you have integrated recarrays in your TimeSeries module, and I'm sure that by choosing a date/time dtype you would be able to reduce the complexity (and specially the efficiency too) of your code quite a few. Cheers, -- Francesc Alted From pav at iki.fi Thu Jul 17 04:14:25 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Jul 2008 08:14:25 +0000 (UTC) Subject: [Numpy-discussion] Ticket #837 References: Message-ID: Wed, 16 Jul 2008 15:43:00 -0600, Charles R Harris wrote: > On Wed, Jul 16, 2008 at 3:05 PM, Pauli Virtanen wrote: > > >> http://scipy.org/scipy/numpy/ticket/837 >> >> Infinite loop in fromfile and fromstring with sep=' ' and malformed >> input. >> >> I committed a fix to trunk. Does this need a 1.1.1 backport? >> >> > Yes, I think so. TIA, Done, r5444. Pauli From stefan at sun.ac.za Thu Jul 17 04:16:51 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 10:16:51 +0200 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> References: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> Message-ID: <9457e7c80807170116i24a4eecay8b8ed6ee05d5d9ca@mail.gmail.com> Hi Robert 2008/7/17 Robert Kern : > In [42]: smallcube = cube[idx_i,idx_j,idx_k] Fantastic -- a good way to warm up the brain-circuit in the morning! Is there an easy-to-remember rule that predicts the output shape of the operation above? I'm trying to imaging how the output would change if I altered the dimensions of idx_i or idx_j, but it's hard. It looks like you can do all sorts of interesting things by manipulation the indices. For example, if I take In [137]: x = np.arange(12).reshape((3,4)) I can produce either In [138]: x[np.array([[0,1]]), np.array([[1, 2]])] Out[138]: array([[1, 6]]) or In [140]: x[np.array([[0],[1]]), np.array([[1], [2]])] Out[140]: array([[1], [6]]) and even In [141]: x[np.array([[0],[1]]), np.array([[1, 2]])] Out[141]: array([[1, 2], [5, 6]]) or its transpose In [143]: x[np.array([[0,1]]), np.array([[1], [2]])] Out[143]: array([[1, 5], [2, 6]]) Is it possible to separate the indexing in order to understand it better? My thinking was cube_i = cube[idx_i,:,:].squeeze() cube_j = cube_i[:,idx_j,:].squeeze() cube_k = cube_j[:,:,idx_k].squeeze() Not sure what would happen if the original array had single dimensions, though. Back to the original problem: In [127]: idx_i.shape Out[127]: (10, 1, 1) In [128]: idx_j.shape Out[128]: (1, 15, 1) In [129]: idx_k.shape Out[129]: (10, 15, 7) For the constant slice case, I guess idx_k also have been (1, 1, 7)? The construction of the cube could probably be done using only cube.flat = np.arange(nk) Fernando is right: this is good food for thought and excellent cookbook material! Regards St?fan From pav at iki.fi Thu Jul 17 04:19:32 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Jul 2008 08:19:32 +0000 (UTC) Subject: [Numpy-discussion] Buildbot failures since r5443 Message-ID: Hi, Since r5443 the Sparc buildbots show a "Bus error" in the test phase: http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/ builds/102/steps/shell_2/logs/stdio while the one on FreeBSD-64 passes. -- Pauli Virtanen From fperez.net at gmail.com Thu Jul 17 04:25:25 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 17 Jul 2008 01:25:25 -0700 Subject: [Numpy-discussion] Testing -heads up with #random Message-ID: Hi Alan, I was trying to reuse your #random checker for ipython but kept running into problems. Is it working for you in numpy in actual code? Because in the entire SVN tree I only see it mentioned here: maqroll[numpy]> grin #random ./numpy/testing/nosetester.py: 43 : if "#random" in want: 67 : # "#random" directive to allow executing a command while ignoring its 375 : # try the #random directive on the output line 379 : #random: may vary on your system maqroll[numpy]> I'm asking because I suspect it is NOT working for numpy. The reason is some really nasty, silent exception trapping being done by nose. In nose's loadTestsFromModule, which you've overridden to include: yield NumpyDocTestCase(test, optionflags=optionflags, checker=NumpyDoctestOutputChecker()) it's likely that this line can cause an exception (at least it was doing it for me in ipython, because this class inherits from npd but tries to directly call __init__ from doctest.DocTestCase). Unfortunately, nose will silently swallow *any* exception there, simply ignoring your tests and not even telling you what happened. Very, very annoying. You can see if you have an exception by doing something like try: dt = DocTestCase(test, optionflags=optionflags, checker=checker) except: from IPython import ultraTB ultraTB.AutoFormattedTB()() yield dt to force a traceback printing. Anyway, I mention this because I just wasted a good chunk of time fighting this one for ipython, where I need the #random functionality. It seems it's not used in numpy yet, but I imagine it will soon, and I figured I'd save you some time. Cheers, f From stefan at sun.ac.za Thu Jul 17 04:31:20 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 10:31:20 +0200 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> Message-ID: <9457e7c80807170131m76152773md4518f118f37dfd1@mail.gmail.com> Hi Anthony 2008/7/16 Anthony Floyd : > Unfortunately, when we try to unpickle the data saved with Numpy 1.0.3 > in the new code using Numpy 1.1.0, it chokes because it can't import > numpy.core.ma for the masked arrays. A check of Numpy 1.1.0 shows that > this is now numpy.ma.core. The maskedarray functionality has been rewritten, and is now `numpy.ma`. For the time being, the old package is still available as `numpy.oldnumeric.ma`. Regards St?fan From robert.kern at gmail.com Thu Jul 17 04:39:46 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Jul 2008 03:39:46 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <9457e7c80807170116i24a4eecay8b8ed6ee05d5d9ca@mail.gmail.com> References: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> <9457e7c80807170116i24a4eecay8b8ed6ee05d5d9ca@mail.gmail.com> Message-ID: <3d375d730807170139g10491683u238ecb066cea5875@mail.gmail.com> On Thu, Jul 17, 2008 at 03:16, St?fan van der Walt wrote: > Hi Robert > > 2008/7/17 Robert Kern : >> In [42]: smallcube = cube[idx_i,idx_j,idx_k] > > Fantastic -- a good way to warm up the brain-circuit in the morning! > Is there an easy-to-remember rule that predicts the output shape of > the operation above? I'm trying to imaging how the output would > change if I altered the dimensions of idx_i or idx_j, but it's hard. Like I said, they all get broadcasted against each other. The final output is the shape of the broadcasted index arrays and takes values found by iterating in parallel over those broadcasted index arrays. > It looks like you can do all sorts of interesting things by > manipulation the indices. For example, if I take > > In [137]: x = np.arange(12).reshape((3,4)) > > I can produce either > > In [138]: x[np.array([[0,1]]), np.array([[1, 2]])] > Out[138]: array([[1, 6]]) > > or > > In [140]: x[np.array([[0],[1]]), np.array([[1], [2]])] > Out[140]: > array([[1], > [6]]) > > and even > > In [141]: x[np.array([[0],[1]]), np.array([[1, 2]])] > Out[141]: > array([[1, 2], > [5, 6]]) > > or its transpose > > In [143]: x[np.array([[0,1]]), np.array([[1], [2]])] > Out[143]: > array([[1, 5], > [2, 6]]) > > Is it possible to separate the indexing in order to understand it > better? My thinking was > > cube_i = cube[idx_i,:,:].squeeze() > cube_j = cube_i[:,idx_j,:].squeeze() > cube_k = cube_j[:,:,idx_k].squeeze() > > Not sure what would happen if the original array had single dimensions, though. You'd have a problem. So the way fancy indexing interacts with slices is a bit tricky, and this is why we couldn't use the nicer syntax of cube[:,:,idx_k]. All axes with fancy indices are collected together. Their index arrays are broadcasted and iterated over. *For each iterate*, all of the slices are collected, and those sliced axes are *added* to the output array. If you had used fancy indexing on all of the axes, then the iterate would be a scalar value pulled from the original array. If you mix fancy indexing and slices, the iterate is the *array* formed by the remaining slices. So if idx_k is shaped (ni,nj,3), for example, cube[:,:,idx_k] will have the shape (ni,nj,ni,nj,3). So smallcube[:,:,i,j,k]==cube[:,:,idx_k[i,j,k]]. Is that clear, or am I obfuscating the subject more? > Back to the original problem: > > In [127]: idx_i.shape > Out[127]: (10, 1, 1) > > In [128]: idx_j.shape > Out[128]: (1, 15, 1) > > In [129]: idx_k.shape > Out[129]: (10, 15, 7) > > For the constant slice case, I guess idx_k also have been (1, 1, 7)? > > The construction of the cube could probably be done using only > > cube.flat = np.arange(nk) Yes, but only due to a weird feature of assigning to .flat. If the RHS is too short, it gets repeated. Since the last axis is contiguous, repeating arange(nk) happily coincides with the desired result of cube[i,j] == arange(nk) for all i,j. This won't check the size, though. If I give it cube.flat=np.arange(nk+1), it will repeat that array just fine, although it doesn't line up. cube[:,:,:]=np.arange(nk), on the other hand broadcasts the RHS to the shape of cube, then does the assignment. If the RHS cannot be broadcasted to the right shape (in this case because it is not the same length as the final axis of the LHS), an error is raised. I find the reuse of the broadcasting concept to be more memorable, and robust over the (mostly) ad hoc use of plain repetition with .flat. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Jul 17 04:51:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Jul 2008 03:51:55 -0500 Subject: [Numpy-discussion] Buildbot failures since r5443 In-Reply-To: References: Message-ID: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> On Thu, Jul 17, 2008 at 03:19, Pauli Virtanen wrote: > Hi, > > Since r5443 the Sparc buildbots show a "Bus error" in the test phase: > > http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/ > builds/102/steps/shell_2/logs/stdio > > while the one on FreeBSD-64 passes. In the test that's failing (test_filled_w_flexible_dtype), a structured array with a dtype of [('i',int), ('s','|S3'), ('f',float)] is created. I'm guessing that the final C double in that record is not getting aligned properly. On that architecture, I'm willing to bet that doubles need to be aligned on a 4-byte or 8-byte boundary. Can someone on that architecture try this for me? In [4]: from numpy import dtype In [5]: dtype([('i',int), ('s','|S3'), ('f',float)]).fields.items() Out[5]: [('i', (dtype('int32'), 0)), ('s', (dtype('|S3'), 4)), ('f', (dtype('float64'), 7))] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Thu Jul 17 05:00:05 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 11:00:05 +0200 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <3d375d730807170139g10491683u238ecb066cea5875@mail.gmail.com> References: <3d375d730807161455m76e73865i657b6b9865179a1f@mail.gmail.com> <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> <9457e7c80807170116i24a4eecay8b8ed6ee05d5d9ca@mail.gmail.com> <3d375d730807170139g10491683u238ecb066cea5875@mail.gmail.com> Message-ID: <9457e7c80807170200h2e9f4127hc01301ae02569ce3@mail.gmail.com> 2008/7/17 Robert Kern : > So the way fancy indexing interacts with slices is a bit tricky, and > this is why we couldn't use the nicer syntax of cube[:,:,idx_k]. All > axes with fancy indices are collected together. Their index arrays are > broadcasted and iterated over. *For each iterate*, all of the slices > are collected, and those sliced axes are *added* to the output array. > If you had used fancy indexing on all of the axes, then the iterate > would be a scalar value pulled from the original array. If you mix > fancy indexing and slices, the iterate is the *array* formed by the > remaining slices. > > So if idx_k is shaped (ni,nj,3), for example, cube[:,:,idx_k] will > have the shape (ni,nj,ni,nj,3). So > smallcube[:,:,i,j,k]==cube[:,:,idx_k[i,j,k]]. > > Is that clear, or am I obfuscating the subject more? Crystal, thank you for taking the time to explain! This is such valuable information; we should consider adding a section to numpy.doc.indexing or wherever is more appropriate. >> For the constant slice case, I guess idx_k also have been (1, 1, 7)? >> >> The construction of the cube could probably be done using only >> >> cube.flat = np.arange(nk) [...] > If the RHS cannot be > broadcasted to the right shape (in this case because it is not the > same length as the final axis of the LHS), an error is raised. I find > the reuse of the broadcasting concept to be more memorable, and robust > over the (mostly) ad hoc use of plain repetition with .flat. I've become used to exploiting the repeating property of .flat, and forgot its dangers. Thanks for the reminder! Cheers St?fan From robert.kern at gmail.com Thu Jul 17 05:13:50 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Jul 2008 04:13:50 -0500 Subject: [Numpy-discussion] Buildbot failures since r5443 In-Reply-To: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> References: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> Message-ID: <3d375d730807170213s388a850fr6e9a6b37bbf67df3@mail.gmail.com> On Thu, Jul 17, 2008 at 03:51, Robert Kern wrote: > On Thu, Jul 17, 2008 at 03:19, Pauli Virtanen wrote: >> Hi, >> >> Since r5443 the Sparc buildbots show a "Bus error" in the test phase: >> >> http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/ >> builds/102/steps/shell_2/logs/stdio >> >> while the one on FreeBSD-64 passes. > > In the test that's failing (test_filled_w_flexible_dtype), a > structured array with a dtype of [('i',int), ('s','|S3'), ('f',float)] > is created. I'm guessing that the final C double in that record is not > getting aligned properly. On that architecture, I'm willing to bet > that doubles need to be aligned on a 4-byte or 8-byte boundary. I think this is the case. Changing the dtype to use |S8 fixes that test. I get another bus error where the same dtype is used. I've changed these over in r5445 and r5446. We'll see if the buildbots pass, but I suspect they will. I'm not sure where the real bug is, though. We'll need real access to such a machine to fix the problem, I suspect. Volunteers? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From drnlmuller+scipy at gmail.com Thu Jul 17 05:51:19 2008 From: drnlmuller+scipy at gmail.com (Neil Muller) Date: Thu, 17 Jul 2008 11:51:19 +0200 Subject: [Numpy-discussion] Buildbot failures since r5443 In-Reply-To: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> References: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> Message-ID: On Thu, Jul 17, 2008 at 10:51 AM, Robert Kern wrote: > On Thu, Jul 17, 2008 at 03:19, Pauli Virtanen wrote: >> Hi, >> >> Since r5443 the Sparc buildbots show a "Bus error" in the test phase: >> >> http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/ >> builds/102/steps/shell_2/logs/stdio >> >> while the one on FreeBSD-64 passes. > > In the test that's failing (test_filled_w_flexible_dtype), a > structured array with a dtype of [('i',int), ('s','|S3'), ('f',float)] > is created. I'm guessing that the final C double in that record is not > getting aligned properly. On that architecture, I'm willing to bet > that doubles need to be aligned on a 4-byte or 8-byte boundary. The Sparc ABI requires that doubles be aligned on a 4-byte boundary. However, gcc uses instructions which require 8-byte alignment of doubles on SPARC by default - there are a couple of flags which can be used to force 4-byte alignment, but that imposes a (usually significant) speed penalty. AFAIK, the Solaris compilers also require 8-byte alignment for doubles. > In [4]: from numpy import dtype > > In [5]: dtype([('i',int), ('s','|S3'), ('f',float)]).fields.items() > Out[5]: > [('i', (dtype('int32'), 0)), > ('s', (dtype('|S3'), 4)), > ('f', (dtype('float64'), 7))] >>> os.uname()[4] 'sparc64' >>> from numpy import dtype >>> dtype([('i',int), ('s','|S3'), ('f',float)]).fields.items() [('i', (dtype('int32'), 0)), ('s', (dtype('|S3'), 4)), ('f', (dtype('float64'), 7))] -- Neil Muller drnlmuller at gmail.com I've got a gmail account. Why haven't I become cool? From Jack.Cook at shell.com Thu Jul 17 07:31:51 2008 From: Jack.Cook at shell.com (Jack.Cook at shell.com) Date: Thu, 17 Jul 2008 06:31:51 -0500 Subject: [Numpy-discussion] Numpy Advanced Indexing Question In-Reply-To: <3d375d730807161608t6b7cac4m8af5f7ba3eaea648@mail.gmail.com> Message-ID: Robert, Thanks so much! This is exactly what I needed. - Jack > Ah, okay. It's a bit tricky, though. Yes, you need to use fancy > indexing. Since axis you want to be index fancifully is not the first > one, you have to be more explicit than you might otherwise want. For From charlesr.harris at gmail.com Thu Jul 17 07:53:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 05:53:59 -0600 Subject: [Numpy-discussion] Ticket #837 In-Reply-To: References: Message-ID: On Thu, Jul 17, 2008 at 2:14 AM, Pauli Virtanen wrote: > Wed, 16 Jul 2008 15:43:00 -0600, Charles R Harris wrote: > > > On Wed, Jul 16, 2008 at 3:05 PM, Pauli Virtanen wrote: > > > > > >> http://scipy.org/scipy/numpy/ticket/837 > >> > >> Infinite loop in fromfile and fromstring with sep=' ' and malformed > >> input. > >> > >> I committed a fix to trunk. Does this need a 1.1.1 backport? > >> > >> > > Yes, I think so. TIA, > > Done, r5444. > Thanks, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 17 08:19:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 06:19:50 -0600 Subject: [Numpy-discussion] Buildbot failures since r5443 In-Reply-To: References: <3d375d730807170151n48ce2633ye065ae4a36d3a4a1@mail.gmail.com> Message-ID: On Thu, Jul 17, 2008 at 3:51 AM, Neil Muller > wrote: > On Thu, Jul 17, 2008 at 10:51 AM, Robert Kern > wrote: > > On Thu, Jul 17, 2008 at 03:19, Pauli Virtanen wrote: > >> Hi, > >> > >> Since r5443 the Sparc buildbots show a "Bus error" in the test phase: > >> > >> http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/ > >> builds/102/steps/shell_2/logs/stdio > >> > >> while the one on FreeBSD-64 passes. > > > > In the test that's failing (test_filled_w_flexible_dtype), a > > structured array with a dtype of [('i',int), ('s','|S3'), ('f',float)] > > is created. I'm guessing that the final C double in that record is not > > getting aligned properly. On that architecture, I'm willing to bet > > that doubles need to be aligned on a 4-byte or 8-byte boundary. > > The Sparc ABI requires that doubles be aligned on a 4-byte boundary. > However, gcc uses instructions which require 8-byte alignment of > doubles on SPARC by default - there are a couple of flags which can be > used to force 4-byte alignment, but that imposes a (usually > significant) speed penalty. AFAIK, the Solaris compilers also require > 8-byte alignment for doubles. > > > In [4]: from numpy import dtype > > > > In [5]: dtype([('i',int), ('s','|S3'), ('f',float)]).fields.items() > > Out[5]: > > [('i', (dtype('int32'), 0)), > > ('s', (dtype('|S3'), 4)), > > ('f', (dtype('float64'), 7))] > > > >>> os.uname()[4] > 'sparc64' > >>> from numpy import dtype > >>> dtype([('i',int), ('s','|S3'), ('f',float)]).fields.items() > > [('i', (dtype('int32'), 0)), ('s', (dtype('|S3'), 4)), ('f', > (dtype('float64'), 7))] > I wonder what descr->alignment is for doubles on SPARC. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Thu Jul 17 08:50:28 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 17 Jul 2008 08:50:28 -0400 Subject: [Numpy-discussion] Testing -heads up with #random In-Reply-To: References: Message-ID: <1d36917a0807170550r6f8c08b8ka5d57f036f4ef10a@mail.gmail.com> On Thu, Jul 17, 2008 at 4:25 AM, Fernando Perez wrote: > I was trying to reuse your #random checker for ipython but kept > running into problems. Is it working for you in numpy in actual code? > Because in the entire SVN tree I only see it mentioned here: > > maqroll[numpy]> grin #random > ./numpy/testing/nosetester.py: > 43 : if "#random" in want: > 67 : # "#random" directive to allow executing a command > while ignoring its > 375 : # try the #random directive on the output line > 379 : #random: may vary on your system > maqroll[numpy]> The second example is a doctest for the feature; for me it fails if #random is removed, and passes otherwise. > I'm asking because I suspect it is NOT working for numpy. The reason > is some really nasty, silent exception trapping being done by nose. > In nose's loadTestsFromModule, which you've overridden to include: Ah, thanks; I recall seeing a comment somewhere about nose swallowing exceptions in code under test, but I didn't know it would do things like that. > Unfortunately, nose will silently swallow *any* exception there, > simply ignoring your tests and not even telling you what happened. > Very, very annoying. You can see if you have an exception by doing > something like I added that to my local nosetester.py, but it didn't turn up any exceptions. I'll keep it in my working copy so I'm not as likely to miss some problem in the future. > Anyway, I mention this because I just wasted a good chunk of time > fighting this one for ipython, where I need the #random functionality. > It seems it's not used in numpy yet, but I imagine it will soon, and > I figured I'd save you some time. Thanks :) From timmichelsen at gmx-topmail.de Thu Jul 17 09:15:15 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 17 Jul 2008 13:15:15 +0000 (UTC) Subject: [Numpy-discussion] proposal: add a header and footer function to numpy.savetxt Message-ID: Hello, sometime scripts and programs create a lot of data output. For the programmer and also others not involved in the scripting but in the evaluation of the output it would be very nice the output files could be prepended with a file header describing what is written in the columns below and to append a footer. A good example has been developed by the scipy.scikits.timeseries developers: http://scipy.org/scipy/scikits/wiki/TimeSeries#Parameters These formatting flags are a convenient way to save additional meta information. E. g. of it is important to state the physical units of the data saved. I would be happy if such a thing could be added to np.savetxt(). What is the current common way to save a header above the saved ascii array? Kind regaed, Timmie From stefan at sun.ac.za Thu Jul 17 09:35:36 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 15:35:36 +0200 Subject: [Numpy-discussion] Documentation: glossary and refactoring Message-ID: <9457e7c80807170635m4cf98f84i5b19a20365be70cf@mail.gmail.com> Hi all, We recenty added a framework for topical documentation to NumPy, which is visible as `numpy.doc`. The content is here: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.doc/ I think new users will find the glossary especially useful: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.doc.reference.glossary/ For technical terms and explanation, we also have `numpy.doc.jargon` (which is still empty; but feel free to break the ice!). To those participating: if you find yourself explaining a common concept or term over and over, consider adding it to the glossary or jargon files instead. Which brings me to refactoring: as with code, we try to keep functionality documented in one place. Please *refer* to other docstrings, instead of copying their content. For example, `argsort` refers to `sort`, instead of re-explaining the different sorting methods. Contributors are now listed on the stats page: http://sd-2116.dedibox.fr/pydocweb/stats/ Again, thanks for all your help; you make a mammoth task look easy! Your docstrings will be included in the 1.2 NumPy release (the first release candidate which should appear around 4 August). Regards St?fan From anthony.floyd at convergent.ca Thu Jul 17 12:14:46 2008 From: anthony.floyd at convergent.ca (Anthony Floyd) Date: Thu, 17 Jul 2008 09:14:46 -0700 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <9457e7c80807170131m76152773md4518f118f37dfd1@mail.gmail.com> Message-ID: <7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> Hi St?fan, > 2008/7/16 Anthony Floyd : > > Unfortunately, when we try to unpickle the data saved with > Numpy 1.0.3 > > in the new code using Numpy 1.1.0, it chokes because it can't import > > numpy.core.ma for the masked arrays. A check of Numpy > 1.1.0 shows that > > this is now numpy.ma.core. > > The maskedarray functionality has been rewritten, and is now > `numpy.ma`. For the time being, the old package is still available as > `numpy.oldnumeric.ma`. Yes, we're aware it's changed. The problem is that when pickle unpickles the data, it tries to assign the data back to its original class ... and the class type for masked arrays under 1.0.3 is numpy.core.ma.MaskedArray. This class type has changed in 1.1.0 to numpy.ma.core.MaskedArray. Since pickle can't find the old type, it fails to load the data. What I need to know is how I can trick pickle or Numpy to put the old class into the new class. The only thing we've come up with is to create our own numpy.core.ma.MaskedArray in 1.1.0 as a class that inherits numpy.ma.core.MaskedArray and doesn't make any changes to it. It's extremely surprising to find a significant API change like this in a stable package. Thanks, Anthony. -- Anthony Floyd, PhD Convergent Manufacturing Technologies Inc. 6190 Agronomy Rd, Suite 403 Vancouver BC V6T 1Z3 CANADA Email: Anthony.Floyd at convergent.ca | Tel: 604-822-9682 x102 WWW: http://www.convergent.ca | Fax: 604-822-9659 CMT is hiring: See http://www.convergent.ca for details From stefan at sun.ac.za Thu Jul 17 12:54:10 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 18:54:10 +0200 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <9457e7c80807170131m76152773md4518f118f37dfd1@mail.gmail.com> <7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> Message-ID: <9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com> 2008/7/17 Anthony Floyd : > What I need to know is how I can trick pickle or Numpy to put the old class into the new class. If you have an example data-file, send it to me off-list and I'll figure out what to do. Maybe it is as simple as np.core.ma = np.oldnumeric.ma > It's extremely surprising to find a significant API change like this in a stable package. I don't know if renaming things in np.core counts as an API change. Pickling is notoriously unreliable for storing arrays, which is why Robert wrote `load` and `save`. I hope that Pierre can get around to implementing MaskedArray storage for 1.2. Otherwise, you can already save the array and mask separately. Regards St?fan From pgmdevlist at gmail.com Thu Jul 17 13:38:07 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 17 Jul 2008 13:38:07 -0400 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> <9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com> Message-ID: <200807171338.07971.pgmdevlist@gmail.com> On Thursday 17 July 2008 12:54:10 St?fan van der Walt wrote: > I don't know if renaming things in np.core counts as an API change. > Pickling is notoriously unreliable for storing arrays, which is why > Robert wrote `load` and `save`. I hope that Pierre can get around to > implementing MaskedArray storage for 1.2. Wow, I'll see what I can do, but no promises. > Otherwise, you can already > save the array and mask separately. An other possibility is to store the MaskedArray as a record array, with one field for the data and one field for the mask. From anthony.floyd at convergent.ca Thu Jul 17 13:41:51 2008 From: anthony.floyd at convergent.ca (Anthony Floyd) Date: Thu, 17 Jul 2008 10:41:51 -0700 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net><9457e7c80807170131m76152773md4518f118f37dfd1@mail.gmail.com><7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> <9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com> Message-ID: <7EFBEC7FA86C1141B59B59EEAEE3294F5A3A12@EMAIL2.exchange.electric.net> > > What I need to know is how I can trick pickle or Numpy to > put the old class into the new class. > > If you have an example data-file, send it to me off-list and I'll > figure out what to do. Maybe it is as simple as > > np.core.ma = np.oldnumeric.ma Yes, pretty much. We've put ma.py into numpy.core where ma.py is nothing more than: import numpy.oldnumeric.ma as ma class MaskedArray(ma.MaskedArray): pass It works, but becomes a bit of a headache because we now have to maintain our own numpy package so that all the developers get these three lines when they install numpy. Anyway, it lets us unpickle/unshelve the old data files with 1.1.0. The next step is to transition on the fly the old numpy.core.ma.MaskedArray classes to numpy.ma.core.MaskedArray classes so that when oldnumeric gets depreciated we're not stuck. Thanks for the input, Anthony. From charlesr.harris at gmail.com Thu Jul 17 14:00:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 12:00:28 -0600 Subject: [Numpy-discussion] Documtation updates for 1.1.1 Message-ID: Hi Stephan, I'm thinking it would nice to backport as many documentation updates to 1.1.1 as possible. It looks like the following steps should do the trick. 1) Make ptvirtan's changes for ufunc documentation. 2) Copy add_newdocs.py 3) Copy fromnumeric.py Does that look reasonable to you? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Jul 17 14:34:47 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Jul 2008 18:34:47 +0000 (UTC) Subject: [Numpy-discussion] Numpy Trac malfunctions Message-ID: Hi, Trac seems to malfunction again with permission problems. At http://projects.scipy.org/scipy/numpy/changeset/5447 there is Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 387, in dispatch_request dispatcher.dispatch(req) File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 238, in dispatch resp = chosen_handler.process_request(req) File "/usr/lib/python2.4/site-packages/trac/versioncontrol/web_ui/changeset.py", line 188, in process_request prev = repos.get_node(new_path, new).get_previous() File "/usr/lib/python2.4/site-packages/trac/versioncontrol/cache.py", line 120, in get_node return self.repos.get_node(path, rev) File "/usr/lib/python2.4/site-packages/trac/versioncontrol/svn_fs.py", line 356, in get_node self.pool) File "/usr/lib/python2.4/site-packages/trac/versioncontrol/svn_fs.py", line 533, in __init__ self.root = fs.revision_root(fs_ptr, rev, self.pool()) SubversionException: ("Can't open file '/home/scipy/svn/numpy/db/revs/5447': Permission denied", 13) -- Pauli Virtanen From millman at berkeley.edu Thu Jul 17 15:04:26 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 17 Jul 2008 12:04:26 -0700 Subject: [Numpy-discussion] Numpy Trac malfunctions In-Reply-To: References: Message-ID: On Thu, Jul 17, 2008 at 11:34 AM, Pauli Virtanen wrote: > Trac seems to malfunction again with permission problems. At > http://projects.scipy.org/scipy/numpy/changeset/5447 there is Fixed. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Thu Jul 17 15:07:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Jul 2008 14:07:20 -0500 Subject: [Numpy-discussion] Numpy Trac malfunctions In-Reply-To: References: Message-ID: <3d375d730807171207i580ba42cn55ca85bb76d49c02@mail.gmail.com> On Thu, Jul 17, 2008 at 13:34, Pauli Virtanen wrote: > Hi, > > Trac seems to malfunction again with permission problems. At > http://projects.scipy.org/scipy/numpy/changeset/5447 there is > > Traceback (most recent call last): > SubversionException: ("Can't open file '/home/scipy/svn/numpy/db/revs/5447': Permission denied", 13) Are you still having problems? I do not see this currently. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Jul 17 15:15:20 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Jul 2008 19:15:20 +0000 (UTC) Subject: [Numpy-discussion] Documtation updates for 1.1.1 References: Message-ID: Thu, 17 Jul 2008 12:00:28 -0600, Charles R Harris wrote: > Hi Stephan, > > I'm thinking it would nice to backport as many documentation updates to > 1.1.1 as possible. It looks like the following steps should do the > trick. > > 1) Make ptvirtan's changes for ufunc documentation. > 2) Copy add_newdocs.py > 3) Copy fromnumeric.py > > Does that look reasonable to you? I'm not sure if 1) is needed for 1.1.1. It's only needed when we start putting back the improved ufunc documentation to SVN. The documentation has not yet made its way yet from the doc wiki to SVN trunk, so there's nothing to backport at the moment. -- Pauli Virtanen From matthieu.brucher at gmail.com Thu Jul 17 16:04:29 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 17 Jul 2008 22:04:29 +0200 Subject: [Numpy-discussion] Removing some warnings from numpy.i In-Reply-To: References: Message-ID: Hi, I've enclosed a patch for numpy.i (against the trunk). Its goal is to add const char* instead of char* in some functions (pytype_string and typecode_string). The char* use raises some warnings in GCC 4.2.3 (and it is indeed not type safe IMHO). Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- A non-text attachment was scrubbed... Name: patch Type: application/octet-stream Size: 1990 bytes Desc: not available URL: From stefan at sun.ac.za Thu Jul 17 16:29:48 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 22:29:48 +0200 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <200807171338.07971.pgmdevlist@gmail.com> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net> <9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com> <200807171338.07971.pgmdevlist@gmail.com> Message-ID: <9457e7c80807171329i15c10admc40cdbf20cf8fa20@mail.gmail.com> Hi Pierre, 2008/7/17 Pierre GM : >> Otherwise, you can already >> save the array and mask separately. > > An other possibility is to store the MaskedArray as a record array, with one > field for the data and one field for the mask. What about the other parameters, such as fill value? Do we know its type beforehand? If we can come up with a robust way to convert a MaskedArray into (one or more) structured array(s), that would be perfect for storage purposes. Also, you wouldn't need to be volunteered to implement it :) Further, could we rename numpy.ma.core to numpy.ma._core? I think we should make it clear that users should not import from core directly. Cheers St?fan From stefan at sun.ac.za Thu Jul 17 16:34:38 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 17 Jul 2008 22:34:38 +0200 Subject: [Numpy-discussion] Documtation updates for 1.1.1 In-Reply-To: References: Message-ID: <9457e7c80807171334y14aa31b1r8805519a5f350a5a@mail.gmail.com> 2008/7/17 Charles R Harris : > I'm thinking it would nice to backport as many documentation updates to > 1.1.1 as possible. It looks like the following steps should do the trick. > > 1) Make ptvirtan's changes for ufunc documentation. > 2) Copy add_newdocs.py > 3) Copy fromnumeric.py > > Does that look reasonable to you? I don't mind, but did we make changes to those files? As Pauli mentioned, we haven't yet merged back the edited docstrings. They haven't been reviewed, but are probably better than what we currently have; would you like me to do a merge? Regards St?fan From millman at berkeley.edu Thu Jul 17 16:41:27 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 17 Jul 2008 13:41:27 -0700 Subject: [Numpy-discussion] Documtation updates for 1.1.1 In-Reply-To: <9457e7c80807171334y14aa31b1r8805519a5f350a5a@mail.gmail.com> References: <9457e7c80807171334y14aa31b1r8805519a5f350a5a@mail.gmail.com> Message-ID: On Thu, Jul 17, 2008 at 1:34 PM, St?fan van der Walt wrote: > I don't mind, but did we make changes to those files? As Pauli > mentioned, we haven't yet merged back the edited docstrings. They > haven't been reviewed, but are probably better than what we currently > have; would you like me to do a merge? Personally, I have a slight preference for just focusing on 1.2 for the documentation work, but would be happy if some updates made it to 1.1.1. My main concern is that I don't want to divert your attention. I would just merge the reviewed portions. Also the release candidate is scheduled for Sunday, so you would need to make whatever merges you have in the next two or three days. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From anthony.floyd at convergent.ca Thu Jul 17 16:47:33 2008 From: anthony.floyd at convergent.ca (Anthony Floyd) Date: Thu, 17 Jul 2008 13:47:33 -0700 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net><7EFBEC7FA86C1141B59B59EEAEE3294F5A3974@EMAIL2.exchange.electric.net><9457e7c80807170954q49247e5cpd665db3c5d0b9a5c@mail.gmail.com><200807171338.07971.pgmdevlist@gmail.com> <9457e7c80807171329i15c10admc40cdbf20cf8fa20@mail.gmail.com> Message-ID: <7EFBEC7FA86C1141B59B59EEAEE3294F5A3B02@EMAIL2.exchange.electric.net> > Further, could we rename numpy.ma.core to numpy.ma._core? I think we > should make it clear that users should not import from core directly. Just to add a bit of noise here, it's not that we were importing directly from .core, it's that pickle was telling us that the actual class associated with the masked array was numpy.ma.core.MaskedArray (erm, well, numpy.core.ma.MaskedArray in the older version). Changing the location *again* will break it again, in the exact same way. A> From charlesr.harris at gmail.com Thu Jul 17 17:13:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 15:13:50 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080715150718.R97049@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> Message-ID: On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott wrote: > On Tue, 15 Jul 2008, Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and without the > > rest of my patch there remains a leak. > > I think I might need to explain a little more about the reason for this > patch, because obviously the bug it fixes was missed the last time I > posted on this bug. > > So here is the missing part of the patch: > > > --- numpy/core/src/scalartypes.inc.src (revision 5411) > > +++ numpy/core/src/scalartypes.inc.src (working copy) > > @@ -1925,19 +1925,30 @@ > > goto finish; > > } > > > > + Py_XINCREF(typecode); > > arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); > > - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; > > + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { > > + Py_XDECREF(typecode); > > + return arr; > > + } > > robj = PyArray_Return((PyArrayObject *)arr); > > > > finish: > > - if ((robj==NULL) || (robj->ob_type == type)) return robj; > > + if ((robj==NULL) || (robj->ob_type == type)) { > > + Py_XDECREF(typecode); > > + return robj; > > + } > > /* Need to allocate new type and copy data-area over */ > > if (type->tp_itemsize) { > > itemsize = PyString_GET_SIZE(robj); > > } > > else itemsize = 0; > > obj = type->tp_alloc(type, itemsize); > > - if (obj == NULL) {Py_DECREF(robj); return NULL;} > > + if (obj == NULL) { > > + Py_XDECREF(typecode); > > + Py_DECREF(robj); > > + return NULL; > > + } > > if (typecode==NULL) > > typecode = PyArray_DescrFromType(PyArray_ at TYPE@); > > dest = scalar_value(obj, typecode); > > On the face of it it might appear that all the DECREFs are cancelling out > the first INCREF, but not so. Let's see two more lines of context: > > > src = scalar_value(robj, typecode); > > Py_DECREF(typecode); > > Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe > the later call ... and of course this has to happen on *ALL* return paths. > If we now take a closer look at the patch we can see that it's doing two > separate things: > > 1. There's an extra Py_XINCREF to balance the ref count lost to > PyArray_FromAny and ensure that typecode survives long enough; > > 2. Every early return path has an extra Py_XDECREF to balance the creation > of typecode. > > I rest my case for this patch. > __ > I still haven't convinced myself of this. By the time we hit finish, robj is NULL or holds a reference to typecode and the NULL case is taken care of up front. Later on, the reference to typecode might be decremented, perhaps leaving robj crippled, but in that case robj itself is marked for deletion upon exit. If the garbage collector can handle zero reference counts I think we are alright. I admit I haven't quite followed all the subroutines and macros, which descend into the hazy depths without the slightest bit of documentation, but at this point I'm inclined to leave things alone unless you have a test that shows a leak from this source. Chuck > _____________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 17 17:19:15 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 15:19:15 -0600 Subject: [Numpy-discussion] Documtation updates for 1.1.1 In-Reply-To: <9457e7c80807171334y14aa31b1r8805519a5f350a5a@mail.gmail.com> References: <9457e7c80807171334y14aa31b1r8805519a5f350a5a@mail.gmail.com> Message-ID: On Thu, Jul 17, 2008 at 2:34 PM, St?fan van der Walt wrote: > 2008/7/17 Charles R Harris : > > I'm thinking it would nice to backport as many documentation updates to > > 1.1.1 as possible. It looks like the following steps should do the trick. > > > > 1) Make ptvirtan's changes for ufunc documentation. > > 2) Copy add_newdocs.py > > 3) Copy fromnumeric.py > > > > Does that look reasonable to you? > > I don't mind, but did we make changes to those files? As Pauli > mentioned, we haven't yet merged back the edited docstrings. They > haven't been reviewed, but are probably better than what we currently > have; would you like me to do a merge? > My intent is to simply copy over files from the trunk. If you don't think things are ready yet, then let's wait and do a 1.1.2 after the documentation merge happens. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Jul 17 17:56:12 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 17 Jul 2008 21:56:12 +0000 (UTC) Subject: [Numpy-discussion] arccosh for complex numbers, goofy choice of branch References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: Mon, 17 Mar 2008 08:07:38 -0600, Charles R Harris wrote: [clip] > OK, that does it. I'm going to change it's behavior. The problem with bad arccosh branch cuts is still present: >>> import numpy as np >>> numpy.__version__ '1.2.0.dev5436.e45a7627a39d' >>> np.arccosh(-1e-9 + 0.1j) (-0.099834078899207618-1.5707963277899337j) >>> np.arccosh(1e-9 + 0.1j) (0.099834078899207576+1.5707963257998594j) >>> np.arccosh(-1e-9 - 0.1j) (-0.099834078899207618+1.5707963277899337j) >>> np.arccosh(1e-9 - 0.1j) (0.099834078899207576-1.5707963257998594j) Ticket #854. http://scipy.org/scipy/numpy/ticket/854 I'll write up some tests for all the functions with branch cuts to verify that the cuts and their continuity are correct. (Where "correct" bears some resemblance to "ISO C standard", I think...) -- Pauli Virtanen From pgmdevlist at gmail.com Thu Jul 17 18:18:06 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 17 Jul 2008 18:18:06 -0400 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <9457e7c80807171329i15c10admc40cdbf20cf8fa20@mail.gmail.com> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <200807171338.07971.pgmdevlist@gmail.com> <9457e7c80807171329i15c10admc40cdbf20cf8fa20@mail.gmail.com> Message-ID: <200807171818.07334.pgmdevlist@gmail.com> On Thursday 17 July 2008 16:29:48 St?fan van der Walt wrote: > > An other possibility is to store the MaskedArray as a record array, with > > one field for the data and one field for the mask. > > What about the other parameters, such as fill value? Dang, forgot about that. Having a dictionary of options would be cool, but we can't store it inside a regular ndarray. If we write to a file, we may want to write a header first that would store all the metadata we need. > If we can come up with a robust way to convert a > MaskedArray into (one or more) structured array(s), that would be > perfect for storage purposes. Also, you wouldn't need to be > volunteered to implement it :) A few weeks ago, I played a bit with interfacing TimeSeries and pytables: the idea is to transform the series (basically a MaskedArray) into a record array, and add the parameters such as fill_value in the metadata section of the table. Works great, we may want to follow the same pattern. Moreover, hdf5 is portable. > Further, could we rename numpy.ma.core to numpy.ma._core? I think we > should make it clear that users should not import from core directly. Anthony raised a very good point against that, and I agree. There's no need for that. Anthony, just making a symlink from numpy/oldnumeric/ma.py to numpy/core/ma.py works to unpickle your array. I agree it's still impractical... From charlesr.harris at gmail.com Thu Jul 17 18:25:06 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Jul 2008 16:25:06 -0600 Subject: [Numpy-discussion] arccosh for complex numbers, goofy choice of branch In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: On Thu, Jul 17, 2008 at 3:56 PM, Pauli Virtanen wrote: > Mon, 17 Mar 2008 08:07:38 -0600, Charles R Harris wrote: > [clip] > > OK, that does it. I'm going to change it's behavior. > > The problem with bad arccosh branch cuts is still present: > > >>> import numpy as np > >>> numpy.__version__ > '1.2.0.dev5436.e45a7627a39d' > >>> np.arccosh(-1e-9 + 0.1j) > (-0.099834078899207618-1.5707963277899337j) > >>> np.arccosh(1e-9 + 0.1j) > (0.099834078899207576+1.5707963257998594j) > >>> np.arccosh(-1e-9 - 0.1j) > (-0.099834078899207618+1.5707963277899337j) > >>> np.arccosh(1e-9 - 0.1j) > (0.099834078899207576-1.5707963257998594j) > > Ticket #854. http://scipy.org/scipy/numpy/ticket/854 > > I'll write up some tests for all the functions with branch cuts to verify > that the cuts and their continuity are correct. (Where "correct" bears > some resemblance to "ISO C standard", I think...) > Hmm, The problem here is arccosh = log(x + sqrt(x**2 - 1)) when the given numbers are plugged into x**2 - 1, one lies above the negative real axis, the other below and the branch cut [-inf,0] of sqrt introduces the discontinuity. Maybe sqrt(x - 1)*sqrt(x+1) will fix that. I do think the branch cut should be part of the documentation of all the complex functions. I wonder what arccos does here? Ah, here is a reference. Note arccosh z = ln(z + sqrt(z-1) sqrt(z+1) ) not sqrt(z**2-1) So I guess that is the fix. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From markperrymiller at gmail.com Thu Jul 17 18:36:37 2008 From: markperrymiller at gmail.com (Mark Miller) Date: Thu, 17 Jul 2008 15:36:37 -0700 Subject: [Numpy-discussion] Masked arrays and pickle/unpickle In-Reply-To: <200807171818.07334.pgmdevlist@gmail.com> References: <7EFBEC7FA86C1141B59B59EEAEE3294F5A37BB@EMAIL2.exchange.electric.net> <200807171338.07971.pgmdevlist@gmail.com> <9457e7c80807171329i15c10admc40cdbf20cf8fa20@mail.gmail.com> <200807171818.07334.pgmdevlist@gmail.com> Message-ID: On Thu, Jul 17, 2008 at 3:18 PM, Pierre GM wrote: > > Dang, forgot about that. Having a dictionary of options would be cool, but > we > can't store it inside a regular ndarray. If we write to a file, we may want > to write a header first that would store all the metadata we need. > > Not to derail the discussion, but I am a frequent user of Python's shelve function to archive large numpy arrays and associated sets of parameters into one very handy and accessible file. If numpy developers are discouraging use of this type of thing (shelve relies on pickle, is this correct?), then it would be super handy to be able to also include other data when saving arrays using numpy's intrinsic functions. Just a thought. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cburns at berkeley.edu Fri Jul 18 01:33:00 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Thu, 17 Jul 2008 22:33:00 -0700 Subject: [Numpy-discussion] 1.1.0 OSX Installer Fails Under 10.5.3? In-Reply-To: References: Message-ID: <764e38540807172233m6bce652bp40478564de10e265@mail.gmail.com> I've been using bdist_mpkg to build the OSX Installer. I'd like to update the requirement documentation for the 1.1.1 release candidate to say "MacPython from python.org" instead of "System Python". bdist_mpkg specifies this, does anyone know how to override it? Chris On Tue, Jun 3, 2008 at 11:48 PM, J. Stark wrote: > Chris, > > many thanks. Could I suggest that this information be featured > prominently in the Read Me in the Installer, and perhaps also at > http://www.scipy.org/Download where this is given as the official > binary distribution for MacOSX. You might want to change the error > message too, since I think that some people will interpret "System > Python" to mean the default Python provided by the standard system > install. Since this is 2.5.1 on Leopard, the error message could be > confusing. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amirnntp at gmail.com Fri Jul 18 04:32:17 2008 From: amirnntp at gmail.com (Amir) Date: Fri, 18 Jul 2008 01:32:17 -0700 (PDT) Subject: [Numpy-discussion] interleaved indexing Message-ID: <0d486f1f-b3eb-44ef-b6eb-54c6a3ac8529@w1g2000prk.googlegroups.com> A very beginner question about indexing: let x be an array where n = len(x). I would like to create a view y of x such that: y[i] = x[i:i+m,...] for each i and a fixed m << n so I can do things like numpy.cov(y). With n large, allocating y is a problem for me. Currently, I either do for loops in cython or translate operations into correlate() but am hoping there is an easier way, maybe using fancy indexing or broadcasting. Memory usage is secondary to speed, though. Thanks. From arnar.flatberg at gmail.com Fri Jul 18 04:49:56 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Fri, 18 Jul 2008 10:49:56 +0200 Subject: [Numpy-discussion] Type checking of arrays containing strings Message-ID: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> Hi I need to check if my array (a) is of type `string`. That is, I dont know the number of characters beforehand, so I cant do a.dtype == '|S*' (* = (max) number of characters) Looking at my options, I see either a.dtype.kind == 'S' or a.dtype.type == np.string_, might be ok. Are these any of the preffered ways, or is there some other way? Thanks, Arnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Jul 18 05:34:46 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 18 Jul 2008 11:34:46 +0200 Subject: [Numpy-discussion] interleaved indexing In-Reply-To: <0d486f1f-b3eb-44ef-b6eb-54c6a3ac8529@w1g2000prk.googlegroups.com> References: <0d486f1f-b3eb-44ef-b6eb-54c6a3ac8529@w1g2000prk.googlegroups.com> Message-ID: <9457e7c80807180234s722e0984s3e159c7e72f1327e@mail.gmail.com> Hi Amir 2008/7/18 Amir : > A very beginner question about indexing: let x be an array where n = > len(x). I would like to create a view y of x such that: > > y[i] = x[i:i+m,...] for each i and a fixed m << n > > so I can do things like numpy.cov(y). With n large, allocating y is a > problem for me. Currently, I either do for loops in cython or > translate operations into correlate() but am hoping there is an easier > way, maybe using fancy indexing or broadcasting. Memory usage is > secondary to speed, though. Robert Kern's recently added numpy.lib.stride_tricks should help: In [84]: x = np.arange(100).reshape(10,-1) In [85]: x Out[85]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79], [80, 81, 82, 83, 84, 85, 86, 87, 88, 89], [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]) In [86]: x.stridesOut[86]: (40, 4) In [87]: xx = np.lib.stride_tricks.as_strided(x, shape=(8, 3, 10), strides=(40, 40, 4)) In [88]: xx Out[88]: array([[[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]], [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39]], [[20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49]], [...] Cheers St?fan From stefan at sun.ac.za Fri Jul 18 05:40:54 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 18 Jul 2008 11:40:54 +0200 Subject: [Numpy-discussion] Type checking of arrays containing strings In-Reply-To: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> References: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> Message-ID: <9457e7c80807180240v40d5b989o7a9db492ff4c32ae@mail.gmail.com> 2008/7/18 Arnar Flatberg : > I need to check if my array (a) is of type `string`. That is, I dont know > the number of characters beforehand, so I cant do a.dtype == '|S*' (* = > (max) number of characters) > Looking at my options, I see either a.dtype.kind == 'S' or a.dtype.type == > np.string_, might be ok. Are these any of the preffered ways, or is there > some other way? Maybe np.issubdtype(x.dtype, str) St?fan From arnar.flatberg at gmail.com Fri Jul 18 06:38:29 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Fri, 18 Jul 2008 12:38:29 +0200 Subject: [Numpy-discussion] Type checking of arrays containing strings In-Reply-To: <9457e7c80807180240v40d5b989o7a9db492ff4c32ae@mail.gmail.com> References: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> <9457e7c80807180240v40d5b989o7a9db492ff4c32ae@mail.gmail.com> Message-ID: <5d3194020807180338m5476d41dhe7384fe8aed2fc00@mail.gmail.com> On Fri, Jul 18, 2008 at 11:40 AM, St?fan van der Walt wrote: > 2008/7/18 Arnar Flatberg : > > I need to check if my array (a) is of type `string`. That is, I dont know > > the number of characters beforehand, so I cant do a.dtype == '|S*' (* = > > (max) number of characters) > > Looking at my options, I see either a.dtype.kind == 'S' or a.dtype.type > == > > np.string_, might be ok. Are these any of the preffered ways, or is there > > some other way? > > Maybe > > np.issubdtype(x.dtype, str) Yes, silly of me. I didnt look at the documentation (source) and tried np.issubdtype(x, str) as my first try. That not working, I got lost. That said, I think some other parameter names than arg1, arg2 would be nice for an undocumented function . Arnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnar.flatberg at gmail.com Fri Jul 18 07:02:55 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Fri, 18 Jul 2008 13:02:55 +0200 Subject: [Numpy-discussion] Type checking of arrays containing strings In-Reply-To: <5d3194020807180338m5476d41dhe7384fe8aed2fc00@mail.gmail.com> References: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> <9457e7c80807180240v40d5b989o7a9db492ff4c32ae@mail.gmail.com> <5d3194020807180338m5476d41dhe7384fe8aed2fc00@mail.gmail.com> Message-ID: <5d3194020807180402x36fa1a27n645eb2a90d5ecc44@mail.gmail.com> On Fri, Jul 18, 2008 at 12:38 PM, Arnar Flatberg wrote: > > > On Fri, Jul 18, 2008 at 11:40 AM, St?fan van der Walt > wrote: > >> 2008/7/18 Arnar Flatberg : >> > I need to check if my array (a) is of type `string`. That is, I dont >> know >> > the number of characters beforehand, so I cant do a.dtype == '|S*' (* = >> > (max) number of characters) >> > Looking at my options, I see either a.dtype.kind == 'S' or a.dtype.type >> == >> > np.string_, might be ok. Are these any of the preffered ways, or is >> there >> > some other way? >> >> Maybe >> >> np.issubdtype(x.dtype, str) > > Yes, silly of me. I didnt look at the documentation (source) and tried > np.issubdtype(x, str) as my first try. That not working, I got lost. That > said, I think some other parameter names than arg1, arg2 would be nice for > an undocumented function . > > Arnar > > ... and instead of just complaining, I can do something about it :-) Could you add permissions to me for the documentation editor, username: ArnarFlatberg. Thanks, Arnar -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at araneidae.co.uk Fri Jul 18 07:15:05 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Fri, 18 Jul 2008 11:15:05 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> Message-ID: <20080717212322.O34675@saturn.araneidae.co.uk> I'm afraid this is going to be a long one, and I don't see any good way to cut down the quoted text either... Charles, I'm going to plea with you to read what I've just written and think about it. I'm trying to make the case as clear as I can. I think the case is actually extremely simple: the existing @name at _arrtype_name code is broken. On Thu, 17 Jul 2008, Charles R Harris wrote: > On Tue, Jul 15, 2008 at 9:28 AM, Michael Abbott > wrote: > > > --- numpy/core/src/scalartypes.inc.src (revision 5411) > > > +++ numpy/core/src/scalartypes.inc.src (working copy) > > > @@ -1925,19 +1925,30 @@ > > > goto finish; > > > } > > > > > > + Py_XINCREF(typecode); > > > arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); > > > - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; > > > + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { > > > + Py_XDECREF(typecode); > > > + return arr; > > > + } > > > robj = PyArray_Return((PyArrayObject *)arr); > > > > > > finish: > > > - if ((robj==NULL) || (robj->ob_type == type)) return robj; > > > + if ((robj==NULL) || (robj->ob_type == type)) { > > > + Py_XDECREF(typecode); > > > + return robj; > > > + } > > > /* Need to allocate new type and copy data-area over */ > > > if (type->tp_itemsize) { > > > itemsize = PyString_GET_SIZE(robj); > > > } > > > else itemsize = 0; > > > obj = type->tp_alloc(type, itemsize); > > > - if (obj == NULL) {Py_DECREF(robj); return NULL;} > > > + if (obj == NULL) { > > > + Py_XDECREF(typecode); > > > + Py_DECREF(robj); > > > + return NULL; > > > + } > > > if (typecode==NULL) > > > typecode = PyArray_DescrFromType(PyArray_ at TYPE@); > > > dest = scalar_value(obj, typecode); > > > src = scalar_value(robj, typecode); > > > Py_DECREF(typecode); > > > > Ahah. That DECREF balances the original PyArray_DescrFromType, or maybe > > the later call ... and of course this has to happen on *ALL* return paths. > > If we now take a closer look at the patch we can see that it's doing two > > separate things: > > > > 1. There's an extra Py_XINCREF to balance the ref count lost to > > PyArray_FromAny and ensure that typecode survives long enough; > > > > 2. Every early return path has an extra Py_XDECREF to balance the creation > > of typecode. > > > > I rest my case for this patch. > > I still haven't convinced myself of this. Seriously? Please bear with me for a bit then. Let me try and go back to basics. Reference counting discipline should be very simple (I did post up some links on my Wed, 9 Jul 2008 08:36:27 posting on this list). Let me try and capture this in the following principles: 1. Each routine is wholly responsible for the reference counts it creates. This responsibility can be fulfilled by: 1.a Decrementing the reference count. 1.b Returning the object to the caller (which accounts for exactly one reference count). 1.c Explicitly placing the object in another reference counted structure (similarly this accounts for another reference count, and of course creates some extra obligations which I won't discuss) 1.d Pass the object to a routine which indulges in "reference count theft". In some sense this can be thought of as an example of 1.c, and this is how PyList_SetItem and PyTuple_SetItem behave. 2. Reference counts are created by: 2.a Calling a routine which returns an object. 2.b Explicitly incrementing the reference count. 3. When the known reference count on an object reaches zero the object must be treated as lost (and any further reference to it will be undefined). In particular, a routine cannot make any assumptions about reference counts it has not created. I need to make a bit of a digression on the subject of reference count theft. I am only aware of three routines that do this: PyList_SetItem, PyTuple_SetItem and PyArray_FromAny The first two are effectively assigning the reference into a preexisting structure, so can be regarded as instances of principle 1.c. The last is a disaster area. I know (by inspecting the code) that PyArray_FromAny may choose to erase its typecode argument (by decrementing the reference), but I can't tell (without digging deeper than I care to go) whether this only occurs on the NULL return branch. However, this is a bit of a red herring. If we recognise that PyArray_FromAny steals the reference count, the remaining analysis will go through. A further note on 1.c: when an object is placed in another reference counted structure its reference count should normally be incremented -- case 1.c is really just an elision of this increment with the decrement under obligation 1.a. This normal case can be seen half way through PyArray_Scalar. Ok, preliminaries over: now to look at the code. Here is a caricature of the routine @name at _arrtype_new with the important features isolated: 1 PyArray_Descr *typecode = NULL; 2 if (...) goto finish; // _WORK at work@ 3 typecode = PyArray_DescrFromType(...); 4 if (...) { ...typecode...; goto finish; } 5 arr = PyArray_FromAny(..., typecode, ...); 6 if (...) return arr; 7 finish: 8 if (...) return robj; 9 if (...) return NULL; 10 if (typecode == NULL) typecode = PyArray_DescrFromType(...); 11 ... scalar_value(..., typecode);... 12 Py_DECREF(typecode); So let's go through this line by line. 1. We start with NULL typecode. Fine. 2. A hidden goto in a macro. Sweet. Let's have more code like this. 3. Here's the reference count we're responsible for. 4. If obj is NULL we use the typecode 5. otherwise we pass it to PyArray_FromAny. 6. The first early return 7. All paths (apart from 6) come together here. So at this point let's take stock. typecode is in one of three states: NULL (path 2, or if creation failed), allocated with a single reference count (path 4), or lost (path 5). This is not good. LET ME EMPHASISE THIS: the state of the code at the finish label is dangerous and simply broken. The original state at the finish label is indeterminate: typecode has either been lost by passing it to PyArray_FromAny (in which case we're not allowed to touch it again), or else it has reference count that we're still responsible for. There seems to be a fantasy expressed in a comment in a recent update to this routine that PyArray_Scalar has stolen a reference, but fortunately a quick code inspection (of arrayobject.c) quickly refutes this frightening possibility. So, the only way to fix the problem at (7) is to unify the two non-NULL cases. One answer is to add a DECREF at (4), but we see at (11) that we still need typecode at (7) -- so the only solution is to add an extra ADDREF just before (5). This then of course sadly means that we also need an extra DECREF at (6). PLEASE don't suggest moving the ADDREF until after (6) -- at this point typecode is lost and may have been destroyed, and relying on any possibility to the contrary is a recipe for continued screw ups. The rest is easy. Once we've established the invariant that typecode is either NULL or has a single reference count at (7) then the two early returns at (8) and (9) unfortunately need to be augmented with DECREFs. And we're done. Responses to your original comments: > By the time we hit finish, robj is NULL or holds a reference to typecode > and the NULL case is taken care of up front. robj has nothing to do with the lifetime management of typecode, the only issue is the early return. After the finish label typecode is either NULL (no problem) or else has a single reference count that needs to be accounted for. > Later on, the reference to typecode might be decremented, That *might* is at the heart of the problem. You can't be so cavalier about managing references. > perhaps leaving robj crippled, but in that case robj itself is marked > for deletion upon exit. Please ignore robj in ths discussion, it's beside the point. > If the garbage collector can handle zero reference counts I think > we are alright. No, no, no. This is nothing to do with the garbage collector. If we screw up our reference counts here then the garbage collector isn't going to dig us out of the hole. > I admit I haven't quite followed all the subroutines and > macros, which descend into the hazy depths without the slightest bit of > documentation, but at this point I'm inclined to leave things alone unless > you have a test that shows a leak from this source. Part of my point is that proper reference count discipline should not require any descent into subroutines (except for the very nasty case of reference theft, which I think is generally agreed to be a bad thing). As for the test case, try this one (you'll need a debug build): import numpy import sys refs = 0 r = range(100) refs = sys.gettotalrefcount() for i in r: float32() print sys.gettotalrefcount() - refs You should get one leak per float32() call. I've just noticed, looking at the latest revision of the code, that somebody seems under the misapprehension that PyArray_Scalar steals reference counts. Fortunately this is not true. Maybe this is part of the confusion? From stefan at sun.ac.za Fri Jul 18 07:21:50 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 18 Jul 2008 13:21:50 +0200 Subject: [Numpy-discussion] Type checking of arrays containing strings In-Reply-To: <5d3194020807180402x36fa1a27n645eb2a90d5ecc44@mail.gmail.com> References: <5d3194020807180149j6954e221g5d58f4bde43de443@mail.gmail.com> <9457e7c80807180240v40d5b989o7a9db492ff4c32ae@mail.gmail.com> <5d3194020807180338m5476d41dhe7384fe8aed2fc00@mail.gmail.com> <5d3194020807180402x36fa1a27n645eb2a90d5ecc44@mail.gmail.com> Message-ID: <9457e7c80807180421q203141d9h304574b6c633aa3e@mail.gmail.com> 2008/7/18 Arnar Flatberg : > ... and instead of just complaining, I can do something about it :-) Could > you add permissions to me for the documentation editor, username: That's the spirit! You are now added. But I'm worried: I had a bet with Joe that we won't get more than 30 people to sign up. Looks like I might have to concede defeat! Cheers St?fan From alan.mcintyre at gmail.com Fri Jul 18 08:15:45 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Fri, 18 Jul 2008 08:15:45 -0400 Subject: [Numpy-discussion] chararray __mod__ behavior Message-ID: <1d36917a0807180515w5d2f967ch4ab6daee0dfc6362@mail.gmail.com> This seems odd to me: >>> A=np.array([['%.3f','%d'],['%s','%r']]).view(np.chararray) >>> A % np.array([[1,2],[3,4]]) Traceback (most recent call last): File "", line 1, in File "/opt/local/lib/python2.5/site-packages/numpy/core/defchararray.py", line 126, in __mod__ newarr[:] = res ValueError: shape mismatch: objects cannot be broadcast to a single shape Is this expected behavior? The % gets broadcast as I'd expect for 1D arrays, but more dimensions fail as above. Changing the offending line in defchararray.py to "newarr.flat = res" makes it behave properly. From stefan at sun.ac.za Fri Jul 18 08:32:06 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 18 Jul 2008 14:32:06 +0200 Subject: [Numpy-discussion] chararray __mod__ behavior In-Reply-To: <1d36917a0807180515w5d2f967ch4ab6daee0dfc6362@mail.gmail.com> References: <1d36917a0807180515w5d2f967ch4ab6daee0dfc6362@mail.gmail.com> Message-ID: <9457e7c80807180532i76719579qc6824979db167a7@mail.gmail.com> 2008/7/18 Alan McIntyre : > This seems odd to me: > >>>> A=np.array([['%.3f','%d'],['%s','%r']]).view(np.chararray) >>>> A % np.array([[1,2],[3,4]]) > Traceback (most recent call last): > File "", line 1, in > File "/opt/local/lib/python2.5/site-packages/numpy/core/defchararray.py", > line 126, in __mod__ > newarr[:] = res > ValueError: shape mismatch: objects cannot be broadcast to a single shape > > Is this expected behavior? The % gets broadcast as I'd expect for 1D > arrays, but more dimensions fail as above. Changing the offending line > in defchararray.py to "newarr.flat = res" makes it behave properly. That looks like a bug to me. I would have expected at least one of the following to work: A % [[1, 2], [3, 4]] A % 1 A % (1, 2, 3, 4) and none of them do. St?fan From ivan at selidor.net Fri Jul 18 10:42:47 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Fri, 18 Jul 2008 16:42:47 +0200 Subject: [Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy In-Reply-To: <200807161844.36953.faltet@pytables.org> References: <200807161844.36953.faltet@pytables.org> Message-ID: <20080718144247.GA5698@tardis.terramar.selidor.net> Francesc Alted (el 2008-07-16 a les 18:44:36 +0200) va dir:: > After tons of excellent feedback received for our first proposal about > the date/time types in NumPy Ivan and me have had another brainstorming > session and ended with a new proposal for your consideration. After re-reading the proposal, Francesc and me found some points that needed small corrections and some clarifications or enhancements. Here you have a new version of the proposal. The changes aren't fundamental: * Reference to POSIX-like treatment of leap seconds. * Notes on default resolutions. * Meaning of the stored values. * Usage examples for scalar constructor. * Using an ISO 8601 string as a date value. * Fixed str() and repr() representations. * Note on operations with mixed resolutions. * Other small corrections. Thanks for the feedback! ---- ==================================================================== A (second) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-18 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stuck with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different resolutions so as to cover different needs. .. Important:: the resolution is conceived here as metadata that *complements* a date/time dtype, *without changing the base type*. It provides information about the *meaning* of the stored numbers, not about their *structure*. Now follows a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is the POSIX epoch (see [3]_). Like POSIX, the representation of a date doesn't take leap seconds into account. Resolution ~~~~~~~~~~ It accepts different resolutions, each of them implying a different time span. The table below describes the resolutions supported with their corresponding time spans. ======== =============== ========================== Resolution Time span (years) ------------------------ -------------------------- Code Meaning ======== =============== ========================== Y year [9.2e18 BC, 9.2e18 AC] Q quarter [3.0e18 BC, 3.0e18 AC] M month [7.6e17 BC, 7.6e17 AC] W week [1.7e17 BC, 1.7e17 AC] d day [2.5e16 BC, 2.5e16 AC] h hour [1.0e15 BC, 1.0e15 AC] m minute [1.7e13 BC, 1.7e13 AC] s second [ 2.9e9 BC, 2.9e9 AC] ms millisecond [ 2.9e6 BC, 2.9e6 AC] us microsecond [290301 BC, 294241 AC] ns nanosecond [ 1678 AC, 2262 AC] ======== =============== ========================== When a resolution is not provided, the default resolution of microseconds is used. The value of an absolute date is thus *an integer number of units of the chosen resolution* passed since the internal epoch. Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('datetime64', res="us") # the default res. is microseconds Using the long string notation:: dtype('datetime64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('T8[us]') # equivalent to dtype('T8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. The conversion from/to a ``datetime`` object doesn't take leap seconds into account. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. Resolution ~~~~~~~~~~ It accepts different resolutions, each of them implying a different time span. The table below describes the resolutions supported with their corresponding time spans. ======== =============== ========================== Resolution Time span ------------------------ -------------------------- Code Meaning ======== =============== ========================== W week +- 1.7e17 years d day +- 2.5e16 years h hour +- 1.0e15 years m minute +- 1.7e13 years s second +- 2.9e12 years ms millisecond +- 2.9e9 years us microsecond +- 2.9e6 years ns nanosecond +- 292 years ps picosecond +- 106 days fs femtosecond +- 2.6 hours as attosecond +- 9.2 seconds ======== =============== ========================== When a resolution is not provided, the default resolution of microseconds is used. The value of a time delta is thus *an integer number of units of the chosen resolution*. Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('timedelta64', res="us") # the default res. is microseconds Using the long string notation:: dtype('timedelta64[us]') # equivalent to dtype('timedelta64') Using the short string notation:: dtype('t8[us]') # equivalent to dtype('t8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. Example of use ============== Here it is an example of use for the ``datetime64``:: In [5]: numpy.datetime64(42) # use default resolution of "us" Out[5]: datetime64(42, 'us') In [6]: print numpy.datetime64(42) # use default resolution of "us" 1970-01-01T00:00:00.000042 # representation in ISO 8601 format In [7]: print numpy.datetime64(367.7, 'D') # decimal part is lost 1971-01-02 # still ISO 8601 format In [8]: numpy.datetime('2008-07-18T12:23:18', 'm') # from ISO 8601 Out[8]: datetime64(20273063, 'm') In [9]: print numpy.datetime('2008-07-18T12:23:18', 'm') Out[9]: 2008-07-18T12:23 In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000] In [13]: t[0].item() # getter in action Out[13]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [14]: print t.dtype dtype('datetime64[ms]') And here it goes an example of use for the ``timedelta64``:: In [5]: numpy.timedelta64(10) # use default resolution of "us" Out[5]: timedelta64(10, 'us') In [6]: print numpy.timedelta64(10) # use default resolution of "us" 0:00:00.010 In [7]: print numpy.timedelta64(3600.2, 'm') # decimal part is lost 2 days, 12:00 In [8]: t1 = numpy.zeros(5, dtype="datetime64[ms]") In [9]: t2 = numpy.ones(5, dtype="datetime64[ms]") In [10]: t = t2 - t1 In [11]: t[0] = datetime.timedelta(0, 24) # setter in action In [12]: print t [0:00:24.000 0:00:01.000 0:00:01.000 0:00:01.000 0:00:01.000] In [13]: t[0].item() # getter in action Out[13]: datetime.timedelta(0, 24) In [14]: print t.dtype dtype('timedelta64[s]') Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only arithmetic operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8") Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) But not other operations:: In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Comparisons between absolute dates are allowed. ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' ``timedelta64`` vs anything --------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(5, 't8') Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M]) But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: the result cannot be converted into a ``timedelta64`` dtype/resolution conversions ============================ For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing resolutions. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[d]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, we found that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Operations with mixed resolutions --------------------------------- Whenever an operation between two time values of the same dtype with the same resolution is accepted, the same operation with time values of different resolutions should be possible (e.g. adding a time delta in seconds and one in microseconds), resulting in an adequate resolution. The exact semantics of this kind of operations is yet to be defined, though. Resolution and dtype issues --------------------------- The date/time dtype's resolution metadata cannot be used in general as part of typical dtype usage. For example, in:: numpy.zeros(5, dtype=numpy.datetime64) we have yet to find a sensible way to pass the resolution. At any rate, one can explicitly create a dtype:: numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y')) BTW, prior to all of this, one should also elucidate whether:: numpy.dtype('datetime64', res='Y') or:: numpy.dtype('datetime64[Y]') numpy.dtype('T8[Y]') numpy.dtype('T[Y]') would be a consistent way to instantiate a dtype in NumPy. We do really think that could be a good way, but we would need to hear the opinion of the expert. Travis? .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: ---- :: Ivan Vilata i Balaguer @ Welcome to the European Banana Republic! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- ==================================================================== A (second) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-18 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stuck with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different resolutions so as to cover different needs. .. Important:: the resolution is conceived here as metadata that *complements* a date/time dtype, *without changing the base type*. It provides information about the *meaning* of the stored numbers, not about their *structure*. Now follows a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is the POSIX epoch (see [3]_). Like POSIX, the representation of a date doesn't take leap seconds into account. Resolution ~~~~~~~~~~ It accepts different resolutions, each of them implying a different time span. The table below describes the resolutions supported with their corresponding time spans. ======== =============== ========================== Resolution Time span (years) ------------------------ -------------------------- Code Meaning ======== =============== ========================== Y year [9.2e18 BC, 9.2e18 AC] Q quarter [3.0e18 BC, 3.0e18 AC] M month [7.6e17 BC, 7.6e17 AC] W week [1.7e17 BC, 1.7e17 AC] d day [2.5e16 BC, 2.5e16 AC] h hour [1.0e15 BC, 1.0e15 AC] m minute [1.7e13 BC, 1.7e13 AC] s second [ 2.9e9 BC, 2.9e9 AC] ms millisecond [ 2.9e6 BC, 2.9e6 AC] us microsecond [290301 BC, 294241 AC] ns nanosecond [ 1678 AC, 2262 AC] ======== =============== ========================== When a resolution is not provided, the default resolution of microseconds is used. The value of an absolute date is thus *an integer number of units of the chosen resolution* passed since the internal epoch. Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('datetime64', res="us") # the default res. is microseconds Using the long string notation:: dtype('datetime64[us]') # equivalent to dtype('datetime64') Using the short string notation:: dtype('T8[us]') # equivalent to dtype('T8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. The conversion from/to a ``datetime`` object doesn't take leap seconds into account. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. Resolution ~~~~~~~~~~ It accepts different resolutions, each of them implying a different time span. The table below describes the resolutions supported with their corresponding time spans. ======== =============== ========================== Resolution Time span ------------------------ -------------------------- Code Meaning ======== =============== ========================== W week +- 1.7e17 years d day +- 2.5e16 years h hour +- 1.0e15 years m minute +- 1.7e13 years s second +- 2.9e12 years ms millisecond +- 2.9e9 years us microsecond +- 2.9e6 years ns nanosecond +- 292 years ps picosecond +- 106 days fs femtosecond +- 2.6 hours as attosecond +- 9.2 seconds ======== =============== ========================== When a resolution is not provided, the default resolution of microseconds is used. The value of a time delta is thus *an integer number of units of the chosen resolution*. Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed way to specify the resolution in the dtype constructor is: Using parameters in the constructor:: dtype('timedelta64', res="us") # the default res. is microseconds Using the long string notation:: dtype('timedelta64[us]') # equivalent to dtype('timedelta64') Using the short string notation:: dtype('t8[us]') # equivalent to dtype('t8') Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a resolution of microseconds. For other resolutions, the conversion process will loose precision or will overflow as needed. Example of use ============== Here it is an example of use for the ``datetime64``:: In [5]: numpy.datetime64(42) # use default resolution of "us" Out[5]: datetime64(42, 'us') In [6]: print numpy.datetime64(42) # use default resolution of "us" 1970-01-01T00:00:00.000042 # representation in ISO 8601 format In [7]: print numpy.datetime64(367.7, 'D') # decimal part is lost 1971-01-02 # still ISO 8601 format In [8]: numpy.datetime('2008-07-18T12:23:18', 'm') # from ISO 8601 Out[8]: datetime64(20273063, 'm') In [9]: print numpy.datetime('2008-07-18T12:23:18', 'm') Out[9]: 2008-07-18T12:23 In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000] In [13]: t[0].item() # getter in action Out[13]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [14]: print t.dtype dtype('datetime64[ms]') And here it goes an example of use for the ``timedelta64``:: In [5]: numpy.timedelta64(10) # use default resolution of "us" Out[5]: timedelta64(10, 'us') In [6]: print numpy.timedelta64(10) # use default resolution of "us" 0:00:00.010 In [7]: print numpy.timedelta64(3600.2, 'm') # decimal part is lost 2 days, 12:00 In [8]: t1 = numpy.zeros(5, dtype="datetime64[ms]") In [9]: t2 = numpy.ones(5, dtype="datetime64[ms]") In [10]: t = t2 - t1 In [11]: t[0] = datetime.timedelta(0, 24) # setter in action In [12]: print t [0:00:24.000 0:00:01.000 0:00:01.000 0:00:01.000 0:00:01.000] In [13]: t[0].item() # getter in action Out[13]: datetime.timedelta(0, 24) In [14]: print t.dtype dtype('timedelta64[s]') Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only arithmetic operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8") Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) But not other operations:: In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Comparisons between absolute dates are allowed. ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' ``timedelta64`` vs anything --------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(5, 't8') Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us]) In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M]) But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: the result cannot be converted into a ``timedelta64`` dtype/resolution conversions ============================ For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing resolutions. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[d]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, we found that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Operations with mixed resolutions --------------------------------- Whenever an operation between two time values of the same dtype with the same resolution is accepted, the same operation with time values of different resolutions should be possible (e.g. adding a time delta in seconds and one in microseconds), resulting in an adequate resolution. The exact semantics of this kind of operations is yet to be defined, though. Resolution and dtype issues --------------------------- The date/time dtype's resolution metadata cannot be used in general as part of typical dtype usage. For example, in:: numpy.zeros(5, dtype=numpy.datetime64) we have yet to find a sensible way to pass the resolution. At any rate, one can explicitly create a dtype:: numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y')) BTW, prior to all of this, one should also elucidate whether:: numpy.dtype('datetime64', res='Y') or:: numpy.dtype('datetime64[Y]') numpy.dtype('T8[Y]') numpy.dtype('T[Y]') would be a consistent way to instantiate a dtype in NumPy. We do really think that could be a good way, but we would need to hear the opinion of the expert. Travis? .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From charlesr.harris at gmail.com Fri Jul 18 14:03:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 12:03:57 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080717212322.O34675@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> Message-ID: On Fri, Jul 18, 2008 at 5:15 AM, Michael Abbott wrote: > I'm afraid this is going to be a long one, and I don't see any good way to > cut down the quoted text either... > > Charles, I'm going to plea with you to read what I've just written and > think about it. I'm trying to make the case as clear as I can. I think > the case is actually extremely simple: the existing @name at _arrtype_name > code is broken. > > Heh. As the macro is undefined after the code is generated, it should probably be moved into the code. I would actually like to get rid of the ifdef's (almost everywhere), but that is a later stage of cleanup. > > 3. Here's the reference count we're responsible for. Yep. > > 4. If obj is NULL we use the typecode > 5. otherwise we pass it to PyArray_FromAny. > 6. The first early return > 7. All paths (apart from 6) come together here. > > So at this point let's take stock. typecode is in one of three states: > NULL (path 2, or if creation failed), allocated with a single reference > count (path 4), or lost (path 5). This is not good. It still has a single reference after 5 if PyArray_FromAny succeeded, that reference is held by arr and transferred to robj. If the transfer fails, the reference to arr is decremented and NULL returned by PyArray_Return. When arr is garbage collected the reference to typecode will be decremented. > > > LET ME EMPHASISE THIS: the state of the code at the finish label is > dangerous and simply broken. > > The original state at the finish label is indeterminate: typecode has > either been lost by passing it to PyArray_FromAny (in which case we're not > allowed to touch it again), or else it has reference count that we're > still responsible for. > > There seems to be a fantasy expressed in a comment in a recent update to > this routine that PyArray_Scalar has stolen a reference, but fortunately a > quick code inspection (of arrayobject.c) quickly refutes this frightening > possibility. > No, no, Pyarray_Scalar doesn't do anything to the reference count. Where did you see otherwise? > > So, the only way to fix the problem at (7) is to unify the two non-NULL > cases. One answer is to add a DECREF at (4), but we see at (11) that we > still need typecode at (7) -- so the only solution is to add an extra > ADDREF just before (5). This then of course sadly means that we also need > an extra DECREF at (6). > > PLEASE don't suggest moving the ADDREF until after (6) -- at this point > typecode is lost and may have been destroyed, and relying on any > possibility to the contrary is a recipe for continued screw ups. > > The rest is easy. Once we've established the invariant that typecode is > either NULL or has a single reference count at (7) then the two early > returns at (8) and (9) unfortunately need to be augmented with DECREFs. > > And we're done. > > > Responses to your original comments: > > > By the time we hit finish, robj is NULL or holds a reference to typecode > > and the NULL case is taken care of up front. > > robj has nothing to do with the lifetime management of typecode, the only > issue is the early return. After the finish label typecode is either NULL > (no problem) or else has a single reference count that needs to be > accounted for. > > > Later on, the reference to typecode might be decremented, > That *might* is at the heart of the problem. You can't be so cavalier > about managing references. > > > perhaps leaving robj crippled, but in that case robj itself is marked > > for deletion upon exit. > Please ignore robj in ths discussion, it's beside the point. > > > If the garbage collector can handle zero reference counts I think > > we are alright. > No, no, no. This is nothing to do with the garbage collector. If we > screw up our reference counts here then the garbage collector isn't going > to dig us out of the hole. The garbage collector destroys the object and should decrement all the references it holds. If that is not the case then there are bigger problems afoot. The finalizer for the object should hold the knowledge of what needs to be decremented. > > > > I admit I haven't quite followed all the subroutines and > > macros, which descend into the hazy depths without the slightest bit of > > documentation, but at this point I'm inclined to leave things alone > unless > > you have a test that shows a leak from this source. > > Part of my point is that proper reference count discipline should not > require any descent into subroutines (except for the very nasty case of > reference theft, which I think is generally agreed to be a bad thing). > Agreed. But that is not the code we are looking at. My personal schedule for this sort of cleanup/refactoring looks like this. 1) Format the code into readable C. (ongoing) 2) Document the functions so we know what they do. 3) Understand the code. 4) Fix up functions starting from the bottom layers. 5) Flatten the code -- the calls go too deep for my taste and make understanding difficult. My attempts to generate a call graph have all run into problems. At that point consider the reference counting model. There aren't many people working with the C code apart from myself and Travis (who wrote the original), so I want to encourage you to keep looking at the code and working with it. However, we can't just start from scratch and we have to be pretty conservative to keep from breaking things. > > As for the test case, try this one (you'll need a debug build): > > import numpy > import sys > refs = 0 > r = range(100) > refs = sys.gettotalrefcount() > for i in r: float32() > print sys.gettotalrefcount() - refs > Simpler test case: import sys, gc import numpy as np def main() : t = np.dtype(np.float32) print sys.getrefcount(t) for i in range(100) : np.float32() gc.collect() print sys.getrefcount(t) if __name__ == "__main__" : main() Result $[charris at f8 ~]$ python debug.py 5 105 So there is a leak. The question is the proper fix. I want to take a closer look at PyArray_Return and also float32() and relations. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at cesmail.net Fri Jul 18 14:59:26 2008 From: rowen at cesmail.net (Russell E. Owen) Date: Fri, 18 Jul 2008 11:59:26 -0700 Subject: [Numpy-discussion] 1.1.0 OSX Installer Fails Under 10.5.3? References: <764e38540807172233m6bce652bp40478564de10e265@mail.gmail.com> Message-ID: In article <764e38540807172233m6bce652bp40478564de10e265 at mail.gmail.com>, "Christopher Burns" wrote: > I've been using bdist_mpkg to build the OSX Installer. I'd like to update > the requirement documentation for the 1.1.1 release candidate to say > "MacPython from python.org" instead of "System Python". bdist_mpkg > specifies this, does anyone know how to override it? I suspect I am misunderstanding your question, but... If you are asking how to make bdist_mkpg actually use MacPython, then surely you simply have to have MacPython installed for that to happen? That was certainly true for MacPython and bdist_mpkg on 10.4.x. Or are you asking how to make the installer fail if the user's system is missing MacPython? That I don't know. I usually rely on the .mpkg's ReadMe and the user being intelligent enough to read it, but of course that is a bit risky. If you are asking how to modify the ReadMe file then that is trivial -- just look through the .mpkg package and you'll find it right away. I often replace the default ReadMe with my own when creating .mpkg installer for others. -- Russell From charlesr.harris at gmail.com Fri Jul 18 16:41:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 14:41:27 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> Message-ID: On Fri, Jul 18, 2008 at 12:03 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > Simpler test case: > > import sys, gc > import numpy as np > > def main() : > t = np.dtype(np.float32) > print sys.getrefcount(t) > for i in range(100) : > np.float32() > gc.collect() > print sys.getrefcount(t) > > if __name__ == "__main__" : > main() > > Result > > $[charris at f8 ~]$ python debug.py > 5 > 105 > > So there is a leak. The question is the proper fix. I want to take a closer > look at PyArray_Return and also float32() and relations. > The reference leak seems specific to the float32 and complex64 types called with default arguments. In [1]: import sys, gc In [2]: t = float32 In [3]: sys.getrefcount(dtype(t)) Out[3]: 4 In [4]: for i in range(10) : t(); ...: In [5]: sys.getrefcount(dtype(t)) Out[5]: 14 In [6]: for i in range(10) : t(0); ...: In [7]: sys.getrefcount(dtype(t)) Out[7]: 14 In [8]: t = complex64 In [9]: sys.getrefcount(dtype(t)) Out[9]: 4 In [10]: for i in range(10) : t(); ....: In [11]: sys.getrefcount(dtype(t)) Out[11]: 14 In [12]: t = float64 In [13]: sys.getrefcount(dtype(t)) Out[13]: 19 In [14]: for i in range(10) : t(); ....: In [15]: sys.getrefcount(dtype(t)) Out[15]: 19 This shouldn't actually leak any memory as these types are singletons, but it points up a logic flaw somewhere. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cburns at berkeley.edu Fri Jul 18 17:17:40 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Fri, 18 Jul 2008 14:17:40 -0700 Subject: [Numpy-discussion] building a better OSX install for 1.1.1 Message-ID: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> Sorry Russell, I was a bit brief before. Installer.app checks for a system requirement when the user installs numpy. I build numpy using bdist_mpkg against the python.org version of python (MacPython). If a user tries to install numpy and they _do not_ have this version of python installed, Installer.app issues a warning: "numpy requires System Python 2.5 to install." The phrase "System Python" is misleading, it's reasonable to assume that refers to the system version of python. So I'd like to change it. This string is stored in an Info.plist buried in the .mpkg that bdist_mpkg builds. I'd like to be able to override that string from the command line, but there does not seem to be any options for changing the requirements from the command line. The hack solution is to modify the string in the Info.plist after the package is built. But I'm hoping there's a proper solution that I'm missing. Thanks! Chris On Fri, Jul 18, 2008 at 11:59 AM, Russell E. Owen wrote: > In article > <764e38540807172233m6bce652bp40478564de10e265 at mail.gmail.com>, > "Christopher Burns" wrote: > > > I've been using bdist_mpkg to build the OSX Installer. I'd like to > update > > the requirement documentation for the 1.1.1 release candidate to say > > "MacPython from python.org" instead of "System Python". bdist_mpkg > > specifies this, does anyone know how to override it? > > I suspect I am misunderstanding your question, but... > > If you are asking how to make bdist_mkpg actually use MacPython, then > surely you simply have to have MacPython installed for that to happen? > That was certainly true for MacPython and bdist_mpkg on 10.4.x. > > Or are you asking how to make the installer fail if the user's system is > missing MacPython? That I don't know. I usually rely on the .mpkg's > ReadMe and the user being intelligent enough to read it, but of course > that is a bit risky. > > If you are asking how to modify the ReadMe file then that is trivial -- > just look through the .mpkg package and you'll find it right away. I > often replace the default ReadMe with my own when creating .mpkg > installer for others. > > -- Russell > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Jul 18 17:36:25 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 14:36:25 -0700 Subject: [Numpy-discussion] f2py_options in setup.py Message-ID: Howdy, setup.py files with calls like f2py_options = ['--fcompiler=gfortran'], ... when building the extension object used to work. But now I'm trying the same thing with some f90 code and when running the build, it doesn't seem to go through to f2py: maqroll[felipe_fortran]> ./setup.py build running build running scons customize UnixCCompiler Found executable /usr/bin/gcc customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize UnixCCompiler customize UnixCCompiler using scons Found executable /usr/bin/g++ etc... If I run f2py itself at the command line, I can give it the compiler flag correctly. But via setup.py it just doesn't seem to be working, and I'm at a loss as to what I'm doing wrong. Any help would be much appreciated. Cheers, f ps - is there an updated f2py guide on the scipy site somewhere? The guide I found was the old one from http://cens.ioc.ee/projects/f2py2e/ But I don't know if, after all the recent integration with numpy, all of it remains valid. It would be good to have f2py documented as part of numpy, since it's now part of it. From suchindra at gmail.com Fri Jul 18 17:41:22 2008 From: suchindra at gmail.com (Suchindra Sandhu) Date: Fri, 18 Jul 2008 17:41:22 -0400 Subject: [Numpy-discussion] integer array creation oddity Message-ID: Hi, Can someone please explain to me this oddity? In [1]: import numpy as n In [8]: a = n.array((1,2,3), 'i') In [9]: type(a[0]) Out[9]: In [10]: type(a[0]) == n.int32 Out[10]: False When I create an array with 'int', 'int32' etc it works fine What is the type of 'i' and what is n.int0? Thanks, Suchindra -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 18 18:11:59 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 16:11:59 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> Message-ID: On Fri, Jul 18, 2008 at 2:41 PM, Charles R Harris wrote: There are actually two bugs here which is confusing things. Bug 1) Deleting a numpy scalar doesn't decrement the descr reference count. Bug 2) PyArray_Return decrements the reference to arr, which in turn correctly decrements the descr on gc. So calls that go through the default (obj == NULL) correctly leave typecode with a reference, it just never gets decremented when the return object is deleted. On the other hand, if the function is called with something like float32(0), then arr is allocated, stealing a reference to typecode. PyArray_Return then deletes arr (which decrements the typecode reference), but that doesn't matter since typecode is a singleton. In this case there is no follow on stack dump when the returned object is deleted because the descr reference is not correctly decremented. BTW, both cases get returned in the first if statement after the finish label, i.e., robj is returned. Oy, what a mess. Here is a short program to follow all the reference counts. import sys, gc import numpy as np def main() : typecodes = "?bBhHiIlLqQfdgFDG" for typecode in typecodes : t = np.dtype(typecode) refcnt = sys.getrefcount(t) t.type() gc.collect() print typecode, t.name, sys.getrefcount(t) - refcnt if __name__ == "__main__" : main() Which gives the following output: $[charris at f8 ~]$ python debug.py ? bool 0 b int8 1 B uint8 1 h int16 1 H uint16 1 i int32 0 I uint32 1 l int32 0 L uint32 1 q int64 1 Q uint64 1 f float32 1 d float64 0 g float96 1 F complex64 1 D complex128 1 G complex192 1 Note that the python types, which the macro handles, don't have the reference leak problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Fri Jul 18 18:16:57 2008 From: rmay31 at gmail.com (Ryan May) Date: Fri, 18 Jul 2008 18:16:57 -0400 Subject: [Numpy-discussion] numpy.loadtext() fails with dtype + usecols Message-ID: <48811659.9020004@gmail.com> Hi, I was trying to use loadtxt() today to read in some text data, and I had a problem when I specified a dtype that only contained as many elements as in columns in usecols. The example below shows the problem: import numpy as np import StringIO data = '''STID RELH TAIR JOE 70.1 25.3 BOB 60.5 27.9 ''' f = StringIO.StringIO(data) names = ['stid', 'temp'] dtypes = ['S4', 'f8'] arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) With current 1.1 (and SVN head), this yields: IndexError Traceback (most recent call last) /home/rmay/ in () /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack) 309 for j in xrange(len(vals))] 310 if usecols is not None: --> 311 row = [converterseq[j](vals[j]) for j in usecols] 312 else: 313 row = [converterseq[j](val) for j,val in enumerate(vals)] IndexError: list index out of range ------------------------------------------ I've added a patch that checks for usecols, and if present, correctly creates the converters dictionary to map each specified column with converter for the corresponding field in the dtype. With the attached patch, this works fine: >arr array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], dtype=[('stid', '|S4'), ('temp', ' From charlesr.harris at gmail.com Fri Jul 18 18:37:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 16:37:37 -0600 Subject: [Numpy-discussion] building a better OSX install for 1.1.1 In-Reply-To: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> References: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 3:17 PM, Christopher Burns wrote: > Sorry Russell, I was a bit brief before. > > Installer.app checks for a system requirement when the user installs > numpy. I build numpy using bdist_mpkg against the python.org version of > python (MacPython). If a user tries to install numpy and they _do not_ have > this version of python installed, Installer.app issues a warning: > "numpy requires System Python 2.5 to install." > > The phrase "System Python" is misleading, it's reasonable to assume that > refers to the system version of python. So I'd like to change it. > > This string is stored in an Info.plist buried in the .mpkg that bdist_mpkg > builds. I'd like to be able to override that string from the command line, > but there does not seem to be any options for changing the requirements from > the command line. > > The hack solution is to modify the string in the Info.plist after the > package is built. But I'm hoping there's a proper solution that I'm > missing. > > Thanks! > Chris > > On Fri, Jul 18, 2008 at 11:59 AM, Russell E. Owen > wrote: > >> In article >> <764e38540807172233m6bce652bp40478564de10e265 at mail.gmail.com>, >> "Christopher Burns" wrote: >> >> > I've been using bdist_mpkg to build the OSX Installer. I'd like to >> update >> > the requirement documentation for the 1.1.1 release candidate to say >> > "MacPython from python.org" instead of "System Python". bdist_mpkg >> > specifies this, does anyone know how to override it? >> >> I suspect I am misunderstanding your question, but... >> >> If you are asking how to make bdist_mkpg actually use MacPython, then >> surely you simply have to have MacPython installed for that to happen? >> That was certainly true for MacPython and bdist_mpkg on 10.4.x. >> >> Or are you asking how to make the installer fail if the user's system is >> missing MacPython? That I don't know. I usually rely on the .mpkg's >> ReadMe and the user being intelligent enough to read it, but of course >> that is a bit risky. >> >> If you are asking how to modify the ReadMe file then that is trivial -- >> just look through the .mpkg package and you'll find it right away. I >> often replace the default ReadMe with my own when creating .mpkg >> installer for others. >> >> Since 1.1.1rc1 is coming out this Sunday, I'd like to know who is responsible for the OS X install improvements, if that is what they are. I don't know squat about them myself and don't run OS X. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Jul 18 18:39:55 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 15:39:55 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? Message-ID: Howdy, again f2py... Since I can't seem to figure out how to pass the --fcompiler option to f2py via setup.py/distutils, I decided to just do things for now via a plain makefile. But I'm struggling here too. The problem is this: the call f2py -c --fcompiler=gfortran -m text2 Text2.f90 works perfectly, and at some point in the output I see customize Gnu95FCompiler Found executable /usr/bin/gfortran and the result is a nice text2.so module. But I'd like to clean up a bit the python interface to the fortran routines, so I did the usual f2py -h text2.pyf Text2.f90 to create the .pyf, edited the pyf to adjust and 'pythonize' the interface, and then when I try to build using this pyf, I get a crash and the *same* gfortran option is now not recognized: maqroll[felipe_fortran]> f2py -c --fcompiler=gfortran text2.pyf Unknown vendor: "gfortran" Traceback (most recent call last): File "/usr/bin/f2py", line 26, in main() File "/home/fperez/usr/opt/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 560, in main run_compile() File "/home/fperez/usr/opt/lib/python2.5/site-packages/numpy/f2py/f2py2e.py", line 536, in run_compile ext = Extension(**ext_args) File "/home/fperez/usr/opt/lib/python2.5/site-packages/numpy/distutils/extension.py", line 45, in __init__ export_symbols) File "/usr/lib/python2.5/distutils/extension.py", line 106, in __init__ assert type(name) is StringType, "'name' must be a string" AssertionError: 'name' must be a string Note that it doesn't matter if I add Text2.f90 or not to the above call, it still fails. I could swear I'd done similar things in the past without any problems (albeit with f77 sources), and the user guide http://cens.ioc.ee/projects/f2py2e/usersguide/index.html#three-ways-to-wrap-getting-started gives instructions very much along the lines of what I'm doing. Are these changes since the integration into numpy, regressions, or mistakes on how I'm calling it? Thanks for any help, f From charlesr.harris at gmail.com Fri Jul 18 18:46:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 16:46:20 -0600 Subject: [Numpy-discussion] numpy.loadtext() fails with dtype + usecols In-Reply-To: <48811659.9020004@gmail.com> References: <48811659.9020004@gmail.com> Message-ID: On Fri, Jul 18, 2008 at 4:16 PM, Ryan May wrote: > Hi, > > I was trying to use loadtxt() today to read in some text data, and I had a > problem when I specified a dtype that only contained as many elements as in > columns in usecols. The example below shows the problem: > > import numpy as np > import StringIO > data = '''STID RELH TAIR > JOE 70.1 25.3 > BOB 60.5 27.9 > ''' > f = StringIO.StringIO(data) > names = ['stid', 'temp'] > dtypes = ['S4', 'f8'] > arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) > > With current 1.1 (and SVN head), this yields: > > IndexError Traceback (most recent call last) > > /home/rmay/ in () > > /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, > dtype, comments, delimiter, converters, skiprows, usecols, unpack) > 309 for j in xrange(len(vals))] > 310 if usecols is not None: > --> 311 row = [converterseq[j](vals[j]) for j in usecols] > 312 else: > 313 row = [converterseq[j](val) for j,val in > enumerate(vals)] > > IndexError: list index out of range > ------------------------------------------ > > I've added a patch that checks for usecols, and if present, correctly > creates the converters dictionary to map each specified column with > converter for the corresponding field in the dtype. With the attached patch, > this works fine: > > >arr > array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], > dtype=[('stid', '|S4'), ('temp', ' > Comments? Can I get this in for 1.1.1? > Can someone familiar with loadtxt comment on this patch? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 18 18:53:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 17:53:07 -0500 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: References: Message-ID: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> On Fri, Jul 18, 2008 at 17:39, Fernando Perez wrote: > Howdy, > > again f2py... Since I can't seem to figure out how to pass the > --fcompiler option to f2py via setup.py/distutils, I decided to just > do things for now via a plain makefile. But I'm struggling here too. > The problem is this: the call > > f2py -c --fcompiler=gfortran -m text2 Text2.f90 > > works perfectly, and at some point in the output I see > > customize Gnu95FCompiler > Found executable /usr/bin/gfortran > > and the result is a nice text2.so module. But I'd like to clean up a > bit the python interface to the fortran routines, so I did the usual > > f2py -h text2.pyf Text2.f90 You still need "-m text2", I believe. > to create the .pyf, edited the pyf to adjust and 'pythonize' the > interface, and then when I try to build using this pyf, I get a crash > and the *same* gfortran option is now not recognized: > > maqroll[felipe_fortran]> f2py -c --fcompiler=gfortran text2.pyf > Unknown vendor: "gfortran" It's --fcompiler=gnu95, not --fcompiler=gfortran -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Fri Jul 18 19:10:57 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 19 Jul 2008 01:10:57 +0200 Subject: [Numpy-discussion] integer array creation oddity In-Reply-To: References: Message-ID: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> 2008/7/18 Suchindra Sandhu : > Can someone please explain to me this oddity? > > In [1]: import numpy as n > > In [8]: a = n.array((1,2,3), 'i') > > In [9]: type(a[0]) > Out[9]: There's more than one int32 type lying around. Rather, compare *dtypes*: In [19]: a.dtype == np.int32 Out[19]: True Regards St?fan From fperez.net at gmail.com Fri Jul 18 19:19:15 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 16:19:15 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 3:53 PM, Robert Kern wrote: > You still need "-m text2", I believe. Right, thanks. But it still doesn't quite work. Consider a makefile with lib: seglib.so seglib.so: Text2.f90 f2py -c --fcompiler=gnu95 -m seglib Text2.f90 pyf: Text2.f90 f2py -h seglib.pyf -m seglib Text2.f90 --overwrite-signature lib2: Text2.f90 f2py -c --fcompiler=gnu95 seglib.pyf If I type make lib it works fine, but make pyf make lib2 bombs out with: gfortran:f90: /tmp/tmpNgmzmT/src.linux-i686-2.5/seglib-f2pywrappers2.f90 /tmp/tmpNgmzmT/src.linux-i686-2.5/seglib-f2pywrappers2.f90:7.41: use seg_functions, only : g_tilde2d 1 Fatal Error: Can't open module file 'seg_functions.mod' for reading at (1): No such file or directory /tmp/tmpNgmzmT/src.linux-i686-2.5/seglib-f2pywrappers2.f90:7.41: use seg_functions, only : g_tilde2d 1 Fatal Error: Can't open module file 'seg_functions.mod' for reading at (1): No such file or directory error: Command "/usr/bin/gfortran -Wall -fno-second-underscore -fPIC -O3 -funroll-loops -march=i686 -mmmx -msse2 -msse -msse3 -fomit-frame-pointer -malign-double -I/tmp/tmpNgmzmT/src.linux-i686-2.5 -I/home/fperez/usr/opt/lib/python2.5/site-packages/numpy/core/include -I/usr/include/python2.5 -c -c /tmp/tmpNgmzmT/src.linux-i686-2.5/seglib-f2pywrappers2.f90 -o /tmp/tmpNgmzmT/tmp/tmpNgmzmT/src.linux-i686-2.5/seglib-f2pywrappers2.o" failed with exit status 1 make: *** [lib2] Error 1 Is it obvious what I'm doing wrong? >> to create the .pyf, edited the pyf to adjust and 'pythonize' the >> interface, and then when I try to build using this pyf, I get a crash >> and the *same* gfortran option is now not recognized: >> >> maqroll[felipe_fortran]> f2py -c --fcompiler=gfortran text2.pyf >> Unknown vendor: "gfortran" > > It's --fcompiler=gnu95, not --fcompiler=gfortran The funny thing is that 'gfortran' does work for the plain call f2py -c --fcompiler=gfortran Text2.f90 just fine, but not for the other forms. So it's easy to be misled into thinking that it might actually be the correct call. Clever trick to trap the unwary :) Cheers, f From robert.kern at gmail.com Fri Jul 18 19:26:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 18:26:07 -0500 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> Message-ID: <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> On Fri, Jul 18, 2008 at 18:19, Fernando Perez wrote: > On Fri, Jul 18, 2008 at 3:53 PM, Robert Kern wrote: > >> You still need "-m text2", I believe. > > Right, thanks. But it still doesn't quite work. Consider a makefile with > > lib: seglib.so > > seglib.so: Text2.f90 > f2py -c --fcompiler=gnu95 -m seglib Text2.f90 > > pyf: Text2.f90 > f2py -h seglib.pyf -m seglib Text2.f90 --overwrite-signature > > lib2: Text2.f90 > f2py -c --fcompiler=gnu95 seglib.pyf You still need to have Text2.f90 on the line. lib2: Text2.f90 f2py -c --fcompiler=gnu95 seglib.pyf Text2.f90 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Fri Jul 18 19:35:21 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 16:35:21 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 4:26 PM, Robert Kern wrote: > On Fri, Jul 18, 2008 at 18:19, Fernando Perez wrote: >> On Fri, Jul 18, 2008 at 3:53 PM, Robert Kern wrote: >> >>> You still need "-m text2", I believe. >> >> Right, thanks. But it still doesn't quite work. Consider a makefile with >> >> lib: seglib.so >> >> seglib.so: Text2.f90 >> f2py -c --fcompiler=gnu95 -m seglib Text2.f90 >> >> pyf: Text2.f90 >> f2py -h seglib.pyf -m seglib Text2.f90 --overwrite-signature >> >> lib2: Text2.f90 >> f2py -c --fcompiler=gnu95 seglib.pyf > > You still need to have Text2.f90 on the line. Ahah! I went on this: -h Write signatures of the fortran routines to file and exit. You can then edit and use it ***instead*** of . [emphasis mine]. The instead there led me to think that I should NOT list the fortran files. Should that docstring be fixed, or am I just misreading something? And do you have any ideas on why the f2py_options in setup.py don't correctly pass the --fcompiler flag down to f2py? It does work if one calls setup.py via ./setup.py config_fc --fcompiler=gnu95 build_ext --inplace but it seems it would be good to be able to set all f2py options inside the setup.py file (without resorting to sys.argv hacks). Or does this go against the grain of numpy.distutils? Cheers, f From gael.varoquaux at normalesup.org Fri Jul 18 19:40:33 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 19 Jul 2008 01:40:33 +0200 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <20080718232551.GA2854@phare.normalesup.org> References: <20080718232551.GA2854@phare.normalesup.org> Message-ID: <20080718234033.GB2854@phare.normalesup.org> On Sat, Jul 19, 2008 at 01:25:51AM +0200, Gael Varoquaux wrote: > Am I wrong, or does matploib not build with current numpy svn? OK, Fernando told me that matplotlib builds fine with latest numpy on his box so Ienquired a bit more. The problem is that the build of matplotlib tries to include a header file that is generated automatically during the build of numpy (__multiarray_api.h). If you install numpy using "python setup.py install", this header file is inlcuded in the install. However I used "python setupegg.py develop" to install numpy. As a result the header file does not get put in the include directory. I guess this is thus a bug in the setupegg.py of numpy, but my knowledge of setuptools is way to low to be able to fix that. Cheers, Ga?l From robert.kern at gmail.com Fri Jul 18 19:44:21 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 18:44:21 -0500 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <20080718234033.GB2854@phare.normalesup.org> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> Message-ID: <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> On Fri, Jul 18, 2008 at 18:40, Gael Varoquaux wrote: > On Sat, Jul 19, 2008 at 01:25:51AM +0200, Gael Varoquaux wrote: >> Am I wrong, or does matploib not build with current numpy svn? > > OK, Fernando told me that matplotlib builds fine with latest numpy on his > box so Ienquired a bit more. The problem is that the build of matplotlib > tries to include a header file that is generated automatically during the > build of numpy (__multiarray_api.h). If you install numpy using "python > setup.py install", this header file is inlcuded in the install. However I > used "python setupegg.py develop" to install numpy. As a result the > header file does not get put in the include directory. I guess this is > thus a bug in the setupegg.py of numpy, but my knowledge of setuptools is > way to low to be able to fix that. It's not a setuptools issue at all. "build_ext --inplace" just doesn't install header files. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Jul 18 19:46:21 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 17:46:21 -0600 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 5:44 PM, Robert Kern wrote: > On Fri, Jul 18, 2008 at 18:40, Gael Varoquaux > wrote: > > On Sat, Jul 19, 2008 at 01:25:51AM +0200, Gael Varoquaux wrote: > >> Am I wrong, or does matploib not build with current numpy svn? > > > > OK, Fernando told me that matplotlib builds fine with latest numpy on his > > box so Ienquired a bit more. The problem is that the build of matplotlib > > tries to include a header file that is generated automatically during the > > build of numpy (__multiarray_api.h). If you install numpy using "python > > setup.py install", this header file is inlcuded in the install. However I > > used "python setupegg.py develop" to install numpy. As a result the > > header file does not get put in the include directory. I guess this is > > thus a bug in the setupegg.py of numpy, but my knowledge of setuptools is > > way to low to be able to fix that. > > It's not a setuptools issue at all. "build_ext --inplace" just doesn't > install header files. > So is there a fix for this problem? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 18 19:47:08 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 18:47:08 -0500 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> Message-ID: <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> On Fri, Jul 18, 2008 at 18:35, Fernando Perez wrote: > On Fri, Jul 18, 2008 at 4:26 PM, Robert Kern wrote: >> On Fri, Jul 18, 2008 at 18:19, Fernando Perez wrote: >>> On Fri, Jul 18, 2008 at 3:53 PM, Robert Kern wrote: >>> >>>> You still need "-m text2", I believe. >>> >>> Right, thanks. But it still doesn't quite work. Consider a makefile with >>> >>> lib: seglib.so >>> >>> seglib.so: Text2.f90 >>> f2py -c --fcompiler=gnu95 -m seglib Text2.f90 >>> >>> pyf: Text2.f90 >>> f2py -h seglib.pyf -m seglib Text2.f90 --overwrite-signature >>> >>> lib2: Text2.f90 >>> f2py -c --fcompiler=gnu95 seglib.pyf >> >> You still need to have Text2.f90 on the line. > > Ahah! I went on this: > > -h Write signatures of the fortran routines to file > and exit. You can then edit and use it > ***instead*** > of . > > [emphasis mine]. The instead there led me to think that I should NOT > list the fortran files. Should that docstring be fixed, or am I just > misreading something? > > And do you have any ideas on why the f2py_options in setup.py don't > correctly pass the --fcompiler flag down to f2py? It does work if one > calls setup.py via > > ./setup.py config_fc --fcompiler=gnu95 build_ext --inplace > > but it seems it would be good to be able to set all f2py options > inside the setup.py file (without resorting to sys.argv hacks). Or > does this go against the grain of numpy.distutils? For publicly distributed packages, it does go against the grain. Never hard-code the compiler! Use a setup.cfg file, instead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Fri Jul 18 19:49:34 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 19 Jul 2008 01:49:34 +0200 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> Message-ID: <20080718234934.GC2854@phare.normalesup.org> On Fri, Jul 18, 2008 at 06:44:21PM -0500, Robert Kern wrote: > It's not a setuptools issue at all. "build_ext --inplace" just doesn't > install header files. Does that mean that python setup.py develop should be banned for numpy? If so I consider it a setuptools issue: one more caveat to learn about setuptools, ie the fact that setuptools develop is not reliable and can lead to interested bugs without conplaining at all. I am just frustated of loosing a significant amount of my time discovering the behavior of the langage introduced by setuptools that keeps popup all the time. Ga?l From robert.kern at gmail.com Fri Jul 18 19:53:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 18:53:47 -0500 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <20080718234934.GC2854@phare.normalesup.org> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> Message-ID: <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> On Fri, Jul 18, 2008 at 18:49, Gael Varoquaux wrote: > On Fri, Jul 18, 2008 at 06:44:21PM -0500, Robert Kern wrote: >> It's not a setuptools issue at all. "build_ext --inplace" just doesn't >> install header files. > > Does that mean that python setup.py develop should be banned for numpy? No. > If so I consider it a setuptools issue: one more caveat to learn about > setuptools, ie the fact that setuptools develop is not reliable and can > lead to interested bugs without conplaining at all. IT'S NOT A SETUPTOOLS ISSUE. If you had done a "python setup.py build_ext --inplace" and then set your PYTHONPATH manually, you would have the same problem. Stop blaming setuptools for every little problem. Building inplace is not a setuptools feature. It's a distutils feature. The fact that we have header files in the package is a numpy feature. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Fri Jul 18 20:02:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 18:02:14 -0600 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 5:53 PM, Robert Kern wrote: > On Fri, Jul 18, 2008 at 18:49, Gael Varoquaux > wrote: > > On Fri, Jul 18, 2008 at 06:44:21PM -0500, Robert Kern wrote: > >> It's not a setuptools issue at all. "build_ext --inplace" just doesn't > >> install header files. > > > > Does that mean that python setup.py develop should be banned for numpy? > > No. > > > If so I consider it a setuptools issue: one more caveat to learn about > > setuptools, ie the fact that setuptools develop is not reliable and can > > lead to interested bugs without conplaining at all. > > IT'S NOT A SETUPTOOLS ISSUE. If you had done a "python setup.py > build_ext --inplace" and then set your PYTHONPATH manually, you would > have the same problem. Stop blaming setuptools for every little > problem. Building inplace is not a setuptools feature. It's a > distutils feature. The fact that we have header files in the package > is a numpy feature. > So what was Gael doing wrong? Was it the develop on this line? python setupegg.py develop I'm asking because of the upcoming release. I never use setupegg.py and I don't know what is going on here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Jul 18 20:04:19 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 17:04:19 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 4:47 PM, Robert Kern wrote: > For publicly distributed packages, it does go against the grain. Never > hard-code the compiler! Use a setup.cfg file, instead. Quite all right. But this was for in-house code where a group has agreed to all use the same compiler. It's basically a matter of wanting ./setup.py install to work without further flags. More generically, the way that f2py_options work is kind of confusing, since it doesn't just pass options to f2py :) In any case, I'm very grateful for all your help, but I get the sense that all of this distutils/f2py stuff would greatly benefit from clearer documentation. I imagine the doc team is already working on a pretty full plate, but if anyone tackles this particular issue, they'd be making a very useful contribution. Cheers, f From robert.kern at gmail.com Fri Jul 18 20:07:22 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 19:07:22 -0500 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> Message-ID: <3d375d730807181707w55d18416m105e4b045f7ebb46@mail.gmail.com> On Fri, Jul 18, 2008 at 19:02, Charles R Harris wrote: > > On Fri, Jul 18, 2008 at 5:53 PM, Robert Kern wrote: >> >> On Fri, Jul 18, 2008 at 18:49, Gael Varoquaux >> wrote: >> > On Fri, Jul 18, 2008 at 06:44:21PM -0500, Robert Kern wrote: >> >> It's not a setuptools issue at all. "build_ext --inplace" just doesn't >> >> install header files. >> > >> > Does that mean that python setup.py develop should be banned for numpy? >> >> No. >> >> > If so I consider it a setuptools issue: one more caveat to learn about >> > setuptools, ie the fact that setuptools develop is not reliable and can >> > lead to interested bugs without conplaining at all. >> >> IT'S NOT A SETUPTOOLS ISSUE. If you had done a "python setup.py >> build_ext --inplace" and then set your PYTHONPATH manually, you would >> have the same problem. Stop blaming setuptools for every little >> problem. Building inplace is not a setuptools feature. It's a >> distutils feature. The fact that we have header files in the package >> is a numpy feature. > > So what was Gael doing wrong? Was it the develop on this line? > > python setupegg.py develop > > I'm asking because of the upcoming release. I never use setupegg.py and I > don't know what is going on here. The code generation logic in numpy does not know anything about "build_ext --inplace" (which is a distutils feature that develop invokes). That's it. This problem has always existed for all versions of numpy whether you use setuptools or not. The logic in numpy/core/setup.py that places the generated files needs to be fixed if you want to fix this issue. I'm testing out a fix right now. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Jul 18 20:15:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 19:15:38 -0500 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> Message-ID: <3d375d730807181715k55a7b38cj6977c80dfbecfb2d@mail.gmail.com> On Fri, Jul 18, 2008 at 19:04, Fernando Perez wrote: > On Fri, Jul 18, 2008 at 4:47 PM, Robert Kern wrote: > >> For publicly distributed packages, it does go against the grain. Never >> hard-code the compiler! Use a setup.cfg file, instead. > > Quite all right. But this was for in-house code where a group has > agreed to all use the same compiler. It's basically a matter of > wanting > > ./setup.py install > > to work without further flags. Right. setup.cfg is still the way to go. > More generically, the way that > f2py_options work is kind of confusing, since it doesn't just pass > options to f2py :) Probably. What other options would you need, though? If everything is accessible from setup.cfg, I'd rather just remove the argument. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Fri Jul 18 20:20:58 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 17:20:58 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: <3d375d730807181715k55a7b38cj6977c80dfbecfb2d@mail.gmail.com> References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> <3d375d730807181715k55a7b38cj6977c80dfbecfb2d@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 5:15 PM, Robert Kern wrote: > Probably. What other options would you need, though? If everything is > accessible from setup.cfg, I'd rather just remove the argument. I'd be OK with that too, and simply telling people to ship a companion setup.cfg. It's just that the current situation is confusing and error-prone. One way to do it and all that... Cheers, f From robert.kern at gmail.com Fri Jul 18 21:14:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Jul 2008 20:14:00 -0500 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181707w55d18416m105e4b045f7ebb46@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> <3d375d730807181707w55d18416m105e4b045f7ebb46@mail.gmail.com> Message-ID: <3d375d730807181814vc9f355n30b4b30a822bbf1e@mail.gmail.com> On Fri, Jul 18, 2008 at 19:07, Robert Kern wrote: > The code generation logic in numpy does not know anything about > "build_ext --inplace" (which is a distutils feature that develop > invokes). That's it. This problem has always existed for all versions > of numpy whether you use setuptools or not. The logic in > numpy/core/setup.py that places the generated files needs to be fixed > if you want to fix this issue. I'm testing out a fix right now. Fixed in r5452. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Fri Jul 18 21:14:59 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 19 Jul 2008 03:14:59 +0200 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181814vc9f355n30b4b30a822bbf1e@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> <3d375d730807181707w55d18416m105e4b045f7ebb46@mail.gmail.com> <3d375d730807181814vc9f355n30b4b30a822bbf1e@mail.gmail.com> Message-ID: <20080719011459.GB5176@phare.normalesup.org> On Fri, Jul 18, 2008 at 08:14:00PM -0500, Robert Kern wrote: > > The code generation logic in numpy does not know anything about > > "build_ext --inplace" (which is a distutils feature that develop > > invokes). That's it. This problem has always existed for all versions > > of numpy whether you use setuptools or not. The logic in > > numpy/core/setup.py that places the generated files needs to be fixed > > if you want to fix this issue. I'm testing out a fix right now. > Fixed in r5452. Thanks a lot Rob, you rock. Ga?l From charlesr.harris at gmail.com Fri Jul 18 21:44:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 19:44:34 -0600 Subject: [Numpy-discussion] [matplotlib-devel] Matplotlib and latest numpy In-Reply-To: <3d375d730807181814vc9f355n30b4b30a822bbf1e@mail.gmail.com> References: <20080718232551.GA2854@phare.normalesup.org> <20080718234033.GB2854@phare.normalesup.org> <3d375d730807181644u20da0d36xead5c30f1d3dc506@mail.gmail.com> <20080718234934.GC2854@phare.normalesup.org> <3d375d730807181653m1e3f10c7q1027a45834f1cd32@mail.gmail.com> <3d375d730807181707w55d18416m105e4b045f7ebb46@mail.gmail.com> <3d375d730807181814vc9f355n30b4b30a822bbf1e@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 7:14 PM, Robert Kern wrote: > On Fri, Jul 18, 2008 at 19:07, Robert Kern wrote: > > > The code generation logic in numpy does not know anything about > > "build_ext --inplace" (which is a distutils feature that develop > > invokes). That's it. This problem has always existed for all versions > > of numpy whether you use setuptools or not. The logic in > > numpy/core/setup.py that places the generated files needs to be fixed > > if you want to fix this issue. I'm testing out a fix right now. > > Fixed in r5452. > Great...Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Jul 18 21:54:52 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 18:54:52 -0700 Subject: [Numpy-discussion] f2py - how to use .pyf files? In-Reply-To: References: <3d375d730807181553t44d96a07g664e776fdebfe19c@mail.gmail.com> <3d375d730807181626k7a427929l5a742cd2d2d08e9a@mail.gmail.com> <3d375d730807181647l57200e86pa7f679a981b9114c@mail.gmail.com> <3d375d730807181715k55a7b38cj6977c80dfbecfb2d@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 5:20 PM, Fernando Perez wrote: > On Fri, Jul 18, 2008 at 5:15 PM, Robert Kern wrote: > >> Probably. What other options would you need, though? If everything is >> accessible from setup.cfg, I'd rather just remove the argument. I just remembered code I had with things like: # Add '--debug-capi' for verbose debugging of low-level # calls #f2py_options = ['--debug-capi'], that can be very useful when in debug hell. Would this go through via setup.cfg? > I'd be OK with that too, and simply telling people to ship a companion > setup.cfg. It's just that the current situation is confusing and > error-prone. One way to do it and all that... BTW, I would have thought this file maqroll[felipe_fortran]> cat setup.cfg [config_fc] fcompiler = gnu95 would do the trick. But it doesn't seem to: maqroll[felipe_fortran]> ./setup.py build_ext --inplace ... gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext building 'seglib' extension ... error: f90 not supported by GnuFCompiler needed for Text2.f90 I'm sure I'm doing something blindingly wrong, but reading the distutils docs http://docs.python.org/inst/config-syntax.html made me think that the way to override the config_fc option is precisely the above. Thanks for any help... All this would be good to include in the end in a little example we ship... Cheers, f From oliphant at enthought.com Fri Jul 18 23:07:24 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 22:07:24 -0500 Subject: [Numpy-discussion] Another reference count leak: ticket #848 In-Reply-To: <20080709064836.Q65045@saturn.araneidae.co.uk> References: <20080708093048.G5947@saturn.araneidae.co.uk> <4873A341.1030402@enthought.com> <20080708210520.Y42061@saturn.araneidae.co.uk> <4873EE48.1090500@enthought.com> <20080709064836.Q65045@saturn.araneidae.co.uk> Message-ID: <48815A6C.9010300@enthought.com> > Looking at the uses of PyArray_FromAny I can see the motivation for this > design: core/include/numpy/ndarrayobject.h has a lot of calls which take a > value returned by PyArray_DescrFromType as argument. This has prompted me > to take a trawl through the code to see what else is going on, and I note > a couple more issues with patches below. > The core issue is that NumPy grew out of Numeric. In Numeric PyArray_Descr was just a C-structure, but in NumPy it is now a real Python object with reference counting. Trying to have a compatible C-API to the old one and making the transition with out huge changes to the C-API is what led to the "stealing" strategy. I did not just out the blue decide to do it that way. Yes, it is a bit of a pain, and yes, it isn't the way it is always done, but as you point out there are precedents, and so that's the direction I took. It is *way* too late to change that in any meaningful way > In the patch below the problem being fixed is that the first call to > PyArray_FromAny can result in the erasure of dtype *before* Py_INCREF is > called. Perhaps you can argue that this only occurs when NULL is > returned... > Indeed I would argue that because the array object holds a reference to the typecode (data-type object). Only if the call returns NULL does the data-type object lose it's reference count, and in fact that works out rather nicely and avoids a host of extra Py_DECREFs. > The next patch deals with an interestingly subtle memory leak in > _string_richcompare where if casting to a common type fails then a > reference count will leaked. Actually this one has nothing to do with > PyArray_FromAny, but I spotted it in passing. > This is a good catch. Thanks! > I really don't think that this design of reference count handling in > PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea. > Your point is well noted, but again given the provenance of the code, I still think it was the best choice. And, yes, it is too late to change it. > Not only is this not a good idea, it's not documented in the API > documentation (I'm referring to the "Guide to NumPy" book) Hmmm... Are you sure? I remember writing a bit about in the paragraphs that describe the relevant API calls. But, you could be right. > I've been trying to find some documentation on stealing references. The > Python C API reference (http://docs.python.org/api/refcountDetails.html) > says > > Few functions steal references; the two notable exceptions are > PyList_SetItem() and PyTuple_SetItem() > > An interesting essay on reference counting is at > http://lists.blender.org/pipermail/bf-python/2005-September/003092.html > Believe me, I understand reference counting pretty well. Still, it can be tricky to do correctly and it is easy to forget corner cases and error-returns. I very much appreciate your careful analysis, but I did an analysis of my own when I wrote the code, and so I will be resistant to change things if I can't see the error. I read something from Guido once that said something to the effect that nothing beats studying the code to get reference counting right. I think this is true. > In conclusion, I can't find much about the role of stealing in reference > count management, but it's such a source of surprise (and frankly doesn't > actually work out all that well in numpy) I strongly beg to differ. This sounds very naive to me. IMHO it has worked out extremely well in converting the PyArray_Descr C-structure into the data-type objects that adds so much power to NumPy. Yes, there are a few corner cases that you have done an excellent job in digging up, but they are "corner" cases that don't cause problems for 99.9% of the use-cases. -Travis From oliphant at enthought.com Fri Jul 18 23:15:19 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 22:15:19 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080715074217.R81915@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> Message-ID: <48815C47.1010305@enthought.com> Michael Abbott wrote: > Only half of my patch for this bug has gone into trunk, and without the > rest of my patch there remains a leak. > Thanks for your work Michael. I've been so grateful to have you and Chuck and others looking carefully at the code to fix its problems. In this particular case, I'm not sure I see how (the rest of) your patch fixes any remaining leak. We do seem to be having a disagreement about whether or not the reference to typecode can be pre-maturely destroyed, but this doesn't fit what I usually call a "memory leak." I think there may be some other cause for remaining leaks. > I can see that there might be an argument that PyArray_FromAny has the > semantics that it retains a reference to typecode unless it returns NULL > ... but I don't want to go there. That would not be a good thing to rely > on -- and even with those semantics the existing code still needs fixing. > Good, that is the argument I'm making. Why don't you want to "rely on it?" Thanks for all your help. -Travis From oliphant at enthought.com Fri Jul 18 23:30:07 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 22:30:07 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> Message-ID: <48815FBF.4080404@enthought.com> Charles R Harris wrote: > > > The reference leak seems specific to the float32 and complex64 types > called with default arguments. > > In [1]: import sys, gc > > In [2]: t = float32 > > In [3]: sys.getrefcount(dtype(t)) > Out[3]: 4 > > In [4]: for i in range(10) : t(); > ...: > > In [5]: sys.getrefcount(dtype(t)) > Out[5]: 14 > > In [6]: for i in range(10) : t(0); > ...: > > In [7]: sys.getrefcount(dtype(t)) > Out[7]: 14 > > In [8]: t = complex64 > > In [9]: sys.getrefcount(dtype(t)) > Out[9]: 4 > > In [10]: for i in range(10) : t(); > ....: > > In [11]: sys.getrefcount(dtype(t)) > Out[11]: 14 > > In [12]: t = float64 > > In [13]: sys.getrefcount(dtype(t)) > Out[13]: 19 > > In [14]: for i in range(10) : t(); > ....: > > In [15]: sys.getrefcount(dtype(t)) > Out[15]: 19 > > This shouldn't actually leak any memory as these types are singletons, > but it points up a logic flaw somewhere. > That is correct. There is no memory leak, but we do need to get it right. I appreciate the extra eyes on some of these intimate details. What can happen (after a lot of calls) is that the reference count can wrap around to 0 and then cause a funny dealloc (actually, the dealloc won't happen, but a warning will be printed). Fixing the reference counting issues has been the single biggest difficulty in converting PyArray_Descr from a C-structure to a regular Python object. It was a very large pain to begin with, and then there has been more code added since the original conversion some of which does not do reference counting correctly (mostly my fault). I've looked over ticket #848 quite a bit and would like to determine the true cause of the growing reference count which I don't believe is fixed by the rest of the patch that is submitted there. -Travis From charlesr.harris at gmail.com Fri Jul 18 23:31:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 21:31:46 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <48815C47.1010305@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815C47.1010305@enthought.com> Message-ID: On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant wrote: > Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and without the > > rest of my patch there remains a leak. > > > Thanks for your work Michael. I've been so grateful to have you and > Chuck and others looking carefully at the code to fix its problems. > > In this particular case, I'm not sure I see how (the rest of) your patch > fixes any remaining leak. We do seem to be having a disagreement about > whether or not the reference to typecode can be pre-maturely destroyed, > but this doesn't fit what I usually call a "memory leak." I think > there may be some other cause for remaining leaks. Travis, There really is (at least) one reference counting error in PyArray_FromAny. In particular, the obj == NULL case leaves a reference to typecode, then exits through the first return after finish. In this case robj doesn't steal a reference to typecode and the result can be seen in the python program above or by printing out the typecode->ob_refcnt from the code itself. So that needs fixing. I would suggest a DECREF in that section and a direct return of robj. The next section before finish is also a bit odd. The direct return of an array works fine, but if that isn't the branch taken, then PyArray_Return decrements the refcnt of arr, which in turn decrements the refcnt of typecode. I don't know if the resulting scalar holds a reference to typecode, but in any case the situation there should also be clarified. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Jul 18 23:34:16 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 22:34:16 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815C47.1010305@enthought.com> Message-ID: <488160B8.5040602@enthought.com> Charles R Harris wrote: > > > On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant > > wrote: > > Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and > without the > > rest of my patch there remains a leak. > > > Thanks for your work Michael. I've been so grateful to have you and > Chuck and others looking carefully at the code to fix its problems. > > In this particular case, I'm not sure I see how (the rest of) your > patch > fixes any remaining leak. We do seem to be having a disagreement > about > whether or not the reference to typecode can be pre-maturely > destroyed, > but this doesn't fit what I usually call a "memory leak." I think > there may be some other cause for remaining leaks. > > > Travis, > > There really is (at least) one reference counting error in > PyArray_FromAny. In particular, the obj == NULL case leaves a > reference to typecode, then exits through the first return after > finish. In this case robj doesn't steal a reference to typecode and > the result can be seen in the python program above or by printing out > the typecode->ob_refcnt from the code itself. So that needs fixing. I > would suggest a DECREF in that section and a direct return of robj. > > The next section before finish is also a bit odd. The direct return of > an array works fine, but if that isn't the branch taken, then > PyArray_Return decrements the refcnt of arr, which in turn decrements > the refcnt of typecode. I don't know if the resulting scalar holds a > reference to typecode, but in any case the situation there should also > be clarified. Thank you. I will direct attention there and try to clear this up tonight. I know Michael is finding problems that do need fixing. -Travis From charlesr.harris at gmail.com Fri Jul 18 23:49:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 21:49:23 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <48815FBF.4080404@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> Message-ID: On Fri, Jul 18, 2008 at 9:30 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > The reference leak seems specific to the float32 and complex64 types > > called with default arguments. > > > > In [1]: import sys, gc > > > > In [2]: t = float32 > > > > In [3]: sys.getrefcount(dtype(t)) > > Out[3]: 4 > > > > In [4]: for i in range(10) : t(); > > ...: > > > > In [5]: sys.getrefcount(dtype(t)) > > Out[5]: 14 > > > > In [6]: for i in range(10) : t(0); > > ...: > > > > In [7]: sys.getrefcount(dtype(t)) > > Out[7]: 14 > > > > In [8]: t = complex64 > > > > In [9]: sys.getrefcount(dtype(t)) > > Out[9]: 4 > > > > In [10]: for i in range(10) : t(); > > ....: > > > > In [11]: sys.getrefcount(dtype(t)) > > Out[11]: 14 > > > > In [12]: t = float64 > > > > In [13]: sys.getrefcount(dtype(t)) > > Out[13]: 19 > > > > In [14]: for i in range(10) : t(); > > ....: > > > > In [15]: sys.getrefcount(dtype(t)) > > Out[15]: 19 > > > > This shouldn't actually leak any memory as these types are singletons, > > but it points up a logic flaw somewhere. > > > That is correct. There is no memory leak, but we do need to get it > right. I appreciate the extra eyes on some of these intimate details. > What can happen (after a lot of calls) is that the reference count can > wrap around to 0 and then cause a funny dealloc (actually, the dealloc > won't happen, but a warning will be printed). > > Fixing the reference counting issues has been the single biggest > difficulty in converting PyArray_Descr from a C-structure to a regular > Python object. It was a very large pain to begin with, and then there > has been more code added since the original conversion some of which > does not do reference counting correctly (mostly my fault). > > I've looked over ticket #848 quite a bit and would like to determine the > true cause of the growing reference count which I don't believe is fixed > by the rest of the patch that is submitted there. > I've attached a test script. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: debug.py Type: text/x-python Size: 334 bytes Desc: not available URL: From fperez.net at gmail.com Sat Jul 19 00:00:41 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Jul 2008 21:00:41 -0700 Subject: [Numpy-discussion] f2py - a recap Message-ID: Howdy, today's exercise with f2py left some lessons learned, mostly thanks to Robert's excellent help, for which I'm grateful. I'd like to recap here what we have, mostly to decide what changes (if any) should go into numpy to make the experience less "interesting" for future users: - Remove the f2py_options flag from numpy.distutils.extension.Extension? If so, do options like '--debug_capi' get correctly passed via setup.cfg? This flag is potentially very confusing, because only *some* f2py options get actually set this way, while others need to be set in calls to config_fc. - How to properly set the compiler options in a setup.py file? Robert suggested the setup.cfg file, but this doesn't get picked up unless config_fc is explicitly called by the user: ./setup.py config_fc etc... This is perhaps a distutils problem, but I don't know if we can solve it more cleanly. It seems to me that it should be possible to provide a setup.py file that can be used simply as ./setup.py install (with the necessary setup.cfg file sitting next to it). I'm thinking here of what we need to do when showing how 'easy' these tools are for scientists migrating from matlab, for example. Obscure, special purpose incantations tend to tarnish our message of ease :) - Should the 'instead' word be removed from the f2py docs regarding the use of .pyf sources? It appears to be a mistake, which threw at least me for a loop for a while. - Why does f2py in the source tree have *both* a doc/ and a docs/ directory? It's really confusing to see this. f2py happens to be a very important tool, not just because scipy couldn't build without it, but also to position python as a credible integration language for scientific work. So I hope we can make using it as easy and robust as is technically feasible. Cheers, f From oliphant at enthought.com Sat Jul 19 00:04:56 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 23:04:56 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815C47.1010305@enthought.com> Message-ID: <488167E8.30203@enthought.com> Charles R Harris wrote: > > > On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant > > wrote: > > Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and > without the > > rest of my patch there remains a leak. > > > Thanks for your work Michael. I've been so grateful to have you and > Chuck and others looking carefully at the code to fix its problems. > > In this particular case, I'm not sure I see how (the rest of) your > patch > fixes any remaining leak. We do seem to be having a disagreement > about > whether or not the reference to typecode can be pre-maturely > destroyed, > but this doesn't fit what I usually call a "memory leak." I think > there may be some other cause for remaining leaks. > > > Travis, > > There really is (at least) one reference counting error in > PyArray_FromAny. In particular, the obj == NULL case leaves a > reference to typecode, then exits through the first return after > finish. In this case robj doesn't steal a reference to typecode and > the result can be seen in the python program above or by printing out > the typecode->ob_refcnt from the code itself. So that needs fixing. I > would suggest a DECREF in that section and a direct return of robj. agreed! I'll commit the change. > > The next section before finish is also a bit odd. The direct return of > an array works fine, but if that isn't the branch taken, then > PyArray_Return decrements the refcnt of arr, which in turn decrements > the refcnt of typecode. I don't know if the resulting scalar holds a > reference to typecode, but in any case the situation there should also > be clarified. This looks fine to me. At the PyArray_Return call, the typecode reference is held by the array. When it is decref'd the typecode is decref'd appropriately as well. The resulting scalar does *not* contain a reference to typecode. The scalar C-structure has no place to put it (it's just a PyObject_HEAD and the memory for the scalar value). Michael is correct that PyArray_Scalar does not change the reference count of typecode (as the comments above that function indicates). I tried to be careful to put comments near the functions that deal with PyArray_Descr objects to describe how they affect reference counting. I also thought I put that in my book. -Travis -Travis From oliphant at enthought.com Sat Jul 19 00:05:02 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 23:05:02 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815C47.1010305@enthought.com> Message-ID: <488167EE.5020606@enthought.com> Charles R Harris wrote: > > > On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant > > wrote: > > Michael Abbott wrote: > > Only half of my patch for this bug has gone into trunk, and > without the > > rest of my patch there remains a leak. > > > Thanks for your work Michael. I've been so grateful to have you and > Chuck and others looking carefully at the code to fix its problems. > > In this particular case, I'm not sure I see how (the rest of) your > patch > fixes any remaining leak. We do seem to be having a disagreement > about > whether or not the reference to typecode can be pre-maturely > destroyed, > but this doesn't fit what I usually call a "memory leak." I think > there may be some other cause for remaining leaks. > > > Travis, > > There really is (at least) one reference counting error in > PyArray_FromAny. In particular, the obj == NULL case leaves a > reference to typecode, then exits through the first return after > finish. In this case robj doesn't steal a reference to typecode and > the result can be seen in the python program above or by printing out > the typecode->ob_refcnt from the code itself. So that needs fixing. I > would suggest a DECREF in that section and a direct return of robj. agreed! I'll commit the change. > > The next section before finish is also a bit odd. The direct return of > an array works fine, but if that isn't the branch taken, then > PyArray_Return decrements the refcnt of arr, which in turn decrements > the refcnt of typecode. I don't know if the resulting scalar holds a > reference to typecode, but in any case the situation there should also > be clarified. This looks fine to me. At the PyArray_Return call, the typecode reference is held by the array. When it is decref'd the typecode is decref'd appropriately as well. The resulting scalar does *not* contain a reference to typecode. The scalar C-structure has no place to put it (it's just a PyObject_HEAD and the memory for the scalar value). Michael is correct that PyArray_Scalar does not change the reference count of typecode (as the comments above that function indicates). I tried to be careful to put comments near the functions that deal with PyArray_Descr objects to describe how they affect reference counting. I also thought I put that in my book. -Travis -Travis From oliphant at enthought.com Sat Jul 19 00:35:50 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 18 Jul 2008 23:35:50 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> Message-ID: <48816F26.2010702@enthought.com> > I've attached a test script. Thank you! It looks like with that added DECREF, the reference count leak is gone. While it was a minor issue (it should be noted that reference counting errors on the built-in data-types won't cause issues), it is nice to clean these things up when we can. I agree that the arrtype_new function is hairy, and I apologize for that. The scalartypes.inc.src was written very quickly. I added a few more comments in the change to the function (and removed a hard-coded hackish multiply with one that takes into account the actual size of Py_UNICODE). -Travis From gael.varoquaux at normalesup.org Sat Jul 19 00:38:39 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 19 Jul 2008 06:38:39 +0200 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <48816F26.2010702@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> Message-ID: <20080719043839.GA10352@phare.normalesup.org> On Fri, Jul 18, 2008 at 11:35:50PM -0500, Travis E. Oliphant wrote: > > I've attached a test script. > Thank you! It looks like with that added DECREF, the reference count > leak is gone. While it was a minor issue (it should be noted that > reference counting errors on the built-in data-types won't cause > issues), it is nice to clean these things up when we can. Yes. I think it is worth thanking all of you who are currently putting a large effort on QA. This effort is very valuable to all of us, as having a robust underlying library on which you can unquestionably rely is priceless. Ga?l From charlesr.harris at gmail.com Sat Jul 19 00:50:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 22:50:43 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <488167E8.30203@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815C47.1010305@enthought.com> <488167E8.30203@enthought.com> Message-ID: On Fri, Jul 18, 2008 at 10:04 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Fri, Jul 18, 2008 at 9:15 PM, Travis E. Oliphant > > > wrote: > > > > Michael Abbott wrote: > > > Only half of my patch for this bug has gone into trunk, and > > without the > > > rest of my patch there remains a leak. > > > > > Thanks for your work Michael. I've been so grateful to have you and > > Chuck and others looking carefully at the code to fix its problems. > > > > In this particular case, I'm not sure I see how (the rest of) your > > patch > > fixes any remaining leak. We do seem to be having a disagreement > > about > > whether or not the reference to typecode can be pre-maturely > > destroyed, > > but this doesn't fit what I usually call a "memory leak." I think > > there may be some other cause for remaining leaks. > > > > > > Travis, > > > > There really is (at least) one reference counting error in > > PyArray_FromAny. In particular, the obj == NULL case leaves a > > reference to typecode, then exits through the first return after > > finish. In this case robj doesn't steal a reference to typecode and > > the result can be seen in the python program above or by printing out > > the typecode->ob_refcnt from the code itself. So that needs fixing. I > > would suggest a DECREF in that section and a direct return of robj. > agreed! I'll commit the change. > > > > The next section before finish is also a bit odd. The direct return of > > an array works fine, but if that isn't the branch taken, then > > PyArray_Return decrements the refcnt of arr, which in turn decrements > > the refcnt of typecode. I don't know if the resulting scalar holds a > > reference to typecode, but in any case the situation there should also > > be clarified. > This looks fine to me. At the PyArray_Return call, the typecode > reference is held by the array. When it is decref'd the typecode is > decref'd appropriately as well. The resulting scalar does *not* > contain a reference to typecode. The scalar C-structure has no place to > put it (it's just a PyObject_HEAD and the memory for the scalar value). > I was thinking of just pulling the relevant part out of PyArray_Return and including it in the function, which would make what was going on quite explicit to anyone reading the code. Then maybe a direct return of robj as I think it is always going to be a scalar at that point. > Michael is correct that PyArray_Scalar does not change the reference > count of typecode (as the comments above that function indicates). I > tried to be careful to put comments near the functions that deal with > PyArray_Descr objects to describe how they affect reference counting. I > also thought I put that in my book. > Yep, it was a brain fart on my part. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 19 01:36:55 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Jul 2008 23:36:55 -0600 Subject: [Numpy-discussion] ticket #842. Message-ID: Just wanted to bring attention to ticket #842because I think the fix should be pretty easy. I added a comment: The printing inconsistency is a duplicate of #841and has been fixed. What remains odd is that the conjugate is a scalar not an array. In [2]: type(conjugate(array(8+7j))) Out[2]: In [3]: type((array(8+7j))) Out[3]: So I think all that needs to be done is fix the return type conjugate if we agree that it should be an array. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Jul 19 07:38:42 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 19 Jul 2008 13:38:42 +0200 Subject: [Numpy-discussion] ticket #842. In-Reply-To: References: Message-ID: <9457e7c80807190438r53949a6r53939ae0e0122d98@mail.gmail.com> 2008/7/19 Charles R Harris : > In [2]: type(conjugate(array(8+7j))) > Out[2]: > > In [3]: type((array(8+7j))) > Out[3]: > > So I think all that needs to be done is fix the return type conjugate if we > agree that it should be an array. I think it should be an array. St?fan From pav at iki.fi Sat Jul 19 09:13:03 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 Jul 2008 13:13:03 +0000 (UTC) Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: Hi all, Re: Ticket 854. I wrote tests for the branch cuts for all complex arc* functions in umathmodule. It turns out that all except arccosh were OK. The formula for arcsinh was written in a non-standard form with an unnecessary nc_neg, but this didn't affect the results. I also wrote tests for checking values of the functions at infs and nans. A patch for all of this is attached, with all currently non-passing tests marked as skipped. I'd like to commit this if there are no objections. Another thing I noticed is that the present implementations of the complex functions are naive, so they over- and underflow earlier than necessary: >>> np.arcsinh(1e8) 19.113827924512311 >>> np.arcsinh(1e8 + 0j) (inf-0j) >>> np.arcsinh(-1e8 + 0j) (-19.113827924512311-0j) This particular thing in arcsinh occurs because of loss of precision in intermediate stages. (In the version in my patch this loss of precision is still present.) It would be nice to polish these up. BTW, are there obstacles to using the C99 complex functions when they are available? This would avoid quite a bit of drudgework... As an alternative, we can probably steal better implementations from Python's recently polished cmathmodule.c *** Then comes a descent into pedantry: Numpy complex functions are not C99 compliant in handling of the signed zero, inf, and nan. I don't know whether we should comply with C99, but it would be nicer to have these handled in a controlled way rather than as a byproduct of the implementation chosen. 1) The branch cuts for sqrt and arc* don't respect the negative zero: >>> a = 1 + 0j >>> np.sqrt(-a) 1j >>> np.sqrt(-1 + 0j) 1j The branch cut of the logarithm however does: >>> np.log(-a) -3.1415926535897931j >>> np.log(-1 + 0j) 3.1415926535897931j All complex functions in the C99 standard respect the negative zero in their branch cuts. Do we want to follow? I don't know how to check what Octave and Matlab do regarding this, since I haven't figured out how to place a negative zero in complex numbers in these languages. But at least in practice these languages appear not to respect the sign of zero. > a = 1 + 0j > log(-a) ans = 0.000000000000000 + 3.141592653589793i > log(-1) ans = 0.000000000000000 + 3.141592653589793i 2) The numpy functions in general don't return C99 compliant results at inf or nan. I wrote up some tests for checking these. Do we want to fix these? -- Pauli Virtanen diff -r 357b6ce2a4bc numpy/core/src/umathmodule.c.src --- a/numpy/core/src/umathmodule.c.src Thu Jul 17 16:17:45 2008 +0300 +++ b/numpy/core/src/umathmodule.c.src Sat Jul 19 15:16:37 2008 +0300 @@ -825,15 +825,16 @@ c at typ@ t; nc_sum at c@(x, &nc_1 at c@, &t); + nc_sqrt at c@(&t, &t); nc_diff at c@(x, &nc_1 at c@, r); + nc_sqrt at c@(r, r); nc_prod at c@(&t, r, r); - nc_sqrt at c@(r, r); nc_sum at c@(x, r, r); nc_log at c@(r, r); return; /* return nc_log(nc_sum(x, - nc_sqrt(nc_prod(nc_sum(x,nc_1), nc_diff(x,nc_1))))); + nc_prod(nc_sqrt(nc_sum(x,nc_1)), nc_sqrt(nc_diff(x,nc_1))))); */ } @@ -863,12 +864,11 @@ nc_prod at c@(x, x, r); nc_sum at c@(&nc_1 at c@, r, r); nc_sqrt at c@(r, r); - nc_diff at c@(r, x, r); + nc_sum at c@(r, x, r); nc_log at c@(r, r); - nc_neg at c@(r, r); return; /* - return nc_neg(nc_log(nc_diff(nc_sqrt(nc_sum(nc_1,nc_prod(x,x))),x))); + return nc_log(nc_sum(nc_sqrt(nc_sum(nc_1,nc_prod(x,x))),x)); */ } diff -r 357b6ce2a4bc numpy/core/tests/test_umath.py --- a/numpy/core/tests/test_umath.py Thu Jul 17 16:17:45 2008 +0300 +++ b/numpy/core/tests/test_umath.py Sat Jul 19 15:16:37 2008 +0300 @@ -1,6 +1,8 @@ from numpy.testing import * import numpy.core.umath as ncu import numpy as np +import nose +from numpy import inf, nan, pi class TestDivision(TestCase): def test_division_int(self): @@ -34,7 +36,6 @@ ncu.sqrt(3+4j)]) assert_almost_equal(x**14, [-76443+16124j, 23161315+58317492j, 5583548873 + 2465133864j]) - class TestLog1p(TestCase): def test_log1p(self): @@ -179,10 +180,10 @@ assert_equal(np.choose(c, (a, 1)), np.array([1,1])) -class TestComplexFunctions(TestCase): +class TestComplexFunctions(object): funcs = [np.arcsin , np.arccos , np.arctan, np.arcsinh, np.arccosh, np.arctanh, np.sin , np.cos , np.tan , np.exp, - np.log , np.sqrt , np.log10] + np.log , np.sqrt , np.log10, np.log1p] def test_it(self): for f in self.funcs: @@ -204,6 +205,205 @@ assert_almost_equal(fcf, fcd, decimal=6, err_msg='fch-fcd %s'%f) assert_almost_equal(fcl, fcd, decimal=15, err_msg='fch-fcl %s'%f) + def test_branch_cuts(self): + # check branch cuts and continuity on them + yield _check_branch_cut, np.log, -0.5, 1j, 1, -1, True + yield _check_branch_cut, np.log10, -0.5, 1j, 1, -1, True + yield _check_branch_cut, np.log1p, -1.5, 1j, 1, -1, True + yield _check_branch_cut, np.sqrt, -0.5, 1j, 1, -1 + + yield _check_branch_cut, np.arcsin, [ -2, 2], [1j, -1j], 1, -1 + yield _check_branch_cut, np.arccos, [ -2, 2], [1j, -1j], 1, -1 + yield _check_branch_cut, np.arctan, [-2j, 2j], [1, -1 ], -1, 1 + + yield _check_branch_cut, np.arcsinh, [-2j, 2j], [-1, 1], -1, 1 + yield _check_branch_cut, np.arccosh, [ -1, 0.5], [1j, 1j], 1, -1 + yield _check_branch_cut, np.arctanh, [ -2, 2], [1j, -1j], 1, -1 + + # check against bogus branch cuts: assert continuity between quadrants + yield _check_branch_cut, np.arcsin, [-2j, 2j], [ 1, 1], 1, 1 + yield _check_branch_cut, np.arccos, [-2j, 2j], [ 1, 1], 1, 1 + yield _check_branch_cut, np.arctan, [ -2, 2], [1j, 1j], 1, 1 + + yield _check_branch_cut, np.arcsinh, [ -2, 2, 0], [1j, 1j, 1 ], 1, 1 + yield _check_branch_cut, np.arccosh, [-2j, 2j, 2], [1, 1, 1j], 1, 1 + yield _check_branch_cut, np.arctanh, [-2j, 2j, 0], [1, 1, 1j], 1, 1 + + def test_branch_cuts_failing(self): + # XXX: signed zeros are not OK for sqrt or for the arc* functions + yield _check_branch_cut, np.sqrt, -0.5, 1j, 1, -1, True + yield _check_branch_cut, np.arcsin, [ -2, 2], [1j, -1j], 1, -1, True + yield _check_branch_cut, np.arccos, [ -2, 2], [1j, -1j], 1, -1, True + yield _check_branch_cut, np.arctan, [-2j, 2j], [1, -1 ], -1, 1, True + yield _check_branch_cut, np.arcsinh, [-2j, 2j], [-1, 1], -1, 1, True + yield _check_branch_cut, np.arccosh, [ -1, 0.5], [1j, 1j], 1, -1, True + yield _check_branch_cut, np.arctanh, [ -2, 2], [1j, -1j], 1, -1, True + test_branch_cuts_failing = dec.skipknownfailure(test_branch_cuts_failing) + + def test_against_cmath(self): + import cmath, sys + + # cmath.asinh is broken in some versions of Python, see + # http://bugs.python.org/issue1381 + broken_cmath_asinh = False + if sys.version_info < (2,5,3): + broken_cmath_asinh = True + + points = [-2, 2j, 2, -2j, -1-1j, -1+1j, +1-1j, +1+1j] + name_map = {'arcsin': 'asin', 'arccos': 'acos', 'arctan': 'atan', + 'arcsinh': 'asinh', 'arccosh': 'acosh', 'arctanh': 'atanh'} + atol = 4*np.finfo(np.complex).eps + for func in self.funcs: + fname = func.__name__.split('.')[-1] + cname = name_map.get(fname, fname) + try: cfunc = getattr(cmath, cname) + except AttributeError: continue + for p in points: + a = complex(func(np.complex_(p))) + b = cfunc(p) + + if cname == 'asinh' and broken_cmath_asinh: + continue + + assert abs(a - b) < atol, "%s %s: %s; cmath: %s"%(fname,p,a,b) + +class TestC99(object): + """Check special functions at special points against the C99 standard""" + # NB: inherits from object instead of TestCase since using test generators + + # + # Non-conforming results are with XXX added to the exception field. + # + + def test_clog(self): + for p, v, e in [ + ((-0., 0.), (-inf, pi), 'divide'), + ((+0., 0.), (-inf, 0.), 'divide'), + ((1., inf), (inf, pi/2), ''), + ((1., nan), (nan, nan), ''), + ((-inf, 1.), (inf, pi), ''), + ((inf, 1.), (inf, 0.), ''), + ((-inf, inf), (inf, 3*pi/4), ''), + ((inf, inf), (inf, pi/4), ''), + ((inf, nan), (inf, nan), ''), + ((-inf, nan), (inf, nan), ''), + ((nan, 0.), (nan, nan), ''), + ((nan, 1.), (nan, nan), ''), + ((nan, inf), (inf, nan), ''), + ((+nan, nan), (nan, nan), ''), + ]: + yield self._check, np.log, p, v, e + + def test_csqrt(self): + for p, v, e in [ + ((-0., 0.), (0.,0.), 'XXX'), # now (-0., 0.) + ((0., 0.), (0.,0.), ''), + ((1., inf), (inf,inf), 'XXX invalid'), # now (inf, nan) + ((nan, inf), (inf,inf), 'XXX'), # now (nan, nan) + ((-inf, 1.), (0.,inf), ''), + ((inf, 1.), (inf,0.), ''), + ((-inf,nan), (nan, -inf), ''), # could also be +inf + ((inf, nan), (inf, nan), ''), + ((nan, 1.), (nan, nan), ''), + ((nan, nan), (nan, nan), ''), + ]: + yield self._check, np.sqrt, p, v, e + + def test_cacos(self): + for p, v, e in [ + ((0., 0.), (pi/2, -0.), 'XXX'), # now (-0., 0.) + ((-0., 0.), (pi/2, -0.), ''), + ((0., nan), (pi/2, nan), 'XXX'), # now (nan, nan) + ((-0., nan), (pi/2, nan), 'XXX'), # now (nan, nan) + ((1., inf), (pi/2, -inf), 'XXX'), # now (nan, -inf) + ((1., nan), (nan, nan), ''), + ((-inf, 1.), (pi, -inf), 'XXX'), # now (nan, -inf) + ((inf, 1.), (0., -inf), 'XXX'), # now (nan, -inf) + ((-inf, inf), (3*pi/4, -inf), 'XXX'), # now (nan, nan) + ((inf, inf), (pi/4, -inf), 'XXX'), # now (nan, nan) + ((inf, nan), (nan, +-inf), 'XXX'), # now (nan, nan) + ((-inf, nan), (nan, +-inf), 'XXX'), # now: (nan, nan) + ((nan, 1.), (nan, nan), ''), + ((nan, inf), (nan, -inf), 'XXX'), # now: (nan, nan) + ((nan, nan), (nan, nan), ''), + ]: + yield self._check, np.arccos, p, v, e + + def test_cacosh(self): + for p, v, e in [ + ((0., 0), (0, pi/2), ''), + ((-0., 0), (0, pi/2), ''), + ((1., inf), (inf, pi/2), 'XXX'), # now: (nan, nan) + ((1., nan), (nan, nan), ''), + ((-inf, 1.), (inf, pi), 'XXX'), # now: (inf, nan) + ((inf, 1.), (inf, 0.), 'XXX'), # now: (inf, nan) + ((-inf, inf), (inf, 3*pi/4), 'XXX'), # now: (nan, nan) + ((inf, inf), (inf, pi/4), 'XXX'), # now: (nan, nan) + ((inf, nan), (inf, nan), 'XXX'), # now: (nan, nan) + ((-inf, nan), (inf, nan), 'XXX'), # now: (nan, nan) + ((nan, 1.), (nan, nan), ''), + ((nan, inf), (inf, nan), 'XXX'), # now: (nan, nan) + ((nan, nan), (nan, nan), '') + ]: + yield self._check, np.arccosh, p, v, e + + def test_casinh(self): + for p, v, e in [ + ((0., 0), (0, 0), ''), + ((1., inf), (inf, pi/2), 'XXX'), # now: (inf, nan) + ((1., nan), (nan, nan), ''), + ((inf, 1.), (inf, 0.), 'XXX'), # now: (inf, nan) + ((inf, inf), (inf, pi/4), 'XXX'), # now: (nan, nan) + ((inf, nan), (nan, nan), 'XXX'), # now: (nan, nan) + ((nan, 0.), (nan, 0.), 'XXX'), # now: (nan, nan) + ((nan, 1.), (nan, nan), ''), + ((nan, inf), (+-inf, nan), 'XXX'), # now: (nan, nan) + ((nan, nan), (nan, nan), ''), + ]: + yield self._check, np.arcsinh, p, v, e + + def test_catanh(self): + for p, v, e in [ + ((0., 0), (0, 0), ''), + ((0., nan), (0., nan), 'XXX'), # now: (nan, nan) + ((1., 0.), (inf, 0.), 'XXX divide'), # now: (nan, nan) + ((1., inf), (inf, 0.), 'XXX'), # now: (nan, nan) + ((1., nan), (nan, nan), ''), + ((inf, 1.), (0., pi/2), 'XXX'), # now: (nan, nan) + ((inf, inf), (0, pi/2), 'XXX'), # now: (nan, nan) + ((inf, nan), (0, nan), 'XXX'), # now: (nan, nan) + ((nan, 1.), (nan, nan), ''), + ((nan, inf), (+0, pi/2), 'XXX'), # now: (nan, nan) + ((nan, nan), (nan, nan), ''), + ]: + yield self._check, np.arctanh, p, v, e + + def _check(self, func, point, value, exc=''): + if 'XXX' in exc: + raise nose.SkipTest + if isinstance(point, tuple): point = complex(*point) + if isinstance(value, tuple): value = complex(*value) + v = dict(divide='ignore', invalid='ignore', + over='ignore', under='ignore') + old_err = np.seterr(**v) + try: + # check sign of zero, nan, etc. + got = complex(func(point)) + got = "(%s, %s)" % (repr(got.real), repr(got.imag)) + expected = "(%s, %s)" % (repr(value.real), repr(value.imag)) + assert got == expected, (got, expected) + + # check exceptions + if exc in ('divide', 'invalid', 'over', 'under'): + v[exc] = 'raise' + np.seterr(**v) + assert_raises(FloatingPointError, func, point) + else: + for k in v.keys(): v[k] = 'raise' + np.seterr(**v) + func(point) + finally: + np.seterr(**old_err) class TestAttributes(TestCase): def test_attributes(self): @@ -216,6 +416,59 @@ assert_equal(add.nout, 1) assert_equal(add.identity, 0) +def _check_branch_cut(f, x0, dx, re_sign=1, im_sign=-1, sig_zero_ok=False, + dtype=np.complex): + """ + Check for a branch cut in a function. + + Assert that `x0` lies on a branch cut of function `f` and `f` is + continuous from the direction `dx`. + + Parameters + ---------- + f : func + Function to check + x0 : array-like + Point on branch cut + dx : array-like + Direction to check continuity in + re_sign, im_sign : {1, -1} + Change of sign of the real or imaginary part expected + sig_zero_ok : bool + Whether to check if the branch cut respects signed zero (if applicable) + dtype : dtype + Dtype to check (should be complex) + + """ + x0 = np.atleast_1d(x0).astype(dtype) + dx = np.atleast_1d(dx).astype(dtype) + + scale = np.finfo(dtype).eps * 1e3 + atol = 1e-4 + + y0 = f(x0) + yp = f(x0 + dx*scale*np.absolute(x0)/np.absolute(dx)) + ym = f(x0 - dx*scale*np.absolute(x0)/np.absolute(dx)) + + assert np.all(np.absolute(y0.real - yp.real) < atol), (y0, yp) + assert np.all(np.absolute(y0.imag - yp.imag) < atol), (y0, yp) + assert np.all(np.absolute(y0.real - ym.real*re_sign) < atol), (y0, ym) + assert np.all(np.absolute(y0.imag - ym.imag*im_sign) < atol), (y0, ym) + + if sig_zero_ok: + # check that signed zeros also work as a displacement + jr = (x0.real == 0) & (dx.real != 0) + ji = (x0.imag == 0) & (dx.imag != 0) + + x = -x0 + x.real[jr] = 0.*dx.real + x.imag[ji] = 0.*dx.imag + x = -x + ym = f(x) + ym = ym[jr | ji] + y0 = y0[jr | ji] + assert np.all(np.absolute(y0.real - ym.real*re_sign) < atol), (y0, ym) + assert np.all(np.absolute(y0.imag - ym.imag*im_sign) < atol), (y0, ym) if __name__ == "__main__": run_module_suite() From alan.mcintyre at gmail.com Sat Jul 19 10:33:24 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sat, 19 Jul 2008 10:33:24 -0400 Subject: [Numpy-discussion] chararray __mod__ behavior In-Reply-To: <9457e7c80807180532i76719579qc6824979db167a7@mail.gmail.com> References: <1d36917a0807180515w5d2f967ch4ab6daee0dfc6362@mail.gmail.com> <9457e7c80807180532i76719579qc6824979db167a7@mail.gmail.com> Message-ID: <1d36917a0807190733le1cce21r79d7a2136e959282@mail.gmail.com> On Fri, Jul 18, 2008 at 8:32 AM, St?fan van der Walt wrote: > That looks like a bug to me. I would have expected at least one of > the following to work: > > A % [[1, 2], [3, 4]] > A % 1 > A % (1, 2, 3, 4) > > and none of them do. I wouldn't expect the last one to work, since the shapes are different. The first two work if I use .flat to assign the result. Since this actually changes the existing behavior of chararray, I figured it deserved a tracker item: http://scipy.org/scipy/numpy/ticket/856 From michael at araneidae.co.uk Sat Jul 19 10:57:51 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Sat, 19 Jul 2008 14:57:51 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <48816F26.2010702@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> Message-ID: <20080719144215.D62170@saturn.araneidae.co.uk> On Fri, 18 Jul 2008, Travis E. Oliphant wrote: > It looks like with that added DECREF, the reference count leak is gone. I've looked at the latest head, and I agree that the problem is now solved. There is an important difference from my original solution: typecode is no longer reused after the finish label (instead it is always created anew). This makes all the difference in the world. I'm not actually convinced by the comment that's there now, which says /* typecode will be NULL */ but in truth it doesn't matter -- because of the correcly placed DECREF after the PyArray_Scalar calls the routine no longer owns typecode. If I can refer to my last message, I made the point that there wasn't a good invariant at the finish label -- we didn't know how many references to typecode we were responsible for at that point -- and I offered the solution to keep typecode. Instead you have chosen to recreate typecode, which I hadn't realised was just as good. This code is still horrible, but I don't think I want to try to understand it anymore. It'd be really nice (it'd make me feel a lot better) if you'd agree that my original patch was in fact correct. I'm not disputing the correcness of the current solution (except I think that typecode can end up being created twice, but who really cares?) but I've put a lot of effort into arguing my case, and the fact is my original patch was not wrong. Thank you. From millman at berkeley.edu Sat Jul 19 10:58:22 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sat, 19 Jul 2008 07:58:22 -0700 Subject: [Numpy-discussion] building a better OSX install for 1.1.1 In-Reply-To: References: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> Message-ID: On Fri, Jul 18, 2008 at 3:37 PM, Charles R Harris wrote: > Since 1.1.1rc1 is coming out this Sunday, I'd like to know who is > responsible for the OS X install improvements, if that is what they are. I > don't know squat about them myself and don't run OS X. Chris Burns has been building the OS X installer for NumPy. He is looking into whether he can make a better Mac installer for the 1.1.1rc.1 release, which is why he is asking about whether anyone knows how to fix this. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From strang at nmr.mgh.harvard.edu Sat Jul 19 10:59:09 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Sat, 19 Jul 2008 10:59:09 -0400 (EDT) Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: day pot-luck invite was a SHAM! The real party is on Saturday, and is not a pot-luck.Remember -- the Sunday pot-luck invite was a SHAM! The real party is on Saturday, and is not a pot-luck.Remember -- the Sunday pot-luck invite was a SHAM! The real party is on Saturday, and is not a pot-luck.Remember -- the Sunday pot-luck invite was a SHAM! The real party is on Saturday, and is not a pot-luck. From strang at nmr.mgh.harvard.edu Sat Jul 19 11:00:46 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Sat, 19 Jul 2008 11:00:46 -0400 (EDT) Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: Accidental (virus?) post. Humblest apologies for the noise. Please ignore. Gary On Sat, 19 Jul 2008, Gary Strangman wrote: > day pot-luck invite was a SHAM! The real party is on Saturday, and is not > a pot-luck.Remember -- the Sunday pot-luck invite was a SHAM! The real > party is on Saturday, and is not a pot-luck.Remember -- the Sunday > pot-luck invite was a SHAM! The real party is on Saturday, and is not a > pot-luck.Remember -- the Sunday pot-luck invite was a SHAM! The real party > is on Saturday, and is not a pot-luck. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > From charlesr.harris at gmail.com Sat Jul 19 13:05:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Jul 2008 11:05:19 -0600 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: On Sat, Jul 19, 2008 at 7:13 AM, Pauli Virtanen wrote: > Hi all, > > Re: Ticket 854. > > I wrote tests for the branch cuts for all complex arc* functions > in umathmodule. It turns out that all except arccosh were OK. > The formula for arcsinh was written in a non-standard form with > an unnecessary nc_neg, but this didn't affect the results. > I also wrote tests for checking values of the functions at infs and nans. > > A patch for all of this is attached, with all currently non-passing > tests marked as skipped. I'd like to commit this if there are no > objections. > > Another thing I noticed is that the present implementations of > the complex functions are naive, so they over- and underflow earlier > than necessary: > > >>> np.arcsinh(1e8) > 19.113827924512311 > >>> np.arcsinh(1e8 + 0j) > (inf-0j) > >>> np.arcsinh(-1e8 + 0j) > (-19.113827924512311-0j) > > This particular thing in arcsinh occurs because of loss of precision > in intermediate stages. (In the version in my patch this loss of precision > is still present.) > > It would be nice to polish these up. BTW, are there obstacles to using > the C99 complex functions when they are available? This would avoid > quite a bit of drudgework... As an alternative, we can probably steal > better implementations from Python's recently polished cmathmodule.c > The main problem is determining when they are available and if they cover all the needed precisions. Since we will need standalone implementations on some platforms anyway, I am inclined towards stealing from cmathmodule.c if it offers improved code for some of the functions. > *** > > Then comes a descent into pedantry: Numpy complex functions are not > C99 compliant in handling of the signed zero, inf, and nan. I don't > know whether we should comply with C99, but it would be nicer to have > these handled in a controlled way rather than as a byproduct of the > implementation chosen. > > Agreed. > > 1) > > The branch cuts for sqrt and arc* don't respect the negative zero: > > >>> a = 1 + 0j > >>> np.sqrt(-a) > 1j > >>> np.sqrt(-1 + 0j) > 1j > > The branch cut of the logarithm however does: > > >>> np.log(-a) > -3.1415926535897931j > >>> np.log(-1 + 0j) > 3.1415926535897931j > > All complex functions in the C99 standard respect the negative zero > in their branch cuts. Do we want to follow? > Hmm. I think so, to the extent we can. This might lead to some unexpected results, but that is what happens for arguments near the branch cuts. Do it and document it. > > I don't know how to check what Octave and Matlab do regarding this, > since I haven't figured out how to place a negative zero in complex > numbers in these languages. But at least in practice these languages > appear not to respect the sign of zero. > > > a = 1 + 0j > > log(-a) > ans = 0.000000000000000 + 3.141592653589793i > > log(-1) > ans = 0.000000000000000 + 3.141592653589793i > > > 2) > > The numpy functions in general don't return C99 compliant results > at inf or nan. I wrote up some tests for checking these. > > Do we want to fix these? > I'd say yes. That way we can refer to the C99 standard to document numpy behavior. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 19 13:24:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Jul 2008 11:24:25 -0600 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: On Sat, Jul 19, 2008 at 7:13 AM, Pauli Virtanen wrote: > Hi all, > > Re: Ticket 854. > I've backported the fixes to 1.1.x, so you had better commit these ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jul 19 13:44:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Jul 2008 11:44:34 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080719144215.D62170@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> Message-ID: On Sat, Jul 19, 2008 at 8:57 AM, Michael Abbott wrote: > On Fri, 18 Jul 2008, Travis E. Oliphant wrote: > > It looks like with that added DECREF, the reference count leak is gone. > > I've looked at the latest head, and I agree that the problem is now > solved. > > There is an important difference from my original solution: typecode is no > longer reused after the finish label (instead it is always created anew). > This makes all the difference in the world. > > I'm not actually convinced by the comment that's there now, which says > /* typecode will be NULL */ > but in truth it doesn't matter -- because of the correcly placed DECREF > after the PyArray_Scalar calls the routine no longer owns typecode. > > If I can refer to my last message, I made the point that there wasn't a > good invariant at the finish label -- we didn't know how many references > to typecode we were responsible for at that point -- and I offered the > solution to keep typecode. Instead you have chosen to recreate typecode, > which I hadn't realised was just as good. > > This code is still horrible, but I don't think I want to try to understand > it anymore. It'd be really nice (it'd make me feel a lot better) if you'd > agree that my original patch was in fact correct. I'm not disputing the > correcness of the current solution (except I think that typecode can end > up being created twice, but who really cares?) but I've put a lot of > effort into arguing my case, and the fact is my original patch was not > wrong. > Yep, the original patch looks good now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Sat Jul 19 14:05:33 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 19 Jul 2008 13:05:33 -0500 Subject: [Numpy-discussion] ticket #842. In-Reply-To: <9457e7c80807190438r53949a6r53939ae0e0122d98@mail.gmail.com> References: <9457e7c80807190438r53949a6r53939ae0e0122d98@mail.gmail.com> Message-ID: <48822CED.4060404@enthought.com> St?fan van der Walt wrote: > 2008/7/19 Charles R Harris : > >> In [2]: type(conjugate(array(8+7j))) >> Out[2]: >> >> In [3]: type((array(8+7j))) >> Out[3]: >> >> So I think all that needs to be done is fix the return type conjugate if we >> agree that it should be an array. >> > > I think it should be an array. > +1/2 From oliphant at enthought.com Sat Jul 19 14:27:40 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 19 Jul 2008 13:27:40 -0500 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080719144215.D62170@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> Message-ID: <4882321C.9030402@enthought.com> Michael Abbott wrote: > > I'm not actually convinced by the comment that's there now, which says > /* typecode will be NULL */ > but in truth it doesn't matter -- because of the correcly placed DECREF > after the PyArray_Scalar calls the routine no longer owns typecode. > I'm pretty sure that it's fine. > If I can refer to my last message, I made the point that there wasn't a > good invariant at the finish label -- we didn't know how many references > to typecode we were responsible for at that point -- and I offered the > solution to keep typecode. Instead you have chosen to recreate typecode, > which I hadn't realised was just as good. > I agree that this routine needs aesthetic improvement. I had hoped that someone would have improved the array scalars routines by now. I think a core issue is that to save a couple of lines of code, an inappropriate goto finish in the macro was used. This complicated the code more than the savings of a couple lines of code would justify. Really this code "grew" from a simple thing into a complicated thing as more "features" were added. This is a common issue that happens all over the place. > This code is still horrible, but I don't think I want to try to understand > it anymore. It'd be really nice (it'd make me feel a lot better) if you'd > agree that my original patch was in fact correct. I'm not disputing the > correcness of the current solution (except I think that typecode can end > up being created twice, but who really cares?) but I've put a lot of > effort into arguing my case, and the fact is my original patch was not > wrong. > > From what I saw, I'm still not quite sure. Your description of reference counting was correct and it is clear you've studied the issue which is great, because there aren't that many people who understand reference counting on the C-level in Python anymore and it is still a useful skill. I'm hopeful that your description of reference counting will be something others can find and learn from. The reason I say I'm not sure is that I don't remember seeing a DECREF after the PyArray_Scalar in the obj = NULL part of the code in my looking at your patches. But, I could have just missed it. Regardless, that core piece was lost in my trying to figure out the other changes you were making to the code. From a "generic" reference counting point of view you did correctly emphasize the problem of having a reference count creation occur in an if-statement but a DECREF occur all the time in the finish: section of the code. It was really that statement: "the fantasy that PyArray_Scalar steals a reference" that tipped me off to what I consider one of the real problems to be. The fact that it was masked at the end of a long discussion about other reference counting and a "stealing" discussion that were not the core problem was distracting and ultimately not very helpful. I'm very impressed with your ability to follow these reference count issues. Especially given that you only started learning about the Python C-API a few months ago (if I remember correctly). I'm also very glad you are checking these corner cases which have not received the testing that they deserve. I hope we have not discouraged you too much from continuing to help. Your input is highly valued. -Travis From oliphant at enthought.com Sat Jul 19 14:29:45 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 19 Jul 2008 13:29:45 -0500 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: <48823299.7060600@enthought.com> Pauli Virtanen wrote: > Hi all, > > Re: Ticket 854. > > I wrote tests for the branch cuts for all complex arc* functions > in umathmodule. It turns out that all except arccosh were OK. > The formula for arcsinh was written in a non-standard form with > an unnecessary nc_neg, but this didn't affect the results. > I also wrote tests for checking values of the functions at infs and nans. > Thanks for looking into these. These functions were contributed by Konrad Hinsen (IIRC) many years ago and I don't think they've really been reviewed since then. I'm all for using C99 when it is available and improving these functions with help from cmathmodule. IIRC, the cmathmodule was contributed by Konrad originally also. So +1 on C99 standardization. -Travis From charlesr.harris at gmail.com Sat Jul 19 14:46:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Jul 2008 12:46:14 -0600 Subject: [Numpy-discussion] Buildbot errors on Windows_XP. Message-ID: Memmap problems. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Sat Jul 19 16:10:42 2008 From: rmay31 at gmail.com (Ryan May) Date: Sat, 19 Jul 2008 16:10:42 -0400 Subject: [Numpy-discussion] Masked array fill_value Message-ID: <48824A42.6050803@gmail.com> Hi, I just noticed this and found it surprising: In [8]: from numpy import ma In [9]: a = ma.array([1,2,3,4],mask=[False,False,True,False],fill_value=0) In [10]: a Out[10]: masked_array(data = [1 2 -- 4], mask = [False False True False], fill_value=0) In [11]: a[2] Out[11]: masked_array(data = --, mask = True, fill_value=1e+20) In [12]: np.__version__ Out[12]: '1.1.0' Is there a reason that the fill_value isn't inherited from the parent array? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From efiring at hawaii.edu Sat Jul 19 17:21:04 2008 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 19 Jul 2008 11:21:04 -1000 Subject: [Numpy-discussion] Masked array fill_value In-Reply-To: <48824A42.6050803@gmail.com> References: <48824A42.6050803@gmail.com> Message-ID: <48825AC0.3030404@hawaii.edu> Ryan May wrote: > Hi, > > I just noticed this and found it surprising: > > In [8]: from numpy import ma > > In [9]: a = ma.array([1,2,3,4],mask=[False,False,True,False],fill_value=0) > > In [10]: a > Out[10]: > masked_array(data = [1 2 -- 4], > mask = [False False True False], > fill_value=0) > > > In [11]: a[2] > Out[11]: > masked_array(data = --, > mask = True, > fill_value=1e+20) > > In [12]: np.__version__ > Out[12]: '1.1.0' > > Is there a reason that the fill_value isn't inherited from the parent array? There was a thread about this a couple months ago, and Pierre GM explained it. I think the point was that indexing is giving you a new masked scalar, which is therefore taking the default mask value of the type. I don't see it as a problem; you can always specify the fill value explicitly when you need to. Eric > > Ryan > From rmay31 at gmail.com Sat Jul 19 18:41:22 2008 From: rmay31 at gmail.com (Ryan May) Date: Sat, 19 Jul 2008 18:41:22 -0400 Subject: [Numpy-discussion] Masked array fill_value In-Reply-To: <48825AC0.3030404@hawaii.edu> References: <48824A42.6050803@gmail.com> <48825AC0.3030404@hawaii.edu> Message-ID: <48826D92.8080806@gmail.com> Eric Firing wrote: > Ryan May wrote: >> Hi, >> >> I just noticed this and found it surprising: >> >> In [8]: from numpy import ma >> >> In [9]: a = ma.array([1,2,3,4],mask=[False,False,True,False],fill_value=0) >> >> In [10]: a >> Out[10]: >> masked_array(data = [1 2 -- 4], >> mask = [False False True False], >> fill_value=0) >> >> >> In [11]: a[2] >> Out[11]: >> masked_array(data = --, >> mask = True, >> fill_value=1e+20) >> >> In [12]: np.__version__ >> Out[12]: '1.1.0' >> >> Is there a reason that the fill_value isn't inherited from the parent array? > > There was a thread about this a couple months ago, and Pierre GM > explained it. I think the point was that indexing is giving you a new > masked scalar, which is therefore taking the default mask value of the > type. I don't see it as a problem; you can always specify the fill > value explicitly when you need to. I thought it sounded familiar. You're right, it's not a big problem, it just seemed unintuitive. Thanks for the explaination. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From pgmdevlist at gmail.com Sat Jul 19 19:20:44 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Sat, 19 Jul 2008 19:20:44 -0400 Subject: [Numpy-discussion] Masked array fill_value In-Reply-To: <48826D92.8080806@gmail.com> References: <48824A42.6050803@gmail.com> <48825AC0.3030404@hawaii.edu> <48826D92.8080806@gmail.com> Message-ID: <200807191920.44545.pgmdevlist@gmail.com> On Saturday 19 July 2008 18:41:22 Ryan May wrote: > > There was a thread about this a couple months ago, and Pierre GM > > explained it. I think the point was that indexing is giving you a new > > masked scalar, which is therefore taking the default mask value of the > > type. I don't see it as a problem; you can always specify the fill > > value explicitly when you need to. Actually, in that example, a[2] is THE masked scalar: that's a constant initialized when you import numpy.ma, it doesn't depend on the type. > I thought it sounded familiar. You're right, it's not a big problem, it > just seemed unintuitive. Thanks for the explaination. It is in a way, but it's needed for compatibility with older code. That way, you can test whether a value is masked by using: a[2] is masked Yeah, you could also check whether the mask if not nomask and whether the mask at this particular element is True, but it's a bit longer. From charlesr.harris at gmail.com Sat Jul 19 22:51:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Jul 2008 20:51:18 -0600 Subject: [Numpy-discussion] Backport r5452? Message-ID: Robert, Is there any reason I shouldn't backport your build fixes? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Jul 19 23:04:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 19 Jul 2008 22:04:03 -0500 Subject: [Numpy-discussion] Backport r5452? In-Reply-To: References: Message-ID: <3d375d730807192004w23a6f9e5jd78453197890bbdc@mail.gmail.com> On Sat, Jul 19, 2008 at 21:51, Charles R Harris wrote: > Robert, > > Is there any reason I shouldn't backport your build fixes? Go ahead. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Sun Jul 20 04:18:47 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 20 Jul 2008 10:18:47 +0200 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: <48823299.7060600@enthought.com> References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: On Sat, 19 Jul 2008 13:29:45 -0500 "Travis E. Oliphant" wrote: > Pauli Virtanen wrote: >> Hi all, >> >> Re: Ticket 854. >> >> I wrote tests for the branch cuts for all complex arc* >>functions >> in umathmodule. It turns out that all except arccosh >>were OK. >> The formula for arcsinh was written in a non-standard >>form with >> an unnecessary nc_neg, but this didn't affect the >>results. >> I also wrote tests for checking values of the functions >>at infs and nans. >> > > Thanks for looking into these. These functions were >contributed by > Konrad Hinsen (IIRC) many years ago and I don't think >they've really > been reviewed since then. > > I'm all for using C99 when it is available and improving >these functions > with help from cmathmodule. IIRC, the cmathmodule was >contributed by > Konrad originally also. > > So +1 on C99 standardization. > > -Travis > ====================================================================== ERROR: test_umath.TestC99.test_catanh(, (nan, nan), (nan, nan), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", line 405, in _check func(point) FloatingPointError: invalid value encountered in arctanh ====================================================================== ERROR: test_umath.TestC99.test_clog(, (1.0, nan), (nan, nan), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", line 405, in _check func(point) FloatingPointError: invalid value encountered in log ====================================================================== ERROR: test_umath.TestC99.test_clog(, (nan, 0.0), (nan, nan), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", line 405, in _check func(point) FloatingPointError: invalid value encountered in log ====================================================================== ERROR: test_umath.TestC99.test_clog(, (nan, 1.0), (nan, nan), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", line 405, in _check func(point) FloatingPointError: invalid value encountered in log ====================================================================== ERROR: test_umath.TestC99.test_clog(, (nan, nan), (nan, nan), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", line 405, in _check func(point) FloatingPointError: invalid value encountered in log ---------------------------------------------------------------------- Ran 2070 tests in 35.424s FAILED (SKIP=41, errors=5) Nils From charlesr.harris at gmail.com Sun Jul 20 04:34:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 02:34:48 -0600 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: On Sun, Jul 20, 2008 at 2:18 AM, Nils Wagner wrote: > On Sat, 19 Jul 2008 13:29:45 -0500 > "Travis E. Oliphant" wrote: > > Pauli Virtanen wrote: > >> Hi all, > >> > >> Re: Ticket 854. > >> > >> I wrote tests for the branch cuts for all complex arc* > >>functions > >> in umathmodule. It turns out that all except arccosh > >>were OK. > >> The formula for arcsinh was written in a non-standard > >>form with > >> an unnecessary nc_neg, but this didn't affect the > >>results. > >> I also wrote tests for checking values of the functions > >>at infs and nans. > >> > > > > Thanks for looking into these. These functions were > >contributed by > > Konrad Hinsen (IIRC) many years ago and I don't think > >they've really > > been reviewed since then. > > > > I'm all for using C99 when it is available and improving > >these functions > > with help from cmathmodule. IIRC, the cmathmodule was > >contributed by > > Konrad originally also. > > > > So +1 on C99 standardization. > > > > -Travis > > > > ====================================================================== > ERROR: test_umath.TestC99.test_catanh(, > (nan, nan), (nan, nan), '') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", > line 182, in runTest > self.test(*self.arg) > File > "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", > line 405, in _check > func(point) > FloatingPointError: invalid value encountered in arctanh > What architecture and OS? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 20 04:42:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 20 Jul 2008 03:42:45 -0500 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: <3d375d730807200142k5e5c12q81639712699cb0c6@mail.gmail.com> On Sun, Jul 20, 2008 at 03:34, Charles R Harris wrote: > What architecture and OS? I get the following on OS X 10.5.3 Intel Core 2 Duo: ====================================================================== FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, -0.0), 'divide') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.3-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/Users/rkern/svn/numpy/numpy/core/tests/test_umath.py", line 394, in _check assert got == expected, (got, expected) AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') ====================================================================== FAIL: test_umath.TestC99.test_csqrt(, (-inf, 1.0), (-0.0, inf), '') ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.3-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/Users/rkern/svn/numpy/numpy/core/tests/test_umath.py", line 394, in _check assert got == expected, (got, expected) AssertionError: ('(0.0, inf)', '(-0.0, inf)') ---------------------------------------------------------------------- Ran 1887 tests in 7.716s -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Sun Jul 20 04:51:39 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 20 Jul 2008 10:51:39 +0200 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: On Sun, 20 Jul 2008 02:34:48 -0600 "Charles R Harris" wrote: > On Sun, Jul 20, 2008 at 2:18 AM, Nils Wagner > > wrote: > >> On Sat, 19 Jul 2008 13:29:45 -0500 >> "Travis E. Oliphant" wrote: >> > Pauli Virtanen wrote: >> >> Hi all, >> >> >> >> Re: Ticket 854. >> >> >> >> I wrote tests for the branch cuts for all complex >>arc* >> >>functions >> >> in umathmodule. It turns out that all except arccosh >> >>were OK. >> >> The formula for arcsinh was written in a non-standard >> >>form with >> >> an unnecessary nc_neg, but this didn't affect the >> >>results. >> >> I also wrote tests for checking values of the >>functions >> >>at infs and nans. >> >> >> > >> > Thanks for looking into these. These functions were >> >contributed by >> > Konrad Hinsen (IIRC) many years ago and I don't think >> >they've really >> > been reviewed since then. >> > >> > I'm all for using C99 when it is available and >>improving >> >these functions >> > with help from cmathmodule. IIRC, the cmathmodule was >> >contributed by >> > Konrad originally also. >> > >> > So +1 on C99 standardization. >> > >> > -Travis >> > >> >> ====================================================================== >> ERROR: test_umath.TestC99.test_catanh(, >> (nan, nan), (nan, nan), '') >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", >> line 182, in runTest >> self.test(*self.arg) >> File >> "/usr/lib/python2.4/site-packages/numpy/core/tests/test_umath.py", >> line 405, in _check >> func(point) >> FloatingPointError: invalid value encountered in arctanh >> > > > > What architecture and OS? Linux linux 2.6.11.4-21.17-default #1 Fri Apr 6 08:42:34 UTC 2007 i686 athlon i386 GNU/Linux SuSe Linux 9.3 gcc --version gcc (GCC) 3.3.5 20050117 (prerelease) (SUSE Linux) Copyright (C) 2003 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. /usr/bin/python Python 2.4 (#1, Oct 13 2006, 17:13:31) [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Nils From michael at araneidae.co.uk Sun Jul 20 05:12:05 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Sun, 20 Jul 2008 09:12:05 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <4882321C.9030402@enthought.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080715150718.R97049@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> Message-ID: <20080720084401.A71237@saturn.araneidae.co.uk> Travis, thank you for your encouraging words. On Sat, 19 Jul 2008, Travis E. Oliphant wrote: > Really this code "grew" from a simple thing into a complicated thing as > more "features" were added. This is a common issue that happens all > over the place. Aye. > The reason I say I'm not sure is that I don't remember seeing a DECREF > after the PyArray_Scalar in the obj = NULL part of the code in my > looking at your patches. But, I could have just missed it. There wasn't -- instead, I was trying to retain the typecode (and paying the price of releasing it on all the early returns!) > From a "generic" reference counting point of view you did correctly > emphasize the problem of having a reference count creation occur in an > if-statement but a DECREF occur all the time in the finish: section of > the code. Yah -- I think the core idea I was trying to get over is that of an "invariant" property at each point in the code to capture what needs to be true for the code to be correct. > It was really that statement: "the fantasy that PyArray_Scalar steals a > reference" that tipped me off to what I consider one of the real > problems to be. The fact that it was masked at the end of a long > discussion about other reference counting and a "stealing" discussion > that were not the core problem was distracting and ultimately not very > helpful. That was the really hard bit for me. To me the issue was actually very obvious (though I didn't realise that typecode could be regenerated, which simplifies things enormously), so the problem was trying to figure out what you and Charles were not seeing. I think that's why I ended up throwing everything into the pot! > I'm very impressed with your ability to follow these reference count > issues. Especially given that you only started learning about the > Python C-API a few months ago (if I remember correctly). Alas no. I'm a bit of an old lag really, I did dabble with the Python C API quite a few years ago (2001ish maybe?). Myy roots are in computer science and then assembler (graduated 1980) before Pascal (seriously) then C, then C++ (which I now regard as a serious mistake) and finally shell script plus Python, all largely on embedded applications. I'd love the opportunity to learn and use Haskell now. > I'm also very glad you are checking these corner cases which have not > received the testing that they deserve. I hope we have not discouraged > you too much from continuing to help. Your input is highly valued. Maybe I'll have a further play. The memory leak issue was a direct consequence of using numpy in an embedded application, and that issue's closed now, but I ought to see if this painful code can be revisited. I'm learning my way around git and have just used `git svn` to grab (and update) the numpy repository. I'm hugely impressed by it, though it is very expensive to run the first time -- it fetches every single svn revision! Hopefully that didn't overload the web server... This will make working on patches much easier. From stefan at sun.ac.za Sun Jul 20 07:00:16 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 20 Jul 2008 13:00:16 +0200 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080720084401.A71237@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> <20080720084401.A71237@saturn.araneidae.co.uk> Message-ID: <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> 2008/7/20 Michael Abbott : >> I'm very impressed with your ability to follow these reference count >> issues. Especially given that you only started learning about the >> Python C-API a few months ago (if I remember correctly). > Alas no. I'm a bit of an old lag really, I did dabble with the Python C > API quite a few years ago (2001ish maybe?). Myy roots are in computer > science and then assembler (graduated 1980) before Pascal (seriously) then > C, then C++ (which I now regard as a serious mistake) and finally shell It's scary how many of us were scarred for life by C++. If you have ten minutes free sometime, would you please consider writing up your reference counting explanation on the wiki (even if you just copy and paste that part out of your email and annotate it)? Having more eyes on the NumPy code is imperative; we need to teach more people to understand how to find these sorts of problems. > I'm learning my way around git and have just used `git svn` to grab (and > update) the numpy repository. I'm hugely impressed by it, though it is > very expensive to run the first time -- it fetches every single svn > revision! Hopefully that didn't overload the web server... This will > make working on patches much easier. I hope that we can move over to a distributed revision control system sometime in the foreseeable future. From what I've seen, its model strongly encourages community interaction. Cheers St?fan From pav at iki.fi Sun Jul 20 08:10:23 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 20 Jul 2008 12:10:23 +0000 (UTC) Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: Hi, Sorry, Sun, 20 Jul 2008 10:18:47 +0200, Nils Wagner wrote: > ERROR: test_umath.TestC99.test_catanh(, (nan, nan), > (nan, nan), '') > FloatingPointError: invalid value encountered in arctanh Skipped now. > ERROR: test_umath.TestC99.test_clog(, (1.0, nan), (nan, > nan), '') > FloatingPointError: invalid value encountered in log Bug in tests, fixed. > ERROR: test_umath.TestC99.test_clog(, (nan, 0.0), (nan, > nan), '') > FloatingPointError: invalid value encountered in log Bug in tests, fixed. > ERROR: test_umath.TestC99.test_clog(, (nan, 1.0), (nan, > nan), '') > FloatingPointError: invalid value encountered in log Bug in tests, fixed. > ERROR: test_umath.TestC99.test_clog(, (nan, nan), (nan, > nan), '') > FloatingPointError: invalid value encountered in log Skipped. Sun, 20 Jul 2008 03:42:45 -0500, Robert Kern wrote: > FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, -0.0), 'divide') Interesting, there's no test like this in there, only one with positive zeros. Where does this one come from? > FAIL: test_umath.TestC99.test_csqrt(, (-inf, 1.0), (-0.0, inf), '') Fails on your platform, skipped. -- Pauli Virtanen From nwagner at iam.uni-stuttgart.de Sun Jul 20 08:22:16 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 20 Jul 2008 14:22:16 +0200 Subject: [Numpy-discussion] Branch cuts, inf, nan, C99 compliance In-Reply-To: References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> <48823299.7060600@enthought.com> Message-ID: On Sun, 20 Jul 2008 12:10:23 +0000 (UTC) Pauli Virtanen wrote: > Hi, > > Sorry, > > Sun, 20 Jul 2008 10:18:47 +0200, Nils Wagner wrote: > >> ERROR: test_umath.TestC99.test_catanh(, >>(nan, nan), >> (nan, nan), '') >> FloatingPointError: invalid value encountered in arctanh > > Skipped now. > >> ERROR: test_umath.TestC99.test_clog(, (1.0, >>nan), (nan, >> nan), '') >> FloatingPointError: invalid value encountered in log > > Bug in tests, fixed. > >> ERROR: test_umath.TestC99.test_clog(, (nan, >>0.0), (nan, >> nan), '') >> FloatingPointError: invalid value encountered in log > > Bug in tests, fixed. > >> ERROR: test_umath.TestC99.test_clog(, (nan, >>1.0), (nan, >> nan), '') >> FloatingPointError: invalid value encountered in log > > Bug in tests, fixed. > >> ERROR: test_umath.TestC99.test_clog(, (nan, >>nan), (nan, >> nan), '') >> FloatingPointError: invalid value encountered in log > > Skipped. > > > Sun, 20 Jul 2008 03:42:45 -0500, Robert Kern wrote: >> FAIL: test_umath.TestC99.test_clog(, (-0.0, >>-0.0), (-inf, > -0.0), 'divide') > > Interesting, there's no test like this in there, only >one with positive > zeros. Where does this one come from? > >> FAIL: test_umath.TestC99.test_csqrt(, >>(-inf, 1.0), > (-0.0, inf), '') > >Fails on your platform, skipped. > > -- > Pauli Virtanen > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Ran 2069 tests in 30.300s OK (SKIP=44) Thank you. Cheers, Nils From michael at araneidae.co.uk Sun Jul 20 08:27:51 2008 From: michael at araneidae.co.uk (Michael Abbott) Date: Sun, 20 Jul 2008 12:27:51 +0000 (GMT) Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <20080717212322.O34675@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> <20080720084401.A71237@saturn.araneidae.co.uk> <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> Message-ID: <20080720122136.P71881@saturn.araneidae.co.uk> On Sun, 20 Jul 2008, St?fan van der Walt wrote: > 2008/7/20 Michael Abbott : > > C, then C++ (which I now regard as a serious mistake) and finally shell > It's scary how many of us were scarred for life by C++. What's really annoying for me is that my most recent big project (http://sourceforge.net/projects/libera-epics) was written in C++, entirely at my own choice. It seemed like a good idea at the time. > If you have ten minutes free sometime, would you please consider > writing up your reference counting explanation on the wiki Good idea. Do you mean at http://scipy.org/scipy/numpy ? Somewhere under CodingStyleGuidelines? From matthieu.brucher at gmail.com Sun Jul 20 08:32:16 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 20 Jul 2008 14:32:16 +0200 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080720122136.P71881@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> <20080720084401.A71237@saturn.araneidae.co.uk> <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> <20080720122136.P71881@saturn.araneidae.co.uk> Message-ID: 2008/7/20 Michael Abbott : > On Sun, 20 Jul 2008, St?fan van der Walt wrote: >> 2008/7/20 Michael Abbott : >> > C, then C++ (which I now regard as a serious mistake) and finally shell >> It's scary how many of us were scarred for life by C++. > > What's really annoying for me is that my most recent big project > (http://sourceforge.net/projects/libera-epics) was written in C++, > entirely at my own choice. It seemed like a good idea at the time. Other are scarred for life with C and are more than happy with C++... Matthieu, who really hopes he will not ever write one more line of C code -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher From stefan at sun.ac.za Sun Jul 20 12:00:49 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 20 Jul 2008 18:00:49 +0200 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <20080720122136.P71881@saturn.araneidae.co.uk> References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> <20080720084401.A71237@saturn.araneidae.co.uk> <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> <20080720122136.P71881@saturn.araneidae.co.uk> Message-ID: <9457e7c80807200900t52463fc5ue35de89388c586fa@mail.gmail.com> 2008/7/20 Michael Abbott : > On Sun, 20 Jul 2008, St?fan van der Walt wrote: >> 2008/7/20 Michael Abbott : >> > C, then C++ (which I now regard as a serious mistake) and finally shell >> It's scary how many of us were scarred for life by C++. > > What's really annoying for me is that my most recent big project > (http://sourceforge.net/projects/libera-epics) was written in C++, > entirely at my own choice. It seemed like a good idea at the time. > >> If you have ten minutes free sometime, would you please consider >> writing up your reference counting explanation on the wiki > > Good idea. Do you mean at http://scipy.org/scipy/numpy ? Somewhere under > CodingStyleGuidelines? Or here: http://projects.scipy.org/scipy/numpy/ under `guidelines`. Thanks! St?fan From charlesr.harris at gmail.com Sun Jul 20 13:06:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 11:06:11 -0600 Subject: [Numpy-discussion] Ticket review: #848, leak in PyArray_DescrFromType In-Reply-To: <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> References: <20080715074217.R81915@saturn.araneidae.co.uk> <48815FBF.4080404@enthought.com> <48816F26.2010702@enthought.com> <20080719144215.D62170@saturn.araneidae.co.uk> <4882321C.9030402@enthought.com> <20080720084401.A71237@saturn.araneidae.co.uk> <9457e7c80807200400l1c092570i6c5d9296cec0d9ec@mail.gmail.com> Message-ID: On Sun, Jul 20, 2008 at 5:00 AM, St?fan van der Walt wrote: > 2008/7/20 Michael Abbott : > >> I'm very impressed with your ability to follow these reference count > >> issues. Especially given that you only started learning about the > >> Python C-API a few months ago (if I remember correctly). > > Alas no. I'm a bit of an old lag really, I did dabble with the Python C > > API quite a few years ago (2001ish maybe?). Myy roots are in computer > > science and then assembler (graduated 1980) before Pascal (seriously) > then > > C, then C++ (which I now regard as a serious mistake) and finally shell > > It's scary how many of us were scarred for life by C++. > I rather like C++, especially the templates and (BOOST) smart pointers. One just has to avoid using the more exotic features, think ten or twenty times before using inheritance, and be very suspicious of operator overloading. And is your friend. But If you need to ration memory and worry about dynamic allocation, forget it. I wouldn't use it for drivers. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Sun Jul 20 16:00:36 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 20 Jul 2008 13:00:36 -0700 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight Message-ID: Hello, This is a reminder that 1.1.1rc1 will be tagged tonight. Chuck is planning to spend some time today fixing a few final bugs on the 1.1.x branch. If anyone else is planning to commit anything to the 1.1.x branch today, please let me know immediately. Obviously now is not the time to commit anything to the branch that could break anything, so please be extremely careful if you have to touch the branch. Once the release is tagged, Chris and David will create binary installers for both Windows and Mac. Hopefully, this will give us an opportunity to have much more widespread testing before releasing 1.1.1 final at the end of the month. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From rmay31 at gmail.com Sun Jul 20 16:14:53 2008 From: rmay31 at gmail.com (Ryan May) Date: Sun, 20 Jul 2008 16:14:53 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: Message-ID: <48839CBD.30304@gmail.com> Jarrod Millman wrote: > Hello, > > This is a reminder that 1.1.1rc1 will be tagged tonight. Chuck is > planning to spend some time today fixing a few final bugs on the 1.1.x > branch. If anyone else is planning to commit anything to the 1.1.x > branch today, please let me know immediately. Obviously now is not > the time to commit anything to the branch that could break anything, > so please be extremely careful if you have to touch the branch. > > Once the release is tagged, Chris and David will create binary > installers for both Windows and Mac. Hopefully, this will give us an > opportunity to have much more widespread testing before releasing > 1.1.1 final at the end of the month. > Can I get anyone to look at this patch for loadtext()? I was trying to use loadtxt() today to read in some text data, and I had a problem when I specified a dtype that only contained as many elements as in columns in usecols. The example below shows the problem: import numpy as np import StringIO data = '''STID RELH TAIR JOE 70.1 25.3 BOB 60.5 27.9 ''' f = StringIO.StringIO(data) names = ['stid', 'temp'] dtypes = ['S4', 'f8'] arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) With current 1.1 (and SVN head), this yields: IndexError Traceback (most recent call last) /home/rmay/ in () /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack) 309 for j in xrange(len(vals))] 310 if usecols is not None: --> 311 row = [converterseq[j](vals[j]) for j in usecols] 312 else: 313 row = [converterseq[j](val) for j,val in enumerate(vals)] IndexError: list index out of range ----------------------------------------- I've added a patch that checks for usecols, and if present, correctly creates the converters dictionary to map each specified column with converter for the corresponding field in the dtype. With the attached patch, this works fine: >arr array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], dtype=[('stid', '|S4'), ('temp', ' From charlesr.harris at gmail.com Sun Jul 20 16:52:45 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 14:52:45 -0600 Subject: [Numpy-discussion] MSVC warnings. Message-ID: The Windows_XP buildbot shows several warnings about npy_intp -> int conversions. Two of them look OK, but probably explicit casts should be made. The third is: numpy\core\src\ufuncobject.c(2422) : warning C4244: '=' : conversion from 'npy_intp' to 'int', possible loss of data Which looks possibly legitimate to me. Unfortunately, it looks like fixing it involves a change in the loop type. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 20 17:06:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 15:06:41 -0600 Subject: [Numpy-discussion] Dealing with NAN. Message-ID: The NAN macro isn't defined on all platforms and we have various workarounds scattered here and there. Is it reasonable to determine a suitable value of NAN for the various float types as part of the initial setup and configuration and have the results appended to one of the include files? This shouldn't be difficult for the gnu compilers, but might be a hassle for MSVC and some of the others. It can be hardwired into MSVC if we assume the Intel architecture. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 20 17:34:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 15:34:22 -0600 Subject: [Numpy-discussion] Windows_XP buildbot error. Message-ID: The log file shows: File "c:\numpy-buildbot\numpy\b11\install\Lib\site-packages\numpy\lib\tests\test_format.py", line 429, in test_memmap_roundtrip fp = open(nfn, 'wb') IOError: [Errno 2] No such file or directory: 'c:\\docume~1\\thomas\\locals~1\\temp\\tmp_yrykj\\normal.npy' Is this some sort of permissions error or something specific to Thomas' machine? I don't want this to show up in the 1.1.1 release and I'm wondering if there is an easy fix besides disabling the test. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 20 18:42:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 16:42:42 -0600 Subject: [Numpy-discussion] Ticket #794 and can o' worms. Message-ID: Hi All, I "fixed" ticket #754, but it leads to a ton of problems. The original discussion is here. The problems that arise come from conversion to different types. In [26]: a Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) In [27]: sign(a).astype(int) Out[27]: array([ 1, -1, -2147483648, 0, 1, -1]) In [28]: sign(a).astype(bool) Out[28]: array([ True, True, True, False, True, True], dtype=bool) In [29]: sign(a) Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) In [30]: bool(NaN) Out[30]: True So there are problems with at minimum the following. 1) The way NaN is converted to bool. I think it should be False. 2) The way NaN is converted to int types. I think it should be 0. These problems show up in failing tests. I'm reverting the fix for now, but I wonder what we should do about these things. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 20 18:47:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 20 Jul 2008 17:47:18 -0500 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: References: Message-ID: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> On Sun, Jul 20, 2008 at 17:42, Charles R Harris wrote: > Hi All, > > I "fixed" ticket #754, but it leads to a ton of problems. The original > discussion is here. The problems that arise come from conversion to > different types. > > In [26]: a > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) > > In [27]: sign(a).astype(int) > Out[27]: > array([ 1, -1, -2147483648, 0, 1, > -1]) > > In [28]: sign(a).astype(bool) > Out[28]: array([ True, True, True, False, True, True], dtype=bool) > > In [29]: sign(a) > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) > > In [30]: bool(NaN) > Out[30]: True > > So there are problems with at minimum the following. > > 1) The way NaN is converted to bool. I think it should be False. It's not really our choice. That's Python's bool(). For the things that are our choice (e.g. array([nan]).astype(bool)) I think we should stay consistent with Python. > 2) The way NaN is converted to int types. I think it should be 0. I agree. That's what int(nan) gives: >>> int(nan) 0L -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sun Jul 20 19:10:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 17:10:16 -0600 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. Message-ID: Alan, Stefan Not raising errors seems ok for examples, but some of the unit tests are also implemented as doctests and the failures are hidden in the logs. I'm not sure what to do about this, but thought it worth pointing out. Also, it would be nice if skipped tests didn't generate large bits of printout, it makes it hard to find relevant failures. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Sun Jul 20 21:17:04 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 20 Jul 2008 21:17:04 -0400 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: References: Message-ID: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> On Sun, Jul 20, 2008 at 7:10 PM, Charles R Harris wrote: > Not raising errors seems ok for examples, but some of the unit tests are > also implemented as doctests and the failures are hidden in the logs. I'm > not sure what to do about this, but thought it worth pointing out. Also, it > would be nice if skipped tests didn't generate large bits of printout, it > makes it hard to find relevant failures. Yeah I was just looking at that; right off the top of my head I don't know why that doctest failure wouldn't bubble all the way up to become a unit test failure. Personally, I'm not a big fan of having a test like that in a docstring and then trying to run it as a unit test, but I'll see if I can fix it. :) The skipped test verbosity is annoying; I'll see if there's a way to make that a bit cleaner-looking for some low verbosity level. From charlesr.harris at gmail.com Sun Jul 20 21:32:10 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 19:32:10 -0600 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> References: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> Message-ID: On Sun, Jul 20, 2008 at 4:47 PM, Robert Kern wrote: > On Sun, Jul 20, 2008 at 17:42, Charles R Harris > wrote: > > Hi All, > > > > I "fixed" ticket #754, but it leads to a ton of problems. The original > > discussion is here. The problems that arise come from conversion to > > different types. > > > > In [26]: a > > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) > > > > In [27]: sign(a).astype(int) > > Out[27]: > > array([ 1, -1, -2147483648, 0, 1, > > -1]) > > > > In [28]: sign(a).astype(bool) > > Out[28]: array([ True, True, True, False, True, True], dtype=bool) > > > > In [29]: sign(a) > > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) > > > > In [30]: bool(NaN) > > Out[30]: True > > > > So there are problems with at minimum the following. > > > > 1) The way NaN is converted to bool. I think it should be False. > > It's not really our choice. That's Python's bool(). For the things > that are our choice (e.g. array([nan]).astype(bool)) I think we should > stay consistent with Python. > > > 2) The way NaN is converted to int types. I think it should be 0. > > I agree. That's what int(nan) gives: > > >>> int(nan) > 0L > So we should shoot for: nan -> bool : True nan -> integer kind : 0 nan -> complex : Nan+0j nan -> string kind : ?, currently it is any one of 'n', 'na', 'nan', depending on string length. nan -> object: float object nan. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jul 20 21:35:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 20 Jul 2008 20:35:30 -0500 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: References: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> Message-ID: <3d375d730807201835p5a87e772m15d03b13655678a1@mail.gmail.com> On Sun, Jul 20, 2008 at 20:32, Charles R Harris wrote: > > On Sun, Jul 20, 2008 at 4:47 PM, Robert Kern wrote: >> >> On Sun, Jul 20, 2008 at 17:42, Charles R Harris >> wrote: >> > Hi All, >> > >> > I "fixed" ticket #754, but it leads to a ton of problems. The original >> > discussion is here. The problems that arise come from conversion to >> > different types. >> > >> > In [26]: a >> > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) >> > >> > In [27]: sign(a).astype(int) >> > Out[27]: >> > array([ 1, -1, -2147483648, 0, 1, >> > -1]) >> > >> > In [28]: sign(a).astype(bool) >> > Out[28]: array([ True, True, True, False, True, True], dtype=bool) >> > >> > In [29]: sign(a) >> > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) >> > >> > In [30]: bool(NaN) >> > Out[30]: True >> > >> > So there are problems with at minimum the following. >> > >> > 1) The way NaN is converted to bool. I think it should be False. >> >> It's not really our choice. That's Python's bool(). For the things >> that are our choice (e.g. array([nan]).astype(bool)) I think we should >> stay consistent with Python. >> >> > 2) The way NaN is converted to int types. I think it should be 0. >> >> I agree. That's what int(nan) gives: >> >> >>> int(nan) >> 0L > > So we should shoot for: > > nan -> bool : True > nan -> integer kind : 0 > nan -> complex : Nan+0j > nan -> string kind : ?, currently it is any one of 'n', 'na', 'nan', > depending on string length. > nan -> object: float object nan. Sounds right. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tim.hochberg at ieee.org Sun Jul 20 22:32:35 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Sun, 20 Jul 2008 19:32:35 -0700 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> References: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> Message-ID: On Sun, Jul 20, 2008 at 3:47 PM, Robert Kern wrote: > On Sun, Jul 20, 2008 at 17:42, Charles R Harris > wrote: > > Hi All, > > > > I "fixed" ticket #754, but it leads to a ton of problems. The original > > discussion is here. The problems that arise come from conversion to > > different types. > > > > In [26]: a > > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) > > > > In [27]: sign(a).astype(int) > > Out[27]: > > array([ 1, -1, -2147483648, 0, 1, > > -1]) > > > > In [28]: sign(a).astype(bool) > > Out[28]: array([ True, True, True, False, True, True], dtype=bool) > > > > In [29]: sign(a) > > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) > > > > In [30]: bool(NaN) > > Out[30]: True > > > > So there are problems with at minimum the following. > > > > 1) The way NaN is converted to bool. I think it should be False. > > It's not really our choice. That's Python's bool(). For the things > that are our choice (e.g. array([nan]).astype(bool)) I think we should > stay consistent with Python. > I agree that this is a good goal. However, in the past, Python's treatment of NaNs has been rather platform dependent and add hock. In this case, I suspect that you are OK since the section "Truth Value Testing" in the Python docs is pretty clear that any non-zero value of a numerical type is True. However... > > > 2) The way NaN is converted to int types. I think it should be 0. > > I agree. That's what int(nan) gives: > > >>> int(nan) > 0L This is GvR in http://mail.python.org/pipermail/python-dev/2008-January/075865.html: *If long(nan) or int(nan) returns 0 on most platforms in 2.5, we should* *fix them to always return 0 in 2.5 *and* 2.6. In 3.0 they should raise* *ValueError.* This implies that in version 2.4 and earlier, the Python behaviour is platform dependent. And that 3.0 this is going to change to raise a ValueError. Whether it's more important to match current behaviour (return 0) or future behaviour (raise ValueError), I'm not certain. I would lean towards a ValueError since it's less long term pain and it's IMO more correct. > > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Sun Jul 20 22:47:35 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 20 Jul 2008 22:47:35 -0400 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> Message-ID: <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> On Sun, Jul 20, 2008 at 9:17 PM, Alan McIntyre wrote: > The skipped test verbosity is annoying; I'll see if there's a way to > make that a bit cleaner-looking for some low verbosity level. The latest release version of nose from easy_install (0.10.3) doesn't generate that verbose output for skipped tests. Should we move up to requiring 0.10.3 for tests? From robert.kern at gmail.com Sun Jul 20 22:56:47 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 20 Jul 2008 21:56:47 -0500 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> Message-ID: <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> On Sun, Jul 20, 2008 at 21:47, Alan McIntyre wrote: > On Sun, Jul 20, 2008 at 9:17 PM, Alan McIntyre wrote: >> The skipped test verbosity is annoying; I'll see if there's a way to >> make that a bit cleaner-looking for some low verbosity level. > > The latest release version of nose from easy_install (0.10.3) doesn't > generate that verbose output for skipped tests. Should we move up to > requiring 0.10.3 for tests? I don't think aesthetics are worth requiring a particular version. numpy doesn't need it; the users can decide whether they want it or not. We should try to have it installed on the buildbots, though, since we *are* the users in that case. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan.mcintyre at gmail.com Sun Jul 20 23:09:04 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 20 Jul 2008 23:09:04 -0400 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> Message-ID: <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> On Sun, Jul 20, 2008 at 10:56 PM, Robert Kern wrote: > I don't think aesthetics are worth requiring a particular version. > numpy doesn't need it; the users can decide whether they want it or > not. We should try to have it installed on the buildbots, though, > since we *are* the users in that case. Actually I was considering asking to move the minimum nose version up to 0.10.3 just because it's the current version before this aesthetic issue came up. There's about 30 bug fixes between 0.10.0 and 0.10.3, including one that fixed some situations in which exceptions were being hidden and one that makes the coverage reporting more accurate. It's not a big deal, though. From gael.varoquaux at normalesup.org Sun Jul 20 23:17:52 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 21 Jul 2008 05:17:52 +0200 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> Message-ID: <20080721031752.GM31836@phare.normalesup.org> On Sun, Jul 20, 2008 at 11:09:04PM -0400, Alan McIntyre wrote: > Actually I was considering asking to move the minimum nose version up > to 0.10.3 just because it's the current version before this aesthetic > issue came up. There's about 30 bug fixes between 0.10.0 and 0.10.3, > including one that fixed some situations in which exceptions were > being hidden and one that makes the coverage reporting more accurate. > It's not a big deal, though. There might be a case to move to 10.3, considering the large amount of bug fixes, but in general I think it is a bad idea to require leading edge packages. The reason being that you would like people to be able to rely on packaged version of the different tools to build an test a package. By packaged versions, I mean versions in the repositories of the main linux distributions, and macport and fink. Each time we require something outside a repository, we loose testers. Ga?l From alan.mcintyre at gmail.com Sun Jul 20 23:19:57 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 20 Jul 2008 23:19:57 -0400 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <20080721031752.GM31836@phare.normalesup.org> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> <20080721031752.GM31836@phare.normalesup.org> Message-ID: <1d36917a0807202019x6372470dt4093dab0a5681c8f@mail.gmail.com> On Sun, Jul 20, 2008 at 11:17 PM, Gael Varoquaux wrote: > There might be a case to move to 10.3, considering the large amount of > bug fixes, but in general I think it is a bad idea to require leading > edge packages. The reason being that you would like people to be able to > rely on packaged version of the different tools to build an test a > package. By packaged versions, I mean versions in the repositories of the > main linux distributions, and macport and fink. Each time we require > something outside a repository, we loose testers. Fair enough; does anybody have any idea which version of nose is generally available from distributions like the ones you mentioned? From gael.varoquaux at normalesup.org Sun Jul 20 23:34:01 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 21 Jul 2008 05:34:01 +0200 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <1d36917a0807202019x6372470dt4093dab0a5681c8f@mail.gmail.com> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> <20080721031752.GM31836@phare.normalesup.org> <1d36917a0807202019x6372470dt4093dab0a5681c8f@mail.gmail.com> Message-ID: <20080721033401.GN31836@phare.normalesup.org> On Sun, Jul 20, 2008 at 11:19:57PM -0400, Alan McIntyre wrote: > On Sun, Jul 20, 2008 at 11:17 PM, Gael Varoquaux > wrote: > > There might be a case to move to 10.3, considering the large amount of > > bug fixes, but in general I think it is a bad idea to require leading > > edge packages. The reason being that you would like people to be able to > > rely on packaged version of the different tools to build an test a > > package. By packaged versions, I mean versions in the repositories of the > > main linux distributions, and macport and fink. Each time we require > > something outside a repository, we loose testers. > Fair enough; does anybody have any idea which version of nose is > generally available from distributions like the ones you mentioned? Ubuntu hardy (current): 10.0 (http://packages.ubuntu.com) Ubuntu intrepid (next): 10.3 (http://packages.ubuntu.com) Debian unstable: 10.3 (http://packages.dbian.com) Fedora 8: 10.0 (https://admin.fedoraproject.org/pkgdb/) For the rest I can't figure out how to get the information. I suspect we can standardise on things around six month old. Debian unstable tracks closely upstream, Ubuntu and Fedora have a release cycle of 6 months, I don't know about SUSE, but I think it is similar, and macports, fink, or Gentoo trac closely upstream. Ga?l From alan.mcintyre at gmail.com Sun Jul 20 23:38:45 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Sun, 20 Jul 2008 23:38:45 -0400 Subject: [Numpy-discussion] Testing: Failed examples don't raise errors on buildbot. In-Reply-To: <20080721033401.GN31836@phare.normalesup.org> References: <1d36917a0807201817o4c8933g137cff2212226471@mail.gmail.com> <1d36917a0807201947g12485ca3vbe6dd01e50a0575a@mail.gmail.com> <3d375d730807201956q6db390eet1416af550c579f7b@mail.gmail.com> <1d36917a0807202009m57483527uae976aede10f2d36@mail.gmail.com> <20080721031752.GM31836@phare.normalesup.org> <1d36917a0807202019x6372470dt4093dab0a5681c8f@mail.gmail.com> <20080721033401.GN31836@phare.normalesup.org> Message-ID: <1d36917a0807202038td76b537sca29ce824e753c9e@mail.gmail.com> On Sun, Jul 20, 2008 at 11:34 PM, Gael Varoquaux wrote: > For the rest I can't figure out how to get the information. I suspect we > can standardise on things around six month old. Debian unstable tracks > closely upstream, Ubuntu and Fedora have a release cycle of 6 months, I > don't know about SUSE, but I think it is similar, and macports, fink, or > Gentoo trac closely upstream. It looks like Macports is at 0.10.1: http://py-nose.darwinports.com/ So it looks like 0.10.0 should still be a safe bet for being generally available. From charlesr.harris at gmail.com Mon Jul 21 00:41:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 22:41:09 -0600 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: References: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> Message-ID: On Sun, Jul 20, 2008 at 8:32 PM, Timothy Hochberg wrote: > > > On Sun, Jul 20, 2008 at 3:47 PM, Robert Kern > wrote: > >> On Sun, Jul 20, 2008 at 17:42, Charles R Harris >> wrote: >> > Hi All, >> > >> > I "fixed" ticket #754, but it leads to a ton of problems. The original >> > discussion is here. The problems that arise come from conversion to >> > different types. >> > >> > In [26]: a >> > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) >> > >> > In [27]: sign(a).astype(int) >> > Out[27]: >> > array([ 1, -1, -2147483648, 0, 1, >> > -1]) >> > >> > In [28]: sign(a).astype(bool) >> > Out[28]: array([ True, True, True, False, True, True], dtype=bool) >> > >> > In [29]: sign(a) >> > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) >> > >> > In [30]: bool(NaN) >> > Out[30]: True >> > >> > So there are problems with at minimum the following. >> > >> > 1) The way NaN is converted to bool. I think it should be False. >> >> It's not really our choice. That's Python's bool(). For the things >> that are our choice (e.g. array([nan]).astype(bool)) I think we should >> stay consistent with Python. >> > > > > I agree that this is a good goal. However, in the past, Python's treatment > of NaNs has been rather platform dependent and add hock. In this case, I > suspect that you are OK since the section "Truth Value Testing" in the > Python docs is pretty clear that any non-zero value of a numerical type is > True. > > However... > > >> >> > 2) The way NaN is converted to int types. I think it should be 0. >> >> I agree. That's what int(nan) gives: >> >> >>> int(nan) >> 0L > > > > This is GvR in > http://mail.python.org/pipermail/python-dev/2008-January/075865.html: > Well, now, that opens a whole other bag of toasted scorpions. It looks like long(inf) and int(inf) already raise OverflowError and > that should stay. > In [3]: (ones(2)*float(inf)).astype(int8) Out[3]: array([0, 0], dtype=int8) In [4]: (ones(2)*float(inf)).astype(int32) Out[4]: array([-2147483648, -2147483648]) In [5]: (ones(2)*float(inf)).astype(int64) Out[5]: array([-9223372036854775808, -9223372036854775808], dtype=int64) Hmmm, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 21 00:45:14 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Jul 2008 22:45:14 -0600 Subject: [Numpy-discussion] Ticket #794 and can o' worms. In-Reply-To: References: <3d375d730807201547v46d709bflc3e6dac9b7d7a0a6@mail.gmail.com> Message-ID: On Sun, Jul 20, 2008 at 10:41 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Jul 20, 2008 at 8:32 PM, Timothy Hochberg > wrote: > >> >> >> On Sun, Jul 20, 2008 at 3:47 PM, Robert Kern >> wrote: >> >>> On Sun, Jul 20, 2008 at 17:42, Charles R Harris >>> wrote: >>> > Hi All, >>> > >>> > I "fixed" ticket #754, but it leads to a ton of problems. The original >>> > discussion is here. The problems that arise come from conversion to >>> > different types. >>> > >>> > In [26]: a >>> > Out[26]: array([ Inf, -Inf, NaN, 0., 3., -3.]) >>> > >>> > In [27]: sign(a).astype(int) >>> > Out[27]: >>> > array([ 1, -1, -2147483648, 0, 1, >>> > -1]) >>> > >>> > In [28]: sign(a).astype(bool) >>> > Out[28]: array([ True, True, True, False, True, True], dtype=bool) >>> > >>> > In [29]: sign(a) >>> > Out[29]: array([ 1., -1., NaN, 0., 1., -1.]) >>> > >>> > In [30]: bool(NaN) >>> > Out[30]: True >>> > >>> > So there are problems with at minimum the following. >>> > >>> > 1) The way NaN is converted to bool. I think it should be False. >>> >>> It's not really our choice. That's Python's bool(). For the things >>> that are our choice (e.g. array([nan]).astype(bool)) I think we should >>> stay consistent with Python. >>> >> >> >> >> I agree that this is a good goal. However, in the past, Python's treatment >> of NaNs has been rather platform dependent and add hock. In this case, I >> suspect that you are OK since the section "Truth Value Testing" in the >> Python docs is pretty clear that any non-zero value of a numerical type is >> True. >> >> However... >> >> >>> >>> > 2) The way NaN is converted to int types. I think it should be 0. >>> >>> I agree. That's what int(nan) gives: >>> >>> >>> int(nan) >>> 0L >> >> >> >> This is GvR in >> http://mail.python.org/pipermail/python-dev/2008-January/075865.html: >> > > Well, now, that opens a whole other bag of toasted scorpions. > > It looks like long(inf) and int(inf) already raise OverflowError and >> that should stay. >> > > In [3]: (ones(2)*float(inf)).astype(int8) > Out[3]: array([0, 0], dtype=int8) > > In [4]: (ones(2)*float(inf)).astype(int32) > > Out[4]: array([-2147483648, -2147483648]) > > In [5]: (ones(2)*float(inf)).astype(int64) > Out[5]: array([-9223372036854775808, -9223372036854775808], dtype=int64) > > Also, In [8]: (ones(2)*float(inf)).astype(uint8) Out[8]: array([0, 0], dtype=uint8) In [9]: (ones(2)*float(inf)).astype(uint16) Out[9]: array([0, 0], dtype=uint16) In [10]: (ones(2)*float(inf)).astype(uint32) Out[10]: array([0, 0], dtype=uint32) In [11]: (ones(2)*float(inf)).astype(uint64) Out[11]: array([0, 0], dtype=uint64) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From suchindra at gmail.com Mon Jul 21 08:30:24 2008 From: suchindra at gmail.com (Suchindra Sandhu) Date: Mon, 21 Jul 2008 08:30:24 -0400 Subject: [Numpy-discussion] integer array creation oddity In-Reply-To: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> References: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> Message-ID: Hi St?fan, Is that the recommended way of checking the type of the array? Ususally for type checkin, I use the isinstance built-in in python, but I see that will not work in this case. I must admit that I am a little confused by this. Why is type different from dtype? Thanks, Suchindra On Fri, Jul 18, 2008 at 7:10 PM, St?fan van der Walt wrote: > 2008/7/18 Suchindra Sandhu : > > Can someone please explain to me this oddity? > > > > In [1]: import numpy as n > > > > In [8]: a = n.array((1,2,3), 'i') > > > > In [9]: type(a[0]) > > Out[9]: > > There's more than one int32 type lying around. Rather, compare *dtypes*: > > In [19]: a.dtype == np.int32 > Out[19]: True > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Jul 21 08:50:28 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 21 Jul 2008 14:50:28 +0200 Subject: [Numpy-discussion] Reference guide updated Message-ID: <9457e7c80807210550h1b8966f0k67deefe3ecbbe38c@mail.gmail.com> Hi all, A new copy of the reference guide is now available at http://mentat.za.net/numpy/refguide/ and http://mentat.za.net/numpy/refguide/NumPy.pdf I'd like to thank Pauli Virtanen, who put in a lot of effort to improve Sphinx interaction and document layout. The guide is not yet complete, but we are working on ironing out remaining problems. As always, we could benefit from more content, so please continue writing and reviewing! Regards St?fan From millman at berkeley.edu Mon Jul 21 12:33:27 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 21 Jul 2008 09:33:27 -0700 Subject: [Numpy-discussion] Reference guide updated In-Reply-To: <9457e7c80807210550h1b8966f0k67deefe3ecbbe38c@mail.gmail.com> References: <9457e7c80807210550h1b8966f0k67deefe3ecbbe38c@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 5:50 AM, St?fan van der Walt wrote: > A new copy of the reference guide is now available at > http://mentat.za.net/numpy/refguide/ That looks great. A big thanks to everyone who is contributing to the documentation. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From doutriaux1 at llnl.gov Mon Jul 21 14:07:12 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Mon, 21 Jul 2008 11:07:12 -0700 Subject: [Numpy-discussion] bug in ma ? Message-ID: <4884D050.5050201@llnl.gov> Hello, I think i found a bug in numpy.ma I tried it both with the trunk and the 1.1 version import numpy a= numpy.ma.arange(256) a.shape=(128,2) b=numpy.reshape(a,(64,2,2)) Traceback (most recent call last): File "quick_test_reshape.py", line 7, in b=numpy.reshape(a,(64,2,2)) File "/lgm/cdat/latest/lib/python2.5/site-packages/numpy/core/fromnumeric.py", line 116, in reshape return reshape(newshape, order=order) TypeError: reshape() got an unexpected keyword argument 'order' From pgmdevlist at gmail.com Mon Jul 21 15:12:37 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 21 Jul 2008 15:12:37 -0400 Subject: [Numpy-discussion] bug in ma ? In-Reply-To: <4884D050.5050201@llnl.gov> References: <4884D050.5050201@llnl.gov> Message-ID: <200807211512.39713.pgmdevlist@gmail.com> Charles, Thx for the report, should be fixed in r5492/5493 (I've been overoptimistic with r5490/5491)... From Chris.Barker at noaa.gov Mon Jul 21 15:23:55 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 21 Jul 2008 12:23:55 -0700 Subject: [Numpy-discussion] building a better OSX install for 1.1.1 In-Reply-To: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> References: <764e38540807181417y1f3dcfd1g16343e018db09eac@mail.gmail.com> Message-ID: <4884E24B.1020402@noaa.gov> Christopher Burns wrote: > install numpy and they _do not_ have this version of python installed, > Installer.app issues a warning: > "numpy requires System Python 2.5 to install." > > The phrase "System Python" is misleading, it's reasonable to assume that > refers to the system version of python. So I'd like to change it. > > This string is stored in an Info.plist buried in the .mpkg that > bdist_mpkg builds. I'd like to be able to override that string from the > command line, but there does not seem to be any options for changing the > requirements from the command line. I've poked into the bdist_mpkg code a bit, and I think this is where that message is generated: # plists.py ... name = u'%s Python %s' % (FRIENDLY_PREFIX.get(dprefix, dprefix), version) kw.setdefault('LabelKey', name) title = u'%s requires %s to install.' % (pkgname, name,) ... and here: FRIENDLY_PREFIX = { os.path.expanduser(u'~/Library/Frameworks') : u'User', u'/System/Library/Frameworks' : u'Apple', u'/Library/Frameworks' : u'System', u'/opt/local' : u'DarwinPorts', u'/usr/local' : u'Unix', u'/sw' : u'Fink', } So, it looks like they are calling "/System/Library/Frameworks" "Apple", and "/Library/Frameworks" "system", which we all seem to agree is misleading. So I'd say change that in the bdist_mpkg source, maybe to: u'/Library/Frameworks' : u'python.org', or even: u'/Library/Frameworks' : u'python.org Framework Build', This calls for a note to the pythonmac list -- someone there will hopefully have access to that source repository. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stefan at sun.ac.za Mon Jul 21 17:37:42 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 21 Jul 2008 23:37:42 +0200 Subject: [Numpy-discussion] integer array creation oddity In-Reply-To: References: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> Message-ID: <9457e7c80807211437j72102a33n58d7e7deefbc54c1@mail.gmail.com> 2008/7/21 Suchindra Sandhu : > Is that the recommended way of checking the type of the array? Ususally for > type checkin, I use the isinstance built-in in python, but I see that will > not work in this case. I must admit that I am a little confused by this. Why > is type different from dtype? Data-types contain additional information needed to lay out numerical types in memory, such as byte-order and bit-width. Each data-type has an associated Python type, which tells you the type of scalars in an array of that dtype. For example, here are two NumPy data-types that are not equal: In [6]: d1 = np.dtype(int).newbyteorder('>') In [7]: d2 = np.dtype(int).newbyteorder('<') In [8]: d1.type Out[8]: In [9]: d2.type Out[9]: In [10]: d1 == d2 Out[10]: False I don't know why there is more than one int32 type (I would guess it has something to do with the way types are detected upon build; maybe Robert or Travis could tell you more). Regards St?fan From charlesr.harris at gmail.com Mon Jul 21 18:25:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Jul 2008 16:25:56 -0600 Subject: [Numpy-discussion] integer array creation oddity In-Reply-To: <9457e7c80807211437j72102a33n58d7e7deefbc54c1@mail.gmail.com> References: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> <9457e7c80807211437j72102a33n58d7e7deefbc54c1@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 3:37 PM, St?fan van der Walt wrote: > 2008/7/21 Suchindra Sandhu : > > Is that the recommended way of checking the type of the array? Ususally > for > > type checkin, I use the isinstance built-in in python, but I see that > will > > not work in this case. I must admit that I am a little confused by this. > Why > > is type different from dtype? > > Data-types contain additional information needed to lay out numerical > types in memory, such as byte-order and bit-width. Each data-type has > an associated Python type, which tells you the type of scalars in an > array of that dtype. For example, here are two NumPy data-types that > are not equal: > > In [6]: d1 = np.dtype(int).newbyteorder('>') > In [7]: d2 = np.dtype(int).newbyteorder('<') > > In [8]: d1.type > Out[8]: > > In [9]: d2.type > Out[9]: > > In [10]: d1 == d2 > Out[10]: False > > I don't know why there is more than one int32 type (I would guess it > has something to do with the way types are detected upon build; maybe > Robert or Travis could tell you more). > They correspond to two C types of the same size, int and long. On 64 bit systems you should have two int64 types, long and longlong. In [1]: dtype('i').name Out[1]: 'int32' In [2]: dtype('l').name Out[2]: 'int32' Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Mon Jul 21 20:53:24 2008 From: david.huard at gmail.com (David Huard) Date: Mon, 21 Jul 2008 20:53:24 -0400 Subject: [Numpy-discussion] numpy.loadtext() fails with dtype + usecols In-Reply-To: References: <48811659.9020004@gmail.com> Message-ID: <91cf711d0807211753k79a57818x1164b357a101aacf@mail.gmail.com> Looks good to me. I committed the patch to the trunk and added a regression test (r5495). David 2008/7/18 Charles R Harris : > > > On Fri, Jul 18, 2008 at 4:16 PM, Ryan May wrote: > >> Hi, >> >> I was trying to use loadtxt() today to read in some text data, and I had a >> problem when I specified a dtype that only contained as many elements as in >> columns in usecols. The example below shows the problem: >> >> import numpy as np >> import StringIO >> data = '''STID RELH TAIR >> JOE 70.1 25.3 >> BOB 60.5 27.9 >> ''' >> f = StringIO.StringIO(data) >> names = ['stid', 'temp'] >> dtypes = ['S4', 'f8'] >> arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) >> >> With current 1.1 (and SVN head), this yields: >> >> IndexError Traceback (most recent call >> last) >> >> /home/rmay/ in () >> >> /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, >> dtype, comments, delimiter, converters, skiprows, usecols, unpack) >> 309 for j in xrange(len(vals))] >> 310 if usecols is not None: >> --> 311 row = [converterseq[j](vals[j]) for j in usecols] >> 312 else: >> 313 row = [converterseq[j](val) for j,val in >> enumerate(vals)] >> >> IndexError: list index out of range >> ------------------------------------------ >> >> I've added a patch that checks for usecols, and if present, correctly >> creates the converters dictionary to map each specified column with >> converter for the corresponding field in the dtype. With the attached patch, >> this works fine: >> >> >arr >> array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], >> dtype=[('stid', '|S4'), ('temp', '> >> Comments? Can I get this in for 1.1.1? >> > > Can someone familiar with loadtxt comment on this patch? > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Jul 21 20:59:10 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 22 Jul 2008 02:59:10 +0200 Subject: [Numpy-discussion] Windows_XP buildbot error. In-Reply-To: References: Message-ID: <5b8d13220807211759v7eb68a44jd804ae578a95525f@mail.gmail.com> On Sun, Jul 20, 2008 at 11:34 PM, Charles R Harris wrote: > The log file shows: > > File > "c:\numpy-buildbot\numpy\b11\install\Lib\site-packages\numpy\lib\tests\test_format.py", > line 429, in test_memmap_roundtrip > fp = open(nfn, 'wb') > > IOError: [Errno 2] No such file or directory: > 'c:\\docume~1\\thomas\\locals~1\\temp\\tmp_yrykj\\normal.npy' > > Is this some sort of permissions error or something specific to Thomas' > machine? I don't want this to show up in the 1.1.1 release and I'm wondering > if there is an easy fix besides disabling the test. It is not specific to Thomas' machine, I had the same problems on all my machines running windows xp. I believe it is a problem related to path idiosyncraties/encoding, but I could not find a way to solve it (I did not try hard). cheers, David From david.huard at gmail.com Mon Jul 21 21:02:40 2008 From: david.huard at gmail.com (David Huard) Date: Mon, 21 Jul 2008 21:02:40 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <48839CBD.30304@gmail.com> References: <48839CBD.30304@gmail.com> Message-ID: <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> Ryan, I committed your patch to the trunk and added a test for it from your failing example. Jarrod, though I'm also wary to touch the branch so late, the patch is minor and I don't see how it could break something that was not already broken. David 2008/7/20 Ryan May : > Jarrod Millman wrote: > >> Hello, >> >> This is a reminder that 1.1.1rc1 will be tagged tonight. Chuck is >> planning to spend some time today fixing a few final bugs on the 1.1.x >> branch. If anyone else is planning to commit anything to the 1.1.x >> branch today, please let me know immediately. Obviously now is not >> the time to commit anything to the branch that could break anything, >> so please be extremely careful if you have to touch the branch. >> >> Once the release is tagged, Chris and David will create binary >> installers for both Windows and Mac. Hopefully, this will give us an >> opportunity to have much more widespread testing before releasing >> 1.1.1 final at the end of the month. >> >> Can I get anyone to look at this patch for loadtext()? > > I was trying to use loadtxt() today to read in some text data, and I had > a problem when I specified a dtype that only contained as many elements > as in columns in usecols. The example below shows the problem: > > import numpy as np > import StringIO > data = '''STID RELH TAIR > JOE 70.1 25.3 > BOB 60.5 27.9 > ''' > f = StringIO.StringIO(data) > names = ['stid', 'temp'] > dtypes = ['S4', 'f8'] > arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) > > With current 1.1 (and SVN head), this yields: > > IndexError Traceback (most recent call last) > > /home/rmay/ in () > > /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, > dtype, comments, delimiter, converters, skiprows, usecols, unpack) > 309 for j in xrange(len(vals))] > 310 if usecols is not None: > --> 311 row = [converterseq[j](vals[j]) for j in usecols] > 312 else: > 313 row = [converterseq[j](val) for j,val in > enumerate(vals)] > > IndexError: list index out of range > ----------------------------------------- > > I've added a patch that checks for usecols, and if present, correctly > creates the converters dictionary to map each specified column with > converter for the corresponding field in the dtype. With the attached > patch, this works fine: > > >arr > array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], > dtype=[('stid', '|S4'), ('temp', ' > > Thanks, > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Mon Jul 21 21:12:00 2008 From: rmay31 at gmail.com (Ryan May) Date: Mon, 21 Jul 2008 21:12:00 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> References: <48839CBD.30304@gmail.com> <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> Message-ID: <488533E0.5030403@gmail.com> Thanks. I wouldn't have ordinarily pushed so much, but I wanted to hit the bugfix release. Ryan David Huard wrote: > Ryan, I committed your patch to the trunk and added a test for it from > your failing example. > > Jarrod, though I'm also wary to touch the branch so late, the patch is > minor and I don't see how it could break something that was not already > broken. > > David > > 2008/7/20 Ryan May >: > > Jarrod Millman wrote: > > Hello, > > This is a reminder that 1.1.1rc1 will be tagged tonight. Chuck is > planning to spend some time today fixing a few final bugs on the > 1.1.x > branch. If anyone else is planning to commit anything to the 1.1.x > branch today, please let me know immediately. Obviously now is not > the time to commit anything to the branch that could break anything, > so please be extremely careful if you have to touch the branch. > > Once the release is tagged, Chris and David will create binary > installers for both Windows and Mac. Hopefully, this will give > us an > opportunity to have much more widespread testing before releasing > 1.1.1 final at the end of the month. > > Can I get anyone to look at this patch for loadtext()? > > I was trying to use loadtxt() today to read in some text data, and I had > a problem when I specified a dtype that only contained as many elements > as in columns in usecols. The example below shows the problem: > > import numpy as np > import StringIO > data = '''STID RELH TAIR > JOE 70.1 25.3 > BOB 60.5 27.9 > ''' > f = StringIO.StringIO(data) > names = ['stid', 'temp'] > dtypes = ['S4', 'f8'] > arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1) > > With current 1.1 (and SVN head), this yields: > > IndexError Traceback (most recent > call last) > > /home/rmay/ in () > > /usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname, > dtype, comments, delimiter, converters, skiprows, usecols, unpack) > 309 for j in xrange(len(vals))] > 310 if usecols is not None: > --> 311 row = [converterseq[j](vals[j]) for j in usecols] > 312 else: > 313 row = [converterseq[j](val) for j,val in > enumerate(vals)] > > IndexError: list index out of range > ----------------------------------------- > > I've added a patch that checks for usecols, and if present, correctly > creates the converters dictionary to map each specified column with > converter for the corresponding field in the dtype. With the attached > patch, this works fine: > > >arr > array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], > dtype=[('stid', '|S4'), ('temp', ' > > Thanks, > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From charlesr.harris at gmail.com Mon Jul 21 21:59:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Jul 2008 19:59:46 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> References: <48839CBD.30304@gmail.com> <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 7:02 PM, David Huard wrote: > Ryan, I committed your patch to the trunk and added a test for it from your > failing example. > > Jarrod, though I'm also wary to touch the branch so late, the patch is > minor and I don't see how it could break something that was not already > broken. > I'm in favor of putting it in. Pierre has also made some fixes to masked arrays. I think that is about the end of it for 1.1.1. However, if anyone is running Python 2.3 it would be helpful if you could test the release candidate as the buildbots are all 2.4 or 2.5. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Jul 21 22:00:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Jul 2008 20:00:57 -0600 Subject: [Numpy-discussion] Windows_XP buildbot error. In-Reply-To: <5b8d13220807211759v7eb68a44jd804ae578a95525f@mail.gmail.com> References: <5b8d13220807211759v7eb68a44jd804ae578a95525f@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 6:59 PM, David Cournapeau wrote: > On Sun, Jul 20, 2008 at 11:34 PM, Charles R Harris > wrote: > > The log file shows: > > > > File > > > "c:\numpy-buildbot\numpy\b11\install\Lib\site-packages\numpy\lib\tests\test_format.py", > > line 429, in test_memmap_roundtrip > > fp = open(nfn, 'wb') > > > > IOError: [Errno 2] No such file or directory: > > 'c:\\docume~1\\thomas\\locals~1\\temp\\tmp_yrykj\\normal.npy' > > > > Is this some sort of permissions error or something specific to Thomas' > > machine? I don't want this to show up in the 1.1.1 release and I'm > wondering > > if there is an easy fix besides disabling the test. > > It is not specific to Thomas' machine, I had the same problems on all > my machines running windows xp. I believe it is a problem related to > path idiosyncraties/encoding, but I could not find a way to solve it > (I did not try hard). > I think this is an ongoing problem with the temp files on windows. ISTR a thread about that some time back. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Jul 21 22:02:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 21 Jul 2008 22:02:29 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> Message-ID: <200807212202.30977.pgmdevlist@gmail.com> > I'm in favor of putting it in. Pierre has also made some fixes to masked > arrays. I think that is about the end of it for 1.1.1. However, if anyone > is running Python 2.3 it would be helpful if you could test the release > candidate as the buildbots are all 2.4 or 2.5. Oh yes. I don't have access to Python 2.3, so I haven't been able to check whether one dictionary update in MaskedArray.reshape would work properly (on 1.1.x, it does on 1.2). From stefan at sun.ac.za Mon Jul 21 22:40:06 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 22 Jul 2008 04:40:06 +0200 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <48839CBD.30304@gmail.com> References: <48839CBD.30304@gmail.com> Message-ID: <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> 2008/7/20 Ryan May : >>arr > array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], > dtype=[('stid', '|S4'), ('temp', ' From millman at berkeley.edu Tue Jul 22 00:41:42 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 21 Jul 2008 21:41:42 -0700 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <48839CBD.30304@gmail.com> <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 6:59 PM, Charles R Harris wrote: > I'm in favor of putting it in. Pierre has also made some fixes to masked > arrays. I think that is about the end of it for 1.1.1. However, if anyone is > running Python 2.3 it would be helpful if you could test the release > candidate as the buildbots are all 2.4 or 2.5. Hey Chuck, Let's commit the changes to the 1.1.x branch and I can tag a 1.1.1rc2 tomorrow night (in about 24 hours). Is that enough time for everyone to get there final fixes in? I will ask Chris and David to wait until I tag 1.1.1rc2 to create new binaries (assuming they haven't done so all ready). Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From charlesr.harris at gmail.com Tue Jul 22 01:10:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Jul 2008 23:10:30 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807212202.30977.pgmdevlist@gmail.com> References: <91cf711d0807211802x1afce69bo84554a909770ddb5@mail.gmail.com> <200807212202.30977.pgmdevlist@gmail.com> Message-ID: On Mon, Jul 21, 2008 at 8:02 PM, Pierre GM wrote: > > > I'm in favor of putting it in. Pierre has also made some fixes to masked > > arrays. I think that is about the end of it for 1.1.1. However, if > anyone > > is running Python 2.3 it would be helpful if you could test the release > > candidate as the buildbots are all 2.4 or 2.5. > > Oh yes. I don't have access to Python 2.3, so I haven't been able to check > whether one dictionary update in MaskedArray.reshape would work properly > (on > 1.1.x, it does on 1.2). > Hmm... 70 errors. Pretty much all of them of this sort: ====================================================================== ERROR: Tests unmasked_edges ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.3/site-packages/numpy/ma/tests/test_extras.py", line 139, in test_edges assert_equal(notmasked_edges(a, None), [0,23]) File "/usr/local/lib/python2.3/site-packages/numpy/ma/testutils.py", line 107, in assert_equal if actual_dtype.char == "S" and desired_dtype.char == "S": NameError: global name 'actual_dtype' is not defined Pierre, I suggest you go to python.org and install python-2.3.7 instead of shooting blind. It's pretty easy if you're running linux, just be sure to end with make altinstall in case your distro has python installed in /usr/local. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 01:21:49 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 01:21:49 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807212202.30977.pgmdevlist@gmail.com> Message-ID: <200807220121.50118.pgmdevlist@gmail.com> On Tuesday 22 July 2008 01:10:30 Charles R Harris wrote: > Hmm... 70 errors. Pretty much all of them of this sort: > NameError: global name 'actual_dtype' is not defined OK, that's a problem with numpy.ma.testutils. r5496 should fix that > > Pierre, I suggest you go to python.org and install python-2.3.7 instead of > shooting blind. It's pretty easy if you're running linux, just be sure to > end with make altinstall in case your distro has python installed in > /usr/local. If you don't mind, I'd do that as a last resort. The problem wasn't a Python version, just a mistake on the version of numpy being installed. I tested 1.2.0 when I was dealing with 1.1.x. So, I'm back on a 1.1.x, hopefully not for long. From pgmdevlist at gmail.com Tue Jul 22 01:33:11 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 01:33:11 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807212202.30977.pgmdevlist@gmail.com> Message-ID: <200807220133.12014.pgmdevlist@gmail.com> > Pierre, I suggest you go to python.org and install python-2.3.7 instead of > shooting blind. It's pretty easy if you're running linux, just be sure to > end with make altinstall in case your distro has python installed in > /usr/local. I forgot to mention that my reluctance to install another Python version comes from the distro I run on: Gentoo is unforgiving with the slightest issue with Python, and I can't really afford the time to install Kubuntu on my only machine... r5496 looks OK. From charlesr.harris at gmail.com Tue Jul 22 02:06:35 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 00:06:35 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807220133.12014.pgmdevlist@gmail.com> References: <200807212202.30977.pgmdevlist@gmail.com> <200807220133.12014.pgmdevlist@gmail.com> Message-ID: On Mon, Jul 21, 2008 at 11:33 PM, Pierre GM wrote: > > > Pierre, I suggest you go to python.org and install python-2.3.7 instead > of > > shooting blind. It's pretty easy if you're running linux, just be sure to > > end with make altinstall in case your distro has python installed in > > /usr/local. > > I forgot to mention that my reluctance to install another Python version > comes > from the distro I run on: Gentoo is unforgiving with the slightest issue > with > Python, and I can't really afford the time to install Kubuntu on my only > machine... But I thought Gentoo was for uber geeks? And testing the cpu cooler. Anyway, the tests pass now, thanks. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 02:10:01 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 02:10:01 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807220133.12014.pgmdevlist@gmail.com> Message-ID: <200807220210.02329.pgmdevlist@gmail.com> On Tuesday 22 July 2008 02:06:35 Charles R Harris wrote: > But I thought Gentoo was for uber geeks? And testing the cpu cooler. Turns out that I'm not one. It's educational, though, but the older I get, the less compiling kernels and tweaking OS corresponds to my idea of fun. > Anyway, the tests pass now, thanks. You're quite welcome, and many apologies for the noise and waste of time. From charlesr.harris at gmail.com Tue Jul 22 02:27:38 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 00:27:38 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807220210.02329.pgmdevlist@gmail.com> References: <200807220133.12014.pgmdevlist@gmail.com> <200807220210.02329.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 12:10 AM, Pierre GM wrote: > On Tuesday 22 July 2008 02:06:35 Charles R Harris wrote: > > But I thought Gentoo was for uber geeks? And testing the cpu cooler. > > Turns out that I'm not one. It's educational, though, but the older I get, > the > less compiling kernels and tweaking OS corresponds to my idea of fun. > > > Anyway, the tests pass now, thanks. > > You're quite welcome, and many apologies for the noise and waste of time. > Looks like you shouldn't use NumpyTestCase for the 1.2 test, however. /usr/lib/python2.5/unittest.py:507: DeprecationWarning: NumpyTestCase will be removed in the next release; please update your code to use nose or unittest return self.suiteClass(map(testCaseClass, testCaseNames)) /usr/lib/python2.5/site-packages/numpy/ma/tests/test_extras.py:330: DeprecationWarning: NumpyTestCase will be removed in the next release; please update your code to use nose or unittest NumpyTestCase.__init__(self, *args, **kwds) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 02:34:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 00:34:46 -0600 Subject: [Numpy-discussion] Corner case complex log error. Message-ID: FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, -0.0), 'divide') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/nose/case.py", line 203, in runTest self.test(*self.arg) File "/usr/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 393, in _check assert got == expected, (got, expected) AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 02:39:18 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 02:39:18 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807220210.02329.pgmdevlist@gmail.com> Message-ID: <200807220239.18965.pgmdevlist@gmail.com> On Tuesday 22 July 2008 02:27:38 Charles R Harris wrote: > Looks like you shouldn't use NumpyTestCase for the 1.2 test, however. Looks like Alan updated it for me, the tests look OK on 1.2. Interestingly, I segfault when running python -c "import numpy; numpy.test()" (NumPy version 1.2.0.dev5496) (I can still run thes numpy.ma tests the old fashioned way, for t in numpy/ma/tests/*.py; do python $t; done From charlesr.harris at gmail.com Tue Jul 22 02:53:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 00:53:02 -0600 Subject: [Numpy-discussion] Trac up to it's old permissions game. Message-ID: Can't browse source, etc. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 02:58:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 00:58:25 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807220239.18965.pgmdevlist@gmail.com> References: <200807220210.02329.pgmdevlist@gmail.com> <200807220239.18965.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 12:39 AM, Pierre GM wrote: > On Tuesday 22 July 2008 02:27:38 Charles R Harris wrote: > > Looks like you shouldn't use NumpyTestCase for the 1.2 test, however. > > Looks like Alan updated it for me, the tests look OK on 1.2. > > Interestingly, I segfault when running > python -c "import numpy; numpy.test()" > (NumPy version 1.2.0.dev5496) > I'm testing 1.2.0.dev5497 and still see the deprecation warnings, they are up near the beginning. I don't get a segfault, either. I wonder what that is about? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 03:16:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 01:16:27 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> Message-ID: On Mon, Jul 21, 2008 at 8:40 PM, St?fan van der Walt wrote: > 2008/7/20 Ryan May : > >>arr > > array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)], > > dtype=[('stid', '|S4'), ('temp', ' > The code in SVN still breaks for more complicated dtypes, such as: > > np.dtype([('x', int), ('y', [('t', int), ('s', float)])]) > > Please find attached a patch which addresses the issue (it passes all > unit tests). > This bit is illegal syntax in Python 2.3 X.append(tuple(conv(val) for (conv, val) in zip(converterseq, vals))) So this isn't going to work for 1.1.1 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 03:06:15 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 03:06:15 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807220239.18965.pgmdevlist@gmail.com> Message-ID: <200807220306.15684.pgmdevlist@gmail.com> On Tuesday 22 July 2008 02:58:25 Charles R Harris wrote: > I'm testing 1.2.0.dev5497 and still see the deprecation warnings, they are > up near the beginning. I'm afraid you have some outdated files lingering somewhere, there's no NumpyTestCase in the sources of numpy.ma > I don't get a segfault, either. I wonder what that is about? Looks like it's coming from linalg. Now, which one ? From charlesr.harris at gmail.com Tue Jul 22 03:23:40 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 01:23:40 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807220306.15684.pgmdevlist@gmail.com> References: <200807220239.18965.pgmdevlist@gmail.com> <200807220306.15684.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 1:06 AM, Pierre GM wrote: > On Tuesday 22 July 2008 02:58:25 Charles R Harris wrote: > > I'm testing 1.2.0.dev5497 and still see the deprecation warnings, they > are > > up near the beginning. > > I'm afraid you have some outdated files lingering somewhere, there's no > NumpyTestCase in the sources of numpy.ma > > > I don't get a segfault, either. I wonder what that is about? > > Looks like it's coming from linalg. Now, which one ? This is a new thing, I take it. Looks like a good time to wait for morning ;) The deprecation warning went away when I deleted the numpy site-package and reinstalled. Grrr, I should know better by now. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 03:22:15 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 03:22:15 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807220306.15684.pgmdevlist@gmail.com> Message-ID: <200807220322.15231.pgmdevlist@gmail.com> On Tuesday 22 July 2008 03:23:40 Charles R Harris wrote: > > Looks like it's coming from linalg. Now, which one ? > > This is a new thing, I take it. Looks like a good time to wait for morning Could be on my side only with a botched dependence, but finding which one... > The deprecation warning went away when I deleted the numpy site-package and > reinstalled. Grrr, I should know better by now. Welcome to the club. If I had $1 each time I thought the same thing... From stefan at sun.ac.za Tue Jul 22 04:45:46 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 22 Jul 2008 10:45:46 +0200 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> Message-ID: <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> 2008/7/22 Charles R Harris : > This bit is illegal syntax in Python 2.3 > > X.append(tuple(conv(val) for (conv, val) in zip(converterseq, > vals))) > > So this isn't going to work for 1.1.1 That's easy to fix. New patch attached. Cheers St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: loadtxt.patch Type: application/octet-stream Size: 5393 bytes Desc: not available URL: From nadavh at visionsense.com Tue Jul 22 09:03:47 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 22 Jul 2008 16:03:47 +0300 Subject: [Numpy-discussion] Getting ready for the upcomming python versions Message-ID: <1216731827.16371.17.camel@nadav.envision.co.il> I am trying to prepare for the trouble-making python3. I installed with success python2.6b2 and then numpy1.10. scipy failed to compile with the following error message: compile options: '-DNO_ATLAS_INFO=1 -DUSE_VENDOR_BLAS=1 -I/usr/local/lib/python2.6/site-packages/numpy/core/include -I/usr/local/include/python2.6 -c' gcc: scipy/sparse/linalg/dsolve/_zsuperlumodule.c In file included from scipy/sparse/linalg/dsolve/_superluobject.h:8, from scipy/sparse/linalg/dsolve/_zsuperlumodule.c:32: scipy/sparse/linalg/dsolve/SuperLU/SRC/scomplex.h:60: error: conflicting types for '_Py_c_abs' /usr/local/include/python2.6/complexobject.h:30: error: previous declaration of '_Py_c_abs' was here In file included from scipy/sparse/linalg/dsolve/_superluobject.h:8, from scipy/sparse/linalg/dsolve/_zsuperlumodule.c:32: scipy/sparse/linalg/dsolve/SuperLU/SRC/scomplex.h:60: error: conflicting types for '_Py_c_abs' /usr/local/include/python2.6/complexobject.h:30: error: previous declaration of '_Py_c_abs' was here error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DNO_ATLAS_INFO=1 -DUSE_VENDOR_BLAS=1 -I/usr/local/lib/python2.6/site-packages/numpy/core/include -I/usr/local/include/python2.6 -c scipy/sparse/linalg/dsolve/_zsuperlumodule.c -o build/temp.linux-x86_64-2.6/scipy/sparse/linalg/dsolve/_zsuperlumodule.o" failed with exit status 1 (Both scipy0.60 and the latest from svn fail the same) My system: Gentoo linux 64 on core2 duo, gcc4.3.1. My question: Is there a migration plan for the python3? Can a layman (with some C knowledge) like me help? Any remarks about the error message above would help. Nadav. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 10:58:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 08:58:52 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 2:45 AM, St?fan van der Walt wrote: > 2008/7/22 Charles R Harris : > > This bit is illegal syntax in Python 2.3 > > > > X.append(tuple(conv(val) for (conv, val) in zip(converterseq, > > vals))) > > > > So this isn't going to work for 1.1.1 > > That's easy to fix. New patch attached. > Yep, but I wanted you to look it over and find all the problems ;) Could you post a patch against current mainline svn, which already has your previous patch applied? I'm also curious why we need tuples, are we using these values as hashes someplace. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue Jul 22 11:18:25 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 22 Jul 2008 17:18:25 +0200 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> Message-ID: <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> 2008/7/22 Charles R Harris : >> That's easy to fix. New patch attached. > > Yep, but I wanted you to look it over and find all the problems ;) Ah, you have your managerial hat on! > Could you post a patch against current mainline svn, which already has your > previous patch applied? I'm also curious why we need tuples, are we using > these values as hashes someplace. Applied. The reason we need to use tuples is because In [3]: np.array([[1, 2], [3, 4]], dtype=[('x', int), ('y', int)]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/stefan/ in () TypeError: expected a readable buffer object but In [4]: np.array([(1, 2), (3, 4)], dtype=[('x', int), ('y', int)]) Out[4]: array([(1, 2), (3, 4)], dtype=[('x', ' References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> Message-ID: <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> 2008/7/22 St?fan van der Walt : >> Could you post a patch against current mainline svn, which already has your >> previous patch applied? I'm also curious why we need tuples, are we using >> these values as hashes someplace. > > Applied. The reason we need to use tuples is because Should these changes be back-ported to the 1.1.1 branch? St?fan From charlesr.harris at gmail.com Tue Jul 22 11:44:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 09:44:43 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 9:21 AM, St?fan van der Walt wrote: > 2008/7/22 St?fan van der Walt : > >> Could you post a patch against current mainline svn, which already has > your > >> previous patch applied? I'm also curious why we need tuples, are we > using > >> these values as hashes someplace. > > > > Applied. The reason we need to use tuples is because > > Should these changes be back-ported to the 1.1.1 branch? > That's what I'm trying out now that I've got Python2.3 installed for testing. What I'd like to do is just copy the whole io.py file over. Let's see if Trac is working... OK, the only other change since 1.1.0 is using np.x in the doctests, which doesn't look like a big problem. I wonder what the status of that is in 1.1.x? Alan? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Tue Jul 22 11:58:33 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 22 Jul 2008 11:58:33 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> Message-ID: <1d36917a0807220858m62d8369cg17d44651ec14de1b@mail.gmail.com> On Tue, Jul 22, 2008 at 11:44 AM, Charles R Harris wrote: > OK, the only other change since 1.1.0 is using > np.x in the doctests, which doesn't look like a big problem. I wonder what > the status of that is in 1.1.x? Alan? All the changes I made for that were in the trunk after 1.1.1 was tagged, so the 1.1.1 test behavior should be completely under the old framework's rules. I could probably backport the np.x context to rundocs() (off the top of my head I can't remember if there was any mechanism to run all the doctests in NumPy), but it might just be easier to add "import numpy as np" to the modules containing doctests that use np.x. I can help with that if need be, just let me know. From charlesr.harris at gmail.com Tue Jul 22 12:05:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 10:05:22 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 9:44 AM, Charles R Harris wrote: > > > On Tue, Jul 22, 2008 at 9:21 AM, St?fan van der Walt > wrote: > >> 2008/7/22 St?fan van der Walt : >> >> Could you post a patch against current mainline svn, which already has >> your >> >> previous patch applied? I'm also curious why we need tuples, are we >> using >> >> these values as hashes someplace. >> > >> > Applied. The reason we need to use tuples is because >> >> Should these changes be back-ported to the 1.1.1 branch? >> > > That's what I'm trying out now that I've got Python2.3 installed for > testing. What I'd like to do is just copy the whole io.py file over. Let's > see if Trac is working... OK, the only other change since 1.1.0 is using > np.x in the doctests, which doesn't look like a big problem. I wonder what > the status of that is in 1.1.x? Alan? > OK, io.py seems to work. However, there are two other problems showing up in 2.3: Failed importing numpy.f2py.lib.extgen: update() takes no keyword arguments Failed importing /usr/local/lib/python2.3/site-packages/numpy/ma/tests/test_mrecords.py: invalid syntax (mrecords.py, line 245) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 12:16:00 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 10:16:00 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807220858m62d8369cg17d44651ec14de1b@mail.gmail.com> References: <48839CBD.30304@gmail.com> <9457e7c80807211940v7c9561dci974caa958b4fed98@mail.gmail.com> <9457e7c80807220145x7dacd26dt3c33c2f83150c3c9@mail.gmail.com> <9457e7c80807220818k6be65896r5cf9df891dec516@mail.gmail.com> <9457e7c80807220821h243c4161p60c2633f9bafe3d9@mail.gmail.com> <1d36917a0807220858m62d8369cg17d44651ec14de1b@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 9:58 AM, Alan McIntyre wrote: > On Tue, Jul 22, 2008 at 11:44 AM, Charles R Harris > wrote: > > OK, the only other change since 1.1.0 is using > > np.x in the doctests, which doesn't look like a big problem. I wonder > what > > the status of that is in 1.1.x? Alan? > > All the changes I made for that were in the trunk after 1.1.1 was > tagged, so the 1.1.1 test behavior should be completely under the old > framework's rules. I could probably backport the np.x context to > rundocs() (off the top of my head I can't remember if there was any > mechanism to run all the doctests in NumPy), but it might just be > easier to add "import numpy as np" to the modules containing doctests > that use np.x. I can help with that if need be, just let me know. If we do a 1.1.2 it might be handy for future backports if rundocs() has the np.x context. For the moment I'm tempted to just leave them as they don't look any worse than than they were, i.e., assuming from numpy import * Thanks, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 12:23:57 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 12:23:57 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: Message-ID: <200807221223.58634.pgmdevlist@gmail.com> On Tuesday 22 July 2008 12:05:22 Charles R Harris wrote: > Failed importing > /usr/local/lib/python2.3/site-packages/numpy/ma/tests/test_mrecords.py: > invalid syntax (mrecords.py, line 245) Charles, Can you import numpy.ma.mrecords ? And we're talking about the 1.1.x branch, right ? From charlesr.harris at gmail.com Tue Jul 22 12:48:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 10:48:20 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221223.58634.pgmdevlist@gmail.com> References: <200807221223.58634.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 10:23 AM, Pierre GM wrote: > On Tuesday 22 July 2008 12:05:22 Charles R Harris wrote: > > > Failed importing > > /usr/local/lib/python2.3/site-packages/numpy/ma/tests/test_mrecords.py: > > invalid syntax (mrecords.py, line 245) > > Charles, > Can you import numpy.ma.mrecords ? And we're talking about the 1.1.x > branch, > right ? > I fixed it, Pierre. You can't do (_[1] for _ in ddtype.descr) to get a tuple. Actually, I think this should fail in mainline also. Alan, does nose show import failures? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 12:56:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 10:56:11 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221223.58634.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 10:48 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Tue, Jul 22, 2008 at 10:23 AM, Pierre GM wrote: > >> On Tuesday 22 July 2008 12:05:22 Charles R Harris wrote: >> >> > Failed importing >> > /usr/local/lib/python2.3/site-packages/numpy/ma/tests/test_mrecords.py: >> > invalid syntax (mrecords.py, line 245) >> >> Charles, >> Can you import numpy.ma.mrecords ? And we're talking about the 1.1.x >> branch, >> right ? >> > > I fixed it, Pierre. You can't do > > (_[1] for _ in ddtype.descr) > > to get a tuple. Actually, I think this should fail in mainline also. Alan, > does nose show import failures? > Well, it produces a generator object in python2.5, which zip accepts. I don't know in which Python version this feature was added. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 12:52:13 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 12:52:13 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221223.58634.pgmdevlist@gmail.com> Message-ID: <200807221252.14822.pgmdevlist@gmail.com> On Tuesday 22 July 2008 12:48:20 Charles R Harris wrote: > I fixed it, Pierre. You can't do > > (_[1] for _ in ddtype.descr) > to get a tuple. OK, thx for that. AAMOF, lines 243-245 should be: self._fill_value = np.array(tuple(fillval), dtype=[(_[0], _[1]) for _ in ddtype.descr]) I guess 2.3 choked on the generator instead of the list. > Actually, I think this should fail in mainline also. 1.2 ? Can't, not the same approach: as in 1.2, flexible types are supported by numpy.ma.core, the get_fill_value trick in numpy.ma.mrecords becomes moot. From pgmdevlist at gmail.com Tue Jul 22 12:53:31 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 12:53:31 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: Message-ID: <200807221253.31677.pgmdevlist@gmail.com> On Tuesday 22 July 2008 12:56:11 Charles R Harris wrote: > Well, it produces a generator object in python2.5, which zip accepts. I > don't know in which Python version this feature was added. Likely 2.4, as it works on my machine (else I would have found about it at one point or another...) From alan.mcintyre at gmail.com Tue Jul 22 13:00:08 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 22 Jul 2008 13:00:08 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221223.58634.pgmdevlist@gmail.com> Message-ID: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> On Tue, Jul 22, 2008 at 12:48 PM, Charles R Harris wrote: > Actually, I think this should fail in mainline also. Alan, > does nose show import failures? I've seen talk of it swallowing some exceptions, but I'm not sure of the specifics. Are you referring to "import numpy.ma.mrecords" under Python 2.3, NumPy trunk? Here's what I get: Python 2.3.7 (#1, Jul 14 2008, 22:34:29) [GCC 4.1.2 (Gentoo 4.1.2 p1.1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy.ma.mrecords Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.3/site-packages/numpy/__init__.py", line 107, in ? import ma File "/usr/local/lib/python2.3/site-packages/numpy/ma/__init__.py", line 14, in ? import core File "/usr/local/lib/python2.3/site-packages/numpy/ma/core.py", line 113, in ? max_filler.update([(k, -np.inf) for k in [np.float32, np.float64]]) AttributeError: keys From alan.mcintyre at gmail.com Tue Jul 22 13:04:08 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 22 Jul 2008 13:04:08 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> References: <200807221223.58634.pgmdevlist@gmail.com> <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> Message-ID: <1d36917a0807221004v69bf9c0fob5e9a90bb0894caa@mail.gmail.com> On Tue, Jul 22, 2008 at 1:00 PM, Alan McIntyre wrote: > I've seen talk of it swallowing some exceptions, but I'm not sure of > the specifics. Are you referring to "import numpy.ma.mrecords" under > Python 2.3, NumPy trunk? When I run numpy/numpy/ma/tests/test_mrecords.py from the trunk with python2.3 (and this should be running under nose), the import fails and appears to be handled correctly. That's with nose 0.11.0 locally, though; I'll try it with an older version to see if it changes. From pgmdevlist at gmail.com Tue Jul 22 13:00:39 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 13:00:39 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> References: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> Message-ID: <200807221300.39742.pgmdevlist@gmail.com> On Tuesday 22 July 2008 13:00:08 Alan McIntyre wrote: > I've seen talk of it swallowing some exceptions, but I'm not sure of > the specifics. Are you referring to "import numpy.ma.mrecords" under > Python 2.3, NumPy trunk? Here's what I get: Hold on a second, what are we taling about ? If it's the trunk, it's Numpy1.2, and Python2.3 shouldn't be supported, right ? If it's the 1.1.x branch, then I'm in trouble. Please don't tell me I'm in trouble. From charlesr.harris at gmail.com Tue Jul 22 13:06:52 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 11:06:52 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221252.14822.pgmdevlist@gmail.com> References: <200807221223.58634.pgmdevlist@gmail.com> <200807221252.14822.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 10:52 AM, Pierre GM wrote: > On Tuesday 22 July 2008 12:48:20 Charles R Harris wrote: > > I fixed it, Pierre. You can't do > > > > (_[1] for _ in ddtype.descr) > > > to get a tuple. > > OK, thx for that. AAMOF, lines 243-245 should be: > self._fill_value = np.array(tuple(fillval), > dtype=[(_[0], _[1]) > for _ in ddtype.descr]) > > I guess 2.3 choked on the generator instead of the list. > I just replaced () by []. The 1.1.x version is now dt = zip(ddtype.names, [s[1] for s in ddtype.descr]) Which should have the same effect, although not done as slickly. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 13:08:28 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 11:08:28 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221300.39742.pgmdevlist@gmail.com> References: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> <200807221300.39742.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 11:00 AM, Pierre GM wrote: > On Tuesday 22 July 2008 13:00:08 Alan McIntyre wrote: > > I've seen talk of it swallowing some exceptions, but I'm not sure of > > the specifics. Are you referring to "import numpy.ma.mrecords" under > > Python 2.3, NumPy trunk? Here's what I get: > > Hold on a second, what are we taling about ? > If it's the trunk, it's Numpy1.2, and Python2.3 shouldn't be supported, > right ? > If it's the 1.1.x branch, then I'm in trouble. I fixed it in the 1.1.x branch, trunk looks fine. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Tue Jul 22 13:09:20 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 22 Jul 2008 13:09:20 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221300.39742.pgmdevlist@gmail.com> References: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> <200807221300.39742.pgmdevlist@gmail.com> Message-ID: <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> On Tue, Jul 22, 2008 at 1:00 PM, Pierre GM wrote: > On Tuesday 22 July 2008 13:00:08 Alan McIntyre wrote: >> I've seen talk of it swallowing some exceptions, but I'm not sure of >> the specifics. Are you referring to "import numpy.ma.mrecords" under >> Python 2.3, NumPy trunk? Here's what I get: > > Hold on a second, what are we taling about ? > If it's the trunk, it's Numpy1.2, and Python2.3 shouldn't be supported, > right ? I've been told NumPy 1.2 doesn't have to support Python 2.3, so I hope that's right. :) > If it's the 1.1.x branch, then I'm in trouble. > Please don't tell me I'm in trouble. No, I was just trying NumPy trunk against Python 2.3 based on something Charles said; sorry for the confusion. From pgmdevlist at gmail.com Tue Jul 22 13:04:51 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 13:04:51 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221252.14822.pgmdevlist@gmail.com> Message-ID: <200807221304.51240.pgmdevlist@gmail.com> On Tuesday 22 July 2008 13:06:52 Charles R Harris wrote: > I just replaced () by []. The 1.1.x version is now > > dt = zip(ddtype.names, [s[1] for s in ddtype.descr]) > > Which should have the same effect, although not done as slickly. Bah, that'll do for now. Many many thanks ! From millman at berkeley.edu Tue Jul 22 13:15:50 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 22 Jul 2008 10:15:50 -0700 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> References: <1d36917a0807221000i7923f795k6d6a2faef4277e71@mail.gmail.com> <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 10:09 AM, Alan McIntyre wrote: > I've been told NumPy 1.2 doesn't have to support Python 2.3, so I hope > that's right. :) Yes that's correct. NumPy 1.2 requires at least Python 2.4 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From pgmdevlist at gmail.com Tue Jul 22 13:14:03 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 13:14:03 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> Message-ID: <200807221314.03726.pgmdevlist@gmail.com> On Tuesday 22 July 2008 13:09:20 Alan McIntyre wrote: > I've been told NumPy 1.2 doesn't have to support Python 2.3, so I hope > that's right. :) I sure do hope so... > > If it's the 1.1.x branch, then I'm in trouble. > > Please don't tell me I'm in trouble. > > No, I was just trying NumPy trunk against Python 2.3 based on > something Charles said; sorry for the confusion. From charlesr.harris at gmail.com Tue Jul 22 13:41:04 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 11:41:04 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221314.03726.pgmdevlist@gmail.com> References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> Message-ID: So, the following oddities remain for Python2.3 : Failed importing numpy.f2py.lib.extgen: update() takes no keyword arguments ctypes is not available on this python: skipping the test (import error was: ctypes is not available.) No distutils available, skipping test. The missing ctypes shouldn't be a problem. I don't know what to make of the f2py import failure or if it matters. Nor do I understand what No distutils available implies. Anyone have ideas? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 22 13:46:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 12:46:02 -0500 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> Message-ID: <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> On Tue, Jul 22, 2008 at 12:41, Charles R Harris wrote: > > So, the following oddities remain for Python2.3 : > > Failed importing numpy.f2py.lib.extgen: update() takes no keyword arguments > > ctypes is not available on this python: skipping the test (import error was: > ctypes is not available.) > No distutils available, skipping test. > > > The missing ctypes shouldn't be a problem. I don't know what to make of the > f2py import failure or if it matters. Everything in numpy/f2py/lib/ is the unfinished G3 version of f2py, whose development has moved to Launchpad. It has been removed from the trunk, and errors in it can be ignored for 1.1.x. > Nor do I understand what No distutils > available implies. Anyone have ideas? It's another ctypes skipped test. numpy/tests/test_ctypes.py:TestLoadLibrary.check_basic2(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Tue Jul 22 13:41:18 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 13:41:18 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221314.03726.pgmdevlist@gmail.com> Message-ID: <200807221341.18871.pgmdevlist@gmail.com> On Tuesday 22 July 2008 13:41:04 Charles R Harris wrote: > The missing ctypes shouldn't be a problem. I don't know what to make of the > f2py import failure or if it matters. /extgen/py_support.py:284: parent_container_options.update(prefix='"\\n\\n:Parameters:\\n"\n" ') ./extgen/py_support.py:287: parent_container_options.update(prefix='"\\n\\n:Optional parameters: \\n"\n" ') ./extgen/py_support.py:290: parent_container_options.update(prefix='"\\n\\n:Extra parameters:\\n"\n" ') ./extgen/py_support.py:293: parent_container_options.update(prefix='"\\n\\n:Returns:\\n"\n" ', so, we can't use that for Python 2.3, we need to use a syntax like parent_container_options.update({'prefix':blahblahblah}) From charlesr.harris at gmail.com Tue Jul 22 13:57:16 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 11:57:16 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 11:46 AM, Robert Kern wrote: > On Tue, Jul 22, 2008 at 12:41, Charles R Harris > wrote: > > > > So, the following oddities remain for Python2.3 : > > > > Failed importing numpy.f2py.lib.extgen: update() takes no keyword > arguments > > > > ctypes is not available on this python: skipping the test (import error > was: > > ctypes is not available.) > > No distutils available, skipping test. > > > > > > The missing ctypes shouldn't be a problem. I don't know what to make of > the > > f2py import failure or if it matters. > > Everything in numpy/f2py/lib/ is the unfinished G3 version of f2py, > whose development has moved to Launchpad. It has been removed from the > trunk, and errors in it can be ignored for 1.1.x. > > > Nor do I understand what No distutils > > available implies. Anyone have ideas? > > It's another ctypes skipped test. > numpy/tests/test_ctypes.py:TestLoadLibrary.check_basic2(). > OK, thanks. Can I take it that we are good to go as far as the tests show? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 22 14:03:15 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 13:03:15 -0500 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> Message-ID: <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> On Tue, Jul 22, 2008 at 12:57, Charles R Harris wrote: > > On Tue, Jul 22, 2008 at 11:46 AM, Robert Kern wrote: >> >> On Tue, Jul 22, 2008 at 12:41, Charles R Harris >> wrote: >> > >> > So, the following oddities remain for Python2.3 : >> > >> > Failed importing numpy.f2py.lib.extgen: update() takes no keyword >> > arguments >> > >> > ctypes is not available on this python: skipping the test (import error >> > was: >> > ctypes is not available.) >> > No distutils available, skipping test. >> > >> > >> > The missing ctypes shouldn't be a problem. I don't know what to make of >> > the >> > f2py import failure or if it matters. >> >> Everything in numpy/f2py/lib/ is the unfinished G3 version of f2py, >> whose development has moved to Launchpad. It has been removed from the >> trunk, and errors in it can be ignored for 1.1.x. >> >> > Nor do I understand what No distutils >> > available implies. Anyone have ideas? >> >> It's another ctypes skipped test. >> numpy/tests/test_ctypes.py:TestLoadLibrary.check_basic2(). > > OK, thanks. Can I take it that we are good to go as far as the tests show? I think so. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jul 22 14:20:13 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 12:20:13 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 12:03 PM, Robert Kern wrote: > On Tue, Jul 22, 2008 at 12:57, Charles R Harris > wrote: > > > > On Tue, Jul 22, 2008 at 11:46 AM, Robert Kern > wrote: > >> > >> On Tue, Jul 22, 2008 at 12:41, Charles R Harris > >> wrote: > >> > > >> > So, the following oddities remain for Python2.3 : > >> > > >> > Failed importing numpy.f2py.lib.extgen: update() takes no keyword > >> > arguments > >> > > >> > ctypes is not available on this python: skipping the test (import > error > >> > was: > >> > ctypes is not available.) > >> > No distutils available, skipping test. > >> > > >> > > >> > The missing ctypes shouldn't be a problem. I don't know what to make > of > >> > the > >> > f2py import failure or if it matters. > >> > >> Everything in numpy/f2py/lib/ is the unfinished G3 version of f2py, > >> whose development has moved to Launchpad. It has been removed from the > >> trunk, and errors in it can be ignored for 1.1.x. > >> > >> > Nor do I understand what No distutils > >> > available implies. Anyone have ideas? > >> > >> It's another ctypes skipped test. > >> numpy/tests/test_ctypes.py:TestLoadLibrary.check_basic2(). > > > > OK, thanks. Can I take it that we are good to go as far as the tests > show? > > I think so. > Although... would it be a problem to just remove the f2py stuff? That would get rid of one confusing message. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 22 14:51:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 13:51:01 -0500 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> Message-ID: <3d375d730807221151x7ce6f33t23ec333685871bc@mail.gmail.com> On Tue, Jul 22, 2008 at 13:20, Charles R Harris wrote: > > Although... would it be a problem to just remove the f2py stuff? That would > get rid of one confusing message. Probably not. See r5347 and r5348 for what I had to do on the trunk. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Tue Jul 22 14:58:22 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 22 Jul 2008 11:58:22 -0700 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 11:20 AM, Charles R Harris wrote: > Although... would it be a problem to just remove the f2py stuff? That would > get rid of one confusing message. +1 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From mdroe at stsci.edu Tue Jul 22 15:36:31 2008 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 22 Jul 2008 15:36:31 -0400 Subject: [Numpy-discussion] Segfault in PyArray_Item_XDECREF when using recarray object references titles Message-ID: <488636BF.9050205@stsci.edu> I've run into a segfault that occurs in the array destructor with arrays containing object references with both names and titles. When a field contains both and name and a title, the fields dictionary contains two entries for that field. This means that the array item destructor (which iterates through the fields dictionary) will decref the pointed-to object twice. If the first decref causes the object to be deleted, the second decref has the potential to segfault. It seems the simplest patch is to set the object pointer to NULL after decref'ing, so the second decref will do nothing. However, perhaps there is a way to avoid decref'ing twice in the first place. I've attached a script that exercises the segfault, a gdb backtrace, and a patch. You may need to adjust the number of rows until it is high enough to create a segfault on your system. This is on: RHEL4 Python 2.5.2 Numpy SVN r5497 Cheers, Mike > gdb python GNU gdb Red Hat Linux (6.3.0.0-1.153.el4_6.2rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) run segfault.py Starting program: /wonkabar/data1/usr/bin/python segfault.py [Thread debugging using libthread_db enabled] [New Thread -1208489312 (LWP 30028)] len(dtype) = 1, len(dtype.fields) = 2 {'name': (dtype('object'), 0, 'title'), 'title': (dtype('object'), 0, 'title')} Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208489312 (LWP 30028)] 0x0097285e in PyArray_Item_XDECREF ( data=0xb7a3e780 "\uffff_\224\uffff `\214\uffff(`\214\uffff0`\214\uffff8`\214\uffff@`\214\uffffH`\214\uffffP`\214\uffffX`\214\uffff``\214\uffffh`\214\uffffp`\214\uffffx`\214\uffff\200`\214\uffff\210`\214\uffff\220`\214\uffff\230`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff", descr=0x9d4680) at numpy/core/src/arrayobject.c:198 198 Py_XDECREF(*temp); (gdb) bt #0 0x0097285e in PyArray_Item_XDECREF ( data=0xb7a3e780 "\uffff_\224\uffff `\214\uffff(`\214\uffff0`\214\uffff8`\214\uffff@`\214\uffffH`\214\uffffP`\214\uffffX`\214\uffff``\214\uffffh`\214\uffffp`\214\uffffx`\214\uffff\200`\214\uffff\210`\214\uffff\220`\214\uffff\230`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff", descr=0x9d4680) at numpy/core/src/arrayobject.c:198 #1 0x00991bc7 in PyArray_XDECREF (mp=0xb7ae4f0c) at numpy/core/src/arrayobject.c:211 #2 0x009a579b in array_dealloc (self=0xb7ae4f0c) at numpy/core/src/arrayobject.c:2089 #3 0x0809781f in subtype_dealloc (self=0xb7ae4f0c) at Objects/typeobject.c:709 #4 0x08082a02 in PyDict_SetItem (op=0xb7f56acc, key=0xb7ea7d80, value=0x81379c0) at Objects/dictobject.c:416 #5 0x08085a1e in _PyModule_Clear (m=0xb7f3e0ec) at Objects/moduleobject.c:136 #6 0x080d7138 in PyImport_Cleanup () at Python/import.c:439 #7 0x080e4343 in Py_Finalize () at Python/pythonrun.c:399 #8 0x08056633 in Py_Main (argc=1, argv=0xbff1ca24) at Modules/main.c:545 #9 0x08056323 in main (argc=2, argv=0xbff1ca24) at ./Modules/python.c:23 -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: arrayobject.c.diff URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: segfault.py URL: From mdroe at stsci.edu Tue Jul 22 15:54:38 2008 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 22 Jul 2008 15:54:38 -0400 Subject: [Numpy-discussion] Segfault in PyArray_Item_XDECREF when using recarray object references titles In-Reply-To: <488636BF.9050205@stsci.edu> References: <488636BF.9050205@stsci.edu> Message-ID: <48863AFE.2040109@stsci.edu> I also noticed that the inverse operation, PyArray_Item_INCREF has the potential to leak memory as it will doubly-increment each object in the array. The solution there probably isn't quite as clean, since we can't just mark the pointer. It will have to somehow avoid incref'ing the objects twice when iterating through the fields dictionary. Cheers, Mike Michael Droettboom wrote: > I've run into a segfault that occurs in the array destructor with > arrays containing object references with both names and titles. > > When a field contains both and name and a title, the fields dictionary > contains two entries for that field. This means that the array item > destructor (which iterates through the fields dictionary) will decref > the pointed-to object twice. If the first decref causes the object to > be deleted, the second decref has the potential to segfault. > > It seems the simplest patch is to set the object pointer to NULL after > decref'ing, so the second decref will do nothing. However, perhaps > there is a way to avoid decref'ing twice in the first place. > > I've attached a script that exercises the segfault, a gdb backtrace, > and a patch. You may need to adjust the number of rows until it is > high enough to create a segfault on your system. > > This is on: > RHEL4 > Python 2.5.2 > Numpy SVN r5497 > > Cheers, > Mike > >> gdb python > GNU gdb Red Hat Linux (6.3.0.0-1.153.el4_6.2rh) > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and > you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "i386-redhat-linux-gnu"...Using host > libthread_db library "/lib/tls/libthread_db.so.1". > > (gdb) run segfault.py > Starting program: /wonkabar/data1/usr/bin/python segfault.py > [Thread debugging using libthread_db enabled] > [New Thread -1208489312 (LWP 30028)] > len(dtype) = 1, len(dtype.fields) = 2 > {'name': (dtype('object'), 0, 'title'), 'title': (dtype('object'), 0, > 'title')} > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread -1208489312 (LWP 30028)] > 0x0097285e in PyArray_Item_XDECREF ( > data=0xb7a3e780 "\uffff_\224\uffff > `\214\uffff(`\214\uffff0`\214\uffff8`\214\uffff@`\214\uffffH`\214\uffffP`\214\uffffX`\214\uffff``\214\uffffh`\214\uffffp`\214\uffffx`\214\uffff\200`\214\uffff\210`\214\uffff\220`\214\uffff\230`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff", > > descr=0x9d4680) at numpy/core/src/arrayobject.c:198 > 198 Py_XDECREF(*temp); > (gdb) bt > #0 0x0097285e in PyArray_Item_XDECREF ( > data=0xb7a3e780 "\uffff_\224\uffff > `\214\uffff(`\214\uffff0`\214\uffff8`\214\uffff@`\214\uffffH`\214\uffffP`\214\uffffX`\214\uffff``\214\uffffh`\214\uffffp`\214\uffffx`\214\uffff\200`\214\uffff\210`\214\uffff\220`\214\uffff\230`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff\uffff`\214\uffff", > > descr=0x9d4680) at numpy/core/src/arrayobject.c:198 > #1 0x00991bc7 in PyArray_XDECREF (mp=0xb7ae4f0c) > at numpy/core/src/arrayobject.c:211 > #2 0x009a579b in array_dealloc (self=0xb7ae4f0c) > at numpy/core/src/arrayobject.c:2089 > #3 0x0809781f in subtype_dealloc (self=0xb7ae4f0c) at > Objects/typeobject.c:709 > #4 0x08082a02 in PyDict_SetItem (op=0xb7f56acc, key=0xb7ea7d80, > value=0x81379c0) at Objects/dictobject.c:416 > #5 0x08085a1e in _PyModule_Clear (m=0xb7f3e0ec) at > Objects/moduleobject.c:136 > #6 0x080d7138 in PyImport_Cleanup () at Python/import.c:439 > #7 0x080e4343 in Py_Finalize () at Python/pythonrun.c:399 > #8 0x08056633 in Py_Main (argc=1, argv=0xbff1ca24) at Modules/main.c:545 > #9 0x08056323 in main (argc=2, argv=0xbff1ca24) at ./Modules/python.c:23 > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From pav at iki.fi Tue Jul 22 15:57:00 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 Jul 2008 19:57:00 +0000 (UTC) Subject: [Numpy-discussion] Corner case complex log error. References: Message-ID: Tue, 22 Jul 2008 00:34:46 -0600, Charles R Harris wrote: > > FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, > -0.0), 'divide') > AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') The interesting thing is that there is no test like this in test_umath.py. The closest thing is the second test in test_clog, which is (+0., 0.), (-inf, 0.). Does this vanish if you comment it out? -- Pauli Virtanen From charlesr.harris at gmail.com Tue Jul 22 16:07:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 14:07:01 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <3d375d730807221151x7ce6f33t23ec333685871bc@mail.gmail.com> References: <200807221300.39742.pgmdevlist@gmail.com> <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> <3d375d730807221151x7ce6f33t23ec333685871bc@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 12:51 PM, Robert Kern wrote: > On Tue, Jul 22, 2008 at 13:20, Charles R Harris > wrote: > > > > Although... would it be a problem to just remove the f2py stuff? That > would > > get rid of one confusing message. > > Probably not. See r5347 and r5348 for what I had to do on the trunk. > OK, done. I think I'll replace the ctypes warnings with a simple "S" if no one objects. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jul 22 16:13:09 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 15:13:09 -0500 Subject: [Numpy-discussion] Corner case complex log error. In-Reply-To: References: Message-ID: <3d375d730807221313j42b71d63i4dc8dcc56087a66@mail.gmail.com> On Tue, Jul 22, 2008 at 14:57, Pauli Virtanen wrote: > Tue, 22 Jul 2008 00:34:46 -0600, Charles R Harris wrote: >> >> FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, >> -0.0), 'divide') >> AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') > > The interesting thing is that there is no test like this > in test_umath.py. The closest thing is the second test in test_clog, > which is (+0., 0.), (-inf, 0.). Does this vanish if you comment it out? I think it's a weirdness in the way that Python can handle literals in the same statement. >>> (-0.0, 0.0) (-0.0, -0.0) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 22 16:17:52 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 15:17:52 -0500 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <1d36917a0807221009r866f5cas290e409ec463b9b1@mail.gmail.com> <200807221314.03726.pgmdevlist@gmail.com> <3d375d730807221046s4c1c8a6bgb5c65baf335a8151@mail.gmail.com> <3d375d730807221103y35a0838epe4d88e0879d2cbb3@mail.gmail.com> <3d375d730807221151x7ce6f33t23ec333685871bc@mail.gmail.com> Message-ID: <3d375d730807221317g5e2ff32ct392ac0baf80da540@mail.gmail.com> On Tue, Jul 22, 2008 at 15:07, Charles R Harris wrote: > > > On Tue, Jul 22, 2008 at 12:51 PM, Robert Kern wrote: >> >> On Tue, Jul 22, 2008 at 13:20, Charles R Harris >> wrote: >> > >> > Although... would it be a problem to just remove the f2py stuff? That >> > would >> > get rid of one confusing message. >> >> Probably not. See r5347 and r5348 for what I had to do on the trunk. > > OK, done. I think I'll replace the ctypes warnings with a simple "S" if no > one objects. Sure. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Tue Jul 22 16:21:42 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 Jul 2008 20:21:42 +0000 (UTC) Subject: [Numpy-discussion] Corner case complex log error. References: <3d375d730807221313j42b71d63i4dc8dcc56087a66@mail.gmail.com> Message-ID: Tue, 22 Jul 2008 15:13:09 -0500, Robert Kern wrote: > On Tue, Jul 22, 2008 at 14:57, Pauli Virtanen wrote: >> Tue, 22 Jul 2008 00:34:46 -0600, Charles R Harris wrote: >>> >>> FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, >>> -0.0), 'divide') >>> AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') >> >> The interesting thing is that there is no test like this in >> test_umath.py. The closest thing is the second test in test_clog, which >> is (+0., 0.), (-inf, 0.). Does this vanish if you comment it out? > > I think it's a weirdness in the way that Python can handle literals in > the same statement. If so, it seems like a platform-dependent quirk: Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 >>> (-0.0, 0.0) (-0.0, 0.0) Anyway, I marked the probable culprit as skipped for now. -- Pauli Virtanen From robert.kern at gmail.com Tue Jul 22 16:43:03 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 22 Jul 2008 15:43:03 -0500 Subject: [Numpy-discussion] Corner case complex log error. In-Reply-To: References: <3d375d730807221313j42b71d63i4dc8dcc56087a66@mail.gmail.com> Message-ID: <3d375d730807221343u52291307s2500207e4d3edf8f@mail.gmail.com> On Tue, Jul 22, 2008 at 15:21, Pauli Virtanen wrote: > Tue, 22 Jul 2008 15:13:09 -0500, Robert Kern wrote: > >> On Tue, Jul 22, 2008 at 14:57, Pauli Virtanen wrote: >>> Tue, 22 Jul 2008 00:34:46 -0600, Charles R Harris wrote: >>>> >>>> FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, >>>> -0.0), 'divide') >>>> AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') >>> >>> The interesting thing is that there is no test like this in >>> test_umath.py. The closest thing is the second test in test_clog, which >>> is (+0., 0.), (-inf, 0.). Does this vanish if you comment it out? >> >> I think it's a weirdness in the way that Python can handle literals in >> the same statement. > > If so, it seems like a platform-dependent quirk: > > Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) > [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 > >>>> (-0.0, 0.0) > (-0.0, 0.0) > > Anyway, I marked the probable culprit as skipped for now. We can define negzero=-0.0 at the top of the file and always use that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Tue Jul 22 16:49:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 14:49:23 -0600 Subject: [Numpy-discussion] Corner case complex log error. In-Reply-To: References: <3d375d730807221313j42b71d63i4dc8dcc56087a66@mail.gmail.com> Message-ID: On Tue, Jul 22, 2008 at 2:21 PM, Pauli Virtanen wrote: > Tue, 22 Jul 2008 15:13:09 -0500, Robert Kern wrote: > > > On Tue, Jul 22, 2008 at 14:57, Pauli Virtanen wrote: > >> Tue, 22 Jul 2008 00:34:46 -0600, Charles R Harris wrote: > >>> > >>> FAIL: test_umath.TestC99.test_clog(, (-0.0, -0.0), (-inf, > >>> -0.0), 'divide') > >>> AssertionError: ('(-inf, 3.1415926535897931)', '(-inf, 0.0)') > >> > >> The interesting thing is that there is no test like this in > >> test_umath.py. The closest thing is the second test in test_clog, which > >> is (+0., 0.), (-inf, 0.). Does this vanish if you comment it out? > > > > I think it's a weirdness in the way that Python can handle literals in > > the same statement. > > If so, it seems like a platform-dependent quirk: > > Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) > [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 > > >>> (-0.0, 0.0) > (-0.0, 0.0) > > Anyway, I marked the probable culprit as skipped for now. > Or maybe a Python bug Python 2.5.1 (r251:54863, Jun 15 2008, 18:24:51) [GCC 4.3.0 20080428 (Red Hat 4.3.0-8)] on linux2 I doesn't show up in Python 2.3.7 either. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 16:48:25 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 16:48:25 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: Message-ID: <200807221648.26021.pgmdevlist@gmail.com> All Could anybody working on the 1.1.x branch test r5507 ? I just backported a bugfix from 1.2, and I'd like to make sure that 1. it doesn't break anything (I can't see why it should, but), 2. I can close the ticket (#857) Thx a lot in advance ! From charlesr.harris at gmail.com Tue Jul 22 17:10:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 15:10:57 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221648.26021.pgmdevlist@gmail.com> References: <200807221648.26021.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 2:48 PM, Pierre GM wrote: > All > Could anybody working on the 1.1.x branch test r5507 ? I just backported a > bugfix from 1.2, and I'd like to make sure that 1. it doesn't break > anything > (I can't see why it should, but), 2. I can close the ticket (#857) > Thx a lot in advance ! > __ > Runs on both Python 2.5.1 and 2.3.7 here. I'll run the buildbots. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 22 17:15:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 15:15:48 -0600 Subject: [Numpy-discussion] Removing some warnings from numpy.i In-Reply-To: References: Message-ID: On Thu, Jul 17, 2008 at 2:04 PM, Matthieu Brucher < matthieu.brucher at gmail.com> wrote: > Hi, > > I've enclosed a patch for numpy.i (against the trunk). Its goal is to > add const char* > instead of char* in some functions (pytype_string and > typecode_string). The char* use raises some warnings in GCC 4.2.3 (and > it is indeed not type safe IMHO). > > Matthieu > -- Hi Matthieu, Open a ticket so this doesn't get lost. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 22 17:04:06 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 17:04:06 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221648.26021.pgmdevlist@gmail.com> Message-ID: <200807221704.06214.pgmdevlist@gmail.com> On Tuesday 22 July 2008 17:10:57 Charles R Harris wrote: > Runs on both Python 2.5.1 and 2.3.7 here. I'll run the buildbots. Thx a lot, I'll close the ticket. From charlesr.harris at gmail.com Tue Jul 22 17:23:29 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 15:23:29 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221704.06214.pgmdevlist@gmail.com> References: <200807221648.26021.pgmdevlist@gmail.com> <200807221704.06214.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 3:04 PM, Pierre GM wrote: > On Tuesday 22 July 2008 17:10:57 Charles R Harris wrote: > > > Runs on both Python 2.5.1 and 2.3.7 here. I'll run the buildbots. > > Thx a lot, I'll close the ticket. > Any more changes in the pipeline? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Tue Jul 22 17:24:27 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Tue, 22 Jul 2008 17:24:27 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221648.26021.pgmdevlist@gmail.com> References: <200807221648.26021.pgmdevlist@gmail.com> Message-ID: <1d36917a0807221424y3a852ddblb9f0f0c4b3ef775c@mail.gmail.com> On Tue, Jul 22, 2008 at 4:48 PM, Pierre GM wrote: > Could anybody working on the 1.1.x branch test r5507 ? I just backported a > bugfix from 1.2, and I'd like to make sure that 1. it doesn't break anything > (I can't see why it should, but), 2. I can close the ticket (#857) > Thx a lot in advance ! 1.1.x runs with no errors against 2.3.7, 2.4.5, and 2.5.3 for me on Linux. From pgmdevlist at gmail.com Tue Jul 22 17:21:57 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 17:21:57 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: References: <200807221704.06214.pgmdevlist@gmail.com> Message-ID: <200807221721.57190.pgmdevlist@gmail.com> > Any more changes in the pipeline? I hope not. I wasn't especially thrilled to get a ticket at the very last minute, but this one was definitely worth fixing ASAP... From pgmdevlist at gmail.com Tue Jul 22 17:22:53 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 17:22:53 -0400 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <1d36917a0807221424y3a852ddblb9f0f0c4b3ef775c@mail.gmail.com> References: <200807221648.26021.pgmdevlist@gmail.com> <1d36917a0807221424y3a852ddblb9f0f0c4b3ef775c@mail.gmail.com> Message-ID: <200807221722.53672.pgmdevlist@gmail.com> On Tuesday 22 July 2008 17:24:27 Alan McIntyre wrote: > 1.1.x runs with no errors against 2.3.7, 2.4.5, and 2.5.3 for me on Linux. Thx Alan ! I didn't expect the fix to crash anything, but better safe than sorry. From charlesr.harris at gmail.com Tue Jul 22 17:44:01 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Jul 2008 15:44:01 -0600 Subject: [Numpy-discussion] 1.1.1rc1 to be tagged tonight In-Reply-To: <200807221722.53672.pgmdevlist@gmail.com> References: <200807221648.26021.pgmdevlist@gmail.com> <1d36917a0807221424y3a852ddblb9f0f0c4b3ef775c@mail.gmail.com> <200807221722.53672.pgmdevlist@gmail.com> Message-ID: On Tue, Jul 22, 2008 at 3:22 PM, Pierre GM wrote: > On Tuesday 22 July 2008 17:24:27 Alan McIntyre wrote: > > > 1.1.x runs with no errors against 2.3.7, 2.4.5, and 2.5.3 for me on > Linux. > > Thx Alan ! I didn't expect the fix to crash anything, but better safe than > sorry. > _____ > Apropos nothing much, but I can definitely recommend gcc 4.3. It's *much* faster. So much so that I wasn't even sure the build was working correctly ;) YMMV. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From neilcrighton at gmail.com Tue Jul 22 18:25:23 2008 From: neilcrighton at gmail.com (Neil Crighton) Date: Tue, 22 Jul 2008 22:25:23 +0000 (UTC) Subject: [Numpy-discussion] Reference guide updated Message-ID: > A new copy of the reference guide is now available at > http://mentat.za.net/numpy/refguide/ It there a reason why there's so much vertical space between all of the text sections? I find the docstrings much easier to read in the editor: http://sd-2116.dedibox.fr/pydocweb/doc/numpy.core.fromnumeric.all/ than in the reference guide: http://mentat.za.net/numpy/refguide/routines.logic.xhtml Neil From pgmdevlist at gmail.com Tue Jul 22 18:43:02 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 22 Jul 2008 18:43:02 -0400 Subject: [Numpy-discussion] Reference guide updated In-Reply-To: References: Message-ID: <200807221843.03016.pgmdevlist@gmail.com> On Tuesday 22 July 2008 18:25:23 Neil Crighton wrote: > > A new copy of the reference guide is now available at > > http://mentat.za.net/numpy/refguide/ > > It there a reason why there's so much vertical space between all of the > text sections? I find the docstrings much easier to read in the editor: Roughly: * In the editor, you have one function per page, vs several function per page in the reference * in the editor, the blocks 'Parameters', 'Returns'... are considered as sections, while in the reference, they are Field lists (roughly). In the end, it's only a matter of taste, of course. But you raise an interesting point, we should provide some kind of options to choose between behaviors. Please note as well that the reference guide is a work in progress: you're more than welcome to join and work with us. From jh at physics.ucf.edu Tue Jul 22 18:59:19 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Tue, 22 Jul 2008 18:59:19 -0400 Subject: [Numpy-discussion] Schedule for 1.2.0 Message-ID: Hi Jarrod, I'm just catching up on my numpy lists and I caught this; sorry for the late reply! > Another issue that we should address is whether it is OK to postpone > the planned API changes to histogram and median. A couple of people > have mentioned to me that they would like to delay the API changes to > 1.3, which seems reasonable to me. If anyone would prefer that we > make the planned API changes for histogram and median in 1.2, please > speak now. I *strongly* want both these changes for 1.2, as I am sure do the many people teaching courses using numpy for the fall. It is hard to get students to understand why there are inconsistencies and irrationalities in software, and it's even worse when it's open-source, since somehow it's the lecturer's fault that he picked a package that isn't right in some major way. Worse, we're changing these behaviors like 6 months from now, so students will have to learn it wrong and code it wrong, and then their code may break on top of it. On behalf of this year's new students and their instructors, I ask you to keep these changes in the release as planned. Thanks, --jh-- Prof. Joseph Harrington Department of Physics MAP 414 4000 Central Florida Blvd. University of Central Florida Orlando, FL 32816-2385 From timmichelsen at gmx-topmail.de Tue Jul 22 19:04:19 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 23 Jul 2008 01:04:19 +0200 Subject: [Numpy-discussion] Reference guide updated In-Reply-To: <9457e7c80807210550h1b8966f0k67deefe3ecbbe38c@mail.gmail.com> References: <9457e7c80807210550h1b8966f0k67deefe3ecbbe38c@mail.gmail.com> Message-ID: > A new copy of the reference guide is now available at > > http://mentat.za.net/numpy/refguide/ Very nice. Will this be included in the main numpy distribution upon completion? Thanks and appreciation for efforts. Kind regards, Timmie From alan.mcintyre at gmail.com Wed Jul 23 09:22:50 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 23 Jul 2008 09:22:50 -0400 Subject: [Numpy-discussion] Benchmarking code Message-ID: <1d36917a0807230622l2e53b3d6h8e6d1af006884d20@mail.gmail.com> There's a function (_test_unique1d_speed) in numpy/lib/arraysetops.py that looks to me like it should be in a benchmark suite instead of in the library module. Would anyone mind if I moved it to numpy/lib/benchmarks? From stefan at sun.ac.za Wed Jul 23 09:23:09 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 23 Jul 2008 15:23:09 +0200 Subject: [Numpy-discussion] Representation of array subclasses Message-ID: <9457e7c80807230623s560545fdg442439b40d17b8af@mail.gmail.com> Hi all, I noticed that subclasses are not represented correctly: In [8]: np.chararray([1, 2, 3]) Out[8]: chararray([[['\x03', '', ''], ['\xc0', '\x03', '']]], dtype='|S1') Notice how the indentation of the second row is completely wrong. I tried to fix this in array_repr_builtin (arrayobject.c:4318), but the fix had no impact. `array_repr_builtin` is not called unless preceded by np.set_string_function(None, repr=1) Does anybody know where the right place is to fix the representation? Thanks St?fan From stefan at sun.ac.za Wed Jul 23 09:40:13 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 23 Jul 2008 15:40:13 +0200 Subject: [Numpy-discussion] Benchmarking code In-Reply-To: <1d36917a0807230622l2e53b3d6h8e6d1af006884d20@mail.gmail.com> References: <1d36917a0807230622l2e53b3d6h8e6d1af006884d20@mail.gmail.com> Message-ID: <9457e7c80807230640v4f546808r3302ef63b057696a@mail.gmail.com> 2008/7/23 Alan McIntyre : > There's a function (_test_unique1d_speed) in numpy/lib/arraysetops.py > that looks to me like it should be in a benchmark suite instead of in > the library module. Would anyone mind if I moved it to > numpy/lib/benchmarks? No, please go ahead. Cheers St?fan From david.huard at gmail.com Wed Jul 23 09:46:15 2008 From: david.huard at gmail.com (David Huard) Date: Wed, 23 Jul 2008 09:46:15 -0400 Subject: [Numpy-discussion] Schedule for 1.2.0 In-Reply-To: References: Message-ID: <91cf711d0807230646y52690010r90c84c521c85926@mail.gmail.com> I think we should stick to what has been agreed and announced months ago. It's called honouring our commitments and the project's image depends on it. If the inconvenience of these API changes is worth the trouble, a 1.1.2 release could be considered. My two cents. David 2008/7/22 Joe Harrington : > Hi Jarrod, > > I'm just catching up on my numpy lists and I caught this; sorry for > the late reply! > >> Another issue that we should address is whether it is OK to postpone >> the planned API changes to histogram and median. A couple of people >> have mentioned to me that they would like to delay the API changes to >> 1.3, which seems reasonable to me. If anyone would prefer that we >> make the planned API changes for histogram and median in 1.2, please >> speak now. > > I *strongly* want both these changes for 1.2, as I am sure do the many > people teaching courses using numpy for the fall. It is hard to get > students to understand why there are inconsistencies and > irrationalities in software, and it's even worse when it's > open-source, since somehow it's the lecturer's fault that he picked a > package that isn't right in some major way. Worse, we're changing > these behaviors like 6 months from now, so students will have to learn > it wrong and code it wrong, and then their code may break on top of > it. On behalf of this year's new students and their instructors, I > ask you to keep these changes in the release as planned. > > Thanks, > > --jh-- > Prof. Joseph Harrington > Department of Physics > MAP 414 > 4000 Central Florida Blvd. > University of Central Florida > Orlando, FL 32816-2385 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From alan.mcintyre at gmail.com Wed Jul 23 09:55:05 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 23 Jul 2008 09:55:05 -0400 Subject: [Numpy-discussion] Benchmarking code In-Reply-To: <9457e7c80807230640v4f546808r3302ef63b057696a@mail.gmail.com> References: <1d36917a0807230622l2e53b3d6h8e6d1af006884d20@mail.gmail.com> <9457e7c80807230640v4f546808r3302ef63b057696a@mail.gmail.com> Message-ID: <1d36917a0807230655s3407eb48s28ea0b690a79312b@mail.gmail.com> On Wed, Jul 23, 2008 at 9:40 AM, St?fan van der Walt wrote: > 2008/7/23 Alan McIntyre : >> There's a function (_test_unique1d_speed) in numpy/lib/arraysetops.py >> that looks to me like it should be in a benchmark suite instead of in >> the library module. Would anyone mind if I moved it to >> numpy/lib/benchmarks? > > No, please go ahead. Thanks, done. From stefan at sun.ac.za Wed Jul 23 09:57:29 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 23 Jul 2008 15:57:29 +0200 Subject: [Numpy-discussion] Documenting chararrays Message-ID: <9457e7c80807230657r6e2e198dp18accc42acb60e13@mail.gmail.com> Hi all, Should we document character arrays? Does anybody still use them? I think their behaviour can largely be duplicated by object arrays. They also seem to be broken: >>> x = np.array(['1', '2', '3', '4']).view(np.chararray) >>> x*3 chararray(['111', '222', '333', '444'], dtype='|S4') All good, but: >>> x = np.array(['12', '34', '56', '78']).view(np.chararray) >>> x * 3 chararray(['1212', '3434', '5656', '7878'], dtype='|S4') Whereas with object arrays: >>> np.array(['12', '34', '56', '78'], dtype=object) * 3 array([121212, 343434, 565656, 787878], dtype=object) Similaryly: >>> x.rjust(3) chararray([' a', ' b', ' c', ' d'], dtype='|S3') >>> x = np.array(['a','b','c','d']).view(np.chararray) >>> x.rjust(3) chararray([' a', ' b', ' c', ' d'], dtype='|S3') But then >>> x = np.array([['ab','cd'], ['ef','gh']]).view(np.chararray) >>> x.rjust(5) # BOOM Cheers St?fan From alan.mcintyre at gmail.com Wed Jul 23 10:07:34 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 23 Jul 2008 10:07:34 -0400 Subject: [Numpy-discussion] Documenting chararrays In-Reply-To: <9457e7c80807230657r6e2e198dp18accc42acb60e13@mail.gmail.com> References: <9457e7c80807230657r6e2e198dp18accc42acb60e13@mail.gmail.com> Message-ID: <1d36917a0807230707r74e6ea6chf35d674b25a2cdf0@mail.gmail.com> On Wed, Jul 23, 2008 at 9:57 AM, St?fan van der Walt wrote: > Hi all, > > Should we document character arrays? Does anybody still use them? > > I think their behaviour can largely be duplicated by object arrays. > They also seem to be broken: FWIW, I've got issues and patches for a couple of chararray problems: __mul__ problem: http://scipy.org/scipy/numpy/ticket/855 __mod__ problem: http://scipy.org/scipy/numpy/ticket/856 I was wondering myself when I looked at the class whether anybody was actually using it. ;) If it's to be replaced, we should probably remove it from the compiled extension wishlist on http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas. From chanley at stsci.edu Wed Jul 23 10:16:57 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Wed, 23 Jul 2008 10:16:57 -0400 Subject: [Numpy-discussion] Documenting chararrays In-Reply-To: <9457e7c80807230657r6e2e198dp18accc42acb60e13@mail.gmail.com> References: <9457e7c80807230657r6e2e198dp18accc42acb60e13@mail.gmail.com> Message-ID: <48873D59.1030905@stsci.edu> St?fan van der Walt wrote: > Hi all, > > Should we document character arrays? Does anybody still use them? > > I think their behaviour can largely be duplicated by object arrays. > They also seem to be broken: > >>>> x = np.array(['1', '2', '3', '4']).view(np.chararray) > >>>> x*3 > chararray(['111', '222', '333', '444'], > dtype='|S4') > > All good, but: > >>>> x = np.array(['12', '34', '56', '78']).view(np.chararray) > >>>> x * 3 > chararray(['1212', '3434', '5656', '7878'], > dtype='|S4') > > Whereas with object arrays: > >>>> np.array(['12', '34', '56', '78'], dtype=object) * 3 > array([121212, 343434, 565656, 787878], dtype=object) > > Similaryly: > >>>> x.rjust(3) > chararray([' a', ' b', ' c', ' d'], > dtype='|S3') > >>>> x = np.array(['a','b','c','d']).view(np.chararray) > >>>> x.rjust(3) > chararray([' a', ' b', ' c', ' d'], > dtype='|S3') > > But then > >>>> x = np.array([['ab','cd'], ['ef','gh']]).view(np.chararray) >>>> x.rjust(5) # BOOM > > Cheers > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion I know that chararrays are used by pyfits. Chris -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From suchindra at gmail.com Wed Jul 23 10:48:55 2008 From: suchindra at gmail.com (Suchindra Sandhu) Date: Wed, 23 Jul 2008 10:48:55 -0400 Subject: [Numpy-discussion] integer array creation oddity In-Reply-To: References: <9457e7c80807181610j64c0e2a5s78f2b7a71996148e@mail.gmail.com> <9457e7c80807211437j72102a33n58d7e7deefbc54c1@mail.gmail.com> Message-ID: Thanks Everyone. On Mon, Jul 21, 2008 at 6:25 PM, Charles R Harris wrote: > > > On Mon, Jul 21, 2008 at 3:37 PM, St?fan van der Walt > wrote: > >> 2008/7/21 Suchindra Sandhu : >> > Is that the recommended way of checking the type of the array? Ususally >> for >> > type checkin, I use the isinstance built-in in python, but I see that >> will >> > not work in this case. I must admit that I am a little confused by this. >> Why >> > is type different from dtype? >> >> Data-types contain additional information needed to lay out numerical >> types in memory, such as byte-order and bit-width. Each data-type has >> an associated Python type, which tells you the type of scalars in an >> array of that dtype. For example, here are two NumPy data-types that >> are not equal: >> >> In [6]: d1 = np.dtype(int).newbyteorder('>') >> In [7]: d2 = np.dtype(int).newbyteorder('<') >> >> In [8]: d1.type >> Out[8]: >> >> In [9]: d2.type >> Out[9]: >> >> In [10]: d1 == d2 >> Out[10]: False >> >> I don't know why there is more than one int32 type (I would guess it >> has something to do with the way types are detected upon build; maybe >> Robert or Travis could tell you more). >> > > They correspond to two C types of the same size, int and long. On 64 bit > systems you should have two int64 types, long and longlong. > > In [1]: dtype('i').name > Out[1]: 'int32' > > In [2]: dtype('l').name > Out[2]: 'int32' > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Wed Jul 23 11:41:04 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 23 Jul 2008 08:41:04 -0700 Subject: [Numpy-discussion] Schedule for 1.2.0 In-Reply-To: <91cf711d0807230646y52690010r90c84c521c85926@mail.gmail.com> References: <91cf711d0807230646y52690010r90c84c521c85926@mail.gmail.com> Message-ID: On Wed, Jul 23, 2008 at 6:46 AM, David Huard wrote: > I think we should stick to what has been agreed and announced months ago. > It's called honouring our commitments and the project's image depends on it. > > If the inconvenience of these API changes is worth the trouble, a 1.1.2 release > could be considered. +1 I would also like to stick with the original plans and make the changes in 1.2.0. I also think that given the fact that 1.1.1 will include a large number of bugfixes from the trunk the concerns about making the API change aren't as pronounced. I also agree that if it becomes an issue that releasing 1.1.2 might be a reasonable response. If anyone still thinks that we should wait until 1.3 to make the changes, now is the time to speak up. Otherwise, let's plan to make the changes on the trunk by the end of the week. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From fperez.net at gmail.com Wed Jul 23 17:50:23 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 14:50:23 -0700 Subject: [Numpy-discussion] f2py - a recap In-Reply-To: References: Message-ID: Hi all, I'm just reposting here to see if anyone with a stake in f2py has an opinion/advice on the points below. F2py feels very much in autopilot/drifting into the icebergs mode right now. Is that correct assessment? If there's any guidance on where to go, I can at least file tickets on these points, but I don't want to create unnecessary tickets on trac if others feel the current situation is satisfactory and it's just me who is confused. Cheers, f On Fri, Jul 18, 2008 at 9:00 PM, Fernando Perez wrote: > Howdy, > > today's exercise with f2py left some lessons learned, mostly thanks to > Robert's excellent help, for which I'm grateful. > > I'd like to recap here what we have, mostly to decide what changes (if > any) should go into numpy to make the experience less "interesting" > for future users: > > - Remove the f2py_options flag from > numpy.distutils.extension.Extension? If so, do options like > '--debug_capi' get correctly passed via setup.cfg? > > This flag is potentially very confusing, because only *some* f2py > options get actually set this way, while others need to be set in > calls to config_fc. > > - How to properly set the compiler options in a setup.py file? Robert > suggested the setup.cfg file, but this doesn't get picked up unless > config_fc is explicitly called by the user: > > ./setup.py config_fc etc... > > This is perhaps a distutils problem, but I don't know if we can > solve it more cleanly. It seems to me that it should be possible to > provide a setup.py file that can be used simply as > > ./setup.py install > > (with the necessary setup.cfg file sitting next to it). I'm thinking > here of what we need to do when showing how 'easy' these tools are > for scientists migrating from matlab, for example. Obscure, special > purpose incantations tend to tarnish our message of ease :) > > - Should the 'instead' word be removed from the f2py docs regarding > the use of .pyf sources? It appears to be a mistake, which threw at > least me for a loop for a while. > > - Why does f2py in the source tree have *both* a doc/ and a docs/ > directory? It's really confusing to see this. > > f2py happens to be a very important tool, not just because scipy > couldn't build without it, but also to position python as a credible > integration language for scientific work. So I hope we can make using > it as easy and robust as is technically feasible. > > Cheers, > > f > From fperez.net at gmail.com Wed Jul 23 17:56:09 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 14:56:09 -0700 Subject: [Numpy-discussion] Instructions on building from source Message-ID: Howdy, I was just trying to explain to a new user how to build numpy from source on ubuntu and I realized that there's not much info on this front in the source tree. Scipy has a nice INSTALL.txt that even lists the names of the debian/ubuntu packages needed for the build (which can be a useful guide on other distros). Should we have a stripped-down copy of this doc somewhere in the top-level directory of numpy? It's also disconcerting for a new user that there's no visible documentation directory at the top, perhaps listing the existence of numpy/doc in the readme could help. I know there's plenty of work being done right now on the docs, so perhaps sprucing up the 'out of the tarball' experience for those building from source could be done with a very reasonable amount of effort. Cheers, f From robert.kern at gmail.com Wed Jul 23 18:02:02 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Jul 2008 17:02:02 -0500 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: References: Message-ID: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> On Wed, Jul 23, 2008 at 16:56, Fernando Perez wrote: > Howdy, > > I was just trying to explain to a new user how to build numpy from > source on ubuntu and I realized that there's not much info on this > front in the source tree. Scipy has a nice INSTALL.txt that even > lists the names of the debian/ubuntu packages needed for the build > (which can be a useful guide on other distros). Should we have a > stripped-down copy of this doc somewhere in the top-level directory of > numpy? Yes. > It's also disconcerting for a new user that there's no visible > documentation directory at the top, perhaps listing the existence of > numpy/doc in the readme could help. Yes. I'm not sure why they're in numpy/doc/ to begin with. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From strawman at astraw.com Wed Jul 23 18:13:02 2008 From: strawman at astraw.com (Andrew Straw) Date: Wed, 23 Jul 2008 15:13:02 -0700 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> Message-ID: <4887ACEE.4050101@astraw.com> Robert Kern wrote: > On Wed, Jul 23, 2008 at 16:56, Fernando Perez wrote: >> Howdy, >> >> I was just trying to explain to a new user how to build numpy from >> source on ubuntu and I realized that there's not much info on this >> front in the source tree. Scipy has a nice INSTALL.txt that even >> lists the names of the debian/ubuntu packages needed for the build >> (which can be a useful guide on other distros). Should we have a >> stripped-down copy of this doc somewhere in the top-level directory of >> numpy? > Just for reference, you can find the build dependencies of any Debian source package by looking at its .dsc file. For numpy, that can be found at http://packages.debian.org/sid/python-numpy Currently (version 1.1.0, debian version 1:1.1.0-3), that list is: Build-Depends: cdbs (>= 0.4.43), python-all-dev, python-all-dbg, python-central (>= 0.6), gfortran (>= 4:4.2), libblas-dev [!arm !m68k], liblapack-dev [!arm !m68k], debhelper (>= 5.0.38), patchutils, python-docutils, libfftw3-dev Build-Conflicts: atlas3-base-dev, libatlas-3dnow-dev, libatlas-base-dev, libatlas-headers, libatlas-sse-dev, libatlas-sse2-dev Some of that stuff (cdbs, debhelper, patchutils) is specific to the Debian build process and wouldn't be necessary for simply compiling numpy itself. And on a Debian (derivative) system, you can stall those with "apt-get build-dep python-numpy". This will only install the build dependencies for the version of python-numpy which is listed in your apt sources.list, but 99% of the time, that should be sufficient. -Andrew From neilcrighton at gmail.com Wed Jul 23 18:16:57 2008 From: neilcrighton at gmail.com (Neil Crighton) Date: Wed, 23 Jul 2008 23:16:57 +0100 Subject: [Numpy-discussion] Reference guide updated Message-ID: <63751c30807231516q5152a4cqb437dbf29c40158a@mail.gmail.com> Ok, thanks. I meant the amount of vertical space between lines of text - for example, the gaps between parameter values and their description, or the large spacing between both lines of text and and the text boxes in the examples section. If other people agree it's a problem, I thought the spacing could be tweaked. It's not a problem that there's more than one function per page. I have been helping out with the docs (where I feel I'm qualified enough - I'm no numpy expert!). I think it will make numpy much easier to learn to have easily-accessible, comprehensive docstrings, and the documentation editor makes it very easy to contribute. I'm also learning a lot reading other people's docstrings :) > > It there a reason why there's so much vertical space between all of the > > text sections? I find the docstrings much easier to read in the editor: > > Roughly: > * In the editor, you have one function per page, vs several function per page > in the reference > * in the editor, the blocks 'Parameters', 'Returns'... are considered as > sections, while in the reference, they are Field lists (roughly). > > In the end, it's only a matter of taste, of course. But you raise an > interesting point, we should provide some kind of options to choose between > behaviors. > > Please note as well that the reference guide is a work in progress: you're > more than welcome to join and work with us. From stefan at sun.ac.za Wed Jul 23 18:18:41 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 24 Jul 2008 00:18:41 +0200 Subject: [Numpy-discussion] f2py - a recap In-Reply-To: References: Message-ID: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> 2008/7/23 Fernando Perez : > I'm just reposting here to see if anyone with a stake in f2py has an > opinion/advice on the points below. F2py feels very much in > autopilot/drifting into the icebergs mode right now. Is that correct > assessment? > > If there's any guidance on where to go, I can at least file tickets on > these points, but I don't want to create unnecessary tickets on trac > if others feel the current situation is satisfactory and it's just me > who is confused. As far as I understand, Pearu is busy developing g3 of f2py. Does that mean that f2py in NumPy is effectively unmaintained? I hope not. I agree (with your previous e-mail) that it would be good to have some documentation, so if you could give me some pointers on *what* to document (I haven't used f2py much), then I'll try my best to get around to it. Cheers St?fan From robert.kern at gmail.com Wed Jul 23 18:27:57 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 23 Jul 2008 17:27:57 -0500 Subject: [Numpy-discussion] f2py - a recap In-Reply-To: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> References: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> Message-ID: <3d375d730807231527g275ab47bx44a07580077b0a57@mail.gmail.com> On Wed, Jul 23, 2008 at 17:18, St?fan van der Walt wrote: > 2008/7/23 Fernando Perez : >> I'm just reposting here to see if anyone with a stake in f2py has an >> opinion/advice on the points below. F2py feels very much in >> autopilot/drifting into the icebergs mode right now. Is that correct >> assessment? >> >> If there's any guidance on where to go, I can at least file tickets on >> these points, but I don't want to create unnecessary tickets on trac >> if others feel the current situation is satisfactory and it's just me >> who is confused. > > As far as I understand, Pearu is busy developing g3 of f2py. I think he's been busy doing other things for the past few months, at least. > Does > that mean that f2py in NumPy is effectively unmaintained? I hope not. We'll try to fix bugs as they come up, but this is, as always, subject to the vagaries of free time. It is very unlikely to grow new features. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Wed Jul 23 18:46:17 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 15:46:17 -0700 Subject: [Numpy-discussion] f2py - a recap In-Reply-To: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> References: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> Message-ID: Howdy, On Wed, Jul 23, 2008 at 3:18 PM, St?fan van der Walt wrote: > 2008/7/23 Fernando Perez : > I agree (with your previous e-mail) that it would be good to have some > documentation, so if you could give me some pointers on *what* to > document (I haven't used f2py much), then I'll try my best to get > around to it. Well, I think my 'recap' message earlier in this thread points to a few issues that can probably be addressed quickly (the 'instead' error in the help, the doc/docs dichotomy needs to be cleaned up so a single documentation directory exists, etc). I'm also attaching a set of very old notes I wrote years ago on f2py that you are free to use in any way you see fit. I gave them a 2-minute rst treatment but didn't edit them at all, so they may be somewhat outdated (I originally wrote them in 2002 I think). If Pearu has moved to greener pastures, f2py could certainly use an adoptive parent. It happens to be a really important piece of infrastructure and for the most part it works fairly well. I think a litlte bit of cleanup/doc integration with the rest of numpy is probably all that's needed, so it could be a good project for someone to adopt that would potentially be low-demand yet quite useful. Cheers, f -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: f2py.txt URL: From fperez.net at gmail.com Wed Jul 23 19:10:30 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 16:10:30 -0700 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> Message-ID: On Wed, Jul 23, 2008 at 3:02 PM, Robert Kern wrote: > On Wed, Jul 23, 2008 at 16:56, Fernando Perez wrote: >> (which can be a useful guide on other distros). Should we have a >> stripped-down copy of this doc somewhere in the top-level directory of >> numpy? > > Yes. > >> It's also disconcerting for a new user that there's no visible >> documentation directory at the top, perhaps listing the existence of >> numpy/doc in the readme could help. > > Yes. I'm not sure why they're in numpy/doc/ to begin with. OK, thanks for the feedback. I'm not about to touch doc stuff when the core team is deep in the middle of the doc marathon, but this may provide them useful guidance to clean things up a bit. cheers, f From fperez.net at gmail.com Wed Jul 23 19:11:36 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 16:11:36 -0700 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <4887ACEE.4050101@astraw.com> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> Message-ID: On Wed, Jul 23, 2008 at 3:13 PM, Andrew Straw wrote: > And on a Debian (derivative) system, you can stall those with "apt-get > build-dep python-numpy". This will only install the build dependencies > for the version of python-numpy which is listed in your apt > sources.list, but 99% of the time, that should be sufficient. Very, very useful. Many thanks for this tip, Andrew! Cheers, f From stefan at sun.ac.za Wed Jul 23 21:10:46 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 24 Jul 2008 03:10:46 +0200 Subject: [Numpy-discussion] Reference guide updated In-Reply-To: <63751c30807231516q5152a4cqb437dbf29c40158a@mail.gmail.com> References: <63751c30807231516q5152a4cqb437dbf29c40158a@mail.gmail.com> Message-ID: <9457e7c80807231810n6f6c26aq3e8827d9c9f99db5@mail.gmail.com> 2008/7/24 Neil Crighton : > Ok, thanks. > > I meant the amount of vertical space between lines of text - for > example, the gaps between parameter values and their description, or > the large spacing between both lines of text and and the text boxes in > the examples section. If other people agree it's a problem, I thought > the spacing could be tweaked. It's not a problem that there's more > than one function per page. Feel free to play around with the stylesheet. Download the tarball of the docs here: http://mentat.za.net/numpy/refguide.tar.gz And edit default.css until it looks good. When done, please send me the changes! Cheers St?fan From david at ar.media.kyoto-u.ac.jp Wed Jul 23 23:59:23 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 24 Jul 2008 12:59:23 +0900 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> Message-ID: <4887FE1B.5020307@ar.media.kyoto-u.ac.jp> Fernando Perez wrote: > > Very, very useful. Many thanks for this tip, Andrew! > Hi Fernando, I am still catching up things (have been on holidays for 2 weeks), but have you started the INSTALL.txt document ? cheers, David From fperez.net at gmail.com Thu Jul 24 00:20:02 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 23 Jul 2008 21:20:02 -0700 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <4887FE1B.5020307@ar.media.kyoto-u.ac.jp> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> <4887FE1B.5020307@ar.media.kyoto-u.ac.jp> Message-ID: On Wed, Jul 23, 2008 at 8:59 PM, David Cournapeau wrote: > Fernando Perez wrote: >> >> Very, very useful. Many thanks for this tip, Andrew! >> > > Hi Fernando, > > I am still catching up things (have been on holidays for 2 weeks), > but have you started the INSTALL.txt document ? No, and because I'm triple-swamped pushing an ipython release (while at a full-time workshop), I likely won't. I've been trying to report things I see as I go on numpy/scipy, but realistically my available time needs to go into ipython. So knock yourself out on that one, it will be much appreciated by myself and I'm sure others. Regards, f From efiring at hawaii.edu Thu Jul 24 03:15:58 2008 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 23 Jul 2008 21:15:58 -1000 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <4887ACEE.4050101@astraw.com> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> Message-ID: <48882C2E.9050805@hawaii.edu> Andrew Straw wrote: > Just for reference, you can find the build dependencies of any Debian > source package by looking at its .dsc file. For numpy, that can be found > at http://packages.debian.org/sid/python-numpy > > Currently (version 1.1.0, debian version 1:1.1.0-3), that list is: > > Build-Depends: cdbs (>= 0.4.43), python-all-dev, python-all-dbg, > python-central (>= 0.6), gfortran (>= 4:4.2), libblas-dev [!arm !m68k], > liblapack-dev [!arm !m68k], debhelper (>= 5.0.38), patchutils, > python-docutils, libfftw3-dev > > Build-Conflicts: atlas3-base-dev, libatlas-3dnow-dev, libatlas-base-dev, > libatlas-headers, libatlas-sse-dev, libatlas-sse2-dev Do you know why atlas is not used, and is even listed as a conflict? I have libatlas-sse2 etc. installed on ubuntu hardy, and I routinely build numpy from source. Maybe the debian specification is for lowest-common-denominator hardware? Eric From robert.kern at gmail.com Thu Jul 24 03:24:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Jul 2008 02:24:07 -0500 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <48882C2E.9050805@hawaii.edu> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> <48882C2E.9050805@hawaii.edu> Message-ID: <3d375d730807240024n5e6782abk2aaef228d4b9f4e1@mail.gmail.com> On Thu, Jul 24, 2008 at 02:15, Eric Firing wrote: > Andrew Straw wrote: > >> Just for reference, you can find the build dependencies of any Debian >> source package by looking at its .dsc file. For numpy, that can be found >> at http://packages.debian.org/sid/python-numpy >> >> Currently (version 1.1.0, debian version 1:1.1.0-3), that list is: >> >> Build-Depends: cdbs (>= 0.4.43), python-all-dev, python-all-dbg, >> python-central (>= 0.6), gfortran (>= 4:4.2), libblas-dev [!arm !m68k], >> liblapack-dev [!arm !m68k], debhelper (>= 5.0.38), patchutils, >> python-docutils, libfftw3-dev >> >> Build-Conflicts: atlas3-base-dev, libatlas-3dnow-dev, libatlas-base-dev, >> libatlas-headers, libatlas-sse-dev, libatlas-sse2-dev > > Do you know why atlas is not used, and is even listed as a conflict? I > have libatlas-sse2 etc. installed on ubuntu hardy, and I routinely build > numpy from source. Maybe the debian specification is for > lowest-common-denominator hardware? Not quite LCD, but that's close to the truth. Basically, a binary numpy package built against liblapack-dev will work with ATLAS when it is installed. Is suspect that one built against libatlas-base-dev may not work without ATLAS installed. It's specific packaging for Debian-and-spawn, not a general numpy requirement. Which is one reason why looking at a distribution's build-deps is not very useful for inferring details about the dependencies about the upstream packages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jones_gl at caltech.edu Thu Jul 24 03:21:09 2008 From: jones_gl at caltech.edu (G) Date: Thu, 24 Jul 2008 07:21:09 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?import_numpy_fails_with_multiarray?= =?utf-8?q?=2Eso=3A_undefined_symbol=3A_PyUnicodeUCS2=5FFromUnicode?= Message-ID: Hello, I have installed the svn version of numpy. I have deleted all previous versions of and files related to numpy prior to installing. I also have tried reinstalling python from source. Regardless, when I try import numpy, I get the following: Python 2.5.2 (r252:60911, Jul 23 2008, 23:54:29) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. import numpy Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.5/site-packages/numpy/__init__.py", line 93, in import add_newdocs File "/usr/local/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/usr/local/lib/python2.5/site-packages/numpy/lib/__init__.py", line 4, in from type_check import * File "/usr/local/lib/python2.5/site-packages/numpy/lib/type_check.py", line 8, in import numpy.core.numeric as _nx File "/usr/local/lib/python2.5/site-packages/numpy/core/__init__.py", line 5, in import multiarray ImportError: /usr/local/lib/python2.5/site-packages/numpy/core/multiarray.so: undefined symbol: PyUnicodeUCS2_FromUnicode I also tried compiling python using ./configure --enable-unicode=ucs4 with no luck. I had python and numpy all working well but then some file got corrupted so I was forced to reinstall, and I have not yet been able to get things working again after two days of attempts. Thank you, G From robert.kern at gmail.com Thu Jul 24 03:30:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Jul 2008 02:30:30 -0500 Subject: [Numpy-discussion] import numpy fails with multiarray.so: undefined symbol: PyUnicodeUCS2_FromUnicode In-Reply-To: References: Message-ID: <3d375d730807240030s14394fccp6edfc07dcf4e59aa@mail.gmail.com> On Thu, Jul 24, 2008 at 02:21, G wrote: > Hello, > I have installed the svn version of numpy. I have deleted all previous versions > of and files related to numpy prior to installing. I also have tried > reinstalling python from source. Regardless, when I try import numpy, I get the > following: Are you sure you are getting the same python executable when running the setup.py as you are when you build numpy? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pearu at cens.ioc.ee Thu Jul 24 03:31:50 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 24 Jul 2008 10:31:50 +0300 (EEST) Subject: [Numpy-discussion] f2py - a recap In-Reply-To: References: <9457e7c80807231518u579279f2kf8d5510bcc843c2f@mail.gmail.com> Message-ID: <39349.62.65.216.185.1216884710.squirrel@cens.ioc.ee> Hi, Few months ago I joined a group of system biologist and I have been busy with new projects (mostly C++ based). So I haven't had a chance to work on f2py. However, I am still around to fix f2py bugs and maintain/support numpy.f2py (as long as current numpy maintainers allow it..) -- as a rule these tasks do not take much of my time. I have also rewritten f2py users guide for numpy.f2py and submitted a paper on f2py. I'll make them available when I get some more time.. Regards, still-kicking-yoursly, Pearu On Thu, July 24, 2008 1:46 am, Fernando Perez wrote: > Howdy, > > On Wed, Jul 23, 2008 at 3:18 PM, St?fan van der Walt > wrote: >> 2008/7/23 Fernando Perez : > >> I agree (with your previous e-mail) that it would be good to have some >> documentation, so if you could give me some pointers on *what* to >> document (I haven't used f2py much), then I'll try my best to get >> around to it. > > Well, I think my 'recap' message earlier in this thread points to a > few issues that can probably be addressed quickly (the 'instead' error > in the help, the doc/docs dichotomy needs to be cleaned up so a single > documentation directory exists, etc). I'm also attaching a set of > very old notes I wrote years ago on f2py that you are free to use in > any way you see fit. I gave them a 2-minute rst treatment but didn't > edit them at all, so they may be somewhat outdated (I originally wrote > them in 2002 I think). > > If Pearu has moved to greener pastures, f2py could certainly use an > adoptive parent. It happens to be a really important piece of > infrastructure and for the most part it works fairly well. I think > a litlte bit of cleanup/doc integration with the rest of numpy is > probably all that's needed, so it could be a good project for someone > to adopt that would potentially be low-demand yet quite useful. > > Cheers, > > f > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From jones_gl at caltech.edu Thu Jul 24 03:49:24 2008 From: jones_gl at caltech.edu (G) Date: Thu, 24 Jul 2008 07:49:24 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?import_numpy_fails_with_multiarray?= =?utf-8?q?=2Eso=3A=09undefined_symbol=3A_PyUnicodeUCS2=5FFromUnico?= =?utf-8?q?de?= References: <3d375d730807240030s14394fccp6edfc07dcf4e59aa@mail.gmail.com> Message-ID: Robert Kern gmail.com> writes: > > Are you sure you are getting the same python executable when running > the setup.py as you are when you build numpy? > I believe so: sudo which python /usr/local/bin/python which python /usr/local/bin/python which ipython /usr/local/bin/ipython head /usr/local/bin/ipython #!/usr/local/bin/python I managed to get numpy 1.1.0 to compile and work correctly, so perhaps I somehow have files left around in the svn directory from building with another version of python. I will have to try the svn version again later. Thanks for the suggestion, G From cournape at gmail.com Thu Jul 24 04:04:14 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 24 Jul 2008 17:04:14 +0900 Subject: [Numpy-discussion] Schedule for 1.1.1 In-Reply-To: References: <20080715045016.GA22431@phare.normalesup.org> Message-ID: <5b8d13220807240104s4a539de2t1537aa991da4ba8f@mail.gmail.com> On Tue, Jul 15, 2008 at 2:01 PM, Charles R Harris wrote: > > > On Mon, Jul 14, 2008 at 10:50 PM, Gael Varoquaux > wrote: >> >> On Sun, Jul 13, 2008 at 01:49:18AM -0700, Jarrod Millman wrote: >> > The NumPy 1.1.1 release date (7/31/08) is rapidly approaching and we >> > need everyone's help. Chuck Harris has volunteered to take the lead >> > on coordinating this release. >> >> Anybody has an idea what the status is on #844? ( >> http://scipy.org/scipy/numpy/ticket/844 ) > > I suspect it is a blas problem, it doesn't show up here. David? I believe it is a bug in ATLAS: http://math-atlas.sourceforge.net/errata3.8.0.html#RMAAT Unfortunately, this means I have to rebuild atlas on windows, which will take time ... cheers, David From strawman at astraw.com Thu Jul 24 04:55:06 2008 From: strawman at astraw.com (Andrew Straw) Date: Thu, 24 Jul 2008 01:55:06 -0700 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <48882C2E.9050805@hawaii.edu> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> <48882C2E.9050805@hawaii.edu> Message-ID: <4888436A.4010004@astraw.com> Eric Firing wrote: > Andrew Straw wrote: > > >> Just for reference, you can find the build dependencies of any Debian >> source package by looking at its .dsc file. For numpy, that can be found >> at http://packages.debian.org/sid/python-numpy >> >> Currently (version 1.1.0, debian version 1:1.1.0-3), that list is: >> >> Build-Depends: cdbs (>= 0.4.43), python-all-dev, python-all-dbg, >> python-central (>= 0.6), gfortran (>= 4:4.2), libblas-dev [!arm !m68k], >> liblapack-dev [!arm !m68k], debhelper (>= 5.0.38), patchutils, >> python-docutils, libfftw3-dev >> >> Build-Conflicts: atlas3-base-dev, libatlas-3dnow-dev, libatlas-base-dev, >> libatlas-headers, libatlas-sse-dev, libatlas-sse2-dev >> > > Do you know why atlas is not used, and is even listed as a conflict? I > have libatlas-sse2 etc. installed on ubuntu hardy, and I routinely build > numpy from source. Maybe the debian specification is for > lowest-common-denominator hardware? The way it's supposed to work, as far as I understand it, is that atlas is not required at build time but when installed later automatically speeds up the blas routines. (Upon installation of libatlas3gf-sse2, libblas.so.3gf is pointed to /usr/lib/sse2/atlas/libblas.so.3gf from libblas.so.3gf => /usr/lib/libblas.so.3gf). I have not verified that any of this actually happens. So, please take this with a grain of salt. Especially since my answer differs from Robert's. -Andrew From david at ar.media.kyoto-u.ac.jp Thu Jul 24 04:45:31 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 24 Jul 2008 17:45:31 +0900 Subject: [Numpy-discussion] Instructions on building from source In-Reply-To: <4888436A.4010004@astraw.com> References: <3d375d730807231502k6939d55amd4c954c87f8ad71b@mail.gmail.com> <4887ACEE.4050101@astraw.com> <48882C2E.9050805@hawaii.edu> <4888436A.4010004@astraw.com> Message-ID: <4888412B.4090608@ar.media.kyoto-u.ac.jp> Andrew Straw wrote: > > The way it's supposed to work, as far as I understand it, is that atlas > is not required at build time but when installed later automatically > speeds up the blas routines. (Upon installation of libatlas3gf-sse2, > libblas.so.3gf is pointed to /usr/lib/sse2/atlas/libblas.so.3gf from > libblas.so.3gf => /usr/lib/libblas.so.3gf). I have not verified that any > of this actually happens. So, please take this with a grain of salt. > Especially since my answer differs from Robert's. > It only happens because debian put the CBLAS interface into libblas.*. Normally, it is not there, and numpy depends on cblas for _dotblas. The way it should work is to test for cblas instead of atlas (atlas always have cblas), but that requires work that nobody has done so far. And this reconcile your answer and Robert's one :) cheers, David From oliphant at enthought.com Thu Jul 24 08:06:19 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 24 Jul 2008 07:06:19 -0500 Subject: [Numpy-discussion] import numpy fails with multiarray.so: undefined symbol: PyUnicodeUCS2_FromUnicode In-Reply-To: References: Message-ID: <4888703B.2000805@enthought.com> G wrote: > Hello, > I have installed the svn version of numpy. I have deleted all previous versions > of and files related to numpy prior to installing. I also have tried > reinstalling python from source. Regardless, when I try import numpy, I get the > following: > > Python 2.5.2 (r252:60911, Jul 23 2008, 23:54:29) > [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > import numpy > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python2.5/site-packages/numpy/__init__.py", line 93, in > > > import add_newdocs > File "/usr/local/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, > in > from lib import add_newdoc > File "/usr/local/lib/python2.5/site-packages/numpy/lib/__init__.py", line 4, > > in > from type_check import * > File "/usr/local/lib/python2.5/site-packages/numpy/lib/type_check.py", line > 8, in > import numpy.core.numeric as _nx > File "/usr/local/lib/python2.5/site-packages/numpy/core/__init__.py", > line 5, in > import multiarray > ImportError: /usr/local/lib/python2.5/site-packages/numpy/core/multiarray.so: > > undefined symbol: PyUnicodeUCS2_FromUnicode > > This symbol is defined by Python when you build Python with 16-bit unicode support (as opposed to 32-bit) unicode support. Somehow numpy is picking up the 16-bit headers but you probably compiled with ucs4. NumPy supports both UCS2 and UCS4 builds. This looks to me like a Python header installation problem. There are probably some incorrect Python headers being picked up during compilation of NumPy. Can you double check which Python headers are being used (look at the -I lines when NumPy is being built). -Travis From stefan at sun.ac.za Thu Jul 24 11:15:46 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 24 Jul 2008 17:15:46 +0200 Subject: [Numpy-discussion] Documenting `zipf` Message-ID: <9457e7c80807240815q2af9f4d5n41654cc0e539401@mail.gmail.com> Hi, Does anybody know how Zipf's law or how Zipfian distributions work, and how they relate to NumPy's `np.random.zipf`? I'm afraid I can't make head or tail of these results: In [106]: np.random.zipf(2, size=(10)) Out[106]: array([ 1, 1, 1, 29, 1, 1, 1, 1, 1, 2]) (8x1, 1x2, 1x29) In [107]: np.random.zipf(2, size=(10)) Out[107]: array([75, 1, 1, 3, 1, 1, 1, 1, 1, 4]) (7x1, 1x3, 1x4, 1x75) In [108]: np.random.zipf(2, size=(10)) Out[108]: array([ 6, 17, 2, 1, 1, 2, 1, 20, 1, 2]) (4x1, 3x2, 1x6, 1x17, 1x20) Thanks! St?fan From dalcinl at gmail.com Thu Jul 24 12:35:15 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 24 Jul 2008 13:35:15 -0300 Subject: [Numpy-discussion] import numpy fails with multiarray.so: undefined symbol: PyUnicodeUCS2_FromUnicode In-Reply-To: References: Message-ID: Did you build Python from sources in such a way that the Python library is a shared one? I mean, Do you have the file /usr/local/lib/libpython2.5.so ?? On Thu, Jul 24, 2008 at 4:21 AM, G wrote: > Hello, > I have installed the svn version of numpy. I have deleted all previous versions > of and files related to numpy prior to installing. I also have tried > reinstalling python from source. Regardless, when I try import numpy, I get the > following: > > Python 2.5.2 (r252:60911, Jul 23 2008, 23:54:29) > [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > import numpy > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python2.5/site-packages/numpy/__init__.py", line 93, in > > > import add_newdocs > File "/usr/local/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, > in > from lib import add_newdoc > File "/usr/local/lib/python2.5/site-packages/numpy/lib/__init__.py", line 4, > > in > from type_check import * > File "/usr/local/lib/python2.5/site-packages/numpy/lib/type_check.py", line > 8, in > import numpy.core.numeric as _nx > File "/usr/local/lib/python2.5/site-packages/numpy/core/__init__.py", > line 5, in > import multiarray > ImportError: /usr/local/lib/python2.5/site-packages/numpy/core/multiarray.so: > > undefined symbol: PyUnicodeUCS2_FromUnicode > > > I also tried compiling python using ./configure --enable-unicode=ucs4 > with no luck. > I had python and numpy all working well but then some file got corrupted so I > was forced to reinstall, and I have not yet been able to get things working > again after two days of attempts. > > Thank you, > G > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From millman at berkeley.edu Thu Jul 24 14:03:46 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 24 Jul 2008 11:03:46 -0700 Subject: [Numpy-discussion] 1.1.1rc2 tagged Message-ID: Hello, The 1.1.1rc2 is now available: http://svn.scipy.org/svn/numpy/tags/1.1.1rc2 The source tarball is here: http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2.tar.gz Here is the universal Mac binary: http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2-py2.5-macosx10.5.dmg David Cournapeau will be creating a 1.1.1rc2 Windows binary in next few days. Please test this release ASAP and let us know if there are any problems. If there are no show stoppers, this will likely become the 1.1.1 release. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Thu Jul 24 15:31:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 24 Jul 2008 14:31:01 -0500 Subject: [Numpy-discussion] Documenting `zipf` In-Reply-To: <9457e7c80807240815q2af9f4d5n41654cc0e539401@mail.gmail.com> References: <9457e7c80807240815q2af9f4d5n41654cc0e539401@mail.gmail.com> Message-ID: <3d375d730807241231i60ec9c49qb8e3c9d90ca464ad@mail.gmail.com> On Thu, Jul 24, 2008 at 10:15, St?fan van der Walt wrote: > Hi, > > Does anybody know how Zipf's law or how Zipfian distributions work, > and how they relate > to NumPy's `np.random.zipf`? I'm afraid I can't make head or tail of > these results: > > In [106]: np.random.zipf(2, size=(10)) > Out[106]: array([ 1, 1, 1, 29, 1, 1, 1, 1, 1, 2]) > > (8x1, 1x2, 1x29) > > In [107]: np.random.zipf(2, size=(10)) > Out[107]: array([75, 1, 1, 3, 1, 1, 1, 1, 1, 4]) > > (7x1, 1x3, 1x4, 1x75) > > In [108]: np.random.zipf(2, size=(10)) > Out[108]: array([ 6, 17, 2, 1, 1, 2, 1, 20, 1, 2]) > > (4x1, 3x2, 1x6, 1x17, 1x20) With only 10 samples a piece, it's hard to evaluate what's going on. zipf(s) samples from a Zipfian distribution with N=inf, using the terminology as in the Wikipedia article: http://en.wikipedia.org/wiki/Zipf%27s_law It's a long-tailed distribution, so you would expect to see one or two big numbers with s=2. For example, here is the survival function for the distribution (sf(x) = 1-cdf(x)). In [23]: from numpy import * In [24]: def harmonic_number(s, k): ....: x = 1.0 / arange(1,k+1) ** s ....: return x.sum() ....: In [25]: from scipy.special import zeta In [26]: def sf(x,s): ....: return 1.0 - harmonic_number(s, int(x)) / zeta(s,1) ....: In [27]: sf(10, 2.0) Out[27]: 0.057854194645034718 In [28]: sf(20, 2.0) Out[28]: 0.029649105042033996 In [29]: sf(60, 2.0) Out[29]: 0.010048153098031198 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Fri Jul 25 06:28:02 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 25 Jul 2008 19:28:02 +0900 Subject: [Numpy-discussion] Putting blas/lapack/atlas code + tools somewhere in svn.scipy.org ? Message-ID: <4889AAB2.8010705@ar.media.kyoto-u.ac.jp> Hi, I would like to know if it would be possible at all to put blas/lapack/atlas code + various scripts to build them for windows in an automated way somewhere in svn.scipy.org, a bit like what svn.python.org does for external dependencies ? The rationale is that to build atlas, you need to build it on different machines for different archs, and having a repeatable build by several people would be a big plus (I also hope that someone would step in to support windows at some point :) ). As python devs do for tk and co I think, I don't want to track the changes in those external packages, just some releases to be in sync together with build scripts: I don't think it should take too much space. cheers, David From faltet at pytables.org Fri Jul 25 07:09:33 2008 From: faltet at pytables.org (Francesc Alted) Date: Fri, 25 Jul 2008 13:09:33 +0200 Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= In-Reply-To: <20080718144247.GA5698@tardis.terramar.selidor.net> References: <200807161844.36953.faltet@pytables.org> <20080718144247.GA5698@tardis.terramar.selidor.net> Message-ID: <200807251309.34418.faltet@pytables.org> Hi, Well, as there were no replies to our second proposal for the date/time dtype, I assume that everbody agrees with it ;-) At any rate, we would like to proceed with the implementation phase very soon now. However, it happens that Enthought is sponsoring this job and they clearly stated that the implementation should cover the needs of as much users as possible. So, most in particular, we would like that one of the most heavier users of date/time objects, i.e. the TimeSeries authors, would be comfortable with the new date/time dtypes, and specially that they can benefit from them. For this goal, we are proposing a decoupling of the date/time use cases in two different groups: 1. A pure ``datetime`` dtype (absolute or relative) that would be useful for timestamping purposes in general (i.e. registering dates without a need that they be evenly spaced in time). 2. A class based on the ``frequency`` concept that would be useful for measurements that are done on a regular basis or in business applications. With this, we are preventing the dtype implementation at the core of NumPy from being too cluttered with the relatively complex needs of the ``frequency`` concept users, factoring it out to a external class (``Date`` to follow the TimeSeries naming convention). More importantly, this decoupling will also avoid the mix of those two concepts that, although they are about time measurements, they have quite a different meanings indeed. Another important advantage of this distinction is that the ``datetime`` timestamp requires less meta-information to worry about (basically, the 'resolution' property), while a ``frequency`` ? la TimeSeries will need more additional meta-information, like the 'start' and 'end' of periods, as well as a more complex way to code frequencies (there exists much more time-periods to be coded, as it can be seen in [1]_). This can be utterly important to allow the NumPy data based on the ``datetime`` dtype to be quickly saved and retrieved on databases like ZODB (object database) or PyTables (HDF5-based database). Our ultimate goal is that the ``Date`` and ``DateArray`` classes in the TimeSeries would be rewritten in terms of the new date/time dtype so as to get advantage of its features but also for getting rid of duplicated code. I honestly think that this can be a big advantage for TimeSeries indeed (at the cost of taking some time for doing the migration). Does that approach make sense for people? .. [1] http://scipy.org/scipy/scikits/wiki/TimeSeries#Frequencies -- Francesc Alted From robert.kern at gmail.com Fri Jul 25 09:05:03 2008 From: robert.kern at gmail.com (robert.kern at gmail.com) Date: Fri, 25 Jul 2008 08:05:03 -0500 Subject: [Numpy-discussion] Putting blas/lapack/atlas code + tools somewhere in svn.scipy.org ? In-Reply-To: <4889AAB2.8010705@ar.media.kyoto-u.ac.jp> References: <4889AAB2.8010705@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730807250605y70ba7b8dj2b05e813f5bcdec2@mail.gmail.com> Sure. Make a directory called vendor/ next to trunk/. On 2008-07-25, David Cournapeau wrote: > Hi, > > I would like to know if it would be possible at all to put > blas/lapack/atlas code + various scripts to build them for windows in an > automated way somewhere in svn.scipy.org, a bit like what svn.python.org > does for external dependencies ? The rationale is that to build atlas, > you need to build it on different machines for different archs, and > having a repeatable build by several people would be a big plus (I also > hope that someone would step in to support windows at some point :) ). > As python devs do for tk and co I think, I don't want to track the > changes in those external packages, just some releases to be in sync > together with build scripts: I don't think it should take too much space. > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From doutriaux1 at llnl.gov Fri Jul 25 10:22:43 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 25 Jul 2008 14:22:43 -0000 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> Message-ID: <6ECC.6050509@llnl.gov> Hi Stephane, This is a good suggestion, I'm ccing the numpy list on this. Because I'm wondering if it wouldn't be a better fit to do it directly at the numpy.ma level. I'm sure they already thought about this (and 'inf' values as well) and if they don't do it , there's probably some good reason we didn't think of yet. So before i go ahead and do it in MV2 I'd like to know the reason why it's not in numpy.ma, they are probably valid for MVs too. C. Stephane Raynaud wrote: > Hi, > > how about automatically (or at least optionally) masking all NaN > values when creating a MV array? > > On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene > > wrote: > > Yup, this works. Thanks! > > I guess it's time for me to dig deeper into numpy syntax and > functions, now that CDAT is using the numpy core for array > management... > > Best, > > Arthur > > > Charles Doutriaux wrote: > > Seems right to me, > > Except that the syntax might scare a bit the new users :) > > C. > > Andrew.Dawson at uea.ac.uk wrote: > > Hi, > > I'm not sure if what I am about to suggest is a good idea > or not, perhaps Charles will correct me if this is a bad > idea for any reason. > > Lets say you have a cdms variable called U with NaNs as > the missing > value. First we can replace the NaNs with 1e20: > > U.data[numpy.where(numpy.isnan(U.data))] = 1e20 > > And remember to set the missing value of the variable > appropriately: > > U.setMissing(1e20) > > I hope that helps, Andrew > > > > Hi Arthur, > > If i remember correctly the way i used to do it was: > a= MV2.greater(data,1.) b=MV2.less_equal(data,1) > c=MV2.logical_and(a,b) # Nan are the only one left > data=MV2.masked_where(c,data) > > BUT I believe numpy now has way to deal with nan I > believe it is numpy.nan_to_num But it replaces with 0 > so it may not be what you > want > > C. > > > Arthur M. Greene wrote: > > A typical netcdf file is opened, and the single > variable extracted: > > > fpr=cdms.open('prTS2p1_SEA_allmos.cdf') > pr0=fpr('prcp') type(pr0) > > > > Masked values (indicating ocean in this case) show > up here as NaNs. > > > pr0[0,-15:-5,0] > > prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 > 0.3460784 0.21960783 0.19117641]) > > So far this is all consistent. A map of the first > time step shows the proper land-ocean boundaries, > reasonable-looking values, and so on. But there > doesn't seem to be any way to mask > this array, so, e.g., an 'xy' average can be > computed (it > comes out all nans). NaN is not equal to anything > -- even > itself -- so there does not seem to be any > condition, among the > MV.masked_xxx options, that can be applied as a > test. Also, it > does not seem possible to compute seasonal averages, > anomalies, etc. -- they also produce just NaNs. > > The workaround I've come up with -- for now -- is > to first generate a new array of identical shape, > filled with 1.0E+20. One test I've found that can > detect NaNs is numpy.isnan: > > > isnan(pr0[0,0,0]) > > True > > So it is _possible_ to tediously loop through > every value in the old array, testing with isnan, > then copying to the new array if the test fails. > Then the axes have to be reset... > > isnan does not accept array arguments, so one > cannot do, e.g., > > prmasked=MV.masked_where(isnan(pr0),pr0) > > The element-by-element conversion is quite slow. > (I'm still waiting for it to complete, in fact). > Any suggestions for dealing with NaN-infested data > objects? > > Thanks! > > AMG > > P.S. This is 5.0.0.beta, RHEL4. > > > *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* > Arthur M. Greene, Ph.D. > The International Research Institute for Climate and Society > The Earth Institute, Columbia University, Lamont Campus > Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA > amg*at*iri-dot-columbia\dot\edu | http://iri.columbia.edu > *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > Cdat-discussion mailing list > Cdat-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/cdat-discussion > > > > > -- > Stephane Raynaud > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http:// moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------------------------------------------------ > > _______________________________________________ > Cdat-discussion mailing list > Cdat-discussion at lists.sourceforge.net > https:// lists.sourceforge.net/lists/listinfo/cdat-discussion > From david at ar.media.kyoto-u.ac.jp Fri Jul 25 10:29:06 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 25 Jul 2008 23:29:06 +0900 Subject: [Numpy-discussion] Putting blas/lapack/atlas code + tools somewhere in svn.scipy.org ? In-Reply-To: <3d375d730807250605y70ba7b8dj2b05e813f5bcdec2@mail.gmail.com> References: <4889AAB2.8010705@ar.media.kyoto-u.ac.jp> <3d375d730807250605y70ba7b8dj2b05e813f5bcdec2@mail.gmail.com> Message-ID: <4889E332.5090006@ar.media.kyoto-u.ac.jp> robert.kern at gmail.com wrote: > Sure. Make a directory called vendor/ next to trunk/. > Great, thanks. cheers, David From felix at physik3.uni-rostock.de Fri Jul 25 11:26:15 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Fri, 25 Jul 2008 17:26:15 +0200 Subject: [Numpy-discussion] FFT usage / consistency Message-ID: <200807251726.16143.felix@physik3.uni-rostock.de> Hi all, I found myself busy today trying to understand what went wrong in my FFT code. I wrote a minimal example/testing code to check the FFT output against an analytic result and also tried to reverse the transformation to get the original function back. Most curiously, the results depend on whether I first do the fft and then the ifft or the other way round. For the test function, I use the Lorentz function 1/(x^2+1). The exact FT is exp(-|t|)*sqrt(pi/2), the IFT yields the same. 1) First FFT and then IFFT: The real part of FFT oscillates, the imaginary part is not zero, and the magnitudes do not match. All this should not be, but the IFFT reproduces the original function just fine. 2) First IFFT and then FFT: The IFFT is perfect, but the FFT does not reproduce the original function. Could someone please have a look and tell me what I am doing wrong? The code is attached, it also plots the results nicely. Maybe the (corrected) code could be used as an example in the documentation (correct use of (i)fftshift and (i)fftfreq is not trivial!) or as an additional test case. The existing test cases for numpy only seem to check that the fft function can be called, but they do not check consistency of results. I'm using NumPy versions 1.0.4 and 1.1.0 on Linux with fftpack_lite.so (even though fftw3 is installed and configured, but I'll probably ask for that later...) Thanks a lot, Felix -------------- next part -------------- A non-text attachment was scrubbed... Name: test_fft.py Type: application/x-python Size: 3334 bytes Desc: not available URL: From bsouthey at gmail.com Fri Jul 25 11:43:34 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 25 Jul 2008 10:43:34 -0500 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <6ECC.6050509@llnl.gov> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> <6ECC.6050509@llnl.gov> Message-ID: <4889F4A6.7020303@gmail.com> Charles Doutriaux wrote: > Hi Stephane, > > This is a good suggestion, I'm ccing the numpy list on this. Because I'm > wondering if it wouldn't be a better fit to do it directly at the > numpy.ma level. > > I'm sure they already thought about this (and 'inf' values as well) and > if they don't do it , there's probably some good reason we didn't think > of yet. > So before i go ahead and do it in MV2 I'd like to know the reason why > it's not in numpy.ma, they are probably valid for MVs too. > > C. > > Stephane Raynaud wrote: > >> Hi, >> >> how about automatically (or at least optionally) masking all NaN >> values when creating a MV array? >> >> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >> > wrote: >> >> Yup, this works. Thanks! >> >> I guess it's time for me to dig deeper into numpy syntax and >> functions, now that CDAT is using the numpy core for array >> management... >> >> Best, >> >> Arthur >> >> >> Charles Doutriaux wrote: >> >> Seems right to me, >> >> Except that the syntax might scare a bit the new users :) >> >> C. >> >> Andrew.Dawson at uea.ac.uk wrote: >> >> Hi, >> >> I'm not sure if what I am about to suggest is a good idea >> or not, perhaps Charles will correct me if this is a bad >> idea for any reason. >> >> Lets say you have a cdms variable called U with NaNs as >> the missing >> value. First we can replace the NaNs with 1e20: >> >> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >> >> And remember to set the missing value of the variable >> appropriately: >> >> U.setMissing(1e20) >> >> I hope that helps, Andrew >> >> >> >> Hi Arthur, >> >> If i remember correctly the way i used to do it was: >> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >> c=MV2.logical_and(a,b) # Nan are the only one left >> data=MV2.masked_where(c,data) >> >> BUT I believe numpy now has way to deal with nan I >> believe it is numpy.nan_to_num But it replaces with 0 >> so it may not be what you >> want >> >> C. >> >> >> Arthur M. Greene wrote: >> >> A typical netcdf file is opened, and the single >> variable extracted: >> >> >> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >> pr0=fpr('prcp') type(pr0) >> >> >> >> Masked values (indicating ocean in this case) show >> up here as NaNs. >> >> >> pr0[0,-15:-5,0] >> >> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >> 0.3460784 0.21960783 0.19117641]) >> >> So far this is all consistent. A map of the first >> time step shows the proper land-ocean boundaries, >> reasonable-looking values, and so on. But there >> doesn't seem to be any way to mask >> this array, so, e.g., an 'xy' average can be >> computed (it >> comes out all nans). NaN is not equal to anything >> -- even >> itself -- so there does not seem to be any >> condition, among the >> MV.masked_xxx options, that can be applied as a >> test. Also, it >> does not seem possible to compute seasonal averages, >> anomalies, etc. -- they also produce just NaNs. >> >> The workaround I've come up with -- for now -- is >> to first generate a new array of identical shape, >> filled with 1.0E+20. One test I've found that can >> detect NaNs is numpy.isnan: >> >> >> isnan(pr0[0,0,0]) >> >> True >> >> So it is _possible_ to tediously loop through >> every value in the old array, testing with isnan, >> then copying to the new array if the test fails. >> Then the axes have to be reset... >> >> isnan does not accept array arguments, so one >> cannot do, e.g., >> >> prmasked=MV.masked_where(isnan(pr0),pr0) >> >> The element-by-element conversion is quite slow. >> (I'm still waiting for it to complete, in fact). >> Any suggestions for dealing with NaN-infested data >> objects? >> >> Thanks! >> >> AMG >> >> P.S. This is 5.0.0.beta, RHEL4. >> >> >> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >> Arthur M. Greene, Ph.D. >> The International Research Institute for Climate and Society >> The Earth Institute, Columbia University, Lamont Campus >> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >> amg*at*iri-dot-columbia\dot\edu | http://iri.columbia.edu >> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win >> great prizes >> Grand prize is a trip for two to an Open Source event anywhere in >> the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> >> _______________________________________________ >> Cdat-discussion mailing list >> Cdat-discussion at lists.sourceforge.net >> >> https://lists.sourceforge.net/lists/listinfo/cdat-discussion >> >> >> >> >> -- >> Stephane Raynaud >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Cdat-discussion mailing list >> Cdat-discussion at lists.sourceforge.net >> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Please look the various NumPy functions to ignore NaN like nansum(). See the NumPy example list (http://www.scipy.org/Numpy_Example_List_With_Doc) for examples under nan or individual functions. To get the mean you can do something like: import numpy x = numpy.array([2, numpy.nan, 1]) numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) x_masked.mean() The real advantage of masked arrays is that you have greater control over the filtering so you can also filter extreme values: y = numpy.array([2, numpy.nan, 1, 1000]) y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) y_masked.mean() Regards Bruce From doutriaux1 at llnl.gov Fri Jul 25 11:51:01 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 25 Jul 2008 15:51:01 -0000 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> Message-ID: <837F.4030605@llnl.gov> Hi All, I'm sending a copy of this reply here because i think we could get some good answer. Basically it was suggested to automarically mask NaN (and Inf ?) when creating ma. I'm sure you already thought of this on this list and was curious to know why you decided not to do it. Just so I can relay it to our list (sending to both list came back flagged as spam...) C. Hi Stephane, This is a good suggestion, I'm ccing the numpy list on this. Because I'm wondering if it wouldn't be a better fit to do it directly at the numpy.ma level. I'm sure they already thought about this (and 'inf' values as well) and if they don't do it , there's probably some good reason we didn't think of yet. So before i go ahead and do it in MV2 I'd like to know the reason why it's not in numpy.ma, they are probably valid for MVs too. C. Stephane Raynaud wrote: > Hi, > > how about automatically (or at least optionally) masking all NaN > values when creating a MV array? > > On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene > > wrote: > > Yup, this works. Thanks! > > I guess it's time for me to dig deeper into numpy syntax and > functions, now that CDAT is using the numpy core for array > management... > > Best, > > Arthur > > > Charles Doutriaux wrote: > > Seems right to me, > > Except that the syntax might scare a bit the new users :) > > C. > > Andrew.Dawson at uea.ac.uk wrote: > > Hi, > > I'm not sure if what I am about to suggest is a good idea > or not, perhaps Charles will correct me if this is a bad > idea for any reason. > > Lets say you have a cdms variable called U with NaNs as > the missing > value. First we can replace the NaNs with 1e20: > > U.data[numpy.where(numpy.isnan(U.data))] = 1e20 > > And remember to set the missing value of the variable > appropriately: > > U.setMissing(1e20) > > I hope that helps, Andrew > > > > Hi Arthur, > > If i remember correctly the way i used to do it was: > a= MV2.greater(data,1.) b=MV2.less_equal(data,1) > c=MV2.logical_and(a,b) # Nan are the only one left > data=MV2.masked_where(c,data) > > BUT I believe numpy now has way to deal with nan I > believe it is numpy.nan_to_num But it replaces with 0 > so it may not be what you > want > > C. > > > Arthur M. Greene wrote: > > A typical netcdf file is opened, and the single > variable extracted: > > > fpr=cdms.open('prTS2p1_SEA_allmos.cdf') > pr0=fpr('prcp') type(pr0) > > > > Masked values (indicating ocean in this case) show > up here as NaNs. > > > pr0[0,-15:-5,0] > > prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 > 0.3460784 0.21960783 0.19117641]) > > So far this is all consistent. A map of the first > time step shows the proper land-ocean boundaries, > reasonable-looking values, and so on. But there > doesn't seem to be any way to mask > this array, so, e.g., an 'xy' average can be > computed (it > comes out all nans). NaN is not equal to anything > -- even > itself -- so there does not seem to be any > condition, among the > MV.masked_xxx options, that can be applied as a > test. Also, it > does not seem possible to compute seasonal averages, > anomalies, etc. -- they also produce just NaNs. > > The workaround I've come up with -- for now -- is > to first generate a new array of identical shape, > filled with 1.0E+20. One test I've found that can > detect NaNs is numpy.isnan: > > > isnan(pr0[0,0,0]) > > True > > So it is _possible_ to tediously loop through > every value in the old array, testing with isnan, > then copying to the new array if the test fails. > Then the axes have to be reset... > > isnan does not accept array arguments, so one > cannot do, e.g., > > prmasked=MV.masked_where(isnan(pr0),pr0) > > The element-by-element conversion is quite slow. > (I'm still waiting for it to complete, in fact). > Any suggestions for dealing with NaN-infested data > objects? > > Thanks! > > AMG > > P.S. This is 5.0.0.beta, RHEL4. > > > *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* > Arthur M. Greene, Ph.D. > The International Research Institute for Climate and Society > The Earth Institute, Columbia University, Lamont Campus > Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA > amg*at*iri-dot-columbia\dot\edu | http://iri.columbia.edu > *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > Cdat-discussion mailing list > Cdat-discussion at lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/cdat-discussion > > > > > -- > Stephane Raynaud > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http:// moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------------------------------------------------ > > _______________________________________________ > Cdat-discussion mailing list > Cdat-discussion at lists.sourceforge.net > https:// lists.sourceforge.net/lists/listinfo/cdat-discussion > From doutriaux1 at llnl.gov Fri Jul 25 11:53:41 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 25 Jul 2008 15:53:41 -0000 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <4889F4A6.7020303@gmail.com> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> <6ECC.6050509@llnl.gov> <4889F4A6.7020303@gmail.com> Message-ID: <841E.8080109@llnl.gov> Hi Bruce, Thx for the reply, we're aware of this, basically the question was why not mask NaN automatically when creating a nump.ma array? C. Bruce Southey wrote: > Charles Doutriaux wrote: > >> Hi Stephane, >> >> This is a good suggestion, I'm ccing the numpy list on this. Because I'm >> wondering if it wouldn't be a better fit to do it directly at the >> numpy.ma level. >> >> I'm sure they already thought about this (and 'inf' values as well) and >> if they don't do it , there's probably some good reason we didn't think >> of yet. >> So before i go ahead and do it in MV2 I'd like to know the reason why >> it's not in numpy.ma, they are probably valid for MVs too. >> >> C. >> >> Stephane Raynaud wrote: >> >> >>> Hi, >>> >>> how about automatically (or at least optionally) masking all NaN >>> values when creating a MV array? >>> >>> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >>> > wrote: >>> >>> Yup, this works. Thanks! >>> >>> I guess it's time for me to dig deeper into numpy syntax and >>> functions, now that CDAT is using the numpy core for array >>> management... >>> >>> Best, >>> >>> Arthur >>> >>> >>> Charles Doutriaux wrote: >>> >>> Seems right to me, >>> >>> Except that the syntax might scare a bit the new users :) >>> >>> C. >>> >>> Andrew.Dawson at uea.ac.uk wrote: >>> >>> Hi, >>> >>> I'm not sure if what I am about to suggest is a good idea >>> or not, perhaps Charles will correct me if this is a bad >>> idea for any reason. >>> >>> Lets say you have a cdms variable called U with NaNs as >>> the missing >>> value. First we can replace the NaNs with 1e20: >>> >>> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >>> >>> And remember to set the missing value of the variable >>> appropriately: >>> >>> U.setMissing(1e20) >>> >>> I hope that helps, Andrew >>> >>> >>> >>> Hi Arthur, >>> >>> If i remember correctly the way i used to do it was: >>> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >>> c=MV2.logical_and(a,b) # Nan are the only one left >>> data=MV2.masked_where(c,data) >>> >>> BUT I believe numpy now has way to deal with nan I >>> believe it is numpy.nan_to_num But it replaces with 0 >>> so it may not be what you >>> want >>> >>> C. >>> >>> >>> Arthur M. Greene wrote: >>> >>> A typical netcdf file is opened, and the single >>> variable extracted: >>> >>> >>> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >>> pr0=fpr('prcp') type(pr0) >>> >>> >>> >>> Masked values (indicating ocean in this case) show >>> up here as NaNs. >>> >>> >>> pr0[0,-15:-5,0] >>> >>> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >>> 0.3460784 0.21960783 0.19117641]) >>> >>> So far this is all consistent. A map of the first >>> time step shows the proper land-ocean boundaries, >>> reasonable-looking values, and so on. But there >>> doesn't seem to be any way to mask >>> this array, so, e.g., an 'xy' average can be >>> computed (it >>> comes out all nans). NaN is not equal to anything >>> -- even >>> itself -- so there does not seem to be any >>> condition, among the >>> MV.masked_xxx options, that can be applied as a >>> test. Also, it >>> does not seem possible to compute seasonal averages, >>> anomalies, etc. -- they also produce just NaNs. >>> >>> The workaround I've come up with -- for now -- is >>> to first generate a new array of identical shape, >>> filled with 1.0E+20. One test I've found that can >>> detect NaNs is numpy.isnan: >>> >>> >>> isnan(pr0[0,0,0]) >>> >>> True >>> >>> So it is _possible_ to tediously loop through >>> every value in the old array, testing with isnan, >>> then copying to the new array if the test fails. >>> Then the axes have to be reset... >>> >>> isnan does not accept array arguments, so one >>> cannot do, e.g., >>> >>> prmasked=MV.masked_where(isnan(pr0),pr0) >>> >>> The element-by-element conversion is quite slow. >>> (I'm still waiting for it to complete, in fact). >>> Any suggestions for dealing with NaN-infested data >>> objects? >>> >>> Thanks! >>> >>> AMG >>> >>> P.S. This is 5.0.0.beta, RHEL4. >>> >>> >>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>> Arthur M. Greene, Ph.D. >>> The International Research Institute for Climate and Society >>> The Earth Institute, Columbia University, Lamont Campus >>> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >>> amg*at*iri-dot-columbia\dot\edu | http:// iri.columbia.edu >>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>> >>> >>> ------------------------------------------------------------------------- >>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>> challenge >>> Build the coolest Linux based applications with Moblin SDK & win >>> great prizes >>> Grand prize is a trip for two to an Open Source event anywhere in >>> the world >>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>> >>> _______________________________________________ >>> Cdat-discussion mailing list >>> Cdat-discussion at lists.sourceforge.net >>> >>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>> >>> >>> >>> >>> -- >>> Stephane Raynaud >>> ------------------------------------------------------------------------ >>> >>> ------------------------------------------------------------------------- >>> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >>> Build the coolest Linux based applications with Moblin SDK & win great prizes >>> Grand prize is a trip for two to an Open Source event anywhere in the world >>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Cdat-discussion mailing list >>> Cdat-discussion at lists.sourceforge.net >>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>> >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > Please look the various NumPy functions to ignore NaN like nansum(). See > the NumPy example list > (http:// www. scipy.org/Numpy_Example_List_With_Doc) for examples under > nan or individual functions. > > To get the mean you can do something like: > > import numpy > x = numpy.array([2, numpy.nan, 1]) > numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) > x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) > x_masked.mean() > > The real advantage of masked arrays is that you have greater control > over the filtering so you can also filter extreme values: > > y = numpy.array([2, numpy.nan, 1, 1000]) > y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) > y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) > y_masked.mean() > > Regards > Bruce > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http:// projects.scipy.org/mailman/listinfo/numpy-discussion > > > From bsouthey at gmail.com Fri Jul 25 12:04:21 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 25 Jul 2008 11:04:21 -0500 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <841E.8080109@llnl.gov> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> <6ECC.6050509@llnl.gov> <4889F4A6.7020303@gmail.com> <841E.8080109@llnl.gov> Message-ID: <4889F985.501@gmail.com> Charles Doutriaux wrote: > Hi Bruce, > > Thx for the reply, we're aware of this, basically the question was why > not mask NaN automatically when creating a nump.ma array? > > C. > > Bruce Southey wrote: > >> Charles Doutriaux wrote: >> >> >>> Hi Stephane, >>> >>> This is a good suggestion, I'm ccing the numpy list on this. Because I'm >>> wondering if it wouldn't be a better fit to do it directly at the >>> numpy.ma level. >>> >>> I'm sure they already thought about this (and 'inf' values as well) and >>> if they don't do it , there's probably some good reason we didn't think >>> of yet. >>> So before i go ahead and do it in MV2 I'd like to know the reason why >>> it's not in numpy.ma, they are probably valid for MVs too. >>> >>> C. >>> >>> Stephane Raynaud wrote: >>> >>> >>> >>>> Hi, >>>> >>>> how about automatically (or at least optionally) masking all NaN >>>> values when creating a MV array? >>>> >>>> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >>>> > wrote: >>>> >>>> Yup, this works. Thanks! >>>> >>>> I guess it's time for me to dig deeper into numpy syntax and >>>> functions, now that CDAT is using the numpy core for array >>>> management... >>>> >>>> Best, >>>> >>>> Arthur >>>> >>>> >>>> Charles Doutriaux wrote: >>>> >>>> Seems right to me, >>>> >>>> Except that the syntax might scare a bit the new users :) >>>> >>>> C. >>>> >>>> Andrew.Dawson at uea.ac.uk wrote: >>>> >>>> Hi, >>>> >>>> I'm not sure if what I am about to suggest is a good idea >>>> or not, perhaps Charles will correct me if this is a bad >>>> idea for any reason. >>>> >>>> Lets say you have a cdms variable called U with NaNs as >>>> the missing >>>> value. First we can replace the NaNs with 1e20: >>>> >>>> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >>>> >>>> And remember to set the missing value of the variable >>>> appropriately: >>>> >>>> U.setMissing(1e20) >>>> >>>> I hope that helps, Andrew >>>> >>>> >>>> >>>> Hi Arthur, >>>> >>>> If i remember correctly the way i used to do it was: >>>> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >>>> c=MV2.logical_and(a,b) # Nan are the only one left >>>> data=MV2.masked_where(c,data) >>>> >>>> BUT I believe numpy now has way to deal with nan I >>>> believe it is numpy.nan_to_num But it replaces with 0 >>>> so it may not be what you >>>> want >>>> >>>> C. >>>> >>>> >>>> Arthur M. Greene wrote: >>>> >>>> A typical netcdf file is opened, and the single >>>> variable extracted: >>>> >>>> >>>> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >>>> pr0=fpr('prcp') type(pr0) >>>> >>>> >>>> >>>> Masked values (indicating ocean in this case) show >>>> up here as NaNs. >>>> >>>> >>>> pr0[0,-15:-5,0] >>>> >>>> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >>>> 0.3460784 0.21960783 0.19117641]) >>>> >>>> So far this is all consistent. A map of the first >>>> time step shows the proper land-ocean boundaries, >>>> reasonable-looking values, and so on. But there >>>> doesn't seem to be any way to mask >>>> this array, so, e.g., an 'xy' average can be >>>> computed (it >>>> comes out all nans). NaN is not equal to anything >>>> -- even >>>> itself -- so there does not seem to be any >>>> condition, among the >>>> MV.masked_xxx options, that can be applied as a >>>> test. Also, it >>>> does not seem possible to compute seasonal averages, >>>> anomalies, etc. -- they also produce just NaNs. >>>> >>>> The workaround I've come up with -- for now -- is >>>> to first generate a new array of identical shape, >>>> filled with 1.0E+20. One test I've found that can >>>> detect NaNs is numpy.isnan: >>>> >>>> >>>> isnan(pr0[0,0,0]) >>>> >>>> True >>>> >>>> So it is _possible_ to tediously loop through >>>> every value in the old array, testing with isnan, >>>> then copying to the new array if the test fails. >>>> Then the axes have to be reset... >>>> >>>> isnan does not accept array arguments, so one >>>> cannot do, e.g., >>>> >>>> prmasked=MV.masked_where(isnan(pr0),pr0) >>>> >>>> The element-by-element conversion is quite slow. >>>> (I'm still waiting for it to complete, in fact). >>>> Any suggestions for dealing with NaN-infested data >>>> objects? >>>> >>>> Thanks! >>>> >>>> AMG >>>> >>>> P.S. This is 5.0.0.beta, RHEL4. >>>> >>>> >>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>> Arthur M. Greene, Ph.D. >>>> The International Research Institute for Climate and Society >>>> The Earth Institute, Columbia University, Lamont Campus >>>> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >>>> amg*at*iri-dot-columbia\dot\edu | http:// iri.columbia.edu >>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>> challenge >>>> Build the coolest Linux based applications with Moblin SDK & win >>>> great prizes >>>> Grand prize is a trip for two to an Open Source event anywhere in >>>> the world >>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>> >>>> _______________________________________________ >>>> Cdat-discussion mailing list >>>> Cdat-discussion at lists.sourceforge.net >>>> >>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>> >>>> >>>> >>>> >>>> -- >>>> Stephane Raynaud >>>> ------------------------------------------------------------------------ >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >>>> Build the coolest Linux based applications with Moblin SDK & win great prizes >>>> Grand prize is a trip for two to an Open Source event anywhere in the world >>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>> ------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Cdat-discussion mailing list >>>> Cdat-discussion at lists.sourceforge.net >>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> >> Please look the various NumPy functions to ignore NaN like nansum(). See >> the NumPy example list >> (http:// www. scipy.org/Numpy_Example_List_With_Doc) for examples under >> nan or individual functions. >> >> To get the mean you can do something like: >> >> import numpy >> x = numpy.array([2, numpy.nan, 1]) >> numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) >> x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) >> x_masked.mean() >> >> The real advantage of masked arrays is that you have greater control >> over the filtering so you can also filter extreme values: >> >> y = numpy.array([2, numpy.nan, 1, 1000]) >> y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) >> y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) >> y_masked.mean() >> >> Regards >> Bruce >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > You mean like doing: import numpy y=numpy.ma.MaskedArray([ 2., numpy.nan, 1., 1000.], numpy.isnan(y)) ? Bruce From doutriaux1 at llnl.gov Fri Jul 25 13:09:56 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 25 Jul 2008 10:09:56 -0700 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <4889F985.501@gmail.com> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> <6ECC.6050509@llnl.gov> <4889F4A6.7020303@gmail.com> <841E.8080109@llnl.gov> <4889F985.501@gmail.com> Message-ID: <488A08E4.1040609@llnl.gov> I mean not having to it myself. data is a numpy array with NaN in it masked_data = numpy.ma.array(data) returns a masked array with a mask where NaN were in data C. Bruce Southey wrote: > Charles Doutriaux wrote: > >> Hi Bruce, >> >> Thx for the reply, we're aware of this, basically the question was why >> not mask NaN automatically when creating a nump.ma array? >> >> C. >> >> Bruce Southey wrote: >> >> >>> Charles Doutriaux wrote: >>> >>> >>> >>>> Hi Stephane, >>>> >>>> This is a good suggestion, I'm ccing the numpy list on this. Because I'm >>>> wondering if it wouldn't be a better fit to do it directly at the >>>> numpy.ma level. >>>> >>>> I'm sure they already thought about this (and 'inf' values as well) and >>>> if they don't do it , there's probably some good reason we didn't think >>>> of yet. >>>> So before i go ahead and do it in MV2 I'd like to know the reason why >>>> it's not in numpy.ma, they are probably valid for MVs too. >>>> >>>> C. >>>> >>>> Stephane Raynaud wrote: >>>> >>>> >>>> >>>> >>>>> Hi, >>>>> >>>>> how about automatically (or at least optionally) masking all NaN >>>>> values when creating a MV array? >>>>> >>>>> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >>>>> > wrote: >>>>> >>>>> Yup, this works. Thanks! >>>>> >>>>> I guess it's time for me to dig deeper into numpy syntax and >>>>> functions, now that CDAT is using the numpy core for array >>>>> management... >>>>> >>>>> Best, >>>>> >>>>> Arthur >>>>> >>>>> >>>>> Charles Doutriaux wrote: >>>>> >>>>> Seems right to me, >>>>> >>>>> Except that the syntax might scare a bit the new users :) >>>>> >>>>> C. >>>>> >>>>> Andrew.Dawson at uea.ac.uk wrote: >>>>> >>>>> Hi, >>>>> >>>>> I'm not sure if what I am about to suggest is a good idea >>>>> or not, perhaps Charles will correct me if this is a bad >>>>> idea for any reason. >>>>> >>>>> Lets say you have a cdms variable called U with NaNs as >>>>> the missing >>>>> value. First we can replace the NaNs with 1e20: >>>>> >>>>> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >>>>> >>>>> And remember to set the missing value of the variable >>>>> appropriately: >>>>> >>>>> U.setMissing(1e20) >>>>> >>>>> I hope that helps, Andrew >>>>> >>>>> >>>>> >>>>> Hi Arthur, >>>>> >>>>> If i remember correctly the way i used to do it was: >>>>> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >>>>> c=MV2.logical_and(a,b) # Nan are the only one left >>>>> data=MV2.masked_where(c,data) >>>>> >>>>> BUT I believe numpy now has way to deal with nan I >>>>> believe it is numpy.nan_to_num But it replaces with 0 >>>>> so it may not be what you >>>>> want >>>>> >>>>> C. >>>>> >>>>> >>>>> Arthur M. Greene wrote: >>>>> >>>>> A typical netcdf file is opened, and the single >>>>> variable extracted: >>>>> >>>>> >>>>> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >>>>> pr0=fpr('prcp') type(pr0) >>>>> >>>>> >>>>> >>>>> Masked values (indicating ocean in this case) show >>>>> up here as NaNs. >>>>> >>>>> >>>>> pr0[0,-15:-5,0] >>>>> >>>>> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >>>>> 0.3460784 0.21960783 0.19117641]) >>>>> >>>>> So far this is all consistent. A map of the first >>>>> time step shows the proper land-ocean boundaries, >>>>> reasonable-looking values, and so on. But there >>>>> doesn't seem to be any way to mask >>>>> this array, so, e.g., an 'xy' average can be >>>>> computed (it >>>>> comes out all nans). NaN is not equal to anything >>>>> -- even >>>>> itself -- so there does not seem to be any >>>>> condition, among the >>>>> MV.masked_xxx options, that can be applied as a >>>>> test. Also, it >>>>> does not seem possible to compute seasonal averages, >>>>> anomalies, etc. -- they also produce just NaNs. >>>>> >>>>> The workaround I've come up with -- for now -- is >>>>> to first generate a new array of identical shape, >>>>> filled with 1.0E+20. One test I've found that can >>>>> detect NaNs is numpy.isnan: >>>>> >>>>> >>>>> isnan(pr0[0,0,0]) >>>>> >>>>> True >>>>> >>>>> So it is _possible_ to tediously loop through >>>>> every value in the old array, testing with isnan, >>>>> then copying to the new array if the test fails. >>>>> Then the axes have to be reset... >>>>> >>>>> isnan does not accept array arguments, so one >>>>> cannot do, e.g., >>>>> >>>>> prmasked=MV.masked_where(isnan(pr0),pr0) >>>>> >>>>> The element-by-element conversion is quite slow. >>>>> (I'm still waiting for it to complete, in fact). >>>>> Any suggestions for dealing with NaN-infested data >>>>> objects? >>>>> >>>>> Thanks! >>>>> >>>>> AMG >>>>> >>>>> P.S. This is 5.0.0.beta, RHEL4. >>>>> >>>>> >>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>> Arthur M. Greene, Ph.D. >>>>> The International Research Institute for Climate and Society >>>>> The Earth Institute, Columbia University, Lamont Campus >>>>> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >>>>> amg*at*iri-dot-columbia\dot\edu | http:// iri.columbia.edu >>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>> >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win >>>>> great prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere in >>>>> the world >>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> >>>>> _______________________________________________ >>>>> Cdat-discussion mailing list >>>>> Cdat-discussion at lists.sourceforge.net >>>>> >>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Stephane Raynaud >>>>> ------------------------------------------------------------------------ >>>>> >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win great prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere in the world >>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> ------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Cdat-discussion mailing list >>>>> Cdat-discussion at lists.sourceforge.net >>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> >>>> >>> Please look the various NumPy functions to ignore NaN like nansum(). See >>> the NumPy example list >>> (http:// www. scipy.org/Numpy_Example_List_With_Doc) for examples under >>> nan or individual functions. >>> >>> To get the mean you can do something like: >>> >>> import numpy >>> x = numpy.array([2, numpy.nan, 1]) >>> numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) >>> x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) >>> x_masked.mean() >>> >>> The real advantage of masked arrays is that you have greater control >>> over the filtering so you can also filter extreme values: >>> >>> y = numpy.array([2, numpy.nan, 1, 1000]) >>> y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) >>> y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) >>> y_masked.mean() >>> >>> Regards >>> Bruce >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > You mean like doing: > > import numpy > y=numpy.ma.MaskedArray([ 2., numpy.nan, 1., 1000.], numpy.isnan(y)) > > ? > > Bruce > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http:// projects.scipy.org/mailman/listinfo/numpy-discussion > > > From tom.duck at dal.ca Fri Jul 25 13:42:05 2008 From: tom.duck at dal.ca (Thomas J. Duck) Date: Fri, 25 Jul 2008 14:42:05 -0300 Subject: [Numpy-discussion] 0d array value comparisons References: Message-ID: <006536B0-2A25-47ED-A970-CDD4892DF849@dal.ca> Hi, There is some unexpected behaviour (to me) when 0-dimensional arrays are compared with values. For example: >>> numpy.array([0]).squeeze() == 0 True >>> numpy.array([None]).squeeze() == None False >>> numpy.array(['a']).squeeze() == 'a' array(True, dtype=bool) Note that each test follows the same pattern, although the dtype for each squeezed array is different. The first case result is what I expected, and the second case result appears wrong. The return type for the third case is inconsistent with those before, but is at least workable. Are these the intended results? Thanks, Tom -- Thomas J. Duck Associate Professor, Department of Physics and Atmospheric Science, Dalhousie University, Halifax, Nova Scotia, Canada, B3H 3J5. Tel: (902)494-1456 | Fax: (902)494-5191 | Lab: (902)494-3813 Web: http://aolab.phys.dal.ca/ From efiring at hawaii.edu Fri Jul 25 14:00:23 2008 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 25 Jul 2008 08:00:23 -1000 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <488A08E4.1040609@llnl.gov> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <47EBFB1E.5000703@llnl.gov> <4847F6D0.6020309@iri.columbia.edu> <4888D69A.3060009@iri.columbia.edu> <4888DA62.6080807@llnl.gov> <63232.86.143.71.246.1216934359.squirrel@webmail.uea.ac.uk> <4888F342.9050302@llnl.gov> <4888F77E.3080406@iri.columbia.edu> <6ECC.6050509@llnl.gov> <4889F4A6.7020303@gmail.com> <841E.8080109@llnl.gov> <4889F985.501@gmail.com> <488A08E4.1040609@llnl.gov> Message-ID: <488A14B7.6070200@hawaii.edu> Charles Doutriaux wrote: > I mean not having to it myself. > data is a numpy array with NaN in it > masked_data = numpy.ma.array(data) > returns a masked array with a mask where NaN were in data Checking for nans is an expensive operation, so it makes sense to make it optional rather than impose the cost on all masked array creations. If you want the same effect, you can do this: masked_data = numpy.ma.masked_invalid(data) Eric > > C. > > Bruce Southey wrote: >> Charles Doutriaux wrote: >> >>> Hi Bruce, >>> >>> Thx for the reply, we're aware of this, basically the question was why >>> not mask NaN automatically when creating a nump.ma array? >>> >>> C. >>> >>> Bruce Southey wrote: >>> >>> >>>> Charles Doutriaux wrote: >>>> >>>> >>>> >>>>> Hi Stephane, >>>>> >>>>> This is a good suggestion, I'm ccing the numpy list on this. Because I'm >>>>> wondering if it wouldn't be a better fit to do it directly at the >>>>> numpy.ma level. >>>>> >>>>> I'm sure they already thought about this (and 'inf' values as well) and >>>>> if they don't do it , there's probably some good reason we didn't think >>>>> of yet. >>>>> So before i go ahead and do it in MV2 I'd like to know the reason why >>>>> it's not in numpy.ma, they are probably valid for MVs too. >>>>> >>>>> C. >>>>> >>>>> Stephane Raynaud wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> how about automatically (or at least optionally) masking all NaN >>>>>> values when creating a MV array? >>>>>> >>>>>> On Thu, Jul 24, 2008 at 11:43 PM, Arthur M. Greene >>>>>> > wrote: >>>>>> >>>>>> Yup, this works. Thanks! >>>>>> >>>>>> I guess it's time for me to dig deeper into numpy syntax and >>>>>> functions, now that CDAT is using the numpy core for array >>>>>> management... >>>>>> >>>>>> Best, >>>>>> >>>>>> Arthur >>>>>> >>>>>> >>>>>> Charles Doutriaux wrote: >>>>>> >>>>>> Seems right to me, >>>>>> >>>>>> Except that the syntax might scare a bit the new users :) >>>>>> >>>>>> C. >>>>>> >>>>>> Andrew.Dawson at uea.ac.uk wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm not sure if what I am about to suggest is a good idea >>>>>> or not, perhaps Charles will correct me if this is a bad >>>>>> idea for any reason. >>>>>> >>>>>> Lets say you have a cdms variable called U with NaNs as >>>>>> the missing >>>>>> value. First we can replace the NaNs with 1e20: >>>>>> >>>>>> U.data[numpy.where(numpy.isnan(U.data))] = 1e20 >>>>>> >>>>>> And remember to set the missing value of the variable >>>>>> appropriately: >>>>>> >>>>>> U.setMissing(1e20) >>>>>> >>>>>> I hope that helps, Andrew >>>>>> >>>>>> >>>>>> >>>>>> Hi Arthur, >>>>>> >>>>>> If i remember correctly the way i used to do it was: >>>>>> a= MV2.greater(data,1.) b=MV2.less_equal(data,1) >>>>>> c=MV2.logical_and(a,b) # Nan are the only one left >>>>>> data=MV2.masked_where(c,data) >>>>>> >>>>>> BUT I believe numpy now has way to deal with nan I >>>>>> believe it is numpy.nan_to_num But it replaces with 0 >>>>>> so it may not be what you >>>>>> want >>>>>> >>>>>> C. >>>>>> >>>>>> >>>>>> Arthur M. Greene wrote: >>>>>> >>>>>> A typical netcdf file is opened, and the single >>>>>> variable extracted: >>>>>> >>>>>> >>>>>> fpr=cdms.open('prTS2p1_SEA_allmos.cdf') >>>>>> pr0=fpr('prcp') type(pr0) >>>>>> >>>>>> >>>>>> >>>>>> Masked values (indicating ocean in this case) show >>>>>> up here as NaNs. >>>>>> >>>>>> >>>>>> pr0[0,-15:-5,0] >>>>>> >>>>>> prcp array([NaN NaN NaN NaN NaN NaN 0.37745094 >>>>>> 0.3460784 0.21960783 0.19117641]) >>>>>> >>>>>> So far this is all consistent. A map of the first >>>>>> time step shows the proper land-ocean boundaries, >>>>>> reasonable-looking values, and so on. But there >>>>>> doesn't seem to be any way to mask >>>>>> this array, so, e.g., an 'xy' average can be >>>>>> computed (it >>>>>> comes out all nans). NaN is not equal to anything >>>>>> -- even >>>>>> itself -- so there does not seem to be any >>>>>> condition, among the >>>>>> MV.masked_xxx options, that can be applied as a >>>>>> test. Also, it >>>>>> does not seem possible to compute seasonal averages, >>>>>> anomalies, etc. -- they also produce just NaNs. >>>>>> >>>>>> The workaround I've come up with -- for now -- is >>>>>> to first generate a new array of identical shape, >>>>>> filled with 1.0E+20. One test I've found that can >>>>>> detect NaNs is numpy.isnan: >>>>>> >>>>>> >>>>>> isnan(pr0[0,0,0]) >>>>>> >>>>>> True >>>>>> >>>>>> So it is _possible_ to tediously loop through >>>>>> every value in the old array, testing with isnan, >>>>>> then copying to the new array if the test fails. >>>>>> Then the axes have to be reset... >>>>>> >>>>>> isnan does not accept array arguments, so one >>>>>> cannot do, e.g., >>>>>> >>>>>> prmasked=MV.masked_where(isnan(pr0),pr0) >>>>>> >>>>>> The element-by-element conversion is quite slow. >>>>>> (I'm still waiting for it to complete, in fact). >>>>>> Any suggestions for dealing with NaN-infested data >>>>>> objects? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> AMG >>>>>> >>>>>> P.S. This is 5.0.0.beta, RHEL4. >>>>>> >>>>>> >>>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>>> Arthur M. Greene, Ph.D. >>>>>> The International Research Institute for Climate and Society >>>>>> The Earth Institute, Columbia University, Lamont Campus >>>>>> Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA >>>>>> amg*at*iri-dot-columbia\dot\edu | http:// iri.columbia.edu >>>>>> *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~* >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------- >>>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>>> challenge >>>>>> Build the coolest Linux based applications with Moblin SDK & win >>>>>> great prizes >>>>>> Grand prize is a trip for two to an Open Source event anywhere in >>>>>> the world >>>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>> >>>>>> _______________________________________________ >>>>>> Cdat-discussion mailing list >>>>>> Cdat-discussion at lists.sourceforge.net >>>>>> >>>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Stephane Raynaud >>>>>> ------------------------------------------------------------------------ >>>>>> >>>>>> ------------------------------------------------------------------------- >>>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >>>>>> Build the coolest Linux based applications with Moblin SDK & win great prizes >>>>>> Grand prize is a trip for two to an Open Source event anywhere in the world >>>>>> http:// moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>> ------------------------------------------------------------------------ >>>>>> >>>>>> _______________________________________________ >>>>>> Cdat-discussion mailing list >>>>>> Cdat-discussion at lists.sourceforge.net >>>>>> https:// lists.sourceforge.net/lists/listinfo/cdat-discussion >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Numpy-discussion mailing list >>>>> Numpy-discussion at scipy.org >>>>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> >>>>> >>>>> >>>> Please look the various NumPy functions to ignore NaN like nansum(). See >>>> the NumPy example list >>>> (http:// www. scipy.org/Numpy_Example_List_With_Doc) for examples under >>>> nan or individual functions. >>>> >>>> To get the mean you can do something like: >>>> >>>> import numpy >>>> x = numpy.array([2, numpy.nan, 1]) >>>> numpy.nansum(x)/(x.shape[0]-numpy.isnan(x).sum()) >>>> x_masked = numpy.ma.masked_where(numpy.isnan(x) , x) >>>> x_masked.mean() >>>> >>>> The real advantage of masked arrays is that you have greater control >>>> over the filtering so you can also filter extreme values: >>>> >>>> y = numpy.array([2, numpy.nan, 1, 1000]) >>>> y_masked =numpy.ma.masked_where(numpy.isnan(y) , y) >>>> y_masked =numpy.ma.masked_where(y_masked > 100 , y_masked) >>>> y_masked.mean() >>>> >>>> Regards >>>> Bruce >>>> _______________________________________________ >>>> Numpy-discussion mailing list >>>> Numpy-discussion at scipy.org >>>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> You mean like doing: >> >> import numpy >> y=numpy.ma.MaskedArray([ 2., numpy.nan, 1., 1000.], numpy.isnan(y)) >> >> ? >> >> Bruce >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http:// projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Fri Jul 25 14:12:04 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Jul 2008 14:12:04 -0400 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <837F.4030605@llnl.gov> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <837F.4030605@llnl.gov> Message-ID: <200807251412.05550.pgmdevlist@gmail.com> Oh, I guess this one's for me... On Thursday 01 January 1970 04:21:03 Charles Doutriaux wrote: > Basically it was suggested to automarically mask NaN (and Inf ?) when > creating ma. > I'm sure you already thought of this on this list and was curious to > know why you decided not to do it. Because it's always best to let the user decide what to do with his/her data and not impose anything ? Masking a point doesn't necessarily mean that the point is invalid (in the sense of NaNs/Infs), just that it doesn't satisfy some particular condition. In that sense, masks act as selecting tools. By forcing invalid data to be masked at the creation of an array, you run the risk to tamper with the (potential) physical meaning of the mask you have given as input, and/or miss the fact that some data are actually invalid when you don't expect it to be. Let's take an example: I want to analyze sea surface temperatures at the world scale. The data comes as a regular 2D ndarray, with NaNs for missing or invalid data. In a first step, I create a masked array of this data, filtering out the land masses by a predefined geographical mask. The remaining NaNs in the masked array indicate areas where the sensor failed... It's an important information I would probably have missed by masking all the NaNs at first... As Eric F. suggested, you can use numpy.ma.masked_invalid to create a masked array with NaNs/Infs filtered out: >>>import numpy as np,. numpy.ma as ma >>>x = np.array([1,2,None,4], dtype=float) >>>x array([ 1., 2., NaN, 4.]) >>>mx = ma.masked_invalid(x) >>>mx masked_array(data = [1.0 2.0 -- 4.0], mask = [False False True False], fill_value=1e+20) Note that the underlying data still has NaNs/Infs: >>>mx._data array([ 1., 2., NaN, 4.]) You can also use the ma.fix_invalid function: it creates a mask where the data is not finite (NaNs/Infs), and set the corresponding points to fill_value. >>>mx = ma.fix_invalid(x, fill_value=999) >>>mx masked_array(data = [1.0 2.0 -- 4.0], mask = [False False True False], fill_value=1e+20) >>>mx._data array([ 1., 2., 999., 4.]) The advantage of the second approach is that you no longer have NaNs/Infs in the underlying data, which speeds things up during computation. The obvious disadvantage is that you no longer know where the data was invalid... From doutriaux1 at llnl.gov Fri Jul 25 14:43:41 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 25 Jul 2008 11:43:41 -0700 Subject: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs In-Reply-To: <200807251412.05550.pgmdevlist@gmail.com> References: <1E3A72A9-08C0-4906-AE1E-6842FDDEC5E7@atmos.colostate.edu> <837F.4030605@llnl.gov> <200807251412.05550.pgmdevlist@gmail.com> Message-ID: <488A1EDD.4010301@llnl.gov> Hi Pierre, Thanks for the answer, I'm ccing cdat's discussion list. It makes sense, that's also the way we develop things here NEVER assume what the user is going to do with the data BUT give the user the necessary tools to do what you're assuming he/she wants to do (as simple as possible) Thanks again for the answer. C. Pierre GM wrote: > Oh, I guess this one's for me... > > On Thursday 01 January 1970 04:21:03 Charles Doutriaux wrote: > > >> Basically it was suggested to automarically mask NaN (and Inf ?) when >> creating ma. >> I'm sure you already thought of this on this list and was curious to >> know why you decided not to do it. >> > > Because it's always best to let the user decide what to do with his/her data > and not impose anything ? > > Masking a point doesn't necessarily mean that the point is invalid (in the > sense of NaNs/Infs), just that it doesn't satisfy some particular condition. > In that sense, masks act as selecting tools. > > By forcing invalid data to be masked at the creation of an array, you run the > risk to tamper with the (potential) physical meaning of the mask you have > given as input, and/or miss the fact that some data are actually invalid when > you don't expect it to be. > > Let's take an example: > I want to analyze sea surface temperatures at the world scale. The data comes > as a regular 2D ndarray, with NaNs for missing or invalid data. In a first > step, I create a masked array of this data, filtering out the land masses by > a predefined geographical mask. The remaining NaNs in the masked array > indicate areas where the sensor failed... It's an important information I > would probably have missed by masking all the NaNs at first... > > > As Eric F. suggested, you can use numpy.ma.masked_invalid to create a masked > array with NaNs/Infs filtered out: > > >>>> import numpy as np,. numpy.ma as ma >>>> x = np.array([1,2,None,4], dtype=float) >>>> x >>>> > array([ 1., 2., NaN, 4.]) > >>>> mx = ma.masked_invalid(x) >>>> mx >>>> > masked_array(data = [1.0 2.0 -- 4.0], > mask = [False False True False], > fill_value=1e+20) > > Note that the underlying data still has NaNs/Infs: > >>>> mx._data >>>> > array([ 1., 2., NaN, 4.]) > > You can also use the ma.fix_invalid function: it creates a mask where the data > is not finite (NaNs/Infs), and set the corresponding points to fill_value. > >>>> mx = ma.fix_invalid(x, fill_value=999) >>>> mx >>>> > masked_array(data = [1.0 2.0 -- 4.0], > mask = [False False True False], > fill_value=1e+20) > >>>> mx._data >>>> > array([ 1., 2., 999., 4.]) > > > The advantage of the second approach is that you no longer have NaNs/Infs in > the underlying data, which speeds things up during computation. The obvious > disadvantage is that you no longer know where the data was invalid... > > From cburns at berkeley.edu Fri Jul 25 14:48:13 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Fri, 25 Jul 2008 11:48:13 -0700 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. Message-ID: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> Reminder, please test the Mac installer for rc2 so we have time to fix any bugs before the release next week. Also, I committed my build script to the trunk/tools/osxbuild. bdist_mpkg 0.4.3 is required. Thank you, Chris On Thu, Jul 24, 2008 at 11:03 AM, Jarrod Millman wrote: > Hello, > > The 1.1.1rc2 is now available: > http://svn.scipy.org/svn/numpy/tags/1.1.1rc2 > > The source tarball is here: > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2.tar.gz > > Here is the universal Mac binary: > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2-py2.5-macosx10.5.dmg > > David Cournapeau will be creating a 1.1.1rc2 Windows binary in next few > days. > > Please test this release ASAP and let us know if there are any > problems. If there are no show stoppers, this will likely become the > 1.1.1 release. > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.mcintyre at gmail.com Fri Jul 25 15:00:42 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Fri, 25 Jul 2008 15:00:42 -0400 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> Message-ID: <1d36917a0807251200ud6665cdy49b384829c66e2f@mail.gmail.com> On Fri, Jul 25, 2008 at 2:48 PM, Christopher Burns wrote: > Reminder, please test the Mac installer for rc2 so we have time to fix any > bugs before the release next week. I just tried it; it installs with no problems and tests run with no failures. From dfranci at seas.upenn.edu Fri Jul 25 15:32:56 2008 From: dfranci at seas.upenn.edu (Frank Lagor) Date: Fri, 25 Jul 2008 15:32:56 -0400 Subject: [Numpy-discussion] curious problem with SVD Message-ID: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> Perhaps I do not understand something properly, if so could someone please explain the behavior I notice with numpy.linalg.svd when acting on arrays. It gives the incorrect answer, but works fine with matrices. My numpy is 1.1.0. >>> R = n.array([[3.6,.35],[.35,1.8]]) >>> V,D,W = n.linalg.svd(R) >>> V*n.diag(D)*W.transpose() array([[ 3.5410365 , 0. ], [ 0. , 1.67537611]]) >>> R = n.matrix([[3.6,.35],[.35,1.8]]) >>> V,D,W = n.linalg.svd(R) >>> V*n.diag(D)*W.transpose() matrix([[ 3.6 , 0.35], [ 0.35, 1.8 ]]) Thanks in advance, Frank -- Frank Lagor Ph.D. Candidate Mechanical Engineering and Applied Mechanics University of Pennsylvania -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Jul 25 15:36:28 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Jul 2008 12:36:28 -0700 Subject: [Numpy-discussion] curious problem with SVD In-Reply-To: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> References: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> Message-ID: On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor wrote: > Perhaps I do not understand something properly, if so could someone please > explain the behavior I notice with numpy.linalg.svd when acting on arrays. > It gives the incorrect answer, but works fine with matrices. My numpy is > 1.1.0. > >>>> R = n.array([[3.6,.35],[.35,1.8]]) >>>> V,D,W = n.linalg.svd(R) >>>> V*n.diag(D)*W.transpose() > array([[ 3.5410365 , 0. ], > [ 0. , 1.67537611]]) >>>> R = n.matrix([[3.6,.35],[.35,1.8]]) >>>> V,D,W = n.linalg.svd(R) >>>> V*n.diag(D)*W.transpose() > matrix([[ 3.6 , 0.35], > [ 0.35, 1.8 ]]) '*' does element-by-element multiplication for arrays but matrix multiplication for matrices. From kwgoodman at gmail.com Fri Jul 25 15:39:23 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Jul 2008 12:39:23 -0700 Subject: [Numpy-discussion] curious problem with SVD In-Reply-To: References: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> Message-ID: On Fri, Jul 25, 2008 at 12:36 PM, Keith Goodman wrote: > On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor wrote: >> Perhaps I do not understand something properly, if so could someone please >> explain the behavior I notice with numpy.linalg.svd when acting on arrays. >> It gives the incorrect answer, but works fine with matrices. My numpy is >> 1.1.0. >> >>>>> R = n.array([[3.6,.35],[.35,1.8]]) >>>>> V,D,W = n.linalg.svd(R) >>>>> V*n.diag(D)*W.transpose() >> array([[ 3.5410365 , 0. ], >> [ 0. , 1.67537611]]) >>>>> R = n.matrix([[3.6,.35],[.35,1.8]]) >>>>> V,D,W = n.linalg.svd(R) >>>>> V*n.diag(D)*W.transpose() >> matrix([[ 3.6 , 0.35], >> [ 0.35, 1.8 ]]) > > '*' does element-by-element multiplication for arrays but matrix > multiplication for matrices. As a check (for the array case): >> n.dot(V, n.dot(n.diag(D), W.transpose())) # That's hard to read! array([[ 3.6 , 0.35], [ 0.35, 1.8 ]]) From robert.kern at gmail.com Fri Jul 25 15:50:24 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Jul 2008 14:50:24 -0500 Subject: [Numpy-discussion] curious problem with SVD In-Reply-To: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> References: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> Message-ID: <3d375d730807251250v768ac25cl6de880c0b48d622b@mail.gmail.com> On Fri, Jul 25, 2008 at 14:32, Frank Lagor wrote: > Perhaps I do not understand something properly, if so could someone please > explain the behavior I notice with numpy.linalg.svd when acting on arrays. > It gives the incorrect answer, but works fine with matrices. My numpy is > 1.1.0. > >>>> R = n.array([[3.6,.35],[.35,1.8]]) >>>> V,D,W = n.linalg.svd(R) >>>> V*n.diag(D)*W.transpose() For regular arrays, * is element-wise multiplication, not matrix multiplication. For matrix objects, * is matrix multiplication. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From arnar.flatberg at gmail.com Fri Jul 25 15:53:09 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Fri, 25 Jul 2008 21:53:09 +0200 Subject: [Numpy-discussion] curious problem with SVD In-Reply-To: References: <9fddf64a0807251232q26d14fe2vfffbdc850b37be3c@mail.gmail.com> Message-ID: <5d3194020807251253q18768b1cgec941e31b2a9c00b@mail.gmail.com> On Fri, Jul 25, 2008 at 9:39 PM, Keith Goodman wrote: > On Fri, Jul 25, 2008 at 12:36 PM, Keith Goodman > wrote: > > On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor > wrote: > >> Perhaps I do not understand something properly, if so could someone > please > >> explain the behavior I notice with numpy.linalg.svd when acting on > arrays. > >> It gives the incorrect answer, but works fine with matrices. My numpy > is > > > '*' does element-by-element multiplication for arrays but matrix > > multiplication for mat > >> n.dot(V, n.dot(n.diag(D), W.transpose())) # That's hard to read! Just two small points: 1.) Broadcasting may be easier on the eye ... well, atleast when you have gotten used to it Then the above is np.dot(V*D, W) 2.) Also, note that the right hand side eigenvectors in numpy's svd routine is ordered by rows! Yes, I know this is confusing as it is different from just about any other linear algebra software out there, but the documentation is clear. It is also a little inconsistent with eig and eigh, some more experienced user can probably answer on why it is like that? Arnar > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dfranci at seas.upenn.edu Fri Jul 25 16:01:32 2008 From: dfranci at seas.upenn.edu (Frank Lagor) Date: Fri, 25 Jul 2008 16:01:32 -0400 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 22, Issue 109 In-Reply-To: References: Message-ID: <9fddf64a0807251301u6daf28fbq80ba64d6adfb31d3@mail.gmail.com> Thanks so much for your help on the '*' confusion. It makes sense now. Thanks, Frank On Fri, Jul 25, 2008 at 3:57 PM, wrote: > Send Numpy-discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://projects.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Numpy-discussion digest..." > > Today's Topics: > > 1. Re: [Cdat-discussion] Arrays containing NaNs (Charles Doutriaux) > 2. numpy-1.1.1rc2 Mac binary - Please Test. (Christopher Burns) > 3. Re: numpy-1.1.1rc2 Mac binary - Please Test. (Alan McIntyre) > 4. curious problem with SVD (Frank Lagor) > 5. Re: curious problem with SVD (Keith Goodman) > 6. Re: curious problem with SVD (Keith Goodman) > 7. Re: curious problem with SVD (Robert Kern) > 8. Re: curious problem with SVD (Arnar Flatberg) > > > ---------- Forwarded message ---------- > From: Charles Doutriaux > To: Pierre GM > Date: Fri, 25 Jul 2008 11:43:41 -0700 > Subject: Re: [Numpy-discussion] [Cdat-discussion] Arrays containing NaNs > Hi Pierre, > > Thanks for the answer, I'm ccing cdat's discussion list. > > It makes sense, that's also the way we develop things here NEVER assume > what the user is going to do with the data BUT give the user the necessary > tools to do what you're assuming he/she wants to do (as simple as possible) > > Thanks again for the answer. > > C. > > > Pierre GM wrote: > >> Oh, I guess this one's for me... >> >> On Thursday 01 January 1970 04:21:03 Charles Doutriaux wrote: >> >> >> >>> Basically it was suggested to automarically mask NaN (and Inf ?) when >>> creating ma. >>> I'm sure you already thought of this on this list and was curious to >>> know why you decided not to do it. >>> >>> >> >> Because it's always best to let the user decide what to do with his/her >> data and not impose anything ? >> >> Masking a point doesn't necessarily mean that the point is invalid (in the >> sense of NaNs/Infs), just that it doesn't satisfy some particular condition. >> In that sense, masks act as selecting tools. >> >> By forcing invalid data to be masked at the creation of an array, you run >> the risk to tamper with the (potential) physical meaning of the mask you >> have given as input, and/or miss the fact that some data are actually >> invalid when you don't expect it to be. >> >> Let's take an example: I want to analyze sea surface temperatures at the >> world scale. The data comes as a regular 2D ndarray, with NaNs for missing >> or invalid data. In a first step, I create a masked array of this data, >> filtering out the land masses by a predefined geographical mask. The >> remaining NaNs in the masked array indicate areas where the sensor failed... >> It's an important information I would probably have missed by masking all >> the NaNs at first... >> >> >> As Eric F. suggested, you can use numpy.ma.masked_invalid to create a >> masked array with NaNs/Infs filtered out: >> >> >> >>> import numpy as np,. numpy.ma as ma >>>>> x = np.array([1,2,None,4], dtype=float) >>>>> x >>>>> >>>>> >>>> array([ 1., 2., NaN, 4.]) >> >> >>> mx = ma.masked_invalid(x) >>>>> mx >>>>> >>>>> >>>> masked_array(data = [1.0 2.0 -- 4.0], >> mask = [False False True False], >> fill_value=1e+20) >> >> Note that the underlying data still has NaNs/Infs: >> >> >>> mx._data >>>>> >>>>> >>>> array([ 1., 2., NaN, 4.]) >> >> You can also use the ma.fix_invalid function: it creates a mask where the >> data is not finite (NaNs/Infs), and set the corresponding points to >> fill_value. >> >> >>> mx = ma.fix_invalid(x, fill_value=999) >>>>> mx >>>>> >>>>> >>>> masked_array(data = [1.0 2.0 -- 4.0], >> mask = [False False True False], >> fill_value=1e+20) >> >> >>> mx._data >>>>> >>>>> >>>> array([ 1., 2., 999., 4.]) >> >> >> The advantage of the second approach is that you no longer have NaNs/Infs >> in the underlying data, which speeds things up during computation. The >> obvious disadvantage is that you no longer know where the data was >> invalid... >> >> >> > > > > > ---------- Forwarded message ---------- > From: "Christopher Burns" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 11:48:13 -0700 > Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. > Reminder, please test the Mac installer for rc2 so we have time to fix any > bugs before the release next week. > > Also, I committed my build script to the trunk/tools/osxbuild. bdist_mpkg > 0.4.3 is required. > > Thank you, > Chris > > On Thu, Jul 24, 2008 at 11:03 AM, Jarrod Millman > wrote: > >> Hello, >> >> The 1.1.1rc2 is now available: >> http://svn.scipy.org/svn/numpy/tags/1.1.1rc2 >> >> The source tarball is here: >> http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2.tar.gz >> >> Here is the universal Mac binary: >> http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2-py2.5-macosx10.5.dmg >> >> David Cournapeau will be creating a 1.1.1rc2 Windows binary in next few >> days. >> >> Please test this release ASAP and let us know if there are any >> problems. If there are no show stoppers, this will likely become the >> 1.1.1 release. >> >> Thanks, >> >> -- >> Jarrod Millman >> Computational Infrastructure for Research Labs >> 10 Giannini Hall, UC Berkeley >> phone: 510.643.4014 >> http://cirl.berkeley.edu/ >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > > > ---------- Forwarded message ---------- > From: "Alan McIntyre" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 15:00:42 -0400 > Subject: Re: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. > On Fri, Jul 25, 2008 at 2:48 PM, Christopher Burns > wrote: > > Reminder, please test the Mac installer for rc2 so we have time to fix > any > > bugs before the release next week. > > I just tried it; it installs with no problems and tests run with no > failures. > > > > ---------- Forwarded message ---------- > From: "Frank Lagor" > To: numpy-discussion at scipy.org > Date: Fri, 25 Jul 2008 15:32:56 -0400 > Subject: [Numpy-discussion] curious problem with SVD > Perhaps I do not understand something properly, if so could someone please > explain the behavior I notice with numpy.linalg.svd when acting on arrays. > It gives the incorrect answer, but works fine with matrices. My numpy is > 1.1.0. > > >>> R = n.array([[3.6,.35],[.35,1.8]]) > >>> V,D,W = n.linalg.svd(R) > >>> V*n.diag(D)*W.transpose() > array([[ 3.5410365 , 0. ], > [ 0. , 1.67537611]]) > >>> R = n.matrix([[3.6,.35],[.35,1.8]]) > >>> V,D,W = n.linalg.svd(R) > >>> V*n.diag(D)*W.transpose() > matrix([[ 3.6 , 0.35], > [ 0.35, 1.8 ]]) > > Thanks in advance, > Frank > -- > Frank Lagor > Ph.D. Candidate > Mechanical Engineering and Applied Mechanics > University of Pennsylvania > > > ---------- Forwarded message ---------- > From: "Keith Goodman" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 12:36:28 -0700 > Subject: Re: [Numpy-discussion] curious problem with SVD > On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor > wrote: > > Perhaps I do not understand something properly, if so could someone > please > > explain the behavior I notice with numpy.linalg.svd when acting on > arrays. > > It gives the incorrect answer, but works fine with matrices. My numpy is > > 1.1.0. > > > >>>> R = n.array([[3.6,.35],[.35,1.8]]) > >>>> V,D,W = n.linalg.svd(R) > >>>> V*n.diag(D)*W.transpose() > > array([[ 3.5410365 , 0. ], > > [ 0. , 1.67537611]]) > >>>> R = n.matrix([[3.6,.35],[.35,1.8]]) > >>>> V,D,W = n.linalg.svd(R) > >>>> V*n.diag(D)*W.transpose() > > matrix([[ 3.6 , 0.35], > > [ 0.35, 1.8 ]]) > > '*' does element-by-element multiplication for arrays but matrix > multiplication for matrices. > > > > ---------- Forwarded message ---------- > From: "Keith Goodman" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 12:39:23 -0700 > Subject: Re: [Numpy-discussion] curious problem with SVD > On Fri, Jul 25, 2008 at 12:36 PM, Keith Goodman > wrote: > > On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor > wrote: > >> Perhaps I do not understand something properly, if so could someone > please > >> explain the behavior I notice with numpy.linalg.svd when acting on > arrays. > >> It gives the incorrect answer, but works fine with matrices. My numpy > is > >> 1.1.0. > >> > >>>>> R = n.array([[3.6,.35],[.35,1.8]]) > >>>>> V,D,W = n.linalg.svd(R) > >>>>> V*n.diag(D)*W.transpose() > >> array([[ 3.5410365 , 0. ], > >> [ 0. , 1.67537611]]) > >>>>> R = n.matrix([[3.6,.35],[.35,1.8]]) > >>>>> V,D,W = n.linalg.svd(R) > >>>>> V*n.diag(D)*W.transpose() > >> matrix([[ 3.6 , 0.35], > >> [ 0.35, 1.8 ]]) > > > > '*' does element-by-element multiplication for arrays but matrix > > multiplication for matrices. > > As a check (for the array case): > > >> n.dot(V, n.dot(n.diag(D), W.transpose())) # That's hard to read! > > array([[ 3.6 , 0.35], > [ 0.35, 1.8 ]]) > > > > ---------- Forwarded message ---------- > From: "Robert Kern" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 14:50:24 -0500 > Subject: Re: [Numpy-discussion] curious problem with SVD > On Fri, Jul 25, 2008 at 14:32, Frank Lagor wrote: > > Perhaps I do not understand something properly, if so could someone > please > > explain the behavior I notice with numpy.linalg.svd when acting on > arrays. > > It gives the incorrect answer, but works fine with matrices. My numpy is > > 1.1.0. > > > >>>> R = n.array([[3.6,.35],[.35,1.8]]) > >>>> V,D,W = n.linalg.svd(R) > >>>> V*n.diag(D)*W.transpose() > > For regular arrays, * is element-wise multiplication, not matrix > multiplication. For matrix objects, * is matrix multiplication. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > > > > ---------- Forwarded message ---------- > From: "Arnar Flatberg" > To: "Discussion of Numerical Python" > Date: Fri, 25 Jul 2008 21:53:09 +0200 > Subject: Re: [Numpy-discussion] curious problem with SVD > > On Fri, Jul 25, 2008 at 9:39 PM, Keith Goodman > wrote: > >> On Fri, Jul 25, 2008 at 12:36 PM, Keith Goodman >> wrote: >> > On Fri, Jul 25, 2008 at 12:32 PM, Frank Lagor >> wrote: >> >> Perhaps I do not understand something properly, if so could someone >> please >> >> explain the behavior I notice with numpy.linalg.svd when acting on >> arrays. >> >> It gives the incorrect answer, but works fine with matrices. My numpy >> is >> >> > '*' does element-by-element multiplication for arrays but matrix >> > multiplication for mat >> >> n.dot(V, n.dot(n.diag(D), W.transpose())) # That's hard to read! > > > Just two small points: > > 1.) Broadcasting may be easier on the eye ... well, atleast when you have > gotten used to it > Then the above is np.dot(V*D, W) > > 2.) Also, note that the right hand side eigenvectors in numpy's svd routine > is ordered by rows! > Yes, I know this is confusing as it is different from just about any other > linear algebra software out there, but the documentation is clear. It is > also a little inconsistent with eig and eigh, some more experienced user can > probably answer on why it is like that? > > Arnar > > > >> >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Frank Lagor Ph.D. Candidate Mechanical Engineering and Applied Mechanics University of Pennsylvania -------------- next part -------------- An HTML attachment was scrubbed... URL: From rpyle at post.harvard.edu Fri Jul 25 16:13:43 2008 From: rpyle at post.harvard.edu (Robert Pyle) Date: Fri, 25 Jul 2008 16:13:43 -0400 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> Message-ID: <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> On Jul 25, 2008, at 2:48 PM, Christopher Burns wrote: > Reminder, please test the Mac installer for rc2 so we have time to > fix any bugs before the release next week. Dual G5, 10.5.4, Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) installed as expected, passed all tests: Ran 1300 tests in 3.438s OK MacBook Pro, Intel Core 2 Duo, 10.5.4, Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) installed as expected, failed one test: FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in check_testUfuncRegression self.failUnless(eqmask(ur.mask, mr.mask)) AssertionError ---------------------------------------------------------------------- Ran 1336 tests in 3.104s FAILED (failures=1) From grs2103 at columbia.edu Fri Jul 25 16:24:35 2008 From: grs2103 at columbia.edu (Gideon Simpson) Date: Fri, 25 Jul 2008 16:24:35 -0400 Subject: [Numpy-discussion] exponentiation q. Message-ID: <97F0E115-8761-4107-B3F9-FB8950B3FDAF@columbia.edu> How does python (or numpy/scipy) do exponentiation? If I do x**p, where p is some positive integer, will it compute x*x*...*x (p times), or will it use logarithms? -gideon From kwgoodman at gmail.com Fri Jul 25 16:32:47 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Jul 2008 13:32:47 -0700 Subject: [Numpy-discussion] exponentiation q. In-Reply-To: <97F0E115-8761-4107-B3F9-FB8950B3FDAF@columbia.edu> References: <97F0E115-8761-4107-B3F9-FB8950B3FDAF@columbia.edu> Message-ID: On Fri, Jul 25, 2008 at 1:24 PM, Gideon Simpson wrote: > How does python (or numpy/scipy) do exponentiation? If I do x**p, > where p is some positive integer, will it compute x*x*...*x (p times), > or will it use logarithms? Here are some examples: >> np.array([[1,2], [3,4]])**2 array([[ 1, 4], [ 9, 16]]) >> np.matrix([[1,2], [3,4]])**2 matrix([[ 7, 10], [15, 22]]) >> np.power(np.matrix([[1,2], [3,4]]), 2) matrix([[ 1, 4], [ 9, 16]]) From kwgoodman at gmail.com Fri Jul 25 16:33:34 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 25 Jul 2008 13:33:34 -0700 Subject: [Numpy-discussion] exponentiation q. In-Reply-To: References: <97F0E115-8761-4107-B3F9-FB8950B3FDAF@columbia.edu> Message-ID: On Fri, Jul 25, 2008 at 1:32 PM, Keith Goodman wrote: > On Fri, Jul 25, 2008 at 1:24 PM, Gideon Simpson wrote: >> How does python (or numpy/scipy) do exponentiation? If I do x**p, >> where p is some positive integer, will it compute x*x*...*x (p times), >> or will it use logarithms? > > Here are some examples: > >>> np.array([[1,2], [3,4]])**2 > > array([[ 1, 4], > [ 9, 16]]) > >>> np.matrix([[1,2], [3,4]])**2 > > matrix([[ 7, 10], > [15, 22]]) > >>> np.power(np.matrix([[1,2], [3,4]]), 2) > > matrix([[ 1, 4], > [ 9, 16]]) Sorry. I see now you're worried about under/overflow. From pav at iki.fi Fri Jul 25 16:39:34 2008 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 25 Jul 2008 20:39:34 +0000 (UTC) Subject: [Numpy-discussion] exponentiation q. References: <97F0E115-8761-4107-B3F9-FB8950B3FDAF@columbia.edu> Message-ID: Fri, 25 Jul 2008 16:24:35 -0400, Gideon Simpson wrote: > How does python (or numpy/scipy) do exponentiation? If I do x**p, where > p is some positive integer, will it compute x*x*...*x (p times), or will > it use logarithms? For floats it will call operating system's pow, which supposedly is optimized. For integer powers |n| < 100 of complex numbers it computes the power by repeated squarings and multiplications using a binary decomposition, and for larger |n| falls back to using a logarithm. For integer powers of integers, it also appears to use repeated squarings and multiplications according to a binary decomposition. -- Pauli Virtanen From pgmdevlist at gmail.com Fri Jul 25 16:47:02 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 25 Jul 2008 16:47:02 -0400 Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= In-Reply-To: <200807251309.34418.faltet@pytables.org> References: <200807161844.36953.faltet@pytables.org> <20080718144247.GA5698@tardis.terramar.selidor.net> <200807251309.34418.faltet@pytables.org> Message-ID: <200807251647.03102.pgmdevlist@gmail.com> Francesc, Could you clarify a couple of points ? [datetime64] If I understand properly, your datetime64 would be time units from the POSIX epoch (1970/01/01 00:00:00), right ? So +7d would be 1970/01/08 (7 days after the epoch) -7W would be 1969/11/13 (7*7 days before the epoch) With this approach, a series [1,2,3,7] at a resolution 'd' would correspond to 1970/01/01, 1970/01/02, 1970/01/03 and 1970/01/07, right ? I'm all for that, **AS LONG AS we have a business day resolution** 'b', so that +7b would be 1970/01/09. [timedelta64] I like your idea of a timedelta64 being relative, but in that case, why not having the same resolutions as datetime64 ? [scikits.timeseries] We can currently perform the following operations in scikits.timeseries >>>import scikits.timeseries as ts >>>series = ts.date_array(['1970-01', '1970-02', '1970-09'], freq='M') >>>series DateArray([Jan-1970, Feb-1970, Sep-1970], freq='M') >>>series.asfreq('A') DateArray([1970, 1970, 1970], freq='A-DEC') >>>series.asfreq('A-MAR') DateArray([1970, 1970, 1971], freq='A-MAR') "A-MAR" means that year YY ends on 03/31 and that year (YY+1) starts on 04/01. I use that a lot in my work, when I need to average daily data by water years (a water year starts usually on 04/01 and ends the following 03/31). How would I do that with datetime64 and timedelta64 ? Apart from that, I'd be of course quite happy to help as much as I can. P. ############################################ On Friday 25 July 2008 07:09:33 Francesc Alted wrote: > Hi, > > Well, as there were no replies to our second proposal for the date/time > dtype, I assume that everbody agrees with it ;-) At any rate, we would > like to proceed with the implementation phase very soon now. > > However, it happens that Enthought is sponsoring this job and they > clearly stated that the implementation should cover the needs of as > much users as possible. So, most in particular, we would like that one > of the most heavier users of date/time objects, i.e. the TimeSeries > authors, would be comfortable with the new date/time dtypes, and > specially that they can benefit from them. > > For this goal, we are proposing a decoupling of the date/time use cases > in two different groups: > > 1. A pure ``datetime`` dtype (absolute or relative) that would be useful > for timestamping purposes in general (i.e. registering dates without a > need that they be evenly spaced in time). > > 2. A class based on the ``frequency`` concept that would be useful for > measurements that are done on a regular basis or in business > applications. > > With this, we are preventing the dtype implementation at the core of > NumPy from being too cluttered with the relatively complex needs of the > ``frequency`` concept users, factoring it out to a external class > (``Date`` to follow the TimeSeries naming convention). More > importantly, this decoupling will also avoid the mix of those two > concepts that, although they are about time measurements, they have > quite a different meanings indeed. > > Another important advantage of this distinction is that the ``datetime`` > timestamp requires less meta-information to worry about (basically, > the 'resolution' property), while a ``frequency`` ? la TimeSeries will > need more additional meta-information, like the 'start' and 'end' of > periods, as well as a more complex way to code frequencies (there > exists much more time-periods to be coded, as it can be seen in [1]_). > This can be utterly important to allow the NumPy data based on the > ``datetime`` dtype to be quickly saved and retrieved on databases like > ZODB (object database) or PyTables (HDF5-based database). > > Our ultimate goal is that the ``Date`` and ``DateArray`` classes in the > TimeSeries would be rewritten in terms of the new date/time dtype so as > to get advantage of its features but also for getting rid of duplicated > code. I honestly think that this can be a big advantage for TimeSeries > indeed (at the cost of taking some time for doing the migration). > > Does that approach make sense for people? > > .. [1] http://scipy.org/scipy/scikits/wiki/TimeSeries#Frequencies From charlesr.harris at gmail.com Fri Jul 25 17:22:51 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Jul 2008 15:22:51 -0600 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> Message-ID: On Fri, Jul 25, 2008 at 2:13 PM, Robert Pyle wrote: > > On Jul 25, 2008, at 2:48 PM, Christopher Burns wrote: > > > Reminder, please test the Mac installer for rc2 so we have time to > > fix any bugs before the release next week. > > Dual G5, 10.5.4, Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) > > installed as expected, passed all tests: > > Ran 1300 tests in 3.438s > > OK > > > > > MacBook Pro, Intel Core 2 Duo, 10.5.4, Python 2.5.2 (r252:60911, Feb > 22 2008, 07:57:53) > > installed as expected, failed one test: > > FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in > check_testUfuncRegression > self.failUnless(eqmask(ur.mask, mr.mask)) > AssertionError > Curious. Would you be willing to download the tarball and help us debug this problem? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnar.flatberg at gmail.com Fri Jul 25 18:01:36 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Sat, 26 Jul 2008 00:01:36 +0200 Subject: [Numpy-discussion] Change of SVD output Message-ID: <5d3194020807251501i28d7b3d4m31fdfe08efee1996@mail.gmail.com> Hi In a recent thread there was an error in how a matrix is reconstructed from its SVD decomposition. I apologize if this is just an old and settled issue and I am just adding noise, but I got bitten by numpy's unfamiliar output myself a long time ago and I see others get confused as well. So what is the issue? Let the svd decomposition of X be USV: U, S, V = np.linalg.svd(X), Then U has X's left singular vectors in its columns, S contains the singular values, and V has X's right singular vectors in its *rows*. The reconstruction of X is (matrix notation) will be: U*diag(S)*V (not U*diag(S)*V.T). All other high-level software (Matlab, Scilab, Mathematica, R, etc), outputs the right singular vectors columnwise, that is V = [v1, v2, v3, ... vn], where vn is a column (eigenvector) in V, thus the reconstruction would be U*diag(S)*V.T. Also, as far as I know, most linear algebra textbooks operate with eigenvectors consistent as column vectors in explanations of the SVD. I think numpy's svd should do so too. I know lapack's dgesdd returns V.T (or conjugate), and specify that in its documentation so this is a true interface of the library, but I still think its wrong and its just too confusing for any beginner who usually has experience in other software, such as matlab, prior to numpy. Also, ,as I am typing here, I realize that changing the output would break lots of stuff, and pass silently through many tests as the shape of V is similar (if full_matrices=0). Oh well, I guess that proposal is off the table? Perhaps some *stronger* hints in the documentation are needed. Arnar PS: In the docs at http://www.scipy.org/NumPy_for_Matlab_Users , the svd equivalents have wrong notation, this is not helping :-). I didnt manage to change it, perhaps some other may be so kind? -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 25 18:10:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Jul 2008 17:10:55 -0500 Subject: [Numpy-discussion] Change of SVD output In-Reply-To: <5d3194020807251501i28d7b3d4m31fdfe08efee1996@mail.gmail.com> References: <5d3194020807251501i28d7b3d4m31fdfe08efee1996@mail.gmail.com> Message-ID: <3d375d730807251510n2489d9c6kf35c9313fb9d7373@mail.gmail.com> On Fri, Jul 25, 2008 at 17:01, Arnar Flatberg wrote: > Hi > > In a recent thread there was an error in how a matrix is reconstructed from > its SVD decomposition. I apologize if this is just an old and settled issue > and I am just adding noise, but I got bitten by numpy's unfamiliar output > myself a long time ago and I see others get confused as well. So what is the > issue? > > Let the svd decomposition of X be USV: > U, S, V = np.linalg.svd(X), > Then U has X's left singular vectors in its columns, S contains the singular > values, and V has X's right singular vectors in its *rows*. The > reconstruction of X is (matrix notation) will be: U*diag(S)*V (not > U*diag(S)*V.T). > All other high-level software (Matlab, Scilab, Mathematica, R, etc), outputs > the right singular vectors columnwise, that is V = [v1, v2, v3, ... vn], > where vn is a column (eigenvector) in V, thus the reconstruction would be > U*diag(S)*V.T. Also, as far as I know, most linear algebra textbooks operate > with eigenvectors consistent as column vectors in explanations of the SVD. I > think numpy's svd should do so too. > > I know lapack's dgesdd returns V.T (or conjugate), and specify that in its > documentation so this is a true interface of the library, but I still think > its wrong and its just too confusing for any beginner who usually has > experience in other software, such as matlab, prior to numpy. Also, ,as I am > typing here, I realize that changing the output would break lots of stuff, > and pass silently through many tests as the shape of V is similar (if > full_matrices=0). Oh well, I guess that proposal is off the table? Yes. > Perhaps > some *stronger* hints in the documentation are needed. Please do. http://sd-2116.dedibox.fr/pydocweb/wiki/Front%20Page/ > Arnar > > PS: > In the docs at http://www.scipy.org/NumPy_for_Matlab_Users , the svd > equivalents have wrong notation, this is not helping :-). I didnt manage to > change it, perhaps some other may be so kind? You will need to register an account. Make an account here: http://www.scipy.org/UserPreferences If you do have an account, and you are still having problems, it may be because of the account filtering I am using to try to control the spam problem. Try a different account name. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cburns at berkeley.edu Fri Jul 25 19:19:52 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Fri, 25 Jul 2008 16:19:52 -0700 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> Message-ID: <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> Robert, numpy/core/tests/test_ma.py is an old file from a previous install. You need to remove the numpy directory and reinstall. Unfortunately the installer does not cleanup old installs. Chris On Fri, Jul 25, 2008 at 1:13 PM, Robert Pyle wrote: > > MacBook Pro, Intel Core 2 Duo, 10.5.4, Python 2.5.2 (r252:60911, Feb > 22 2008, 07:57:53) > > installed as expected, failed one test: > > FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in > check_testUfuncRegression > self.failUnless(eqmask(ur.mask, mr.mask)) > AssertionError > > ---------------------------------------------------------------------- > Ran 1336 tests in 3.104s > > FAILED (failures=1) > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jul 25 19:56:11 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Jul 2008 17:56:11 -0600 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> Message-ID: On Fri, Jul 25, 2008 at 5:19 PM, Christopher Burns wrote: > Robert, > > numpy/core/tests/test_ma.py is an old file from a previous install. You > need to remove the numpy directory and reinstall. > Whew! Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Jul 25 20:19:47 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 26 Jul 2008 02:19:47 +0200 Subject: [Numpy-discussion] 0d array value comparisons In-Reply-To: <006536B0-2A25-47ED-A970-CDD4892DF849@dal.ca> References: <006536B0-2A25-47ED-A970-CDD4892DF849@dal.ca> Message-ID: <9457e7c80807251719o6e247c23sb2dcf93683e2ea2d@mail.gmail.com> 2008/7/25 Thomas J. Duck : > Hi, > > There is some unexpected behaviour (to me) when 0-dimensional > arrays are compared with values. For example: > > >>> numpy.array([0]).squeeze() == 0 > True > > >>> numpy.array([None]).squeeze() == None > False > > >>> numpy.array(['a']).squeeze() == 'a' > array(True, dtype=bool) > > Note that each test follows the same pattern, although the dtype for > each squeezed array is different. The first case result is what I > expected, and the second case result appears wrong. The return type > for the third case is inconsistent with those before, but is at least > workable. > > Are these the intended results? I would think not. E.g., in the following case broadcasting should kick in: In [20]: np.array([None, None, None]) == None Out[20]: False In [21]: np.array([None, None, None]) == [None] Out[21]: array([ True, True, True], dtype=bool) Cheers St?fan From mattknox.ca at gmail.com Fri Jul 25 21:22:14 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Sat, 26 Jul 2008 01:22:14 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= References: <200807161844.36953.faltet@pytables.org> <20080718144247.GA5698@tardis.terramar.selidor.net> <200807251309.34418.faltet@pytables.org> Message-ID: >> For this goal, we are proposing a decoupling of the date/time use cases >> in two different groups: >> >> 1. A pure ``datetime`` dtype (absolute or relative) that would be useful >> for timestamping purposes in general (i.e. registering dates without a >> need that they be evenly spaced in time). I agree with this split. A basic datetime data type would be useful to a lot of people that don't need fancier time series capabilities. I would recommend focusing on implementing this first as it will probably provide lots of useful learning experiences and examples for the more complicated task of a "frequency" aware date type later on. >> 2. A class based on the ``frequency`` concept that would be useful for >> measurements that are done on a regular basis or in business >> applications. >> ... >> Our ultimate goal is that the ``Date`` and ``DateArray`` classes in the >> TimeSeries would be rewritten in terms of the new date/time dtype so as >> to get advantage of its features but also for getting rid of duplicated >> code. I'm excited to hear such interest in time series work with python and numpy. I certainly support the goals and more collaboration and sharing of code is always a good thing. My biggest concern would be not losing existing functionality. A decent amount of work went into implementing all the different frequencies, and losing any of the currently supported frequencies could mean the difference between the dtype being very useful to someone, or not useful at all. Just thinking out loud here... but in terms of improving on the Date implementation in the timeseries module, it would be nice to have a more "plug in" kind of architecture for implementing different frequencies so that it could be extended more easily with custom frequencies by other users. There is no end to the list of possible frequencies that people might potentially use and the current timeseries implementation isn't as flexibile as it could be in that area. The automatic string parsing has been mentioned before, but it is a feature I am personally very fond of. I use it all the time, and I suspect a lot of people would like it very much if they used it. It's not suited for high performance code, but is fantastic for interactive and ad-hoc work. This is supported right in the "constructor" of the current Date class, along with conversion from datetime objects. I'd love to see such support built into the new date type, although I guess it could be added on easily enough with a factory function. Another extra feature (or hack depending on your point of view) in the timeseries Date class is the addition of a couple extra custom directives for string formatting. Specifically the %q and %Q directives for printing out Quarter information. Obviously these are non-standard directives, but when you are talking about dates with custom frequencies I think it sometimes make sense to have custom format directives. A plug in architecture that somehow lets you define new custom directives for various frequencies would also be really nice. Anyway, I'm very much in support of this initiative. I'm not sure I'll be able to help much on the initial implementation, but once you have a framework in place I may be able to pitch in with some of the details. Please keep us posted. - Matt From david at ar.media.kyoto-u.ac.jp Fri Jul 25 21:18:50 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 26 Jul 2008 10:18:50 +0900 Subject: [Numpy-discussion] 1.1.1rc2 tagged In-Reply-To: References: Message-ID: <488A7B7A.7020601@ar.media.kyoto-u.ac.jp> Jarrod Millman wrote: > Hello, > > The 1.1.1rc2 is now available: > http://svn.scipy.org/svn/numpy/tags/1.1.1rc2 > > The source tarball is here: > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2.tar.gz > > Here is the universal Mac binary: > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2-py2.5-macosx10.5.dmg > > David Cournapeau will be creating a 1.1.1rc2 Windows binary in next few days. > > Please test this release ASAP and let us know if there are any > problems. If there are no show stoppers, this will likely become the > 1.1.1 release. > > There are two valgrind warnings: http://scipy.org/scipy/numpy/ticket/859 http://scipy.org/scipy/numpy/ticket/863 I have not checked the 2nd one, but the first one does not look spurious. cheers, David From jdh2358 at gmail.com Fri Jul 25 21:36:17 2008 From: jdh2358 at gmail.com (John Hunter) Date: Fri, 25 Jul 2008 20:36:17 -0500 Subject: [Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy In-Reply-To: References: <200807161844.36953.faltet@pytables.org> <20080718144247.GA5698@tardis.terramar.selidor.net> <200807251309.34418.faltet@pytables.org> Message-ID: <88e473830807251836o175e8922l6b38973262f7ae51@mail.gmail.com> On Fri, Jul 25, 2008 at 8:22 PM, Matt Knox wrote: > The automatic string parsing has been mentioned before, but it is a feature > I am personally very fond of. I use it all the time, and I suspect a lot of > people would like it very much if they used it. It's not suited for high > performance code, but is fantastic for interactive and ad-hoc work. This is > supported right in the "constructor" of the current Date class, along with > conversion from datetime objects. I'd love to see such support built into the > new date type, although I guess it could be added on easily enough with a > factory function. There is a module dateutil.parser which is released under the PSF license if there is interest in including something like this. Not sure if it is appropriate for numpy because of the speed implications, but its out there. mpl ships dateutil, so it is already available with all mpl installs. JDH From charlesr.harris at gmail.com Fri Jul 25 22:20:24 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 25 Jul 2008 20:20:24 -0600 Subject: [Numpy-discussion] 1.1.1rc2 tagged In-Reply-To: <488A7B7A.7020601@ar.media.kyoto-u.ac.jp> References: <488A7B7A.7020601@ar.media.kyoto-u.ac.jp> Message-ID: On Fri, Jul 25, 2008 at 7:18 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Jarrod Millman wrote: > > Hello, > > > > The 1.1.1rc2 is now available: > > http://svn.scipy.org/svn/numpy/tags/1.1.1rc2 > > > > The source tarball is here: > > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2.tar.gz > > > > Here is the universal Mac binary: > > http://cirl.berkeley.edu/numpy/numpy-1.1.1rc2-py2.5-macosx10.5.dmg > > > > David Cournapeau will be creating a 1.1.1rc2 Windows binary in next few > days. > > > > Please test this release ASAP and let us know if there are any > > problems. If there are no show stoppers, this will likely become the > > 1.1.1 release. > > > > > > There are two valgrind warnings: > > http://scipy.org/scipy/numpy/ticket/859 > http://scipy.org/scipy/numpy/ticket/863 > I assume these are in mainline. The question is, are they serious enough to warrant spending the weekend tracking down and fixing? Grump. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jul 25 22:41:29 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 25 Jul 2008 21:41:29 -0500 Subject: [Numpy-discussion] 0d array value comparisons In-Reply-To: <9457e7c80807251719o6e247c23sb2dcf93683e2ea2d@mail.gmail.com> References: <006536B0-2A25-47ED-A970-CDD4892DF849@dal.ca> <9457e7c80807251719o6e247c23sb2dcf93683e2ea2d@mail.gmail.com> Message-ID: <3d375d730807251941y79525aa1s39565c76427accaf@mail.gmail.com> On Fri, Jul 25, 2008 at 19:19, St?fan van der Walt wrote: > 2008/7/25 Thomas J. Duck : >> Hi, >> >> There is some unexpected behaviour (to me) when 0-dimensional >> arrays are compared with values. For example: >> >> >>> numpy.array([0]).squeeze() == 0 >> True >> >> >>> numpy.array([None]).squeeze() == None >> False >> >> >>> numpy.array(['a']).squeeze() == 'a' >> array(True, dtype=bool) >> >> Note that each test follows the same pattern, although the dtype for >> each squeezed array is different. The first case result is what I >> expected, and the second case result appears wrong. The return type >> for the third case is inconsistent with those before, but is at least >> workable. >> >> Are these the intended results? > > I would think not. E.g., in the following case broadcasting should kick in: > > In [20]: np.array([None, None, None]) == None > Out[20]: False (some_ndarray == None) == False is actually an intentional special case. I forget the details of why, and I don't feel like reinstalling Numeric or numarray to see if it was backwards compatibility, but it's not a bug. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rpyle at post.harvard.edu Sat Jul 26 00:31:27 2008 From: rpyle at post.harvard.edu (Robert Pyle) Date: Sat, 26 Jul 2008 00:31:27 -0400 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> Message-ID: On Jul 25, 2008, at 5:22 PM, Charles R Harris wrote: > > > On Fri, Jul 25, 2008 at 2:13 PM, Robert Pyle > wrote: > MacBook Pro, Intel Core 2 Duo, 10.5.4, Python 2.5.2 (r252:60911, Feb > 22 2008, 07:57:53) > > installed as expected, failed one test: > > FAIL: check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/numpy/core/tests/test_ma.py", line 692, in > check_testUfuncRegression > self.failUnless(eqmask(ur.mask, mr.mask)) > AssertionError > > Curious. Would you be willing to download the tarball and help us > debug this problem? > > Chuck Sure, but you'll have to tell me how to go about it. Maybe we should go off-list from here on. Bob From rpyle at post.harvard.edu Sat Jul 26 00:39:16 2008 From: rpyle at post.harvard.edu (Robert Pyle) Date: Sat, 26 Jul 2008 00:39:16 -0400 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> Message-ID: <27A51648-5070-4A66-897D-DA2FBB01EF5E@post.harvard.edu> On Jul 25, 2008, at 7:19 PM, Christopher Burns wrote: > Robert, > > numpy/core/tests/test_ma.py is an old file from a previous install. > You need to remove the numpy directory and reinstall. > > Unfortunately the installer does not cleanup old installs. Okay, all is well after all. 1300 tests, no errors. Bob From cburns at berkeley.edu Sat Jul 26 01:13:48 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Fri, 25 Jul 2008 22:13:48 -0700 Subject: [Numpy-discussion] numpy-1.1.1rc2 Mac binary - Please Test. In-Reply-To: <27A51648-5070-4A66-897D-DA2FBB01EF5E@post.harvard.edu> References: <764e38540807251148i52adb46ahdd58f6b79e116cae@mail.gmail.com> <7B1E48D6-40D7-45C1-BDF0-CCD4685D6FE4@post.harvard.edu> <764e38540807251619s2adc7ffcu532a4141a43e3707@mail.gmail.com> <27A51648-5070-4A66-897D-DA2FBB01EF5E@post.harvard.edu> Message-ID: <764e38540807252213s3f468f6eieab79744cbc5124f@mail.gmail.com> Excellent! Thanks for testing Bob. On Fri, Jul 25, 2008 at 9:39 PM, Robert Pyle wrote: > > Okay, all is well after all. 1300 tests, no errors. > > Bob > > -- Christopher Burns Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sat Jul 26 05:07:04 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 26 Jul 2008 18:07:04 +0900 Subject: [Numpy-discussion] Volunteers for future windows binaries packaging Message-ID: <488AE938.5090308@ar.media.kyoto-u.ac.jp> Hi there, I would like to call for a volunteer to maintain future releases of win32 binaries of numpy and scipy (after 1.2/1.1.1 and after scipy 0.7). Because I am leading toward the end of my PhD, I will have less time for numpy and other open source stuff, and windows related things are not the parts I enjoyed the most to say the least. So I would prefer someone to take up the job if possible. I spent some time cleaning my build scripts which automate most of the tasks (Building blas/lapack/atlas from scratch on cygwin, building the nsis installer from the numpy/scipy trunk), so it is not like there would be a lot of work. It would be mostly maintenance, making sure it works on the different versions we want to support, etc... thanks, David From david at ar.media.kyoto-u.ac.jp Sat Jul 26 05:10:24 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 26 Jul 2008 18:10:24 +0900 Subject: [Numpy-discussion] Volunteers for future windows binaries packaging In-Reply-To: <488AE938.5090308@ar.media.kyoto-u.ac.jp> References: <488AE938.5090308@ar.media.kyoto-u.ac.jp> Message-ID: <488AEA00.9040905@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > Hi there, > > I would like to call for a volunteer to maintain future releases of > win32 binaries of numpy and scipy (after 1.2/1.1.1 and after scipy 0.7). Just to be clear: I will build/maintain the binaries for 1.2, 1.1.1 and scipy 0.7. I am looking for someone else to take up the job after those planned releases. cheers, David From stefan at sun.ac.za Sat Jul 26 05:56:49 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 26 Jul 2008 11:56:49 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807251726.16143.felix@physik3.uni-rostock.de> References: <200807251726.16143.felix@physik3.uni-rostock.de> Message-ID: <9457e7c80807260256x22a434denab239e46579af32a@mail.gmail.com> Hi Felix This doesn't look quite right: # Re-Transform to frequency domain fftdata = fftpack.fft(ifftdata) fftdata = fftpack.fftshift(ifftdata) # not the "i" You probably want fftshift(fft(ifftdata))? As an aside, you also don't need vectorise, since those functions are all "vectorised" by themselves. Cheers St?fan 2008/7/25 Felix Richter : > Hi all, > > I found myself busy today trying to understand what went wrong in my FFT code. > I wrote a minimal example/testing code to check the FFT output against an > analytic result and also tried to reverse the transformation to get the > original function back. Most curiously, the results depend on whether I first > do the fft and then the ifft or the other way round. > > For the test function, I use the Lorentz function 1/(x^2+1). The exact FT is > exp(-|t|)*sqrt(pi/2), the IFT yields the same. > > 1) First FFT and then IFFT: The real part of FFT oscillates, the imaginary > part is not zero, and the magnitudes do not match. All this should not be, > but the IFFT reproduces the original function just fine. From bblais at bryant.edu Sat Jul 26 09:35:01 2008 From: bblais at bryant.edu (Brian Blais) Date: Sat, 26 Jul 2008 09:35:01 -0400 Subject: [Numpy-discussion] indexing (compared to matlab) Message-ID: Hello, I wanted to do the following thing that I do in Matlab (on a bigger problem), setting the values of a part of a matrix with indexing: >> a=floor(rand(5,5)*10) % make an example matrix to work on a = 2 4 7 9 8 6 9 2 5 2 6 3 5 1 8 1 5 6 1 2 1 2 8 2 9 >> ind=[2,4] ind = 2 4 >> a(ind,ind)=a(ind,ind)+100 a = 2 4 7 9 8 6 109 2 105 2 6 3 5 1 8 1 105 6 101 2 1 2 8 2 9 =========================== In numpy, the same gives: In [11]:a=floor(random.rand(5,5)*10) In [14]:a Out[14]: array([[ 7., 7., 8., 1., 9.], [ 0., 4., 9., 0., 5.], [ 4., 3., 7., 8., 3.], [ 2., 0., 4., 2., 4.], [ 9., 5., 0., 9., 9.]]) In [15]:ind=[1,3] In [20]:a[ind,ind]+=100 In [21]:a Out[21]: array([[ 7., 7., 8., 1., 9.], [ 0., 104., 9., 0., 5.], [ 4., 3., 7., 8., 3.], [ 2., 0., 4., 102., 4.], [ 9., 5., 0., 9., 9.]]) which only replaces 2 values, not all the values in the row,col combinations of [1,1],[1,2],etc...[3,3] like matlab. Is there a preferred way to do this, which I think should be fairly common. If I know that the indices are regular (like a slice) is there a way to do this? thanks, Brian Blais -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Sat Jul 26 09:55:14 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 26 Jul 2008 16:55:14 +0300 Subject: [Numpy-discussion] indexing (compared to matlab) References: Message-ID: <710F2847B0018641891D9A216027636029C1EC@ex3.envision.co.il> I had a conversation about this issue in the mailing list several months ago: in short, if the spacings are regular you can do what you want. either: a[1:4:2,1:4:2] += 100 or: ind = slice(1,4,2) a[ind, ind] += 100 Nadav -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Brian Blais ????: ? 26-????-08 16:35 ??: Discussion of Numerical Python ????: [Numpy-discussion] indexing (compared to matlab) Hello, I wanted to do the following thing that I do in Matlab (on a bigger problem), setting the values of a part of a matrix with indexing: >> a=floor(rand(5,5)*10) % make an example matrix to work on a = 2 4 7 9 8 6 9 2 5 2 6 3 5 1 8 1 5 6 1 2 1 2 8 2 9 >> ind=[2,4] ind = 2 4 >> a(ind,ind)=a(ind,ind)+100 a = 2 4 7 9 8 6 109 2 105 2 6 3 5 1 8 1 105 6 101 2 1 2 8 2 9 =========================== In numpy, the same gives: In [11]:a=floor(random.rand(5,5)*10) In [14]:a Out[14]: array([[ 7., 7., 8., 1., 9.], [ 0., 4., 9., 0., 5.], [ 4., 3., 7., 8., 3.], [ 2., 0., 4., 2., 4.], [ 9., 5., 0., 9., 9.]]) In [15]:ind=[1,3] In [20]:a[ind,ind]+=100 In [21]:a Out[21]: array([[ 7., 7., 8., 1., 9.], [ 0., 104., 9., 0., 5.], [ 4., 3., 7., 8., 3.], [ 2., 0., 4., 102., 4.], [ 9., 5., 0., 9., 9.]]) which only replaces 2 values, not all the values in the row,col combinations of [1,1],[1,2],etc...[3,3] like matlab. Is there a preferred way to do this, which I think should be fairly common. If I know that the indices are regular (like a slice) is there a way to do this? thanks, Brian Blais -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3653 bytes Desc: not available URL: From aisaac at american.edu Sat Jul 26 10:12:40 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 26 Jul 2008 10:12:40 -0400 Subject: [Numpy-discussion] indexing (compared to matlab) In-Reply-To: References: Message-ID: This is probably the most asked single question. Use ``ix_``. Example below. Cheers, Alan Isaac >>> import numpy as np >>> a=np.floor(np.random.rand(5,5)*10) >>> ind=[1,3] >>> a[np.ix_(ind,ind)]+=100 >>> a array([[ 9., 1., 2., 8., 5.], [ 2., 102., 7., 109., 0.], [ 8., 0., 2., 2., 2.], [ 1., 103., 5., 101., 7.], [ 1., 4., 7., 2., 3.]]) >>> From bblais at bryant.edu Sat Jul 26 11:05:05 2008 From: bblais at bryant.edu (Brian Blais) Date: Sat, 26 Jul 2008 11:05:05 -0400 Subject: [Numpy-discussion] indexing (compared to matlab) In-Reply-To: References: Message-ID: <4FE21854-E95C-42A9-9A10-2EA545B50897@bryant.edu> On Jul 26, 2008, at Jul 26:10:12 AM, Alan G Isaac wrote: > This is probably the most asked single question. > Use ``ix_``. Example below. > cool. this should definitely be in the Numpy for Matlab users page, http://www.scipy.org/NumPy_for_Matlab_Users, right after the line: Matlab Numpy Notes a(1:3,5:9) a[0:3][:,4:9] rows one to three and columns five to nine of a because this example gives a read-only submatrix. I looked there first to get an answer, and it wasn't forthcoming. thanks for the tip! bb -- Brian Blais bblais at bryant.edu http://web.bryant.edu/~bblais -------------- next part -------------- An HTML attachment was scrubbed... URL: From lpc at cmu.edu Sat Jul 26 15:03:35 2008 From: lpc at cmu.edu (Luis Pedro Coelho) Date: Sat, 26 Jul 2008 15:03:35 -0400 Subject: [Numpy-discussion] No Copy Reduce Operations Message-ID: <200807261503.36015.lpc@cmu.edu> Hello all, Numpy arrays come with several reduce operations: sum(), std(), argmin(), min(), .... The traditional implementation of these suffers from two big problems: It is slow and it often allocates intermediate memory. I have code that is failing with OOM (out of memory) exceptions in calls to ndarray.std(). I regularly handle arrays with 100 million entries (have a couple of million objects * 20 features per object = 100 million doubles), so this is a real problem for me. This being open-source, I decided to solve the problem. My first idea was to try to improve the numpy code. I failed to see how to do that while supporting everything that numpy does (multiple types, for example), so I started an implementation of reduce operations that uses C++ templates to make code optimised into the types it actually uses, choosing the right version to use at run time. In the spirit or release-early/release-often, I attach the first version of this code that runs. BASIC IDEA =============== ndarray.std does basically the following (The examples are in pseudo-code even though the implementation happens to be in C): def stddev(A): mu = A.mean() diff=(A-mu) maybe_conj=(diff if not complex(A) else diff.conjugate()) diff *= maybe_conj return diff.sum() With a lot of temporary arrays. My code does: def stddev(A): mu = A.mean() # No getting around this temporary std = 0 for i in xrange(A.size): diff = (A[i]-mu) if complex(A): diff *= conjugate(diff) else: diff *= diff std += diff return sqrt(diff/A.size) Of course, my code does it while taking into account the geometry of the array. It handles arrays with arbitrary strides. I do it while avoiding copying the array at any time (while most of the existing code will transpose/copy the array so that it is well behaved in memory). IMPLEMENTATION =============== I have written some template infrastructure so that, if I wanted a very fast entropy calculation, on a normalised array, you could do: template < ... > void compute_entropy(BaseIterator data, BaseIterator past, ResultsIterator results) { while (data != past) { if (*data) *result += *data * std::log(*data); ++data; ++result; } } and the machinery will instantiate this in several variations, deciding at run-time which one to call. You just have to write a C interface function like PyObject* fast_entropy(PyArrayObject *self, PyObject *args, PyObject *kwds) { int axis=NPY_MAXDIMS; PyArray_Descr *dtype=NULL; PyArrayObject *out=NULL; static char *kwlist[] = {"array","axis", "dtype", "out", NULL}; if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|O&O&O&", kwlist, &self, PyArray_AxisConverter,&axis, PyArray_DescrConverter2, &dtype, PyArray_OutputConverter, &out)) { Py_XDECREF(dtype); return NULL; } int num = _get_type_num_double(self->descr, dtype); Py_XDECREF(dtype); return compress_dispatch(self, out, axis, num, EmptyType()); // This decides which function to call } For contiguous arrays, axis=None, this becomes void compute_entropy(Base* data, Base* past, Results* result) { while (data != past) { if (*data) *result += *data * std::log(*data); ++data; } } which is probably as fast as it can be. If the array is not contiguous, this becomes void compute_entropy(numpy_iterator data, numpy_iterator past, Results* result) { while (data != past) { if (*data) *result += *data * std::log(*data); ++data; } } where numpy_iterator is a type that knows how to iterate over numpy arrays following strides. If axis is not None, then the result will not be a single value, it will be an array. The machinery will automatically create the array of the right size and pass it to you with so that the following gets instantiated: void compute_entropy(numpy_iterator data, numpy_iterator past, numpy_iterator results) { while (data != past) { if (*data) *results += *data * std::log(*data); ++data; ++results; } } The results parameter has the strides set up correctly to iterate over results, including looping back when necessary so that the code works as it should. Notice that the ++results operation seems to be dropping in and out. In fact, I was oversimplifying above. There is no instantiation with Result*, but with no_iteration which behaves like Result*, but with an empty operator ++(). You never change your template implementation. (The code above was for explanatory purposes, not an example of working code. The interface I actually used takes more parameters which are not very important for expository purposes. This allows you to, for example, implement the ddof parameter in std()). ADVANTAGES =========== For most operations, my code is faster (it's hard to beat ndarray.sum(), but easy to beat ndarray.std()) than numpy on both an intel 32 bit machine and an amd 64 bit machine both newer than one year (you can test it on your machine by runnning profile.py). For some specific operations, like summing along a non-last axis on a moderately large array, it is slower (I think that the copy&process tactic might be better in this case than the no-copy/one pass operation I am using). In most cases I tested, it is faster. In particular, the basic case (a well-behaved array), it is faster. More important than speed (at least for me) is the fact that this does not allocate more memory than needed. This will not fail with OOM errors. It's easy to implement specific optimisations. For example, replace a sum function for a specific case to call AMD's framewave SIMD library (which uses SIMD instructions): void compute_sum(short* data, short* past, no_iteration result) { fwiSum_16s_C1R ( data, sizeof(short), past-start, &*result); } or, compute the standard deviation of an array of boolean with a single pass (sigma = sqrt(p(1-p))): void compute_std(bool* data, bool* past, no_iteration result) { size_type N = (past-data); size_type pos = 0; while (data != past) { if (*data) ++pos; ++data; } *result = std::sqrt(ResType(pos)/ResType(N)*(1-ResType(pos)/ResType(N))); } NOT IMPLEMENTED (YET) ====================== Non-aligned arrays are not implemented. out arrays have to be well behaved. My current idea is to compromise and make copies in this case. I could also, trivially, write a thing that handled those cases without copying, but I don't think it's worth the cost in code bloat. * argmax()/argmin(). This is a bit harder to implement than the rest, as it needs a bit more machinery. I think it's possible, but I haven't gotten around to it. * ptp(). I hesitate whether to simply do it in Python: def ncr_ptp(A,axis=None,dtype=None,out=None): res=ncr.max(A,axis=axis,dtype=dtype,out=out) min=ncr.min(A,axis=axis,dtype=dtype) res -= max return res Does two passes over the data, but no extra copying. I could do the same using the Python array API, of course. DISADVANTAGES ============== It's C++. I don't see it as a technical disadvantage, but I understand some people might not like it. If this was used in the numpy code base, it could replace the current macro language (begin repeat // end repeat), so the total number of languages used would not increase. It generates too many functions. Disk is cheap, but do we really need a well optimised version of std() for the case of complex inputs and boolean output? What's a boolean standard deviation anyway?. Maybe some tradeoff is necessary: optimise the defaults and make others possible. BUGS ===== I am not correctly implementing the dtype parameter. I thought it controlled the type of the intermediate results, but if I do import numpy A=numpy.arange(10) A.mean(dtype=numpy.int8) I get 4.5 which surprises me! Furthermore, A.mean(dtype=numpy.int8).dtype returns dtype('float64') In my implementation, mean(numpy.arange(10),dtype=numpy.int8) is 4 (This might be considered a numpy bug --- BTW, i am not running subversion for this comparison). * I also return only either python longs or python floats. This is a trivial change, I just don't know all the right function names to return numpy types. * A couple of places in the code still have a FIXME on them FUTURE ======= I consider this code proof-of-concept. It's cool, it demonstrate that the idea works, but it is rough and needs cleaning up. There might even be bugs in it (especially the C interface with Python is something I am not so familiar with)! I see three possible paths for this: (1) You agree that this is nice, it achieves good results and I should strive to integrate this into numpy proper, replacing the current implementation. I would start a branch in numpy svn and implement it there and finally merge into the main branch. I would be perfectly happy to relicense to BSD if this is the case. One could think of the array-to-array (A+B, A+=2, A != B,...) operations using a similar scheme This would imply that part of the C API would be obsoleted: the whole functions array would stop making sense. I don't know how widespread it's used. (2) I keep maintaining this as a separate package and whoever wants to use it, can use it. I will probably keep it GPL, for now. Of course, at some later point in time, one could integrate this into the main branch (Again, in this case, I am happy to relicense). (3) Someone decides to take this as a challenge and re-implements ndarray.std and the like so that it uses less memory and is faster, but does it still in C. I am not much of a C programmer, so I don't see how this could be done without really ugly reliance on the preprocessor, but maybe it could be done. What do you think? bye, Lu?s Pedro Coelho PhD Student in Computational Biology -------------- next part -------------- A non-text attachment was scrubbed... Name: ncreduce-0.01.tar.gz Type: application/x-tgz Size: 5542 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: profile.py Type: application/x-python Size: 4728 bytes Desc: not available URL: -------------- next part -------------- Values are fold improvements: >1 means ncreduce is faster, <1 means ncreduce is slower Columns are different array sizes: (4x4) (40x4) (4000x10) (10x4000) (4000x10000) Rows are different types of reduce operation: A.f() A.T.f() A.f(0) A.f(1) A.T.f(1) A[2].f() A[:,2].f() For SUM function [[ 3.8070922 3.74078624 3.18315868 3.21160784 2.43980108] [ 4.46617647 4.22613065 8.31738401 6.8606588 29.21819473] [ 1.9447132 1.56371814 0.96617167 3.01338318 7.0270556 ] [ 2.51067616 4.4812749 6.82775021 2.81078355 2.02944134] [ 1.91545392 1.62806468 0.87334081 2.54212652 0.80367664] [ 3.6557377 3.52655539 3.45639535 3.09458466 3.09918836] [ 3.20726783 2.93624557 1.0432128 3.20680628 0.96305311]] For MEAN function [[ 4.53997379 4.33751425 3.18691456 3.21325964 2.61187721] [ 5.09466667 4.86712486 8.09895611 6.66878963 29.56070697] [ 1.32862454 1.2232241 0.95682217 2.77428247 6.59373701] [ 1.38306992 2.07929176 5.6042247 2.67118025 2.01389827] [ 1.32216749 1.24022472 0.8609297 2.38601915 0.79781743] [ 4.26244953 4.21212121 4.28808864 3.22779923 3.19200338] [ 4.05635649 3.68317972 1.08496276 4.00253807 0.95483616]] For STD function [[ 16.37931034 13.23200859 8.515398 8.6650662 5.56770624] [ 17.59516616 13.92785794 11.48111037 10.94399828 16.72880933] [ 7.96362619 4.69929338 2.06533419 2.63442178 5.35333711] [ 12.23589744 12.1507772 8.82073033 8.74585517 6.89534047] [ 7.9735658 4.66921665 1.54624704 2.49588909 1.30503195] [ 17.3400936 17.64123377 17.10280374 5.49334051 4.91767663] [ 14.07474227 10.96748768 1.73907847 13.80691299 1.08550626]] For VAR function [[ 12.75496689 11.19303797 8.23840394 8.17942839 5.35515979] [ 14.72340426 11.67340426 11.47882782 11.10354571 16.79689987] [ 7.43418014 4.3738514 2.07330643 2.68918206 5.09448361] [ 11.42946429 13.17880795 9.98134987 8.36491556 6.79879305] [ 7.52822581 4.28857407 1.4730624 2.47868726 1.1964753 ] [ 14.93135725 14.64387917 14.28 5.2311021 4.79074167] [ 12.18278146 9.37636544 1.65813694 11.40481928 1.04396634]] For MAX function [[ 4.18282548 4.1835206 3.064325 3.09957884 1.89579732] [ 5.3807267 5.09299191 8.16320492 6.81480951 23.32648022] [ 2.22292994 1.53729977 0.82438672 1.95009491 6.05179257] [ 2.95664467 4.09146341 3.55795579 2.05772538 1.74708854] [ 2.35276753 1.66998507 0.86066282 1.75172327 0.80780005] [ 4.53484603 4.39833333 3.88140556 3.10858712 3.08567761] [ 4.31647399 3.50316857 1.21125174 3.87866109 0.93919081]] For MIN function [[ 5.12647059 4.29441624 3.03089161 3.05712249 1.89781197] [ 5.23816794 4.79036458 8.18441573 6.75257879 23.33387201] [ 2.20617111 1.57129757 0.86588154 2.0558663 6.3874678 ] [ 2.96841155 3.84391192 3.35497735 1.99594036 1.74332781] [ 2.25244073 1.69904857 0.87264242 1.82947976 0.80815663] [ 4.62801932 4.39569536 4.34477124 3.12006903 3.10487051] [ 3.97965116 3.51935081 1.16092389 4.00438596 0.93541891]] For ALL function [[ 2.79002467e-02 3.76257310e+00 9.28888889e+01 8.84859813e+01 8.42846792e+04] [ 4.99675325e+00 5.28160000e+00 1.31449298e+02 1.89387255e+02 5.50127171e+05] [ 2.30509745e+00 1.61616675e+00 3.98208860e-01 2.52603138e+00 2.41908320e+00] [ 2.82739726e+00 4.85934820e+00 3.36824142e+01 5.78263158e+01 1.46416294e+03] [ 1.95994914e+00 1.75441501e+00 4.47726973e-01 2.20169301e+00 9.43527633e-01] [ 4.18080000e+00 4.11092715e+00 4.09602649e+00 1.30212072e+01 2.61726974e+01] [ 3.84461538e+00 3.96411856e+00 1.29984496e+01 3.94307692e+00 5.45761689e+01]] For ANY function [[ 4.54672897e+00 4.97651007e+00 9.48595944e+01 9.28051948e+01 8.83522646e+04] [ 5.02138158e+00 5.54304636e+00 1.40057190e+02 1.87542130e+02 5.78053539e+05] [ 2.42564910e+00 1.70693277e+00 4.13649990e-01 2.52723749e+00 2.52861281e+00] [ 2.97480620e+00 5.02946429e+00 3.39985881e+01 5.69674115e+01 1.50428346e+03] [ 2.29548872e+00 1.81257078e+00 4.52254796e-01 2.20091677e+00 9.46904281e-01] [ 4.29111842e+00 4.16582064e+00 4.13087248e+00 1.33412162e+01 2.89243697e+01] [ 3.91051805e+00 3.96214511e+00 1.41828299e+01 3.93322734e+00 6.09200000e+01]] python profile.py 10948.30s user 625.08s system 99% cpu 3:13:06.33 total -------------- next part -------------- Values are fold improvements: >1 means ncreduce is faster, <1 means ncreduce is slower Columns are different array sizes: (4x4) (40x4) (4000x10) (10x4000) (4000x10000) Rows are different types of reduce operation: A.f() A.T.f() A.f(0) A.f(1) A.T.f(1) A[2].f() A[:,2].f() For SUM function [[ 4.29501085 4.20192308 2.84235939 2.88138342 2.29455524] [ 5.25827815 5.16287879 6.16807616 6.3403272 14.58060969] [ 2.21042831 1.50277949 0.44599368 2.76584319 1.32321157] [ 3.03638645 5.85714286 17.28581874 2.82322888 2.37914623] [ 2.10303588 1.59593975 0.43697892 2.48825259 0.7626602 ] [ 4.18485523 4.01573034 4.12837838 3.14817601 3.04448643] [ 3.71543086 3.2927242 0.67116867 3.5546875 1.12240952]] For MEAN function [[ 5.30393996 5.08747856 3.07230251 3.05563231 2.32033496] [ 6.37074148 5.9165247 6.12403101 6.28776978 13.90174399] [ 1.21764063 1.07815771 0.44468226 2.46024723 1.34025926] [ 1.33383189 2.1530469 8.07927117 2.73193251 2.42263582] [ 1.19756428 1.09569715 0.44064895 2.26017632 0.74816577] [ 5.19438878 5.11608961 5.05210421 3.33246946 3.13600136] [ 4.45698925 4.14195584 0.73804391 4.40414508 1.16802168]] For STD function [[ 17.99813433 16.37692308 7.62219466 7.631936 7.28233251] [ 21.27087576 17.59594384 9.22584327 9.31559985 12.67600126] [ 8.86459803 5.83032235 1.86635194 2.37492871 1.82752484] [ 14.94068802 12.8956229 8.47783179 7.37456519 7.95656419] [ 9.27318841 5.22204908 0.93005222 2.35775864 1.21045558] [ 19.32653061 19.59915612 18.70140281 6.42620232 5.58600269] [ 15.66890756 12.18216561 1.66883246 14.60942249 2.57191279]] For VAR function [[ 16.59753593 14.16326531 7.90385245 7.05429577 7.07134674] [ 18.39917695 15.50394945 9.25239751 9.21568903 12.79580458] [ 8.60134128 5.53321364 1.84634827 2.51325626 1.83403096] [ 14.28641975 16.80358535 11.29057751 7.46716752 8.01444069] [ 8.5099926 4.92359729 0.91359572 2.5071937 1.26863978] [ 17.28387097 17.0867679 16.32217573 6.15619088 5.4340506 ] [ 13.72299652 10.93495935 1.58346303 12.90923825 1.56830078]] For MAX function [[ 5.20682303 4.9273743 2.65534406 2.64120879 1.972041 ] [ 5.89339019 5.66117216 5.76669586 5.91858423 12.36736688] [ 2.62611276 1.8565051 0.51823534 2.79180066 1.45855696] [ 3.61738003 5.61111111 4.89628282 2.00837979 1.78166991] [ 2.70691334 1.95046235 0.4998023 2.43290857 0.82916059] [ 5.16556291 5.08944954 5.13443396 3.00861101 2.80622541] [ 4.6120332 4.03327496 0.85167674 4.32677165 1.14838906]] For MIN function [[ 5.14893617 4.74499089 2.75640497 2.71304858 2.0904718 ] [ 6.03862661 5.56284153 5.65100334 5.751155 12.2645859 ] [ 2.72997033 1.83545918 0.51301053 2.79813543 1.49883452] [ 3.65349144 5.54545455 4.87626525 1.9751425 1.77834036] [ 2.62728146 1.93559097 0.4968886 2.43326653 0.84075323] [ 5.07658643 4.91314031 4.78118162 2.90317812 2.69502462] [ 4.36811024 3.71568627 0.82321831 4.01663586 1.15417859]] For ALL function [[ 3.01151344e-02 6.80778589e+00 1.61822384e+02 1.61588808e+02 1.36934336e+05] [ 6.09692671e+00 7.12165450e+00 2.25415909e+02 3.05729167e+02 3.79104862e+05] [ 2.56551724e+00 1.85556995e+00 4.66932320e-01 3.53097573e+00 8.77598466e-01] [ 3.47936085e+00 8.33626098e+00 9.45616438e+01 1.55502577e+02 5.36139550e+03] [ 2.52413793e+00 1.99930314e+00 4.82473875e-01 3.33456449e+00 5.92418062e-01] [ 4.86363636e+00 4.91084337e+00 4.89156627e+00 1.95740741e+01 3.94606987e+01] [ 4.11328125e+00 4.32681018e+00 1.75753968e+01 4.15145228e+00 2.69920477e+01]] For ANY function [[ 4.93668122e+00 5.90997567e+00 1.68126551e+02 1.63245783e+02 1.42517746e+05] [ 6.18491484e+00 6.83490566e+00 2.25388764e+02 3.13054374e+02 3.77207517e+05] [ 2.62072435e+00 1.79820051e+00 4.81102465e-01 3.57245295e+00 8.86472009e-01] [ 3.57994580e+00 7.91734417e+00 7.90528971e+01 9.62805049e+01 5.50243880e+03] [ 2.59674134e+00 1.95604396e+00 4.90362418e-01 3.38196965e+00 5.94254604e-01] [ 4.94285714e+00 4.84671533e+00 4.83132530e+00 2.07274939e+01 4.46569343e+01] [ 4.04887984e+00 4.33544304e+00 1.83661088e+01 4.14978903e+00 2.76757322e+01]] 3199.12s user 552.47s system 99% cpu 1:02:38.99 total From henrik-web at phoboid.net Sat Jul 26 15:52:28 2008 From: henrik-web at phoboid.net (Henrik Ronellenfitsch) Date: Sat, 26 Jul 2008 21:52:28 +0200 Subject: [Numpy-discussion] 2D Hamming window Message-ID: <488B807C.6050507@phoboid.net> Hello! I'm looking for a 2D hamming window function. As far as I can see, numpy only supports 1D windows and googling didn't show a simple way of extending it in two dimensions. Thanks for your help, Henrik From cimrman3 at ntc.zcu.cz Sat Jul 26 17:35:24 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Sat, 26 Jul 2008 23:35:24 +0200 Subject: [Numpy-discussion] [SciPy-user] unique, sort, sortrows In-Reply-To: <1217095611.7172.42.camel@localhost> References: <1217095611.7172.42.camel@localhost> Message-ID: <20080726233524.vcdq55buowgwgcso@webmail.zcu.cz> Hi David, I can comment on unique1d, as I am the culprit. I am cc'ing to numpy-discussion as this is a numpy function. Quoting "David M. Kaplan" : > 2) Is there a simple equivalent of sortrows(a) (i.e., sorting by entire > rows)? Similarly, is there a simple equivalent of the matlab Y = have you looked at lexsort? > 3) Is there an equivalent of [Y,I,J] = unique(X)? In this case, I is > the indices of the unique elements and J is an index from Y to X (i.e., > where the unique elements appear in X. I can get I with: > > I,Y = unique1d( X, return_index=True ) > > but J, which is very useful, is not available. I suppose I could do: > > J = array([]) > for i,y in enumerate(Y): > J[ nonzero( y == X ) ] = i > > But it seems like J is useful enough that there should be an easier way > (or perhaps it should be integrated into unique1d, at the risk of adding > more keyword arguments). So basically Y = X[I] and X = Y[J], right? I do not recall matlab that well to know for sure. It certainly could be done, I could look at it after I return from Euroscipy (i.e. after Monday). I would replace return_index argument by two arguments: return_direct (->I) and return_inverse (->J), ok? Does anyone propose better names? Actually most of the methods in arraysetops could return optionally some index arrays. Would anyone use it? (I do not need it personally :) cheers, r. From gruben at bigpond.net.au Sat Jul 26 20:48:33 2008 From: gruben at bigpond.net.au (Gary Ruben) Date: Sun, 27 Jul 2008 01:48:33 +0100 Subject: [Numpy-discussion] 2D Hamming window In-Reply-To: <488B807C.6050507@phoboid.net> References: <488B807C.6050507@phoboid.net> Message-ID: <488BC5E1.2030208@bigpond.net.au> Henrik Ronellenfitsch wrote: > Hello! > I'm looking for a 2D hamming window function. > As far as I can see, numpy only supports 1D windows > and googling didn't show a simple way of extending it > in two dimensions. > > Thanks for your help, > > Henrik Hi Henrik, I haven't looked at the "correct" way to do this, but I recently wanted to do the same thing and ended up with the following solution. This may well be mathematically incorrect, but in my application it wasn't important: import numpy as np import scipy.signal as ss # read heightmap here - in my case it's a square numpy float array # build 2d window hm_len = heightmap.shape[0] bw2d = np.outer(ss.hamming(hm_len), np.ones(hm_len)) bw2d = np.sqrt(bw2d * bw2d.T) # I don't know whether the sqrt is correct # window the heightmap heightmap *= bw2d -- Gary R. From fperez.net at gmail.com Sat Jul 26 21:30:59 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 26 Jul 2008 18:30:59 -0700 Subject: [Numpy-discussion] indexing (compared to matlab) In-Reply-To: <4FE21854-E95C-42A9-9A10-2EA545B50897@bryant.edu> References: <4FE21854-E95C-42A9-9A10-2EA545B50897@bryant.edu> Message-ID: On Sat, Jul 26, 2008 at 8:05 AM, Brian Blais wrote: > cool. this should definitely be in the Numpy for Matlab users > page, http://www.scipy.org/NumPy_for_Matlab_Users, right after the line: > Matlab Numpy Notes By all means, please put it in, it's a wiki after all. One of the unwritten rules of open source projects: when others take the time to help you out and there's a publicly accessible place for documenting things, the nice thing to do is to pay back the good will of the more experienced users with a public record of this information. It will help you clarify things in your head as you write them up, it will get you more involved with the community, help others in the future and earn you good karma :) Cheers, f From henrik-web at phoboid.net Sun Jul 27 05:10:37 2008 From: henrik-web at phoboid.net (Henrik Ronellenfitsch) Date: Sun, 27 Jul 2008 11:10:37 +0200 Subject: [Numpy-discussion] 2D Hamming window In-Reply-To: <488BC5E1.2030208@bigpond.net.au> References: <488B807C.6050507@phoboid.net> <488BC5E1.2030208@bigpond.net.au> Message-ID: <488C3B8D.6040806@phoboid.net> Hi, Gary Ruben wrote: > import numpy as np > import scipy.signal as ss > > # read heightmap here - in my case it's a square numpy float array > > # build 2d window > hm_len = heightmap.shape[0] > bw2d = np.outer(ss.hamming(hm_len), np.ones(hm_len)) > bw2d = np.sqrt(bw2d * bw2d.T) # I don't know whether the sqrt is correct > > # window the heightmap > heightmap *= bw2d > > -- > Gary R. Thanks very much for your solution, this is exactly what I needed! If I'm not mistaken, though, you can achieve the same result with h = hamming(n) ham2d = sqrt(outer(h,h)) which is a bit more compact. Regards, Henrik From oliphant at enthought.com Sun Jul 27 06:00:43 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 27 Jul 2008 05:00:43 -0500 Subject: [Numpy-discussion] No Copy Reduce Operations In-Reply-To: <200807261503.36015.lpc@cmu.edu> References: <200807261503.36015.lpc@cmu.edu> Message-ID: <488C474B.1050601@enthought.com> Luis Pedro Coelho wrote: > Hello all, > > Numpy arrays come with several reduce operations: sum(), std(), argmin(), > min(), .... > > The traditional implementation of these suffers from two big problems: It is > slow and it often allocates intermediate memory. I have code that is failing > with OOM (out of memory) exceptions in calls to ndarray.std(). I regularly > handle arrays with 100 million entries (have a couple of million objects * 20 > features per object = 100 million doubles), so this is a real problem for me. > Luis, Thank you for your work and your enthusiasm. You are absolutely right that the default implementations are generic and therefore potentially slower and more memory consuming. There is generally a basic tradeoff between generic code and fast/memory-conserving code. The default implementations of std and var are much different than sum, min, argmin, etc. The main difference is that the latter are direct reduce methods on the ufuncs while the former are generic extensions using "python written with the Python C-API." Your approach using C++ templates is interesting, and I'm very glad for your explanation and your releasing of the code as open source. I'm not prepared to start using C++ in NumPy, however, so your code will have to serve as an example only. One way to do this without using templates is to extend the dtype functions array with additional function pointers (std, and var). This has been done several times in the past and it is probably advisable. In that case your code could very likely be used (using the C-compatible portions). I'm grateful you are willing to re-license that part of your code as BSD so that it can possibly be used in NumPy. Thanks so much. It is exciting to see interest in the code especially from students. Best regards, -Travis P.S. Specific answers to some of your questions below. > I am not correctly implementing the dtype parameter. I thought it controlled > the type of the intermediate results, but if I do > The dtype parameter controls the type of the "reduction" but not the final result (which is always a float for the mean because of the division). > I see three possible paths for this: > (1) You agree that this is nice, it achieves good results and I should strive > to integrate this into numpy proper, replacing the current implementation. I > would start a branch in numpy svn and implement it there and finally merge > into the main branch. I would be perfectly happy to relicense to BSD if this > is the case. > The idea of your code is great, but the C++ implementation cannot be directly used. > One could think of the array-to-array (A+B, A+=2, A != B,...) operations > using a similar scheme > This would imply that part of the C API would be obsoleted: the whole > functions array would stop making sense. I don't know how widespread it's > used. > I don't know what you mean by the "functions array". If you are talking about the ufuncs, then yes it is widespread. From gruben at bigpond.net.au Sun Jul 27 06:32:07 2008 From: gruben at bigpond.net.au (Gary Ruben) Date: Sun, 27 Jul 2008 11:32:07 +0100 Subject: [Numpy-discussion] 2D Hamming window In-Reply-To: <488C3B8D.6040806@phoboid.net> References: <488B807C.6050507@phoboid.net> <488BC5E1.2030208@bigpond.net.au> <488C3B8D.6040806@phoboid.net> Message-ID: <488C4EA7.60800@bigpond.net.au> Henrik Ronellenfitsch wrote: > Thanks very much for your solution, this is exactly what I needed! > If I'm not mistaken, though, you can achieve the same result with > > h = hamming(n) > ham2d = sqrt(outer(h,h)) > > which is a bit more compact. > > Regards, > Henrik Yes, that's nicer. regards, Gary From david at ar.media.kyoto-u.ac.jp Mon Jul 28 01:26:59 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 28 Jul 2008 14:26:59 +0900 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries Message-ID: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Hi, After some delay, here are the win32 binaries for numpy 1.1.1rc2: http://www.ar.media.kyoto-u.ac.jp/members/david/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe Notes on those binaries: - Based on Atlas 3.8.2 (the 1.1.0 was built against 3.8.0, which had a serious bug wrt dgemm, which is used for numpy.dot. It should solve #844 (problem with numpy.inner) - It is not stricly based on 1.1.rc2, but on 1.1.x trunk. The only difference, though, is a small fix in MANIFEST.in which was broken wrt sdist target and the version of course. If those work out, I will also prepare 2.4 binaries. I am sorry for the delay, cheers, David From david at ar.media.kyoto-u.ac.jp Mon Jul 28 01:30:44 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 28 Jul 2008 14:30:44 +0900 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Message-ID: <488D5984.202@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > Hi, > > After some delay, here are the win32 binaries for numpy 1.1.1rc2: > > http://www.ar.media.kyoto-u.ac.jp/members/david/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe > I managed to screw up the link: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe cheers, David From charlesr.harris at gmail.com Mon Jul 28 03:41:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Jul 2008 01:41:42 -0600 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jul 27, 2008 at 11:26 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > After some delay, here are the win32 binaries for numpy 1.1.1rc2: > > > http://www.ar.media.kyoto-u.ac.jp/members/david/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe > > > Notes on those binaries: > - Based on Atlas 3.8.2 (the 1.1.0 was built against 3.8.0, which had > a serious bug wrt dgemm, which is used for numpy.dot. It should solve > #844 (problem with numpy.inner) > - It is not stricly based on 1.1.rc2, but on 1.1.x trunk. The only > difference, though, is a small fix in MANIFEST.in which was broken wrt > sdist target and the version of course. > > If those work out, I will also prepare 2.4 binaries. I am sorry for the > delay, > Great. Now I just need to write up release notes ;) ...Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix at physik3.uni-rostock.de Mon Jul 28 04:25:36 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Mon, 28 Jul 2008 10:25:36 +0200 Subject: [Numpy-discussion] FFT usage / consistency Message-ID: <200807281025.36338.felix@physik3.uni-rostock.de> Hi all, Stefan, thank you very much for your quick answer. This was an obvious silly mistake. Now the first function does what it should. Still, me and my colleagues can't make any sense of what happens in the second example. The re-transformed function is identical to the original one, but the Fourier-transform doesn't have anything to do with what it should be mathematically. Could you or someone else please have a look at it? Thanks again, Felix From David.Kaplan at ird.fr Mon Jul 28 08:55:22 2008 From: David.Kaplan at ird.fr (David M. Kaplan) Date: Mon, 28 Jul 2008 14:55:22 +0200 Subject: [Numpy-discussion] [SciPy-user] unique, sort, sortrows In-Reply-To: 1217162589.7128.36.camel@localhost Message-ID: <1217249722.7230.36.camel@localhost> Hi, Well, as usual there are compromises in everything and the mgrid/ogrid functionality is the way it currently is for some good reasons. The first reason is that python appears to be fairly sloppy about how it passes indexing arguments to the __getitem__ method. It passes a tuple containing the arguments in all cases except when it has one argument, in which case it just passes that argument. This means that it is hard to tell a tuple argument from several non-tuple arguments. For example, the following two produce exactly the same call to __getitem__ : mgrid[1,2,3] mgrid[(1,2,3)] (__getitem__ receives a single tuple (1,2,3)), but different from: mgrid[[1,2,3]] (__getitem__ receives a single list = [1,2,3]). This seems like a bug to me, but is probably considered a feature by somebody. In any case, this is workable, but a bit annoying in that tuple arguments just aren't going to work well. The second problem is that the current implementation is fairly efficient because it forces all arguments to the same type so as to avoid some unnecessary copies (I think). Once you allow non-slice arguments, this is hard to maintain. That being said, attached is a replacement for index_tricks.py that should implement a reasonable set of features, while only very slightly altering performance. I have only touched nd_grid. I haven't fixed the documentation string yet, nor have I added tests to test_index_tricks.py, but will do that if the changes will be accepted into numpy. With the new version, old stuff should work as usual, except that mgrid now returns a list of arrays instead of an array of arrays (note that this change will cause current test_index_tricks.py to fail). With the new changes, you can now do: mgrid[-2:5:10j,[4.5,6,7.1],12,['abc','def']] The following will work as expected: mgrid[:5,(1,2,3)] But this will not: mgrid[(1,2,3)] # same as mgrid[1,2,3], but different from mgrid[[1,2,3]] Given these limitations, this seems like a fairly useful addition. If this looks usable, I will clean up and add tests if desired. If not, I recommend adding a ndgrid function to numpy that does the equivalent of matlab [X,Y,Z,...] = ndgrid(x,y,z,...) and then making the current meshgrid just call that changing the order of the first two arguments. Cheers, David -- ********************************** David M. Kaplan Charge de Recherche 1 Institut de Recherche pour le Developpement Centre de Recherche Halieutique Mediterraneenne et Tropicale av. Jean Monnet B.P. 171 34203 Sete cedex France Phone: +33 (0)4 99 57 32 27 Fax: +33 (0)4 99 57 32 95 http://www.ur097.ird.fr/team/dkaplan/index.html ********************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: index_tricks.py Type: text/x-python Size: 15089 bytes Desc: not available URL: From aisaac at american.edu Mon Jul 28 09:51:47 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 28 Jul 2008 09:51:47 -0400 Subject: [Numpy-discussion] [SciPy-user] unique, sort, sortrows In-Reply-To: <1217249722.7230.36.camel@localhost> References: <1217249722.7230.36.camel@localhost> Message-ID: <488DCEF3.3010702@american.edu> David M. Kaplan wrote: > python appears to be fairly sloppy about how it passes > indexing arguments to the __getitem__ method. I do not generally find the word 'sloppy' to be descriptive of Python. > It passes a tuple containing the arguments in all cases > except when it has one argument, in which case it just > passes that argument. Well, not quite. The bracket syntax is for passing a key (a single object) to __getitem__. > For example, the following two produce exactly the same > call to __getitem__ : > mgrid[1,2,3] > mgrid[(1,2,3)] Well, yes. Note:: >>> x = 1,2,3 >>> type(x) In Python it is the commas, not the paretheses, that is determining the tuple type. So perhaps the question you raise could be rephrased to "why does an ndarray (not Python) handle treat a list 'index' differently than a tuple 'index'?" I do not know the history of that decision, but it has been used to provide some additional functionality. Cheers, Alan Isaac From bsouthey at gmail.com Mon Jul 28 10:31:49 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 28 Jul 2008 09:31:49 -0500 Subject: [Numpy-discussion] No Copy Reduce Operations In-Reply-To: <200807261503.36015.lpc@cmu.edu> References: <200807261503.36015.lpc@cmu.edu> Message-ID: <488DD855.9000602@gmail.com> Luis Pedro Coelho wrote: > Hello all, > > Numpy arrays come with several reduce operations: sum(), std(), argmin(), > min(), .... > > The traditional implementation of these suffers from two big problems: It is > slow and it often allocates intermediate memory. I have code that is failing > with OOM (out of memory) exceptions in calls to ndarray.std(). I regularly > handle arrays with 100 million entries (have a couple of million objects * 20 > features per object = 100 million doubles), so this is a real problem for me. > > This being open-source, I decided to solve the problem. My first idea was to > try to improve the numpy code. I failed to see how to do that while > supporting everything that numpy does (multiple types, for example), so I > started an implementation of reduce operations that uses C++ templates to > make code optimised into the types it actually uses, choosing the right > version to use at run time. In the spirit or release-early/release-often, I > attach the first version of this code that runs. > > BASIC IDEA > =============== > > ndarray.std does basically the following (The examples are in pseudo-code even > though the implementation happens to be in C): > > def stddev(A): > mu = A.mean() > diff=(A-mu) > maybe_conj=(diff if not complex(A) else diff.conjugate()) > diff *= maybe_conj > return diff.sum() > > With a lot of temporary arrays. My code does: > > def stddev(A): > mu = A.mean() # No getting around this temporary > std = 0 > for i in xrange(A.size): > diff = (A[i]-mu) > if complex(A): > diff *= conjugate(diff) > else: > diff *= diff > std += diff > return sqrt(diff/A.size) > > Of course, my code does it while taking into account the geometry of the > array. It handles arrays with arbitrary strides. > > I do it while avoiding copying the array at any time (while most of the > existing code will transpose/copy the array so that it is well behaved in > memory). > > > IMPLEMENTATION > =============== > > I have written some template infrastructure so that, if I wanted a very fast > entropy calculation, on a normalised array, you could do: > > template < ... > > void compute_entropy(BaseIterator data, BaseIterator past, ResultsIterator > results) { > while (data != past) { > if (*data) *result += *data * std::log(*data); > ++data; > ++result; > } > } > > and the machinery will instantiate this in several variations, deciding at > run-time which one to call. You just have to write a C interface function > like > > PyObject* fast_entropy(PyArrayObject *self, PyObject *args, PyObject *kwds) > { > int axis=NPY_MAXDIMS; > PyArray_Descr *dtype=NULL; > PyArrayObject *out=NULL; > static char *kwlist[] = {"array","axis", "dtype", "out", NULL}; > > if (!PyArg_ParseTupleAndKeywords(args, kwds, "O|O&O&O&", kwlist, > &self, > PyArray_AxisConverter,&axis, > PyArray_DescrConverter2, &dtype, > PyArray_OutputConverter, &out)) { > Py_XDECREF(dtype); > return NULL; > } > > int num = _get_type_num_double(self->descr, dtype); > Py_XDECREF(dtype); > return compress_dispatch(self, out, axis, num, > EmptyType()); // This decides which function to call > } > > > For contiguous arrays, axis=None, this becomes > > void compute_entropy(Base* data, Base* past, Results* result) { > while (data != past) { > if (*data) *result += *data * std::log(*data); > ++data; > } > } > > which is probably as fast as it can be. > > If the array is not contiguous, this becomes > > void compute_entropy(numpy_iterator data, numpy_iterator past, > Results* result) { > while (data != past) { > if (*data) *result += *data * std::log(*data); > ++data; > } > } > > where numpy_iterator is a type that knows how to iterate over numpy arrays > following strides. > > If axis is not None, then the result will not be a single value, it will be an > array. The machinery will automatically create the array of the right size > and pass it to you with so that the following gets instantiated: > > void compute_entropy(numpy_iterator data, numpy_iterator past, > numpy_iterator results) { > while (data != past) { > if (*data) *results += *data * std::log(*data); > ++data; > ++results; > } > } > > The results parameter has the strides set up correctly to iterate over > results, including looping back when necessary so that the code works as it > should. > > Notice that the ++results operation seems to be dropping in and out. In fact, > I was oversimplifying above. There is no instantiation with Result*, but with > no_iteration which behaves like Result*, but with an empty operator > ++(). You never change your template implementation. > > (The code above was for explanatory purposes, not an example of working code. > The interface I actually used takes more parameters which are not very > important for expository purposes. This allows you to, for example, implement > the ddof parameter in std()). > > ADVANTAGES > =========== > > For most operations, my code is faster (it's hard to beat ndarray.sum(), but > easy to beat ndarray.std()) than numpy on both an intel 32 bit machine and an > amd 64 bit machine both newer than one year (you can test it on your machine > by runnning profile.py). For some specific operations, like summing along a > non-last axis on a moderately large array, it is slower (I think that the > copy&process tactic might be better in this case than the no-copy/one pass > operation I am using). In most cases I tested, it is faster. In particular, > the basic case (a well-behaved array), it is faster. > > More important than speed (at least for me) is the fact that this does not > allocate more memory than needed. This will not fail with OOM errors. > > It's easy to implement specific optimisations. For example, replace a sum > function for a specific case to call AMD's framewave SIMD library (which uses > SIMD instructions): > > void compute_sum(short* data, short* past, no_iteration result) { > fwiSum_16s_C1R ( data, sizeof(short), past-start, &*result); > } > > or, compute the standard deviation of an array of boolean with a single pass > (sigma = sqrt(p(1-p))): > > void compute_std(bool* data, bool* past, no_iteration result) { > size_type N = (past-data); > size_type pos = 0; > while (data != past) { > if (*data) ++pos; > ++data; > } > *result = std::sqrt(ResType(pos)/ResType(N)*(1-ResType(pos)/ResType(N))); > } > > NOT IMPLEMENTED (YET) > ====================== > > Non-aligned arrays are not implemented. > out arrays have to be well behaved. > > My current idea is to compromise and make copies in this case. I could also, > trivially, write a thing that handled those cases without copying, but I > don't think it's worth the cost in code bloat. > > * argmax()/argmin(). This is a bit harder to implement than the rest, as it > needs a bit more machinery. I think it's possible, but I haven't gotten > around to it. > > * ptp(). I hesitate whether to simply do it in Python: > > def ncr_ptp(A,axis=None,dtype=None,out=None): > res=ncr.max(A,axis=axis,dtype=dtype,out=out) > min=ncr.min(A,axis=axis,dtype=dtype) > res -= max > return res > > Does two passes over the data, but no extra copying. I could do the same using > the Python array API, of course. > > DISADVANTAGES > ============== > > It's C++. I don't see it as a technical disadvantage, but I understand some > people might not like it. If this was used in the numpy code base, it could > replace the current macro language (begin repeat // end repeat), so the total > number of languages used would not increase. > > It generates too many functions. Disk is cheap, but do we really need a well > optimised version of std() for the case of complex inputs and boolean output? > What's a boolean standard deviation anyway?. Maybe some tradeoff is > necessary: optimise the defaults and make others possible. > > BUGS > ===== > > I am not correctly implementing the dtype parameter. I thought it controlled > the type of the intermediate results, but if I do > > import numpy > A=numpy.arange(10) > A.mean(dtype=numpy.int8) > > I get > > 4.5 > > which surprises me! > > Furthermore, > > A.mean(dtype=numpy.int8).dtype > > returns > > dtype('float64') > > In my implementation, mean(numpy.arange(10),dtype=numpy.int8) is 4 > > (This might be considered a numpy bug --- BTW, i am not running subversion for > this comparison). > > * I also return only either python longs or python floats. This is a trivial > change, I just don't know all the right function names to return numpy types. > > * A couple of places in the code still have a FIXME on them > > FUTURE > ======= > > I consider this code proof-of-concept. It's cool, it demonstrate that the idea > works, but it is rough and needs cleaning up. There might even be bugs in it > (especially the C interface with Python is something I am not so familiar > with)! > > I see three possible paths for this: > (1) You agree that this is nice, it achieves good results and I should strive > to integrate this into numpy proper, replacing the current implementation. I > would start a branch in numpy svn and implement it there and finally merge > into the main branch. I would be perfectly happy to relicense to BSD if this > is the case. > One could think of the array-to-array (A+B, A+=2, A != B,...) operations > using a similar scheme > This would imply that part of the C API would be obsoleted: the whole > functions array would stop making sense. I don't know how widespread it's > used. > > (2) I keep maintaining this as a separate package and whoever wants to use > it, can use it. I will probably keep it GPL, for now. > Of course, at some later point in time, one could integrate this into the > main branch (Again, in this case, I am happy to relicense). > > (3) Someone decides to take this as a challenge and re-implements ndarray.std > and the like so that it uses less memory and is faster, but does it still in > C. I am not much of a C programmer, so I don't see how this could be done > without really ugly reliance on the preprocessor, but maybe it could be done. > > What do you think? > > bye, > Lu?s Pedro Coelho > PhD Student in Computational Biology > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion FYI for computing variance (and hence std) you probably should be using Knuth's (or Welford's) one pass approach (On-line algorithm) to avoid recomputing the mean: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance Also for large arrays, you may also want to maximize the precision to avoid potential overflow. Bruce From faltet at pytables.org Mon Jul 28 12:17:41 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 28 Jul 2008 18:17:41 +0200 Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= In-Reply-To: <200807251647.03102.pgmdevlist@gmail.com> References: <200807161844.36953.faltet@pytables.org> <200807251309.34418.faltet@pytables.org> <200807251647.03102.pgmdevlist@gmail.com> Message-ID: <200807281817.41987.faltet@pytables.org> Hi Pierre, A Friday 25 July 2008, Pierre GM escrigu?: > Francesc, > > Could you clarify a couple of points ? > > [datetime64] > If I understand properly, your datetime64 would be time units from > the POSIX epoch (1970/01/01 00:00:00), right ? So > > +7d would be 1970/01/08 (7 days after the epoch) > -7W would be 1969/11/13 (7*7 days before the epoch) > > With this approach, a series [1,2,3,7] at a resolution 'd' would > correspond to 1970/01/01, 1970/01/02, 1970/01/03 and 1970/01/07, > right ? > > I'm all for that, **AS LONG AS we have a business day resolution** > 'b', so that > +7b would be 1970/01/09. We have been analyzing the addition of a business day resolution into the bag, but this has the problem that such an entity cannot be considered as a 'resolution' as such. The problem is that the business day does not have a fixed time-span (2 days of the week doesn't count, and that introduces a non-regular behaviour in many situations). Having said that, it is apparent that the bussiness day is a **strong requeriment** on your side, and you know that we would like to keep you happy. So, for allowing this to happen, we have concluded that a conceptual change in our second proposal is needed: instead of a 'resolution', we can introduce the 'time unit' concept. A 'time unit' can be considered as an extent of time that doesn't necessarily need to be fixed, but can change depending on the context of use. As the 'time unit' concept has this less restrictive meaning, we think that the user can be easily persuaded that a 'business day' can enter into this definition (which can be difficult/weird to explain in case of using the 'resolution' concept). We have given this some thought, and while it is certain that this will suppose a bit more of complexity (not too much, really). So, yes, we are willing to rewrite the proposal with the new 'time unit' concept and include the 'business day' too. With this, we hope to better serve the needs of the TimeSeries authors and users. Also, adding the 'time unit' concept (and its corresponding infraestructure) into the dtype opens the door to the adoption of other 'XXXX units' inside NumPy so that, for example, people can easily convert from, say, miles and kilometers easily this: lengths_in_miles_array.astype('length[Km]') but well, this is another history. > [timedelta64] > I like your idea of a timedelta64 being relative, but in that case, > why not having the same resolutions as datetime64 ? At the beginning our argument to stay with weeks as the minimum resolution for relative times was that the duration of months and years was not well defined (a month can last between 28 and 31 days, and a year 365 or 366 days) for a time that was meant to be *relative* (for example, the duration of a relative month is different if the reference time is June or July). However, after thinking more about this, we think now that a relative time of months or years has a clear meaning indeed: it makes a lot of sense to say "3 months after July 1998" or "5 months before August 2008", i.e. they make complete sense when it is used in combination with an absolute date. One thing that will not be possible though, is to change the time unit of a relative time expressed in say, years, to another time unit expressed in say, days. This is because the impossibility to know how many days has a year that is relative (i.e. not bound to a given year). More in general, it will not be possible to perform 'time unit' conversions between units above and below a relative week (because it is the maximum time unit that has a definite number of seconds). So, yes, will be adding months and years to the relative times too. > [scikits.timeseries] > We can currently perform the following operations in > scikits.timeseries > > >>>import scikits.timeseries as ts > >>>series = ts.date_array(['1970-01', '1970-02', '1970-09'], > >>> freq='M') series > > DateArray([Jan-1970, Feb-1970, Sep-1970], > freq='M') > > >>>series.asfreq('A') > > DateArray([1970, 1970, 1970], > freq='A-DEC') > > >>>series.asfreq('A-MAR') > > DateArray([1970, 1970, 1971], > freq='A-MAR') > "A-MAR" means that year YY ends on 03/31 and that year (YY+1) starts > on 04/01. > > I use that a lot in my work, when I need to average daily data by > water years (a water year starts usually on 04/01 and ends the > following 03/31). > > How would I do that with datetime64 and timedelta64 ? Well, as we don't like an 'origin' to have part of our proposal, you won't be able to do exactly that with the proposed plain dtype. However, we think that by making a rational use of smaller time units (i.e. with more resolution, using the old convention) and a combination of absolute and relative times, it is easy to cover this use case. To continue with your example, you will be able to do: >>> series = numpy.array(['1970-01', '1970-02', '1970-09'], dtype='T[M]') >>> series.astype('Y') array([1970, 1970, 1970], dtype='T8[Y]') >>> series2 = series + 3 # Add 3 relative months >>> series2.astype('Y') array([1970, 1970, 1971], dtype='T8[Y]') I hope you get the idea. > Apart from that, I'd be of course quite happy to help as much as I > can. P. Well, I really hope that you would be ok with the modifications that we are planning to do for the new (third) proposal. Many thanks! Francesc > > > ############################################ > > On Friday 25 July 2008 07:09:33 Francesc Alted wrote: > > Hi, > > > > Well, as there were no replies to our second proposal for the > > date/time dtype, I assume that everbody agrees with it ;-) At any > > rate, we would like to proceed with the implementation phase very > > soon now. > > > > However, it happens that Enthought is sponsoring this job and they > > clearly stated that the implementation should cover the needs of as > > much users as possible. So, most in particular, we would like that > > one of the most heavier users of date/time objects, i.e. the > > TimeSeries authors, would be comfortable with the new date/time > > dtypes, and specially that they can benefit from them. > > > > For this goal, we are proposing a decoupling of the date/time use > > cases in two different groups: > > > > 1. A pure ``datetime`` dtype (absolute or relative) that would be > > useful for timestamping purposes in general (i.e. registering dates > > without a need that they be evenly spaced in time). > > > > 2. A class based on the ``frequency`` concept that would be useful > > for measurements that are done on a regular basis or in business > > applications. > > > > With this, we are preventing the dtype implementation at the core > > of NumPy from being too cluttered with the relatively complex needs > > of the ``frequency`` concept users, factoring it out to a external > > class (``Date`` to follow the TimeSeries naming convention). More > > importantly, this decoupling will also avoid the mix of those two > > concepts that, although they are about time measurements, they have > > quite a different meanings indeed. > > > > Another important advantage of this distinction is that the > > ``datetime`` timestamp requires less meta-information to worry > > about (basically, the 'resolution' property), while a ``frequency`` > > ? la TimeSeries will need more additional meta-information, like > > the 'start' and 'end' of periods, as well as a more complex way to > > code frequencies (there exists much more time-periods to be coded, > > as it can be seen in [1]_). This can be utterly important to allow > > the NumPy data based on the ``datetime`` dtype to be quickly saved > > and retrieved on databases like ZODB (object database) or PyTables > > (HDF5-based database). > > > > Our ultimate goal is that the ``Date`` and ``DateArray`` classes in > > the TimeSeries would be rewritten in terms of the new date/time > > dtype so as to get advantage of its features but also for getting > > rid of duplicated code. I honestly think that this can be a big > > advantage for TimeSeries indeed (at the cost of taking some time > > for doing the migration). > > > > Does that approach make sense for people? > > > > .. [1] http://scipy.org/scipy/scikits/wiki/TimeSeries#Frequencies > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Francesc Alted From felix at physik3.uni-rostock.de Mon Jul 28 12:35:50 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Mon, 28 Jul 2008 18:35:50 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807281025.36338.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> Message-ID: <200807281835.50556.felix@physik3.uni-rostock.de> I have to correct myself. The function test_fft1() still does not work, it just looked good in the plots, but at a closer look, Re(IFFT) is close to zero and is far from matching the exact IFT. So it seems FFT(IFFT(f)) == IFFT(FFT(f)) == 1 (if done right ;-), but I just cannot reproduce the exact (I)FT. Felix From nwagner at iam.uni-stuttgart.de Mon Jul 28 13:08:43 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 28 Jul 2008 19:08:43 +0200 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, 28 Jul 2008 14:26:59 +0900 David Cournapeau wrote: > Hi, > > After some delay, here are the win32 binaries for >numpy 1.1.1rc2: > > http://www.ar.media.kyoto-u.ac.jp/members/david/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe > > > Notes on those binaries: > - Based on Atlas 3.8.2 (the 1.1.0 was built against >3.8.0, which had > a serious bug wrt dgemm, which is used for numpy.dot. It >should solve > #844 (problem with numpy.inner) > - It is not stricly based on 1.1.rc2, but on 1.1.x >trunk. The only > difference, though, is a small fix in MANIFEST.in which >was broken wrt > sdist target and the version of course. > > If those work out, I will also prepare 2.4 binaries. I >am sorry for the > delay, > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion David, Did you also try ATLAS3.9.1 ? Is it recommended to use the stable version (3.8.2) Nils From faltet at pytables.org Mon Jul 28 13:16:25 2008 From: faltet at pytables.org (Francesc Alted) Date: Mon, 28 Jul 2008 19:16:25 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?RFC=3A_A_=28second=29_proposal_?= =?iso-8859-1?q?for_implementing=09some_date/time_types_in_NumPy?= In-Reply-To: References: <200807161844.36953.faltet@pytables.org> <200807251309.34418.faltet@pytables.org> Message-ID: <200807281916.25489.faltet@pytables.org> A Saturday 26 July 2008, Matt Knox escrigu?: > >> For this goal, we are proposing a decoupling of the date/time use > >> cases in two different groups: > >> > >> 1. A pure ``datetime`` dtype (absolute or relative) that would be > >> useful for timestamping purposes in general (i.e. registering > >> dates without a need that they be evenly spaced in time). > > I agree with this split. A basic datetime data type would be useful > to a lot of people that don't need fancier time series capabilities. Excellent, this is our thought too. > I would recommend focusing on implementing this first as it will > probably provide lots of useful learning experiences and examples for > the more complicated task of a "frequency" aware date type later on. Definitely. We plan to do exactly this. > >> 2. A class based on the ``frequency`` concept that would be useful > >> for measurements that are done on a regular basis or in business > >> applications. > >> ... > >> Our ultimate goal is that the ``Date`` and ``DateArray`` classes > >> in the TimeSeries would be rewritten in terms of the new date/time > >> dtype so as to get advantage of its features but also for getting > >> rid of duplicated code. > > I'm excited to hear such interest in time series work with python and > numpy. I certainly support the goals and more collaboration and > sharing of code is always a good thing. My biggest concern would be > not losing existing functionality. A decent amount of work went into > implementing all the different frequencies, and losing any of the > currently supported frequencies could mean the difference between the > dtype being very useful to someone, or not useful at all. > > Just thinking out loud here... but in terms of improving on the Date > implementation in the timeseries module, it would be nice to have a > more "plug in" kind of architecture for implementing different > frequencies so that it could be extended more easily with custom > frequencies by other users. There is no end to the list of possible > frequencies that people might potentially use and the current > timeseries implementation isn't as flexibile as it could be in that > area. We completely agree with the idea of the plug-in architecture for the ``Date`` class. Are you thinking in something concrete already? > The automatic string parsing has been mentioned before, but it is a > feature I am personally very fond of. I use it all the time, and I > suspect a lot of people would like it very much if they used it. It's > not suited for high performance code, but is fantastic for > interactive and ad-hoc work. This is supported right in the > "constructor" of the current Date class, along with conversion from > datetime objects. I'd love to see such support built into the new > date type, although I guess it could be added on easily enough with a > factory function. Well, what we are planning is to support only three kinds of assignments: - From ``datetime.datetime`` (absolute time) or ``datetime.timedelta`` (relative time) objects. - From integers or floating points numbers (relative time). - From ISO-8601 strings (absolute time). The last input mode does imply a parser, but our intention is to support directly just the standard ISO. We think that if you want to specifiy other string formats it is better to rely on the ``datetime`` parsers or, as John Hunter suggests, the ``dateutil`` module. We believe that incorporating more parsers into the ``Date`` class may represent an unnecessary duplication of code. > Another extra feature (or hack depending on your point of view) in > the timeseries Date class is the addition of a couple extra custom > directives for string formatting. Specifically the %q and %Q > directives for printing out Quarter information. Obviously these are > non-standard directives, but when you are talking about dates with > custom frequencies I think it sometimes make sense to have custom > format directives. A plug in architecture that somehow lets you > define new custom directives for various frequencies would also be > really nice. Maybe you are right, yes. However, I'd consider using the ``datetime`` or ``dateutil`` for this first. If there are use cases that escape to existing modules, then we can start thinking about this, but not before. > Anyway, I'm very much in support of this initiative. I'm not sure > I'll be able to help much on the initial implementation, but once you > have a framework in place I may be able to pitch in with some of the > details. Please keep us posted. Yes, that's the idea. We plan to send a third proposal (tomorrow?) based on the latests suggestions by Pierre. Once we reach a consensus, we will start the implementation of the date/time dtype based on the final proposal (hopefully, the third one). It would be great if, based on this, and before or during the implementation phase of the dtype, you can start thinking about the architecture of the new ``Date`` class (with all the added fanciness that you are proposing) so that we can have time to include possible details that escaped from the final proposal for the date/time dtype. Thanks a lot! -- Francesc Alted From charlesr.harris at gmail.com Mon Jul 28 13:56:42 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Jul 2008 11:56:42 -0600 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jul 28, 2008 at 11:08 AM, Nils Wagner wrote: > On Mon, 28 Jul 2008 14:26:59 +0900 > David Cournapeau wrote: > > Hi, > > > > After some delay, here are the win32 binaries for > >numpy 1.1.1rc2: > > > > > http://www.ar.media.kyoto-u.ac.jp/members/david/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe > > > > > > Notes on those binaries: > > - Based on Atlas 3.8.2 (the 1.1.0 was built against > >3.8.0, which had > > a serious bug wrt dgemm, which is used for numpy.dot. It > >should solve > > #844 (problem with numpy.inner) > > - It is not stricly based on 1.1.rc2, but on 1.1.x > >trunk. The only > > difference, though, is a small fix in MANIFEST.in which > >was broken wrt > > sdist target and the version of course. > > > > If those work out, I will also prepare 2.4 binaries. I > >am sorry for the > > delay, > > > > cheers, > > > > David > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > David, > > Did you also try ATLAS3.9.1 ? > Is it recommended to use the stable version (3.8.2) > I compiled ATLAS3.9.1 and the make time tests didn't run any faster. It's also very new, so it might be best to let it stew a bit to uncover any little oopsies. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Jul 28 14:31:07 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 28 Jul 2008 11:31:07 -0700 Subject: [Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy In-Reply-To: <200807161844.36953.faltet@pytables.org> References: <200807161844.36953.faltet@pytables.org> Message-ID: <488E106B.5030404@noaa.gov> Hi, Sorry for the very long delay in commenting on this. In short, it looks great, and thanks for your efforts. A couple small comments: > In [11]: t[0] = datetime.datetime.now() # setter in action > In [12]: t[0] > Out[12]: '2008-07-16T13:39:25.315' # representation in ISO 8601 format I like that, but what about: > In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]") > In [9]: t2 = numpy.ones(5, dtype="datetime64[s]") > > In [10]: t = t2 - t1 > > In [11]: t[0] = 24 # setter in action (setting to 24 seconds) Is there a way to set in any other units? (hours, days, etc.) > In [12]: t[0] > Out[12]: 24 # representation as an int64 why not a "pretty" representation of timedelta64 too? I'd like that better (at least for __str__, perhaps __repr__ should be the raw numbers. how will operations between different types work? > t1 = numpy.ones(5, dtype="timedelta64[s]") > t2 = numpy.ones(5, dtype="timedelta64[ms]") t1 + t2 >> ?????? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jdh2358 at gmail.com Mon Jul 28 14:56:52 2008 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 28 Jul 2008 13:56:52 -0500 Subject: [Numpy-discussion] strange seterr persistence between sessions Message-ID: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> In trying to track down a bug in matplotlib, I have come across tsome very strange numpy behavior. Basically, whether or not I call np.seterr('raise') or not in a matplotlib demo affects the behavior of seterr in another (pure numpy) script, run in a separate process. Something about the numpy state is persisting between python sessions. This appears to be platform specific, because I have only been able to verify it on 1 platform (quad code xeon 64 bit running fedora) but not on another (solaris x86). Here are the gory details. Below is a cut-and-paste from a single xterm session, with some comments sprinkled in. Some version info:: ~> uname -a Linux bic128.bic.berkeley.edu 2.6.25.10-47.fc8 #1 SMP Mon Jul 7 18:31:41 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux ~> python -V Python 2.5.1 ~> python -c 'import numpy; print numpy.__version__' 1.2.0.dev5564 ~> python -c 'import matplotlib; print matplotlib.__version__' 0.98.3rc1 With mpl svn, head over to the examples directory and grab the data file needed to show this bug:: ~> cd mpl/examples/pylab_examples/ pylab_examples> wget http://matplotlib.sourceforge.net/tmp/alpha.npy --11:22:08-- http://matplotlib.sourceforge.net/tmp/alpha.npy => `alpha.npy' Resolving matplotlib.sourceforge.net... 66.35.250.209 Connecting to matplotlib.sourceforge.net|66.35.250.209|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 688 [text/plain] 100%[===================================================================>] 688 --.--K/s 11:22:08 (111.19 MB/s) - `alpha.npy' saved [688/688] Run the geo_demo.py example. This has np.seterr set to "raise". It will issue a floating point error:: pylab_examples> head -5 geo_demo.py import numpy as np np.seterr("raise") from pylab import * pylab_examples> python geo_demo.py Traceback (most recent call last): File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_gtk.py", line 333, in expose_event self._render_figure(self._pixmap, w, h) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_gtkagg.py", line 75, in _render_figure FigureCanvasAgg.draw(self) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/backends/backend_agg.py", line 261, in draw self.figure.draw(self.renderer) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/figure.py", line 759, in draw for a in self.axes: a.draw(renderer) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axes.py", line 1523, in draw a.draw(renderer) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axis.py", line 718, in draw tick.draw(renderer) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/axis.py", line 186, in draw self.gridline.draw(renderer) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/lines.py", line 423, in draw tpath, affine = self._transformed_path.get_transformed_path_and_affine() File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py", line 2089, in get_transformed_path_and_affine self._transform.transform_path_non_affine(self._path) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py", line 1828, in transform_path_non_affine self._a.transform_path(path)) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py", line 1828, in transform_path_non_affine self._a.transform_path(path)) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/transforms.py", line 1816, in transform_path self._a.transform_path(path)) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/projections/geo.py", line 264, in transform_path return Path(self.transform(ipath.vertices), ipath.codes) File "/home/jdhunter/dev/lib64/python2.5/site-packages/matplotlib/projections/geo.py", line 249, in transform sinc_alpha = ma.sin(alpha) / alpha File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 1887, in __div__ return divide(self, other) File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 638, in __call__ t = narray(self.domain(d1, d2), copy=False) File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 413, in __call__ return umath.absolute(a) * self.tolerance >= umath.absolute(b) FloatingPointError: underflow encountered in multiply OK, now run the pure numpy test script in a separate python process. It also has np.seterr set to raise, and it raises the same error. Nothing too strange (yet):: pylab_examples> cat test.py import numpy as np np.seterr("raise") import numpy.ma as ma alpha = np.load('alpha.npy') alpham = ma.MaskedArray(alpha) sinc_alpha_ma = ma.sin(alpham) / alpham pylab_examples> python test.py Traceback (most recent call last): File "test.py", line 7, in sinc_alpha_ma = ma.sin(alpham) / alpham File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 1887, in __div__ return divide(self, other) File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 638, in __call__ t = narray(self.domain(d1, d2), copy=False) File "/home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.py", line 413, in __call__ return umath.absolute(a) * self.tolerance >= umath.absolute(b) FloatingPointError: underflow encountered in multiply OK, in your editor, comment out the np.seterr line from the geo_demo.py script, and rerun it. The demo runs fine w/o error this time and a figure window pops up. Again, nothing surprising. pylab_examples> head -5 geo_demo.py import numpy as np #np.seterr("raise") from pylab import * pylab_examples> python geo_demo.py OK, now this is where it starts getting funky. Rerun the numpy test script, with no changes (seterr is still set to raise):: pylab_examples> cat test.py import numpy as np np.seterr("raise") import numpy.ma as ma alpha = np.load('alpha.npy') alpham = ma.MaskedArray(alpha) sinc_alpha_ma = ma.sin(alpham) / alpham pylab_examples> python test.py pylab_examples> This time it ran w/o errors (and will continue to do so on successive runs). Same script, same data, same error codes. I can repeat this many times: if I turn errors back on in geo_demo, the error is raised in subsequent runs of test.py. If I turn it back off in geo_demo.py, it is not raised in subsequent runs of test.py. I tried this on a solaris x86 box and did not see this behavior. In mostly unrelated news, I find the exception in the ma divide here a bit confusing, because using np sin and divide does not raise this error, and none of the values are masked. I would expect the divides for the unmasked portions to have the same floating point behavior. Eg, only the second divide raises in the example below:: import numpy as np np.seterr("raise") import numpy.ma as ma alpha = np.load('alpha.npy') alpham = ma.MaskedArray(alpha) sinc_alpha_ma = np.sin(alpham.data) / alpham.data sinc_alpha_ma = ma.sin(alpham) / alpham JDH From jj20047 at gmail.com Mon Jul 28 15:01:58 2008 From: jj20047 at gmail.com (jb) Date: Mon, 28 Jul 2008 12:01:58 -0700 Subject: [Numpy-discussion] Ashigabou Repository atlas vs yum blas/lapack Message-ID: Hello: I'm hoping someone can straighten me out. I have a 64 bit fedora 8 quad core machine and can install blas and lapack from the yum repository. With these, numpy installs fine and finds blas and lapack. I also tried removing the yum blas/lapack libs and installing atlas via the instructions given on the scipy site for Ashigabou Repository. Atlas built fine from source and after installing the new rpm there were two files in the lib64/atlas/sse2 folder: libblas.so.3.0 and liblapack.so.3.0. However, when I try to install numpy, it cannot find any blas, lapack, or atlas, even though my site.cfg file has: [DEFAULT] library_dirs = /usr/lib64:/usr/lib64/atlas/sse2 [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack, f77blas, cblas, atlas [atlas] library_dirs = /usr/lib64/atlas/sse2 atlas_libs = lapack, blas Using LD_LIBRARY_PATH=/usr/lib64/atlas/sse2 before installing numpy does not make a difference. My questions are, are the yum versions of lapack/blas just as good as the one built from Ashigabou source, and if not, why would numpy not be able to find the Ashigabou blas and lapack files (even though its looking in the right directory)? Thanks. From robert.kern at gmail.com Mon Jul 28 15:02:56 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Jul 2008 14:02:56 -0500 Subject: [Numpy-discussion] strange seterr persistence between sessions In-Reply-To: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> References: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> Message-ID: <3d375d730807281202g18c8fb24k41de353034d12ba5@mail.gmail.com> On Mon, Jul 28, 2008 at 13:56, John Hunter wrote: > In trying to track down a bug in matplotlib, I have come across tsome > very strange numpy behavior. Basically, whether or not I call > np.seterr('raise') or not in a matplotlib demo affects the behavior of > seterr in another (pure numpy) script, run in a separate process. > Something about the numpy state is persisting between python sessions. > This appears to be platform specific, because I have only been able > to verify it on 1 platform (quad code xeon 64 bit running fedora) but > not on another (solaris x86). Can you make a new, smaller self-contained example? I suspect stale .pyc files. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jdh2358 at gmail.com Mon Jul 28 15:30:54 2008 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 28 Jul 2008 14:30:54 -0500 Subject: [Numpy-discussion] strange seterr persistence between sessions In-Reply-To: <3d375d730807281202g18c8fb24k41de353034d12ba5@mail.gmail.com> References: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> <3d375d730807281202g18c8fb24k41de353034d12ba5@mail.gmail.com> Message-ID: <88e473830807281230n75f74667nb3e8ec2e0cf71f45@mail.gmail.com> On Mon, Jul 28, 2008 at 2:02 PM, Robert Kern wrote: > On Mon, Jul 28, 2008 at 13:56, John Hunter wrote: >> In trying to track down a bug in matplotlib, I have come across tsome >> very strange numpy behavior. Basically, whether or not I call >> np.seterr('raise') or not in a matplotlib demo affects the behavior of >> seterr in another (pure numpy) script, run in a separate process. >> Something about the numpy state is persisting between python sessions. >> This appears to be platform specific, because I have only been able >> to verify it on 1 platform (quad code xeon 64 bit running fedora) but >> not on another (solaris x86). > > Can you make a new, smaller self-contained example? I suspect stale .pyc files. I'm not sure exactly what you mean by self-contained (since the behavior requires at least two files). Do you mean trying to come up with two numpy only examples files, or one that does away with the npy file? Or both.... As for the stale files, I'm not sure what you are thinking but these are clean builds and installs of numpy and mpl. So if you'll give me a little more guidance in terms of what you are looking for in a self contained example, I'll be happy to try and put it together. But I am not sure what it is about the loading of the geo_demo that is triggering the behavior (numpy extension code, large memory footprint, ??). I tried running a python snippet that would fill a lot of memory to see if that would clear the persistence. I was wondering if there is some kernel memory caching and some empty numpy memory that is not getting initialized properly and thus is picking up some memory from a prior session) but it did not. JDH From robert.kern at gmail.com Mon Jul 28 15:35:35 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Jul 2008 14:35:35 -0500 Subject: [Numpy-discussion] strange seterr persistence between sessions In-Reply-To: <88e473830807281230n75f74667nb3e8ec2e0cf71f45@mail.gmail.com> References: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> <3d375d730807281202g18c8fb24k41de353034d12ba5@mail.gmail.com> <88e473830807281230n75f74667nb3e8ec2e0cf71f45@mail.gmail.com> Message-ID: <3d375d730807281235n6e7a3291v13488a406c81d1eb@mail.gmail.com> On Mon, Jul 28, 2008 at 14:30, John Hunter wrote: > On Mon, Jul 28, 2008 at 2:02 PM, Robert Kern wrote: >> On Mon, Jul 28, 2008 at 13:56, John Hunter wrote: >>> In trying to track down a bug in matplotlib, I have come across tsome >>> very strange numpy behavior. Basically, whether or not I call >>> np.seterr('raise') or not in a matplotlib demo affects the behavior of >>> seterr in another (pure numpy) script, run in a separate process. >>> Something about the numpy state is persisting between python sessions. >>> This appears to be platform specific, because I have only been able >>> to verify it on 1 platform (quad code xeon 64 bit running fedora) but >>> not on another (solaris x86). >> >> Can you make a new, smaller self-contained example? I suspect stale .pyc files. > > I'm not sure exactly what you mean by self-contained (since the > behavior requires at least two files). Do you mean trying to come up > with two numpy only examples files, or one that does away with the npy > file? Or both.... Both, if the behavior exhibits itself without the npy file. If it only exhibits itself with an npy involved, then we have some more information about where the problem might be. > As for the stale files, I'm not sure what you are > thinking but these are clean builds and installs of numpy and mpl. And of the files you are editing and anything they might import? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jdh2358 at gmail.com Mon Jul 28 15:56:18 2008 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 28 Jul 2008 14:56:18 -0500 Subject: [Numpy-discussion] strange seterr persistence between sessions In-Reply-To: <3d375d730807281235n6e7a3291v13488a406c81d1eb@mail.gmail.com> References: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> <3d375d730807281202g18c8fb24k41de353034d12ba5@mail.gmail.com> <88e473830807281230n75f74667nb3e8ec2e0cf71f45@mail.gmail.com> <3d375d730807281235n6e7a3291v13488a406c81d1eb@mail.gmail.com> Message-ID: <88e473830807281256r3ddd943fqcb035a6f21b57e85@mail.gmail.com> On Mon, Jul 28, 2008 at 2:35 PM, Robert Kern wrote: > Both, if the behavior exhibits itself without the npy file. If it only > exhibits itself with an npy involved, then we have some more > information about where the problem might be. OK, I'll see what I can come up with. In the mean time, as I was trying to strip out the npy component and put the data directly into the file, I find it strange that I am getting a floating point error on this operation import numpy as np np.seterr("raise") import numpy.ma as ma x = 1.50375883 m = ma.MaskedArray([x]) sinc_alpha_ma = ma.sin(m) / m --------------------------------------------------------------------------- FloatingPointError Traceback (most recent call last) /home/jdhunter/ in () /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in __div__(self, other) 1885 def __div__(self, other): 1886 "Divide other into self, and return a new masked array." -> 1887 return divide(self, other) 1888 # 1889 def __truediv__(self, other): /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in __call__(self, a, b) 636 d1 = getdata(a) 637 d2 = get_data(b) --> 638 t = narray(self.domain(d1, d2), copy=False) 639 if t.any(None): 640 mb = mask_or(mb, t) /home/jdhunter/dev/lib64/python2.5/site-packages/numpy/ma/core.pyc in __call__(self, a, b) 411 if self.tolerance is None: 412 self.tolerance = np.finfo(float).tiny --> 413 return umath.absolute(a) * self.tolerance >= umath.absolute(b) 414 #............................ 415 class _DomainGreater: FloatingPointError: underflow encountered in multiply I am no floating point expert, but I don't see why a numerator of 0.99775383 and a denominator of 1.50375883 should be triggering an underflow error. It looks more like a bug in the ma core logic since umath.absolute(a) * self.tolerance is more or less guaranteed to fail if np.seterr("raise") is set JDH From pav at iki.fi Mon Jul 28 16:12:43 2008 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 28 Jul 2008 20:12:43 +0000 (UTC) Subject: [Numpy-discussion] strange seterr persistence between sessions References: <88e473830807281156t54823a7fk4f7614435f46fc28@mail.gmail.com> Message-ID: Mon, 28 Jul 2008 13:56:52 -0500, John Hunter wrote: > In trying to track down a bug in matplotlib, I have come across tsome > very strange numpy behavior. Basically, whether or not I call > np.seterr('raise') or not in a matplotlib demo affects the behavior of > seterr in another (pure numpy) script, run in a separate process. > Something about the numpy state is persisting between python sessions. > This appears to be platform specific, because I have only been able > to verify it on 1 platform (quad code xeon 64 bit running fedora) but > not on another (solaris x86). [clip] I don't see this on Python 2.5.2, Linux, AMD Athlon x86, numpy SVN r5542: test.py always raises the FloatingPointError. But on a related note: On 1.2.0.dev5542, I always see the FloatingPointError in test.py. On 1.1.0, I don't see any FloatingPointErrors. Still, nothing depends on running geo_data.py. Are your numpy versions the same on these platforms? Do you have two numpy versions installed. Could it be possible that somehow running the scripts switches between numpy versions? (Sounds very strange, and of course, this is easy to check for in test.py...) -- Pauli Virtanen From pgmdevlist at gmail.com Mon Jul 28 16:10:43 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 28 Jul 2008 16:10:43 -0400 Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= In-Reply-To: <200807281817.41987.faltet@pytables.org> References: <200807161844.36953.faltet@pytables.org> <200807251647.03102.pgmdevlist@gmail.com> <200807281817.41987.faltet@pytables.org> Message-ID: <200807281610.44286.pgmdevlist@gmail.com> On Monday 28 July 2008 12:17:41 Francesc Alted wrote: > So, for allowing this to happen, we have concluded that a > conceptual change in our second proposal is needed: instead of > a 'resolution', we can introduce the 'time unit' concept. I'm all for that, thanks ! > One thing that will not be possible though, is > to change the time unit of a relative time expressed in say, years, to > another time unit expressed in say, days. This is because the > impossibility to know how many days has a year that is relative (i.e. > not bound to a given year). OK, that makes sense for timedeltas. But would I still be able to add a timedelta['Y'] (in years) to a datetime['D'] (in days) and get the proper result ? > More in general, it will not be possible > to perform 'time unit' conversions between units above and below a > relative week (because it is the maximum time unit that has a definite > number of seconds). Could you rephrase that ? You're still talking about conversion for timedelta, not datetime, right ? > > >>>series.asfreq('A-MAR') > Well, as we don't like an 'origin' to have part of our proposal, you > won't be able to do exactly that with the proposed plain dtype. That's what I was afraid of. Oh well, I'm sure we'll come with a way... Looking forward to reading the third version ! From fperez.net at gmail.com Mon Jul 28 18:33:13 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 28 Jul 2008 15:33:13 -0700 Subject: [Numpy-discussion] Python tools at the annual SIAM meeting Message-ID: Hi all, for those interested, here's a brief report on the recent SIAM meeting where a number of Python-based tools for scientific computing (including ipython, numpy, scipy, sage, and more) were discussed: http://fdoperez.blogspot.com/2008/07/python-tools-for-science-go-to-siam.html The punch line is that we got selected for the annual highlights of the conference: http://www.ams.org/ams/siam-2008.html#python Thanks again to all who contributed talks and attended! Cheers, f From eads at soe.ucsc.edu Mon Jul 28 19:50:52 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Mon, 28 Jul 2008 16:50:52 -0700 (PDT) Subject: [Numpy-discussion] asarray issue with type codes Message-ID: <49374.128.165.202.224.1217289052.squirrel@squirrelmail.soe.ucsc.edu> Hi there, I ran into a little problem in some type checking code for a C extension I'm writing. I construct X as a C-long array and then I cast it to a C-int array Y, however the type code does not change. However, when I try constructing the array from scratch as a C-int, I get the right type code (ie 5). I assumed that when X gets casted to a C-int, no copying should occur but a new array view should be constructed with the C-int type code. What's wrong with this logic? Also note that casting from a C-long (type code 7) to a double to a C-int returns an array with the right type code, although a double copy occurs. Damian # Construct X as a C-long. In [16]: X=numpy.zeros((10,10),dtype='l') # Now cast X to a C-int. In [17]: Y=numpy.asarray(X, dtype='i') # Check X and Y's data type; they are the same. In [18]: X.dtype Out[18]: dtype('int32') In [19]: Y.dtype Out[19]: dtype('int32') # Their type codes are the same. In [20]: X.dtype.num Out[20]: 7 In [21]: Y.dtype.num Out[21]: 7 # Constructing with dtype='i', gives the right type code. In [22]: Z=numpy.zeros((10,10),dtype='i') In [23]: Z.dtype Out[23]: dtype('int32') In [24]: Z.dtype.num Out[24]: 5 From Chris.Barker at noaa.gov Mon Jul 28 20:17:44 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 28 Jul 2008 17:17:44 -0700 Subject: [Numpy-discussion] fromfile and text files... Message-ID: <488E61A8.2080408@noaa.gov> Hi all, I'd like to use fromfile() to read text files that look like: 23.4, 123.43 456.321, 9568.00 32.0, 134.4 so they are comma-separated values, but separated by newlines. I tried this code: import numpy as np file = "fromfiletest.txt" a = np.fromfile(file, dtype=np.float, sep=",", count=6) print a and got: 6 items requested but only 2 read [ 23.4 123.43] So, fromfile is looking for commas, and when there is a newline, instead of a comma, it stops looking. This kind of kills the point of fromfile for any text files other than whitespace - delimited ( i assume passing in " " for sep will get you an y whitespace -- it does seem to work for spaces and newlines, anyway). Yes, I know there is loadtext() and any number of other solutions, but they are all processing line in python, and that is a lot slower when reading big files. I thought fromtext() did this, and in any case, it could -- it would be a simple a blazingly fast way to red many common text formats. I think it was inspired by some code I posted a couple years back that would work with virtually any separator. That code only did double arrays, but it was fast and easy to use for the easy cases. It used C fscanf to simple get "the next number" then is would skip along 'till it found another thing it could parse as a number. since C's fscanf already skips whitespace, couldn't sep be interpreted as "this character and white space"? Or maybe be able to pass in a list of separators? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Mon Jul 28 22:31:55 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 29 Jul 2008 11:31:55 +0900 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> Message-ID: <488E811B.2000909@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > > David, > > Did you also try ATLAS3.9.1 ? > Is it recommended to use the stable version (3.8.2) > I did not try atlas 3.9.1, but anyone can try. I personally do not want to package unstable versions, but the vendor directory in numpy repository should make it relatively easy to try for yourself (once you add atlas 3.9.1 sources). cheers, David From david at ar.media.kyoto-u.ac.jp Mon Jul 28 22:39:35 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 29 Jul 2008 11:39:35 +0900 Subject: [Numpy-discussion] Ashigabou Repository atlas vs yum blas/lapack In-Reply-To: References: Message-ID: <488E82E7.6060408@ar.media.kyoto-u.ac.jp> jb wrote: > I also tried removing the yum blas/lapack libs and installing atlas > via the instructions given on the scipy site for Ashigabou Repository. > Did you install blas/lapack from ashigabou repository as well ? When I developed those packages, FC packages for blas/lapack were unusable. But maybe with recent versions it is ok. I don't have very much time to spend on this unfortunately (I called for people using fedora to take care of this and pushing it to official FC repositories as I am not using FC myself, but nothing happened). > Atlas built fine from source and after installing the new rpm there > were two files in the lib64/atlas/sse2 folder: libblas.so.3.0 and > liblapack.so.3.0. However, when I try to install numpy, it cannot > find any blas, lapack, or atlas, even though my site.cfg file has: > > [DEFAULT] > library_dirs = /usr/lib64:/usr/lib64/atlas/sse2 > > [blas_opt] > libraries = f77blas, cblas, atlas > > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > [atlas] > library_dirs = /usr/lib64/atlas/sse2 > atlas_libs = lapack, blas > > Could you paste the configuration log (when it says whether it finds the packages or not). I believe that you put too much information, the following site.cfg should work: [DEFAULT] library_dirs = /usr/lib64/atlas/sse2 Should work, but I can't be sure without seeing the log (Installing in sse2 is strange BTW; a quadcore certainly means you have more than sse2. That's something else to fix). > Using LD_LIBRARY_PATH=/usr/lib64/atlas/sse2 before installing numpy > does not make a difference. > LD_LIBRARY_PATH does not change how numpy looks for libraries. It only changes how the OS looks for libraries when you launch programs, so this is expected. > My questions are, are the yum versions of lapack/blas just as good as > the one built from Ashigabou source, and if not, why would numpy not > be able to find the Ashigabou blas and lapack files (even though its > looking in the right directory)? > In the old times (FC 5), the yum ones did not work. If they do now, I would say just use them. ATLAS is faster than blas/lapack, but it is more work, and it is useful mainly for large problems anyway. cheers, David From david at ar.media.kyoto-u.ac.jp Tue Jul 29 00:29:34 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 29 Jul 2008 13:29:34 +0900 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw Message-ID: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> Hi, I was away during the discussion on the updated complex functions using C99, and I've noticed it breaks some tests on windows (with mingw; I have not tried with Visual Studio, but it is likely to make things even worse given C support from MS compilers): http://scipy.org/scipy/numpy/ticket/865 Were those changes backported to 1.1.x ? If so, I would consider this as a release blocker, cheers, David From charlesr.harris at gmail.com Tue Jul 29 02:27:04 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 00:27:04 -0600 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw In-Reply-To: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Jul 28, 2008 at 10:29 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > I was away during the discussion on the updated complex functions > using C99, and I've noticed it breaks some tests on windows (with mingw; > I have not tried with Visual Studio, but it is likely to make things > even worse given C support from MS compilers): > > http://scipy.org/scipy/numpy/ticket/865 > > Were those changes backported to 1.1.x ? If so, I would consider this as > a release blocker, > The only changes to the computations were in acosh and asinh, which I think should work fine. The tests check branch cuts and corner cases among other things and are only in the trunk, so we aren't any worse off than we were, we just have more failing tests to track down. At least one previous failing tests looked to be a Python bug, so finding the root causes here and putting together tests that work everywhere is going to be a project. Some of the problems could be in the windows library, probably sqrt and log. Other problems might be in notations for nans and infs. Anyway, it would be nice to make all these things platform independent but I don't think we should worry about that for 1.1.1. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jul 29 02:17:01 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 29 Jul 2008 15:17:01 +0900 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw In-Reply-To: References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> Message-ID: <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Jul 28, 2008 at 10:29 PM, David Cournapeau > > > wrote: > > Hi, > > I was away during the discussion on the updated complex functions > using C99, and I've noticed it breaks some tests on windows (with > mingw; > I have not tried with Visual Studio, but it is likely to make things > even worse given C support from MS compilers): > > http://scipy.org/scipy/numpy/ticket/865 > > Were those changes backported to 1.1.x ? If so, I would consider > this as > a release blocker, > > > The only changes to the computations were in acosh and asinh, which I > think should work fine. The tests check branch cuts and corner cases > among other things and are only in the trunk, so we aren't any worse > off than we were, we just have more failing tests to track down. Ok. I though there was more than just tests, but also C code modification. If not, it is certainly much less of a problem. > At least one previous failing tests looked to be a Python bug, so > finding the root causes here and putting together tests that work > everywhere is going to be a project. Some of the problems could be in > the windows library, probably sqrt and log. One thing is that mingw now uses ancient gcc (3.4.5), and there was a lot of changes since then for C99 conformance. We can just pretend the problem is not there for now, but for 2.6, we will have to handle this, because mingw with gcc 3.4 does not have a runtime compatible with MSVC 9, which is the one used by python2.6 binary release. That will be a lot of fun :) cheers, David From stefan at sun.ac.za Tue Jul 29 02:54:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 29 Jul 2008 08:54:52 +0200 Subject: [Numpy-discussion] asarray issue with type codes In-Reply-To: <49374.128.165.202.224.1217289052.squirrel@squirrelmail.soe.ucsc.edu> References: <49374.128.165.202.224.1217289052.squirrel@squirrelmail.soe.ucsc.edu> Message-ID: <9457e7c80807282354j7569ba75h92f8e36f502fa1b1@mail.gmail.com> Hi Damian 2008/7/29 Damian Eads : > I ran into a little problem in some type checking code for a C extension > I'm writing. I construct X as a C-long array and then I cast it to a C-int > array Y, however the type code does not change. However, when I try > constructing the array from scratch as a C-int, I get the right type code > (ie 5). > > I assumed that when X gets casted to a C-int, no copying should occur but > a new array view should be constructed with the C-int type code. What's > wrong with this logic? I would guess that, somewhere along the line, the two dtypes are compared to see whether anything needs to be done: In [18]: np.dtype('i') == np.dtype('l') Out[18]: True Since int and c_long is the same on 32-bit platforms, it doesn't do anything. I agree; that looks like a bug. Unless someone else justifies this behaviour, please file a ticket so that we can fix it in time for 1.2. For now, you could achieve what you want by doing X.view(np.dtype('l')) But, of course, that would break on a platform where the widths of int and clong differ. Cheers St?fan From faltet at pytables.org Tue Jul 29 04:48:35 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 10:48:35 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?RFC=3A_A_=28second=29_proposal_?= =?iso-8859-1?q?for_implementing_some=09date/time_types_in_NumPy?= In-Reply-To: <488E106B.5030404@noaa.gov> References: <200807161844.36953.faltet@pytables.org> <488E106B.5030404@noaa.gov> Message-ID: <200807291048.35792.faltet@pytables.org> A Monday 28 July 2008, Christopher Barker escrigu?: > Hi, > > Sorry for the very long delay in commenting on this. Don't worry, we are still in time to receive more comments (but if there is people willing to contribute more comments, hurry up, please!). > In short, it > looks great, and thanks for your efforts. > > A couple small comments: > > In [11]: t[0] = datetime.datetime.now() # setter in action > > > > In [12]: t[0] > > Out[12]: '2008-07-16T13:39:25.315' # representation in ISO 8601 > > format > > I like that, but what about: > > In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]") > > In [9]: t2 = numpy.ones(5, dtype="datetime64[s]") > > > > In [10]: t = t2 - t1 > > > > In [11]: t[0] = 24 # setter in action (setting to 24 seconds) > > Is there a way to set in any other units? (hours, days, etc.) Yes. You will be able to use a scalar ``timedelta64``. For example, if t is an array with dtype = 'timedelta64[s]' (i.e. with a time unit of seconds), you will be able to do the next: >>> t[0] = numpy.timedelta64(2, unit="[D]") where you are adding 2 days to the 0-element of t. However, you won't be able to do the next: >>> t[0] = numpy.timedelta64(2, unit="[M]") because a month has not a definite number of seconds. This will typically raise a ``TypeError`` exception, or perhaps a ``numpy.IncompatibleUnitError`` which would be more auto-explaining. > > > In [12]: t[0] > > Out[12]: 24 # representation as an int64 > > why not a "pretty" representation of timedelta64 too? I'd like that > better (at least for __str__, perhaps __repr__ should be the raw > numbers. That could be an interesting feature. Here it is what the ``datetime`` module does: >>> delta = datetime.datetime(1980,2,1)-datetime.datetime(1970,1,1) >>> delta.__str__() '3683 days, 0:00:00' >>> delta.__repr__() 'datetime.timedelta(3683)' For the NumPy ``timedelta64`` with a time unit of days, it could be something like: >>> delta_days.__str__() '3683 days' >>> delta_days.__repr__() 3683 while for a ``timedelta64`` with a time unit of microseconds it could be: >>> delta_us.__str__() '3683 days, 3:04:05.000064' >>> delta_us.__repr__() 318222245000064 But I'm open to other suggestions, of course. > how will operations between different types work? > > > t1 = numpy.ones(5, dtype="timedelta64[s]") > > t2 = numpy.ones(5, dtype="timedelta64[ms]") > > t1 + t2 > > >> ?????? Yeah. While the proposal stated that these operations should be possible, it is true that the casting rules where not stablished yet. After thinking a bit about this, we find that we should prioritize avoiding overflows rather than trying to keep the maximum precision. With this rule in mind, the outcome will always have the larger of the units in the operands. In your example, t1 + t2 will have '[s]' units. Would that make sense for most of people? Cheers, -- Francesc Alted From faltet at pytables.org Tue Jul 29 05:37:38 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 11:37:38 +0200 Subject: [Numpy-discussion] =?utf-8?q?RFC=3A_A_=28second=29_proposal_for_i?= =?utf-8?q?mplementing=09some_date/time_types_in_NumPy?= In-Reply-To: <200807281610.44286.pgmdevlist@gmail.com> References: <200807161844.36953.faltet@pytables.org> <200807281817.41987.faltet@pytables.org> <200807281610.44286.pgmdevlist@gmail.com> Message-ID: <200807291137.38513.faltet@pytables.org> A Monday 28 July 2008, Pierre GM escrigu?: > On Monday 28 July 2008 12:17:41 Francesc Alted wrote: > > So, for allowing this to happen, we have concluded that a > > conceptual change in our second proposal is needed: instead of > > a 'resolution', we can introduce the 'time unit' concept. > > I'm all for that, thanks ! > > > One thing that will not be possible though, is > > to change the time unit of a relative time expressed in say, years, > > to another time unit expressed in say, days. This is because the > > impossibility to know how many days has a year that is relative > > (i.e. not bound to a given year). > > OK, that makes sense for timedeltas. But would I still be able to add > a timedelta['Y'] (in years) to a datetime['D'] (in days) and get the > proper result ? Hmmm, good point. Well, provided that we plan to set the casting rules so that the time unit of the outcome will be the largest of the time units of the operands, and assuming aproximate values for the number of days in a year (365.2425, i.e. the average year length of the Gregorian calendar) and in a month (30.436875 = 365.2425/12), I think the next operations would be feasible: >>> numpy.timedelta(20, unit='Y') + numpy.timedelta(365, unit='D') 20 # unit is Year >>> numpy.timedelta(20, unit='Y') + numpy.timedelta(366, unit='D') 21 # unit is Year >>> numpy.timedelta(43, unit='M') + numpy.timedelta(30, unit='D') 43 # unit is Month >>> numpy.timedelta(43, unit='M') + numpy.timedelta(31, unit='D') 44 # unit is Month Would that be ok for you? > > More in general, it will not be possible > > to perform 'time unit' conversions between units above and below a > > relative week (because it is the maximum time unit that has a > > definite number of seconds). > > Could you rephrase that ? You're still talking about conversion for > timedelta, not datetime, right ? Yes. I was talking about the relative timedelta in that case. The initial idea was to forbid conversions among relative timedeltas with different units that imply assumptions in the number of days. But after largely pondering about the example above, I think now that it would be sensible to allow conversions from time units shorter than a week to larger than a week ones (but not the inverse), assuming the truncation of the outcome. For example, the next would be allowed: >>> numpy.timedelta(43, unit='D').astype("t8[M]") 1 # One complete month >>> numpy.timedelta(365, unit='D').astype("t8[Y]") 0 # Not a complete year But this would not: >>> numpy.timedelta(2, unit='M').astype("t8[d]") raise ``IncompatibleUnitError`` # How many days could have 2 months? >>> numpy.timedelta(1, unit='Y').astype("t8[d]") raise ``IncompatibleUnitError`` # How many days could have 1 year? This will add more complexity to the code, but the functionality looks sensible to my eyes. What do you think? > > > > >>>series.asfreq('A-MAR') > > > > Well, as we don't like an 'origin' to have part of our proposal, > > you won't be able to do exactly that with the proposed plain dtype. > > That's what I was afraid of. Oh well, I'm sure we'll come with a > way... > > Looking forward to reading the third version ! Well, as we are still discussing and changing things, we would like to wait a bit more until all the dust has settled. But we are looking forward to produce the third version of the proposal before the end of this week. Cheers, -- Francesc Alted From pav at iki.fi Tue Jul 29 07:21:08 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 29 Jul 2008 11:21:08 +0000 (UTC) Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> Message-ID: Tue, 29 Jul 2008 15:17:01 +0900, David Cournapeau wrote: > Charles R Harris wrote: >> >> >> On Mon, Jul 28, 2008 at 10:29 PM, David Cournapeau >> > >> wrote: >> >> Hi, >> >> I was away during the discussion on the updated complex >> functions >> using C99, and I've noticed it breaks some tests on windows (with >> mingw; >> I have not tried with Visual Studio, but it is likely to make >> things even worse given C support from MS compilers): >> >> http://scipy.org/scipy/numpy/ticket/865 >> >> Were those changes backported to 1.1.x ? If so, I would consider >> this as >> a release blocker, >> >> >> The only changes to the computations were in acosh and asinh, which I >> think should work fine. The tests check branch cuts and corner cases >> among other things and are only in the trunk, so we aren't any worse >> off than we were, we just have more failing tests to track down. > > Ok. I though there was more than just tests, but also C code > modification. If not, it is certainly much less of a problem. I'm not sure whether it makes sense to keep the C99 tests in SVN, even if marked as skipped, before the C code is fixed. Right now, it seems that we are far from C99 compliance with regard to corner-case value inf-nan behavior. (The branch cuts are mostly OK, though, and I suspect that what is currently non-C99 could be fixed by making nc_sqrt to handle negative zeros properly.) Also, it appears that signaling and quiet NaNs (#IND, #QNAN) are printed differently on mingw32, so that the comparisons should be reworked to treat all nans the same, or the functions should be consistent in which flavor they return. I'm not sure whether IEEE 754 or C99 says something about what kind of NaNs functions should return. But I guess in practice this is not so important, I doubt anyone uses these for anything. -- Pauli Virtanen From david at ar.media.kyoto-u.ac.jp Tue Jul 29 07:16:36 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 29 Jul 2008 20:16:36 +0900 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw In-Reply-To: References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> Message-ID: <488EFC14.1090200@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > > I'm not sure whether it makes sense to keep the C99 tests in SVN, even if > marked as skipped, before the C code is fixed. Right now, it seems that > we are far from C99 compliance with regard to corner-case value inf-nan > behavior. (The branch cuts are mostly OK, though, and I suspect that what > is currently non-C99 could be fixed by making nc_sqrt to handle negative > zeros properly.) > Is there a clear explanation about C99 features related to complex math somewhere ? The problem with C99 is that few compilers implement it properly. None of the most used compilers implement it entirely, and some of them don't even try, like MS compilers; the windows situations is the most problematic because the mingw32 compilers are old, and thus may not handle than many C99 features. There are also some shortcuts in the way we detect the math functions, which is not 100 % reliable (because of some mingw problems, in particular: I have already mentioned this problem several times, I really ought to solve it at some points instead of speaking about it). So what matters IMHO is the practical implications with the compilers/C runtime we use for numpy/scipy (gcc, visual studio and intel compilers should cover most of developers/users). cheers, David From faltet at pytables.org Tue Jul 29 07:57:09 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 13:57:09 +0200 Subject: [Numpy-discussion] =?iso-8859-1?q?RFC=3A_A_=28second=29_proposal_?= =?iso-8859-1?q?for_implementing_some=09date/time_types_in_NumPy?= In-Reply-To: <200807291048.35792.faltet@pytables.org> References: <200807161844.36953.faltet@pytables.org> <488E106B.5030404@noaa.gov> <200807291048.35792.faltet@pytables.org> Message-ID: <200807291357.10156.faltet@pytables.org> A Tuesday 29 July 2008, Francesc Alted escrigu?: [snip] > > > In [12]: t[0] > > > Out[12]: 24 # representation as an int64 > > > > why not a "pretty" representation of timedelta64 too? I'd like that > > better (at least for __str__, perhaps __repr__ should be the raw > > numbers. > > That could be an interesting feature. Here it is what the > ``datetime`` > > module does: > >>> delta = datetime.datetime(1980,2,1)-datetime.datetime(1970,1,1) > >>> delta.__str__() > > '3683 days, 0:00:00' > > >>> delta.__repr__() > > 'datetime.timedelta(3683)' > > For the NumPy ``timedelta64`` with a time unit of days, it could be > > something like: > >>> delta_days.__str__() > > '3683 days' > > >>> delta_days.__repr__() > > 3683 > > while for a ``timedelta64`` with a time unit of microseconds it could > > be: > >>> delta_us.__str__() > > '3683 days, 3:04:05.000064' > > >>> delta_us.__repr__() > > 318222245000064 > > But I'm open to other suggestions, of course. Sorry, but I've been a bit inconsistent here as this is documented in the proposal already. Just to clarify things, here it goes the str/repr suggestions (just a bit more populated with examples) in the second version of the second proposal. For absolute times: In [5]: numpy.datetime64(42, 'us') Out[5]: datetime64(42, 'us') In [6]: print numpy.datetime64(42) 1970-01-01T00:00:00.000042 # representation in ISO 8601 format In [7]: print numpy.datetime64(367.7, 'D') # decimal part is lost 1971-01-02 # still ISO 8601 format In [8]: numpy.datetime('2008-07-18T12:23:18', 'm') # from ISO 8601 Out[8]: datetime64(20273063, 'm') In [9]: print numpy.datetime('2008-07-18T12:23:18', 'm') Out[9]: 2008-07-18T12:23 In [10]: t = numpy.zeros(5, dtype="datetime64[D]") In [11]: print t [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] In [12]: repr(t) Out[12]: array([0, 0, 0, 0, 0], dtype="datetime64[D]") In [13]: print t[0] 1970-01-01 In [14]: t[0] Out[14]: datetime64(0, unit='D') In [15]: t[0].item() # getter in action Out[15]: datetime.datetime(1970, 1, 1, 0, 0) For relative times: In [5]: numpy.timedelta64(10, 'us') Out[5]: timedelta64(10, 'us') In [6]: print numpy.timedelta64(10, 'ms') 0:00:00.010 In [7]: print numpy.timedelta64(3600.2, 'm') # decimal part is lost 2 days, 12:00 In [8]: t0 = numpy.zeros(5, dtype="datetime64[ms]") In [9]: t1 = numpy.ones(5, dtype="datetime64[ms]") In [10]: t = t1 - t1 In [11]: t[0] = datetime.timedelta(0, 24) # setter in action In [12]: print t [0:00:24.000 0:00:01.000 0:00:01.000 0:00:01.000 0:00:01.000] In [13]: repr(t) Out[13]: array([24000, 1, 1, 1, 1], dtype="timedelta64[ms]") In [14]: print t[0] 0:00:24.000 In [15]: t[0] Out[15]: timedelta(24000, unit='ms') In [16]: t[0].item() # getter in action Out[16]: datetime.timedelta(0, 24) Cheers, -- Francesc Alted From pav at iki.fi Tue Jul 29 07:59:52 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 29 Jul 2008 11:59:52 +0000 (UTC) Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> <488EFC14.1090200@ar.media.kyoto-u.ac.jp> Message-ID: Tue, 29 Jul 2008 20:16:36 +0900, David Cournapeau wrote: [clip] > Is there a clear explanation about C99 features related to complex math > somewhere ? The problem with C99 is that few compilers implement it The C99 standard (or more precisely its draft but this is probably mostly the same thing) can be found here: http://www.open-std.org/jtc1/sc22/wg14/www/standards > properly. None of the most used compilers implement it entirely, and > some of them don't even try, like MS compilers; the windows situations > is the most problematic because the mingw32 compilers are old, and thus > may not handle than many C99 features. There are also some shortcuts in > the way we detect the math functions, which is not 100 % reliable > (because of some mingw problems, in particular: I have already mentioned > this problem several times, I really ought to solve it at some points > instead of speaking about it). > > So what matters IMHO is the practical implications with the compilers/C > runtime we use for numpy/scipy (gcc, visual studio and intel compilers > should cover most of developers/users). We implement the complex functions completely ourselves and use only the real C math functions. As we compose the complex operations from real ones, corner cases can and apparently go wrong even if the underlying compiler is C99 compliant. The new Python cmath module actually implements the corner cases using a lookup table. I wonder if we should follow... -- Pauli Virtanen From felix at physik3.uni-rostock.de Tue Jul 29 08:36:22 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Tue, 29 Jul 2008 14:36:22 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807281835.50556.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807281835.50556.felix@physik3.uni-rostock.de> Message-ID: <200807291436.23054.felix@physik3.uni-rostock.de> I learned a few things in the meantime: In my installation, NumPy uses fftpack_lite while SciPy uses FFTW3. There are more test cases in SciPy which all pass. So I am confirmed my problem is a pure usage problem. One thing I was confused about is the fact that even if I calculate the function over a certain interval, I cannot tell FFT which interval this is, it will instead assume [0...n]. So actually I did not transform a Lorentz function centered at zero but rather centered at 500. Unfortunately, this solves only half of my problem, because I still cannot reproduce the exact FT. I'll ask for that on the SciPy list, this now seems more appropriate. From stefan at sun.ac.za Tue Jul 29 08:45:31 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 29 Jul 2008 14:45:31 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807291436.23054.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807281835.50556.felix@physik3.uni-rostock.de> <200807291436.23054.felix@physik3.uni-rostock.de> Message-ID: <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> 2008/7/29 Felix Richter : > I learned a few things in the meantime: > > In my installation, NumPy uses fftpack_lite while SciPy uses FFTW3. There are > more test cases in SciPy which all pass. So I am confirmed my problem is a > pure usage problem. > One thing I was confused about is the fact that even if I calculate the > function over a certain interval, I cannot tell FFT which interval this is, > it will instead assume [0...n]. So actually I did not transform a Lorentz > function centered at zero but rather centered at 500. > Unfortunately, this solves only half of my problem, because I still cannot > reproduce the exact FT. I'll ask for that on the SciPy list, this now seems > more appropriate. Felix, Do your answers differ from the theory by a constant factor, or are they completely unrelated? St?fan From meine at informatik.uni-hamburg.de Tue Jul 29 08:52:13 2008 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 29 Jul 2008 14:52:13 +0200 Subject: [Numpy-discussion] Operation over multiple axes? (Or: Partial flattening?) Message-ID: <200807291452.20263.meine@informatik.uni-hamburg.de> Hi, with a multidimensional array (say, 4-dimensional), I often want to project this onto one single dimension, i.e.. let "dat" be a 4D array, I am interested in dat.sum(0).sum(0).sum(0) # equals dat.sum(2).sum(1).sum(0) However, creating intermediate results looks more expensive than necessary; I would actually like to say dat.sum((0,1,2)) One way to achieve this is partial flattening, which I did like this: dat.reshape((numpy.prod(dat.shape[:3]), dat.shape[3])).sum(0) Is there a more elegant way to do this? Ciao, / / .o. /--/ ..o / / ANS ooo -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. URL: From felix at physik3.uni-rostock.de Tue Jul 29 09:04:41 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Tue, 29 Jul 2008 15:04:41 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807291436.23054.felix@physik3.uni-rostock.de> <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> Message-ID: <200807291504.41609.felix@physik3.uni-rostock.de> > Do your answers differ from the theory by a constant factor, or are > they completely unrelated? No, it's more complicated. Below you'll find my most recent, more stripped down code. - I don't know how to scale in a way that works for any n. - I don't know how to get the oscillations to match. I suppose its a problem with the frequency scale, but usage of fftfreq() is straightforward... - I don't know why the imaginary part of the FFT behaves so different from the real part. It should just be a matter of sin vs. cos. Is this voodoo? ;-) And I didn't find any example on the internet which tries just to reproduce an analytic FT with the FFT... Thanks for your help! # coding: UTF-8 """Test for FFT against analytic results""" from scipy import * from scipy import fftpack as fft import pylab def expdecay(t, dx, a): return exp(-a*abs(t))*exp(1j*dx*t) * sqrt(pi/2.0) def lorentz(x, dx, a): return a/((x-dx)**2+a**2) origfunc = lorentz exactfft = expdecay xrange, dxrange = linspace(0, 100, 2**12, retstep=True) n = len(xrange) # calculate original function over positive half of x-axis # this serves as input to fft, make sure datatype is complex ftdata = zeros(xrange.shape, complex128) ftdata += origfunc(xrange, 50, 1.0) # do FFT fftft = fft.fft(ftdata) # normalize # but how exactly? fftft /= sqrt(n) # shift frequencies into human-readable order fftfts= fft.fftshift(fftft) # determine frequency axis fftscale = fft.fftfreq(n, dxrange) fftscale = fft.fftshift(fftscale) # calculate exact result of FT for comparison exactres = exactfft(fftscale, 50, 1.0) pylab.subplot(211) pylab.plot(xrange, ftdata.real, 'x', label='Re data') pylab.legend() pylab.subplot(212) pylab.plot(fftscale, fftfts.real, 'x', label='Re FFT(data)') pylab.plot(fftscale, fftfts.imag, '.', label='Im FFT(data)') pylab.plot(fftscale, exactres.real, label='exact Re FT') pylab.plot(fftscale, exactres.imag, label='exact Im FT') pylab.legend() pylab.show() pylab.close() From faltet at pytables.org Tue Jul 29 09:12:52 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 15:12:52 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue Message-ID: <200807291512.53270.faltet@pytables.org> Hi, During the making of the date/time proposals and the subsequent discussions in this list, we have changed a couple of times our point of view about the way how the castings would work between different date/time types and the different time units (previously called resolutions). So I'd like to expose this issue in detail here, and give yet another new proposal about this, so as to gather feedback from the community before consolidating it in the final date/time proposal. Casting proposal for date/time types ==================================== The operations among the proposed date/time types can be divided in three groups: * Absolute time versus relative time * Absolute time versus absolute time * Relative time versus relative time Now, here are our considerations for each case: Absolute time versus relative time ---------------------------------- We think that in this case the absolute time should have priority for determining the time unit of the outcome. That would represent what the people wants to do most of the times. For example, this would allow to do: >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], dtype='datetime64[D]') >>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 relative years >>> series2 array(['1972-01-01', '1972-02-01', '1972-09-01'], dtype='datetime64[D]') # the 'D'ay time unit has been chosen Absolute time versus absolute time ---------------------------------- When operating (basically, only the substraction will be allowed) two absolute times with different unit times, we are proposing that the outcome would be to raise an exception. This is because the ranges and timespans of the different time units can be very different, and it is not clear at all what time unit will be preferred for the user. For example, this should be allowed: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]") array([1, 1, 1], dtype="timedelta64[Y]") But the next should not: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]") raise numpy.IncompatibleUnitError # what unit to choose? Relative time versus relative time ---------------------------------- This case would be the same than the previous one (absolute vs absolute). Our proposal is to forbid this operation if the time units of the operands are different. For example, this should be allowed: >>> numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Y]") array([4, 4, 4], dtype="timedelta64[Y]") But the next should not: >>> numpy.ones(3, dtype="t8[Y]") + numpy.zeros(3, dtype="t8[fs]") raise numpy.IncompatibleUnitError # what unit to choose? Introducing a time casting function ----------------------------------- As forbidding operations among absolute/absolute and relative/relative types can be unacceptable in many situations, we are proposing an explicit casting mechanism so that the user can inform about the desired time unit of the outcome. For this, a new NumPy function, called, say, ``numpy.change_unit()`` (this name is for the purposes of the discussion and can be changed) will be provided. The signature for the function will be: change_unit(time_object, new_unit, reference) where 'time_object' is the time object whose unit is to be changed, 'new_unit' is the desired new time unit, and 'reference' is an absolute date that will be used to allow the conversion of relative times in case of using time units with an uncertain number of smaller time units (relative years or months cannot be expressed in days). For example, that would allow to do: >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' ) array([365, 731], dtype="datetime64[d]") or: >>> ref = numpy.datetime64('1971', 'T[Y]') >>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]', ref ) array([366, 365], dtype="timedelta64[d]") Note: we refused to use the ``.astype()`` method because of the additional 'time_reference' parameter that will sound strange for other typical uses of ``.astype()``. Opinions? -- Francesc Alted From stefan at sun.ac.za Tue Jul 29 09:32:07 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 29 Jul 2008 15:32:07 +0200 Subject: [Numpy-discussion] Operation over multiple axes? (Or: Partial flattening?) In-Reply-To: <200807291452.20263.meine@informatik.uni-hamburg.de> References: <200807291452.20263.meine@informatik.uni-hamburg.de> Message-ID: <9457e7c80807290632p126f6d75hda04b1d6b3cf571f@mail.gmail.com> 2008/7/29 Hans Meine : > with a multidimensional array (say, 4-dimensional), I often want to project > this onto one single dimension, i.e.. let "dat" be a 4D array, I am > interested in > > dat.sum(0).sum(0).sum(0) # equals dat.sum(2).sum(1).sum(0) > > However, creating intermediate results looks more expensive than necessary; I > would actually like to say > > dat.sum((0,1,2)) > > One way to achieve this is partial flattening, which I did like this: > > dat.reshape((numpy.prod(dat.shape[:3]), dat.shape[3])).sum(0) > > Is there a more elegant way to do this? That looks like a good way to do it. You can clean it up ever so slightly: x.reshape([-1, x.shape[-1]]).sum(axis=0) Cheers St?fan From chanley at stsci.edu Tue Jul 29 09:36:40 2008 From: chanley at stsci.edu (Christopher Hanley) Date: Tue, 29 Jul 2008 09:36:40 -0400 Subject: [Numpy-discussion] svn numpy selftests fail on Solaris Message-ID: <488F1CE8.40408@stsci.edu> This has apparently been occurring for a few days. My apologizes but I have been away on vacation. FAILED (failures=5) Running unit tests for numpy NumPy version 1.2.0.dev5565 NumPy is installed in /usr/ra/pyssg/2.5.1/numpy Python version 2.5.1 (r251:54863, Jun 4 2008, 15:48:19) [C] nose version 0.10.0 ctypes is not available on this python: skipping the test (import error was: ctypes is not available.) No distutils available, skipping test. errors: failures: (Test(test_umath.TestC99.test_cacos(, (1.0, NaN), (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n self.test(*self.arg)\n File "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in _check\n assert got == expected, (got, expected)\nAssertionError: (\'(-NaN, -NaN)\', \'(NaN, NaN)\')\n') (Test(test_umath.TestC99.test_cacos(, (NaN, 1.0), (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n self.test(*self.arg)\n File "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in _check\n assert got == expected, (got, expected)\nAssertionError: (\'(-NaN, -NaN)\', \'(NaN, NaN)\')\n') (Test(test_umath.TestC99.test_cacosh(, (1.0, NaN), (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n self.test(*self.arg)\n File "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in _check\n assert got == expected, (got, expected)\nAssertionError: (\'(NaN, -NaN)\', \'(NaN, NaN)\')\n') (Test(test_umath.TestC99.test_casinh(, (NaN, 1.0), (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n self.test(*self.arg)\n File "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in _check\n assert got == expected, (got, expected)\nAssertionError: (\'(NaN, -NaN)\', \'(NaN, NaN)\')\n') (Test(test_umath.TestC99.test_clog(, (-0.0, -0.0), (-Infinity, 3.1415926535897931), 'divide')), 'Traceback (most recent call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n self.test(*self.arg)\n File "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in _check\n assert got == expected, (got, expected)\nAssertionError: (\'(-Infinity, 0.0)\', \'(-Infinity, 3.1415926535897931)\')\n') -- Christopher Hanley Systems Software Engineer Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21218 (410) 338-4338 From charlesr.harris at gmail.com Tue Jul 29 10:13:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 08:13:23 -0600 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw In-Reply-To: References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> <488EFC14.1090200@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Jul 29, 2008 at 5:59 AM, Pauli Virtanen wrote: > Tue, 29 Jul 2008 20:16:36 +0900, David Cournapeau wrote: > [clip] > > Is there a clear explanation about C99 features related to complex math > > somewhere ? The problem with C99 is that few compilers implement it > > The C99 standard (or more precisely its draft but this is probably mostly > the same thing) can be found here: > > http://www.open-std.org/jtc1/sc22/wg14/www/standards > > > properly. None of the most used compilers implement it entirely, and > > some of them don't even try, like MS compilers; the windows situations > > is the most problematic because the mingw32 compilers are old, and thus > > may not handle than many C99 features. There are also some shortcuts in > > the way we detect the math functions, which is not 100 % reliable > > (because of some mingw problems, in particular: I have already mentioned > > this problem several times, I really ought to solve it at some points > > instead of speaking about it). > > > > So what matters IMHO is the practical implications with the compilers/C > > runtime we use for numpy/scipy (gcc, visual studio and intel compilers > > should cover most of developers/users). > > We implement the complex functions completely ourselves and use only the > real C math functions. As we compose the complex operations from real > ones, corner cases can and apparently go wrong even if the underlying > compiler is C99 compliant. > > The new Python cmath module actually implements the corner cases using a > lookup table. I wonder if we should follow... We also need speed. I think we just say behaviour on the branch cuts is undefined, which is numerically true in any case, and try to get the nan's and infs sorted out. But only if the costs are reasonable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 29 10:16:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 08:16:07 -0600 Subject: [Numpy-discussion] svn numpy selftests fail on Solaris In-Reply-To: <488F1CE8.40408@stsci.edu> References: <488F1CE8.40408@stsci.edu> Message-ID: On Tue, Jul 29, 2008 at 7:36 AM, Christopher Hanley wrote: > This has apparently been occurring for a few days. My apologizes but I > have been away on vacation. > > FAILED (failures=5) > Running unit tests for numpy > NumPy version 1.2.0.dev5565 > NumPy is installed in /usr/ra/pyssg/2.5.1/numpy > Python version 2.5.1 (r251:54863, Jun 4 2008, 15:48:19) [C] > nose version 0.10.0 > ctypes is not available on this python: skipping the test (import error > was: ctypes is not available.) > No distutils available, skipping test. > errors: > failures: > (Test(test_umath.TestC99.test_cacos(, (1.0, NaN), > (NaN, > NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File > "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n > self.test(*self.arg)\n File > "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in > _check\n assert got == expected, (got, expected)\nAssertionError: > (\'(-NaN, -NaN)\', \'(NaN, NaN)\')\n') > (Test(test_umath.TestC99.test_cacos(, (NaN, 1.0), > (NaN, > NaN), 'invalid-optional')), 'Traceback (most recent call last):\n File > "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n > self.test(*self.arg)\n File > "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in > _check\n assert got == expected, (got, expected)\nAssertionError: > (\'(-NaN, -NaN)\', \'(NaN, NaN)\')\n') > (Test(test_umath.TestC99.test_cacosh(, (1.0, NaN), > (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n > File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n > self.test(*self.arg)\n File > "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in > _check\n assert got == expected, (got, expected)\nAssertionError: > (\'(NaN, -NaN)\', \'(NaN, NaN)\')\n') > (Test(test_umath.TestC99.test_casinh(, (NaN, 1.0), > (NaN, NaN), 'invalid-optional')), 'Traceback (most recent call last):\n > File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, in runTest\n > self.test(*self.arg)\n File > "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in > _check\n assert got == expected, (got, expected)\nAssertionError: > (\'(NaN, -NaN)\', \'(NaN, NaN)\')\n') > (Test(test_umath.TestC99.test_clog(, (-0.0, -0.0), > (-Infinity, 3.1415926535897931), 'divide')), 'Traceback (most recent > call last):\n File "/usr/stsci/pyssgdev/2.5.1/nose/case.py", line 202, > in runTest\n self.test(*self.arg)\n File > "/usr/ra/pyssg/2.5.1/numpy/core/tests/test_umath.py", line 393, in > _check\n assert got == expected, (got, expected)\nAssertionError: > (\'(-Infinity, 0.0)\', \'(-Infinity, 3.1415926535897931)\')\n') > See the thread on recent work on branch cuts. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From meine at informatik.uni-hamburg.de Tue Jul 29 10:20:29 2008 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 29 Jul 2008 16:20:29 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807291504.41609.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> <200807291504.41609.felix@physik3.uni-rostock.de> Message-ID: <200807291620.29704.meine@informatik.uni-hamburg.de> Hi Felix, I quickly copy-pasted and ran your code; it looks to me like the results you calculated analytically oscillate too fast to be represented discretely. Did you try to transform different, simpler signals? (e.g. a Gaussian?) Ciao, / / .o. /--/ ..o / / ANS ooo -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. URL: From meine at informatik.uni-hamburg.de Tue Jul 29 10:24:27 2008 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Tue, 29 Jul 2008 16:24:27 +0200 Subject: [Numpy-discussion] Operation over multiple axes? (Or: Partial flattening?) In-Reply-To: <9457e7c80807290632p126f6d75hda04b1d6b3cf571f@mail.gmail.com> References: <200807291452.20263.meine@informatik.uni-hamburg.de> <9457e7c80807290632p126f6d75hda04b1d6b3cf571f@mail.gmail.com> Message-ID: <200807291624.27888.meine@informatik.uni-hamburg.de> On Dienstag 29 Juli 2008, St?fan van der Walt wrote: > > One way to achieve this is partial flattening, which I did like this: > > > > dat.reshape((numpy.prod(dat.shape[:3]), dat.shape[3])).sum(0) > > > > Is there a more elegant way to do this? > > That looks like a good way to do it. You can clean it up ever so slightly: > > x.reshape([-1, x.shape[-1]]).sum(axis=0) Thanks, that looks more elegant indeed. I am not sure if I've read about -1 in shapes before. I assume it represents "the automatically determined rest" and may only appear once? Should this be documented in the reshape docstring? Ciao, / / .o. /--/ ..o / / ANS ooo -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. URL: From pav at iki.fi Tue Jul 29 10:38:15 2008 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 29 Jul 2008 14:38:15 +0000 (UTC) Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> <488EFC14.1090200@ar.media.kyoto-u.ac.jp> Message-ID: Tue, 29 Jul 2008 08:13:23 -0600, Charles R Harris wrote: [clip] > We also need speed. I think we just say behaviour on the branch cuts is > undefined, which is numerically true in any case, and try to get the > nan's and infs sorted out. But only if the costs are reasonable. Well, the branch cut tests have succeeded on all platforms so far, which means the behavior is numerically well-defined. I doubt we can lose any speed by fixing sqrt in this respect. The inf-nan business is the one causing problems. Lookup tables might solve the problem, but they add a few branches to the code even if the arguments are finite. -- Pauli Virtanen From charlesr.harris at gmail.com Tue Jul 29 10:44:35 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 08:44:35 -0600 Subject: [Numpy-discussion] Recent work for branch cuts / C99 complex maths: problems on mingw In-Reply-To: References: <488E9CAE.9090602@ar.media.kyoto-u.ac.jp> <488EB5DD.3040803@ar.media.kyoto-u.ac.jp> <488EFC14.1090200@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Jul 29, 2008 at 8:38 AM, Pauli Virtanen wrote: > Tue, 29 Jul 2008 08:13:23 -0600, Charles R Harris wrote: > [clip] > > We also need speed. I think we just say behaviour on the branch cuts is > > undefined, which is numerically true in any case, and try to get the > > nan's and infs sorted out. But only if the costs are reasonable. > > Well, the branch cut tests have succeeded on all platforms so far, which > means the behavior is numerically well-defined. I doubt we can lose any > speed by fixing sqrt in this respect. > Because of the discontinuity, roundoff error makes the behaviour undefined. It's just a fact of computational life. I suspect we mean slightly different things by "numerically" ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix at physik3.uni-rostock.de Tue Jul 29 10:56:08 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Tue, 29 Jul 2008 16:56:08 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807291620.29704.meine@informatik.uni-hamburg.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807291504.41609.felix@physik3.uni-rostock.de> <200807291620.29704.meine@informatik.uni-hamburg.de> Message-ID: <200807291656.08774.felix@physik3.uni-rostock.de> > I quickly copy-pasted and ran your code; it looks to me like the results > you calculated analytically oscillate too fast to be represented > discretely. Did you try to transform different, simpler signals? (e.g. a > Gaussian?) Yes, I run into the same problem. Since the oscillation frequency is given by the point around which the function is centered, it would be good to have it centered around zero. The FFT assumes the x axis to be [0..n], so how should I do this? The functions I have to transform later won't be symmetrical, so the trick abs(fftdata) is not possible. Felix From lpc at cmu.edu Tue Jul 29 11:13:03 2008 From: lpc at cmu.edu (Luis Pedro Coelho) Date: Tue, 29 Jul 2008 11:13:03 -0400 Subject: [Numpy-discussion] No Copy Reduce Operations Message-ID: <200807291113.03278.lpc@cmu.edu> Travis E. Oliphant wrote: > Your approach using C++ templates is interesting, and I'm very glad for > your explanation and your releasing of the code as open source. I'm > not prepared to start using C++ in NumPy, however, so your code will > have to serve as an example only. I will keep this as a separate package. I will write back one I have put it up somewhere. If this is not going into numpy, then I will do things a little differently, namely, I will have a very simple python layer which cleans up the arguments before calling the C++ implementation. bye, Luis From faltet at pytables.org Tue Jul 29 12:17:01 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 18:17:01 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807291512.53270.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> Message-ID: <200807291817.01370.faltet@pytables.org> Ops, after reviewing this document, I've discovered a couple of typos. A Tuesday 29 July 2008, Francesc Alted escrigu?: [snip] > >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], > dtype='datetime64[D]') > >>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 years ^^^ the above line should read: >>> series2 = series + numpy.timedelta(2, 'Y') # Add 2 years > >>> series2 > > array(['1972-01-01', '1972-02-01', '1972-09-01'], > dtype='datetime64[D]') # the 'D'ay time unit has been chosen [snip] > >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' ) > > array([365, 731], dtype="datetime64[d]") > > or: > >>> ref = numpy.datetime64('1971', 'T[Y]') > >>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]', ref ) > > array([366, 365], dtype="timedelta64[d]") ^^^ the above line should read: array([366, 731], dtype="timedelta64[d]") -- Francesc Alted From myeates at jpl.nasa.gov Tue Jul 29 12:22:34 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 09:22:34 -0700 Subject: [Numpy-discussion] why isn't libfftw.a being accessed? Message-ID: <488F43CA.5090707@jpl.nasa.gov> Hi In my site.cfg I have [DEFAULT] library_dirs = /home/ossetest/lib64:/home/ossetest/lib include_dirs = /home/ossetest/include [fftw] libraries = fftw3 but libfftw3.a isn't being accesed. ls -lu ~/lib/libfftw3.a -rw-r--r-- 1 ossetest ossetest 1572628 Jul 26 15:02 /home/ossetest/lib/libfftw3.a anybody know why? Mathew From david.huard at gmail.com Tue Jul 29 12:31:54 2008 From: david.huard at gmail.com (David Huard) Date: Tue, 29 Jul 2008 12:31:54 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807291512.53270.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> Message-ID: <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Hi, Silent casting is often a source of bugs and I appreciate the strict rules you want to enforce. However, I think there should be a simpler mechanism for operations between different types than creating a copy of a variable with the correct type. My suggestion is to have a dtype argument for methods such as add and subs: >>> numpy.ones(3, dtype="t8[Y]").add(numpy.zeros(3, dtype="t8[fs]"), dtype="t8[fs]") This way, `implicit` operations (+,-) enforce strict rules, and `explicit` operations (add, subs) let's you do want you want at your own risk. David On Tue, Jul 29, 2008 at 9:12 AM, Francesc Alted wrote: > Hi, > > During the making of the date/time proposals and the subsequent > discussions in this list, we have changed a couple of times our point > of view about the way how the castings would work between different > date/time types and the different time units (previously called > resolutions). So I'd like to expose this issue in detail here, and > give yet another new proposal about this, so as to gather feedback from > the community before consolidating it in the final date/time proposal. > > Casting proposal for date/time types > ==================================== > > The operations among the proposed date/time types can be divided in > three groups: > > * Absolute time versus relative time > > * Absolute time versus absolute time > > * Relative time versus relative time > > Now, here are our considerations for each case: > > Absolute time versus relative time > ---------------------------------- > > We think that in this case the absolute time should have priority for > determining the time unit of the outcome. That would represent what > the people wants to do most of the times. For example, this would > allow to do: > > >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], > dtype='datetime64[D]') > >>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 relative years > >>> series2 > array(['1972-01-01', '1972-02-01', '1972-09-01'], > dtype='datetime64[D]') # the 'D'ay time unit has been chosen > > Absolute time versus absolute time > ---------------------------------- > > When operating (basically, only the substraction will be allowed) two > absolute times with different unit times, we are proposing that the > outcome would be to raise an exception. This is because the ranges and > timespans of the different time units can be very different, and it is > not clear at all what time unit will be preferred for the user. For > example, this should be allowed: > > >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]") > array([1, 1, 1], dtype="timedelta64[Y]") > > But the next should not: > > >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]") > raise numpy.IncompatibleUnitError # what unit to choose? > > Relative time versus relative time > ---------------------------------- > > This case would be the same than the previous one (absolute vs > absolute). Our proposal is to forbid this operation if the time units > of the operands are different. For example, this should be allowed: > > >>> numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Y]") > array([4, 4, 4], dtype="timedelta64[Y]") > > But the next should not: > > >>> numpy.ones(3, dtype="t8[Y]") + numpy.zeros(3, dtype="t8[fs]") > raise numpy.IncompatibleUnitError # what unit to choose? > > Introducing a time casting function > ----------------------------------- > > As forbidding operations among absolute/absolute and relative/relative > types can be unacceptable in many situations, we are proposing an > explicit casting mechanism so that the user can inform about the > desired time unit of the outcome. For this, a new NumPy function, > called, say, ``numpy.change_unit()`` (this name is for the purposes of > the discussion and can be changed) will be provided. The signature for > the function will be: > > change_unit(time_object, new_unit, reference) > > where 'time_object' is the time object whose unit is to be > changed, 'new_unit' is the desired new time unit, and 'reference' is an > absolute date that will be used to allow the conversion of relative > times in case of using time units with an uncertain number of smaller > time units (relative years or months cannot be expressed in days). For > example, that would allow to do: > > >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' ) > array([365, 731], dtype="datetime64[d]") > > or: > > >>> ref = numpy.datetime64('1971', 'T[Y]') > >>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]', ref ) > array([366, 365], dtype="timedelta64[d]") > > Note: we refused to use the ``.astype()`` method because of the > additional 'time_reference' parameter that will sound strange for other > typical uses of ``.astype()``. > > Opinions? > > -- > Francesc Alted > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Jul 29 12:38:19 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Jul 2008 12:38:19 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807291512.53270.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> Message-ID: <200807291238.20161.pgmdevlist@gmail.com> Francesc, > Absolute time versus relative time > ---------------------------------- > > We think that in this case the absolute time should have priority for > determining the time unit of the outcome. +1 > Absolute time versus absolute time > ---------------------------------- > > When operating (basically, only the substraction will be allowed) two > absolute times with different unit times, we are proposing that the > outcome would be to raise an exception. +1 (However, I don't think that np.zeros(3, dtype="T8[Y]") is the most useful example ;)) > Relative time versus relative time > ---------------------------------- > > This case would be the same than the previous one (absolute vs > absolute). Our proposal is to forbid this operation if the time units > of the operands are different. Mmh, less sure on this one. Can't we use a hierarchy of time units, and force to the lowest ? For example: >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") >>>array([15,15,15], dtype="t8['M']") I agree that adding ns to years makes no sense, but ns to s ? min to hr or days ? In short: systematically raising an exception looks a bit too drastic. There are some simple unambiguous cases that sould be allowed (Y+M, Y+Q, M+Q, H+D...) > Introducing a time casting function > ----------------------------------- > change_unit(time_object, new_unit, reference) > > where 'time_object' is the time object whose unit is to be > changed, 'new_unit' is the desired new time unit, and 'reference' is an > absolute date that will be used to allow the conversion of relative > times in case of using time units with an uncertain number of smaller > time units (relative years or months cannot be expressed in days). reference default to the POSIX epoch, right ? So this function could be a first step towards our problem of frequency conversion... > Note: we refused to use the ``.astype()`` method because of the > additional 'time_reference' parameter that will sound strange for other > typical uses of ``.astype()``. A method would be really, really helpful, though... Back to a previous email: > >>> numpy.timedelta(20, unit='Y') + numpy.timedelta(365, unit='D') > 20 # unit is Year I would have expected days, or an exception (as there's an ambiguity in the length in days of a year) > >>> numpy.timedelta(20, unit='Y') + numpy.timedelta(366, unit='D') > 21 # unit is Year > >>> numpy.timedelta(43, unit='M') + numpy.timedelta(30, unit='D') > 43 # unit is Month > > >>> numpy.timedelta(43, unit='M') + numpy.timedelta(31, unit='D') > 44 # unit is Month > Would that be ok for you? Gah, I dunno. Adding relative values is always tricky... I understand the last statement as 43 months and 31 days, which could be 44 months if we're speaking in months, or 3 years, 7 months, and 31 days... From charlesr.harris at gmail.com Tue Jul 29 12:55:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 10:55:27 -0600 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807291656.08774.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807291504.41609.felix@physik3.uni-rostock.de> <200807291620.29704.meine@informatik.uni-hamburg.de> <200807291656.08774.felix@physik3.uni-rostock.de> Message-ID: On Tue, Jul 29, 2008 at 8:56 AM, Felix Richter wrote: > > I quickly copy-pasted and ran your code; it looks to me like the results > > you calculated analytically oscillate too fast to be represented > > discretely. Did you try to transform different, simpler signals? (e.g. > a > > Gaussian?) > Yes, I run into the same problem. > > Since the oscillation frequency is given by the point around which the > function is centered, it would be good to have it centered around zero. > The FFT assumes the x axis to be [0..n], so how should I do this? > The functions I have to transform later won't be symmetrical, so the trick > abs(fftdata) is not possible. > You can apply a linear phase shift to the transformed data, i.e., multiply by something of the form exp(ixn), where x depends on where you want the center and n is the index of the transformed data point. This effectively rotates the original data. Or you can just rotate the data. If the data is not symmetric you are always going to have complex components. What exactly are you trying to do? I mean, what is the original problem that you are trying to solve by this method? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.denniston at alum.dartmouth.org Tue Jul 29 13:21:39 2008 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Tue, 29 Jul 2008 12:21:39 -0500 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Message-ID: Francesc, The datetime proposal is very impressive in its depth and thought. For me as well as many other people this would be a massive improvement to numpy and allow numpy to get a foothold in areas like econometrics where R/S is now dominant. I had one question regarding casting of strings: I think it would be ideal if things like the following worked: >>> series = numpy.array(['1970-02-01','1970-09-01'], dtype = 'datetime64[D]') >>> series == '1970-02-01' [True, False] I view this as similar to: >>> series = numpy.array([1,2,3], dtype=float) >>> series == 2 [False,True,False] 1. However it does numpy recognizes that an int is comparable with a float and does the float cast. I think you want the same behavior between strings that parse into dates and date arrays. Some might object that the relationship between string and date is more tenuous than float and int, which is true, but having used my own homespun date array numpy extension for over a year, I've found that the first thing I did was wrap it into an object that handles these string->date translations elegantly and that made it infinately more usable from an ipython session. 2. Even more important to me, however, is the issue of date parsing. The mx library does many things badly but it does do a great job of parsing dates of many formats. When you parse '1/1/95' or 1995-01-01' it knows that you mean 19950101 which is really nice. I believe the scipy timeseries code for parsing dates is based on it. I would highly suggest starting with that level of functionality. The one major issue with it is an uninterpretable date doesn't throw an error but becomes whatever date is right now. That is obviously unfavorable. 3. Finally my current implementation uses floats uses nan to represent an invalid date. When you assign an element of an date array to None it uses nan as the value. When you assign a real date it puts in the equivalent floating point value. I have found this to be hugely beneficial and just wanted to float the idea of reserving a value to indicate the floating point equivalent of nan. People might prefer masked arrays as a solution, but I just wanted to float the idea. Forgive me if any of this has already been covered. There has been a lot of volume on this subject and I've tried to read it all diligently but may have missed a point or two. --Tom From faltet at pytables.org Tue Jul 29 14:08:28 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 20:08:28 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Message-ID: <200807292008.28992.faltet@pytables.org> A Tuesday 29 July 2008, David Huard escrigu?: > Hi, > > Silent casting is often a source of bugs and I appreciate the strict > rules you want to enforce. > However, I think there should be a simpler mechanism for operations > between different types > than creating a copy of a variable with the correct type. > > My suggestion is to have a dtype argument for methods such as add and subs: > >>> numpy.ones(3, dtype="t8[Y]").add(numpy.zeros(3, dtype="t8[fs]"), > > dtype="t8[fs]") > > This way, `implicit` operations (+,-) enforce strict rules, and > `explicit` operations (add, subs) let's > you do want you want at your own risk. Hmm, the idea of the ``.add()`` and ``.subtract()`` methods is tempting, but I not sure it is a good idea to add new methods to the ndarray object that are meant to be used with just the date/time dtype. I'm afraid that I'm -1 here. Cheers, -- Francesc Alted From pgmdevlist at gmail.com Tue Jul 29 14:16:27 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Jul 2008 14:16:27 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807292008.28992.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> <200807292008.28992.faltet@pytables.org> Message-ID: <200807291416.28154.pgmdevlist@gmail.com> On Tuesday 29 July 2008 14:08:28 Francesc Alted wrote: > A Tuesday 29 July 2008, David Huard escrigu?: > Hmm, the idea of the ``.add()`` and ``.subtract()`` methods is tempting, > but I not sure it is a good idea to add new methods to the ndarray > object that are meant to be used with just the date/time dtype. > > I'm afraid that I'm -1 here. I fully agree with Francesc, .add and .subtract will be quite confusing. About inplace conversions, the right-end (other) is cast to the type of the left end (self) by default following the basic rule of casting when there's no ambiguity and raising an exception otherwise ? From ivan at selidor.net Tue Jul 29 14:38:13 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Tue, 29 Jul 2008 20:38:13 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Message-ID: <20080729183813.GB5600@tardis.terramar.selidor.net> David Huard (el 2008-07-29 a les 12:31:54 -0400) va dir:: > Silent casting is often a source of bugs and I appreciate the strict > rules you want to enforce. However, I think there should be a simpler > mechanism for operations between different types than creating a copy > of a variable with the correct type. > > My suggestion is to have a dtype argument for methods such as add and subs: > > >>> numpy.ones(3, dtype="t8[Y]").add(numpy.zeros(3, dtype="t8[fs]"), > dtype="t8[fs]") > > This way, `implicit` operations (+,-) enforce strict rules, and > `explicit` operations (add, subs) let's you do want you want at your > own risk. Umm, that looks like a big change (or addition) to the NumPy interface. I think similar "include a dtype argument for method X" issues hava been discussed before in the list. However, given the big change of adding the new explicit operation methods I think your proposal falls beyond the scope of the project being discussed. However, since yours isn't necessarily a time-related proposal, you may ask what people think of it in a separate thread. :: Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From faltet at pytables.org Tue Jul 29 14:47:34 2008 From: faltet at pytables.org (Francesc Alted) Date: Tue, 29 Jul 2008 20:47:34 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Message-ID: <200807292047.34813.faltet@pytables.org> A Tuesday 29 July 2008, Tom Denniston escrigu?: > Francesc, > > The datetime proposal is very impressive in its depth and thought. > For me as well as many other people this would be a massive > improvement to numpy and allow numpy to get a foothold in areas like > econometrics where R/S is now dominant. > > I had one question regarding casting of strings: > > I think it would be ideal if things like the following worked: > >>> series = numpy.array(['1970-02-01','1970-09-01'], dtype = > >>> 'datetime64[D]') series == '1970-02-01' > > [True, False] > > I view this as similar to: > >>> series = numpy.array([1,2,3], dtype=float) > >>> series == 2 > > [False,True,False] Good point. Well, I agree that adding the support for setting elements from strings, i.e.: >>> t = numpy.ones(3, 'T8[D]') >>> t[0] = '2001-01-01' should be supported. With this, and appyling the broadcasting rules, then the next: >>> t == '2001-01-01' [True, False, False] should work without problems. We will try to add this explicitely into the new proposal. > 1. However it does numpy recognizes that an int is comparable with a > float and does the float cast. I think you want the same behavior > between strings that parse into dates and date arrays. Some might > object that the relationship between string and date is more tenuous > than float and int, which is true, but having used my own homespun > date array numpy extension for over a year, I've found that the first > thing I did was wrap it into an object that handles these > string->date translations elegantly and that made it infinately more > usable from an ipython session. Well, you should not worry because of this. Hopefully, in the >>> t == '2001-01-01' comparison, the scalar part of the expression can be casted into a date array, and then the proper comparison will be performed. If this cannot be done for some reason that scapes me, one will always be able to do: >>> t == N.datetime64('2001-01-01', 'Y') [True, False, False] which is a bit more verbose, but much more clear too. > 2. Even more important to me, however, is the issue of date parsing. > The mx library does many things badly but it does do a great job of > parsing dates of many formats. When you parse '1/1/95' or > 1995-01-01' it knows that you mean 19950101 which is really nice. I > believe the scipy timeseries code for parsing dates is based on it. > I would highly suggest starting with that level of functionality. > The one major issue with it is an uninterpretable date doesn't throw > an error but becomes whatever date is right now. That is obviously > unfavorable. Hmmm. We would not like to clutter too much the NumPy core with too much date string parsing code. As it is said in the proposal, we only plan to support the parsing for the ISO 8601. That should be enough for most of purposes. However, I'm sure that parsing for other formats will be available in the ``Date`` class of the TimeSeries package. > 3. Finally my current implementation uses floats uses nan to > represent an invalid date. When you assign an element of an date > array to None it uses nan as the value. When you assign a real date > it puts in the equivalent floating point value. I have found this to > be hugely beneficial and just wanted to float the idea of reserving a > value to indicate the floating point equivalent of nan. People might > prefer masked arrays as a solution, but I just wanted to float the > idea. Hmm, that's another very valid point. In fact, Ivan and me had already foreseen the existence of a NaT (Not A Time), as the maximum negative integer (-2**63). However, as the underlying type of the proposed time type is an int64, the arithmetic operations with the time types will be done through integer arithmetic, and unfortunately, the majority of platforms out there perform this kind of arithmetic as two's-complement arithmetic. That means that there is not provision for handling NaT's in hardware: In [58]: numpy.int64(-2**63) Out[58]: -9223372036854775808 # this is a NaT In [59]: numpy.int64(-2**63)+1 Out[59]: -9223372036854775807 # no longer a NaT In [60]: numpy.int64(-2**63)-1 Out[60]: 9223372036854775807 # idem, and besides, positive! So, well, due to this limitation, I'm afraid that we will have to live without a proper handling of NaT times. Perhaps this would be the biggest limitation of choosing int64 as the base type of the date/time dtype (float64 is better in that regard, but has also its disadvantages, like the variable precision which is intrinsic to it). > Forgive me if any of this has already been covered. There has been a > lot of volume on this subject and I've tried to read it all > diligently but may have missed a point or two. Not at all. You've touched important issues. Thanks! -- Francesc Alted From ivan at selidor.net Tue Jul 29 14:59:19 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Tue, 29 Jul 2008 20:59:19 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <91cf711d0807290931l727ba61fl36d5afe4240554ff@mail.gmail.com> Message-ID: <20080729185919.GC5600@tardis.terramar.selidor.net> Tom Denniston (el 2008-07-29 a les 12:21:39 -0500) va dir:: > [...] > I think it would be ideal if things like the following worked: > > >>> series = numpy.array(['1970-02-01','1970-09-01'], dtype = 'datetime64[D]') > >>> series == '1970-02-01' > [True, False] > > I view this as similar to: > > >>> series = numpy.array([1,2,3], dtype=float) > >>> series == 2 > [False,True,False] > > 1. However it does numpy recognizes that an int is comparable with a > float and does the float cast. I think you want the same behavior > between strings that parse into dates and date arrays. Some might > object that the relationship between string and date is more tenuous > than float and int, which is true, but having used my own homespun > date array numpy extension for over a year, I've found that the first > thing I did was wrap it into an object that handles these string->date > translations elegantly and that made it infinately more usable from an > ipython session. That may be feasible as long as there is a very clear rule for what time units you get given a string. For instance, '1970' could yield years and '1970-03-12T12:00' minutes, but then we don't have a way of creating a time in business days... However, it looks interesting. Any more people interested in this behaviour? > 2. Even more important to me, however, is the issue of date parsing. > The mx library does many things badly but it does do a great job of > parsing dates of many formats. When you parse '1/1/95' or 1995-01-01' > it knows that you mean 19950101 which is really nice. I believe the > scipy timeseries code for parsing dates is based on it. I would > highly suggest starting with that level of functionality. The one > major issue with it is an uninterpretable date doesn't throw an error > but becomes whatever date is right now. That is obviously > unfavorable. Umm, that may get quite complex. E.g. does '1/2/95' refer to February the 1st. or January the 2nd.? There are sooo many date formats and standards that maybe using an external parser code (like mx, TimeSeries or even datetime/strptime) for them would be preferable. I think the ISO 8601 is enough for a basic, well defined time string support. At least to start with. > 3. Finally my current implementation uses floats uses nan to represent > an invalid date. When you assign an element of an date array to None > it uses nan as the value. When you assign a real date it puts in the > equivalent floating point value. I have found this to be hugely > beneficial and just wanted to float the idea of reserving a value to > indicate the floating point equivalent of nan. People might prefer > masked arrays as a solution, but I just wanted to float the idea. > [...] Good news! Our next proposal includes a "Not a Time" value which came around due to the impossibility of converting some times into business days. Stay tuned. However I should point out that the NaT value isn't as powerful as the floating-point NaN, since the former is completely lacking of any sense to hardware, and patching that in all cases would make computations quite slower. Using floating point values doesn't look like an option anymore, since they don't have a fixed precision given a time unit. Cheers, :: Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From robert.kern at gmail.com Tue Jul 29 15:04:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Jul 2008 14:04:10 -0500 Subject: [Numpy-discussion] why isn't libfftw.a being accessed? In-Reply-To: <488F43CA.5090707@jpl.nasa.gov> References: <488F43CA.5090707@jpl.nasa.gov> Message-ID: <3d375d730807291204p588307b3t86a0447885a84c63@mail.gmail.com> On Tue, Jul 29, 2008 at 11:22, Mathew Yeates wrote: > Hi > In my site.cfg I have > > [DEFAULT] > library_dirs = /home/ossetest/lib64:/home/ossetest/lib > include_dirs = /home/ossetest/include > > [fftw] > libraries = fftw3 > > but libfftw3.a isn't being accesed. > ls -lu ~/lib/libfftw3.a > -rw-r--r-- 1 ossetest ossetest 1572628 Jul 26 15:02 > /home/ossetest/lib/libfftw3.a > > anybody know why? Can you show us the relevant part of the output from "python setup.py build"? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Jul 29 15:08:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Jul 2008 14:08:49 -0500 Subject: [Numpy-discussion] Operation over multiple axes? (Or: Partial flattening?) In-Reply-To: <200807291624.27888.meine@informatik.uni-hamburg.de> References: <200807291452.20263.meine@informatik.uni-hamburg.de> <9457e7c80807290632p126f6d75hda04b1d6b3cf571f@mail.gmail.com> <200807291624.27888.meine@informatik.uni-hamburg.de> Message-ID: <3d375d730807291208j42156dabia75f4ad93c7b69dd@mail.gmail.com> On Tue, Jul 29, 2008 at 09:24, Hans Meine wrote: > On Dienstag 29 Juli 2008, St?fan van der Walt wrote: >> > One way to achieve this is partial flattening, which I did like this: >> > >> > dat.reshape((numpy.prod(dat.shape[:3]), dat.shape[3])).sum(0) >> > >> > Is there a more elegant way to do this? >> >> That looks like a good way to do it. You can clean it up ever so slightly: >> >> x.reshape([-1, x.shape[-1]]).sum(axis=0) > > Thanks, that looks more elegant indeed. I am not sure if I've read about -1 > in shapes before. I assume it represents "the automatically determined rest" > and may only appear once? Should this be documented in the reshape > docstring? Yes, yes, and yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ivan at selidor.net Tue Jul 29 15:14:13 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Tue, 29 Jul 2008 21:14:13 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807291238.20161.pgmdevlist@gmail.com> References: <200807291512.53270.faltet@pytables.org> <200807291238.20161.pgmdevlist@gmail.com> Message-ID: <20080729191413.GD5600@tardis.terramar.selidor.net> Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: > > Relative time versus relative time > > ---------------------------------- > > > > This case would be the same than the previous one (absolute vs > > absolute). Our proposal is to forbid this operation if the time units > > of the operands are different. > > Mmh, less sure on this one. Can't we use a hierarchy of time units, and force > to the lowest ? > For example: > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") > >>>array([15,15,15], dtype="t8['M']") > > I agree that adding ns to years makes no sense, but ns to s ? min to > hr or days ? In short: systematically raising an exception looks a > bit too drastic. There are some simple unambiguous cases that sould be > allowed (Y+M, Y+Q, M+Q, H+D...) Do you mean using the most precise unit for operations with "near enough", different units? I see the point, but what makes me doubt about it is giving the user the false impression that the most precise unit is *always* expected. I'd rather spare the user as many surprises as possible, by simplifying rules in favour of explicitness (but that may be debated). > > Introducing a time casting function > > ----------------------------------- > > > change_unit(time_object, new_unit, reference) > > > > where 'time_object' is the time object whose unit is to be > > changed, 'new_unit' is the desired new time unit, and 'reference' is an > > absolute date that will be used to allow the conversion of relative > > times in case of using time units with an uncertain number of smaller > > time units (relative years or months cannot be expressed in days). > > reference default to the POSIX epoch, right ? > So this function could be a first step towards our problem of frequency > conversion... > > > Note: we refused to use the ``.astype()`` method because of the > > additional 'time_reference' parameter that will sound strange for other > > typical uses of ``.astype()``. > > A method would be really, really helpful, though... > [...] Yay, but what doesn't seem to fit for me is that the method would only have sense to time values. NumPy is pretty orthogonal in that every method and attribute applies to every type. However, if "units" were to be adopted by NumPy, the method would fit in well. In fact, we are thinking of adding a ``unit`` attribute to dtypes to support time units (being ``None`` for normal NumPy types). But full unit support in NumPy looks so far away that I'm not sure to adopt the method. Thanks for the insights. Cheers, :: Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From jturner at gemini.edu Tue Jul 29 15:16:13 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 15:16:13 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() Message-ID: <488F6C7D.7040100@gemini.edu> I have built NumPy 1.1.0 on RedHat Enterprise 3 (Linux 2.4.21 with gcc 3.2.3 and glibc 2.3.2) and Python 2.5.1. When I run numpy.test() I get a core dump, as follows. I haven't noticed any special errors during the build. Should I post the entire terminal output from "python setup.py install"? Maybe as an attachment? Let me know if I can provide any more info. Thanks a lot, James. --- [iraf at sbfirf01 DRSetupScripts]$ python Python 2.5.1 (r251:54863, Jul 28 2008, 19:08:11) [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.test() Numpy is installed in /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy Numpy version 1.1.0 Python version 2.5.1 (r251:54863, Jul 28 2008, 19:08:11) [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-20)] Found 2/2 tests for numpy.core.tests.test_ufunc Found 143/143 tests for numpy.core.tests.test_regression Found 63/63 tests for numpy.core.tests.test_unicode Found 7/7 tests for numpy.core.tests.test_scalarmath Found 3/3 tests for numpy.core.tests.test_errstate Found 16/16 tests for numpy.core.tests.test_umath Found 12/12 tests for numpy.core.tests.test_records Found 70/70 tests for numpy.core.tests.test_numeric Found 18/18 tests for numpy.core.tests.test_defmatrix Found 36/36 tests for numpy.core.tests.test_numerictypes Found 286/286 tests for numpy.core.tests.test_multiarray Found 3/3 tests for numpy.core.tests.test_memmap Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu Found 5/5 tests for numpy.distutils.tests.test_misc_util Found 2/2 tests for numpy.fft.tests.test_fftpack Found 3/3 tests for numpy.fft.tests.test_helper Found 15/15 tests for numpy.lib.tests.test_twodim_base Found 1/1 tests for numpy.lib.tests.test_regression Found 4/4 tests for numpy.lib.tests.test_polynomial Found 43/43 tests for numpy.lib.tests.test_type_check Found 1/1 tests for numpy.lib.tests.test_financial Found 1/1 tests for numpy.lib.tests.test_machar Found 53/53 tests for numpy.lib.tests.test_function_base Found 6/6 tests for numpy.lib.tests.test_index_tricks Found 15/15 tests for numpy.lib.tests.test_io Found 10/10 tests for numpy.lib.tests.test_arraysetops Found 1/1 tests for numpy.lib.tests.test_ufunclike Found 5/5 tests for numpy.lib.tests.test_getlimits Found 24/24 tests for numpy.lib.tests.test__datasource Found 49/49 tests for numpy.lib.tests.test_shape_base Found 3/3 tests for numpy.linalg.tests.test_regression Found 89/89 tests for numpy.linalg.tests.test_linalg Found 36/36 tests for numpy.ma.tests.test_old_ma Found 94/94 tests for numpy.ma.tests.test_core Found 15/15 tests for numpy.ma.tests.test_extras Found 17/17 tests for numpy.ma.tests.test_mrecords Found 4/4 tests for numpy.ma.tests.test_subclassing Found 7/7 tests for numpy.tests.test_random Found 16/16 tests for numpy.testing.tests.test_utils Found 5/5 tests for numpy.tests.test_ctypeslib ..................................................................................................................................................................................................................................................................................................Floating exception (core dumped) From christoph.rackwitz at rwth-aachen.de Tue Jul 29 15:20:25 2008 From: christoph.rackwitz at rwth-aachen.de (Christoph Rackwitz) Date: Tue, 29 Jul 2008 21:20:25 +0200 Subject: [Numpy-discussion] Numpy 1.1.0 (+ PIL 1.1.6) crashes on large datasets Message-ID: Hi, I've managed to crash numpy+PIL when feeding it rather large images. Please see the URL for a test image, script, and gdb stack trace. This crashes on my box (Windows XP SP3) as well as on a linux box (the gdb trace I've been provided with) and a Mac. Windows reports the crash to happen in "multiarray.pyd"; the stack trace mentions the equivalent file. Unfortunately, I don't know how to fix this. Can I help somehow? -- Chris [1] http://cracki.ath.cx:10081/pub/numpy-pil-crash/ From robert.kern at gmail.com Tue Jul 29 15:38:18 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Jul 2008 14:38:18 -0500 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F6C7D.7040100@gemini.edu> References: <488F6C7D.7040100@gemini.edu> Message-ID: <3d375d730807291238r404b77f0l1e8685cc52045567@mail.gmail.com> On Tue, Jul 29, 2008 at 14:16, James Turner wrote: > I have built NumPy 1.1.0 on RedHat Enterprise 3 (Linux 2.4.21 > with gcc 3.2.3 and glibc 2.3.2) and Python 2.5.1. When I run > numpy.test() I get a core dump, as follows. I haven't noticed > any special errors during the build. Should I post the entire > terminal output from "python setup.py install"? Maybe as an > attachment? Let me know if I can provide any more info. Can you do numpy.test(verbosity=2) ? That will print out the name of the test before running it, so we will know exactly which test caused the core dump. A gdb backtrace would also help. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Tue Jul 29 15:47:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 29 Jul 2008 21:47:52 +0200 Subject: [Numpy-discussion] Operation over multiple axes? (Or: Partial flattening?) In-Reply-To: <200807291624.27888.meine@informatik.uni-hamburg.de> References: <200807291452.20263.meine@informatik.uni-hamburg.de> <9457e7c80807290632p126f6d75hda04b1d6b3cf571f@mail.gmail.com> <200807291624.27888.meine@informatik.uni-hamburg.de> Message-ID: <9457e7c80807291247l215c3b2aoa32d7e3dc287189c@mail.gmail.com> 2008/7/29 Hans Meine : > On Dienstag 29 Juli 2008, St?fan van der Walt wrote: >> > One way to achieve this is partial flattening, which I did like this: >> > >> > dat.reshape((numpy.prod(dat.shape[:3]), dat.shape[3])).sum(0) >> > >> > Is there a more elegant way to do this? >> >> That looks like a good way to do it. You can clean it up ever so slightly: >> >> x.reshape([-1, x.shape[-1]]).sum(axis=0) > > Thanks, that looks more elegant indeed. I am not sure if I've read about -1 > in shapes before. I assume it represents "the automatically determined rest" > and may only appear once? Should this be documented in the reshape > docstring? That's correct, and yes -- it should! Would you like to document it yourself? If you register on http://sd-2116.dedibox.fr/pydocweb I'll give you editor's access. Regards St?fan From pgmdevlist at gmail.com Tue Jul 29 15:47:52 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 29 Jul 2008 15:47:52 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <20080729191413.GD5600@tardis.terramar.selidor.net> References: <200807291512.53270.faltet@pytables.org> <200807291238.20161.pgmdevlist@gmail.com> <20080729191413.GD5600@tardis.terramar.selidor.net> Message-ID: <200807291547.53370.pgmdevlist@gmail.com> On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote: > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: > > > Relative time versus relative time > > > ---------------------------------- > > > > > > This case would be the same than the previous one (absolute vs > > > absolute). Our proposal is to forbid this operation if the time units > > > of the operands are different. > > > > Mmh, less sure on this one. Can't we use a hierarchy of time units, and > > force to the lowest ? > > > > For example: > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") > > >>>array([15,15,15], dtype="t8['M']") > > > > I agree that adding ns to years makes no sense, but ns to s ? min to > > hr or days ? In short: systematically raising an exception looks a > > bit too drastic. There are some simple unambiguous cases that sould be > > allowed (Y+M, Y+Q, M+Q, H+D...) > > Do you mean using the most precise unit for operations with "near > enough", different units? I see the point, but what makes me doubt > about it is giving the user the false impression that the most precise > unit is *always* expected. I'd rather spare the user as many surprises > as possible, by simplifying rules in favour of explicitness (but that > may be debated). Let me rephrase: Adding different relative time units should be allowed when there's no ambiguity on the output: For example, a relative year timedelta is always 12 month timedeltas, or 4 quarter timedeltas. In that case, I should be able to do: >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") array([15,15,15], dtype="t8['M']") >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Q]") array([7,7,7], dtype="t8['Q']") Similarly: * an hour is always 3600s, so I could add relative s/ms/us/ns timedeltas to hour timedeltas, and get the result in s/ms/us/ns. * A day is always 24h, so I could add relative hours and days timedeltas and get an hour timedelta * A week is always 7d, so W+D -> D However: * We can't tell beforehand how much days are in any month, so adding relative days and months would raise an exception. * Same thing with weeks and months/quarters/years There'll be only a limited number of time units, therefore a limited number of potential combinations between time units. It'd be just a matter of listing which ones are allowed and which ones will raise an exception. > > > Note: we refused to use the ``.astype()`` method because of the > > > additional 'time_reference' parameter that will sound strange for other > > > typical uses of ``.astype()``. > > > > A method would be really, really helpful, though... > > [...] > > Yay, but what doesn't seem to fit for me is that the method would only > have sense to time values. Well, what about a .tounit(new_unit, reference=None) ? By default, the reference would be None and default to the POSIX epoch. We could also go for .totunit (for to time unit) > NumPy is pretty orthogonal in that every > method and attribute applies to every type. However, if "units" were to > be adopted by NumPy, the method would fit in well. In fact, we are > thinking of adding a ``unit`` attribute to dtypes to support time units > (being ``None`` for normal NumPy types). But full unit support in NumPy > looks so far away that I'm not sure to adopt the method. > > Thanks for the insights. Cheers, From myeates at jpl.nasa.gov Tue Jul 29 15:59:45 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 12:59:45 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <3d375d730807291238r404b77f0l1e8685cc52045567@mail.gmail.com> References: <488F6C7D.7040100@gemini.edu> <3d375d730807291238r404b77f0l1e8685cc52045567@mail.gmail.com> Message-ID: <488F76B1.8060200@jpl.nasa.gov> I'm getting this too Ticket #652 ... ok Ticket 662.Segmentation fault Robert Kern wrote: > On Tue, Jul 29, 2008 at 14:16, James Turner wrote: > >> I have built NumPy 1.1.0 on RedHat Enterprise 3 (Linux 2.4.21 >> with gcc 3.2.3 and glibc 2.3.2) and Python 2.5.1. When I run >> numpy.test() I get a core dump, as follows. I haven't noticed >> any special errors during the build. Should I post the entire >> terminal output from "python setup.py install"? Maybe as an >> attachment? Let me know if I can provide any more info. >> > > Can you do > > numpy.test(verbosity=2) > > ? That will print out the name of the test before running it, so we > will know exactly which test caused the core dump. > > A gdb backtrace would also help. > > From charlesr.harris at gmail.com Tue Jul 29 16:00:23 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 14:00:23 -0600 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F6C7D.7040100@gemini.edu> References: <488F6C7D.7040100@gemini.edu> Message-ID: On Tue, Jul 29, 2008 at 1:16 PM, James Turner wrote: > I have built NumPy 1.1.0 on RedHat Enterprise 3 (Linux 2.4.21 > with gcc 3.2.3 and glibc 2.3.2) and Python 2.5.1. When I run > numpy.test() I get a core dump, as follows. I haven't noticed > any special errors during the build. Should I post the entire > terminal output from "python setup.py install"? Maybe as an > attachment? Let me know if I can provide any more info. > > Thanks a lot, > > James. > > --- > > [iraf at sbfirf01 DRSetupScripts]$ python > Python 2.5.1 (r251:54863, Jul 28 2008, 19:08:11) > [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-20)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> numpy.test() > Numpy is installed in > /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy > Numpy version 1.1.0 > Python version 2.5.1 (r251:54863, Jul 28 2008, 19:08:11) [GCC 3.2.3 > 20030502 > (Red Hat Linux 3.2.3-20)] > Found 2/2 tests for numpy.core.tests.test_ufunc > Found 143/143 tests for numpy.core.tests.test_regression > Found 63/63 tests for numpy.core.tests.test_unicode > Found 7/7 tests for numpy.core.tests.test_scalarmath > Found 3/3 tests for numpy.core.tests.test_errstate > Found 16/16 tests for numpy.core.tests.test_umath > Found 12/12 tests for numpy.core.tests.test_records > Found 70/70 tests for numpy.core.tests.test_numeric > Found 18/18 tests for numpy.core.tests.test_defmatrix > Found 36/36 tests for numpy.core.tests.test_numerictypes > Found 286/286 tests for numpy.core.tests.test_multiarray > Found 3/3 tests for numpy.core.tests.test_memmap > Found 4/4 tests for numpy.distutils.tests.test_fcompiler_gnu > Found 5/5 tests for numpy.distutils.tests.test_misc_util > Found 2/2 tests for numpy.fft.tests.test_fftpack > Found 3/3 tests for numpy.fft.tests.test_helper > Found 15/15 tests for numpy.lib.tests.test_twodim_base > Found 1/1 tests for numpy.lib.tests.test_regression > Found 4/4 tests for numpy.lib.tests.test_polynomial > Found 43/43 tests for numpy.lib.tests.test_type_check > Found 1/1 tests for numpy.lib.tests.test_financial > Found 1/1 tests for numpy.lib.tests.test_machar > Found 53/53 tests for numpy.lib.tests.test_function_base > Found 6/6 tests for numpy.lib.tests.test_index_tricks > Found 15/15 tests for numpy.lib.tests.test_io > Found 10/10 tests for numpy.lib.tests.test_arraysetops > Found 1/1 tests for numpy.lib.tests.test_ufunclike > Found 5/5 tests for numpy.lib.tests.test_getlimits > Found 24/24 tests for numpy.lib.tests.test__datasource > Found 49/49 tests for numpy.lib.tests.test_shape_base > Found 3/3 tests for numpy.linalg.tests.test_regression > Found 89/89 tests for numpy.linalg.tests.test_linalg > Found 36/36 tests for numpy.ma.tests.test_old_ma > Found 94/94 tests for numpy.ma.tests.test_core > Found 15/15 tests for numpy.ma.tests.test_extras > Found 17/17 tests for numpy.ma.tests.test_mrecords > Found 4/4 tests for numpy.ma.tests.test_subclassing > Found 7/7 tests for numpy.tests.test_random > Found 16/16 tests for numpy.testing.tests.test_utils > Found 5/5 tests for numpy.tests.test_ctypeslib > > ..................................................................................................................................................................................................................................................................................................Floating > exception (core dumped) > Are you using ATLAS? If so, where did you get it and what cpu do you have? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jturner at gemini.edu Tue Jul 29 16:04:45 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 16:04:45 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() References: 488F6C7D.7040100@gemini.edu Message-ID: <488F77DD.8040508@gemini.edu> Thanks, Robert. > Can you do > > numpy.test(verbosity=2) OK. Here is the line that fails: check_matvec (numpy.core.tests.test_numeric.TestDot)Floating exception (core dumped) > A gdb backtrace would also help. OK. I'm pretty ignorant about using debuggers, but I did "gdb python core.23696" and got the following. Does that help? Thanks, James. --- GNU gdb Red Hat Linux (5.3.90-0.20030710.40rh) Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". Core was generated by `python numpytest.py'. Program terminated with signal 8, Arithmetic exception. Reading symbols from /lib/tls/libpthread.so.0...done. Loaded symbols for /lib/tls/libpthread.so.0 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/libutil.so.1...done. Loaded symbols for /lib/libutil.so.1 Reading symbols from /lib/tls/libm.so.6...done. Loaded symbols for /lib/tls/libm.so.6 Reading symbols from /lib/tls/libc.so.6...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/multiarray.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/multiarray.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/umath.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/umath.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/strop.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/strop.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/_sort.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/_sort.so ---Type to continue, or q to quit--- Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/_dotblas.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/_dotblas.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/cPickle.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/cPickle.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/cStringIO.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/cStringIO.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/parser.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/parser.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_struct.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_struct.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/operator.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/operator.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/itertools.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/itertools.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/collections.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/collections.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/mmap.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/mmap.soReading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/scalarmath.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/scalarmath.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/math.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/math.soReading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/num---Type to continue, or q to quit--- py/lib/_compiled_base.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/lib/_compiled_base.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/time.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/time.soReading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/binascii.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/binascii.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_random.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_random.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/fcntl.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/fcntl.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_hashlib.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_hashlib.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_socket.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_socket.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_ssl.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_ssl.soReading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_bisect.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_bisect.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/bz2.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/bz2.so Reading symbols from /usr/lib/libbz2.so.1...done. Loaded symbols for /usr/lib/libbz2.so.1 Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/zlib.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/zlib.soReading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 ---Type to continue, or q to quit--- Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_heapq.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_heapq.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so Reading symbols from /usr/lib/libg2c.so.0...done. Loaded symbols for /usr/lib/libg2c.so.0 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/fft/fftpack_lite.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/random/mtrand.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/random/mtrand.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_ctypes.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_ctypes.so Reading symbols from /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_curses.so...done. Loaded symbols for /astro/iraf/i686/gempylocal/lib/python2.5/lib-dynload/_curses.so Reading symbols from /usr/lib/libncursesw.so.5...done. Loaded symbols for /usr/lib/libncursesw.so.5 Reading symbols from /usr/lib/libgpm.so.1...done. Loaded symbols for /usr/lib/libgpm.so.1 Reading symbols from /usr/lib/libncurses.so.5...done. Loaded symbols for /usr/lib/libncurses.so.5 #0 0xb6ef587d in ATL_dgemvT_a1_x1_b0_y1_gemvT_1_3_16 () from /astro/iraf/i686/gempylocal/lib/python2.5/site-packages/numpy/core/_dotblas.so From zachary.pincus at yale.edu Tue Jul 29 16:05:37 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 29 Jul 2008 16:05:37 -0400 Subject: [Numpy-discussion] Numpy 1.1.0 (+ PIL 1.1.6) crashes on large datasets In-Reply-To: References: Message-ID: > I've managed to crash numpy+PIL when feeding it rather large images. > Please see the URL for a test image, script, and gdb stack trace. This > crashes on my box (Windows XP SP3) as well as on a linux box (the gdb > trace I've been provided with) and a Mac. Windows reports the crash to > happen in "multiarray.pyd"; the stack trace mentions the equivalent > file. > Unfortunately, I don't know how to fix this. Can I help somehow? > > -- Chris > > [1] http://cracki.ath.cx:10081/pub/numpy-pil-crash/ Hmm... I've opened this file with my homebrew image-IO tools, and I cannot provoke a segfault. (These tools are derived from PIL, but with many bugs related to the array interface fixed. I had submitted patches to the PIL mailing list, which, as usual, languished.) I wonder if the issue is with how the PIL is providing the buffer interface to numpy? Can you get the crash if you get the array into numpy through the image's tostring (or whatever) method, and then use numpy.fromstring? Zach PS. This is with a recent SVN version of numpy, on OS X 10.5.4. From jturner at gemini.edu Tue Jul 29 16:16:38 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 16:16:38 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: References: <488F6C7D.7040100@gemini.edu> Message-ID: <488F7AA6.4070004@gemini.edu> > Are you using ATLAS? If so, where did you get it and what cpu do you have? Yes. I have Atlas 3.8.2. I think I got it from http://math-atlas.sourceforge.net. I also included Lapack 3.1.1 from Netlib when building it from source. This worked on another machine. According to /proc/cpuinfo, I have a quad-processor (or core?) Intel Xeon. It is running the Linux 2.4 kernel (I needed to build a load of software including NumPy with an older glibc so it will run on older client machines). Maybe I shouldn't use ATLAS for a server installation, since it won't be tuned well? We're trying to keep things uniform across our sites though. Thanks! James. From myeates at jpl.nasa.gov Tue Jul 29 16:21:37 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 13:21:37 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F7AA6.4070004@gemini.edu> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> Message-ID: <488F7BD1.3020005@jpl.nasa.gov> my set up is similar. Same cpu's. Except I am using atlas 3.9.1 and gcc 4.2.4 James Turner wrote: >> Are you using ATLAS? If so, where did you get it and what cpu do you have? >> > > Yes. I have Atlas 3.8.2. I think I got it from > http://math-atlas.sourceforge.net. I also included Lapack 3.1.1 > from Netlib when building it from source. This worked on another > machine. > > According to /proc/cpuinfo, I have a quad-processor (or core?) > Intel Xeon. It is running the Linux 2.4 kernel (I needed to build > a load of software including NumPy with an older glibc so it will > run on older client machines). Maybe I shouldn't use ATLAS for a > server installation, since it won't be tuned well? We're trying > to keep things uniform across our sites though. > > Thanks! > > James. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From myeates at jpl.nasa.gov Tue Jul 29 16:21:22 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 13:21:22 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <3d375d730807291238r404b77f0l1e8685cc52045567@mail.gmail.com> References: <488F6C7D.7040100@gemini.edu> <3d375d730807291238r404b77f0l1e8685cc52045567@mail.gmail.com> Message-ID: <488F7BC2.5020809@jpl.nasa.gov> I am using an ATLAS 64 bit lapack 3.9.1. My cpu (4 cpus) ------------------------------------------------- processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU X5460 @ 3.16GHz stepping : 6 cpu MHz : 3158.790 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 6321.80 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: ----------------------------------------------------------- A system trace ends with futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 write(2, ".", 1) = 1 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 futex(0xc600bb0, FUTEX_WAKE, 1) = 0 2655 --- SIGSEGV (Segmentation fault) @ 0 (0) --- 2655 +++ killed by SIGSEGV +++ ---------------------------------------- I get no core file Robert Kern wrote: > On Tue, Jul 29, 2008 at 14:16, James Turner wrote: > >> I have built NumPy 1.1.0 on RedHat Enterprise 3 (Linux 2.4.21 >> with gcc 3.2.3 and glibc 2.3.2) and Python 2.5.1. When I run >> numpy.test() I get a core dump, as follows. I haven't noticed >> any special errors during the build. Should I post the entire >> terminal output from "python setup.py install"? Maybe as an >> attachment? Let me know if I can provide any more info. >> > > Can you do > > numpy.test(verbosity=2) > > ? That will print out the name of the test before running it, so we > will know exactly which test caused the core dump. > > A gdb backtrace would also help. > > From jturner at gemini.edu Tue Jul 29 16:48:20 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 16:48:20 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F7BD1.3020005@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> Message-ID: <488F8214.2070702@gemini.edu> Thanks everyone. I think I might try using the Netlib BLAS, since it's a server installation... but please let me know if you'd like me to troubleshoot this some more (the sooner the easier). James. From myeates at jpl.nasa.gov Tue Jul 29 16:54:06 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 13:54:06 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F8214.2070702@gemini.edu> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> Message-ID: <488F836E.3030007@jpl.nasa.gov> more info when /linalg.py(872)eigh() calls dsyevd I crash James Turner wrote: > Thanks everyone. I think I might try using the Netlib BLAS, since > it's a server installation... but please let me know if you'd like > me to troubleshoot this some more (the sooner the easier). > > James. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Tue Jul 29 17:26:18 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 29 Jul 2008 15:26:18 -0600 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F8214.2070702@gemini.edu> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> Message-ID: On Tue, Jul 29, 2008 at 2:48 PM, James Turner wrote: > Thanks everyone. I think I might try using the Netlib BLAS, since > it's a server installation... but please let me know if you'd like > me to troubleshoot this some more (the sooner the easier). > This smells like an ATLAS problem. You should seed a note to Clint Whaley (the ATLAS guy). IIRC, ATLAS has some hand coded asm routines and it seems that support for these very new processors might be broken. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmb at wartburg.edu Tue Jul 29 17:04:00 2008 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Tue, 29 Jul 2008 21:04:00 +0000 (UTC) Subject: [Numpy-discussion] FFT usage / consistency References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807291436.23054.felix@physik3.uni-rostock.de> <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> <200807291504.41609.felix@physik3.uni-rostock.de> Message-ID: Felix Richter physik3.uni-rostock.de> writes: > > > Do your answers differ from the theory by a constant factor, or are > > they completely unrelated? > > No, it's more complicated. Below you'll find my most recent, more stripped > down code. > > - I don't know how to scale in a way that works for any n. > - I don't know how to get the oscillations to match. I suppose its a problem > with the frequency scale, but usage of fftfreq() is straightforward... > - I don't know why the imaginary part of the FFT behaves so different from the > real part. It should just be a matter of sin vs. cos. > > Is this voodoo? > > And I didn't find any example on the internet which tries just to reproduce an > analytic FT with the FFT... > > Thanks for your help! This is not voodoo, this is signal processing, which is itself harmonic analysis. Just because the Fast Fourier Transform is fast doesn't mean that this stuff is easy. You seem to be looking for a simple relationship between the Fourier Transform (an integral transform from L^2(R) -> L^2(R)) of a function f and the Discrete Fourier Transform (a linear transformation from R^n to R^n) of the vector of f sampled at regularly-spaced points. Such a simple relationship does not exist. That is why you found no such examples on the internet. The closest you might come is to study the surprisingly cogent explanation at http://en.wikipedia.org/wiki/Fourier_analysis of the differences between the various types of Fourier analysis. Remember that the DFT (as implemented by an FFT algorithm) is *not* an approximation to the Fourier transform, but rather a streamlined way of computing the coefficients of a Fourier series of a particular periodic function (that contains a finite number of Fourier modes). Rather than look for errors in the scaling factors or errors in your code, I think that you should try to expand your understanding of the (subtly) different types of Fourier representations. -Neil From myeates at jpl.nasa.gov Tue Jul 29 18:41:52 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 15:41:52 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> Message-ID: <488F9CB0.5050607@jpl.nasa.gov> Charles R Harris wrote: > > > This smells like an ATLAS problem. I don't think so. I crash in a call to dsyevd which part of lapack but not atlas. Also, when I commented out the call to test_eigh_build I get zillions of errors like (look at the second one, warnings wasn't imported?) ====================================================================== ERROR: check_single (numpy.linalg.tests.test_linalg.TestSVD) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 30, in check_single self.do(a, b) File "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", line 100, in do u, s, vt = linalg.svd(a, 0) File "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 980, in svd s = s.astype(_realType(result_t)) ValueError: On entry to DLASD0 parameter number 9 had an illegal value ====================================================================== ERROR: Tests polyfit ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ossetest/lib/python2.5/site-packages/numpy/ma/tests/test_extras.py", line 365, in test_polyfit assert_almost_equal(polyfit(x,y,3),numpy.polyfit(x,y,3)) File "/home/ossetest/lib/python2.5/site-packages/numpy/ma/extras.py", line 882, in polyfit warnings.warn("Polyfit may be poorly conditioned", np.RankWarning) NameError: global name 'warnings' is not defined > You should seed a note to Clint Whaley (the ATLAS guy). IIRC, ATLAS > has some hand coded asm routines and it seems that support for these > very new processors might be broken. > > Chuck > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From myeates at jpl.nasa.gov Tue Jul 29 18:45:55 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 15:45:55 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F9CB0.5050607@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> <488F9CB0.5050607@jpl.nasa.gov> Message-ID: <488F9DA3.1040203@jpl.nasa.gov> oops. It is ATLAS. I was able to run with a nonoptimized lapack. Mathew Yeates wrote: > Charles R Harris wrote: > >> This smells like an ATLAS problem. >> > I don't think so. I crash in a call to dsyevd which part of lapack but > not atlas. Also, when I commented out the call to test_eigh_build I get > zillions of errors like (look at the second one, warnings wasn't imported?) > ====================================================================== > ERROR: check_single (numpy.linalg.tests.test_linalg.TestSVD) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", > line 30, in check_single > self.do(a, b) > File > "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/tests/test_linalg.py", > line 100, in do > u, s, vt = linalg.svd(a, 0) > File > "/home/ossetest/lib/python2.5/site-packages/numpy/linalg/linalg.py", > line 980, in svd > s = s.astype(_realType(result_t)) > ValueError: On entry to DLASD0 parameter number 9 had an illegal value > > ====================================================================== > ERROR: Tests polyfit > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/ossetest/lib/python2.5/site-packages/numpy/ma/tests/test_extras.py", > line 365, in test_polyfit > assert_almost_equal(polyfit(x,y,3),numpy.polyfit(x,y,3)) > File "/home/ossetest/lib/python2.5/site-packages/numpy/ma/extras.py", > line 882, in polyfit > warnings.warn("Polyfit may be poorly conditioned", np.RankWarning) > NameError: global name 'warnings' is not defined > > > >> You should seed a note to Clint Whaley (the ATLAS guy). IIRC, ATLAS >> has some hand coded asm routines and it seems that support for these >> very new processors might be broken. >> >> Chuck >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From robert.kern at gmail.com Tue Jul 29 18:50:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Jul 2008 17:50:13 -0500 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F9CB0.5050607@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> <488F9CB0.5050607@jpl.nasa.gov> Message-ID: <3d375d730807291550t708189eayac345c59c17e3f0@mail.gmail.com> On Tue, Jul 29, 2008 at 17:41, Mathew Yeates wrote: > Charles R Harris wrote: >> >> >> This smells like an ATLAS problem. > I don't think so. I crash in a call to dsyevd which part of lapack but > not atlas. Also, when I commented out the call to test_eigh_build I get > zillions of errors like (look at the second one, warnings wasn't imported?) Fixed in SVN. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From myeates at jpl.nasa.gov Tue Jul 29 19:00:48 2008 From: myeates at jpl.nasa.gov (Mathew Yeates) Date: Tue, 29 Jul 2008 16:00:48 -0700 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <3d375d730807291550t708189eayac345c59c17e3f0@mail.gmail.com> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> <488F9CB0.5050607@jpl.nasa.gov> <3d375d730807291550t708189eayac345c59c17e3f0@mail.gmail.com> Message-ID: <488FA120.1000404@jpl.nasa.gov> What got fixed? Robert Kern wrote: > On Tue, Jul 29, 2008 at 17:41, Mathew Yeates wrote: > >> Charles R Harris wrote: >> >>> This smells like an ATLAS problem. >>> >> I don't think so. I crash in a call to dsyevd which part of lapack but >> not atlas. Also, when I commented out the call to test_eigh_build I get >> zillions of errors like (look at the second one, warnings wasn't imported?) >> > > Fixed in SVN. > > From robert.kern at gmail.com Tue Jul 29 19:02:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 29 Jul 2008 18:02:48 -0500 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488FA120.1000404@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> <488F9CB0.5050607@jpl.nasa.gov> <3d375d730807291550t708189eayac345c59c17e3f0@mail.gmail.com> <488FA120.1000404@jpl.nasa.gov> Message-ID: <3d375d730807291602n39086ca4q3a414185788e4a67@mail.gmail.com> On Tue, Jul 29, 2008 at 18:00, Mathew Yeates wrote: > What got fixed? >>>(look at the second one, warnings wasn't imported?) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jturner at gemini.edu Tue Jul 29 19:08:34 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 19:08:34 -0400 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: References: <200807281025.36338.felix@physik3.uni-rostock.de> <200807291436.23054.felix@physik3.uni-rostock.de> <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> <200807291504.41609.felix@physik3.uni-rostock.de> Message-ID: <488FA2F2.4080803@gemini.edu> > Rather than look for errors in the scaling factors or errors in your code, I > think that you should try to expand your understanding of the (subtly) different > types of Fourier representations. I'd strongly recommend "The Fourier Transform and its Applications" by Bracewell, if that helps. James. From jturner at gemini.edu Tue Jul 29 19:43:05 2008 From: jturner at gemini.edu (James Turner) Date: Tue, 29 Jul 2008 19:43:05 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> Message-ID: <488FAB09.9060708@gemini.edu> > This smells like an ATLAS problem. You should seed a note to Clint > Whaley (the ATLAS guy). IIRC, ATLAS has some hand coded asm routines and > it seems that support for these very new processors might be broken. I believe the machine is a couple of years old, though it's a fairly high-end workstation. Anyway, I have submitted an ATLAS support request so they're aware of it: https://sourceforge.net/tracker/index.php?func=detail&aid=2032011&group_id=23725&atid=379483 Cheers, James. From david at ar.media.kyoto-u.ac.jp Tue Jul 29 21:17:04 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 30 Jul 2008 10:17:04 +0900 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F7BD1.3020005@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> Message-ID: <488FC110.1090000@ar.media.kyoto-u.ac.jp> Mathew Yeates wrote: > my set up is similar. Same cpu's. Except I am using atlas 3.9.1 and gcc > 4.2.4 > ATLAS 3.9.1 is a development version, and is not supposed to be used for production use. Please use 3.8.2 if you want to build your own atlas, cheers, David From david at ar.media.kyoto-u.ac.jp Tue Jul 29 22:29:20 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 30 Jul 2008 11:29:20 +0900 Subject: [Numpy-discussion] Why are all fortran compilers looked for when --fcompiler=something is given ? Message-ID: <488FD200.3020606@ar.media.kyoto-u.ac.jp> Hi, While building numpy in wine, I got some errors in distutils when initializing the fortran compilers. I build numpy with: wine python setup.py build -c mingw32 --fcompiler=gnu95 And I got an error in load_all_fcompiler_classes when it tries to load the Compaq visual compiler. I don't think it has anything to do with wine, but rather that in python 2.6, running MSVCCompiler().initialize() can raise an IOError (python 2.6 uses a new method to lookd for msvc compilers, based on the existence of a bat file, which I don't have on my wine installation; I can check this on Windows, but that would be ackward since I would need to uninstall visual studio first...). I could catch the exception in the CompaqVisualCompiler, but I don't understand why this class is loaded at all. It also explains a warning I never quite understand before about "one should fix me in fcompiler/compaq.py" on windows whereas I have never used this compiler. Is this something we should "fix" ? Or just let alone to avoid breaking anything ? cheers, David From Anthony.Kong at macquarie.com Wed Jul 30 00:10:50 2008 From: Anthony.Kong at macquarie.com (Anthony Kong) Date: Wed, 30 Jul 2008 14:10:50 +1000 Subject: [Numpy-discussion] Example of numpy cov() not correct? Message-ID: Hi, all, I am trying out the example here (http://www.scipy.org/Numpy_Example_List_With_Doc#cov) >>> from numpy import * ... >>> T = array([1.3, 4.5, 2.8, 3.9]) >>> P = array([2.7, 8.7, 4.7, 8.2]) >>> cov(T,P) The answer is supposed to be 3.9541666666666657 The result I got is instead a cov matrix array([[ 1.97583333, 3.95416667], [ 3.95416667, 8.22916667]]) So, I just wanna confirm this particular example may be no longer correct. I am using python 2.4.3 with numpy 1.1.0 on MS win Cheers, Anthony NOTICE This e-mail and any attachments are confidential and may contain copyright material of Macquarie Group Limited or third parties. If you are not the intended recipient of this email you should not read, print, re-transmit, store or act in reliance on this e-mail or any attachments, and should destroy all copies of them. Macquarie Group Limited does not guarantee the integrity of any emails or any attached files. The views or opinions expressed are the author's own and may not reflect the views or opinions of Macquarie Group Limited. From felix at physik3.uni-rostock.de Wed Jul 30 03:55:17 2008 From: felix at physik3.uni-rostock.de (Felix Richter) Date: Wed, 30 Jul 2008 09:55:17 +0200 Subject: [Numpy-discussion] FFT usage / consistency In-Reply-To: <200807291504.41609.felix@physik3.uni-rostock.de> References: <200807281025.36338.felix@physik3.uni-rostock.de> <9457e7c80807290545h393ba29cka4d274c59e5ef309@mail.gmail.com> <200807291504.41609.felix@physik3.uni-rostock.de> Message-ID: <200807300955.17277.felix@physik3.uni-rostock.de> Thanks for all your comments. It's definitely time to read a good book now. My original problem is a convolution of two complex functions given as samples over quite different intervals with different n. The imaginary part of one of these functions is Lorentz-shaped. I thought it might be good to resample them in the frequency domain, then multiply and transform back. For the resampling I have to make sure the two resulting frequency axises are equivalent/physically meaningful. Also, of course, I wanted to understand what the NumPy/SciPy routines do and how to use them correctly. Now I'll try to just resample in the time domain, transform without looking at the result, then blindly multiply and transform back. This seems to work, but I'll have to find a different testcase so I can make sure the results are trustworthy. Felix From ivan at selidor.net Wed Jul 30 04:06:58 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Wed, 30 Jul 2008 10:06:58 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807291547.53370.pgmdevlist@gmail.com> References: <200807291512.53270.faltet@pytables.org> <200807291238.20161.pgmdevlist@gmail.com> <20080729191413.GD5600@tardis.terramar.selidor.net> <200807291547.53370.pgmdevlist@gmail.com> Message-ID: <20080730080658.GA8475@tardis.terramar.selidor.net> Pierre GM (el 2008-07-29 a les 15:47:52 -0400) va dir:: > On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote: > > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: > > > > Relative time versus relative time > > > > ---------------------------------- > > > > > > > > This case would be the same than the previous one (absolute vs > > > > absolute). Our proposal is to forbid this operation if the time units > > > > of the operands are different. > > > > > > Mmh, less sure on this one. Can't we use a hierarchy of time units, and > > > force to the lowest ? > > > > > > For example: > > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") > > > >>>array([15,15,15], dtype="t8['M']") > > > > > > I agree that adding ns to years makes no sense, but ns to s ? min to > > > hr or days ? In short: systematically raising an exception looks a > > > bit too drastic. There are some simple unambiguous cases that sould be > > > allowed (Y+M, Y+Q, M+Q, H+D...) > > > > Do you mean using the most precise unit for operations with "near > > enough", different units? I see the point, but what makes me doubt > > about it is giving the user the false impression that the most precise > > unit is *always* expected. I'd rather spare the user as many surprises > > as possible, by simplifying rules in favour of explicitness (but that > > may be debated). > > Let me rephrase: > Adding different relative time units should be allowed when there's no > ambiguity on the output: > For example, a relative year timedelta is always 12 month timedeltas, or 4 > quarter timedeltas. In that case, I should be able to do: > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") > array([15,15,15], dtype="t8['M']") > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Q]") > array([7,7,7], dtype="t8['Q']") > > Similarly: > * an hour is always 3600s, so I could add relative s/ms/us/ns timedeltas to > hour timedeltas, and get the result in s/ms/us/ns. > * A day is always 24h, so I could add relative hours and days timedeltas and > get an hour timedelta > * A week is always 7d, so W+D -> D > > However: > * We can't tell beforehand how much days are in any month, so adding relative > days and months would raise an exception. > * Same thing with weeks and months/quarters/years > > There'll be only a limited number of time units, therefore a limited number of > potential combinations between time units. It'd be just a matter of listing > which ones are allowed and which ones will raise an exception. That's "keep the precision" over "keep the range". At first Francesc and I opted for "keep the range" because that's what NumPy does, e.g. when operating an int64 with an uint64. Then, since we weren't sure about what the best choice would be for the majority of users, we decided upon letting (or forcing) the user to be explicit. However, the use of time units and integer values is precisely intended to "keep the precision", and overflow won't be so frequent given the correct time unit and the span of uint64, so you may be right in the end. :) > > > > Note: we refused to use the ``.astype()`` method because of the > > > > additional 'time_reference' parameter that will sound strange for other > > > > typical uses of ``.astype()``. > > > > > > A method would be really, really helpful, though... > > > [...] > > > > Yay, but what doesn't seem to fit for me is that the method would only > > have sense to time values. > > Well, what about a .tounit(new_unit, reference=None) ? > By default, the reference would be None and default to the POSIX epoch. > We could also go for .totunit (for to time unit) Yes, that'd be the signature for a method. The ``reference`` argument shouldn't be allowed for ``datetime64`` values (absolute times, no ambiguities) but it should be mandatory for ``timedelta64`` ones. Sorry, but I can't see the use of having a default reference, unless one wanted to work with Epoch-based deltas, which looks like an extremely particular case. Could you please show me a use case for having a reference defaulting to the POSIX epoch? Cheers, :: Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From faltet at pytables.org Wed Jul 30 06:35:32 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 12:35:32 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <20080730080658.GA8475@tardis.terramar.selidor.net> References: <200807291512.53270.faltet@pytables.org> <200807291547.53370.pgmdevlist@gmail.com> <20080730080658.GA8475@tardis.terramar.selidor.net> Message-ID: <200807301235.33045.faltet@pytables.org> A Wednesday 30 July 2008, Ivan Vilata i Balaguer escrigu?: > Pierre GM (el 2008-07-29 a les 15:47:52 -0400) va dir:: > > On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote: > > > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: > > > > > Relative time versus relative time > > > > > ---------------------------------- > > > > > > > > > > This case would be the same than the previous one (absolute > > > > > vs absolute). Our proposal is to forbid this operation if > > > > > the time units of the operands are different. > > > > > > > > Mmh, less sure on this one. Can't we use a hierarchy of time > > > > units, and force to the lowest ? > > > > > > > > For example: > > > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, > > > > >>> dtype="t8[M]") array([15,15,15], dtype="t8['M']") > > > > > > > > I agree that adding ns to years makes no sense, but ns to s ? > > > > min to hr or days ? In short: systematically raising an > > > > exception looks a bit too drastic. There are some simple > > > > unambiguous cases that sould be allowed (Y+M, Y+Q, M+Q, H+D...) > > > > > > Do you mean using the most precise unit for operations with "near > > > enough", different units? I see the point, but what makes me > > > doubt about it is giving the user the false impression that the > > > most precise unit is *always* expected. I'd rather spare the > > > user as many surprises as possible, by simplifying rules in > > > favour of explicitness (but that may be debated). > > > > Let me rephrase: > > Adding different relative time units should be allowed when there's > > no ambiguity on the output: > > For example, a relative year timedelta is always 12 month > > timedeltas, or 4 > > > > quarter timedeltas. In that case, I should be able to do: > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[M]") > > > > array([15,15,15], dtype="t8['M']") > > > > >>>numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Q]") > > > > array([7,7,7], dtype="t8['Q']") > > > > Similarly: > > * an hour is always 3600s, so I could add relative s/ms/us/ns > > timedeltas to hour timedeltas, and get the result in s/ms/us/ns. > > * A day is always 24h, so I could add relative hours and days > > timedeltas and get an hour timedelta > > * A week is always 7d, so W+D -> D > > > > However: > > * We can't tell beforehand how much days are in any month, so > > adding relative days and months would raise an exception. > > * Same thing with weeks and months/quarters/years > > > > There'll be only a limited number of time units, therefore a > > limited number of potential combinations between time units. It'd > > be just a matter of listing which ones are allowed and which ones > > will raise an exception. > > That's "keep the precision" over "keep the range". At first Francesc > and I opted for "keep the range" because that's what NumPy does, e.g. > when operating an int64 with an uint64. Then, since we weren't sure > about what the best choice would be for the majority of users, we > decided upon letting (or forcing) the user to be explicit. However, > the use of time units and integer values is precisely intended to > "keep the precision", and overflow won't be so frequent given the > correct time unit and the span of uint64, so you may be right in the > end. :) Well, I do think that the "keep the precision" rule can be a quite sensible approach for this case, so I am in favor to it. Also, the Pierre suggestion of allowing automatic castings for all the time units except when the 'Y'ear and 'M'onth are involved makes a lot of sense too. I'll adopt these for the third version of the proposal then. > > > > > > Note: we refused to use the ``.astype()`` method because of > > > > > the additional 'time_reference' parameter that will sound > > > > > strange for other typical uses of ``.astype()``. > > > > > > > > A method would be really, really helpful, though... > > > > [...] > > > > > > Yay, but what doesn't seem to fit for me is that the method would > > > only have sense to time values. > > > > Well, what about a .tounit(new_unit, reference=None) ? > > By default, the reference would be None and default to the POSIX > > epoch. We could also go for .totunit (for to time unit) > > Yes, that'd be the signature for a method. The ``reference`` > argument shouldn't be allowed for ``datetime64`` values (absolute > times, no ambiguities) but it should be mandatory for ``timedelta64`` > ones. Sorry, but I can't see the use of having a default reference, > unless one wanted to work with Epoch-based deltas, which looks like > an extremely particular case. Could you please show me a use case > for having a reference defaulting to the POSIX epoch? Yeah, I agree with Ivan in that a default reference time makes little sense for general relative times. IMO, and provided that we will be allowing an implicit casting for most of time units for relative vs relative and in absolute vs relative, the use of forced casting will not be as frequent, and that a function would be enough. Having said that, I still see the merit of method for some situations, so I'll mention that in the third proposal as a possible improvement. -- Francesc Alted From thorstenkranz at googlemail.com Wed Jul 30 06:51:53 2008 From: thorstenkranz at googlemail.com (Thorsten Kranz) Date: Wed, 30 Jul 2008 12:51:53 +0200 Subject: [Numpy-discussion] Memmap and other read/write operations In-Reply-To: References: Message-ID: Hi there, I have a question concerning numpy.memmap. I'm working with a binary format, consisting of a header of certain size (1024 byte) in the beginning and a 2d-float32 array afterwards. I would like to open the array-part using a memmap-object using mm = n.memmap("test.dat",dtype=n.float32,offset=1024,shape=(2000,32),mode="r+") This works fine. My question is now, if I can in the mean time securely open the file for custom writing by using f = open("test.dat", "r+") or will there be problems? Is there another possibility to do custom writing to the header-part? Thanks, Thorsten -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Jul 30 10:04:55 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 30 Jul 2008 07:04:55 -0700 Subject: [Numpy-discussion] Example of numpy cov() not correct? In-Reply-To: References: Message-ID: On Tue, Jul 29, 2008 at 9:10 PM, Anthony Kong wrote: > I am trying out the example here > (http://www.scipy.org/Numpy_Example_List_With_Doc#cov) > > >>>> from numpy import * > ... >>>> T = array([1.3, 4.5, 2.8, 3.9]) >>>> P = array([2.7, 8.7, 4.7, 8.2]) >>>> cov(T,P) > > The answer is supposed to be 3.9541666666666657 > > The result I got is instead a cov matrix > array([[ 1.97583333, 3.95416667], > [ 3.95416667, 8.22916667]]) > So, I just wanna confirm this particular example may be no longer > correct. > > I am using python 2.4.3 with numpy 1.1.0 on MS win It works for me (1.1 on GNU/Linux): >> import numpy as np >> T = np.array([1.3, 4.5, 2.8, 3.9]) >> P = np.array([2.7, 8.7, 4.7, 8.2]) >> np.cov(T,P) array([[ 1.97583333, 3.95416667], [ 3.95416667, 8.22916667]])' From nadavh at visionsense.com Wed Jul 30 11:49:10 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 30 Jul 2008 18:49:10 +0300 Subject: [Numpy-discussion] Example of numpy cov() not correct? References: Message-ID: <710F2847B0018641891D9A216027636029C1F1@ex3.envision.co.il> If you read the cov function documentation you'll see that if a second vector is given, it joins the 2 into one matrix and calculate the covariance of it. In your case, you are looking for the off-diagonal elements. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Keith Goodman ????: ? 30-????-08 17:04 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Example of numpy cov() not correct? On Tue, Jul 29, 2008 at 9:10 PM, Anthony Kong wrote: > I am trying out the example here > (http://www.scipy.org/Numpy_Example_List_With_Doc#cov) > > >>>> from numpy import * > ... >>>> T = array([1.3, 4.5, 2.8, 3.9]) >>>> P = array([2.7, 8.7, 4.7, 8.2]) >>>> cov(T,P) > > The answer is supposed to be 3.9541666666666657 > > The result I got is instead a cov matrix > array([[ 1.97583333, 3.95416667], > [ 3.95416667, 8.22916667]]) > So, I just wanna confirm this particular example may be no longer > correct. > > I am using python 2.4.3 with numpy 1.1.0 on MS win It works for me (1.1 on GNU/Linux): >> import numpy as np >> T = np.array([1.3, 4.5, 2.8, 3.9]) >> P = np.array([2.7, 8.7, 4.7, 8.2]) >> np.cov(T,P) array([[ 1.97583333, 3.95416667], [ 3.95416667, 8.22916667]])' _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3587 bytes Desc: not available URL: From pgmdevlist at gmail.com Wed Jul 30 12:11:10 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 30 Jul 2008 12:11:10 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301235.33045.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <20080730080658.GA8475@tardis.terramar.selidor.net> <200807301235.33045.faltet@pytables.org> Message-ID: <200807301211.11208.pgmdevlist@gmail.com> On Wednesday 30 July 2008 06:35:32 Francesc Alted wrote: > A Wednesday 30 July 2008, Ivan Vilata i Balaguer escrigu?: > > Pierre GM (el 2008-07-29 a les 15:47:52 -0400) va dir:: > > > On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote: > > > > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: [Pierre] > > > Well, what about a .tounit(new_unit, reference=None) ? > > > By default, the reference would be None and default to the POSIX > > > epoch. We could also go for .totunit (for to time unit) [Ivan] > > Yes, that'd be the signature for a method. The ``reference`` > > argument shouldn't be allowed for ``datetime64`` values (absolute > > times, no ambiguities) but it should be mandatory for ``timedelta64`` > > ones. Sorry, but I can't see the use of having a default reference, > > unless one wanted to work with Epoch-based deltas, which looks like > > an extremely particular case. Could you please show me a use case > > for having a reference defaulting to the POSIX epoch? [Francesc] > Yeah, I agree with Ivan in that a default reference time makes little > sense for general relative times. IMO, and provided that we will be > allowing an implicit casting for most of time units for relative vs > relative and in absolute vs relative, the use of forced casting will > not be as frequent, and that a function would be enough. Having said > that, I still see the merit of method for some situations, so I'll > mention that in the third proposal as a possible improvement. In my mind, .tounit(*args) should be available for both relative (timedeltas) and absolute (datetime) times. I agree that for relative times, a default reference is meaningless. However, for absolute times, there's only one possible reference, the POSIX epoch, right ? Now, what format do you consider for this reference ? Moreover, could you give some more examples of interaction between datetime and timedelta ? From faltet at pytables.org Wed Jul 30 12:35:26 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 18:35:26 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301211.11208.pgmdevlist@gmail.com> References: <200807291512.53270.faltet@pytables.org> <200807301235.33045.faltet@pytables.org> <200807301211.11208.pgmdevlist@gmail.com> Message-ID: <200807301835.27197.faltet@pytables.org> A Wednesday 30 July 2008, Pierre GM escrigu?: > On Wednesday 30 July 2008 06:35:32 Francesc Alted wrote: > > A Wednesday 30 July 2008, Ivan Vilata i Balaguer escrigu?: > > > Pierre GM (el 2008-07-29 a les 15:47:52 -0400) va dir:: > > > > On Tuesday 29 July 2008 15:14:13 Ivan Vilata i Balaguer wrote: > > > > > Pierre GM (el 2008-07-29 a les 12:38:19 -0400) va dir:: > > [Pierre] > > > > > Well, what about a .tounit(new_unit, reference=None) ? > > > > By default, the reference would be None and default to the > > > > POSIX epoch. We could also go for .totunit (for to time unit) > > [Ivan] > > > > Yes, that'd be the signature for a method. The ``reference`` > > > argument shouldn't be allowed for ``datetime64`` values (absolute > > > times, no ambiguities) but it should be mandatory for > > > ``timedelta64`` ones. Sorry, but I can't see the use of having a > > > default reference, unless one wanted to work with Epoch-based > > > deltas, which looks like an extremely particular case. Could you > > > please show me a use case for having a reference defaulting to > > > the POSIX epoch? > > [Francesc] > > > Yeah, I agree with Ivan in that a default reference time makes > > little sense for general relative times. IMO, and provided that we > > will be allowing an implicit casting for most of time units for > > relative vs relative and in absolute vs relative, the use of forced > > casting will not be as frequent, and that a function would be > > enough. Having said that, I still see the merit of method for some > > situations, so I'll mention that in the third proposal as a > > possible improvement. > > In my mind, .tounit(*args) should be available for both relative > (timedeltas) and absolute (datetime) times. Well, what we are proposing is that the conversion time unit method for absolute times would be '.astype()' because its semantics is respected in this case. The problem is with relative times, and only with conversions between years or months and the rest of time units. This is why I propose the adoption of just a humble function for this cases. Introducing a method (.tounit()) for the ndarray object that only is useful for the date/time types seems a bit too much to my eyes (but I can be wrong, indeed). > I agree that for relative > times, a default reference is meaningless. However, for absolute > times, there's only one possible reference, the POSIX epoch, right ? That's correct. > Now, what format do you consider for this reference ? Whatever that can be converted into a datetime64 scalar. Some examples: ref = '2001-04-01' ref = datetime.datetime(2001, 4, 1) > Moreover, could you give some more examples of interaction between > datetime and timedelta ? In the second proposal there are some examples of this interaction and I'm populating the third proposal with more examples yet. Just wait a bit (maybe a couple of hours) to see the new proposal. Cheers, -- Francesc Alted From pgmdevlist at gmail.com Wed Jul 30 12:54:39 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 30 Jul 2008 12:54:39 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301835.27197.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <200807301235.33045.faltet@pytables.org> <200807301211.11208.pgmdevlist@gmail.com> <200807301835.27197.faltet@pytables.org> Message-ID: <777651ce0807300954s1a8b530ciaf53ce76e6d1996f@mail.gmail.com> On Wed, Jul 30, 2008 at 12:35 PM, Francesc Alted wrote: > A Wednesday 30 July 2008, Pierre GM escrigu?: > > In my mind, .tounit(*args) should be available for both relative > > (timedeltas) and absolute (datetime) times. > > Well, what we are proposing is that the conversion time unit method for > absolute times would be '.astype()' because its semantics is respected > in this case. OK > The problem is with relative times, and only with > conversions between years or months and the rest of time units. This > is why I propose the adoption of just a humble function for this cases. OK > Introducing a method (.tounit()) for the ndarray object that only is > useful for the date/time types seems a bit too much to my eyes (but I > can be wrong, indeed). Ohoh, I see... I was still thinking in terms of subclassing ndarray with a timedelta class, where such a method would have made sense. In fact, you're talking about the dtype. Well, of course, in that case, that makes sense not to have an extra method. We can always implement it in Date/DateArray. > > Now, what format do you consider for this reference ? > > Whatever that can be converted into a datetime64 scalar. Some examples: > > ref = '2001-04-01' > ref = datetime.datetime(2001, 4, 1) Er, should I see ref as having a 'day' unit or 'business day' unit in that case? I know that 'business days' spoil the game, but Matt really needs them, so... > > Moreover, could you give some more examples of interaction between > > datetime and timedelta ? > > In the second proposal there are some examples of this interaction and > I'm populating the third proposal with more examples yet. Just wait a > bit (maybe a couple of hours) to see the new proposal. > OK, with pleasure. It's just that I have trouble understanding the meaning of something like t2 = numpy.ones(5, dtype="datetime64[s]") That's five times one second after the epoch, right ? But in what circumstances would you need t2 ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Wed Jul 30 13:16:25 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 19:16:25 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <777651ce0807300954s1a8b530ciaf53ce76e6d1996f@mail.gmail.com> References: <200807291512.53270.faltet@pytables.org> <200807301835.27197.faltet@pytables.org> <777651ce0807300954s1a8b530ciaf53ce76e6d1996f@mail.gmail.com> Message-ID: <200807301916.25824.faltet@pytables.org> A Wednesday 30 July 2008, Pierre GM escrigu?: > > > Now, what format do you consider for this reference ? > > > > Whatever that can be converted into a datetime64 scalar. Some > > examples: > > > > ref = '2001-04-01' > > ref = datetime.datetime(2001, 4, 1) > > Er, should I see ref as having a 'day' unit or 'business day' unit in > that case? I know that 'business days' spoil the game, but Matt > really needs them, so... OK. I was wrong. Of course you need to specify the resolution, so the reference *should* be a NumPy scalar: ref = numpy.datetime64('2001-04-01', unit="B") # 'B'usiness days > > > > Moreover, could you give some more examples of interaction > > > between datetime and timedelta ? > > > > In the second proposal there are some examples of this interaction > > and I'm populating the third proposal with more examples yet. Just > > wait a bit (maybe a couple of hours) to see the new proposal. > > OK, with pleasure. It's just that I have trouble understanding the > meaning of something like > t2 = numpy.ones(5, dtype="datetime64[s]") > > That's five times one second after the epoch, right ? But in what > circumstances would you need t2 ? I'm not sure I follow you. This is just an example so as to produce an array of time objects quickly. In general, you should also be able to produce the same result by doing: t2 = numpy.array(['1970-01-01T00:00:05', '1970-01-01T00:00:05', '1970-01-01T00:00:05', '1970-01-01T00:00:05', '1970-01-01T00:00:05', dtype="datetime64[s]") which is more visual, but has the drawback that it's just too long for documenting purposes. When you don't need the values for some examples, conciseness is a virtue. -- Francesc Alted From pgmdevlist at gmail.com Wed Jul 30 13:28:45 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 30 Jul 2008 13:28:45 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301916.25824.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <777651ce0807300954s1a8b530ciaf53ce76e6d1996f@mail.gmail.com> <200807301916.25824.faltet@pytables.org> Message-ID: <200807301328.46348.pgmdevlist@gmail.com> On Wednesday 30 July 2008 13:16:25 Francesc Alted wrote: > A Wednesday 30 July 2008, Pierre GM escrigu?: > > It's just that I have trouble understanding the > > meaning of something like > > t2 = numpy.ones(5, dtype="datetime64[s]") > > > > That's five times one second after the epoch, right ? But in what > > circumstances would you need t2 ? > > I'm not sure I follow you. This is just an example so as to produce an > array of time objects quickly. > ... > When you don't need the values for some > examples, conciseness is a virtue. I'd prefer something like np.range(5, dtype=datetime64['s']), which is both concise and still has a physical meaning I can wrap my mind around. Which brings me to another question: datetime64 and timedelta64 are just dtypes, therefore they don't impose any restriction (in terms of uniqueness of elements, ordering of the elements...) on the underlying ndarray, right ? From tom.denniston at alum.dartmouth.org Wed Jul 30 13:44:04 2008 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Wed, 30 Jul 2008 12:44:04 -0500 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301916.25824.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <200807301835.27197.faltet@pytables.org> <777651ce0807300954s1a8b530ciaf53ce76e6d1996f@mail.gmail.com> <200807301916.25824.faltet@pytables.org> Message-ID: When people are refering to busienss days are you talking about weekdays or are you saying weekday non-holidays? On 7/30/08, Francesc Alted wrote: > A Wednesday 30 July 2008, Pierre GM escrigu?: > > > > Now, what format do you consider for this reference ? > > > > > > Whatever that can be converted into a datetime64 scalar. Some > > > examples: > > > > > > ref = '2001-04-01' > > > ref = datetime.datetime(2001, 4, 1) > > > > Er, should I see ref as having a 'day' unit or 'business day' unit in > > that case? I know that 'business days' spoil the game, but Matt > > really needs them, so... > > OK. I was wrong. Of course you need to specify the resolution, so the > reference *should* be a NumPy scalar: > > ref = numpy.datetime64('2001-04-01', unit="B") # 'B'usiness days > > > > > > > Moreover, could you give some more examples of interaction > > > > between datetime and timedelta ? > > > > > > In the second proposal there are some examples of this interaction > > > and I'm populating the third proposal with more examples yet. Just > > > wait a bit (maybe a couple of hours) to see the new proposal. > > > > OK, with pleasure. It's just that I have trouble understanding the > > meaning of something like > > t2 = numpy.ones(5, dtype="datetime64[s]") > > > > That's five times one second after the epoch, right ? But in what > > circumstances would you need t2 ? > > I'm not sure I follow you. This is just an example so as to produce an > array of time objects quickly. In general, you should also be able to > produce the same result by doing: > > t2 = numpy.array(['1970-01-01T00:00:05', '1970-01-01T00:00:05', > '1970-01-01T00:00:05', '1970-01-01T00:00:05', > '1970-01-01T00:00:05', dtype="datetime64[s]") > > which is more visual, but has the drawback that it's just too long for > documenting purposes. When you don't need the values for some > examples, conciseness is a virtue. > > -- > Francesc Alted > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Wed Jul 30 13:50:44 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 19:50:44 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301328.46348.pgmdevlist@gmail.com> References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> <200807301328.46348.pgmdevlist@gmail.com> Message-ID: <200807301950.44452.faltet@pytables.org> A Wednesday 30 July 2008, Pierre GM escrigu?: > Which brings me to another question: > datetime64 and timedelta64 are just dtypes, therefore they don't > impose any restriction (in terms of uniqueness of elements, ordering > of the elements...) on the underlying ndarray, right ? That's right. Perhaps this is the reason why you got mystified about the numpy.ones(5, dtype="datetime64[s]") thing. -- Francesc Alted From faltet at pytables.org Wed Jul 30 13:54:13 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 19:54:13 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> Message-ID: <200807301954.13802.faltet@pytables.org> A Wednesday 30 July 2008, Tom Denniston escrigu?: > When people are refering to busienss days are you talking about > weekdays or are you saying weekday non-holidays? Plain weekdays. Taking in account holidays for all the world round would be certainly much more complex than timezones, which neither are being considered in this proposal. -- Francesc Alted From tom.denniston at alum.dartmouth.org Wed Jul 30 14:12:45 2008 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Wed, 30 Jul 2008 13:12:45 -0500 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807301954.13802.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> <200807301954.13802.faltet@pytables.org> Message-ID: If it's really just weekdays why not call it that instead of using a term like business days that (quite confusingly) suggests holidays are handled properly? Also, I view the timezone and holiday issues as totally seperate. I would definately NOT recommend basing holidays on a timezone because holidays are totally unrelated to timezones. Usually when you deal with holidays, because they vary by application and country and change over time, you provide a calendar as an outside input. That would be very useful if that were allowed, but might make the implementation rather complex. --Tom On 7/30/08, Francesc Alted wrote: > A Wednesday 30 July 2008, Tom Denniston escrigu?: > > When people are refering to busienss days are you talking about > > weekdays or are you saying weekday non-holidays? > > Plain weekdays. Taking in account holidays for all the world round > would be certainly much more complex than timezones, which neither are > being considered in this proposal. > > -- > Francesc Alted > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Wed Jul 30 14:26:14 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 20:26:14 +0200 Subject: [Numpy-discussion] A (third) proposal for implementing some date/time types in NumPy Message-ID: <200807302026.14687.faltet@pytables.org> Hi, After several weeks of gathering and pondering through very valuable feedback, we are happy to release the third (and final?) version of the proposal for the addition of the date/time types in NumPy. The bad news is that, due to a series of circumstances (apparently not related on how this job was being done) Enthought is no longer funding this project. However, we decided to go ahead and publish this in the hope that it could be useful in the future for someone brave enough to willing to do an implementation. So, while I'll be glad to answer the doubts or questions about this proposal, it should be clear that we (Ivan Vilata and myself) are not going to proceed with the implementation phase (unless Enthought would find fundings for sponsoring this in the future). Finally, I'd like to thank everybody that has been involved in the making of this proposal. Without their advices, suggestions and criticism we probably have never been able to realize all the requeriments and intrincacies that need a dtype for allowing a date/time manipulation which is adequate for numerical purposes. Cheers, -- Francesc Alted ==================================================================== A (third) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-30 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stuck with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different time units so as to cover different needs. .. Important:: the time unit is conceived here as metadata that *complements* a date/time dtype, *without changing the base type*. It provides information about the *meaning* of the stored numbers, not about their *structure*. Now follows a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is the POSIX epoch (see [3]_). Like POSIX, the representation of a date doesn't take leap seconds into account. In time unit *conversions* and time *representations* (but not in other time computations), the value -2**63 (0x8000000000000000) is interpreted as an invalid or unknown date, *Not a Time* or *NaT*. See the section on time unit conversions for more information. Time units ~~~~~~~~~~ It accepts different time units, each of them implying a different time span. The table below describes the time units supported with their corresponding time spans. ======== ================ ========================== Time unit Time span (years) ------------------------- -------------------------- Code Meaning ======== ================ ========================== Y year [9.2e18 BC, 9.2e18 AC] M month [7.6e17 BC, 7.6e17 AC] W week [1.7e17 BC, 1.7e17 AC] B business day [3.5e16 BC, 3.5e16 AC] D day [2.5e16 BC, 2.5e16 AC] h hour [1.0e15 BC, 1.0e15 AC] m minute [1.7e13 BC, 1.7e13 AC] s second [ 2.9e9 BC, 2.9e9 AC] ms millisecond [ 2.9e6 BC, 2.9e6 AC] us microsecond [290301 BC, 294241 AC] ns nanosecond [ 1678 AC, 2262 AC] ======== ================ ========================== The value of an absolute date is thus *an integer number of units of the chosen time unit* passed since the internal epoch. When working with business days, Saturdays and Sundays are simply ignored from the count (i.e. day 3 in business days is not Saturday 1970-01-03, but Monday 1970-01-05). Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed ways to specify the time unit in the dtype constructor are: Using the long string notation:: dtype('datetime64[us]') Using the short string notation:: dtype('T8[us]') Note that a time unit should always be specified, as there is not a default. Setting and getting values ~~~~~~~~~~~~~~~~~~~~~~~~~~ The objects with this dtype can be set in a series of ways:: t = numpy.ones(3, dtype='T8[s]') t[0] = 1199164176 # assign to July 30th, 2008 at 17:31:00 t[1] = datetime.datetime(2008, 7, 30, 17, 31, 01) # with datetime module t[2] = '2008-07-30T17:31:02' # with ISO 8601 And can be get in different ways too:: str(t[0]) --> 2008-07-30T17:31:00 repr(t[1]) --> datetime64(1199164177, 's') str(t[0].item()) --> 2008-07-30 17:31:00 # datetime module object repr(t[0].item()) --> datetime.datetime(2008, 7, 30, 17, 31) # idem str(t) --> [2008-07-30T17:31:00 2008-07-30T17:31:01 2008-07-30T17:31:02] repr(t) --> array([1199164176, 1199164177, 1199164178], dtype='datetime64[s]') Comparisons ~~~~~~~~~~~ The comparisons will be supported too:: numpy.array(['1980'], 'T8[Y]') == numpy.array(['1979'], 'T8[Y]') --> [False] or by applying broadcasting:: numpy.array(['1979', '1980'], 'T8[Y]') == numpy.datetime64 ('1980', 'Y') --> [False, True] The next should work too:: numpy.array(['1979', '1980'], 'T8[Y]') == '1980-01-01' --> [False, True] because the right hand expression can be broadcasted into an array of 2 elements of dtype 'T8[Y]'. Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a time unit of microseconds. For other time units, the conversion process will loose precision or will overflow as needed. The conversion from/to a ``datetime`` object doesn't take leap seconds into account. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. In time unit *conversions* and time *representations* (but not in other time computations), the value -2**63 (0x8000000000000000) is interpreted as an invalid or unknown time, *Not a Time* or *NaT*. See the section on time unit conversions for more information. Time units ~~~~~~~~~~ It accepts different time units, each of them implying a different time span. The table below describes the time units supported with their corresponding time spans. ======== ================ ========================== Time unit Time span ------------------------- -------------------------- Code Meaning ======== ================ ========================== Y year +- 9.2e18 years M month +- 7.6e17 years W week +- 1.7e17 years B business day +- 3.5e16 years D day +- 2.5e16 years h hour +- 1.0e15 years m minute +- 1.7e13 years s second +- 2.9e12 years ms millisecond +- 2.9e9 years us microsecond +- 2.9e6 years ns nanosecond +- 292 years ps picosecond +- 106 days fs femtosecond +- 2.6 hours as attosecond +- 9.2 seconds ======== ================ ========================== The value of a time delta is thus *an integer number of units of the chosen time unit*. Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed ways to specify the time unit in the dtype constructor are: Using the long string notation:: dtype('timedelta64[us]') Using the short string notation:: dtype('t8[us]') Note that a time unit should always be specified, as there is not a default. Setting and getting values ~~~~~~~~~~~~~~~~~~~~~~~~~~ The objects with this dtype can be set in a series of ways:: t = numpy.ones(3, dtype='t8[ms]') t[0] = 12 # assign to 12 ms t[1] = datetime.timedelta(0, 0, 13000) # 13 ms t[2] = '0:00:00.014' # 14 ms And can be get in different ways too:: str(t[0]) --> 0:00:00.012 repr(t[1]) --> timedelta64(13, 'ms') str(t[0].item()) --> 0:00:00.012000 # datetime module object repr(t[0].item()) --> datetime.timedelta(0, 0, 12000) # idem str(t) --> [0:00:00.012 0:00:00.014 0:00:00.014] repr(t) --> array([12, 13, 14], dtype="timedelta64[ms]") Comparisons ~~~~~~~~~~~ The comparisons will be supported too:: numpy.array([12, 13, 14], 't8[ms]') == numpy.array([12, 13, 13], 't8 [ms]') --> [True, True, False] or by applying broadcasting:: numpy.array([12, 13, 14], 't8[ms]') == numpy.timedelta64(13, 'ms') --> [False, True, False] The next should work too:: numpy.array([12, 13, 14], 't8[ms]') == '0:00:00.012' --> [True, False, False] because the right hand expression can be broadcasted into an array of 3 elements of dtype 't8[ms]'. Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a time unit of microseconds. For other units, the conversion process will loose precision or will overflow as needed. Examples of use =============== Here it is an example of use for the ``datetime64``:: In [5]: numpy.datetime64(42, 'us') Out[5]: datetime64(42, 'us') In [6]: print numpy.datetime64(42, 'us') 1970-01-01T00:00:00.000042 # representation in ISO 8601 format In [7]: print numpy.datetime64(367.7, 'D') # decimal part is lost 1971-01-02 # still ISO 8601 format In [8]: numpy.datetime('2008-07-18T12:23:18', 'm') # from ISO 8601 Out[8]: datetime64(20273063, 'm') In [9]: print numpy.datetime('2008-07-18T12:23:18', 'm') Out[9]: 2008-07-18T12:23 In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000] In [13]: repr(t) Out[13]: array([267859210457, 0, 0, 0, 0], dtype="datetime64[ms]") In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [15]: print t.dtype dtype('datetime64[ms]') And here it goes an example of use for the ``timedelta64``:: In [5]: numpy.timedelta64(10, 'us') Out[5]: timedelta64(10, 'us') In [6]: print numpy.timedelta64(10, 'us') 0:00:00.000010 In [7]: print numpy.timedelta64(3600.2, 'm') # decimal part is lost 2 days, 12:00 In [8]: t1 = numpy.zeros(5, dtype="datetime64[ms]") In [9]: t2 = numpy.ones(5, dtype="datetime64[ms]") In [10]: t = t2 - t1 In [11]: t[0] = datetime.timedelta(0, 24) # setter in action In [12]: print t [0:00:24.000 0:00:01.000 0:00:01.000 0:00:01.000 0:00:01.000] In [13]: print repr(t) Out[13]: array([24000, 1, 1, 1, 1], dtype="timedelta64[ms]") In [14]: t[0].item() # getter in action Out[14]: datetime.timedelta(0, 24) In [15]: print t.dtype dtype('timedelta64[s]') Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only arithmetic operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(3, "T8[s]") - numpy.zeros(3, "T8[s]") Out[10]: array([1, 1, 1], dtype=timedelta64[s]) But not other operations:: In [11]: numpy.ones(3, "T8[s]") + numpy.zeros(3, "T8[s]") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Comparisons between absolute dates are allowed. Casting rules ~~~~~~~~~~~~~ When operating (basically, only the subtraction will be allowed) two absolute times with different unit times, the outcome would be to raise an exception. This is because the ranges and time-spans of the different time units can be very different, and it is not clear at all what time unit will be preferred for the user. For example, this should be allowed:: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]") array([1, 1, 1], dtype="timedelta64[Y]") But the next should not:: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]") raise numpy.IncompatibleUnitError # what unit to choose? ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' Casting rules ~~~~~~~~~~~~~ In this case the absolute time should have priority for determining the time unit of the outcome. That would represent what the people wants to do most of the times. For example, this would allow to do:: >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], dtype='datetime64[D]') >>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 relative years >>> series2 array(['1972-01-01', '1972-02-01', '1972-09-01'], dtype='datetime64[D]') # the 'D'ay time unit has been chosen ``timedelta64`` vs ``timedelta64`` ---------------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(3, 't8[us]') Out[10]: array([1, 1, 1], dtype="timedelta64[us]") In [11]: (numpy.ones(3, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27], dtype="timedelta64[M]") But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: the result cannot be converted into a ``timedelta64`` Casting rules ~~~~~~~~~~~~~ When combining two ``timedelta64`` dtypes with different time units the outcome will be the shorter of both ("keep the precision" rule). For example:: In [10]: numpy.ones(3, 't8[s]') + numpy.ones(3, 't8[m]') Out[10]: array([61, 61, 61], dtype="timedelta64[s]") However, due to the impossibility to know the exact duration of a relative year or a relative month, when these time units appear in one of the operands, the operation will not be allowed:: In [11]: numpy.ones(3, 't8[Y]') + numpy.ones(3, 't8[D]') raise numpy.IncompatibleUnitError # how to convert relative years to days? In order to being able to perform the above operation a new NumPy function, called ``change_timeunit`` is proposed. Its signature will be:: change_timeunit(time_object, new_unit, reference) where 'time_object' is the time object whose unit is to be changed, 'new_unit' is the desired new time unit, and 'reference' is an absolute date (NumPy datetime64 scalar) that will be used to allow the conversion of relative times in case of using time units with an uncertain number of smaller time units (relative years or months cannot be expressed in days). With this, the above operation can be done as follows:: In [10]: t_years = numpy.ones(3, 't8[Y]') In [11]: t_days = numpy.change_timeunit(t_years, 'D', '2001-01-01') In [12]: t_days + numpy.ones(3, 't8[D]') Out[12]: array([366, 366, 366], dtype="timedelta64[D]") dtype vs time units conversions =============================== For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing time units. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[D]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Business days have the peculiarity that they do not cover a continuous line of time (they have gaps at weekends). Thus, when converting from any ordinary time to business days, it can happen that the original time is not representable. In that case, the result of the conversion is *Not a Time* (*NaT*):: In[10]: t1 = numpy.arange(5, dtype="datetime64[D]") In[11]: print t1 [1970-01-01 1970-01-02 1970-01-03 1970-01-04 1970-01-05] In[12]: t2 = t1.astype("datetime64[B]") In[13]: print t2 # 1970 begins in a Thursday [1970-01-01 1970-01-02 NaT NaT 1970-01-05] When converting back to ordinary days, NaT values are left untouched (this happens in all time unit conversions):: In[14]: t3 = t2.astype("datetime64[D]") In[13]: print t3 [1970-01-01 1970-01-02 NaT NaT 1970-01-05] Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, we found that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Operations with mixed time units -------------------------------- Whenever an operation between two time values of the same dtype with the same unit is accepted, the same operation with time values of different units should be possible (e.g. adding a time delta in seconds and one in microseconds), resulting in an adequate time unit. The exact semantics of this kind of operations is defined int the "Casting rules" subsections of the "Operating with date/time arrays" section. Due to the peculiarities of business days, it is most probable that operations mixing business days with other time units will not be allowed. Why there is not a ``quarter`` time unit? ----------------------------------------- This proposal tries to focus on the most common used set of time units to operate with, and the ``quarter`` can be considered more of a derived unit. Besides, the use of a ``quarter`` normally requires that it can start at whatever month of the year, and as we are not including support for a time ``origin`` metadata, this is not a viable venue here. Finally, if we were to add the ``quarter`` then people should expect to find a ``biweekly``, ``semester`` or ``biyearly`` just to put some examples of other derived units, and we find this a bit too overwhelming for this proposal purposes. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: -------------- next part -------------- ==================================================================== A (third) proposal for implementing some date/time types in NumPy ==================================================================== :Author: Francesc Alted i Abad :Contact: faltet at pytables.com :Author: Ivan Vilata i Balaguer :Contact: ivan at selidor.net :Date: 2008-07-30 Executive summary ================= A date/time mark is something very handy to have in many fields where one has to deal with data sets. While Python has several modules that define a date/time type (like the integrated ``datetime`` [1]_ or ``mx.DateTime`` [2]_), NumPy has a lack of them. In this document, we are proposing the addition of a series of date/time types to fill this gap. The requirements for the proposed types are two-folded: 1) they have to be fast to operate with and 2) they have to be as compatible as possible with the existing ``datetime`` module that comes with Python. Types proposed ============== To start with, it is virtually impossible to come up with a single date/time type that fills the needs of every case of use. So, after pondering about different possibilities, we have stuck with *two* different types, namely ``datetime64`` and ``timedelta64`` (these names are preliminary and can be changed), that can have different time units so as to cover different needs. .. Important:: the time unit is conceived here as metadata that *complements* a date/time dtype, *without changing the base type*. It provides information about the *meaning* of the stored numbers, not about their *structure*. Now follows a detailed description of the proposed types. ``datetime64`` -------------- It represents a time that is absolute (i.e. not relative). It is implemented internally as an ``int64`` type. The internal epoch is the POSIX epoch (see [3]_). Like POSIX, the representation of a date doesn't take leap seconds into account. In time unit *conversions* and time *representations* (but not in other time computations), the value -2**63 (0x8000000000000000) is interpreted as an invalid or unknown date, *Not a Time* or *NaT*. See the section on time unit conversions for more information. Time units ~~~~~~~~~~ It accepts different time units, each of them implying a different time span. The table below describes the time units supported with their corresponding time spans. ======== ================ ========================== Time unit Time span (years) ------------------------- -------------------------- Code Meaning ======== ================ ========================== Y year [9.2e18 BC, 9.2e18 AC] M month [7.6e17 BC, 7.6e17 AC] W week [1.7e17 BC, 1.7e17 AC] B business day [3.5e16 BC, 3.5e16 AC] D day [2.5e16 BC, 2.5e16 AC] h hour [1.0e15 BC, 1.0e15 AC] m minute [1.7e13 BC, 1.7e13 AC] s second [ 2.9e9 BC, 2.9e9 AC] ms millisecond [ 2.9e6 BC, 2.9e6 AC] us microsecond [290301 BC, 294241 AC] ns nanosecond [ 1678 AC, 2262 AC] ======== ================ ========================== The value of an absolute date is thus *an integer number of units of the chosen time unit* passed since the internal epoch. When working with business days, Saturdays and Sundays are simply ignored from the count (i.e. day 3 in business days is not Saturday 1970-01-03, but Monday 1970-01-05). Building a ``datetime64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed ways to specify the time unit in the dtype constructor are: Using the long string notation:: dtype('datetime64[us]') Using the short string notation:: dtype('T8[us]') Note that a time unit should always be specified, as there is not a default. Setting and getting values ~~~~~~~~~~~~~~~~~~~~~~~~~~ The objects with this dtype can be set in a series of ways:: t = numpy.ones(3, dtype='T8[s]') t[0] = 1199164176 # assign to July 30th, 2008 at 17:31:00 t[1] = datetime.datetime(2008, 7, 30, 17, 31, 01) # with datetime module t[2] = '2008-07-30T17:31:02' # with ISO 8601 And can be get in different ways too:: str(t[0]) --> 2008-07-30T17:31:00 repr(t[1]) --> datetime64(1199164177, 's') str(t[0].item()) --> 2008-07-30 17:31:00 # datetime module object repr(t[0].item()) --> datetime.datetime(2008, 7, 30, 17, 31) # idem str(t) --> [2008-07-30T17:31:00 2008-07-30T17:31:01 2008-07-30T17:31:02] repr(t) --> array([1199164176, 1199164177, 1199164178], dtype='datetime64[s]') Comparisons ~~~~~~~~~~~ The comparisons will be supported too:: numpy.array(['1980'], 'T8[Y]') == numpy.array(['1979'], 'T8[Y]') --> [False] or by applying broadcasting:: numpy.array(['1979', '1980'], 'T8[Y]') == numpy.datetime64('1980', 'Y') --> [False, True] The next should work too:: numpy.array(['1979', '1980'], 'T8[Y]') == '1980-01-01' --> [False, True] because the right hand expression can be broadcasted into an array of 2 elements of dtype 'T8[Y]'. Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``datetime`` class of the ``datetime`` module of Python only when using a time unit of microseconds. For other time units, the conversion process will loose precision or will overflow as needed. The conversion from/to a ``datetime`` object doesn't take leap seconds into account. ``timedelta64`` --------------- It represents a time that is relative (i.e. not absolute). It is implemented internally as an ``int64`` type. In time unit *conversions* and time *representations* (but not in other time computations), the value -2**63 (0x8000000000000000) is interpreted as an invalid or unknown time, *Not a Time* or *NaT*. See the section on time unit conversions for more information. Time units ~~~~~~~~~~ It accepts different time units, each of them implying a different time span. The table below describes the time units supported with their corresponding time spans. ======== ================ ========================== Time unit Time span ------------------------- -------------------------- Code Meaning ======== ================ ========================== Y year +- 9.2e18 years M month +- 7.6e17 years W week +- 1.7e17 years B business day +- 3.5e16 years D day +- 2.5e16 years h hour +- 1.0e15 years m minute +- 1.7e13 years s second +- 2.9e12 years ms millisecond +- 2.9e9 years us microsecond +- 2.9e6 years ns nanosecond +- 292 years ps picosecond +- 106 days fs femtosecond +- 2.6 hours as attosecond +- 9.2 seconds ======== ================ ========================== The value of a time delta is thus *an integer number of units of the chosen time unit*. Building a ``timedelta64`` dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The proposed ways to specify the time unit in the dtype constructor are: Using the long string notation:: dtype('timedelta64[us]') Using the short string notation:: dtype('t8[us]') Note that a time unit should always be specified, as there is not a default. Setting and getting values ~~~~~~~~~~~~~~~~~~~~~~~~~~ The objects with this dtype can be set in a series of ways:: t = numpy.ones(3, dtype='t8[ms]') t[0] = 12 # assign to 12 ms t[1] = datetime.timedelta(0, 0, 13000) # 13 ms t[2] = '0:00:00.014' # 14 ms And can be get in different ways too:: str(t[0]) --> 0:00:00.012 repr(t[1]) --> timedelta64(13, 'ms') str(t[0].item()) --> 0:00:00.012000 # datetime module object repr(t[0].item()) --> datetime.timedelta(0, 0, 12000) # idem str(t) --> [0:00:00.012 0:00:00.014 0:00:00.014] repr(t) --> array([12, 13, 14], dtype="timedelta64[ms]") Comparisons ~~~~~~~~~~~ The comparisons will be supported too:: numpy.array([12, 13, 14], 't8[ms]') == numpy.array([12, 13, 13], 't8[ms]') --> [True, True, False] or by applying broadcasting:: numpy.array([12, 13, 14], 't8[ms]') == numpy.timedelta64(13, 'ms') --> [False, True, False] The next should work too:: numpy.array([12, 13, 14], 't8[ms]') == '0:00:00.012' --> [True, False, False] because the right hand expression can be broadcasted into an array of 3 elements of dtype 't8[ms]'. Compatibility issues ~~~~~~~~~~~~~~~~~~~~ This will be fully compatible with the ``timedelta`` class of the ``datetime`` module of Python only when using a time unit of microseconds. For other units, the conversion process will loose precision or will overflow as needed. Examples of use =============== Here it is an example of use for the ``datetime64``:: In [5]: numpy.datetime64(42, 'us') Out[5]: datetime64(42, 'us') In [6]: print numpy.datetime64(42, 'us') 1970-01-01T00:00:00.000042 # representation in ISO 8601 format In [7]: print numpy.datetime64(367.7, 'D') # decimal part is lost 1971-01-02 # still ISO 8601 format In [8]: numpy.datetime('2008-07-18T12:23:18', 'm') # from ISO 8601 Out[8]: datetime64(20273063, 'm') In [9]: print numpy.datetime('2008-07-18T12:23:18', 'm') Out[9]: 2008-07-18T12:23 In [10]: t = numpy.zeros(5, dtype="datetime64[ms]") In [11]: t[0] = datetime.datetime.now() # setter in action In [12]: print t [2008-07-16T13:39:25.315 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000 1970-01-01T00:00:00.000] In [13]: repr(t) Out[13]: array([267859210457, 0, 0, 0, 0], dtype="datetime64[ms]") In [14]: t[0].item() # getter in action Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000) In [15]: print t.dtype dtype('datetime64[ms]') And here it goes an example of use for the ``timedelta64``:: In [5]: numpy.timedelta64(10, 'us') Out[5]: timedelta64(10, 'us') In [6]: print numpy.timedelta64(10, 'us') 0:00:00.000010 In [7]: print numpy.timedelta64(3600.2, 'm') # decimal part is lost 2 days, 12:00 In [8]: t1 = numpy.zeros(5, dtype="datetime64[ms]") In [9]: t2 = numpy.ones(5, dtype="datetime64[ms]") In [10]: t = t2 - t1 In [11]: t[0] = datetime.timedelta(0, 24) # setter in action In [12]: print t [0:00:24.000 0:00:01.000 0:00:01.000 0:00:01.000 0:00:01.000] In [13]: print repr(t) Out[13]: array([24000, 1, 1, 1, 1], dtype="timedelta64[ms]") In [14]: t[0].item() # getter in action Out[14]: datetime.timedelta(0, 24) In [15]: print t.dtype dtype('timedelta64[s]') Operating with date/time arrays =============================== ``datetime64`` vs ``datetime64`` -------------------------------- The only arithmetic operation allowed between absolute dates is the subtraction:: In [10]: numpy.ones(3, "T8[s]") - numpy.zeros(3, "T8[s]") Out[10]: array([1, 1, 1], dtype=timedelta64[s]) But not other operations:: In [11]: numpy.ones(3, "T8[s]") + numpy.zeros(3, "T8[s]") TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' Comparisons between absolute dates are allowed. Casting rules ~~~~~~~~~~~~~ When operating (basically, only the subtraction will be allowed) two absolute times with different unit times, the outcome would be to raise an exception. This is because the ranges and time-spans of the different time units can be very different, and it is not clear at all what time unit will be preferred for the user. For example, this should be allowed:: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]") array([1, 1, 1], dtype="timedelta64[Y]") But the next should not:: >>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]") raise numpy.IncompatibleUnitError # what unit to choose? ``datetime64`` vs ``timedelta64`` --------------------------------- It will be possible to add and subtract relative times from absolute dates:: In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]") Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y]) In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]") Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y]) But not other operations:: In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]") TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray' Casting rules ~~~~~~~~~~~~~ In this case the absolute time should have priority for determining the time unit of the outcome. That would represent what the people wants to do most of the times. For example, this would allow to do:: >>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'], dtype='datetime64[D]') >>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 relative years >>> series2 array(['1972-01-01', '1972-02-01', '1972-09-01'], dtype='datetime64[D]') # the 'D'ay time unit has been chosen ``timedelta64`` vs ``timedelta64`` ---------------------------------- Finally, it will be possible to operate with relative times as if they were regular int64 dtypes *as long as* the result can be converted back into a ``timedelta64``:: In [10]: numpy.ones(3, 't8[us]') Out[10]: array([1, 1, 1], dtype="timedelta64[us]") In [11]: (numpy.ones(3, 't8[M]') + 2) ** 3 Out[11]: array([27, 27, 27], dtype="timedelta64[M]") But:: In [12]: numpy.ones(5, 't8') + 1j TypeError: the result cannot be converted into a ``timedelta64`` Casting rules ~~~~~~~~~~~~~ When combining two ``timedelta64`` dtypes with different time units the outcome will be the shorter of both ("keep the precision" rule). For example:: In [10]: numpy.ones(3, 't8[s]') + numpy.ones(3, 't8[m]') Out[10]: array([61, 61, 61], dtype="timedelta64[s]") However, due to the impossibility to know the exact duration of a relative year or a relative month, when these time units appear in one of the operands, the operation will not be allowed:: In [11]: numpy.ones(3, 't8[Y]') + numpy.ones(3, 't8[D]') raise numpy.IncompatibleUnitError # how to convert relative years to days? In order to being able to perform the above operation a new NumPy function, called ``change_timeunit`` is proposed. Its signature will be:: change_timeunit(time_object, new_unit, reference) where 'time_object' is the time object whose unit is to be changed, 'new_unit' is the desired new time unit, and 'reference' is an absolute date (NumPy datetime64 scalar) that will be used to allow the conversion of relative times in case of using time units with an uncertain number of smaller time units (relative years or months cannot be expressed in days). With this, the above operation can be done as follows:: In [10]: t_years = numpy.ones(3, 't8[Y]') In [11]: t_days = numpy.change_timeunit(t_years, 'D', '2001-01-01') In [12]: t_days + numpy.ones(3, 't8[D]') Out[12]: array([366, 366, 366], dtype="timedelta64[D]") dtype vs time units conversions =============================== For changing the date/time dtype of an existing array, we propose to use the ``.astype()`` method. This will be mainly useful for changing time units. For example, for absolute dates:: In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]") In[11]: print t1 [1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00] In[12]: print t1.astype('datetime64[D]') [1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01] For relative times:: In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]") In[11]: print t1 [1 1 1 1 1] In[12]: print t1.astype('timedelta64[ms]') [1000 1000 1000 1000 1000] Changing directly from/to relative to/from absolute dtypes will not be supported:: In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64') TypeError: data type cannot be converted to the desired type Business days have the peculiarity that they do not cover a continuous line of time (they have gaps at weekends). Thus, when converting from any ordinary time to business days, it can happen that the original time is not representable. In that case, the result of the conversion is *Not a Time* (*NaT*):: In[10]: t1 = numpy.arange(5, dtype="datetime64[D]") In[11]: print t1 [1970-01-01 1970-01-02 1970-01-03 1970-01-04 1970-01-05] In[12]: t2 = t1.astype("datetime64[B]") In[13]: print t2 # 1970 begins in a Thursday [1970-01-01 1970-01-02 NaT NaT 1970-01-05] When converting back to ordinary days, NaT values are left untouched (this happens in all time unit conversions):: In[14]: t3 = t2.astype("datetime64[D]") In[13]: print t3 [1970-01-01 1970-01-02 NaT NaT 1970-01-05] Final considerations ==================== Why the ``origin`` metadata disappeared --------------------------------------- During the discussion of the date/time dtypes in the NumPy list, the idea of having an ``origin`` metadata that complemented the definition of the absolute ``datetime64`` was initially found to be useful. However, after thinking more about this, we found that the combination of an absolute ``datetime64`` with a relative ``timedelta64`` does offer the same functionality while removing the need for the additional ``origin`` metadata. This is why we have removed it from this proposal. Operations with mixed time units -------------------------------- Whenever an operation between two time values of the same dtype with the same unit is accepted, the same operation with time values of different units should be possible (e.g. adding a time delta in seconds and one in microseconds), resulting in an adequate time unit. The exact semantics of this kind of operations is defined int the "Casting rules" subsections of the "Operating with date/time arrays" section. Due to the peculiarities of business days, it is most probable that operations mixing business days with other time units will not be allowed. Why there is not a ``quarter`` time unit? ----------------------------------------- This proposal tries to focus on the most common used set of time units to operate with, and the ``quarter`` can be considered more of a derived unit. Besides, the use of a ``quarter`` normally requires that it can start at whatever month of the year, and as we are not including support for a time ``origin`` metadata, this is not a viable venue here. Finally, if we were to add the ``quarter`` then people should expect to find a ``biweekly``, ``semester`` or ``biyearly`` just to put some examples of other derived units, and we find this a bit too overwhelming for this proposal purposes. .. [1] http://docs.python.org/lib/module-datetime.html .. [2] http://www.egenix.com/products/python/mxBase/mxDateTime .. [3] http://en.wikipedia.org/wiki/Unix_time .. Local Variables: .. mode: rst .. coding: utf-8 .. fill-column: 72 .. End: From faltet at pytables.org Wed Jul 30 14:30:46 2008 From: faltet at pytables.org (Francesc Alted) Date: Wed, 30 Jul 2008 20:30:46 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <200807301954.13802.faltet@pytables.org> Message-ID: <200807302030.46483.faltet@pytables.org> A Wednesday 30 July 2008, Tom Denniston escrigu?: > If it's really just weekdays why not call it that instead of using a > term like business days that (quite confusingly) suggests holidays > are handled properly? Well, we were adopting the name from the TimeSeries package. Perhaps the authors can answer this better than me. > Also, I view the timezone and holiday issues as totally seperate. I > would definately NOT recommend basing holidays on a timezone because > holidays are totally unrelated to timezones. Usually when you deal > with holidays, because they vary by application and country and > change over time, you provide a calendar as an outside input. That > would be very useful if that were allowed, but might make the > implementation rather complex. Yeah, I agree in that timezone and holiday issues as totally separate issues. I only wanted to stress out that the implementation of these things is *complex*, and that this was the reason to not consider them. -- Francesc Alted From ivan at selidor.net Wed Jul 30 14:38:23 2008 From: ivan at selidor.net (Ivan Vilata i Balaguer) Date: Wed, 30 Jul 2008 20:38:23 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> <200807301954.13802.faltet@pytables.org> Message-ID: <20080730183823.GA9720@tardis.terramar.selidor.net> Tom Denniston (el 2008-07-30 a les 13:12:45 -0500) va dir:: > If it's really just weekdays why not call it that instead of using a > term like business days that (quite confusingly) suggests holidays are > handled properly? Yes, that may be a better term. I guess we didn't choose that because we aren't native English speakers, and because TimeSeries was already using the other term. > Also, I view the timezone and holiday issues as totally seperate. I > would definately NOT recommend basing holidays on a timezone because > holidays are totally unrelated to timezones. Usually when you deal > with holidays, because they vary by application and country and change > over time, you provide a calendar as an outside input. That would be > very useful if that were allowed, but might make the implementation > rather complex. I think that what Francesc was trying to say is that taking holidays into account would be way too difficult to be implemented. Timezones were just an example of another (unrelated) feature we left out due to its complexity. > On 7/30/08, Francesc Alted wrote: > > A Wednesday 30 July 2008, Tom Denniston escrigu?: > > > When people are refering to busienss days are you talking about > > > weekdays or are you saying weekday non-holidays? > > > > Plain weekdays. Taking in account holidays for all the world round > > would be certainly much more complex than timezones, which neither are > > being considered in this proposal. :: Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 307 bytes Desc: Digital signature URL: From tom.denniston at alum.dartmouth.org Wed Jul 30 15:33:05 2008 From: tom.denniston at alum.dartmouth.org (Tom Denniston) Date: Wed, 30 Jul 2008 14:33:05 -0500 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <20080730183823.GA9720@tardis.terramar.selidor.net> References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> <200807301954.13802.faltet@pytables.org> <20080730183823.GA9720@tardis.terramar.selidor.net> Message-ID: Yes this all makes a lot of sense. I would propose changing the name from business days to weekdays though. Does anyone object wih that? On 7/30/08, Ivan Vilata i Balaguer wrote: > Tom Denniston (el 2008-07-30 a les 13:12:45 -0500) va dir:: > > > If it's really just weekdays why not call it that instead of using a > > term like business days that (quite confusingly) suggests holidays are > > handled properly? > > Yes, that may be a better term. I guess we didn't choose that because > we aren't native English speakers, and because TimeSeries was already > using the other term. > > > Also, I view the timezone and holiday issues as totally seperate. I > > would definately NOT recommend basing holidays on a timezone because > > holidays are totally unrelated to timezones. Usually when you deal > > with holidays, because they vary by application and country and change > > over time, you provide a calendar as an outside input. That would be > > very useful if that were allowed, but might make the implementation > > rather complex. > > I think that what Francesc was trying to say is that taking holidays > into account would be way too difficult to be implemented. Timezones > were just an example of another (unrelated) feature we left out due to > its complexity. > > > On 7/30/08, Francesc Alted wrote: > > > A Wednesday 30 July 2008, Tom Denniston escrigu?: > > > > When people are refering to busienss days are you talking about > > > > weekdays or are you saying weekday non-holidays? > > > > > > Plain weekdays. Taking in account holidays for all the world round > > > would be certainly much more complex than timezones, which neither are > > > being considered in this proposal. > > :: > > Ivan Vilata i Balaguer @ Intellectual Monopoly hinders Innovation! @ > http://www.selidor.net/ @ http://www.nosoftwarepatents.com/ @ > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > > iQCVAwUBSJC1HYB9xakSvZSBAQIqGAP9FES0aN1ioJWL9NsBggBCmdxdA0d973nr > 0dP00xdaq9CeGfNa78NOxzphxsL3kiKR4t6eDE1y3DwyhFV9+X9B+w/pFOcZAuRX > fIlkRHOiQn0SODf287LwAsSab2dKgL+HpiJjAa45QFMDNUUmz7KCa2HKrBZSdL5y > rMfxlbKwAwA= > =Vt/d > -----END PGP SIGNATURE----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > From dalke at dalkescientific.com Wed Jul 30 16:12:19 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Wed, 30 Jul 2008 22:12:19 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> Message-ID: <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> On Jul 4, 2008, at 2:22 PM, Andrew Dalke wrote: > [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c > 'pass' > 0.015u 0.042s 0:00.06 83.3% 0+0k 0+0io 0pf+0w > [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c > 'import numpy' > 0.084u 0.231s 0:00.33 93.9% 0+0k 0+8io 0pf+0w > [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% > For one of my clients I wrote a tool to analyze import times. I > don't have it, but here's something similar I just now whipped up: Based on those results I've been digging into the code trying to figure out why numpy imports so many files, and at the same time I've been trying to guess at the use case Robert Kern regards as typical when he wrote: Your use case isn't so typical and so suffers on the import time end of the balance and trying to figure out what code would break if those modules weren't all eagerly imported and were instead written as most other Python modules are written. I have two thoughts for why mega-importing might be useful: - interactive users get to do tab complete and see everything (eg, "import numpy" means "numpy.fft.ifft" works, without having to do "import numpy.fft" manually) - class inspectors don't need to to directory checks to find possible modules (This is a stretch, since every general purpose inspector I know of has to know how to frob the directories to find directories.) Are these the reasons numpy imports everything or are there other reasons? The first guess comes from the comment in numpy/__init__.py "The following sub-packages must be explicitly imported:" meaning, I take it, that the other modules (core, lib, random, linalg, fft, testing) do not need to be explicitly imported. Is the numpy recommendation that people should do: import numpy numpy.fft.ifft(data) ? If so, the documentation should be updated to say that "random", "ma", "ctypeslib" and several other libraries are included in that list. Why is the last so important that it should be in the top- level namespace? In my opinion, this assistance is counter to standard practice in effectively every other Python package. I don't see the benefit. You may ask if there are possible improvements. There's no obvious place taking up a bunch of time but there are plenty of small places which add up. For examples: 1) I wondered why 'cPickle' needed to be imported. One of the places it's used is numpy.lib.format which is only imported by numpy.lib.io. It's easy to defer the 'import format' to be inside the functions which need it. Note that io.py already defers the import of zipfile, so function-local imports are not inappropriate. 'io' imports 'tempfile', needing 0.016 seconds. This can be a deferred cost only incurred by those who use io.savez, which already has some function-local imports. The reason for the high import costs? Here's what tempfile itself imports. tempfile: 0.016 (io) errno: 0.000 (tempfile) random: 0.010 (tempfile) binascii: 0.003 (random) _random: 0.003 (random) fcntl: 0.003 (tempfile) thread: 0.000 (tempfile) (This is read as 'tempfile' is imported by 'io' and takes 0.016 seconds total, including all children, and the directly imported children of 'tempfile' are 'errno', 'random', 'fcntl' and 'thread'. 'random' imports 'binascii' and '_random'.) BTW, the load and save commands in io do an incorrect check. if isinstance(file, type("")): fid = _file(file,"rb") else: fid = file Filenames can be unicode strings. This test should either be isinstance(file, basestring) or not hasatttr(file, 'read') 2) What's the point of "add_newdocs"? According to the top of the module # This is only meant to add docs to objects defined in C- extension modules. # The purpose is to allow easier editing of the docstrings without # requiring a re-compile. which implies this aids development, but not deployment. The import takes a miniscule 0.006 seconds of the 0.225 ("import lib" and its subimports takes 0.141 seconds) but seems to add no direct end-user benefit. Shouldn't this documentation be pushed into the C code at least for each release? 3) I see that numpy/core/numerictypes.py imports 'string', which takes 0.008 seconds. I wondered why. It's part of "english_lower", "english_upper", and "english_capitalize", which are functions defined in that module. The implementation can't be improved, and using string.translate is the right approach. However, 3a) the two functions have no leading underscore and have docstrings to imply that this is part of the public API (although they are not included in __all__). Are they meant for general use? Note that english_capitalize is over-engineered for the use-case in that file. There are no empty type names, so the test "if s" is never false. 3b) there are only 33 types in that module so a hand-written lookup table mapping the name to the appropriate name/alias would work. Yes, it makes adding new types less than completely auomatic, but that's done rarely. Getting rid of these functions, and thus getting rid of the import speeds numpy startup time by 3.5%. 4) numpy.testing takes 0.041 seconds to import. The text I quoted above says that it's a numpy requirement that 'testing' always be imported, even though I'm hard pressed to figure out why that's important. Assuming it is important, 0.020 seconds is spent importing 'difflib' difflib: 0.020 (utils) heapq: 0.016 (difflib) itertools: 0.003 (heapq) operator: 0.003 (heapq) bisect: 0.005 (heapq) _bisect: 0.003 (bisect) _heapq: 0.003 (heapq) which is only used in numpy.testing.utils:assert_string . That can be deferred. Similarly, numpytest: 0.012 (numpy.testing) glob: 0.005 (numpytest) fnmatch: 0.002 (glob) shlex: 0.006 (numpytest) collections: 0.003 (shlex) numpy.testing.utils: 0.000 (numpytest) but notice that 'glob' while imported is never used in 'numpytest', and that 'shlex' can easily be a deferred import. This saves (for the common case) 0.01 seconds. 5) There some additional savings in _datasource _datasource: 0.016 (io) shutil: 0.003 (_datasource) stat: 0.000 (shutil) urlparse: 0.003 (_datasource) bz2: 0.003 (_datasource) gzip: 0.006 (_datasource) zlib: 0.003 (gzip) This module provides the "Datasource" class, which is accessed through "numpy.lib.io.Datasource". Deferring the 'bz2' and 'gzip' imports until needed saves 0.01 seconds. This will require some modification to the code more than shifting the import statement. These together add up to about 0.08 seconds, which is about 30% of the 'import numpy' cost. I could probably get another 0.05 seconds if I dug around more, but I can't without knowing what use case numpy is trying to achieve. Why are all those ancillary modules (testing, ctypeslib) eagerly loaded when there seems no need for that feature? Andrew dalke at dalkescientific.com From alan.mcintyre at gmail.com Wed Jul 30 16:51:45 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 30 Jul 2008 16:51:45 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> Message-ID: <1d36917a0807301351x703ee800lf96fdc94944fdd9b@mail.gmail.com> On Wed, Jul 30, 2008 at 4:12 PM, Andrew Dalke wrote: > 4) numpy.testing takes 0.041 seconds to import. The text I quoted > above says that it's a numpy requirement that 'testing' always be > imported, even though I'm hard pressed to figure out why that's > important. I suppose it's necessary for providing the test() and bench() functions in subpackages, but I that isn't a good reason to impose upon all users the time required to set up numpy.testing. > Assuming it is important, 0.020 seconds is spent > importing 'difflib' > > difflib: 0.020 (utils) > heapq: 0.016 (difflib) > itertools: 0.003 (heapq) > operator: 0.003 (heapq) > bisect: 0.005 (heapq) > _bisect: 0.003 (bisect) > _heapq: 0.003 (heapq) > > which is only used in numpy.testing.utils:assert_string . That can > be deferred. > > Similarly, > > numpytest: 0.012 (numpy.testing) > glob: 0.005 (numpytest) > fnmatch: 0.002 (glob) > shlex: 0.006 (numpytest) > collections: 0.003 (shlex) > numpy.testing.utils: 0.000 (numpytest) > > > but notice that 'glob' while imported is never used in 'numpytest', > and that 'shlex' can easily be a deferred import. This saves (for > the common case) 0.01 seconds. Thanks for taking the time to find those; I just removed the unused glob and delayed the import of shlex, difflib, and inspect in numpy.testing. From jturner at gemini.edu Wed Jul 30 16:54:19 2008 From: jturner at gemini.edu (James Turner) Date: Wed, 30 Jul 2008 16:54:19 -0400 Subject: [Numpy-discussion] Core dump during numpy.test() In-Reply-To: <488F9DA3.1040203@jpl.nasa.gov> References: <488F6C7D.7040100@gemini.edu> <488F7AA6.4070004@gemini.edu> <488F7BD1.3020005@jpl.nasa.gov> <488F8214.2070702@gemini.edu> <488F9CB0.5050607@jpl.nasa.gov> <488F9DA3.1040203@jpl.nasa.gov> Message-ID: <4890D4FB.5000306@gemini.edu> > oops. It is ATLAS. I was able to run with a nonoptimized lapack. Just to confirm, it also works for me when I use Netlib BLAS instead of ATLAS. From stefan at sun.ac.za Wed Jul 30 16:59:32 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 30 Jul 2008 22:59:32 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> Message-ID: <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> 2008/7/30 Andrew Dalke : > Based on those results I've been digging into the code trying to > figure out why numpy imports so many files, and at the same time I've > been trying to guess at the use case Robert Kern regards as typical > when he wrote: > > Your use case isn't so typical and so suffers on the import > time end of the balance I.e. most people don't start up NumPy all the time -- they import NumPy, and then do some calculations, which typically take longer than the import time. > and trying to figure out what code would break if those modules > weren't all eagerly imported and were instead written as most other > Python modules are written. For a benefit of 0.03s, I don't think it's worth it. > I have two thoughts for why mega-importing might be useful: > > - interactive users get to do tab complete and see everything > (eg, "import numpy" means "numpy.fft.ifft" works, without > having to do "import numpy.fft" manually) Numpy has a very flat namespace, for better or worse, which implies many imports. This can't be easily changed without modifying the API. > Is the numpy recommendation that people should do: > > import numpy > numpy.fft.ifft(data) That's the way many people use it. > ? If so, the documentation should be updated to say that "random", > "ma", "ctypeslib" and several other libraries are included in that > list. Thanks for pointing that out, I'll edit the documentation wiki. > Why is the last so important that it should be in the top- > level namespace? It's a single Python file -- does it make much of a difference? > In my opinion, this assistance is counter to standard practice in > effectively every other Python package. I don't see the benefit. How do you propose we change this? > BTW, the load and save commands in io do an incorrect check. > > if isinstance(file, type("")): > fid = _file(file,"rb") > else: > fid = file Thanks, fixed. [snip lots of suggestions] > Getting rid of these functions, and thus getting rid of the import > speeds numpy startup time by 3.5%. While I appreciate you taking the time to find these niggles, but we are short on developer time as it is. Asking them to spend their precious time on making a 3.5% improvement in startup time does not make much sense. If you provide a patch, on the other hand, it would only take a matter of seconds to decide whether to apply or not. You've already done most of the sleuth work. > I could probably get another 0.05 seconds if I dug around more, but I > can't without knowing what use case numpy is trying to achieve. Why > are all those ancillary modules (testing, ctypeslib) eagerly loaded > when there seems no need for that feature? Need is relative. You need fast startup time, but most of our users need quick access to whichever functions they want (and often use from an interactive terminal). I agree that "testing" and "ctypeslib" do not belong in that category, but they don't seem to do much harm either. Regards St?fan From stefan at sun.ac.za Wed Jul 30 17:01:59 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 30 Jul 2008 23:01:59 +0200 Subject: [Numpy-discussion] Suppressing skipped tests Message-ID: <9457e7c80807301401j692e2dd7h1fa5c9560d64497@mail.gmail.com> Alan, If others agree, could we suppress the output of skipped tests unless specifically requested? They clutter the output, and makes it more difficult to see which "real" tests fail. St?fan From alan.mcintyre at gmail.com Wed Jul 30 17:10:22 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 30 Jul 2008 17:10:22 -0400 Subject: [Numpy-discussion] Suppressing skipped tests In-Reply-To: <9457e7c80807301401j692e2dd7h1fa5c9560d64497@mail.gmail.com> References: <9457e7c80807301401j692e2dd7h1fa5c9560d64497@mail.gmail.com> Message-ID: <1d36917a0807301410w3251f6b5mcd110b42459fc6e3@mail.gmail.com> On Wed, Jul 30, 2008 at 5:01 PM, St?fan van der Walt wrote: > If others agree, could we suppress the output of skipped tests unless > specifically requested? They clutter the output, and makes it more > difficult to see which "real" tests fail. I'll see if there's some easy way to do that (I can't remember if I looked for this before), but one way to make them go away is to upgrade nose to 0.10.3 or later. ;) From stefan at sun.ac.za Wed Jul 30 17:33:27 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 30 Jul 2008 23:33:27 +0200 Subject: [Numpy-discussion] Suppressing skipped tests In-Reply-To: <1d36917a0807301410w3251f6b5mcd110b42459fc6e3@mail.gmail.com> References: <9457e7c80807301401j692e2dd7h1fa5c9560d64497@mail.gmail.com> <1d36917a0807301410w3251f6b5mcd110b42459fc6e3@mail.gmail.com> Message-ID: <9457e7c80807301433n538cd4ey1f6deb150e3fcb96@mail.gmail.com> 2008/7/30 Alan McIntyre : > On Wed, Jul 30, 2008 at 5:01 PM, St?fan van der Walt wrote: >> If others agree, could we suppress the output of skipped tests unless >> specifically requested? They clutter the output, and makes it more >> difficult to see which "real" tests fail. > > I'll see if there's some easy way to do that (I can't remember if I > looked for this before), but one way to make them go away is to > upgrade nose to 0.10.3 or later. ;) Thanks, Alan -- that works for me. For others who need to do the same, the command is easy_install nose==0.10.3 Cheers St?fan From harry.mangalam at uci.edu Wed Jul 30 18:03:44 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Wed, 30 Jul 2008 15:03:44 -0700 Subject: [Numpy-discussion] f2py: undefined symbol: zgemm_ Message-ID: <200807301503.44503.harry.mangalam@uci.edu> Hi All, Using: Python 2.5.2, f2py ver 2_4422, gfortran --version GNU Fortran (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7) on kubuntu 8.04 (Hardy) up-to-date on a Thinkpad T60. After building and using an f2py-generated lib for a while with this command: f2py --opt="-O3" -c -m fd_rrt1d --fcompiler=gnu95 \ --link-lapack_opt *.f I had to regenerate it after some code changes (just commenting out some debugging info). Altho the shared lib gets built correctly, when the calling python is run, I now get this: ./1d.py Traceback (most recent call last): File "./1d.py", line 27, in from fd_rrt1d import * ImportError: /home/hjm/shaka/1D-Mangalam-py/fd_rrt1d.so: undefined symbol: zgemm_ sure enough, nm reports it as undefined: nm fd_rrt1d.so |tail 000065b0 t string_from_pyobj U strlen@@GLIBC_2.0 U strncpy@@GLIBC_2.0 0001df40 b u.1294 00013bee T umatrix1d_dms0_ 000128c3 T umatrix1d_dms1_ 00015458 T umatrix1d_ms0_ U zgemm_ U zgemv_ U zgesv_ but why the recent change? Google and the numpy list seem not to have heard about this one.. -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824-0084(o), 949 285-4487(c) -- ..Kick at the darkness til it bleeds daylight. (Lovers in a Dangerous Time) - Bruce Cockburn From fperez.net at gmail.com Wed Jul 30 18:30:51 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 30 Jul 2008 15:30:51 -0700 Subject: [Numpy-discussion] labeling decorators Message-ID: Howdy (esp. Alan McIntyre): I've been using numpy's decorators a lot, many thanks to Matthew B and Alan for this code! Here's a snippet to auto-generate labeling decorators that might come in handy to avoid repetition in creating decos like @slow & friends. It's doctested as well as validating things a bit, feel free to use it if you find it useful. Cheers, f ### def make_label_dec(label,ds=None): """Factory function to create a decorator that applies one or more labels. :Parameters: label : string or sequence One or more labels that will be applied by the decorator to the functions it decorates. Labels are attributes of the decorated function with their value set to True. :Keywords: ds : string An optional docstring for the resulting decorator. If not given, a default docstring is auto-generated. :Returns: A decorator. :Examples: A simple labeling decorator: >>> slow = make_label_dec('slow') >>> print slow.__doc__ Labels a test as 'slow' And one that uses multiple labels and a custom docstring: >>> rare = make_label_dec(['slow','hard'], ... "Mix labels 'slow' and 'hard' for rare tests.") >>> print rare.__doc__ Mix labels 'slow' and 'hard' for rare tests. Now, let's test using this one: >>> @rare ... def f(): pass ... >>> >>> f.slow True >>> f.hard True """ if isinstance(label,basestring): labels = [label] else: labels = label # Validate that the given label(s) are OK for use in setattr() by doing a # dry run on a dummy function. tmp = lambda : None for label in labels: setattr(tmp,label,True) # This is the actual decorator we'll return def decor(f): for label in labels: setattr(f,label,True) return f # Apply the user's docstring if ds is None: ds = "Labels a test as %r" % label decor.__doc__ = ds return decor From dalke at dalkescientific.com Wed Jul 30 20:07:37 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 02:07:37 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> Message-ID: On Jul 30, 2008, at 10:59 PM, St?fan van der Walt wrote: > I.e. most people don't start up NumPy all the time -- they import > NumPy, and then do some calculations, which typically take longer than > the import time. Is that interactively, or is that through programs? > For a benefit of 0.03s, I don't think it's worth it. The final number with all the hundredths of a second added up to 0.08 seconds, which was about 30% of the 'import numpy' cost. > Numpy has a very flat namespace, for better or worse, which implies > many imports. I don't get the feeling that numpy is flat. Python's stdlib is flat. Numpy has many 2- and 3-level modules. >> Is the numpy recommendation that people should do: >> >> import numpy >> numpy.fft.ifft(data) > > That's the way many people use it. The normal Python way is: from numpy import fft fft.ifft(data) because in most packages, parent modules don't import all of their children. I acknowledge that existing numpy code will break with my desired change, as this example from the tutorial import numpy import pylab # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2 mu, sigma = 2, 0.5 v = numpy.random.normal(mu,sigma,10000) and I am not saying to change this code. Instead, I am asking for limits on the eagerness, with a long-term goal of minimizing its use. >> Why is [ctypeslib] so important that it should be in the top- >> level namespace? > > It's a single Python file -- does it make much of a difference? The file imports other files. Here's the import chain: ctypeslib: 0.047 (numpy) ctypes: -1.000 (ctypeslib) _ctypes: 0.003 (ctypes) gestalt: -1.000 (ctypes) ma: 0.005 (numpy) extras: 0.001 (ma) numpy.lib.index_tricks: 0.000 (extras) numpy.lib.polynomial: 0.000 (extras) (The "-1.000" indicates a bug in my instrumentation script, which I worked around with a -1.0 value.) Every numpy program, because it eagerly imports 'ctypeslib' to make it be accessible as a top-level variable, ends up importing ctypes. >>> if 1: ... t1 = time.time() ... import ctypes ... t2 = time.time() ... >>> t2-t1 0.032159090042114258 That's 10% of the import time. >> In my opinion, this assistance is counter to standard practice in >> effectively every other Python package. I don't see the benefit. > > How do you propose we change this? If I had my way, remove things like (in numpy/__init__.py) import linalg import fft import random import ctypeslib import ma but leave the list of submodules in "__all__" so that "from numpy import *" works. Perhaps add a top-level function to 'import_all()' which mimics the current behavior, and have iPython know about it so interactive users get it automatically. Or something like that. Yes, I know the numpy team won't change this behavior. I want to know why you all will consider changing. Something more concrete: change the top-level definitions in 'numpy' from from testing import Tester test = Tester().test bench = Tester().bench with def test(label='fast', verbose=1, extra_argv=None, doctests=False, coverage=False, **kwargs): from testing import Tester Tester.test(label, verbose, extra_argv, doctests, coverage, **kwargs and do something similar for 'bench'. Note that numpy currently implements numpy.test <-- this is a Tester().test numpy.testing.test <-- another Tester().test bound method so there's some needless and distracting, but extremely minor, duplication. >> Getting rid of these functions, and thus getting rid of the import >> speeds numpy startup time by 3.5%. > > While I appreciate you taking the time to find these niggles, but we > are short on developer time as it is. Asking them to spend their > precious time on making a 3.5% improvement in startup time does not > make much sense. If you provide a patch, on the other hand, it would > only take a matter of seconds to decide whether to apply or not. > You've already done most of the sleuth work. I wrote that I don't know the reasons for why the design was as it is. Are those functions ("english_upper", "english_lower", "english_capitalize") expected as part of the public interface for the module? The lack of a "_" prefix and their verbose docstrings implies that they are for general use. In that case, they can't easily be gotten rid of. Yet it doesn't make sense for them to be part of 'numerictypes'. Why would I submit a patch if there's no way those definitions will disappear, for reasons I am not aware of? I am not asking you all to make these changes. I'm asking about how much change is acceptable, what are the restrictions, and why are they there? I also haven't yet figured out how to get the regression tests to run, and I'm not going to contribute patches without at least passing that bare minimum. BTW, how do I do that? In the top-level there's a 'test.sh' command but when I run it I get: % mkdir tmp % bash test.sh Running from numpy source directory. Traceback (most recent call last): File "setupscons.py", line 56, in raise DistutilsError('\n'.join(msg)) distutils.errors.DistutilsError: You cannot build numpy with scons without the numscons package (Failure was: No module named numscons) test.sh: line 11: cd: /Users/dalke/cvses/numpy/tmp: No such file or directory and when I run 'nosetests' in the top-level directory I get: ImportError: Error importing numpy: you should not try to import numpy from its source directory; please exit the numpy source tree, and relaunch your python intepreter from there. I couldn't find (in a cursory search) instructions for running self- tests or regression tests. >> I could probably get another 0.05 seconds if I dug around more, but I >> can't without knowing what use case numpy is trying to achieve. Why >> are all those ancillary modules (testing, ctypeslib) eagerly loaded >> when there seems no need for that feature? > > Need is relative. You need fast startup time, but most of our users > need quick access to whichever functions they want (and often use from > an interactive terminal). I agree that "testing" and "ctypeslib" do > not belong in that category, but they don't seem to do much harm > either. If there is no need for those features then I'll submit a patch to remove them. There is some need, and there are many ways to handle that need. The current solution in numpy is to import everything. Again I ask, does *everything* (like 'testing' and 'ctypeslib') need to be imported eagerly? In your use case of user-driven exploratory development the answer is no - the users described above rarely desire access to those package because those packages are best used in automated environments. Eg, why write tests which are only used once? Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Wed Jul 30 20:19:53 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 02:19:53 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <1d36917a0807301351x703ee800lf96fdc94944fdd9b@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <1d36917a0807301351x703ee800lf96fdc94944fdd9b@mail.gmail.com> Message-ID: On Jul 30, 2008, at 10:51 PM, Alan McIntyre wrote: > I suppose it's necessary for providing the test() and bench() > functions in subpackages, but I that isn't a good reason to impose > upon all users the time required to set up numpy.testing. I just posted this in my reply to St?fan, but I'll say it again here. numpy defines numpy.test numpy.bench and numpy.testing.test The two 'test's use the same implementation. This is a likely unneeded duplication and one should be removed. The choice depends on if people think the name should be 'numpy.test' or 'numpy.testing.test'. BTW, where's the on-line documentation for these functions? They are actually bound methods, and I wondered if the doc programs handle them okay. If they should be top-level functions then I would prefer the be actual functions to hide an import. In that case, replace from testing import Tester test = Tester().test with def test(label='fast', verbose=1, extra_argv=None, doctests=False, coverage=False, **kwargs): from testing import Tester Tester.test(label, verbose, extra_argv, doctests, coverage, **kwargs) or something similar. This would keep the API unchanged (assuming those are important in the top-level) and reduce the number of imports. Else I would keep/move them in 'numpy.testing' and require that if someone wants to use 'test' or 'bench' then to get them after a 'from numpy import testing'. > Thanks for taking the time to find those; I just removed the unused > glob and delayed the import of shlex, difflib, and inspect in > numpy.testing. Thanks! Andrew dalke at dalkescientific.com From dpeterson at enthought.com Wed Jul 30 21:25:44 2008 From: dpeterson at enthought.com (Dave Peterson) Date: Wed, 30 Jul 2008 20:25:44 -0500 Subject: [Numpy-discussion] [ANNOUNCE] Traits 3.0 has been released Message-ID: <48911498.4050903@enthought.com> Hello, I am very pleased to announce that Traits 3.0 has just been released! All Traits projects have been registered with PyPi (aka The Cheeseshop) and each project's listing on PyPi currently includes a source tarball. In the near future, we will also upload binary eggs for Windows and Mac OS X platforms. Installation of Traits 3.0 is now as simple as: easy_install Traits The Traits projects include: http://pypi.python.org/pypi?:action=display&name=Traits&version=3.0.0 http://pypi.python.org/pypi?:action=display&name=TraitsGUI&version=3.0.0 http://pypi.python.org/pypi?:action=display&name=TraitsBackendQt&version=3.0.0 http://pypi.python.org/pypi?:action=display&name=TraitsBackendWX&version=3.0.0 The Traits project is at the center of all Enthought Tool Suite development and has changed the mental model used at Enthought for programming in the already extremely efficient Python programming language. We encourage everyone to join us in enjoying the productivity gains from using such a powerful approach. The Traits project allows Python programmers to use a special kind of type definition called a trait, which gives object attributes some additional characteristics: * Initialization: A trait has a default value, which is automatically set as the initial value of an attribute before its first use in a program. * Validation: A trait attribute's type is explicitly declared. The type is evident in the code, and only values that meet a programmer-specified set of criteria (i.e., the trait definition) can be assigned to that attribute. * Delegation: The value of a trait attribute can be contained either in the defining object or in another object delegated to by the trait. * Notification: Setting the value of a trait attribute can notify other parts of the program that the value has changed. * Visualization: User interfaces that allow a user to interactively modify the value of a trait attribute can be automatically constructed using the trait's definition. (This feature requires that a supported GUI toolkit be installed. If this feature is not used, the Traits project does not otherwise require GUI support.) A class can freely mix trait-based attributes with normal Python attributes, or can opt to allow the use of only a fixed or open set of trait attributes within the class. Trait attributes defined by a classs are automatically inherited by any subclass derived from the class. -- Dave From cournapeau at cslab.kecl.ntt.co.jp Wed Jul 30 21:53:52 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 31 Jul 2008 10:53:52 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> Message-ID: <1217469232.31016.15.camel@bbc8> On Thu, 2008-07-31 at 02:07 +0200, Andrew Dalke wrote: > On Jul 30, 2008, at 10:59 PM, St?fan van der Walt wrote: > > I.e. most people don't start up NumPy all the time -- they import > > NumPy, and then do some calculations, which typically take longer than > > the import time. > > Is that interactively, or is that through programs? Most people use it interactively, or for long running programs. Import times only matters for interactive commands depending on numpy. > > and I am not saying to change this code. Instead, I am asking for > limits on the eagerness, with a long-term goal of minimizing its use. For new API, this is never done, and is a bug if it is. In scipy, typically, import scipy does not import the whole subpackages list. > I also haven't yet figured out how to get the regression tests to > run, and I'm not going to contribute patches without at least passing > that bare minimum. BTW, how do I do that? In the top-level there's > a 'test.sh' command but when I run it I get: Argh, this file should have never ended here, that's entirely my fault. It was a merge from a (at the time) experimental branch. I can't remove it now because my company does not allow subversion access, but I will fix this tonight. Sorry for the confusion. > > and when I run 'nosetests' in the top-level directory I get: > > ImportError: Error importing numpy: you should not try to import > numpy from > its source directory; please exit the numpy source tree, and > relaunch > your python intepreter from there. > > I couldn't find (in a cursory search) instructions for running self- > tests or regression tests. You are supposed to run the tests on an installed numpy, not in the sources: import numpy numpy.test(verbose = 10) You can't really use run numpy without it to be installed first (which is what the message is about). cheers, David From alan.mcintyre at gmail.com Wed Jul 30 22:21:02 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 30 Jul 2008 22:21:02 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <1d36917a0807301351x703ee800lf96fdc94944fdd9b@mail.gmail.com> Message-ID: <1d36917a0807301921m70107b84s3a64077f73818410@mail.gmail.com> On Wed, Jul 30, 2008 at 8:19 PM, Andrew Dalke wrote: > numpy defines > > numpy.test > numpy.bench > > and > > numpy.testing.test > > The two 'test's use the same implementation. This is a likely > unneeded duplication and one should be removed. The choice depends on > if people think the name should be 'numpy.test' or 'numpy.testing.test'. They actually do two different things; numpy.test() runs test for all of numpy, and numpy.testing.test() runs tests for numpy.testing only. There are similar functions in numpy.lib, numpy.core, etc. From mattknox.ca at gmail.com Wed Jul 30 22:45:36 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Thu, 31 Jul 2008 02:45:36 +0000 (UTC) Subject: [Numpy-discussion] The date/time dtype and the casting issue References: <200807291512.53270.faltet@pytables.org> <200807301916.25824.faltet@pytables.org> <200807301954.13802.faltet@pytables.org> <20080730183823.GA9720@tardis.terramar.selidor.net> Message-ID: >> >> If it's really just weekdays why not call it that instead of using a >> >> term like business days that (quite confusingly) suggests holidays >> >> are handled properly? >> >> Well, we were adopting the name from the TimeSeries package. Perhaps >> the authors can answer this better than me. A lot of the inspiration for the original prototype of the timeseries module came from FAME (http://www.sungard.com/Fame/). The proprietary FAME 4GL language does a lot of things well when it comes to time series analysis, but is (not surprisingly) very lacking as a general purpose programming language. Python was the glue language I was using at work, and naturally I wanted to do a lot of the stuff I could do in FAME using Python instead. Most of the frequencies in the timeseries package are named the same as their FAME counterparts. I'm not especially attached to the name "business" instead of "weekday" for the frequency, it is just what I was used to from FAME so I went with it. I won't lose any sleep if you decide to call it "weekday" instead. While on the topic of FAME... being a financial analyst, I really am quite fond of the multitude of quarterly frequencies we have in the timeseries package (with different year end points) because they are very useful when doing things like "calenderizing" earnings from companies with different fiscal year ends. These frequencies are included in FAME, which makes sense since it targets financial users. I know Pierre likes them too for working with different seasons. I think it would be ok to leave them out of an initial implementation, but it might be worth keeping in mind during the design phase about how the dtype could be extended to incorporate such things. >> As forbidding operations among absolute/absolute and relative/relative >> types can be unacceptable in many situations, we are proposing an >> explicit casting mechanism so that the user can inform about the >> desired time unit of the outcome. For this, a new NumPy function, >> called, say, ``numpy.change_unit()`` (this name is for the purposes of >> the discussion and can be changed) will be provided. The signature for >> the function will be: >> >> change_unit(time_object, new_unit, reference) >> >> where 'time_object' is the time object whose unit is to be >> changed, 'new_unit' is the desired new time unit, and 'reference' is an >> absolute date that will be used to allow the conversion of relative >> times in case of using time units with an uncertain number of smaller >> time units (relative years or months cannot be expressed in days). For >> example, that would allow to do: >> >> >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' ) >> array([365, 731], dtype="datetime64[d]") If I understand you correctly, this is very close to the "asfreq" method of the Date/DateArray/TimeSeries classes in the timeseries module. One key element missing here (from my point of view anyway) is an equivalent of the 'relation' parameter in the asfreq method in the timeseries module. This is only used when converting from a lower frequency to a higher frequency (eg. annual to daily). For example... >>> a = ts.Date(freq='Annual', year=2007) >>> a.asfreq('Daily', 'START') >>> a.asfreq('Daily', 'END') This is another one of those things that I use all the time. Now whether it belongs in the core dtype, or some extension module I'm not sure... but it's an important feature in the timeseries module. From dalke at dalkescientific.com Thu Jul 31 01:56:14 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 07:56:14 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <1d36917a0807301921m70107b84s3a64077f73818410@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <1d36917a0807301351x703ee800lf96fdc94944fdd9b@mail.gmail.com> <1d36917a0807301921m70107b84s3a64077f73818410@mail.gmail.com> Message-ID: On Jul 31, 2008, at 4:21 AM, Alan McIntyre wrote: > They actually do two different things; numpy.test() runs test for all > of numpy, and numpy.testing.test() runs tests for numpy.testing only. > There are similar functions in numpy.lib, numpy.core, etc. Really? This is the code from numpy/__init__.py: from testing import Tester test = Tester().test bench = Tester().bench This is the code from numpy/testing/__init__.py: test = Tester().test ... ahhh, here's the magic, from testing/nosetester.py:NoseTester if package is None: f = sys._getframe(1) package = f.f_locals.get('__file__', None) assert package is not None package = os.path.dirname(package) Why are 'test' and 'bench' part of the general API instead something only used during testing? Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Jul 31 02:12:05 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 08:12:05 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <1217469232.31016.15.camel@bbc8> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> Message-ID: <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> On Jul 31, 2008, at 3:53 AM, David Cournapeau wrote: > You are supposed to run the tests on an installed numpy, not in the > sources: > > import numpy > numpy.test(verbose = 10) Doesn't that make things more cumbersome to test? That is, if I were to make a change I would need to: - python setup.py build (to put the code into the build/* subdirectory) - cd the build directory, or switch to a terminal which was already there - manually do the import/test code you wrote, or a write two-line program for it I would rather do 'nosetests' in the source tree, if at all feasible, although that might only be possible for the Python source. Hmm. And it looks like testing/nosetester.py (which implements the 'test' function above) is meant to make it easier to run nose, except my feeling is the extra level of wrapping makes things more complicated. The nosetest command-line appears to be more flexible, with support for, for examples, dropping into the debugger on errors, and reseting the coverage test files. I'm speaking out of ignorance, btw. Cheers, Andrew dalke at dalkescientific.com From faltet at pytables.org Thu Jul 31 02:15:45 2008 From: faltet at pytables.org (Francesc Alted) Date: Thu, 31 Jul 2008 08:15:45 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> Message-ID: <200807310815.45884.faltet@pytables.org> A Thursday 31 July 2008, Matt Knox escrigu?: > While on the topic of FAME... being a financial analyst, I really am > quite fond of the multitude of quarterly frequencies we have in the > timeseries package (with different year end points) because they are > very useful when doing things like "calenderizing" earnings from > companies with different fiscal year ends. These frequencies are > included in FAME, which makes sense since it targets financial users. > I know Pierre likes them too for working with different seasons. I > think it would be ok to leave them out of an initial implementation, > but it might be worth keeping in mind during the design phase about > how the dtype could be extended to incorporate such things. Well, introducing a quarter should not be difficult. We just wanted to keep the set of supported time units under a minimum (the list is already quite large). We thought that the quarter fits better as a 'derived' time unit, similarly as biweekly, semester or biyearly (to name just a few examples). However, if quarters are found to be much more important than other derived time units, they can go into the proposal too. > >> As forbidding operations among absolute/absolute and > >> relative/relative types can be unacceptable in many situations, we > >> are proposing an explicit casting mechanism so that the user can > >> inform about the desired time unit of the outcome. For this, a > >> new NumPy function, called, say, ``numpy.change_unit()`` (this > >> name is for the purposes of the discussion and can be changed) > >> will be provided. The signature for the function will be: > >> > >> change_unit(time_object, new_unit, reference) > >> > >> where 'time_object' is the time object whose unit is to be > >> changed, 'new_unit' is the desired new time unit, and 'reference' > >> is an absolute date that will be used to allow the conversion of > >> relative times in case of using time units with an uncertain > >> number of smaller time units (relative years or months cannot be > >> expressed in days). For > >> > >> example, that would allow to do: > >> >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' ) > >> > >> array([365, 731], dtype="datetime64[d]") > > If I understand you correctly, this is very close to the "asfreq" > method of the Date/DateArray/TimeSeries classes in the timeseries > module. One key element missing here (from my point of view anyway) > is an equivalent of the 'relation' parameter in the asfreq method in > the timeseries module. This is only used when converting from a lower > frequency to a higher frequency (eg. annual to daily). For example... > > >>> a = ts.Date(freq='Annual', year=2007) > >>> a.asfreq('Daily', 'START') > > > > >>> a.asfreq('Daily', 'END') > > > > This is another one of those things that I use all the time. Now > whether it belongs in the core dtype, or some extension module I'm > not sure... but it's an important feature in the timeseries module. I agree that such a 'relation' parameter in the proposed 'change_timeunit' could be handy in many situations. It should be applicable only to absolute times though. With this, the signature for the function would be: change_timeunit(time_object, new_unit, relation, reference) where 'relation' only can be used with absolute times and 'reference' only with relative times. Who knows, perhaps in the future one can find a way to implement such a 'change_timeunit' function as methods without disturbing too much the method schema of the ndarray objects. Cheers, -- Francesc Alted From cournapeau at cslab.kecl.ntt.co.jp Thu Jul 31 02:41:15 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 31 Jul 2008 15:41:15 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> Message-ID: <1217486475.412.10.camel@bbc8> On Thu, 2008-07-31 at 08:12 +0200, Andrew Dalke wrote: > On Jul 31, 2008, at 3:53 AM, David Cournapeau wrote: > > You are supposed to run the tests on an installed numpy, not in the > > sources: > > > > import numpy > > numpy.test(verbose = 10) > > Doesn't that make things more cumbersome to test? That is, if I were > to make a change I would need to: > - python setup.py build (to put the code into the build/* > subdirectory) > - cd the build directory, or switch to a terminal which was > already there > - manually do the import/test code you wrote, or a write two-line > program for it Yes. Nothing that an easy make file cannot solve, nonetheless (I am sure I am not the only one with a makefile/script which automates the above, to test a new svn updated numpy in one command). The problem is that it is difficult to support running uninstalled packages, in particular because of compiled code (distutils/setuptools have a develop mode to make this possible, though). Distutils put the build code in build directory, and the correct tree is built at install time. > > I would rather do 'nosetests' in the source tree, if at all feasible, > although that might only be possible for the Python source. Yes but how do you do that ? You would do import scipy in an svn checkout, and the C extensions would be the ones installed ? That sounds like a nightmare from a reliability POV. There was a related discussion (using scipy wo installing it) on scipy ML, BTW: http://projects.scipy.org/pipermail/scipy-user/2008-July/017678.html cheers, David From robert.kern at gmail.com Thu Jul 31 02:44:33 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 01:44:33 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> Message-ID: <3d375d730807302344j635c3a9aq5ba190232b571d3b@mail.gmail.com> On Thu, Jul 31, 2008 at 01:12, Andrew Dalke wrote: > On Jul 31, 2008, at 3:53 AM, David Cournapeau wrote: >> You are supposed to run the tests on an installed numpy, not in the >> sources: >> >> import numpy >> numpy.test(verbose = 10) > > Doesn't that make things more cumbersome to test? That is, if I were > to make a change I would need to: > - python setup.py build (to put the code into the build/* > subdirectory) > - cd the build directory, or switch to a terminal which was > already there > - manually do the import/test code you wrote, or a write two-line > program for it Developers can build_ext --inplace and frequently use nosetests anyways. numpy.test() is now primarily for users who are trying to see if their installation worked (or gathering requested information for the people on this list to help them troubleshoot) need to test the installed numpy. Note that we are *just* now transitioning to using nosetests for the development version of numpy. It used to be (through the 1.1.x releases) that we had our own test collection code inside numpy. numpy.test() was *necessary* in those releases. By now, we have most of the denizens here trained to do numpy.test() when testing their new installations. Maybe in 1.3, we'll remove it in favor of just having people use the nosetests command. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Jul 31 04:14:10 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 31 Jul 2008 08:14:10 +0000 (UTC) Subject: [Numpy-discussion] Example of numpy cov() not correct? References: <710F2847B0018641891D9A216027636029C1F1@ex3.envision.co.il> Message-ID: Wed, 30 Jul 2008 18:49:10 +0300, Nadav Horesh wrote: > If you read the cov function documentation you'll see that if a second > vector is given, it joins the 2 into one matrix and calculate the > covariance of it. In your case, you are looking for the off-diagonal > elements. So the final answer to the OP's question is: Yes, the example on http://www.scipy.org/Numpy_Example_List_With_Doc#cov is wrong; cov(T,P) indeed returns a matrix. And it would be nice if someone fixed this, you can simply register a wiki account and fix the problem. -- Pauli Virtanen From dagss at student.matnat.uio.no Thu Jul 31 05:06:13 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 Jul 2008 11:06:13 +0200 Subject: [Numpy-discussion] NumPy won't work with distutils loaded? Message-ID: <48918085.70206@student.matnat.uio.no> While developing a testcase for NumPy in Cython, I had the problem demonstrated below. Essentially this means that I couldn't run testcases in our Cython testing framework (though I will work around it). Is this some known restriction in how NumPy can be used, is it work in progress, a problem with my environment/Ubuntu, or should I file a bug report? This was on an up-to-date Ubunty Hardy 64-bit i386 system (with distro numpy, Python and distutils). Py> import numpy is ok, however (note the uppercase UnixCCompiler in dir(...)): Py> import distutils.unixccompiler Py> dir(distutils.unixccompiler) ['CCompiler', 'CompileError', 'DistutilsExecError', 'LibError', 'LinkError', 'NoneType', 'StringType', 'UnixCCompiler', '__builtins__', '__doc__', '__file__', '__name__', '__revision__', '_darwin_compiler_fixup', 'copy', 'gen_lib_options', 'gen_preprocess_options', 'log', 'newer', 'os', 'sys', 'sysconfig'] Py> import numpy Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/site-packages/numpy/__init__.py", line 37, in import testing File "/usr/lib/python2.5/site-packages/numpy/testing/__init__.py", line 3, in from numpytest import * File "/usr/lib/python2.5/site-packages/numpy/testing/numpytest.py", line 19, in from numpy.distutils.exec_command import splitcmdline File "/usr/lib/python2.5/site-packages/numpy/distutils/__init__.py", line 6, in import ccompiler File "/usr/lib/python2.5/site-packages/numpy/distutils/ccompiler.py", line 393, in setattr(getattr(_m, _cc+'compiler'), 'gen_lib_options', AttributeError: 'module' object has no attribute 'unixccompiler' -- Dag Sverre From dagss at student.matnat.uio.no Thu Jul 31 05:19:46 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 Jul 2008 11:19:46 +0200 Subject: [Numpy-discussion] Sorry!, please disregard (Re: NumPy won't work with distutils loaded?) In-Reply-To: <48918085.70206@student.matnat.uio.no> References: <48918085.70206@student.matnat.uio.no> Message-ID: <489183B2.8090407@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > While developing a testcase for NumPy in Cython, I had the problem > demonstrated below. Essentially this means that I couldn't run testcases > in our Cython testing framework (though I will work around it). Is this > some known restriction in how NumPy can be used, is it work in progress, > a problem with my environment/Ubuntu, or should I file a bug report? > > This was on an up-to-date Ubunty Hardy 64-bit i386 system (with distro > numpy, Python and distutils). Now where *has* my manners gone to, complaining about a distro numpy revision... Sorry all, built the latest SVN and there appears to be no problem. Although the source in question is still the same *shrug*. -- Dag Sverre From stefan at sun.ac.za Thu Jul 31 05:42:02 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 31 Jul 2008 11:42:02 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> Message-ID: <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> 2008/7/31 Andrew Dalke : >> Numpy has a very flat namespace, for better or worse, which implies >> many imports. > > I don't get the feeling that numpy is flat. Python's stdlib is flat. > Numpy has many 2- and 3-level modules. With 500+ functions in the root namespace, I'd call numpy flat. > If I had my way, remove things like (in numpy/__init__.py) > > import linalg > import fft > import random > import ctypeslib > import ma > > but leave the list of submodules in "__all__" so that "from numpy > import *" works. Perhaps add a top-level function to 'import_all()' > which mimics the current behavior, and have iPython know about it so > interactive users get it automatically. Or something like that. > > > Yes, I know the numpy team won't change this behavior. I want to > know why you all will consider changing. Maybe when we're convinced that there is a lot to be gained from making such a change. From my perspective, it doesn't look good: I) Major code breakage II) Confused users III) More difficult function discovery for beginners vs. I) Slight improvement in startup speed. >>> Getting rid of these functions, and thus getting rid of the import >>> speeds numpy startup time by 3.5%. >> >> While I appreciate you taking the time to find these niggles, but we >> are short on developer time as it is. Asking them to spend their >> precious time on making a 3.5% improvement in startup time does not >> make much sense. If you provide a patch, on the other hand, it would >> only take a matter of seconds to decide whether to apply or not. >> You've already done most of the sleuth work. > > I wrote that I don't know the reasons for why the design was as it > is. Are those functions ("english_upper", "english_lower", > "english_capitalize") expected as part of the public interface for > the module? The lack of a "_" prefix and their verbose docstrings > implies that they are for general use. In that case, they can't > easily be gotten rid of. Yet it doesn't make sense for them to be > part of 'numerictypes'. Anything underneath numpy.core that is not exposed as numpy.something is not for public consumption. St?fan From robert.kern at gmail.com Thu Jul 31 06:03:20 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 05:03:20 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> Message-ID: <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> On Thu, Jul 31, 2008 at 04:42, St?fan van der Walt wrote: > 2008/7/31 Andrew Dalke : >> I wrote that I don't know the reasons for why the design was as it >> is. Are those functions ("english_upper", "english_lower", >> "english_capitalize") expected as part of the public interface for >> the module? The lack of a "_" prefix and their verbose docstrings >> implies that they are for general use. In that case, they can't >> easily be gotten rid of. Yet it doesn't make sense for them to be >> part of 'numerictypes'. > > Anything underneath numpy.core that is not exposed as numpy.something > is not for public consumption. That said, the reason those particular docstrings are verbose is because I wanted people to know why those functions exist there (e.g. "This is an internal utility function...."). But you still can't remove them since they are being used inside numerictypes. That's why I labeled them "internal utility functions" instead of leaving them with minimal docstrings such that you would have to guess. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dalke at dalkescientific.com Thu Jul 31 06:36:46 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 12:36:46 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> Message-ID: On Jul 31, 2008, at 11:42 AM, St?fan van der Walt wrote: > Maybe when we're convinced that there is a lot to be gained from > making such a change. From my perspective, it doesn't look good: > > I) Major code breakage > II) Confused users > III) More difficult function discovery for beginners I'm not asking for a change. I fully realize this. I happen to think it's a mistake and there are other ways to have addressed the underlying requirement, but I know that's not going to change. (For example, follow matplotlib approach where there's a special library designed to be imported in interactive use. But I am *not* proposing this change.) I point out that this make numpy different than most other Python packages. Had this not been done then I) would not be a problem, II) is I think a wash, because people starting with numpy will still wonder why >>> import PIL >>> PIL.Image Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'Image' >>> import PIL.Image >>> PIL.Image >>> and >>> import xml >>> xml.etree Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'etree' >>> from xml import etree >>> xml.etree >>> occur. III) assumes there couldn't have been other solutions. And it assumes that the difficulties are large, which I haven't seen in my experience. > I) Slight improvement in startup speed. The user base for numpy might be .. 10,000 people? 100,000 people? Let's go with the latter, and assume that with command-line scripts, CGI scripts, and the other programs that people write in order to help do research means that numpy is started on average 10 times a day. 100,000 people * 10 times / day * 0.1 seconds per startup = almost 28 people-hours spent each day waiting for numpy to start. I'm willing to spend a few days to achieve that. Perhaps there's fewer people than I'm estimating. OTOH, perhaps there are more imports of numpy per day. An order of magnitude less time is still a couple of hours each day as the world waits to import all of the numpy libraries. If on average people import numpy 10 times a day and it could be made 0.1 seconds faster then that's 1 second per person per day. If it takes on average 5 minutes to learn to import the module directly and the onus is all on numpy, then after 1 year of use the efficiency has made up for it, and the benefits continue to grow. Slight improvements add up when multiplied by everyone. The goals of numpy when it started aren't going to be the same as when it's a mature, widely used and deployed package. > Andrew dalke at dalkescientific.com From dalke at dalkescientific.com Thu Jul 31 06:43:17 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 12:43:17 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> Message-ID: <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> On Jul 31, 2008, at 12:03 PM, Robert Kern wrote: > That said, the reason those particular docstrings are verbose is > because I wanted people to know why those functions exist there (e.g. > "This is an internal utility function...."). Err, umm, you mean that first line of the second paragraph in the docstring? *blush* > But you still can't remove them since they are being used inside > numerictypes. That's why I labeled them "internal utility functions" > instead of leaving them with minimal docstrings such that you would > have to guess. My proposal is to replace that code with a table mapping the type name to the uppercase/lowercase/capitalized forms, thus eliminating the (small) amount of time needed to import string. It makes adding new types slightly more difficult. I know it's a tradeoff. In this case it's somewhat like the proverbial New Jersey approach vs. the MIT one. The code that's there is the right way to solve the problem in the general case, but solving the specific problem can be punted, and as a result the code is (slightly) faster. Other parts of the code are like that, which is why I pointed out so many examples. Startup performance has not been a numpy concern. It a concern for me, and it has been (for other packages) a concern for some of my clients. Andrew dalke at dalkescientific.com From stefan at sun.ac.za Thu Jul 31 08:20:59 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 31 Jul 2008 14:20:59 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> Message-ID: <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> 2008/7/31 Andrew Dalke : > The user base for numpy might be .. 10,000 people? 100,000 people? > Let's go with the latter, and assume that with command-line scripts, > CGI scripts, and the other programs that people write in order to > help do research means that numpy is started on average 10 times a day. > > 100,000 people * 10 times / day * 0.1 seconds per startup > = almost 28 people-hours spent each day waiting for numpy to start. I don't buy that argument. No single person is agile enough to do anything useful in the half a second or so it takes to start up NumPy. No one is *waiting* for NumPy to start. Just by answering this e-mail I could have (and maybe should have) started NumPy three hundred and sixty times. I don't want to argue about this, though. Write the patches, file a ticket, and hopefully someone will deem them important enough to apply them. St?fan From hanni.ali at gmail.com Thu Jul 31 08:31:49 2008 From: hanni.ali at gmail.com (Hanni Ali) Date: Thu, 31 Jul 2008 13:31:49 +0100 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> Message-ID: <789d27b10807310531s110f10d1n564c5ac8031e454c@mail.gmail.com> Hi All, I've been reading this discussion with interest. I would just to highlight an alternate use of numpy to interactive use. We have a cluster of machines which process tasks on an individual basis where a master tasks may spawn 600 slave tasks to be processed. These tasks are spread across the cluster and processed as scripts in a individual python thread. Although reducing the process time by 300 seconds for the master task is only about a 1.5% speedup (total time can be i excess of 24000s). We process large number of these tasks in any given year and every little helps! Hanni 2008/7/31 St?fan van der Walt > 2008/7/31 Andrew Dalke : > > The user base for numpy might be .. 10,000 people? 100,000 people? > > Let's go with the latter, and assume that with command-line scripts, > > CGI scripts, and the other programs that people write in order to > > help do research means that numpy is started on average 10 times a day. > > > > 100,000 people * 10 times / day * 0.1 seconds per startup > > = almost 28 people-hours spent each day waiting for numpy to start. > > I don't buy that argument. No single person is agile enough to do > anything useful in the half a second or so it takes to start up NumPy. > No one is *waiting* for NumPy to start. Just by answering this > e-mail I could have (and maybe should have) started NumPy three > hundred and sixty times. > > I don't want to argue about this, though. Write the patches, file a > ticket, and hopefully someone will deem them important enough to apply > them. > > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Thu Jul 31 08:46:20 2008 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 31 Jul 2008 07:46:20 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <789d27b10807310531s110f10d1n564c5ac8031e454c@mail.gmail.com> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> <789d27b10807310531s110f10d1n564c5ac8031e454c@mail.gmail.com> Message-ID: On Thu, Jul 31, 2008 at 7:31 AM, Hanni Ali wrote: > > I would just to highlight an alternate use of numpy to interactive use. We > have a cluster of machines which process tasks on an individual basis where > a master tasks may spawn 600 slave tasks to be processed. These tasks are > spread across the cluster and processed as scripts in a individual python > thread. Although reducing the process time by 300 seconds for the master > task is only about a 1.5% speedup (total time can be i excess of 24000s). We > process large number of these tasks in any given year and every little > helps! > There are other components of NumPy/SciPy that are more worthy of optimization. Given that programmer time is a scarce resource, it's more sensible to direct our efforts towards making the other 98.5% of the computation faster. /law of diminishing returns -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From david at ar.media.kyoto-u.ac.jp Thu Jul 31 09:01:59 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 31 Jul 2008 22:01:59 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> <789d27b10807310531s110f10d1n564c5ac8031e454c@mail.gmail.com> Message-ID: <4891B7C7.8060002@ar.media.kyoto-u.ac.jp> Nathan Bell wrote: > > There are other components of NumPy/SciPy that are more worthy of > optimization. Given that programmer time is a scarce resource, it's > more sensible to direct our efforts towards making the other 98.5% of > the computation faster. > To be fair, when I took a look at the problem last month, it took a few of us (Robert and me IIRC) maximum 2 man hours altogether to divide by two numpy import times on linux, without altering at all the API. Maybe there are more things which can be done to get to a more 'flat' profile. cheers, David From wnbell at gmail.com Thu Jul 31 09:19:27 2008 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 31 Jul 2008 08:19:27 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> Message-ID: On Thu, Jul 31, 2008 at 5:36 AM, Andrew Dalke wrote: > > The user base for numpy might be .. 10,000 people? 100,000 people? > Let's go with the latter, and assume that with command-line scripts, > CGI scripts, and the other programs that people write in order to > help do research means that numpy is started on average 10 times a day. > > 100,000 people * 10 times / day * 0.1 seconds per startup > = almost 28 people-hours spent each day waiting for numpy to start. > > I'm willing to spend a few days to achieve that. > > > Perhaps there's fewer people than I'm estimating. OTOH, perhaps > there are more imports of numpy per day. An order of magnitude less > time is still a couple of hours each day as the world waits to import > all of the numpy libraries. > > If on average people import numpy 10 times a day and it could be made > 0.1 seconds faster then that's 1 second per person per day. If it > takes on average 5 minutes to learn to import the module directly and > the onus is all on numpy, then after 1 year of use the efficiency has > made up for it, and the benefits continue to grow. > Just think of the savings that could be achieved if all 2.1 million Walmart employees were outfitted with colostomy bags. 0.5 hours / day for bathroom breaks * 2,100,000 employees * 365 days/year * $7/hour = $2,682,750,000/year Granted, I'm probably not the first to run these numbers. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From aisaac at american.edu Thu Jul 31 09:33:00 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 31 Jul 2008 09:33:00 -0400 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: <200807310815.45884.faltet@pytables.org> References: <200807291512.53270.faltet@pytables.org> <200807310815.45884.faltet@pytables.org> Message-ID: > A Thursday 31 July 2008, Matt Knox escrigu?: >> While on the topic of FAME... being a financial analyst, I really am >> quite fond of the multitude of quarterly frequencies we have in the >> timeseries package (with different year end points) because they are >> very useful when doing things like "calenderizing" >> earnings from companies with different fiscal year ends. On Thu, 31 Jul 2008, Francesc Alted apparently wrote: > Well, introducing a quarter should not be difficult. We just wanted to > keep the set of supported time units under a minimum (the list is > already quite large). We thought that the quarter fits better as > a 'derived' time unit, similarly as biweekly, semester or biyearly (to > name just a few examples). However, if quarters are found to be much > more important than other derived time units, they can go into the > proposal too. Quarterly frequency is probably the most analyzed frequency in macroeconometrics. Widely used macroeconometrics packages (e.g., EViews) traditionally support only three explicit frequencies: annual, quarterly, and monthly. Cheers, Alan Isaac From gael.varoquaux at normalesup.org Thu Jul 31 10:10:44 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:10:44 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <1217486475.412.10.camel@bbc8> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> <1217486475.412.10.camel@bbc8> Message-ID: <20080731141044.GA24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 03:41:15PM +0900, David Cournapeau wrote: > Yes. Nothing that an easy make file cannot solve, nonetheless (I am sure > I am not the only one with a makefile/script which automates the above, > to test a new svn updated numpy in one command). That's why distutils have a test target. You can do "python setup.py test", and if you have setup you setup.py properly it should work (obviously it is easy to make this statement, and harder to get the thing working). Ga?l From gael.varoquaux at normalesup.org Thu Jul 31 10:14:21 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:14:21 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> Message-ID: <20080731141421.GB24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 12:43:17PM +0200, Andrew Dalke wrote: > Startup performance has not been a numpy concern. It a concern for > me, and it has been (for other packages) a concern for some of my > clients. I am curious, if startup performance is a problem, I guess it is because you are running lots of little scripts where startup time is big compared to run time. Did you think of forking them from an already started process. I had this same problem (with libraries way slower than numpy to load) and used os.fork to a great success. Ga?l From kwgoodman at gmail.com Thu Jul 31 10:15:39 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 31 Jul 2008 07:15:39 -0700 Subject: [Numpy-discussion] Example of numpy cov() not correct? In-Reply-To: References: <710F2847B0018641891D9A216027636029C1F1@ex3.envision.co.il> Message-ID: On Thu, Jul 31, 2008 at 1:14 AM, Pauli Virtanen wrote: > Yes, the example on > > http://www.scipy.org/Numpy_Example_List_With_Doc#cov > > is wrong; cov(T,P) indeed returns a matrix. And it would be nice if > someone fixed this, you can simply register a wiki account and fix the > problem. Done. From alan.mcintyre at gmail.com Thu Jul 31 10:21:15 2008 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 31 Jul 2008 10:21:15 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> Message-ID: <1d36917a0807310721p2be2bf40qd73c62e650fb1e56@mail.gmail.com> On Thu, Jul 31, 2008 at 2:12 AM, Andrew Dalke wrote: > Hmm. And it looks like testing/nosetester.py (which implements the > 'test' function above) is meant to make it easier to run nose, except > my feeling is the extra level of wrapping makes things more > complicated. The nosetest command-line appears to be more flexible, > with support for, for examples, dropping into the debugger on errors, > and reseting the coverage test files. You can actually pass those sorts of options to nose through the extra_argv parameter in test(). That might be a little cumbersome, but (as far as I know) it's something I'm going to do so infrequently it's not a big deal. From david at ar.media.kyoto-u.ac.jp Thu Jul 31 10:05:33 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 31 Jul 2008 23:05:33 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <20080731141044.GA24491@phare.normalesup.org> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> <1217486475.412.10.camel@bbc8> <20080731141044.GA24491@phare.normalesup.org> Message-ID: <4891C6AD.8090607@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > > That's why distutils have a test target. You can do "python setup.py > test", and if you have setup you setup.py properly it should work > (obviously it is easy to make this statement, and harder to get the thing > working). > I have already seen some discussion about distutils like this, if you mean something like this: http://blog.ianbicking.org/pythons-makefile.html but I would take with rake and make over this anytime. I just don't understand why something like rake does not exist in python, but well, let's not go there. David From gael.varoquaux at normalesup.org Thu Jul 31 10:27:26 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:27:26 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891C6AD.8090607@ar.media.kyoto-u.ac.jp> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> <1217486475.412.10.camel@bbc8> <20080731141044.GA24491@phare.normalesup.org> <4891C6AD.8090607@ar.media.kyoto-u.ac.jp> Message-ID: <20080731142726.GC24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 11:05:33PM +0900, David Cournapeau wrote: > Gael Varoquaux wrote: > > That's why distutils have a test target. You can do "python setup.py > > test", and if you have setup you setup.py properly it should work > > (obviously it is easy to make this statement, and harder to get the thing > > working). > I have already seen some discussion about distutils like this, if you > mean something like this: > http://blog.ianbicking.org/pythons-makefile.html > but I would take with rake and make over this anytime. I just don't > understand why something like rake does not exist in python, but well, > let's not go there. Well, actually, in the enthought tools suite we use setuptools for packaging (I don't want to start a controversy, I am not advocating the use of setuptools, just stating a fact) and nose for testing, and getting "setup.py test" to wrok, including do the build test and download nose if not there, is a matter of addig those two lines to the setup.py: tests_require = [ 'nose >= 0.10.3', ], test_suite = 'nose.collector', Obviously, the build part has to be well-tuned for the machinery to work, but there is a lot of value here. Ga?l From david at ar.media.kyoto-u.ac.jp Thu Jul 31 10:16:12 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 31 Jul 2008 23:16:12 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <20080731142726.GC24491@phare.normalesup.org> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> <1217486475.412.10.camel@bbc8> <20080731141044.GA24491@phare.normalesup.org> <4891C6AD.8090607@ar.media.kyoto-u.ac.jp> <20080731142726.GC24491@phare.normalesup.org> Message-ID: <4891C92C.1030704@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > Obviously, the build part has to be well-tuned for the machinery to work, > but there is a lot of value here. > Ah yes, setuptools does have this. But this is specific to setuptools, bare distutils does not have this test command, right ? cheers, David From bioinformed at gmail.com Thu Jul 31 10:34:04 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 31 Jul 2008 10:34:04 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <20080731141421.GB24491@phare.normalesup.org> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> <20080731141421.GB24491@phare.normalesup.org> Message-ID: <2e1434c10807310734ma0e9c51y7fe50502054ceed9@mail.gmail.com> On Thu, Jul 31, 2008 at 10:14 AM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Thu, Jul 31, 2008 at 12:43:17PM +0200, Andrew Dalke wrote: > > Startup performance has not been a numpy concern. It a concern for > > me, and it has been (for other packages) a concern for some of my > > clients. > > I am curious, if startup performance is a problem, I guess it is because > you are running lots of little scripts where startup time is big compared > to run time. Did you think of forking them from an already started > process. I had this same problem (with libraries way slower than numpy to > load) and used os.fork to a great success. > Start up time is an issue for me, but in a larger sense than just numpy. I do run many scripts, some that are ephemeral and some that take significant amounts of time. However, numpy is just one of many many libraries that I must import, so improvements, even minor ones, are appreciated. The morale of this discussion, for me, is that just because _you_ don't care about a particular aspect or feature, doesn't mean that others don't or shouldn't. Your workarounds may not be viable for me and vice-versa. So let's just go with the spirit of open source and encourage those motivated to controbute to do so, provided their suggestions are sensible and do not break code. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Jul 31 10:34:21 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:34:21 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891C92C.1030704@ar.media.kyoto-u.ac.jp> References: <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <1217469232.31016.15.camel@bbc8> <693BDC66-7E16-4862-B63E-355134531387@dalkescientific.com> <1217486475.412.10.camel@bbc8> <20080731141044.GA24491@phare.normalesup.org> <4891C6AD.8090607@ar.media.kyoto-u.ac.jp> <20080731142726.GC24491@phare.normalesup.org> <4891C92C.1030704@ar.media.kyoto-u.ac.jp> Message-ID: <20080731143421.GE24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 11:16:12PM +0900, David Cournapeau wrote: > Gael Varoquaux wrote: > > Obviously, the build part has to be well-tuned for the machinery to work, > > but there is a lot of value here. > Ah yes, setuptools does have this. But this is specific to setuptools, > bare distutils does not have this test command, right ? Dunno, sorry. The scale of my ignore of distutils and related subjects would probably impress you :). Ga?l, looking forward to your tutorial on scons. From david at ar.media.kyoto-u.ac.jp Thu Jul 31 10:18:40 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 31 Jul 2008 23:18:40 +0900 Subject: [Numpy-discussion] distutils and inplace build: is numpy supposed to work ? Message-ID: <4891C9C0.3050202@ar.media.kyoto-u.ac.jp> Hi, I wanted to know if numpy was supposed to work when built in place through the -i option of distutils. The reason why I am asking it that I would like to support it in numscons, and I cannot make it work when using distutils. Importing numpy works in the source tree, but most tests fail because of some missing imports; I have a lots of those: ====================================================================== ERROR: Check that matrix type is preserved. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line 69, in test_matrix_a_and_b self.do(a, b) File "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line 99, in do assert_almost_equal(a, dot(multiply(u, s), vt)) File "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line 22, in assert_almost_equal old_assert_almost_equal(a, b, decimal=decimal, **kw) File "numpy/testing/utils.py", line 171, in assert_almost_equal from numpy.core import ndarray File "core/__init__.py", line 27, in __all__ += numeric.__all__ NameError: name 'numeric' is not defined Is this expected, or am I doing something wrong ? cheers, David From gael.varoquaux at normalesup.org Thu Jul 31 10:36:38 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:36:38 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <2e1434c10807310734ma0e9c51y7fe50502054ceed9@mail.gmail.com> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> <20080731141421.GB24491@phare.normalesup.org> <2e1434c10807310734ma0e9c51y7fe50502054ceed9@mail.gmail.com> Message-ID: <20080731143638.GF24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 10:34:04AM -0400, Kevin Jacobs wrote: > The morale of this discussion, for me, is that just because _you_ don't > care about a particular aspect or feature, doesn't mean that others don't > or shouldn't. Your workarounds may not be viable for me and vice-versa. > So let's just go with the spirit of open source and encourage those > motivated to controbute to do so, provided their suggestions are sensible > and do not break code. I fully agree ehre. And if people improve numpy's startup time with breaking or obfuscating stuff, I am very happy. I was just trying to help :). Yes, the value of open source is that different people improve the same tools to meet different goals, thus we should always keep on open ear to other people's requirements, especially if they come up with high-quality code. Ga?l From bioinformed at gmail.com Thu Jul 31 10:38:42 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 31 Jul 2008 10:38:42 -0400 Subject: [Numpy-discussion] [ANNOUNCE] Traits 3.0 has been released In-Reply-To: <48911498.4050903@enthought.com> References: <48911498.4050903@enthought.com> Message-ID: <2e1434c10807310738v2f8150a0ka371352dcdf753f5@mail.gmail.com> On Wed, Jul 30, 2008 at 9:25 PM, Dave Peterson wrote: > Hello, > > I am very pleased to announce that Traits 3.0 has just been released! > > All of the URLs on PyPi to Enthought seem to be broken (e.g., http://code.enthought.com/traits). Can you give an example showing how traits work? I'm mildly intrigued, but too lazy to dig beyond the first broken link. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Jul 31 10:46:09 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 16:46:09 +0200 Subject: [Numpy-discussion] [ANNOUNCE] Traits 3.0 has been released In-Reply-To: <2e1434c10807310738v2f8150a0ka371352dcdf753f5@mail.gmail.com> References: <48911498.4050903@enthought.com> <2e1434c10807310738v2f8150a0ka371352dcdf753f5@mail.gmail.com> Message-ID: <20080731144609.GG24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 10:38:42AM -0400, Kevin Jacobs wrote: > All of the URLs on PyPi to Enthought seem to be broken (e.g., > [2]http://code.enthought.com/traits). Can you give an example showing how > traits work? I'm mildly intrigued, but too lazy to dig beyond the first > broken link. The proper URL is http://code.enthought.com/projects/traits/ . This has been reported and will be fixed ASAP. Ga?l From Chris.Barker at noaa.gov Thu Jul 31 12:45:35 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 31 Jul 2008 09:45:35 -0700 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> Message-ID: <4891EC2F.9050501@noaa.gov> Andrew Dalke wrote: > If I had my way, remove things like (in numpy/__init__.py) > > import linalg > import fft > import random > import ctypeslib > import ma as a side benefit, this might help folks using py2exe, py2app and friends -- as it stands all those sub-modules need to be included in your app bundle regardless of whether they are used. I recall having to explicitly add them by hand, too, though that may have been a matplotlib.numerix issue. > but leave the list of submodules in "__all__" so that "from numpy > import *" works. Of course, no one should be doing that anyway.... ;-) And for what it's worth, I've found myself very frustrated by how long it takes to start up python and import numpy. I often do whip out the interpreter to do something fast, and I didn't used to have to wait for it. On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 seconds to import numpy! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From david at ar.media.kyoto-u.ac.jp Thu Jul 31 12:34:30 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 01 Aug 2008 01:34:30 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891EC2F.9050501@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> Message-ID: <4891E996.5070000@ar.media.kyoto-u.ac.jp> Christopher Barker wrote: > On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 > seconds to import numpy! > > Hot or cold ? If hot, there is something horribly wrong with your setup. On my macbook, it takes ~ 180 ms to to python -c "import numpy", and ~ 100 ms on linux (same machine). cheers, David From Chris.Barker at noaa.gov Thu Jul 31 12:55:31 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 31 Jul 2008 09:55:31 -0700 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> Message-ID: <4891EE83.2070901@noaa.gov> St?fan van der Walt wrote: > No one is *waiting* for NumPy to start. I am, and probably 10 times, a day, yes. And it's a major issue for CGI, though maybe no one's using that anymore anyway. > Just by answering this > e-mail I could have (and maybe should have) started NumPy three > hundred and sixty times. sure, but I like wasting my time on mailing lists.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Jul 31 13:12:22 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 31 Jul 2008 10:12:22 -0700 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891E996.5070000@ar.media.kyoto-u.ac.jp> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> <4891E996.5070000@ar.media.kyoto-u.ac.jp> Message-ID: <4891F276.8070404@noaa.gov> David Cournapeau wrote: > Christopher Barker wrote: >> On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 >> seconds to import numpy! > > Hot or cold ? If hot, there is something horribly wrong with your setup. hot -- it takes about 10 cold. I've been wondering about that. time python -c "import numpy" real 0m8.383s user 0m0.320s sys 0m7.805s and similar results if run multiple times in a row. Any idea what could be wrong? I have no clue where to start, though I suppose a complete clean out and re-install of python comes to mind. oh, and this is a dual G5 PPC (which should have a faster disk than your Macbook) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From nwagner at iam.uni-stuttgart.de Thu Jul 31 13:15:59 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 31 Jul 2008 19:15:59 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891F276.8070404@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> <4891E996.5070000@ar.media.kyoto-u.ac.jp> <4891F276.8070404@noaa.gov> Message-ID: On Thu, 31 Jul 2008 10:12:22 -0700 Christopher Barker wrote: > David Cournapeau wrote: >> Christopher Barker wrote: >>> On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), >>>it takes about 7 >>> seconds to import numpy! >> >> Hot or cold ? If hot, there is something horribly wrong >>with your setup. > > hot -- it takes about 10 cold. > > I've been wondering about that. > > time python -c "import numpy" > > real 0m8.383s > user 0m0.320s > sys 0m7.805s > > and similar results if run multiple times in a row. > > Any idea what could be wrong? I have no clue where to >start, though I > suppose a complete clean out and re-install of python >comes to mind. > > oh, and this is a dual G5 PPC (which should have a >faster disk than your > Macbook) > > > -Chris > No idea, but for comparison time /usr/bin/python -c "import numpy" real 0m0.295s user 0m0.236s sys 0m0.050s nwagner at linux:~/svn/matplotlib> cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 10 model name : mobile AMD Athlon (tm) 2500+ stepping : 0 cpu MHz : 662.592 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mp mmxext 3dnowext 3dnow bogomips : 1316.57 Nils From david.huard at gmail.com Thu Jul 31 13:56:28 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 31 Jul 2008 13:56:28 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891F276.8070404@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> <4891E996.5070000@ar.media.kyoto-u.ac.jp> <4891F276.8070404@noaa.gov> Message-ID: <91cf711d0807311056i3c3d929bo24c59f283572b1a1@mail.gmail.com> On Thu, Jul 31, 2008 at 1:12 PM, Christopher Barker wrote: > David Cournapeau wrote: > > Christopher Barker wrote: > >> On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 > >> seconds to import numpy! > > > > Hot or cold ? If hot, there is something horribly wrong with your setup. > > hot -- it takes about 10 cold. > > I've been wondering about that. > > time python -c "import numpy" > > real 0m8.383s > user 0m0.320s > sys 0m7.805s > > and similar results if run multiple times in a row. > > Any idea what could be wrong? I have no clue where to start, though I > suppose a complete clean out and re-install of python comes to mind. > Is only 'import numpy' slow, or other packages import slowly too ? Are there remote directories in your pythonpath ? Do you have old `eggs` in the site-packages directory that point to remote directories (installed with setuptools developp) ? Try cleaning the site-packages directory. That did the trick for me once. David > oh, and this is a dual G5 PPC (which should have a faster disk than your > Macbook) > > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 31 14:12:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 31 Jul 2008 12:12:54 -0600 Subject: [Numpy-discussion] Numpy 1.1.1 release notes. Message-ID: Hi All, I've attached draft release notes for Numpy 1.1.1. If you have anything to add or correct, let me know. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: np-1.1.1-release-notes.txt URL: From faltet at pytables.org Thu Jul 31 14:15:19 2008 From: faltet at pytables.org (Francesc Alted) Date: Thu, 31 Jul 2008 20:15:19 +0200 Subject: [Numpy-discussion] The date/time dtype and the casting issue In-Reply-To: References: <200807291512.53270.faltet@pytables.org> <200807310815.45884.faltet@pytables.org> Message-ID: <200807312015.19728.faltet@pytables.org> A Thursday 31 July 2008, Alan G Isaac escrigu?: > > A Thursday 31 July 2008, Matt Knox escrigu?: > >> While on the topic of FAME... being a financial analyst, I really > >> am quite fond of the multitude of quarterly frequencies we have in > >> the timeseries package (with different year end points) because > >> they are very useful when doing things like "calenderizing" > >> earnings from companies with different fiscal year ends. > > On Thu, 31 Jul 2008, Francesc Alted apparently wrote: > > Well, introducing a quarter should not be difficult. We just > > wanted to keep the set of supported time units under a minimum (the > > list is already quite large). We thought that the quarter fits > > better as a 'derived' time unit, similarly as biweekly, semester or > > biyearly (to name just a few examples). However, if quarters are > > found to be much more important than other derived time units, they > > can go into the proposal too. > > Quarterly frequency is probably the most analyzed frequency > in macroeconometrics. > > Widely used macroeconometrics packages (e.g., EViews) > traditionally support only three explicit frequencies: > annual, quarterly, and monthly. I see. However, I forgot to mention that another reason for not including the quarters is that they normally need flexibility to be defined as starting in *any* month of the year. As we don't wanted to provide an ``origin`` metadata in the proposal (things got too complex already, as you can see in the third proposal that I sent to this list yesterday), then the usefulness of such an 'unflexible' quarters would be rather limited. So, in the end, I think it is best to avoid them for the dtype (and add support for them in the ``Date`` class). Cheers, -- Francesc Alted From charlesr.harris at gmail.com Thu Jul 31 14:16:09 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 31 Jul 2008 12:16:09 -0600 Subject: [Numpy-discussion] distutils and inplace build: is numpy supposed to work ? In-Reply-To: <4891C9C0.3050202@ar.media.kyoto-u.ac.jp> References: <4891C9C0.3050202@ar.media.kyoto-u.ac.jp> Message-ID: On Thu, Jul 31, 2008 at 8:18 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi, > > I wanted to know if numpy was supposed to work when built in place > through the -i option of distutils. The reason why I am asking it that I > would like to support it in numscons, and I cannot make it work when > using distutils. Importing numpy works in the source tree, but most > tests fail because of some missing imports; I have a lots of those: > Robert made some fixes to support in place builds, so if it doesn't work, it's probably a bug. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Jul 31 14:23:47 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 31 Jul 2008 18:23:47 +0000 (UTC) Subject: [Numpy-discussion] Numpy 1.1.1 release notes. References: Message-ID: Thu, 31 Jul 2008 12:12:54 -0600, Charles R Harris wrote: > Hi All, > > I've attached draft release notes for Numpy 1.1.1. If you have anything > to add or correct, let me know. [clip] > Bug fixes #854, r5456? -- Pauli Virtanen From cournape at gmail.com Thu Jul 31 14:28:33 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 1 Aug 2008 03:28:33 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891F276.8070404@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> <4891E996.5070000@ar.media.kyoto-u.ac.jp> <4891F276.8070404@noaa.gov> Message-ID: <5b8d13220807311128w56e1996bue1ec304aca672b47@mail.gmail.com> > > hot -- it takes about 10 cold. > > I've been wondering about that. > > time python -c "import numpy" > > real 0m8.383s > user 0m0.320s > sys 0m7.805s > > and similar results if run multiple times in a row. What does python -c "import sys; print sys.path" say ? > Any idea what could be wrong? I have no clue where to start, though I > suppose a complete clean out and re-install of python comes to mind. > > oh, and this is a dual G5 PPC (which should have a faster disk than your > Macbook) disk should not matter. If hot, everything should be in the IO buffer, opening a file is of the order of a few micro seconds (that's certainly the order on Linux; the VM on Mac OS X is likely not as good, but still). cheers, David From charlesr.harris at gmail.com Thu Jul 31 14:32:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 31 Jul 2008 12:32:57 -0600 Subject: [Numpy-discussion] Numpy 1.1.1 release notes. In-Reply-To: References: Message-ID: On Thu, Jul 31, 2008 at 12:23 PM, Pauli Virtanen wrote: > Thu, 31 Jul 2008 12:12:54 -0600, Charles R Harris wrote: > > > Hi All, > > > > I've attached draft release notes for Numpy 1.1.1. If you have anything > > to add or correct, let me know. > [clip] > > Bug fixes > > #854, r5456? > Added... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Jul 31 14:33:47 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 31 Jul 2008 20:33:47 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891F276.8070404@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> <4891E996.5070000@ar.media.kyoto-u.ac.jp> <4891F276.8070404@noaa.gov> Message-ID: <20080731183347.GM24491@phare.normalesup.org> On Thu, Jul 31, 2008 at 10:12:22AM -0700, Christopher Barker wrote: > I've been wondering about that. > time python -c "import numpy" > real 0m8.383s > user 0m0.320s > sys 0m7.805s I don't know what is wrong, but this is plain wrong, unless you are on a distant file system, or something usual. On the box I am currently on, I get: python -c "import numpy" 0.10s user 0.03s system 101% cpu 0.122 total And this matches my overall experience. Ga?l From pgmdevlist at gmail.com Thu Jul 31 14:36:28 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 31 Jul 2008 14:36:28 -0400 Subject: [Numpy-discussion] Numpy 1.1.1 release notes. In-Reply-To: References: Message-ID: <200807311436.28970.pgmdevlist@gmail.com> Chuck, Can you remove the entry "Pierre GM -- masked array, improved support for flexible dtypes." from "General Improvements" ? The work was done for 1.2 and not completely backported, so that's not really a lot of improvements. It will for 1.2, however: when is this one supposed to be released ? From sransom at nrao.edu Thu Jul 31 15:30:44 2008 From: sransom at nrao.edu (Scott Ransom) Date: Thu, 31 Jul 2008 15:30:44 -0400 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <9457e7c80807310520v19d270bby5601d507e6dfa89@mail.gmail.com> <789d27b10807310531s110f10d1n564c5ac8031e454c@mail.gmail.com> Message-ID: <20080731193044.GC20380@ssh.cv.nrao.edu> On Thu, Jul 31, 2008 at 07:46:20AM -0500, Nathan Bell wrote: > On Thu, Jul 31, 2008 at 7:31 AM, Hanni Ali wrote: > > > > I would just to highlight an alternate use of numpy to interactive use. We > > have a cluster of machines which process tasks on an individual basis where > > a master tasks may spawn 600 slave tasks to be processed. These tasks are > > spread across the cluster and processed as scripts in a individual python > > thread. Although reducing the process time by 300 seconds for the master > > task is only about a 1.5% speedup (total time can be i excess of 24000s). We > > process large number of these tasks in any given year and every little > > helps! > > > > There are other components of NumPy/SciPy that are more worthy of > optimization. Given that programmer time is a scarce resource, it's > more sensible to direct our efforts towards making the other 98.5% of > the computation faster. This is true in general, but I have a different use case for one of my programs that uses numpy on a cluster. Basically, the program gets called thousands of times per day and the runtime for each is only a second or two. In this case I am much more dominated by numpy's import time. Scott PS: Yes, I could change the way that the routine works so that it is called many fewer times, however, that would be very difficult (although not impossible). A "free" speedup due to faster numpy import would be very nice. -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From robert.kern at gmail.com Thu Jul 31 16:02:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 15:02:54 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> Message-ID: <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> On Thu, Jul 31, 2008 at 05:43, Andrew Dalke wrote: > On Jul 31, 2008, at 12:03 PM, Robert Kern wrote: >> But you still can't remove them since they are being used inside >> numerictypes. That's why I labeled them "internal utility functions" >> instead of leaving them with minimal docstrings such that you would >> have to guess. > > My proposal is to replace that code with a table mapping > the type name to the uppercase/lowercase/capitalized forms, > thus eliminating the (small) amount of time needed to > import string. > > It makes adding new types slightly more difficult. > > I know it's a tradeoff. Probably not a bad one. Write up the patch, and then we'll see how much it affects the import time. I would much rather that we discuss concrete changes like this rather than rehash the justifications of old decisions. Regardless of the merits about the old decisions (and I agreed with your position at the time), it's a pointless and irrelevant conversation. The decisions were made, and now we have a user base to whom we have promised not to break their code so egregiously again. The relevant conversation is what changes we can make now. Some general guidelines: 1) Everything exposed by "from numpy import *" still needs to work. a) The layout of everything under numpy.core is an implementation detail. b) _underscored functions and explicitly labeled internal functions can probably be modified. c) Ask about specific functions when in doubt. 2) The improvement in import times should be substantial. Feel free to bundle up the optimizations for consideration. 3) Moving imports from module-level down into the functions where they are used is generally okay if we get a reasonable win from it. The local imports should be commented, explaining that they are made local in order to improve the import times. 4) __import__ hacks are off the table. 5) Proxy objects ... I would really like to avoid proxy objects. They have caused fragility in the past. 6) I'm not a fan of having environment variables control the way numpy gets imported, but I'm willing to consider it. For example, I might go for having proxy objects for linalg et al. *only* if a particular environment variable were set. But there had better be a very large improvement in import times. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Jul 31 16:19:11 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 15:19:11 -0500 Subject: [Numpy-discussion] distutils and inplace build: is numpy supposed to work ? In-Reply-To: <4891C9C0.3050202@ar.media.kyoto-u.ac.jp> References: <4891C9C0.3050202@ar.media.kyoto-u.ac.jp> Message-ID: <3d375d730807311319l7e81ed9dvb87715138ab47769@mail.gmail.com> On Thu, Jul 31, 2008 at 09:18, David Cournapeau wrote: > Hi, > > I wanted to know if numpy was supposed to work when built in place > through the -i option of distutils. The reason why I am asking it that I > would like to support it in numscons, and I cannot make it work when > using distutils. Importing numpy works in the source tree, but most > tests fail because of some missing imports; I have a lots of those: > > ====================================================================== > ERROR: Check that matrix type is preserved. > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line > 69, in test_matrix_a_and_b > self.do(a, b) > File > "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line > 99, in do > assert_almost_equal(a, dot(multiply(u, s), vt)) > File > "/usr/media/src/dsp/numpy/trunk/numpy/linalg/tests/test_linalg.py", line > 22, in assert_almost_equal > old_assert_almost_equal(a, b, decimal=decimal, **kw) > File "numpy/testing/utils.py", line 171, in assert_almost_equal > from numpy.core import ndarray > File "core/__init__.py", line 27, in > __all__ += numeric.__all__ > NameError: name 'numeric' is not defined > > Is this expected, or am I doing something wrong ? I have been running numpy built inplace for a long time now. As far as I can tell, this only shows up when running numpy.test() while in the numpy trunk checkout directory. I think it's an interaction with the way nose traverses packages to locate tests. numpy/core/__init__.py is a bit odd; it does "from numeric import *" and expects "numeric" to then be in the namespace. I think this only happens when the import machinery knows that it's in a package. nose uses __import__() when scouting around the package, so it misses this. For example, [~]$ ls foo __init__.py bar.py [~]$ cat foo/__init__.py from bar import x print dir() print bar.x [~]$ cat foo/bar.py x = 1 [~]$ python -c "import foo" ['__builtins__', '__doc__', '__file__', '__name__', '__path__', 'bar', 'x'] 1 [~]$ python -c "__import__('foo.__init__')" ['__builtins__', '__doc__', '__file__', '__name__', '__path__', 'bar', 'x'] 1 ['__builtins__', '__doc__', '__file__', '__name__', 'x'] Traceback (most recent call last): File "", line 1, in File "foo/__init__.py", line 3, in print bar.x NameError: name 'bar' is not defined -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Thu Jul 31 16:37:41 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 1 Aug 2008 05:37:41 +0900 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> References: <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> Message-ID: <5b8d13220807311337i300f2072u8264decf534bb5ec@mail.gmail.com> On Fri, Aug 1, 2008 at 5:02 AM, Robert Kern wrote: > > 5) Proxy objects ... I would really like to avoid proxy objects. They > have caused fragility in the past. One recurrent problem around import times optimization is that it is some work to improve it, but it takes one line to destroy it all. For example, inspect import came back, and this alone is ~10-15 % of my import time on mac os x (from ~ 180 to ~160). This would be the main advantage of lazy import; but does it really worth the trouble, since it brings some complexity as you mentionned last time we had this discussion ? Maybe a simple test script to check for known costly import would be enough (running from time to time ?). Maybe ctypes can be loaded "in the fly", too. Those are the two obvious hotspot ( ~ 25 % altogether). with a recent SVN checkout > 6) I'm not a fan of having environment variables control the way numpy > gets imported, but I'm willing to consider it. For example, I might go > for having proxy objects for linalg et al. *only* if a particular > environment variable were set. But there had better be a very large > improvement in import times. linalg does not seem to have a huge impact. It is typically much faster to load than ctypeslib or inspect. cheers, David From dalke at dalkescientific.com Thu Jul 31 16:44:38 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 22:44:38 +0200 Subject: [Numpy-discussion] patches to reduce import costs Message-ID: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> I don't see a place to submit patches. Is there a patch manager for numpy? Here's a patch to defer importing 'tempfile' until needed. I previously mentioned one other place that didn't need tempfile. With this there is no 'import tempfile' during 'import numpy' This improves startup by about 7% --- numpy/lib/_datasource.py (revision 5576) +++ numpy/lib/_datasource.py (working copy) @@ -35,7 +35,6 @@ __docformat__ = "restructuredtext en" import os -import tempfile from shutil import rmtree from urlparse import urlparse @@ -131,6 +130,7 @@ self._destpath = os.path.abspath(destpath) self._istmpdest = False else: + import tempfile self._destpath = tempfile.mkdtemp() self._istmpdest = True Andrew dalke at dalkescientific.com From charlesr.harris at gmail.com Thu Jul 31 16:46:30 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 31 Jul 2008 14:46:30 -0600 Subject: [Numpy-discussion] Numpy 1.1.1 release notes. In-Reply-To: <200807311436.28970.pgmdevlist@gmail.com> References: <200807311436.28970.pgmdevlist@gmail.com> Message-ID: On Thu, Jul 31, 2008 at 12:36 PM, Pierre GM wrote: > Chuck, > Can you remove the entry > "Pierre GM -- masked array, improved support for flexible dtypes." > from "General Improvements" ? > The work was done for 1.2 and not completely backported, so that's not > really > a lot of improvements. It will for 1.2, however: when is this one supposed > to > be released ? > OK...Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 31 16:48:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 15:48:54 -0500 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> Message-ID: <3d375d730807311348n785641c6yc55b5bca4df67c08@mail.gmail.com> On Thu, Jul 31, 2008 at 15:44, Andrew Dalke wrote: > I don't see a place to submit patches. Is there a patch manager for > numpy? http://projects.scipy.org/scipy/numpy -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Thu Jul 31 16:49:02 2008 From: cournape at gmail.com (David Cournapeau) Date: Fri, 1 Aug 2008 05:49:02 +0900 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> Message-ID: <5b8d13220807311349o18421775y753f616b11090280@mail.gmail.com> On Fri, Aug 1, 2008 at 5:44 AM, Andrew Dalke wrote: > I don't see a place to submit patches. Is there a patch manager for > numpy? > Attaching a patch to numpy trac is the way to go: http://projects.scipy.org/scipy/numpy thanks for the patch, David From millman at berkeley.edu Thu Jul 31 17:08:47 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 31 Jul 2008 14:08:47 -0700 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: <488D5984.202@ar.media.kyoto-u.ac.jp> References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> <488D5984.202@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Jul 27, 2008 at 10:30 PM, David Cournapeau wrote: > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe I want to get the final 1.1.1 release out ASAP, but I need some feedback on the windows binaries. Could someone please try them out and let us know if you run into any problems. In particular, could someone verify that this has been fixed: http://projects.scipy.org/scipy/numpy/ticket/844 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From dalke at dalkescientific.com Thu Jul 31 17:09:25 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 31 Jul 2008 23:09:25 +0200 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> Message-ID: <09EF1D94-B252-45F0-9E73-EBDC57E3993E@dalkescientific.com> On Jul 31, 2008, at 10:02 PM, Robert Kern wrote: > 1) Everything exposed by "from numpy import *" still needs to work. Does that include numpy.Tester? I don't mean numpy.test() nor numpy.bench(). Does that include numpy.PackageLoader? I don't mean numpy.pkgload. > 2) The improvement in import times should be substantial. Feel free to > bundle up the optimizations for consideration. Okay, wasn't sure if bundle or small independent ones were best. I tried using Trac to submit a small patch. It's a big hassle to do for a two line patch. > 5) Proxy objects ... I would really like to avoid proxy objects. They > have caused fragility in the past. Understood and agreed. Andrew dalke at dalkescientific.com From bioinformed at gmail.com Thu Jul 31 17:14:18 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 31 Jul 2008 17:14:18 -0400 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> Message-ID: <2e1434c10807311414i1b4c2c1dh49c8a4587d6cde51@mail.gmail.com> For the help of freeze packages, it would be great if you could add a file that lists all of the deferred imports that you run across. That way, we can add/update recipes more easily for py2app, py2exe, bbfreeze, etc. Thanks, -Kevin On Thu, Jul 31, 2008 at 4:44 PM, Andrew Dalke wrote: > I don't see a place to submit patches. Is there a patch manager for > numpy? > > Here's a patch to defer importing 'tempfile' until needed. I > previously mentioned one other place that didn't need tempfile. With > this there is no 'import tempfile' during 'import numpy' > > This improves startup by about 7% > > --- numpy/lib/_datasource.py (revision 5576) > +++ numpy/lib/_datasource.py (working copy) > @@ -35,7 +35,6 @@ > __docformat__ = "restructuredtext en" > > import os > -import tempfile > from shutil import rmtree > from urlparse import urlparse > > @@ -131,6 +130,7 @@ > self._destpath = os.path.abspath(destpath) > self._istmpdest = False > else: > + import tempfile > self._destpath = tempfile.mkdtemp() > self._istmpdest = True > > > Andrew > dalke at dalkescientific.com > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jul 31 17:37:13 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 16:37:13 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <09EF1D94-B252-45F0-9E73-EBDC57E3993E@dalkescientific.com> References: <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <9457e7c80807310242q75d4a943wbdd1d4e2c264b613@mail.gmail.com> <3d375d730807310303q54ef94f7m3ba74b3f47f6e5ea@mail.gmail.com> <0BD87DFD-6B55-44E3-90EA-C1F83301F091@dalkescientific.com> <3d375d730807311302p46138c29k1cae6a5aa8899360@mail.gmail.com> <09EF1D94-B252-45F0-9E73-EBDC57E3993E@dalkescientific.com> Message-ID: <3d375d730807311437j5465c529n9088c3c4b957b26d@mail.gmail.com> On Thu, Jul 31, 2008 at 16:09, Andrew Dalke wrote: > On Jul 31, 2008, at 10:02 PM, Robert Kern wrote: >> 1) Everything exposed by "from numpy import *" still needs to work. > > Does that include numpy.Tester? I don't mean numpy.test() nor > numpy.bench(). > > Does that include numpy.PackageLoader? I don't mean numpy.pkgload. Probably not. I would consider those to be implementation details that got left in rather than a deliberate API exposure. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dpeterson at enthought.com Thu Jul 31 18:48:02 2008 From: dpeterson at enthought.com (Dave Peterson) Date: Thu, 31 Jul 2008 17:48:02 -0500 Subject: [Numpy-discussion] [ANNOUNCE] ETS 3.0.0b1 released! Message-ID: <48924122.3090501@enthought.com> Hello, I am very pleased to announce that ETS 3.0.0b1 has just been tagged and released! All ETS sub-projects, including Traits, Chaco, Mayavi, and Envisage, have been tagged and released in one form or another (alpha, beta, final) as part of this effort. All ETS projects have now been registered with PyPi and source tarballs have been uploaded! As a result, for the first time ever, you can now easy_install all of ETS via a simple command like: easy_install ETS Or, alternatively, you can get only specific projects with commands like: easy_install Mayavi Or easy_install Chaco We anticipate the final release of ETS 3.0, and all of its sub-projects at their current version number, in a matter of weeks. Please join us in helping make this final release the most robust release it can be by participating in this beta shakeout period. Discussions about the status of ETS and its sub-projects happens on the enthought-dev mailing list, who's home page is at: https://mail.enthought.com/mailman/listinfo/enthought-dev Thank you to all who have worked so hard on making this release possible! -- Dave Enthought Tool Suite --------------------------- The Enthought Tool Suite (ETS) is a collection of components developed by Enthought and open source participants, which we use every day to construct custom scientific applications. It includes a wide variety of components, including: * an extensible application framework * application building blocks * 2-D and 3-D graphics libraries * scientific and math libraries * developer tools The cornerstone on which these tools rest is the Traits package, which provides explicit type declarations in Python; its features include initialization, validation, delegation, notification, and visualization of typed attributes. More information is available for all the packages within ETS from the Enthought Tool Suite development home page at http://code.enthought.com/projects/tool-suite.php. Testimonials ---------------- "I set out to rebuild an application in one week that had been developed over the last seven years (in C by generations of post-docs). Pyface and Traits were my cornerstones and I knew nothing about Pyface or Wx. It has been a hectic week. But here ... sits in front of me a nice application that does most of what it should. I think this has been a huge success. ... Thanks to the tools Enthought built, and thanks to the friendly support from people on the [enthought-dev] list, I have been able to build what I think is the best application so far. I have built similar applications (controlling cameras for imaging Bose-Einstein condensate) in C+MFC, Matlab, and C+labWindows, each time it has taken me at least four times longer to get to a result I regard as inferior. So I just wanted to say a big "thank you". Thank you to Enthought for providing this great software open-source. Thank you for everybody on the list for your replies." ? Ga?l Varoquaux, Laboratoire Charles Fabry, Institut d?Optique, Palaiseau, France "I'm currently writing a realtime data acquisition/display application ? I'm using Enthought Tool Suite and Traits, and Chaco for display. IMHO, I think that in five years ETS/Traits will be the most comonly used framework for scientific applications." ? Gary Pajer, Department of Chemistry, Biochemistry and Physics, Rider University, Lawrenceville NJ From cburns at berkeley.edu Thu Jul 31 19:08:43 2008 From: cburns at berkeley.edu (Christopher Burns) Date: Thu, 31 Jul 2008 16:08:43 -0700 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <2e1434c10807311414i1b4c2c1dh49c8a4587d6cde51@mail.gmail.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> <2e1434c10807311414i1b4c2c1dh49c8a4587d6cde51@mail.gmail.com> Message-ID: <764e38540807311608q2675f87bg3cd352825e2125d@mail.gmail.com> Kevin, Do you mean add a file on the Wiki or in the source tree somewhere? Chris On Thu, Jul 31, 2008 at 2:14 PM, Kevin Jacobs wrote: > For the help of freeze packages, it would be great if you could add a file > that lists all of the deferred imports that you run across. That way, we > can add/update recipes more easily for py2app, py2exe, bbfreeze, etc. > > Thanks, > -Kevin > From stefan at sun.ac.za Thu Jul 31 19:14:22 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 1 Aug 2008 01:14:22 +0200 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> Message-ID: <9457e7c80807311614r7eee33fal2f7cee105c7d693f@mail.gmail.com> 2008/7/31 Andrew Dalke : > I don't see a place to submit patches. Is there a patch manager for > numpy? > > Here's a patch to defer importing 'tempfile' until needed. I > previously mentioned one other place that didn't need tempfile. With > this there is no 'import tempfile' during 'import numpy' > > This improves startup by about 7% Thanks for the patch. Applied in r5588. Cheers St?fan From bioinformed at gmail.com Thu Jul 31 19:14:44 2008 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Thu, 31 Jul 2008 19:14:44 -0400 Subject: [Numpy-discussion] patches to reduce import costs In-Reply-To: <764e38540807311608q2675f87bg3cd352825e2125d@mail.gmail.com> References: <1F36976E-288A-4DC5-ADE5-1105B8D9B546@dalkescientific.com> <2e1434c10807311414i1b4c2c1dh49c8a4587d6cde51@mail.gmail.com> <764e38540807311608q2675f87bg3cd352825e2125d@mail.gmail.com> Message-ID: <2e1434c10807311614o2771d111hd7a57ef10728c9ff@mail.gmail.com> On Thu, Jul 31, 2008 at 7:08 PM, Christopher Burns wrote: > Do you mean add a file on the Wiki or in the source tree somewhere? > Either or both-- so long as there is a convenient place to find them. I suppose a Wiki page would be most flexible, since it could be expanded to discuss deeper packaging issues. Thanks, -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Jul 31 20:38:43 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 1 Aug 2008 02:38:43 +0200 Subject: [Numpy-discussion] [ANNOUNCE] ETS 3.0.0b1 released! In-Reply-To: <48924122.3090501@enthought.com> References: <48924122.3090501@enthought.com> Message-ID: <20080801003843.GA21892@phare.normalesup.org> On Thu, Jul 31, 2008 at 05:48:02PM -0500, Dave Peterson wrote: > easy_install Mayavi Just a precision (I just got caught by that), to install mayavi2, the application, and not just the 3d visualization library that can be used eg in ipython, you need to do: easy_install "Mayavi[ui]" ie the Mayavi project, with the optional ui component. In addition, you need to choose a toolkit. You can do this with the following command: easy_install "Traits[ui,wx]" Finally, under Linux, the above commands will install by default the packages to your /usr/lib/python2.x/site-packages. If like me you believe this is owned by the package manager and you would like to install to /usr/local/lib/python2.x/site-packages, just use the follwing switch: easy_install --prefix "Mayavi[ui]" Ga?l From robert.kern at gmail.com Thu Jul 31 22:23:56 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Jul 2008 21:23:56 -0500 Subject: [Numpy-discussion] "import numpy" is slow In-Reply-To: <4891EC2F.9050501@noaa.gov> References: <92AFDF8C-418F-4B6D-895C-8E7F0FD47D90@dalkescientific.com> <3d375d730807030006v2a58860eodb6022454c8e06d6@mail.gmail.com> <846D821E-BA9F-4043-AE39-D7D648013C30@dalkescientific.com> <6B6B9FA6-864D-4D31-99BB-E9AE7A93A6D3@dalkescientific.com> <9457e7c80807301359k599b9155ge296ae804485c94@mail.gmail.com> <4891EC2F.9050501@noaa.gov> Message-ID: <3d375d730807311923h4130c1ebiddbca175ed7c0bd6@mail.gmail.com> On Thu, Jul 31, 2008 at 11:45, Christopher Barker wrote: > On my OS-X box (10.4.11, python2.5, numpy '1.1.1rc2'), it takes about 7 > seconds to import numpy! Can you try running a Python process that just imports numpy under Shark.app? http://developer.apple.com/tools/shark_optimize.html This will help us see what's eating up the time at the C level, at least. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Thu Jul 31 22:38:02 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 31 Jul 2008 21:38:02 -0500 Subject: [Numpy-discussion] numpy 1.1.rc2: win32 binaries In-Reply-To: References: <488D58A3.6070800@ar.media.kyoto-u.ac.jp> <488D5984.202@ar.media.kyoto-u.ac.jp> Message-ID: Hi, The installation worked on my old Athlon XP running Windows XP and 'numpy.test(level=1)' gave no errors. I did not get an error for the code provided for ticket 844 so I presume this ticket is fixed: 'numpy.inner(F,F)' results in 'array([[ 0.]])' Also, the installer gives this information: Author: Travis E. Oliphant, et.al. Author_email: oliphant at ee.byu.edu Description: NumPy: array processing for numbers, strings, records, and objects. Maintainer: NumPy Developers Maintainer_email: numpy-discussion at lists.sourceforge.net Name: numpy Url: http://numpy.scipy.org Version: 1.1.1.dev5559 I think that at least the ' Author_email' and 'Maintainer_email' should be updated. Thanks Bruce On Thu, Jul 31, 2008 at 4:08 PM, Jarrod Millman wrote: > On Sun, Jul 27, 2008 at 10:30 PM, David Cournapeau > wrote: >> http://www.ar.media.kyoto-u.ac.jp/members/david/archives/numpy-1.1.1.dev5559-win32-superpack-python2.5.exe > > I want to get the final 1.1.1 release out ASAP, but I need some > feedback on the windows binaries. Could someone please try them out > and let us know if you run into any problems. > > In particular, could someone verify that this has been fixed: > http://projects.scipy.org/scipy/numpy/ticket/844 > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion >