From vs at it.uu.se Tue Dec 1 03:17:50 2009 From: vs at it.uu.se (Virgil Stokes) Date: Tue, 01 Dec 2009 09:17:50 +0100 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: <4B14D12E.5090300@it.uu.se> David Cournapeau wrote: > Hi, > > The first release candidate for 1.4.0 has been released. The sources, > as well as mac and windows installers may be found here: > > https://sourceforge.net/projects/numpy/files/ > > The main improvements compared to 1.3.0 are: > > * Faster import time > * Extended array wrapping mechanism for ufuncs > * New Neighborhood iterator (C-level only) > * C99-like complex functions in npymath > > As well as more than 50 bug fixes. The detailed list of changes may be > found on trac: > > http://projects.scipy.org/numpy/roadmap > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Thanks for your hard work David! :-) --V. Stokes From cournape at gmail.com Tue Dec 1 03:31:10 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 1 Dec 2009 17:31:10 +0900 Subject: [Numpy-discussion] Python 3K merge In-Reply-To: References: Message-ID: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris wrote: > Hi Pauli, > > It looks like you doing great stuff with the py3k transition. Do you and > David have any sort of merge schedule in mind? I have updated my py3k branch for numpy.distutils, and it is ready to merge: http://github.com/cournape/numpy/tree/py3k_bootstrap_take3 I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on Linux at least. The patch is much smaller than my previous attempts as well, so I would just push it to the trunk, and deal with the issues as they come. cheers, David From seb.haase at gmail.com Tue Dec 1 04:00:29 2009 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 1 Dec 2009 10:00:29 +0100 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B14D12E.5090300@it.uu.se> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <4B14D12E.5090300@it.uu.se> Message-ID: On Tue, Dec 1, 2009 at 9:17 AM, Virgil Stokes wrote: > David Cournapeau wrote: >> Hi, >> >> The first release candidate for 1.4.0 has been released. The sources, >> as well as mac and windows installers may be found here: >> >> https://sourceforge.net/projects/numpy/files/ >> >> The main improvements compared to 1.3.0 are: >> >> * Faster import time >> * Extended array wrapping mechanism for ufuncs >> * New Neighborhood iterator (C-level only) >> * C99-like complex functions in npymath >> >> As well as more than 50 bug fixes. The detailed list of changes may be >> found on trac: >> >> http://projects.scipy.org/numpy/roadmap >> >> cheers, >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Thanks for your ?hard work David! :-) > > --V. Stokes I can only agree - great work ! Where can one find out about the * New Neighborhood iterator (C-level only) ? I would problably like to use it in some SWIGged code of mine - even though I have never before used numpy-C code... Thanks, Sebastian From cournape at gmail.com Tue Dec 1 04:46:16 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 1 Dec 2009 18:46:16 +0900 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <4B14D12E.5090300@it.uu.se> Message-ID: <5b8d13220912010146h6f39a211l62be38e06a9c6385@mail.gmail.com> On Tue, Dec 1, 2009 at 6:00 PM, Sebastian Haase wrote: > > I can only agree - great work ! > Thanks. > Where can one find out about the > * New Neighborhood iterator (C-level only) > ? Here: http://docs.scipy.org/doc/numpy/reference/c-api.array.html#functions You can find some examples in the multiarray_tests.c in numpy/core (which test "stacked iterators"), as well as in scipy.signal (the nd-correlate function uses the neighborhood iterator). Note that optimizations such as used in VTK to separate the zones where boundaries handling is needed from the ones without is not implemented yet. cheers, David From dagss at student.matnat.uio.no Tue Dec 1 05:17:42 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 01 Dec 2009 11:17:42 +0100 Subject: [Numpy-discussion] convert strides/shape/offset into nd index? In-Reply-To: References: <6DE05582-D10B-4632-9C9C-6B07DAF321BE@yale.edu> <7f1eaee30911301904r4be694e6x85eb47319bebea67@mail.gmail.com> Message-ID: <4B14ED46.9090302@student.matnat.uio.no> Anne Archibald wrote: > 2009/11/30 James Bergstra : > >> Your question involves a few concepts: >> >> - an integer vector describing the position of an element >> >> - the logical shape (another int vector) >> >> - the physical strides (another int vector) >> >> Ignoring the case of negative offsets, a physical offset is the inner >> product of the physical strides with the position vector. >> >> In these terms, you are asking how to solve the inner-product equation >> for the position vector. There can be many possible solutions (like, >> if there is a stride of 1, then you can make that dimension account >> for the entire offset. This is often not the solution you want.). >> For valid ndarrays though, there is at most one solution though with >> the property that every position element is less than the shape. >> >> You will also need to take into account that for certain stride >> vectors, there is no way to get certain offsets. Imagine all the >> strides were even, and you needed to get at an odd offset... it would >> be impossible. It would even be impossible if there were a dimension >> with stride 1 but it had shape of 1 too. >> >> I can't think of an algorithm off the top of my head that would do >> this in a quick and elegant way. >> > > Not to be a downer, but this problem is technically NP-complete. The > so-called "knapsack problem" is to find a subset of a collection of > numbers that adds up to the specified number, and it is NP-complete. > Unfortunately, it is exactly what you need to do to find the indices > to a particular memory location in an array of shape (2,2,...,2). > > What that means in practice is that either you have to allow > potentially very slow algorithms (though you know that there will > never be more than 32 different values in the knapsack, which might or > might not be enough to keep things tractable) or use heuristic > algorithms that don't always work. There are probably fairly good > heuristics, particularly if the array elements are all at distinct > memory locations (arrays with overlapping elements can arise from > broadcasting and other slightly more arcane operations). > Not that this should be done, but getting a chance to discuss NP is always fun: I think this particular problem can be solved in O(d*n^2) or better, where n is the offset in question and d the number of dimensions of the array, by using dynamic programming on the buffer offset in question (so first try for offset 1, then 2, and so on up to n). Which doesn't contradict the fact that the problem is exponential (n is exponential in terms of the length of the input to the problem), but it is still not *too* bad in many cases, because the exponential term is always smaller than the size of the array. Dag Sverre From dagss at student.matnat.uio.no Tue Dec 1 05:34:39 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 01 Dec 2009 11:34:39 +0100 Subject: [Numpy-discussion] convert strides/shape/offset into nd index? In-Reply-To: <4B14ED46.9090302@student.matnat.uio.no> References: <6DE05582-D10B-4632-9C9C-6B07DAF321BE@yale.edu> <7f1eaee30911301904r4be694e6x85eb47319bebea67@mail.gmail.com> <4B14ED46.9090302@student.matnat.uio.no> Message-ID: <4B14F13F.8010306@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Anne Archibald wrote: > >> 2009/11/30 James Bergstra : >> >> >>> Your question involves a few concepts: >>> >>> - an integer vector describing the position of an element >>> >>> - the logical shape (another int vector) >>> >>> - the physical strides (another int vector) >>> >>> Ignoring the case of negative offsets, a physical offset is the inner >>> product of the physical strides with the position vector. >>> >>> In these terms, you are asking how to solve the inner-product equation >>> for the position vector. There can be many possible solutions (like, >>> if there is a stride of 1, then you can make that dimension account >>> for the entire offset. This is often not the solution you want.). >>> For valid ndarrays though, there is at most one solution though with >>> the property that every position element is less than the shape. >>> >>> You will also need to take into account that for certain stride >>> vectors, there is no way to get certain offsets. Imagine all the >>> strides were even, and you needed to get at an odd offset... it would >>> be impossible. It would even be impossible if there were a dimension >>> with stride 1 but it had shape of 1 too. >>> >>> I can't think of an algorithm off the top of my head that would do >>> this in a quick and elegant way. >>> >>> >> Not to be a downer, but this problem is technically NP-complete. The >> so-called "knapsack problem" is to find a subset of a collection of >> numbers that adds up to the specified number, and it is NP-complete. >> Unfortunately, it is exactly what you need to do to find the indices >> to a particular memory location in an array of shape (2,2,...,2). >> >> What that means in practice is that either you have to allow >> potentially very slow algorithms (though you know that there will >> never be more than 32 different values in the knapsack, which might or >> might not be enough to keep things tractable) or use heuristic >> algorithms that don't always work. There are probably fairly good >> heuristics, particularly if the array elements are all at distinct >> memory locations (arrays with overlapping elements can arise from >> broadcasting and other slightly more arcane operations). >> >> > Not that this should be done, but getting a chance to discuss NP is > always fun: > > I think this particular problem can be solved in O(d*n^2) or better, > Hmm, I guess that should be O(d*n). http://en.wikipedia.org/wiki/Knapsack_problem has the exact algorithm (though it needs some customization). Dag Sverre > where n is the offset in question and d the number of dimensions of the > array, by using dynamic programming on the buffer offset in question (so > first try for offset 1, then 2, and so on up to n). > > Which doesn't contradict the fact that the problem is exponential (n is > exponential in terms of the length of the input to the problem), but it > is still not *too* bad in many cases, because the exponential term is > always smaller than the size of the array. > > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From eg at fft.be Tue Dec 1 10:51:35 2009 From: eg at fft.be (Eloi Gaudry) Date: Tue, 01 Dec 2009 16:51:35 +0100 Subject: [Numpy-discussion] Is anyone knowledgeable about dll deployment on windows ? In-Reply-To: <5b8d13220911300408t4f5c9c9dm59eb4391c26c0149@mail.gmail.com> References: <4B0FDBAC.3030108@fft.be> <5b8d13220911270921l78b74d45s93a25f692a7c1c22@mail.gmail.com> <4B126603.7040203@fft.be> <4B12CAA3.50804@uci.edu> <5b8d13220911291612o7b57dff3xca4b04cdd5277fa2@mail.gmail.com> <4B138D6F.60907@fft.be> <4B138C65.5000401@ar.media.kyoto-u.ac.jp> <4B13B1FD.6080404@fft.be> <5b8d13220911300408t4f5c9c9dm59eb4391c26c0149@mail.gmail.com> Message-ID: <4B153B87.7000309@fft.be> Thanks for these references (that's a pity we currently can't find anything related to runtime libraries versioning on the msdn database). Eloi David Cournapeau wrote: > On Mon, Nov 30, 2009 at 8:52 PM, Eloi Gaudry wrote: > > >> Well, I wasn't aware of Microsoft willing to giving up the whole >> SxS/manifest thing. Is there any MSDN information available? >> > > I have seen this mentioned for the first time on the python-dev ML: > > http://aspn.activestate.com/ASPN/Mail/Message/python-dev/3764855 > > The mention of including the version in the dll file, if true, is > tragically comic. Maybe in 20 years windows will be able to have a > system which exists for more than a decade on conventional unix... The > link given by M.A Lemburg has changed since, though, as the > description is nowhere to be found in the link. I think I have read > that VS 2010 will never install the runtime in the SxS configuration, > but I of course cannot find this information anymore. Maybe it is not > true anymore, VS 2010 has not yet been released. > > You can also find useful manifest troubleshooting information there: > > http://blogs.msdn.com/junfeng/archive/2006/04/14/576314.aspx > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626 From eg at fft.be Tue Dec 1 10:55:17 2009 From: eg at fft.be (Eloi Gaudry) Date: Tue, 01 Dec 2009 16:55:17 +0100 Subject: [Numpy-discussion] Is anyone knowledgeable about dll deployment on windows ? In-Reply-To: <4B141289.10400@uci.edu> References: <4B0FDBAC.3030108@fft.be> <5b8d13220911270921l78b74d45s93a25f692a7c1c22@mail.gmail.com> <4B126603.7040203@fft.be> <4B12CAA3.50804@uci.edu> <4B138C3D.5030500@fft.be> <4B141289.10400@uci.edu> Message-ID: <4B153C65.9050100@fft.be> I've done so, thanks for pointing the discussion. In the meantime, I've just patched distutils/msvc9compiler.py so that it neither embed nor create a manifest assembly. This way, I'll be sure that the assembly information would be fetched from the main python (or python-based) binaries (i.e. pythonX.dll). That may be a very strong prerequisites in some cases, but never in my very particular case. Eloi Christoph Gohlke wrote: > The most popular/simple way to deal with the VC90.CRT dependency issue > is to have the user install the runtime redistributable on their system. > If you don't want to put that burden on the user, which I understand, > you have to make adjustments to the assembly manifests. This is not > unofficial or unsupported. It is a bug in Python that it embeds the > assemblyIdentity for VC90.CRT in all extensions build with > distutils/msvc9compiler.py. In fact, the *.pyd distributed with Python > 2.6.3+ don't have that problem. Maybe you can raise your concerns about > future compatibility at . > > Christoph > > On 11/30/2009 1:11 AM, Eloi Gaudry wrote: > >> Christoph, thanks for pointing this discussion. That's a perfect match. >> >> If the workaround provided offers a solution to the current >> redistribution issue, I'm wondering if it will still be the case when an >> update to the assembly check function will be activated/implemented >> (within Windows). >> The manifest edition (removing the "assemblyIdentity" tag) doesn't seem >> to be a popular/official/supported way of dealing with the whole runtime >> libraries issue. Don't you think ? >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626 From zachary.pincus at yale.edu Tue Dec 1 11:06:41 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 1 Dec 2009 11:06:41 -0500 Subject: [Numpy-discussion] convert strides/shape/offset into nd index? In-Reply-To: References: <6DE05582-D10B-4632-9C9C-6B07DAF321BE@yale.edu> <7f1eaee30911301904r4be694e6x85eb47319bebea67@mail.gmail.com> Message-ID: > Not to be a downer, but this problem is technically NP-complete. The > so-called "knapsack problem" is to find a subset of a collection of > numbers that adds up to the specified number, and it is NP-complete. > Unfortunately, it is exactly what you need to do to find the indices > to a particular memory location in an array of shape (2,2,...,2). Ha ha, right -- that is the knapsack problem isn't it. Oh well... I'll just require fortran- or C-style strided arrays, for which case it is easy to unravel offsets into indices. Thanks everyone! Zach From mathew.c.yeates at jpl.nasa.gov Tue Dec 1 14:32:03 2009 From: mathew.c.yeates at jpl.nasa.gov (Yeates, Mathew C (388D)) Date: Tue, 1 Dec 2009 11:32:03 -0800 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing Message-ID: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> Click on "Hello World" twice and get a memory error. Comment out the ax.plot call and get no error. import numpy import sys import gtk from matplotlib.figure import Figure from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as FigureCanvas ax=None fig=None canvas=None def doplot(widget,box1): global ax,fig,canvas data=numpy.zeros(shape=(3508,125,129)) plot_data=data[0,0:,0] if canvas: box1.remove(canvas) canvas=None if ax: ax.cla() ax=None if fig: fig=None fig = Figure(figsize=(5,5), dpi=100) ax = fig.add_subplot(111) mif=numpy.arange(plot_data.shape[0]) #if the next line is commented out, all is good ax.plot(plot_data,mif) canvas = FigureCanvas(fig) box1.pack_start(canvas, True, True, 0) canvas.show() def delete_event(widget, event, data=None): return False window = gtk.Window(gtk.WINDOW_TOPLEVEL) window.connect("destroy", lambda x: gtk.main_quit()) box1 = gtk.HBox(False, 0) window.add(box1) button = gtk.Button("Hello World") box1.pack_start(button, True, True, 0) #window.add(box1) button.show() button.connect("clicked", doplot, box1) box1.show() window.set_default_size(500,400) window.show() gtk.main() -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Tue Dec 1 14:58:45 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 01 Dec 2009 14:58:45 -0500 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing In-Reply-To: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> References: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> Message-ID: <4B157575.8060402@stsci.edu> Hmm... works for me. What platform, with how much physical and virtual RAM? One thing you may want to try is to completely destroy the figure each time: if fig: fig.clf() fig=None Mike Yeates, Mathew C (388D) wrote: > > Click on ?Hello World? twice and get a memory error. Comment out the > ax.plot call and get no error. > > import numpy > > import sys > > import gtk > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as > FigureCanvas > > ax=None > > fig=None > > canvas=None > > def doplot(widget,box1): > > global ax,fig,canvas > > data=numpy.zeros(shape=(3508,125,129)) > > plot_data=data[0,0:,0] > > if canvas: > > box1.remove(canvas) > > canvas=None > > if ax: > > ax.cla() > > ax=None > > if fig: fig=None > > fig = Figure(figsize=(5,5), dpi=100) > > ax = fig.add_subplot(111) > > mif=numpy.arange(plot_data.shape[0]) > > #if the next line is commented out, all is good > > ax.plot(plot_data,mif) > > canvas = FigureCanvas(fig) > > box1.pack_start(canvas, True, True, 0) > > canvas.show() > > def delete_event(widget, event, data=None): > > return False > > window = gtk.Window(gtk.WINDOW_TOPLEVEL) > > window.connect("destroy", lambda x: gtk.main_quit()) > > box1 = gtk.HBox(False, 0) > > window.add(box1) > > button = gtk.Button("Hello World") > > box1.pack_start(button, True, True, 0) > > #window.add(box1) > > button.show() > > button.connect("clicked", doplot, box1) > > box1.show() > > window.set_default_size(500,400) > > window.show() > > gtk.main() > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From santanu.chatter at gmail.com Tue Dec 1 15:15:24 2009 From: santanu.chatter at gmail.com (Santanu Chatterjee) Date: Tue, 1 Dec 2009 15:15:24 -0500 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing In-Reply-To: <4B157575.8060402@stsci.edu> References: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> <4B157575.8060402@stsci.edu> Message-ID: Hi Mathew, I saw your email and I was curious about it. I tried your code and it does work for me without any problem. Santanu On Tue, Dec 1, 2009 at 2:58 PM, Michael Droettboom wrote: > Hmm... works for me. What platform, with how much physical and virtual RAM? > > One thing you may want to try is to completely destroy the figure each > time: > > if fig: > fig.clf() > fig=None > > Mike > > Yeates, Mathew C (388D) wrote: > > > > Click on ?Hello World? twice and get a memory error. Comment out the > > ax.plot call and get no error. > > > > import numpy > > > > import sys > > > > import gtk > > > > from matplotlib.figure import Figure > > > > from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as > > FigureCanvas > > > > ax=None > > > > fig=None > > > > canvas=None > > > > def doplot(widget,box1): > > > > global ax,fig,canvas > > > > data=numpy.zeros(shape=(3508,125,129)) > > > > plot_data=data[0,0:,0] > > > > if canvas: > > > > box1.remove(canvas) > > > > canvas=None > > > > if ax: > > > > ax.cla() > > > > ax=None > > > > if fig: fig=None > > > > fig = Figure(figsize=(5,5), dpi=100) > > > > ax = fig.add_subplot(111) > > > > mif=numpy.arange(plot_data.shape[0]) > > > > #if the next line is commented out, all is good > > > > ax.plot(plot_data,mif) > > > > canvas = FigureCanvas(fig) > > > > box1.pack_start(canvas, True, True, 0) > > > > canvas.show() > > > > def delete_event(widget, event, data=None): > > > > return False > > > > window = gtk.Window(gtk.WINDOW_TOPLEVEL) > > > > window.connect("destroy", lambda x: gtk.main_quit()) > > > > box1 = gtk.HBox(False, 0) > > > > window.add(box1) > > > > button = gtk.Button("Hello World") > > > > box1.pack_start(button, True, True, 0) > > > > #window.add(box1) > > > > button.show() > > > > button.connect("clicked", doplot, box1) > > > > box1.show() > > > > window.set_default_size(500,400) > > > > window.show() > > > > gtk.main() > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From millman at berkeley.edu Tue Dec 1 19:05:26 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 2 Dec 2009 01:05:26 +0100 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: On Tue, Dec 1, 2009 at 4:47 AM, David Cournapeau wrote: > The first release candidate for 1.4.0 has been released. Excellent! Thanks for all your effort, Jarrod From dwf at cs.toronto.edu Tue Dec 1 19:48:56 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 1 Dec 2009 19:48:56 -0500 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: <2F7A6A59-2439-4553-B1C7-F40FD0CDCCCA@cs.toronto.edu> On 30-Nov-09, at 10:47 PM, David Cournapeau wrote: > Hi, > > The first release candidate for 1.4.0 has been released. The sources, > as well as mac and windows installers may be found here: > > https://sourceforge.net/projects/numpy/files/ Hi David, All clear on my Intel Atom and Core i5 boxes, though problem on a PowerPC machine (I assume it's just more 'long double' weirdness that's platform-specific): ====================================================================== FAIL: test_umath.test_nextafterl ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/dwf/numpyrc/lib/python2.5/site-packages/nose-0.11.1- py2.5.egg/nose/case.py", line 183, in runTest self.test(*self.arg) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/testing/ decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/core/ tests/test_umath.py", line 866, in test_nextafterl return _test_nextafter(np.longdouble) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/core/ tests/test_umath.py", line 852, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError ====================================================================== FAIL: test_umath.test_spacingl ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/dwf/numpyrc/lib/python2.5/site-packages/nose-0.11.1- py2.5.egg/nose/case.py", line 183, in runTest self.test(*self.arg) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/testing/ decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/core/ tests/test_umath.py", line 886, in test_spacingl return _test_spacing(np.longdouble) File "/Users/dwf/numpyrc/lib/python2.5/site-packages/numpy/core/ tests/test_umath.py", line 873, in _test_spacing assert np.spacing(one) == eps AssertionError ---------------------------------------------------------------------- Ran 2484 tests in 12.445s FAILED (KNOWNFAIL=4, SKIP=1, failures=2) From mathew.c.yeates at jpl.nasa.gov Tue Dec 1 21:53:25 2009 From: mathew.c.yeates at jpl.nasa.gov (Yeates, Mathew C (388D)) Date: Tue, 1 Dec 2009 18:53:25 -0800 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing In-Reply-To: References: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> <4B157575.8060402@stsci.edu> Message-ID: <4D311220B29AEF40BCF6A04066590EE2013BC59E78E0@ALTPHYEMBEVSP10.RES.AD.JPL> I found a workaround. If I replace > plot_data=data[0,0:,0] With > plot_data=numpy.copy(data[0,0:,0]) Everything is okay. I am on Windows XP 64 with 4 Gigs ram. (Note: the data array is greater than 4 Gigs since my datatype is float64. If I decrease the size so that the array is around 3 Gigs, all is good) Mathew ________________________________ From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Santanu Chatterjee Sent: Tuesday, December 01, 2009 12:15 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing Hi Mathew, I saw your email and I was curious about it. I tried your code and it does work for me without any problem. Santanu On Tue, Dec 1, 2009 at 2:58 PM, Michael Droettboom > wrote: Hmm... works for me. What platform, with how much physical and virtual RAM? One thing you may want to try is to completely destroy the figure each time: if fig: fig.clf() fig=None Mike Yeates, Mathew C (388D) wrote: > > Click on "Hello World" twice and get a memory error. Comment out the > ax.plot call and get no error. > > import numpy > > import sys > > import gtk > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as > FigureCanvas > > ax=None > > fig=None > > canvas=None > > def doplot(widget,box1): > > global ax,fig,canvas > > data=numpy.zeros(shape=(3508,125,129)) > > plot_data=data[0,0:,0] > > if canvas: > > box1.remove(canvas) > > canvas=None > > if ax: > > ax.cla() > > ax=None > > if fig: fig=None > > fig = Figure(figsize=(5,5), dpi=100) > > ax = fig.add_subplot(111) > > mif=numpy.arange(plot_data.shape[0]) > > #if the next line is commented out, all is good > > ax.plot(plot_data,mif) > > canvas = FigureCanvas(fig) > > box1.pack_start(canvas, True, True, 0) > > canvas.show() > > def delete_event(widget, event, data=None): > > return False > > window = gtk.Window(gtk.WINDOW_TOPLEVEL) > > window.connect("destroy", lambda x: gtk.main_quit()) > > box1 = gtk.HBox(False, 0) > > window.add(box1) > > button = gtk.Button("Hello World") > > box1.pack_start(button, True, True, 0) > > #window.add(box1) > > button.show() > > button.connect("clicked", doplot, box1) > > box1.show() > > window.set_default_size(500,400) > > window.show() > > gtk.main() > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Dec 1 22:17:30 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 1 Dec 2009 21:17:30 -0600 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: On Mon, Nov 30, 2009 at 9:47 PM, David Cournapeau wrote: > Hi, > > The first release candidate for 1.4.0 has been released. The sources, > as well as mac and windows installers may be found here: > > https://sourceforge.net/projects/numpy/files/ > I installed 32-bit Python 2.6.3 and 23-bit Python numpy on a Win7 Pro 64-bit system. I get the following failure: Python 2.6.3 (r263rc1:75186, Oct 2 2009, 20:40:30) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 2.6.3 >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> np.test() Running unit tests for numpy NumPy version 1.4.0rc1 NumPy is installed in E:\Python26\lib\site-packages\numpy Python version 2.6.3 (r263rc1:75186, Oct 2 2009, 20:40:30) [MSC v.1500 32 bit (Intel)] nose version 0.11.1 ........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K.......................................................K..K.............................K......................K.F..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K........K..................................................................................................................................................................................................................................................................................................................................................................................................................................................S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ====================================================================== FAIL: test_special_values (test_umath_complex.TestClog) ---------------------------------------------------------------------- Traceback (most recent call last): File "E:\Python26\lib\site-packages\numpy\core\tests\test_umath_complex.py", line 179, in test_special_values assert_almost_equal(np.log(x), y) File "E:\Python26\lib\site-packages\numpy\testing\utils.py", line 437, in assert_almost_equal "DESIRED: %s\n" % (str(actual), str(desired))) AssertionError: Items are not equal: ACTUAL: [ NaN+2.35619449j] DESIRED: (inf+2.35619449019j) Bruce ---------------------------------------------------------------------- Ran 2336 tests in 23.571s FAILED (KNOWNFAIL=7, SKIP=1, failures=1) >>> From cournape at gmail.com Tue Dec 1 22:23:44 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 2 Dec 2009 12:23:44 +0900 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: <5b8d13220912011923r292bf6c8rc1b04249a885735e@mail.gmail.com> On Wed, Dec 2, 2009 at 12:17 PM, Bruce Southey wrote: > Traceback (most recent call last): > ?File "E:\Python26\lib\site-packages\numpy\core\tests\test_umath_complex.py", > line 179, in test_special_values > ? ?assert_almost_equal(np.log(x), y) > ?File "E:\Python26\lib\site-packages\numpy\testing\utils.py", line > 437, in assert_almost_equal > ? ?"DESIRED: %s\n" % (str(actual), str(desired))) > AssertionError: Items are not equal: > ACTUAL: [ NaN+2.35619449j] > DESIRED: (inf+2.35619449019j) That's a known failure on windows (which has not been marked as such, though). Unless you rely on C99 semantics for nan/inf for complex handling, it should not be a big problem, cheers, David From nadavh at visionsense.com Wed Dec 2 03:08:37 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 2 Dec 2009 10:08:37 +0200 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> I got the following errors with a clean installation of numpy (previous installations deleted): Running unit tests for numpy NumPy version 1.4.0rc1 NumPy is installed in /usr/lib64/python2.6/site-packages/numpy Python version 2.6.4 (r264:75706, Nov 5 2009, 20:27:15) [GCC 4.3.4] nose version 0.11.1 ..........................................................................................EEEEEEEEEEEEEEEEEEEEEEEEE.EE.............................................................SSSSSSSS................................................................................................................................................................................................................................................................................SSS.......................................................................................................................................................................................................................................K..............................................................F.........................K......................K................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ====================================================================== ERROR: test_basic (test_defmatrix.TestAlgebra) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 189, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: Check that 'not implemented' operations produce a failure. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 225, in test_notimplemented A = matrix([[1., 2.], NameError: global name 'matrix' is not defined ====================================================================== ERROR: Test raising a matrix to an integer power works as expected. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 212, in test_pow m = matrix("1. 2.; 3. 4.") NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestCasting) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 167, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestCtor) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 8, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_bmat_nondefault_str (test_defmatrix.TestCtor) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 43, in test_bmat_nondefault_str assert all(bmat("A,A;A,A") == Aresult) NameError: global name 'bmat' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 287, in test_basic x = asmatrix(zeros((3,2),float)) NameError: global name 'asmatrix' is not defined ====================================================================== ERROR: test_instance_methods (test_defmatrix.TestMatrixReturn) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 247, in test_instance_methods a = matrix([1.0], dtype='f8') NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_array_from_matrix_list (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_array_to_list (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_boolean_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_dimesions (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_fancy_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_list_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_matrix_element (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_row_column_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_scalar_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_asmatrix (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 151, in test_asmatrix mA = asmatrix(A) NameError: global name 'asmatrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 106, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_comparisons (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 127, in test_comparisons mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_max (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 78, in test_max x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_min (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 84, in test_min x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_noaxis (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 156, in test_noaxis A = matrix([[1,0],[0,1]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_pinv (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 119, in test_pinv x = matrix(arange(6).reshape(2,3)) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_prod (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 69, in test_prod x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_repr (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 161, in test_repr A = matrix([[1,0],[0,1]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: Test whether matrix.sum(axis=1) preserves orientation. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_defmatrix.py", line 56, in test_sum M = matrix([[1,2,0,0], NameError: global name 'matrix' is not defined ====================================================================== FAIL: Test bug in reduceat with structured arrays copied for speed. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/case.py", line 183, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_umath.py", line 952, in test_reduceat assert_array_almost_equal(h1, h2) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 765, in assert_array_almost_equal header='Arrays are not almost equal') File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 609, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 9.41165773e+09, 9.41165773e+09, 9.41165773e+09, 9.41165773e+09], dtype=float32) y: array([ 700., 800., 1000., 7500.], dtype=float32) ---------------------------------------------------------------------- Ran 2521 tests in 10.292s I am able to correct the error: NameError: global name 'matrix' is not defined but I wonder why I get it. System: gentoo linux on amd64, python2.6.4 Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Jarrod Millman Sent: Wed 02-Dec-09 02:05 To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] Numpy 1.4.0 rc1 released On Tue, Dec 1, 2009 at 4:47 AM, David Cournapeau wrote: > The first release candidate for 1.4.0 has been released. ____________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 6168 bytes Desc: not available URL: From cheronetolivia at yahoo.com Wed Dec 2 04:04:31 2009 From: cheronetolivia at yahoo.com (Olivia Cheronet) Date: Wed, 2 Dec 2009 01:04:31 -0800 (PST) Subject: [Numpy-discussion] Import numpy fails on cygwin python Message-ID: <679405.12948.qm@web51002.mail.re2.yahoo.com> Hello! I have built numpy (updated to the trunk) for my cygwin (1.5.25) Python (2.5.2). However, testing fails when I try to import numpy in python (see output below). I have been searching around for a solution, but everything has failed so far... I would be grateful for any advice. Thank you, Olivia $ python Python 2.5.2 (r252:60911, Dec 2 2008, 09:26:14) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/site-packages/numpy/__init__.py", line 132, in import add_newdocs File "/usr/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/usr/lib/python2.5/site-packages/numpy/lib/__init__.py", line 13, in from polynomial import * File "/usr/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 17, in < module> from numpy.linalg import eigvals, lstsq File "/usr/lib/python2.5/site-packages/numpy/linalg/__init__.py", line 47, in from linalg import * File "/usr/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 22, in from numpy.linalg import lapack_lite ImportError: No such file or directory From david at ar.media.kyoto-u.ac.jp Wed Dec 2 03:49:13 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 02 Dec 2009 17:49:13 +0900 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> Message-ID: <4B162A09.9070106@ar.media.kyoto-u.ac.jp> Nadav Horesh wrote: > I got the following errors with a clean installation of numpy (previous installations deleted): > Actually, there are still some leftover: the file numpy/core/test_defmatrix.py does not exist in the tarball. cheers, David From david at ar.media.kyoto-u.ac.jp Wed Dec 2 03:56:01 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 02 Dec 2009 17:56:01 +0900 Subject: [Numpy-discussion] Import numpy fails on cygwin python In-Reply-To: <679405.12948.qm@web51002.mail.re2.yahoo.com> References: <679405.12948.qm@web51002.mail.re2.yahoo.com> Message-ID: <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> Olivia Cheronet wrote: > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.5/site-packages/numpy/__init__.py", line 132, in e> > import add_newdocs > File "/usr/lib/python2.5/site-packages/numpy/add_newdocs.py", line 9, in le> > from lib import add_newdoc > File "/usr/lib/python2.5/site-packages/numpy/lib/__init__.py", line 13, in dule> > from polynomial import * > File "/usr/lib/python2.5/site-packages/numpy/lib/polynomial.py", line 17, in < > module> > from numpy.linalg import eigvals, lstsq > File "/usr/lib/python2.5/site-packages/numpy/linalg/__init__.py", line 47, in > > from linalg import * > File "/usr/lib/python2.5/site-packages/numpy/linalg/linalg.py", line 22, in odule> > from numpy.linalg import lapack_lite > ImportError: No such file or directory Does the file /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so exist ? If so, what does: file /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so say ? cheers, David From renesd at gmail.com Wed Dec 2 05:53:47 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 2 Dec 2009 11:53:47 +0100 Subject: [Numpy-discussion] numpy 1.3.0 eggs with python2.6 seem broken on osx, and linux In-Reply-To: <5b8d13220911280556s475ed2e5hec24768e20602210@mail.gmail.com> References: <64ddb72c0911280134m48d5fec5u40540f35c923318a@mail.gmail.com> <5b8d13220911280216v19806cd1wbb8cea4b09c3298f@mail.gmail.com> <64ddb72c0911280542n4521b822madd4b336e3c9b5f5@mail.gmail.com> <5b8d13220911280556s475ed2e5hec24768e20602210@mail.gmail.com> Message-ID: <64ddb72c0912020253r54f98bd0x401bb45272bd0039@mail.gmail.com> On Sat, Nov 28, 2009 at 2:56 PM, David Cournapeau wrote: > On Sat, Nov 28, 2009 at 10:42 PM, Ren? Dudfield wrote: > > > > > yeah, I completely understand the unfortunate packaging situation (eg, > some > > of my packages do not work with this install method). > > > > Here is a simple package requiring numpy. It uses buildout > > (http://www.buildout.org/). To help easily reproduce the problem, here > are > > the commands to reproduce. > > I have not tried it, but from your explanation, I would suspect that > buildout does some monkeypatching which is not exactly the same as > setuptools itself, and that would break numpy.distutils. > > The distutils architecture unfortunately requires to take into account > and special case any distutils extensions, because of the "extension > as command inheritance" approach. I know I prefer spending my time on > new packaging solution, but I would take patches. > > David > > > Package/module semantics changed during 2.6/3.0. I think it might have something to do with that. Hopefully some of the py3 work will fix these problems too. Actually, I think it might also be a buildout bug. It seems buildout uses the sys.path variable to add paths to the python interpreter. Whereas the 'proper' way is to use the PYTHONPATH environment imho. Since modifying sys.path does not affect sub processes paths. It seems numpy uses a sub process in the install? Or maybe buildout/python2.6 use a subprocess. Or with 2.5 already having a numpy installed (on osx), then perhaps it was using the old numpy.distutils already installed to do the install. Anyway, that's my findings so far. Not sure if I'll look into it more. cu, -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Wed Dec 2 06:01:54 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 2 Dec 2009 12:01:54 +0100 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B162A09.9070106@ar.media.kyoto-u.ac.jp> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> Message-ID: <64ddb72c0912020301t6ee77383yee2fcc96deb62f65@mail.gmail.com> On Wed, Dec 2, 2009 at 9:49 AM, David Cournapeau wrote: > Nadav Horesh wrote: >> I got the following errors with a clean installation of numpy (previous installations deleted): >> > > Actually, there are still some leftover: the file > numpy/core/test_defmatrix.py does not exist in the tarball. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > hi, this is caused by distutils not having a mechanism for removing old files. See this bug: http://bugs.python.org/issue5342 We put in some hacks into pygame distutils for removing old files (with msi installer, and for setup.py installer). However I think the long term solution they are thinking of is to allow distutils to uninstall correctly. However that will only work for new pythons if it is implemented. The pygame distutils mod has a list of old files to remove, which are then acted on by the msi code, and the setup.py install code. I could submit a patch for numpy if you are interested? It would require a list of old files from previous versions of numpy (like numpy/core/test_defmatrix.py). cheers, From david at ar.media.kyoto-u.ac.jp Wed Dec 2 05:51:10 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 02 Dec 2009 19:51:10 +0900 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <64ddb72c0912020301t6ee77383yee2fcc96deb62f65@mail.gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> <64ddb72c0912020301t6ee77383yee2fcc96deb62f65@mail.gmail.com> Message-ID: <4B16469E.2010207@ar.media.kyoto-u.ac.jp> Ren? Dudfield wrote: > We put in some hacks into pygame distutils for removing old files > (with msi installer, and for setup.py installer). However I think the > long term solution they are thinking of is to allow distutils to > uninstall correctly. I think this is just wishful thinking from people who try to improve distutils - it is near impossible to implement this correctly with distutils, because of how install works (i.e. just dumping whatever happens to be in build directory, hope that the list of files is correctly handled - which is almost never the case in my experience). That's one of the major flaw of distutils which I hope to fix with my project toydist, by being very explicit at each step. In particular, the install steps are described from a file generated at build time, and all binary installers and eggs will be based on this generated file. > I could submit a patch for numpy if you are interested? It would > require a list of old files from previous versions of numpy (like > numpy/core/test_defmatrix.py). > Do you mean that the files to remove have to be tracked manually ? You can always submit a patch to trac, though, cheers, David From renesd at gmail.com Wed Dec 2 07:01:42 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 2 Dec 2009 13:01:42 +0100 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B16469E.2010207@ar.media.kyoto-u.ac.jp> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> <64ddb72c0912020301t6ee77383yee2fcc96deb62f65@mail.gmail.com> <4B16469E.2010207@ar.media.kyoto-u.ac.jp> Message-ID: <64ddb72c0912020401y11b89702q2d11f96f73c4a60a@mail.gmail.com> On Wed, Dec 2, 2009 at 11:51 AM, David Cournapeau wrote: > Ren? Dudfield wrote: >> We put in some hacks into pygame distutils for removing old files >> (with msi installer, and for setup.py installer). ?However I think the >> long term solution they are thinking of is to allow distutils to >> uninstall correctly. > > I think this is just wishful thinking from people who try to improve > distutils - it is near impossible to implement this correctly with > distutils, because of how install works (i.e. just dumping whatever > happens to be in build directory, hope that the list of files is > correctly handled - which is almost never the case in my experience). > Ugly, but safe hacks to distutils trump breaking installs for me. There's another bug report about how moving the new install directory in is better. Then you have very little chance of having a broken install, and you do not need to worry about over writing files/new files old files. 'distutils race condition': http://bugs.python.org/issue7412 > That's one of the major flaw of distutils which I hope to fix with my > project toydist, by being very explicit at each step. In particular, the > install steps are described from a file generated at build time, and all > binary installers and eggs will be based on this generated file. > Agreed. Separate steps are needed. >> I could submit a patch for numpy if you are interested? ?It would >> require a list of old files from previous versions of numpy (like >> numpy/core/test_defmatrix.py). >> > > Do you mean that the files to remove have to be tracked manually ? You > can always submit a patch to trac, though, > yeah, tracking files between releases manually. If there isn't already a list of files installed for each numpy release that is. If there is a file list, then it would be simple to figure out which files have been removed from each release. There is a new pep being worked on for making this file list in a specified place I believe. Actually, we have three sets of files... py_to_remove and ext_to_remove. Since oldfile.dll is different to oldfile.so etc. Also there are different .py files... eg, pyc, pyo and others. The third set is an explicit file, eg 'data/bla.png' I'll attach a patch to Trac with just with the 'numpy/core/test_defmatrix.py*' to be removed. Or you can view the pygame setup.py here: look for 'def remove_old_files', and also "msilib.add_data(self.db, "RemoveFile", records)". http://www.seul.org/viewcvs/viewcvs.cgi/trunk/setup.py?rev=2653&root=PyGame&sortby=date&view=markup cheers, From bsouthey at gmail.com Wed Dec 2 09:38:23 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 02 Dec 2009 08:38:23 -0600 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B162A09.9070106@ar.media.kyoto-u.ac.jp> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> Message-ID: <4B167BDF.30909@gmail.com> On 12/02/2009 02:49 AM, David Cournapeau wrote: > Nadav Horesh wrote: > >> I got the following errors with a clean installation of numpy (previous installations deleted): >> >> > Actually, there are still some leftover: the file > numpy/core/test_defmatrix.py does not exist in the tarball. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I see this issue as a (small) downside of the ability of nose to find and run any test. So, while it does not really solve the underlying problem, can nose be *easily* configured to ignore these defunct tests for the stable releases? Bruce From meine at informatik.uni-hamburg.de Wed Dec 2 12:42:37 2009 From: meine at informatik.uni-hamburg.de (Hans Meine) Date: Wed, 2 Dec 2009 18:42:37 +0100 Subject: [Numpy-discussion] ANN: qimage2ndarray - converting between QImages and numpy.ndarrays Message-ID: <200912021842.43806.meine@informatik.uni-hamburg.de> Hi, I have just uploaded a first release of qimage2ndarray, a tiny python extension for quickly converting between QImages and numpy.ndarrays (in both directions). These are very common tasks when programming e.g. scientific visualizations in Python using PyQt4 as the GUI library. Similar code was found in Qwt and floating around on mailing lists, but qimage2ndarray has the following unique feature set: * Supports conversion of scalar and RGB data, with arbitrary dtypes and memory layout, with and without alpha channels, into QImages (e.g. for display or saving using Qt). * Using a tiny C++ extension, qimage2ndarray makes it possible to create ndarrays that are *views* into a given QImage's memory. This allows for very efficient data handling and makes it possible to modify Qt image data in-place (e.g. for brightness/gamma or alpha mask modifications). * qimage2ndarray is stable and unit-tested: * proper reference counting even with views (ndarray.base points to the underlying QImage) * handles non-standard widths and respects QImage's 32-bit row alignment * Masked arrays are also supported and are converted into QImages with transparent pixels. * Supports value scaling / normalization to 0..255 for convenient display of arbitrary NumPy arrays. The extension is open source, BSD-licensed, and available via PyPI or here: http://kogs-www.informatik.uni-hamburg.de/~meine/software/qimage2ndarray/ I hope this is useful to many of you and look forward to your feedback, Hans PS: Now that I am announcing this, I suddenly have the feeling that I should have talked with some lawyer (or Phil) about possible license issues because of PyQt. I really hope there will not turn out to be problems with this. From robert.kern at gmail.com Wed Dec 2 12:49:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Dec 2009 12:49:27 -0500 Subject: [Numpy-discussion] ANN: qimage2ndarray - converting between QImages and numpy.ndarrays In-Reply-To: <200912021842.43806.meine@informatik.uni-hamburg.de> References: <200912021842.43806.meine@informatik.uni-hamburg.de> Message-ID: <3d375d730912020949k81dc1fejdadacec7af69a237@mail.gmail.com> 2009/12/2 Hans Meine : > PS: Now that I am announcing this, I suddenly have the feeling that I should > have talked with some lawyer (or Phil) about possible license issues because > of PyQt. ?I really hope there will not turn out to be problems with this. The PyQt license has an explicit provision that allows you to build BSD-licensed libraries and applications that use PyQt. The final application as a whole (i.e. when combined with PyQt) must be distributed according to the appropriate PyQt license, either GPL or Commercial, of course, but your BSD library is fine to include in that application. Look at the file GPL_EXCEPTION.TXT . -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mathew.c.yeates at jpl.nasa.gov Wed Dec 2 12:55:48 2009 From: mathew.c.yeates at jpl.nasa.gov (Yeates, Mathew C (388D)) Date: Wed, 2 Dec 2009 09:55:48 -0800 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing In-Reply-To: <4D311220B29AEF40BCF6A04066590EE2013BC59E78E0@ALTPHYEMBEVSP10.RES.AD.JPL> References: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> <4B157575.8060402@stsci.edu> <4D311220B29AEF40BCF6A04066590EE2013BC59E78E0@ALTPHYEMBEVSP10.RES.AD.JPL> Message-ID: <4D311220B29AEF40BCF6A04066590EE2013BC59E792A@ALTPHYEMBEVSP10.RES.AD.JPL> Anybody have any ideas what is going on here. Although I found a workaround, I'm concerned about memory leaks ________________________________ From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Yeates, Mathew C (388D) Sent: Tuesday, December 01, 2009 6:53 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing I found a workaround. If I replace > plot_data=data[0,0:,0] With > plot_data=numpy.copy(data[0,0:,0]) Everything is okay. I am on Windows XP 64 with 4 Gigs ram. (Note: the data array is greater than 4 Gigs since my datatype is float64. If I decrease the size so that the array is around 3 Gigs, all is good) Mathew ________________________________ From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Santanu Chatterjee Sent: Tuesday, December 01, 2009 12:15 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing Hi Mathew, I saw your email and I was curious about it. I tried your code and it does work for me without any problem. Santanu On Tue, Dec 1, 2009 at 2:58 PM, Michael Droettboom > wrote: Hmm... works for me. What platform, with how much physical and virtual RAM? One thing you may want to try is to completely destroy the figure each time: if fig: fig.clf() fig=None Mike Yeates, Mathew C (388D) wrote: > > Click on "Hello World" twice and get a memory error. Comment out the > ax.plot call and get no error. > > import numpy > > import sys > > import gtk > > from matplotlib.figure import Figure > > from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as > FigureCanvas > > ax=None > > fig=None > > canvas=None > > def doplot(widget,box1): > > global ax,fig,canvas > > data=numpy.zeros(shape=(3508,125,129)) > > plot_data=data[0,0:,0] > > if canvas: > > box1.remove(canvas) > > canvas=None > > if ax: > > ax.cla() > > ax=None > > if fig: fig=None > > fig = Figure(figsize=(5,5), dpi=100) > > ax = fig.add_subplot(111) > > mif=numpy.arange(plot_data.shape[0]) > > #if the next line is commented out, all is good > > ax.plot(plot_data,mif) > > canvas = FigureCanvas(fig) > > box1.pack_start(canvas, True, True, 0) > > canvas.show() > > def delete_event(widget, event, data=None): > > return False > > window = gtk.Window(gtk.WINDOW_TOPLEVEL) > > window.connect("destroy", lambda x: gtk.main_quit()) > > box1 = gtk.HBox(False, 0) > > window.add(box1) > > button = gtk.Button("Hello World") > > box1.pack_start(button, True, True, 0) > > #window.add(box1) > > button.show() > > button.connect("clicked", doplot, box1) > > box1.show() > > window.set_default_size(500,400) > > window.show() > > gtk.main() > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Wed Dec 2 13:25:07 2009 From: cjw at ncf.ca (Colin J. Williams) Date: Wed, 02 Dec 2009 13:25:07 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> Message-ID: <4B16B103.6070600@ncf.ca> On 29-Nov-09 20:15 PM, Robin wrote: > On Mon, Nov 30, 2009 at 12:30 AM, Colin J. Williams wrote: > >> On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote: >> >>> All of the statistical packages that I am currently using and have used in >>> the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the >>> sqrt(1/(n-1)) normalization, which gives a result that is unbiased when >>> sampling from a normally-distributed population. NumPy uses the sqrt(1/n) >>> normalization. I'm currently using the following code to calculate standard >>> deviations, but would much prefer if this could be fixed in NumPy itself: >>> >>> def mystd(x=numpy.array([]), axis=None): >>> """This function calculates the standard deviation of the input using the >>> definition of standard deviation that gives an unbiased result for >>> samples >>> from a normally-distributed population.""" >>> >>> xd= x - x.mean(axis=axis) >>> return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) ) >>> >>> >> Anne Archibald has suggested a work-around. Perhaps ddof could be set, >> by default to >> 1 as other values are rarely required. >> >> Where the distribution of a variate is not known a priori, then I >> believe that it can be shown >> that the n-1 divisor provides the best estimate of the variance. >> > There have been previous discussions on this (but I can't find them > now) and I believe the current default was chosen deliberately. I > think it is the view of the numpy developers that the n divisor has > more desireable properties in most cases than the traditional n-1 - > see this paper by Travis Oliphant for details: > http://hdl.handle.net/1877/438 > > Cheers > > Robin > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > The conventional approach, based in the notion of Expected values is given here: http://en.wikipedia.org/wiki/Variance#Distribution_of_the_sample_variance I would suggest that numpy should stick with that until the approach advocated in: http://hdl.handle.net/1877/438 is generally accepted. Thomas Bayes introduced some nebulous ideas that might not be relevant for most cases when one is trying to find a confidence interval for a mean: http://en.wikipedia.org/wiki/Thomas_bayes#Bayes.27_theorem Colin W. From Chris.Barker at noaa.gov Wed Dec 2 15:28:02 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 02 Dec 2009 12:28:02 -0800 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B167BDF.30909@gmail.com> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> <4B167BDF.30909@gmail.com> Message-ID: <4B16CDD2.5060907@noaa.gov> I downloaded rc1, and built it on my PPC OS-X 10.4 box, with Python 2.5.2 (from python.org). Then ran the tests. I got: ---------------------------------------------------------------------- Ran 2521 tests in 24.804s FAILED (KNOWNFAIL=4, SKIP=1, errors=27, failures=2) Many of them look like this: ERROR: test_basic (test_defmatrix.TestAlgebra) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 189, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined Some sort of namespace issue? np.matrix does exist. Then there is an issue have no clue about: ====================================================================== FAIL: test_umath.test_nextafterl ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 866, in test_nextafterl return _test_nextafter(np.longdouble) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 852, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError Is this something with eps and PPC (endian issue?), maybe? -Chris Here is the whole run: ORRW-W-1275328-Barker:~/Downloads/Python2.5 cbarker$ python Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> np.test() Running unit tests for numpy NumPy version 1.4.0rc1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 (Apple Computer, Inc. build 5363)] nose version 0.10.4 ..........................................................................................EEEEEEEEEEEEEEEEEEEEEEEEE.EE...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...K...................................................F..F.............................K......................K........................................................................................................................................................................................ ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................S................................................................. ............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ====================================================================== ERROR: test_basic (test_defmatrix.TestAlgebra) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 189, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: Check that 'not implemented' operations produce a failure. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 225, in test_notimplemented A = matrix([[1., 2.], NameError: global name 'matrix' is not defined ====================================================================== ERROR: Test raising a matrix to an integer power works as expected. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 212, in test_pow m = matrix("1. 2.; 3. 4.") NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestCasting) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 167, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestCtor) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 8, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_bmat_nondefault_str (test_defmatrix.TestCtor) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 43, in test_bmat_nondefault_str assert all(bmat("A,A;A,A") == Aresult) NameError: global name 'bmat' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 287, in test_basic x = asmatrix(zeros((3,2),float)) NameError: global name 'asmatrix' is not defined ====================================================================== ERROR: test_instance_methods (test_defmatrix.TestMatrixReturn) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 247, in test_instance_methods a = matrix([1.0], dtype='f8') NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_array_from_matrix_list (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_array_to_list (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_boolean_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_dimesions (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_fancy_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_list_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_matrix_element (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_row_column_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_scalar_indexing (test_defmatrix.TestNewScalarIndexing) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 296, in setUp self.a = matrix([[1, 2],[3,4]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_asmatrix (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 151, in test_asmatrix mA = asmatrix(A) NameError: global name 'asmatrix' is not defined ====================================================================== ERROR: test_basic (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 106, in test_basic mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_comparisons (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 127, in test_comparisons mA = matrix(A) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_max (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 78, in test_max x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_min (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 84, in test_min x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_noaxis (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 156, in test_noaxis A = matrix([[1,0],[0,1]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_pinv (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 119, in test_pinv x = matrix(arange(6).reshape(2,3)) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_prod (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 69, in test_prod x = matrix([[1,2,3],[4,5,6]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: test_repr (test_defmatrix.TestProperties) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 161, in test_repr A = matrix([[1,0],[0,1]]) NameError: global name 'matrix' is not defined ====================================================================== ERROR: Test whether matrix.sum(axis=1) preserves orientation. ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", line 56, in test_sum M = matrix([[1,2,0,0], NameError: global name 'matrix' is not defined ====================================================================== FAIL: test_umath.test_nextafterl ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 866, in test_nextafterl return _test_nextafter(np.longdouble) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 852, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError ====================================================================== FAIL: test_umath.test_spacingl ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/decorators.py", line 215, in knownfailer return f(*args, **kwargs) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 886, in test_spacingl return _test_spacing(np.longdouble) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", line 873, in _test_spacing assert np.spacing(one) == eps AssertionError ---------------------------------------------------------------------- Ran 2521 tests in 24.804s FAILED (KNOWNFAIL=4, SKIP=1, errors=27, failures=2) >>> -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Dec 2 15:29:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 2 Dec 2009 15:29:26 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B16B103.6070600@ncf.ca> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> Message-ID: <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> On Wed, Dec 2, 2009 at 13:25, Colin J. Williams wrote: > The conventional approach, based in the notion ?of Expected values is > given here: > http://en.wikipedia.org/wiki/Variance#Distribution_of_the_sample_variance > > I would suggest that numpy should stick with that until the approach > advocated in: http://hdl.handle.net/1877/438 > is generally accepted. We are not changing the behavior of numpy.std() or numpy.var() at this point in time. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Dec 2 15:37:19 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Dec 2009 13:37:19 -0700 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: <4B16CDD2.5060907@noaa.gov> References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> <4B167BDF.30909@gmail.com> <4B16CDD2.5060907@noaa.gov> Message-ID: On Wed, Dec 2, 2009 at 1:28 PM, Christopher Barker wrote: > I downloaded rc1, and built it on my PPC OS-X 10.4 box, with Python > 2.5.2 (from python.org). Then ran the tests. I got: > > ---------------------------------------------------------------------- > Ran 2521 tests in 24.804s > > FAILED (KNOWNFAIL=4, SKIP=1, errors=27, failures=2) > > > Many of them look like this: > > ERROR: test_basic (test_defmatrix.TestAlgebra) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_defmatrix.py", > line 189, in test_basic > mA = matrix(A) > NameError: global name 'matrix' is not defined > > > Some sort of namespace issue? np.matrix does exist. > > David says this is due to a stray old file (see earlier post), you need to clean out the previous numpy installation. > Then there is an issue have no clue about: > > ====================================================================== > FAIL: test_umath.test_nextafterl > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/nose-0.10.4-py2.5.egg/nose/case.py", > line 182, in runTest > self.test(*self.arg) > File > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/testing/decorators.py", > line 215, in knownfailer > return f(*args, **kwargs) > File > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", > line 866, in test_nextafterl > return _test_nextafter(np.longdouble) > File > > "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_umath.py", > line 852, in _test_nextafter > assert np.nextafter(one, two) - one == eps > AssertionError > > > Is this something with eps and PPC (endian issue?), maybe? > > It is the odd long double of PPC, also reported earlier. It should probably be marked a known fail on PPC. If David or a developer had a PPC it might could be worked out. OTOH, long double on PPC isn't any sort of IEEE so, strictly speaking, we don't support it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkington at wisc.edu Wed Dec 2 15:41:12 2009 From: jkington at wisc.edu (Joe Kington) Date: Wed, 2 Dec 2009 14:41:12 -0600 Subject: [Numpy-discussion] a simple examplr showing numpy and matplotlib failing In-Reply-To: <4D311220B29AEF40BCF6A04066590EE2013BC59E792A@ALTPHYEMBEVSP10.RES.AD.JPL> References: <4D311220B29AEF40BCF6A04066590EE2013BC5A1AE7D@ALTPHYEMBEVSP10.RES.AD.JPL> <4B157575.8060402@stsci.edu> <4D311220B29AEF40BCF6A04066590EE2013BC59E78E0@ALTPHYEMBEVSP10.RES.AD.JPL> <4D311220B29AEF40BCF6A04066590EE2013BC59E792A@ALTPHYEMBEVSP10.RES.AD.JPL> Message-ID: I'm just guessing here, but have you tried completely destroying the figure each time, as Michael suggested? That should avoid the problem you're having, I think... At any rate, if you don't do a fig.clf(), I'm fairly sure matplotlib keeps a reference to the data around. Hope that helps, -Joe On Wed, Dec 2, 2009 at 11:55 AM, Yeates, Mathew C (388D) < mathew.c.yeates at jpl.nasa.gov> wrote: > Anybody have any ideas what is going on here. Although I found a > workaround, I?m concerned about memory leaks > > > ------------------------------ > > *From:* numpy-discussion-bounces at scipy.org [mailto: > numpy-discussion-bounces at scipy.org] *On Behalf Of *Yeates, Mathew C (388D) > *Sent:* Tuesday, December 01, 2009 6:53 PM > > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] a simple examplr showing numpy and > matplotlib failing > > > > I found a workaround. If I replace > > > plot_data=data[0,0:,0] > > With > > > plot_data=numpy.copy(data[0,0:,0]) > > > > Everything is okay. > > > > I am on Windows XP 64 with 4 Gigs ram. (Note: the data array is greater > than 4 Gigs since my datatype is float64. If I decrease the size so that the > array is around 3 Gigs, all is good) > > > > > > Mathew > > > > > > > > > ------------------------------ > > *From:* numpy-discussion-bounces at scipy.org [mailto: > numpy-discussion-bounces at scipy.org] *On Behalf Of *Santanu Chatterjee > *Sent:* Tuesday, December 01, 2009 12:15 PM > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] a simple examplr showing numpy and > matplotlib failing > > > > Hi Mathew, > I saw your email and I was curious about it. I tried your code and it > does work for me without any problem. > > Santanu > > On Tue, Dec 1, 2009 at 2:58 PM, Michael Droettboom > wrote: > > Hmm... works for me. What platform, with how much physical and virtual RAM? > > One thing you may want to try is to completely destroy the figure each > time: > > if fig: > fig.clf() > fig=None > > Mike > > > Yeates, Mathew C (388D) wrote: > > > > Click on ?Hello World? twice and get a memory error. Comment out the > > ax.plot call and get no error. > > > > import numpy > > > > import sys > > > > import gtk > > > > from matplotlib.figure import Figure > > > > from matplotlib.backends.backend_gtkagg import FigureCanvasGTKAgg as > > FigureCanvas > > > > ax=None > > > > fig=None > > > > canvas=None > > > > def doplot(widget,box1): > > > > global ax,fig,canvas > > > > data=numpy.zeros(shape=(3508,125,129)) > > > > plot_data=data[0,0:,0] > > > > if canvas: > > > > box1.remove(canvas) > > > > canvas=None > > > > if ax: > > > > ax.cla() > > > > ax=None > > > > if fig: fig=None > > > > fig = Figure(figsize=(5,5), dpi=100) > > > > ax = fig.add_subplot(111) > > > > mif=numpy.arange(plot_data.shape[0]) > > > > #if the next line is commented out, all is good > > > > ax.plot(plot_data,mif) > > > > canvas = FigureCanvas(fig) > > > > box1.pack_start(canvas, True, True, 0) > > > > canvas.show() > > > > def delete_event(widget, event, data=None): > > > > return False > > > > window = gtk.Window(gtk.WINDOW_TOPLEVEL) > > > > window.connect("destroy", lambda x: gtk.main_quit()) > > > > box1 = gtk.HBox(False, 0) > > > > window.add(box1) > > > > button = gtk.Button("Hello World") > > > > box1.pack_start(button, True, True, 0) > > > > #window.add(box1) > > > > button.show() > > > > button.connect("clicked", doplot, box1) > > > > box1.show() > > > > window.set_default_size(500,400) > > > > window.show() > > > > gtk.main() > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Michael Droettboom > Science Software Branch > Operations and Engineering Division > Space Telescope Science Institute > Operated by AURA for NASA > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cheronetolivia at yahoo.com Wed Dec 2 15:41:42 2009 From: cheronetolivia at yahoo.com (Olivia Cheronet) Date: Wed, 2 Dec 2009 12:41:42 -0800 (PST) Subject: [Numpy-discussion] Import numpy fails on cygwin python In-Reply-To: <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> References: <679405.12948.qm@web51002.mail.re2.yahoo.com> <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> Message-ID: <599716.6028.qm@web51003.mail.re2.yahoo.com> ----- Original Message ---- > From: David Cournapeau > > Does the file > /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so exist ? > > cheers, > > David Indeed, this file is not there. Where can I find it? Thanks. Olivia From Chris.Barker at noaa.gov Wed Dec 2 18:08:20 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 02 Dec 2009 15:08:20 -0800 Subject: [Numpy-discussion] Numpy 1.4.0 rc1 released In-Reply-To: References: <5b8d13220911301947m7fc20c83v59d84a035cb6e6ac@mail.gmail.com> <710F2847B0018641891D9A21602763605AD247@ex3.envision.co.il> <4B162A09.9070106@ar.media.kyoto-u.ac.jp> <4B167BDF.30909@gmail.com> <4B16CDD2.5060907@noaa.gov> Message-ID: <4B16F364.4060000@noaa.gov> Charles R Harris wrote: > David says this is due to a stray old file (see earlier post), you need > to clean out the previous numpy installation. Done, and yes, that was it. Which is weird, because I really thought I'd cleared it out the first time! I"m still having trouble figuring out how to test it without installing it, but it's not so hard to put the old one back if I want. Now I get: FAILED (KNOWNFAIL=4, SKIP=1, failures=2) and the two failures are: > It is the odd long double of PPC, also reported earlier. > It should probably be marked a known fail on PPC. If David or a developer had a > PPC it might could be worked out. How tricky is it? If someone tells me where to look, I might be able to poke at it, though I may be out of my depth. > OTOH, long double on PPC isn't any > sort of IEEE so, strictly speaking, we don't support it. nor do I sue them, so I guess I don't care -- setting it as KNOWNFAIL would be fine with me! thanks, and thanks to everyone for a another fabulous release! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sccolbert at gmail.com Wed Dec 2 18:52:48 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Thu, 3 Dec 2009 00:52:48 +0100 Subject: [Numpy-discussion] ANN: qimage2ndarray - converting between QImages and numpy.ndarrays In-Reply-To: <200912021842.43806.meine@informatik.uni-hamburg.de> References: <200912021842.43806.meine@informatik.uni-hamburg.de> Message-ID: <7f014ea60912021552n629429d9kce98e1166508f945@mail.gmail.com> Cool. Thanks! I will take a look at this. We have some code in scikits.image that creates a QImage from the numpy data buffer for display. But I have only implemented it for RGB888 so far. So you may have saved me some time :) Cheers! Chris 2009/12/2 Hans Meine : > Hi, > > I have just uploaded a first release of qimage2ndarray, a tiny python > extension for quickly converting between QImages and numpy.ndarrays > (in both directions). ?These are very common tasks when programming e.g. > scientific visualizations in Python using PyQt4 as the GUI library. > > Similar code was found in Qwt and floating around on mailing lists, > but qimage2ndarray has the following unique feature set: > > * Supports conversion of scalar and RGB data, with arbitrary dtypes > ?and memory layout, with and without alpha channels, into QImages > ?(e.g. for display or saving using Qt). > > * Using a tiny C++ extension, qimage2ndarray makes it possible to > ?create ndarrays that are *views* into a given QImage's memory. > > ?This allows for very efficient data handling and makes it possible > ?to modify Qt image data in-place (e.g. for brightness/gamma or alpha > ?mask modifications). > > * qimage2ndarray is stable and unit-tested: > > ?* proper reference counting even with views (ndarray.base points to > ? ?the underlying QImage) > > ?* handles non-standard widths and respects QImage's 32-bit row > ? ?alignment > > * Masked arrays are also supported and are converted into QImages > ?with transparent pixels. > > * Supports value scaling / normalization to 0..255 for convenient > ?display of arbitrary NumPy arrays. > > The extension is open source, BSD-licensed, and available via PyPI or here: > > ?http://kogs-www.informatik.uni-hamburg.de/~meine/software/qimage2ndarray/ > > I hope this is useful to many of you and look forward to your feedback, > ?Hans > > PS: Now that I am announcing this, I suddenly have the feeling that I should > have talked with some lawyer (or Phil) about possible license issues because > of PyQt. ?I really hope there will not turn out to be problems with this. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From hgchong at berkeley.edu Wed Dec 2 18:55:00 2009 From: hgchong at berkeley.edu (Howard Chong) Date: Wed, 2 Dec 2009 15:55:00 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array Message-ID: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> I will need to find the N largest numbers and corresponding indexes in an 1-D array. If N==1, I can easily do: def myFindMaxC(myList): """implement finding maximum value with using numpy.array()""" myA=np.array(myList) maxIndex=myA.argmax() maxVal=myA[maxIndex] return [maxIndex, maxVal] For me, I'm likely going to be running this with N==7. So, I think I have to iterate over the array. Doing it with non-numpy procedures is **quite slow**. Here, I run it with N==1 without using any numpy procedures. def myFindMaxA(myList): """implement finding maximum value with for loop iteration""" maxIndex=0 maxVal=myList[0] for index, item in enumerate(myList): if item[0]>maxVal: maxVal=item[0] maxIndex=index return [maxIndex, maxVal] My question is: how can I make the latter version run faster? I think the answer is that I have to do the iteration in C. If that's the case, can anyone point me to where np.array.argmax() is implemented so I can write np.array.argmaxN() extend it to the N largest values? Thanks! -- Howard Chong Dept. of Agricultural and Resource Economics and Energy Institute @ Haas Business School UC Berkeley hgchong at berkeley.edu Cell: 510-333-0539 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed Dec 2 19:10:32 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 2 Dec 2009 19:10:32 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> Message-ID: On Dec 2, 2009, at 6:55 PM, Howard Chong wrote: > > My question is: how can I make the latter version run faster? I think the answer is that I have to do the iteration in C. > > If that's the case, can anyone point me to where np.array.argmax() is implemented so I can write np.array.argmaxN() extend it to the N largest values? What about using .argsort and take the last N values ? >>> x=np.array([10,40,30,50,20]) >>> i=x.argsort()[-2:] >>> (i,x[i]) (array([1, 3]), array([40, 50])) From dwf at cs.toronto.edu Wed Dec 2 19:15:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 2 Dec 2009 19:15:25 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> Message-ID: <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: > def myFindMaxA(myList): > """implement finding maximum value with for loop iteration""" > maxIndex=0 > maxVal=myList[0] > for index, item in enumerate(myList): > if item[0]>maxVal: > maxVal=item[0] > maxIndex=index > return [maxIndex, maxVal] > > > > My question is: how can I make the latter version run faster? I > think the > answer is that I have to do the iteration in C. def find_biggest_n(myarray, n): ind = np.argsort(myarray) return ind[-n:], myarray[ind[-n:]] David From ndbecker2 at gmail.com Wed Dec 2 20:09:17 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 20:09:17 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: David Warde-Farley wrote: > On 2-Dec-09, at 6:55 PM, Howard Chong wrote: > >> def myFindMaxA(myList): >> """implement finding maximum value with for loop iteration""" >> maxIndex=0 >> maxVal=myList[0] >> for index, item in enumerate(myList): >> if item[0]>maxVal: >> maxVal=item[0] >> maxIndex=index >> return [maxIndex, maxVal] >> >> >> >> My question is: how can I make the latter version run faster? I >> think the >> answer is that I have to do the iteration in C. > > > def find_biggest_n(myarray, n): > ind = np.argsort(myarray) > return ind[-n:], myarray[ind[-n:]] > > David Not bad, although I wonder whether a partial sort could be faster. From kwgoodman at gmail.com Wed Dec 2 20:23:35 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 17:23:35 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker wrote: > David Warde-Farley wrote: > >> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: >> >>> def myFindMaxA(myList): >>> ? ?"""implement finding maximum value with for loop iteration""" >>> ? ?maxIndex=0 >>> ? ?maxVal=myList[0] >>> ? ?for index, item in enumerate(myList): >>> ? ? ? ?if item[0]>maxVal: >>> ? ? ? ? ? ?maxVal=item[0] >>> ? ? ? ? ? ?maxIndex=index >>> ? ?return [maxIndex, maxVal] >>> >>> >>> >>> My question is: how can I make the latter version run faster? I >>> think the >>> answer is that I have to do the iteration in C. >> >> >> def find_biggest_n(myarray, n): >> ind = np.argsort(myarray) >> return ind[-n:], myarray[ind[-n:]] >> >> David > Not bad, although I wonder whether a partial sort could be faster. I'm doing a lot of sorting right now. I only need to sort the lowest 30% of values in a 1d array (about 250k elements), the rest I don't need to sort. How do I do a partial sort? From dwf at cs.toronto.edu Wed Dec 2 20:30:45 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 2 Dec 2009 20:30:45 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> On 2-Dec-09, at 8:09 PM, Neal Becker wrote: > Not bad, although I wonder whether a partial sort could be faster. Probably (if the array is large) but depending on n, not if it's in Python. Ideal problem for Cython, though. David From peridot.faceted at gmail.com Wed Dec 2 20:27:08 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 2 Dec 2009 20:27:08 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: 2009/12/2 Keith Goodman : > On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker wrote: >> David Warde-Farley wrote: >> >>> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: >>> >>>> def myFindMaxA(myList): >>>> ? ?"""implement finding maximum value with for loop iteration""" >>>> ? ?maxIndex=0 >>>> ? ?maxVal=myList[0] >>>> ? ?for index, item in enumerate(myList): >>>> ? ? ? ?if item[0]>maxVal: >>>> ? ? ? ? ? ?maxVal=item[0] >>>> ? ? ? ? ? ?maxIndex=index >>>> ? ?return [maxIndex, maxVal] >>>> >>>> >>>> >>>> My question is: how can I make the latter version run faster? I >>>> think the >>>> answer is that I have to do the iteration in C. >>> >>> >>> def find_biggest_n(myarray, n): >>> ind = np.argsort(myarray) >>> return ind[-n:], myarray[ind[-n:]] >>> >>> David >> Not bad, although I wonder whether a partial sort could be faster. > > I'm doing a lot of sorting right now. I only need to sort the lowest > 30% of values in a 1d array (about 250k elements), the rest I don't > need to sort. How do I do a partial sort? Algorithmically, if you're doing a quicksort, you just don't sort one side of a partition when it's outside the range you want sorted (which could even be data-dependent, I suppose, as well as number-of-items-dependent). This is useful for sorting out only the extreme elements, as well as finding quantiles (and those elements above/below them). Unfortunately I'm not aware of any implementation, useful though it might be. Anne From ndbecker2 at gmail.com Wed Dec 2 20:27:53 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 20:27:53 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: Keith Goodman wrote: ... >> Not bad, although I wonder whether a partial sort could be faster. > > I'm doing a lot of sorting right now. I only need to sort the lowest > 30% of values in a 1d array (about 250k elements), the rest I don't > need to sort. How do I do a partial sort? I only know of it because of the standard c++ library std::partial_sort That should give references (maybe heap-based?) From charlesr.harris at gmail.com Wed Dec 2 20:29:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Dec 2009 18:29:43 -0700 Subject: [Numpy-discussion] SPARC Buildbot bus errors after recent commits Message-ID: Hi Travis, I think this is yours ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Dec 2 20:32:06 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 2 Dec 2009 20:32:06 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> Message-ID: 2009/12/2 David Warde-Farley : > On 2-Dec-09, at 8:09 PM, Neal Becker wrote: > >> Not bad, although I wonder whether a partial sort could be faster. > > Probably (if the array is large) but depending on n, not if it's in > Python. Ideal problem for Cython, though. How is Cython support for generic types these days? One wouldn't want to have to write separate versions for each dtype... Anne > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From kwgoodman at gmail.com Wed Dec 2 20:39:44 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 17:39:44 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 5:27 PM, Anne Archibald wrote: > 2009/12/2 Keith Goodman : >> On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker wrote: >>> David Warde-Farley wrote: >>> >>>> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: >>>> >>>>> def myFindMaxA(myList): >>>>> ? ?"""implement finding maximum value with for loop iteration""" >>>>> ? ?maxIndex=0 >>>>> ? ?maxVal=myList[0] >>>>> ? ?for index, item in enumerate(myList): >>>>> ? ? ? ?if item[0]>maxVal: >>>>> ? ? ? ? ? ?maxVal=item[0] >>>>> ? ? ? ? ? ?maxIndex=index >>>>> ? ?return [maxIndex, maxVal] >>>>> >>>>> >>>>> >>>>> My question is: how can I make the latter version run faster? I >>>>> think the >>>>> answer is that I have to do the iteration in C. >>>> >>>> >>>> def find_biggest_n(myarray, n): >>>> ind = np.argsort(myarray) >>>> return ind[-n:], myarray[ind[-n:]] >>>> >>>> David >>> Not bad, although I wonder whether a partial sort could be faster. >> >> I'm doing a lot of sorting right now. I only need to sort the lowest >> 30% of values in a 1d array (about 250k elements), the rest I don't >> need to sort. How do I do a partial sort? > > Algorithmically, if you're doing a quicksort, you just don't sort one > side of a partition when it's outside the range you want sorted (which > could even be data-dependent, I suppose, as well as > number-of-items-dependent). This is useful for sorting out only the > extreme elements, as well as finding quantiles (and those elements > above/below them). Unfortunately I'm not aware of any implementation, > useful though it might be. Oh, I thought he meant there was a numpy function for partial sorting. What kind of speed up for my problem (sort lower 30% of a 1d array with 250k elements) could I expect if I paid someone to write it in cython? Twice as fast? I'm actually doing x.argsort(), if that matters. From charlesr.harris at gmail.com Wed Dec 2 20:44:16 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Dec 2009 18:44:16 -0700 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 6:32 PM, Anne Archibald wrote: > 2009/12/2 David Warde-Farley : > > On 2-Dec-09, at 8:09 PM, Neal Becker wrote: > > > >> Not bad, although I wonder whether a partial sort could be faster. > > > > Probably (if the array is large) but depending on n, not if it's in > > Python. Ideal problem for Cython, though. > > How is Cython support for generic types these days? One wouldn't want > to have to write separate versions for each dtype... > > It could be made part of the _sortmodule without much trouble, but if you are looking at the lower 30% I doubt it would do much more that buy you a factor of 2x. Would that matter? folks mostly think of using partial sorts for finding medians and such where it is essentially linear time. I've thought of implementing it for the next release. Re heapsort, heapsort is already the slowest sort by about a factor of 2, and that is about the best case savings you could get using it for finding some small percentage of the smallest values. Half the time is spent building the heap. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Wed Dec 2 20:52:02 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 2 Dec 2009 20:52:02 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: 2009/12/2 Keith Goodman : > On Wed, Dec 2, 2009 at 5:27 PM, Anne Archibald > wrote: >> 2009/12/2 Keith Goodman : >>> On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker wrote: >>>> David Warde-Farley wrote: >>>> >>>>> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: >>>>> >>>>>> def myFindMaxA(myList): >>>>>> ? ?"""implement finding maximum value with for loop iteration""" >>>>>> ? ?maxIndex=0 >>>>>> ? ?maxVal=myList[0] >>>>>> ? ?for index, item in enumerate(myList): >>>>>> ? ? ? ?if item[0]>maxVal: >>>>>> ? ? ? ? ? ?maxVal=item[0] >>>>>> ? ? ? ? ? ?maxIndex=index >>>>>> ? ?return [maxIndex, maxVal] >>>>>> >>>>>> >>>>>> >>>>>> My question is: how can I make the latter version run faster? I >>>>>> think the >>>>>> answer is that I have to do the iteration in C. >>>>> >>>>> >>>>> def find_biggest_n(myarray, n): >>>>> ind = np.argsort(myarray) >>>>> return ind[-n:], myarray[ind[-n:]] >>>>> >>>>> David >>>> Not bad, although I wonder whether a partial sort could be faster. >>> >>> I'm doing a lot of sorting right now. I only need to sort the lowest >>> 30% of values in a 1d array (about 250k elements), the rest I don't >>> need to sort. How do I do a partial sort? >> >> Algorithmically, if you're doing a quicksort, you just don't sort one >> side of a partition when it's outside the range you want sorted (which >> could even be data-dependent, I suppose, as well as >> number-of-items-dependent). This is useful for sorting out only the >> extreme elements, as well as finding quantiles (and those elements >> above/below them). Unfortunately I'm not aware of any implementation, >> useful though it might be. > > Oh, I thought he meant there was a numpy function for partial sorting. > > What kind of speed up for my problem (sort lower 30% of a 1d array > with 250k elements) could I expect if I paid someone to write it in > cython? Twice as fast? I'm actually doing x.argsort(), if that > matters. Hard to say exactly, but I'd say at best a factor of two. You can get a rough best-case value by doing an argmin followed by an argsort of the first 30%. In reality things will be a bit worse because you won't immediately be able to discard the whole rest of the array, and because your argsort will be accessing data spread over a larger swath of memory (so worse cache performance). But if this doesn't provide a useful speedup, it's very unlikely partial sorting will be of any use to you. Medians and other quantiles, or partitioning tasks, are where it really shines. Anne From kwgoodman at gmail.com Wed Dec 2 21:05:02 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 18:05:02 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 5:52 PM, Anne Archibald wrote: > 2009/12/2 Keith Goodman : >> On Wed, Dec 2, 2009 at 5:27 PM, Anne Archibald >> wrote: >>> 2009/12/2 Keith Goodman : >>>> On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker wrote: >>>>> David Warde-Farley wrote: >>>>> >>>>>> On 2-Dec-09, at 6:55 PM, Howard Chong wrote: >>>>>> >>>>>>> def myFindMaxA(myList): >>>>>>> ? ?"""implement finding maximum value with for loop iteration""" >>>>>>> ? ?maxIndex=0 >>>>>>> ? ?maxVal=myList[0] >>>>>>> ? ?for index, item in enumerate(myList): >>>>>>> ? ? ? ?if item[0]>maxVal: >>>>>>> ? ? ? ? ? ?maxVal=item[0] >>>>>>> ? ? ? ? ? ?maxIndex=index >>>>>>> ? ?return [maxIndex, maxVal] >>>>>>> >>>>>>> >>>>>>> >>>>>>> My question is: how can I make the latter version run faster? I >>>>>>> think the >>>>>>> answer is that I have to do the iteration in C. >>>>>> >>>>>> >>>>>> def find_biggest_n(myarray, n): >>>>>> ind = np.argsort(myarray) >>>>>> return ind[-n:], myarray[ind[-n:]] >>>>>> >>>>>> David >>>>> Not bad, although I wonder whether a partial sort could be faster. >>>> >>>> I'm doing a lot of sorting right now. I only need to sort the lowest >>>> 30% of values in a 1d array (about 250k elements), the rest I don't >>>> need to sort. How do I do a partial sort? >>> >>> Algorithmically, if you're doing a quicksort, you just don't sort one >>> side of a partition when it's outside the range you want sorted (which >>> could even be data-dependent, I suppose, as well as >>> number-of-items-dependent). This is useful for sorting out only the >>> extreme elements, as well as finding quantiles (and those elements >>> above/below them). Unfortunately I'm not aware of any implementation, >>> useful though it might be. >> >> Oh, I thought he meant there was a numpy function for partial sorting. >> >> What kind of speed up for my problem (sort lower 30% of a 1d array >> with 250k elements) could I expect if I paid someone to write it in >> cython? Twice as fast? I'm actually doing x.argsort(), if that >> matters. > > Hard to say exactly, but I'd say at best a factor of two. You can get > a rough best-case value by doing an argmin followed by an argsort of > the first 30%. > > In reality things will be a bit worse because you won't immediately be > able to discard the whole rest of the array, and because your argsort > will be accessing data spread over a larger swath of memory (so worse > cache performance). But if this doesn't provide a useful speedup, it's > very unlikely partial sorting will be of any use to you. > > Medians and other quantiles, or partitioning tasks, are where it really shines. Looks delicious: >> y = np.random.rand(250000) >> timeit y.argsort() 10 loops, best of 3: 33 ms per loop >> y3 = np.random.rand(int(250000/3.0)) >> timeit y.argmin(); y3.argsort() 100 loops, best of 3: 10.2 ms per loop My actual problem is this: idx = d.argsort() yc = y[idx].cumsum() yc = yc[ntops1] yc /= ntops where d is a 1d array of length 250k, ntops is a 30 element 1d array, and ntop1 = ntops + 1. So I really only need to know the exact order of 30 elements but I need the right elements, in any order, between those 30 elements. So it looks like there is a lot to be gained from a cython implementation of my specific problem. Sorry for trying to hijack the thread. From dwf at cs.toronto.edu Wed Dec 2 21:19:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 2 Dec 2009 21:19:25 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> Message-ID: <7E4532C8-FD1D-4A56-915A-8D6E96AFF495@cs.toronto.edu> On 2-Dec-09, at 8:32 PM, Anne Archibald wrote: > 2009/12/2 David Warde-Farley : >> On 2-Dec-09, at 8:09 PM, Neal Becker wrote: >> >>> Not bad, although I wonder whether a partial sort could be faster. >> >> Probably (if the array is large) but depending on n, not if it's in >> Python. Ideal problem for Cython, though. > > How is Cython support for generic types these days? One wouldn't want > to have to write separate versions for each dtype... Much the same, unfortunately. Dag suggested generating Cython code with a Cheetah template the last time I brought it up. David From kwgoodman at gmail.com Wed Dec 2 21:48:14 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 18:48:14 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> <78943DA3-E93F-4389-B4F6-A4960C9BFE57@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 5:44 PM, Charles R Harris wrote: > > > On Wed, Dec 2, 2009 at 6:32 PM, Anne Archibald > wrote: >> >> 2009/12/2 David Warde-Farley : >> > On 2-Dec-09, at 8:09 PM, Neal Becker wrote: >> > >> >> Not bad, although I wonder whether a partial sort could be faster. >> > >> > Probably (if the array is large) but depending on n, not if it's in >> > Python. Ideal problem for Cython, though. >> >> How is Cython support for generic types these days? One wouldn't want >> to have to write separate versions for each dtype... >> > > It could be made part of the _sortmodule without much trouble, but if you > are looking at the lower 30% I doubt it would do much more that buy you a > factor of 2x. Would that matter? That sounds fantastic! From ndbecker2 at gmail.com Wed Dec 2 22:12:31 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 22:12:31 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: Keith Goodman wrote: ... > Oh, I thought he meant there was a numpy function for partial sorting. > Actually, I do use this myself. My code is a boost::python wrapper or the std::partial_sum using pyublas. Here's the main pieces: template inline out_t partial_sum (in_t const& in) { out_t out (boost::size (in)); std::partial_sum (boost::begin (in), boost::end (in), boost::begin (out)); return out; } ... def ("partial_sum", &partial_sum,pyublas::numpy_strided_vector >); From ndbecker2 at gmail.com Wed Dec 2 22:15:18 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 22:15:18 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: Neal Becker wrote: > Keith Goodman wrote: > ... >> Oh, I thought he meant there was a numpy function for partial sorting. >> > Actually, I do use this myself. My code is a boost::python wrapper or > the std::partial_sum using pyublas. Here's the main pieces: > > template > inline out_t partial_sum (in_t const& in) { > out_t out (boost::size (in)); > std::partial_sum (boost::begin (in), boost::end (in), boost::begin > (out)); > return out; > } > ... > def ("partial_sum", > &partial_sum,pyublas::numpy_strided_vector >>); Oops, sorry, that's the wrong one (that was partial_sum, not partial_sort). I don't have a wrapper for that one, but it would probably be easy enough to do with the same tools as above. From kwgoodman at gmail.com Wed Dec 2 22:23:47 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 19:23:47 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 7:15 PM, Neal Becker wrote: > Neal Becker wrote: > >> Keith Goodman wrote: >> ... >>> Oh, I thought he meant there was a numpy function for partial > sorting. >>> >> Actually, I do use this myself. ?My code is a boost::python wrapper > or >> the std::partial_sum using pyublas. ?Here's the main pieces: >> >> template >> inline out_t partial_sum (in_t const& in) { >> ? out_t out (boost::size (in)); >> ? std::partial_sum (boost::begin (in), boost::end (in), boost::begin >> (out)); >> ? return out; >> } >> ... >> ? def ("partial_sum", >> > &partial_sum,pyublas::numpy_strided_vector >>>); > > Oops, sorry, that's the wrong one (that was partial_sum, not > partial_sort). ?I don't have a wrapper for that one, but it would > probably be easy enough to do with the same tools as above. Is a partial sum a cumsum? How does the speed of your code above compare to numpy's cumsum? >> y = np.random.rand(250000) >> timeit y.cumsum() 1000 loops, best of 3: 1.05 ms per loop From ndbecker2 at gmail.com Wed Dec 2 22:26:29 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 22:26:29 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: Neal Becker wrote: > Keith Goodman wrote: > ... >> Oh, I thought he meant there was a numpy function for partial sorting. >> Try this one: template inline void partial_sort (in_t in, int n_el) { std::partial_sort (boost::begin (in), boost::begin(in) + n_el, boost::end (in)); } ... def ("partial_sort", &partial_sort >); def ("partial_sort", &partial_sort >); def ("partial_sort", &partial_sort >); --------- import pyublas import numpy as np u = np.arange (20)[::-1] from numpy_fncs import partial_sort partial_sort (u, 4) In [2]: u Out[2]: array([ 0, 1, 2, 3, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4]) From ndbecker2 at gmail.com Wed Dec 2 22:31:46 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 Dec 2009 22:31:46 -0500 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: Keith Goodman wrote: > On Wed, Dec 2, 2009 at 7:15 PM, Neal Becker wrote: >> Neal Becker wrote: >> >>> Keith Goodman wrote: >>> ... >>>> Oh, I thought he meant there was a numpy function for partial >> sorting. >>>> >>> Actually, I do use this myself. My code is a boost::python wrapper >> or >>> the std::partial_sum using pyublas. Here's the main pieces: >>> >>> template >>> inline out_t partial_sum (in_t const& in) { >>> out_t out (boost::size (in)); >>> std::partial_sum (boost::begin (in), boost::end (in), boost::begin >>> (out)); >>> return out; >>> } >>> ... >>> def ("partial_sum", >>> >> &partial_sum,pyublas::numpy_strided_vector >>>>); >> >> Oops, sorry, that's the wrong one (that was partial_sum, not >> partial_sort). I don't have a wrapper for that one, but it would >> probably be easy enough to do with the same tools as above. > > Is a partial sum a cumsum? How does the speed of your code above > compare to numpy's cumsum? > >>> y = np.random.rand(250000) >>> timeit y.cumsum() > 1000 loops, best of 3: 1.05 ms per loop timeit y.cumsum() 1000 loops, best of 3: 1.08 ms per loop from numpy_fncs import partial_sum : timeit partial_sum(y) 1000 loops, best of 3: 554 us per loop From kwgoodman at gmail.com Wed Dec 2 22:42:33 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 2 Dec 2009 19:42:33 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com> <42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: On Wed, Dec 2, 2009 at 7:31 PM, Neal Becker wrote: > Keith Goodman wrote: > >> On Wed, Dec 2, 2009 at 7:15 PM, Neal Becker > wrote: >>> Neal Becker wrote: >>> >>>> Keith Goodman wrote: >>>> ... >>>>> Oh, I thought he meant there was a numpy function for partial >>> sorting. >>>>> >>>> Actually, I do use this myself. ?My code is a boost::python > wrapper >>> or >>>> the std::partial_sum using pyublas. ?Here's the main pieces: >>>> >>>> template >>>> inline out_t partial_sum (in_t const& in) { >>>> out_t out (boost::size (in)); >>>> std::partial_sum (boost::begin (in), boost::end (in), boost::begin >>>> (out)); >>>> return out; >>>> } >>>> ... >>>> def ("partial_sum", >>>> >>> > &partial_sum,pyublas::numpy_strided_vector >>>>>); >>> >>> Oops, sorry, that's the wrong one (that was partial_sum, not >>> partial_sort). ?I don't have a wrapper for that one, but it would >>> probably be easy enough to do with the same tools as above. >> >> Is a partial sum a cumsum? How does the speed of your code above >> compare to numpy's cumsum? >> >>>> y = np.random.rand(250000) >>>> timeit y.cumsum() >> 1000 loops, best of 3: 1.05 ms per loop > ?timeit y.cumsum() > 1000 loops, best of 3: 1.08 ms per loop > from numpy_fncs import partial_sum > : timeit partial_sum(y) > 1000 loops, best of 3: 554 us per loop Nice. From yogeshkarpate at gmail.com Thu Dec 3 00:35:07 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Thu, 3 Dec 2009 11:05:07 +0530 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> Message-ID: <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> The thing is that the normalization by (n-1) is done for the no. of samples >20 or23(Not sure about this no. but sure about the thing that this no isnt greater than 25) and below that we use normalization by n. Regards ~ymk -------------- next part -------------- An HTML attachment was scrubbed... URL: From newptcai at gmail.com Thu Dec 3 00:40:30 2009 From: newptcai at gmail.com (Peter Cai) Date: Thu, 3 Dec 2009 13:40:30 +0800 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? Message-ID: How to solve homogeneous linear equations with NumPy? If I have homogeneous linear equations like this array([[-0.75, 0.25, 0.25, 0.25], [ 1. , -1. , 0. , 0. ], [ 1. , 0. , -1. , 0. ], [ 1. , 0. , 0. , -1. ]]) And I want to get a non-zero solution for it. How can it be done with NumPy? linalg.solve only works on A * x = b where b does not contains only 0. -- look to the things around you,the immediate world around you, if you are alive,it will mean something to you ??Paul Strand From charlesr.harris at gmail.com Thu Dec 3 01:04:05 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 2 Dec 2009 23:04:05 -0700 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: Message-ID: On Wed, Dec 2, 2009 at 10:40 PM, Peter Cai wrote: > How to solve homogeneous linear equations with NumPy? > > > > If I have homogeneous linear equations like this > > array([[-0.75, 0.25, 0.25, 0.25], > [ 1. , -1. , 0. , 0. ], > [ 1. , 0. , -1. , 0. ], > [ 1. , 0. , 0. , -1. ]]) > > And I want to get a non-zero solution for it. How can it be done with > NumPy? > > linalg.solve only works on A * x = b where b does not contains only 0. > > > One way is to use the singular value decomposition In [16]: a = array([[-0.75, 0.25, 0.25, 0.25], [ 1. , -1. , 0. , 0. ], [ 1. , 0. , -1. , 0. ], [ 1. , 0. , 0. , -1. ]]) In [20]: l,v,r = svd(a) In [21]: v Out[21]: array([ 2.17944947e+00, 1.00000000e+00, 1.00000000e+00, 1.11022302e-16]) In [22]: dot(a,r[-1]) Out[22]: array([ -6.93889390e-17, 5.55111512e-17, 1.11022302e-16,1.11022302e-16]) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From newptcai at gmail.com Thu Dec 3 01:13:40 2009 From: newptcai at gmail.com (Peter Cai) Date: Thu, 3 Dec 2009 14:13:40 +0800 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: Message-ID: Thanks a lot. But my knowledge of linear equations are limited, so can explain in your code, which result represent the solution set of solution? BTW : since [1, 1, 1, 1] is an obviously non-trivial solution, can you prove your method could verify it? On Thu, Dec 3, 2009 at 2:04 PM, Charles R Harris wrote: > > > On Wed, Dec 2, 2009 at 10:40 PM, Peter Cai wrote: > >> How to solve homogeneous linear equations with NumPy? >> >> >> >> If I have homogeneous linear equations like this >> >> array([[-0.75, 0.25, 0.25, 0.25], >> [ 1. , -1. , 0. , 0. ], >> [ 1. , 0. , -1. , 0. ], >> [ 1. , 0. , 0. , -1. ]]) >> >> And I want to get a non-zero solution for it. How can it be done with >> NumPy? >> >> linalg.solve only works on A * x = b where b does not contains only 0. >> >> >> > One way is to use the singular value decomposition > > In [16]: a = array([[-0.75, 0.25, 0.25, 0.25], > > [ 1. , -1. , 0. , 0. ], > [ 1. , 0. , -1. , 0. ], > [ 1. , 0. , 0. , -1. ]]) > > In [20]: l,v,r = svd(a) > > In [21]: v > Out[21]: > array([ 2.17944947e+00, 1.00000000e+00, 1.00000000e+00, > 1.11022302e-16]) > > In [22]: dot(a,r[-1]) > Out[22]: > array([ -6.93889390e-17, 5.55111512e-17, 1.11022302e-16,1.11022302e-16]) > > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- look to the things around you,the immediate world around you, if you are alive,it will mean something to you ??Paul Strand -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Thu Dec 3 01:34:34 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 3 Dec 2009 08:34:34 +0200 Subject: [Numpy-discussion] Find the N maximum values and correspondingindexes in an array References: <5861ec420912021555j1caf295by5ef3c0eb8084bf3a@mail.gmail.com><42CBDE9D-0225-4BB0-A2FF-2430105B2702@cs.toronto.edu> Message-ID: <710F2847B0018641891D9A21602763605AD248@ex3.envision.co.il> Is it relevant to scipy.stats.scoreatpercentile? There is a comment is that function: # TODO: this should be a simple wrapper around a well-written quantile # function. GNU R provides 9 quantile algorithms (!), with differing # behaviour at, for example, discontinuities. Nadav. -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of Neal Becker Sent: Thu 03-Dec-09 05:26 To: numpy-discussion at scipy.org Subject: Re: [Numpy-discussion] Find the N maximum values and correspondingindexes in an array Neal Becker wrote: > Keith Goodman wrote: > ... >> Oh, I thought he meant there was a numpy function for partial sorting. >> Try this one: template inline void partial_sort (in_t in, int n_el) { std::partial_sort (boost::begin (in), boost::begin(in) + n_el, boost::end (in)); } ... def ("partial_sort", &partial_sort >); def ("partial_sort", &partial_sort >); def ("partial_sort", &partial_sort >); --------- import pyublas import numpy as np u = np.arange (20)[::-1] from numpy_fncs import partial_sort partial_sort (u, 4) In [2]: u Out[2]: array([ 0, 1, 2, 3, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4]) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3850 bytes Desc: not available URL: From cournape at gmail.com Thu Dec 3 03:31:20 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 3 Dec 2009 17:31:20 +0900 Subject: [Numpy-discussion] Import numpy fails on cygwin python In-Reply-To: <599716.6028.qm@web51003.mail.re2.yahoo.com> References: <679405.12948.qm@web51002.mail.re2.yahoo.com> <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> <599716.6028.qm@web51003.mail.re2.yahoo.com> Message-ID: <5b8d13220912030031l46e8f011je0a1e1844a1b19d5@mail.gmail.com> On Thu, Dec 3, 2009 at 5:41 AM, Olivia Cheronet wrote: > ----- Original Message ---- >> From: David Cournapeau >> >> Does the file >> /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so exist ? >> >> cheers, >> >> David > > > Indeed, this file is not there. Where can I find it? It should have been installed - I would be really surprised if the install succeeded without installing this file. Can you post the full build log when you install numpy. Something like: python setup.py install &> build.log After having removed the build directory (rm -rf build). cheers, David From pav+sp at iki.fi Thu Dec 3 04:36:09 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Thu, 3 Dec 2009 09:36:09 +0000 (UTC) Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> Message-ID: Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote: [clip] > One thing to keep in mind here is that PEP 3118 actually defines a > standard dtype format string, which is (mostly) incompatible with > NumPy's. It should probably be supported as well when PEP 3118 is > implemented. PEP 3118 is for the most part implemented in my Py3K branch now -- it was not actually much work, as I could steal most of the format string converter from numpy.pxd. Some questions: How hard do we want to try supplying a buffer? Eg. if the consumer does not specify strided but specifies suboffsets, should we try to compute suitable suboffsets? Should we try making contiguous copies of the data (I guess this would break buffer semantics?)? > Just something to keep in the back of ones mind when discussing this. > For instance one could, instead of inventing something new, adopt the > characters PEP 3118 uses (if there isn't a conflict): > > - b: Raw byte > - c: ucs-1 encoding (latin 1, one byte) > - u: ucs-2 encoding, two bytes > - w: ucs-4 encoding, four bytes The 'b' character is already taken so we can't easily use that. 'y' would be free for bYtes, however. > Long-term I hope the NumPy-specific format string will be deprecated, so > that repr print out the PEP 3118 format string etc. But, I'm aware that > API breakage shouldn't happen when porting to Python 3. Agreed. A global switch could in principle be added for this, maybe -- the type codes are for the most part stored in a dict in numerictypes.py and could probably be easily replaced runtime. -- Pauli Virtanen From pav+sp at iki.fi Thu Dec 3 04:39:30 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Thu, 3 Dec 2009 09:39:30 +0000 (UTC) Subject: [Numpy-discussion] Python 3K merge References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> Message-ID: Tue, 01 Dec 2009 17:31:10 +0900, David Cournapeau wrote: > On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris > wrote: >> It looks like you doing great stuff with the py3k transition. Do you >> and David have any sort of merge schedule in mind? > > I have updated my py3k branch for numpy.distutils, and it is ready to > merge: > > http://github.com/cournape/numpy/tree/py3k_bootstrap_take3 > > I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on > Linux at least. The patch is much smaller than my previous attempts as > well, so I would just push it to the trunk, and deal with the issues as > they come. I think I should rebase my branch on this, or vice versa, to avoid further duplicated work. I think most of my changes would be ready for SVN, after rebasing and regrouping via rebase -i -- they do not affect behavior on Py2, and for the most part the changes required in C code are quite obvious. The largest changes are probably related to the 'S' data type. In other news, we cannot support Py2 pickles in Py3 -- this is because Py2 str is unpickled as Py3 str, resulting to encoding failures even before the data is passed on to Numpy. But in any case, Py3 support for Numpy 1.5.0 seems a completely realistic plan. -- Pauli Virtanen From david at ar.media.kyoto-u.ac.jp Thu Dec 3 04:23:28 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 03 Dec 2009 18:23:28 +0900 Subject: [Numpy-discussion] Python 3K merge In-Reply-To: References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> Message-ID: <4B178390.7050006@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > > I think I should rebase my branch on this, or vice versa, to avoid > further duplicated work. > I think I will just commit my branch to the trunk once ASAP - I expect more breakage from my code than yours, and the sooner the better for distutils-related changes. cheers, David From pav+sp at iki.fi Thu Dec 3 05:01:36 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Thu, 3 Dec 2009 10:01:36 +0000 (UTC) Subject: [Numpy-discussion] Python 3K merge References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> <4B178390.7050006@ar.media.kyoto-u.ac.jp> Message-ID: Thu, 03 Dec 2009 18:23:28 +0900, David Cournapeau wrote: > Pauli Virtanen wrote: >> I think I should rebase my branch on this, or vice versa, to avoid >> further duplicated work. > > I think I will just commit my branch to the trunk once ASAP - I expect > more breakage from my code than yours, and the sooner the better for > distutils-related changes. Ok, I'll follow that up with the more innocuous changesets: 1) Stuff needed to make Numpy C modules to build on Py3K 2) The evil 2to3 autoconversion hack, which we may want to get rid of in the long run. 3) PEP 3118 4) Obvious PyBytes vs. PyUnicode changes I'll try to get this done ASAP, too, after the distutils stuff is in. -- Pauli Virtanen From renesd at gmail.com Thu Dec 3 07:04:59 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Thu, 3 Dec 2009 13:04:59 +0100 Subject: [Numpy-discussion] Python 3K merge In-Reply-To: References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> Message-ID: <64ddb72c0912030404kd647b1i32d80ec1c1691963@mail.gmail.com> On Thu, Dec 3, 2009 at 10:39 AM, Pauli Virtanen > wrote: > Tue, 01 Dec 2009 17:31:10 +0900, David Cournapeau wrote: > > On Tue, Dec 1, 2009 at 5:04 AM, Charles R Harris > > wrote: > >> It looks like you doing great stuff with the py3k transition. Do you > >> and David have any sort of merge schedule in mind? > > > > I have updated my py3k branch for numpy.distutils, and it is ready to > > merge: > > > > http://github.com/cournape/numpy/tree/py3k_bootstrap_take3 > > > > I have not thoroughly tested it, but it can run on both 2.4 and 3.1 on > > Linux at least. The patch is much smaller than my previous attempts as > > well, so I would just push it to the trunk, and deal with the issues as > > they come. > > I think I should rebase my branch on this, or vice versa, to avoid > further duplicated work. > > I think most of my changes would be ready for SVN, after rebasing and > regrouping via rebase -i -- they do not affect behavior on Py2, and for > the most part the changes required in C code are quite obvious. > > The largest changes are probably related to the 'S' data type. > > In other news, we cannot support Py2 pickles in Py3 -- this is because > Py2 str is unpickled as Py3 str, resulting to encoding failures even > before the data is passed on to Numpy. > > Is this just for the type codes? Or is there other string data that needs to be pickle loaded? If it is just for the type codes, they are all within the ansi character set and unpickle fine without errors. I'm guessing numpy uses strings to pickle arrays? Note that the pickle module is extensible. So we might be able to get it to special case things. You can subclass Unpickler to make extensions... and there are other techniques. Or it's even possible to submit patches to python if we have a need for something it doesn't support. It is even possible to change the pickle code for py2, so that py3 compatible pickles are saved. In this case it would just require people to load, and resave their pickles with the latest numpy version. Using the python array module to store data might be the way to go(rather than strings), since that is available in both py2 and py3. The pickling/unpickling situtation should be marked as a todo, and documented anyway. As we should start a numpy specific 'porting your code to py3k' document. A set of pickles saved from python2 would be useful for testing. Forwards compatibility is also a useful thing to test. That is py3.1 pickles saved to be loaded with python2 numpy. cheers! -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Dec 3 07:30:24 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 03 Dec 2009 14:30:24 +0200 Subject: [Numpy-discussion] Python 3K merge In-Reply-To: <64ddb72c0912030404kd647b1i32d80ec1c1691963@mail.gmail.com> References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> <64ddb72c0912030404kd647b1i32d80ec1c1691963@mail.gmail.com> Message-ID: <1259843424.7680.17.camel@talisman> to, 2009-12-03 kello 13:04 +0100, Ren? Dudfield kirjoitti: [clip] > In other news, we cannot support Py2 pickles in Py3 -- this is > because > Py2 str is unpickled as Py3 str, resulting to encoding > failures even > before the data is passed on to Numpy. > > Is this just for the type codes? Or is there other string data that > needs to be pickle loaded? If it is just for the type codes, they are > all within the ansi character set and unpickle fine without errors. > I'm guessing numpy uses strings to pickle arrays? The array data is put in a string in __reduce__. The dtype is IIRC mostly stored using integers, though endianness is stored with a character. Actually, now that I look more closely, Py3 pickle.load takes an 'encoding' argument, which will perhaps help here. We should probably just instruct users to pass 'latin1' there in Py3 if they want backwards compatibility. The Numpy __reduce__ and __setstate__ C code must then just be checked for compatibility. [clip] > Using the python array module to store data might be the way to > go(rather than strings), since that is available in both py2 and py3. The array module has the same problem as Numpy, so using it will not help: $ python Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) >>> import array >>> c = array.array('b', '123??') >>> c array('b', [49, 50, 51, -61, -74, -61, -92]) >>> f = open('foo.pck', 'w'); pickle.dump(c, f); f.close() $ python3 Python 3.0.1+ (r301:69556, Apr 15 2009, 15:59:22) >>> import pickle >>> f = open('foo.pck', 'rb') >>> pickle.load(f) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.0/pickle.py", line 1335, in load return Unpickler(file, encoding=encoding, errors=errors).load() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) The 'encoding' argument does not actually help array module, but that may be just because of some incompatible __setstate__ stuff in 'array'. [clip] > A set of pickles saved from python2 would be useful for testing. > Forwards compatibility is also a useful thing to test. That is py3.1 > pickles saved to be loaded with python2 numpy. In Py3 it would be very convenient to __getstate__ the array data in Bytes (e.g. space savings!), which will be forward incompatible, unless the Py2 side has a custom unpickler. -- Pauli Virtanen From cjw at ncf.ca Thu Dec 3 07:49:57 2009 From: cjw at ncf.ca (Colin J. Williams) Date: Thu, 03 Dec 2009 07:49:57 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> Message-ID: <4B17B3F5.60409@ncf.ca> Yogesh, Could you explain the rationale for this choice please? Colin W. On 03-Dec-09 00:35 AM, yogesh karpate wrote: > The thing is that the normalization by (n-1) is done for the no. of > samples >20 or23(Not sure about this no. but sure about the thing that > this no isnt greater than 25) and below that we use normalization by n. > Regards > ~ymk > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dagss at student.matnat.uio.no Thu Dec 3 08:03:13 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 14:03:13 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> Message-ID: <4B17B711.5050001@student.matnat.uio.no> Pauli Virtanen wrote: > Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote: > [clip] > >> One thing to keep in mind here is that PEP 3118 actually defines a >> standard dtype format string, which is (mostly) incompatible with >> NumPy's. It should probably be supported as well when PEP 3118 is >> implemented. >> > > PEP 3118 is for the most part implemented in my Py3K branch now -- it was > not actually much work, as I could steal most of the format string > converter from numpy.pxd. > Great! Are you storing the format string in the dtype types as well? (So that no release is needed and acquisitions are cheap...) As far as numpy.pxd goes -- well, for the simplest dtypes. > Some questions: > > How hard do we want to try supplying a buffer? Eg. if the consumer does > not specify strided but specifies suboffsets, should we try to compute > suitable suboffsets? Should we try making contiguous copies of the data > (I guess this would break buffer semantics?)? > Actually per the PEP, suboffsets imply strided: #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES) :-) So there's no real way for a consumer to specify only suboffsets, 0x0100 is not a possible flag I think. Suboffsets can't really work without the strides anyway IIUC, and in the case of NumPy the field can always be left at 0. IMO one should very much stay clear of making contiguous copies, especially considering the existance of PyBuffer_ToContiguous, which makes it trivial for client code to get a pointer to a contiguous buffer anyway. The intention of the PEP seems to be to export the buffer in as raw form as possible. Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too conservative with NumPy arrays. If a contiguous buffer is requested, then looping through the strides and checking that the strides are monotonically decreasing/increasing could eventually save copying in some cases. I think that could be worth it -- I actually have my own code for IS_F_CONTIGUOUS rather than relying on the flags personally because of this issue, so it does come up in practice. Dag Sverre From dagss at student.matnat.uio.no Thu Dec 3 08:05:51 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 14:05:51 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B17B711.5050001@student.matnat.uio.no> References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> <4B17B711.5050001@student.matnat.uio.no> Message-ID: <4B17B7AF.6040301@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Pauli Virtanen wrote: > >> Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote: >> [clip] >> >> >>> One thing to keep in mind here is that PEP 3118 actually defines a >>> standard dtype format string, which is (mostly) incompatible with >>> NumPy's. It should probably be supported as well when PEP 3118 is >>> implemented. >>> >>> >> PEP 3118 is for the most part implemented in my Py3K branch now -- it was >> not actually much work, as I could steal most of the format string >> converter from numpy.pxd. >> >> > Great! Are you storing the format string in the dtype types as well? (So > that no release is needed and acquisitions are cheap...) > > As far as numpy.pxd goes -- well, for the simplest dtypes. > >> Some questions: >> >> How hard do we want to try supplying a buffer? Eg. if the consumer does >> not specify strided but specifies suboffsets, should we try to compute >> suitable suboffsets? Should we try making contiguous copies of the data >> (I guess this would break buffer semantics?)? >> >> > Actually per the PEP, suboffsets imply strided: > > #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES) > > :-) So there's no real way for a consumer to specify only suboffsets, > 0x0100 is not a possible flag I think. Suboffsets can't really work > without the strides anyway IIUC, and in the case of NumPy the field can > always be left at 0. > That is, NULL! > IMO one should very much stay clear of making contiguous copies, > especially considering the existance of PyBuffer_ToContiguous, which > makes it trivial for client code to get a pointer to a contiguous buffer > anyway. The intention of the PEP seems to be to export the buffer in as > raw form as possible. > > Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too > conservative with NumPy arrays. If a contiguous buffer is requested, > then looping through the strides and checking that the strides are > monotonically decreasing/increasing could eventually save copying in > some cases. I think that could be worth it -- I actually have my own > And, of course, that the innermost stride is 1. > code for IS_F_CONTIGUOUS rather than relying on the flags personally > because of this issue, so it does come up in practice. > > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dagss at student.matnat.uio.no Thu Dec 3 08:10:27 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 14:10:27 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B17B7AF.6040301@student.matnat.uio.no> References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> <4B17B711.5050001@student.matnat.uio.no> <4B17B7AF.6040301@student.matnat.uio.no> Message-ID: <4B17B8C3.1060207@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Dag Sverre Seljebotn wrote: > >> Pauli Virtanen wrote: >> >> >>> Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote: >>> [clip] >>> >>> >>> >>>> One thing to keep in mind here is that PEP 3118 actually defines a >>>> standard dtype format string, which is (mostly) incompatible with >>>> NumPy's. It should probably be supported as well when PEP 3118 is >>>> implemented. >>>> >>>> >>>> >>> PEP 3118 is for the most part implemented in my Py3K branch now -- it was >>> not actually much work, as I could steal most of the format string >>> converter from numpy.pxd. >>> >>> >>> >> Great! Are you storing the format string in the dtype types as well? (So >> that no release is needed and acquisitions are cheap...) >> >> As far as numpy.pxd goes -- well, for the simplest dtypes. >> >> >>> Some questions: >>> >>> How hard do we want to try supplying a buffer? Eg. if the consumer does >>> not specify strided but specifies suboffsets, should we try to compute >>> suitable suboffsets? Should we try making contiguous copies of the data >>> (I guess this would break buffer semantics?)? >>> >>> >>> >> Actually per the PEP, suboffsets imply strided: >> >> #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES) >> >> :-) So there's no real way for a consumer to specify only suboffsets, >> 0x0100 is not a possible flag I think. Suboffsets can't really work >> without the strides anyway IIUC, and in the case of NumPy the field can >> always be left at 0. >> >> > That is, NULL! > >> IMO one should very much stay clear of making contiguous copies, >> especially considering the existance of PyBuffer_ToContiguous, which >> makes it trivial for client code to get a pointer to a contiguous buffer >> anyway. The intention of the PEP seems to be to export the buffer in as >> raw form as possible. >> >> Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too >> conservative with NumPy arrays. If a contiguous buffer is requested, >> then looping through the strides and checking that the strides are >> monotonically decreasing/increasing could eventually save copying in >> some cases. I think that could be worth it -- I actually have my own >> >> > And, of course, that the innermost stride is 1. > Aargh. Some day I'll find/implement a 10 minute send delay for my email program, so I'll catch my errors before the emails go out... Anyway, this is not sufficient, one must also check correspondance with shape, of course. Dag Sverre >> code for IS_F_CONTIGUOUS rather than relying on the flags personally >> because of this issue, so it does come up in practice. >> >> Dag Sverre >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav+sp at iki.fi Thu Dec 3 08:29:50 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Thu, 3 Dec 2009 13:29:50 +0000 (UTC) Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> <4B17B711.5050001@student.matnat.uio.no> Message-ID: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] > Great! Are you storing the format string in the dtype types as well? (So > that no release is needed and acquisitions are cheap...) I regenerate it on each buffer acquisition. It's simple low-level C code, and I suspect it will always be fast enough. Of course, we could *cache* the result in the dtype. (If dtypes are immutable, which I don't remember right now.) Do you have a case in mind where the speed of format string generation would be a bottleneck? >> Some questions: >> >> How hard do we want to try supplying a buffer? Eg. if the consumer does >> not specify strided but specifies suboffsets, should we try to compute >> suitable suboffsets? Should we try making contiguous copies of the data >> (I guess this would break buffer semantics?)? >> > Actually per the PEP, suboffsets imply strided: > > #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES) > > :-) So there's no real way for a consumer to specify only suboffsets, > 0x0100 is not a possible flag I think. Suboffsets can't really work > without the strides anyway IIUC, and in the case of NumPy the field can > always be left at 0. Ok, great! > IMO one should very much stay clear of making contiguous copies, > especially considering the existance of PyBuffer_ToContiguous, which > makes it trivial for client code to get a pointer to a contiguous buffer > anyway. The intention of the PEP seems to be to export the buffer in as > raw form as possible. This is what I thought, too. > Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too > conservative with NumPy arrays. If a contiguous buffer is requested, > then looping through the strides and checking that the strides are > monotonically decreasing/increasing could eventually save copying in > some cases. I think that could be worth it -- I actually have my own > code for IS_F_CONTIGUOUS rather than relying on the flags personally > because of this issue, so it does come up in practice. Are you sure? Assume monotonically increasing or decreasing strides with inner stride of itemsize. Now, if the strides are not C or F-contiguous, doesn't this imply that part of the data in the memory block is *not* pointed to by a set of indices? [For example, strides = {itemsize, 3*itemsize}; dims = {2, 2}. Now, there is unused memory between items (1,0) and (0,1).] This probably boils down to what exactly was meant in the PEP and Python docs by "contiguous". I'd believe it was meant to be the same as in Numpy -- that you can send the array data e.g. to Fortran as-is. If so, there should not be gaps in the data, if the client explicitly requested that the buffer be contiguous. Maybe you meant that the Numpy array flags (which the macros check) are not always up-to-date wrt. the stride information? -- Pauli Virtanen From dagss at student.matnat.uio.no Thu Dec 3 08:56:16 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 Dec 2009 14:56:16 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: References: <1259276898.8494.18.camel@idol> <200911271633.47281.faltet@pytables.org> <1259336464.4110.825.camel@talisman> <200911271704.57927.faltet@pytables.org> <4B10508E.4040200@student.matnat.uio.no> <4B17B711.5050001@student.matnat.uio.no> Message-ID: <4B17C380.4090501@student.matnat.uio.no> Pauli Virtanen wrote: > Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: > [clip] > >> Great! Are you storing the format string in the dtype types as well? (So >> that no release is needed and acquisitions are cheap...) >> > > I regenerate it on each buffer acquisition. It's simple low-level C code, > and I suspect it will always be fast enough. Of course, we could *cache* > the result in the dtype. (If dtypes are immutable, which I don't remember > right now.) > We discussed this at SciPy 09 -- basically, they are not necesarrily immutable in implementation, but anywhere they are not that is a bug and no code should depend on their mutability, so we are free to assume so. > Do you have a case in mind where the speed of format string generation > would be a bottleneck? > Going all the way down to user code; no. Well, contrived: You have a Python list of NumPy arrays and want to sum over the first element of each, acquiring the buffer by PEP 3118 (which is easy through Cython). In that case I can see all the memory allocation that must go on for each element for the format-string as a bottle-neck. But mostly it's from cleanliness of implementation, like the fact that you don't know up-front how long the string need to be for nested dtypes. Obviously, what you have done is much better than nothing, and probably sufficient for nearly all purposes, so I should stop complaining. > >> Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too >> conservative with NumPy arrays. If a contiguous buffer is requested, >> then looping through the strides and checking that the strides are >> monotonically decreasing/increasing could eventually save copying in >> some cases. I think that could be worth it -- I actually have my own >> code for IS_F_CONTIGUOUS rather than relying on the flags personally >> because of this issue, so it does come up in practice. >> > > Are you sure? > > Assume monotonically increasing or decreasing strides with inner stride > of itemsize. Now, if the strides are not C or F-contiguous, doesn't this > imply that part of the data in the memory block is *not* pointed to by a > set of indices? [For example, strides = {itemsize, 3*itemsize}; dims = > {2, 2}. Now, there is unused memory between items (1,0) and (0,1).] > > This probably boils down to what exactly was meant in the PEP and Python > docs by "contiguous". I'd believe it was meant to be the same as in Numpy > -- that you can send the array data e.g. to Fortran as-is. If so, there > should not be gaps in the data, if the client explicitly requested that > the buffer be contiguous. > > Maybe you meant that the Numpy array flags (which the macros check) are > not always up-to-date wrt. the stride information? > Yep, this is what I meant, and the rest is wrong. But now that I think about it, the case that bit me is In [14]: np.arange(10)[None, None, :].flags.c_contiguous Out[14]: False I suppose this particular case could be fixed properly with little cost (if it isn't already). It is probably cleaner to just rely on the flags for PEP 3118, less confusion etc. Sorry for the distraction. Dag Sverre From lou_boog2000 at yahoo.com Thu Dec 3 09:24:41 2009 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Thu, 3 Dec 2009 06:24:41 -0800 (PST) Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: Message-ID: <456193.8364.qm@web34405.mail.mud.yahoo.com> From: Peter Cai To: Discussion of Numerical Python Sent: Thu, December 3, 2009 1:13:40 AM Subject: Re: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? Thanks a lot. But my knowledge of linear equations are limited, so can explain in your code, which result represent the solution set of solution? BTW : since [1, 1, 1, 1] is an obviously non-trivial solution, can you prove your method could verify it? As usual, Google is your friend. Also check on Wikipedia, Scholarpedia, and http://mathworld.wolfram.com/. If you are serious about getting a solution, then it is worth spending some time learning about linear systems. -- Lou Pecora, my views are my own. -------------- next part -------------- An HTML attachment was scrubbed... URL: From newptcai at gmail.com Thu Dec 3 09:59:37 2009 From: newptcai at gmail.com (Peter Cai) Date: Thu, 3 Dec 2009 22:59:37 +0800 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: <456193.8364.qm@web34405.mail.mud.yahoo.com> References: <456193.8364.qm@web34405.mail.mud.yahoo.com> Message-ID: Thanks, I've read some explanations on wikipedia and finally found out how to solve homogeneous equations by singular value decomposition. On Thu, Dec 3, 2009 at 10:24 PM, Lou Pecora wrote: > From: Peter Cai > To: Discussion of Numerical Python > Sent: Thu, December 3, 2009 1:13:40 AM > Subject: Re: [Numpy-discussion] How to solve homogeneous linear equations > with NumPy? > > Thanks a lot. > But my knowledge of linear equations are limited, so can explain in your > code, > ?which result represent the solution set of solution? > > BTW : since [1, 1, 1, 1] is an obviously non-trivial solution, can you prove > your method could verify it? > > > As usual, Google is your friend. ?Also check on Wikipedia, Scholarpedia, > and?http://mathworld.wolfram.com/. ?If you are serious about getting a > solution, then it is worth spending some time learning about linear systems. > -- Lou Pecora, my views are my own. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- look to the things around you,the immediate world around you, if you are alive,it will mean something to you ??Paul Strand From david at ar.media.kyoto-u.ac.jp Thu Dec 3 10:33:44 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 04 Dec 2009 00:33:44 +0900 Subject: [Numpy-discussion] Python 3K merge In-Reply-To: References: <5b8d13220912010031w3945441cj3caed84d661f51b5@mail.gmail.com> <4B178390.7050006@ar.media.kyoto-u.ac.jp> Message-ID: <4B17DA58.4040100@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > Thu, 03 Dec 2009 18:23:28 +0900, David Cournapeau wrote: > >> Pauli Virtanen wrote: >> >>> I think I should rebase my branch on this, or vice versa, to avoid >>> further duplicated work. >>> >> I think I will just commit my branch to the trunk once ASAP - I expect >> more breakage from my code than yours, and the sooner the better for >> distutils-related changes. >> > > Ok, I'll follow that up with the more innocuous changesets Ok, the patch is being commited to the trunk right now. I have not thoroughly tested it, but at least python 2.4/2.6/3.1 all build (up to the first build failure for 3.1 of course), without the need to apply 2to3 to numpy.distutils. Also, do not forget the NPY_SEPARATE_COMPILATION option - it makes an appreciable difference when compiling partial build :) cheers, David From charlesr.harris at gmail.com Thu Dec 3 12:17:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 10:17:42 -0700 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: <456193.8364.qm@web34405.mail.mud.yahoo.com> Message-ID: On Thu, Dec 3, 2009 at 7:59 AM, Peter Cai wrote: > Thanks, I've read some explanations on wikipedia and finally found out > how to solve homogeneous equations by singular value decomposition. > > Note that the numpy svd doesn't quite conform to what you will see in those sources and the documentation is confusing. Numpy returns u,s,v and a = u*diag(s)*v, whereas the decomposition is normally written as u*diag(s)*v^T, i.e., the numpy v is the transpose (Hermitean conjugate) of the conventional v. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgchong at berkeley.edu Thu Dec 3 12:24:44 2009 From: hgchong at berkeley.edu (Howard Chong) Date: Thu, 3 Dec 2009 09:24:44 -0800 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array Message-ID: <5861ec420912030924s3672fdcai49e47f5e55160eeb@mail.gmail.com> Thanks for all the help and suggestions. I think the partial sort is exactly what I need. I thought of doing it as a full sort with argsort(), but that would be much slower if I just need a small number (maybe 7) from a large array, potentially thousands or a million repeated many times. In case you are wondering, I am doing this to find the 7 nearest neighbors in a GIS. For a list of zipcodes in America, find the 7 nearest weather stations. > Message: 8 > Date: Wed, 02 Dec 2009 22:26:29 -0500 > From: Neal Becker > Subject: Re: [Numpy-discussion] Find the N maximum values and > corresponding indexes in an array > To: numpy-discussion at scipy.org > Message-ID: > Content-Type: text/plain; charset="ISO-8859-1" > > Neal Becker wrote: > > > Keith Goodman wrote: > > ... > >> Oh, I thought he meant there was a numpy function for partial > sorting. > >> > Try this one: > > template > inline void partial_sort (in_t in, int n_el) { > std::partial_sort (boost::begin (in), boost::begin(in) + n_el, > boost::end (in)); > } > ... > def ("partial_sort", > &partial_sort strided_vector >); > def ("partial_sort", > &partial_sort >); > def ("partial_sort", > &partial_sort >); > --------- > import pyublas > import numpy as np > u = np.arange (20)[::-1] > from numpy_fncs import partial_sort > partial_sort (u, 4) > In [2]: u > Out[2]: > array([ 0, 1, 2, 3, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, > 7, > 6, 5, 4]) > > > -- > Howard Chong > Dept. of Agricultural and Resource Economics and Energy Institute @ Haas > Business School > UC Berkeley > hgchong at berkeley.edu > Cell: 510-333-0539 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 3 12:34:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 10:34:18 -0700 Subject: [Numpy-discussion] Find the N maximum values and corresponding indexes in an array In-Reply-To: <5861ec420912030924s3672fdcai49e47f5e55160eeb@mail.gmail.com> References: <5861ec420912030924s3672fdcai49e47f5e55160eeb@mail.gmail.com> Message-ID: On Thu, Dec 3, 2009 at 10:24 AM, Howard Chong wrote: > Thanks for all the help and suggestions. I think the partial sort is > exactly what I need. > > I thought of doing it as a full sort with argsort(), but that would be much > slower if I just need a small number (maybe 7) from a large array, > potentially thousands or a million repeated many times. > > In case you are wondering, I am doing this to find the 7 nearest neighbors > in a GIS. For a list of zipcodes in America, find the 7 nearest weather > stations. > > Have you looked at the kdtree in scipy? It is aimed precisely at this sort of problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu Dec 3 16:07:02 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 3 Dec 2009 15:07:02 -0600 Subject: [Numpy-discussion] Three bugs fixed Message-ID: Thanks to the reporters of tickets #1108, #1197 (similar to #1279), and #1222. Pointing out these problems allowed me to find and squash two subtle memory leaks and one just plain stupid bug lurking in reduce-at (when using the buffered internal loop). I think I fixed the problems. I added a test for #1108, but I don't have good tests for the memory leak fixes. I'm not sure how to write a good test for those. Even though it may take several months to get to the tickets with my limited time, please do continue to report problems. I'm hoping to have more time for NumPy next year. Best regards, -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 3 16:47:05 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 14:47:05 -0700 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: Message-ID: On Thu, Dec 3, 2009 at 2:07 PM, Travis Oliphant wrote: > > Thanks to the reporters of tickets #1108, #1197 (similar to #1279), and > #1222. > > Pointing out these problems allowed me to find and squash two subtle memory > leaks and one just plain stupid bug lurking in reduce-at (when using the > buffered internal loop). > > I think I fixed the problems. I added a test for #1108, but I don't have > good tests for the memory leak fixes. I'm not sure how to write a good > test for those. > > Even though it may take several months to get to the tickets with my > limited time, please do continue to report problems. I'm hoping to have > more time for NumPy next year. > > Good job! Nice to have the reduceat bug tracked down. Now for some nits ;) 1) Could you format your multiline comments like: /* * blah */ You should be able to set up your editor to do that automatically. 2) You don't need a continuation here - if (loop->meth == ZERO_EL_REDUCELOOP) { + if ((loop->meth == ZERO_EL_REDUCELOOP) || \ + ((operation == UFUNC_REDUCEAT) && (loop->meth == BUFFER_UFUNCLOOP))) { idarr = _getidentity(self, otype, str); if (idarr == NULL) { 3) Commit to the trunk first, then backport to the release candidate after checking the buildbot. The SPARC machines are still giving bus errors after the reduceat fix and this bug is now part of the release candidate. 4) What is this? -#include "npy_config.h" +#include "npy_config.h" c Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu Dec 3 17:13:17 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 3 Dec 2009 16:13:17 -0600 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: Message-ID: On Dec 3, 2009, at 3:47 PM, Charles R Harris wrote: > > > On Thu, Dec 3, 2009 at 2:07 PM, Travis Oliphant > wrote: > > Thanks to the reporters of tickets #1108, #1197 (similar to #1279), > and #1222. > > Pointing out these problems allowed me to find and squash two subtle > memory leaks and one just plain stupid bug lurking in reduce-at > (when using the buffered internal loop). > > I think I fixed the problems. I added a test for #1108, but I > don't have good tests for the memory leak fixes. I'm not sure how > to write a good test for those. > > Even though it may take several months to get to the tickets with my > limited time, please do continue to report problems. I'm hoping to > have more time for NumPy next year. > > > Good job! Nice to have the reduceat bug tracked down. Now for some > nits ;) Thanks for keeping the code clean. O.K. I can commit to the trunk first. How do I see what the build-bot machines are giving errors on. I'd like to try and fix the bus errors if possible while I'm thinking about it? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 3 17:20:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 15:20:43 -0700 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: Message-ID: On Thu, Dec 3, 2009 at 3:13 PM, Travis Oliphant wrote: > > On Dec 3, 2009, at 3:47 PM, Charles R Harris wrote: > > > > On Thu, Dec 3, 2009 at 2:07 PM, Travis Oliphant wrote: > >> >> Thanks to the reporters of tickets #1108, #1197 (similar to #1279), and >> #1222. >> >> Pointing out these problems allowed me to find and squash two subtle >> memory leaks and one just plain stupid bug lurking in reduce-at (when using >> the buffered internal loop). >> >> I think I fixed the problems. I added a test for #1108, but I don't have >> good tests for the memory leak fixes. I'm not sure how to write a good >> test for those. >> >> Even though it may take several months to get to the tickets with my >> limited time, please do continue to report problems. I'm hoping to have >> more time for NumPy next year. >> >> > Good job! Nice to have the reduceat bug tracked down. Now for some nits ;) > > > Thanks for keeping the code clean. > > O.K. I can commit to the trunk first. > > How do I see what the build-bot machines are giving errors on. I'd like to > try and fix the bus errors if possible while I'm thinking about it? > > I like the waterfall view http://buildbot.scipy.org/waterfall?show_events=false, it doesn't automatically update though, so you need to refresh it now and then. The root page is at http://buildbot.scipy.org/. Mind, the SPARC machines take 10-15 minutes to complete on a good day. If you hit the stdio link on the waterfall display for a machine you can view the test output. I expect the bus error is from the starting index fix. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 3 17:26:26 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 15:26:26 -0700 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: Message-ID: On Thu, Dec 3, 2009 at 3:20 PM, Charles R Harris wrote: > > > On Thu, Dec 3, 2009 at 3:13 PM, Travis Oliphant wrote: > >> >> On Dec 3, 2009, at 3:47 PM, Charles R Harris wrote: >> >> >> >> On Thu, Dec 3, 2009 at 2:07 PM, Travis Oliphant wrote: >> >>> >>> Thanks to the reporters of tickets #1108, #1197 (similar to #1279), and >>> #1222. >>> >>> Pointing out these problems allowed me to find and squash two subtle >>> memory leaks and one just plain stupid bug lurking in reduce-at (when using >>> the buffered internal loop). >>> >>> I think I fixed the problems. I added a test for #1108, but I don't >>> have good tests for the memory leak fixes. I'm not sure how to write a >>> good test for those. >>> >>> Even though it may take several months to get to the tickets with my >>> limited time, please do continue to report problems. I'm hoping to have >>> more time for NumPy next year. >>> >>> >> Good job! Nice to have the reduceat bug tracked down. Now for some nits ;) >> >> >> Thanks for keeping the code clean. >> >> O.K. I can commit to the trunk first. >> >> How do I see what the build-bot machines are giving errors on. I'd like >> to try and fix the bus errors if possible while I'm thinking about it? >> >> > I like the waterfall view > http://buildbot.scipy.org/waterfall?show_events=false, it doesn't > automatically update though, so you need to refresh it now and then. The > root page is at http://buildbot.scipy.org/. Mind, the SPARC machines take > 10-15 minutes to complete on a good day. If you hit the stdio link on the > waterfall display for a machine you can view the test output. > > I expect the bus error is from the starting index fix. > > Maybe the buffer needs alignment? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu Dec 3 18:25:46 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 3 Dec 2009 17:25:46 -0600 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: Message-ID: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> On Dec 3, 2009, at 4:26 PM, Charles R Harris wrote: > > > > I like the waterfall view http://buildbot.scipy.org/waterfall?show_events=false > , it doesn't automatically update though, so you need to refresh it > now and then. The root page is at http://buildbot.scipy.org/. Mind, > the SPARC machines take 10-15 minutes to complete on a good day. If > you hit the stdio link on the waterfall display for a machine you > can view the test output. > > I expect the bus error is from the starting index fix. > I'm not sure which fix that was. The bus errors started when I added the test for #1299. I also fixed the test with the reduceat bug in that same checkin, but the SPARC build machines are reporting a bus error in testing #1299. I'm not sure where the bus error is occurring in that code exactly (i.e. the test wasn't being run before my fix so it could have exposed a bus error earlier --- obviously I need to play better with the build- bots than I am doing). A likely candidate though is the PyArray_Item_INCREF call (though it uses the new copy-object-before-refcount-changes semantics). -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Dec 3 18:48:35 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 16:48:35 -0700 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> References: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> Message-ID: On Thu, Dec 3, 2009 at 4:25 PM, Travis Oliphant wrote: > > On Dec 3, 2009, at 4:26 PM, Charles R Harris wrote: > > > > >> I like the waterfall view >> http://buildbot.scipy.org/waterfall?show_events=false, it doesn't >> automatically update though, so you need to refresh it now and then. The >> root page is at http://buildbot.scipy.org/. Mind, the SPARC machines take >> 10-15 minutes to complete on a good day. If you hit the stdio link on the >> waterfall display for a machine you can view the test output. >> >> I expect the bus error is from the starting index fix. >> >> > > I'm not sure which fix that was. The bus errors started when I added the > test for #1299. I also fixed the test with the reduceat bug in that same > checkin, but the SPARC build machines are reporting a bus error in testing > #1299. > > I'm not sure where the bus error is occurring in that code exactly (i.e. > the test wasn't being run before my fix so it could have exposed a bus error > earlier --- obviously I need to play better with the build-bots than I am > doing). > > The error could certainly been lurking all these years, I don't think reduceat gets used much. > A likely candidate though is the PyArray_Item_INCREF call (though it uses > the new copy-object-before-refcount-changes semantics). > > Would that cause a bus error? That looks like an alignment issue. There was another buffer alignment issue a while back that I fixed, I'll try to track it down. Maybe we can get Michael Droettboom to help here, he has access to SPARC. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Thu Dec 3 19:04:06 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Thu, 3 Dec 2009 18:04:06 -0600 Subject: [Numpy-discussion] Can someone backport r7870 to numpy 1.4 branch? Message-ID: I just checked in another fix (for ticket #1254), but I don't have time to backport the fix to the 1.4 branch. It would be great if someone could do that. Thanks, -Travis From charlesr.harris at gmail.com Thu Dec 3 21:01:49 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2009 19:01:49 -0700 Subject: [Numpy-discussion] Can someone backport r7870 to numpy 1.4 branch? In-Reply-To: References: Message-ID: On Thu, Dec 3, 2009 at 5:04 PM, Travis Oliphant wrote: > > I just checked in another fix (for ticket #1254), but I don't have > time to backport the fix to the 1.4 branch. It would be great if > someone could do that. > > Done...Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Fri Dec 4 04:09:44 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 4 Dec 2009 01:09:44 -0800 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: <456193.8364.qm@web34405.mail.mud.yahoo.com> Message-ID: <45d1ab480912040109v3884c32bu1c1d11d865c4edae@mail.gmail.com> On Thu, Dec 3, 2009 at 9:17 AM, Charles R Harris wrote: > > > On Thu, Dec 3, 2009 at 7:59 AM, Peter Cai wrote: > >> Thanks, I've read some explanations on wikipedia and finally found out >> how to solve homogeneous equations by singular value decomposition. >> >> > Note that the numpy svd doesn't quite conform to what you will see in those > sources and the documentation is confusing. Numpy returns > u,s,v and a = u*diag(s)*v, whereas the decomposition is normally written as > u*diag(s)*v^T, i.e., the numpy v is the transpose (Hermitean conjugate) of > the conventional v. > > Chuck > It's quite clear to me (at least in the version of the doc in the Wiki) that what is returned in the third "slot" is the "Hermitean of v", i.e., the third factor in the decomposition the way it is "normally written"; how would you suggest it be made clearer? DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Fri Dec 4 05:19:27 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 4 Dec 2009 11:19:27 +0100 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B17B3F5.60409@ncf.ca> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <4B17B3F5.60409@ncf.ca> Message-ID: <7f014ea60912040219u2b46252bib0e47b26d6b10d67@mail.gmail.com> Why cant the divisor constant just be made an optional kwarg that defaults to zero? It wont break any existing code, and will let everybody that wants the other behavior, to have it. On Thu, Dec 3, 2009 at 1:49 PM, Colin J. Williams wrote: > Yogesh, > > Could you explain the rationale for this choice please? > > Colin W. > > On 03-Dec-09 00:35 AM, yogesh karpate wrote: >> The thing is that the normalization by (n-1) is done for the no. of >> samples >20 or23(Not sure about this no. but sure about the thing that >> this no isnt greater than 25) and below that we use normalization by n. >> Regards >> ~ymk >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Fri Dec 4 05:21:47 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 04 Dec 2009 12:21:47 +0200 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <7f014ea60912040219u2b46252bib0e47b26d6b10d67@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <4B17B3F5.60409@ncf.ca> <7f014ea60912040219u2b46252bib0e47b26d6b10d67@mail.gmail.com> Message-ID: <1259922107.2944.0.camel@talisman> pe, 2009-12-04 kello 11:19 +0100, Chris Colbert kirjoitti: > Why cant the divisor constant just be made an optional kwarg that > defaults to zero? It already is an optional kwarg that defaults to zero. Cheers, -- Pauli Virtanen From pav+sp at iki.fi Fri Dec 4 05:24:34 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 4 Dec 2009 10:24:34 +0000 (UTC) Subject: [Numpy-discussion] non-standard standard deviation References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> Message-ID: Thu, 03 Dec 2009 11:05:07 +0530, yogesh karpate wrote: > The thing is that the normalization by (n-1) is done for the no. of > samples >>20 or23(Not sure about this no. but sure about the thing that this no >>isnt > greater than 25) and below that we use normalization by n. Regards > ~ymk > The thing is that the normalization by (n-1) is done for the no. of > samples >20 or23(Not sure about this no. but sure about the thing > that this no isnt greater than 25) and below that we use normalization > by n. Just to clarify: Numpy (of course) does not change the divisor depending on `n` -- Yogesh's post concerns probably some code of his own. -- Pauli Virtanen From yogeshkarpate at gmail.com Fri Dec 4 07:18:45 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Fri, 4 Dec 2009 17:48:45 +0530 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> Message-ID: <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> @ Pauli and @ Colin: Sorry for the late reply. I was busy in some other assignments. # As far as normalization by(n) is concerned then its common assumption that the population is normally distributed and population size is fairly large enough to fit the normal distribution. But this standard deviation, when applied to a small population, tends to be too low therefore it is called as biased. # The correction known as bessel correction is there for small sample size std. deviation. i.e. normalization by (n-1). # In "electrical-and-electronic-measurements-and-instrumentation" by A.K. Sawhney . In 1st chapter of the book "Fundamentals of Meausrements " . Its shown that for N=16 the std. deviation normalization was (n-1)=15 # While I was learning statistics in my course Instructor would advise to take n=20 for normalization by (n-1) # Probability and statistics by Schuam Series is good reading. Regards ~ymk -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Fri Dec 4 07:23:59 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 4 Dec 2009 13:23:59 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B17C380.4090501@student.matnat.uio.no> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> Message-ID: <200912041323.59338.faltet@pytables.org> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: > Pauli Virtanen wrote: > > Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: > > [clip] > > > >> Great! Are you storing the format string in the dtype types as well? (So > >> that no release is needed and acquisitions are cheap...) > > > > I regenerate it on each buffer acquisition. It's simple low-level C code, > > and I suspect it will always be fast enough. Of course, we could *cache* > > the result in the dtype. (If dtypes are immutable, which I don't remember > > right now.) > > We discussed this at SciPy 09 -- basically, they are not necesarrily > immutable in implementation, but anywhere they are not that is a bug and > no code should depend on their mutability, so we are free to assume so. Mmh, the only case that I'm aware about dtype *mutability* is changing the names of compound types: In [19]: t = np.dtype("i4,f4") In [20]: t Out[20]: dtype([('f0', ' What do people think of applying patch #1085. This patch makes a copy of inputs when the input and output views overlap in ways in which one computation will change later computations. i.e. what is the output of? x = ones((10,3)) x += x[1] I think that copying in such instances is a good idea and it looks like the patch has received careful review. The only downsides I can see are: 1) Increased overhead of the check --- this looks very small 2) Subtle change in behavior --- it's hard to imagine that somebody would be relying on the specific behavior of overlapping views writing into the same output (I know it's something we warn against in teaching) -- but switching from it is a change. Anything else? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Fri Dec 4 09:28:13 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 04 Dec 2009 09:28:13 -0500 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: References: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> Message-ID: <4B191C7D.3020608@stsci.edu> Charles R Harris wrote: > > > On Thu, Dec 3, 2009 at 4:25 PM, Travis Oliphant > > wrote: > > > I'm not sure which fix that was. The bus errors started when I > added the test for #1299. I also fixed the test with the reduceat > bug in that same checkin, but the SPARC build machines are > reporting a bus error in testing #1299. > > I'm not sure where the bus error is occurring in that code exactly > (i.e. the test wasn't being run before my fix so it could have > exposed a bus error earlier --- obviously I need to play better > with the build-bots than I am doing). > > > The error could certainly been lurking all these years, I don't think > reduceat gets used much. > > > A likely candidate though is the PyArray_Item_INCREF call (though > it uses the new copy-object-before-refcount-changes semantics). > > > Would that cause a bus error? That looks like an alignment issue. > There was another buffer alignment issue a while back that I fixed, > I'll try to track it down. Maybe we can get Michael Droettboom to help > here, he has access to SPARC. Sure. I have some time to look into it this morning. Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From cournape at gmail.com Fri Dec 4 10:19:51 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 5 Dec 2009 00:19:51 +0900 Subject: [Numpy-discussion] Import numpy fails on cygwin python In-Reply-To: <599716.6028.qm@web51003.mail.re2.yahoo.com> References: <679405.12948.qm@web51002.mail.re2.yahoo.com> <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> <599716.6028.qm@web51003.mail.re2.yahoo.com> Message-ID: <5b8d13220912040719kcf29875g193534cdfdf6f0d8@mail.gmail.com> On Thu, Dec 3, 2009 at 5:41 AM, Olivia Cheronet wrote: > ----- Original Message ---- >> From: David Cournapeau >> >> Does the file >> /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so exist ? >> >> cheers, >> >> David > > > Indeed, this file is not there. Where can I find it? My mistake, cygwin uses the same extension as windows, that is .dll and not .so, so I would need the output of ldd lapack_lite.dll as well as the output of nm lapack_lite.dll David From mdroe at stsci.edu Fri Dec 4 10:45:53 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Fri, 04 Dec 2009 10:45:53 -0500 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: <4B191C7D.3020608@stsci.edu> References: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> <4B191C7D.3020608@stsci.edu> Message-ID: <4B192EB1.6060402@stsci.edu> Unfortunately, I can't reproduce it on Solaris/SPARC (SVN r7878, trunk). Could it be Linux/SPARC-specific? (We don't have a Linux/SPARC machine lying around, I don't think). Mike Michael Droettboom wrote: > Charles R Harris wrote: > >> On Thu, Dec 3, 2009 at 4:25 PM, Travis Oliphant >> > wrote: >> >> >> I'm not sure which fix that was. The bus errors started when I >> added the test for #1299. I also fixed the test with the reduceat >> bug in that same checkin, but the SPARC build machines are >> reporting a bus error in testing #1299. >> >> I'm not sure where the bus error is occurring in that code exactly >> (i.e. the test wasn't being run before my fix so it could have >> exposed a bus error earlier --- obviously I need to play better >> with the build-bots than I am doing). >> >> >> The error could certainly been lurking all these years, I don't think >> reduceat gets used much. >> >> >> A likely candidate though is the PyArray_Item_INCREF call (though >> it uses the new copy-object-before-refcount-changes semantics). >> >> >> Would that cause a bus error? That looks like an alignment issue. >> There was another buffer alignment issue a while back that I fixed, >> I'll try to track it down. Maybe we can get Michael Droettboom to help >> here, he has access to SPARC. >> > Sure. I have some time to look into it this morning. > > Mike > > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From charlesr.harris at gmail.com Fri Dec 4 10:52:52 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Dec 2009 08:52:52 -0700 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: <45d1ab480912040109v3884c32bu1c1d11d865c4edae@mail.gmail.com> References: <456193.8364.qm@web34405.mail.mud.yahoo.com> <45d1ab480912040109v3884c32bu1c1d11d865c4edae@mail.gmail.com> Message-ID: On Fri, Dec 4, 2009 at 2:09 AM, David Goldsmith wrote: > On Thu, Dec 3, 2009 at 9:17 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Thu, Dec 3, 2009 at 7:59 AM, Peter Cai wrote: >> >>> Thanks, I've read some explanations on wikipedia and finally found out >>> how to solve homogeneous equations by singular value decomposition. >>> >>> >> Note that the numpy svd doesn't quite conform to what you will see in >> those sources and the documentation is confusing. Numpy returns >> u,s,v and a = u*diag(s)*v, whereas the decomposition is normally written >> as u*diag(s)*v^T, i.e., the numpy v is the transpose (Hermitean conjugate) >> of the conventional v. >> >> Chuck >> > > It's quite clear to me (at least in the version of the doc in the Wiki) > that what is returned in the third "slot" is the "Hermitean of v", i.e., the > third factor in the decomposition the way it is "normally written"; how > would you suggest it be made clearer? > > Leave off the Hermitean bit since it is irrelevant to our decomposition, show a = u*diag(s)*v, and make a note explaining the usual convention. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 4 10:54:51 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Dec 2009 08:54:51 -0700 Subject: [Numpy-discussion] Three bugs fixed In-Reply-To: <4B192EB1.6060402@stsci.edu> References: <08A24B53-57CF-41DB-80EE-1308BC87073C@enthought.com> <4B191C7D.3020608@stsci.edu> <4B192EB1.6060402@stsci.edu> Message-ID: On Fri, Dec 4, 2009 at 8:45 AM, Michael Droettboom wrote: > Unfortunately, I can't reproduce it on Solaris/SPARC (SVN r7878, > trunk). Could it be Linux/SPARC-specific? (We don't have a Linux/SPARC > machine lying around, I don't think). > > Hmm, maybe it is compiler specific. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Dec 4 10:54:51 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 04 Dec 2009 09:54:51 -0600 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> Message-ID: <4B1930CB.7090708@gmail.com> On 12/04/2009 06:18 AM, yogesh karpate wrote: > @ Pauli and @ Colin: > Sorry for the late reply. I was busy > in some other assignments. > # As far as normalization by(n) is concerned then its common > assumption that the population is normally distributed and population > size is fairly large enough to fit the normal distribution. But this > standard deviation, when applied to a small population, tends to be > too low therefore it is called as biased. > # The correction known as bessel correction is there for small sample > size std. deviation. i.e. normalization by (n-1). > # In "electrical-and-electronic-measurements-and-instrumentation" by > A.K. Sawhney . In 1st chapter of the book "Fundamentals of > Meausrements " . Its shown that for N=16 the std. deviation > normalization was (n-1)=15 > # While I was learning statistics in my course Instructor would advise > to take n=20 for normalization by (n-1) > # Probability and statistics by Schuam Series is good reading. > Regards > ~ymk > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, Basically, all that I see with these arbitrary values is that you are relying on the 'central limit theorem' (http://en.wikipedia.org/wiki/Central_limit_theorem). Really the issue in using these values is how much statistical bias will you tolerate especially in the impact on usage of that estimate because the usage of variance (such as in statistical tests) tend to be more influenced by bias than the estimate of variance. (Of course, many features rely on asymptotic properties so bias concerns are less apparent in large sample sizes.) Obviously the default relies on the developers background and requirements. There are multiple valid variance estimators in statistics with different denominators like N (maximum likelihood estimator), N-1 (restricted maximum likelihood estimator and certain Bayesian estimators) and Stein's (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So thecurrent default behavior is a valid and documented. Consequently you can not just have one option or different functions (like certain programs) and Numpy's implementation actually allows you do all these in a single function. So I also see no reason change even if I have to add the ddof=1 argument, after all 'Explicit is better than implicit' :-). Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 4 11:09:37 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Dec 2009 09:09:37 -0700 Subject: [Numpy-discussion] Applying Patch #1085 In-Reply-To: <5C367222-2FDF-4B1D-8DD4-4987E0EB10CA@enthought.com> References: <5C367222-2FDF-4B1D-8DD4-4987E0EB10CA@enthought.com> Message-ID: On Fri, Dec 4, 2009 at 6:29 AM, Travis Oliphant wrote: > > What do people think of applying patch #1085. This patch makes a copy of > inputs when the input and output views overlap in ways in which one > computation will change later computations. > > I'd rename the function views_share_base_data, make the loop to find the base an inplace function, and break the long lines in the test. It looks like the routine doesn't try to determine if the views actually overlap, just if they might potentially share data. Is that correct? That seems safe and if the time isn't much it might be a nice safety catch. But it shouldn't go into 1.4. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Dec 4 11:12:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 5 Dec 2009 01:12:09 +0900 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <200912041323.59338.faltet@pytables.org> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> Message-ID: <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted wrote: > A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >> Pauli Virtanen wrote: >> > Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >> > [clip] >> > >> >> Great! Are you storing the format string in the dtype types as well? (So >> >> that no release is needed and acquisitions are cheap...) >> > >> > I regenerate it on each buffer acquisition. It's simple low-level C code, >> > and I suspect it will always be fast enough. Of course, we could *cache* >> > the result in the dtype. (If dtypes are immutable, which I don't remember >> > right now.) >> >> We discussed this at SciPy 09 -- basically, they are not necesarrily >> immutable in implementation, but anywhere they are not that is a bug and >> no code should depend on their mutability, so we are free to assume so. > > Mmh, the only case that I'm aware about dtype *mutability* is changing the > names of compound types: > > In [19]: t = np.dtype("i4,f4") > > In [20]: t > Out[20]: dtype([('f0', ' > In [21]: hash(t) > Out[21]: -9041335829180134223 > > In [22]: t.names = ('one', 'other') > > In [23]: t > Out[23]: dtype([('one', ' > In [24]: hash(t) > Out[24]: 8637734220020415106 > > Perhaps this should be marked as a bug? ?I'm not sure about that, because the > above seems quite useful. Hm, that's strange - I get the same hash in both cases, but I thought I took into account names when I implemented the hashing protocol for dtype. Which version of numpy on which os are you seeing this ? David From cheronetolivia at yahoo.com Fri Dec 4 11:26:38 2009 From: cheronetolivia at yahoo.com (Olivia Cheronet) Date: Fri, 4 Dec 2009 08:26:38 -0800 (PST) Subject: [Numpy-discussion] Import numpy fails on cygwin python In-Reply-To: <5b8d13220912040719kcf29875g193534cdfdf6f0d8@mail.gmail.com> References: <679405.12948.qm@web51002.mail.re2.yahoo.com> <4B162BA1.2040106@ar.media.kyoto-u.ac.jp> <599716.6028.qm@web51003.mail.re2.yahoo.com> <5b8d13220912040719kcf29875g193534cdfdf6f0d8@mail.gmail.com> Message-ID: <237904.68070.qm@web51002.mail.re2.yahoo.com> > From: David Cournapeau > >> Does the file > >> /usr/lib/python2.5/site-packages/numpy/linalg/lapack_lite.so exist ? > >> > >> cheers, > >> > >> David > > > > > > Indeed, this file is not there. Where can I find it? > > My mistake, cygwin uses the same extension as windows, that is .dll > and not .so, so I would need the output of ldd lapack_lite.dll as well > as the output of nm lapack_lite.dll Here are the outputs of cygcheck (as advised in http://cygwin.com/ml/cygwin/2009-10/msg00004.html) and nm. I have found cyglapack.dll in lib/lapack/cygblas.dll..... Thanks, Olivia $ cygcheck /lib/python2.5/site-packages/numpy/linalg/lapack_lite.dll C:\cygwin\lib/python2.5/site-packages/numpy/linalg/lapack_lite.dll C:\cygwin\bin\cygwin1.dll C:\WINDOWS\system32\ADVAPI32.DLL C:\WINDOWS\system32\KERNEL32.dll C:\WINDOWS\system32\ntdll.dll C:\WINDOWS\system32\RPCRT4.dll C:\WINDOWS\system32\Secur32.dll Error: could not find cyglapack.dll C:\cygwin\bin\libpython2.5.dll $ nm /lib/python2.5/site-packages/numpy/linalg/lapack_lite.dll 10008000 b .bss 10008120 b .bss 10008020 b .bss 10008120 b .bss 10008060 b .bss 10008060 b .bss 10008120 b .bss 10008120 b .bss 10008120 b .bss 10008110 b .bss 10008120 b .bss 10008120 b .bss 10008020 b .bss 10008120 b .bss 10008120 b .bss 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006000 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 10006160 d .data 1000a03c i .idata$2 1000a000 i .idata$2 1000a028 i .idata$2 1000a014 i .idata$2 1000a0cc i .idata$4 1000a078 i .idata$4 1000a110 i .idata$4 1000a0c4 i .idata$4 1000a08c i .idata$4 1000a0c8 i .idata$4 1000a100 i .idata$4 1000a0e0 i .idata$4 1000a0bc i .idata$4 1000a0e8 i .idata$4 1000a10c i .idata$4 1000a0b8 i .idata$4 1000a120 i .idata$4 1000a0ac i .idata$4 1000a114 i .idata$4 1000a090 i .idata$4 1000a09c i .idata$4 1000a12c i .idata$4 1000a0b0 i .idata$4 1000a070 i .idata$4 1000a0f8 i .idata$4 1000a11c i .idata$4 1000a0a8 i .idata$4 1000a06c i .idata$4 1000a064 i .idata$4 1000a0a0 i .idata$4 1000a0f0 i .idata$4 1000a068 i .idata$4 1000a0a4 i .idata$4 1000a108 i .idata$4 1000a080 i .idata$4 1000a088 i .idata$4 1000a098 i .idata$4 1000a0d0 i .idata$4 1000a104 i .idata$4 1000a130 i .idata$4 1000a0b4 i .idata$4 1000a084 i .idata$4 1000a07c i .idata$4 1000a12c i .idata$4 1000a0f4 i .idata$4 1000a094 i .idata$4 1000a0ec i .idata$4 1000a0d8 i .idata$4 1000a124 i .idata$4 1000a128 i .idata$4 1000a118 i .idata$4 1000a0c0 i .idata$4 1000a0fc i .idata$4 1000a0e4 i .idata$4 1000a0d4 i .idata$4 1000a074 i .idata$4 1000a0dc i .idata$4 1000a1cc i .idata$5 1000a1c4 i .idata$5 1000a148 i .idata$5 1000a1b8 i .idata$5 1000a1e8 i .idata$5 1000a1fc i .idata$5 1000a1e0 i .idata$5 1000a1dc i .idata$5 1000a140 i .idata$5 1000a158 i .idata$5 1000a1f8 i .idata$5 1000a1a8 i .idata$5 1000a190 i .idata$5 1000a1ac i .idata$5 1000a1a4 i .idata$5 1000a19c i .idata$5 1000a144 i .idata$5 1000a15c i .idata$5 1000a194 i .idata$5 1000a198 i .idata$5 1000a18c i .idata$5 1000a1b0 i .idata$5 1000a188 i .idata$5 1000a17c i .idata$5 1000a16c i .idata$5 1000a160 i .idata$5 1000a180 i .idata$5 1000a178 i .idata$5 1000a170 i .idata$5 1000a134 i .idata$5 1000a174 i .idata$5 1000a168 i .idata$5 1000a150 i .idata$5 1000a1a0 i .idata$5 1000a184 i .idata$5 1000a14c i .idata$5 1000a164 i .idata$5 1000a200 i .idata$5 1000a1bc i .idata$5 1000a1d4 i .idata$5 1000a154 i .idata$5 1000a1d8 i .idata$5 1000a1c0 i .idata$5 1000a1fc i .idata$5 1000a1c8 i .idata$5 1000a13c i .idata$5 1000a1e4 i .idata$5 1000a1d0 i .idata$5 1000a1f4 i .idata$5 1000a1f0 i .idata$5 1000a138 i .idata$5 1000a1b4 i .idata$5 1000a1ec i .idata$5 1000a214 i .idata$6 1000a370 i .idata$6 1000a2a8 i .idata$6 1000a39c i .idata$6 1000a2c0 i .idata$6 1000a440 i .idata$6 1000a488 i .idata$6 1000a2d8 i .idata$6 1000a3c4 i .idata$6 1000a46c i .idata$6 1000a29c i .idata$6 1000a35c i .idata$6 1000a2cc i .idata$6 1000a458 i .idata$6 1000a2f0 i .idata$6 1000a2fc i .idata$6 1000a248 i .idata$6 1000a320 i .idata$6 1000a3ec i .idata$6 1000a314 i .idata$6 1000a32c i .idata$6 1000a344 i .idata$6 1000a234 i .idata$6 1000a3dc i .idata$6 1000a204 i .idata$6 1000a308 i .idata$6 1000a4c0 i .idata$6 1000a350 i .idata$6 1000a4ac i .idata$6 1000a26c i .idata$6 1000a278 i .idata$6 1000a220 i .idata$6 1000a284 i .idata$6 1000a388 i .idata$6 1000a2e4 i .idata$6 1000a258 i .idata$6 1000a3b4 i .idata$6 1000a338 i .idata$6 1000a400 i .idata$6 1000a42c i .idata$6 1000a290 i .idata$6 1000a49c i .idata$6 1000a2b4 i .idata$6 1000a414 i .idata$6 1000a260 i .idata$6 1000a504 i .idata$7 1000a548 i .idata$7 1000a530 i .idata$7 1000a544 i .idata$7 1000a53c i .idata$7 1000a534 i .idata$7 1000a538 i .idata$7 1000a52c i .idata$7 1000a528 i .idata$7 1000a51c i .idata$7 1000a50c i .idata$7 1000a520 i .idata$7 1000a518 i .idata$7 1000a510 i .idata$7 1000a514 i .idata$7 1000a508 i .idata$7 1000a540 i .idata$7 1000a524 i .idata$7 1000a564 i .idata$7 1000a57c i .idata$7 1000a580 i .idata$7 1000a568 i .idata$7 1000a570 i .idata$7 1000a58c i .idata$7 1000a578 i .idata$7 1000a574 i .idata$7 1000a59c i .idata$7 1000a598 i .idata$7 1000a55c i .idata$7 1000a594 i .idata$7 1000a56c i .idata$7 1000a560 i .idata$7 1000a590 i .idata$7 1000a588 i .idata$7 1000a584 i .idata$7 1000a4f4 i .idata$7 1000a4dc i .idata$7 1000a4e4 i .idata$7 1000a4d4 i .idata$7 1000a4d8 i .idata$7 1000a4f0 i .idata$7 1000a4e8 i .idata$7 1000a4ec i .idata$7 1000a4e0 i .idata$7 1000a5b4 i .idata$7 1000a5b8 i .idata$7 1000a4f8 i .idata$7 1000a54c i .idata$7 1000a5a0 i .idata$7 10007000 r .rdata 10007598 r .rdata 10013d34 N .stab 1000c000 N .stab 10005370 t .text 10005368 t .text 10005658 t .text 10005360 t .text 10005358 t .text 10005350 t .text 10005668 t .text 100054b8 t .text 10005348 t .text 10005340 t .text 10005338 t .text 100054a8 t .text 10005668 t .text 10005330 t .text 10005328 t .text 100056c8 t .text 10005298 t .text 100054c8 t .text 10005378 t .text 10005508 t .text 10005748 t .text 10001000 t .text 100052a0 t .text 10005760 t .text 10005290 t .text 10005758 t .text 10005288 t .text 10005280 t .text 100051f0 t .text 10005320 t .text 10005318 t .text 10005310 t .text 100056d8 t .text 10005310 t .text 10005310 t .text 10005310 t .text 100056e8 t .text 10005308 t .text 10005300 t .text 100056f8 t .text 100052f8 t .text 100052f0 t .text 100052e8 t .text 10005760 t .text 10005708 t .text 100052e0 t .text 100052d8 t .text 10005378 t .text 10005718 t .text 100052d0 t .text 100052c8 t .text 10005498 t .text 10005728 t .text 100052c0 t .text 100052b8 t .text 10005760 t .text 10005738 t .text 100052b0 t .text 100052a8 t .text 100054a8 T _DllMain at 12 10005758 T _GetModuleHandleA at 4 10008010 b _LapackError 10005348 T _PyArg_ParseTuple 10008000 b _PyArray_API 10005360 T _PyCObject_AsVoidPtr 10005310 T _PyDict_SetItemString 10005358 T _PyErr_Format 10005318 T _PyErr_NewException 10005330 T _PyErr_Print 10005328 T _PyErr_SetString 10005370 T _PyImport_ImportModule 10005320 T _PyModule_GetDict 10005368 T _PyObject_GetAttrString 10005350 T _PyType_IsSubtype 10005340 T _Py_BuildValue 10005338 T _Py_InitModule4 10005760 T __CTOR_LIST__ 10005768 T __DTOR_LIST__ 100075d4 R __RUNTIME_PSEUDO_RELOC_LIST_END__ 100075d4 R __RUNTIME_PSEUDO_RELOC_LIST__ 10005760 T ___CTOR_LIST__ 10005768 T ___DTOR_LIST__ 10000000 A ___ImageBase 100075d4 R ___RUNTIME_PSEUDO_RELOC_LIST_END__ 100075d4 R ___RUNTIME_PSEUDO_RELOC_LIST__ U ___crt_xc_end__ U ___crt_xc_start__ U ___crt_xi_end__ U ___crt_xi_start__ U ___crt_xl_start__ U ___crt_xp_end__ U ___crt_xp_start__ U ___crt_xt_end__ U ___crt_xt_start__ 10005378 t ___dllMain U ___tls_end__ U ___tls_start__ 10008140 B __bss_end__ 10008000 B __bss_start__ 10005508 T __cygwin_crt0_common at 8 100053a8 T __cygwin_dll_entry at 12 10005488 T __cygwin_noncygwin_dll_entry at 12 10006160 D __data_end__ 10006000 D __data_start__ 00000000 A __dll__ U __end__ 00000200 A __file_alignment__ 10008110 B __fmode 1000a014 I __head_cyglapack_dll 1000a000 I __head_cygwin1_dll 1000a03c I __head_libkernel32_a 1000a028 I __head_libpython2_5_dll 10000000 A __image_base__ 1000a1fc I __imp__GetModuleHandleA at 4 1000a1b4 I __imp__PyArg_ParseTuple 1000a1b8 I __imp__PyCObject_AsVoidPtr 1000a1bc I __imp__PyCObject_Type 1000a1c0 I __imp__PyDict_SetItemString 1000a1c4 I __imp__PyErr_Format 1000a1c8 I __imp__PyErr_NewException 1000a1cc I __imp__PyErr_Print 1000a1d0 I __imp__PyErr_SetString 1000a1d4 I __imp__PyExc_ImportError 1000a1d8 I __imp__PyExc_RuntimeError 1000a1dc I __imp__PyExc_ValueError 1000a1e0 I __imp__PyImport_ImportModule 1000a1e4 I __imp__PyModule_GetDict 1000a1e8 I __imp__PyObject_GetAttrString 1000a1ec I __imp__PyType_IsSubtype 1000a1f0 I __imp__Py_BuildValue 1000a1f4 I __imp__Py_InitModule4 1000a138 I __imp___impure_ptr 1000a13c I __imp__calloc 1000a140 I __imp__cygwin_detach_dll 1000a144 I __imp__cygwin_internal 1000a164 I __imp__dgeev_ 1000a168 I __imp__dgelsd_ 1000a16c I __imp__dgeqrf_ 1000a170 I __imp__dgesdd_ 1000a174 I __imp__dgesv_ 1000a178 I __imp__dgetrf_ 1000a148 I __imp__dll_dllcrt0 1000a17c I __imp__dorgqr_ 1000a180 I __imp__dpotrf_ 1000a184 I __imp__dsyevd_ 1000a14c I __imp__free 1000a150 I __imp__malloc 1000a154 I __imp__realloc 1000a158 I __imp__snprintf 1000a188 I __imp__zgeev_ 1000a18c I __imp__zgelsd_ 1000a190 I __imp__zgeqrf_ 1000a194 I __imp__zgesdd_ 1000a198 I __imp__zgesv_ 1000a19c I __imp__zgetrf_ 1000a1a0 I __imp__zheevd_ 1000a1a4 I __imp__zpotrf_ 1000a1a8 I __imp__zungqr_ 1000a5b8 I __libkernel32_a_iname 00000000 A __loader_flags__ 00000001 A __major_image_version__ 00000004 A __major_os_version__ 00000004 A __major_subsystem_version__ 00000000 A __minor_image_version__ 00000000 A __minor_os_version__ 00000000 A __minor_subsystem_version__ 1000a388 I __nm__PyCObject_Type 1000a400 I __nm__PyExc_ImportError 1000a414 I __nm__PyExc_RuntimeError 1000a42c I __nm__PyExc_ValueError 1000a204 I __nm___impure_ptr 10005698 T __pei386_runtime_relocator 00001000 A __section_alignment__ 00001000 A __size_of_heap_commit__ 00100000 A __size_of_heap_reserve__ 00001000 A __size_of_stack_commit__ 00200000 A __size_of_stack_reserve__ 00000003 A __subsystem__ 100056c8 T _calloc 1000a54c I _cyglapack_dll_iname 1000a4f8 I _cygwin1_dll_iname 100054c8 T _cygwin_attach_dll 100054b8 T _cygwin_detach_dll 10005748 T _cygwin_internal 10005738 T _cygwin_premain0 10005728 T _cygwin_premain1 10005718 T _cygwin_premain2 10005708 T _cygwin_premain3 10005308 T _dgeev_ 100052f0 T _dgelsd_ 100052c8 T _dgeqrf_ 100052e0 T _dgesdd_ 100052e8 T _dgesv_ 100052d8 T _dgetrf_ 10005658 T _dll_dllcrt0 10008050 b _dll_index 10005668 T _do_pseudo_reloc 100052c0 T _dorgqr_ 100052d0 T _dpotrf_ 10005300 T _dsyevd_ 10008114 B _environ 100056e8 T _free 10005000 T _initlapack_lite 10001000 t _lapack_lite_dgeev 10001d40 t _lapack_lite_dgelsd 10002e20 t _lapack_lite_dgeqrf 100024a0 t _lapack_lite_dgesdd 100021c0 t _lapack_lite_dgesv 10002a70 t _lapack_lite_dgetrf 10003100 t _lapack_lite_dorgqr 10002c90 t _lapack_lite_dpotrf 10001520 t _lapack_lite_dsyevd 10006000 d _lapack_lite_module_documentation 10006020 d _lapack_lite_module_methods 100033c0 t _lapack_lite_zgeev 100038d0 t _lapack_lite_zgelsd 10004a60 t _lapack_lite_zgeqrf 100040e0 t _lapack_lite_zgesdd 10003e00 t _lapack_lite_zgesv 100046b0 t _lapack_lite_zgetrf 100018e0 t _lapack_lite_zheevd 100048d0 t _lapack_lite_zpotrf 10004d40 t _lapack_lite_zungqr 1000a5a0 I _libpython2_5_dll_iname 100056f8 T _malloc 100056d8 T _realloc 10005498 T _snprintf 10008020 b _storedHandle 10008040 b _storedPtr 10008030 b _storedReason 100051f0 T _xerbla_ 100052b8 T _zgeev_ 100052b0 T _zgelsd_ 10005288 T _zgeqrf_ 100052a0 T _zgesdd_ 100052a8 T _zgesv_ 10005298 T _zgetrf_ 100052f8 T _zheevd_ 10005290 T _zpotrf_ 10005280 T _zungqr_ 1000a1fc i fthunk 1000a12c i hname 10008060 b u.0 From bsouthey at gmail.com Fri Dec 4 11:31:56 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 04 Dec 2009 10:31:56 -0600 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> Message-ID: <4B19397C.5020300@gmail.com> On 12/04/2009 10:12 AM, David Cournapeau wrote: > On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted wrote: > >> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >> >>> Pauli Virtanen wrote: >>> >>>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >>>> [clip] >>>> >>>> >>>>> Great! Are you storing the format string in the dtype types as well? (So >>>>> that no release is needed and acquisitions are cheap...) >>>>> >>>> I regenerate it on each buffer acquisition. It's simple low-level C code, >>>> and I suspect it will always be fast enough. Of course, we could *cache* >>>> the result in the dtype. (If dtypes are immutable, which I don't remember >>>> right now.) >>>> >>> We discussed this at SciPy 09 -- basically, they are not necesarrily >>> immutable in implementation, but anywhere they are not that is a bug and >>> no code should depend on their mutability, so we are free to assume so. >>> >> Mmh, the only case that I'm aware about dtype *mutability* is changing the >> names of compound types: >> >> In [19]: t = np.dtype("i4,f4") >> >> In [20]: t >> Out[20]: dtype([('f0', '> >> In [21]: hash(t) >> Out[21]: -9041335829180134223 >> >> In [22]: t.names = ('one', 'other') >> >> In [23]: t >> Out[23]: dtype([('one', '> >> In [24]: hash(t) >> Out[24]: 8637734220020415106 >> >> Perhaps this should be marked as a bug? I'm not sure about that, because the >> above seems quite useful. >> > Hm, that's strange - I get the same hash in both cases, but I thought > I took into account names when I implemented the hashing protocol for > dtype. Which version of numpy on which os are you seeing this ? > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, On the same linux 64-bit Fedora 11, I get the same hash with Python2.4 and numpy 1.3 but different hashes for Python2.6 and numpy 1.4. Bruce Python 2.6 (r26:66714, Jun 8 2009, 16:07:29) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0.dev7750' >>> t = np.dtype("i4,f4") >>> t dtype([('f0', '>> hash(t) -9041335829180134223 >>> t.names = ('one', 'other') >>> t dtype([('one', '>> hash(t) 8637734220020415106 Python 2.4.5 (#1, Oct 6 2008, 09:54:35) [GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.3.0.dev6653' >>> t = np.dtype("i4,f4") >>> hash(t) 140053539914640 >>> t.names = ('one', 'other') >>> hash(t) 140053539914640 From cournape at gmail.com Fri Dec 4 11:57:14 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 5 Dec 2009 01:57:14 +0900 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B19397C.5020300@gmail.com> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> <4B19397C.5020300@gmail.com> Message-ID: <5b8d13220912040857l3f2b9444n35eeffa2b19bb18@mail.gmail.com> On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southey wrote: > On 12/04/2009 10:12 AM, David Cournapeau wrote: >> On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted ?wrote: >> >>> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >>> >>>> Pauli Virtanen wrote: >>>> >>>>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >>>>> [clip] >>>>> >>>>> >>>>>> Great! Are you storing the format string in the dtype types as well? (So >>>>>> that no release is needed and acquisitions are cheap...) >>>>>> >>>>> I regenerate it on each buffer acquisition. It's simple low-level C code, >>>>> and I suspect it will always be fast enough. Of course, we could *cache* >>>>> the result in the dtype. (If dtypes are immutable, which I don't remember >>>>> right now.) >>>>> >>>> We discussed this at SciPy 09 -- basically, they are not necesarrily >>>> immutable in implementation, but anywhere they are not that is a bug and >>>> no code should depend on their mutability, so we are free to assume so. >>>> >>> Mmh, the only case that I'm aware about dtype *mutability* is changing the >>> names of compound types: >>> >>> In [19]: t = np.dtype("i4,f4") >>> >>> In [20]: t >>> Out[20]: dtype([('f0', '>> >>> In [21]: hash(t) >>> Out[21]: -9041335829180134223 >>> >>> In [22]: t.names = ('one', 'other') >>> >>> In [23]: t >>> Out[23]: dtype([('one', '>> >>> In [24]: hash(t) >>> Out[24]: 8637734220020415106 >>> >>> Perhaps this should be marked as a bug? ?I'm not sure about that, because the >>> above seems quite useful. >>> >> Hm, that's strange - I get the same hash in both cases, but I thought >> I took into account names when I implemented the hashing protocol for >> dtype. Which version of numpy on which os are you seeing this ? >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > On the same linux 64-bit Fedora 11, I get the same hash with Python2.4 > and numpy 1.3 but different hashes for Python2.6 and numpy 1.4. Could you check the behavior of 1.4.0 on 2.4 ? The code doing hashing for dtypes has not changed since 1.3.0, so normally only the python should have an influence. David From faltet at pytables.org Fri Dec 4 12:03:05 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 4 Dec 2009 18:03:05 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> References: <1259276898.8494.18.camel@idol> <200912041323.59338.faltet@pytables.org> <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> Message-ID: <200912041803.05380.faltet@pytables.org> A Friday 04 December 2009 17:12:09 David Cournapeau escrigu?: > > Mmh, the only case that I'm aware about dtype *mutability* is changing > > the names of compound types: > > > > In [19]: t = np.dtype("i4,f4") > > > > In [20]: t > > Out[20]: dtype([('f0', ' > > > In [21]: hash(t) > > Out[21]: -9041335829180134223 > > > > In [22]: t.names = ('one', 'other') > > > > In [23]: t > > Out[23]: dtype([('one', ' > > > In [24]: hash(t) > > Out[24]: 8637734220020415106 > > > > Perhaps this should be marked as a bug? I'm not sure about that, because > > the above seems quite useful. > > Hm, that's strange - I get the same hash in both cases, but I thought > I took into account names when I implemented the hashing protocol for > dtype. Which version of numpy on which os are you seeing this ? numpy: 1.4.0.dev7072 python: 2.6.1 -- Francesc Alted From cournape at gmail.com Fri Dec 4 12:04:53 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 5 Dec 2009 02:04:53 +0900 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <5b8d13220912040857l3f2b9444n35eeffa2b19bb18@mail.gmail.com> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> <4B19397C.5020300@gmail.com> <5b8d13220912040857l3f2b9444n35eeffa2b19bb18@mail.gmail.com> Message-ID: <5b8d13220912040904w5c889510o1d4875dced18743b@mail.gmail.com> On Sat, Dec 5, 2009 at 1:57 AM, David Cournapeau wrote: > On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southey wrote: >> On 12/04/2009 10:12 AM, David Cournapeau wrote: >>> On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted ?wrote: >>> >>>> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >>>> >>>>> Pauli Virtanen wrote: >>>>> >>>>>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >>>>>> [clip] >>>>>> >>>>>> >>>>>>> Great! Are you storing the format string in the dtype types as well? (So >>>>>>> that no release is needed and acquisitions are cheap...) >>>>>>> >>>>>> I regenerate it on each buffer acquisition. It's simple low-level C code, >>>>>> and I suspect it will always be fast enough. Of course, we could *cache* >>>>>> the result in the dtype. (If dtypes are immutable, which I don't remember >>>>>> right now.) >>>>>> >>>>> We discussed this at SciPy 09 -- basically, they are not necesarrily >>>>> immutable in implementation, but anywhere they are not that is a bug and >>>>> no code should depend on their mutability, so we are free to assume so. >>>>> >>>> Mmh, the only case that I'm aware about dtype *mutability* is changing the >>>> names of compound types: >>>> >>>> In [19]: t = np.dtype("i4,f4") >>>> >>>> In [20]: t >>>> Out[20]: dtype([('f0', '>>> >>>> In [21]: hash(t) >>>> Out[21]: -9041335829180134223 >>>> >>>> In [22]: t.names = ('one', 'other') >>>> >>>> In [23]: t >>>> Out[23]: dtype([('one', '>>> >>>> In [24]: hash(t) >>>> Out[24]: 8637734220020415106 >>>> >>>> Perhaps this should be marked as a bug? ?I'm not sure about that, because the >>>> above seems quite useful. >>>> >>> Hm, that's strange - I get the same hash in both cases, but I thought >>> I took into account names when I implemented the hashing protocol for >>> dtype. Which version of numpy on which os are you seeing this ? >>> >>> David >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> Hi, >> On the same linux 64-bit Fedora 11, I get the same hash with Python2.4 >> and numpy 1.3 but different hashes for Python2.6 and numpy 1.4. > > Could you check the behavior of 1.4.0 on 2.4 ? The code doing hashing > for dtypes has not changed since 1.3.0, so normally only the python > should have an influence. When I say should, it should be understood as this is the only reason why I think it could be different - the behavior should certainly not depend on the python version. David From Chris.Barker at noaa.gov Fri Dec 4 12:12:23 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 04 Dec 2009 09:12:23 -0800 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> Message-ID: <4B1942F7.4030805@noaa.gov> This is getting OT, as I'm not making any comment on numpy's implementation, but... yogesh karpate wrote: > # As far as normalization by(n) is concerned then its common assumption > that the population is normally distributed and population size is > fairly large enough to fit the normal distribution. But this standard > deviation, when applied to a small population, tends to be too low > therefore it is called as biased. OK. > # The correction known as bessel correction is there for small sample > size std. deviation. i.e. normalization by (n-1). but why only small size -- the "beauty" of the approach is that the "-1" makes less and less difference the larger n gets. > " . Its shown that for N=16 the std. deviation normalization was (n-1)=15 > # While I was learning statistics in my course Instructor would advise > to take n=20 for normalization by (n-1) Which introduces an incontinuity -- I never like incontinuities -- why bother? for large n, it makes no practical difference, for small n you want the -1 -- why arbitrarily decide what "small" is? From an engineering/applied science point of view, I take the view expressed in the Wikipedia page on Unbiased estimation of standard deviation: "...the task has little relevance to applications of statistics..." -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bsouthey at gmail.com Fri Dec 4 12:12:36 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 04 Dec 2009 11:12:36 -0600 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <5b8d13220912040857l3f2b9444n35eeffa2b19bb18@mail.gmail.com> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> <5b8d13220912040812k15e66dabi8eb92b7a13ec13df@mail.gmail.com> <4B19397C.5020300@gmail.com> <5b8d13220912040857l3f2b9444n35eeffa2b19bb18@mail.gmail.com> Message-ID: <4B194304.4020708@gmail.com> On 12/04/2009 10:57 AM, David Cournapeau wrote: > On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southey wrote: > >> On 12/04/2009 10:12 AM, David Cournapeau wrote: >> >>> On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted wrote: >>> >>> >>>> A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >>>> >>>> >>>>> Pauli Virtanen wrote: >>>>> >>>>> >>>>>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >>>>>> [clip] >>>>>> >>>>>> >>>>>> >>>>>>> Great! Are you storing the format string in the dtype types as well? (So >>>>>>> that no release is needed and acquisitions are cheap...) >>>>>>> >>>>>>> >>>>>> I regenerate it on each buffer acquisition. It's simple low-level C code, >>>>>> and I suspect it will always be fast enough. Of course, we could *cache* >>>>>> the result in the dtype. (If dtypes are immutable, which I don't remember >>>>>> right now.) >>>>>> >>>>>> >>>>> We discussed this at SciPy 09 -- basically, they are not necesarrily >>>>> immutable in implementation, but anywhere they are not that is a bug and >>>>> no code should depend on their mutability, so we are free to assume so. >>>>> >>>>> >>>> Mmh, the only case that I'm aware about dtype *mutability* is changing the >>>> names of compound types: >>>> >>>> In [19]: t = np.dtype("i4,f4") >>>> >>>> In [20]: t >>>> Out[20]: dtype([('f0', '>>> >>>> In [21]: hash(t) >>>> Out[21]: -9041335829180134223 >>>> >>>> In [22]: t.names = ('one', 'other') >>>> >>>> In [23]: t >>>> Out[23]: dtype([('one', '>>> >>>> In [24]: hash(t) >>>> Out[24]: 8637734220020415106 >>>> >>>> Perhaps this should be marked as a bug? I'm not sure about that, because the >>>> above seems quite useful. >>>> >>>> >>> Hm, that's strange - I get the same hash in both cases, but I thought >>> I took into account names when I implemented the hashing protocol for >>> dtype. Which version of numpy on which os are you seeing this ? >>> >>> David >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> Hi, >> On the same linux 64-bit Fedora 11, I get the same hash with Python2.4 >> and numpy 1.3 but different hashes for Python2.6 and numpy 1.4. >> > Could you check the behavior of 1.4.0 on 2.4 ? The code doing hashing > for dtypes has not changed since 1.3.0, so normally only the python > should have an influence. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > These are different with Python 2.4 and numpy 1.4. Curiously I got different hash values with Python 2.5 and numpy 1.3. (For what it is worth, I get the same hash values with Python 2.3 with numpy 1.1.1). Bruce Python 2.5.2 (r252:60911, Nov 18 2008, 09:20:42) [GCC 4.3.2 20081105 (Red Hat 4.3.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.3.0' >>> t = np.dtype("i4,f4") >>> hash(t) -9041335829180134223 >>> t.names = ('one', 'other') >>> hash(t) 8637734220020415106 [bsouthey at starling python]$ /usr/local/bin/python2.4 Python 2.4.5 (#1, Oct 6 2008, 09:54:35) [GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> t = np.dtype("i4,f4") >>> hash(t) -9041335829180134223 >>> t.names = ('one', 'other') >>> hash(t) 8637734220020415106 Python 2.3.7 (#1, Oct 6 2008, 09:55:54) [GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.1.1' >>> t = np.dtype("i4,f4") >>> hash(t) 140552637936672 >>> t.names = ('one', 'other') >>> hash(t) 140552637936672 From oliphant at enthought.com Sat Dec 5 02:12:44 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Sat, 5 Dec 2009 01:12:44 -0600 Subject: [Numpy-discussion] Applying Patch #1085 In-Reply-To: References: <5C367222-2FDF-4B1D-8DD4-4987E0EB10CA@enthought.com> Message-ID: <75E2A16E-83FF-4467-8543-8AA1060DB27B@enthought.com> On Dec 4, 2009, at 10:09 AM, Charles R Harris wrote: > > > On Fri, Dec 4, 2009 at 6:29 AM, Travis Oliphant > wrote: > > What do people think of applying patch #1085. This patch makes a > copy of inputs when the input and output views overlap in ways in > which one computation will change later computations. > > > I'd rename the function views_share_base_data, make the loop to find > the base an inplace function, and break the long lines in the test. > > It looks like the routine doesn't try to determine if the views > actually overlap, just if they might potentially share data. Is that > correct? That seems safe and if the time isn't much it might be a > nice safety catch. But it shouldn't go into 1.4. It sounds like with a little clean-up that it can be applied to the trunk. I agree that it shouldn't go into 1.4. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Sat Dec 5 05:16:55 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 Dec 2009 11:16:55 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <200912041323.59338.faltet@pytables.org> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> Message-ID: <4B1A3317.5060406@student.matnat.uio.no> Francesc Alted wrote: > A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigu?: >> Pauli Virtanen wrote: >>> Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: >>> [clip] >>> >>>> Great! Are you storing the format string in the dtype types as well? (So >>>> that no release is needed and acquisitions are cheap...) >>> I regenerate it on each buffer acquisition. It's simple low-level C code, >>> and I suspect it will always be fast enough. Of course, we could *cache* >>> the result in the dtype. (If dtypes are immutable, which I don't remember >>> right now.) >> We discussed this at SciPy 09 -- basically, they are not necesarrily >> immutable in implementation, but anywhere they are not that is a bug and >> no code should depend on their mutability, so we are free to assume so. > > Mmh, the only case that I'm aware about dtype *mutability* is changing the > names of compound types: > > In [19]: t = np.dtype("i4,f4") > > In [20]: t > Out[20]: dtype([('f0', ' > In [21]: hash(t) > Out[21]: -9041335829180134223 > > In [22]: t.names = ('one', 'other') > > In [23]: t > Out[23]: dtype([('one', ' > In [24]: hash(t) > Out[24]: 8637734220020415106 > > Perhaps this should be marked as a bug? I'm not sure about that, because the > above seems quite useful. Well, I for one don't like this, but that's just an opinion. I think it is unwise to leave object which supports hash() mutable, because it's too easy to make hard to find bugs (sticking a dtype as a key in a dict is rather useful in many situations). There's a certain tradition in Python for leaving types immutable if possible, and dtype certainly feels like it. Anyway, the buffer PEP can be supported simply by updating the buffer format string on the "names" setter, so it's an orthogonal issue. BTW note that the buffer PEP provides for supplying names of fields: T{ i:one: f:other: } (or similar). NumPy should probably do so at one point in the future; the Cython implementation doesn't because Cython doesn't use this information. -- Dag Sverre From tpk at kraussfamily.org Sat Dec 5 09:20:30 2009 From: tpk at kraussfamily.org (Tom K.) Date: Sat, 5 Dec 2009 06:20:30 -0800 (PST) Subject: [Numpy-discussion] ANN: upfirdn 0.1.0 Message-ID: <26656025.post@talk.nabble.com> (also posted on scipy-user) ANNOUNCEMENT I am pleased to announce the initial release of "upfirdn." This package provides an efficient polyphase FIR resampler object (SWIG-ed C++) and some python wrappers. https://opensource.motorola.com/sf/projects/upfirdn MOTIVATION As a long time user of MATLAB and the Signal Processing Toolbox, I have missed an "upfirdn" analogue in numpy / scipy since I switched over to python a couple years ago. I've been looking for a way to contribute to the wider numpy / scipy community since I love these tools and appreciate all the efforts of those who have developed them. Since we have the polyphase resampling functionality within Motorola and I use it for my work, I thought it best to go the "official route" and get approval from my company to publish it under a BSD compatible license. NOTES TO USERS AND REGARDING INSTALL It is my hope that others find this functionality useful. Suggestions for improvements or bug reports are welcome. As installation is an area that I am very green, I recommend only installing it locally for now, e.g. python setup.py build_ext -i or even just make I'd like to spend some time learning more about distutils and release a version with a more user-friendly install. -- View this message in context: http://old.nabble.com/ANN%3A-upfirdn-0.1.0-tp26656025p26656025.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From cournape at gmail.com Sat Dec 5 10:21:58 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 6 Dec 2009 00:21:58 +0900 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B1A3317.5060406@student.matnat.uio.no> References: <1259276898.8494.18.camel@idol> <4B17C380.4090501@student.matnat.uio.no> <200912041323.59338.faltet@pytables.org> <4B1A3317.5060406@student.matnat.uio.no> Message-ID: <5b8d13220912050721i5ae932cfndfc341a730ec2612@mail.gmail.com> On Sat, Dec 5, 2009 at 7:16 PM, Dag Sverre Seljebotn wrote: >> Perhaps this should be marked as a bug? ?I'm not sure about that, because the >> above seems quite useful. > > Well, I for one don't like this, but that's just an opinion. I think it > is unwise to leave object which supports hash() mutable, because it's > too easy to make hard to find bugs (sticking a dtype as a key in a dict > is rather useful in many situations). There's a certain tradition in > Python for leaving types immutable if possible, and dtype certainly > feels like it. I agree the behavior is a bit surprising, but I don't know if code relies on "compound" dtype names to be immutable out there. Also, the fact that names attribute is a tuple and not a list also suggests that the intent is to be immutable. I am more worried about the variations between python versions ATM, though, I have no idea where it is coming from. David From charlesr.harris at gmail.com Sat Dec 5 10:33:54 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2009 08:33:54 -0700 Subject: [Numpy-discussion] Applying Patch #1085 In-Reply-To: <75E2A16E-83FF-4467-8543-8AA1060DB27B@enthought.com> References: <5C367222-2FDF-4B1D-8DD4-4987E0EB10CA@enthought.com> <75E2A16E-83FF-4467-8543-8AA1060DB27B@enthought.com> Message-ID: On Sat, Dec 5, 2009 at 12:12 AM, Travis Oliphant wrote: > > On Dec 4, 2009, at 10:09 AM, Charles R Harris wrote: > > > > On Fri, Dec 4, 2009 at 6:29 AM, Travis Oliphant wrote: > >> >> What do people think of applying patch #1085. This patch makes a copy of >> inputs when the input and output views overlap in ways in which one >> computation will change later computations. >> >> > I'd rename the function views_share_base_data, make the loop to find the > base an inplace function, and break the long lines in the test. > > It looks like the routine doesn't try to determine if the views actually > overlap, just if they might potentially share data. Is that correct? That > seems safe and if the time isn't much it might be a nice safety catch. But > it shouldn't go into 1.4. > > > It sounds like with a little clean-up that it can be applied to the trunk. > I agree that it shouldn't go into 1.4. > > Agreed, it looks to add a bit of safety. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dyamins at gmail.com Sat Dec 5 12:03:50 2009 From: dyamins at gmail.com (Dan Yamins) Date: Sat, 5 Dec 2009 12:03:50 -0500 Subject: [Numpy-discussion] Numpy/Scipy for EC2 In-Reply-To: <45d1ab480912042232v479bd00aqcd83ea6ba03692b7@mail.gmail.com> References: <15e4667e0910282129t4ad7f5eble1de56c91e6cff25@mail.gmail.com> <3d375d730910290841u78343cd2v4715cd40001bb09a@mail.gmail.com> <15e4667e0910290845te2f9217gf03efa635ff82bd4@mail.gmail.com> <15e4667e0911191745y6e2b3e20pceaddd05aa3c0554@mail.gmail.com> <45d1ab480911200047o2a2cb230i4cf75ce8211a8c58@mail.gmail.com> <45d1ab480912042232v479bd00aqcd83ea6ba03692b7@mail.gmail.com> Message-ID: <15e4667e0912050903g4aecac4at3617f525fb417fb4@mail.gmail.com> On Sat, Dec 5, 2009 at 1:32 AM, David Goldsmith wrote: > Dan- > > I almost hate to ask - after what you've already provided, which is > substantial: did you ever try out what alestic had to offer? If not, I may > be that guinea pig. ;-) > > David -- I'm posting my response to your question to the group as well ... I never tried out the alestic AMIs *directly* ... that's because at least as far as I could tell (I might be wrong about this), they didn't have blas, lapack, scipy, etc prebuilt on them. But the whole StarCluster thing *is* basically Alestic fundamentally, since the AMIs that StarCluster provides were themselves built directly from Alestic's 32-bit and 64-bit Ubuntu jaunty AMIs ... with comparatively small but crucial modifications (building optimized blas, scipy, the Sun Grid Engine, NFS &c). From a "systems" point of view should basically have the same stability properties. The real strength of starcluster is that, in addition to those small but crucial additions to the Alestic AMIs, Starcluster also provides a simple command for making startup/admin/termination of a multi-node cluster really easy. With one invocation, it will interface with EC2 commands to start up the relevant machine instances, mount a shared NFS drive from an amazon EBS volume, start up the grid engine, etc... so that all the headaches of setting up and breaking down the pieces of a multi-node shared-memory cluster are completely handled for you. Is there a particular reason you had in mind for using Alestic directly? I'd be interested to hear your use case ... Dan Btw -- I got an email from Justin Riley, the developer of starcluster -- he seems to be hard at work on the next release (which will have hadoop, couchdb, and some important improvements to the management of multiple clusters. still no word on dynamic load balancing tho) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjw at ncf.ca Sat Dec 5 14:14:56 2009 From: cjw at ncf.ca (Colin J. Williams) Date: Sat, 05 Dec 2009 14:14:56 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <1259922107.2944.0.camel@talisman> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <4B17B3F5.60409@ncf.ca> <7f014ea60912040219u2b46252bib0e47b26d6b10d67@mail.gmail.com> <1259922107.2944.0.camel@talisman> Message-ID: <4B1AB130.5070002@ncf.ca> On 04-Dec-09 05:21 AM, Pauli Virtanen wrote: > pe, 2009-12-04 kello 11:19 +0100, Chris Colbert kirjoitti: > >> Why cant the divisor constant just be made an optional kwarg that >> defaults to zero? >> > It already is an optional kwarg that defaults to zero. > > Cheers, > I suggested that 1 (one) would be a better default but Robert Kern told us that it won't happen. Colin W. From cjw at ncf.ca Sat Dec 5 14:28:12 2009 From: cjw at ncf.ca (Colin J. Williams) Date: Sat, 05 Dec 2009 14:28:12 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> Message-ID: <4B1AB44C.9040804@ncf.ca> On 04-Dec-09 07:18 AM, yogesh karpate wrote: > @ Pauli and @ Colin: > Sorry for the late reply. I was busy > in some other assignments. > # As far as normalization by(n) is concerned then its common > assumption that the population is normally distributed and population > size is fairly large enough to fit the normal distribution. But this > standard deviation, when applied to a small population, tends to be > too low therefore it is called as biased. > # The correction known as bessel correction is there for small sample > size std. deviation. i.e. normalization by (n-1). > # In "electrical-and-electronic-measurements-and-instrumentation" by > A.K. Sawhney . In 1st chapter of the book "Fundamentals of > Meausrements " . Its shown that for N=16 the std. deviation > normalization was (n-1)=15 > # While I was learning statistics in my course Instructor would advise > to take n=20 for normalization by (n-1) > # Probability and statistics by Schuam Series is good reading. > Regards > ~ymk > > > > Yogesh, Thanks for the Bessel name, I hadn't come across that before. The Wikipedia reference for the Bessel Correction uses a divisor of n-1: http://en.wikipedia.org/wiki/Bessel%27s_correction Perhaps the simplification for larger n comes from the fact that for large n, 1/n => 1/(n-1). I would suggest C. E. Weatherburn - Mathematical Statistics, but I doubt whether it is still widely available. Colin W. From d.l.goldsmith at gmail.com Sat Dec 5 17:06:50 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 5 Dec 2009 14:06:50 -0800 Subject: [Numpy-discussion] Numpy/Scipy for EC2 In-Reply-To: <15e4667e0912050903g4aecac4at3617f525fb417fb4@mail.gmail.com> References: <15e4667e0910282129t4ad7f5eble1de56c91e6cff25@mail.gmail.com> <3d375d730910290841u78343cd2v4715cd40001bb09a@mail.gmail.com> <15e4667e0910290845te2f9217gf03efa635ff82bd4@mail.gmail.com> <15e4667e0911191745y6e2b3e20pceaddd05aa3c0554@mail.gmail.com> <45d1ab480911200047o2a2cb230i4cf75ce8211a8c58@mail.gmail.com> <45d1ab480912042232v479bd00aqcd83ea6ba03692b7@mail.gmail.com> <15e4667e0912050903g4aecac4at3617f525fb417fb4@mail.gmail.com> Message-ID: <45d1ab480912051406x2c866ffcl7b10140cfb0c3e87@mail.gmail.com> On Sat, Dec 5, 2009 at 9:03 AM, Dan Yamins wrote: > > > On Sat, Dec 5, 2009 at 1:32 AM, David Goldsmith wrote: > >> Dan- >> >> I almost hate to ask - after what you've already provided, which is >> substantial: did you ever try out what alestic had to offer? If not, I may >> be that guinea pig. ;-) >> >> > David -- I'm posting my response to your question to the group as well ... > > > I never tried out the alestic AMIs *directly* ... that's because at least > as far as I could tell (I might be wrong about this), they didn't have blas, > lapack, scipy, etc prebuilt on them. > > But the whole StarCluster thing *is* basically Alestic fundamentally, > since the AMIs that StarCluster provides were themselves built directly from > Alestic's 32-bit and 64-bit Ubuntu jaunty AMIs ... with comparatively small > but crucial modifications (building optimized blas, scipy, the Sun Grid > Engine, NFS &c). From a "systems" point of view should basically have the > same stability properties. > > The real strength of starcluster is that, in addition to those small but > crucial additions to the Alestic AMIs, Starcluster also provides a simple > command for making startup/admin/termination of a multi-node cluster really > easy. With one invocation, it will interface with EC2 commands to start up > the relevant machine instances, mount a shared NFS drive from an amazon EBS > volume, start up the grid engine, etc... so that all the headaches of > setting up and breaking down the pieces of a multi-node shared-memory > cluster are completely handled for you. > > Is there a particular reason you had in mind for using Alestic directly? > I'd be interested to hear your use case ... > Thanks, Dan. Nope, no particular reason than "coverage within the community": now that you've explained the relationship between the two (if, between the initial description of the Alestic option and your thorough description of Starcluster, it was supposed to be clear that the latter was just an enhanced former, it flew by me while I was paying attention to technical details) I'm perfectly happy to follow the trail you've already blazed. Thanks again, DG > > Dan > > Btw -- I got an email from Justin Riley, the developer of starcluster -- he > seems to be hard at work on the next release (which will have hadoop, > couchdb, and some important improvements to the management of multiple > clusters. still no word on dynamic load balancing tho) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Sat Dec 5 18:52:17 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 06 Dec 2009 00:52:17 +0100 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B1AB130.5070002@ncf.ca> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <4B17B3F5.60409@ncf.ca> <7f014ea60912040219u2b46252bib0e47b26d6b10d67@mail.gmail.com> <1259922107.2944.0.camel@talisman> <4B1AB130.5070002@ncf.ca> Message-ID: <4B1AF231.6030508@molden.no> Colin J. Williams skrev: > > suggested that 1 (one) would be a better default but Robert Kern told > us that it won't happen. > > I don't even see the need for this keyword argument, as you can always multiply the variance by n/(n-1) to get what you want. Also, normalization by n gives the ML estimate (yes it has a bias, but it is better anyway). It is a common novice mistake to use 1/(n-1) as nomalization, probably due to poor advice in introductory statistics textbooks. It also seems that frequentists are more scared about this "bias" boogey monster than Bayesians. It may actually help beginners to avoid this mistake if numpy's implementation prompts them to ask why the normalization is 1/n. If numpy is to change the implementation of std, var, and cov, I suggest using the two-pass algorithm to reduce rounding error. (I can provide C code.) This is much more important than changing the normalization to a bias-free but otherwise inferior value. Sturla From pav at iki.fi Sat Dec 5 19:41:06 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 02:41:06 +0200 Subject: [Numpy-discussion] Py3 merge Message-ID: <1260060065.5576.80.camel@idol> Hi, I'd like to commit my Py3 Numpy branch to SVN trunk soon: http://github.com/pv/numpy-work/commits/py3k For an overview, check the notes: http://github.com/pv/numpy-work/blob/py3k/doc/Py3K.txt None of the changes should affect behavior on Py2. The test status currently is: Python 3.1: Ran 1964 tests in 10.294s FAILED (KNOWNFAIL=5, errors=435, failures=74) Python 2.4: Ran 2480 tests in 18.209s OK (KNOWNFAIL=4, SKIP=2) Python 2.5: Ran 2483 tests in 18.552s OK (KNOWNFAIL=4) Python 2.6: Ran 2484 tests in 20.359s OK (KNOWNFAIL=5) The next TODOs are: - Map 'S' scalartype to 'U' on Py3, and add a new scalartype for bytes - Finish the rest of the PyString transition - Fix I/O - Hammer out causes for test failures More TODOs are listed in Py3K.txt. Pauli From dwf at cs.toronto.edu Sat Dec 5 20:27:48 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sat, 5 Dec 2009 20:27:48 -0500 Subject: [Numpy-discussion] another numpy/ATLAS problem In-Reply-To: <1CF7EB5A-F17B-4E2C-BD19-35AED576DEC8@cs.toronto.edu> References: <662818A6-E8EE-4392-9BED-D281D07D1DFC@cs.toronto.edu> <1CF7EB5A-F17B-4E2C-BD19-35AED576DEC8@cs.toronto.edu> Message-ID: <204B9FD5-165F-4EDA-9BAF-10AB8181B841@cs.toronto.edu> On 29-Nov-09, at 4:46 PM, David Warde-Farley wrote: > On 27-Nov-09, at 6:29 PM, Charles R Harris wrote: > >> 3.9.12 segfaulted on me while running, so I haven't bothered with >> versions after that. Why not try the stable version 3.8.3? > > Just to follow up, I went back to 3.9.11 and numpy works without > incident, using the exact same ATLAS build procedure; I guess it > wasn't something I was doing wrong after all. > > I've filed it on the ATLAS tracker, so hopefully it'll be addressed > sooner or later. Below is the response I received from Clint Whaley re: dlamc3_ and NumPy. It seems that NumPy's lapack_lite is making some assumptions about routines which aren't actually there in recent (3.9.16 and on) versions of ATLAS, as it provides a speedier version of DLAMCH. I've briefly been trying to figure out how to work around this, but I'm a bit confused. I was under the (mistaken?) impression that the f2c'd code in dlapack_lite.c, zlapack_lite.c and dlamch.c was only used in the absence of an external BLAS/LAPACK, so that NumPy can fall back on its own; yet these are the only places dlamc3_ is referenced. The ATLAS ticket is at http://tinyurl.com/y8dv5w8 if you wish to respond directly as well. -------------------------------------------------------------------------------------------- David, Well, dlamc3 is an internal routine called by dlamch in LAPACK in order to keep register assignment from screwing up their internal tests. It is not part of the official LAPACK API AFAIK. It looks like some part of NumPy is calling this routine, expecting that DLAMCH will provide it. However, from 3.9.16 on, ATLAS provides a native LAMCH implementation, that reads in a generated file for improved speed. This LAMCH does no computation at all, and so obviously does not provide the internal routine DLAM3. Can you ask the NumPy people to comment on why they are calling this routine? Thanks, Clint From pav at iki.fi Sat Dec 5 21:04:57 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 04:04:57 +0200 Subject: [Numpy-discussion] another numpy/ATLAS problem In-Reply-To: <204B9FD5-165F-4EDA-9BAF-10AB8181B841@cs.toronto.edu> References: <662818A6-E8EE-4392-9BED-D281D07D1DFC@cs.toronto.edu> <1CF7EB5A-F17B-4E2C-BD19-35AED576DEC8@cs.toronto.edu> <204B9FD5-165F-4EDA-9BAF-10AB8181B841@cs.toronto.edu> Message-ID: <1260065097.32703.11.camel@idol> la, 2009-12-05 kello 20:27 -0500, David Warde-Farley kirjoitti: [clip] > I've briefly been trying to figure out how to work around this, but > I'm a bit confused. I was under the (mistaken?) impression that the > f2c'd code in dlapack_lite.c, zlapack_lite.c and dlamch.c was only > used in the absence of an external BLAS/LAPACK, so that NumPy can fall > back on its own; yet these are the only places dlamc3_ is referenced. I think your impression is not mistaken, the f2c'd stuff really is intended to be included only when a real lapack is not available. And lamc3_ is not called in lapack_litemodule.c... Even more curiously, dlamc3_ is actually included in dlamch.c, so if that file is included in the build, one would expect the routine be available... Perhaps there is some distutils mystery going on, maybe the depends list becomes reordered or something, messing up the list of files included in the build. In fact, I don't understand why linalg/setup.py uses a callback to construct the source file list -- it would be simpler to construct the source file list before passing it on to config.add_extension. Can you try to change linalg/setup.py so that it *only* includes lapack_litemodule.c in the build? -- Pauli Virtanen From charlesr.harris at gmail.com Sat Dec 5 21:18:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2009 19:18:36 -0700 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <1260060065.5576.80.camel@idol> References: <1260060065.5576.80.camel@idol> Message-ID: On Sat, Dec 5, 2009 at 5:41 PM, Pauli Virtanen wrote: > Hi, > > I'd like to commit my Py3 Numpy branch to SVN trunk soon: > > http://github.com/pv/numpy-work/commits/py3k > > For an overview, check the notes: > > http://github.com/pv/numpy-work/blob/py3k/doc/Py3K.txt > > That's a nice set of notes, very informative. > None of the changes should affect behavior on Py2. The test status > currently is: > > Python 3.1: > Ran 1964 tests in 10.294s > FAILED (KNOWNFAIL=5, errors=435, failures=74) > > Heh, at least we are far enough along to get errors and failures. > Python 2.4: > Ran 2480 tests in 18.209s > OK (KNOWNFAIL=4, SKIP=2) > > Python 2.5: > Ran 2483 tests in 18.552s > OK (KNOWNFAIL=4) > > Python 2.6: > Ran 2484 tests in 20.359s > OK (KNOWNFAIL=5) > > The next TODOs are: > > - Map 'S' scalartype to 'U' on Py3, and add a new scalartype for bytes > - Finish the rest of the PyString transition > - Fix I/O > - Hammer out causes for test failures > > We need character arrays for the astro people. I assume these will be byte arrays. Maybe Michael will weigh in here. > More TODOs are listed in Py3K.txt. > > Work, work, work. Great start. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 5 21:30:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2009 19:30:43 -0700 Subject: [Numpy-discussion] Py3 merge In-Reply-To: References: <1260060065.5576.80.camel@idol> Message-ID: On Sat, Dec 5, 2009 at 7:18 PM, Charles R Harris wrote: > > > On Sat, Dec 5, 2009 at 5:41 PM, Pauli Virtanen wrote: > >> Hi, >> >> I'd like to commit my Py3 Numpy branch to SVN trunk soon: >> >> http://github.com/pv/numpy-work/commits/py3k >> >> For an overview, check the notes: >> >> http://github.com/pv/numpy-work/blob/py3k/doc/Py3K.txt >> >> > That's a nice set of notes, very informative. > > >> None of the changes should affect behavior on Py2. The test status >> currently is: >> >> Python 3.1: >> Ran 1964 tests in 10.294s >> FAILED (KNOWNFAIL=5, errors=435, failures=74) >> >> > Heh, at least we are far enough along to get errors and failures. > > >> Python 2.4: >> Ran 2480 tests in 18.209s >> OK (KNOWNFAIL=4, SKIP=2) >> >> Python 2.5: >> Ran 2483 tests in 18.552s >> OK (KNOWNFAIL=4) >> >> Python 2.6: >> Ran 2484 tests in 20.359s >> OK (KNOWNFAIL=5) >> >> The next TODOs are: >> >> - Map 'S' scalartype to 'U' on Py3, and add a new scalartype for bytes >> - Finish the rest of the PyString transition >> - Fix I/O >> - Hammer out causes for test failures >> >> > We need character arrays for the astro people. I assume these will be byte > arrays. Maybe Michael will weigh in here. > > >> More TODOs are listed in Py3K.txt. >> >> > Work, work, work. Great start. > > PS, I wonder if it would be useful to keep the detailed info on the new type structures in a supplementary file. They might be useful to someone else as I had to dig through the Py3k sources to get them. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Dec 5 22:54:56 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 6 Dec 2009 12:54:56 +0900 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <1260060065.5576.80.camel@idol> References: <1260060065.5576.80.camel@idol> Message-ID: <5b8d13220912051954u6706e991hf41612a51b45ce48@mail.gmail.com> On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen wrote: > Hi, > > I'd like to commit my Py3 Numpy branch to SVN trunk soon: > > ? ? ? ?http://github.com/pv/numpy-work/commits/py3k Awesome - I think we should merge this ASAP. In particular, I would like to start fixing platforms-specific issues. Concerning nose, will there be any version which works on both py2 and py3 ? David From aisaac at american.edu Sat Dec 5 23:27:54 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 05 Dec 2009 23:27:54 -0500 Subject: [Numpy-discussion] How to solve homogeneous linear equations with NumPy? In-Reply-To: References: Message-ID: <4B1B32CA.4090206@american.edu> On 12/3/2009 12:40 AM, Peter Cai wrote: > If I have homogeneous linear equations like this > > array([[-0.75, 0.25, 0.25, 0.25], > [ 1. , -1. , 0. , 0. ], > [ 1. , 0. , -1. , 0. ], > [ 1. , 0. , 0. , -1. ]]) > > And I want to get a non-zero solution for it. How can it be done with NumPy? > > linalg.solve only works on A * x = b where b does not contains only 0. Just to be clear: linalg.solve does not have any problem with b=0. It does have a problem with a singular coefficient matrix, like the one above. But I suppose Chuck answered your real question. fwiw, Alan Isaac From faltet at pytables.org Sun Dec 6 05:47:23 2009 From: faltet at pytables.org (Francesc Alted) Date: Sun, 6 Dec 2009 11:47:23 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <4B1A3317.5060406@student.matnat.uio.no> References: <1259276898.8494.18.camel@idol> <200912041323.59338.faltet@pytables.org> <4B1A3317.5060406@student.matnat.uio.no> Message-ID: <200912061147.23728.faltet@pytables.org> A Saturday 05 December 2009 11:16:55 Dag Sverre Seljebotn escrigu?: > > Mmh, the only case that I'm aware about dtype *mutability* is changing > > the names of compound types: > > > > In [19]: t = np.dtype("i4,f4") > > > > In [20]: t > > Out[20]: dtype([('f0', ' > > > In [21]: hash(t) > > Out[21]: -9041335829180134223 > > > > In [22]: t.names = ('one', 'other') > > > > In [23]: t > > Out[23]: dtype([('one', ' > > > In [24]: hash(t) > > Out[24]: 8637734220020415106 > > > > Perhaps this should be marked as a bug? I'm not sure about that, because > > the above seems quite useful. > > Well, I for one don't like this, but that's just an opinion. I think it > is unwise to leave object which supports hash() mutable, because it's > too easy to make hard to find bugs (sticking a dtype as a key in a dict > is rather useful in many situations). There's a certain tradition in > Python for leaving types immutable if possible, and dtype certainly > feels like it. Yes, I think you are right and force dtype to be immutable would be the best. As a bonus, an immutable dtype would render this ticket: http://projects.scipy.org/numpy/ticket/1127 without effect. -- Francesc Alted From gael.varoquaux at normalesup.org Sun Dec 6 06:13:01 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 12:13:01 +0100 Subject: [Numpy-discussion] Chararray depreciated? Message-ID: <20091206111300.GC897@phare.normalesup.org> http://docs.scipy.org/doc/numpy/reference/generated/numpy.chararray.html says that chararray are depreciated. I think I saw a discussion on the mailing list that hinted otherwise. Which one is true? Should I correct the docs? Cheers, Ga?l From ralf.gommers at googlemail.com Sun Dec 6 07:11:24 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 6 Dec 2009 13:11:24 +0100 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: <20091206111300.GC897@phare.normalesup.org> References: <20091206111300.GC897@phare.normalesup.org> Message-ID: On Sun, Dec 6, 2009 at 12:13 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.chararray.html > says that chararray are depreciated. I think I saw a discussion on the > mailing list that hinted otherwise. Which one is true? Should I correct > the docs? You're right, after Mike's fixes that note should have been changed. I thought Mike had also proposed an alternative text, but I can't find it right now. So feel free to change it. Also keep in mind the following (from another of Mike's emails): "All vectorized string operations are now available as regular functions in the numpy.char namespace. Usage of the chararray view class is only recommended for numarray backward compatibility." Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From eadrogue at gmx.net Sun Dec 6 07:26:24 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Sun, 6 Dec 2009 13:26:24 +0100 Subject: [Numpy-discussion] histogram for discrete data Message-ID: <20091206122623.GA8262@doriath.local> Hi, A few weeks ago there was a discussion about a histogram_discrete() function --sorry for starting a new thread but I have lost the mails. Somebody pointed out that bincount() already can be used to histogram discrete data (except that it doesn't work with negative values). I have just discovered a function in scipy.stats called itemfreq() that does handle negative values. In [17]: scipy.stats.itemfreq([-1,-1,0,5]) Out[17]: array([[-1., 2.], [ 0., 1.], [ 5., 1.]]) Bye. From josef.pktd at gmail.com Sun Dec 6 07:52:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 6 Dec 2009 07:52:55 -0500 Subject: [Numpy-discussion] histogram for discrete data In-Reply-To: <20091206122623.GA8262@doriath.local> References: <20091206122623.GA8262@doriath.local> Message-ID: <1cd32cbb0912060452x2967c36mdb2e28866c75ba6a@mail.gmail.com> 2009/12/6 Ernest Adrogu? : > Hi, > > A few weeks ago there was a discussion about a > histogram_discrete() function --sorry for starting a new > thread but I have lost the mails. > > Somebody pointed out that bincount() already can be used > to histogram discrete data (except that it doesn't work > with negative values). > > I have just discovered a function in scipy.stats called > itemfreq() that does handle negative values. > > In [17]: scipy.stats.itemfreq([-1,-1,0,5]) > Out[17]: > array([[-1., ?2.], > ? ? ? [ 0., ?1.], > ? ? ? [ 5., ?1.]]) bincount is a fast c function, stats.itemfreq uses a slow python loop. The latter should be very slow for large arrays. stats.itemfreq works also on floats, but not on strings (which should be a bug). Josef > Bye. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Sun Dec 6 07:57:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 6 Dec 2009 07:57:59 -0500 Subject: [Numpy-discussion] np.equal Message-ID: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> what's the difference in the implementation between np.equal and == ? np.equal raises NotImplemented for strings, while == works. >>> aa array(['a', 'b', 'a', 'aa', 'a'], dtype='|S2') >>> aa == 'a' array([ True, False, True, False, True], dtype=bool) >>> np.equal(aa,'a') NotImplemented >>> np.equal(np.arange(5),1) array([False, True, False, False, False], dtype=bool) >>> np.equal(np.arange(5),'a') NotImplemented >>> np.arange(5) == 'a' False Josef From dsdale24 at gmail.com Sun Dec 6 07:59:19 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Sun, 6 Dec 2009 07:59:19 -0500 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <5b8d13220912051954u6706e991hf41612a51b45ce48@mail.gmail.com> References: <1260060065.5576.80.camel@idol> <5b8d13220912051954u6706e991hf41612a51b45ce48@mail.gmail.com> Message-ID: On Sat, Dec 5, 2009 at 10:54 PM, David Cournapeau wrote: > On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen wrote: >> Hi, >> >> I'd like to commit my Py3 Numpy branch to SVN trunk soon: >> >> ? ? ? ?http://github.com/pv/numpy-work/commits/py3k > > Awesome - I think we should merge this ASAP. In particular, I would > like to start fixing platforms-specific issues. > > Concerning nose, will there be any version which works on both py2 and py3 ? There is a development branch for python-3 here: svn checkout http://python-nose.googlecode.com/svn/branches/py3k Darren From pav at iki.fi Sun Dec 6 08:04:31 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 15:04:31 +0200 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <5b8d13220912051954u6706e991hf41612a51b45ce48@mail.gmail.com> References: <1260060065.5576.80.camel@idol> <5b8d13220912051954u6706e991hf41612a51b45ce48@mail.gmail.com> Message-ID: <1260104670.4862.71.camel@idol> su, 2009-12-06 kello 12:54 +0900, David Cournapeau kirjoitti: > On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen wrote: > > Hi, > > > > I'd like to commit my Py3 Numpy branch to SVN trunk soon: > > > > http://github.com/pv/numpy-work/commits/py3k > > Awesome - I think we should merge this ASAP. In particular, I would > like to start fixing platforms-specific issues. Ok, the whole shebang is now in. > Concerning nose, will there be any version which works on both py2 and py3 ? No idea. (Though there's a separate Py3 branch.) Pauli From gael.varoquaux at normalesup.org Sun Dec 6 08:10:10 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 14:10:10 +0100 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: References: <20091206111300.GC897@phare.normalesup.org> Message-ID: <20091206131010.GD897@phare.normalesup.org> On Sun, Dec 06, 2009 at 01:11:24PM +0100, Ralf Gommers wrote: > [2]http://docs.scipy.org/doc/numpy/reference/generated/numpy.chararray.html > > says that chararray are depreciated. I think I saw a discussion on the > > mailing list that hinted otherwise. Which one is true? Should I correct > > the docs? > You're right, after Mike's fixes that note should have been changed. I > thought Mike had also proposed an alternative text, but I can't find it > right now. So feel free to change it. > Also keep in mind the following (from another of Mike's emails): > "All vectorized string operations are now available as regular functions > in the numpy.char namespace. Excellent. I tweeked a bit the text to make it clearer: http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ I made it clear that the replacement are present only starting numpy 1.4. I'd love a review and an 'OK to apply'. By the way, this switch is a clear improvement over chararrays, IMHO. Congratulations to the pack and Mike for identifying usecases and giving a good implementation to answer them. Ga?l From gael.varoquaux at normalesup.org Sun Dec 6 08:53:58 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 14:53:58 +0100 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk Message-ID: <20091206135358.GE897@phare.normalesup.org> I have a lot of code that has stopped working with my latest SVN pull to numpy. * Some compiled code yields an error looking like (from memory): "incorrect type 'numpy.ndarray'" Rebuilding it is sufficient. * I had some code doing: hashlib.md5(x).hexdigest() where x is a numpy array. I had to replace it by: hashlib.md5(np.getbuffer(x)).hexdigest() * Finally, I had to following failure: /home/varoquau/dev/enthought/ets/Mayavi_3.1.0/enthought/tvtk/array_handler.pyc in array2vtk(num_array, vtk_array) --> 298 result_array.SetVoidArray(z_flat, len(z_flat), 1) TypeError: argument 1 must be string or read-only buffer, not numpy.ndarray I can solve the problem using: result_array.SetVoidArray(numpy.getbuffer(z_flat), len(z_flat), 1) However, I am wondering: is this some incompatibility that has been introduced by mistake? I find it a bit strange that a '.x' release induces so much breakage, and I am afraid that it won't be popular amongst our users. Cheers, Ga?l From pav at iki.fi Sun Dec 6 09:07:16 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 16:07:16 +0200 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <20091206135358.GE897@phare.normalesup.org> References: <20091206135358.GE897@phare.normalesup.org> Message-ID: <1260108435.4862.76.camel@idol> su, 2009-12-06 kello 14:53 +0100, Gael Varoquaux kirjoitti: > I have a lot of code that has stopped working with my latest SVN pull to > numpy. Which SVN revision? Before or after the Py3K commits? Note that the trunk is currently aiming at 1.5.x, code for 1.4.x is in a branch. > * Some compiled code yields an error looking like (from memory): > > "incorrect type 'numpy.ndarray'" > > Rebuilding it is sufficient. > > * I had some code doing: > > hashlib.md5(x).hexdigest() > > where x is a numpy array. I had to replace it by: > > hashlib.md5(np.getbuffer(x)).hexdigest() Probably due to the PEP 3118 buffer interface implementation? I'll try to see what's wrong here. > * Finally, I had to following failure: > > /home/varoquau/dev/enthought/ets/Mayavi_3.1.0/enthought/tvtk/array_handler.pyc > in array2vtk(num_array, vtk_array) > --> 298 result_array.SetVoidArray(z_flat, len(z_flat), 1) > > TypeError: argument 1 must be string or read-only buffer, not > numpy.ndarray > > I can solve the problem using: > > result_array.SetVoidArray(numpy.getbuffer(z_flat), len(z_flat), 1) > > However, I am wondering: is this some incompatibility that has been > introduced by mistake? Again, sounds like the buffer interface. > I find it a bit strange that a '.x' release > induces so much breakage, and I am afraid that it won't be popular > amongst our users. Well, there's still a lot of time to fix these issues before 1.5.0 is out. Just file bug tickets for each one :) -- Pauli Virtanen From gael.varoquaux at normalesup.org Sun Dec 6 09:11:16 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 15:11:16 +0100 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <1260108435.4862.76.camel@idol> References: <20091206135358.GE897@phare.normalesup.org> <1260108435.4862.76.camel@idol> Message-ID: <20091206141116.GF897@phare.normalesup.org> On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote: > su, 2009-12-06 kello 14:53 +0100, Gael Varoquaux kirjoitti: > > I have a lot of code that has stopped working with my latest SVN pull to > > numpy. > Which SVN revision? Before or after the Py3K commits? > Note that the trunk is currently aiming at 1.5.x, code for 1.4.x is in a > branch. Trunk. I had indeed forgotten that we where in 1.5.x now. > > I find it a bit strange that a '.x' release induces so much breakage, > > and I am afraid that it won't be popular amongst our users. > Well, there's still a lot of time to fix these issues before 1.5.0 is > out. Just file bug tickets for each one :) OK, cool. Glad to see that we are on the same page. I will file tickets. Ga?l From pav at iki.fi Sun Dec 6 09:14:55 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 16:14:55 +0200 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <20091206141116.GF897@phare.normalesup.org> References: <20091206135358.GE897@phare.normalesup.org> <1260108435.4862.76.camel@idol> <20091206141116.GF897@phare.normalesup.org> Message-ID: <1260108895.4862.78.camel@idol> su, 2009-12-06 kello 15:11 +0100, Gael Varoquaux kirjoitti: > On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote: [clip] > > > I find it a bit strange that a '.x' release induces so much breakage, > > > and I am afraid that it won't be popular amongst our users. > > > Well, there's still a lot of time to fix these issues before 1.5.0 is > > out. Just file bug tickets for each one :) > > OK, cool. Glad to see that we are on the same page. I will file tickets. Great, thanks! The bugs you see, btw, point out holes in our test suite... Pauli From gael.varoquaux at normalesup.org Sun Dec 6 09:26:12 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 15:26:12 +0100 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <1260108895.4862.78.camel@idol> References: <20091206135358.GE897@phare.normalesup.org> <1260108435.4862.76.camel@idol> <20091206141116.GF897@phare.normalesup.org> <1260108895.4862.78.camel@idol> Message-ID: <20091206142612.GG897@phare.normalesup.org> On Sun, Dec 06, 2009 at 04:14:55PM +0200, Pauli Virtanen wrote: > su, 2009-12-06 kello 15:11 +0100, Gael Varoquaux kirjoitti: > > On Sun, Dec 06, 2009 at 04:07:16PM +0200, Pauli Virtanen wrote: > [clip] > > > > I find it a bit strange that a '.x' release induces so much breakage, > > > > and I am afraid that it won't be popular amongst our users. > > > Well, there's still a lot of time to fix these issues before 1.5.0 is > > > out. Just file bug tickets for each one :) > > OK, cool. Glad to see that we are on the same page. I will file tickets. http://projects.scipy.org/numpy/ticket/1312 > The bugs you see, btw, point out holes in our test suite... Well, the hashlib one is easy to add as a test. The other one is harder. Cheers, Ga?l From ralf.gommers at googlemail.com Sun Dec 6 09:36:37 2009 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 6 Dec 2009 15:36:37 +0100 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: <20091206131010.GD897@phare.normalesup.org> References: <20091206111300.GC897@phare.normalesup.org> <20091206131010.GD897@phare.normalesup.org> Message-ID: On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > > Excellent. I tweeked a bit the text to make it clearer: > http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ > > I made it clear that the replacement are present only starting numpy 1.4. > I'd love a review and an 'OK to apply'. > > Looks good, I copied the changes to the notes in defchararray and arrays.classes.rst, and toggled OK to apply. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Sun Dec 6 09:37:48 2009 From: xavier.gnata at gmail.com (Xavier Gnata) Date: Sun, 06 Dec 2009 15:37:48 +0100 Subject: [Numpy-discussion] Help to port numpy to python3? Message-ID: <4B1BC1BC.6080606@gmail.com> Hi, Is there a way to help to port numpy to python3? I don't thing I have time to rewrite some code but I can test whatever has to be tested. Is there an official web page showing the status of this port? Same question from scipy? It is already nice to see that the last numpy version is compatible with python2.6 :) Xavier From pav at iki.fi Sun Dec 6 09:52:30 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 06 Dec 2009 16:52:30 +0200 Subject: [Numpy-discussion] Help to port numpy to python3? In-Reply-To: <4B1BC1BC.6080606@gmail.com> References: <4B1BC1BC.6080606@gmail.com> Message-ID: <1260111149.4862.84.camel@idol> su, 2009-12-06 kello 15:37 +0100, Xavier Gnata kirjoitti: > Is there a way to help to port numpy to python3? If you want to write some code, check http://projects.scipy.org/numpy/browser/trunk/doc/Py3K.txt > I don't thing I have time to rewrite some code but I can test whatever > has to be tested. Is there an official web page showing the status of > this port? Otherwise, you can help by: 1) Build Numpy SVN on Python 2.6 Run all kinds of software that use Numpy, and see if there are new bugs as compared to Numpy 1.4.0 or 1.3.0. The Py3 transition involves a large amount of changes in the C code, and it's easy to miss out some subtle issues. 2) Figure out how to test the PEP 3118 buffer interface on Python 2.6 and Python 3.1 Write unit tests for it. The Py3K.txt is pretty much the status report as we have now. > Same question from scipy? Work on Scipy can begin only after most of Numpy works on Py3. -- Pauli Virtanen From cjw at ncf.ca Sun Dec 6 11:01:13 2009 From: cjw at ncf.ca (Colin J. Williams) Date: Sun, 06 Dec 2009 11:01:13 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B1930CB.7090708@gmail.com> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> <4B1930CB.7090708@gmail.com> Message-ID: <4B1BD549.30508@ncf.ca> On 04-Dec-09 10:54 AM, Bruce Southey wrote: > On 12/04/2009 06:18 AM, yogesh karpate wrote: >> @ Pauli and @ Colin: >> Sorry for the late reply. I was >> busy in some other assignments. >> # As far as normalization by(n) is concerned then its common >> assumption that the population is normally distributed and population >> size is fairly large enough to fit the normal distribution. But this >> standard deviation, when applied to a small population, tends to be >> too low therefore it is called as biased. >> # The correction known as bessel correction is there for small sample >> size std. deviation. i.e. normalization by (n-1). >> # In "electrical-and-electronic-measurements-and-instrumentation" by >> A.K. Sawhney . In 1st chapter of the book "Fundamentals of >> Meausrements " . Its shown that for N=16 the std. deviation >> normalization was (n-1)=15 >> # While I was learning statistics in my course Instructor would >> advise to take n=20 for normalization by (n-1) >> # Probability and statistics by Schuam Series is good reading. >> Regards >> ~ymk >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > Hi, > Basically, all that I see with these arbitrary values is that you are > relying on the 'central limit theorem' > (http://en.wikipedia.org/wiki/Central_limit_theorem). Really the > issue in using these values is how much statistical bias will you > tolerate especially in the impact on usage of that estimate because > the usage of variance (such as in statistical tests) tend to be more > influenced by bias than the estimate of variance. (Of course, many > features rely on asymptotic properties so bias concerns are less > apparent in large sample sizes.) > > Obviously the default relies on the developers background and > requirements. There are multiple valid variance estimators in > statistics with different denominators like N (maximum likelihood > estimator), N-1 (restricted maximum likelihood estimator and certain > Bayesian estimators) and Stein's > (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So > thecurrent default behavior is a valid and documented. Consequently > you can not just have one option or different functions (like certain > programs) and Numpy's implementation actually allows you do all these > in a single function. So I also see no reason change even if I have to > add the ddof=1 argument, after all 'Explicit is better than implicit' :-). > > Bruce Bruce, I suggest that the Central Limit Theorem is tied in with the Law of Large Numbers. When one has a smallish sample size, what give the best estimate of the variance? The Bessel Correction provides a rationale, based on expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction). It is difficult to understand the proof of Stein: http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example The symbols used are not clearly stated. He seems interested in a decision rule for the calculation of the mean of a sample and claims that his approach is better than the traditional Least Squares approach. In most cases, the interest is likely to be in the variance, with a view to establishing a confidence interval. In the widely used Analysis of Variance (ANOVA), the degrees of freedom are reduced for each mean estimated, see: http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below: *Analysis of Variance Table* ** Source of Variation Sum of Squares Degrees of Freedom Mean Square F Ratio p Between Groups 25.20 2 12.60 5.178 <.05 Within Groups 29.20 12 2.43 Total 54.40 14 There is a sample of 15 observations, which is divided into three groups, depending on the number of hours of therapy. Thus, the Total degrees of freedom are 15-1 = 14, the Between Groups 3-1 = 2 and the Residual is 14 - 2 = 12. Colin W. From josef.pktd at gmail.com Sun Dec 6 11:21:09 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 6 Dec 2009 11:21:09 -0500 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B1BD549.30508@ncf.ca> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> <4B1930CB.7090708@gmail.com> <4B1BD549.30508@ncf.ca> Message-ID: <1cd32cbb0912060821h6fdd6823h20c4c7511131503f@mail.gmail.com> On Sun, Dec 6, 2009 at 11:01 AM, Colin J. Williams wrote: > > > On 04-Dec-09 10:54 AM, Bruce Southey wrote: >> On 12/04/2009 06:18 AM, yogesh karpate wrote: >>> @ Pauli and @ Colin: >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Sorry for the late reply. I was >>> busy in some other assignments. >>> # As far as ?normalization by(n) is concerned then its common >>> assumption that the population is normally distributed and population >>> size is fairly large enough to fit the normal distribution. But this >>> standard deviation, when applied to a small population, tends to be >>> too low therefore it is called ?as biased. >>> # The correction known as bessel correction is there for small sample >>> size std. deviation. i.e. normalization by (n-1). >>> # In "electrical-and-electronic-measurements-and-instrumentation" by >>> A.K. Sawhney . In 1st chapter of the book "Fundamentals of >>> Meausrements " . Its shown that for N=16 the std. deviation >>> normalization was (n-1)=15 >>> # While I was learning statistics in my course Instructor would >>> advise to take n=20 for normalization by (n-1) >>> # Probability and statistics by Schuam Series ?is good reading. >>> Regards >>> ~ymk >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> Hi, >> Basically, all that I see with these arbitrary values is that you are >> relying on the 'central limit theorem' >> (http://en.wikipedia.org/wiki/Central_limit_theorem). ?Really the >> issue in using these values is how much statistical bias will you >> tolerate especially in the impact on usage of that estimate because >> the usage of variance (such as in statistical tests) tend to be more >> influenced by bias than the estimate of variance. (Of course, many >> features rely on asymptotic properties so bias concerns are less >> apparent in large sample sizes.) >> >> Obviously the default relies on the developers background and >> requirements. There are multiple valid variance estimators in >> statistics with different denominators like N (maximum likelihood >> estimator), N-1 (restricted maximum likelihood estimator and certain >> Bayesian estimators) and Stein's >> (http://en.wikipedia.org/wiki/James%E2%80%93Stein_estimator). So >> thecurrent default behavior is a valid and documented. Consequently >> you can not just have one option or different functions (like certain >> programs) and Numpy's implementation actually allows you do all these >> in a single function. So I also see no reason change even if I have to >> add the ddof=1 argument, after all 'Explicit is better than implicit' :-). >> >> Bruce > Bruce, > > I suggest that the Central Limit Theorem is tied in with the Law of > Large Numbers. > > When one has a smallish sample size, what give the best estimate of the > variance? ?The Bessel Correction provides a rationale, based on > expectations: (http://en.wikipedia.org/wiki/Bessel%27s_correction). > > It is difficult to understand the proof of Stein: > http://en.wikipedia.org/wiki/Proof_of_Stein%27s_example > > The symbols used are not clearly stated. ?He seems interested in a > decision rule for the calculation of the mean of a sample and claims > that his approach is better than the traditional Least Squares approach. > > In most cases, the interest is likely to be in the variance, with a view > to establishing a confidence interval. What's the best estimate? That's the main question Estimators differ in their (sample or posterior) distribution, especially bias and variance. Stein estimator dominates OLS in the mean squared error, so although it is biased, the variance of the estimator is smaller than OLS so that MSE (bias plus variance) is also smaller for Stein estimator than for OLS. Depending on the application there could be many possible loss functions, including asymmetric, eg. if its more costly to over than to under estimate. The following was a good book for this, that I read a long time ago: Statistical decision theory and Bayesian analysis By James O. Berger http://books.google.ca/books?id=oY_x7dE15_AC&pg=PP1&lpg=PP1&dq=berger+decision&source=bl&ots=wzL3ocu5_9&sig=lGm5VevPtnFW570mgeqJklASalU&hl=en&ei=P9cbS5CSCIqllAf-0f3xCQ&sa=X&oi=book_result&ct=result&resnum=4&ved=0CBcQ6AEwAw#v=onepage&q=&f=false > > In the widely used Analysis of Variance (ANOVA), the degrees of freedom > are reduced for each mean estimated, see: > http://www.mnstate.edu/wasson/ed602lesson13.htm for the example below: > > *Analysis of Variance Table* ** Source of > Variation ? ? ? Sum of > Squares ? ? ? ? Degrees of > Freedom ? ? ? ? Mean > Square ?F Ratio ? ? ? ? p > Between Groups ?25.20 ? 2 ? ? ? 12.60 ? 5.178 ? <.05 > Within Groups ? 29.20 ? 12 ? ? ?2.43 > > Total ? 54.40 ? 14 > > > There is a sample of 15 observations, which is divided into three > groups, depending on the number of hours of therapy. > Thus, the Total degrees of freedom are 15-1 = 14, ?the Between Groups > 3-1 = 2 and the Residual is 14 - 2 = 12. Statistical tests are the only area where I really pay attention to the degrees of freedom, since the test statistic is derived under specific assumptions. But there are also many cases, where different statisticians argue in favor of different dof corrections, and it is not always clear in which cases one or another is the "best". Josef > > Colin W. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sun Dec 6 11:41:19 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 6 Dec 2009 09:41:19 -0700 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <1cd32cbb0912060821h6fdd6823h20c4c7511131503f@mail.gmail.com> References: <26566843.post@talk.nabble.com> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> <4B1930CB.7090708@gmail.com> <4B1BD549.30508@ncf.ca> <1cd32cbb0912060821h6fdd6823h20c4c7511131503f@mail.gmail.com> Message-ID: On Sun, Dec 6, 2009 at 9:21 AM, wrote: > On Sun, Dec 6, 2009 at 11:01 AM, Colin J. Williams wrote: > > > > What's the best estimate? That's the main question > > Estimators differ in their (sample or posterior) distribution, > especially bias and variance. > Stein estimator dominates OLS in the mean squared error, so although > it is biased, the variance of the estimator is smaller than OLS so that > MSE (bias plus variance) is also smaller for Stein estimator than for OLS. > Depending on the application there could be many possible loss functions, > including asymmetric, eg. if its more costly to over than to under > estimate. > > The following was a good book for this, that I read a long time ago: > Statistical decision theory and Bayesian analysis By James O. Berger > > > http://books.google.ca/books?id=oY_x7dE15_AC&pg=PP1&lpg=PP1&dq=berger+decision&source=bl&ots=wzL3ocu5_9&sig=lGm5VevPtnFW570mgeqJklASalU&hl=en&ei=P9cbS5CSCIqllAf-0f3xCQ&sa=X&oi=book_result&ct=result&resnum=4&ved=0CBcQ6AEwAw#v=onepage&q=&f=false > > At last, an explanation I can understand. Thanks Josef. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Sun Dec 6 12:27:25 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sun, 6 Dec 2009 09:27:25 -0800 Subject: [Numpy-discussion] np.equal In-Reply-To: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> References: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> Message-ID: On Sun, Dec 6, 2009 at 4:57 AM, wrote: > what's the difference in the implementation between np.equal and == ? > np.equal raises NotImplemented for strings, while == works. > >>>> aa > array(['a', 'b', 'a', 'aa', 'a'], > ? ? ?dtype='|S2') > >>>> aa == 'a' > array([ True, False, ?True, False, ?True], dtype=bool) >>>> np.equal(aa,'a') > NotImplemented > > >>>> np.equal(np.arange(5),1) > array([False, ?True, False, False, False], dtype=bool) >>>> np.equal(np.arange(5),'a') > NotImplemented >>>> np.arange(5) == 'a' > False Seems like none of the ufuncs can handle strings: >> np.log('a') NotImplemented >> np.exp('a') NotImplemented >> np.add('a', 'b') NotImplemented >> np.negative('a') NotImplemented >> np.sin('a') NotImplemented From sturla at molden.no Sun Dec 6 12:36:13 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 06 Dec 2009 18:36:13 +0100 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B1BD549.30508@ncf.ca> References: <26566843.post@talk.nabble.com> <4B131239.7080801@ncf.ca> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> <4B1930CB.7090708@gmail.com> <4B1BD549.30508@ncf.ca> Message-ID: <4B1BEB8D.1050107@molden.no> Colin J. Williams skrev: > When one has a smallish sample size, what give the best estimate of the > variance? What do you mean by "best estimate"? Unbiased? Smallest standard error? > In the widely used Analysis of Variance (ANOVA), the degrees of freedom > are reduced for each mean estimated, That is for statistical tests, not to compute estimators. From d.l.goldsmith at gmail.com Sun Dec 6 14:05:22 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 6 Dec 2009 11:05:22 -0800 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: References: <20091206111300.GC897@phare.normalesup.org> <20091206131010.GD897@phare.normalesup.org> Message-ID: <45d1ab480912061105s39ebab81jf39a73bdd881eb26@mail.gmail.com> On Sun, Dec 6, 2009 at 6:36 AM, Ralf Gommers wrote: > > > On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux < > gael.varoquaux at normalesup.org> wrote: > >> >> Excellent. I tweeked a bit the text to make it clearer: >> http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ >> >> I made it clear that the replacement are present only starting numpy 1.4. >> I'd love a review and an 'OK to apply'. >> >> Looks good, I copied the changes to the notes in defchararray and > arrays.classes.rst, and toggled OK to apply. > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > Thanks, guys! DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Dec 6 14:12:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 6 Dec 2009 13:12:52 -0600 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <20091206135358.GE897@phare.normalesup.org> References: <20091206135358.GE897@phare.normalesup.org> Message-ID: <3d375d730912061112m1b1317cbqa02e068813ff37b8@mail.gmail.com> On Sun, Dec 6, 2009 at 07:53, Gael Varoquaux wrote: > I have a lot of code that has stopped working with my latest SVN pull to > numpy. > > * Some compiled code yields an error looking like (from memory): > > ? ?"incorrect type 'numpy.ndarray'" > > Rebuilding it is sufficient. Is this Cython or Pyrex code? Unfortunately Pyrex checks the size of types exactly such that even if you extend the type in a backwards compatible way, it will raise that exception. This behavior has been inherited by Cython. I have asked for this feature to be removed, or at least turned into a >= check, but it got no traction. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Sun Dec 6 14:33:10 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 Dec 2009 20:33:10 +0100 Subject: [Numpy-discussion] Some incompatibilities in numpy trunk In-Reply-To: <3d375d730912061112m1b1317cbqa02e068813ff37b8@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <3d375d730912061112m1b1317cbqa02e068813ff37b8@mail.gmail.com> Message-ID: <20091206193310.GA15307@phare.normalesup.org> On Sun, Dec 06, 2009 at 01:12:52PM -0600, Robert Kern wrote: > Is this Cython or Pyrex code? It is. > Unfortunately Pyrex checks the size of types exactly such that even if > you extend the type in a backwards compatible way, it will raise that > exception. OK, that makes sens. Thanks for the explaination. > This behavior has been inherited by Cython. I have asked for > this feature to be removed, or at least turned into a >= check, but it > got no traction. Well, maybe when all the cython deployements break because of the numpy change, it will get more traction. Ga?l From jsseabold at gmail.com Sun Dec 6 16:16:08 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 6 Dec 2009 16:16:08 -0500 Subject: [Numpy-discussion] Zero Division not handled correctly? Message-ID: I believe this is known, but I am surprised that division by "integer" zero results in the following. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.4.0.dev7539' In [3]: 0**-1 # or 0**-1/-1 --------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) /home/skipper/school/Data/ascii/numpy/ in () ZeroDivisionError: 0.0 cannot be raised to a negative power In [4]: np.array([0.])**-1 Out[4]: array([ Inf]) In [5]: np.array([0.])**-1/-1 Out[5]: array([-Inf]) In [6]: np.array([0])**-1. Out[6]: array([ Inf]) In [7]: np.array([0])**-1./-1 Out[7]: array([-Inf]) In [8]: np.array([0])**-1 Out[8]: array([-9223372036854775808]) In [9]: np.array([0])**-1/-1 Floating point exception This last command crashes the interpreter. There have been some threads about similar issues over the years, but I'm wondering if this is still intended/known or if this should raise an exception or return inf or -inf. I expected a -inf, though maybe this is incorrect on my part. Skipper From d.l.goldsmith at gmail.com Sun Dec 6 17:39:55 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 6 Dec 2009 14:39:55 -0800 Subject: [Numpy-discussion] Zero Division not handled correctly? In-Reply-To: References: Message-ID: <45d1ab480912061439r4c872de5m71aa5bd48d07c6c0@mail.gmail.com> On Sun, Dec 6, 2009 at 1:16 PM, Skipper Seabold wrote: > In [9]: np.array([0])**-1/-1 > Floating point exception > > This last command crashes the interpreter. > > It crashes mine also, and IMO, anything that crashes the interpreter should be considered a bug - can you file a bug report, please? Thanks! DG > There have been some threads about similar issues over the years, but > I'm wondering if this is still intended/known or if this should raise > an exception or return inf or -inf. I expected a -inf, though maybe > this is incorrect on my part. > > Skipper > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Sun Dec 6 20:54:46 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 6 Dec 2009 20:54:46 -0500 Subject: [Numpy-discussion] another numpy/ATLAS problem In-Reply-To: <1260065097.32703.11.camel@idol> References: <662818A6-E8EE-4392-9BED-D281D07D1DFC@cs.toronto.edu> <1CF7EB5A-F17B-4E2C-BD19-35AED576DEC8@cs.toronto.edu> <204B9FD5-165F-4EDA-9BAF-10AB8181B841@cs.toronto.edu> <1260065097.32703.11.camel@idol> Message-ID: <49917878-57AB-4CA2-B983-F32A6FCC6517@cs.toronto.edu> (we hashed this out on IRC, but replying here for the sake of recording it) On 5-Dec-09, at 9:04 PM, Pauli Virtanen wrote: > Can you try to change linalg/setup.py so that it *only* includes > lapack_litemodule.c in the build? Yup; it turns out it wasn't NumPy's lapack_lite calling dlamc3_ but rather other routines in LAPACK. It was accepted as a bug and it should be fixed in future 3.9.x's of ATLAS. David From bsouthey at gmail.com Sun Dec 6 21:12:58 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 6 Dec 2009 20:12:58 -0600 Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: <4B1BEB8D.1050107@molden.no> References: <26566843.post@talk.nabble.com> <2d5132a50911291715k703a0b32t1deffe67cf8947e2@mail.gmail.com> <4B16B103.6070600@ncf.ca> <3d375d730912021229g10d80116hb6341da8bb97bee3@mail.gmail.com> <703777c60912022135v3074ee03mb384359cb6b32622@mail.gmail.com> <703777c60912040418n3586bc15j6435b1a35a77bba1@mail.gmail.com> <4B1930CB.7090708@gmail.com> <4B1BD549.30508@ncf.ca> <4B1BEB8D.1050107@molden.no> Message-ID: On Sun, Dec 6, 2009 at 11:36 AM, Sturla Molden wrote: > Colin J. Williams skrev: >> When one has a smallish sample size, what give the best estimate of the >> variance? > What do you mean by "best estimate"? > > Unbiased? Smallest standard error? > > >> In the widely used Analysis of Variance (ANOVA), the degrees of freedom >> are reduced for each mean estimated, > That is for statistical tests, not to compute estimators. > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Ignoring the estimation method, there is no correct answer unless you impose various conditions like minimum-variance unbiased estimator (http://en.wikipedia.org/wiki/Minimum_variance_unbiased) where usually N-1 wins. Anyhow, this is way off topic since it is totally in the realm of math stats. Law of large numbers (http://en.wikipedia.org/wiki/Law_of_large_numbers) just address that the average not the variance so it is not directly applicable. Bruce From david at ar.media.kyoto-u.ac.jp Mon Dec 7 01:18:52 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Dec 2009 15:18:52 +0900 Subject: [Numpy-discussion] Zero Division not handled correctly? In-Reply-To: References: Message-ID: <4B1C9E4C.10300@ar.media.kyoto-u.ac.jp> Skipper Seabold wrote: > I believe this is known, but I am surprised that division by "integer" > zero results in the following. > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.4.0.dev7539' > > In [3]: 0**-1 # or 0**-1/-1 > --------------------------------------------------------------------------- > ZeroDivisionError Traceback (most recent call last) > > /home/skipper/school/Data/ascii/numpy/ in () > > ZeroDivisionError: 0.0 cannot be raised to a negative power > > In [4]: np.array([0.])**-1 > Out[4]: array([ Inf]) > > In [5]: np.array([0.])**-1/-1 > Out[5]: array([-Inf]) > > In [6]: np.array([0])**-1. > Out[6]: array([ Inf]) > > In [7]: np.array([0])**-1./-1 > Out[7]: array([-Inf]) > > In [8]: np.array([0])**-1 > Out[8]: array([-9223372036854775808]) > > In [9]: np.array([0])**-1/-1 > Floating point exception np.array([0])**-1 This last one is sort of interesting - Skipper Seabold wrote: > I believe this is known, but I am surprised that division by "integer" > zero results in the following. > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.4.0.dev7539' > > In [3]: 0**-1 # or 0**-1/-1 > --------------------------------------------------------------------------- > ZeroDivisionError Traceback (most recent call last) > > /home/skipper/school/Data/ascii/numpy/ in () > > ZeroDivisionError: 0.0 cannot be raised to a negative power > > In [4]: np.array([0.])**-1 > Out[4]: array([ Inf]) > > In [5]: np.array([0.])**-1/-1 > Out[5]: array([-Inf]) > > In [6]: np.array([0])**-1. > Out[6]: array([ Inf]) > > In [7]: np.array([0])**-1./-1 > Out[7]: array([-Inf]) > > In [8]: np.array([0])**-1 > Out[8]: array([-9223372036854775808]) > > In [9]: np.array([0])**-1/-1 > Floating point exception This last one is sort of interesting - np.array([0])**-1 returns the smallest long, and on 2-complement machines, this means that its opposite is not representable. IOW, it is not a divide by zero, but a division overflow, which also generates a SIGFPE on x86. I think the crash is the same as the one on this simple C program: #include int main(void) { long a = -2147483648; long b = -1; printf("%ld\n", a); a /= b; printf("%ld\n", a); return 0; } I am not sure about how to fix this: one simple way would be to detect this case in the LONG_divide ufunc (and other concerned signed integer types). Another, maybe better solution is to handle the signal, but that's maybe much more work (I still don't know well how signals interact with python interpreter). cheers, David From fperez.net at gmail.com Mon Dec 7 02:01:48 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 6 Dec 2009 23:01:48 -0800 Subject: [Numpy-discussion] np.equal In-Reply-To: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> References: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> Message-ID: 2009/12/6 josef.pktd : >>>> np.equal(np.arange(5),'a') > NotImplemented Why is NotImplemented a *return* value? Normally NotImplementedError is a raised exception, but if it's not implemented, it shouldn't be returned as a value. For one thing, it leads to absurdities like the following being possible: In [6]: if np.equal(np.random.rand(5),'a'): ...: print("Array equal to 'a'") ...: ...: Array equal to 'a' In [7]: if np.equal(np.random.rand(5),'a'): print("Array equal to 'a'") ...: ...: Array equal to 'a' In practice, it's as if np.equal() for not implemented cases returns True always (since bool(NotImplemented)==True). Cheers, f From nadavh at visionsense.com Mon Dec 7 03:36:47 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 7 Dec 2009 10:36:47 +0200 Subject: [Numpy-discussion] Test of numpy/py3k Message-ID: <710F2847B0018641891D9A21602763605AD252@ex3.envision.co.il> I would like to test and prepare for migration of numpy/python3.1 on a linux box. Can anyone provide an installation recipe (and maybe some more tips)? Nadav. From david at ar.media.kyoto-u.ac.jp Mon Dec 7 04:02:47 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Dec 2009 18:02:47 +0900 Subject: [Numpy-discussion] SIGFPE handling in ufunc Message-ID: <4B1CC4B7.9020407@ar.media.kyoto-u.ac.jp> Hi, While investigating why np.array([0]) ** -1/-1 crashes, I wondered about how signal handling is supposed to work in ufuncs when a SIGFPE is raised. In particular, the following is puzzling: import numpy as np x = np.array([1, 2, 3, 4]) / 0 # x is now array([0, 0, 0, 0]) Note that no warning is raised - this seems to be a regression, since I came across e.g. this ticket: http://projects.scipy.org/numpy/ticket/541. cheers, David From pav at iki.fi Mon Dec 7 04:33:18 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 11:33:18 +0200 Subject: [Numpy-discussion] Test of numpy/py3k In-Reply-To: <710F2847B0018641891D9A21602763605AD252@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD252@ex3.envision.co.il> Message-ID: <1260178398.2704.3.camel@talisman> Hi, ma, 2009-12-07 kello 10:36 +0200, Nadav Horesh kirjoitti: > I would like to test and prepare for migration of numpy/python3.1 on a linux box. > Can anyone provide an installation recipe (and maybe some more tips)? It should already "work out of the box": 1) Install Python 3 2) Install Nose's py3k branch from SVN, cf. doc/Py3K.txt in Numpy's SVN. 3) Run "python3 setup.py install --user" in a dir containing Numpy's SVN trunk Of course, since many thing are not finished yet for Py3, it's probable you run into problems when trying to run software using Numpy on Py3. -- Pauli Virtanen From pav at iki.fi Mon Dec 7 04:36:12 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 11:36:12 +0200 Subject: [Numpy-discussion] np.equal In-Reply-To: References: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> Message-ID: <1260178572.2704.5.camel@talisman> su, 2009-12-06 kello 23:01 -0800, Fernando Perez kirjoitti: > 2009/12/6 josef.pktd : > >>>> np.equal(np.arange(5),'a') > > NotImplemented > > Why is NotImplemented a *return* value? Normally NotImplementedError > is a raised exception, but if it's not implemented, it shouldn't be > returned as a value. Maybe it is so because some code paths are shared with the == operation? But in any case, NotImplemented should never be returned to the user -- I believe it's only meant to be an internal value used in determining the proper comparison operation between values. A bug ticket should be filed for this, I believe. -- Pauli Virtanen From david at ar.media.kyoto-u.ac.jp Mon Dec 7 06:21:37 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 07 Dec 2009 20:21:37 +0900 Subject: [Numpy-discussion] SIGFPE handling in ufunc In-Reply-To: <4B1CC4B7.9020407@ar.media.kyoto-u.ac.jp> References: <4B1CC4B7.9020407@ar.media.kyoto-u.ac.jp> Message-ID: <4B1CE541.9060905@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > Note that no warning is raised - this seems to be a regression, since I > came across e.g. this ticket: http://projects.scipy.org/numpy/ticket/541. > Ok, this is broken since 1.1.0 (r4778 to be exact), and the reason is the MaskedArray support - there are some global seterr calls without the old state being restored. Look easy to fix, cheers, David From nadavh at visionsense.com Mon Dec 7 10:00:42 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 07 Dec 2009 17:00:42 +0200 Subject: [Numpy-discussion] Test of numpy/py3k In-Reply-To: <1260178398.2704.3.camel@talisman> References: <710F2847B0018641891D9A21602763605AD252@ex3.envision.co.il> <1260178398.2704.3.camel@talisman> Message-ID: <1260198042.29935.0.camel@nadav.envision.co.il> Thanks, it works. Nadav On Mon, 2009-12-07 at 11:33 +0200, Pauli Virtanen wrote: > Hi, > > ma, 2009-12-07 kello 10:36 +0200, Nadav Horesh kirjoitti: > > I would like to test and prepare for migration of numpy/python3.1 on a linux box. > > Can anyone provide an installation recipe (and maybe some more tips)? > > It should already "work out of the box": > > 1) Install Python 3 > > 2) Install Nose's py3k branch from SVN, cf. doc/Py3K.txt in Numpy's SVN. > > 3) Run "python3 setup.py install --user" in a dir containing Numpy's SVN > trunk > > Of course, since many thing are not finished yet for Py3, it's probable > you run into problems when trying to run software using Numpy on Py3. > From dagss at student.matnat.uio.no Mon Dec 7 07:23:59 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 Dec 2009 13:23:59 +0100 Subject: [Numpy-discussion] Help to port numpy to python3? In-Reply-To: <1260111149.4862.84.camel@idol> References: <4B1BC1BC.6080606@gmail.com> <1260111149.4862.84.camel@idol> Message-ID: <4B1CF3DF.3080808@student.matnat.uio.no> Pauli Virtanen wrote: > 2) Figure out how to test the PEP 3118 buffer interface on Python 2.6 > and Python 3.1 > > Write unit tests for it. > On that note, I gave that a very small start in August. The two files are here -- one is a Cython file wrapping the part of the PyBuffer API that one needs, and then there's a small pure Python testcase which can be extended to other cases. In case anybody is interested: http://heim.ifi.uio.no/dagss/numpy_pep3118/ Dag Sverre From mdroe at stsci.edu Mon Dec 7 08:49:11 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 08:49:11 -0500 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: <45d1ab480912061105s39ebab81jf39a73bdd881eb26@mail.gmail.com> References: <20091206111300.GC897@phare.normalesup.org> <20091206131010.GD897@phare.normalesup.org> <45d1ab480912061105s39ebab81jf39a73bdd881eb26@mail.gmail.com> Message-ID: <4B1D07D7.6030108@stsci.edu> If this: http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ is the most recent revision, then it looks good to me. It does seem to incorporate my new text -- in case there was any confusion about that. Mike David Goldsmith wrote: > On Sun, Dec 6, 2009 at 6:36 AM, Ralf Gommers > > wrote: > > > > On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux > > wrote: > > > Excellent. I tweeked a bit the text to make it clearer: > http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ > > I made it clear that the replacement are present only starting > numpy 1.4. > I'd love a review and an 'OK to apply'. > > Looks good, I copied the changes to the notes in defchararray and > arrays.classes.rst, and toggled OK to apply. > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > Thanks, guys! > > DG > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From mdroe at stsci.edu Mon Dec 7 09:06:00 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 09:06:00 -0500 Subject: [Numpy-discussion] np.equal In-Reply-To: <1260178572.2704.5.camel@talisman> References: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> <1260178572.2704.5.camel@talisman> Message-ID: <4B1D0BC8.4080402@stsci.edu> There was some discussion on this recently. "==" actually has special fall-back code for strings, whereas "equal" is just a "raw" ufunc, and no ufunc's support strings. http://www.mail-archive.com/numpy-discussion at scipy.org/msg21408.html Of course bool(NotImplemented) == True is sort of a separate issue that can be dealt with independently of whether ufuncs grow string support of not. Mike Pauli Virtanen wrote: > su, 2009-12-06 kello 23:01 -0800, Fernando Perez kirjoitti: > >> 2009/12/6 josef.pktd : >> >>>>>> np.equal(np.arange(5),'a') >>>>>> >>> NotImplemented >>> >> Why is NotImplemented a *return* value? Normally NotImplementedError >> is a raised exception, but if it's not implemented, it shouldn't be >> returned as a value. >> > > Maybe it is so because some code paths are shared with the == operation? > > But in any case, NotImplemented should never be returned to the user -- > I believe it's only meant to be an internal value used in determining > the proper comparison operation between values. > > A bug ticket should be filed for this, I believe. > > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From mdroe at stsci.edu Mon Dec 7 09:12:11 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 09:12:11 -0500 Subject: [Numpy-discussion] Py3 merge In-Reply-To: References: <1260060065.5576.80.camel@idol> Message-ID: <4B1D0D3B.2060202@stsci.edu> Charles R Harris wrote: > > > > We need character arrays for the astro people. I assume these will be > byte arrays. Maybe Michael will weigh in here. I can't find in the thread where removing byte arrays (meaning arrays of fixed-length non-unicode strings) was suggested -- though changing the dtype specifier for them was. That is 'S' would change to 'B' in python3 (with some deprecation period for 'S'), and 'U' would remain 'U'. That seems acceptable to me, as long as we have some way to have fixed-length 8-bit strings. Hopefully all the new chararray unit tests will help with this transition. Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From mdroe at stsci.edu Mon Dec 7 09:17:08 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 09:17:08 -0500 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: <4B1D07D7.6030108@stsci.edu> References: <20091206111300.GC897@phare.normalesup.org> <20091206131010.GD897@phare.normalesup.org> <45d1ab480912061105s39ebab81jf39a73bdd881eb26@mail.gmail.com> <4B1D07D7.6030108@stsci.edu> Message-ID: <4B1D0E64.1050606@stsci.edu> Also noticed that http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ lists a number of methods that are really applicable. For example "sum". This is a side effect of course of chararray inheriting from ndarray, and the documentation isn't "wrong" -- chararray has a "sum" method, it's just that it doesn't make sense and you get: TypeError: cannot perform reduce with flexible type when trying to run it. I suspect fixing this at the implementation level may be too difficult/dangerous, but for documentation purposes, is there a way to explicitly list the methods for the chararray class? Mike Michael Droettboom wrote: > If this: > > http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ > > is the most recent revision, then it looks good to me. It does seem to > incorporate my new text -- in case there was any confusion about that. > > Mike > > David Goldsmith wrote: > >> On Sun, Dec 6, 2009 at 6:36 AM, Ralf Gommers >> > wrote: >> >> >> >> On Sun, Dec 6, 2009 at 2:10 PM, Gael Varoquaux >> > > wrote: >> >> >> Excellent. I tweeked a bit the text to make it clearer: >> http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ >> >> I made it clear that the replacement are present only starting >> numpy 1.4. >> I'd love a review and an 'OK to apply'. >> >> Looks good, I copied the changes to the notes in defchararray and >> arrays.classes.rst, and toggled OK to apply. >> >> Cheers, >> Ralf >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> Thanks, guys! >> >> DG >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From pav at iki.fi Mon Dec 7 09:24:05 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 16:24:05 +0200 Subject: [Numpy-discussion] Chararray depreciated? In-Reply-To: <4B1D0E64.1050606@stsci.edu> References: <20091206111300.GC897@phare.normalesup.org> <20091206131010.GD897@phare.normalesup.org> <45d1ab480912061105s39ebab81jf39a73bdd881eb26@mail.gmail.com> <4B1D07D7.6030108@stsci.edu> <4B1D0E64.1050606@stsci.edu> Message-ID: <1260195845.2704.15.camel@talisman> ma, 2009-12-07 kello 09:17 -0500, Michael Droettboom kirjoitti: > Also noticed that > > http://docs.scipy.org/numpy/docs/numpy.core.defchararray.chararray/ > [clip] > I suspect fixing this at the implementation level may be too > difficult/dangerous, but for documentation purposes, is there a way to > explicitly list the methods for the chararray class? Yes, by adding a methods section to the class docstring: Methods ------- rstrip lstrip ... Pauli From pav at iki.fi Mon Dec 7 09:22:42 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 16:22:42 +0200 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <4B1D0D3B.2060202@stsci.edu> References: <1260060065.5576.80.camel@idol> <4B1D0D3B.2060202@stsci.edu> Message-ID: <1260195762.2704.13.camel@talisman> ma, 2009-12-07 kello 09:12 -0500, Michael Droettboom kirjoitti: > > We need character arrays for the astro people. I assume these will be > > byte arrays. Maybe Michael will weigh in here. > > I can't find in the thread where removing byte arrays (meaning arrays of > fixed-length non-unicode strings) was suggested -- though changing the > dtype specifier for them was. That is 'S' would change to 'B' in > python3 (with some deprecation period for 'S'), and 'U' would remain > 'U'. That seems acceptable to me, as long as we have some way to have > fixed-length 8-bit strings. Hopefully all the new chararray unit tests > will help with this transition. Removal was suggested, with the motivation that people should just use byte arrays instead. I think we're not going to remove it at the moment, though. The character 'B' is already by unsigned bytes -- I wonder if it's easy to support 'B123' and plain 'B' at the same time, or whether we have to pick a different letter for "byte strings". 'y' would be free... The chararray unit tests are all presently failing, so they are definitely useful :) Pauli From mdroe at stsci.edu Mon Dec 7 09:50:20 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 09:50:20 -0500 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <1260195762.2704.13.camel@talisman> References: <1260060065.5576.80.camel@idol> <4B1D0D3B.2060202@stsci.edu> <1260195762.2704.13.camel@talisman> Message-ID: <4B1D162C.5020607@stsci.edu> Pauli Virtanen wrote: > ma, 2009-12-07 kello 09:12 -0500, Michael Droettboom kirjoitti: > >>> We need character arrays for the astro people. I assume these will be >>> byte arrays. Maybe Michael will weigh in here. >>> >> I can't find in the thread where removing byte arrays (meaning arrays of >> fixed-length non-unicode strings) was suggested -- though changing the >> dtype specifier for them was. That is 'S' would change to 'B' in >> python3 (with some deprecation period for 'S'), and 'U' would remain >> 'U'. That seems acceptable to me, as long as we have some way to have >> fixed-length 8-bit strings. Hopefully all the new chararray unit tests >> will help with this transition. >> > > Removal was suggested, with the motivation that people should just use > byte arrays instead. I think we're not going to remove it at the moment, > though. > Maybe I'm missing something, but those don't seem the same thing. The byte type is fundamentally numeric, whereas byte strings are lexicographic. They construct, repr and sort differently, and many numerical operations don't make sense on strings. It doesn't seem like (at present) byte arrays are a reasonable substitute for string arrays. > The character 'B' is already by unsigned bytes -- I wonder if it's easy > to support 'B123' and plain 'B' at the same time, or whether we have to > pick a different letter for "byte strings". 'y' would be free... > It seems to me the motivation to change the 'S' dtype to something else is to make things clearer with respect to the new conventions of Python 3. (Where str -> bytes, and unicode -> str). In that sense, I'm not sure there's any advantage going from "S" to "y" (particularly without doing "U" to "S"), whereas there's a strong backward-compatibility advantage to keep it as "S", though admittedly it's confusing to someone who doesn't know the pre Python 3 history. I'm not sure your suggestion of making 'B' and 'B123' both work seems like a good one because of the semantic differences between numbers and strings. Would np.array(['a', 'b']) have a repr of [97, 98] or ['a', 'b']? Sorting them would also not necessarily do the right thing. > The chararray unit tests are all presently failing, so they are > definitely useful :) > Glad to help :) Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From cournape at gmail.com Mon Dec 7 10:24:02 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Dec 2009 00:24:02 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? Message-ID: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> Hi, There are a few issues which have been found on numpy 1.4.0, which worry me: # 1317: segfaults for integer division overflow # 1318: all FPU exceptions ignored by default #1318 worries me the most: I think it is a pretty serious regression, since things like this go unnoticed: x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed But fixing it reveals quite a few problems while running the test suite (invalid value warnings appear 518 times), and I think it is a bite late to fix those. What do people think ? cheers, David From pav at iki.fi Mon Dec 7 10:32:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 17:32:50 +0200 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <4B1D162C.5020607@stsci.edu> References: <1260060065.5576.80.camel@idol> <4B1D0D3B.2060202@stsci.edu> <1260195762.2704.13.camel@talisman> <4B1D162C.5020607@stsci.edu> Message-ID: <1260199970.2704.84.camel@talisman> ma, 2009-12-07 kello 09:50 -0500, Michael Droettboom kirjoitti: > Pauli Virtanen wrote: [clip] > > The character 'B' is already by unsigned bytes -- I wonder if it's easy > > to support 'B123' and plain 'B' at the same time, or whether we have to > > pick a different letter for "byte strings". 'y' would be free... > > It seems to me the motivation to change the 'S' dtype to something else > is to make things clearer with respect to the new conventions of Python > 3. (Where str -> bytes, and unicode -> str). In that sense, I'm not > sure there's any advantage going from "S" to "y" (particularly without > doing "U" to "S"), whereas there's a strong backward-compatibility > advantage to keep it as "S", though admittedly it's confusing to someone > who doesn't know the pre Python 3 history. I think a better plan is to deprecate "U" instead of "S". Also, I'm not completely convinced that staying with "S" == bytes has a strong backward-compatibility advantage: array(['foo']).dtype == 'U' and this will break code in several places. Also, for instance, array(['foo', 'bar'], dtype='S3') will result to encoding errors. We probably don't want to start implicitly casting Unicode to bytes, since Py3 does not do that either. The only places where the dtype characters are used, AFAIK, is in repr and in the dtype kwarg -- they are not used in pickles etc. One can actually argue that changing 'U' to 'S' is more backward-compatible: array(['foo', 'bar'], dtype='S3') would still be valid code. Of course, the semantics change, but this anyway occurs also on the Python side when moving to Py3. The simplest way to get more insight would be to try to convert some string-using Py2 code to work on Py3. > I'm not sure your suggestion of making 'B' and 'B123' both work seems > like a good one because of the semantic differences between numbers and > strings. Would np.array(['a', 'b']) have a repr of [97, 98] or ['a', > 'b']? Sorting them would also not necessarily do the right thing. I think the point would be that 'B' and 'B1' would be treated as completely separate dtypes with different typenums -- they'd look similar only in the dtype character API (which is not so large) but not internally. np.array([b'a', b'b']).dtype would be 'B1'. Might be a bit confusing, though. Pauli From charlesr.harris at gmail.com Mon Dec 7 11:43:12 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2009 09:43:12 -0700 Subject: [Numpy-discussion] np.equal In-Reply-To: References: <1cd32cbb0912060457u68e43565xfdd982a043a3e8fd@mail.gmail.com> Message-ID: On Mon, Dec 7, 2009 at 12:01 AM, Fernando Perez wrote: > 2009/12/6 josef.pktd : > >>>> np.equal(np.arange(5),'a') > > NotImplemented > > Why is NotImplemented a *return* value? Normally NotImplementedError > NotImplemented and NotImplementedError are different things. The first is a message to the interpreter, the second an error. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 7 11:48:33 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2009 09:48:33 -0700 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> Message-ID: On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau wrote: > Hi, > > There are a few issues which have been found on numpy 1.4.0, which worry > me: > > # 1317: segfaults for integer division overflow > # 1318: all FPU exceptions ignored by default > > #1318 worries me the most: I think it is a pretty serious regression, > since things like this go unnoticed: > > x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed > > Hasn't that always been the case? Unless we have a way to raise exceptions from ufuncs I don't know what else we can do. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon Dec 7 11:53:24 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 Dec 2009 11:53:24 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> Message-ID: On Mon, Dec 7, 2009 at 11:48 AM, Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau wrote: >> >> Hi, >> >> There are a few issues which have been found on numpy 1.4.0, which worry >> me: >> >> # 1317: segfaults for integer division overflow >> # 1318: all FPU exceptions ignored by default >> >> #1318 worries me the most: I think it is a pretty serious regression, >> since things like this go unnoticed: >> >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed >> > > Hasn't that always been the case? Unless we have a way to raise exceptions > from ufuncs I don't know what else we can do. > http://thread.gmane.org/gmane.comp.python.numeric.general/8214/ Skipper From mdroe at stsci.edu Mon Dec 7 11:54:11 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 11:54:11 -0500 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <1260199970.2704.84.camel@talisman> References: <1260060065.5576.80.camel@idol> <4B1D0D3B.2060202@stsci.edu> <1260195762.2704.13.camel@talisman> <4B1D162C.5020607@stsci.edu> <1260199970.2704.84.camel@talisman> Message-ID: <4B1D3333.9010609@stsci.edu> Pauli Virtanen wrote: > ma, 2009-12-07 kello 09:50 -0500, Michael Droettboom kirjoitti: > >> Pauli Virtanen wrote: >> > [clip] > >>> The character 'B' is already by unsigned bytes -- I wonder if it's easy >>> to support 'B123' and plain 'B' at the same time, or whether we have to >>> pick a different letter for "byte strings". 'y' would be free... >>> >> It seems to me the motivation to change the 'S' dtype to something else >> is to make things clearer with respect to the new conventions of Python >> 3. (Where str -> bytes, and unicode -> str). In that sense, I'm not >> sure there's any advantage going from "S" to "y" (particularly without >> doing "U" to "S"), whereas there's a strong backward-compatibility >> advantage to keep it as "S", though admittedly it's confusing to someone >> who doesn't know the pre Python 3 history. >> > > I think a better plan is to deprecate "U" instead of "S". > > Also, I'm not completely convinced that staying with "S" == bytes has a > strong backward-compatibility advantage: > > array(['foo']).dtype == 'U' > > and this will break code in several places. Also, for instance, > > array(['foo', 'bar'], dtype='S3') > > will result to encoding errors. We probably don't want to start > implicitly casting Unicode to bytes, since Py3 does not do that either. > The only places where the dtype characters are used, AFAIK, is in repr > and in the dtype kwarg -- they are not used in pickles etc. > > One can actually argue that changing 'U' to 'S' is more > backward-compatible: > > array(['foo', 'bar'], dtype='S3') > > would still be valid code. Of course, the semantics change, but this > anyway occurs also on the Python side when moving to Py3. > > The simplest way to get more insight would be to try to convert some > string-using Py2 code to work on Py3. > Ok -- I think I can see that argument. Our use case is to define structured arrays to read and write binary files, which means we will have to change our dtypes from 'S8' to 'B8' in this case, or risk having the fields be the wrong size. It's very rare for our code to create arrays using string literals, so this problem hadn't occurred to me. I think 'U' will have to change to 'S', and users defining structured arrays will just have to make this change. > >> I'm not sure your suggestion of making 'B' and 'B123' both work seems >> like a good one because of the semantic differences between numbers and >> strings. Would np.array(['a', 'b']) have a repr of [97, 98] or ['a', >> 'b']? Sorting them would also not necessarily do the right thing. >> > > I think the point would be that 'B' and 'B1' would be treated as > completely separate dtypes with different typenums -- they'd look > similar only in the dtype character API (which is not so large) but not > internally. np.array([b'a', b'b']).dtype would be 'B1'. Might be a bit > confusing, though. > I see. I didn't quite understand what you were suggesting. I suppose that's not a bad compromise. Would the "kind" attribute be different between bytes and byte strings? I worry about code that does something like: if x.dtype.kind == 'B': ... ...which is not great usage, (issubclass(x.dtype.type, np.byte) would be better) but one sees it in user code in the wild (and even in Numpy itself) now and then. Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From oliphant at enthought.com Mon Dec 7 12:14:30 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Mon, 7 Dec 2009 11:14:30 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> Message-ID: <1BF4EF95-AE46-468E-B4CF-55B1FA7BBDE7@enthought.com> -- (mobile phone of) Travis Oliphant Enthought, Inc. 1-512-536-1057 http://www.enthought.com On Dec 7, 2009, at 10:48 AM, Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau > wrote: > Hi, > > There are a few issues which have been found on numpy 1.4.0, which > worry me: > > # 1317: segfaults for integer division overflow > # 1318: all FPU exceptions ignored by default > > #1318 worries me the most: I think it is a pretty serious regression, > since things like this go unnoticed: > > x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning > printed > This is the default behavior when error handling is set to ignore. I don't know when that was changed. I thought some errors were supposed to be printed. > > Hasn't that always been the case? Unless we have a way to raise > exceptions from ufuncs I don't know what else we can do. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Dec 7 12:31:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Dec 2009 02:31:37 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> Message-ID: <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau wrote: >> >> Hi, >> >> There are a few issues which have been found on numpy 1.4.0, which worry >> me: >> >> # 1317: segfaults for integer division overflow >> # 1318: all FPU exceptions ignored by default >> >> #1318 worries me the most: I think it is a pretty serious regression, >> since things like this go unnoticed: >> >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed >> > > Hasn't that always been the case? Unless we have a way to raise exceptions > from ufuncs I don't know what else we can do. No, it is a consequence of errors being set to ignored in numpy.ma: http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 So the fix is easy - but then it shows many (> 500) invalid values, etc... related to wrong fpu handling (most of them are limited to the new polynomial code, though). David From charlesr.harris at gmail.com Mon Dec 7 13:16:32 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2009 11:16:32 -0700 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> Message-ID: On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau wrote: > On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris > wrote: > > > > > > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau > wrote: > >> > >> Hi, > >> > >> There are a few issues which have been found on numpy 1.4.0, which worry > >> me: > >> > >> # 1317: segfaults for integer division overflow > >> # 1318: all FPU exceptions ignored by default > >> > >> #1318 worries me the most: I think it is a pretty serious regression, > >> since things like this go unnoticed: > >> > >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed > >> > > > > Hasn't that always been the case? Unless we have a way to raise > exceptions > > from ufuncs I don't know what else we can do. > > No, it is a consequence of errors being set to ignored in numpy.ma: > > > http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 > > So the fix is easy - but then it shows many (> 500) invalid values, > etc... related to wrong fpu handling (most of them are limited to the > new polynomial code, though). > > Umm, no. Just four, and easily fixed as I explicitly relied on the behaviour. After the fix and seterror(all='raise'): FAILED (KNOWNFAIL=4, SKIP=11, errors=47, failures=1) No errors in polynomial, however. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 7 13:24:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2009 11:24:53 -0700 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> Message-ID: On Mon, Dec 7, 2009 at 11:16 AM, Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau wrote: > >> On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris >> wrote: >> > >> > >> > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau >> wrote: >> >> >> >> Hi, >> >> >> >> There are a few issues which have been found on numpy 1.4.0, which >> worry >> >> me: >> >> >> >> # 1317: segfaults for integer division overflow >> >> # 1318: all FPU exceptions ignored by default >> >> >> >> #1318 worries me the most: I think it is a pretty serious regression, >> >> since things like this go unnoticed: >> >> >> >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning printed >> >> >> > >> > Hasn't that always been the case? Unless we have a way to raise >> exceptions >> > from ufuncs I don't know what else we can do. >> >> No, it is a consequence of errors being set to ignored in numpy.ma: >> >> >> http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 >> >> So the fix is easy - but then it shows many (> 500) invalid values, >> etc... related to wrong fpu handling (most of them are limited to the >> new polynomial code, though). >> >> > Umm, no. Just four, and easily fixed as I explicitly relied on the > behaviour. After the fix and seterror(all='raise'): > > To be specific, it was a true divide and I relied on nan being returned. I expect many of the remaining failures are of the same sort. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdroe at stsci.edu Mon Dec 7 13:33:50 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Mon, 07 Dec 2009 13:33:50 -0500 Subject: [Numpy-discussion] 1.4.0rc: Failing regression test on Solaris/SPARC Message-ID: <4B1D4A8E.9060204@stsci.edu> We've got 2 failing regression tests left on Solaris/SPARC, with the same apparent root cause (a difference in out-of-domain handling in asin and friends). Would anyone (particularly someone familiar with masked arrays) be able to look at #1173 to confirm that my suggested fix is correct? http://projects.scipy.org/numpy/ticket/1173 http://projects.scipy.org/numpy/ticket/1174 Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From josef.pktd at gmail.com Mon Dec 7 13:41:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 Dec 2009 13:41:26 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> Message-ID: <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> On Mon, Dec 7, 2009 at 1:24 PM, Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 11:16 AM, Charles R Harris > wrote: >> >> >> On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau >> wrote: >>> >>> On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> There are a few issues which have been found on numpy 1.4.0, which >>> >> worry >>> >> me: >>> >> >>> >> # 1317: segfaults for integer division overflow >>> >> # 1318: all FPU exceptions ignored by default >>> >> >>> >> #1318 worries me the most: I think it is a pretty serious regression, >>> >> since things like this go unnoticed: >>> >> >>> >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning >>> >> printed >>> >> >>> > >>> > Hasn't that always been the case? Unless we have a way to raise >>> > exceptions >>> > from ufuncs I don't know what else we can do. >>> >>> No, it is a consequence of errors being set to ignored in numpy.ma: >>> >>> >>> http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 >>> >>> So the fix is easy - but then it shows many (> 500) invalid values, >>> etc... related to wrong fpu handling (most of them are limited to the >>> new polynomial code, though). >>> >> >> Umm, no. Just four, and easily fixed as I explicitly relied on the >> behaviour. After the fix and seterror(all='raise'): >> > > To be specific, it was a true divide and I relied on nan being returned. I > expect many of the remaining failures are of the same sort. if seterr raise also raises when the calculations are done with floating point, then it's not really useful. I think there is a lot of code in scipy.stats that relies on returning nan for 0/0 and inf for x/0, x!=0 . Instead of partial answers, we would get exceptions in lots of cases, another common example is np.corrcoef I think this is more a problem with the silent casting of nan and inf to 0 for integers (which I dislike for a long time), not a problem with floating point operations. >>> np.seterr(all='raise') {'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'warn'} >>> np.arange(3).astype(int)/0 Traceback (most recent call last): File "", line 1, in FloatingPointError: divide by zero encountered in divide >>> np.arange(3).astype(float)/0 Traceback (most recent call last): File "", line 1, in FloatingPointError: divide by zero encountered in divide >>> >>> np.corrcoef(np.ones(3),np.arange(3)) Traceback (most recent call last): File "", line 1, in File "C:\Programs\Python25\Lib\site-packages\numpy\lib\function_base.py", line 1981, in corrcoef return c/sqrt(multiply.outer(d,d)) FloatingPointError: invalid value encountered in divide >>> np.seterr(all='ignore') {'over': 'raise', 'divide': 'raise', 'invalid': 'raise', 'under': 'raise'} >>> np.corrcoef(np.ones(3),np.arange(3)) array([[ NaN, NaN], [ NaN, 1.]]) > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From nwagner at iam.uni-stuttgart.de Mon Dec 7 16:36:40 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 07 Dec 2009 22:36:40 +0100 Subject: [Numpy-discussion] numpy.test() failures Message-ID: >>> numpy.__version__ '1.5.0.dev7980' FAIL: test_buffer_hashlib (test_regression.TestRegression) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_regression.py", line 1246, in test_buffer_hashlib assert_equal(md5(x).hexdigest(), '2a1dd1e1e59d0a384c26951e316cd7e6') File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 307, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: 'aa341a15f5ade44faafbe190f98c2587' DESIRED: '2a1dd1e1e59d0a384c26951e316cd7e6' ---------------------------------------------------------------------- Ran 2485 tests in 14.328s FAILED (KNOWNFAIL=5, failures=1) From pav at iki.fi Mon Dec 7 17:37:41 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 00:37:41 +0200 Subject: [Numpy-discussion] numpy.test() failures In-Reply-To: References: Message-ID: <1260225460.9834.0.camel@idol> ma, 2009-12-07 kello 22:36 +0100, Nils Wagner kirjoitti: > >>> numpy.__version__ > '1.5.0.dev7980' > > FAIL: test_buffer_hashlib (test_regression.TestRegression) Thx, fixed. I forgot about defaulting to int64. Pauli From washakie at gmail.com Mon Dec 7 19:00:11 2009 From: washakie at gmail.com (John [H2O]) Date: Mon, 7 Dec 2009 16:00:11 -0800 (PST) Subject: [Numpy-discussion] re cfunctions help with concatenating (vstack, hstack, etc.) Message-ID: <26686543.post@talk.nabble.com> Hello (Pierre?), I'm trying to work more with structured arrays, which at times seems great, and at others (due to my lack of familiarity) very frustrating. Anyway, right now I'm writing a bit of code to read a series of files with x,y,z data. I'm creating record arrays for each file a read. Once I have them all read, I just want to load them into one big array. But it doesn't seem as straight forward as concatenation: In [102]: type(D); D.shape; D.dtype.names Out[102]: Out[102]: (3025,) Out[102]: ('datetime', 'lon', 'lat', 'elv', 'co') In [103]: type(D2); D2.shape; D2.dtype.names Out[103]: Out[103]: (3445,) Out[103]: ('datetime', 'lon', 'lat', 'elv', 'co') In [104]: C = rf.stack_arrays((D,D2)) In [105]: type(C); C.shape; C.dtype.names Out[105]: Out[105]: (6470,) Out[105]: ('datetime', 'lon', 'lat', 'elv', 'co') In [106]: C.datetime --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /xnilu_wrk/flex_wrk/jfb/RESEARCH_ARCTIC/Arctic_CO/ in () AttributeError: 'MaskedArray' object has no attribute 'datetime' In [107]: C[0] Out[107]: (datetime.datetime(2008, 6, 29, 14, 50, tzinfo=), 248.83900164808117, 53.949661856137709, -0.31834712473582627, 112.91218844292422) In [108]: So it seems I end up with a masked array of the correct length, but it is an array of tuples and no longer a record array. Am I missing a step? Thanks, john -- View this message in context: http://old.nabble.com/recfunctions-help-with-concatenating-%28vstack%2C-hstack%2C-etc.%29-tp26686543p26686543.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Mon Dec 7 19:08:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 7 Dec 2009 18:08:52 -0600 Subject: [Numpy-discussion] re cfunctions help with concatenating (vstack, hstack, etc.) In-Reply-To: <26686543.post@talk.nabble.com> References: <26686543.post@talk.nabble.com> Message-ID: <3d375d730912071608m434105i2573242286bda611@mail.gmail.com> On Mon, Dec 7, 2009 at 18:00, John [H2O] wrote: > > Hello (Pierre?), > > I'm trying to work more with structured arrays, which at times seems great, > and at others (due to my lack of familiarity) very frustrating. > > Anyway, right now I'm writing a bit of code to read a series of files with > x,y,z data. I'm creating record arrays for each file a read. Once I have > them all read, I just want to load them into one big array. But it doesn't > seem as straight forward as concatenation: > > In [102]: type(D); D.shape; D.dtype.names > Out[102]: > Out[102]: (3025,) > Out[102]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [103]: type(D2); D2.shape; D2.dtype.names > Out[103]: > Out[103]: (3445,) > Out[103]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [104]: C = rf.stack_arrays((D,D2)) > > In [105]: type(C); C.shape; C.dtype.names > Out[105]: > Out[105]: (6470,) > Out[105]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [106]: C.datetime > --------------------------------------------------------------------------- > AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last) > > /xnilu_wrk/flex_wrk/jfb/RESEARCH_ARCTIC/Arctic_CO/ in > () > > AttributeError: 'MaskedArray' object has no attribute 'datetime' > > In [107]: C[0] > Out[107]: (datetime.datetime(2008, 6, 29, 14, 50, tzinfo=), > 248.83900164808117, 53.949661856137709, -0.31834712473582627, > 112.91218844292422) > > In [108]: > > So it seems I end up with a masked array of the correct length, but it is an > array of tuples and no longer a record array. Am I missing a step? It's still a structured array (albeit a masked one); however, it is no longer a recarray. recarray is just a convenience class that provide attribute access. Use C['datetime'] to access the datetime column. Out[107] is not a tuple but a record scalar. Look at the docstring of numpy.lib.recfunctions.stack_arrays() to see the asrecarray option which will make the output a recarray (or the MaskedRecords subclass if you retain the default usemask=True option). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Mon Dec 7 19:12:09 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 Dec 2009 19:12:09 -0500 Subject: [Numpy-discussion] re cfunctions help with concatenating (vstack, hstack, etc.) In-Reply-To: <26686543.post@talk.nabble.com> References: <26686543.post@talk.nabble.com> Message-ID: On Mon, Dec 7, 2009 at 7:00 PM, John [H2O] wrote: > > Hello (Pierre?), > > I'm trying to work more with structured arrays, which at times seems great, > and at others (due to my lack of familiarity) very frustrating. > Sounds familiar... > Anyway, right now I'm writing a bit of code to read a series of files with > x,y,z data. I'm creating record arrays for each file a read. Once I have > them all read, I just want to load them into one big array. But it doesn't > seem as straight forward as concatenation: > Hmm, I just went through a lot of data cleaning in a similar manner. I used structured arrays rather than record arrays though, as I believe the latter are faster to pull from. > In [102]: type(D); D.shape; D.dtype.names > Out[102]: > Out[102]: (3025,) > Out[102]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [103]: type(D2); D2.shape; D2.dtype.names > Out[103]: > Out[103]: (3445,) > Out[103]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [104]: C = rf.stack_arrays((D,D2)) > > In [105]: type(C); C.shape; C.dtype.names > Out[105]: > Out[105]: (6470,) > Out[105]: ('datetime', 'lon', 'lat', 'elv', 'co') > > In [106]: C.datetime > --------------------------------------------------------------------------- > AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last) > > /xnilu_wrk/flex_wrk/jfb/RESEARCH_ARCTIC/Arctic_CO/ in > () > > AttributeError: 'MaskedArray' object has no attribute 'datetime' > > In [107]: C[0] > Out[107]: (datetime.datetime(2008, 6, 29, 14, 50, tzinfo=), > 248.83900164808117, 53.949661856137709, -0.31834712473582627, > 112.91218844292422) > > In [108]: > > So it seems I end up with a masked array of the correct length, but it is an > array of tuples and no longer a record array. Am I missing a step? > Well I think all record arrays are essentially arrays of tuples (or structs) if I understand correctly. You can try passing usemask=False and asrecarray=True to stack_arrays. I had to play with these when using the recfunctions at times. I'm not sure the defaults are consistent, though I didn't really check. Incidentally, is there a reason that the recfunctions aren't imported into the np.lib namespace? Skipper From pgmdevlist at gmail.com Mon Dec 7 19:47:16 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 7 Dec 2009 19:47:16 -0500 Subject: [Numpy-discussion] re cfunctions help with concatenating (vstack, hstack, etc.) In-Reply-To: References: <26686543.post@talk.nabble.com> Message-ID: <24B10E4D-379B-4D30-9E1E-C10004D01BB6@gmail.com> On Dec 7, 2009, at 7:12 PM, Skipper Seabold wrote: > On Mon, Dec 7, 2009 at 7:00 PM, John [H2O] wrote: >> >> Hello (Pierre?), >> >> I'm trying to work more with structured arrays, which at times seems great, >> and at others (due to my lack of familiarity) very frustrating. Like Robert said, you still have a structured array (named fields), but not a recarray (where fields can be accessed as attributes). Use a simple .view(np.recarray) or .view(np.ma.MaskedRecords) to get the field-as-attributes behavior, if you really need it. > > You can try passing usemask=False and asrecarray=True to stack_arrays. > I had to play with these when using the recfunctions at times. I'm > not sure the defaults are consistent, though I didn't really check. > > Incidentally, is there a reason that the recfunctions aren't imported > into the np.lib namespace? Because I'm still unsure whether they're completely baked. You mentioned that the defaults may not be consistent, that's one of the aspects we need to investigate further, along with testing, proper docs... From jsseabold at gmail.com Mon Dec 7 20:39:46 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 Dec 2009 20:39:46 -0500 Subject: [Numpy-discussion] re cfunctions help with concatenating (vstack, hstack, etc.) In-Reply-To: <24B10E4D-379B-4D30-9E1E-C10004D01BB6@gmail.com> References: <26686543.post@talk.nabble.com> <24B10E4D-379B-4D30-9E1E-C10004D01BB6@gmail.com> Message-ID: On Mon, Dec 7, 2009 at 7:47 PM, Pierre GM wrote: > On Dec 7, 2009, at 7:12 PM, Skipper Seabold wrote: >> On Mon, Dec 7, 2009 at 7:00 PM, John [H2O] wrote: >>> >>> Hello (Pierre?), >>> >>> I'm trying to work more with structured arrays, which at times seems great, >>> and at others (due to my lack of familiarity) very frustrating. > > Like Robert said, you still have a structured array (named fields), but not a recarray (where fields can be accessed as attributes). Use a simple .view(np.recarray) or .view(np.ma.MaskedRecords) to get the field-as-attributes behavior, if you really need it. >> >> You can try passing usemask=False and asrecarray=True to stack_arrays. >> I had to play with these when using the recfunctions at times. ?I'm >> not sure the defaults are consistent, though I didn't really check. >> >> Incidentally, is there a reason that the recfunctions aren't imported >> into the np.lib namespace? > > > Because I'm still unsure whether they're completely baked. You mentioned that the defaults may not be consistent, that's one of the aspects we need to investigate further, along with testing, proper docs... Noted. FWIW, I find the docs to be pretty good (maybe some of the examples could be easier to read?), at least for the functions I make a lot of use of, which is most of them now that I look again. I haven't had to read the source once, which is nice ;) From pfeldman at verizon.net Mon Dec 7 23:13:21 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Mon, 7 Dec 2009 20:13:21 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> Message-ID: <26688450.post@talk.nabble.com> Robert Kern-2 wrote: > > > > Downcasting data is a necessary operation sometimes. We explicitly > made a choice a long time ago to allow this. > > Robert Kern > > This might be the time to recognize that that was a bad choice and reverse it. It is not clear to me why downcasting from complex to real sometimes raises an exception, and sometimes doesn't. Here are two examples where the behavior of NumPy is inexplicably different: Example #1: IPython 0.10 [on Py 2.5.4] [~]|1> z= zeros(3) [~]|2> z[0]= 1+1J TypeError: can't convert complex to float; use abs(z) Example #2: ### START OF CODE ### from numpy import * q = ones(2,dtype=complex)*(1 + 1J) r = zeros(2,dtype=float) r[:] = q print 'q = ',q print 'r = ',r ### END OF CODE ### [~]|9> run err q = [ 1.+1.j 1.+1.j] r = [ 1. 1.] [~]|10> In example #1, when we attempt to assign a complex value to a single element of a float array, an exception is triggered. In example #2, when we assign the entire array (actually, a slice), the imaginary part is silently discarded and we get a wrong result. At a minimum, this inconsistency needs to be cleared up. My preference would be that the programmer should have to explicitly downcast from complex to float, and that if he/she fails to do this, that an exception be triggered. -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26688450.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pfeldman at verizon.net Mon Dec 7 23:13:28 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Mon, 7 Dec 2009 20:13:28 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> Message-ID: <26688453.post@talk.nabble.com> Robert Kern-2 wrote: > > > > Downcasting data is a necessary operation sometimes. We explicitly > made a choice a long time ago to allow this. > > Robert Kern > > This might be the time to recognize that that was a bad choice and reverse it. It is not clear to me why downcasting from complex to real sometimes raises an exception, and sometimes doesn't. Here are two examples where the behavior of NumPy is inexplicably different: Example #1: IPython 0.10 [on Py 2.5.4] [~]|1> z= zeros(3) [~]|2> z[0]= 1+1J TypeError: can't convert complex to float; use abs(z) Example #2: ### START OF CODE ### from numpy import * q = ones(2,dtype=complex)*(1 + 1J) r = zeros(2,dtype=float) r[:] = q print 'q = ',q print 'r = ',r ### END OF CODE ### [~]|9> run err q = [ 1.+1.j 1.+1.j] r = [ 1. 1.] [~]|10> In example #1, when we attempt to assign a complex value to a single element of a float array, an exception is triggered. In example #2, when we assign the entire array (actually, a slice), the imaginary part is silently discarded and we get a wrong result. At a minimum, this inconsistency needs to be cleared up. My preference would be that the programmer should have to explicitly downcast from complex to float, and that if he/she fails to do this, that an exception be triggered. Dr. Phillip M. Feldman -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26688453.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From david at ar.media.kyoto-u.ac.jp Mon Dec 7 23:15:30 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 13:15:30 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> Message-ID: <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau > wrote: > > On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris > > wrote: > > > > > > On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau > > wrote: > >> > >> Hi, > >> > >> There are a few issues which have been found on numpy 1.4.0, > which worry > >> me: > >> > >> # 1317: segfaults for integer division overflow > >> # 1318: all FPU exceptions ignored by default > >> > >> #1318 worries me the most: I think it is a pretty serious > regression, > >> since things like this go unnoticed: > >> > >> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning > printed > >> > > > > Hasn't that always been the case? Unless we have a way to raise > exceptions > > from ufuncs I don't know what else we can do. > > No, it is a consequence of errors being set to ignored in numpy.ma > : > > http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 > > So the fix is easy - but then it shows many (> 500) invalid values, > etc... related to wrong fpu handling (most of them are limited to the > new polynomial code, though). > > > Umm, no. Just four, and easily fixed as I explicitly relied on the > behaviour. Yup, my last sentence was not really clear: I just meant that most of the warnings were printed during polynomial tests. > After the fix and seterror(all='raise'): > > FAILED (KNOWNFAIL=4, SKIP=11, errors=47, failures=1) 47 errors is quite a bit. So what about the following for 1.4.0: - we just remove the seterr set to ignore in numpy.ma - we deal with the special case of MIN_INT/-1 to avoid the SIGFPE crash - we deal with the spurious warnings during tests in 1.5.0 ? David From david at ar.media.kyoto-u.ac.jp Mon Dec 7 23:22:48 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 13:22:48 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> Message-ID: <4B1DD498.9010709@ar.media.kyoto-u.ac.jp> josef.pktd at gmail.com wrote: > On Mon, Dec 7, 2009 at 1:24 PM, Charles R Harris > wrote: > >> On Mon, Dec 7, 2009 at 11:16 AM, Charles R Harris >> wrote: >> >>> On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau >>> wrote: >>> >>>> On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris >>>> wrote: >>>> >>>>> On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> There are a few issues which have been found on numpy 1.4.0, which >>>>>> worry >>>>>> me: >>>>>> >>>>>> # 1317: segfaults for integer division overflow >>>>>> # 1318: all FPU exceptions ignored by default >>>>>> >>>>>> #1318 worries me the most: I think it is a pretty serious regression, >>>>>> since things like this go unnoticed: >>>>>> >>>>>> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning >>>>>> printed >>>>>> >>>>>> >>>>> Hasn't that always been the case? Unless we have a way to raise >>>>> exceptions >>>>> from ufuncs I don't know what else we can do. >>>>> >>>> No, it is a consequence of errors being set to ignored in numpy.ma: >>>> >>>> >>>> http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 >>>> >>>> So the fix is easy - but then it shows many (> 500) invalid values, >>>> etc... related to wrong fpu handling (most of them are limited to the >>>> new polynomial code, though). >>>> >>>> >>> Umm, no. Just four, and easily fixed as I explicitly relied on the >>> behaviour. After the fix and seterror(all='raise'): >>> >>> >> To be specific, it was a true divide and I relied on nan being returned. I >> expect many of the remaining failures are of the same sort. >> > > if seterr raise also raises when the calculations are done with > floating point, then it's not really useful. I think it is out of the question to set the default to raise - at least that not what I suggest. The default up to 1.0.4 was warning, and it was unintentionally set to ignored starting at 1.1.0. > I think this is more a problem with the silent casting of nan and inf > to 0 for integers (which I dislike for a long time), not a problem > with floating point operations. > I think it depends on the use-cases: I can see why in stats it may be useful to set them to ignore, but for linear algebra, for example, nan is almost always a bug in the code somewhere. Note also that the default can be overriden temporarily - we should actually have a context manager so that it becomes easy to use it safely if python >= 2.6 is an option. cheers, David From josef.pktd at gmail.com Mon Dec 7 23:59:33 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 Dec 2009 23:59:33 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1DD498.9010709@ar.media.kyoto-u.ac.jp> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> <4B1DD498.9010709@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0912072059u6431c49by71d8d4f99c71d4a4@mail.gmail.com> On Mon, Dec 7, 2009 at 11:22 PM, David Cournapeau wrote: > josef.pktd at gmail.com wrote: >> On Mon, Dec 7, 2009 at 1:24 PM, Charles R Harris >> wrote: >> >>> On Mon, Dec 7, 2009 at 11:16 AM, Charles R Harris >>> wrote: >>> >>>> On Mon, Dec 7, 2009 at 10:31 AM, David Cournapeau >>>> wrote: >>>> >>>>> On Tue, Dec 8, 2009 at 1:48 AM, Charles R Harris >>>>> wrote: >>>>> >>>>>> On Mon, Dec 7, 2009 at 8:24 AM, David Cournapeau >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> There are a few issues which have been found on numpy 1.4.0, which >>>>>>> worry >>>>>>> me: >>>>>>> >>>>>>> # 1317: segfaults for integer division overflow >>>>>>> # 1318: all FPU exceptions ignored by default >>>>>>> >>>>>>> #1318 worries me the most: I think it is a pretty serious regression, >>>>>>> since things like this go unnoticed: >>>>>>> >>>>>>> x = np.array([1, 2, 3, 4]) / 0 # x is an array of 0, no warning >>>>>>> printed >>>>>>> >>>>>>> >>>>>> Hasn't that always been the case? Unless we have a way to raise >>>>>> exceptions >>>>>> from ufuncs I don't know what else we can do. >>>>>> >>>>> No, it is a consequence of errors being set to ignored in numpy.ma: >>>>> >>>>> >>>>> http://projects.scipy.org/gitweb?p=numpy;a=blob;f=numpy/ma/core.py;h=f28a5738efa6fb6c4cbf0b3479243b0d7286ae32;hb=master#l107 >>>>> >>>>> So the fix is easy - but then it shows many (> 500) invalid values, >>>>> etc... related to wrong fpu handling (most of them are limited to the >>>>> new polynomial code, though). >>>>> >>>>> >>>> Umm, no. Just four, and easily fixed as I explicitly relied on the >>>> behaviour. After the fix and seterror(all='raise'): >>>> >>>> >>> To be specific, it was a true divide and I relied on nan being returned. I >>> expect many of the remaining failures are of the same sort. >>> >> >> if seterr raise also raises when the calculations are done with >> floating point, then it's not really useful. > > I think it is out of the question to set the default to raise - at least > that not what I suggest. The default up to 1.0.4 was warning, and it was > unintentionally set to ignored starting at 1.1.0. warning is no problem, but I haven't figured out what the pattern is for repeated warnings. When I do the same zero division several times, I only get the warning the first time, after that no warning is printed anymore. I don't know about the scope of the non-printing. If it is globally, so that only the first warning is printed, then it won't really help in detecting errors. ? Josef >>> import numpy as np >>> np.seterr('warn') {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} >>> np.arange(3)/0 __main__:1: RuntimeWarning: divide by zero encountered in divide array([0, 0, 0]) >>> np.arange(3)/0 array([0, 0, 0]) > >> I think this is more a problem with the silent casting of nan and inf >> to 0 for integers (which I dislike for a long time), not a problem >> with floating point operations. >> > > I think it depends on the use-cases: I can see why in stats it may be > useful to set them to ignore, but for linear algebra, for example, nan > is almost always a bug in the code somewhere. > > Note also that the default can be overriden temporarily - we should > actually have a context manager so that it becomes easy to use it safely > if python >= 2.6 is an option. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Mon Dec 7 23:54:39 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 13:54:39 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <1cd32cbb0912072059u6431c49by71d8d4f99c71d4a4@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> <4B1DD498.9010709@ar.media.kyoto-u.ac.jp> <1cd32cbb0912072059u6431c49by71d8d4f99c71d4a4@mail.gmail.com> Message-ID: <4B1DDC0F.9080003@ar.media.kyoto-u.ac.jp> josef.pktd at gmail.com wrote: > > warning is no problem, but I haven't figured out what the pattern is > for repeated warnings. > By default, warnings are only 'raised' once. You need to use the standard warnings module to control the behavior (always warn, raise, etc...). Look for the simplefilter function in the python documentation for the details. Now, it is not that simple because the divide by zero is not a python warning, but a simple printf on stderr if I understand correctly what Pauli said. > When I do the same zero division several times, I only get the warning > the first time, after that no warning is printed anymore. I don't know > about the scope of the non-printing. If it is globally, so that only > the first warning is printed, then it won't really help in detecting > errors. ? Both FPU exception handling and warning handling can be set up locally - it is just a bit tricky to do because it should be considered as a resource to be freed, that is it should always be restored to the previous state no matter what. That's what context manager (with the with keyword) are for in python >=2.6, and you should use try/finally block otherwise. cheers, David From josef.pktd at gmail.com Tue Dec 8 00:47:33 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 8 Dec 2009 00:47:33 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1DDC0F.9080003@ar.media.kyoto-u.ac.jp> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <1cd32cbb0912071041n1a8c7f05r8960f34854bc11f6@mail.gmail.com> <4B1DD498.9010709@ar.media.kyoto-u.ac.jp> <1cd32cbb0912072059u6431c49by71d8d4f99c71d4a4@mail.gmail.com> <4B1DDC0F.9080003@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0912072147u2230765dlfb77a26a0e7a6b12@mail.gmail.com> On Mon, Dec 7, 2009 at 11:54 PM, David Cournapeau wrote: > josef.pktd at gmail.com wrote: >> >> warning is no problem, but I haven't figured out what the pattern is >> for repeated warnings. >> > > By default, warnings are only 'raised' once. You need to use the > standard warnings module to control the behavior (always warn, raise, > etc...). Look for the simplefilter function in the python documentation > for the details. > > Now, it is not that simple because the divide by zero is not a python > warning, but a simple printf on stderr if I understand ?correctly what > Pauli said. > >> When I do the same zero division several times, I only get the warning >> the first time, after that no warning is printed anymore. I don't know >> about the scope of the non-printing. If it is globally, so that only >> the first warning is printed, then it won't really help in detecting >> errors. ? > > Both FPU exception handling and warning handling can be set up locally - > it is just a bit tricky to do because it should be considered as a > resource to be freed, that is it should always be restored to the > previous state no matter what. That's what context manager (with the > with keyword) are for in python >=2.6, and you should use try/finally > block otherwise. Thanks, sounds clear now. I guess we can expect some user questions if suddenly the number of warnings increases again. contextlib was introduced in python 2.5 http://docs.scipy.org/numpy/docs/numpy-docs/user/misc.rst/ mentions that the default in numpy is warn. Maybe adding some basic information about using pythons warning module to control the warnings could be added here to the docs (and in notes of seterr). I will remember to reset it, if I ever need to change it temporarily >>> seterrold = np.seterr() >>> np.seterr(**seterrold) Cheers, Josef > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From david at ar.media.kyoto-u.ac.jp Tue Dec 8 01:44:38 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 15:44:38 +0900 Subject: [Numpy-discussion] Py3k: including npy_config.h outside numpy.core Message-ID: <4B1DF5D6.8090504@ar.media.kyoto-u.ac.jp> Hi, I have noticed that now some code outside numpy/core uses npy_config.h. As suggested by its location (numpy/core/src/private), it is specific to numpy/core, and should not be used outside. What's the rationale to use npy_config.h in numpy/core ? cheers, David From dwf at cs.toronto.edu Tue Dec 8 02:32:20 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 8 Dec 2009 02:32:20 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26688453.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26688453.post@talk.nabble.com> Message-ID: On 7-Dec-09, at 11:13 PM, Dr. Phillip M. Feldman wrote: > Example #1: > IPython 0.10 [on Py 2.5.4] > [~]|1> z= zeros(3) > [~]|2> z[0]= 1+1J > > TypeError: can't convert complex to float; use abs(z) The problem is that you're using Python's built-in complex type, and it responds to type coercion differently than NumPy types do. Calling float() on a Python complex will raise the exception. Calling float() on (for example) a numpy.complex64 will not. Notice what happens here: In [14]: z = zeros(3) In [15]: z[0] = complex64(1+1j) In [16]: z[0] Out[16]: 1.0 > Example #2: > > ### START OF CODE ### > from numpy import * > q = ones(2,dtype=complex)*(1 + 1J) > r = zeros(2,dtype=float) > r[:] = q > print 'q = ',q > print 'r = ',r > ### END OF CODE ### Here, both operands are NumPy arrays. NumPy is in complete control of the situation, and it's well documented what it will do. I do agree that the behaviour in example #1 is mildly inconsistent, but such is the way with NumPy vs. Python scalars. They are mostly transparently intermingled, except when they're not. > At a minimum, this inconsistency needs to be cleared up. My > preference > would be that the programmer should have to explicitly downcast from > complex to float, and that if he/she fails to do this, that an > exception be > triggered. That would most likely break a *lot* of deployed code that depends on the implicit downcast behaviour. A less harmful solution (if a solution is warranted, which is for the Council of the Elders to decide) would be to treat the Python complex type as a special case, so that the .real attribute is accessed instead of trying to cast to float. David From pgmdevlist at gmail.com Tue Dec 8 02:34:06 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 02:34:06 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> Message-ID: <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> On Dec 7, 2009, at 11:15 PM, David Cournapeau wrote: > Charles R Harris wrote: >> No, it is a consequence of errors being set to ignored in numpy.ma >> : Oopsie... A bit of background first; In the first implementations of numpy.core.ma, the approach was to get rid of the data that could cause problem beforehand by replacing them with safe values. Turned out that finding these data is not always obvious (cf the problem of defining a domain when dealing with exponents that we discussed a while back on this list), and that all in all, it is faster to compute first and deal with the problems afterwards. Of course, the user doesn't need the warnings if something goes wrong, so disabling them globally looked like a way to go. I thought the disabling would be only for numpy.ma, though Anyhow, running numpy.ma.test() on 1.5.x, I get 36 warnings by getting rid of the np.seterr(all='ignore') in numpy.ma.core. I can go down to 2 by saving the seterr state before the computations in _Masked/DomainUnary/BinaryOperation and restoring it after the computation. I gonna try to find where the 2 missing warnings come from. The 281 tests + 36 warnings take 4.087s to run, the 2 warning version 5.95s (but I didn't try to go to much into timing details...) So, what do you want me to do guys ? Commit the fixes to the trunk ? Backporting them to 1.4.x ? From david at ar.media.kyoto-u.ac.jp Tue Dec 8 02:31:21 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 16:31:21 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> Message-ID: <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> Pierre GM wrote: > A bit of background first; > In the first implementations of numpy.core.ma, the approach was to get rid of the data that could cause problem beforehand by replacing them with safe values. Turned out that finding these data is not always obvious (cf the problem of defining a domain when dealing with exponents that we discussed a while back on this list), and that all in all, it is faster to compute first and deal with the problems afterwards. Of course, the user doesn't need the warnings if something goes wrong, so disabling them globally looked like a way to go. I thought the disabling would be only for numpy.ma, though I don't think there is an easy (or any) way to disable this at module level. Please be sure to always do so in a try/finally (like you would handle a file object to guarantee it is always closed, for example), because otherwise, test failures and SIGINT (ctrl+C) will pollute the user environment. > > Anyhow, running numpy.ma.test() on 1.5.x, I get 36 warnings by getting rid of the np.seterr(all='ignore') in numpy.ma.core. I can go down to 2 by saving the seterr state before the computations in _Masked/DomainUnary/BinaryOperation and restoring it after the computation. I gonna try to find where the 2 missing warnings come from. > The 281 tests + 36 warnings take 4.087s to run, the 2 warning version 5.95s (but I didn't try to go to much into timing details...) Setting/unsetting the FPU state definitely has a cost. I don't know how significant it would be for your precise case, though: is the cost because of setting/unsetting the state in the test themselves or ? We may be able to improve the situation later on once we have better numbers. A proper solution to this FPU exception may require hard work (because of the inherent asynchronous nature of signals, because signals behave very differently on different platforms, and because I don't think we can afford spending too many cycles on it). > > So, what do you want me to do guys ? Commit the fixes to the trunk ? Backporting them to 1.4.x ? I have already committed the removal of the global np.seterr in the trunk. I feel like backporting this one to 1.4.x is a good idea (because it could be considered as a regression), but maybe someone has a strong case against it. cheers, David From cournape at gmail.com Tue Dec 8 03:04:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Dec 2009 17:04:18 +0900 Subject: [Numpy-discussion] Zero Division not handled correctly? In-Reply-To: References: Message-ID: <5b8d13220912080004sfa8e16fw9da91ce6ca2387f6@mail.gmail.com> On Mon, Dec 7, 2009 at 6:16 AM, Skipper Seabold wrote: > I believe this is known, but I am surprised that division by "integer" > zero results in the following. > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.4.0.dev7539' > > In [3]: 0**-1 # or 0**-1/-1 > --------------------------------------------------------------------------- > ZeroDivisionError ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) > > /home/skipper/school/Data/ascii/numpy/ in () > > ZeroDivisionError: 0.0 cannot be raised to a negative power > > In [4]: np.array([0.])**-1 > Out[4]: array([ Inf]) > > In [5]: np.array([0.])**-1/-1 > Out[5]: array([-Inf]) > > In [6]: np.array([0])**-1. > Out[6]: array([ Inf]) > > In [7]: np.array([0])**-1./-1 > Out[7]: array([-Inf]) > > In [8]: np.array([0])**-1 > Out[8]: array([-9223372036854775808]) > > In [9]: np.array([0])**-1/-1 > Floating point exception > > This last command crashes the interpreter. > > There have been some threads about similar issues over the years, but > I'm wondering if this is still intended/known or if this should raise > an exception or return inf or -inf. ?I expected a -inf, though maybe > this is incorrect on my part. The crash is fixed, but you get some potentially spurious warning, still. A thorough solution will require some more work. David From gael.varoquaux at normalesup.org Tue Dec 8 03:25:22 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 8 Dec 2009 09:25:22 +0100 Subject: [Numpy-discussion] Build error Message-ID: <20091208082522.GC1111@phare.normalesup.org> I just did an SVN up, and I am getting a build error: $ python setup.py build_ext --inplace ... building extension "numpy.linalg.lapack_lite" sources adding 'numpy/linalg/lapack_litemodule.c' to sources. adding 'numpy/linalg/python_xerbla.c' to sources. building extension "numpy.random.mtrand" sources error: _numpyconfig.h not found in numpy include dirs ['numpy/core/include', 'numpy/core/include/numpy'] It's probably something minor, and I will figure it out, but I wanted to make sure the devs were aware of the potential hitch. Ga?l From gael.varoquaux at normalesup.org Tue Dec 8 03:26:58 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 8 Dec 2009 09:26:58 +0100 Subject: [Numpy-discussion] Build error In-Reply-To: <20091208082522.GC1111@phare.normalesup.org> References: <20091208082522.GC1111@phare.normalesup.org> Message-ID: <20091208082658.GD1111@phare.normalesup.org> On Tue, Dec 08, 2009 at 09:25:22AM +0100, Gael Varoquaux wrote: > I just did an SVN up, and I am getting a build error: Please ignore this. Brain fart. I should refrain from doing anything before coffee. Ga?l From pgmdevlist at gmail.com Tue Dec 8 03:33:31 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 03:33:31 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> Message-ID: <2CC3201F-C340-495F-8B5A-0656A32FF2B7@gmail.com> On Dec 8, 2009, at 2:31 AM, David Cournapeau wrote: > Pierre GM wrote: >> A bit of background first; >> In the first implementations of numpy.core.ma, the approach was to get rid of the data that could cause problem beforehand by replacing them with safe values. Turned out that finding these data is not always obvious (cf the problem of defining a domain when dealing with exponents that we discussed a while back on this list), and that all in all, it is faster to compute first and deal with the problems afterwards. Of course, the user doesn't need the warnings if something goes wrong, so disabling them globally looked like a way to go. I thought the disabling would be only for numpy.ma, though > > I don't think there is an easy (or any) way to disable this at module > level. Please be sure to always do so in a try/finally (like you would > handle a file object to guarantee it is always closed, for example), > because otherwise, test failures and SIGINT (ctrl+C) will pollute the > user environment. Will try. We're still supporting 2.3, right ? > Setting/unsetting the FPU state definitely has a cost. I don't know how > significant it would be for your precise case, though: is the cost > because of setting/unsetting the state in the test themselves or ? We > may be able to improve the situation later on once we have better numbers. I can't tell you, unfortunately... And I'm afraid I won't have too much time to write some benchmarks. > I have already committed the removal of the global np.seterr in the > trunk. I feel like backporting this one to 1.4.x is a good idea (because > it could be considered as a regression), but maybe someone has a strong > case against it. Well, there's another cosmetic issue that nags me: when using np functions instead of their ma equivalents on masked arrays, a warning might be raised. I could probably find a workaround with __array_prepare__, but it may take some time (and it probably won't be very pretty). So couldn't we keep things the way they are for 1.4.0, and get the fixes for 1.5 only ? From d.l.goldsmith at gmail.com Tue Dec 8 03:51:11 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 8 Dec 2009 00:51:11 -0800 Subject: [Numpy-discussion] Zero Division not handled correctly? In-Reply-To: <5b8d13220912080004sfa8e16fw9da91ce6ca2387f6@mail.gmail.com> References: <5b8d13220912080004sfa8e16fw9da91ce6ca2387f6@mail.gmail.com> Message-ID: <45d1ab480912080051i19c3ab63xa6f3892881aa8fad@mail.gmail.com> Thanks, Dave! DG On Tue, Dec 8, 2009 at 12:04 AM, David Cournapeau wrote: > On Mon, Dec 7, 2009 at 6:16 AM, Skipper Seabold > wrote: > > I believe this is known, but I am surprised that division by "integer" > > zero results in the following. > > > > In [1]: import numpy as np > > > > In [2]: np.__version__ > > Out[2]: '1.4.0.dev7539' > > > > In [3]: 0**-1 # or 0**-1/-1 > > > --------------------------------------------------------------------------- > > ZeroDivisionError Traceback (most recent call > last) > > > > /home/skipper/school/Data/ascii/numpy/ in () > > > > ZeroDivisionError: 0.0 cannot be raised to a negative power > > > > In [4]: np.array([0.])**-1 > > Out[4]: array([ Inf]) > > > > In [5]: np.array([0.])**-1/-1 > > Out[5]: array([-Inf]) > > > > In [6]: np.array([0])**-1. > > Out[6]: array([ Inf]) > > > > In [7]: np.array([0])**-1./-1 > > Out[7]: array([-Inf]) > > > > In [8]: np.array([0])**-1 > > Out[8]: array([-9223372036854775808]) > > > > In [9]: np.array([0])**-1/-1 > > Floating point exception > > > > This last command crashes the interpreter. > > > > There have been some threads about similar issues over the years, but > > I'm wondering if this is still intended/known or if this should raise > > an exception or return inf or -inf. I expected a -inf, though maybe > > this is incorrect on my part. > > The crash is fixed, but you get some potentially spurious warning, > still. A thorough solution will require some more work. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Dec 7 10:32:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 07 Dec 2009 17:32:50 +0200 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <4B1D162C.5020607@stsci.edu> References: <1260060065.5576.80.camel@idol> <4B1D0D3B.2060202@stsci.edu> <1260195762.2704.13.camel@talisman> <4B1D162C.5020607@stsci.edu> Message-ID: <1260199970.2704.84.camel@talisman> ma, 2009-12-07 kello 09:50 -0500, Michael Droettboom kirjoitti: > Pauli Virtanen wrote: [clip] > > The character 'B' is already by unsigned bytes -- I wonder if it's easy > > to support 'B123' and plain 'B' at the same time, or whether we have to > > pick a different letter for "byte strings". 'y' would be free... > > It seems to me the motivation to change the 'S' dtype to something else > is to make things clearer with respect to the new conventions of Python > 3. (Where str -> bytes, and unicode -> str). In that sense, I'm not > sure there's any advantage going from "S" to "y" (particularly without > doing "U" to "S"), whereas there's a strong backward-compatibility > advantage to keep it as "S", though admittedly it's confusing to someone > who doesn't know the pre Python 3 history. I think a better plan is to deprecate "U" instead of "S". Also, I'm not completely convinced that staying with "S" == bytes has a strong backward-compatibility advantage: array(['foo']).dtype == 'U' and this will break code in several places. Also, for instance, array(['foo', 'bar'], dtype='S3') will result to encoding errors. We probably don't want to start implicitly casting Unicode to bytes, since Py3 does not do that either. The only places where the dtype characters are used, AFAIK, is in repr and in the dtype kwarg -- they are not used in pickles etc. One can actually argue that changing 'U' to 'S' is more backward-compatible: array(['foo', 'bar'], dtype='S3') would still be valid code. Of course, the semantics change, but this anyway occurs also on the Python side when moving to Py3. The simplest way to get more insight would be to try to convert some string-using Py2 code to work on Py3. > I'm not sure your suggestion of making 'B' and 'B123' both work seems > like a good one because of the semantic differences between numbers and > strings. Would np.array(['a', 'b']) have a repr of [97, 98] or ['a', > 'b']? Sorting them would also not necessarily do the right thing. I think the point would be that 'B' and 'B1' would be treated as completely separate dtypes with different typenums -- they'd look similar only in the dtype character API (which is not so large) but not internally. np.array([b'a', b'b']).dtype would be 'B1'. Might be a bit confusing, though. Pauli From faltet at pytables.org Tue Dec 8 06:03:41 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 8 Dec 2009 12:03:41 +0100 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <1260199970.2704.84.camel@talisman> References: <1260060065.5576.80.camel@idol> <4B1D162C.5020607@stsci.edu> <1260199970.2704.84.camel@talisman> Message-ID: <200912081203.41342.faltet@pytables.org> A Monday 07 December 2009 16:32:50 Pauli Virtanen escrigu?: > ma, 2009-12-07 kello 09:50 -0500, Michael Droettboom kirjoitti: > > Pauli Virtanen wrote: > > [clip] > > > > The character 'B' is already by unsigned bytes -- I wonder if it's easy > > > to support 'B123' and plain 'B' at the same time, or whether we have to > > > pick a different letter for "byte strings". 'y' would be free... > > > > It seems to me the motivation to change the 'S' dtype to something else > > is to make things clearer with respect to the new conventions of Python > > 3. (Where str -> bytes, and unicode -> str). In that sense, I'm not > > sure there's any advantage going from "S" to "y" (particularly without > > doing "U" to "S"), whereas there's a strong backward-compatibility > > advantage to keep it as "S", though admittedly it's confusing to someone > > who doesn't know the pre Python 3 history. > > I think a better plan is to deprecate "U" instead of "S". > > Also, I'm not completely convinced that staying with "S" == bytes has a > strong backward-compatibility advantage: > > array(['foo']).dtype == 'U' > > and this will break code in several places. That's true, but at least this can be attributed to a poor programming practice. The same happens with: array([1]).dtype == 'int32' # in 32-bit systems array([1]).dtype == 'int64' # in 64-bit systems and my impression is that int32/int64 duality for int default would hit much more NumPy people than the "U"/"S" for string defaults. > Also, for instance, > > array(['foo', 'bar'], dtype='S3') > > will result to encoding errors. I don't think so. All existing code using the above idiom is using plain 7- bit ascii character set with almost all certainty, so we should not expect encoding errors here. > We probably don't want to start > implicitly casting Unicode to bytes, since Py3 does not do that either. I agree. > The only places where the dtype characters are used, AFAIK, is in repr > and in the dtype kwarg -- they are not used in pickles etc. > > One can actually argue that changing 'U' to 'S' is more > backward-compatible: > > array(['foo', 'bar'], dtype='S3') > > would still be valid code. Of course, the semantics change, but this > anyway occurs also on the Python side when moving to Py3. Mmh, as more I see this, the more I think that we can safely keep 'S' for bytes and 'U' for unicode. The only glitch would be: array(['foo']).dtype == 'U' but again, I don't think this is going to break a lot of code. > > I'm not sure your suggestion of making 'B' and 'B123' both work seems > > like a good one because of the semantic differences between numbers and > > strings. Would np.array(['a', 'b']) have a repr of [97, 98] or ['a', > > 'b']? Sorting them would also not necessarily do the right thing. > > I think the point would be that 'B' and 'B1' would be treated as > completely separate dtypes with different typenums -- they'd look > similar only in the dtype character API (which is not so large) but not > internally. np.array([b'a', b'b']).dtype would be 'B1'. Might be a bit > confusing, though. Yeah. Making 'B' and 'B1' so different types sounds very confusing, IMHO. -- Francesc Alted From cournape at gmail.com Tue Dec 8 06:13:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 8 Dec 2009 20:13:26 +0900 Subject: [Numpy-discussion] Py3 merge In-Reply-To: <200912081203.41342.faltet@pytables.org> References: <1260060065.5576.80.camel@idol> <4B1D162C.5020607@stsci.edu> <1260199970.2704.84.camel@talisman> <200912081203.41342.faltet@pytables.org> Message-ID: <5b8d13220912080313o184ee3dbs47acc1782b6e99ac@mail.gmail.com> On Tue, Dec 8, 2009 at 8:03 PM, Francesc Alted wrote: > > That's true, but at least this can be attributed to a poor programming > practice. ?The same happens with: > > array([1]).dtype == 'int32' ?# in 32-bit systems > array([1]).dtype == 'int64' ?# in 64-bit systems > > and my impression is that int32/int64 duality for int default would hit much > more NumPy people than the "U"/"S" for string defaults. I have not followed this discussion much so far, but I was wondering whether it would make sense to write our own fixer for 2to3 to handle some of those issues ? Encoding issues cannot be fixed in 2to3, but things like changing dtype name should be doable, no ? I have not looked at the 2to3 code, so I don't know how much work this would be. cheers, David From david at ar.media.kyoto-u.ac.jp Tue Dec 8 05:52:34 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 08 Dec 2009 19:52:34 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <2CC3201F-C340-495F-8B5A-0656A32FF2B7@gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <2CC3201F-C340-495F-8B5A-0656A32FF2B7@gmail.com> Message-ID: <4B1E2FF2.4050904@ar.media.kyoto-u.ac.jp> Pierre GM wrote: > > Will try. We're still supporting 2.3, right ? > We stopped supporting 2.3 starting at 1.3 I think. We require 2.4, cheers, David From rmay31 at gmail.com Tue Dec 8 09:23:24 2009 From: rmay31 at gmail.com (Ryan May) Date: Tue, 8 Dec 2009 08:23:24 -0600 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26688453.post@talk.nabble.com> Message-ID: >> At a minimum, this inconsistency needs to be cleared up. ?My >> preference >> would be that the programmer should have to explicitly downcast from >> complex to float, and that if he/she fails to do this, that an >> exception be >> triggered. > > That would most likely break a *lot* of deployed code that depends on > the implicit downcast behaviour. A less harmful solution (if a > solution is warranted, which is for the Council of the Elders to > decide) would be to treat the Python complex type as a special case, > so that the .real attribute is accessed instead of trying to cast to > float. Except that the exception raised on downcast is the behavior we really want. We don't need python complex types introducing subtle bugs as well. I understand why we have the silent downcast from complex to float, but I consider it a wart, not a feature. I've lost hours tracking down bugs where I'm putting complex data from some routine into a new array (without specifying a dtype) ends up with the complex downcast silently to float64. The only reason you even notice it is because at the end you have incorrect answers. I know to look for it now, but for inexperienced users, it's a pain. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From charlesr.harris at gmail.com Tue Dec 8 09:36:02 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Dec 2009 07:36:02 -0700 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Dec 8, 2009 at 12:31 AM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Pierre GM wrote: > > A bit of background first; > > In the first implementations of numpy.core.ma, the approach was to get > rid of the data that could cause problem beforehand by replacing them with > safe values. Turned out that finding these data is not always obvious (cf > the problem of defining a domain when dealing with exponents that we > discussed a while back on this list), and that all in all, it is faster to > compute first and deal with the problems afterwards. Of course, the user > doesn't need the warnings if something goes wrong, so disabling them > globally looked like a way to go. I thought the disabling would be only for > numpy.ma, though > > I don't think there is an easy (or any) way to disable this at module > level. Please be sure to always do so in a try/finally (like you would > handle a file object to guarantee it is always closed, for example), > because otherwise, test failures and SIGINT (ctrl+C) will pollute the > user environment. > > > > > Anyhow, running numpy.ma.test() on 1.5.x, I get 36 warnings by getting > rid of the np.seterr(all='ignore') in numpy.ma.core. I can go down to 2 by > saving the seterr state before the computations in > _Masked/DomainUnary/BinaryOperation and restoring it after the computation. > I gonna try to find where the 2 missing warnings come from. > > The 281 tests + 36 warnings take 4.087s to run, the 2 warning version > 5.95s (but I didn't try to go to much into timing details...) > > Setting/unsetting the FPU state definitely has a cost. I don't know how > significant it would be for your precise case, though: is the cost > because of setting/unsetting the state in the test themselves or ? We > may be able to improve the situation later on once we have better numbers. > > A proper solution to this FPU exception may require hard work (because > of the inherent asynchronous nature of signals, because signals behave > very differently on different platforms, and because I don't think we > can afford spending too many cycles on it). > > > > > So, what do you want me to do guys ? Commit the fixes to the trunk ? > Backporting them to 1.4.x ? > > I have already committed the removal of the global np.seterr in the > trunk. I feel like backporting this one to 1.4.x is a good idea (because > it could be considered as a regression), but maybe someone has a strong > case against it. > > At this point it isn't a regression, it is a tradition. I think it best to leave the fix out of 1.4 and make the change for 1.5 because it is likely to break user code. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Dec 8 10:10:51 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 08 Dec 2009 09:10:51 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> Message-ID: <4B1E6C7B.7060807@gmail.com> On 12/08/2009 08:36 AM, Charles R Harris wrote: > > > On Tue, Dec 8, 2009 at 12:31 AM, David Cournapeau > > > wrote: > > Pierre GM wrote: > > A bit of background first; > > In the first implementations of numpy.core.ma > , the approach was to get rid of the data > that could cause problem beforehand by replacing them with safe > values. Turned out that finding these data is not always obvious > (cf the problem of defining a domain when dealing with exponents > that we discussed a while back on this list), and that all in all, > it is faster to compute first and deal with the problems > afterwards. Of course, the user doesn't need the warnings if > something goes wrong, so disabling them globally looked like a way > to go. I thought the disabling would be only for numpy.ma > , though > > I don't think there is an easy (or any) way to disable this at module > level. Please be sure to always do so in a try/finally (like you would > handle a file object to guarantee it is always closed, for example), > because otherwise, test failures and SIGINT (ctrl+C) will pollute the > user environment. > > > > > Anyhow, running numpy.ma.test() on 1.5.x, I get 36 warnings by > getting rid of the np.seterr(all='ignore') in numpy.ma.core. I can > go down to 2 by saving the seterr state before the computations in > _Masked/DomainUnary/BinaryOperation and restoring it after the > computation. I gonna try to find where the 2 missing warnings come > from. > > The 281 tests + 36 warnings take 4.087s to run, the 2 warning > version 5.95s (but I didn't try to go to much into timing details...) > > Setting/unsetting the FPU state definitely has a cost. I don't > know how > significant it would be for your precise case, though: is the cost > because of setting/unsetting the state in the test themselves or ? We > may be able to improve the situation later on once we have better > numbers. > > A proper solution to this FPU exception may require hard work (because > of the inherent asynchronous nature of signals, because signals behave > very differently on different platforms, and because I don't think we > can afford spending too many cycles on it). > > > > > So, what do you want me to do guys ? Commit the fixes to the > trunk ? Backporting them to 1.4.x ? > > I have already committed the removal of the global np.seterr in the > trunk. I feel like backporting this one to 1.4.x is a good idea > (because > it could be considered as a regression), but maybe someone has a > strong > case against it. > > > At this point it isn't a regression, it is a tradition. I think it > best to leave the fix out of 1.4 and make the change for 1.5 because > it is likely to break user code. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > I understand the reason for the masked arrays behavior but changing the seterr default will be a problem until ma is changed. With Python 2.6 and numpy '1.4.0.dev7750', the current default works but fails when changing the seterr default. >>> a = np.ma.masked_array([-1, 0, 1, 2, 3], mask=[0, 0, 0, 0, 1]) >>> np.sqrt(a) masked_array(data = [-- 0.0 1.0 1.41421356237 --], mask = [ True False False False True], fill_value = 999999) >>> np.seterr(all='raise') {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} >>> np.sqrt(a) Traceback (most recent call last): File "", line 1, in FloatingPointError: invalid value encountered in sqrt Furthermore, with np.seterr(all='raise') scipy.test() stops with an numpy error (output below) and scipy also appears to have some test errors. Should I submit a bug report for that error? Also, with seterr(all='raise') there are 25 test errors in numpy 1.3.0 and 25 test errors in numpy 1.4.0.dev7750 (only 10 are the same). So these tests would need to be resolved. So I agree with Chuck that we need to hold off the change in default until after 1.4 because there may be other code that has similar behavior to ma and scipy. Also we need both numpy and scipy to pass all the tests with this default. Bruce >>> sp.__version__ '0.8.0.dev6119' >>> sp.test() Running unit tests for scipy NumPy version 1.4.0.dev7750 NumPy is installed in /usr/lib64/python2.6/site-packages/numpy SciPy version 0.8.0.dev6119 SciPy is installed in /usr/lib64/python2.6/site-packages/scipy Python version 2.6 (r26:66714, Jun 8 2009, 16:07:29) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] nose version 0.10.4 ................................................................................................................................................................................................F.....................................................................................................................K..K............................................................EEE........E...................................................F......................................................................................................................................................................................Warning: 1000000 bytes requested, 20 bytes read. ............................................Exception AttributeError: "'netcdf_file' object has no attribute 'mode'" in > ignored ........................[ 8 103 111 24 47] [ 8 103 111 24 47] .[[ 77 53] {snip} [ 65 69 70 80 46]] ...............................................................................................................................................SSSSSS......SSSSSS......SSSS.....................................................................E..E...........................................................................................................................................................................................................E...............................................................................................................................................Result may be inaccurate, approximate err = 1.18043920589e-08 ...Result may be inaccurate, approximate err = 4.28420877755e-10 ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................E.SSSSSSSSSSS..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................E.E..............................E...........K.K...........................................................................................................................................................................................E.......................................................................E...................E...........................................EEEE....................EE....EEEE......EEKEEEE....K.......E.SSSSSE................E..E..........................................................................Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python2.6/site-packages/numpy/testing/nosetester.py", line 335, in test t = NumpyTestProgram(argv=argv, exit=False, plugins=plugins) File "/usr/lib/python2.6/site-packages/nose/core.py", line 219, in __init__ argv=argv, testRunner=testRunner, testLoader=testLoader) File "/usr/lib64/python2.6/unittest.py", line 817, in __init__ self.runTests() File "/usr/lib64/python2.6/site-packages/numpy/testing/noseclasses.py", line 300, in runTests self.result = self.testRunner.run(self.test) File "/usr/lib/python2.6/site-packages/nose/core.py", line 62, in run test(result) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 168, in run test(orig) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 168, in run test(orig) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 168, in run test(orig) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 168, in run test(orig) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 168, in run test(orig) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 138, in __call__ return self.run(*arg, **kw) File "/usr/lib/python2.6/site-packages/nose/suite.py", line 161, in run for test in self._tests: File "/usr/lib/python2.6/site-packages/nose/suite.py", line 283, in _get_wrapped_tests for test in self._get_tests(): File "/usr/lib/python2.6/site-packages/nose/suite.py", line 79, in _get_tests for test in self.test_generator: File "/usr/lib/python2.6/site-packages/nose/loader.py", line 221, in generate for test in g(): File "/usr/lib64/python2.6/site-packages/scipy/stats/tests/test_continuous_basic.py", line 168, in test_cont_basic m,v = distfn.stats(*arg) File "/usr/lib64/python2.6/site-packages/scipy/stats/distributions.py", line 836, in stats mu = self._munp(1.0,*goodargs) File "/usr/lib64/python2.6/site-packages/scipy/stats/distributions.py", line 563, in _munp return self.generic_moment(n,*args) File "/usr/lib64/python2.6/site-packages/numpy/lib/function_base.py", line 1823, in __call__ _res = array(self.ufunc(*newargs),copy=False, FloatingPointError: underflow encountered in _mom1_sc (vectorized) >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Dec 8 10:21:59 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 00:21:59 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <4B1E6C7B.7060807@gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> Message-ID: <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> On Wed, Dec 9, 2009 at 12:10 AM, Bruce Southey wrote: > > I understand the reason for the masked arrays behavior but changing the > seterr default will be a problem until ma is changed. With Python 2.6 and > numpy '1.4.0.dev7750', the current default works but fails when changing the > seterr default. The default was warn, and not raise. Raise as a default does not make much sense. David From josef.pktd at gmail.com Tue Dec 8 10:49:05 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 8 Dec 2009 10:49:05 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> Message-ID: <1cd32cbb0912080749i68c4682fg756b2789cdd960a5@mail.gmail.com> On Tue, Dec 8, 2009 at 10:21 AM, David Cournapeau wrote: > On Wed, Dec 9, 2009 at 12:10 AM, Bruce Southey wrote: > >> >> I understand the reason for the masked arrays behavior but changing the >> seterr default will be a problem until ma is changed. With Python 2.6 and >> numpy '1.4.0.dev7750', the current default works but fails when changing the >> seterr default. > > The default was warn, and not raise. Raise as a default does not make > much sense. There are no problems with 'warn', running scipy.stats.test() just prints a lot of underflow, zero division,.. warnings but ends with zero failures, zero errors. It only adds a bit of noise to the test output. Josef > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Tue Dec 8 10:57:09 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 17:57:09 +0200 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> Message-ID: <1260287829.18562.50.camel@talisman> ke, 2009-12-09 kello 00:21 +0900, David Cournapeau kirjoitti: > On Wed, Dec 9, 2009 at 12:10 AM, Bruce Southey wrote: > > > > > I understand the reason for the masked arrays behavior but changing the > > seterr default will be a problem until ma is changed. With Python 2.6 and > > numpy '1.4.0.dev7750', the current default works but fails when changing the > > seterr default. > > The default was warn, and not raise. Raise as a default does not make > much sense. There are drawbacks with the all='print' default: It prints to C stdio stderr, which is unsanitary. Perhaps it should print to Python's sys.stderr instead? Also, some code that worked OK before would now start to spit out extra warnings, which is not so nice. -- Pauli Virtanen From cournape at gmail.com Tue Dec 8 11:12:41 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 01:12:41 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <1260287829.18562.50.camel@talisman> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912070931x3e1f20b3tf56321687dc58086@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> Message-ID: <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> On Wed, Dec 9, 2009 at 12:57 AM, Pauli Virtanen wrote: > > Also, some code that worked OK before would now start to spit out extra > warnings, which is not so nice. Hm, there are several things mixed up in this discussion, I feel like we are not talking about exactly the same thing: - I am talking about setting the default back as before, which was warn and not print AFAIK. This means that things like np.log(0) will "raise" a proper warning, which can be filtered globally if wanted. Same for np.array([1]) / 0, etc... no stderr is involved AFAICS for simple examples - Because warns are involved, they will only appear once per exception type and origin Once this is understood, if you still think it should not be the default for 1.4.0, I will not cherry-pick it for 1.4.x. David From robert.kern at gmail.com Tue Dec 8 11:21:46 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 10:21:46 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> Message-ID: <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> On Tue, Dec 8, 2009 at 10:12, David Cournapeau wrote: > On Wed, Dec 9, 2009 at 12:57 AM, Pauli Virtanen wrote: > >> >> Also, some code that worked OK before would now start to spit out extra >> warnings, which is not so nice. > > Hm, there are several things mixed up in this discussion, I feel like > we are not talking about exactly the same thing: > ?- I am talking about setting the default back as before, which was > warn and not print AFAIK. This means that things like np.log(0) will > "raise" a proper warning, which can be filtered globally if wanted. > Same for np.array([1]) / 0, etc... no stderr is involved AFAICS for > simple examples > ?- Because warns are involved, they will only appear once per > exception type and origin The default has always been "print", not "warn" (except for underflow, which was "ignore"). However, "warn" is better for the reasons you state (except for underflow, which should remain "ignore"). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav+sp at iki.fi Tue Dec 8 12:02:42 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 8 Dec 2009 17:02:42 +0000 (UTC) Subject: [Numpy-discussion] Cython issues w/ 1.4.0 References: <20091206135358.GE897@phare.normalesup.org> Message-ID: Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote: > I have a lot of code that has stopped working with my latest SVN pull to > numpy. > > * Some compiled code yields an error looking like (from memory): > > "incorrect type 'numpy.ndarray'" This, by the way, also affects the 1.4.x branch. Because of the datetime branch merge, a new field was added to ArrayDescr -- and this breaks previously compiled Cython modules. I guess this should be mentioned in the release notes. We'll probably be doing it again in 1.5.0... -- Pauli Virtanen From dsdale24 at gmail.com Tue Dec 8 12:12:43 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Tue, 8 Dec 2009 12:12:43 -0500 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: References: <20091206135358.GE897@phare.normalesup.org> Message-ID: On Tue, Dec 8, 2009 at 12:02 PM, Pauli Virtanen wrote: > Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote: >> I have a lot of code that has stopped working with my latest SVN pull to >> numpy. >> >> * Some compiled code yields an error looking like (from memory): >> >> ? ? "incorrect type 'numpy.ndarray'" > > This, by the way, also affects the 1.4.x branch. Because of the datetime > branch merge, a new field was added to ArrayDescr -- and this breaks > previously compiled Cython modules. Will the datetime changes effect other previously-compiled python modules, like PyQwt? From pav at iki.fi Tue Dec 8 12:16:06 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 19:16:06 +0200 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: References: <20091206135358.GE897@phare.normalesup.org> Message-ID: <1260292566.18562.51.camel@talisman> ti, 2009-12-08 kello 12:12 -0500, Darren Dale kirjoitti: > On Tue, Dec 8, 2009 at 12:02 PM, Pauli Virtanen wrote: > > Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote: > >> I have a lot of code that has stopped working with my latest SVN pull to > >> numpy. > >> > >> * Some compiled code yields an error looking like (from memory): > >> > >> "incorrect type 'numpy.ndarray'" > > > > This, by the way, also affects the 1.4.x branch. Because of the datetime > > branch merge, a new field was added to ArrayDescr -- and this breaks > > previously compiled Cython modules. > > Will the datetime changes effect other previously-compiled python > modules, like PyQwt? For those modules that explicitly check the sizeof of various C structures. For other modules, I'd expect there be no consequences, as the new fields were added to the end of the struct. -- Pauli Virtanen From cournape at gmail.com Tue Dec 8 12:28:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 02:28:46 +0900 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: References: <20091206135358.GE897@phare.normalesup.org> Message-ID: <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> On Wed, Dec 9, 2009 at 2:02 AM, Pauli Virtanen wrote: > Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote: >> I have a lot of code that has stopped working with my latest SVN pull to >> numpy. >> >> * Some compiled code yields an error looking like (from memory): >> >> ? ? "incorrect type 'numpy.ndarray'" > > This, by the way, also affects the 1.4.x branch. Because of the datetime > branch merge, a new field was added to ArrayDescr -- and this breaks > previously compiled Cython modules. It seems that it is partly a cython problem. If py3k can be done for numpy 1.5, I wonder if we should focus on making incompatible numpy 1.6 (or 2.0 :) ), with an emphasis on making the C api more robust about those changes, using opaque pointer, functions, etc... Basically, implementing something like PEP 384, but for numpy. As numpy becomes more and more used as a basic for so many softwares, I feel like the current situation is hurting numpy users quite badly. Maybe I am overestimate the problem, though ? cheers, David From pav at iki.fi Tue Dec 8 12:37:10 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 19:37:10 +0200 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> Message-ID: <1260293830.18562.55.camel@talisman> ke, 2009-12-09 kello 02:28 +0900, David Cournapeau kirjoitti: [clip] > It seems that it is partly a cython problem. If py3k can be done for > numpy 1.5, I wonder if we should focus on making incompatible numpy > 1.6 (or 2.0 :) ), with an emphasis on making the C api more robust > about those changes, using opaque pointer, functions, etc... > Basically, implementing something like PEP 384, but for numpy. > > As numpy becomes more and more used as a basic for so many softwares, > I feel like the current situation is hurting numpy users quite badly. > Maybe I am overestimate the problem, though ? If we add an unused void *private both to ArrayDescr and ArrayObject for 1.4.x, we can stuff private data there and don't need to break the ABI again for 1.5 just because of possible changes to implementation details. (And if it turns we can do without them in 1.5.x, then we have some leeway for future changes.) I think this would be a harmless addition, although we are already in the rc phase. Thoughts? Pauli From gael.varoquaux at normalesup.org Tue Dec 8 12:37:36 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 8 Dec 2009 18:37:36 +0100 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> Message-ID: <20091208173736.GE16809@phare.normalesup.org> On Wed, Dec 09, 2009 at 02:28:46AM +0900, David Cournapeau wrote: > As numpy becomes more and more used as a basic for so many softwares, > I feel like the current situation is hurting numpy users quite badly. > Maybe I am overestimate the problem, though ? I think you are right. It is going to hurt us pretty badly, in my instute, where we have several python deployed, with several numpy versions (the system one, the NFS one, and often some locally-built packages), and users not fully aware of the situation. In addition, it is going to make deploying soft harder, and lead to impossible situations, where the binaries for module A work with one version of numpy, and the binaries for module B work with another. And recompiling is not a good option for end users. Of course, I am not knowledgeable-enough to say technicaly what the way out, or the best compromise is. I am just saying that the current situation will hurt. My 2 cents, Ga?l From robert.kern at gmail.com Tue Dec 8 12:40:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 11:40:56 -0600 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> Message-ID: <3d375d730912080940g217eb744vbcee68f8a8a3a1c8@mail.gmail.com> On Tue, Dec 8, 2009 at 11:28, David Cournapeau wrote: > On Wed, Dec 9, 2009 at 2:02 AM, Pauli Virtanen wrote: >> Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote: >>> I have a lot of code that has stopped working with my latest SVN pull to >>> numpy. >>> >>> * Some compiled code yields an error looking like (from memory): >>> >>> ? ? "incorrect type 'numpy.ndarray'" >> >> This, by the way, also affects the 1.4.x branch. Because of the datetime >> branch merge, a new field was added to ArrayDescr -- and this breaks >> previously compiled Cython modules. > > It seems that it is partly a cython problem. It's *entirely* a Cython problem. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Tue Dec 8 12:41:33 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 02:41:33 +0900 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <4B1DD2E2.8090007@ar.media.kyoto-u.ac.jp> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> Message-ID: <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> On Wed, Dec 9, 2009 at 1:21 AM, Robert Kern wrote: > The default has always been "print", not "warn" (except for underflow, > which was "ignore"). Ah, ok, that explains part of the misunderstanding, sorry for the confusion. > However, "warn" is better for the reasons you > state (except for underflow, which should remain "ignore"). do you have an opinion on whether we should keep the current behavior vs replacing with warn for everything but underflow for 1.4.x ? David From cournape at gmail.com Tue Dec 8 12:47:58 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 02:47:58 +0900 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <1260293830.18562.55.camel@talisman> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> Message-ID: <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> On Wed, Dec 9, 2009 at 2:37 AM, Pauli Virtanen wrote: > ke, 2009-12-09 kello 02:28 +0900, David Cournapeau kirjoitti: > [clip] >> It seems that it is partly a cython problem. If py3k can be done for >> numpy 1.5, I wonder if we should focus on making incompatible numpy >> 1.6 (or 2.0 :) ), with an emphasis on making the C api more robust >> about those changes, using opaque pointer, functions, etc... >> Basically, implementing something like PEP 384, but for numpy. >> >> As numpy becomes more and more used as a basic for so many softwares, >> I feel like the current situation is hurting numpy users quite badly. >> Maybe I am overestimate the problem, though ? > > If we add an unused > > ? ? ? ?void *private > > both to ArrayDescr and ArrayObject for 1.4.x, we can stuff private data > there and don't need to break the ABI again for 1.5 just because of > possible changes to implementation details. (And if it turns we can do > without them in 1.5.x, then we have some leeway for future changes.) What I had in mind was more thorough: all the structs become as opaque as possible, we remove most macros to replace them with accessors (or mark the macro as "unsafe" as far as ABI goes). This may be unrealistic, at least in some cases, for speed reasons, though. Of course, this does not prevent from applying your suggested change - I don't understand why you want to add it to 1.4.0, though. 1.4.0 does not break the ABI compared to 1.3.0. Or is it "just" to avoid the cython issue to reappear for 1.5.0 ? cheers, David From robert.kern at gmail.com Tue Dec 8 12:54:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 11:54:14 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> Message-ID: <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> On Tue, Dec 8, 2009 at 11:41, David Cournapeau wrote: > On Wed, Dec 9, 2009 at 1:21 AM, Robert Kern wrote: > >> The default has always been "print", not "warn" (except for underflow, >> which was "ignore"). > > Ah, ok, that explains part of the misunderstanding, sorry for the confusion. > >> However, "warn" is better for the reasons you >> state (except for underflow, which should remain "ignore"). > > do you have an opinion on whether we should keep the current behavior > vs replacing with warn for everything but underflow for 1.4.x ? As far as I can tell, the faulty global seterr() has been in place since 1.1.0, so fixing it at all should be considered a feature change. It's not likely to actually *break* things except for doctests and documentation. I think I fall in with Chuck in suggesting that it should be changed in 1.5.0. I would add that it would be okay to use the preferable "warn" option instead of "print" at that time since it really isn't a "fix" anymore, just a new feature. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jorgen.stenarson at bostream.nu Tue Dec 8 12:54:37 2009 From: jorgen.stenarson at bostream.nu (=?ISO-8859-1?Q?J=F6rgen_Stenarson?=) Date: Tue, 08 Dec 2009 18:54:37 +0100 Subject: [Numpy-discussion] Complex zerodivision/negative powers not handled correctly Message-ID: <4B1E92DD.4030404@bostream.nu> Hi, I have observed a problem with complex zerodivision and negative powers. With a complex zero the result is either zero or NaN NaNj, the first one is clearly wrong and the other one I don't know what is most reasonable some kind of inf or a Nan. This problem has been reported in the tracker as #1271. In [1]: import numpy In [2]: numpy.__version__ Out[2]: '1.4.0rc1' In [3]: from numpy import * In [4]: array([-0., 0])**-1 Out[4]: array([-Inf, Inf]) In [5]: array([-0., 0])**-2 Out[5]: array([ Inf, Inf]) In [6]: array([-0.-0j, -0.+0j, 0-0j, 0+0j])**-2 Out[6]: array([ 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j]) In [7]: array([-0.-0j, -0.+0j, 0-0j, 0+0j])**-1 Out[7]: array([ NaN NaNj, NaN NaNj, NaN NaNj, NaN NaNj]) In [8]: 1/array([-0.-0j, -0.+0j, 0-0j, 0+0j]) Out[8]: array([ NaN NaNj, NaN NaNj, NaN NaNj, NaN NaNj]) From pav at iki.fi Tue Dec 8 13:08:11 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 20:08:11 +0200 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> Message-ID: <1260295691.18562.66.camel@talisman> ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti: [clip] > Of course, this does not prevent from applying your suggested change - > I don't understand why you want to add it to 1.4.0, though. 1.4.0 does > not break the ABI compared to 1.3.0. Or is it "just" to avoid the > cython issue to reappear for 1.5.0 ? Yes, it's to avoid having to deal with the Cython issue again in 1.5.0. Although it's not strictly speaking an ABI break, it seems this is a bit of a nuisance for some people, so if we can work around it cheaply, I think we should do it. We should maybe convince the Cython people to disable this check, at least for Numpy. -- Pauli Virtanen From robert.kern at gmail.com Tue Dec 8 13:14:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 12:14:48 -0600 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <1260295691.18562.66.camel@talisman> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> Message-ID: <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen wrote: > ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti: > [clip] >> Of course, this does not prevent from applying your suggested change - >> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does >> not break the ABI compared to 1.3.0. Or is it "just" to avoid the >> cython issue to reappear for 1.5.0 ? > > Yes, it's to avoid having to deal with the Cython issue again in 1.5.0. Do we have any features on deck that would add a struct member? I think it's pretty rare for us to do so, as it should be. > Although it's not strictly speaking an ABI break, it seems this is a bit > of a nuisance for some people, so if we can work around it cheaply, I > think we should do it. Breaking compatibility via a major reorganization of our structs is not cheap! > We should maybe convince the Cython people to disable this check, at > least for Numpy. They appear to be. See the latest messages in the thread "Checking extension type sizes". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pfeldman at verizon.net Tue Dec 8 13:17:20 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Tue, 8 Dec 2009 10:17:20 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26688453.post@talk.nabble.com> Message-ID: <26698253.post@talk.nabble.com> David Warde-Farley-2 wrote: > > > A less harmful solution (if a solution is warranted, which is for the > Council of the Elders to > decide) would be to treat the Python complex type as a special case, so > that the .real attribute is accessed instead of trying to cast to float. > > There are two even less harmful solutions: (1) Raise an exception. (2) Provide the user with a top-level flag to control whether the attempt to downcast a NumPy complex to a float should be handled by raising an exception, by throwing away the imaginary part, or by taking the magnitude. P.S. As things stand now, I do not regard NumPy as a reliable platform for scientific computing. -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26698253.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From Chris.Barker at noaa.gov Tue Dec 8 13:34:12 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Dec 2009 10:34:12 -0800 Subject: [Numpy-discussion] What protocol to use now? Message-ID: <4B1E9C24.8030704@noaa.gov> Hi folks, There was just a question on the wxPython list about how to optimize some drawing of data in numpy arrays. Currently, wxPython uses PySequenceGetItem to iterate through an array, so you can imagine there is a fair bit of overhead in that. But what to use? We don't want to require numpy, so using the numpy API directly is out. Using the buffer interface makes it too hard to catch user errors. The array interface was made for this sort of thing, but is deprecated: http://docs.scipy.org/doc/numpy/reference/arrays.interface.html Is the new PEP 3118 protocol now (as of version 1.4) supported by numpy, at least for export? At the moment, a one-way street is OK for this application. thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pav at iki.fi Tue Dec 8 13:38:18 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 20:38:18 +0200 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> Message-ID: <1260297498.18562.89.camel@talisman> ti, 2009-12-08 kello 12:14 -0600, Robert Kern kirjoitti: > On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen wrote: > > ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti: > > [clip] > >> Of course, this does not prevent from applying your suggested change - > >> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does > >> not break the ABI compared to 1.3.0. Or is it "just" to avoid the > >> cython issue to reappear for 1.5.0 ? > > > > Yes, it's to avoid having to deal with the Cython issue again in 1.5.0. > > Do we have any features on deck that would add a struct member? I > think it's pretty rare for us to do so, as it should be. If we want to support PEP 3118 on Py2.6, then new fields would be useful: - Python2.6 currently has issues with PyArg_ParseTuple("s#", ...): defining a bf_releasebuffer breaks that particular feature. Consequently, if we want backwards compatibility, we cannot keep track of allocated memory using the Py_buffer structure, so something else is needed. We can probably get this fixed in future Python 2.6/2.7 releases, making it prefer the old buffer interface. The issue also most likely unfixable on Py3, since "s#" has semantics that are not really compatible with the new buffer interface. - We need to cache the buffer protocol format string somewhere, if we do not want to regenerate it on each buffer acquisition. The natural place for this would be in the ArrayDescr. The alternative at the moment is to #ifdef the PEP 3118 implementation out on Py2, at least until the issues on 2.6 are fixed. This would be actually a cleaner alternative -- working around bugs like this is icky. > > Although it's not strictly speaking an ABI break, it seems this is a bit > > of a nuisance for some people, so if we can work around it cheaply, I > > think we should do it. > > Breaking compatibility via a major reorganization of our structs is not cheap! We already break the compatibility in 1.4.0 because of the datetime "metadata" field added in the ArrayDescr. Adding an additional field reserved for future use at the same time should not cause additional breakage. They could come useful later on, even if it turns out we don't immediately need them. (Or maybe you commented on David's suggestion...) -- Pauli Virtanen From robert.kern at gmail.com Tue Dec 8 15:13:17 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 14:13:17 -0600 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <1260297498.18562.89.camel@talisman> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> <1260297498.18562.89.camel@talisman> Message-ID: <3d375d730912081213h362ece9bk2d4104fa1234d4e8@mail.gmail.com> On Tue, Dec 8, 2009 at 12:38, Pauli Virtanen wrote: > ti, 2009-12-08 kello 12:14 -0600, Robert Kern kirjoitti: >> On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen wrote: >> > ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti: >> > [clip] >> >> Of course, this does not prevent from applying your suggested change - >> >> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does >> >> not break the ABI compared to 1.3.0. Or is it "just" to avoid the >> >> cython issue to reappear for 1.5.0 ? >> > >> > Yes, it's to avoid having to deal with the Cython issue again in 1.5.0. >> >> Do we have any features on deck that would add a struct member? I >> think it's pretty rare for us to do so, as it should be. > > If we want to support PEP 3118 on Py2.6, then new fields would be > useful: > > - Python2.6 currently has issues with PyArg_ParseTuple("s#", ...): > ?defining a bf_releasebuffer breaks that particular feature. > > ?Consequently, if we want backwards compatibility, we cannot keep track > ?of allocated memory using the Py_buffer structure, so something else > ?is needed. > > ?We can probably get this fixed in future Python 2.6/2.7 releases, > ?making it prefer the old buffer interface. ?The issue also most likely > ?unfixable on Py3, since "s#" has semantics that are not really > ?compatible with the new buffer interface. > > - We need to cache the buffer protocol format string somewhere, > ?if we do not want to regenerate it on each buffer acquisition. My suspicion is that YAGNI. I would wait until it is actually in use and we see whether it takes up a significant amount of time in actual code. >> > Although it's not strictly speaking an ABI break, it seems this is a bit >> > of a nuisance for some people, so if we can work around it cheaply, I >> > think we should do it. >> >> Breaking compatibility via a major reorganization of our structs is not cheap! > > We already break the compatibility in 1.4.0 because of the datetime > "metadata" field added in the ArrayDescr. Adding an additional field > reserved for future use at the same time should not cause additional > breakage. They could come useful later on, even if it turns out we don't > immediately need them. > > (Or maybe you commented on David's suggestion...) Yes, I was thinking of that. I think I might be okay with putting in the placeholder pointer in 1.4.0 in order to reserve the slot for a well-thought-out implementation in 1.5. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Dec 8 15:23:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 14:23:36 -0600 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <4B1E9C24.8030704@noaa.gov> References: <4B1E9C24.8030704@noaa.gov> Message-ID: <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> On Tue, Dec 8, 2009 at 12:34, Christopher Barker wrote: > Hi folks, > > There was just a question on the wxPython list about how to optimize > some drawing of data in numpy arrays. Currently, wxPython uses > PySequenceGetItem to iterate through an array, so you can imagine there > is a fair bit of overhead in that. > > But what to use? > > We don't want to require numpy, so using the numpy API directly is out. > > Using the buffer interface makes it too hard to catch user errors. > > The array interface was made for this sort of thing, but is deprecated: > > http://docs.scipy.org/doc/numpy/reference/arrays.interface.html > > Is the new PEP 3118 protocol now (as of version 1.4) supported by numpy, > at least for export? At the moment, a one-way street is OK for this > application. I think the wording is overly strong. I don't think that we actually decided to deprecate the interface. PEP 3118 is not yet implemented by numpy, and the PEP 3118 API won't be available to Python's <2.6 (Cython's workarounds notwithstanding). Pauli, did we discuss this before you wrote that warning and I'm just not remembering it? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From washakie at gmail.com Tue Dec 8 15:42:58 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 12:42:58 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help Message-ID: <26700380.post@talk.nabble.com> I see record arrays don't have a masked_where method. How can I achieve the following for a record array: cd.masked_where(cd.co == -9999.) Or something like this. Thanks! -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26700380.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From dagss at student.matnat.uio.no Tue Dec 8 15:52:16 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 Dec 2009 21:52:16 +0100 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <3d375d730912081213h362ece9bk2d4104fa1234d4e8@mail.gmail.com> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> <1260297498.18562.89.camel@talisman> <3d375d730912081213h362ece9bk2d4104fa1234d4e8@mail.gmail.com> Message-ID: <4B1EBC80.8020702@student.matnat.uio.no> Robert Kern wrote: > On Tue, Dec 8, 2009 at 12:38, Pauli Virtanen wrote: >> ti, 2009-12-08 kello 12:14 -0600, Robert Kern kirjoitti: >>> On Tue, Dec 8, 2009 at 12:08, Pauli Virtanen wrote: >>>> ke, 2009-12-09 kello 02:47 +0900, David Cournapeau kirjoitti: >>>> [clip] >>>>> Of course, this does not prevent from applying your suggested change - >>>>> I don't understand why you want to add it to 1.4.0, though. 1.4.0 does >>>>> not break the ABI compared to 1.3.0. Or is it "just" to avoid the >>>>> cython issue to reappear for 1.5.0 ? >>>> Yes, it's to avoid having to deal with the Cython issue again in 1.5.0. >>> Do we have any features on deck that would add a struct member? I >>> think it's pretty rare for us to do so, as it should be. >> If we want to support PEP 3118 on Py2.6, then new fields would be >> useful: >> >> - Python2.6 currently has issues with PyArg_ParseTuple("s#", ...): >> defining a bf_releasebuffer breaks that particular feature. >> >> Consequently, if we want backwards compatibility, we cannot keep track >> of allocated memory using the Py_buffer structure, so something else >> is needed. >> >> We can probably get this fixed in future Python 2.6/2.7 releases, >> making it prefer the old buffer interface. The issue also most likely >> unfixable on Py3, since "s#" has semantics that are not really >> compatible with the new buffer interface. How about this: - Cache/store the format string in a bytes object in a global WeakRefKeyDict (?), keyed by dtype - The array holds a ref to the dtype, and the Py_buffer holds a ref to the array (through the obj field). Alternatively, create a new Python object and stick it in the "obj" in the Py_buffer, I don't think obj has to point to the actual object the buffer was acquired from, as long as it keeps alive a reference to it somehow (though I didn't find any docs for the obj field, it was added as an afterthought by the implementors after the PEP...). But the only advantage is not using weak references (if that is a problem), and it is probably slower and doesn't cache the string. >> - We need to cache the buffer protocol format string somewhere, >> if we do not want to regenerate it on each buffer acquisition. > > My suspicion is that YAGNI. I would wait until it is actually in use > and we see whether it takes up a significant amount of time in actual > code. The slight problem with that is that if somebody discover that this is a bottleneck in the code, the turnaround time for waiting for a new NumPy release could be quite a while. Not that I think it will ever be a problem. -- Dag Sverre From pivanov314 at gmail.com Tue Dec 8 15:57:09 2009 From: pivanov314 at gmail.com (Paul Ivanov) Date: Tue, 8 Dec 2009 12:57:09 -0800 Subject: [Numpy-discussion] doctest improvements patch (and possible regressions) In-Reply-To: References: <4B1EB148.1010607@gmail.com> Message-ID: Hi Numpy-devs, I'm a long time listener, first time caller. I grabbed 1.4.0rc1 and was happy that all the tests passed. But then I tried: >>> import numpy as np >>> np.test(doctests=True) ... Ran 1696 tests in 22.027s FAILED (failures=113, errors=24) I looked at some of the failures, and they looked like trivial typos. So I said to myself: "Self, wouldn't it be cool if all the doctests worked?" Well, I didn't quite get there spelunking and snorkeling in the source code a few evenings during the past week, but I got close. With the attached patch (which patches the 1.4.0rc1 tarball), I now get: >>> import numpy as np >>> np.test(doctests=True) ... Ran 1696 tests in 20.937s FAILED (failures=33, errors=25) I marked up suspicious differences with XXX, since I don't know if they're significant. In particular: - shortening a defchararray by strip does not change it's dtype to a shorter one (apparently it used to?) - the docstring for seterr says that np.seterr() should reset all errors to defaults, but clearly doesn't do that - there's a regression in recfunctions which may be related to #1299 and may have been fixed - recfunctions.find_duplicates ignoremask flag has no effect. There are a few other things, but they're minor (e.g. I added a note about how missing values are filled with usemask=False in recfunctions.merge_arrays). I think the only code I added was to testing/noseclasses.py. There, if a test fails, I give a few more chances to pass by normalizing the endianness of both desired and actual output, as well as default int size for 32 and 64 bit machines. This is done just using replace() on the strings. Everything else is docstring stuff, so I was hoping to sneak this into 1.4.0, since it would make it that much more polished. Does that sound crazy? best, Paul Ivanov -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: better-doctests.patch.gz Type: application/x-gzip Size: 16134 bytes Desc: not available URL: From robert.kern at gmail.com Tue Dec 8 16:06:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 15:06:03 -0600 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <4B1EBC80.8020702@student.matnat.uio.no> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> <1260297498.18562.89.camel@talisman> <3d375d730912081213h362ece9bk2d4104fa1234d4e8@mail.gmail.com> <4B1EBC80.8020702@student.matnat.uio.no> Message-ID: <3d375d730912081306o43862d26k9377e70a9e2fe973@mail.gmail.com> On Tue, Dec 8, 2009 at 14:52, Dag Sverre Seljebotn wrote: > Robert Kern wrote: >> On Tue, Dec 8, 2009 at 12:38, Pauli Virtanen wrote: >>> - We need to cache the buffer protocol format string somewhere, >>> ?if we do not want to regenerate it on each buffer acquisition. >> >> My suspicion is that YAGNI. I would wait until it is actually in use >> and we see whether it takes up a significant amount of time in actual >> code. > > The slight problem with that is that if somebody discover that this is a > bottleneck in the code, the turnaround time for waiting for a new NumPy > release could be quite a while. Not that I think it will ever be a problem. That's true of anything we might do. I'm just skeptical that regeneration takes so much time that it will significantly affect real applications. How often are buffers going to be converted, really? Particularly since one of the points of this interface is to get the buffer once and read/write into it many times and avoid copying anything. Adding a struct member or even using the envisioned dynamic slots is pretty costly, and is not something that we should do for a cache until there is some profiling done on real applications. Premature optimization, root of all evil, and all that. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Tue Dec 8 16:25:07 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 16:25:07 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <6A3E7210-AC39-400A-9F03-F033A95D6F28@gmail.com> <4B1E00C9.9080502@ar.media.kyoto-u.ac.jp> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> Message-ID: <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> On Dec 8, 2009, at 12:54 PM, Robert Kern wrote: > > As far as I can tell, the faulty global seterr() has been in place > since 1.1.0, so fixing it at all should be considered a feature > change. It's not likely to actually *break* things except for doctests > and documentation. I think I fall in with Chuck in suggesting that it > should be changed in 1.5.0. OK. I'll work on fixing the remaining issues when a np function is applied on a masked array. FYI. most of the warnings can be fixed in _MaskedUnaryOperation and consorts with: err_status_ini = np.geterr() np.seterr(divide='ignore', invalid='ignore') result = self.f(da, db, *args, **kwargs) np.seterr(**err_status_ini) Is this kind of fix acceptable ? From pav at iki.fi Tue Dec 8 16:25:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 08 Dec 2009 23:25:50 +0200 Subject: [Numpy-discussion] Cython issues w/ 1.4.0 In-Reply-To: <4B1EBC80.8020702@student.matnat.uio.no> References: <20091206135358.GE897@phare.normalesup.org> <5b8d13220912080928u1c0c88b4qac6689e5534ff5ec@mail.gmail.com> <1260293830.18562.55.camel@talisman> <5b8d13220912080947r56630b24vd945e7824a10dfe9@mail.gmail.com> <1260295691.18562.66.camel@talisman> <3d375d730912081014l29092efarc17df555912d3f4f@mail.gmail.com> <1260297498.18562.89.camel@talisman> <3d375d730912081213h362ece9bk2d4104fa1234d4e8@mail.gmail.com> <4B1EBC80.8020702@student.matnat.uio.no> Message-ID: <1260307550.17559.11.camel@idol> ti, 2009-12-08 kello 21:52 +0100, Dag Sverre Seljebotn kirjoitti: [clip] > How about this: > - Cache/store the format string in a bytes object in a global > WeakRefKeyDict (?), keyed by dtype > - The array holds a ref to the dtype, and the Py_buffer holds a ref to > the array (through the obj field). Yep, storage in a static variable is the second alternative. We can even handle allocation and deallocation manually. I think I'll make it this way then. > Alternatively, create a new Python object and stick it in the "obj" in > the Py_buffer, I don't think obj has to point to the actual object the > buffer was acquired from, as long as it keeps alive a reference to it > somehow (though I didn't find any docs for the obj field, it was added > as an afterthought by the implementors after the PEP...). But the only > advantage is not using weak references (if that is a problem), and it is > probably slower and doesn't cache the string. The current implementation of MemoryView in Python assumes that you can call PyObject_GetBuffer(view->obj). But this can be changed, IIRC, the memoryview has also a ->base member containing the same info. > >> - We need to cache the buffer protocol format string somewhere, > >> if we do not want to regenerate it on each buffer acquisition. > > > > My suspicion is that YAGNI. I would wait until it is actually in use > > and we see whether it takes up a significant amount of time in actual > > code. Sure, it's likely that it won't be a real problem performance-wise, as it's simple C code. The point is that the format string needs to be stored somewhere for later deallocation, and to work around bugs in Python, we cannot put it in Py_buffer where it would naturally belong to. But anyway, it may really be best to not pollute object structs because of a need for workarounds -- I suppose if I submit patches to Python soon, they may make it in releases before Numpy 1.5.0 rolls out. For backward compatibility, we'll just make do with static variables. Ok, the reserved-for-future pointers in structs may not then be needed after all, at least for this purpose. -- Pauli Virtanen From sienkiew at stsci.edu Tue Dec 8 16:36:33 2009 From: sienkiew at stsci.edu (Mark Sienkiewicz) Date: Tue, 08 Dec 2009 16:36:33 -0500 Subject: [Numpy-discussion] numpy distutils breaks scipy install on mac Message-ID: <4B1EC6E1.7040004@stsci.edu> When I compile scipy on a mac, the build fails with: ... gfortran:f77: scipy/fftpack/src/dfftpack/dcosqb.f f951: error: unrecognized command line option "-arch" f951: error: unrecognized command line option "-arch" f951: error: unrecognized command line option "-arch" f951: error: unrecognized command line option "-arch" error: Command "/sw/bin/gfortran -Wall -ffixed-form -fno-second-underscore -arch i686 -arch x86_64 -fPIC -O3 -funroll-loops -I/usr/stsci/pyssgdev/2.5.4/numpy/core/include -c -c scipy/fftpack/src/dfftpack/dcosqb.f -o build/temp.macosx-10.3-i386-2.5/scipy/fftpack/src/dfftpack/dcosqb.o" failed with exit status 1 I have % gfortran --version GNU Fortran (GCC) 4.3.0 Copyright (C) 2008 Free Software Foundation, Inc. % which gfortran /sw/bin/gfortran (This /sw/bin apparently means it was installed by "fink". My IT department did this. This is not the recommended compiler from AT&T, but it seems a likely configuration to encounter in the wild, and I didn't expect a problem. ) I traced the problem to numpy/distutils/fcompiler/gnu.py in the class Gnu94FCompiler. The function _universal_flags() tries to detect which processor types are recognized by the compiler, presumably in an attempt to make a macintosh universal binary. It adds "-arch whatever" for each architecture that it thinks it detected. Since gfortran does not recognize "-arch", the compile fails. ( Presumably, some other version of gfortan does accept -arch, or this code wouldn't be here, right? ) The function _can_target() attempts to recognize what architectures the compiler is capable of by passing in -arch parameters with various known values, but gfortran does not properly indicate a problem in a way that _can_target() can detect: % gfortran -arch i386 hello_world.f f951: error: unrecognized command line option "-arch" % gfortran -arch i386 -v Using built-in specs. Target: i686-apple-darwin9 Configured with: ../gcc-4.3.0/configure --prefix=/sw --prefix=/sw/lib/gcc4.3 --mandir=/sw/share/man --infodir=/sw/share/info --enable-languages=c,c++,fortran,objc,java --with-arch=nocona --with-tune=generic --build=i686-apple-darwin9 --with-gmp=/sw --with-libiconv-prefix=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --disable-libjava-multilib Thread model: posix gcc version 4.3.0 (GCC) % echo $status 0 % That is, when you say "-v", it gives no indication that it doesn't understand the -arch flag. I didn't ask for a universal binary and I don't need one, so I'm surprised that it is trying to make one for me. I think the correct solution is that _universal_flag() should not add -arch flags unless the user specifically requests one. Unfortunately, I can't write a patch, because I don't have the time it would take to reverse engineer distutils well enough to know how to do it. As is usual when a setup.py auto-detects the wrong compiler flags, the easiest solution is to create a shell script that looks like the compiler, but add/removes flags as necessary: % cat /eng/ssb/auto/prog/binhacks/scipy.osx/gfortran #!/bin/sh args="" for x in $* do case "$x" in -arch) shift shift ;; *) args="$args $x" ;; esac done /sw/bin/gfortran $args Mark S. From robert.kern at gmail.com Tue Dec 8 16:36:41 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 15:36:41 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> Message-ID: <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> On Tue, Dec 8, 2009 at 15:25, Pierre GM wrote: > On Dec 8, 2009, at 12:54 PM, Robert Kern wrote: >> >> As far as I can tell, the faulty global seterr() has been in place >> since 1.1.0, so fixing it at all should be considered a feature >> change. It's not likely to actually *break* things except for doctests >> and documentation. I think I fall in with Chuck in suggesting that it >> should be changed in 1.5.0. > > OK. I'll work on fixing the remaining issues when a np function is applied on a masked array. > > FYI. most of the warnings can be fixed in _MaskedUnaryOperation and consorts with: > > ? ? ? ?err_status_ini = np.geterr() > ? ? ? ?np.seterr(divide='ignore', invalid='ignore') > ? ? ? ?result = self.f(da, db, *args, **kwargs) > ? ? ? ?np.seterr(**err_status_ini) > > Is this kind of fix acceptable ? olderr = np.seterr(divide='ignore', invalid='ignore') try: result = self.f(da, db, *args, **kwargs) finally: np.seterr(**olderr) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Tue Dec 8 16:38:43 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 16:38:43 -0500 Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <26700380.post@talk.nabble.com> References: <26700380.post@talk.nabble.com> Message-ID: <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> On Dec 8, 2009, at 3:42 PM, John [H2O] wrote: > I see record arrays don't have a masked_where method. How can I achieve the > following for a record array: > > cd.masked_where(cd.co == -9999.) > > Or something like this. masked_where is a function that requires 2 arguments. If you try to mask a whole record, you can try something like >>> x = ma.array([('a',1),('b',2)],dtype=[('','|S1'),('',float)]) >>> x[x['f0']=='a'] = ma.masked For an individual field, try something like >>>x['f1'][x['f1']=='b'] = ma.masked Otherwise, ma.masked_where doesn't work with structured arrays (yet, that's a bug I just find out) From Chris.Barker at noaa.gov Tue Dec 8 16:44:13 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Dec 2009 13:44:13 -0800 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> Message-ID: <4B1EC8AD.8080902@noaa.gov> Robert Kern wrote: >> The array interface was made for this sort of thing, but is deprecated: >> >> http://docs.scipy.org/doc/numpy/reference/arrays.interface.html > I think the wording is overly strong. I don't think that we actually > decided to deprecate the interface. PEP 3118 is not yet implemented by > numpy. That settles it then -- the array interface is the only option if you want to do any type checking. I'm a bit surprised that PEP 3118 hasn't been implemented yet in numpy -- after all, it we designed very much with numpy in mind. Oh well, I'm not writing the code. thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Tue Dec 8 16:47:39 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 16:47:39 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> Message-ID: <6F39BD24-BB5E-4D69-B399-8105EF88A5C2@gmail.com> On Dec 8, 2009, at 4:36 PM, Robert Kern wrote: >> >> err_status_ini = np.geterr() >> np.seterr(divide='ignore', invalid='ignore') >> result = self.f(da, db, *args, **kwargs) >> np.seterr(**err_status_ini) >> >> Is this kind of fix acceptable ? > > olderr = np.seterr(divide='ignore', invalid='ignore') > try: > result = self.f(da, db, *args, **kwargs) > finally: > np.seterr(**olderr) Neat ! I didn't know about np.seterr returning the old settings. Thanks a million Robert. From robert.kern at gmail.com Tue Dec 8 16:47:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 15:47:33 -0600 Subject: [Numpy-discussion] numpy distutils breaks scipy install on mac In-Reply-To: <4B1EC6E1.7040004@stsci.edu> References: <4B1EC6E1.7040004@stsci.edu> Message-ID: <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> On Tue, Dec 8, 2009 at 15:36, Mark Sienkiewicz wrote: > When I compile scipy on a mac, the build fails with: > > ... > gfortran:f77: scipy/fftpack/src/dfftpack/dcosqb.f > f951: error: unrecognized command line option "-arch" > f951: error: unrecognized command line option "-arch" > f951: error: unrecognized command line option "-arch" > f951: error: unrecognized command line option "-arch" > error: Command "/sw/bin/gfortran -Wall -ffixed-form > -fno-second-underscore -arch i686 -arch x86_64 -fPIC -O3 -funroll-loops > -I/usr/stsci/pyssgdev/2.5.4/numpy/core/include -c -c > scipy/fftpack/src/dfftpack/dcosqb.f -o > build/temp.macosx-10.3-i386-2.5/scipy/fftpack/src/dfftpack/dcosqb.o" > failed with exit status 1 > > > I have > > % gfortran --version > GNU Fortran (GCC) 4.3.0 > Copyright (C) 2008 Free Software Foundation, Inc. > > % which gfortran > /sw/bin/gfortran > > > (This /sw/bin apparently means it was installed by "fink". ?My IT > department did this. ?This is not the recommended compiler from AT&T, > but it seems a likely configuration to encounter in the wild, and I > didn't expect a problem. ) > > I traced the problem to numpy/distutils/fcompiler/gnu.py in the class > Gnu94FCompiler. ?The function _universal_flags() tries to detect which > processor types are recognized by the compiler, presumably in an attempt > to make a macintosh universal binary. ?It adds "-arch whatever" for each > architecture that it thinks it detected. ?Since gfortran does not > recognize "-arch", the compile fails. > > ( Presumably, some other version of gfortan does accept -arch, or this > code wouldn't be here, right? ) Right. The -arch flag was added by Apple to GCC and their patch really should be applied to all builds of GCC compilers for the Mac. It is deeply disappointing that Fink ignored this. The only Mac gfortran build that I can recommend is here: http://r.research.att.com/tools/ _can_target() should be fixed to be more accurate, though, so if you find a patch that works for you, please let us know. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Dec 8 16:48:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 15:48:54 -0600 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <4B1EC8AD.8080902@noaa.gov> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> <4B1EC8AD.8080902@noaa.gov> Message-ID: <3d375d730912081348s49023587lb5437a4970e7da32@mail.gmail.com> On Tue, Dec 8, 2009 at 15:44, Christopher Barker wrote: > Robert Kern wrote: >>> The array interface was made for this sort of thing, but is deprecated: >>> >>> http://docs.scipy.org/doc/numpy/reference/arrays.interface.html > >> I think the wording is overly strong. I don't think that we actually >> decided to deprecate the interface. PEP 3118 is not yet implemented by >> numpy. > > That settles it then -- the array interface is the only option if you > want to do any type checking. > > I'm a bit surprised that PEP 3118 hasn't been implemented yet in numpy > -- after all, it we designed very much with numpy in mind. Travis's time commitments very suddenly changed late in the PEP's life. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From washakie at gmail.com Tue Dec 8 16:53:52 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 13:53:52 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> Message-ID: <26701484.post@talk.nabble.com> This is what I get: In [74]: type(cd) Out[74]: In [75]: type(cd.co) Out[75]: In [76]: cd[cd['co']==-9999.] = np.ma.masked --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/jfb/Research/arctic_co/co_plot.py in () ----> 1 2 3 4 5 ValueError: tried to set void-array with object members using buffer. -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26701484.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pgmdevlist at gmail.com Tue Dec 8 17:23:23 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 17:23:23 -0500 Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <26701484.post@talk.nabble.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26701484.post@talk.nabble.com> Message-ID: On Dec 8, 2009, at 4:53 PM, John [H2O] wrote: > This is what I get: > > In [74]: type(cd) > Out[74]: > > In [75]: type(cd.co) > Out[75]: > > In [76]: cd[cd['co']==-9999.] = np.ma.masked > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > > /home/jfb/Research/arctic_co/co_plot.py in () > ----> 1 > 2 > 3 > 4 > 5 > > ValueError: tried to set void-array with object members using buffer. John, * Could you post self contained example next time ? * cd should be a MaskedRecords or MaskedArray before you can mask it From pav at iki.fi Tue Dec 8 17:29:31 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 09 Dec 2009 00:29:31 +0200 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> Message-ID: <1260311371.17559.28.camel@idol> ti, 2009-12-08 kello 14:23 -0600, Robert Kern kirjoitti: [clip] > I think the wording is overly strong. I don't think that we actually > decided to deprecate the interface. PEP 3118 is not yet implemented by > numpy, and the PEP 3118 API won't be available to Python's <2.6 > (Cython's workarounds notwithstanding). > > Pauli, did we discuss this before you wrote that warning and I'm just > not remembering it? I think this came about as a result of some discussion. This, I believe: http://thread.gmane.org/gmane.comp.python.numeric.general/27413 Yes, the warning is strongly worded -- especially as the support for PEP 3118 will not arrive before Numpy 1.5.0, and I don't see any reason why we would be removing support for the array interface. Perhaps Py3 is a different ball game, but even there, there is no real reason to remove the support, so I don't think we should do it. -- Pauli Virtanen From washakie at gmail.com Tue Dec 8 17:32:52 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 14:32:52 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> Message-ID: <26702019.post@talk.nabble.com> Pierre GM-2 wrote: > > > > masked_where is a function that requires 2 arguments. > If you try to mask a whole record, you can try something like >>>> x = ma.array([('a',1),('b',2)],dtype=[('','|S1'),('',float)]) >>>> x[x['f0']=='a'] = ma.masked > For an individual field, try something like >>>>x['f1'][x['f1']=='b'] = ma.masked > > Just some more detail, here's what I'm working on: def mk_COarray(rD,datetimevec): """ rD is a previous record array, but I add the datetime vector """ codata = np.column_stack((np.array(datetime),rD.lon,rD.lat,rD.elv,rD.co)) codata = np.ma.array(codata) codata_masked = np.ma.masked_where(codata==-9999.,codata) codata = np.rec.fromrecords(codata_masked,names='datetime,lon,lat,elv,co') return codata, codata_masked Plotting the arrays out of this: In [128]: cd,cdm = mk_COarray(rD,datetimevec) In [129]: plt.plot(cd.datetime,cd.co,label='raw'); plt.plot(cdm[:,0],cdm[:,4],label='masked') I get the following image, where you can see that the codata which is created from the codata_masked seems to not be masked???? http://old.nabble.com/file/p26702019/example.png -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26702019.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pgmdevlist at gmail.com Tue Dec 8 17:50:26 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 17:50:26 -0500 Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <26702019.post@talk.nabble.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> Message-ID: <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> On Dec 8, 2009, at 5:32 PM, John [H2O] wrote: > Pierre GM-2 wrote: >> >> >> >> masked_where is a function that requires 2 arguments. >> If you try to mask a whole record, you can try something like >>>>> x = ma.array([('a',1),('b',2)],dtype=[('','|S1'),('',float)]) >>>>> x[x['f0']=='a'] = ma.masked >> For an individual field, try something like >>>>> x['f1'][x['f1']=='b'] = ma.masked >> >> > > > Just some more detail, here's what I'm working on: Did you check scikits.timeseries ? Might be a solution if you have data indexed in time > > def mk_COarray(rD,datetimevec): > """ rD is a previous record array, but I add the datetime vector """ > codata = > np.column_stack((np.array(datetime),rD.lon,rD.lat,rD.elv,rD.co)) > codata = np.ma.array(codata) > codata_masked = np.ma.masked_where(codata==-9999.,codata) > codata = > np.rec.fromrecords(codata_masked,names='datetime,lon,lat,elv,co') > return codata, codata_masked OK, I gonna have to guess again: codata is a regular ndarray, not structured ? Then you don't have to transform it into a masked array codata=... codata_masked = np.ma.masked_values(codata,-9999.) Then you transform codata into a np.recarray... But why not transforming codata_masked ? It is hard to help you, because I don't know the actual structure you use. Once again, please give a self contained example. The first two entries of codata would be enough. > Plotting the arrays out of this: > In [128]: cd,cdm = mk_COarray(rD,datetimevec) > In [129]: plt.plot(cd.datetime,cd.co,label='raw'); > plt.plot(cdm[:,0],cdm[:,4],label='masked') > > I get the following image, where you can see that the codata which is > created from the codata_masked seems to not be masked???? Er... You can check whether codata_masked is masked by checking if some entries of its mask are True (codata_masked.mask.any()). Given your graph, I'd say yes, codata_masked is actually masked: see how you have a gap in your green curve when the blue one plummets into negative? That's likely where your elevation was -9999., I'd say. From Chris.Barker at noaa.gov Tue Dec 8 18:41:15 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Dec 2009 15:41:15 -0800 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <1260311371.17559.28.camel@idol> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> <1260311371.17559.28.camel@idol> Message-ID: <4B1EE41B.3080308@noaa.gov> Pauli Virtanen wrote: >> I think the wording is overly strong. not just too string, but actually wrong -- you can't target PEP 3118 -- numpy doesn't support it at all yet! Current wording: """ Warning This page describes the old, deprecated array interface. Everything still works as described as of numpy 1.2 and on into the foreseeable future, but new development should target PEP 3118 ? The Revised Buffer Protocol. PEP 3118 was incorporated into Python 2.6 and 3.0 ... """ My suggested new wording: """ Warning This page describes the current array interface. Everything still works as described as of numpy 1.4 and on into the foreseeable future. However, future versions of numpy will target PEP 3118 ? The Revised Buffer Protocol. PEP 3118 was incorporated into Python 2.6 and 3.0, and we hope to incorporate it into numpy 1.5 ... """ -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Tue Dec 8 18:56:51 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 9 Dec 2009 08:56:51 +0900 Subject: [Numpy-discussion] numpy distutils breaks scipy install on mac In-Reply-To: <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> References: <4B1EC6E1.7040004@stsci.edu> <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> Message-ID: <5b8d13220912081556o3e0c6d06j223fa74a1fbcee59@mail.gmail.com> On Wed, Dec 9, 2009 at 6:47 AM, Robert Kern wrote: > > Right. The -arch flag was added by Apple to GCC and their patch really > should be applied to all builds of GCC compilers for the Mac. It is > deeply disappointing that Fink ignored this. The only Mac gfortran > build that I can recommend is here: > > ?http://r.research.att.com/tools/ > > _can_target() should be fixed to be more accurate, though, so if you > find a patch that works for you, please let us know. Damn, I thought I fixed all remaining issues by testing with a custom-built gfortran, but it seems that every gfortran variation likes to behave differently... I will install fink and fix _can_target accordingly David From robert.kern at gmail.com Tue Dec 8 18:58:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 17:58:01 -0600 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <4B1EE41B.3080308@noaa.gov> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> <1260311371.17559.28.camel@idol> <4B1EE41B.3080308@noaa.gov> Message-ID: <3d375d730912081558x22c1398rddab833e0002a4cd@mail.gmail.com> On Tue, Dec 8, 2009 at 17:41, Christopher Barker wrote: > Pauli Virtanen wrote: >>> I think the wording is overly strong. > > not just too string, but actually wrong -- you can't target PEP 3118 -- > numpy doesn't support it at all yet! > > > Current wording: > > """ > Warning > > This page describes the old, deprecated array interface. Everything > still works as described as of numpy 1.2 and on into the foreseeable > future, but new development should target PEP 3118 ? The Revised Buffer > Protocol. PEP 3118 was incorporated into Python 2.6 and 3.0 > ... > """ > > My suggested new wording: > > """ > Warning > > This page describes the current array interface. Everything still works > as described as of numpy 1.4 and on into the foreseeable future. > However, future versions of numpy will target PEP 3118 ? The Revised > Buffer Protocol. PEP 3118 was incorporated into Python 2.6 and 3.0, and > we hope to incorporate it into numpy 1.5 > ... > > """ The Cython information is still nice. Also, it should not be a warning, just a note, since there is no impending deprecation to warn about. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Tue Dec 8 19:09:00 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 08 Dec 2009 16:09:00 -0800 Subject: [Numpy-discussion] What protocol to use now? In-Reply-To: <3d375d730912081558x22c1398rddab833e0002a4cd@mail.gmail.com> References: <4B1E9C24.8030704@noaa.gov> <3d375d730912081223q22380a58u6941ee3593ee8caf@mail.gmail.com> <1260311371.17559.28.camel@idol> <4B1EE41B.3080308@noaa.gov> <3d375d730912081558x22c1398rddab833e0002a4cd@mail.gmail.com> Message-ID: <4B1EEA9C.5000006@noaa.gov> Robert Kern wrote: > On Tue, Dec 8, 2009 at 17:41, Christopher Barker wrote: >> My suggested new wording: >> >> """ >> Warning >> >> This page describes the current array interface. Everything still works >> as described as of numpy 1.4 and on into the foreseeable future. >> However, future versions of numpy will target PEP 3118 ? The Revised >> Buffer Protocol. PEP 3118 was incorporated into Python 2.6 and 3.0, and >> we hope to incorporate it into numpy 1.5 >> ... >> >> """ > > The Cython information is still nice. sure -- that's the "..." there ;-) >Also, it should not be a > warning, just a note, since there is no impending deprecation to warn > about. good point. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From washakie at gmail.com Tue Dec 8 19:11:18 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 16:11:18 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> Message-ID: <26703152.post@talk.nabble.com> Pierre GM-2 wrote: > > > Did you check scikits.timeseries ? Might be a solution if you have data > indexed in time > > >> np.rec.fromrecords(codata_masked,names='datetime,lon,lat,elv,co') >> return codata, codata_masked > > OK, I gonna have to guess again: > codata is a regular ndarray, not structured ? Then you don't have to > transform it into a masked array > codata=... > codata_masked = np.ma.masked_values(codata,-9999.) > > Then you transform codata into a np.recarray... But why not transforming > codata_masked ? > > It is hard to help you, because I don't know the actual structure you use. > Once again, please give a self contained example. The first two entries of > codata would be enough. > > > > > > Er... You can check whether codata_masked is masked by checking if some > entries of its mask are True (codata_masked.mask.any()). > Given your graph, I'd say yes, codata_masked is actually masked: see how > you have a gap in your green curve when the blue one plummets into > negative? That's likely where your elevation was -9999., I'd say. > _______________________________________________ > > My apologies for adding confusing. In answer to your first question. Yes at one point I tried playing with scikits.timeseries... there were some issues at the time that prevented me from working with it, maybe I should revisit. But on to this problem... First off, let me say, my biggest curve with numpy has been dealing with different types of data structures, and I find that I always am starting with something new. For instance, I am now somewhat comfortable with arrays, but not yet masked arrays, and then it seems record arrays are the most 'advanced' and quite practical, but it seems the handling of them is quite different from standard arrays. So I;m learning! As best I can I'll provide a full example. Here's what I have: def mk_COarray(rD,dtvector,mask=None): codata = np.column_stack((np.array(dtvector),rD.lon,rD.lat,rD.elv,rD.co)) print type(codata) if mask: codata_masked = np.ma.masked_where(codata==mask,codata,copy=False) # Create record array from codata_masked else: codata_masked = codata codata = np.rec.fromrecords(codata_masked,names='datetime,lon,lat,elv,co') #Note the above is just for debugging, and below I return masked and unmasked arrays return codata, codata_masked In [162]: codata,codata_masked =mk_COarray(rD,dtvec,mask=-9999.) In [163]: type(codata); type(codata_masked) Out[163]: Out[163]: In [164]: codata[0] Out[164]: (datetime.datetime(2008, 4, 6, 11, 38, 37, 760000), 20.327100000000002, 67.8215, 442.60000000000002, -9999.0) In [165]: codata_masked[0] Out[165]: masked_array(data = [2008-04-06 11:38:37.760000 20.3271 67.8215 442.6 --], mask = [False False False False True], fill_value = ?) So then, the plot above will be the same. codata is the blue line (codata_masked converted into a rec array), whereas for debugging, I also return codata_masked (and it is plotted green). In my prior post I used the variables cd and cdm which refer to codata and codata_masked. I know this isn't terribly clear, but hopefully enough so to let me know how to create a masked record array ;) -john -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26703152.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From washakie at gmail.com Tue Dec 8 19:27:56 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 16:27:56 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <26703152.post@talk.nabble.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> <26703152.post@talk.nabble.com> Message-ID: <26703314.post@talk.nabble.com> Maybe I should add, I'm looking at this thread: http://old.nabble.com/masked-record-arrays-td26237612.html And, I guess I'm in the same situation as the OP there. It's not clear to me, but as best I can tell I am working with structured arrays (that's from np.rec.fromrecords creates, no?). Anyway, perhaps the simplest thing someone could do to help is to show how to create a masked structured array. Thanks! -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26703314.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From jakevdp at gmail.com Tue Dec 8 19:29:47 2009 From: jakevdp at gmail.com (Jake VanderPlas) Date: Tue, 8 Dec 2009 16:29:47 -0800 Subject: [Numpy-discussion] Flattening an array Message-ID: <58df6dc20912081629t5960d58dub66c4a5ae901e426@mail.gmail.com> Hello, I have a function -- call it f() -- which takes a length-N 1D numpy array as an argument, and returns a length-N 1D array. I want to pass it the data in an N-D array, and obtain the N-D array of the result. I've thought about wrapping it as such: #python code: from my_module import f # takes a 1D array, raises an exception otherwise def f_wrap(A): A_1D = A.ravel() B = f(A_1D) return B.reshape(A.shape) #end code I expect A to be contiguous in memory, but I don't know if it will be C_CONTIGUOUS or F_CONTIGUOUS. Is there a way to implement this such that 1) the data in the arrays A and B_1D are not copied (memory issues) 2) the function f is only called once (speed issues)? The above implementation appears to copy data if A is fortran-ordered. Thanks for the help -Jake From alan at ajackson.org Tue Dec 8 19:47:02 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Tue, 8 Dec 2009 18:47:02 -0600 Subject: [Numpy-discussion] Can't set an element of a subset of an array, , , Message-ID: <20091208184702.006235ef@ajackson.org> Okay, I'm stuck. Why doesn't this work? In [226]: mask Out[226]: array([False, False, False, ..., False, False, False], dtype=bool) In [227]: mask[data['horizon']==i] Out[227]: array([ True, True, False, False, True, False, False, False, False, True, False, False, False, True, False, False, False, False, False, True, False, False, False, False, False, False, True, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, True, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, True, True, False, False, False, True, False, True, False, False, True, True, False, False, False, False, False, False, False, False, False, True, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, True, False, False, False, False, False, True, False, False, False, True, False, True, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False], dtype=bool) In [228]: mask[data['horizon']==i][2] Out[228]: False In [229]: mask[data['horizon']==i][2] = True In [230]: mask[data['horizon']==i][2] Out[230]: False -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From robert.kern at gmail.com Tue Dec 8 19:52:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 8 Dec 2009 18:52:26 -0600 Subject: [Numpy-discussion] Can't set an element of a subset of an array, , , In-Reply-To: <20091208184702.006235ef@ajackson.org> References: <20091208184702.006235ef@ajackson.org> Message-ID: <3d375d730912081652y100e9206ve5b7f4bb0fd1d9ec@mail.gmail.com> 2009/12/8 : > Okay, I'm stuck. Why doesn't this work? > > In [226]: mask > Out[226]: array([False, False, False, ..., False, False, False], dtype=bool) > In [229]: mask[data['horizon']==i][2] = True mask[data['horizon']==i] creates a copy. mask[data['horizon']==i][2] assigns to the copy, which then gets thrown away. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From alan at ajackson.org Tue Dec 8 19:59:57 2009 From: alan at ajackson.org (alan at ajackson.org) Date: Tue, 8 Dec 2009 18:59:57 -0600 Subject: [Numpy-discussion] Can't set an element of a subset of an array, , , In-Reply-To: <3d375d730912081652y100e9206ve5b7f4bb0fd1d9ec@mail.gmail.com> References: <20091208184702.006235ef@ajackson.org> <3d375d730912081652y100e9206ve5b7f4bb0fd1d9ec@mail.gmail.com> Message-ID: <20091208185957.278ff584@ajackson.org> >2009/12/8 : >> Okay, I'm stuck. Why doesn't this work? >> >> In [226]: mask >> Out[226]: array([False, False, False, ..., False, False, False], dtype=bool) >> In [229]: mask[data['horizon']==i][2] = True > >mask[data['horizon']==i] creates a copy. mask[data['horizon']==i][2] >assigns to the copy, which then gets thrown away. > Bummer. That was such a nice way to reach inside the data structure. -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From eadrogue at gmx.net Tue Dec 8 20:32:25 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 9 Dec 2009 02:32:25 +0100 Subject: [Numpy-discussion] slices of structured arrays? Message-ID: <20091209013225.GA15586@doriath.local> Hello, Here's a structured array with fields 'a','b' and 'c': s=[(i,int) for i in 'abc'] t=np.zeros(1,s) It has the form: array([(0, 0, 0)] I was wondering if such an array can be accessed like an ordinary array (e.g., with a slice) in order to set multiple values at once. t[0] does not access the first element of the array, instead it returns the whole array. In [329]: t[0] Out[329]: (1, 0, 0) But this array is type np.void and does not support slices. t[0][0] returns the first element, but t[0][:2] fails with IndexError: invalid index. Any suggestion? Ernest From pgmdevlist at gmail.com Tue Dec 8 21:07:30 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 8 Dec 2009 21:07:30 -0500 Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <26703314.post@talk.nabble.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> <26703152.post@talk.nabble.com> <26703314.post@talk.nabble.com> Message-ID: <168093A5-141B-4F23-B607-51B8BE6BBD3C@gmail.com> On Dec 8, 2009, at 7:27 PM, John [H2O] wrote: > Maybe I should add, I'm looking at this thread: > http://old.nabble.com/masked-record-arrays-td26237612.html > > And, I guess I'm in the same situation as the OP there. It's not clear to > me, but as best I can tell I am working with structured arrays (that's from > np.rec.fromrecords creates, no?). > > Anyway, perhaps the simplest thing someone could do to help is to show how > to create a masked structured array. > > Thanks! (Note to self: one of us all's gonna have to write some doc about that...) A structured array is a ndarray with named fields. Like a standard ndarray, each item has a given size defined by the dtype. At the difference of a standard ndarray, each item is composed of different sub-items whose types don't have to be homogeneous. Each item is a special numpy scalar called a numpy.void. For example: >>> x = np.array([('a',1),('b',2)],dtype=[('F0','|S1'),('F1',float)]) The first item, x[0], is composed of two fields, 'F0' and 'F1'. The first field is a single character, the second a float. Fields can be accessed for each item (like x[0]['F0']) or globally (like x['F0']). Note that this syntax is analogous to getting an item. A recarray is just a structured ndarray with some overwritten methods, where the fields can also be accessed as attributes. Because it uses some overwritten __getattr__ and __setattr__, they tend to be not as efficient as standard structured ndarrays, but that's the price for convenience. To create a recarray, you can use the constructions functions in np.records, or simply take a view of your structured array as a np.recarray. So, when you use np.rec.fromrecords, you get a recarray, which is a subclass of structured arrays. Each item of a np.recarray is a special object (np.record), which is a regular np.void that allows attribute-like access to fields. Masked arrays are ndarrays that have a special mask attributes. Since 1.3, masked arrays support flexible dtypes (aka structured dtype), and you can mask individual fields. If >>> x = ma.array([('a',1), ('b',2)], dtype=[('F0','|S1'),('F1',float)]) >>> x['F0'][0] = ma.masked >>> x masked_array(data = [(--, 1.0) ('b', 2.0)], mask = [(True, False) (False, False)], fill_value = ('N', 1e+20), dtype = [('F0', '|S1'), ('F1', ' References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> <26703152.post@talk.nabble.com> Message-ID: On Dec 8, 2009, at 7:11 PM, John [H2O] wrote: > My apologies for adding confusing. In answer to your first question. Yes at > one point I tried playing with scikits.timeseries... there were some issues > at the time that prevented me from working with it, maybe I should revisit. What kind of issues ? > > As best I can I'll provide a full example. Except that you forgot to give a sample of the input arrays ;) > > Here's what I have: > > def mk_COarray(rD,dtvector,mask=None): > codata = > np.column_stack((np.array(dtvector),rD.lon,rD.lat,rD.elv,rD.co)) > print type(codata) > > if mask: > codata_masked = np.ma.masked_where(codata==mask,codata,copy=False) > # Create record array from codata_masked > else: > codata_masked = codata > codata = > np.rec.fromrecords(codata_masked,names='datetime,lon,lat,elv,co') > #Note the above is just for debugging, and below I return masked and > unmasked arrays > return codata, codata_masked > > > In [162]: codata,codata_masked =mk_COarray(rD,dtvec,mask=-9999.) > In [163]: type(codata); type(codata_masked) > Out[163]: > Out[163]: > In [164]: codata[0] > Out[164]: (datetime.datetime(2008, 4, 6, 11, 38, 37, 760000), > 20.327100000000002, 67.8215, 442.60000000000002, -9999.0) > > In [165]: codata_masked[0] > Out[165]: > masked_array(data = [2008-04-06 11:38:37.760000 20.3271 67.8215 442.6 --], > mask = [False False False False True], > fill_value = ?) > > > So then, the plot above will be the same. codata is the blue line > (codata_masked converted into a rec array), whereas for debugging, I also > return codata_masked (and it is plotted green). > > In my prior post I used the variables cd and cdm which refer to codata and > codata_masked. > > I know this isn't terribly clear, but hopefully enough so to let me know how > to create a masked record array ;) > Your structured ndarray: >>> x=np.array([('aaa',1,2,30),('bbb',2,4,40)], dtype=[('f0',np.object),('f1',int),('f2',int),('f3',float)]) Make a MaskedRecords: >>> x = x.view(np.ma.mrecords.mrecarray) Mask the whole records where field 'f3' > 30 >>> x[x['f3'] > 30] = ma.masked Mask the 'f1' field where it is equal to 1 >>> x['f1'][x['f1']==1]=ma.masked From washakie at gmail.com Tue Dec 8 21:32:44 2009 From: washakie at gmail.com (John [H2O]) Date: Tue, 8 Dec 2009 18:32:44 -0800 (PST) Subject: [Numpy-discussion] more recfunctions, structured array help In-Reply-To: <168093A5-141B-4F23-B607-51B8BE6BBD3C@gmail.com> References: <26700380.post@talk.nabble.com> <1272CAB6-D715-488D-842F-6D7F5B2DA6B0@gmail.com> <26702019.post@talk.nabble.com> <39B11D6B-FFD9-4D00-A098-35BBE57852DB@gmail.com> <26703152.post@talk.nabble.com> <26703314.post@talk.nabble.com> <168093A5-141B-4F23-B607-51B8BE6BBD3C@gmail.com> Message-ID: <26704233.post@talk.nabble.com> Pierre GM-2 wrote: > > On Dec 8, 2009, at 7:27 PM, John [H2O] wrote: >> Maybe I should add, I'm looking at this thread: >> http://old.nabble.com/masked-record-arrays-td26237612.html >> >> And, I guess I'm in the same situation as the OP there. It's not clear to >> me, but as best I can tell I am working with structured arrays (that's >> from >> np.rec.fromrecords creates, no?). >> >> Anyway, perhaps the simplest thing someone could do to help is to show >> how >> to create a masked structured array. >> >> Thanks! > > (Note to self: one of us all's gonna have to write some doc about that...) > > Pierre, Do you have access to the docs. For now, this is indeed very helpful. Thanks for the description. I would recommend just adding this at least as a note to the page: http://docs.scipy.org/doc/numpy/user/basics.rec.html Just a thought.... now I'm going to experiment and see if I can figure out how to pick and choose one data structure to work with! ;) -- View this message in context: http://old.nabble.com/more-recfunctions%2C-structured-array-help-tp26700380p26704233.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From dwf at cs.toronto.edu Tue Dec 8 22:08:33 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 8 Dec 2009 22:08:33 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26698253.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26688453.post@talk.nabble.com> <26698253.post@talk.nabble.com> Message-ID: <20091209030833.GA1170@rodimus> On Tue, Dec 08, 2009 at 10:17:20AM -0800, Dr. Phillip M. Feldman wrote: > > > > David Warde-Farley-2 wrote: > > > > > > A less harmful solution (if a solution is warranted, which is for the > > Council of the Elders to > > decide) would be to treat the Python complex type as a special case, so > > that the .real attribute is accessed instead of trying to cast to float. > > > > > > There are two even less harmful solutions: (1) Raise an exception. This is not less harmful, since as I mentioned there is likely a lot of deployed code that is not expecting such exceptions. If such a change were to take place it would have to take place over several versions, where warnings are issued for a while (probably at least one stable release) before the feature being removed. Esoteric handling of ambiguous assignments may not speed adoption of NumPy, but monumental shifts in basic behaviour without any warning will make us even less friends. The best thing to do is probably to file an enhancement ticket on the bugtracker so that the issue doesn't get lost/forgotten. > (2) Provide the user with a top-level flag to control whether the attempt to > downcast a NumPy complex to a float should be handled by raising an > exception, by throwing away the imaginary part, or by taking the magnitude. I'm not so sure that introducing more global state is looked fondly upon, but it'd be worth including this proposal in the ticket. > P.S. As things stand now, I do not regard NumPy as a reliable platform for > scientific computing. One man's bug is another's feature, I guess. I rarely use complex numbers and when I do I simply avoid this situation. David From cycomanic at gmail.com Tue Dec 8 22:37:45 2009 From: cycomanic at gmail.com (Jochen Schroeder) Date: Wed, 9 Dec 2009 14:37:45 +1100 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26688453.post@talk.nabble.com> Message-ID: <20091209033743.GA2172@cudos0803> On 12/08/09 02:32, David Warde-Farley wrote: > On 7-Dec-09, at 11:13 PM, Dr. Phillip M. Feldman wrote: > > > Example #1: > > IPython 0.10 [on Py 2.5.4] > > [~]|1> z= zeros(3) > > [~]|2> z[0]= 1+1J > > > > TypeError: can't convert complex to float; use abs(z) > > The problem is that you're using Python's built-in complex type, and > it responds to type coercion differently than NumPy types do. Calling > float() on a Python complex will raise the exception. Calling float() > on (for example) a numpy.complex64 will not. Notice what happens here: > > In [14]: z = zeros(3) > > In [15]: z[0] = complex64(1+1j) > > In [16]: z[0] > Out[16]: 1.0 > > > Example #2: > > > > ### START OF CODE ### > > from numpy import * > > q = ones(2,dtype=complex)*(1 + 1J) > > r = zeros(2,dtype=float) > > r[:] = q > > print 'q = ',q > > print 'r = ',r > > ### END OF CODE ### > > Here, both operands are NumPy arrays. NumPy is in complete control of > the situation, and it's well documented what it will do. > > I do agree that the behaviour in example #1 is mildly inconsistent, > but such is the way with NumPy vs. Python scalars. They are mostly > transparently intermingled, except when they're not. > > > At a minimum, this inconsistency needs to be cleared up. My > > preference > > would be that the programmer should have to explicitly downcast from > > complex to float, and that if he/she fails to do this, that an > > exception be > > triggered. > > That would most likely break a *lot* of deployed code that depends on > the implicit downcast behaviour. A less harmful solution (if a > solution is warranted, which is for the Council of the Elders to > decide) would be to treat the Python complex type as a special case, > so that the .real attribute is accessed instead of trying to cast to > float. I'm not sure how much code is actually relying on the implicit downcast, but I'd argue that it's bad programming anyways. It is really difficult to spot if you reviewing someone else's code. As others mentioned it's also a bitch to track down a bug that has been accidentally introduced by this behaviour. Jochen From josef.pktd at gmail.com Wed Dec 9 00:45:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 9 Dec 2009 00:45:32 -0500 Subject: [Numpy-discussion] slices of structured arrays? In-Reply-To: <20091209013225.GA15586@doriath.local> References: <20091209013225.GA15586@doriath.local> Message-ID: <1cd32cbb0912082145k49064cb7o8c9d97cf9ac5dcc0@mail.gmail.com> 2009/12/8 Ernest Adrogu? : > Hello, > > Here's a structured array with fields 'a','b' and 'c': > > s=[(i,int) for i in 'abc'] > t=np.zeros(1,s) > > It has the form: array([(0, 0, 0)] > > I was wondering if such an array can be accessed like an > ordinary array (e.g., with a slice) in order to set multiple > values at once. > > t[0] does not access the first element of the array, instead > it returns the whole array. > > In [329]: t[0] > Out[329]: (1, 0, 0) > > But this array is type np.void and does not support slices. > t[0][0] returns the first element, but t[0][:2] fails with > IndexError: invalid index. > > Any suggestion? as long as all numbers are of the same type, you can create a view that behaves (mostly) like a regular array >>> t0=np.arange(12).reshape(-1,3) >>> t0 array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) >>> t0.dtype = s >>> t0 array([[(0, 1, 2)], [(3, 4, 5)], [(6, 7, 8)], [(9, 10, 11)]], dtype=[('a', '>> t0.view(int) array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) >>> t0.view(int)[3] array([ 9, 10, 11]) >>> t0.view(int)[3,1:] array([10, 11]) structured arrays treat all parts of the dtype as a single array element, your t[0] returns the first row/element corresponding to s >>> t0.shape (4, 1) >>> t1=np.arange(12) >>> t1.dtype = s >>> t1 array([(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11)], dtype=[('a', '>> t1.shape (4,) >>> t1.view(int) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) >>> t1.view(int).reshape(-1,3) array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) >>> t1.view(int).reshape(-1,3)[2,2:] array([8]) >>> t1.view(int).reshape(-1,3)[2,1:] array([7, 8]) as long as there is no indexing that makes a copy, you can still change the original array by changing the view >>> t1.view(int).reshape(-1,3)[2,1:] = 0 >>> t1 array([(0, 1, 2), (3, 4, 5), (6, 0, 0), (9, 10, 11)], dtype=[('a', ' > Ernest > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Dec 9 00:55:06 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 9 Dec 2009 00:55:06 -0500 Subject: [Numpy-discussion] Flattening an array In-Reply-To: <58df6dc20912081629t5960d58dub66c4a5ae901e426@mail.gmail.com> References: <58df6dc20912081629t5960d58dub66c4a5ae901e426@mail.gmail.com> Message-ID: <1cd32cbb0912082155r2a19b11ald91d228b54132bb6@mail.gmail.com> On Tue, Dec 8, 2009 at 7:29 PM, Jake VanderPlas wrote: > Hello, > I have a function -- call it f() -- which takes a length-N 1D numpy > array as an argument, and returns a length-N 1D array. > I want to pass it the data in an N-D array, and obtain the N-D array > of the result. > I've thought about wrapping it as such: > > #python code: > from my_module import f ? # takes a 1D array, raises an exception otherwise > def f_wrap(A): > ? ?A_1D = A.ravel() > ? ?B = f(A_1D) > ? ?return B.reshape(A.shape) > #end code > > I expect A to be contiguous in memory, but I don't know if it will be > C_CONTIGUOUS or F_CONTIGUOUS. ?Is there a way to implement this such > that > ?1) the data in the arrays A and B_1D are not copied (memory issues) > ?2) the function f is only called once (speed issues)? > The above implementation appears to copy data if A is fortran-ordered. maybe one way is to check the flags, and conditional on C or F, use the corresponding order in numpy.ravel(a, order='C') ? if A.flags.c_contiguous: ... elif A.flags.f_contiguous ... not tried, and I don't know what the right condition for `is_c_contiguous` is Josef > ?Thanks for the help > ? -Jake > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pfeldman at verizon.net Wed Dec 9 01:08:52 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Tue, 8 Dec 2009 22:08:52 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <9457e7c80903070629t282fc492u55aba87a2ed8b8d3@mail.gmail.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <9457e7c80903070629t282fc492u55aba87a2ed8b8d3@mail.gmail.com> Message-ID: <26705632.post@talk.nabble.com> St?fan van der Walt wrote: > > > Would it be possible to, optionally, throw an exception? > > S. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I'm certain that it is possible. And, I believe that if this option is selected via a Python flag, the run-time performance implications should be nil. I wonder if there is some way of taking a vote to see how many people would like such an option. -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26705632.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pfeldman at verizon.net Wed Dec 9 01:26:16 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Tue, 8 Dec 2009 22:26:16 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> Message-ID: <26705737.post@talk.nabble.com> Darren Dale wrote: > > On Sat, Mar 7, 2009 at 5:18 AM, Robert Kern wrote: > >> On Sat, Mar 7, 2009 at 04:10, St?fan van der Walt >> wrote: >> > 2009/3/7 Robert Kern : >> >> In [5]: z = zeros(3, int) >> >> >> >> In [6]: z[1] = 1.5 >> >> >> >> In [7]: z >> >> Out[7]: array([0, 1, 0]) >> > >> > Blind moment, sorry. So, what is your take -- should this kind of >> > thing pass silently? >> >> Downcasting data is a necessary operation sometimes. We explicitly >> made a choice a long time ago to allow this. > > In that case, do you know why this raises an exception: > > np.int64(10+20j) > > Darren > > I think that you have a good point, Darren, and that Robert is oversimplifying the situation. NumPy and Python are somewhat out of step. The NumPy approach is stricter and more likely to catch errors than Python. Python tends to be somewhat laissez-faire about numerical errors and the correctness of results. Unfortunately, NumPy seems to be a sort of step-child of Python, tolerated, but not fully accepted. There are a number of people who continue to use Matlab, despite all of its deficiencies, because it can at least be counted on to produce correct answers most of the time. Dr. Phillip M. Feldman -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26705737.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From peridot.faceted at gmail.com Wed Dec 9 02:41:55 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Dec 2009 02:41:55 -0500 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <4B1E6C7B.7060807@gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> Message-ID: 2009/12/8 Robert Kern : > On Tue, Dec 8, 2009 at 15:25, Pierre GM wrote: >> On Dec 8, 2009, at 12:54 PM, Robert Kern wrote: >>> >>> As far as I can tell, the faulty global seterr() has been in place >>> since 1.1.0, so fixing it at all should be considered a feature >>> change. It's not likely to actually *break* things except for doctests >>> and documentation. I think I fall in with Chuck in suggesting that it >>> should be changed in 1.5.0. >> >> OK. I'll work on fixing the remaining issues when a np function is applied on a masked array. >> >> FYI. most of the warnings can be fixed in _MaskedUnaryOperation and consorts with: >> >> ? ? ? ?err_status_ini = np.geterr() >> ? ? ? ?np.seterr(divide='ignore', invalid='ignore') >> ? ? ? ?result = self.f(da, db, *args, **kwargs) >> ? ? ? ?np.seterr(**err_status_ini) >> >> Is this kind of fix acceptable ? > > ?olderr = np.seterr(divide='ignore', invalid='ignore') > ?try: > ? ?result = self.f(da, db, *args, **kwargs) > ?finally: > ? ?np.seterr(**olderr) Doesn't this risk a ctrl-C after the seterr but before the try? How about: olderr = np.geterr() try: np.seterr(....) do_whatever() finally: np.seterr(**olderr) (I guess this would be why context managers were invented...) Anne From pav at iki.fi Wed Dec 9 04:04:54 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 09 Dec 2009 11:04:54 +0200 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26705737.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> Message-ID: <1260349494.18562.120.camel@talisman> ti, 2009-12-08 kello 22:26 -0800, Dr. Phillip M. Feldman kirjoitti: > Darren Dale wrote: > > On Sat, Mar 7, 2009 at 5:18 AM, Robert Kern wrote: > >> On Sat, Mar 7, 2009 at 04:10, St?fan van der Walt > >> wrote: > >> > 2009/3/7 Robert Kern : > >> >> In [5]: z = zeros(3, int) > >> >> > >> >> In [6]: z[1] = 1.5 > >> >> > >> >> In [7]: z > >> >> Out[7]: array([0, 1, 0]) > >> > > >> > Blind moment, sorry. So, what is your take -- should this kind of > >> > thing pass silently? > >> > >> Downcasting data is a necessary operation sometimes. We explicitly > >> made a choice a long time ago to allow this. I'd think that downcasting is different from dropping the imaginary part. Also, I doubt a bit that there is a large body of correct code relying on the implicit behavior. This kind of assertions should of course be checked experimentally -- make the complex downcast an error, and check a few prominent software packages. An alternative to an exception would be to make complex numbers with nonzero imaginary parts to cast to *nan*. This would, however, likely lead to errors difficult to track. Another alternative would be to raise an error only if the imaginary part is non-zero. This requires some additional checking in some places where no checking is usually made. At least I tend to use .real or real() to explicitly take the real part. In interactive use, it occasionally is convenient to have the real part taken "automatically", but sometimes this leads to problems inside Matplotlib. Nevertheless, I can't really regard dropping the imaginary part a significant issue. I've sometimes bumped into problems because of it, and it would have been nice to catch them earlier, though. (As an example, scipy.interpolate.interp1d some time ago silently dropped the imaginary part -- not nice.) -- Pauli Virtanen From dagss at student.matnat.uio.no Wed Dec 9 04:18:57 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Dec 2009 10:18:57 +0100 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <1260349494.18562.120.camel@talisman> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> Message-ID: <4B1F6B81.5010000@student.matnat.uio.no> Pauli Virtanen wrote: > ti, 2009-12-08 kello 22:26 -0800, Dr. Phillip M. Feldman kirjoitti: > >> Darren Dale wrote: >> >>> On Sat, Mar 7, 2009 at 5:18 AM, Robert Kern wrote: >>> >>>> On Sat, Mar 7, 2009 at 04:10, St?fan van der Walt >>>> wrote: >>>> >>>>> 2009/3/7 Robert Kern : >>>>> >>>>>> In [5]: z = zeros(3, int) >>>>>> >>>>>> In [6]: z[1] = 1.5 >>>>>> >>>>>> In [7]: z >>>>>> Out[7]: array([0, 1, 0]) >>>>>> >>>>> Blind moment, sorry. So, what is your take -- should this kind of >>>>> thing pass silently? >>>>> >>>> Downcasting data is a necessary operation sometimes. We explicitly >>>> made a choice a long time ago to allow this. >>>> > > I'd think that downcasting is different from dropping the imaginary > part. Also, I doubt a bit that there is a large body of correct code > relying on the implicit behavior. This kind of assertions should of > course be checked experimentally -- make the complex downcast an error, > and check a few prominent software packages. > > An alternative to an exception would be to make complex numbers with > nonzero imaginary parts to cast to *nan*. This would, however, likely > lead to errors difficult to track. > > Another alternative would be to raise an error only if the imaginary > part is non-zero. This requires some additional checking in some places > where no checking is usually made. > > At least I tend to use .real or real() to explicitly take the real part. > In interactive use, it occasionally is convenient to have the real part > taken "automatically", but sometimes this leads to problems inside > Matplotlib. > > Nevertheless, I can't really regard dropping the imaginary part a > significant issue. I've sometimes bumped into problems because of it, > and it would have been nice to catch them earlier, though. (As an > example, scipy.interpolate.interp1d some time ago silently dropped the > imaginary part -- not nice.) > FWIW, +1. And if nothing is done, there should at least be big fat red warnings prominently in the documentation. (Knowing that imaginary parts can in some situations be dropped without warning makes me rather uneasy...at least now I know to look out for it and double check the dtypes on the lhs.) Dag Sverre From dwf at cs.toronto.edu Wed Dec 9 04:51:52 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 9 Dec 2009 04:51:52 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26705737.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> Message-ID: <521ACC3D-30AD-4ACB-830B-2C87A03F186A@cs.toronto.edu> On 9-Dec-09, at 1:26 AM, Dr. Phillip M. Feldman wrote: > NumPy and Python are somewhat out of step. The NumPy approach > is stricter and more likely to catch errors than Python. Python > tends to be > somewhat laissez-faire about numerical errors and the correctness of > results. The bug/quirk you've been complaining about seems to suggest just the opposite. Darren's error is being raised because it is a Python complex being coerced. np.int64(np.complex64(10+20j)) raises no such error. > Unfortunately, NumPy seems to be a sort of step-child of Python, > tolerated, > but not fully accepted. There are a number of people who continue to > use Matlab, > despite all of its deficiencies, because it can at least be counted > on to > produce correct answers most of the time. Except that you could never fully verify that it produces correct results, even if that was your desire. There are legitimate reasons for wanting to use Matlab (e.g. familiarity, because collaborators do, and for certain things it's still faster than the alternatives) but correctness of results isn't one of them. That said, people routinely let price tags influence their perceptions of worth. David From faltet at pytables.org Wed Dec 9 05:06:36 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 9 Dec 2009 11:06:36 +0100 Subject: [Numpy-discussion] Bytes vs. Unicode in Python3 In-Reply-To: <200912061147.23728.faltet@pytables.org> References: <1259276898.8494.18.camel@idol> <4B1A3317.5060406@student.matnat.uio.no> <200912061147.23728.faltet@pytables.org> Message-ID: <200912091106.36462.faltet@pytables.org> A Sunday 06 December 2009 11:47:23 Francesc Alted escrigu?: > A Saturday 05 December 2009 11:16:55 Dag Sverre Seljebotn escrigu?: > > > In [19]: t = np.dtype("i4,f4") > > > > > > In [20]: t > > > Out[20]: dtype([('f0', ' > > > > > In [21]: hash(t) > > > Out[21]: -9041335829180134223 > > > > > > In [22]: t.names = ('one', 'other') > > > > > > In [23]: t > > > Out[23]: dtype([('one', ' > > > > > In [24]: hash(t) > > > Out[24]: 8637734220020415106 > > > > > > Perhaps this should be marked as a bug? I'm not sure about that, > > > because the above seems quite useful. > > > > Well, I for one don't like this, but that's just an opinion. I think it > > is unwise to leave object which supports hash() mutable, because it's > > too easy to make hard to find bugs (sticking a dtype as a key in a dict > > is rather useful in many situations). There's a certain tradition in > > Python for leaving types immutable if possible, and dtype certainly > > feels like it. > > Yes, I think you are right and force dtype to be immutable would be the > best. I've filed a ticket so that we don't loose track of this: http://projects.scipy.org/numpy/ticket/1321 -- Francesc Alted From eadrogue at gmx.net Wed Dec 9 08:41:52 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 9 Dec 2009 14:41:52 +0100 Subject: [Numpy-discussion] slices of structured arrays? In-Reply-To: <1cd32cbb0912082145k49064cb7o8c9d97cf9ac5dcc0@mail.gmail.com> References: <20091209013225.GA15586@doriath.local> <1cd32cbb0912082145k49064cb7o8c9d97cf9ac5dcc0@mail.gmail.com> Message-ID: <20091209134152.GA16268@doriath.local> 9/12/09 @ 00:45 (-0500), thus spake josef.pktd at gmail.com: > as long as all numbers are of the same type, you can create a view > that behaves (mostly) like a regular array > [...] Thanks Josef. Great explanation. It's all clear now. Ernest From mdroe at stsci.edu Wed Dec 9 09:04:05 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Wed, 09 Dec 2009 09:04:05 -0500 Subject: [Numpy-discussion] doctest improvements patch (and possible regressions) In-Reply-To: References: <4B1EB148.1010607@gmail.com> Message-ID: <4B1FAE55.9070605@stsci.edu> Paul Ivanov wrote: > I marked up suspicious differences with XXX, since I don't know if > they're significant. In particular: > - shortening a defchararray by strip does not change it's dtype to a > shorter one (apparently it used to?) Yes. The new behavior is to return a string array with the same itemsize as the input array. That's primarily just the result of the new implementation rather than a thought out change, though. Sorry, just commenting on the parts I feel competent in :) But I think this is a great improvement. It would be nice to start doing doctests as a matter of course to keep the docs accurate. Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA From rmay31 at gmail.com Wed Dec 9 09:46:41 2009 From: rmay31 at gmail.com (Ryan May) Date: Wed, 9 Dec 2009 08:46:41 -0600 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <521ACC3D-30AD-4ACB-830B-2C87A03F186A@cs.toronto.edu> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <521ACC3D-30AD-4ACB-830B-2C87A03F186A@cs.toronto.edu> Message-ID: On Wed, Dec 9, 2009 at 3:51 AM, David Warde-Farley wrote: > On 9-Dec-09, at 1:26 AM, Dr. Phillip M. Feldman wrote: >> Unfortunately, NumPy seems to be a sort of step-child of Python, >> tolerated, >> but not fully accepted. There are a number of people who continue to >> use Matlab, >> despite all of its deficiencies, because it can at least be counted >> on to >> produce correct answers most of the time. > > Except that you could never fully verify that it produces correct > results, even if that was your desire. > > There are legitimate reasons for wanting to use Matlab (e.g. > familiarity, because collaborators do, and for certain things it's > still faster than the alternatives) but correctness of results isn't > one of them. That said, people routinely let price tags influence > their perceptions of worth. While I'm not going to argue in favor of Matlab, and think it's benefits are being over-stated, let's call a spade a spade. Silent downcasting of complex types to float is a *wart*. It's not sensible behavior, it's an implementation detail that smacks new users in the face. It's completely insensible to consider converting from complex to float in the same vein as a simple loss of precision from 64-bit to 32-bit. The following doesn't work: a = np.array(['bob', 'sarah']) b = np.arange(2.) b[:] = a --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/rmay/ in () ValueError: invalid literal for float(): bob Why doesn't that silently downcast the strings to 0.0 or something silly? Because that would be *stupid*. So why doesn't trying to stuff 3+4j into the array get the same error, because 3+4j is definitely not a float value either. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From rsalvador.wk at gmail.com Wed Dec 9 09:56:24 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Wed, 9 Dec 2009 15:56:24 +0100 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? Message-ID: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> Hello everybody. I've seen this question arise sometimes on the list, but don't know if something has "happened" yet or not. I mean, any solution feasible to use more or less right out of the box? I'm just a hardware engineer, so it would be difficult for me to create my own class for this, since my knowledge of python/numpy is very limited, and, just don't have the time/knowledge to be more than a simple user of the language, not a developer. I have just come across this: http://www.dilloneng.com/documents/downloads/demodel/ but haven't used it yet. I'll give it a try and see how it works and come back to the list to report somehow. But, is there any "official" plans for this within the numpy developers? Is there any code around that may be used? I just need to test my code with fixed point arithmetic (I'm modelling hardware....) Thanks for the good work to all the Python/Numpy developers (and all the projects related, matplotlib and so on....) and for the possiblity of freeing from matlab!!! I'm determined to do research with as many free software design tools as possible....though this fixed-point arithmetic issue is still a chain! Regards! -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 9 10:20:50 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 9 Dec 2009 10:20:50 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <521ACC3D-30AD-4ACB-830B-2C87A03F186A@cs.toronto.edu> Message-ID: <1cd32cbb0912090720r5638577l36e636e66d8fc4a7@mail.gmail.com> On Wed, Dec 9, 2009 at 9:46 AM, Ryan May wrote: > On Wed, Dec 9, 2009 at 3:51 AM, David Warde-Farley wrote: >> On 9-Dec-09, at 1:26 AM, Dr. Phillip M. Feldman wrote: >>> Unfortunately, NumPy seems to be a sort of step-child of Python, >>> tolerated, >>> but not fully accepted. There are a number of people who continue to >>> use Matlab, >>> despite all of its deficiencies, because it can at least be counted >>> on to >>> produce correct answers most of the time. >> >> Except that you could never fully verify that it produces correct >> results, even if that was your desire. >> >> There are legitimate reasons for wanting to use Matlab (e.g. >> familiarity, because collaborators do, and for certain things it's >> still faster than the alternatives) but correctness of results isn't >> one of them. That said, people routinely let price tags influence >> their perceptions of worth. > > While I'm not going to argue in favor of Matlab, and think it's > benefits are being over-stated, let's call a spade a spade. ?Silent > downcasting of complex types to float is a *wart*. ?It's not sensible > behavior, it's an implementation detail that smacks new users in the > face. ?It's completely insensible to consider converting from complex > to float in the same vein as a simple loss of precision from 64-bit to > 32-bit. ?The following doesn't work: > > a = np.array(['bob', 'sarah']) > b = np.arange(2.) > b[:] = a > --------------------------------------------------------------------------- > ValueError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last) > > /home/rmay/ in () > > ValueError: invalid literal for float(): bob > > Why doesn't that silently downcast the strings to 0.0 or something > silly? ?Because that would be *stupid*. ?So why doesn't trying to > stuff 3+4j into the array get the same error, because 3+4j is > definitely not a float value either. Real numbers are a special case of complex, so I think the integer/float analogy is better. Numpy requires quite a bit more learning than programs like matlab and gauss with a more rigid type structure. And numpy has quite a few issues with "Is this a bug or a feature". numpy downcasting looks pretty consistent (for most parts) and it just one more thing to keep in mind like integer division and integer powers. Instead of requiring numpy to emit hundreds of warnings, I think it's better to properly unit test the code. For example, inspection and a test case showed pretty quickly that the way I tried to use scipy.integrate.quad with complex numbers didn't return the correct complex answer but only the correct real part. Compared to some questionable behavior with views and rearranging the axes with fancy indexing, I think the casting problem is easy to keep track of. Maybe we should start to collect these warts for a numpy 3000. Josef > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Wed Dec 9 10:50:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Dec 2009 08:50:41 -0700 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <1260349494.18562.120.camel@talisman> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> Message-ID: On Wed, Dec 9, 2009 at 2:04 AM, Pauli Virtanen wrote: > ti, 2009-12-08 kello 22:26 -0800, Dr. Phillip M. Feldman kirjoitti: > > Darren Dale wrote: > > > On Sat, Mar 7, 2009 at 5:18 AM, Robert Kern > wrote: > > >> On Sat, Mar 7, 2009 at 04:10, St?fan van der Walt > > >> wrote: > > >> > 2009/3/7 Robert Kern : > > >> >> In [5]: z = zeros(3, int) > > >> >> > > >> >> In [6]: z[1] = 1.5 > > >> >> > > >> >> In [7]: z > > >> >> Out[7]: array([0, 1, 0]) > > >> > > > >> > Blind moment, sorry. So, what is your take -- should this kind of > > >> > thing pass silently? > > >> > > >> Downcasting data is a necessary operation sometimes. We explicitly > > >> made a choice a long time ago to allow this. > > I'd think that downcasting is different from dropping the imaginary > part. Also, I doubt a bit that there is a large body of correct code > relying on the implicit behavior. This kind of assertions should of > course be checked experimentally -- make the complex downcast an error, > and check a few prominent software packages. > > An alternative to an exception would be to make complex numbers with > nonzero imaginary parts to cast to *nan*. This would, however, likely > lead to errors difficult to track. > > Another alternative would be to raise an error only if the imaginary > part is non-zero. This requires some additional checking in some places > where no checking is usually made. > > At least I tend to use .real or real() to explicitly take the real part. > In interactive use, it occasionally is convenient to have the real part > taken "automatically", but sometimes this leads to problems inside > Matplotlib. > > Nevertheless, I can't really regard dropping the imaginary part a > significant issue. I've sometimes bumped into problems because of it, > and it would have been nice to catch them earlier, though. (As an > example, scipy.interpolate.interp1d some time ago silently dropped the > imaginary part -- not nice.) > > It looks like a lot of folks have written or met buggy code at one point or another because this behaviour. And finding the problem is a hassle because it doesn't stand out. I think that makes it a candidate for a warning simply because we want to help people write correct code and that is a significant issue. So +1 for raising a warning in this case. I feel the same about silently casting floats to integers, although that doesn't feel quite as strange because one at least expects the result to be close to the original. I think the boundaries are unsigned discrete <- signed discrete <- real line <- complex plane The different kinds all have different domains and crossing the boundaries between kinds should only happen if it is the clear intent of the programmer. Because python types don't match up 1-1 with numpy types this can be tricky to enforce but I think is worthwhile to keep it in mind. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lciti at essex.ac.uk Wed Dec 9 11:21:25 2009 From: lciti at essex.ac.uk (Citi, Luca) Date: Wed, 9 Dec 2009 16:21:25 +0000 Subject: [Numpy-discussion] Applying Patch #1085 In-Reply-To: References: <5C367222-2FDF-4B1D-8DD4-4987E0EB10CA@enthought.com> <75E2A16E-83FF-4467-8543-8AA1060DB27B@enthought.com>, Message-ID: <271BED32E925E646A1333A56D9C6AFCB58F25C70A1@MBOX0.essex.ac.uk> Hello! > What do people think of applying patch #1085. Fine with me. > I'd rename the function ... Let me know if you want me to make these canges or feel free to make them. > It looks like the routine doesn't try to determine if the > views actually overlap, just if they might potentially > share data. Is that correct? That seems safe and if the > time isn't much it might be a nice safety catch. The function compares the ultimate base of every output with those of the inputs and if an output and an input have the same base (or either one is the base of the other), the input is copied to a temporary object before the operation (unless it is the easy case of same dimensions and strides, strides all positive and the output pointer is less than the input one). The two views might not overlap (such as z[1::2] = z[0::2] + 1) but the routine it is not smart enough to understand it and makes a copy anyway. This should be a conservative approach: it should be safe, at most it might cause copying some array unnecessarily. Let me know if you can think of better approach able to save some unnecesasary copy but light enough (as it is applied nin x nout times). Best, Luca From robert.kern at gmail.com Wed Dec 9 11:29:45 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Dec 2009 10:29:45 -0600 Subject: [Numpy-discussion] Release blockers for 1.4.0 ? In-Reply-To: References: <5b8d13220912070724p51be8fedq3268972a0cd1c4ab@mail.gmail.com> <5b8d13220912080721q34b81382k2e53d1b605496c4@mail.gmail.com> <1260287829.18562.50.camel@talisman> <5b8d13220912080812l75c150b6u6024513730ae013b@mail.gmail.com> <3d375d730912080821m6ea32dc2w62cfa1b58808f939@mail.gmail.com> <5b8d13220912080941r3d028451ka3b0b38dd392b203@mail.gmail.com> <3d375d730912080954m5758cf00l14666a81848f1b78@mail.gmail.com> <13D4BCC7-C9CB-408F-956A-D4A2C450C30A@gmail.com> <3d375d730912081336l24dc9e7ao88469e7777a14f9d@mail.gmail.com> Message-ID: <3d375d730912090829h459da3efh7b30f4bb01c5696c@mail.gmail.com> On Wed, Dec 9, 2009 at 01:41, Anne Archibald wrote: > 2009/12/8 Robert Kern : >> ?olderr = np.seterr(divide='ignore', invalid='ignore') >> ?try: >> ? ?result = self.f(da, db, *args, **kwargs) >> ?finally: >> ? ?np.seterr(**olderr) > > Doesn't this risk a ctrl-C after the seterr but before the try? How about: > olderr = np.geterr() > try: > ? ?np.seterr(....) > ? ?do_whatever() > finally: > ? ?np.seterr(**olderr) Fair point. > (I guess this would be why context managers were invented...) Oh yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david.kirkby at onetel.net Wed Dec 9 14:13:21 2009 From: david.kirkby at onetel.net (David Kirkby) Date: Wed, 9 Dec 2009 19:13:21 +0000 Subject: [Numpy-discussion] Numpy not reconising PA-RISC CPU on HP-UX workstation. Message-ID: <286f7bad0912091113n324a21b1n4c7f10ee1dff8a05@mail.gmail.com> See here for a fuller errror message, http://trac.sagemath.org/sage_trac/ticket/7166 but basically I see: gcc: build/src.hp-ux-B.11.11-9000-785-2.6/numpy/core/src/_sortmodule.c In file included from numpy/core/include/numpy/npy_endian.h:22, from numpy/core/include/numpy/ndarrayobject.h:26, from numpy/core/include/numpy/noprefix.h:7, from numpy/core/src/_sortmodule.c.src:29: numpy/core/include/numpy/npy_cpu.h:49:6: error: #error Unknown CPU, please report this to numpy maintainers with information about your platform (OS, CPU and compiler) In file included from numpy/core/include/numpy/ndarrayobject.h:26, The computer is a HP C3600 workstation The CPU is a 64-bit 552 MHz PA-RISC device. The OS is HP-UX 11i also known as HP-UX 11.11. I'm not sure of the cache sizes on this, though I expect data can be found on the web. If a numpy maintainer wants access to the HP-UX machine, let me know your preferred login name by email, and I'll create you an account. Dave From charlesr.harris at gmail.com Wed Dec 9 14:21:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Dec 2009 12:21:43 -0700 Subject: [Numpy-discussion] Numpy not reconising PA-RISC CPU on HP-UX workstation. In-Reply-To: <286f7bad0912091113n324a21b1n4c7f10ee1dff8a05@mail.gmail.com> References: <286f7bad0912091113n324a21b1n4c7f10ee1dff8a05@mail.gmail.com> Message-ID: On Wed, Dec 9, 2009 at 12:13 PM, David Kirkby wrote: > See here for a fuller errror message, > > http://trac.sagemath.org/sage_trac/ticket/7166 > > but basically I see: > > gcc: build/src.hp-ux-B.11.11-9000-785-2.6/numpy/core/src/_sortmodule.c > In file included from numpy/core/include/numpy/npy_endian.h:22, > from numpy/core/include/numpy/ndarrayobject.h:26, > from numpy/core/include/numpy/noprefix.h:7, > from numpy/core/src/_sortmodule.c.src:29: > numpy/core/include/numpy/npy_cpu.h:49:6: error: #error Unknown CPU, > please report this to numpy maintainers with information about your > platform (OS, CPU and compiler) > In file included from numpy/core/include/numpy/ndarrayobject.h:26, > > The computer is a HP C3600 workstation > The CPU is a 64-bit 552 MHz PA-RISC device. > The OS is HP-UX 11i also known as HP-UX 11.11. > > I'm not sure of the cache sizes on this, though I expect data can be > found on the web. > > If a numpy maintainer wants access to the HP-UX machine, let me know > your preferred login name by email, and I'll create you an account. > > I believe this has been fixed. Can you try the release candidate? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Wed Dec 9 14:29:38 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 Dec 2009 20:29:38 +0100 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? In-Reply-To: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> Message-ID: <4B1FFAA2.3040905@student.matnat.uio.no> Ruben Salvador wrote: > Hello everybody. > > I've seen this question arise sometimes on the list, but don't know if > something has "happened" yet or not. I mean, any solution feasible to > use more or less right out of the box? > > I'm just a hardware engineer, so it would be difficult for me to create > my own class for this, since my knowledge of python/numpy is very > limited, and, just don't have the time/knowledge to be more than a > simple user of the language, not a developer. > > I have just come across this: > http://www.dilloneng.com/documents/downloads/demodel/ but haven't used > it yet. I'll give it a try and see how it works and come back to the > list to report somehow. But, is there any "official" plans for this > within the numpy developers? Is there any code around that may be used? > I just need to test my code with fixed point arithmetic (I'm modelling > hardware....) > > Thanks for the good work to all the Python/Numpy developers (and all the > projects related, matplotlib and so on....) and for the possiblity of > freeing from matlab!!! I'm determined to do research with as many free > software design tools as possible....though this fixed-point arithmetic > issue is still a chain! I haven't heard of anything, but here's what I'd do: - Use np.int64 - Multiply all inputs to my code with 10^6 - Divide all output from my code with 10^6 - If you need to debug-print and array, simply define something like FIXED_POINT_FACTOR = 10**6 def printarr(x): print x.astype(np.float) / FIXED_POINT_FACTOR Or am I missing something? -- Dag Sverre From ndbecker2 at gmail.com Wed Dec 9 14:26:21 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 09 Dec 2009 14:26:21 -0500 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> Message-ID: Ruben Salvador wrote: > Hello everybody. > > I've seen this question arise sometimes on the list, but don't know if > something has "happened" yet or not. I mean, any solution feasible to use > more or less right out of the box? > > I'm just a hardware engineer, so it would be difficult for me to create my > own class for this, since my knowledge of python/numpy is very limited, > and, just don't have the time/knowledge to be more than a simple user of > the language, not a developer. > > I have just come across this: > http://www.dilloneng.com/documents/downloads/demodel/ but haven't used it > yet. I'll give it a try and see how it works and come back to the list to > report somehow. But, is there any "official" plans for this within the > numpy developers? Is there any code around that may be used? I just need > to test my code with fixed point arithmetic (I'm modelling hardware....) > > Thanks for the good work to all the Python/Numpy developers (and all the > projects related, matplotlib and so on....) and for the possiblity of > freeing from matlab!!! I'm determined to do research with as many free > software design tools as possible....though this fixed-point arithmetic > issue is still a chain! > > Regards! I've done some experiments with adding a fixed-pt type to numpy, but in the end abandoned the effort. For now, I use integer arrays to store the data, and then just keep variables for the #bits and position of the binary point. For actual signal processing, I use c++ code. I have a class that is based on boost::constrained_value (unreleased) that gives me the behavior I want from fixed point scalars. From benjamin at kerns.de Wed Dec 9 14:29:16 2009 From: benjamin at kerns.de (Benjamin Kern) Date: Wed, 9 Dec 2009 20:29:16 +0100 Subject: [Numpy-discussion] General Array -> Into Index Array + Value Array of Nonzero Elements Message-ID: Hello everyone, at the moment i like to create a numpy interface to a library for numerical optimization. This library uses a special format to include symmetric matrices, basically if you have A = np.array([ [1.0, 0.0, 2.0] [0.0, 3.0, 0.0] [2.0, 0.0, 5.0] ] ) you would have to create 2 arrays which specify the position as well as the the non-zero elements of the lower triangular part of the matrix. So in this example you would need the following arrays to specify the matrix completely A_pos = np.array([0, 2, 3, 5], dtype=int ) A_val = np.array([1.0, 3.0, 2.0, 5.0]) So now to my question. Is there a clever way to extract these two arrays A_pos and A_val from an arbirtrary A (where A.ndim=2) Another question would be if there is the possibility to do something similiar if you are using sparse matrices (from scipy.sparse). Best Benjamin From charlesr.harris at gmail.com Wed Dec 9 14:47:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Dec 2009 12:47:31 -0700 Subject: [Numpy-discussion] git gui Message-ID: There is a new git gui out. Some windows users might like to give it a try and report back. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Dec 9 15:02:16 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 9 Dec 2009 12:02:16 -0800 Subject: [Numpy-discussion] General Array -> Into Index Array + Value Array of Nonzero Elements In-Reply-To: References: Message-ID: On Wed, Dec 9, 2009 at 11:29 AM, Benjamin Kern wrote: > Hello everyone, > > at the moment i like to create a numpy interface to a library for numerical optimization. This library uses a special format to include symmetric matrices, basically if you have > > A = np.array([ [1.0, 0.0, 2.0] > ? ? ? ? ? ? ? ? ? ? ? ? [0.0, 3.0, 0.0] > ? ? ? ? ? ? ? ? ? ? ? ? [2.0, 0.0, 5.0] ] ) > > you would have to create 2 arrays which specify the position as well as the the non-zero elements > of the lower triangular part of the matrix. So in this example you would need the following arrays to specify the matrix completely > > A_pos = np.array([0, 2, 3, 5], dtype=int ) > A_val = np.array([1.0, 3.0, 2.0, 5.0]) > > So now to my question. Is there a clever way to extract these two arrays A_pos and A_val from an arbirtrary A (where A.ndim=2) > Another question would be if there is the possibility to do something similiar if you are using sparse matrices (from scipy.sparse). > > Best > Benjamin I don't know of a clever way. But it is always possible to hack something together: >> x = np.tri(3, 3, k=0) >> x[x==0] = np.nan >> y = (A*x).reshape(-1) >> y = y[np.isfinite(y)] >> A_pos = np.where(y != 0)[0] >> A_val = y[A_pos] From sienkiew at stsci.edu Wed Dec 9 15:03:07 2009 From: sienkiew at stsci.edu (Mark Sienkiewicz) Date: Wed, 09 Dec 2009 15:03:07 -0500 Subject: [Numpy-discussion] numpy distutils breaks scipy install on mac In-Reply-To: <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> References: <4B1EC6E1.7040004@stsci.edu> <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> Message-ID: <4B20027B.6090104@stsci.edu> Robert Kern wrote: > On Tue, Dec 8, 2009 at 15:36, Mark Sienkiewicz wrote: > >> >> ( Presumably, some other version of gfortan does accept -arch, or this >> code wouldn't be here, right? ) >> > > Right. The -arch flag was added by Apple to GCC and their patch really > should be applied to all builds of GCC compilers for the Mac. It is > deeply disappointing that Fink ignored this. So, you're saying that an un-patched GCC doesn't know -arch ? In that case, isn't it a mistake to see "gfortran" on the path, and then assume that you can say "-arch" to it? > The only Mac gfortran > build that I can recommend is here: > > http://r.research.att.com/tools/ > I saw the "should" note about that in the installation instructions. If I were doing this for personal use, I would have just installed that compiler and been done with it. Unfortunately, I am supporting many users who are already going to have the fink gfortran installed, and therefore I have to build scipy to use those libraries. It would be a tremendous amount of work just to convince my IT department to uninstall the fink gfortran and install the AT&T gfortran on every mac at the institute. > _can_target() should be fixed to be more accurate, though, so if you > find a patch that works for you, please let us know. > Here is an idea: I have a Mac Tiger machine that I believe has the AT&T gfortran installed. It can make universal binaries. It says: % gfortran -arch bananapc6000 -v gfortran: Invalid arch name : bananapc6000 % But the Mac Leopard machine with the Fink: % gfortran -arch bananapc6000 -v Using built-in specs. Target: i686-apple-darwin9 Configured with: ../gcc-4.3.0/configure --prefix=/sw --prefix=/sw/lib/gcc4.3 --mandir=/sw/share/man --infodir=/sw/share/info --enable-languages=c,c++,fortran,objc,java --with-arch=nocona --with-tune=generic --build=i686-apple-darwin9 --with-gmp=/sw --with-libiconv-prefix=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --disable-libjava-multilib Thread model: posix gcc version 4.3.0 (GCC) % So, if you ask gfortran to use an obviously bogus architecture and it objects with the message "Invalid arch name", then it knows what -arch means. If it says anything else, then it doesn't. It should work until somebody ports gfortran to make bananapc6000 binaries. :) I think this is an ugly hack, but that's how it is when you use auto-detection. (I wish there were some way that distutils could autodetect everything, write it into a file, let me edit that file, then next time I run setup.py it would use the values in the file, but I expect that would require a near complete re-write of distutils, and I just don't have time.) Mark From robert.kern at gmail.com Wed Dec 9 15:08:05 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Dec 2009 14:08:05 -0600 Subject: [Numpy-discussion] numpy distutils breaks scipy install on mac In-Reply-To: <4B20027B.6090104@stsci.edu> References: <4B1EC6E1.7040004@stsci.edu> <3d375d730912081347l787db5b4q6d10db351d01ec4e@mail.gmail.com> <4B20027B.6090104@stsci.edu> Message-ID: <3d375d730912091208q4571d300w9e02c8e93e41a267@mail.gmail.com> On Wed, Dec 9, 2009 at 14:03, Mark Sienkiewicz wrote: > Robert Kern wrote: >> On Tue, Dec 8, 2009 at 15:36, Mark Sienkiewicz wrote: >> >>> >>> ( Presumably, some other version of gfortan does accept -arch, or this >>> code wouldn't be here, right? ) >>> >> >> Right. The -arch flag was added by Apple to GCC and their patch really >> should be applied to all builds of GCC compilers for the Mac. It is >> deeply disappointing that Fink ignored this. > > So, you're saying that an un-patched GCC doesn't know -arch ? Yup! >?In that > case, isn't it a mistake to see "gfortran" on the path, and then assume > that you can say "-arch" to it? We're not. You've found a bug in our test for whether or not gfortran supports the flag. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Wed Dec 9 15:20:06 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 9 Dec 2009 12:20:06 -0800 Subject: [Numpy-discussion] General Array -> Into Index Array + Value Array of Nonzero Elements In-Reply-To: References: Message-ID: On Wed, Dec 9, 2009 at 12:02 PM, Keith Goodman wrote: > On Wed, Dec 9, 2009 at 11:29 AM, Benjamin Kern wrote: >> Hello everyone, >> >> at the moment i like to create a numpy interface to a library for numerical optimization. This library uses a special format to include symmetric matrices, basically if you have >> >> A = np.array([ [1.0, 0.0, 2.0] >> ? ? ? ? ? ? ? ? ? ? ? ? [0.0, 3.0, 0.0] >> ? ? ? ? ? ? ? ? ? ? ? ? [2.0, 0.0, 5.0] ] ) >> >> you would have to create 2 arrays which specify the position as well as the the non-zero elements >> of the lower triangular part of the matrix. So in this example you would need the following arrays to specify the matrix completely >> >> A_pos = np.array([0, 2, 3, 5], dtype=int ) >> A_val = np.array([1.0, 3.0, 2.0, 5.0]) >> >> So now to my question. Is there a clever way to extract these two arrays A_pos and A_val from an arbirtrary A (where A.ndim=2) >> Another question would be if there is the possibility to do something similiar if you are using sparse matrices (from scipy.sparse). >> >> Best >> Benjamin > > I don't know of a clever way. But it is always possible to hack > something together: > >>> x = np.tri(3, 3, k=0) >>> x[x==0] = np.nan >>> y = (A*x).reshape(-1) >>> y = y[np.isfinite(y)] >>> A_pos = np.where(y != 0)[0] >>> A_val = y[A_pos] This is a little less hackish: >> x = np.tri(3, 3) >> idx = x.flat == 1 >> y = A.flat[idx] >> A_pos = np.where(y != 0)[0] >> A_val = y[A_pos] From gael.varoquaux at normalesup.org Wed Dec 9 16:31:46 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 9 Dec 2009 22:31:46 +0100 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? Message-ID: <20091209213146.GA5468@phare.normalesup.org> In the nipy project, we have cython-generated C files checked in. With the latest numpy I get the following run-time failure: "/home/varoquau/dev/nipy-trunk/nipy/neurospin/register/iconic_matcher.py", line 6, in from routines import _joint_histogram, _similarity, similarity_measures File "numpy.pxi", line 74, in nipy.neurospin.register.routines (nipy/neurospin/register/routines.c:6042) ValueError: numpy.dtype does not appear to be the correct type object Where routines.c is a cython-generated file. I suspect that this is due to the recent change in the size of the struct representing numpy arrays. With other projects, it was sufficient to recompile the project to avoid this problem. With nipy it is not. Could that be because the cython-generated C file encodes the size of the numpy array struct, and has been compiled with a different numpy? If this is the case, I think that it means that we need to all use the same version of numpy (1.3). Does anybody know if the cython folks are going to work around that anytime soon (I am not on the cython mailing list, so I am not asking there)? Nipy folks, does that mean that we should change our policy, and build the cython-generated c files at compile time, thus requiring cython as a build dependency? Any comments appreciated. Ga?l From matthew.brett at gmail.com Wed Dec 9 16:36:54 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 9 Dec 2009 16:36:54 -0500 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? In-Reply-To: <20091209213146.GA5468@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> Message-ID: <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> Hi, On Wed, Dec 9, 2009 at 4:31 PM, Gael Varoquaux wrote: > In the nipy project, we have cython-generated C files checked in. > With the latest numpy I get the following run-time failure: > > "/home/varoquau/dev/nipy-trunk/nipy/neurospin/register/iconic_matcher.py", > line 6, in > ? ?from routines import _joint_histogram, _similarity, > similarity_measures > ?File "numpy.pxi", line 74, in nipy.neurospin.register.routines > (nipy/neurospin/register/routines.c:6042) > ValueError: numpy.dtype does not appear to be the correct type object Er - I've had this error in the past, but as far as I remember it is always because I am somehow picking up a fragment of the wrong numpy C-API. Usual steps (delete site-packages/numpy maybe site-packages/scipy, /build /build, rebuild and install, then remove site-packages/nipy /build if necessary, or all traces of inplace build, then try again? See you, Matthew From gael.varoquaux at normalesup.org Wed Dec 9 16:40:02 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 9 Dec 2009 22:40:02 +0100 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? In-Reply-To: <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> Message-ID: <20091209214002.GB32739@phare.normalesup.org> On Wed, Dec 09, 2009 at 04:36:54PM -0500, Matthew Brett wrote: > > "/home/varoquau/dev/nipy-trunk/nipy/neurospin/register/iconic_matcher.py", > > line 6, in > > ? ?from routines import _joint_histogram, _similarity, > > similarity_measures > > ?File "numpy.pxi", line 74, in nipy.neurospin.register.routines > > (nipy/neurospin/register/routines.c:6042) > > ValueError: numpy.dtype does not appear to be the correct type object > Er - I've had this error in the past, but as far as I remember it is > always because I am somehow picking up a fragment of the wrong numpy > C-API. Usual steps (delete site-packages/numpy maybe > site-packages/scipy, /build /build, rebuild and > install, then remove site-packages/nipy /build if necessary, > or all traces of inplace build, then try again? I did that quite a few times. Maybe I am forgetting to clean up something. I do the following for numpy/scipy/nipy (using zsh extended glob): rm -rf build **/*.o **/*.so **/*.a python setup.py build_ext --inplace Yes, nipy has a '.a'. Ga?l From robert.kern at gmail.com Wed Dec 9 16:39:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Dec 2009 15:39:56 -0600 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? In-Reply-To: <20091209213146.GA5468@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> Message-ID: <3d375d730912091339g6ee06e8bs9322ac42ce1f6c47@mail.gmail.com> On Wed, Dec 9, 2009 at 15:31, Gael Varoquaux wrote: > In the nipy project, we have cython-generated C files checked in. > With the latest numpy I get the following run-time failure: > > "/home/varoquau/dev/nipy-trunk/nipy/neurospin/register/iconic_matcher.py", > line 6, in > ? ?from routines import _joint_histogram, _similarity, > similarity_measures > ?File "numpy.pxi", line 74, in nipy.neurospin.register.routines > (nipy/neurospin/register/routines.c:6042) > ValueError: numpy.dtype does not appear to be the correct type object > > Where routines.c is a cython-generated file. I suspect that this is due > to the recent change in the size of the struct representing numpy arrays. > With other projects, it was sufficient to recompile the project to avoid > this problem. With nipy it is not. Could that be because the > cython-generated C file encodes the size of the numpy array struct, and > has been compiled with a different numpy? I wouldn't think so (Cython itself doesn't know the size; it should rely on using sizeof() in the C code), but it's your file. You tell us. > If this is the case, I think that it means that we need to all use the > same version of numpy (1.3). > > Does anybody know if the cython folks are going to work around that > anytime soon (I am not on the cython mailing list, so I am not asking > there)? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From benjamin at kerns.de Wed Dec 9 16:36:34 2009 From: benjamin at kerns.de (Benjamin Kern) Date: Wed, 9 Dec 2009 22:36:34 +0100 Subject: [Numpy-discussion] General Array -> Into Index Array + Value Array of Nonzero Elements In-Reply-To: <20091209213146.GA5468@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> Message-ID: <0F007FAF-C42C-41EA-869C-90DBF47069CB@kerns.de> Thanks for the quick answer. I think this will help me further for the moment. Best Benjamin From gael.varoquaux at normalesup.org Wed Dec 9 16:42:11 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 9 Dec 2009 22:42:11 +0100 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? In-Reply-To: <3d375d730912091339g6ee06e8bs9322ac42ce1f6c47@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <3d375d730912091339g6ee06e8bs9322ac42ce1f6c47@mail.gmail.com> Message-ID: <20091209214211.GC32739@phare.normalesup.org> On Wed, Dec 09, 2009 at 03:39:56PM -0600, Robert Kern wrote: > > Where routines.c is a cython-generated file. I suspect that this is due > > to the recent change in the size of the struct representing numpy arrays. > > With other projects, it was sufficient to recompile the project to avoid > > this problem. With nipy it is not. Could that be because the > > cython-generated C file encodes the size of the numpy array struct, and > > has been compiled with a different numpy? > I wouldn't think so (Cython itself doesn't know the size; it should > rely on using sizeof() in the C code), but it's your file. You tell > us. OK, I thought it could be that way. So I must be doing something stupid. I'll look closer. Thanks, Ga?l From robert.kern at gmail.com Wed Dec 9 16:52:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 Dec 2009 15:52:19 -0600 Subject: [Numpy-discussion] Impossibility to build nipy on recent numpy? In-Reply-To: <20091209214002.GB32739@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> Message-ID: <3d375d730912091352r535ce9a8s6b6d1406ef2aec13@mail.gmail.com> On Wed, Dec 9, 2009 at 15:40, Gael Varoquaux wrote: > On Wed, Dec 09, 2009 at 04:36:54PM -0500, Matthew Brett wrote: >> > "/home/varoquau/dev/nipy-trunk/nipy/neurospin/register/iconic_matcher.py", >> > line 6, in >> > ? ?from routines import _joint_histogram, _similarity, >> > similarity_measures >> > ?File "numpy.pxi", line 74, in nipy.neurospin.register.routines >> > (nipy/neurospin/register/routines.c:6042) >> > ValueError: numpy.dtype does not appear to be the correct type object > >> Er - I've had this error in the past, but as far as I remember it is >> always because I am somehow picking up a fragment of the wrong numpy >> C-API. ?Usual steps (delete site-packages/numpy maybe >> site-packages/scipy, /build /build, rebuild and >> install, then remove site-packages/nipy /build if necessary, >> or all traces of inplace build, then try again? > > I did that quite a few times. Maybe I am forgetting to clean up > something. I do the following for numpy/scipy/nipy (using zsh extended > glob): > > rm -rf build **/*.o **/*.so **/*.a > python setup.py build_ext --inplace If you are building numpy inplace, you will also have to delete the inplace headers. nuke: rm -rf build rm -f numpy/core/include/numpy/__multiarray_api.c numpy/core/include/numpy/__multiarray_api.h numpy/core/include/numpy/__ufunc_api.c numpy/core/include/numpy/__ufunc_api.h numpy/core/include/numpy /__umath_generated.c rm -f `find . -name "*.so"` -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pfeldman at verizon.net Wed Dec 9 20:51:03 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 9 Dec 2009 17:51:03 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <1260349494.18562.120.camel@talisman> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> Message-ID: <26720961.post@talk.nabble.com> Pauli Virtanen-3 wrote: > > I'd think that downcasting is different from dropping the imaginary part. > There are many ways (in fact, an unlimited number) to downcast from complex to real. Here are three possibilities: - Take the real part. - Take the magnitude (root-mean-square of the real and imaginary parts). - Assign a NaN. -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26720961.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pfeldman at verizon.net Wed Dec 9 20:54:07 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Wed, 9 Dec 2009 17:54:07 -0800 (PST) Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <1260349494.18562.120.camel@talisman> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> Message-ID: <26720987.post@talk.nabble.com> Pauli Virtanen-3 wrote: > > Nevertheless, I can't really regard dropping the imaginary part a > significant issue. > I am amazed that anyone could say this. For anyone who works with Fourier transforms, or with electrical circuits, or with electromagnetic waves, dropping the imaginary part is a huge issue because we get answers that are totally wrong. When I recently tried to validate a code, the answers were wrong, and it took two full days to track down the cause. I am now forced to reconsider carefully whether Python/NumPy is a suitable platform for serious scientific computing. -- View this message in context: http://old.nabble.com/Assigning-complex-values-to-a-real-array-tp22383353p26720987.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From sransom at nrao.edu Wed Dec 9 21:10:12 2009 From: sransom at nrao.edu (Scott Ransom) Date: Wed, 9 Dec 2009 21:10:12 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26720987.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> Message-ID: <20091210021012.GA3768@ssh.cv.nrao.edu> On Wed, Dec 09, 2009 at 05:54:07PM -0800, Dr. Phillip M. Feldman wrote: > > Pauli Virtanen-3 wrote: > > > > Nevertheless, I can't really regard dropping the imaginary part a > > significant issue. > > > > I am amazed that anyone could say this. For anyone who works with Fourier > transforms, or with electrical circuits, or with electromagnetic waves, > dropping the imaginary part is a huge issue because we get answers that are > totally wrong. > > When I recently tried to validate a code, the answers were wrong, and it > took two full days to track down the cause. I am now forced to reconsider > carefully whether Python/NumPy is a suitable platform for serious scientific > computing. You've now said this a couple times. And it is fine if that is your opinion. However, I think it is incorrect. I've been using numeric/numarray/numpy for about 12 years as my main scientific computing platform. And I do extensive work with Fourier Transforms and other complex numbers. I have not once run into this issue and in fact, my use of numpy has improved my scientific productivity dramatically. Most of the casting rules were set a very long time ago and are there for very good reasons. While it is certainly possibly that there could be bugs in corner cases of some of them, or that those rules surprise some people due to their familiarity with other behaviours, but that does not change the fact that most of them are in place in numpy for good reasons. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From peridot.faceted at gmail.com Wed Dec 9 22:10:06 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 9 Dec 2009 22:10:06 -0500 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <26720987.post@talk.nabble.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> Message-ID: 2009/12/9 Dr. Phillip M. Feldman : > > > Pauli Virtanen-3 wrote: >> >> Nevertheless, I can't really regard dropping the imaginary part a >> significant issue. >> > > I am amazed that anyone could say this. ?For anyone who works with Fourier > transforms, or with electrical circuits, or with electromagnetic waves, > dropping the imaginary part is a huge issue because we get answers that are > totally wrong. I agree that dropping the imaginary part is a wart. But it is one that is not very hard to learn to live with. I say this as someone who has been burned by it while using Fourier analysis to work with astronomical data. > When I recently tried to validate a code, the answers were wrong, and it > took two full days to track down the cause. ?I am now forced to reconsider > carefully whether Python/NumPy is a suitable platform for serious scientific > computing. While I find the current numpy complex->real conversion annoying, I have to say, this kind of rhetoric does not benefit your cause. It sounds childish and manipulative, and makes even people who agree in principle want to tell you to go ahead and use MATLAB and stop pestering us. We are not here to sell you on numpy; if you hate it, don't use it. We are here because *we* use it, warts and all, and we want to discuss interesting topics related to numpy. That you would have implemented it differently is not very interesting if you are not even willing to understand why it is the way it is and what a change would cost, let alone propose a workable way to improve. Anne From charlesr.harris at gmail.com Thu Dec 10 00:59:22 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 9 Dec 2009 22:59:22 -0700 Subject: [Numpy-discussion] Flattening an array In-Reply-To: <58df6dc20912081629t5960d58dub66c4a5ae901e426@mail.gmail.com> References: <58df6dc20912081629t5960d58dub66c4a5ae901e426@mail.gmail.com> Message-ID: On Tue, Dec 8, 2009 at 5:29 PM, Jake VanderPlas wrote: > Hello, > I have a function -- call it f() -- which takes a length-N 1D numpy > array as an argument, and returns a length-N 1D array. > I want to pass it the data in an N-D array, and obtain the N-D array > of the result. > I've thought about wrapping it as such: > > #python code: > from my_module import f # takes a 1D array, raises an exception otherwise > def f_wrap(A): > A_1D = A.ravel() > B = f(A_1D) > return B.reshape(A.shape) > #end code > > If the function treats both types of input the same and the input arrays are genuinely C/F contiguous, then you can just reshape them A_1D = A.reshape(-1, order='C') # c order A_1D = A.reshape(-1, order='F') # fortran order Warning: if they aren't contiguous of the proper sort, copies will be made. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Thu Dec 10 03:54:32 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 10 Dec 2009 09:54:32 +0100 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> Message-ID: <4B20B748.2050707@student.matnat.uio.no> Anne Archibald wrote: > 2009/12/9 Dr. Phillip M. Feldman : > > >> When I recently tried to validate a code, the answers were wrong, and it >> took two full days to track down the cause. I am now forced to reconsider >> carefully whether Python/NumPy is a suitable platform for serious scientific >> computing. >> > > While I find the current numpy complex->real conversion annoying, I > have to say, this kind of rhetoric does not benefit your cause. It > sounds childish and manipulative, and makes even people who agree in > principle want to tell you to go ahead and use MATLAB and stop > pestering us. We are not here to sell you on numpy; if you hate it, > don't use it. We are here because *we* use it, warts and all, and we > want to discuss interesting topics related to numpy. That you would > have implemented it differently is not very interesting if you are not > even willing to understand why it is the way it is and what a change > would cost, let alone propose a workable way to improve. > At this point I want to remind us about Charles Harris' very workable proposal: Raise a warning. That should both keep backward compatability and prevent people from wasting days. (Hopefully, we can avoid wasting days discussing this issue too :-) ). Dag Sverre From rsalvador.wk at gmail.com Thu Dec 10 05:22:18 2009 From: rsalvador.wk at gmail.com (Ruben Salvador) Date: Thu, 10 Dec 2009 11:22:18 +0100 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? In-Reply-To: References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> Message-ID: <4fe028e30912100222t283ddfabo2b05fb7f814bf813@mail.gmail.com> On Wed, Dec 9, 2009 at 8:26 PM, Neal Becker wrote: > Ruben Salvador wrote: > > > Hello everybody. > > > > I've seen this question arise sometimes on the list, but don't know if > > something has "happened" yet or not. I mean, any solution feasible to use > > more or less right out of the box? > > > > I'm just a hardware engineer, so it would be difficult for me to create > my > > own class for this, since my knowledge of python/numpy is very limited, > > and, just don't have the time/knowledge to be more than a simple user of > > the language, not a developer. > > > > I have just come across this: > > http://www.dilloneng.com/documents/downloads/demodel/ but haven't used > it > > yet. I'll give it a try and see how it works and come back to the list to > > report somehow. But, is there any "official" plans for this within the > > numpy developers? Is there any code around that may be used? I just need > > to test my code with fixed point arithmetic (I'm modelling hardware....) > > > > Thanks for the good work to all the Python/Numpy developers (and all the > > projects related, matplotlib and so on....) and for the possiblity of > > freeing from matlab!!! I'm determined to do research with as many free > > software design tools as possible....though this fixed-point arithmetic > > issue is still a chain! > > > > Regards! > > I've done some experiments with adding a fixed-pt type to numpy, but in the > end abandoned the effort. For now, I use integer arrays to store the data, > and then just keep variables for the #bits and position of the binary > point. > > For actual signal processing, I use c++ code. I have a class that is based > on boost::constrained_value (unreleased) that gives me the behavior I want > from fixed point scalars. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Well...I think I may also try this way. This FIXED_POINT_FACTOR scaling is what is actually done implicitly in hardware to align bit vectors. And if I over-dimension the bit length, I "won't need to take care" of the number of bits after arithmetic operations... I'll try and see...but, if anybody has a quicker solution....I'm actually in a hurry :S I had a look at the code I mentioned in my first email. It does the trick someway, but from my point of view, needs some more tweaking to be usable in a wider context. It only supports some operations and I just guess it will fail in many numpy.array routines, if data is not cast previously (maybe not since the actual numerical value is floating point, and the fixed point is an internal representation of the class)...will try and report back.... Anyway, don't you people think we should boost this fixed-point issue in numpy? We should make some kind of roadmap for the implementation, I think it's a *MUST*. -- Rub?n Salvador PhD student @ Centro de Electr?nica Industrial (CEI) http://www.cei.upm.es Blog: http://aesatcei.wordpress.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav+sp at iki.fi Thu Dec 10 05:24:52 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Thu, 10 Dec 2009 10:24:52 +0000 (UTC) Subject: [Numpy-discussion] Assigning complex values to a real array References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070130j5672a114m159a7b922931b205@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> <4B20B748.2050707@student.matnat.uio.no> Message-ID: Thu, 10 Dec 2009 09:54:32 +0100, Dag Sverre Seljebotn wrote: [clip] > At this point I want to remind us about Charles Harris' very workable > proposal: Raise a warning. That should both keep backward compatability > and prevent people from wasting days. (Hopefully, we can avoid wasting > days discussing this issue too :-) ). Yes. We can even make it a PendingDeprecationWarning, when we become convinced that this feature should go away in entirety. -- Pauli Virtanen From gael.varoquaux at normalesup.org Thu Dec 10 06:43:53 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Dec 2009 12:43:53 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> Message-ID: <20091210114353.GA1393@phare.normalesup.org> On Thu, Dec 10, 2009 at 10:59:30AM +0100, Alexis Roche wrote: > Apparently you are trying to link nipy with an in-place build of > numpy, is that correct? Are you confident that the nipy setup does not > mix bits of your local numpy with another site-package install? My > question might be naive: I never build anything in-place (and never > had this bug, although it might be unrelated). It does seem like it might be a mess related to having two numpy installed: the local one, and the system one. It seems that during compilation of nipy, numpy.distutils picks up the wrong header, that is the system ones, and not those of my local instal of numpy (I tried with a 'develop' and with an 'install'). When I compile nipy, the compile options are: compile options: '-DATLAS_INFO="\"3.6.0\"" -I/usr/include -I/home/varoquau/usr/lib/python2.6/site-packages/numpy/core/include -I/home/varoquau/dev/nipy-neurospin/libcstat/fff -I/home/varoquau/dev/nipy-neurospin/libcstat/randomkit -I/home/varoquau/dev/nipy-neurospin/libcstat/wrapper -I/home/varoquau/usr/lib/python2.6/site-packages/numpy/core/include -I/usr/include -I/usr/include/python2.6 -c' In /usr/include, there is (under ubuntu Karmic) a numpy directory, with the numpy headers, corresponding to the system numpy. I don't want to remove these headers, as that would mean removing the python-numpy package on which many other packages depend. Does anybody have a suggestion on how to cleanly solve the problem? I fear I won't be the only one to have it. Ga?l From gael.varoquaux at normalesup.org Thu Dec 10 08:29:47 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Dec 2009 14:29:47 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> Message-ID: <20091210132947.GC1202@phare.normalesup.org> On Thu, Dec 10, 2009 at 05:29:07PM +0530, David Cournapeau wrote: > On Thu, Dec 10, 2009 at 5:13 PM, Gael Varoquaux > wrote: > > Does anybody have a suggestion on how to cleanly solve the problem? I > > fear I won't be the only one to have it. > The clean solution is to tell the deb package maintainer to fix the > packaging, and not to put numpy in /usr/include (at least, it should > be in /usr/include/numpy-version, like python). OK, so your point is that numpy headers should be retrieved via numpy.distutils only, right? We can probably make this argument, but it is going to make the life of those who use other build chains hard. Ga?l From pav at iki.fi Thu Dec 10 08:44:34 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 10 Dec 2009 15:44:34 +0200 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <20091210132947.GC1202@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210132947.GC1202@phare.normalesup.org> Message-ID: <1260452674.18562.159.camel@talisman> to, 2009-12-10 kello 14:29 +0100, Gael Varoquaux kirjoitti: > On Thu, Dec 10, 2009 at 05:29:07PM +0530, David Cournapeau wrote: > > On Thu, Dec 10, 2009 at 5:13 PM, Gael Varoquaux > > wrote: > > > > > Does anybody have a suggestion on how to cleanly solve the problem? I > > > fear I won't be the only one to have it. > > > The clean solution is to tell the deb package maintainer to fix the > > packaging, and not to put numpy in /usr/include (at least, it should > > be in /usr/include/numpy-version, like python). > > OK, so your point is that numpy headers should be retrieved via > numpy.distutils only, right? We can probably make this argument, but it > is going to make the life of those who use other build chains hard. Those who need alternative build chains can locate the path to the correct numpy headers via python -c 'import numpy, os; print(os.path.dirname(numpy.__file__))+"/core/include"' which is simple to pop in e.g. in a Makefile. I think this is a price worth paying for supporting multiple versions, and is in any case required on non-Ubuntu platforms. -- Pauli Virtanen From gael.varoquaux at normalesup.org Thu Dec 10 08:58:49 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Dec 2009 14:58:49 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <1260452674.18562.159.camel@talisman> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210132947.GC1202@phare.normalesup.org> <1260452674.18562.159.camel@talisman> Message-ID: <20091210135849.GA31430@phare.normalesup.org> On Thu, Dec 10, 2009 at 03:44:34PM +0200, Pauli Virtanen wrote: > Those who need alternative build chains can locate the path to the > correct numpy headers via > python -c 'import numpy, os; print(os.path.dirname(numpy.__file__))+"/core/include"' > which is simple to pop in e.g. in a Makefile. > I think this is a price worth paying for supporting multiple versions, > and is in any case required on non-Ubuntu platforms. OK, so we need to bug report to ubuntu. Anybody feels like doing it, or do I need to go ahead :). Ga?l From charlesr.harris at gmail.com Thu Dec 10 10:07:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Dec 2009 08:07:53 -0700 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? In-Reply-To: <4fe028e30912100222t283ddfabo2b05fb7f814bf813@mail.gmail.com> References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> <4fe028e30912100222t283ddfabo2b05fb7f814bf813@mail.gmail.com> Message-ID: On Thu, Dec 10, 2009 at 3:22 AM, Ruben Salvador wrote: > > On Wed, Dec 9, 2009 at 8:26 PM, Neal Becker wrote: > >> Ruben Salvador wrote: >> >> > Hello everybody. >> > >> > I've seen this question arise sometimes on the list, but don't know if >> > something has "happened" yet or not. I mean, any solution feasible to >> use >> > more or less right out of the box? >> > >> > I'm just a hardware engineer, so it would be difficult for me to create >> my >> > own class for this, since my knowledge of python/numpy is very limited, >> > and, just don't have the time/knowledge to be more than a simple user of >> > the language, not a developer. >> > >> > I have just come across this: >> > http://www.dilloneng.com/documents/downloads/demodel/ but haven't used >> it >> > yet. I'll give it a try and see how it works and come back to the list >> to >> > report somehow. But, is there any "official" plans for this within the >> > numpy developers? Is there any code around that may be used? I just need >> > to test my code with fixed point arithmetic (I'm modelling hardware....) >> > >> > Thanks for the good work to all the Python/Numpy developers (and all the >> > projects related, matplotlib and so on....) and for the possiblity of >> > freeing from matlab!!! I'm determined to do research with as many free >> > software design tools as possible....though this fixed-point arithmetic >> > issue is still a chain! >> > >> > Regards! >> >> I've done some experiments with adding a fixed-pt type to numpy, but in >> the >> end abandoned the effort. For now, I use integer arrays to store the >> data, >> and then just keep variables for the #bits and position of the binary >> point. >> >> For actual signal processing, I use c++ code. I have a class that is >> based >> on boost::constrained_value (unreleased) that gives me the behavior I want >> from fixed point scalars. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > Well...I think I may also try this way. This FIXED_POINT_FACTOR scaling is > what is actually done implicitly in hardware to align bit vectors. And if I > over-dimension the bit length, I "won't need to take care" of the number of > bits after arithmetic operations... > > I'll try and see...but, if anybody has a quicker solution....I'm actually > in a hurry :S > > I had a look at the code I mentioned in my first email. It does the trick > someway, but from my point of view, needs some more tweaking to be usable in > a wider context. It only supports some operations and I just guess it will > fail in many numpy.array routines, if data is not cast previously (maybe not > since the actual numerical value is floating point, and the fixed point is > an internal representation of the class)...will try and report back.... > > Anyway, don't you people think we should boost this fixed-point issue in > numpy? We should make some kind of roadmap for the implementation, I think > it's a *MUST*. > > There is certainly a whole class of engineering problems for which it would be very useful. But things in numpy/scipy tend to get done when someone scratches their itch and none of the current developers seem to have this particular itch. Now, if someone comes along with a nice implementation, voila, they become a developer and the job gets done. Which is to say, no one is keeping the gate, contributions are welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.kirkby at onetel.net Thu Dec 10 10:36:30 2009 From: david.kirkby at onetel.net (Dr. David Kirkby) Date: Thu, 10 Dec 2009 15:36:30 +0000 Subject: [Numpy-discussion] Numpy not reconising PA-RISC CPU on HP-UX workstation. In-Reply-To: References: <286f7bad0912091113n324a21b1n4c7f10ee1dff8a05@mail.gmail.com> Message-ID: <4B21157E.8000708@onetel.net> Charles R Harris wrote: > > > On Wed, Dec 9, 2009 at 12:13 PM, David Kirkby > wrote: > > See here for a fuller errror message, > > http://trac.sagemath.org/sage_trac/ticket/7166 > > but basically I see: > > gcc: build/src.hp-ux-B.11.11-9000-785-2.6/numpy/core/src/_sortmodule.c > In file included from numpy/core/include/numpy/npy_endian.h:22, > from numpy/core/include/numpy/ndarrayobject.h:26, > from numpy/core/include/numpy/noprefix.h:7, > from numpy/core/src/_sortmodule.c.src:29: > numpy/core/include/numpy/npy_cpu.h:49:6: error: #error Unknown CPU, > please report this to numpy maintainers with information about your > platform (OS, CPU and compiler) > In file included from numpy/core/include/numpy/ndarrayobject.h:26, > > The computer is a HP C3600 workstation > The CPU is a 64-bit 552 MHz PA-RISC device. > The OS is HP-UX 11i also known as HP-UX 11.11. > > I'm not sure of the cache sizes on this, though I expect data can be > found on the web. > > If a numpy maintainer wants access to the HP-UX machine, let me know > your preferred login name by email, and I'll create you an account. > > > I believe this has been fixed. Can you try the release candidate? > > Chuck Hi, Unfortunately I'm not in a position to verify if this is fixed, but I will do later and let you know. But it might not be for a few weeks/moths. I was building numpy as part of Sage http://www.sagemath.org/ which is not ported to HP-UX. In fact, there is no official aim to Sage to HP-UX, though personally I'd like to see a port. My main motivation is not that I want to use it on HP-UX, but that mainly trying software on other platforms often reveals buts that effect all platforms. What I did was just *quickly* checked what bits of Sage built on HP-UX and what bits do not. One bit which does not build is python. So the numpy release candidate I downloaded detects the fact python does not exist, and so will not start to build. On contrast, the previous version of numpy started to compile the source files, without checking python existed. Hence I found this problem with the unreconised CPU. The HP-UX port of Sage is not high priority, but every now and again I spend a few hours on it, though devoting most of my efforts at an improved port to Solaris. If a numpy developer want to try to build the sources, without python present, and check this, let me know and I'll create you an account. But until python is building in Sage, I do not wish to spend any more time on the numpy/HP-UX issue. Dave Dave From charlesr.harris at gmail.com Thu Dec 10 11:15:27 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Dec 2009 09:15:27 -0700 Subject: [Numpy-discussion] Numpy not reconising PA-RISC CPU on HP-UX workstation. In-Reply-To: <4B21157E.8000708@onetel.net> References: <286f7bad0912091113n324a21b1n4c7f10ee1dff8a05@mail.gmail.com> <4B21157E.8000708@onetel.net> Message-ID: On Thu, Dec 10, 2009 at 8:36 AM, Dr. David Kirkby wrote: > Charles R Harris wrote: > > > > > > On Wed, Dec 9, 2009 at 12:13 PM, David Kirkby > > wrote: > > > > See here for a fuller errror message, > > > > http://trac.sagemath.org/sage_trac/ticket/7166 > > > > but basically I see: > > > > gcc: > build/src.hp-ux-B.11.11-9000-785-2.6/numpy/core/src/_sortmodule.c > > In file included from numpy/core/include/numpy/npy_endian.h:22, > > from numpy/core/include/numpy/ndarrayobject.h:26, > > from numpy/core/include/numpy/noprefix.h:7, > > from numpy/core/src/_sortmodule.c.src:29: > > numpy/core/include/numpy/npy_cpu.h:49:6: error: #error Unknown CPU, > > please report this to numpy maintainers with information about your > > platform (OS, CPU and compiler) > > In file included from numpy/core/include/numpy/ndarrayobject.h:26, > > > > The computer is a HP C3600 workstation > > The CPU is a 64-bit 552 MHz PA-RISC device. > > The OS is HP-UX 11i also known as HP-UX 11.11. > > > > I'm not sure of the cache sizes on this, though I expect data can be > > found on the web. > > > > If a numpy maintainer wants access to the HP-UX machine, let me know > > your preferred login name by email, and I'll create you an account. > > > > > > I believe this has been fixed. Can you try the release candidate? > > > > Chuck > > Hi, > Unfortunately I'm not in a position to verify if this is fixed, but I will > do > later and let you know. But it might not be for a few weeks/moths. > > I was building numpy as part of Sage > > http://www.sagemath.org/ > > which is not ported to HP-UX. In fact, there is no official aim to Sage to > HP-UX, though personally I'd like to see a port. My main motivation is not > that > I want to use it on HP-UX, but that mainly trying software on other > platforms > often reveals buts that effect all platforms. > > What I did was just *quickly* checked what bits of Sage built on HP-UX and > what > bits do not. One bit which does not build is python. So the numpy release > candidate I downloaded detects the fact python does not exist, and so will > not > start to build. > > On contrast, the previous version of numpy started to compile the source > files, > without checking python existed. Hence I found this problem with the > unreconised > CPU. > > The HP-UX port of Sage is not high priority, but every now and again I > spend a > few hours on it, though devoting most of my efforts at an improved port to > Solaris. > > If a numpy developer want to try to build the sources, without python > present, > and check this, let me know and I'll create you an account. But until > python is > building in Sage, I do not wish to spend any more time on the numpy/HP-UX > issue. > > OK. Looking back at the patch that fixed this, it was submitted by gentoo linux, so it may not be the right thing for HP_UX. Keep us informed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Dec 10 11:17:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Dec 2009 10:17:43 -0600 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <20091210135849.GA31430@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210132947.GC1202@phare.normalesup.org> <1260452674.18562.159.camel@talisman> <20091210135849.GA31430@phare.normalesup.org> Message-ID: <3d375d730912100817n76b626dax6ae1d6026cb755f0@mail.gmail.com> On Thu, Dec 10, 2009 at 07:58, Gael Varoquaux wrote: > On Thu, Dec 10, 2009 at 03:44:34PM +0200, Pauli Virtanen wrote: >> Those who need alternative build chains can locate the path to the >> correct numpy headers via > >> ? ? ? ? python -c 'import numpy, os; print(os.path.dirname(numpy.__file__))+"/core/include"' > >> which is simple to pop in e.g. in a Makefile. > >> I think this is a price worth paying for supporting multiple versions, >> and is in any case required on non-Ubuntu platforms. > > OK, so we need to bug report to ubuntu. Anybody feels like doing it, or > do I need to go ahead :). It's your problem. :-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Dec 10 11:18:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 Dec 2009 10:18:53 -0600 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> <4B20B748.2050707@student.matnat.uio.no> Message-ID: <3d375d730912100818r32b1cfc9ted71cc018fbc4536@mail.gmail.com> On Thu, Dec 10, 2009 at 04:24, Pauli Virtanen wrote: > Thu, 10 Dec 2009 09:54:32 +0100, Dag Sverre Seljebotn wrote: > [clip] >> At this point I want to remind us about Charles Harris' very workable >> proposal: Raise a warning. That should both keep backward compatability >> and prevent people from wasting days. (Hopefully, we can avoid wasting >> days discussing this issue too :-) ). > > Yes. We can even make it a PendingDeprecationWarning, when we become > convinced that this feature should go away in entirety. PendingDeprecationWarnings are off by default, I think. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Thu Dec 10 11:19:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Dec 2009 17:19:24 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <3d375d730912100817n76b626dax6ae1d6026cb755f0@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210132947.GC1202@phare.normalesup.org> <1260452674.18562.159.camel@talisman> <20091210135849.GA31430@phare.normalesup.org> <3d375d730912100817n76b626dax6ae1d6026cb755f0@mail.gmail.com> Message-ID: <20091210161924.GB25773@phare.normalesup.org> On Thu, Dec 10, 2009 at 10:17:43AM -0600, Robert Kern wrote: > > OK, so we need to bug report to ubuntu. Anybody feels like doing it, or > > do I need to go ahead :). > It's your problem. :-) That's kinda what I thought. I was just try to dump work on someone else :). I'll do it. Ga?l From rmay31 at gmail.com Thu Dec 10 12:02:04 2009 From: rmay31 at gmail.com (Ryan May) Date: Thu, 10 Dec 2009 11:02:04 -0600 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <4B20B748.2050707@student.matnat.uio.no> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <3d375d730903070135u28fb4085x86d0d6139d2dd28f@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> <4B20B748.2050707@student.matnat.uio.no> Message-ID: On Thu, Dec 10, 2009 at 2:54 AM, Dag Sverre Seljebotn wrote: > Anne Archibald wrote: >> 2009/12/9 Dr. Phillip M. Feldman : >> >> >>> When I recently tried to validate a code, the answers were wrong, and it >>> took two full days to track down the cause. ?I am now forced to reconsider >>> carefully whether Python/NumPy is a suitable platform for serious scientific >>> computing. >>> >> >> While I find the current numpy complex->real conversion annoying, I >> have to say, this kind of rhetoric does not benefit your cause. It >> sounds childish and manipulative, and makes even people who agree in >> principle want to tell you to go ahead and use MATLAB and stop >> pestering us. We are not here to sell you on numpy; if you hate it, >> don't use it. We are here because *we* use it, warts and all, and we >> want to discuss interesting topics related to numpy. That you would >> have implemented it differently is not very interesting if you are not >> even willing to understand why it is the way it is and what a change >> would cost, let alone propose a workable way to improve. >> > At this point I want to remind us about Charles Harris' very workable > proposal: Raise a warning. That should both keep backward compatability > and prevent people from wasting days. (Hopefully, we can avoid wasting > days discussing this issue too :-) ). +1 Completely agree. And to be clear, I realize the need not to break anything relying on this behavior. I just don't want people passing this off as a non-issue/'not a big deal'. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From Norbert.Nemec.list at gmx.de Thu Dec 10 13:36:26 2009 From: Norbert.Nemec.list at gmx.de (Norbert Nemec) Date: Thu, 10 Dec 2009 18:36:26 +0000 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? In-Reply-To: <4B1FFAA2.3040905@student.matnat.uio.no> References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> <4B1FFAA2.3040905@student.matnat.uio.no> Message-ID: <4B213FAA.4050308@gmx.de> Dag Sverre Seljebotn wrote: > I haven't heard of anything, but here's what I'd do: > - Use np.int64 > - Multiply all inputs to my code with 10^6 > - Divide all output from my code with 10^6 > - If you need to debug-print and array, simply define something like > > FIXED_POINT_FACTOR = 10**6 > > def printarr(x): > print x.astype(np.float) / FIXED_POINT_FACTOR > > Or am I missing something? > Indeed, you are missing that internal multiplications have to take into account this factor as well. To prevent loss of precision, you would need int128 results and shift those correctly after the multiplication. From ndbecker2 at gmail.com Thu Dec 10 14:02:22 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 10 Dec 2009 14:02:22 -0500 Subject: [Numpy-discussion] c++ code Message-ID: Here is my c++ code if you want to play with it. You'll need to find constrained_value for boost (or I can provide it as a patch if you request). As for semantics, different people actually want/expect different behavior. I chose the 'pythonic' approach. If you do fpa @ fpb, where fpa,b are fixed point numbers of same/different sizes and @ is a binary op, my library will _not_ promote the output. Explicit is better. If you want a(32 bit) x b(32 bit) -> c(64 bit), you promote the operands explicitly first. -------------- next part -------------- A non-text attachment was scrubbed... Name: run_time_fixed_pt.hpp Type: text/x-c++hdr Size: 7592 bytes Desc: not available URL: From pav at iki.fi Thu Dec 10 14:34:18 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 10 Dec 2009 21:34:18 +0200 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <3d375d730912100818r32b1cfc9ted71cc018fbc4536@mail.gmail.com> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <9457e7c80903070210o5704963exa5b5596970a22d8a@mail.gmail.com> <3d375d730903070218v6fd8e607kd0da6da9afefaa9d@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> <4B20B748.2050707@student.matnat.uio.no> <3d375d730912100818r32b1cfc9ted71cc018fbc4536@mail.gmail.com> Message-ID: <1260473657.8996.21.camel@idol> to, 2009-12-10 kello 10:18 -0600, Robert Kern kirjoitti: > On Thu, Dec 10, 2009 at 04:24, Pauli Virtanen wrote: > > Thu, 10 Dec 2009 09:54:32 +0100, Dag Sverre Seljebotn wrote: > > [clip] > >> At this point I want to remind us about Charles Harris' very workable > >> proposal: Raise a warning. That should both keep backward compatability > >> and prevent people from wasting days. (Hopefully, we can avoid wasting > >> days discussing this issue too :-) ). > > > > Yes. We can even make it a PendingDeprecationWarning, when we become > > convinced that this feature should go away in entirety. > > PendingDeprecationWarnings are off by default, I think. So it seems. r7993 makes Numpy to raise a custom ComplexWarning when these things occur. It's easy for the user to change that into an error or silence it. -- Pauli Virtanen From charlesr.harris at gmail.com Thu Dec 10 14:44:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Dec 2009 12:44:41 -0700 Subject: [Numpy-discussion] Assigning complex values to a real array In-Reply-To: <1260473657.8996.21.camel@idol> References: <9457e7c80903061718i1dbbcdc7td1c9dfb82e0b387@mail.gmail.com> <26705737.post@talk.nabble.com> <1260349494.18562.120.camel@talisman> <26720987.post@talk.nabble.com> <4B20B748.2050707@student.matnat.uio.no> <3d375d730912100818r32b1cfc9ted71cc018fbc4536@mail.gmail.com> <1260473657.8996.21.camel@idol> Message-ID: Hi Pauli, On Thu, Dec 10, 2009 at 12:34 PM, Pauli Virtanen wrote: > to, 2009-12-10 kello 10:18 -0600, Robert Kern kirjoitti: > > On Thu, Dec 10, 2009 at 04:24, Pauli Virtanen > > wrote: > > > Thu, 10 Dec 2009 09:54:32 +0100, Dag Sverre Seljebotn wrote: > > > [clip] > > >> At this point I want to remind us about Charles Harris' very workable > > >> proposal: Raise a warning. That should both keep backward > compatability > > >> and prevent people from wasting days. (Hopefully, we can avoid wasting > > >> days discussing this issue too :-) ). > > > > > > Yes. We can even make it a PendingDeprecationWarning, when we become > > > convinced that this feature should go away in entirety. > > > > PendingDeprecationWarnings are off by default, I think. > > So it seems. > > r7993 makes Numpy to raise a custom ComplexWarning when these things > occur. It's easy for the user to change that into an error or silence > it. > > You need a small fixup to support python-2.4. For instance, the deprecation warning is implemented as: #if PY_VERSION_HEX >= 0x02050000 #define DEPRECATE(msg) PyErr_WarnEx(PyExc_DeprecationWarning,msg,1) #else #define DEPRECATE(msg) PyErr_Warn(PyExc_DeprecationWarning,msg) #endif Thanks for settling this discussion ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Thu Dec 10 16:04:14 2009 From: david.huard at gmail.com (David Huard) Date: Thu, 10 Dec 2009 16:04:14 -0500 Subject: [Numpy-discussion] Histogram - removing the "new" keyword for 1.4 Message-ID: <91cf711d0912101304x7eadc243oe4efef29b0b7b956@mail.gmail.com> Hi all, A long time ago, it was decided to change the default behaviour of the histogram function. The new behaviour has been the default in 1.3 and usage of the old behaviour has raised a warning. According to the timeline discussed at the time, version 1.4 would be the time to remove the old stuff entirely. I was wondering if this was satisfactory for all, or if someone still depends on "new=False" ? I don't think there is any harm in keeping it around until 1.5, we'd just have to update the docstring to reflect this. Of course, following the original plan would be better. I'm sorry to bring this so late in the release cycle. Cheers, David Huard -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Thu Dec 10 17:28:52 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 10 Dec 2009 14:28:52 -0800 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> Message-ID: On Thu, Dec 10, 2009 at 3:59 AM, David Cournapeau wrote: > You could also try playing with the order of the -Ipath to make sure > the right one us picked up before /usr/include, but I am not sure it > is even possible, as /usr/include may always be the first one gcc > looks in. Gael, orthogonal to reporting this to Ubuntu, I think you can do what David suggests by configuring your variables correctly. I have a pretty strictly manipulated set of valid installation $PREFIX location, all of which I can use for ./configure --prefix=$PREFIX or ./setup.py install --prefix=$PREFIX and that's because for each one of these, I correctly configure ALL of these: PATH: binary execution LD_LIBRARY_PATH: dynamic linker search path LIBRARY_PATH: static linking by gcc (like -L) CPATH: generic include path for gcc (like -I), used for all languages C_INCLUDE_PATH: C-specific include path, after CPATH CPLUS_INCLUDE_PATH: C++-specific include path, after CPATH PYTHONPATH: search path for python packages I have some bash code to do this automatically, I can send it your way if you want. This doesn't out of the box work for in-place installs, because it creates pythonpath with pythonX.Y/site-packages, so it would not cover your in-place headers. But you could easily use the utilities in there to configure some of the *PATH variables with the location of your numpy source tree, so the in-place installs work as you expect them. I haven't tested it, but I think it should work. Cheers, f From gael.varoquaux at normalesup.org Thu Dec 10 17:40:28 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 10 Dec 2009 23:40:28 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> Message-ID: <20091210224028.GA25147@phare.normalesup.org> On Thu, Dec 10, 2009 at 02:28:52PM -0800, Fernando Perez wrote: > and that's because for each one of these, I correctly configure ALL of these: > PATH: binary execution > LD_LIBRARY_PATH: dynamic linker search path > LIBRARY_PATH: static linking by gcc (like -L) > CPATH: generic include path for gcc (like -I), used for all languages > C_INCLUDE_PATH: C-specific include path, after CPATH > CPLUS_INCLUDE_PATH: C++-specific include path, after CPATH > PYTHONPATH: search path for python packages I have an aversion to changing LD_LIBRARY_PATH, LIBRARY_PATH, C_INCLUDE_PATH, CPATH, and CPLUS_INCLUDE_PATH, because I believe it leads to non reproducibles builds or run-time. More importantly, having to pull these tricks often reveals non-compatible run-times, and thus impossibility to share code, eg to use system code in localy-installed packages[*]. Now, we are indeed in such a situation (because we have binary incompatibility between numpy 1.3 and numpy 1.4), and there is nothing I can do to avoid it. It just makes me very unhappy. I am confortable enough with a Linux build environment to pull tricks to get stuff to build and run most of the time. I do however consider it wrong to have to do so. I don't want to use these tricks systematicaly. I'd much rather trip on the incompatibilities, and report them as bugs or annoyances. Thanks for offering, though. Ga?l [*] To give examples, I don't consider that I should have to build atlas to use scipy svn. I do it quite often because I get better performance, but it should not be necessary. Similarly, I want to be able to install Mayavi svn and use the system VTK, and the system VTK Python bindings. Matplotlib should be able to use pygtk from the system... From fperez.net at gmail.com Thu Dec 10 17:53:41 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 10 Dec 2009 14:53:41 -0800 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <20091210224028.GA25147@phare.normalesup.org> References: <20091209213146.GA5468@phare.normalesup.org> <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210224028.GA25147@phare.normalesup.org> Message-ID: On Thu, Dec 10, 2009 at 2:40 PM, Gael Varoquaux wrote: > I have an aversion to changing LD_LIBRARY_PATH, LIBRARY_PATH, > C_INCLUDE_PATH, CPATH, and CPLUS_INCLUDE_PATH, because I believe it leads > to non reproducibles builds or run-time. > > More importantly, having to pull these tricks often reveals > non-compatible run-times, and thus impossibility to share code, eg to use > system code in localy-installed packages[*]. Having local prefixes that are fully defined (i.e. where all *path variables are valid) makes it easy to do local installs under $HOME of just about anything, something that I need often when I'm running in environments where I don't have root access. That's really why I settled on this practice, and so far it's served me well. Cheers, f From sccolbert at gmail.com Thu Dec 10 19:09:23 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Fri, 11 Dec 2009 01:09:23 +0100 Subject: [Numpy-discussion] Fixed-point arithemetic...any solution yet? In-Reply-To: <4B213FAA.4050308@gmx.de> References: <4fe028e30912090656t5b82b5cfif865d60d07fe23ec@mail.gmail.com> <4B1FFAA2.3040905@student.matnat.uio.no> <4B213FAA.4050308@gmx.de> Message-ID: <7f014ea60912101609me738fco55a66da93a566ee9@mail.gmail.com> not to mention that that idea probably isnt going to work if his problem is non-linear ;) On Thu, Dec 10, 2009 at 7:36 PM, Norbert Nemec wrote: > > > Dag Sverre Seljebotn wrote: >> I haven't heard of anything, but here's what I'd do: >> ? - Use np.int64 >> ? - Multiply all inputs to my code with 10^6 >> ? - Divide all output from my code with 10^6 >> ? - If you need to debug-print and array, simply define something like >> >> FIXED_POINT_FACTOR = 10**6 >> >> def printarr(x): >> ? ? ?print x.astype(np.float) / FIXED_POINT_FACTOR >> >> Or am I missing something? >> > Indeed, you are missing that internal multiplications have to take into > account this factor as well. To prevent loss of precision, you would > need int128 results and shift those correctly after the multiplication. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From fperez.net at gmail.com Thu Dec 10 21:27:56 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 10 Dec 2009 18:27:56 -0800 Subject: [Numpy-discussion] doctest improvements patch (and possible regressions) In-Reply-To: <4B1FAE55.9070605@stsci.edu> References: <4B1EB148.1010607@gmail.com> <4B1FAE55.9070605@stsci.edu> Message-ID: On Wed, Dec 9, 2009 at 6:04 AM, Michael Droettboom wrote: > Sorry, just commenting on the parts I feel competent in :) ?But I think > this is a great improvement. ?It would be nice to start doing doctests > as a matter of course to keep the docs accurate. > Indeed. From the sidelines, thanks a lot to Paul for this work! Speaking of testing, here's a bit of code that I've had for a while pending for ipython, but which I think could be very useful for numpy too. Feel free to rip only the parts you need... The point of this code is to get a few nose-style features that produce normal unittests. In particular, it introduces support for single functions to be tests (something nose does but unittest doesn't), and most importantly, parametric tests that can be debugged meaningfully. A simple example: def is_smaller(i,j): assert i From cgohlke at uci.edu Thu Dec 10 23:37:54 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 10 Dec 2009 20:37:54 -0800 Subject: [Numpy-discussion] Test reports of numpy 1.4.0.dev built with Visual Studio Message-ID: <4B21CCA2.1060508@uci.edu> Hello, I built Windows binaries of numpy-1.4.0rc2.dev7996 for Python 2.4, 2.5, and 2.6 (32 and 64-bit) using Microsoft Visual Studio 2003 and 2008. Looks good: the 'setup.py bdist_wininst' builds were all successful. The output of the 'python.exe -c "import numpy;numpy.test()"' runs are attached. The Python 2.6 VS2008 builds pass all tests (except know failures). The Python 2.4 and 2.5 VS2003 builds fail 8 tests, mostly due to NaN/Inf/0 and formatting mismatches. I also compiled and test-run some Python extension packages (matplotlib, PIL, PyMOL, etc) against these numpy builds and did not notice any failures due to the numpy upgrade. Test platform: VS2008 9.0.30729.1 SP; VS2003 7.1.3088; Windows 7 Pro 64-bit; Core2Quad CPU; No external ATLAS/MKL/BLAS/LAPACK/FFTW libraries used; Python versions are noted in the attachments. Best, Christoph Gohlke Laboratory for Fluorescence Dynamics University of California, Irvine http://www.lfd.uci.edu/~gohlke/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-1.4.win32-py2.5.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-1.4.win32-py2.4.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-1.4.win32-py2.6.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpy-1.4.win-amd64-py2.6.txt URL: From Nicolas.Rougier at loria.fr Fri Dec 11 03:50:10 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Fri, 11 Dec 2009 09:50:10 +0100 Subject: [Numpy-discussion] nan_to_num and bool arrays Message-ID: <1260521410.4517.8.camel@sulfur> Hello, Using both numpy 1.3.0 and 1.4.0rc1 I got the following exception using nan_to_num on a bool array, is that the expected behavior ? >>> import numpy >>> Z = numpy.zeros((3,3),dtype=bool) >>> numpy.nan_to_num(Z) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line 374, in nan_to_num maxf, minf = _getmaxmin(y.dtype.type) File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line 307, in _getmaxmin f = getlimits.finfo(t) File "/usr/lib/python2.6/dist-packages/numpy/core/getlimits.py", line 103, in __new__ raise ValueError, "data type %r not inexact" % (dtype) ValueError: data type not inexact Nicolas From th.v.d.gronde at hccnet.nl Fri Dec 11 06:49:10 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Fri, 11 Dec 2009 12:49:10 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? Message-ID: <4B2231B6.6000709@hccnet.nl> (Resending without attachment as I don't think my previous message arrived.) I just started using numpy and am very, very pleased with the functionality and cleanness so far. However, I tried what I though would be a simple optimization and found that the opposite was true. Specifically, I had a loop where something like this was done: w += Xi[mu,:] E = np.dot(Xi,w) Instead of repeatedly doing the matrix product I thought I'd do the matrix product just once, before the loop, compute the product np.dot(Xi,Xi.T) and then do: w += Xi[mu,:] E += Xi2[mu,:] Seems like a clear winner, instead of doing a matrix multiplication it simply has to sum two vectors (in-place). However, it turned out to be 1.5 times SLOWER... I've attached a test file which shows the problem. It also tries adding columns instead of rows (in case the memory layout is playing tricks), but this seems to make no difference. This is the output I got: Dot product: 5.188786 Add a row: 8.032767 Add a column: 8.070953 Any ideas on why adding a row (or column) of a matrix is slower than computing a matrix product with a similarly sized matrix... (Xi has less columns than Xi2, but just as many rows.) From dagss at student.matnat.uio.no Fri Dec 11 07:01:18 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Dec 2009 13:01:18 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B2231B6.6000709@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> Message-ID: <4B22348E.8060502@student.matnat.uio.no> Jasper van de Gronde wrote: > (Resending without attachment as I don't think my previous message arrived.) > > I just started using numpy and am very, very pleased with the > functionality and cleanness so far. However, I tried what I though would > be a simple optimization and found that the opposite was true. > Specifically, I had a loop where something like this was done: > > w += Xi[mu,:] > E = np.dot(Xi,w) > > Instead of repeatedly doing the matrix product I thought I'd do the > matrix product just once, before the loop, compute the product > np.dot(Xi,Xi.T) and then do: > > w += Xi[mu,:] > E += Xi2[mu,:] > > Seems like a clear winner, instead of doing a matrix multiplication it > simply has to sum two vectors (in-place). However, it turned out to be > 1.5 times SLOWER... > > I've attached a test file which shows the problem. It also tries adding > columns instead of rows (in case the memory layout is playing tricks), > but this seems to make no difference. This is the output I got: > > Dot product: 5.188786 > Add a row: 8.032767 > Add a column: 8.070953 > > Any ideas on why adding a row (or column) of a matrix is slower than > computing a matrix product with a similarly sized matrix... (Xi has less > columns than Xi2, but just as many rows.) > I think we need some numbers to put this into context -- how big are the vectors/matrices? How many iterations was the loop run? If the vectors are small and the loop is run many times, how fast the operation "ought" to be is irrelevant as it would drown in Python overhead. Dag Sverre From th.v.d.gronde at hccnet.nl Fri Dec 11 08:55:57 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Fri, 11 Dec 2009 14:55:57 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B22348E.8060502@student.matnat.uio.no> References: <4B2231B6.6000709@hccnet.nl> <4B22348E.8060502@student.matnat.uio.no> Message-ID: <4B224F6D.6000606@hccnet.nl> Dag Sverre Seljebotn wrote: > Jasper van de Gronde wrote: >> I've attached a test file which shows the problem. It also tries adding >> columns instead of rows (in case the memory layout is playing tricks), >> but this seems to make no difference. This is the output I got: >> >> Dot product: 5.188786 >> Add a row: 8.032767 >> Add a column: 8.070953 >> >> Any ideas on why adding a row (or column) of a matrix is slower than >> computing a matrix product with a similarly sized matrix... (Xi has less >> columns than Xi2, but just as many rows.) >> > I think we need some numbers to put this into context -- how big are the > vectors/matrices? How many iterations was the loop run? If the vectors > are small and the loop is run many times, how fast the operation "ought" > to be is irrelevant as it would drown in Python overhead. Originally I had attached a Python file demonstrating the problem, but apparently this wasn't accepted by the list. In any case, the matrices and vectors weren't too big (60x20), so I tried making them bigger and indeed the "fast" version was now considerably faster. But still, this seems like a very odd difference. I know Python is an interpreted language and has a lot of overhead, but still, selecting a row/column shouldn't be THAT slow, should it? To be clear, this is the code I used for testing: -------------------------------------------------------------------- import timeit setupCode = """ import numpy as np P = 60 N = 20 Xi = np.random.standard_normal((P,N)) w = np.random.standard_normal((N)) Xi2 = np.dot(Xi,Xi.T) E = np.dot(Xi,w) """ N = 10000 dotProduct = timeit.Timer('E = np.dot(Xi,w)',setupCode) additionRow = timeit.Timer('E += Xi2[P/2,:]',setupCode) additionCol = timeit.Timer('E += Xi2[:,P/2]',setupCode) print "Dot product: %f" % dotProduct.timeit(N) print "Add a row: %f" % additionRow.timeit(N) print "Add a column: %f" % additionCol.timeit(N) -------------------------------------------------------------------- From dagss at student.matnat.uio.no Fri Dec 11 10:44:29 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 Dec 2009 16:44:29 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B224F6D.6000606@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <4B22348E.8060502@student.matnat.uio.no> <4B224F6D.6000606@hccnet.nl> Message-ID: <4B2268DD.9070603@student.matnat.uio.no> Jasper van de Gronde wrote: > Dag Sverre Seljebotn wrote: > >> Jasper van de Gronde wrote: >> >>> I've attached a test file which shows the problem. It also tries adding >>> columns instead of rows (in case the memory layout is playing tricks), >>> but this seems to make no difference. This is the output I got: >>> >>> Dot product: 5.188786 >>> Add a row: 8.032767 >>> Add a column: 8.070953 >>> >>> Any ideas on why adding a row (or column) of a matrix is slower than >>> computing a matrix product with a similarly sized matrix... (Xi has less >>> columns than Xi2, but just as many rows.) >>> >>> >> I think we need some numbers to put this into context -- how big are the >> vectors/matrices? How many iterations was the loop run? If the vectors >> are small and the loop is run many times, how fast the operation "ought" >> to be is irrelevant as it would drown in Python overhead. >> > > Originally I had attached a Python file demonstrating the problem, but > apparently this wasn't accepted by the list. In any case, the matrices > and vectors weren't too big (60x20), so I tried making them bigger and > indeed the "fast" version was now considerably faster. > 60x20 is "nothing", so a full matrix multiplication or a single matrix-vector probably takes the same time (that is, the difference between them in itself is likely smaller than the error you make during measuring). In this context, the benchmarks will be completely dominated by the number of Python calls you make (each, especially taking the slice, means allocating Python objects, calling a bunch of functions in C, etc. etc). So it's not that strange, taking a slice isn't free, some Python objects must be created etc. etc. I think the lesson mostly should be that with so little data, benchmarking becomes a very difficult art. Dag Sverre > But still, this seems like a very odd difference. I know Python is an > interpreted language and has a lot of overhead, but still, selecting a > row/column shouldn't be THAT slow, should it? To be clear, this is the > code I used for testing: > -------------------------------------------------------------------- > import timeit > > setupCode = """ > import numpy as np > > P = 60 > N = 20 > > Xi = np.random.standard_normal((P,N)) > w = np.random.standard_normal((N)) > Xi2 = np.dot(Xi,Xi.T) > E = np.dot(Xi,w) > """ > > N = 10000 > > dotProduct = timeit.Timer('E = np.dot(Xi,w)',setupCode) > additionRow = timeit.Timer('E += Xi2[P/2,:]',setupCode) > additionCol = timeit.Timer('E += Xi2[:,P/2]',setupCode) > print "Dot product: %f" % dotProduct.timeit(N) > print "Add a row: %f" % additionRow.timeit(N) > print "Add a column: %f" % additionCol.timeit(N) > -------------------------------------------------------------------- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From faltet at pytables.org Fri Dec 11 11:03:21 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 11 Dec 2009 17:03:21 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B2268DD.9070603@student.matnat.uio.no> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> Message-ID: <200912111703.21261.faltet@pytables.org> A Friday 11 December 2009 16:44:29 Dag Sverre Seljebotn escrigu?: > Jasper van de Gronde wrote: > > Dag Sverre Seljebotn wrote: > >> Jasper van de Gronde wrote: > >>> I've attached a test file which shows the problem. It also tries adding > >>> columns instead of rows (in case the memory layout is playing tricks), > >>> but this seems to make no difference. This is the output I got: > >>> > >>> Dot product: 5.188786 > >>> Add a row: 8.032767 > >>> Add a column: 8.070953 > >>> > >>> Any ideas on why adding a row (or column) of a matrix is slower than > >>> computing a matrix product with a similarly sized matrix... (Xi has > >>> less columns than Xi2, but just as many rows.) > >> > >> I think we need some numbers to put this into context -- how big are the > >> vectors/matrices? How many iterations was the loop run? If the vectors > >> are small and the loop is run many times, how fast the operation "ought" > >> to be is irrelevant as it would drown in Python overhead. > > > > Originally I had attached a Python file demonstrating the problem, but > > apparently this wasn't accepted by the list. In any case, the matrices > > and vectors weren't too big (60x20), so I tried making them bigger and > > indeed the "fast" version was now considerably faster. > > 60x20 is "nothing", so a full matrix multiplication or a single > matrix-vector probably takes the same time (that is, the difference > between them in itself is likely smaller than the error you make during > measuring). > > In this context, the benchmarks will be completely dominated by the > number of Python calls you make (each, especially taking the slice, > means allocating Python objects, calling a bunch of functions in C, etc. > etc). So it's not that strange, taking a slice isn't free, some Python > objects must be created etc. etc. Yeah, I think taking slices here is taking quite a lot of time: In [58]: timeit E + Xi2[P/2,:] 100000 loops, best of 3: 3.95 ?s per loop In [59]: timeit E + Xi2[P/2] 100000 loops, best of 3: 2.17 ?s per loop don't know why the additional ',:' in the slice is taking so much time, but my guess is that passing & analyzing the second argument (slice(None,None,None)) could be the responsible for the slowdown (but that is taking too much time). Mmh, perhaps it would be worth to study this more carefully so that an optimization could be done in NumPy. > I think the lesson mostly should be that with so little data, > benchmarking becomes a very difficult art. Well, I think it is not difficult, it is just that you are perhaps benchmarking Python/NumPy machinery instead ;-) I'm curious whether Matlab can do slicing much more faster than NumPy. Jasper? -- Francesc Alted From gael.varoquaux at normalesup.org Fri Dec 11 11:07:32 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 11 Dec 2009 17:07:32 +0100 Subject: [Numpy-discussion] [Nipy-devel] Impossibility to build nipy on recent numpy? In-Reply-To: <20091210161924.GB25773@phare.normalesup.org> References: <1e2af89e0912091336j31d1870fpf5679790e9956f1f@mail.gmail.com> <20091209214002.GB32739@phare.normalesup.org> <3f6ca94e0912100159o282cf1d4na8bef423b8ad69cc@mail.gmail.com> <20091210114353.GA1393@phare.normalesup.org> <5b8d13220912100359t6ec43f89y7fcb312aaed44de6@mail.gmail.com> <20091210132947.GC1202@phare.normalesup.org> <1260452674.18562.159.camel@talisman> <20091210135849.GA31430@phare.normalesup.org> <3d375d730912100817n76b626dax6ae1d6026cb755f0@mail.gmail.com> <20091210161924.GB25773@phare.normalesup.org> Message-ID: <20091211160732.GB22114@phare.normalesup.org> On Thu, Dec 10, 2009 at 05:19:24PM +0100, Gael Varoquaux wrote: > On Thu, Dec 10, 2009 at 10:17:43AM -0600, Robert Kern wrote: > > > OK, so we need to bug report to ubuntu. Anybody feels like doing it, or > > > do I need to go ahead :). > > It's your problem. :-) > That's kinda what I thought. I was just try to dump work on someone else > :). I'll do it. OK, done: https://bugs.launchpad.net/ubuntu/+source/python-numpy/+bug/495537 From kwgoodman at gmail.com Fri Dec 11 11:21:22 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 08:21:22 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <1260521410.4517.8.camel@sulfur> References: <1260521410.4517.8.camel@sulfur> Message-ID: On Fri, Dec 11, 2009 at 12:50 AM, Nicolas Rougier wrote: > > Hello, > > Using both numpy 1.3.0 and 1.4.0rc1 I got the following exception using > nan_to_num on a bool array, is that the expected behavior ? > > >>>> import numpy >>>> Z = numpy.zeros((3,3),dtype=bool) >>>> numpy.nan_to_num(Z) > Traceback (most recent call last): > ?File "", line 1, in > ?File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line > 374, in nan_to_num > ? ?maxf, minf = _getmaxmin(y.dtype.type) > ?File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line > 307, in _getmaxmin > ? ?f = getlimits.finfo(t) > ?File "/usr/lib/python2.6/dist-packages/numpy/core/getlimits.py", line > 103, in __new__ > ? ?raise ValueError, "data type %r not inexact" % (dtype) > ValueError: data type not inexact I guess a check for bool could be added at the top of nan_to_num. If the input x is a bool then nan_to_num would just return x unchanged. Or perhaps maxf, minf = _getmaxmin(y.dtype.type) could return False, True. Best bet is probably to file a ticket. And then pray. From bsouthey at gmail.com Fri Dec 11 11:36:54 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 11 Dec 2009 10:36:54 -0600 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <200912111703.21261.faltet@pytables.org> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> Message-ID: <4B227526.10707@gmail.com> On 12/11/2009 10:03 AM, Francesc Alted wrote: > A Friday 11 December 2009 16:44:29 Dag Sverre Seljebotn escrigu?: > >> Jasper van de Gronde wrote: >> >>> Dag Sverre Seljebotn wrote: >>> >>>> Jasper van de Gronde wrote: >>>> >>>>> I've attached a test file which shows the problem. It also tries adding >>>>> columns instead of rows (in case the memory layout is playing tricks), >>>>> but this seems to make no difference. This is the output I got: >>>>> >>>>> Dot product: 5.188786 >>>>> Add a row: 8.032767 >>>>> Add a column: 8.070953 >>>>> >>>>> Any ideas on why adding a row (or column) of a matrix is slower than >>>>> computing a matrix product with a similarly sized matrix... (Xi has >>>>> less columns than Xi2, but just as many rows.) >>>>> >>>> I think we need some numbers to put this into context -- how big are the >>>> vectors/matrices? How many iterations was the loop run? If the vectors >>>> are small and the loop is run many times, how fast the operation "ought" >>>> to be is irrelevant as it would drown in Python overhead. >>>> >>> Originally I had attached a Python file demonstrating the problem, but >>> apparently this wasn't accepted by the list. In any case, the matrices >>> and vectors weren't too big (60x20), so I tried making them bigger and >>> indeed the "fast" version was now considerably faster. >>> >> 60x20 is "nothing", so a full matrix multiplication or a single >> matrix-vector probably takes the same time (that is, the difference >> between them in itself is likely smaller than the error you make during >> measuring). >> >> In this context, the benchmarks will be completely dominated by the >> number of Python calls you make (each, especially taking the slice, >> means allocating Python objects, calling a bunch of functions in C, etc. >> etc). So it's not that strange, taking a slice isn't free, some Python >> objects must be created etc. etc. >> > Yeah, I think taking slices here is taking quite a lot of time: > > In [58]: timeit E + Xi2[P/2,:] > 100000 loops, best of 3: 3.95 ?s per loop > > In [59]: timeit E + Xi2[P/2] > 100000 loops, best of 3: 2.17 ?s per loop > > don't know why the additional ',:' in the slice is taking so much time, but my > guess is that passing& analyzing the second argument (slice(None,None,None)) > could be the responsible for the slowdown (but that is taking too much time). > Mmh, perhaps it would be worth to study this more carefully so that an > optimization could be done in NumPy. > > >> I think the lesson mostly should be that with so little data, >> benchmarking becomes a very difficult art. >> > Well, I think it is not difficult, it is just that you are perhaps > benchmarking Python/NumPy machinery instead ;-) I'm curious whether Matlab > can do slicing much more faster than NumPy. Jasper? > > What are using actually trying to test here? I do not see any equivalence in the operations or output here. -With your slices you need two dot products but ultimately you are only using one for your dot product. -There are addition operations on the slices that are not present in the dot product. -The final E arrays are not the same for all three operations. Having said that, the more you can vectorize your function, the more efficient it will likely be especially with Atlas etc. Bruce From Nicolas.Rougier at loria.fr Fri Dec 11 12:18:20 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Fri, 11 Dec 2009 18:18:20 +0100 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> Message-ID: <8805FA35-DF06-48F0-A7CA-9D092E97C697@loria.fr> I've created a ticket (#1327). Nicolas On Dec 11, 2009, at 17:21 , Keith Goodman wrote: > On Fri, Dec 11, 2009 at 12:50 AM, Nicolas Rougier > wrote: >> >> Hello, >> >> Using both numpy 1.3.0 and 1.4.0rc1 I got the following exception using >> nan_to_num on a bool array, is that the expected behavior ? >> >> >>>>> import numpy >>>>> Z = numpy.zeros((3,3),dtype=bool) >>>>> numpy.nan_to_num(Z) >> Traceback (most recent call last): >> File "", line 1, in >> File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line >> 374, in nan_to_num >> maxf, minf = _getmaxmin(y.dtype.type) >> File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line >> 307, in _getmaxmin >> f = getlimits.finfo(t) >> File "/usr/lib/python2.6/dist-packages/numpy/core/getlimits.py", line >> 103, in __new__ >> raise ValueError, "data type %r not inexact" % (dtype) >> ValueError: data type not inexact > > I guess a check for bool could be added at the top of nan_to_num. If > the input x is a bool then nan_to_num would just return x unchanged. > Or perhaps > > maxf, minf = _getmaxmin(y.dtype.type) > > could return False, True. > > Best bet is probably to file a ticket. And then pray. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From faltet at pytables.org Fri Dec 11 12:28:44 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 11 Dec 2009 18:28:44 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B227526.10707@gmail.com> References: <4B2231B6.6000709@hccnet.nl> <200912111703.21261.faltet@pytables.org> <4B227526.10707@gmail.com> Message-ID: <200912111828.44165.faltet@pytables.org> A Friday 11 December 2009 17:36:54 Bruce Southey escrigu?: > On 12/11/2009 10:03 AM, Francesc Alted wrote: > > A Friday 11 December 2009 16:44:29 Dag Sverre Seljebotn escrigu?: > >> Jasper van de Gronde wrote: > >>> Dag Sverre Seljebotn wrote: > >>>> Jasper van de Gronde wrote: > >>>>> I've attached a test file which shows the problem. It also tries > >>>>> adding columns instead of rows (in case the memory layout is playing > >>>>> tricks), but this seems to make no difference. This is the output I > >>>>> got: > >>>>> > >>>>> Dot product: 5.188786 > >>>>> Add a row: 8.032767 > >>>>> Add a column: 8.070953 > >>>>> > >>>>> Any ideas on why adding a row (or column) of a matrix is slower than > >>>>> computing a matrix product with a similarly sized matrix... (Xi has > >>>>> less columns than Xi2, but just as many rows.) > >>>> > >>>> I think we need some numbers to put this into context -- how big are > >>>> the vectors/matrices? How many iterations was the loop run? If the > >>>> vectors are small and the loop is run many times, how fast the > >>>> operation "ought" to be is irrelevant as it would drown in Python > >>>> overhead. > >>> > >>> Originally I had attached a Python file demonstrating the problem, but > >>> apparently this wasn't accepted by the list. In any case, the matrices > >>> and vectors weren't too big (60x20), so I tried making them bigger and > >>> indeed the "fast" version was now considerably faster. > >> > >> 60x20 is "nothing", so a full matrix multiplication or a single > >> matrix-vector probably takes the same time (that is, the difference > >> between them in itself is likely smaller than the error you make during > >> measuring). > >> > >> In this context, the benchmarks will be completely dominated by the > >> number of Python calls you make (each, especially taking the slice, > >> means allocating Python objects, calling a bunch of functions in C, etc. > >> etc). So it's not that strange, taking a slice isn't free, some Python > >> objects must be created etc. etc. > > > > Yeah, I think taking slices here is taking quite a lot of time: > > > > In [58]: timeit E + Xi2[P/2,:] > > 100000 loops, best of 3: 3.95 ?s per loop > > > > In [59]: timeit E + Xi2[P/2] > > 100000 loops, best of 3: 2.17 ?s per loop > > > > don't know why the additional ',:' in the slice is taking so much time, > > but my guess is that passing& analyzing the second argument > > (slice(None,None,None)) could be the responsible for the slowdown (but > > that is taking too much time). Mmh, perhaps it would be worth to study > > this more carefully so that an optimization could be done in NumPy. > > > >> I think the lesson mostly should be that with so little data, > >> benchmarking becomes a very difficult art. > > > > Well, I think it is not difficult, it is just that you are perhaps > > benchmarking Python/NumPy machinery instead ;-) I'm curious whether > > Matlab can do slicing much more faster than NumPy. Jasper? > > What are using actually trying to test here? Good question. I don't know for sure :-) > I do not see any equivalence in the operations or output here. > -With your slices you need two dot products but ultimately you are only > using one for your dot product. > -There are addition operations on the slices that are not present in the > dot product. > -The final E arrays are not the same for all three operations. I don't understand the ultimate goal of the OP either, but what caught my attention was that: In [74]: timeit Xi2[P/2] 1000000 loops, best of 3: 278 ns per loop In [75]: timeit Xi2[P/2,:] 1000000 loops, best of 3: 1.04 ?s per loop i.e. adding an additional parameter (the ':') to the slice, drives the time to run almost 4x slower. And with this, it is *partially* explained the problem exposed by OP, i.e.: In [77]: timeit np.dot(Xi,w) 100000 loops, best of 3: 2.91 ?s per loop In [78]: timeit E + Xi2[P/2] 100000 loops, best of 3: 2.05 ?s per loop In [79]: timeit E + Xi2[P/2,:] 100000 loops, best of 3: 3.81 ?s per loop But again, don't ask me whether the results are okay or not. I'm playing here the role of a pure computational scientist on a very concrete problem ;-) > Having said that, the more you can vectorize your function, the more > efficient it will likely be especially with Atlas etc. Except if your arrays are small enough, which is the underlying issue here IMO. -- Francesc Alted From amenity at enthought.com Fri Dec 11 12:52:23 2009 From: amenity at enthought.com (Amenity Applewhite) Date: Fri, 11 Dec 2009 11:52:23 -0600 Subject: [Numpy-discussion] December Webinar: SciPy India with Travis Oliphant Message-ID: <0F07782C-2129-4FD0-BC91-C45CAEC10501@enthought.com> Next Friday Enthought will be hosting our monthly Scientific Computing with Python Webinar: Summary of SciPy India Friday December 18 1pm CST/ 7pm UTC Register at GoToMeeting Enthought President Travis Oliphant is currently in Kerala, India as the keynote speaker at SciPy India 2009. Due to a training engagement, Travis missed SciPy for the first time this summer, so he?s excited for this additional opportunity to meet and collaborate with the scientific Python community. Speakers at the event include Jarrod Millman, David Cournapeau, Christopher Burns, Prabhu Ramachandran, and Asokan Pichai ? a great group. We?re looking forward to hearing Travis? review of the proceedings. Hope to see you there! Enthought Media -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Dec 11 14:11:41 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 11 Dec 2009 13:11:41 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> Message-ID: <4B22996D.6040201@gmail.com> On 12/11/2009 10:21 AM, Keith Goodman wrote: > On Fri, Dec 11, 2009 at 12:50 AM, Nicolas Rougier > wrote: > >> Hello, >> >> Using both numpy 1.3.0 and 1.4.0rc1 I got the following exception using >> nan_to_num on a bool array, is that the expected behavior ? >> >> >> >>>>> import numpy >>>>> Z = numpy.zeros((3,3),dtype=bool) >>>>> numpy.nan_to_num(Z) >>>>> >> Traceback (most recent call last): >> File "", line 1, in >> File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line >> 374, in nan_to_num >> maxf, minf = _getmaxmin(y.dtype.type) >> File "/usr/lib/python2.6/dist-packages/numpy/lib/type_check.py", line >> 307, in _getmaxmin >> f = getlimits.finfo(t) >> File "/usr/lib/python2.6/dist-packages/numpy/core/getlimits.py", line >> 103, in __new__ >> raise ValueError, "data type %r not inexact" % (dtype) >> ValueError: data type not inexact >> > I guess a check for bool could be added at the top of nan_to_num. If > the input x is a bool then nan_to_num would just return x unchanged. > Or perhaps > > maxf, minf = _getmaxmin(y.dtype.type) > > could return False, True. > > Best bet is probably to file a ticket. And then pray. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > As documented, nan_to_num returns a float so it does not return the input unchanged. That is the output of np.nan_to_num(np.zeros((3,3))) is a float array not an int array. This is also why np.finfo() fails because it is not give a float (that is, it also gives the same output if the argument to np.finfo() is an int rather than an boolean type). I am curious why do you expect this conversion to work given how Python defines boolean types (http://docs.python.org/library/stdtypes.html#boolean-values). It is ambiguous to convert from boolean to float since anything that is not zero is 'True' and that NaN is not zero: >>> bool(np.PINF) True >>> bool(np.NINF) True >>> bool(np.NaN) True >>> bool(np.PZERO) False >>> bool(np.NZERO) False So what do you behavior do you expect to see? Bruce From robert.kern at gmail.com Fri Dec 11 14:33:07 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 13:33:07 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <4B22996D.6040201@gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> Message-ID: <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> On Fri, Dec 11, 2009 at 13:11, Bruce Southey wrote: > As documented, nan_to_num returns a float so it does not return the > input unchanged. I think that is describing the current behavior rather than documenting the intent of the function. Given the high level purpose of the function, to "[r]eplace nan with zero and inf with finite numbers," I think it is fairly reasonable to implement it as a no-op for integers and related dtypes. There are no nans or infs for those dtypes so the input can be passed back unchanged. > That is the output of np.nan_to_num(np.zeros((3,3))) is a float array > not an int array. This is also why np.finfo() fails because it is not > give a float (that is, it also gives the same output if the argument to > np.finfo() is an int rather than an boolean type). > > I am curious why do you expect this conversion to work given how Python > defines boolean types > (http://docs.python.org/library/stdtypes.html#boolean-values). > > It is ambiguous to convert from boolean to float since anything that is > not zero is 'True' and that NaN is not zero: > ?>>> bool(np.PINF) > True > ?>>> bool(np.NINF) > True > ?>>> bool(np.NaN) > True > ?>>> bool(np.PZERO) > False > ?>>> bool(np.NZERO) > False No, that's the other way around, converting floats to bools. Converting bools to floats is trivial: True->1.0, False->0.0. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Fri Dec 11 15:08:11 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 11 Dec 2009 14:08:11 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> Message-ID: <4B22A6AB.9000106@gmail.com> On 12/11/2009 01:33 PM, Robert Kern wrote: > On Fri, Dec 11, 2009 at 13:11, Bruce Southey wrote: > > >> As documented, nan_to_num returns a float so it does not return the >> input unchanged. >> Sorry for my mistake: Given an int input, np.nan_to_num returns an int dtype >>> np.nan_to_num(np.zeros((3,3), dtype=np.int)).dtype dtype('int64') > I think that is describing the current behavior rather than > documenting the intent of the function. Given the high level purpose > of the function, to "[r]eplace nan with zero and inf with finite > numbers," I think it is fairly reasonable to implement it as a no-op > for integers and related dtypes. There are no nans or infs for those > dtypes so the input can be passed back unchanged. > So I agree that it should leave the input untouched when a non-float dtype is used for some array-like input. Bruce From kwgoodman at gmail.com Fri Dec 11 15:41:02 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 12:41:02 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <4B22A6AB.9000106@gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> Message-ID: On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: > On 12/11/2009 01:33 PM, Robert Kern wrote: >> On Fri, Dec 11, 2009 at 13:11, Bruce Southey ?wrote: >> >> >>> As documented, nan_to_num returns a float so it does not return the >>> input unchanged. >>> > Sorry for my mistake: > Given an int input, np.nan_to_num returns an int dtype > ?>>> np.nan_to_num(np.zeros((3,3), dtype=np.int)).dtype > dtype('int64') > >> I think that is describing the current behavior rather than >> documenting the intent of the function. Given the high level purpose >> of the function, to "[r]eplace nan with zero and inf with finite >> numbers," I think it is fairly reasonable to implement it as a no-op >> for integers and related dtypes. There are no nans or infs for those >> dtypes so the input can be passed back unchanged. >> > > So I agree that it should leave the input untouched when a non-float > dtype is used for some array-like input. Would only one line need to be changed? Would changing if not issubclass(t, _nx.integer): to if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): do the trick? Here's nan_to_num for reference: def nan_to_num(x): try: t = x.dtype.type except AttributeError: t = obj2sctype(type(x)) if issubclass(t, _nx.complexfloating): return nan_to_num(x.real) + 1j * nan_to_num(x.imag) else: try: y = x.copy() except AttributeError: y = array(x) if not issubclass(t, _nx.integer): if not y.shape: y = array([x]) scalar = True else: scalar = False are_inf = isposinf(y) are_neg_inf = isneginf(y) are_nan = isnan(y) maxf, minf = _getmaxmin(y.dtype.type) y[are_nan] = 0 y[are_inf] = maxf y[are_neg_inf] = minf if scalar: y = y[0] return y From robert.kern at gmail.com Fri Dec 11 16:14:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 15:14:19 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> Message-ID: <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: > On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >> So I agree that it should leave the input untouched when a non-float >> dtype is used for some array-like input. > > Would only one line need to be changed? Would changing > > if not issubclass(t, _nx.integer): > > to > > if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): > > do the trick? That still leaves strings, voids, and objects. I recommend: if issubclass(t, _nx.inexact): Arguably, one should handle nan float objects in object arrays and float columns in structured arrays, but the current code does not handle either of those anyways. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Dec 11 17:09:48 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 14:09:48 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> Message-ID: On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: > On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: > >>> So I agree that it should leave the input untouched when a non-float >>> dtype is used for some array-like input. >> >> Would only one line need to be changed? Would changing >> >> if not issubclass(t, _nx.integer): >> >> to >> >> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >> >> do the trick? > > That still leaves strings, voids, and objects. I recommend: > > ?if issubclass(t, _nx.inexact): > > Arguably, one should handle nan float objects in object arrays and > float columns in structured arrays, but the current code does not > handle either of those anyways. Without your change both >> np.nan_to_num(np.array([True, False])) >> np.nan_to_num([1]) raise exceptions. With your change: >> np.nan_to_num(np.array([True, False])) array([ True, False], dtype=bool) >> np.nan_to_num([1]) array([1]) On a separate note, this seems a little awkward: >> np.nan_to_num(1.0) 1.0 >> np.nan_to_num(1) array(1) >> x = np.ones(1, dtype=np.int) >> np.nan_to_num(x[0]) 1 From robert.kern at gmail.com Fri Dec 11 17:22:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 16:22:20 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> Message-ID: <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> On Fri, Dec 11, 2009 at 16:09, Keith Goodman wrote: > On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: >> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >> >>>> So I agree that it should leave the input untouched when a non-float >>>> dtype is used for some array-like input. >>> >>> Would only one line need to be changed? Would changing >>> >>> if not issubclass(t, _nx.integer): >>> >>> to >>> >>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >>> >>> do the trick? >> >> That still leaves strings, voids, and objects. I recommend: >> >> ?if issubclass(t, _nx.inexact): >> >> Arguably, one should handle nan float objects in object arrays and >> float columns in structured arrays, but the current code does not >> handle either of those anyways. > > Without your change both > >>> np.nan_to_num(np.array([True, False])) >>> np.nan_to_num([1]) > > raise exceptions. With your change: > >>> np.nan_to_num(np.array([True, False])) > ? array([ True, False], dtype=bool) >>> np.nan_to_num([1]) > ? array([1]) I think this is correct, though the latter one happens by accident. Lists don't have a .dtype attribute so obj2sctype(type([1])) is checked and happens to be object_. The latter line is intended to handle scalars, not sequences. I think that sequences should be coerced to arrays for output and this check should be more explicit about what it handles. [1.0] will have a problem if you don't. > On a separate note, this seems a little awkward: > >>> np.nan_to_num(1.0) > ? 1.0 >>> np.nan_to_num(1) > ? array(1) >>> x = np.ones(1, dtype=np.int) >>> np.nan_to_num(x[0]) > ? 1 Worth fixing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Dec 11 18:44:16 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 15:44:16 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> Message-ID: On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern wrote: > On Fri, Dec 11, 2009 at 16:09, Keith Goodman wrote: >> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: >>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >>> >>>>> So I agree that it should leave the input untouched when a non-float >>>>> dtype is used for some array-like input. >>>> >>>> Would only one line need to be changed? Would changing >>>> >>>> if not issubclass(t, _nx.integer): >>>> >>>> to >>>> >>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >>>> >>>> do the trick? >>> >>> That still leaves strings, voids, and objects. I recommend: >>> >>> ?if issubclass(t, _nx.inexact): >>> >>> Arguably, one should handle nan float objects in object arrays and >>> float columns in structured arrays, but the current code does not >>> handle either of those anyways. >> >> Without your change both >> >>>> np.nan_to_num(np.array([True, False])) >>>> np.nan_to_num([1]) >> >> raise exceptions. With your change: >> >>>> np.nan_to_num(np.array([True, False])) >> ? array([ True, False], dtype=bool) >>>> np.nan_to_num([1]) >> ? array([1]) > > I think this is correct, though the latter one happens by accident. > Lists don't have a .dtype attribute so obj2sctype(type([1])) is > checked and happens to be object_. The latter line is intended to > handle scalars, not sequences. I think that sequences should be > coerced to arrays for output and this check should be more explicit > about what it handles. [1.0] will have a problem if you don't. That makes sense. But I'm not smart enough to implement it. >> On a separate note, this seems a little awkward: >> >>>> np.nan_to_num(1.0) >> ? 1.0 >>>> np.nan_to_num(1) >> ? array(1) >>>> x = np.ones(1, dtype=np.int) >>>> np.nan_to_num(x[0]) >> ? 1 > > Worth fixing. Would this work? def nan_to_num(x): try: t = x.dtype.type except AttributeError: t = obj2sctype(type(x)) if issubclass(t, _nx.complexfloating): return nan_to_num(x.real) + 1j * nan_to_num(x.imag) else: try: y = x.copy() except AttributeError: y = array(x) if not y.shape: y = array([x]) scalar = True else: scalar = False if issubclass(t, _nx.inexact): are_inf = isposinf(y) are_neg_inf = isneginf(y) are_nan = isnan(y) maxf, minf = _getmaxmin(y.dtype.type) y[are_nan] = 0 y[are_inf] = maxf y[are_neg_inf] = minf if scalar: y = y[0] return y Instead of >> nan_to_num(1.0) 1.0 >> nan_to_num(1) array(1) >> nan_to_num(np.array(1.0)) 1.0 >> nan_to_num(np.array(1)) array(1) it gives >> nan_to_num(1.0) 1.0 >> nan_to_num(1) 1 >> nan_to_num(np.array(1.0)) 1.0 >> nan_to_num(np.array(1)) 1 I guess a lot of unit tests need to be written before nan_to_num can be fixed. But for now, your bool fix is an improvement. From kwgoodman at gmail.com Fri Dec 11 19:03:55 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 16:03:55 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> Message-ID: On Fri, Dec 11, 2009 at 3:44 PM, Keith Goodman wrote: > On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern wrote: >> On Fri, Dec 11, 2009 at 16:09, Keith Goodman wrote: >>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: >>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >>>> >>>>>> So I agree that it should leave the input untouched when a non-float >>>>>> dtype is used for some array-like input. >>>>> >>>>> Would only one line need to be changed? Would changing >>>>> >>>>> if not issubclass(t, _nx.integer): >>>>> >>>>> to >>>>> >>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >>>>> >>>>> do the trick? >>>> >>>> That still leaves strings, voids, and objects. I recommend: >>>> >>>> ?if issubclass(t, _nx.inexact): >>>> >>>> Arguably, one should handle nan float objects in object arrays and >>>> float columns in structured arrays, but the current code does not >>>> handle either of those anyways. >>> >>> Without your change both >>> >>>>> np.nan_to_num(np.array([True, False])) >>>>> np.nan_to_num([1]) >>> >>> raise exceptions. With your change: >>> >>>>> np.nan_to_num(np.array([True, False])) >>> ? array([ True, False], dtype=bool) >>>>> np.nan_to_num([1]) >>> ? array([1]) >> >> I think this is correct, though the latter one happens by accident. >> Lists don't have a .dtype attribute so obj2sctype(type([1])) is >> checked and happens to be object_. The latter line is intended to >> handle scalars, not sequences. I think that sequences should be >> coerced to arrays for output and this check should be more explicit >> about what it handles. [1.0] will have a problem if you don't. > > That makes sense. But I'm not smart enough to implement it. > >>> On a separate note, this seems a little awkward: >>> >>>>> np.nan_to_num(1.0) >>> ? 1.0 >>>>> np.nan_to_num(1) >>> ? array(1) >>>>> x = np.ones(1, dtype=np.int) >>>>> np.nan_to_num(x[0]) >>> ? 1 >> >> Worth fixing. > > Would this work? > > def nan_to_num(x): > ? ?try: > ? ? ? ?t = x.dtype.type > ? ?except AttributeError: > ? ? ? ?t = obj2sctype(type(x)) > ? ?if issubclass(t, _nx.complexfloating): > ? ? ? ?return nan_to_num(x.real) + 1j * nan_to_num(x.imag) > ? ?else: > ? ? ? ?try: > ? ? ? ? ? ?y = x.copy() > ? ? ? ?except AttributeError: > ? ? ? ? ? ?y = array(x) > ? ?if not y.shape: > ? ? ? ?y = array([x]) > ? ? ? ?scalar = True > ? ?else: > ? ? ? ?scalar = False > ? ?if issubclass(t, _nx.inexact): > ? ? ? ?are_inf = isposinf(y) > ? ? ? ?are_neg_inf = isneginf(y) > ? ? ? ?are_nan = isnan(y) > ? ? ? ?maxf, minf = _getmaxmin(y.dtype.type) > ? ? ? ?y[are_nan] = 0 > ? ? ? ?y[are_inf] = maxf > ? ? ? ?y[are_neg_inf] = minf > ? ?if scalar: > ? ? ? ?y = y[0] > ? ?return y > > Instead of > >>> nan_to_num(1.0) > ? 1.0 >>> nan_to_num(1) > ? array(1) >>> nan_to_num(np.array(1.0)) > ? 1.0 >>> nan_to_num(np.array(1)) > ? array(1) > > it gives > >>> nan_to_num(1.0) > ? 1.0 >>> nan_to_num(1) > ? 1 >>> nan_to_num(np.array(1.0)) > ? 1.0 >>> nan_to_num(np.array(1)) > ? 1 > > I guess a lot of unit tests need to be written before nan_to_num can > be fixed. But for now, your bool fix is an improvement. Ack! The "if issubclass(t, _nx.inexact)" fix doesn't work. It solves the bool problem but it introduces its own problem since numpy.object_ is not a subclass of inexact: >> nan_to_num([np.inf]) array([ Inf]) Yeah, way too many special cases here to do this without full unit test coverage. From robert.kern at gmail.com Fri Dec 11 19:06:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 18:06:01 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> Message-ID: <3d375d730912111606k68b19f68i489c600ac8668092@mail.gmail.com> On Fri, Dec 11, 2009 at 17:44, Keith Goodman wrote: > On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern wrote: >> On Fri, Dec 11, 2009 at 16:09, Keith Goodman wrote: >>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: >>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >>>> >>>>>> So I agree that it should leave the input untouched when a non-float >>>>>> dtype is used for some array-like input. >>>>> >>>>> Would only one line need to be changed? Would changing >>>>> >>>>> if not issubclass(t, _nx.integer): >>>>> >>>>> to >>>>> >>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >>>>> >>>>> do the trick? >>>> >>>> That still leaves strings, voids, and objects. I recommend: >>>> >>>> ?if issubclass(t, _nx.inexact): >>>> >>>> Arguably, one should handle nan float objects in object arrays and >>>> float columns in structured arrays, but the current code does not >>>> handle either of those anyways. >>> >>> Without your change both >>> >>>>> np.nan_to_num(np.array([True, False])) >>>>> np.nan_to_num([1]) >>> >>> raise exceptions. With your change: >>> >>>>> np.nan_to_num(np.array([True, False])) >>> ? array([ True, False], dtype=bool) >>>>> np.nan_to_num([1]) >>> ? array([1]) >> >> I think this is correct, though the latter one happens by accident. >> Lists don't have a .dtype attribute so obj2sctype(type([1])) is >> checked and happens to be object_. The latter line is intended to >> handle scalars, not sequences. I think that sequences should be >> coerced to arrays for output and this check should be more explicit >> about what it handles. [1.0] will have a problem if you don't. > > That makes sense. But I'm not smart enough to implement it. Something like the following at the top should help distinguish the various cases.: is_scalar = False if not isinstance(x, _nx.ndarray): x = np.asarray(x) if x.shape == (): # Must return this as a scalar later. is_scalar = True old_shape = x.shape if x.shape == (): # We need element access. x.shape = (1,) t = x.dtype.type This should allow one to pass in [np.inf] and have it correctly get interpreted as a float array rather than an object scalar. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Dec 11 19:07:07 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 18:07:07 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> Message-ID: <3d375d730912111607q2c1bff60s124fde268e00714a@mail.gmail.com> On Fri, Dec 11, 2009 at 18:03, Keith Goodman wrote: > Ack! The "if issubclass(t, _nx.inexact)" fix doesn't work. It solves > the bool problem but it introduces its own problem since numpy.object_ > is not a subclass of inexact: > >>> nan_to_num([np.inf]) > ? array([ Inf]) Right. This is the problem I was referring to: "I think that sequences should be coerced to arrays for output and this check should be more explicit about what it handles. [1.0] will have a problem if you don't." -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Dec 11 19:38:24 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 16:38:24 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <3d375d730912111606k68b19f68i489c600ac8668092@mail.gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22996D.6040201@gmail.com> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> <3d375d730912111606k68b19f68i489c600ac8668092@mail.gmail.com> Message-ID: On Fri, Dec 11, 2009 at 4:06 PM, Robert Kern wrote: > On Fri, Dec 11, 2009 at 17:44, Keith Goodman wrote: >> On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern wrote: >>> On Fri, Dec 11, 2009 at 16:09, Keith Goodman wrote: >>>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern wrote: >>>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman wrote: >>>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey wrote: >>>>> >>>>>>> So I agree that it should leave the input untouched when a non-float >>>>>>> dtype is used for some array-like input. >>>>>> >>>>>> Would only one line need to be changed? Would changing >>>>>> >>>>>> if not issubclass(t, _nx.integer): >>>>>> >>>>>> to >>>>>> >>>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_): >>>>>> >>>>>> do the trick? >>>>> >>>>> That still leaves strings, voids, and objects. I recommend: >>>>> >>>>> ?if issubclass(t, _nx.inexact): >>>>> >>>>> Arguably, one should handle nan float objects in object arrays and >>>>> float columns in structured arrays, but the current code does not >>>>> handle either of those anyways. >>>> >>>> Without your change both >>>> >>>>>> np.nan_to_num(np.array([True, False])) >>>>>> np.nan_to_num([1]) >>>> >>>> raise exceptions. With your change: >>>> >>>>>> np.nan_to_num(np.array([True, False])) >>>> ? array([ True, False], dtype=bool) >>>>>> np.nan_to_num([1]) >>>> ? array([1]) >>> >>> I think this is correct, though the latter one happens by accident. >>> Lists don't have a .dtype attribute so obj2sctype(type([1])) is >>> checked and happens to be object_. The latter line is intended to >>> handle scalars, not sequences. I think that sequences should be >>> coerced to arrays for output and this check should be more explicit >>> about what it handles. [1.0] will have a problem if you don't. >> >> That makes sense. But I'm not smart enough to implement it. > > Something like the following at the top should help distinguish the > various cases.: > > is_scalar = False > if not isinstance(x, _nx.ndarray): > ? ?x = np.asarray(x) > ? ?if x.shape == (): > ? ? ? ?# Must return this as a scalar later. > ? ? ? ?is_scalar = True > old_shape = x.shape > if x.shape == (): > ? ?# We need element access. > ? ?x.shape = (1,) > t = x.dtype.type > > This should allow one to pass in [np.inf] and have it correctly get > interpreted as a float array rather than an object scalar. That seems to work. To avoid changing the input >> x = np.array(1) >> x.shape () >> y = nan_to_num(x) >> x.shape (1,) I moved y = x.copy() further up and switched x's to y's. Here's what it looks like: def nan_to_num(x): is_scalar = False if not isinstance(x, _nx.ndarray): x = asarray(x) if x.shape == (): # Must return this as a scalar later. is_scalar = True y = x.copy() old_shape = y.shape if y.shape == (): # We need element access. y.shape = (1,) t = y.dtype.type if issubclass(t, _nx.complexfloating): return nan_to_num(y.real) + 1j * nan_to_num(y.imag) if issubclass(t, _nx.inexact): are_inf = isposinf(y) are_neg_inf = isneginf(y) are_nan = isnan(y) maxf, minf = _getmaxmin(y.dtype.type) y[are_nan] = 0 y[are_inf] = maxf y[are_neg_inf] = minf if is_scalar: y = y[0] else: y.shape = old_shape return y From robert.kern at gmail.com Fri Dec 11 21:38:12 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 11 Dec 2009 20:38:12 -0600 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: References: <1260521410.4517.8.camel@sulfur> <3d375d730912111133g6cccf79dy61dc01d16ded52c9@mail.gmail.com> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> <3d375d730912111606k68b19f68i489c600ac8668092@mail.gmail.com> Message-ID: <3d375d730912111838x2e9c5dd5y7f5fefe93c655097@mail.gmail.com> On Fri, Dec 11, 2009 at 18:38, Keith Goodman wrote: > That seems to work. To avoid changing the input > >>> x = np.array(1) >>> x.shape > ? () >>> y = nan_to_num(x) >>> x.shape > ? (1,) > > I moved y = x.copy() further up and switched x's to y's. Here's what > it looks like: > > def nan_to_num(x): > ? ?is_scalar = False > ? ?if not isinstance(x, _nx.ndarray): > ? ? ? x = asarray(x) > ? ? ? if x.shape == (): > ? ? ? ? ? # Must return this as a scalar later. > ? ? ? ? ? is_scalar = True > ? ?y = x.copy() > ? ?old_shape = y.shape > ? ?if y.shape == (): > ? ? ? # We need element access. > ? ? ? y.shape = (1,) > ? ?t = y.dtype.type > ? ?if issubclass(t, _nx.complexfloating): > ? ? ? ?return nan_to_num(y.real) + 1j * nan_to_num(y.imag) Almost! You need to handle the shape restoration in this branch, too. In [9]: nan_to_num(array(1+1j)) Out[9]: array([ 1.+1.j]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pfeldman at verizon.net Fri Dec 11 22:13:31 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Fri, 11 Dec 2009 19:13:31 -0800 (PST) Subject: [Numpy-discussion] non-standard standard deviation In-Reply-To: References: <26566808.post@talk.nabble.com> Message-ID: <26753999.post@talk.nabble.com> Anne Archibald wrote: > > 2009/11/29 Dr. Phillip M. Feldman : > >> All of the statistical packages that I am currently using and have used >> in >> the past (Matlab, Minitab, R, S-plus) calculate standard deviation using >> the >> sqrt(1/(n-1)) normalization, which gives a result that is unbiased when >> sampling from a normally-distributed population. ?NumPy uses the >> sqrt(1/n) >> normalization. ?I'm currently using the following code to calculate >> standard >> deviations, but would much prefer if this could be fixed in NumPy itself: > > This issue was the subject of lengthy discussions on the mailing list, > the upshot of which is that in current versions of scipy, std and var > take an optional argument "ddof", into which you can supply 1 to get > the normalization you want. > > Anne > You are right that I can get the result that I want by setting ddof. Thanks! I still feel that the default value for ddof should be 1 rather than 0; new users are unlikely to read the documentation for a command like std, because it is reasonable to expect standard behavior across all statistical packages. Phillip -- View this message in context: http://old.nabble.com/non-standard-standard-deviation-tp26566808p26753999.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From cournape at gmail.com Fri Dec 11 23:14:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 12 Dec 2009 09:44:26 +0530 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B227526.10707@gmail.com> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B227526.10707@gmail.com> Message-ID: <5b8d13220912112014h76154699v5b5da5aa1ee43999@mail.gmail.com> On Fri, Dec 11, 2009 at 10:06 PM, Bruce Southey wrote: > > Having said that, the more you can vectorize your function, the more > efficient it will likely be especially with Atlas etc. One thing to note is that dot uses optimized atlas if available, which makes it quite faster than equivalent operations you would do using purely numpy. I doubt that's the reason here, since the arrays are small, but that's something to keep in mind when performances matter: use dot wherever possible, it is generally faster than prod/sum, cheers, David From hoytak at cs.ubc.ca Fri Dec 11 23:26:29 2009 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Fri, 11 Dec 2009 20:26:29 -0800 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <5b8d13220912112014h76154699v5b5da5aa1ee43999@mail.gmail.com> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B227526.10707@gmail.com> <5b8d13220912112014h76154699v5b5da5aa1ee43999@mail.gmail.com> Message-ID: <4db580fd0912112026h708d2d64n928e733f51005d5f@mail.gmail.com> > One thing to note is that dot uses optimized atlas if available, which > makes it quite faster than equivalent operations you would do using > purely numpy. I doubt that's the reason here, since the arrays are > small, but that's something to keep in mind when performances matter: > use dot wherever possible, it is generally faster than prod/sum, This is quite true; I once had a very large matrix (600 x 200,000) that I needed to normalize. Using .sum( ) and /= took about 30 minutes. When I switched to using dot( ) to do the same operation (matrix multiplication with a vector of 1's, then turning that into a diagonal matrix and using dot() again to normalize it), it dropped the computation time down to about 2 minutes. Most of the gain was likely due to ATLAS using all the cores and numpy only using 1, but I was still impressed. --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From kwgoodman at gmail.com Fri Dec 11 23:27:50 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 11 Dec 2009 20:27:50 -0800 Subject: [Numpy-discussion] nan_to_num and bool arrays In-Reply-To: <3d375d730912111838x2e9c5dd5y7f5fefe93c655097@mail.gmail.com> References: <1260521410.4517.8.camel@sulfur> <4B22A6AB.9000106@gmail.com> <3d375d730912111314r634a2c13s695af392f4258f88@mail.gmail.com> <3d375d730912111422t5fb2e7e2p594ee9df73ced538@mail.gmail.com> <3d375d730912111606k68b19f68i489c600ac8668092@mail.gmail.com> <3d375d730912111838x2e9c5dd5y7f5fefe93c655097@mail.gmail.com> Message-ID: On Fri, Dec 11, 2009 at 6:38 PM, Robert Kern wrote: > On Fri, Dec 11, 2009 at 18:38, Keith Goodman wrote: > >> That seems to work. To avoid changing the input >> >>>> x = np.array(1) >>>> x.shape >> ? () >>>> y = nan_to_num(x) >>>> x.shape >> ? (1,) >> >> I moved y = x.copy() further up and switched x's to y's. Here's what >> it looks like: >> >> def nan_to_num(x): >> ? ?is_scalar = False >> ? ?if not isinstance(x, _nx.ndarray): >> ? ? ? x = asarray(x) >> ? ? ? if x.shape == (): >> ? ? ? ? ? # Must return this as a scalar later. >> ? ? ? ? ? is_scalar = True >> ? ?y = x.copy() >> ? ?old_shape = y.shape >> ? ?if y.shape == (): >> ? ? ? # We need element access. >> ? ? ? y.shape = (1,) >> ? ?t = y.dtype.type >> ? ?if issubclass(t, _nx.complexfloating): >> ? ? ? ?return nan_to_num(y.real) + 1j * nan_to_num(y.imag) > > Almost! You need to handle the shape restoration in this branch, too. > > In [9]: nan_to_num(array(1+1j)) > Out[9]: array([ 1.+1.j]) Taking care of my imaginary bug has the nice side effect of leaving us with only one return statement. I changed return nan_to_num(y.real) + 1j * nan_to_num(y.imag) to y = nan_to_num(y.real) + 1j * nan_to_num(y.imag) And changed the if on the next line to elif. def nan_to_num(x): is_scalar = False if not isinstance(x, _nx.ndarray): x = asarray(x) if x.shape == (): # Must return this as a scalar later. is_scalar = True y = x.copy() old_shape = y.shape if y.shape == (): # We need element access. y.shape = (1,) t = y.dtype.type if issubclass(t, _nx.complexfloating): y = nan_to_num(y.real) + 1j * nan_to_num(y.imag) elif issubclass(t, _nx.inexact): are_inf = isposinf(y) are_neg_inf = isneginf(y) are_nan = isnan(y) maxf, minf = _getmaxmin(y.dtype.type) y[are_nan] = 0 y[are_inf] = maxf y[are_neg_inf] = minf if is_scalar: y = y[0] else: y.shape = old_shape return y From th.v.d.gronde at hccnet.nl Sat Dec 12 06:59:16 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Sat, 12 Dec 2009 12:59:16 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <200912111703.21261.faltet@pytables.org> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> Message-ID: <4B238594.1000305@hccnet.nl> Francesc Alted wrote: > ... > Yeah, I think taking slices here is taking quite a lot of time: > > In [58]: timeit E + Xi2[P/2,:] > 100000 loops, best of 3: 3.95 ?s per loop > > In [59]: timeit E + Xi2[P/2] > 100000 loops, best of 3: 2.17 ?s per loop > > don't know why the additional ',:' in the slice is taking so much time, but my > guess is that passing & analyzing the second argument (slice(None,None,None)) > could be the responsible for the slowdown (but that is taking too much time). > Mmh, perhaps it would be worth to study this more carefully so that an > optimization could be done in NumPy. This is indeed interesting! And very nice that this actually works the way you'd expect it to. I guess I've just worked too long with Matlab :) >> I think the lesson mostly should be that with so little data, >> benchmarking becomes a very difficult art. > > Well, I think it is not difficult, it is just that you are perhaps > benchmarking Python/NumPy machinery instead ;-) I'm curious whether Matlab > can do slicing much more faster than NumPy. Jasper? I had a look, these are the timings for Python for 60x20: Dot product: 0.051165 (5.116467e-06 per iter) Add a row: 0.092849 (9.284860e-06 per iter) Add a column: 0.082523 (8.252348e-06 per iter) For Matlab 60x20: Dot product: 0.029927 (2.992664e-006 per iter) Add a row: 0.019664 (1.966444e-006 per iter) Add a column: 0.008384 (8.384376e-007 per iter) For Python 600x200: Dot product: 1.917235 (1.917235e-04 per iter) Add a row: 0.113243 (1.132425e-05 per iter) Add a column: 0.162740 (1.627397e-05 per iter) For Matlab 600x200: Dot product: 1.282778 (1.282778e-004 per iter) Add a row: 0.107252 (1.072525e-005 per iter) Add a column: 0.021325 (2.132527e-006 per iter) If I fit a line through these two data points (60 and 600 rows), I get the following equations: Python, AR: 3.8e-5 * n + 0.091 Matlab, AC: 2.4e-5 * n + 0.0069 This would suggest that Matlab performs the vector addition about 1.6 times faster and has a 13 times smaller constant cost! As for the questions about what I'm trying to compute, these tests are minimized as much as possible to show the bottleneck I encountered, they are part of a larger loop where it does make sense. In essence I'm iteratively adjusting w and E has to keep up (because that's what is used to determine the next change). Instead of recomputing E all the time based on E = Xi*w a little linear algebra shows that the vector addition is sufficient. From eadrogue at gmx.net Sat Dec 12 07:00:29 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Sat, 12 Dec 2009 13:00:29 +0100 Subject: [Numpy-discussion] indices of values contained in a list Message-ID: <20091212120029.GA24072@doriath.local> Hi, Suppose I have a flat array, and I want to know the indices corresponding to values contained in a list of arbitrary lenght. Intuitively I would have done: a = np.array([1,2,3,4]) np.nonzero(a in (0,2,4)) However the "in" operator doesn't work element-wise, instead it compares the whole array with each member of the list. I have found that this does the trick: b = (0,2,4) reduce(np.logical_or, [a == i for i in b]) then pass the result to np.nonzero to get the indices, but, is there a numpy function that can handle this situation? Cheers. Ernest From gruben at bigpond.net.au Sat Dec 12 07:51:42 2009 From: gruben at bigpond.net.au (Gary Ruben) Date: Sat, 12 Dec 2009 23:51:42 +1100 Subject: [Numpy-discussion] indices of values contained in a list In-Reply-To: <20091212120029.GA24072@doriath.local> References: <20091212120029.GA24072@doriath.local> Message-ID: <4B2391DE.3060806@bigpond.net.au> np.setmember1d(a,b) does the same as your reduce(np.logical_or, [a == i for i in b]) but it's actually slower on my machine! Gary R. Ernest Adrogu? wrote: > Hi, > > Suppose I have a flat array, and I want to know the > indices corresponding to values contained in a list > of arbitrary lenght. > > Intuitively I would have done: > > a = np.array([1,2,3,4]) > np.nonzero(a in (0,2,4)) > > However the "in" operator doesn't work element-wise, > instead it compares the whole array with each member > of the list. > > I have found that this does the trick: > > b = (0,2,4) > reduce(np.logical_or, [a == i for i in b]) > > then pass the result to np.nonzero to get the indices, > but, is there a numpy function that can handle this > situation? > > Cheers. > Ernest From bsouthey at gmail.com Sat Dec 12 10:17:05 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Sat, 12 Dec 2009 09:17:05 -0600 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B238594.1000305@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> Message-ID: On Sat, Dec 12, 2009 at 5:59 AM, Jasper van de Gronde wrote: > Francesc Alted wrote: >> ... >> Yeah, I think taking slices here is taking quite a lot of time: >> >> In [58]: timeit E + Xi2[P/2,:] >> 100000 loops, best of 3: 3.95 ?s per loop >> >> In [59]: timeit E + Xi2[P/2] >> 100000 loops, best of 3: 2.17 ?s per loop >> >> don't know why the additional ',:' in the slice is taking so much time, but my >> guess is that passing & analyzing the second argument (slice(None,None,None)) >> could be the responsible for the slowdown (but that is taking too much time). >> Mmh, perhaps it would be worth to study this more carefully so that an >> optimization could be done in NumPy. > > This is indeed interesting! And very nice that this actually works the > way you'd expect it to. I guess I've just worked too long with Matlab :) > >>> I think the lesson mostly should be that with so little data, >>> benchmarking becomes a very difficult art. >> >> Well, I think it is not difficult, it is just that you are perhaps >> benchmarking Python/NumPy machinery instead ;-) ?I'm curious whether Matlab >> can do slicing much more faster than NumPy. ?Jasper? > > I had a look, these are the timings for Python for 60x20: > ? Dot product: 0.051165 (5.116467e-06 per iter) > ? Add a row: 0.092849 (9.284860e-06 per iter) > ? Add a column: 0.082523 (8.252348e-06 per iter) > For Matlab 60x20: > ? Dot product: 0.029927 (2.992664e-006 per iter) > ? Add a row: 0.019664 (1.966444e-006 per iter) > ? Add a column: 0.008384 (8.384376e-007 per iter) > For Python 600x200: > ? Dot product: 1.917235 (1.917235e-04 per iter) > ? Add a row: 0.113243 (1.132425e-05 per iter) > ? Add a column: 0.162740 (1.627397e-05 per iter) > For Matlab 600x200: > ? Dot product: 1.282778 (1.282778e-004 per iter) > ? Add a row: 0.107252 (1.072525e-005 per iter) > ? Add a column: 0.021325 (2.132527e-006 per iter) > > If I fit a line through these two data points (60 and 600 rows), I get > the following equations: > ? Python, AR: 3.8e-5 * n + 0.091 > ? Matlab, AC: 2.4e-5 * n + 0.0069 > This would suggest that Matlab performs the vector addition about 1.6 > times faster and has a 13 times smaller constant cost! > > As for the questions about what I'm trying to compute, these tests are > minimized as much as possible to show the bottleneck I encountered, they > are part of a larger loop where it does make sense. In essence I'm > iteratively adjusting w and E has to keep up (because that's what is > used to determine the next change). Instead of recomputing E all the > time based on E = Xi*w a little linear algebra shows that the vector > addition is sufficient. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Well, I think the difference between Matlab and numpy have been well discussed elsewhere (http://www.scipy.org/NumPy_for_Matlab_Users) especially the C/Fortran order difference. Clearly you are not using the same level of optimized libraries because of the differences shown. Unfortunately your code does not distinguish between the dot products and the slicing so slower dot product rules your times. Really you need to just compare your slicing alone without any dot product or without the inplace addition. Really I would suggest asking the list for the real problem because it is often amazing what solutions have been given. Bruce From kwgoodman at gmail.com Sat Dec 12 11:16:42 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 12 Dec 2009 08:16:42 -0800 Subject: [Numpy-discussion] indices of values contained in a list In-Reply-To: <20091212120029.GA24072@doriath.local> References: <20091212120029.GA24072@doriath.local> Message-ID: 2009/12/12 Ernest Adrogu? : > Hi, > > Suppose I have a flat array, and I want to know the > indices corresponding to values contained in a list > of arbitrary lenght. > > Intuitively I would have done: > > a = np.array([1,2,3,4]) > np.nonzero(a in (0,2,4)) > > However the "in" operator doesn't work element-wise, > instead it compares the whole array with each member > of the list. > > I have found that this does the trick: > > b = (0,2,4) > reduce(np.logical_or, [a == i for i in b]) > > then pass the result to np.nonzero to get the indices, > but, is there a numpy function that can handle this > situation? If a and b are as short as in your example, which I doubt, here's a faster way: >> timeit np.nonzero(reduce(np.logical_or, [a == i for i in b])) 100000 loops, best of 3: 14 ?s per loop >> timeit [i for i, z in enumerate(a) if z in b] 100000 loops, best of 3: 3.43 ?s per loop Looping over a instead of b is faster if len(a) is much less than len(b): >> a = np.random.randint(0,100,10000) >> b = tuple(set(a[:50].tolist())) >> len(b) 41 >> timeit np.nonzero(reduce(np.logical_or, [a == i for i in b])) 100 loops, best of 3: 2.65 ms per loop >> timeit [i for i, z in enumerate(a) if z in b] 10 loops, best of 3: 37.7 ms per loop >> b, a = a, b >> timeit np.nonzero(reduce(np.logical_or, [a == i for i in b])) 10 loops, best of 3: 165 ms per loop >> timeit [i for i, z in enumerate(a) if z in b] 1000 loops, best of 3: 597 ?s per loop From tjhnson at gmail.com Sat Dec 12 16:55:17 2009 From: tjhnson at gmail.com (T J) Date: Sat, 12 Dec 2009 13:55:17 -0800 Subject: [Numpy-discussion] Repeated dot products Message-ID: Hi, Suppose I have an array of shape: (n, k, k). In this case, I have n k-by-k matrices. My goal is to compute the product of a (potentially large) user-specified selection (with replacement) of these matrices. For example, x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6] says that I want to take the 0th matrix and dot it with the 1st matrix and dot that product with the 2nd matrix and dot that product with the 1st matrix again and so on... Essentially, I am looking for efficient ways of doing this. It seems like what I *need* is for dot to be a ufunc with a reduce() operator. Then I would construct an array of the matrices, as specified by the input x. For now, I am using a python loop and this is unbearable. >>> prod = np.eye(k) >>> for i in x: ... prod = dot(prod, matrices[i]) ... Is there a better way? From josef.pktd at gmail.com Sat Dec 12 17:17:03 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 12 Dec 2009 17:17:03 -0500 Subject: [Numpy-discussion] Repeated dot products In-Reply-To: References: Message-ID: <1cd32cbb0912121417v4d6e4605o624a54d23207438f@mail.gmail.com> On Sat, Dec 12, 2009 at 4:55 PM, T J wrote: > Hi, > > Suppose I have an array of shape: ?(n, k, k). ?In this case, I have n > k-by-k matrices. ?My goal is to compute the product of a (potentially > large) user-specified selection (with replacement) of these matrices. > For example, > > ? x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6] > > says that I want to take the 0th matrix and dot it with the 1st matrix > and dot that product with the 2nd matrix and dot that product with the > 1st matrix again and so on... > > Essentially, I am looking for efficient ways of doing this. ?It seems > like what I *need* is for dot to be a ufunc with a reduce() operator. > Then I would construct an array of the matrices, as specified by the > input x. ?For now, I am using a python loop and this is unbearable. > >>>> prod = np.eye(k) >>>> for i in x: > ... ?prod = dot(prod, matrices[i]) > ... > > Is there a better way? I don't know about numpy, but I was using python reduce for similar: >>> d = np.eye(2) >>> li = [d,d,d,d,d,d,d,d] >>> reduce(np.dot,li) array([[ 1., 0.], [ 0., 1.]]) >>> lii = [1,2,1,2,3] >>> reduce(lambda x,i : np.dot(x,li[i]),lii) array([[ 1., 0.], [ 0., 1.]]) quickly written from memory, not sure its always correct, and no idea about speed compared to loop Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dlaxalde at gmail.com Sat Dec 12 17:27:14 2009 From: dlaxalde at gmail.com (Denis Laxalde) Date: Sat, 12 Dec 2009 17:27:14 -0500 Subject: [Numpy-discussion] Repeated dot products In-Reply-To: References: Message-ID: <20091212222714.GC14677@nanuk> Le samedi 12 d?cembre 2009 ? 01:55 -0800, T J a ?crit : > Is there a better way? You may have a look at http://scipy.org/Cookbook/MultiDot Several alternatives are proposed. Cheers, -- Denis From peridot.faceted at gmail.com Sat Dec 12 17:30:46 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 12 Dec 2009 17:30:46 -0500 Subject: [Numpy-discussion] Repeated dot products In-Reply-To: References: Message-ID: 2009/12/12 T J : > Hi, > > Suppose I have an array of shape: ?(n, k, k). ?In this case, I have n > k-by-k matrices. ?My goal is to compute the product of a (potentially > large) user-specified selection (with replacement) of these matrices. > For example, > > ? x = [0,1,2,1,3,3,2,1,3,2,1,5,3,2,3,5,2,5,3,2,1,3,5,6] > > says that I want to take the 0th matrix and dot it with the 1st matrix > and dot that product with the 2nd matrix and dot that product with the > 1st matrix again and so on... > > Essentially, I am looking for efficient ways of doing this. ?It seems > like what I *need* is for dot to be a ufunc with a reduce() operator. > Then I would construct an array of the matrices, as specified by the > input x. ?For now, I am using a python loop and this is unbearable. > >>>> prod = np.eye(k) >>>> for i in x: > ... ?prod = dot(prod, matrices[i]) > ... > > Is there a better way? You are right that numpy/scipy should have a matrix product ufunc. In fact it should have ufunc versions of all the linear algebra tools. It even has the infrastructure (generalized ufuncs) to support such a thing in a very general way. Sadly this infrastructure is basically unused. Your best workaround depends on the sizes of the various objects involved. If k is large enough, then a k by k by k matrix multiply will be slow enough that it doesn't really matter what else you do. If x is much longer than n, you will want to avoid constructing a len(x) by k by k array. If n is really small and x really large, you might even win by some massively clever Knuthian operation that found repeated substrings in x and evaluated each corresponding matrix product only once. If on the other hand len(x) is much shorter than n, you could perhaps benefit from forming an intermediate len(x) by k by k array. This array could then be used in a vectorized reduce operation, if we had one. Normally, one can work around the lack of a vectorized matrix multiplication by forming an (n*k) by k matrix and multiplying it, but I don't think this will help any with reduce. If k is small, you can multiply two k by k matrices by producing the k by k by k elementwise product and then adding along the middle axis, but I don't think this will work well with reduction either. So I have just two practical solutions, really: (a) use python reduce() and the matrix product, possibly taking advantage of the lack of need to produce an intermediate len(x) by k by k matrix, and live with python in that part of the loop, or (b) use the fact that in recent versions of numpy there's a quasi-undocumented ufuncized matrix multiply built as a test of the generalized ufunc mechanism. I have no idea whether it supports reduce() (and if it doesn't adding it may be an unpleasant experience). Good luck, Anne P.S. if you come up with a good implementation of the repeated substring approach I'd love to hear it! -A From fonnesbeck at gmail.com Sat Dec 12 17:57:33 2009 From: fonnesbeck at gmail.com (Chris) Date: Sat, 12 Dec 2009 22:57:33 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <4AFA285A.6000105@ar.media.kyoto-u.ac.jp> <5b8d13220911120516j521e70f8jfac325ef5e157e33@mail.gmail.com> <5b8d13220911121950p4c52e60ej93040405e7296b40@mail.gmail.com> <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> Message-ID: Here is a log form a build of svn rev 7996 with no LDFLAGS specified, as recommended by Robert. The result is the same, however. http://files.me.com/fonnesbeck/y7e9v2 cf From totalbull at mac.com Sat Dec 12 20:08:22 2009 From: totalbull at mac.com (THOMAS BROWNE) Date: Sun, 13 Dec 2009 01:08:22 +0000 Subject: [Numpy-discussion] Question on timeseries, for financial application Message-ID: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> Hello all, Quite new to numpy / timeseries module, please forgive the elementary question. I wish to do quite to do a bunch of multivariate analysis on 1000 different financial markets series, each holding about 1800 data points (5 years of daily data). What's the best way to put this into a TimeSeries object? Should I use a structured data type (in which case I can reference each series by name), or should I put it into one big numpy array object (in which case I guess I'll have to keep track of the series name in an internal structure)? What are the advantages and disadvantages of each? Ideally I'd have liked the ease and simplicity of being able to reference each series by name, while maintaining the fast speed and clean structure of one big numpy array. Any way of getting both? Once I have a multivariate TimeSeries, how do I add another series to it? Thanks for the help. Thomas. From josef.pktd at gmail.com Sat Dec 12 22:03:09 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 12 Dec 2009 22:03:09 -0500 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> Message-ID: <1cd32cbb0912121903g389e0d56o4f2e570db0e5fd@mail.gmail.com> On Sat, Dec 12, 2009 at 8:08 PM, THOMAS BROWNE wrote: > Hello all, > > Quite new to numpy / timeseries module, please forgive the elementary question. > > I wish to do quite to do a bunch of multivariate analysis on 1000 different financial markets series, each holding about 1800 data points (5 years of daily data). > > What's the best way to put this into a TimeSeries object? Should I use a structured data type (in which case I can reference each series by name), or should I put it into one big numpy array object (in which case I guess I'll have to keep track of the series name in an internal structure)? What are the advantages and disadvantages of each? > > Ideally I'd have liked the ease and simplicity of being able to reference each series by name, while maintaining the fast speed and clean structure of one big numpy array. Any way of getting both? > > Once I have a multivariate TimeSeries, how do I add another series to it? I'm not sure if your TimeSeries object refers to the scikits or writing your own application. In the later case, I would recommend looking at the following for data handling, the first two are written or co-written with finance applications in mind, the last is a nice package for working with structured arrays, but I'm don't know about how extensive time handling is. http://code.google.com/p/pandas/ general (2d) panel data, arbitrary axis labels possible, integrates scikits.statsmodels http://scikits.appspot.com/timeseries based on masked arrays, extensive time handling http://pypi.python.org/pypi/tabular If you need to do more serious data work and don't have a special requirement on the data structure, I think, it would be better to use ( and contribute for adjustments/extensions/testing) to one of the existing data structures than writing your own. All 3 are BSD or MIT licensed. Josef > > Thanks for the help. > > Thomas. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ferrell at diablotech.com Sun Dec 13 00:11:32 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Sat, 12 Dec 2009 22:11:32 -0700 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> Message-ID: <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> Have you considered creating a TimeSeries for each data series, and then putting them all together in a dict, keyed by symbol? One disadvantage of one big monster numpy array for all the series is that not all series may have a full set of 1800 data points. So the array isn't really nicely rectangular. Not sure exactly what kind of analysis you want to do, but grabbing a series from a dict is quite fast. -r On Dec 12, 2009, at 6:08 PM, THOMAS BROWNE wrote: > Hello all, > > Quite new to numpy / timeseries module, please forgive the > elementary question. > > I wish to do quite to do a bunch of multivariate analysis on 1000 > different financial markets series, each holding about 1800 data > points (5 years of daily data). > > What's the best way to put this into a TimeSeries object? Should I > use a structured data type (in which case I can reference each > series by name), or should I put it into one big numpy array object > (in which case I guess I'll have to keep track of the series name in > an internal structure)? What are the advantages and disadvantages of > each? > > Ideally I'd have liked the ease and simplicity of being able to > reference each series by name, while maintaining the fast speed and > clean structure of one big numpy array. Any way of getting both? > > Once I have a multivariate TimeSeries, how do I add another series > to it? > > Thanks for the help. > > Thomas. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Sun Dec 13 01:37:28 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 13 Dec 2009 12:07:28 +0530 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: References: <5b8d13220911121950p4c52e60ej93040405e7296b40@mail.gmail.com> <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> Message-ID: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> On Sun, Dec 13, 2009 at 4:27 AM, Chris wrote: > Here is a log form a build of svn rev 7996 with no LDFLAGS specified, as > recommended by Robert. The result is the same, however. > > http://files.me.com/fonnesbeck/y7e9v2 I don't see any build error on this log ? David From robert.kern at gmail.com Sun Dec 13 02:25:12 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 13 Dec 2009 01:25:12 -0600 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> References: <5b8d13220911121950p4c52e60ej93040405e7296b40@mail.gmail.com> <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> Message-ID: <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> On Sun, Dec 13, 2009 at 00:37, David Cournapeau wrote: > On Sun, Dec 13, 2009 at 4:27 AM, Chris wrote: >> Here is a log form a build of svn rev 7996 with no LDFLAGS specified, as >> recommended by Robert. The result is the same, however. >> >> http://files.me.com/fonnesbeck/y7e9v2 > > I don't see any build error on this log ? See earlier in the thread. The error occurs at runtime: import umath ImportError: dlopen(/Library/Python/2.6/site-packages/ numpy-1.4.0.dev7726-py2.6-macosx-10.6- universal.egg/numpy/core/umath.so, 2): Symbol not found : _npy_cexp Referenced from: /Library/Python/2.6/site-packages/ numpy-1.4.0.dev7726-py2.6-macosx-10.6-universal.egg/ numpy/core/umath.so Expected in: flat namespace in /Library/Python/2.6/site-packages/ numpy-1.4.0.dev7726-py2.6-macosx-10.6-universal.egg /numpy/core/umath.so -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pivanov314 at gmail.com Sun Dec 13 02:27:50 2009 From: pivanov314 at gmail.com (Paul Ivanov) Date: Sat, 12 Dec 2009 23:27:50 -0800 Subject: [Numpy-discussion] doctest improvements patch (and possible regressions) In-Reply-To: <4B1FAE55.9070605@stsci.edu> References: <4B1EB148.1010607@gmail.com> <4B1FAE55.9070605@stsci.edu> Message-ID: <4B249776.3010107@gmail.com> So far, no one has voiced objections, so should I go ahead and check this in? btw, thanks Mike, what about this one: >>> (np.char.lstrip(c, ' ') == np.char.lstrip(c, '')).all() ... # XXX: is this a regression? this line now returns False -pi ... # np.char.lstrip(c,'') does not modify c at all. True best, Paul Ivanov Michael Droettboom, on 2009-12-09 06:04, wrote: > Paul Ivanov wrote: >> I marked up suspicious differences with XXX, since I don't know if >> they're significant. In particular: >> - shortening a defchararray by strip does not change it's dtype to a >> shorter one (apparently it used to?) > Yes. The new behavior is to return a string array with the same > itemsize as the input array. That's primarily just the result of the > new implementation rather than a thought out change, though. > > Sorry, just commenting on the parts I feel competent in :) But I think > this is a great improvement. It would be nice to start doing doctests > as a matter of course to keep the docs accurate. > > Mike > From pgmdevlist at gmail.com Sun Dec 13 03:31:06 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 13 Dec 2009 03:31:06 -0500 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> Message-ID: <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote: > Have you considered creating a TimeSeries for each data series, and > then putting them all together in a dict, keyed by symbol? That's an idea > One disadvantage of one big monster numpy array for all the series is > that not all series may have a full set of 1800 data points. So the > array isn't really nicely rectangular. Bah, there's adjust_endpoints to take care of that. > > Not sure exactly what kind of analysis you want to do, but grabbing a > series from a dict is quite fast. Thomas, as robert F. pointed out, everything depends on the kind of analysis you want. If you want to normalize your series, having all of them in a big array is the best (plain array, not structured, so that you can apply .mean and .std directly without having to loop on the series). If you need to apply the same function over all the series, here again having a big ndarray is easiest. Give us an example of what you wanna do. From cournape at gmail.com Sun Dec 13 03:56:23 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 13 Dec 2009 14:26:23 +0530 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> References: <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> Message-ID: <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> On Sun, Dec 13, 2009 at 12:55 PM, Robert Kern wrote: >> I don't see any build error on this log ? > > See earlier in the thread. The error occurs at runtime: Right. Chris, could you show the output from nm on umath.so, to check what symbols are missing. Maybe seeing the whole list would bring something. David From th.v.d.gronde at hccnet.nl Sun Dec 13 06:13:11 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Sun, 13 Dec 2009 12:13:11 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> Message-ID: <4B24CC47.1060601@hccnet.nl> Bruce Southey wrote: > Really I would suggest asking the list for the real problem because it > is often amazing what solutions have been given. So far this is the fastest code I've got: ------------------------------------------------------------------------ import numpy as np nmax = 100 def minover(Xi,S): P,N = Xi.shape SXi = Xi.copy() for i in xrange(0,P): SXi[i] *= S[i] SXi2 = np.dot(SXi,SXi.T) SXiSXi2divN = np.concatenate((SXi,SXi2),axis=1)/N w = np.random.standard_normal((N)) E = np.dot(SXi,w) wE = np.concatenate((w,E)) for s in xrange(0,nmax*P): mu = wE[N:].argmin() wE += SXiSXi2divN[mu] # E' = dot(SXi,w') # = dot(SXi,w + SXi[mu,:]/N) # = dot(SXi,w) + dot(SXi,SXi[mu,:])/N # = E + dot(SXi,SXi.T)[:,mu]/N # = E + dot(SXi,SXi.T)[mu,:]/N return wE[:N] ------------------------------------------------------------------------ I am particularly interested in cleaning up the initialization part, but any suggestions for improving the overall performance are of course appreciated. From josef.pktd at gmail.com Sun Dec 13 09:07:23 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 Dec 2009 09:07:23 -0500 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> Message-ID: <1cd32cbb0912130607k5e88382dkf58c1eb5fca291df@mail.gmail.com> On Sun, Dec 13, 2009 at 3:31 AM, Pierre GM wrote: > On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote: >> Have you considered creating a TimeSeries for each data series, and >> then putting them all together in a dict, keyed by symbol? > > That's an idea As far as I understand, that's what pandas.DataFrame does. pandas.DataMatrix used 2d array to store data > >> One disadvantage of one big monster numpy array for all the series is >> that not all series may have a full set of 1800 data points. ?So the >> array isn't really nicely rectangular. > > Bah, there's adjust_endpoints to take care of that. > >> >> Not sure exactly what kind of analysis you want to do, but grabbing a >> series from a dict is quite fast. > > Thomas, as robert F. pointed out, everything depends on the kind of analysis you want. If you want to normalize your series, having all of them in a big array is the best (plain array, not structured, so that you can apply .mean and .std directly without having to loop on the series). If you need to apply the same function over all the series, here again having a big ndarray is easiest. Give us an example of what you wanna do. Or a structured array with homogeneous type that allows fast creation of views for data analysis. Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ferrell at diablotech.com Sun Dec 13 09:14:23 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Sun, 13 Dec 2009 07:14:23 -0700 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> Message-ID: <18692363-ECAC-434A-9F2C-5EF3F29D6212@diablotech.com> On Dec 13, 2009, at 1:31 AM, Pierre GM wrote: > On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote: >> Have you considered creating a TimeSeries for each data series, and >> then putting them all together in a dict, keyed by symbol? > > That's an idea > >> One disadvantage of one big monster numpy array for all the series is >> that not all series may have a full set of 1800 data points. So the >> array isn't really nicely rectangular. > > Bah, there's adjust_endpoints to take care of that. Maybe this will work for the OP. In my work, if a series is missing data the desirable thing is to use the data I have. I don't' want to truncate existing series to fit the short ones, nor pad to fit the long ones. Really depends on the analysis the OP is trying to do. From ferrell at diablotech.com Sun Dec 13 09:27:19 2009 From: ferrell at diablotech.com (Robert Ferrell) Date: Sun, 13 Dec 2009 07:27:19 -0700 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <1cd32cbb0912130607k5e88382dkf58c1eb5fca291df@mail.gmail.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> <1cd32cbb0912130607k5e88382dkf58c1eb5fca291df@mail.gmail.com> Message-ID: <8F64E7C2-19BF-4ECE-A472-42A0455DE4B4@diablotech.com> On Dec 13, 2009, at 7:07 AM, josef.pktd at gmail.com wrote: > On Sun, Dec 13, 2009 at 3:31 AM, Pierre GM > wrote: >> On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote: >>> Have you considered creating a TimeSeries for each data series, and >>> then putting them all together in a dict, keyed by symbol? >> >> That's an idea > > As far as I understand, that's what pandas.DataFrame does. > pandas.DataMatrix used 2d array to store data > >> >>> One disadvantage of one big monster numpy array for all the series >>> is >>> that not all series may have a full set of 1800 data points. So the >>> array isn't really nicely rectangular. >> >> Bah, there's adjust_endpoints to take care of that. >> >>> >>> Not sure exactly what kind of analysis you want to do, but >>> grabbing a >>> series from a dict is quite fast. >> >> Thomas, as robert F. pointed out, everything depends on the kind of >> analysis you want. If you want to normalize your series, having all >> of them in a big array is the best (plain array, not structured, so >> that you can apply .mean and .std directly without having to loop >> on the series). If you need to apply the same function over all the >> series, here again having a big ndarray is easiest. Give us an >> example of what you wanna do. > > Or a structured array with homogeneous type that allows fast creation > of views for data analysis. These kinds of financial series don't have that much data (speaking from the early 21st century point of view). The OP says 1000 series, 1800 observations per series. Maybe 5 data items per observation, 4 bytes each. That's well under 50MB. I've found it satisfactory to keep the data someplace that's handy to get at, and easy to use. When I want to do analysis I pull it into whatever format is best for that analysis. Depending on the needs, it may not be necessary to try to arrange the data so you can get a view for analysis - the time for a copy can be negligible if the analysis takes a while. -r From robince at gmail.com Sun Dec 13 10:18:03 2009 From: robince at gmail.com (Robin) Date: Sun, 13 Dec 2009 15:18:03 +0000 Subject: [Numpy-discussion] ma feature request (log2) Message-ID: <2d5132a50912130718v7ae7301bta497ca921e64d6bc@mail.gmail.com> Hi, Could we have a ma aware numpy.ma.log2 please, similar to np.ma.log and np.ma.log10? I think it should be as simple as the patch below but perhaps I've missed something: Thanks, Robin --- core.py.orig 2009-12-13 15:14:14.000000000 +0000 +++ core.py 2009-12-13 15:14:53.000000000 +0000 @@ -66,7 +66,7 @@ 'identity', 'ids', 'indices', 'inner', 'innerproduct', 'isMA', 'isMaskedArray', 'is_mask', 'is_masked', 'isarray', 'left_shift', 'less', 'less_equal', 'load', 'loads', 'log', 'log10', - 'logical_and', 'logical_not', 'logical_or', 'logical_xor', + 'log2', 'logical_and', 'logical_not', 'logical_or', 'logical_xor', 'make_mask', 'make_mask_descr', 'make_mask_none', 'mask_or', 'masked', 'masked_array', 'masked_equal', 'masked_greater', 'masked_greater_equal', 'masked_inside', 'masked_invalid', @@ -1124,6 +1124,8 @@ _DomainGreater(0.0)) log10 = _MaskedUnaryOperation(umath.log10, 1.0, _DomainGreater(0.0)) +log2 = _MaskedUnaryOperation(umath.log2, 1.0, + _DomainGreater(0.0)) tan = _MaskedUnaryOperation(umath.tan, 0.0, _DomainTan(1e-35)) arcsin = _MaskedUnaryOperation(umath.arcsin, 0.0, From kirzhanov at gmail.com Sun Dec 13 11:00:51 2009 From: kirzhanov at gmail.com (iason) Date: Sun, 13 Dec 2009 08:00:51 -0800 (PST) Subject: [Numpy-discussion] A discrepancy between NumPy documentation and recomendations for beginers Message-ID: <26765626.post@talk.nabble.com> Here http://www.scipy.org/NumPy_for_Matlab_Users a recommendation to use scipy.integrate.ode(...) with parameters "method='bdf', order=15" instead of the ode15s function (from Matlab) is given. But from the documentation for the scipy.integrate.ode(...) one can find out that the accuracy order ("order") of the BDF method ("method='bdf'") is no greater than 5. What is the correct value of the highest order of the BDF method? I think the answer for this question is necessary for practical use of scipy.integrate.ode(...). I tried to migrate my numerical research from Matlab to NumPy and first tried scipy.integrate.ode(...) function because it supports single step calculation. I used the default parameters listed in this guide http://www.scipy.org/NumPy_for_Matlab_Users and found the calculation process extremely slow. It was unusable because it selected too small integration step and I could complete only few percents of my numerical task (a stiff ODE integration) in an admissible time (tens of minutes). Than I switched my program to use the simpler scipy.integrate.odeint instead. The scipy.integrate.odeint function gave me an excellent solution in several seconds. But it does not support single step integrations and thus is not a good replacement for the ode15s function (Matlab). What is the difference between scipy.integrate.ode and scipy.integrate.odeint? How to select correct parameters for the scipy.integrate.ode function? After all it seems to me that there might be a bug in scipy.integrate.ode function implementation... -- View this message in context: http://old.nabble.com/A-discrepancy-between-NumPy-documentation-and-recomendations-for-beginers-tp26765626p26765626.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From wesmckinn at gmail.com Sun Dec 13 14:59:15 2009 From: wesmckinn at gmail.com (Wes McKinney) Date: Sun, 13 Dec 2009 14:59:15 -0500 Subject: [Numpy-discussion] Question on timeseries, for financial application In-Reply-To: <8F64E7C2-19BF-4ECE-A472-42A0455DE4B4@diablotech.com> References: <15DD68D1-B0E3-43BF-BC8A-784070842049@mac.com> <5A037C98-1485-4AFB-9133-CB740E298824@diablotech.com> <5E7E7C2B-3AEF-4AC1-829F-BB39914C1571@gmail.com> <1cd32cbb0912130607k5e88382dkf58c1eb5fca291df@mail.gmail.com> <8F64E7C2-19BF-4ECE-A472-42A0455DE4B4@diablotech.com> Message-ID: <6c476c8a0912131159x3f2c47f6lecec0e1d708aef46@mail.gmail.com> On Sun, Dec 13, 2009 at 9:27 AM, Robert Ferrell wrote: > > On Dec 13, 2009, at 7:07 AM, josef.pktd at gmail.com wrote: > >> On Sun, Dec 13, 2009 at 3:31 AM, Pierre GM >> wrote: >>> On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote: >>>> Have you considered creating a TimeSeries for each data series, and >>>> then putting them all together in a dict, keyed by symbol? >>> >>> That's an idea >> >> As far as I understand, that's what pandas.DataFrame does. >> pandas.DataMatrix used 2d array to store data >> >>> >>>> One disadvantage of one big monster numpy array for all the series >>>> is >>>> that not all series may have a full set of 1800 data points. ?So the >>>> array isn't really nicely rectangular. >>> >>> Bah, there's adjust_endpoints to take care of that. >>> >>>> >>>> Not sure exactly what kind of analysis you want to do, but >>>> grabbing a >>>> series from a dict is quite fast. >>> >>> Thomas, as robert F. pointed out, everything depends on the kind of >>> analysis you want. If you want to normalize your series, having all >>> of them in a big array is the best (plain array, not structured, so >>> that you can apply .mean and .std directly without having to loop >>> on the series). If you need to apply the same function over all the >>> series, here again having a big ndarray is easiest. Give us an >>> example of what you wanna do. >> >> Or a structured array with homogeneous type that allows fast creation >> of views for data analysis. > > These kinds of financial series don't have that much data (speaking > from the early 21st century point of view). ?The OP says 1000 series, > 1800 observations per series. ?Maybe 5 data items per observation, 4 > bytes each. ?That's well under 50MB. ?I've found it satisfactory to > keep the data someplace that's handy to get at, and easy to use. ?When > I want to do analysis I pull it into whatever format is best for that > analysis. ?Depending on the needs, it may not be necessary to try to > arrange the data so you can get a view for analysis - the time for a > copy can be negligible if the analysis takes a while. > > -r > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > As Josef mentioned, the pandas library is designed for the problem we're discussing-- i.e. working with collections of time series or cross-sectional data. The basic DataFrame object accepts a dict of pandas.Series objects (or a dict of equal-length ndarrays and an array of labels / dates) and provides slicing, reindexing, aggregation, other conveniences. I have not made an official release of the library yet but it is quite robust and suitable for general use (I use it actively in proprietary applications). HTML documentation is also not available yet, but the docstrings are reasonably good. I hope to make an official release by the end of the year with documentation, etc. From fonnesbeck at gmail.com Sun Dec 13 16:33:05 2009 From: fonnesbeck at gmail.com (Chris) Date: Sun, 13 Dec 2009 21:33:05 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > could you show the output from nm on umath.so, to check what > symbols are missing. Maybe seeing the whole list would bring > something. Here it is: http://files.me.com/fonnesbeck/6ezhy5 The symbol in question is in there, but I see that it does not have a value. From thomas.robitaille at gmail.com Sun Dec 13 18:18:18 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Sun, 13 Dec 2009 18:18:18 -0500 Subject: [Numpy-discussion] Problem with set_fill_value for masked structured array Message-ID: <7C8C88B0-7F76-4424-8275-3BF31F845AB2@gmail.com> Hi, The following code doesn't seem to work: import numpy.ma as ma t = ma.array(zip([1,2,3],[4,5,6]),dtype=[('a',int),('b',int)]) print repr(t['a']) t['a'].set_fill_value(10) print repr(t['a']) As the output is masked_array(data = [1 2 3], mask = [False False False], fill_value = 999999) masked_array(data = [1 2 3], mask = [False False False], fill_value = 999999) (and no exception is raised) Am I doing something wrong? Thanks in advance for any help, Thomas From gael.varoquaux at normalesup.org Sun Dec 13 18:55:50 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 14 Dec 2009 00:55:50 +0100 Subject: [Numpy-discussion] [Ann] EuroScipy 2010 Message-ID: <20091213235550.GB27356@phare.normalesup.org> ========================== Announcing EuroScipy 2010 ========================== --------------------------------------------------- The 3rd European meeting on Python in Science --------------------------------------------------- **Paris, Ecole Normale Sup?rieure, July 8-11 2010** We are happy to announce the 3rd EuroScipy meeting, in Paris, July 2010. The EuroSciPy meeting is a cross-disciplinary gathering focused on the use and development of the Python language in scientific research. This event strives to bring together both users and developers of scientific tools, as well as academic research and state of the art industry. Important dates ================== ====================================== =================================== **Registration opens** Sunday March 29 **Paper submission deadline** Sunday May 9 **Program announced** Sunday May 22 **Tutorials tracks** Thursday July 8 - Friday July 9 **Conference track** Saturday July 10 - Sunday July 11 ====================================== =================================== Tutorial ========= There will be two tutorial tracks at the conference, an introductory one, to bring up to speed with the Python language as a scientific tool, and an advanced track, during which experts of the field will lecture on specific advanced topics such as advanced use of numpy, scientific visualization, software engineering... Main conference topics ======================== We will be soliciting talks on the follow topics: - Presentations of scientific tools and libraries using the Python language, including but not limited to: - Vector and array manipulation - Parallel computing - Scientific visualization - Scientific data flow and persistence - Algorithms implemented or exposed in Python - Web applications and portals for science and engineering - Reports on the use of Python in scientific achievements or ongoing projects. - General-purpose Python tools that can be of special interest to the scientific community. Keynote Speaker: Hans Petter Langtangen ========================================== We are excited to welcome Hans Petter Langtangen as our keynote speaker. - Director of scientific computing and bio-medical research at Simula labs, Oslo - Author of the famous book Python scripting for computational science http://www.springer.com/math/cse/book/978-3-540-73915-9 -- Ga?l Varoquaux, conference co-chair Nicolas Chauvat, conference co-chair Program committee ................. Romain Brette (ENS Paris, DEC) Mike M?ller (Python Academy) Christophe Pradal (CIRAD/INRIA, DigiPlantes team) Pierre Raybault (CEA, DAM) Jarrod Millman (UC Berkeley, Helen Wills NeuroScience institute) From fonnesbeck at gmail.com Sun Dec 13 18:59:50 2009 From: fonnesbeck at gmail.com (Chris) Date: Sun, 13 Dec 2009 23:59:50 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220911281744ib60ab05j9fefdd5c7749439@mail.gmail.com> <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> Message-ID: Chris gmail.com> writes: > Here it is: > > http://files.me.com/fonnesbeck/6ezhy5 > Sorry, that link should be: http://files.me.com/fonnesbeck/qv8o59 From eadrogue at gmx.net Sun Dec 13 20:54:54 2009 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Mon, 14 Dec 2009 02:54:54 +0100 Subject: [Numpy-discussion] structured array from ordinary array Message-ID: <20091214015454.GA27887@doriath.local> Hi, How does one generate a structured array from a normal array? I use the 'view' method, but this way I get a superfluous dimension that I do not want. Example: In [619]: a = np.array([[1,2,3],[1,2,3],[1,2,4]],int) In [620]: struct = np.dtype([('a',int),('b',int),('c',int)]) In [621]: a.view(struct) Out[621]: array([[(1, 2, 3)], [(1, 2, 3)], [(1, 2, 4)]], dtype=[('a', ' References: <20091212120029.GA24072@doriath.local> Message-ID: <20091214020020.GB27887@doriath.local> 12/12/09 @ 08:16 (-0800), thus spake Keith Goodman: > If a and b are as short as in your example, which I doubt, here's a faster way: > > >> timeit np.nonzero(reduce(np.logical_or, [a == i for i in b])) > 100000 loops, best of 3: 14 ?s per loop > >> timeit [i for i, z in enumerate(a) if z in b] > 100000 loops, best of 3: 3.43 ?s per loop > > Looping over a instead of b is faster if len(a) is much less than len(b): > > >> a = np.random.randint(0,100,10000) > >> b = tuple(set(a[:50].tolist())) > >> len(b) > 41 > >> timeit np.nonzero(reduce(np.logical_or, [a == i for i in b])) > 100 loops, best of 3: 2.65 ms per loop > >> timeit [i for i, z in enumerate(a) if z in b] > 10 loops, best of 3: 37.7 ms per loop Nice. Well speed was not critical in this case, sometimes is good to know alternative ways of doing things. Thanks for your suggestions. Ernest From pgmdevlist at gmail.com Sun Dec 13 21:21:50 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 13 Dec 2009 21:21:50 -0500 Subject: [Numpy-discussion] Problem with set_fill_value for masked structured array In-Reply-To: <7C8C88B0-7F76-4424-8275-3BF31F845AB2@gmail.com> References: <7C8C88B0-7F76-4424-8275-3BF31F845AB2@gmail.com> Message-ID: <3A649CD2-E0EF-45B8-8C30-7293FCFE0417@gmail.com> On Dec 13, 2009, at 6:18 PM, Thomas Robitaille wrote: > Hi, > > The following code doesn't seem to work: > > import numpy.ma as ma > > t = ma.array(zip([1,2,3],[4,5,6]),dtype=[('a',int),('b',int)]) > print repr(t['a']) > t['a'].set_fill_value(10) > print repr(t['a']) > > As the output is > > masked_array(data = [1 2 3], > mask = [False False False], > fill_value = 999999) > > masked_array(data = [1 2 3], > mask = [False False False], > fill_value = 999999) > > (and no exception is raised) > > Am I doing something wrong? Well, that's a problem indeed, and I'd put that as a bug. However, you can use that syntax instead: >>> t.fill_value['a']=10 or set all the fields at once: >>>t.fill_value=(10,999999) I gonna try to see what I can do, but don't expect it in 1.4.x From pgmdevlist at gmail.com Sun Dec 13 21:24:17 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 13 Dec 2009 21:24:17 -0500 Subject: [Numpy-discussion] structured array from ordinary array In-Reply-To: <20091214015454.GA27887@doriath.local> References: <20091214015454.GA27887@doriath.local> Message-ID: <8C6E344D-CDCD-4A53-ABB2-CA5DD5BFC979@gmail.com> On Dec 13, 2009, at 8:54 PM, Ernest Adrogu? wrote: > Hi, > > How does one generate a structured array from a normal > array? > > I use the 'view' method, but this way I get a superfluous > dimension that I do not want. Example: > > In [619]: a = np.array([[1,2,3],[1,2,3],[1,2,4]],int) > > In [620]: struct = np.dtype([('a',int),('b',int),('c',int)]) > > In [621]: a.view(struct) ... > I'd like the last array to have shape (3,). > What am I doing wrong?? Put a .squeeze() after your view() From cournape at gmail.com Sun Dec 13 23:26:23 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 14 Dec 2009 09:56:23 +0530 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> Message-ID: <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> On Mon, Dec 14, 2009 at 5:29 AM, Chris wrote: > Chris gmail.com> writes: > >> Here it is: >> >> http://files.me.com/fonnesbeck/6ezhy5 >> > > Sorry, that link should be: > > http://files.me.com/fonnesbeck/qv8o59 Ok, so the undefined functions all indicate that the most recently implemented ones are not included. I really cannot see any other explanation that having a discrepancy between the source tree, build tree and installation. Sometimes, svn screw things up when switching between branches in my experience, so that's something to check for as well. Could you give us the generated config.h (somewhere in build/src.*/numpy/core/), just in case ? David From yogeshkarpate at gmail.com Mon Dec 14 02:31:26 2009 From: yogeshkarpate at gmail.com (yogesh karpate) Date: Mon, 14 Dec 2009 13:01:26 +0530 Subject: [Numpy-discussion] Does Numpy support CGI-scripting?` Message-ID: <703777c60912132331w2afd1f2dt66aa02de0bab802c@mail.gmail.com> Does Numpy Support CGI scripting? DO scipy and matplotlib also support? Regards ~ymk -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Mon Dec 14 03:20:20 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Mon, 14 Dec 2009 03:20:20 -0500 Subject: [Numpy-discussion] Does Numpy support CGI-scripting?` In-Reply-To: <703777c60912132331w2afd1f2dt66aa02de0bab802c@mail.gmail.com> References: <703777c60912132331w2afd1f2dt66aa02de0bab802c@mail.gmail.com> Message-ID: <47AAC2F3-6BDB-4805-B57E-C295A5676795@cs.toronto.edu> On 14-Dec-09, at 2:31 AM, yogesh karpate wrote: > Does Numpy Support CGI scripting? DO scipy and matplotlib also > support? I'm not sure what you're asking exactly. If the question is "can you create CGI scripts that use NumPy/SciPy/ matplotlib" then the answer is yes. You just need to look up how to create CGI scripts in Python, and then import the relevant modules from your script. Provided that numpy/scipy/matplotlib are installed on the machine executing the CGI script, it should work just fine. There is an entry in the matplotlib FAQ that is relevant: http://tinyurl.com/ya6wule David From fonnesbeck at gmail.com Mon Dec 14 04:39:12 2009 From: fonnesbeck at gmail.com (Chris) Date: Mon, 14 Dec 2009 09:39:12 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > Could you give us the generated config.h (somewhere in > build/src.*/numpy/core/), just in case ? > Here it is: http://files.me.com/fonnesbeck/d9eyxi Thanks again. cf From bsouthey at gmail.com Mon Dec 14 10:01:32 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 14 Dec 2009 09:01:32 -0600 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B24CC47.1060601@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> <4B24CC47.1060601@hccnet.nl> Message-ID: <4B26534C.1070405@gmail.com> On 12/13/2009 05:13 AM, Jasper van de Gronde wrote: > Bruce Southey wrote: > >> Really I would suggest asking the list for the real problem because it >> is often amazing what solutions have been given. >> > So far this is the fastest code I've got: > ------------------------------------------------------------------------ > import numpy as np > > nmax = 100 > > def minover(Xi,S): > P,N = Xi.shape > SXi = Xi.copy() > for i in xrange(0,P): > SXi[i] *= S[i] > SXi2 = np.dot(SXi,SXi.T) > SXiSXi2divN = np.concatenate((SXi,SXi2),axis=1)/N > w = np.random.standard_normal((N)) > E = np.dot(SXi,w) > wE = np.concatenate((w,E)) > for s in xrange(0,nmax*P): > mu = wE[N:].argmin() > wE += SXiSXi2divN[mu] > # E' = dot(SXi,w') > # = dot(SXi,w + SXi[mu,:]/N) > # = dot(SXi,w) + dot(SXi,SXi[mu,:])/N > # = E + dot(SXi,SXi.T)[:,mu]/N > # = E + dot(SXi,SXi.T)[mu,:]/N > return wE[:N] > ------------------------------------------------------------------------ > > I am particularly interested in cleaning up the initialization part, but > any suggestions for improving the overall performance are of course > appreciated. > > What is Xi and S? I think that your SXi is just: SXi=Xi*S But really I do not understand what you are actually trying to do. As previously indicated, some times simplifying an algorithm can make it computationally slower. Bruce From faltet at pytables.org Mon Dec 14 11:09:13 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Dec 2009 17:09:13 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B238594.1000305@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> Message-ID: <200912141709.13560.faltet@pytables.org> A Saturday 12 December 2009 12:59:16 Jasper van de Gronde escrigu?: > Francesc Alted wrote: > > ... > > Yeah, I think taking slices here is taking quite a lot of time: > > > > In [58]: timeit E + Xi2[P/2,:] > > 100000 loops, best of 3: 3.95 ?s per loop > > > > In [59]: timeit E + Xi2[P/2] > > 100000 loops, best of 3: 2.17 ?s per loop > > > > don't know why the additional ',:' in the slice is taking so much time, > > but my guess is that passing & analyzing the second argument > > (slice(None,None,None)) could be the responsible for the slowdown (but > > that is taking too much time). Mmh, perhaps it would be worth to study > > this more carefully so that an optimization could be done in NumPy. > > This is indeed interesting! And very nice that this actually works the > way you'd expect it to. I guess I've just worked too long with Matlab :) > > >> I think the lesson mostly should be that with so little data, > >> benchmarking becomes a very difficult art. > > > > Well, I think it is not difficult, it is just that you are perhaps > > benchmarking Python/NumPy machinery instead ;-) I'm curious whether > > Matlab can do slicing much more faster than NumPy. Jasper? > > I had a look, these are the timings for Python for 60x20: > Dot product: 0.051165 (5.116467e-06 per iter) > Add a row: 0.092849 (9.284860e-06 per iter) > Add a column: 0.082523 (8.252348e-06 per iter) > For Matlab 60x20: > Dot product: 0.029927 (2.992664e-006 per iter) > Add a row: 0.019664 (1.966444e-006 per iter) > Add a column: 0.008384 (8.384376e-007 per iter) > For Python 600x200: > Dot product: 1.917235 (1.917235e-04 per iter) > Add a row: 0.113243 (1.132425e-05 per iter) > Add a column: 0.162740 (1.627397e-05 per iter) > For Matlab 600x200: > Dot product: 1.282778 (1.282778e-004 per iter) > Add a row: 0.107252 (1.072525e-005 per iter) > Add a column: 0.021325 (2.132527e-006 per iter) > > If I fit a line through these two data points (60 and 600 rows), I get > the following equations: > Python, AR: 3.8e-5 * n + 0.091 > Matlab, AC: 2.4e-5 * n + 0.0069 > This would suggest that Matlab performs the vector addition about 1.6 > times faster and has a 13 times smaller constant cost! The things seems to be worst than 1.6x times slower for numpy, as matlab orders arrays by column, while numpy order is by row. So, if we want to compare pears with pears: For Python 600x200: Add a row: 0.113243 (1.132425e-05 per iter) For Matlab 600x200: Add a column: 0.021325 (2.132527e-006 per iter) which makes numpy 5x slower than matlab. Hmm, I definitely think that numpy could do better here :-/ However, caveat emptor, when you do timings, you normally put your code snippets in loops, and after the first iteration, the dataset (if small enough, as in your examples above) lives in CPU caches. But this is not *usually* the case because you have to transmit your data to CPU first. This transmission process is normally the main bottleneck when doing BLAS-1 level operations (i.e. vector-vector). This is to say that, in real-life calculations your numpy code will work almost as fast as matlab. So, my adivce is: don't be too worried about small dataset speed in small loops, and concentrate your optimization efforts in making your *real* code faster. -- Francesc Alted From faltet at pytables.org Mon Dec 14 11:22:40 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Dec 2009 17:22:40 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <200912141709.13560.faltet@pytables.org> References: <4B2231B6.6000709@hccnet.nl> <4B238594.1000305@hccnet.nl> <200912141709.13560.faltet@pytables.org> Message-ID: <200912141722.40354.faltet@pytables.org> A Monday 14 December 2009 17:09:13 Francesc Alted escrigu?: > The things seems to be worst than 1.6x times slower for numpy, as matlab > orders arrays by column, while numpy order is by row. So, if we want to > compare pears with pears: > > For Python 600x200: > Add a row: 0.113243 (1.132425e-05 per iter) > For Matlab 600x200: > Add a column: 0.021325 (2.132527e-006 per iter) Mmh, I've repeated this benchmark on my machine and got: In [59]: timeit E + Xi2[P/2] 100000 loops, best of 3: 2.8 ?s per loop that is, very similar to matlab's 2.1 ?s and quite far from the 11 ?s you are getting for numpy in your machine... I'm using a Core2 @ 3 GHz. -- Francesc Alted From th.v.d.gronde at hccnet.nl Mon Dec 14 12:20:32 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Mon, 14 Dec 2009 18:20:32 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <200912141722.40354.faltet@pytables.org> References: <4B2231B6.6000709@hccnet.nl> <4B238594.1000305@hccnet.nl> <200912141709.13560.faltet@pytables.org> <200912141722.40354.faltet@pytables.org> Message-ID: <4B2673E0.3050109@hccnet.nl> Francesc Alted wrote: > A Monday 14 December 2009 17:09:13 Francesc Alted escrigu?: >> The things seems to be worst than 1.6x times slower for numpy, as matlab >> orders arrays by column, while numpy order is by row. So, if we want to >> compare pears with pears: >> >> For Python 600x200: >> Add a row: 0.113243 (1.132425e-05 per iter) >> For Matlab 600x200: >> Add a column: 0.021325 (2.132527e-006 per iter) > > Mmh, I've repeated this benchmark on my machine and got: > > In [59]: timeit E + Xi2[P/2] > 100000 loops, best of 3: 2.8 ?s per loop > > that is, very similar to matlab's 2.1 ?s and quite far from the 11 ?s you are > getting for numpy in your machine... I'm using a Core2 @ 3 GHz. I'm using Python 2.6 and numpy 1.4.0rc1 on a Core2 @ 1.33 GHz (notebook). I'll have a look later to see if upgrading Python to 2.6.4 makes a difference. From th.v.d.gronde at hccnet.nl Mon Dec 14 12:27:08 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Mon, 14 Dec 2009 18:27:08 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B26534C.1070405@gmail.com> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> <4B24CC47.1060601@hccnet.nl> <4B26534C.1070405@gmail.com> Message-ID: <4B26756C.8070301@hccnet.nl> Bruce Southey wrote: >> So far this is the fastest code I've got: >> ------------------------------------------------------------------------ >> import numpy as np >> >> nmax = 100 >> >> def minover(Xi,S): >> P,N = Xi.shape >> SXi = Xi.copy() >> for i in xrange(0,P): >> SXi[i] *= S[i] >> SXi2 = np.dot(SXi,SXi.T) >> SXiSXi2divN = np.concatenate((SXi,SXi2),axis=1)/N >> w = np.random.standard_normal((N)) >> E = np.dot(SXi,w) >> wE = np.concatenate((w,E)) >> for s in xrange(0,nmax*P): >> mu = wE[N:].argmin() >> wE += SXiSXi2divN[mu] >> # E' = dot(SXi,w') >> # = dot(SXi,w + SXi[mu,:]/N) >> # = dot(SXi,w) + dot(SXi,SXi[mu,:])/N >> # = E + dot(SXi,SXi.T)[:,mu]/N >> # = E + dot(SXi,SXi.T)[mu,:]/N >> return wE[:N] >> ------------------------------------------------------------------------ >> >> I am particularly interested in cleaning up the initialization part, but >> any suggestions for improving the overall performance are of course >> appreciated. >> >> > What is Xi and S? > I think that your SXi is just: > SXi=Xi*S Sort of, it's actually (Xi.T*S).T, now that I think of it... I'll see if that is any faster. And if there is a neater way of doing it I'd love to hear about it. > But really I do not understand what you are actually trying to do. As > previously indicated, some times simplifying an algorithm can make it > computationally slower. It was hardly simplified, this was the original function body: P,N = Xi.shape SXi = Xi.copy() for i in xrange(0,P): SXi[i] *= S[i] w = np.random.standard_normal((N)) for s in xrange(0,nmax*P): E = np.dot(SXi,w) mu = E.argmin() w += SXi[mu]/N return w As you can see it's basically some basic linear algebra (which reduces the time complexity from about O(n^3) to O(n^2)), plus some less nice tweaks to avoid the high Python overhead. From faltet at pytables.org Mon Dec 14 12:51:58 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 14 Dec 2009 18:51:58 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B2673E0.3050109@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <200912141722.40354.faltet@pytables.org> <4B2673E0.3050109@hccnet.nl> Message-ID: <200912141851.58960.faltet@pytables.org> A Monday 14 December 2009 18:20:32 Jasper van de Gronde escrigu?: > Francesc Alted wrote: > > A Monday 14 December 2009 17:09:13 Francesc Alted escrigu?: > >> The things seems to be worst than 1.6x times slower for numpy, as matlab > >> orders arrays by column, while numpy order is by row. So, if we want to > >> compare pears with pears: > >> > >> For Python 600x200: > >> Add a row: 0.113243 (1.132425e-05 per iter) > >> For Matlab 600x200: > >> Add a column: 0.021325 (2.132527e-006 per iter) > > > > Mmh, I've repeated this benchmark on my machine and got: > > > > In [59]: timeit E + Xi2[P/2] > > 100000 loops, best of 3: 2.8 ?s per loop > > > > that is, very similar to matlab's 2.1 ?s and quite far from the 11 ?s you > > are getting for numpy in your machine... I'm using a Core2 @ 3 GHz. > > I'm using Python 2.6 and numpy 1.4.0rc1 on a Core2 @ 1.33 GHz > (notebook). I'll have a look later to see if upgrading Python to 2.6.4 > makes a difference. I don't think so. Your machine is slow for nowadays standards, so the 5x slowness should be due to python/numpy overhead, but unfortunately nothing that could be solved magically by using a newer python/numpy version. -- Francesc Alted From josef.pktd at gmail.com Mon Dec 14 13:26:56 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 Dec 2009 13:26:56 -0500 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <200912141851.58960.faltet@pytables.org> References: <4B2231B6.6000709@hccnet.nl> <200912141722.40354.faltet@pytables.org> <4B2673E0.3050109@hccnet.nl> <200912141851.58960.faltet@pytables.org> Message-ID: <1cd32cbb0912141026i678450b1xdcf79cb00df46e27@mail.gmail.com> On Mon, Dec 14, 2009 at 12:51 PM, Francesc Alted wrote: > A Monday 14 December 2009 18:20:32 Jasper van de Gronde escrigu?: >> Francesc Alted wrote: >> > A Monday 14 December 2009 17:09:13 Francesc Alted escrigu?: >> >> The things seems to be worst than 1.6x times slower for numpy, as matlab >> >> orders arrays by column, while numpy order is by row. ?So, if we want to >> >> compare pears with pears: >> >> >> >> For Python 600x200: >> >> ? ?Add a row: 0.113243 (1.132425e-05 per iter) >> >> For Matlab 600x200: >> >> ? ?Add a column: 0.021325 (2.132527e-006 per iter) >> > >> > Mmh, I've repeated this benchmark on my machine and got: >> > >> > In [59]: timeit E + Xi2[P/2] >> > 100000 loops, best of 3: 2.8 ?s per loop >> > >> > that is, very similar to matlab's 2.1 ?s and quite far from the 11 ?s you >> > are getting for numpy in your machine... ?I'm using a Core2 @ 3 GHz. >> >> I'm using Python 2.6 and numpy 1.4.0rc1 on a Core2 @ 1.33 GHz >> (notebook). I'll have a look later to see if upgrading Python to 2.6.4 >> makes a difference. > > I don't think so. ?Your machine is slow for nowadays standards, so the 5x > slowness should be due to python/numpy overhead, but unfortunately nothing > that could be solved magically by using a newer python/numpy version. dot is slow on single cpu, older notebook with older atlas and low in memory, (dot cannot multi-process). it looks like adding a row is almost only overhead for 600x200 >>> print "Dot product: %f" % dotProduct.timeit(N) Dot product: 3.124008 >>> print "Add a row: %f" % additionRow.timeit(N) Add a row: 0.080612 >>> print "Add a column: %f" % additionCol.timeit(N) Add a column: 0.113229 for 60x20 >>> print "Dot product: %f" % dotProduct.timeit(N) Dot product: 0.070933 >>> print "Add a row: %f" % additionRow.timeit(N) Add a row: 0.058492 >>> print "Add a column: %f" % additionCol.timeit(N) Add a column: 0.061401 600x2000 (dot may induce swapping to disc) >>> print "Dot product: %f" % dotProduct.timeit(N) Dot product: 43.114585 >>> print "Add a row: %f" % additionRow.timeit(N) Add a row: 0.085261 >>> print "Add a column: %f" % additionCol.timeit(N) Add a column: 0.122754 >>> print "Dot product: %f" % dotProduct.timeit(N) Dot product: 35.232084 Josef > -- > Francesc Alted > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jdh2358 at gmail.com Mon Dec 14 13:53:00 2009 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 14 Dec 2009 12:53:00 -0600 Subject: [Numpy-discussion] ANN: job opening at Tradelink Message-ID: <88e473830912141053w15499579i28aeed1cb973fb6c@mail.gmail.com> We are looking to hire a quantitative researcher to help research and develop trading ideas, and to develop and support infrastructure to put these trading strategies into production. We are looking for someone who is bright and curious with a quantitative background and a strong interest in writing good code and building systems that work. Experience with probability, statistics and time series is required, and experience working with real world data is a definite plus. We do not require a financial background, but are looking for someone with an enthusiasm to dive into this industry and learn a lot. We do most of our data modeling and production software in python and R. We have a lot of ideas to test and hopefully put into production, and you'll be working with a fast paced and friendly small team of traders, programmers and quantitative researchers. Applying: Please submit a resume and cover letter to qsjobs at trdlnk.com. In your cover letter, please address how your background, experience and skills will fit into the position described above. We are looking for a full-time, on-site candidate only. About Us: TradeLink Holdings LLC is a diversified alternative investment, trading and software firm. Headquartered in Chicago, TradeLink Holdings LLC includes a number of closely related entities. Since its organization in 1979, TradeLink has been actively engaged in the securities, futures, options, and commodities trading industries. Engaged in the option arbitrage business since 1983, TradeLink has a floor trading and/or electronic trading interface in commodity options, financial futures and options, and currency futures and options at all major U.S. exchanges. TradeLink is involved in various market-making programs in many different exchanges around the world, including over-the-counter derivatives markets. http://www.tradelinkllc.com From charlesr.harris at gmail.com Mon Dec 14 15:46:17 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Dec 2009 13:46:17 -0700 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B26756C.8070301@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> <4B24CC47.1060601@hccnet.nl> <4B26534C.1070405@gmail.com> <4B26756C.8070301@hccnet.nl> Message-ID: On Mon, Dec 14, 2009 at 10:27 AM, Jasper van de Gronde < th.v.d.gronde at hccnet.nl> wrote: > Bruce Southey wrote: > >> So far this is the fastest code I've got: > >> ------------------------------------------------------------------------ > >> import numpy as np > >> > >> nmax = 100 > >> > >> def minover(Xi,S): > >> P,N = Xi.shape > >> SXi = Xi.copy() > >> for i in xrange(0,P): > >> SXi[i] *= S[i] > >> SXi2 = np.dot(SXi,SXi.T) > >> SXiSXi2divN = np.concatenate((SXi,SXi2),axis=1)/N > >> w = np.random.standard_normal((N)) > >> E = np.dot(SXi,w) > >> wE = np.concatenate((w,E)) > >> for s in xrange(0,nmax*P): > >> mu = wE[N:].argmin() > >> wE += SXiSXi2divN[mu] > >> # E' = dot(SXi,w') > >> # = dot(SXi,w + SXi[mu,:]/N) > >> # = dot(SXi,w) + dot(SXi,SXi[mu,:])/N > >> # = E + dot(SXi,SXi.T)[:,mu]/N > >> # = E + dot(SXi,SXi.T)[mu,:]/N > >> return wE[:N] > >> ------------------------------------------------------------------------ > >> > >> I am particularly interested in cleaning up the initialization part, but > >> any suggestions for improving the overall performance are of course > >> appreciated. > >> > >> > > What is Xi and S? > > I think that your SXi is just: > > SXi=Xi*S > > Sort of, it's actually (Xi.T*S).T, now that I think of it... I'll see if > that is any faster. And if there is a neater way of doing it I'd love to > hear about it. > > Xi*S[:,newaxis] Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.robitaille at gmail.com Mon Dec 14 16:28:12 2009 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Mon, 14 Dec 2009 13:28:12 -0800 (PST) Subject: [Numpy-discussion] Problem with set_fill_value for masked structured array In-Reply-To: <3A649CD2-E0EF-45B8-8C30-7293FCFE0417@gmail.com> References: <7C8C88B0-7F76-4424-8275-3BF31F845AB2@gmail.com> <3A649CD2-E0EF-45B8-8C30-7293FCFE0417@gmail.com> Message-ID: <26780052.post@talk.nabble.com> Pierre GM-2 wrote: > > Well, that's a problem indeed, and I'd put that as a bug. > However, you can use that syntax instead: >>>> t.fill_value['a']=10 > or set all the fields at once: >>>>t.fill_value=(10,999999) > Thanks for your reply - should I submit a bug report on the numpy trac site? Thomas -- View this message in context: http://old.nabble.com/Problem-with-set_fill_value-for-masked-structured-array-tp26770843p26780052.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pgmdevlist at gmail.com Mon Dec 14 19:14:52 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 14 Dec 2009 19:14:52 -0500 Subject: [Numpy-discussion] Problem with set_fill_value for masked structured array In-Reply-To: <26780052.post@talk.nabble.com> References: <7C8C88B0-7F76-4424-8275-3BF31F845AB2@gmail.com> <3A649CD2-E0EF-45B8-8C30-7293FCFE0417@gmail.com> <26780052.post@talk.nabble.com> Message-ID: <6066A08B-2707-44D5-970A-A6FAEC040914@gmail.com> On Dec 14, 2009, at 4:28 PM, Thomas Robitaille wrote: > Pierre GM-2 wrote: >> >> Well, that's a problem indeed, and I'd put that as a bug. >> However, you can use that syntax instead: >>>>> t.fill_value['a']=10 >> or set all the fields at once: >>>>> t.fill_value=(10,999999) >> > > Thanks for your reply - should I submit a bug report on the numpy trac site? Always best to do so... Thx in advance !!! From pfeldman at verizon.net Mon Dec 14 23:30:08 2009 From: pfeldman at verizon.net (Dr. Phillip M. Feldman) Date: Mon, 14 Dec 2009 20:30:08 -0800 (PST) Subject: [Numpy-discussion] no ordinary Bessel functions? Message-ID: <26789343.post@talk.nabble.com> When I issue the command np.lookfor('bessel') I get the following: Search results for 'bessel' --------------------------- numpy.i0 Modified Bessel function of the first kind, order 0. numpy.kaiser Return the Kaiser window. numpy.random.vonmises Draw samples from a von Mises distribution. I assume that there is an ordinary (unmodified) Bessel function in NumPy, but have not been able to figure out how to access it. Also, I need to operate sometimes on scalars, and sometimes on arrays. For operations on scalars, are the NumPy Bessel functions significantly slower than the SciPy Bessel functions? -- View this message in context: http://old.nabble.com/no-ordinary-Bessel-functions--tp26789343p26789343.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Mon Dec 14 23:39:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Dec 2009 22:39:36 -0600 Subject: [Numpy-discussion] no ordinary Bessel functions? In-Reply-To: <26789343.post@talk.nabble.com> References: <26789343.post@talk.nabble.com> Message-ID: <3d375d730912142039m1712368aif6854a5305475382@mail.gmail.com> On Mon, Dec 14, 2009 at 22:30, Dr. Phillip M. Feldman wrote: > > When I issue the command > > np.lookfor('bessel') > > I get the following: > > Search results for 'bessel' > --------------------------- > numpy.i0 > ? ?Modified Bessel function of the first kind, order 0. > numpy.kaiser > ? ?Return the Kaiser window. > numpy.random.vonmises > ? ?Draw samples from a von Mises distribution. > > I assume that there is an ordinary (unmodified) Bessel function in NumPy, Nope. i0() is only in numpy to support the kaiser() window. Our policy on special functions is to include those which are exposed by C99 with a few exceptions for those that are necessary to support other functions in numpy. scipy.special is the place to go for general special function needs. > but have not been able to figure out how to access it. Also, I need to > operate sometimes on scalars, and sometimes on arrays. For operations on > scalars, are the NumPy Bessel functions significantly slower than the SciPy > Bessel functions? I recommend using the %timeit magic in IPython to test such things: In [1]: from scipy import special %y In [2]: %timeit numpy.i0(1.0) 1000 loops, best of 3: 921 ?s per loop In [3]: %timeit special.i0(1.0) 100000 loops, best of 3: 5.6 ?s per loop -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Tue Dec 15 03:02:01 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 15 Dec 2009 00:02:01 -0800 Subject: [Numpy-discussion] doctest improvements patch (and possible regressions) In-Reply-To: <4B249776.3010107@gmail.com> References: <4B1EB148.1010607@gmail.com> <4B1FAE55.9070605@stsci.edu> <4B249776.3010107@gmail.com> Message-ID: On Sat, Dec 12, 2009 at 11:27 PM, Paul Ivanov wrote: > So far, no one has voiced objections, so should I go ahead and check > this in? > +1 from me, at least. I don't see how there could be a downside to fixing a ton of tests :) Cheers, f From nadavh at visionsense.com Tue Dec 15 04:26:55 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 15 Dec 2009 11:26:55 +0200 Subject: [Numpy-discussion] small doc error in numpy.random.randn Message-ID: <710F2847B0018641891D9A21602763605AD265@ex3.envision.co.il> The 2nd line of the doc string randn([d1, ..., dn]) should be randn(d1, ..., dn) Nadav From d.l.goldsmith at gmail.com Tue Dec 15 04:36:37 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 15 Dec 2009 01:36:37 -0800 Subject: [Numpy-discussion] small doc error in numpy.random.randn In-Reply-To: <710F2847B0018641891D9A21602763605AD265@ex3.envision.co.il> References: <710F2847B0018641891D9A21602763605AD265@ex3.envision.co.il> Message-ID: <45d1ab480912150136u55f9369cr457fdd089d6a9bd0@mail.gmail.com> Indeed it should, thanks! DG On Tue, Dec 15, 2009 at 1:26 AM, Nadav Horesh wrote: > > The 2nd line of the doc string > > ? ?randn([d1, ..., dn]) > > should be > ? ?randn(d1, ..., dn) > > ?Nadav > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav+sp at iki.fi Tue Dec 15 04:36:29 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 15 Dec 2009 09:36:29 +0000 (UTC) Subject: [Numpy-discussion] Slicing slower than matrix multiplication? References: <4B2231B6.6000709@hccnet.nl> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> <200912141709.13560.faltet@pytables.org> Message-ID: Mon, 14 Dec 2009 17:09:13 +0100, Francesc Alted wrote: [clip] > which makes numpy 5x slower than matlab. Hmm, I definitely think that > numpy could do better here :-/ It could be useful to track down what exactly is slow, by profiling the actual C code. Unfortunately, profiling shared libraries is somewhat difficult. Some tools that I've seen to work (on Linux): - Valgrind (+ KCacheGrind) Together with its cache profiler, this can give useful information on what is the slow part, and on which lines most of the time is spent. - Oprofile Nice sample-based profiler, but requires root. - Qprof (32-bit only) Good for quick sample-based profiling on function level. Easy to use. - Sprof "The" way to profile dynamically linked libraries on Linux. Function-level, and slightly obscure to use. So if someone wants to spend time on this, those are the tools I'd recommend :) -- Pauli Virtanen From d.l.goldsmith at gmail.com Tue Dec 15 04:42:55 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 15 Dec 2009 01:42:55 -0800 Subject: [Numpy-discussion] small doc error in numpy.random.randn In-Reply-To: <45d1ab480912150136u55f9369cr457fdd089d6a9bd0@mail.gmail.com> References: <710F2847B0018641891D9A21602763605AD265@ex3.envision.co.il> <45d1ab480912150136u55f9369cr457fdd089d6a9bd0@mail.gmail.com> Message-ID: <45d1ab480912150142y596361cj8ce89881e485dbd8@mail.gmail.com> Fixed in the Wiki. DG On Tue, Dec 15, 2009 at 1:36 AM, David Goldsmith wrote: > Indeed it should, thanks! > > DG > > On Tue, Dec 15, 2009 at 1:26 AM, Nadav Horesh wrote: >> >> The 2nd line of the doc string >> >> ? ?randn([d1, ..., dn]) >> >> should be >> ? ?randn(d1, ..., dn) >> >> ?Nadav >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > From th.v.d.gronde at hccnet.nl Tue Dec 15 05:48:54 2009 From: th.v.d.gronde at hccnet.nl (Jasper van de Gronde) Date: Tue, 15 Dec 2009 11:48:54 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: References: <4B2231B6.6000709@hccnet.nl> <4B224F6D.6000606@hccnet.nl> <4B2268DD.9070603@student.matnat.uio.no> <200912111703.21261.faltet@pytables.org> <4B238594.1000305@hccnet.nl> <4B24CC47.1060601@hccnet.nl> <4B26534C.1070405@gmail.com> <4B26756C.8070301@hccnet.nl> Message-ID: <4B276996.20504@hccnet.nl> Charles R Harris wrote: > Sort of, it's actually (Xi.T*S).T, now that I think of it... I'll see if > that is any faster. And if there is a neater way of doing it I'd love to > hear about it. > > Xi*S[:,newaxis] Thanks! (Obviously doesn't matter much in terms of performance, as it's only the initialization, but it definitely does make for nicer code.) From bsouthey at gmail.com Tue Dec 15 10:30:08 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Dec 2009 09:30:08 -0600 Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 Message-ID: <4B27AB80.4070903@gmail.com> Hi, After installing Python2.7, a patched nose (http://bitbucket.org/kumar303/nose-2_7_fixes/ because unittest._TextTestResult has been removed) and numpy '1.5.0.dev8011', numpy.test crashes with a segmentation fault with the test for: test_multiarray.TestIO.test_ascii If I understand the test correctly: $ python Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> from numpy.compat import asbytes, getexception >>> np.fromstring(asbytes('1 , 2 , 3 , 4'),sep=',') Segmentation fault This code works under Python2.6 and numpy '1.5.0.dev8011'. Bruce From peridot.faceted at gmail.com Tue Dec 15 10:32:55 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 15 Dec 2009 10:32:55 -0500 Subject: [Numpy-discussion] no ordinary Bessel functions? In-Reply-To: <26789343.post@talk.nabble.com> References: <26789343.post@talk.nabble.com> Message-ID: 2009/12/14 Dr. Phillip M. Feldman : > > When I issue the command > > np.lookfor('bessel') > > I get the following: > > Search results for 'bessel' > --------------------------- > numpy.i0 > ? ?Modified Bessel function of the first kind, order 0. > numpy.kaiser > ? ?Return the Kaiser window. > numpy.random.vonmises > ? ?Draw samples from a von Mises distribution. > > I assume that there is an ordinary (unmodified) Bessel function in NumPy, > but have not been able to figure out how to access it. Also, I need to > operate sometimes on scalars, and sometimes on arrays. For operations on > scalars, are the NumPy Bessel functions significantly slower than the SciPy > Bessel functions? I am afraid this is one of the historical warts in numpy. The proper place for special functions is in scipy.special, which does indeed have various flavours of Bessel functions. Anne From pav+sp at iki.fi Tue Dec 15 11:07:47 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 15 Dec 2009 16:07:47 +0000 (UTC) Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 References: <4B27AB80.4070903@gmail.com> Message-ID: Hi, Tue, 15 Dec 2009 09:30:08 -0600, Bruce Southey wrote: > After installing Python2.7, a patched nose > (http://bitbucket.org/kumar303/nose-2_7_fixes/ because > unittest._TextTestResult has been removed) and numpy '1.5.0.dev8011', > numpy.test crashes with a segmentation fault with the test for: > test_multiarray.TestIO.test_ascii > > If I understand the test correctly: > $ python > Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 > (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or > "license" for more information. > >>> import numpy as np > >>> from numpy.compat import asbytes, getexception > >>> np.fromstring(asbytes('1 , 2 , 3 , 4'),sep=',') > Segmentation fault Please run it under gdb to obtain a backtrace. $ gdb --args python ... (gdb) run ... (gdb) bt ... > This code works under Python2.6 and numpy '1.5.0.dev8011'. Please also test the 1.4.x branch http://svn.scipy.org/svn/numpy/branches/1.4.x Does it fail too on Python 2.7? There are very few code changes since 1.4.x on the path that the test exercises. -- Pauli Virtanen From bsouthey at gmail.com Tue Dec 15 11:36:03 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Dec 2009 10:36:03 -0600 Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 In-Reply-To: References: <4B27AB80.4070903@gmail.com> Message-ID: <4B27BAF3.2040306@gmail.com> On 12/15/2009 10:07 AM, Pauli Virtanen wrote: > Hi, > > Tue, 15 Dec 2009 09:30:08 -0600, Bruce Southey wrote: > >> After installing Python2.7, a patched nose >> (http://bitbucket.org/kumar303/nose-2_7_fixes/ because >> unittest._TextTestResult has been removed) and numpy '1.5.0.dev8011', >> numpy.test crashes with a segmentation fault with the test for: >> test_multiarray.TestIO.test_ascii >> >> If I understand the test correctly: >> $ python >> Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 >> (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or >> "license" for more information. >> >>> import numpy as np >> >>> from numpy.compat import asbytes, getexception >> >>> np.fromstring(asbytes('1 , 2 , 3 , 4'),sep=',') >> Segmentation fault >> > Please run it under gdb to obtain a backtrace. > > $ gdb --args python > ... > (gdb) run > ... > (gdb) bt > ... > > Thanks for that! It also made remember that numpy is being built with Atlas 3.8.3. $ gdb --args python GNU gdb (GDB) Fedora (6.8.50.20090302-39.fc11) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... (gdb) run Starting program: /usr/local/bin/python [Thread debugging using libthread_db enabled] Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> from numpy.compat import asbytes, getexception >>> np.fromstring(asbytes('1 , 2 , 3 , 4'),sep=',') Program received signal SIGSEGV, Segmentation fault. setup_context (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:449 449 PyFrameObject *f = PyThreadState_GET()->frame; (gdb) bt #0 setup_context (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:449 #1 do_warn (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:593 #2 0x0000000000493c81 in PyErr_WarnEx (category=0x760720, text=, stack_level=1) at Python/_warnings.c:719 #3 0x00000000004c8e94 in PyOS_ascii_strtod (nptr=0x7ffff7f08914 "1 , 2 , 3 , 4", endptr=0x7fffffffdb28) at Python/pystrtod.c:282 #4 0x00007ffff2954151 in NumPyOS_ascii_strtod (s=0x7ffff7f08914 "1 , 2 , 3 , 4", endptr=0x7fffffffdb28) at numpy/core/src/multiarray/numpyos.c:527 #5 0x00007ffff29541cc in DOUBLE_fromstr (str=0x7ffff7ef14b0 "\1", ip=0xac4d60, endptr=0x0, __NPY_UNUSED_TAGGEDignore=0x6920656c62756f64) at numpy/core/src/multiarray/arraytypes.c.src:1575 #6 0x00007ffff29375a7 in fromstr_next_element (s=0x7fffffffdb28, dptr=0x760720, dtype=, end=0x7ffff7f08921 "") at numpy/core/src/multiarray/ctors.c:35 #7 0x00007ffff296153a in array_from_text (dtype=0x7ffff2bbd440, num=, sep=, nread=0x7fffffffdbb8, stream=0x7ffff7f08914, next=, skip_sep=0x7ffff294cb70 , stream_data=0x7ffff7f08921) at numpy/core/src/multiarray/ctors.c:2950 #8 0x00007ffff2961751 in PyArray_FromString (data=0x7ffff7f08914 "1 , 2 , 3 , 4", slen=, dtype=0x7ffff2bbd440, num=-1, sep=0x0) at numpy/core/src/multiarray/ctors.c:3264 #9 0x00007ffff29618cf in array_fromstring (__NPY_UNUSED_TAGGEDignored=, args=, keywds=) at numpy/core/src/multiarray/multiarraymodule.c:1707 #10 0x00000000004a03c0 in do_call (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:4194 #11 call_function (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:4002 #12 PyEval_EvalFrameEx (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:2618 #13 0x00000000004a12a6 in PyEval_EvalCodeEx (co=0x7ffff7ef6210, globals=, locals=, args=0x0, argcount=1702130542, kws=, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3209 #14 0x00000000004a1372 in PyEval_EvalCode (co=0x7ffff7ef14b0, globals=0x760720, locals=0x0) at Python/ceval.c:666 #15 0x00000000004c0047 in run_mod (arena=, flags=, locals=, globals=, filename=, mod=) at Python/pythonrun.c:1342 #16 PyRun_InteractiveOneFlags (arena=, flags=, locals=, globals=, filename=, mod=) at Python/pythonrun.c:847 #17 0x00000000004c024e in PyRun_InteractiveLoopFlags (fp=0x34ecf686a0, filename=0x5161a8 "", flags=0x7fffffffdff0) at Python/pythonrun.c:767 #18 0x00000000004c094b in PyRun_AnyFileExFlags (fp=0x34ecf686a0, filename=0x5161a8 "", closeit=0, flags=0x7fffffffdff0) at Python/pythonrun.c:736 #19 0x0000000000414334 in Py_Main (argc=0, argv=) at Modules/main.c:576 #20 0x00000034ecc1ea2d in __libc_start_main (main=, argc=, ubp_av=, init=, fini=, rtld_fini=, stack_end=0x7fffffffe108) at libc-start.c:220 #21 0x0000000000413619 in _start () Bruce From bsouthey at gmail.com Tue Dec 15 11:46:26 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Dec 2009 10:46:26 -0600 Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 In-Reply-To: References: <4B27AB80.4070903@gmail.com> Message-ID: <4B27BD62.6020800@gmail.com> On 12/15/2009 10:07 AM, Pauli Virtanen wrote: [snip] > Please also test the 1.4.x branch > > http://svn.scipy.org/svn/numpy/branches/1.4.x > > Does it fail too on Python 2.7? There are very few code changes since > 1.4.x on the path that the test exercises. > > This took a little time find to the test because asbytes appears to be new in SVN. Anyhow, numpy 1.4.0rc1 works with Python 2.6 but not Python 2.7: $ python2.6 Python 2.6.1 (r261:532, May 7 2009, 11:38:00) [GCC 4.3.2 20081105 (Red Hat 4.3.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> np.fromstring('1,2,3,4',sep=',') array([ 1., 2., 3., 4.]) >>> $ python2.7 Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> np.fromstring('1,2,3,4',sep=',') Segmentation fault $ gdb --args python GNU gdb (GDB) Fedora (6.8.50.20090302-39.fc11) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... (gdb) run Starting program: /usr/local/bin/python [Thread debugging using libthread_db enabled] Python 2.7a1 (r27a1:76674, Dec 14 2009, 13:46:01) [GCC 4.4.1 20090725 (Red Hat 4.4.1-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.4.0rc1' >>> from numpy.compat import asbytes, getexception Traceback (most recent call last): File "", line 1, in ImportError: cannot import name asbytes >>> s='1,2,3,4' >>> np.fromstring('1,2,3,4',sep=',') Program received signal SIGSEGV, Segmentation fault. setup_context (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:449 449 PyFrameObject *f = PyThreadState_GET()->frame; (gdb) bt #0 setup_context (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:449 #1 do_warn (registry=, module=, lineno=, filename=, stack_level=) at Python/_warnings.c:593 #2 0x0000000000493c81 in PyErr_WarnEx (category=0x760720, text=, stack_level=1) at Python/_warnings.c:719 #3 0x00000000004c8e94 in PyOS_ascii_strtod (nptr=0x7ffff7f16624 "1,2,3,4", endptr=0x7fffffffdb28) at Python/pystrtod.c:282 #4 0x00007ffff2955791 in NumPyOS_ascii_strtod (s=0x7ffff7f16624 "1,2,3,4", endptr=0x7fffffffdb28) at numpy/core/src/multiarray/numpyos.c:518 #5 0x00007ffff295580c in DOUBLE_fromstr (str=0x7ffff7ef14b0 "\1", ip=0xc42de0, endptr=0x0, __NPY_UNUSED_TAGGEDignore=0x6920656c62756f64) at numpy/core/src/multiarray/arraytypes.c.src:1549 #6 0x00007ffff2939397 in fromstr_next_element (s=0x7fffffffdb28, dptr=0x760720, dtype=, end=0x7ffff7f1662b "") at numpy/core/src/multiarray/ctors.c:33 #7 0x00007ffff296291a in array_from_text (dtype=0x7ffff2bbd3e0, num=, sep=, nread=0x7fffffffdbb8, stream=0x7ffff7f16624, next=, skip_sep=0x7ffff294d360 , stream_data=0x7ffff7f1662b) at numpy/core/src/multiarray/ctors.c:2927 #8 0x00007ffff2962b31 in PyArray_FromString (data=0x7ffff7f16624 "1,2,3,4", slen=, dtype=0x7ffff2bbd3e0, num=-1, sep=0x0) at numpy/core/src/multiarray/ctors.c:3234 #9 0x00007ffff2962caf in array_fromstring (__NPY_UNUSED_TAGGEDignored=, args=, keywds=) at numpy/core/src/multiarray/multiarraymodule.c:1684 #10 0x00000000004a03c0 in do_call (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:4194 #11 call_function (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:4002 #12 PyEval_EvalFrameEx (nk=, na=-1, pp_stack=, func=) at Python/ceval.c:2618 #13 0x00000000004a12a6 in PyEval_EvalCodeEx (co=0x7ffff7ef3738, globals=, locals=, args=0x0, argcount=1702130542, kws=, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3209 #14 0x00000000004a1372 in PyEval_EvalCode (co=0x7ffff7ef14b0, globals=0x760720, locals=0x0) at Python/ceval.c:666 #15 0x00000000004c0047 in run_mod (arena=, flags=, locals=, globals=, filename=, mod=) at Python/pythonrun.c:1342 #16 PyRun_InteractiveOneFlags (arena=, flags=, locals=, globals=, filename=, mod=) at Python/pythonrun.c:847 #17 0x00000000004c024e in PyRun_InteractiveLoopFlags (fp=0x34ecf686a0, filename=0x5161a8 "", flags=0x7fffffffdff0) at Python/pythonrun.c:767 #18 0x00000000004c094b in PyRun_AnyFileExFlags (fp=0x34ecf686a0, filename=0x5161a8 "", closeit=0, flags=0x7fffffffdff0) at Python/pythonrun.c:736 #19 0x0000000000414334 in Py_Main (argc=0, argv=) at Modules/main.c:576 #20 0x00000034ecc1ea2d in __libc_start_main (main=, argc=, ubp_av=, init=, fini=, rtld_fini=, stack_end=0x7fffffffe108) at libc-start.c:220 #21 0x0000000000413619 in _start () From pav+sp at iki.fi Tue Dec 15 11:51:39 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 15 Dec 2009 16:51:39 +0000 (UTC) Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 References: <4B27AB80.4070903@gmail.com> <4B27BAF3.2040306@gmail.com> Message-ID: Tue, 15 Dec 2009 10:36:03 -0600, Bruce Southey wrote: [clip] > Program received signal SIGSEGV, Segmentation fault. setup_context > (registry=, module=, > lineno=, filename=, > stack_level=) > at Python/_warnings.c:449 > 449 PyFrameObject *f = PyThreadState_GET()->frame; (gdb) bt > #0 setup_context (registry=, module= optimized out>, lineno=, filename= out>, stack_level=) > at Python/_warnings.c:449 > #1 do_warn (registry=, module= out>, lineno=, filename=, > stack_level=) > at Python/_warnings.c:593 > #2 0x0000000000493c81 in PyErr_WarnEx (category=0x760720, text= optimized out>, stack_level=1) at Python/_warnings.c:719 #3 > 0x00000000004c8e94 in PyOS_ascii_strtod (nptr=0x7ffff7f08914 "1 , 2 , 3 > , 4", endptr=0x7fffffffdb28) at Python/pystrtod.c:282 #4 > 0x00007ffff2954151 in NumPyOS_ascii_strtod (s=0x7ffff7f08914 "1 , 2 , 3 > , 4", endptr=0x7fffffffdb28) at numpy/core/src/multiarray/numpyos.c:527 Looks like it's trying to raise a deprecation warning after PyArray_FromString has released GIL. So that was the reason why it caused a segfault also in 3.1. PyOS_ascii_strtod was deprecated in 2.7 and in 3.1. Apparently, we now *must* do something like #if PY_VERSION_HEX >= 0x02060000 return PyOS_string_to_double(s, endptr, NULL); #else return PyOS_ascii_strtod(s, endptr); #endif everywhere the function is used. It also seems that this needs to be backported to Numpy 1.4.x... (Note to self: this is also the origin of the crashes in scipy/ lambertw... GIL must be reacquired before raising any warnings.) -- Pauli Virtanen From charlesr.harris at gmail.com Tue Dec 15 11:59:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Dec 2009 09:59:39 -0700 Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 In-Reply-To: References: <4B27AB80.4070903@gmail.com> <4B27BAF3.2040306@gmail.com> Message-ID: On Tue, Dec 15, 2009 at 9:51 AM, Pauli Virtanen > wrote: > Tue, 15 Dec 2009 10:36:03 -0600, Bruce Southey wrote: > [clip] > > Program received signal SIGSEGV, Segmentation fault. setup_context > > (registry=, module=, > > lineno=, filename=, > > stack_level=) > > at Python/_warnings.c:449 > > 449 PyFrameObject *f = PyThreadState_GET()->frame; (gdb) bt > > #0 setup_context (registry=, module= > optimized out>, lineno=, filename= > out>, stack_level=) > > at Python/_warnings.c:449 > > #1 do_warn (registry=, module= > out>, lineno=, filename=, > > stack_level=) > > at Python/_warnings.c:593 > > #2 0x0000000000493c81 in PyErr_WarnEx (category=0x760720, text= > optimized out>, stack_level=1) at Python/_warnings.c:719 #3 > > 0x00000000004c8e94 in PyOS_ascii_strtod (nptr=0x7ffff7f08914 "1 , 2 , 3 > > , 4", endptr=0x7fffffffdb28) at Python/pystrtod.c:282 #4 > > 0x00007ffff2954151 in NumPyOS_ascii_strtod (s=0x7ffff7f08914 "1 , 2 , 3 > > , 4", endptr=0x7fffffffdb28) at numpy/core/src/multiarray/numpyos.c:527 > > Looks like it's trying to raise a deprecation warning after > PyArray_FromString has released GIL. So that was the reason why it caused > a segfault also in 3.1. > > PyOS_ascii_strtod was deprecated in 2.7 and in 3.1. Apparently, we now > *must* do something like > > #if PY_VERSION_HEX >= 0x02060000 > return PyOS_string_to_double(s, endptr, NULL); > #else > return PyOS_ascii_strtod(s, endptr); > #endif > > everywhere the function is used. > > It also seems that this needs to be backported to Numpy 1.4.x... > > (Note to self: this is also the origin of the crashes in scipy/ > lambertw... GIL must be reacquired before raising any warnings.) > > Would it be appropriate to put macros for all these in config.h or some other common spot? Having all the python version dependencies in one spot might make it easier to keep current. I've been thinking of moving the numpy deprecation macro for that reason. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Dec 15 12:01:53 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 15 Dec 2009 12:01:53 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy Message-ID: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> Hi, Following on from the occasional discussion on the list, can I propose a small matrix_rank function for inclusion in numpy/linalg? I suggest it because it seems rather a basic need for linear algebra, and it's very small and simple... I've appended an implementation with some doctests in the hope that it will be acceptable, Robert - I hope you don't mind me quoting you in the notes. Thanks a lot, Matthew def matrix_rank(M, tol=None): ''' Return rank of matrix using SVD method Rank of the array is the number of SVD singular values of the array that are greater than `tol`. Parameters ---------- M : array-like array of <=2 dimensions tol : {None, float} threshold below which SVD values are considered zero. If `tol` is None, and `S` is an array with singular values for `M`, and `eps` is the epsilon value for datatype of `S`, then `tol` set to ``S.max() * eps``. Examples -------- >>> matrix_rank(np.eye(4)) # Full rank matrix 4 >>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix 4 >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank 0 >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 1 >>> matrix_rank(np.zeros((4,))) 0 >>> matrix_rank([1]) # accepts array-like 1 Notes ----- Golub and van Loan define "numerical rank deficiency" as using tol=eps*S[0] (note that S[0] is the maximum singular value and thus the 2-norm of the matrix). There really is not one definition, much like there isn't a single definition of the norm of a matrix. For example, if your data come from uncertain measurements with uncertainties greater than floating point epsilon, choosing a tolerance of about the uncertainty is probably a better idea (the tolerance may be absolute if the uncertainties are absolute rather than relative, even). When floating point roundoff is your concern, then "numerical rank deficiency" is a better concept, but exactly what the relevant measure of the tolerance is depends on the operations you intend to do with your matrix. [RK, numpy mailing list] References ---------- Matrix Computations by Golub and van Loan ''' M = np.asarray(M) if M.ndim > 2: raise TypeError('array should have 2 or fewer dimensions') if M.ndim < 2: return int(not np.all(M==0)) S = npl.svd(M, compute_uv=False) if tol is None: tol = S.max() * np.finfo(S.dtype).eps return np.sum(S > tol) From matthew.brett at gmail.com Tue Dec 15 12:01:53 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 15 Dec 2009 12:01:53 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy Message-ID: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> Hi, Following on from the occasional discussion on the list, can I propose a small matrix_rank function for inclusion in numpy/linalg? I suggest it because it seems rather a basic need for linear algebra, and it's very small and simple... I've appended an implementation with some doctests in the hope that it will be acceptable, Robert - I hope you don't mind me quoting you in the notes. Thanks a lot, Matthew def matrix_rank(M, tol=None): ''' Return rank of matrix using SVD method Rank of the array is the number of SVD singular values of the array that are greater than `tol`. Parameters ---------- M : array-like array of <=2 dimensions tol : {None, float} threshold below which SVD values are considered zero. If `tol` is None, and `S` is an array with singular values for `M`, and `eps` is the epsilon value for datatype of `S`, then `tol` set to ``S.max() * eps``. Examples -------- >>> matrix_rank(np.eye(4)) # Full rank matrix 4 >>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix 4 >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank 0 >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 1 >>> matrix_rank(np.zeros((4,))) 0 >>> matrix_rank([1]) # accepts array-like 1 Notes ----- Golub and van Loan define "numerical rank deficiency" as using tol=eps*S[0] (note that S[0] is the maximum singular value and thus the 2-norm of the matrix). There really is not one definition, much like there isn't a single definition of the norm of a matrix. For example, if your data come from uncertain measurements with uncertainties greater than floating point epsilon, choosing a tolerance of about the uncertainty is probably a better idea (the tolerance may be absolute if the uncertainties are absolute rather than relative, even). When floating point roundoff is your concern, then "numerical rank deficiency" is a better concept, but exactly what the relevant measure of the tolerance is depends on the operations you intend to do with your matrix. [RK, numpy mailing list] References ---------- Matrix Computations by Golub and van Loan ''' M = np.asarray(M) if M.ndim > 2: raise TypeError('array should have 2 or fewer dimensions') if M.ndim < 2: return int(not np.all(M==0)) S = npl.svd(M, compute_uv=False) if tol is None: tol = S.max() * np.finfo(S.dtype).eps return np.sum(S > tol) From pav+sp at iki.fi Tue Dec 15 12:09:38 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 15 Dec 2009 17:09:38 +0000 (UTC) Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 References: <4B27AB80.4070903@gmail.com> <4B27BAF3.2040306@gmail.com> Message-ID: Tue, 15 Dec 2009 09:59:39 -0700, Charles R Harris wrote: > Would it be appropriate to put macros for all these in config.h or some > other common spot? Having all the python version dependencies in one > spot might make it easier to keep current. I've been thinking of moving > the numpy deprecation macro for that reason. Actually, we have already have a separate NumpyOS_ascii_strtod function that should be used instead of PyOS_ascii_strtod (which, historically, has not satisfier our requirements). So I believe PyOS_ascii_strtod is used in only a single location in Numpy. -- Pauli Virtanen From josef.pktd at gmail.com Tue Dec 15 12:12:37 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 15 Dec 2009 12:12:37 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> Message-ID: <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> On Tue, Dec 15, 2009 at 12:01 PM, Matthew Brett wrote: > Hi, > > Following on from the occasional discussion on the list, can I propose > a small matrix_rank function for inclusion in numpy/linalg? > > I suggest it because it seems rather a basic need for linear algebra, > and it's very small and simple... > > I've appended an implementation with some doctests in the hope that it > will be acceptable, > > Robert - I hope you don't mind me quoting you in the notes. > > Thanks a lot, > > Matthew > > > def matrix_rank(M, tol=None): > ? ?''' Return rank of matrix using SVD method > > ? ?Rank of the array is the number of SVD singular values of the > ? ?array that are greater than `tol`. > > ? ?Parameters > ? ?---------- > ? ?M : array-like > ? ? ? ?array of <=2 dimensions > ? ?tol : {None, float} > ? ? ? ? threshold below which SVD values are considered zero. If `tol` > ? ? ? ? is None, and `S` is an array with singular values for `M`, and > ? ? ? ? `eps` is the epsilon value for datatype of `S`, then `tol` set > ? ? ? ? to ``S.max() * eps``. > > ? ?Examples > ? ?-------- > ? ?>>> matrix_rank(np.eye(4)) # Full rank matrix > ? ?4 > ? ?>>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix > ? ?4 > ? ?>>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank > ? ?0 > ? ?>>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 > ? ?1 > ? ?>>> matrix_rank(np.zeros((4,))) > ? ?0 > ? ?>>> matrix_rank([1]) # accepts array-like > ? ?1 > > ? ?Notes > ? ?----- > ? ?Golub and van Loan define "numerical rank deficiency" as using > ? ?tol=eps*S[0] (note that S[0] is the maximum singular value and thus > ? ?the 2-norm of the matrix). There really is not one definition, much > ? ?like there isn't a single definition of the norm of a matrix. For > ? ?example, if your data come from uncertain measurements with > ? ?uncertainties greater than floating point epsilon, choosing a > ? ?tolerance of about the uncertainty is probably a better idea (the > ? ?tolerance may be absolute if the uncertainties are absolute rather > ? ?than relative, even). When floating point roundoff is your concern, > ? ?then "numerical rank deficiency" is a better concept, but exactly > ? ?what the relevant measure of the tolerance is depends on the > ? ?operations you intend to do with your matrix. [RK, numpy mailing > ? ?list] > > ? ?References > ? ?---------- > ? ?Matrix Computations by Golub and van Loan > ? ?''' > ? ?M = np.asarray(M) > ? ?if M.ndim > 2: > ? ? ? ?raise TypeError('array should have 2 or fewer dimensions') > ? ?if M.ndim < 2: > ? ? ? ?return int(not np.all(M==0)) > ? ?S = npl.svd(M, compute_uv=False) > ? ?if tol is None: > ? ? ? ?tol = S.max() * np.finfo(S.dtype).eps > ? ?return np.sum(S > tol) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > This was missing from numpy compared to matlab and gauss. If we put it in linalg next to np.linalg.cond, then we could shorten the name to `rank`, since the meaning of np.linalg.rank should be pretty obvious. Josef From mdroe at stsci.edu Tue Dec 15 13:20:11 2009 From: mdroe at stsci.edu (Michael Droettboom) Date: Tue, 15 Dec 2009 13:20:11 -0500 Subject: [Numpy-discussion] fromfile can segfault if data is corrupted Message-ID: <4B27D35B.80406@stsci.edu> I just discovered a bug in fromfile where it can segfault if the file data is corrupted in such a way that the array size is insanely large. (It was a byte-swapping problem in my own code, but it would be preferable to get an exception rather than a crash). It's a simple fix to propagate the "array too large" exception before trying to dereference the NULL array pointer (ret) in PyArray_FromFile (see attached patch). But my question is: is this an appropriate fix for 1.4 (it seems pretty straightforward), or should I only make this to the trunk? Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fromfile_segfault.patch URL: From charlesr.harris at gmail.com Tue Dec 15 13:28:43 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Dec 2009 11:28:43 -0700 Subject: [Numpy-discussion] fromfile can segfault if data is corrupted In-Reply-To: <4B27D35B.80406@stsci.edu> References: <4B27D35B.80406@stsci.edu> Message-ID: On Tue, Dec 15, 2009 at 11:20 AM, Michael Droettboom wrote: > I just discovered a bug in fromfile where it can segfault if the file data > is corrupted in such a way that the array size is insanely large. (It was a > byte-swapping problem in my own code, but it would be preferable to get an > exception rather than a crash). > > It's a simple fix to propagate the "array too large" exception before > trying to dereference the NULL array pointer (ret) in PyArray_FromFile (see > attached patch). But my question is: is this an appropriate fix for 1.4 (it > seems pretty straightforward), or should I only make this to the trunk? > > David can weigh in here, but I think you should backport it. It's a bugfix, small, and there is going to be another rc. On the other hand, Travis should stop backporting new functionality. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Dec 15 13:39:11 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Dec 2009 12:39:11 -0600 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> Message-ID: <4B27D7CF.3060001@gmail.com> On 12/15/2009 11:12 AM, josef.pktd at gmail.com wrote: > On Tue, Dec 15, 2009 at 12:01 PM, Matthew Brett wrote: > >> Hi, >> >> Following on from the occasional discussion on the list, can I propose >> a small matrix_rank function for inclusion in numpy/linalg? >> >> I suggest it because it seems rather a basic need for linear algebra, >> and it's very small and simple... >> >> I've appended an implementation with some doctests in the hope that it >> will be acceptable, >> >> Robert - I hope you don't mind me quoting you in the notes. >> >> Thanks a lot, >> >> Matthew >> >> >> def matrix_rank(M, tol=None): >> ''' Return rank of matrix using SVD method >> >> Rank of the array is the number of SVD singular values of the >> array that are greater than `tol`. >> >> Parameters >> ---------- >> M : array-like >> array of<=2 dimensions >> tol : {None, float} >> threshold below which SVD values are considered zero. If `tol` >> is None, and `S` is an array with singular values for `M`, and >> `eps` is the epsilon value for datatype of `S`, then `tol` set >> to ``S.max() * eps``. >> >> Examples >> -------- >> >>> matrix_rank(np.eye(4)) # Full rank matrix >> 4 >> >>> matrix_rank(np.c_[np.eye(4),np.eye(4)]) # Rank deficient matrix >> 4 >> >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank >> 0 >> >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 >> 1 >> >>> matrix_rank(np.zeros((4,))) >> 0 >> >>> matrix_rank([1]) # accepts array-like >> 1 >> >> Notes >> ----- >> Golub and van Loan define "numerical rank deficiency" as using >> tol=eps*S[0] (note that S[0] is the maximum singular value and thus >> the 2-norm of the matrix). There really is not one definition, much >> like there isn't a single definition of the norm of a matrix. For >> example, if your data come from uncertain measurements with >> uncertainties greater than floating point epsilon, choosing a >> tolerance of about the uncertainty is probably a better idea (the >> tolerance may be absolute if the uncertainties are absolute rather >> than relative, even). When floating point roundoff is your concern, >> then "numerical rank deficiency" is a better concept, but exactly >> what the relevant measure of the tolerance is depends on the >> operations you intend to do with your matrix. [RK, numpy mailing >> list] >> >> References >> ---------- >> Matrix Computations by Golub and van Loan >> ''' >> M = np.asarray(M) >> if M.ndim> 2: >> raise TypeError('array should have 2 or fewer dimensions') >> if M.ndim< 2: >> return int(not np.all(M==0)) >> S = npl.svd(M, compute_uv=False) >> if tol is None: >> tol = S.max() * np.finfo(S.dtype).eps >> return np.sum(S> tol) >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > This was missing from numpy compared to matlab and gauss. > > If we put it in linalg next to np.linalg.cond, then we could shorten > the name to `rank`, since the meaning of np.linalg.rank should be > pretty obvious. > > Josef > _______________________________________________ > +1 for the function but we can not shorten the name because of existing numpy.rank() function. Bruce From matthew.brett at gmail.com Tue Dec 15 13:45:07 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 15 Dec 2009 13:45:07 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <4B27D7CF.3060001@gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> <4B27D7CF.3060001@gmail.com> Message-ID: <1e2af89e0912151045q5294456l84a61b6c90a70a33@mail.gmail.com> Hi, > +1 for the function but we can not shorten the name because of existing > numpy.rank() function. I don't feel strongly about the name, but I imagine you could do from numpy.linalg import rank as matrix_rank if you weren't using the numpy.linalg namespace already... Best, Matthew From aisaac at american.edu Tue Dec 15 13:47:50 2009 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 15 Dec 2009 13:47:50 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <4B27D7CF.3060001@gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> <4B27D7CF.3060001@gmail.com> Message-ID: <4B27D9D6.7050807@american.edu> On 12/15/2009 1:39 PM, Bruce Southey wrote: > +1 for the function but we can not shorten the name because of existing > numpy.rank() function. 1. Is it a rule that there cannot be a name duplication in this different namespace? 2. Is there a commitment to keeping both np.rank and np.ndim? (I.e., can np.rank never be deprecated?) If the answers are both 'yes', then perhaps linalg.rank2d is a possible shorter name. Alan Isaac From bsouthey at gmail.com Tue Dec 15 14:24:43 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 15 Dec 2009 13:24:43 -0600 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <4B27D9D6.7050807@american.edu> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <1cd32cbb0912150912u1a9fc5a3xec52a7684bf4c73@mail.gmail.com> <4B27D7CF.3060001@gmail.com> <4B27D9D6.7050807@american.edu> Message-ID: <4B27E27B.7000806@gmail.com> On 12/15/2009 12:47 PM, Alan G Isaac wrote: > On 12/15/2009 1:39 PM, Bruce Southey wrote: > >> +1 for the function but we can not shorten the name because of existing >> numpy.rank() function. >> > 1. Is it a rule that there cannot be a name duplication > in this different namespace? > In my view this is still the same numpy namespace. An example of the potential problems is just using an incorrect import statement somewhere: from numpy import rank instead of from numpy.linalg import rank For a package you control, you should really prevent this type of user mistake. > 2. Is there a commitment to keeping both np.rank and np.ndim? > (I.e., can np.rank never be deprecated?) > I do not see that is practical because of the number of releases to actually remove a function. Also the current rank function has existed for a very long time in Numerical Python (as it is present in Numeric). So it could be confusing for a user to think that the function just has been moved rather than being a different function. > If the answers are both 'yes', > then perhaps linalg.rank2d is a possible shorter name. > > Alan Isaac > _______________________________________________ > 0 Actually I do interpret rank in terms of linear algebra definition, but obviously other people have other meanings. I Bruce From robert.kern at gmail.com Tue Dec 15 14:45:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 15 Dec 2009 13:45:55 -0600 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> Message-ID: <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> On Tue, Dec 15, 2009 at 11:01, Matthew Brett wrote: > Hi, > > Following on from the occasional discussion on the list, can I propose > a small matrix_rank function for inclusion in numpy/linalg? > > I suggest it because it seems rather a basic need for linear algebra, > and it's very small and simple... > > I've appended an implementation with some doctests in the hope that it > will be acceptable, I think you need a real example of a nontrivial numerically rank-deficient matrix. Note that c_[eye(4), eye(4)] is actually a full-rank matrix. A matrix is full rank if its numerical rank is equal to min(rows, cols) not max(rows, cols). Taking I=eye(4); I[-1,-1] = 0.0 should be a sufficient example. > Robert - I hope you don't mind me quoting you in the notes. I certainly. However, you do not need to cite me; I'm in the authors list already. On the other hand, you probably shouldn't copy-and-paste anything I write on the mailing list to use in a docstring. On the mailing list, I am answering a particular question and use a different voice than is appropriate for a docstring. Also, a full citation of Golub and Van Loan would be appropriate: .. [1] G. H. Golub and C. F. Van Loan, _Matrix Computations_. Baltimore: Johns Hopkins University Press, 1996. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthew.brett at gmail.com Tue Dec 15 15:16:25 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 15 Dec 2009 15:16:25 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> Message-ID: <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Hi, Is it reasonable to summarize that, to avoid confusion, we keep 'matrix_rank' as the name? I've edited as Robert suggested, attempting to adopt a more suitable tone in the docstring... Thanks a lot, Matthew def matrix_rank(M, tol=None): ''' Return rank of matrix using SVD method Rank of the array is the number of SVD singular values of the array that are greater than `tol`. Parameters ---------- M : array-like array of <=2 dimensions tol : {None, float} threshold below which SVD values are considered zero. If `tol` is None, and `S` is an array with singular values for `M`, and `eps` is the epsilon value for datatype of `S`, then `tol` set to ``S.max() * eps``. Examples -------- >>> matrix_rank(np.eye(4)) # Full rank matrix 4 >>> I=np.eye(4); I[-1,-1] = 0. # rank deficient matrix >>> matrix_rank(I) 3 >>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank 0 >>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0 1 >>> matrix_rank(np.zeros((4,))) 0 >>> matrix_rank([1]) # accepts array-like 1 Notes ----- Golub and van Loan [1]_ define "numerical rank deficiency" as using tol=eps*S[0] (where S[0] is the maximum singular value and thus the 2-norm of the matrix). This is one definition of rank deficiency, and the one we use here. When floating point roundoff is the main concern, then "numerical rank deficiency" is a reasonable choice. In some cases you may prefer other definitions. The most useful measure of the tolerance depends on the operations you intend to use on your matrix. For example, if your data come from uncertain measurements with uncertainties greater than floating point epsilon, choosing a tolerance near that uncertainty may be preferable. The tolerance may be absolute if the uncertainties are absolute rather than relative. References ---------- .. [1] G. H. Golub and C. F. Van Loan, _Matrix Computations_. Baltimore: Johns Hopkins University Press, 1996. ''' M = np.asarray(M) if M.ndim > 2: raise TypeError('array should have 2 or fewer dimensions') if M.ndim < 2: return int(not np.all(M==0)) S = npl.svd(M, compute_uv=False) if tol is None: tol = S.max() * np.finfo(S.dtype).eps return np.sum(S > tol) From fonnesbeck at gmail.com Tue Dec 15 19:22:42 2009 From: fonnesbeck at gmail.com (Chris) Date: Wed, 16 Dec 2009 00:22:42 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > Ok, so the undefined functions all indicate that the most recently > implemented ones are not included. I really cannot see any other > explanation that having a discrepancy between the source tree, build > tree and installation. Sometimes, svn screw things up when switching > between branches in my experience, so that's something to check for as > well. By the way, I tried building 1.4rc1 and the same thing happens. From fonnesbeck at gmail.com Tue Dec 15 19:38:13 2009 From: fonnesbeck at gmail.com (Chris) Date: Wed, 16 Dec 2009 00:38:13 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: Chris gmail.com> writes: > > By the way, I tried building 1.4rc1 and the same thing happens. > ... however, I was am able to get a usable build from r7542. Not sure how much more recent I can go before failures occurred. Somewhere between 7543 and 7726. From fonnesbeck at gmail.com Tue Dec 15 21:09:45 2009 From: fonnesbeck at gmail.com (Chris) Date: Wed, 16 Dec 2009 02:09:45 +0000 (UTC) Subject: [Numpy-discussion] Failure building scipy.special.lambertw Message-ID: Building a current checkout of scipy on OSX 10.6 fails when trying to compile scipy.special.lambertw, giving the message: Warning: No configuration returned, assuming unavailable. The full failure is here: http://img.skitch.com/20091216-d4b8ueqh27g4fqwebu3e3wgfkq.jpg From pav+sp at iki.fi Wed Dec 16 03:50:31 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Wed, 16 Dec 2009 08:50:31 +0000 (UTC) Subject: [Numpy-discussion] Failure building scipy.special.lambertw References: Message-ID: Wed, 16 Dec 2009 02:09:45 +0000, Chris wrote: > Building a current checkout of scipy on OSX 10.6 fails when trying to > compile scipy.special.lambertw, giving the message: > > Warning: No configuration returned, assuming unavailable. > > The full failure is here: > > http://img.skitch.com/20091216-d4b8ueqh27g4fqwebu3e3wgfkq.jpg npy_cabs et al. are defined in npy_math.h, and it seems you have an old version of that lying somewhere. Do you have an old version of Numpy installed at /Library/Python/2.6/site-packages/numpy From aisaac at american.edu Wed Dec 16 10:48:55 2009 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 16 Dec 2009 10:48:55 -0500 Subject: [Numpy-discussion] fsum Message-ID: <4B290167.1040004@american.edu> Does NumPy have an equivalent to Python's math.fsum? Thanks, Alan Isaac From denis-bz-py at t-online.de Wed Dec 16 12:15:18 2009 From: denis-bz-py at t-online.de (denis) Date: Wed, 16 Dec 2009 18:15:18 +0100 Subject: [Numpy-discussion] Slicing slower than matrix multiplication? In-Reply-To: <4B2231B6.6000709@hccnet.nl> References: <4B2231B6.6000709@hccnet.nl> Message-ID: A general question, is there a collection of numpy code snippets as transformed by experts, short of google site:mail.scipy.org/pipermail/numpy-discussion or the like ? And a subquestion, does anyone have a list of algebraic identities for .T vstack etc ? for a real example, to transform dots = np.vstack([ dot( x[j:j+4] .T, imatrix ) .T for j in ...]) cheers -- denis From jsseabold at gmail.com Wed Dec 16 13:56:08 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 16 Dec 2009 13:56:08 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: On Tue, Dec 15, 2009 at 3:16 PM, Matthew Brett wrote: > Hi, > > Is it reasonable to summarize that, to avoid confusion, we keep > 'matrix_rank' as the name? > > I've edited as Robert suggested, attempting to adopt a more suitable > tone in the docstring... > > Thanks a lot, > > Matthew > > What comes next when someone offers up a useful function like this? We are using an earlier version of in statsmodels and wouldn't mind seeing this in numpy. Presumably the doctests should be turned into actual tests (noting Robert's comment) to make it more likely that it gets in and an enhancement ticket should be filed? Skipper From matthew.brett at gmail.com Wed Dec 16 14:13:08 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 16 Dec 2009 14:13:08 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> Hi, >> Is it reasonable to summarize that, to avoid confusion, we keep >> 'matrix_rank' as the name? >> >> I've edited as Robert suggested, attempting to adopt a more suitable >> tone in the docstring... > What comes next when someone offers up a useful function like this? > We are using an earlier version of in statsmodels and wouldn't mind > seeing this in numpy. ?Presumably the doctests should be turned into > actual tests (noting Robert's comment) to make it more likely that it > gets in and an enhancement ticket should be filed? I'm happy to write the doctests as tests. My feeling is there is no objection to this function at the moment, so it would be reasonable, unless I hear otherwise, to commit to SVN. Best, Matthew From jsseabold at gmail.com Wed Dec 16 14:15:06 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 16 Dec 2009 14:15:06 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> Message-ID: On Wed, Dec 16, 2009 at 2:13 PM, Matthew Brett wrote: > Hi, > >>> Is it reasonable to summarize that, to avoid confusion, we keep >>> 'matrix_rank' as the name? >>> >>> I've edited as Robert suggested, attempting to adopt a more suitable >>> tone in the docstring... > >> What comes next when someone offers up a useful function like this? >> We are using an earlier version of in statsmodels and wouldn't mind >> seeing this in numpy. ?Presumably the doctests should be turned into >> actual tests (noting Robert's comment) to make it more likely that it >> gets in and an enhancement ticket should be filed? > > I'm happy to write the doctests as tests. ? My feeling is there is no > objection to this function at the moment, so it would be reasonable, > unless I hear otherwise, to commit to SVN. > Sounds good. I didn't know you had commit privileges, and I didn't want to see this get lost in the shuffle. Skipper From gael.varoquaux at normalesup.org Wed Dec 16 14:16:12 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 16 Dec 2009 20:16:12 +0100 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> Message-ID: <20091216191612.GA18266@phare.normalesup.org> On Wed, Dec 16, 2009 at 02:13:08PM -0500, Matthew Brett wrote: > I'm happy to write the doctests as tests. My feeling is there is no > objection to this function at the moment, so it would be reasonable, > unless I hear otherwise, to commit to SVN. I have one small comment: I am really happy to see you working on this function. It will be very useful, and its great to have it in a generaly-available package. However, I wonder whether it belongs to numpy.linalg, or scipy.linalg. I was under the impression that we should direct users who have linalg problems to scipy, as it can do much more. My 2 cents, Ga?l From matthew.brett at gmail.com Wed Dec 16 14:22:31 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 16 Dec 2009 14:22:31 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <20091216191612.GA18266@phare.normalesup.org> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> <20091216191612.GA18266@phare.normalesup.org> Message-ID: <1e2af89e0912161122n368b3a2et282f6d780f6962d@mail.gmail.com> Hi, On Wed, Dec 16, 2009 at 2:16 PM, Gael Varoquaux wrote: > On Wed, Dec 16, 2009 at 02:13:08PM -0500, Matthew Brett wrote: >> I'm happy to write the doctests as tests. ? My feeling is there is no >> objection to this function at the moment, so it would be reasonable, >> unless I hear otherwise, to commit to SVN. > > I have one small comment: I am really happy to see you working on this > function. It will be very useful, and its great to have it in a > generaly-available package. However, I wonder whether it belongs to > numpy.linalg, or scipy.linalg. I was under the impression that we should > direct users who have linalg problems to scipy, as it can do much more. It's another option, and one I thought of, but in this case, the use is so ubiquitous in linear algebra, and the function so small, that it would seem a pity to require installing scipy to get it. See you, Matthew From dwf at cs.toronto.edu Wed Dec 16 17:21:35 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 16 Dec 2009 17:21:35 -0500 Subject: [Numpy-discussion] tracking additions to NumPy across versions Message-ID: <0488DE71-663B-4691-95BA-6364759E9B0A@cs.toronto.edu> Hi all, Is there currently anything in the docstring standard about tracking when functions get added to NumPy? The recent discussion of matrix_rank got me thinking about this. Regards, David From dwf at cs.toronto.edu Wed Dec 16 17:31:53 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 16 Dec 2009 17:31:53 -0500 Subject: [Numpy-discussion] tracking additions to NumPy across versions In-Reply-To: <0488DE71-663B-4691-95BA-6364759E9B0A@cs.toronto.edu> References: <0488DE71-663B-4691-95BA-6364759E9B0A@cs.toronto.edu> Message-ID: <8A071CA6-F208-4CC8-9BAE-CFC27ECA715B@cs.toronto.edu> On 16-Dec-09, at 5:21 PM, David Warde-Farley wrote: > Hi all, > > Is there currently anything in the docstring standard about tracking > when functions get added to NumPy? The recent discussion of > matrix_rank got me thinking about this. Once again, Google answered for me a few minutes after I asked. I see we have .. versionadded:. http://mail.scipy.org/pipermail/numpy-discussion/2009-July/044043.html Sorry for the noise. David From ndbecker2 at gmail.com Thu Dec 17 06:36:03 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 17 Dec 2009 06:36:03 -0500 Subject: [Numpy-discussion] segfault in vdot Message-ID: http://projects.scipy.org/numpy/ticket/1335 From pgmdevlist at gmail.com Thu Dec 17 09:16:29 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 17 Dec 2009 09:16:29 -0500 Subject: [Numpy-discussion] np.void from 0d array + subclassing Message-ID: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> All, * What is the most efficient way to get a np.void object from a 0d structured ndarray ? * Is there any way to subclass np.void ? Thanks a lot in advance ! P. From faltet at pytables.org Thu Dec 17 10:16:25 2009 From: faltet at pytables.org (Francesc Alted) Date: Thu, 17 Dec 2009 16:16:25 +0100 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> Message-ID: <200912171616.25300.faltet@pytables.org> A Thursday 17 December 2009 15:16:29 Pierre GM escrigu?: > All, > * What is the most efficient way to get a np.void object from a 0d > structured ndarray ? I normally use `PyArray_GETITEM` C macro for general n-d structured arrays. I suppose that this will work with 0-d arrays too. -- Francesc Alted From d.l.goldsmith at gmail.com Thu Dec 17 13:36:50 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 17 Dec 2009 10:36:50 -0800 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> Message-ID: <45d1ab480912171036ka3204f0v5f6c450ae088d94c@mail.gmail.com> On Thu, Dec 17, 2009 at 6:16 AM, Pierre GM wrote: > All, > * What is the most efficient way to get a np.void object from a 0d structured ndarray ? > * Is there any way to subclass np.void ? The standard way (more or less) works for me: >>> class myvoidclass(np.void): ... pass ... >>> foo = myvoidclass() Traceback (most recent call last): File "", line 1, in TypeError: function takes exactly 1 argument (0 given) >>> foo = myvoidclass(1) >>> dir(foo) ['T', '__abs__', '__add__', '__and__', '__array__', '__array_interface__', '__array_priority__', '__array_struct__', '__ array_wrap__', '__class__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dict__', '__div__', '__divmod__ ', '__doc__', '__eq__', '__float__', '__floordiv__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__hex__', '__init__', '__int__', '__invert__', '__le__', '__len__', '__long__', '__lshift__', '__lt__', '__mod__', '__m odule__', '__mul__', '__ne__', '__neg__', '__new__', '__nonzero__', '__oct__', '__or__', '__pos__', '__pow__', '__radd__ ', '__rand__', '__rdiv__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__ rmod__', '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__seta ttr__', '__setitem__', '__setstate__', '__str__', '__sub__', '__truediv__', '__weakref__', '__xor__', 'all', 'any', 'arg max', 'argmin', 'argsort', 'astype', 'base', 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', 'cum prod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump', 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', ' item', 'itemset', 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', 'nonzero', 'prod', 'ptp', 'put', ' ravel', 'real', 'repeat', 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', 'swapaxes', 'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', 'var', 'v iew'] Is there something more specific you want to do? DG > Thanks a lot in advance ! > P. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pgmdevlist at gmail.com Thu Dec 17 17:11:03 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 17 Dec 2009 17:11:03 -0500 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <200912171616.25300.faltet@pytables.org> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> Message-ID: On Dec 17, 2009, at 10:16 AM, Francesc Alted wrote: > A Thursday 17 December 2009 15:16:29 Pierre GM escrigu?: >> All, >> * What is the most efficient way to get a np.void object from a 0d >> structured ndarray ? > > I normally use `PyArray_GETITEM` C macro for general n-d structured arrays. I > suppose that this will work with 0-d arrays too. Francesc, you're overestimating my knowledge of C... Can we stick to the Python implementation ? Here's the catch: IIUC, each individual element of a nD structured array is a void, provided the element can be accessed, ie that n>0. A 0D array cannot be indexed, so I don't know how capture the object below. The sad trick I found was to do a .reshape(1)[0], but that looks really overkill... > > The standard way (more or less) works for me: > >>>> class myvoidclass(np.void): > ... pass > ... David, what do you do w/ the __new__ of myvoidclass ? Just an empty class doesn't help me much, 'm'fraid. From d.l.goldsmith at gmail.com Thu Dec 17 18:27:16 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 17 Dec 2009 15:27:16 -0800 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> Message-ID: <45d1ab480912171527k525a1022v7a032a26d0b371ac@mail.gmail.com> On Thu, Dec 17, 2009 at 2:11 PM, Pierre GM wrote: > Francesc, you're overestimating my knowledge of C... Can we stick to the Python implementation ? > Here's the catch: IIUC, each individual element of a nD structured array is a void, provided the element can be accessed, ie that n>0. A 0D array cannot be indexed, so I don't know how Unless someone can explain why it isn't, this sounds like an API inconsistency, which in turn I would characterize as a bug. But others may disagree and/or explain it away... > capture the object below. The sad trick I found was to do a .reshape(1)[0], but that looks really overkill... > >> >> The standard way (more or less) works for me: >> >>>>> class myvoidclass(np.void): >> ... ? ? pass >> ... > > David, what do you do w/ the __new__ of myvoidclass ? Just an empty class doesn't help me much, 'm'fraid. Presumably, whatever you want (i.e., override it, calling the base class constructor inside your __new__ if/when needed) - I've never done this, so I have no reason to believe it would/should behave any differently than any other Python subclass; your question merely provoked me to check to see if the normal subclassing syntax did not work for some reason, and since I found that it did, I thought I'd post that result as a "data point". Now, if you're generally unfamiliar (but it doesn't sound like you are) with what to do with a subclass' __new__, I'm sure someone else can more easily point you to a reference for that issue. Is there some reason you believe you have to override __new__ differently in your use-case? DG > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Thu Dec 17 18:35:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Dec 2009 17:35:26 -0600 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> Message-ID: <3d375d730912171535qd23fe5pa63b178827349f00@mail.gmail.com> On Thu, Dec 17, 2009 at 16:11, Pierre GM wrote: > On Dec 17, 2009, at 10:16 AM, Francesc Alted wrote: >> A Thursday 17 December 2009 15:16:29 Pierre GM escrigu?: >>> All, >>> * What is the most efficient way to get a np.void object from a 0d >>> structured ndarray ? >> >> I normally use `PyArray_GETITEM` C macro for general n-d structured arrays. ?I >> suppose that this will work with 0-d arrays too. > > Francesc, you're overestimating my knowledge of C... Can we stick to the Python implementation ? > Here's the catch: IIUC, each individual element of a nD structured array is a void, provided the element can be accessed, ie that n>0. A 0D array cannot be indexed, so I don't know how capture the object below. The sad trick I found was to do a .reshape(1)[0], but that looks really overkill... a[()] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Thu Dec 17 18:41:37 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 17 Dec 2009 18:41:37 -0500 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <3d375d730912171535qd23fe5pa63b178827349f00@mail.gmail.com> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> <3d375d730912171535qd23fe5pa63b178827349f00@mail.gmail.com> Message-ID: <553421DB-144F-4E0D-86EB-0EE7BE54D8C9@gmail.com> On Dec 17, 2009, at 6:35 PM, Robert Kern wrote: > On Thu, Dec 17, 2009 at 16:11, Pierre GM wrote: >> On Dec 17, 2009, at 10:16 AM, Francesc Alted wrote: >>> A Thursday 17 December 2009 15:16:29 Pierre GM escrigu?: >>>> All, >>>> * What is the most efficient way to get a np.void object from a 0d >>>> structured ndarray ? >>> >>> I normally use `PyArray_GETITEM` C macro for general n-d structured arrays. I >>> suppose that this will work with 0-d arrays too. >> >> Francesc, you're overestimating my knowledge of C... Can we stick to the Python implementation ? >> Here's the catch: IIUC, each individual element of a nD structured array is a void, provided the element can be accessed, ie that n>0. A 0D array cannot be indexed, so I don't know how capture the object below. The sad trick I found was to do a .reshape(1)[0], but that looks really overkill... > > a[()] Well, that's slick and really neat !!! Typically Robert K's... Thanks a lot !!! And would you have anything as cool for the reverse operation (from np.void to 0d array) ? From robert.kern at gmail.com Thu Dec 17 18:46:07 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 17 Dec 2009 17:46:07 -0600 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <553421DB-144F-4E0D-86EB-0EE7BE54D8C9@gmail.com> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> <3d375d730912171535qd23fe5pa63b178827349f00@mail.gmail.com> <553421DB-144F-4E0D-86EB-0EE7BE54D8C9@gmail.com> Message-ID: <3d375d730912171546t7f4e82a7gd1d7ec601d44049@mail.gmail.com> On Thu, Dec 17, 2009 at 17:41, Pierre GM wrote: > On Dec 17, 2009, at 6:35 PM, Robert Kern wrote: >> On Thu, Dec 17, 2009 at 16:11, Pierre GM wrote: >>> Here's the catch: IIUC, each individual element of a nD structured array is a void, provided the element can be accessed, ie that n>0. A 0D array cannot be indexed, so I don't know how capture the object below. The sad trick I found was to do a .reshape(1)[0], but that looks really overkill... >> >> a[()] > > Well, that's slick and really neat !!! Typically Robert K's... Thanks a lot !!! > And would you have anything as cool for the reverse operation (from np.void to 0d array) ? a = np.array(v) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From d.l.goldsmith at gmail.com Thu Dec 17 19:03:28 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 17 Dec 2009 16:03:28 -0800 Subject: [Numpy-discussion] np.void from 0d array + subclassing In-Reply-To: <3d375d730912171546t7f4e82a7gd1d7ec601d44049@mail.gmail.com> References: <2E60EE61-F9C1-4074-A207-BF2CBDCCCF4A@gmail.com> <200912171616.25300.faltet@pytables.org> <3d375d730912171535qd23fe5pa63b178827349f00@mail.gmail.com> <553421DB-144F-4E0D-86EB-0EE7BE54D8C9@gmail.com> <3d375d730912171546t7f4e82a7gd1d7ec601d44049@mail.gmail.com> Message-ID: <45d1ab480912171603i35cf2e92g4afe0957cc6d1b01@mail.gmail.com> On Thu, Dec 17, 2009 at 3:46 PM, Robert Kern wrote: > On Thu, Dec 17, 2009 at 17:41, Pierre GM wrote: >> Well, that's slick and really neat !!! Typically Robert K's... Thanks a lot !!! Yeah, sometimes I think the only reason we have a mailing list is so that we're not all just emailing Robert all the time... ;-) DG From sierra_mtnview at sbcglobal.net Fri Dec 18 00:39:36 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Thu, 17 Dec 2009 21:39:36 -0800 Subject: [Numpy-discussion] [AstroPy] Rotating and Transforming Vectors--Flight Path of a Celestial Body In-Reply-To: References: <4B2ADC00.307@sbcglobal.net> Message-ID: <4B2B1598.2060900@sbcglobal.net> It's starting to come back to me. I found a few old graphics books that get into transformation matrices and such. Yes, earth centered. I ground out some code with geometry and trig that at least gets the first point on the path right. I think I can probably apply a rotation around the x-axis several times with a 1/2 degree rotation for each step towards the north to get the job done. I'm still fairly fresh with Python, and found a little bit of info in a Tentative numpy tutorial. I found this on getting started with matrices: from numpy import matrix Apparently matrix is a class in numpy, and there are many others, linalg I think is one. How does one find all the classes, and are there rules that keep them apart. It was tempting to add import numpy in addition to the above, but how is what is in numpy different that the classes? Is numpy solely class driven? That is, if one does a from as above to get all the classes, does it give all the capabilities that just using import numpy does? Anne Archibald wrote: > 2009/12/17 Wayne Watson : > >> I'm just getting used to the math and numpy library, and have begun >> working on a problem of the following sort. >> >> Two observing stations are equidistant, 1/2 degree, on either side of a >> line of longitude of 90 deg west, and both are at 45 deg latitude. Given >> the earth's circumference as 25020 miles, a meteor is 60 miles above the >> point between the two sites. That is, if you were standing at long >> 90deg and lat 45 deg, the meteor would be that high above you. 70 miles >> along the long line is 1 degree, so the stations are 70 miles apart. I >> want to know the az and el of the meteor from each station. With some >> geometry and trig, I've managed to get that first point; however, I can >> see moving the meteor say, 1/2 deg, along its circular path towards the >> north pole is going to require more pen and pencil work to get the az/el >> for it. >> >> Long ago in a faraway time, I used to do this stuff. It should be easy >> to rotate the vector to the first point 1/2 deg northward, and find the >> vector there, then compute the new az and el from each station. Maybe. >> I'm just beginning to look at the matrix and vector facilities in numpy. >> Maybe someone can comment on how this should be done, and steer me >> towards what I need to know in numpy. >> > > You may find that the problem grows drastically easier if you work as > much as possible in so-called earth-centered earth-fixed coordinates > (sometimes called XYZ) coordinates. These are a rectilinear coordinate > system that rotates with the earth, with the Z axis through the north > pole and the X axis through the equator at the Greenwich meridian. > It's kind of horrible for getting altitudes, since the Earth is sort > of pear-shaped, but it makes the 3D geometry much simpler. > > If you don't go this route, I'd recommend picking one station and > defining a rectilinear coordinate system based on its north and > vertical vectors. The north and vertical vectors of the other station > will be at somewhat funny angles (unless you can get away with > treating the Earth as flat between the two), but whatever rectilinear > coordinates you choose, a dot product lets you calculate vector > lengths and angles between them, and a cross product lets you build > vectors orthogonal to a given pair. So, for example, if your station > has north vector N and up vector U, you can get its east vector as > E=cross(N,U) (then normalize it); if you want to convert an absolute > north to a local north (i.e. one that is horizontal) you can do > N=cross(E,U) (and normalize it). Then you can get the azimuth of a > vector V using dot(N,V)/sqrt(dot(V,V)) and dot(E,V)/sqrt(dot(V,V)). > > > Anne > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From peridot.faceted at gmail.com Fri Dec 18 01:14:18 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 18 Dec 2009 01:14:18 -0500 Subject: [Numpy-discussion] [AstroPy] Rotating and Transforming Vectors--Flight Path of a Celestial Body In-Reply-To: <4B2B1598.2060900@sbcglobal.net> References: <4B2ADC00.307@sbcglobal.net> <4B2B1598.2060900@sbcglobal.net> Message-ID: 2009/12/18 Wayne Watson : > It's starting to come back to me. I found a few old graphics books that > get into transformation matrices and such. Yes, earth centered. I ground > out some code with geometry and trig that at least gets the first point > on the path right. I think I can probably apply a rotation around the > x-axis several times with a 1/2 degree rotation for each step towards > the north to get the job done. > > I'm still fairly fresh with Python, and found a little bit of info in a > Tentative numpy tutorial. I found this on getting started with matrices: > > from numpy import matrix > > Apparently matrix is a class in numpy, and there are many others, linalg > I think is one. How > does one find all the classes, and are there rules that keep them apart. > It was tempting to add > import numpy in addition to the above, but how is what is in numpy > different that the classes? > Is numpy solely class driven? That is, if one does a from as above to > get all the classes, does > it give all the capabilities that just using import numpy does? Many things in python are classes; a class is a way of attaching relevant functions to a collection of data (more precisely, a class is a *type*, defining the interpretation of data; usually they also carry a collection of functions to operate on that data). So the central feature of numpy is a class, ndarray, that represents a collection of values of homogeneous type. This may be one, two, or many-dimensional, and there are various operations, including linear algebra, on them available in numpy. The matrix class is a special kind of ndarray in which a number of modifications have been made. In particular, the * operator no longer does element-wise operations, it does matrix multiplication. There are also various rules designed to ensure that matrix objects are always two-dimensional. I avoid matrix objects like the plague, but some people find them useful. numpy.linalg is an entirely different beast. It is a *module*, a collection of functions (and potentially objects and classes). It is like sys or os: you import it and the functions, objects and classes it contains become available. This is a basic feature of python. What is unusual (but not unique) is that rather than having to explicitly import it like: import numpy import numpy.linalg numpy.linalg.svd(numpy.ones((3,2))) numpy automatically imports it for you, every time. This is done for historical reasons and won't change, but is a wart. For your purposes, I recommend simply using numpy arrays - three-element arrays for vectors, three-by-three for matrices - and using the linear algebra functions numpy provides to act on them. For example, dot does matrix-matrix, matrix-vector, and vector-vector multiplication. Anne P.S. you can usually write out a rotation explicitly, e.g. as [[cos(t), sin(t), 0], [-sin(t), cos(t), 0], [0, 0, 1]] but if you want a more general one I believe there's a clever way to make it using two reflections. -A From denis-bz-py at t-online.de Fri Dec 18 05:49:05 2009 From: denis-bz-py at t-online.de (denis) Date: Fri, 18 Dec 2009 11:49:05 +0100 Subject: [Numpy-discussion] [AstroPy] Rotating and Transforming Vectors--Flight Path of a Celestial Body In-Reply-To: References: <4B2ADC00.307@sbcglobal.net> <4B2B1598.2060900@sbcglobal.net> Message-ID: Fyinfo, http://code.google.com/p/geometry-simple has classes "Point","Line","Plane","Movement", with methods points moved distance_to angle_to midpoint_to ... It's not all you want (~ 350 lines, uses straight numpy, would benefit from an expert eye (dot = inner ? math. ?)) but has a clean api. @Anne, excellent description; can you find time to work up your notes to a say 5-page intro ? Would sell like hotcakes cheers -- denis From bsouthey at gmail.com Fri Dec 18 12:15:34 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 18 Dec 2009 11:15:34 -0600 Subject: [Numpy-discussion] test_multiarray.TestIO.test_ascii segmentation fault with Python2.7 In-Reply-To: References: <4B27AB80.4070903@gmail.com> <4B27BAF3.2040306@gmail.com> Message-ID: <4B2BB8B6.3040003@gmail.com> On 12/15/2009 10:59 AM, Charles R Harris wrote: > > > On Tue, Dec 15, 2009 at 9:51 AM, Pauli Virtanen > wrote: > > Tue, 15 Dec 2009 10:36:03 -0600, Bruce Southey wrote: > [clip] > > Program received signal SIGSEGV, Segmentation fault. setup_context > > (registry=, module=, > > lineno=, filename=, > > stack_level=) > > at Python/_warnings.c:449 > > 449 PyFrameObject *f = PyThreadState_GET()->frame; (gdb) bt > > #0 setup_context (registry=, module= > optimized out>, lineno=, filename= optimized > > out>, stack_level=) > > at Python/_warnings.c:449 > > #1 do_warn (registry=, module= > out>, lineno=, filename=, > > stack_level=) > > at Python/_warnings.c:593 > > #2 0x0000000000493c81 in PyErr_WarnEx (category=0x760720, > text= > optimized out>, stack_level=1) at Python/_warnings.c:719 #3 > > 0x00000000004c8e94 in PyOS_ascii_strtod (nptr=0x7ffff7f08914 "1 > , 2 , 3 > > , 4", endptr=0x7fffffffdb28) at Python/pystrtod.c:282 #4 > > 0x00007ffff2954151 in NumPyOS_ascii_strtod (s=0x7ffff7f08914 "1 > , 2 , 3 > > , 4", endptr=0x7fffffffdb28) at > numpy/core/src/multiarray/numpyos.c:527 > > Looks like it's trying to raise a deprecation warning after > PyArray_FromString has released GIL. So that was the reason why it > caused > a segfault also in 3.1. > > PyOS_ascii_strtod was deprecated in 2.7 and in 3.1. Apparently, we now > *must* do something like > > #if PY_VERSION_HEX >= 0x02060000 > return PyOS_string_to_double(s, endptr, NULL); > #else > return PyOS_ascii_strtod(s, endptr); > #endif > > everywhere the function is used. > > It also seems that this needs to be backported to Numpy 1.4.x... > > (Note to self: this is also the origin of the crashes in scipy/ > lambertw... GIL must be reacquired before raising any warnings.) > > > Would it be appropriate to put macros for all these in config.h or > some other common spot? Having all the python version dependencies in > one spot might make it easier to keep current. I've been thinking of > moving the numpy deprecation macro for that reason. > > Chuck > There are two places that appear to use PyOS_ascii_strtod in numpy: 1) core/SConscript - 'Define the function PyOS_ascii_strod if not available' There is no definition for Python 3K and I do not know the purpose of this function. 2) core/src/multiarray/numpyos.c - defines NumPyOS_ascii_strtod: Here the usage is already within an '#if defined(NPY_PY3K)' for Python 3K. In the numpyos.c with numpy '1.5.0.dev8019' I replaced the NPY_PY3K with: #if PY_VERSION_HEX >= 0x02070000 In Python 2.6 all the tests passed. But for Python 2.7 I got the following errors. ====================================================================== ERROR: test_buffer_hashlib (test_regression.TestRegression) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_regression.py", line 1255, in test_buffer_hashlib assert_equal(md5(x).hexdigest(), '2a1dd1e1e59d0a384c26951e316cd7e6') TypeError: object supporting the buffer API required ====================================================================== FAIL: Check formatting of complex types. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 183, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_print.py", line 61, in check_complex_type err_msg='Failed str formatting for type %s' % tp) File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 305, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: Failed str formatting for type ACTUAL: '-1j' DESIRED: '(-0-1j)' ====================================================================== FAIL: Check formatting of complex types. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 183, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_print.py", line 61, in check_complex_type err_msg='Failed str formatting for type %s' % tp) File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 305, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: Failed str formatting for type ACTUAL: '-1j' DESIRED: '(-0-1j)' ====================================================================== FAIL: Check formatting of complex types. ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 183, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/site-packages/numpy/core/tests/test_print.py", line 61, in check_complex_type err_msg='Failed str formatting for type %s' % tp) File "/usr/local/lib/python2.7/site-packages/numpy/testing/utils.py", line 305, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: Failed str formatting for type ACTUAL: '-1j' DESIRED: '(-0-1j)' ---------------------------------------------------------------------- Ran 2494 tests in 6.932s FAILED (KNOWNFAIL=5, errors=1, failures=3) These last ones may be due to: "Another format()-related change: the default precision used for floating-point and complex numbers was changed from 6 decimal places to 12, which matches the precision used by str(). (Changed by Eric Smith; issue 5920.)" http://docs.python.org/dev/whatsnew/2.7.html Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Dec 18 12:46:51 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 09:46:51 -0800 Subject: [Numpy-discussion] Segmentation fault with argsort Message-ID: I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already known, fixed, reproducible? >> np.array(121).argsort(0).argsort(0) Segmentation fault The expected result: AttributeError: 'np.int64' object has no attribute 'argsort' From robert.kern at gmail.com Fri Dec 18 12:52:50 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Dec 2009 11:52:50 -0600 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> On Fri, Dec 18, 2009 at 11:46, Keith Goodman wrote: > I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already > known, fixed, reproducible? > >>> np.array(121).argsort(0).argsort(0) > Segmentation fault > > The expected result: > > AttributeError: 'np.int64' object has no attribute 'argsort' Why would you expect that? On OS X with an SVN checkout ~1.4: In [1]: np.array(121).argsort(0).argsort(0) Out[1]: 0 In [6]: np.int64.argsort Out[6]: -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jsseabold at gmail.com Fri Dec 18 12:57:00 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 18 Dec 2009 12:57:00 -0500 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 12:52 PM, Robert Kern wrote: > On Fri, Dec 18, 2009 at 11:46, Keith Goodman wrote: >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >> known, fixed, reproducible? >> >>>> np.array(121).argsort(0).argsort(0) >> Segmentation fault >> >> The expected result: >> >> AttributeError: 'np.int64' object has no attribute 'argsort' > > Why would you expect that? On OS X with an SVN checkout ~1.4: > > In [1]: np.array(121).argsort(0).argsort(0) > Out[1]: 0 > > In [6]: np.int64.argsort > Out[6]: > Kubuntu 9.10 In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.4.0.dev7539' In [3]: np.array(121).argsort(0).argsort(0) Segmentation fault -Skipper From robert.kern at gmail.com Fri Dec 18 13:00:12 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Dec 2009 12:00:12 -0600 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> Message-ID: <3d375d730912181000q7c363a72t1e6c5091214aa655@mail.gmail.com> On Fri, Dec 18, 2009 at 11:57, Skipper Seabold wrote: > On Fri, Dec 18, 2009 at 12:52 PM, Robert Kern wrote: >> On Fri, Dec 18, 2009 at 11:46, Keith Goodman wrote: >>> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >>> known, fixed, reproducible? >>> >>>>> np.array(121).argsort(0).argsort(0) >>> Segmentation fault >>> >>> The expected result: >>> >>> AttributeError: 'np.int64' object has no attribute 'argsort' >> >> Why would you expect that? On OS X with an SVN checkout ~1.4: >> >> In [1]: np.array(121).argsort(0).argsort(0) >> Out[1]: 0 >> >> In [6]: np.int64.argsort >> Out[6]: >> > > Kubuntu 9.10 > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.4.0.dev7539' > > In [3]: np.array(121).argsort(0).argsort(0) > Segmentation fault Can you give us a gdb backtrace? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Fri Dec 18 13:01:19 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 10:01:19 -0800 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 9:52 AM, Robert Kern wrote: > On Fri, Dec 18, 2009 at 11:46, Keith Goodman wrote: >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >> known, fixed, reproducible? >> >>>> np.array(121).argsort(0).argsort(0) >> Segmentation fault >> >> The expected result: >> >> AttributeError: 'np.int64' object has no attribute 'argsort' > > Why would you expect that? On OS X with an SVN checkout ~1.4: > > In [1]: np.array(121).argsort(0).argsort(0) > Out[1]: 0 > > In [6]: np.int64.argsort > Out[6]: Oh, I didn't realize numpy scalars had all of the methods of arrays. From charlesr.harris at gmail.com Fri Dec 18 13:01:55 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 11:01:55 -0700 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 10:57 AM, Skipper Seabold wrote: > On Fri, Dec 18, 2009 at 12:52 PM, Robert Kern > wrote: > > On Fri, Dec 18, 2009 at 11:46, Keith Goodman > wrote: > >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already > >> known, fixed, reproducible? > >> > >>>> np.array(121).argsort(0).argsort(0) > >> Segmentation fault > >> > >> The expected result: > >> > >> AttributeError: 'np.int64' object has no attribute 'argsort' > > > > Why would you expect that? On OS X with an SVN checkout ~1.4: > > > > In [1]: np.array(121).argsort(0).argsort(0) > > Out[1]: 0 > > > > In [6]: np.int64.argsort > > Out[6]: > > > > Kubuntu 9.10 > > In [1]: import numpy as np > > In [2]: np.__version__ > Out[2]: '1.4.0.dev7539' > > In [3]: np.array(121).argsort(0).argsort(0) > Segmentation fault > > I also see that here on ubuntu 9.10, 64 bits. ISTR recall another such issue on ubuntu, which makes me think that there might be a compiler problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Dec 18 13:06:30 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 18 Dec 2009 13:06:30 -0500 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> Message-ID: <1cd32cbb0912181006o566da029q5f9451ed9b922fbd@mail.gmail.com> On Fri, Dec 18, 2009 at 1:01 PM, Charles R Harris wrote: > > > On Fri, Dec 18, 2009 at 10:57 AM, Skipper Seabold > wrote: >> >> On Fri, Dec 18, 2009 at 12:52 PM, Robert Kern >> wrote: >> > On Fri, Dec 18, 2009 at 11:46, Keith Goodman >> > wrote: >> >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >> >> known, fixed, reproducible? >> >> >> >>>> np.array(121).argsort(0).argsort(0) >> >> Segmentation fault >> >> >> >> The expected result: >> >> >> >> AttributeError: 'np.int64' object has no attribute 'argsort' >> > >> > Why would you expect that? On OS X with an SVN checkout ~1.4: >> > >> > In [1]: np.array(121).argsort(0).argsort(0) >> > Out[1]: 0 >> > >> > In [6]: np.int64.argsort >> > Out[6]: >> > >> >> Kubuntu 9.10 >> >> In [1]: import numpy as np >> >> In [2]: np.__version__ >> Out[2]: '1.4.0.dev7539' >> >> In [3]: np.array(121).argsort(0).argsort(0) >> Segmentation fault Segmentation fault same here WindowsXP 32, numpy 1.4.0rc1, python 2.5.2 Josef > > I also see that here on ubuntu 9.10, 64 bits. ISTR recall another such issue > on ubuntu, which makes me think that there might be a compiler problem. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From jsseabold at gmail.com Fri Dec 18 13:07:36 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 18 Dec 2009 13:07:36 -0500 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: <3d375d730912181000q7c363a72t1e6c5091214aa655@mail.gmail.com> References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> <3d375d730912181000q7c363a72t1e6c5091214aa655@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 1:00 PM, Robert Kern wrote: > > Can you give us a gdb backtrace? > No idea what I'm doing, but I figure I should learn a bit... Does this look right? skipper at linux-desktop:~$ gdb python GNU gdb (GDB) 7.0-ubuntu Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /usr/bin/python...Reading symbols from /usr/lib/debug/usr/bin/python2.6...done. (no debugging symbols found)...done. (gdb) run argsort_seg.py Starting program: /usr/bin/python argsort_seg.py [Thread debugging using libthread_db enabled] Program received signal SIGSEGV, Segmentation fault. 0x00000000004b499a in _PyArg_ParseTupleAndKeywords_SizeT (args=0x7ffff7f54ad0, keywords=0x7ffff6d89eb0, format=0x7ffff6d9308d "|O&O&O", kwlist=0x7ffff6faa5a0) at ../Python/getargs.c:1409 1409 ../Python/getargs.c: No such file or directory. in ../Python/getargs.c (gdb) From jsseabold at gmail.com Fri Dec 18 13:10:41 2009 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 18 Dec 2009 13:10:41 -0500 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: <3d375d730912180952w3b68dc51o8e5484c3ff345e7a@mail.gmail.com> <3d375d730912181000q7c363a72t1e6c5091214aa655@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 1:07 PM, Skipper Seabold wrote: > On Fri, Dec 18, 2009 at 1:00 PM, Robert Kern wrote: >> >> Can you give us a gdb backtrace? >> > > No idea what I'm doing, but I figure I should learn a bit... ?Does > this look right? > > skipper at linux-desktop:~$ gdb python > GNU gdb (GDB) 7.0-ubuntu > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. ?Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-linux-gnu". > For bug reporting instructions, please see: > ... > Reading symbols from /usr/bin/python...Reading symbols from > /usr/lib/debug/usr/bin/python2.6...done. > (no debugging symbols found)...done. > (gdb) run argsort_seg.py > Starting program: /usr/bin/python argsort_seg.py > [Thread debugging using libthread_db enabled] > > Program received signal SIGSEGV, Segmentation fault. > 0x00000000004b499a in _PyArg_ParseTupleAndKeywords_SizeT > (args=0x7ffff7f54ad0, keywords=0x7ffff6d89eb0, > ? ?format=0x7ffff6d9308d "|O&O&O", kwlist=0x7ffff6faa5a0) at > ../Python/getargs.c:1409 > 1409 ? ?../Python/getargs.c: No such file or directory. > ? ? ? ?in ../Python/getargs.c > (gdb) > And .... (gdb) backtrace #0 0x00000000004b499a in _PyArg_ParseTupleAndKeywords_SizeT (args=0x7ffff7f54ad0, keywords=0x7ffff6d89eb0, format=0x7ffff6d9308d "|O&O&O", kwlist=0x7ffff6faa5a0) at ../Python/getargs.c:1409 #1 0x00007ffff6d7a08a in array_argsort (self=0xaf9af0, args=0x7ffff7f54ad0, kwds=0x7ffff6d89eb0) at numpy/core/src/multiarray/methods.c:1063 #2 0x000000000041d6e7 in PyObject_Call (func=0x7ffff46cd950, arg=0x7ffff6d89eb0, kw=0x7ffff6d9308d) at ../Objects/abstract.c:2492 #3 0x00007ffff6d89952 in gentype_generic_method (self=, args=0x7ffff7f54ad0, kwds=0x7ffff6d89eb0, str=) at numpy/core/src/multiarray/scalartypes.c.src:201 #4 0x00000000004a290d in call_function (f=0x90ed90, throwflag=) at ../Python/ceval.c:3706 #5 PyEval_EvalFrameEx (f=0x90ed90, throwflag=) at ../Python/ceval.c:2389 #6 0x00000000004a40e0 in PyEval_EvalCodeEx (co=0x7ffff7ef0eb8, globals=, locals=, args=0x0, argcount=, kws=, kwcount=0, defs=0x0, defcount=0, closure=0x0) at ../Python/ceval.c:2968 #7 0x00000000004a41b2 in PyEval_EvalCode (co=0x7ffff7f54ad0, globals=0x7ffff6d89eb0, locals=0x7ffff6d9308d) at ../Python/ceval.c:522 #8 0x00000000004c33a0 in run_mod (fp=0x90e230, filename=, start=, globals=, locals=0x8b9270, closeit=1, flags=0x7fffffffe130) at ../Python/pythonrun.c:1335 #9 PyRun_FileExFlags (fp=0x90e230, filename=, start=, globals=, locals=0x8b9270, closeit=1, flags=0x7fffffffe130) at ../Python/pythonrun.c:1321 #10 0x00000000004c3564 in PyRun_SimpleFileExFlags (fp=, filename=0x7fffffffe542 "argsort_seg.py", closeit=1, flags=0x7fffffffe130) at ../Python/pythonrun.c:931 #11 0x0000000000418ab7 in Py_Main (argc=-135384960, argv=) at ../Modules/main.c:599 ---Type to continue, or q to quit--- #12 0x00007ffff6fd0abd in __libc_start_main (main=, argc=, ubp_av=, init=, fini=, rtld_fini=, stack_end=0x7fffffffe248) at libc-start.c:220 #13 0x0000000000417ca9 in _start () at ../sysdeps/x86_64/elf/start.S:113 From charlesr.harris at gmail.com Fri Dec 18 13:32:55 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 11:32:55 -0700 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: On Fri, Dec 18, 2009 at 10:46 AM, Keith Goodman wrote: > I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already > known, fixed, reproducible? > > >> np.array(121).argsort(0).argsort(0) > Segmentation fault > > The expected result: > > AttributeError: 'np.int64' object has no attribute 'argsort' > ___ On an old install of fedora 11 with '1.4.0.dev' I get In [2]: array(121).argsort(0).argsort(0) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/charris/ in () TypeError: function takes at most 3 arguments (175832141 given) In [3]: np.array(121).argsort(0).argsort(0) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/charris/ in () TypeError: function takes at most 3 arguments (175832045 given) Which looks suspicious ;) Now to update that numpy install... and I get the same error. This is python 2.6, but I don't know what minor version it is. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Fri Dec 18 13:48:36 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Fri, 18 Dec 2009 10:48:36 -0800 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: <4B2BCE84.3080701@uci.edu> On 12/18/2009 9:46 AM, Keith Goodman wrote: > I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already > known, fixed, reproducible? > >>> np.array(121).argsort(0).argsort(0) > Segmentation fault > > The expected result: > > AttributeError: 'np.int64' object has no attribute 'argsort' On Windows 7 with Python 2.6.4 and numpy 1.4 built with vc2008 from svn source: the 64-bit version works, the 32-bit version throws a SystemError: C:\>python26-x64 Python 2.6.4 (r264:75708, Oct 26 2009, 07:36:50) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.__version__ '1.4.0rc2.dev8016' >>> numpy.array(121).argsort(0).argsort(0) 0 C:\>python26 Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.__version__ '1.4.0rc2.dev8016' >>> numpy.array(121).argsort(0).argsort(0) Traceback (most recent call last): File "", line 1, in SystemError: ..\Python\getargs.c:1413: bad argument to internal function -- Christoph From charlesr.harris at gmail.com Fri Dec 18 16:02:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 14:02:39 -0700 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: On Fri, Dec 18, 2009 at 10:46 AM, Keith Goodman wrote: > I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already > known, fixed, reproducible? > > >> np.array(121).argsort(0).argsort(0) > Segmentation fault > > The immediate problem is in scalartypes.c.src in these lines {"sort", (PyCFunction)gentype_sort, METH_VARARGS, NULL}, {"argsort", (PyCFunction)gentype_argsort, METH_VARARGS, NULL}, But the methods array_{argsort, sort} take keywords and are called that way. The following version fixes things, but I am not sure it is the correct fix, the gentype_* functions may need to be fixed instead. {"sort", (PyCFunction)gentype_sort, METH_VARARGS|METH_KEYWORDS, NULL}, {"argsort", (PyCFunction)gentype_argsort, METH_VARARGS|METH_KEYWORDS, NULL}, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Dec 18 16:13:48 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 13:13:48 -0800 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: On Fri, Dec 18, 2009 at 1:02 PM, Charles R Harris wrote: > > > On Fri, Dec 18, 2009 at 10:46 AM, Keith Goodman wrote: >> >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >> known, fixed, reproducible? >> >> >> np.array(121).argsort(0).argsort(0) >> Segmentation fault >> > > The immediate problem is in scalartypes.c.src in these lines > > ??? {"sort", > ??????? (PyCFunction)gentype_sort, > ??????? METH_VARARGS, NULL}, > ??? {"argsort", > ??????? (PyCFunction)gentype_argsort, > ??????? METH_VARARGS, NULL}, > > But the methods array_{argsort, sort} take keywords and are called that way. > The following version fixes things, but I am not sure it is the correct fix, > the gentype_* functions may need to be fixed instead. > > ??? {"sort", > ??????? (PyCFunction)gentype_sort, > ??????? METH_VARARGS|METH_KEYWORDS, NULL}, > ??? {"argsort", > ??????? (PyCFunction)gentype_argsort, > ??????? METH_VARARGS|METH_KEYWORDS, NULL}, Not sure I should have, but I created a ticket: http://projects.scipy.org/numpy/ticket/1339 From charlesr.harris at gmail.com Fri Dec 18 16:28:08 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 14:28:08 -0700 Subject: [Numpy-discussion] Segmentation fault with argsort In-Reply-To: References: Message-ID: On Fri, Dec 18, 2009 at 2:02 PM, Charles R Harris wrote: > > > On Fri, Dec 18, 2009 at 10:46 AM, Keith Goodman wrote: > >> I am using the numpy 1.3 binary from Ubuntu 9.10. Is this already >> known, fixed, reproducible? >> >> >> np.array(121).argsort(0).argsort(0) >> Segmentation fault >> >> > The immediate problem is in scalartypes.c.src in these lines > > {"sort", > (PyCFunction)gentype_sort, > METH_VARARGS, NULL}, > {"argsort", > (PyCFunction)gentype_argsort, > METH_VARARGS, NULL}, > > But the methods array_{argsort, sort} take keywords and are called that > way. The following version fixes things, but I am not sure it is the correct > fix, the gentype_* functions may need to be fixed instead. > > {"sort", > (PyCFunction)gentype_sort, > METH_VARARGS|METH_KEYWORDS, NULL}, > {"argsort", > (PyCFunction)gentype_argsort, > METH_VARARGS|METH_KEYWORDS, NULL}, > > Changing the generated function gentype_argsort to pass NULL for the keyword argument also fixes the problem. This may be the correct fix, but in terms of avoiding special cases it may be better to have it take the normal run of keywords. Note that view doesn't take keywords so a dtype can't be passed, making it kind of useless. I think the whole method section for types could use an audit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Fri Dec 18 16:51:33 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 13:51:33 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? Message-ID: <4B2BF965.5010107@sbcglobal.net> Is it possible to calculate a dot product in numpy by either notation (a ^ b, where ^ is a possible notation) or calling a dot function (dot(a,b)? I'm trying to use a column matrix for both "vectors". Perhaps, I need to somehow change them to arrays? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From kwgoodman at gmail.com Fri Dec 18 16:57:48 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 13:57:48 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2BF965.5010107@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> Message-ID: On Fri, Dec 18, 2009 at 1:51 PM, Wayne Watson wrote: > Is it possible to calculate a dot product in numpy by either notation > (a ^ b, where ^ is a possible notation) or calling a dot function > (dot(a,b)? I'm trying to use a column matrix for both "vectors". > Perhaps, I need to somehow change them to arrays? Does this do what you want? >> x matrix([[1], [2], [3]]) >> x.T * x matrix([[14]]) >> np.dot(x.T,x) matrix([[14]]) From dwf at cs.toronto.edu Fri Dec 18 17:17:49 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 18 Dec 2009 17:17:49 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <20091216191612.GA18266@phare.normalesup.org> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> <20091216191612.GA18266@phare.normalesup.org> Message-ID: <3438E08E-C453-422F-9D6B-776CA413829D@cs.toronto.edu> Hi Gael, On 16-Dec-09, at 2:16 PM, Gael Varoquaux wrote: > I was under the impression that we should > direct users who have linalg problems to scipy, as it can do much > more. I agree about pushing users in that direction, but I think that's mostly a consequence of all the wrapped Fortran routines that already exist there. If this thing can be implemented in a short Python function I don't see the harm in having it in NumPy. David From sierra_mtnview at sbcglobal.net Fri Dec 18 17:51:25 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 14:51:25 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> Message-ID: <4B2C076D.3050906@sbcglobal.net> That should do it. Thanks. How do I get the scalar result by itself? Keith Goodman wrote: > On Fri, Dec 18, 2009 at 1:51 PM, Wayne Watson > wrote: > >> Is it possible to calculate a dot product in numpy by either notation >> (a ^ b, where ^ is a possible notation) or calling a dot function >> (dot(a,b)? I'm trying to use a column matrix for both "vectors". >> Perhaps, I need to somehow change them to arrays? >> > > Does this do what you want? > > >>> x >>> > matrix([[1], > [2], > [3]]) > > >>> x.T * x >>> > matrix([[14]]) > >>> np.dot(x.T,x) >>> > matrix([[14]]) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From kwgoodman at gmail.com Fri Dec 18 17:54:49 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 14:54:49 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C076D.3050906@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> Message-ID: On Fri, Dec 18, 2009 at 2:51 PM, Wayne Watson wrote: > That should do it. Thanks. How do I get the scalar result by itself? >> np.dot(x.T,x)[0,0] 14 or >> x = np.array([1,2,3]) >> np.dot(x,x) 14 From oliphant at enthought.com Fri Dec 18 17:58:35 2009 From: oliphant at enthought.com (Travis Oliphant) Date: Fri, 18 Dec 2009 16:58:35 -0600 Subject: [Numpy-discussion] fromfile can segfault if data is corrupted In-Reply-To: References: <4B27D35B.80406@stsci.edu> Message-ID: <4938389B-74E4-4CE5-B95F-2599DA3BF1BD@enthought.com> On Dec 15, 2009, at 12:28 PM, Charles R Harris wrote: > > > On Tue, Dec 15, 2009 at 11:20 AM, Michael Droettboom > wrote: > I just discovered a bug in fromfile where it can segfault if the > file data is corrupted in such a way that the array size is insanely > large. (It was a byte-swapping problem in my own code, but it would > be preferable to get an exception rather than a crash). > > It's a simple fix to propagate the "array too large" exception > before trying to dereference the NULL array pointer (ret) in > PyArray_FromFile (see attached patch). But my question is: is this > an appropriate fix for 1.4 (it seems pretty straightforward), or > should I only make this to the trunk? > > > David can weigh in here, but I think you should backport it. It's a > bugfix, small, and there is going to be another rc. > > On the other hand, Travis should stop backporting new functionality. And Chuck should stop making unrelated jabs.... I spoke with David C about making the change at SciPy India. It doesn't break any code and makes the datetime stuff in 1.4 more usable. In my mind datetime improvements are fair game for 1.4.0 until the release comes out. Or is there something else you are upset about and want to bring up on a public forum ? -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 18 18:04:55 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 16:04:55 -0700 Subject: [Numpy-discussion] fromfile can segfault if data is corrupted In-Reply-To: <4938389B-74E4-4CE5-B95F-2599DA3BF1BD@enthought.com> References: <4B27D35B.80406@stsci.edu> <4938389B-74E4-4CE5-B95F-2599DA3BF1BD@enthought.com> Message-ID: On Fri, Dec 18, 2009 at 3:58 PM, Travis Oliphant wrote: > > On Dec 15, 2009, at 12:28 PM, Charles R Harris wrote: > > > > On Tue, Dec 15, 2009 at 11:20 AM, Michael Droettboom wrote: > >> I just discovered a bug in fromfile where it can segfault if the file data >> is corrupted in such a way that the array size is insanely large. (It was a >> byte-swapping problem in my own code, but it would be preferable to get an >> exception rather than a crash). >> >> It's a simple fix to propagate the "array too large" exception before >> trying to dereference the NULL array pointer (ret) in PyArray_FromFile (see >> attached patch). But my question is: is this an appropriate fix for 1.4 (it >> seems pretty straightforward), or should I only make this to the trunk? >> >> > David can weigh in here, but I think you should backport it. It's a bugfix, > small, and there is going to be another rc. > > On the other hand, Travis should stop backporting new functionality. > > > > And Chuck should stop making unrelated jabs.... > > I spoke with David C about making the change at SciPy India. It doesn't > break any code and makes the datetime stuff in 1.4 more usable. In my mind > datetime improvements are fair game for 1.4.0 until the release comes out. > > > Or is there something else you are upset about and want to bring up on a > public forum ? > > Yes, backporting new code to a release candidate. If David signed off on it, then you should mention that in the commit comment. Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Fri Dec 18 18:22:26 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 15:22:26 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> Message-ID: <4B2C0EB2.5050504@sbcglobal.net> Very good. Is there a scalar product in numpy? Keith Goodman wrote: > On Fri, Dec 18, 2009 at 2:51 PM, Wayne Watson > wrote: > >> That should do it. Thanks. How do I get the scalar result by itself? >> > > >>> np.dot(x.T,x)[0,0] >>> > 14 > > or > > >>> x = np.array([1,2,3]) >>> np.dot(x,x) >>> > 14 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From kwgoodman at gmail.com Fri Dec 18 18:31:24 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 18 Dec 2009 15:31:24 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C0EB2.5050504@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> Message-ID: On Fri, Dec 18, 2009 at 3:22 PM, Wayne Watson wrote: > Is there a scalar product in numpy? Isn't that the same thing as a dot product? np.dot doesn't do what you want? From sierra_mtnview at sbcglobal.net Fri Dec 18 18:40:56 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 15:40:56 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> Message-ID: <4B2C1308.900@sbcglobal.net> Well, they aren't quite the same. If a is the length of A, and b is the length of B, then a*b = A dot B* cos (theta). I'm still not familiar enough with numpy or math to know if there's some function that will produce a from A. It's easy enough to do, a = A(0)**2 + ..., but I would like to think it's a common enough need that there would be something available like sumsq(). Keith Goodman wrote: > On Fri, Dec 18, 2009 at 3:22 PM, Wayne Watson > wrote: > >> Is there a scalar product in numpy? >> > > Isn't that the same thing as a dot product? np.dot doesn't do what you want? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From d.l.goldsmith at gmail.com Fri Dec 18 18:48:22 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 18 Dec 2009 15:48:22 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C1308.900@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> Message-ID: <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> On Fri, Dec 18, 2009 at 3:40 PM, Wayne Watson wrote: > Well, they aren't quite the same. If a is the length of A, and b is the > length of B, then a*b = A dot B* cos (theta). ?I'm still not familiar > enough with numpy or math to know if there's some function that will > produce a from A. It's easy enough to do, a = A(0)**2 + ..., but I would > like to think it's a common enough need that there would be something > available like sumsq(). In your usage, dot product and scalar product are synonymous: a = sqrt(A dot A) There are some contexts in which "scalar" product and "dot" product don't mean exactly the same thing (e.g., tensors, where "dot" is typically synonymous w/ "inner," which, in the general case, does not result in a scalar, or a multiplication-like functional where a function is mapped to a scalar, in which context we typically - but not uniformly - do not describe the product as a dot product) but unless you're working in one of those advanced contexts, scalar and dot are typically used interchangeably. In particular, IIUC, in NumPy, unless your using it to calculate a tensor product that doesn't result in a scalar, dot and scalar product are synonymous. DG > > > Keith Goodman wrote: >> On Fri, Dec 18, 2009 at 3:22 PM, Wayne Watson >> wrote: >> >>> Is there a scalar product in numpy? >>> >> >> Isn't that the same thing as a dot product? np.dot doesn't do what you want? >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > -- > ? ? ? ? ? Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > ? ? ? ? ? ? (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > ? ? ? ? ? ? ?Obz Site: ?39? 15' 7" N, 121? 2' 32" W, 2700 feet > > ? ? ? ? ? ? "... humans'innate skills with numbers isn't much > ? ? ? ? ? ? ?better than that of rats and dolphins." > ? ? ? ? ? ? ? ? ? ? ? -- Stanislas Dehaene, neurosurgeon > > ? ? ? ? ? ? ? ? ? ?Web Page: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Fri Dec 18 19:12:38 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 16:12:38 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> Message-ID: <4B2C1A76.3000202@sbcglobal.net> Not quite. The point of the scalar product is to produce theta. My intended use is that found in calculus. Nevertheless, my question is how to produce the result in some set of functions that are close to minimal. I could finish this off by using the common definition found in a calculus book (sum of squares, loop, etc.), but, from where I stand--just getting into numpy, this is about discovering more about numpy, and math. However, it's not just an example. I''m working on a task in celestial computations that has a definite goal. The dot product is very useful to it, since the work is very oriented towards vectors and matrices. Surprisingly it doesn't seem to be available in numpy's bag of tricks. David Goldsmith wrote: > On Fri, Dec 18, 2009 at 3:40 PM, Wayne Watson > wrote: > >> Well, they aren't quite the same. If a is the length of A, and b is the >> length of B, then a*b = A dot B* cos (theta). I'm still not familiar >> enough with numpy or math to know if there's some function that will >> produce a from A. It's easy enough to do, a = A(0)**2 + ..., but I would >> like to think it's a common enough need that there would be something >> available like sumsq(). >> > > In your usage, dot product and scalar product are synonymous: > > a = sqrt(A dot A) > > There are some contexts in which "scalar" product and "dot" product > don't mean exactly the same thing (e.g., tensors, where "dot" is > typically synonymous w/ "inner," which, in the general case, does not > result in a scalar, or a multiplication-like functional where a > function is mapped to a scalar, in which context we typically - but > not uniformly - do not describe the product as a dot product) but > unless you're working in one of those advanced contexts, scalar and > dot are typically used interchangeably. In particular, IIUC, in > NumPy, unless your using it to calculate a tensor product that doesn't > result in a scalar, dot and scalar product are synonymous. > > DG > >> Keith Goodman wrote: >> >>> On Fri, Dec 18, 2009 at 3:22 PM, Wayne Watson >>> wrote: >>> >>> >>>> Is there a scalar product in numpy? >>>> >>>> >>> Isn't that the same thing as a dot product? np.dot doesn't do what you want? >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> -- >> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >> >> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >> >> "... humans'innate skills with numbers isn't much >> better than that of rats and dolphins." >> -- Stanislas Dehaene, neurosurgeon >> >> Web Page: >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From aisaac at american.edu Fri Dec 18 20:31:51 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 18 Dec 2009 20:31:51 -0500 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> Message-ID: <4B2C2D07.8000607@american.edu> On 12/18/2009 5:54 PM, Keith Goodman wrote: > On Fri, Dec 18, 2009 at 2:51 PM, Wayne Watson > wrote: >> That should do it. Thanks. How do I get the scalar result by itself? > >>> np.dot(x.T,x)[0,0] > 14 > > or > >>> x = np.array([1,2,3]) >>> np.dot(x,x) > 14 or np.dot(x.flat,x.flat) fwiw, Alan Isaac From aisaac at american.edu Fri Dec 18 20:46:28 2009 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 18 Dec 2009 20:46:28 -0500 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C1A76.3000202@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> Message-ID: <4B2C3074.80009@american.edu> On 12/18/2009 7:12 PM, Wayne Watson wrote: > The point of the scalar product is to produce theta. As David said, that is just NumPy's `dot`. >>> a = np.array([0,2]) >>> b = np.array([5,0]) >>> theta = np.arccos(np.dot(a,b)/np.sqrt(np.dot(a,a)*np.dot(b,b))) >>> theta 1.5707963267948966 >>> theta/np.pi 0.5 hth, Alan Isaac From fperez.net at gmail.com Fri Dec 18 22:21:47 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Dec 2009 19:21:47 -0800 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold wrote: > Presumably the doctests should be turned into > actual tests (noting Robert's comment) to make it more likely that it > gets in Just curious: is there a policy against pure doctests in numpy? I've always found that doctests and 'real tests' serve complementary purposes and are both useful: - small, clear tests that make for *illustrative* examples for the end user should be left in the docstring, and picked up by the test suite as normal doctests. - tests with a lot more logic that get cumbersome to write as doctests can go into 'normal' tests into the test suite. - There's also a valid third category: for cases where it's convenient to write the test interactively but one doesn't want the example in the main docstring, putting a function in the test suite that simply has the doctest as a docstring works (it may require a little decorator, I don't recall). I'm just wondering if there's a policy of requiring that all tests become non-doctests... Cheers, f From charlesr.harris at gmail.com Fri Dec 18 22:46:26 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 20:46:26 -0700 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 8:21 PM, Fernando Perez wrote: > On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold > wrote: > > Presumably the doctests should be turned into > > actual tests (noting Robert's comment) to make it more likely that it > > gets in > > Just curious: is there a policy against pure doctests in numpy? I've > always found that doctests and 'real tests' serve complementary > purposes and are both useful: > > - small, clear tests that make for *illustrative* examples for the end > user should be left in the docstring, and picked up by the test suite > as normal doctests. > > - tests with a lot more logic that get cumbersome to write as doctests > can go into 'normal' tests into the test suite. > > - There's also a valid third category: for cases where it's convenient > to write the test interactively but one doesn't want the example in > the main docstring, putting a function in the test suite that simply > has the doctest as a docstring works (it may require a little > decorator, I don't recall). > > I'm just wondering if there's a policy of requiring that all tests > become non-doctests... > > I don't think there is a policy, but there is a growing tradition to put "serious" tests in a test suite and avoid making them doctests. Functional tests need to be easy to read, maintain, and document, and doctests don't really fit that description. Your policy for examples in docstrings looks like a good one, though. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Dec 18 22:47:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 18 Dec 2009 22:47:26 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: <1cd32cbb0912181947m75bb0bdex9ef5a38a9282ac87@mail.gmail.com> On Fri, Dec 18, 2009 at 10:21 PM, Fernando Perez wrote: > On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold wrote: >> Presumably the doctests should be turned into >> actual tests (noting Robert's comment) to make it more likely that it >> gets in > > Just curious: is there a policy against pure doctests in numpy? I've > always found that doctests and 'real tests' serve complementary > purposes and are both useful: > > - small, clear tests that make for *illustrative* examples for the end > user should be left in the docstring, and picked up by the test suite > as normal doctests. > > - tests with a lot more logic that get cumbersome to write as doctests > can go into 'normal' tests into the test suite. > > - There's also a valid third category: for cases where it's convenient > to write the test interactively but one doesn't want the example in > the main docstring, putting a function in the test suite that simply > has the doctest as a docstring works (it may require a little > decorator, I don't recall). > > I'm just wondering if there's a policy of requiring that all tests > become non-doctests... doctests have cross platform rendering/printing/formatting problems 1e-01 versus 1e-001 there was a test failure recently with, I think, cheby also nans render differently, even scalar nan versus nans in arrays. I don't know about differences across versions of python, numpy, ... but doctests are very fragile for numbers because they also test the formatting. Also, the precision is not as easy to control on a test-by-test basis than with assert_almost_equal or similar and easier to adjust when the implementation changes (e.g. in stats). I think, doctests are faster to write but more work to maintain. But I don't know of any "official" policy, http://projects.scipy.org/numpy/wiki/TestingGuidelines#doctests explains how to do them but no recommendation whether to use them or not. Josef > > Cheers, > > f > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Fri Dec 18 23:10:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 18 Dec 2009 22:10:53 -0600 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> Message-ID: <3d375d730912182010l77188933x5cdd0585bd9770a4@mail.gmail.com> On Fri, Dec 18, 2009 at 21:21, Fernando Perez wrote: > On Wed, Dec 16, 2009 at 10:56 AM, Skipper Seabold wrote: >> Presumably the doctests should be turned into >> actual tests (noting Robert's comment) to make it more likely that it >> gets in > > Just curious: is there a policy against pure doctests in numpy? I've > always found that doctests and 'real tests' serve complementary > purposes and are both useful: > > - small, clear tests that make for *illustrative* examples for the end > user should be left in the docstring, and picked up by the test suite > as normal doctests. > > - tests with a lot more logic that get cumbersome to write as doctests > can go into 'normal' tests into the test suite. > > - There's also a valid third category: for cases where it's convenient > to write the test interactively but one doesn't want the example in > the main docstring, putting a function in the test suite that simply > has the doctest as a docstring works (it may require a little > decorator, I don't recall). > > I'm just wondering if there's a policy of requiring that all tests > become non-doctests... My policy and rationale, which I believe is reflected in the docstring standard, is that examples in the docstrings should put pedagogical concerns above all others. In my experience, a properly robust doctest sacrifices the readability, clarity, and terseness of a good example. Thus, not all examples run as doctests, so docstrings are not added to numpy's test suite. For example, for floating point functions, one should use allclose(), etc. to test the results against the gold standard. However, using that in the example makes it less clear what is going on. Compare: In [2]: np.sin(np.linspace(0, 2*np.pi, 10)) Out[2]: array([ 0.00000000e+00, 6.42787610e-01, 9.84807753e-01, 8.66025404e-01, 3.42020143e-01, -3.42020143e-01, -8.66025404e-01, -9.84807753e-01, -6.42787610e-01, -2.44929360e-16]) In [4]: np.allclose(np.sin(np.linspace(0, 2*np.pi, 10)), array([0.0, .642787610, .984807753, .866025404, .342020143, -.342020143, -.866025404, -.984807753, -.642787610, 0.0])) Out[4]: True I certainly don't mind properly written doctests outside of the real docstrings (e.g. in particular files under test/ that just contain doctests); they're a handy way to write certain tests although they have well known and annoying limits for numerical work. I just don't want documentation examples to be doctests. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Fri Dec 18 23:22:15 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 18 Dec 2009 20:22:15 -0800 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <3d375d730912182010l77188933x5cdd0585bd9770a4@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <3d375d730912182010l77188933x5cdd0585bd9770a4@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 8:10 PM, Robert Kern wrote: > My policy and rationale, which I believe is reflected in the docstring > standard, is that examples in the docstrings should put pedagogical > concerns above all others. In my experience, a properly robust doctest > sacrifices the readability, clarity, and terseness of a good example. > Thus, not all examples run as doctests, so docstrings are not added to > numpy's test suite. For example, for floating point functions, one > should use allclose(), etc. to test the results against the gold > standard. I think we mostly agree, up to a point: I also emphasized pedagogical value, but I think it would be better in the long run if we could also run all examples as part of the test suite. This would act as a protection against examples going stale due to changes in the underlying code. A false example is worse than no example at all. I wonder if this couldn't be addressed by simply having a suitable set of printing options wrapped up in a utility that doctests could all use (4 digits only, setting certain flags, linewidth, etc). This could help with most of the problems with robustness and maintenance, while allowing us to ensure that we can always guarantee that examples actually do work. Just a thought. Cheers, f From sierra_mtnview at sbcglobal.net Sat Dec 19 00:20:59 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 21:20:59 -0800 Subject: [Numpy-discussion] objects are not aligned. Matrix and Array Message-ID: <4B2C62BB.60607@sbcglobal.net> This program gives me the message following it: ================Program========== import numpy as np from numpy import matrix import math def sinD(D): # given in degrees, convert to radians return math.sin(math.radians(D)) def cosD(D): return math.cos(math.radians(D)) r = math.sqrt(2*2+5*5) print r m1 = matrix([[2], [5]]) print "m1: ", m1 theta = 5.0 # degrees #CW 2x2 clockwise matrix rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), cosD(theta)] ]) print rotCW m2 = rotCW*m1 print "m2: ", m2 print "aaaaaaaa: ", type(m1), type(m2) m1=np.array(m1) m2=np.array(m2) print "zzzzzzzz: ", type(m1), type(m2) print"dot, ..." dotres = np.dot(m1,m2) print "dotres", dotres ==============end========== ==========Output msgs======== 5.38516480713 m1: [[2] [5]] [[ 0.9961947 0.08715574] [-0.08715574 0.9961947 ]] m2: [[ 2.42816811] [ 4.806662 ]] aaaaaaaa: zzzzzzzz: dot, ... Traceback (most recent call last): File "C:/Sponsor_Meteors/Sentinel_Development/Development_Sentuser+Utilities/Playground/junk.py", line 30, in dotres = np.dot(m1,m2) ValueError: objects are not aligned ================end msgs=========== Why the msg? The types look alike and each array/matrix contains two elements.. -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From sierra_mtnview at sbcglobal.net Sat Dec 19 00:22:02 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 21:22:02 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C3074.80009@american.edu> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> Message-ID: <4B2C62FA.7010909@sbcglobal.net> Nicely done. Alan G Isaac wrote: > On 12/18/2009 7:12 PM, Wayne Watson wrote: > >> The point of the scalar product is to produce theta. >> > > As David said, that is just NumPy's `dot`. > > >>>> a = np.array([0,2]) >>>> b = np.array([5,0]) >>>> theta = np.arccos(np.dot(a,b)/np.sqrt(np.dot(a,a)*np.dot(b,b))) >>>> theta >>>> > 1.5707963267948966 > >>>> theta/np.pi >>>> > 0.5 > > hth, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From sierra_mtnview at sbcglobal.net Sat Dec 19 00:29:17 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 21:29:17 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C62FA.7010909@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C076D.3050906@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> Message-ID: <4B2C64AD.6000306@sbcglobal.net> I'll amend that. I should have said, "Dot's all folks." -- Bugs Bunny -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From hgchong at berkeley.edu Sat Dec 19 00:42:12 2009 From: hgchong at berkeley.edu (Howard Chong) Date: Fri, 18 Dec 2009 21:42:12 -0800 Subject: [Numpy-discussion] ValueError for numpy when importing KDTree from scipy.spatial Message-ID: <5861ec420912182142g295fb883r2cf6745c854bed73@mail.gmail.com> I'm getting an odd behavior when I try to load KDTree. In the interactive interpreter: the first time I try to load it, it gives me an error. Second time works fine. When trying to run from command line, same error. To reproduce, just type the 4 commands below. If people can't reproduce it, I might do a re-install. Here's my session. *** Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32. *** >>> from scipy.spatial import KDTree Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\scipy\spatial\__init__.py", line 7, in from ckdtree import * File "numpy.pxd", line 30, in scipy.spatial.ckdtree (scipy\spatial\ckdtree.c:6087) ValueError: numpy.dtype does not appear to be the correct type object >>> KDTree Traceback (most recent call last): File "", line 1, in NameError: name 'KDTree' is not defined >>> from scipy.spatial import KDTree >>> KDTree >>> -- Howard Chong Dept. of Agricultural and Resource Economics and Energy Institute @ Haas Business School UC Berkeley -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 19 00:43:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 22:43:41 -0700 Subject: [Numpy-discussion] objects are not aligned. Matrix and Array In-Reply-To: <4B2C62BB.60607@sbcglobal.net> References: <4B2C62BB.60607@sbcglobal.net> Message-ID: On Fri, Dec 18, 2009 at 10:20 PM, Wayne Watson wrote: > This program gives me the message following it: > ================Program========== > import numpy as np > from numpy import matrix > import math > > You don't want math. > def sinD(D): # given in degrees, convert to radians > return math.sin(math.radians(D)) > def cosD(D): > return math.cos(math.radians(D)) > > def sinD(D): return np.sin(np.deg2rad(D)) > r = math.sqrt(2*2+5*5) > np.hypot(2, 5) > print r > m1 = matrix([[2], [5]]) > print "m1: ", m1 > > theta = 5.0 # degrees > #CW 2x2 clockwise matrix > rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), cosD(theta)] ]) > > print rotCW > > m2 = rotCW*m1 > print "m2: ", m2 > print "aaaaaaaa: ", type(m1), type(m2) > m1=np.array(m1) > m2=np.array(m2) > > print "zzzzzzzz: ", type(m1), type(m2) > > print"dot, ..." > dotres = np.dot(m1,m2) > Try np.dot(m2, m1), m1 is a column matrix. > print "dotres", dotres > ==============end========== > > ==========Output msgs======== > 5.38516480713 > m1: [[2] > [5]] > [[ 0.9961947 0.08715574] > [-0.08715574 0.9961947 ]] > m2: [[ 2.42816811] > [ 4.806662 ]] > aaaaaaaa: 'numpy.core.defmatrix.matrix'> > zzzzzzzz: > dot, ... > > Traceback (most recent call last): > File > > "C:/Sponsor_Meteors/Sentinel_Development/Development_Sentuser+Utilities/Playground/junk.py", > line 30, in > dotres = np.dot(m1,m2) > ValueError: objects are not aligned > ================end msgs=========== > Why the msg? The types look alike and each array/matrix contains two > elements.. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Dec 19 01:02:40 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 19 Dec 2009 01:02:40 -0500 Subject: [Numpy-discussion] ValueError for numpy when importing KDTree from scipy.spatial In-Reply-To: <5861ec420912182142g295fb883r2cf6745c854bed73@mail.gmail.com> References: <5861ec420912182142g295fb883r2cf6745c854bed73@mail.gmail.com> Message-ID: <1cd32cbb0912182202g75ff38bbo44e3e4246e461c95@mail.gmail.com> On Sat, Dec 19, 2009 at 12:42 AM, Howard Chong wrote: > I'm getting an odd behavior when I try to load KDTree. In the interactive > interpreter: the first time I try to load it, it gives me an error. Second > time works fine. When trying to run from command line, same error. > To reproduce, just type the 4 commands below. If people can't reproduce it, > I might do a re-install. > > Here's my session. > *** Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit > (Intel)] on win32. *** >>>> from scipy.spatial import KDTree > Traceback (most recent call last): > ??File "", line 1, in > ??File "C:\Python26\lib\site-packages\scipy\spatial\__init__.py", line 7, in > > ?? ?from ckdtree import * > ??File "numpy.pxd", line 30, in scipy.spatial.ckdtree > (scipy\spatial\ckdtree.c:6087) > ValueError: numpy.dtype does not appear to be the correct type object >>>> KDTree > Traceback (most recent call last): > ??File "", line 1, in > NameError: name 'KDTree' is not defined >>>> from scipy.spatial import KDTree >>>> KDTree > >>>> No problem here. What are your numpy/scipy versions? Do you have scipy compiled against the numpy that you are using? If you use numpy 1.4.0rc1 with scipy 0.7.1, then it might be the cython incompatibility problem. Josef > -- > Howard Chong > Dept. of Agricultural and Resource Economics and Energy Institute @ Haas > Business School > UC Berkeley > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From sierra_mtnview at sbcglobal.net Sat Dec 19 01:20:21 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Fri, 18 Dec 2009 22:20:21 -0800 Subject: [Numpy-discussion] objects are not aligned. Matrix and Array In-Reply-To: References: <4B2C62BB.60607@sbcglobal.net> Message-ID: <4B2C70A5.9010207@sbcglobal.net> Is math automatic (built-in)? Same result with np.dot(m2, m1). Ah, this works. dotres = np.dot(m2.T, m1). It looks to me like the shapes are the same, so maybe dot() requires one as a column vector and one as a row. Thanks. Charles R Harris wrote: > > > On Fri, Dec 18, 2009 at 10:20 PM, Wayne Watson > > > wrote: > > This program gives me the message following it: > ================Program========== > import numpy as np > from numpy import matrix > import math > > > You don't want math. > > > def sinD(D): # given in degrees, convert to radians > return math.sin(math.radians(D)) > def cosD(D): > return math.cos(math.radians(D)) > > > def sinD(D): > return np.sin(np.deg2rad(D)) > > > r = math.sqrt(2*2+5*5) > > > np.hypot(2, 5) > > > print r > m1 = matrix([[2], [5]]) > print "m1: ", m1 > > theta = 5.0 # degrees > #CW 2x2 clockwise matrix > rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), > cosD(theta)] ]) > > print rotCW > > m2 = rotCW*m1 > print "m2: ", m2 > print "aaaaaaaa: ", type(m1), type(m2) > m1=np.array(m1) > m2=np.array(m2) > > print "zzzzzzzz: ", type(m1), type(m2) > > print"dot, ..." > dotres = np.dot(m1,m2) > > > Try np.dot(m2, m1), m1 is a column matrix. > > > print "dotres", dotres > ==============end========== > > ==========Output msgs======== > 5.38516480713 > m1: [[2] > [5]] > [[ 0.9961947 0.08715574] > [-0.08715574 0.9961947 ]] > m2: [[ 2.42816811] > [ 4.806662 ]] > aaaaaaaa: 'numpy.core.defmatrix.matrix'> > zzzzzzzz: > dot, ... > > Traceback (most recent call last): > File > "C:/Sponsor_Meteors/Sentinel_Development/Development_Sentuser+Utilities/Playground/junk.py", > line 30, in > dotres = np.dot(m1,m2) > ValueError: objects are not aligned > ================end msgs=========== > Why the msg? The types look alike and each array/matrix contains two > elements.. > > > Chuck > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From charlesr.harris at gmail.com Sat Dec 19 01:43:02 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 23:43:02 -0700 Subject: [Numpy-discussion] ValueError for numpy when importing KDTree from scipy.spatial In-Reply-To: <5861ec420912182142g295fb883r2cf6745c854bed73@mail.gmail.com> References: <5861ec420912182142g295fb883r2cf6745c854bed73@mail.gmail.com> Message-ID: On Fri, Dec 18, 2009 at 10:42 PM, Howard Chong wrote: > I'm getting an odd behavior when I try to load KDTree. In the interactive > interpreter: the first time I try to load it, it gives me an error. Second > time works fine. When trying to run from command line, same error. > > To reproduce, just type the 4 commands below. If people can't reproduce it, > I might do a re-install. > > Here's my session. > > *** Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit > (Intel)] on win32. *** > >>> from scipy.spatial import KDTree > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\scipy\spatial\__init__.py", line 7, > in > from ckdtree import * > File "numpy.pxd", line 30, in scipy.spatial.ckdtree > (scipy\spatial\ckdtree.c:6087) > ValueError: numpy.dtype does not appear to be the correct type object > >>> KDTree > Traceback (most recent call last): > File "", line 1, in > NameError: name 'KDTree' is not defined > >>> from scipy.spatial import KDTree > >>> KDTree > > >>> > > Delete the old numpy install and reinstall. I ran into the same problem. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Sat Dec 19 01:44:44 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Fri, 18 Dec 2009 22:44:44 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2C64AD.6000306@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> Message-ID: <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> np.dot(x.flat, x.flat) _is exactly_ "sum of squares"(x.flat). Your math education appears to have drawn a distinction between "dot product" and "scalar product," that, when one is talking about Euclidean vectors, just isn't there: in that context, they are one and the same thing. DG On Fri, Dec 18, 2009 at 9:29 PM, Wayne Watson wrote: > I'll amend that. I should have said, "Dot's all folks." -- Bugs Bunny > > -- > ? ? ? ? ? Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > ? ? ? ? ? ? (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > ? ? ? ? ? ? ?Obz Site: ?39? 15' 7" N, 121? 2' 32" W, 2700 feet > > ? ? ? ? ? ? "... humans'innate skills with numbers isn't much > ? ? ? ? ? ? ?better than that of rats and dolphins." > ? ? ? ? ? ? ? ? ? ? ? -- Stanislas Dehaene, neurosurgeon > > ? ? ? ? ? ? ? ? ? ?Web Page: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Sat Dec 19 01:54:01 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2009 23:54:01 -0700 Subject: [Numpy-discussion] [AstroPy] Rotating and Transforming Vectors--Flight Path of a Celestial Body In-Reply-To: References: <4B2ADC00.307@sbcglobal.net> <4B2B1598.2060900@sbcglobal.net> Message-ID: On Thu, Dec 17, 2009 at 11:14 PM, Anne Archibald wrote: > 2009/12/18 Wayne Watson : > > It's starting to come back to me. I found a few old graphics books that > > get into transformation matrices and such. Yes, earth centered. I ground > > out some code with geometry and trig that at least gets the first point > > on the path right. I think I can probably apply a rotation around the > > x-axis several times with a 1/2 degree rotation for each step towards > > the north to get the job done. > > > > I'm still fairly fresh with Python, and found a little bit of info in a > > Tentative numpy tutorial. I found this on getting started with matrices: > > > > from numpy import matrix > > > > Apparently matrix is a class in numpy, and there are many others, linalg > > I think is one. How > > does one find all the classes, and are there rules that keep them apart. > > It was tempting to add > > import numpy in addition to the above, but how is what is in numpy > > different that the classes? > > Is numpy solely class driven? That is, if one does a from as above to > > get all the classes, does > > it give all the capabilities that just using import numpy does? > > Many things in python are classes; a class is a way of attaching > relevant functions to a collection of data (more precisely, a class is > a *type*, defining the interpretation of data; usually they also carry > a collection of functions to operate on that data). So the central > feature of numpy is a class, ndarray, that represents a collection of > values of homogeneous type. This may be one, two, or many-dimensional, > and there are various operations, including linear algebra, on them > available in numpy. > > The matrix class is a special kind of ndarray in which a number of > modifications have been made. In particular, the * operator no longer > does element-wise operations, it does matrix multiplication. There are > also various rules designed to ensure that matrix objects are always > two-dimensional. I avoid matrix objects like the plague, but some > people find them useful. > > numpy.linalg is an entirely different beast. It is a *module*, a > collection of functions (and potentially objects and classes). It is > like sys or os: you import it and the functions, objects and classes > it contains become available. This is a basic feature of python. What > is unusual (but not unique) is that rather than having to explicitly > import it like: > > import numpy > import numpy.linalg > > numpy.linalg.svd(numpy.ones((3,2))) > > numpy automatically imports it for you, every time. This is done for > historical reasons and won't change, but is a wart. > > For your purposes, I recommend simply using numpy arrays - > three-element arrays for vectors, three-by-three for matrices - and > using the linear algebra functions numpy provides to act on them. For > example, dot does matrix-matrix, matrix-vector, and vector-vector > multiplication. > > > Anne > > P.S. you can usually write out a rotation explicitly, e.g. as > [[cos(t), sin(t), 0], > [-sin(t), cos(t), 0], > [0, 0, 1]] > but if you want a more general one I believe there's a clever way to > make it using two reflections. -A > Two reflections in two (hyper)planes gives a rotation of twice the angle between their normals and leaves their intersection invariant. It's hard to visualize in more than three dimensions ;) It's also a good trick for rotating vector values in place using only exchanges. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sccolbert at gmail.com Sat Dec 19 06:53:56 2009 From: sccolbert at gmail.com (Chris Colbert) Date: Sat, 19 Dec 2009 12:53:56 +0100 Subject: [Numpy-discussion] objects are not aligned. Matrix and Array In-Reply-To: References: <4B2C62BB.60607@sbcglobal.net> Message-ID: <7f014ea60912190353l39e1532bxf811f20b10787848@mail.gmail.com> On Sat, Dec 19, 2009 at 6:43 AM, Charles R Harris wrote: > > > On Fri, Dec 18, 2009 at 10:20 PM, Wayne Watson < > sierra_mtnview at sbcglobal.net> wrote: > >> This program gives me the message following it: >> ================Program========== >> import numpy as np >> from numpy import matrix >> import math >> >> > You don't want math. > Why do you say that? The builtins are MUCH faster than numpy for single values: In [1]: import math In [2]: import numpy as np In [3]: %timeit np.sin(1.57) 100000 loops, best of 3: 2.41 us per loop In [4]: %timeit math.sin(1.57) 10000000 loops, best of 3: 165 ns per loop In [6]: %timeit np.array([np.sin(1.57)]) 100000 loops, best of 3: 11.5 us per loop In [7]: %timeit np.array([math.sin(1.57)]) 100000 loops, best of 3: 7.01 us per loop -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Sat Dec 19 07:51:25 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 04:51:25 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> Message-ID: <4B2CCC4D.1070509@sbcglobal.net> I'm trying to compute the angle between two vectors in three dimensional space. For that, I need to use the "scalar (dot) product" , according to a calculus book (quoting the book) I'm holding in my hands right now. I've used dot() successfully to produce the necessary angle. My program works just fine. In the case of the dot(function), one must use np.dev(x.T,x), where x is 1x3. I'm not quite sure what your point is about dot()* unless you are thinking in some non-Euclidean fashion. One can form np.dot(a,b) with a and b arrays of 3x4 and 4x2 shape to arrive at a 3x2 array. That's definitely not a scalar. Is there a need for this sort of calculation in non-Euclidean geometry, which I have never dealt with? *Maybe it's about something else related to it. David Goldsmith wrote: > np.dot(x.flat, x.flat) _is exactly_ "sum of squares"(x.flat). Your > math education appears to have drawn a distinction between "dot > product" and "scalar product," that, when one is talking about > Euclidean vectors, just isn't there: in that context, they are one and > the same thing. > > DG > > On Fri, Dec 18, 2009 at 9:29 PM, Wayne Watson > wrote: > >> I'll amend that. I should have said, "Dot's all folks." -- Bugs Bunny >> >> -- >> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >> >> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >> >> "... humans'innate skills with numbers isn't much >> better than that of rats and dolphins." >> -- Stanislas Dehaene, neurosurgeon >> >> Web Page: >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From dagss at student.matnat.uio.no Sat Dec 19 08:19:50 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 19 Dec 2009 14:19:50 +0100 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2CCC4D.1070509@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> Message-ID: <4B2CD2F6.7070406@student.matnat.uio.no> Wayne Watson wrote: > I'm trying to compute the angle between two vectors in three dimensional > space. For that, I need to use the "scalar (dot) product" , according to > a calculus book (quoting the book) I'm holding in my hands right now. > I've used dot() successfully to produce the necessary angle. My program > works just fine. > > In the case of the dot(function), one must use np.dev(x.T,x), where x is > 1x3. > > I'm not quite sure what your point is about dot()* unless you are > thinking in some non-Euclidean fashion. One can form np.dot(a,b) with a > and b arrays of 3x4 and 4x2 shape to arrive at a 3x2 array. That's > definitely not a scalar. Is there a need for this sort of calculation in > non-Euclidean geometry, which I have never dealt with? There's a difference between 1D and 2D arrays that's important here. For a 1D array, np.dot(x.T, x) == np.dot(x, x), since there's only one dimension. NumPy is all about arrays, not matrices and vectors. Dag Sverre > > *Maybe it's about something else related to it. > > > David Goldsmith wrote: >> np.dot(x.flat, x.flat) _is exactly_ "sum of squares"(x.flat). Your >> math education appears to have drawn a distinction between "dot >> product" and "scalar product," that, when one is talking about >> Euclidean vectors, just isn't there: in that context, they are one and >> the same thing. >> >> DG >> >> On Fri, Dec 18, 2009 at 9:29 PM, Wayne Watson >> wrote: >> >>> I'll amend that. I should have said, "Dot's all folks." -- Bugs Bunny >>> >>> -- >>> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >>> >>> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >>> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >>> >>> "... humans'innate skills with numbers isn't much >>> better than that of rats and dolphins." >>> -- Stanislas Dehaene, neurosurgeon >>> >>> Web Page: >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -- Dag Sverre From charlesr.harris at gmail.com Sat Dec 19 10:13:12 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Dec 2009 08:13:12 -0700 Subject: [Numpy-discussion] objects are not aligned. Matrix and Array In-Reply-To: <7f014ea60912190353l39e1532bxf811f20b10787848@mail.gmail.com> References: <4B2C62BB.60607@sbcglobal.net> <7f014ea60912190353l39e1532bxf811f20b10787848@mail.gmail.com> Message-ID: On Sat, Dec 19, 2009 at 4:53 AM, Chris Colbert wrote: > > > On Sat, Dec 19, 2009 at 6:43 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Fri, Dec 18, 2009 at 10:20 PM, Wayne Watson < >> sierra_mtnview at sbcglobal.net> wrote: >> >>> This program gives me the message following it: >>> ================Program========== >>> import numpy as np >>> from numpy import matrix >>> import math >>> >>> >> You don't want math. >> > > > Why do you say that? The builtins are MUCH faster than numpy for single > values: > > In [1]: import math > > In [2]: import numpy as np > > In [3]: %timeit np.sin(1.57) > 100000 loops, best of 3: 2.41 us per loop > > In [4]: %timeit math.sin(1.57) > 10000000 loops, best of 3: 165 ns per loop > > In [6]: %timeit np.array([np.sin(1.57)]) > 100000 loops, best of 3: 11.5 us per loop > > In [7]: %timeit np.array([math.sin(1.57)]) > 100000 loops, best of 3: 7.01 us per loop > > > Fair point. I was thinking of the vector case down the road. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Sat Dec 19 11:45:16 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 08:45:16 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2CD2F6.7070406@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> Message-ID: <4B2D031C.2090408@sbcglobal.net> Dag Sverre Seljebotn wrote: > Wayne Watson wrote: > >> I'm trying to compute the angle between two vectors in three dimensional >> space. For that, I need to use the "scalar (dot) product" , according to >> a calculus book (quoting the book) I'm holding in my hands right now. >> I've used dot() successfully to produce the necessary angle. My program >> works just fine. >> >> In the case of the dot(function), one must use np.dev(x.T,x), where x is >> 1x3. >> >> I'm not quite sure what your point is about dot()* unless you are >> thinking in some non-Euclidean fashion. One can form np.dot(a,b) with a >> and b arrays of 3x4 and 4x2 shape to arrive at a 3x2 array. That's >> definitely not a scalar. Is there a need for this sort of calculation in >> non-Euclidean geometry, which I have never dealt with? >> > > There's a difference between 1D and 2D arrays that's important here. For > a 1D array, np.dot(x.T, x) == np.dot(x, x), since there's only one > dimension. > A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? Are you saying that instead of using a rotational matrix like theta = 5.0 # degrees m1 = matrix([[2] ,[5]]) rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), cosD(theta)] ]) m2= rotCW*m1 m1=np.array(m1) m2=np.array(m2) that I should use a 2-D array for rotCW? So why does numpy have a matrix class? Is the class only used when working with matplotlib? To get the scalar value (sum of squares) I had to use a transpose, T, on one argument. > NumPy is all about arrays, not matrices and vectors. > > Dag Sverre > > >> *Maybe it's about something else related to it. >> >> >> David Goldsmith wrote: >> >>> np.dot(x.flat, x.flat) _is exactly_ "sum of squares"(x.flat). Your >>> math education appears to have drawn a distinction between "dot >>> product" and "scalar product," that, when one is talking about >>> Euclidean vectors, just isn't there: in that context, they are one and >>> the same thing. >>> >>> DG >>> >>> On Fri, Dec 18, 2009 at 9:29 PM, Wayne Watson >>> wrote: >>> >>> >>>> I'll amend that. I should have said, "Dot's all folks." -- Bugs Bunny >>>> >>>> -- >>>> Wayne Watson (Watson Adventures, Prop., Nevada City, CA) >>>> >>>> (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) >>>> Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet >>>> >>>> "... humans'innate skills with numbers isn't much >>>> better than that of rats and dolphins." >>>> -- Stanislas Dehaene, neurosurgeon >>>> >>>> Web Page: >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> > > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From aisaac at american.edu Sat Dec 19 12:17:17 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 19 Dec 2009 12:17:17 -0500 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D031C.2090408@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> Message-ID: <4B2D0A9D.20200@american.edu> On 12/19/2009 11:45 AM, Wayne Watson wrote: > A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? > > Are you saying that instead of using a rotational matrix ... > that I should use a 2-D array for rotCW? So why does numpy have a matrix > class? Is the class only used when working with matplotlib? > > To get the scalar value (sum of squares) I had to use a transpose, T, on > one argument. At this point, you have raised some long standing issues. There are a couple standard replies people give to some of them. E.g., 1. don't use matrices, OR 2. don't mix the use of matrices and arrays Matrices are *always* 2d (e.g., a "row vector" or a "column vector" is 2d). So in fact you should find it quite natural that that transpose was needed. Matrices change * to matrix multiplication and ** to matrix exponentiation. I find this very convenient, especially in a teaching setting, so I use NumPy matrices all the time. Many on this list avoid them completely. Again, if you want a *scalar* as the product of vectors for which you created matrix objects (e.g., a and b), you can just use flat: np.dot(a.flat,b.flat) hth, Alan Isaac From charlesr.harris at gmail.com Sat Dec 19 12:22:25 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Dec 2009 10:22:25 -0700 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D031C.2090408@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> Message-ID: On Sat, Dec 19, 2009 at 9:45 AM, Wayne Watson wrote: > > > Dag Sverre Seljebotn wrote: > > Wayne Watson wrote: > > > >> I'm trying to compute the angle between two vectors in three dimensional > >> space. For that, I need to use the "scalar (dot) product" , according to > >> a calculus book (quoting the book) I'm holding in my hands right now. > >> I've used dot() successfully to produce the necessary angle. My program > >> works just fine. > >> > >> In the case of the dot(function), one must use np.dev(x.T,x), where x is > >> 1x3. > >> > >> I'm not quite sure what your point is about dot()* unless you are > >> thinking in some non-Euclidean fashion. One can form np.dot(a,b) with a > >> and b arrays of 3x4 and 4x2 shape to arrive at a 3x2 array. That's > >> definitely not a scalar. Is there a need for this sort of calculation in > >> non-Euclidean geometry, which I have never dealt with? > >> > > > > There's a difference between 1D and 2D arrays that's important here. For > > a 1D array, np.dot(x.T, x) == np.dot(x, x), since there's only one > > dimension. > > > A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? > No, they are all 2D. All matrices are 2D. An array is 1D if it doesn't have a second dimension, which might be confusing if you have only seen vectors represented as arrays. To see the number of dimensions in a numpy array, use shape: In [1]: array([[1,2],[3,4]]) Out[1]: array([[1, 2], [3, 4]]) In [2]: array([[1,2],[3,4]]).shape Out[2]: (2, 2) In [3]: array([1,2, 3,4]) Out[3]: array([1, 2, 3, 4]) In [4]: array([1,2, 3,4]).shape Out[4]: (4,) > Are you saying that instead of using a rotational matrix like > theta = 5.0 # degrees > m1 = matrix([[2] ,[5]]) > rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), > cosD(theta)] ]) > m2= rotCW*m1 > m1=np.array(m1) > m2=np.array(m2) > that I should use a 2-D array for rotCW? So why does numpy have a matrix > class? Is the class only used when working with matplotlib? > > Numpy has a matrix class because python lacks operators, so where * normally means element-wise multiplication the matrix class uses it for matrix multiplication, which is different. Having a short form for matrix multiplication is sometimes a convenience and also more familiar for folks coming to numpy from matlab. > To get the scalar value (sum of squares) I had to use a transpose, T, on > one argument. > > That is if the argument is 2D. It's not strictly speaking a scalar product, but we won't go into that here ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Sat Dec 19 12:38:58 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 09:38:58 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D0A9D.20200@american.edu> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> Message-ID: <4B2D0FB2.6080905@sbcglobal.net> Yes, flat sounds useful here. However, numpy isn't bending over backwards to tie in conventional mathematical language into it. I don't recall flat in any calculus books. :-) Maybe I've been away so long from it, that it is a common math concept? Although I doubt that. Alan G Isaac wrote: > On 12/19/2009 11:45 AM, Wayne Watson wrote: > >> A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? >> >> Are you saying that instead of using a rotational matrix ... >> that I should use a 2-D array for rotCW? So why does numpy have a matrix >> class? Is the class only used when working with matplotlib? >> >> To get the scalar value (sum of squares) I had to use a transpose, T, on >> one argument. >> > > > At this point, you have raised some long standing issues. > There are a couple standard replies people give to some of them. > E.g., > > 1. don't use matrices, OR > 2. don't mix the use of matrices and arrays > > Matrices are *always* 2d (e.g., a "row vector" or a "column vector" is 2d). > So in fact you should find it quite natural that that transpose was needed. > Matrices change * to matrix multiplication and ** to matrix exponentiation. > I find this very convenient, especially in a teaching setting, so I use > NumPy matrices all the time. Many on this list avoid them completely. > > Again, if you want a *scalar* as the product of vectors for which you > created matrix objects (e.g., a and b), you can just use flat: > np.dot(a.flat,b.flat) > > hth, > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From charlesr.harris at gmail.com Sat Dec 19 12:42:39 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Dec 2009 10:42:39 -0700 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D0FB2.6080905@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> Message-ID: On Sat, Dec 19, 2009 at 10:38 AM, Wayne Watson wrote: > Yes, flat sounds useful here. However, numpy isn't bending over > backwards to tie in conventional mathematical language into it. > I don't recall flat in any calculus books. :-) Maybe I've been away so > long from it, that it is a common math concept? Although I doubt that. > > Flat is a programming concept. Programming and mathematics have some overlap, but they aren't the same by any means. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sierra_mtnview at sbcglobal.net Sat Dec 19 12:45:02 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 09:45:02 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> Message-ID: <4B2D111E.6060809@sbcglobal.net> OK, so what's your recommendation on the code I wrote? Use shape 0xN? Will that eliminate the need for T? I'll go back to Tenative Python, and re-read dimension, shape and the like. Charles R Harris wrote: > > > On Sat, Dec 19, 2009 at 9:45 AM, Wayne Watson > > > wrote: > > > > Dag Sverre Seljebotn wrote: > > Wayne Watson wrote: > > > >> I'm trying to compute the angle between two vectors in three > dimensional > >> space. For that, I need to use the "scalar (dot) product" , > according to > >> a calculus book (quoting the book) I'm holding in my hands > right now. > >> I've used dot() successfully to produce the necessary angle. My > program > >> works just fine. > >> > >> In the case of the dot(function), one must use np.dev(x.T,x), > where x is > >> 1x3. > >> > >> I'm not quite sure what your point is about dot()* unless you are > >> thinking in some non-Euclidean fashion. One can form > np.dot(a,b) with a > >> and b arrays of 3x4 and 4x2 shape to arrive at a 3x2 array. That's > >> definitely not a scalar. Is there a need for this sort of > calculation in > >> non-Euclidean geometry, which I have never dealt with? > >> > > > > There's a difference between 1D and 2D arrays that's important > here. For > > a 1D array, np.dot(x.T, x) == np.dot(x, x), since there's only one > > dimension. > > > A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? > > > No, they are all 2D. All matrices are 2D. An array is 1D if it doesn't > have a second dimension, which might be confusing if you have only > seen vectors represented as arrays. To see the number of dimensions in > a numpy array, use shape: > > In [1]: array([[1,2],[3,4]]) > Out[1]: > array([[1, 2], > [3, 4]]) > > In [2]: array([[1,2],[3,4]]).shape > Out[2]: (2, 2) > > In [3]: array([1,2, 3,4]) > Out[3]: array([1, 2, 3, 4]) > > In [4]: array([1,2, 3,4]).shape > Out[4]: (4,) > > > Are you saying that instead of using a rotational matrix like > theta = 5.0 # degrees > m1 = matrix([[2] ,[5]]) > rotCW = matrix([ [cosD(theta), sinD(theta)], [-sinD(theta), > cosD(theta)] ]) > m2= rotCW*m1 > m1=np.array(m1) > m2=np.array(m2) > that I should use a 2-D array for rotCW? So why does numpy have a > matrix > class? Is the class only used when working with matplotlib? > > > Numpy has a matrix class because python lacks operators, so where * > normally means element-wise multiplication the matrix class uses it > for matrix multiplication, which is different. Having a short form for > matrix multiplication is sometimes a convenience and also more > familiar for folks coming to numpy from matlab. > > > To get the scalar value (sum of squares) I had to use a transpose, > T, on > one argument. > > > That is if the argument is 2D. It's not strictly speaking a scalar > product, but we won't go into that here ;) > > > > Chuck > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From sierra_mtnview at sbcglobal.net Sat Dec 19 12:46:57 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 09:46:57 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: References: <4B2BF965.5010107@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> Message-ID: <4B2D1191.4050803@sbcglobal.net> That's for sure! :-) Charles R Harris wrote: > > > On Sat, Dec 19, 2009 at 10:38 AM, Wayne Watson > > > wrote: > > Yes, flat sounds useful here. However, numpy isn't bending over > backwards to tie in conventional mathematical language into it. > I don't recall flat in any calculus books. :-) Maybe I've been away so > long from it, that it is a common math concept? Although I doubt that. > > > Flat is a programming concept. Programming and mathematics have some > overlap, but they aren't the same by any means. > > Chuck > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From Chris.Barker at noaa.gov Sat Dec 19 13:18:09 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Sat, 19 Dec 2009 10:18:09 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D0FB2.6080905@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> Message-ID: <4B2D18E1.20905@noaa.gov> Wayne Watson wrote: > Yes, flat sounds useful here. However, numpy isn't bending over > backwards to tie in conventional mathematical language into it. exactly -- it isn't bending over at all! (well a little -- see below). numpy was designed for general purpose computational needs, not any one branch of math. nd-arrays are very useful for lots of things. In contrast, Matlab, for instance, was originally designed to be an easy front-end to linear algebra package. Personally, when I used Matlab, I found that very awkward -- I was usually writing 100s of lines of code that had nothing to do with linear algebra, for every few lines that actually did matrix math. So I much prefer numpy's way -- the linear algebra lines of code are longer an more awkward, but the rest is much better. The Matrix class is the exception to this: is was written to provide a natural way to express linear algebra. However, things get a bit tricky when you mix matrices and arrays, and even when sticking with matrices there are confusions and limitations -- how do you express a row vs a column vector? what do you get when you iterate over a matrix? etc. There has been a bunch of discussion about these issues, a lot of good ideas, a little bit of consensus about how to improve it, but no one with the skill to do it has enough motivation to do it. As for your problem, I think a 3-d euclidean vector is well expressed as a (3,) shape array, and then you don't need flat, etc. In [6]: v1 = np.array((1,2,3), dtype=np.float) In [7]: v2 = np.array((3,1,2), dtype=np.float) In [8]: np.dot(v1,v2) Out[8]: 11.0 -Chris > I don't recall flat in any calculus books. :-) Maybe I've been away so > long from it, that it is a common math concept? Although I doubt that. > > > Alan G Isaac wrote: >> On 12/19/2009 11:45 AM, Wayne Watson wrote: >> >>> A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? >>> >>> Are you saying that instead of using a rotational matrix ... >>> that I should use a 2-D array for rotCW? So why does numpy have a matrix >>> class? Is the class only used when working with matplotlib? >>> >>> To get the scalar value (sum of squares) I had to use a transpose, T, on >>> one argument. >>> >> >> At this point, you have raised some long standing issues. >> There are a couple standard replies people give to some of them. >> E.g., >> >> 1. don't use matrices, OR >> 2. don't mix the use of matrices and arrays >> >> Matrices are *always* 2d (e.g., a "row vector" or a "column vector" is 2d). >> So in fact you should find it quite natural that that transpose was needed. >> Matrices change * to matrix multiplication and ** to matrix exponentiation. >> I find this very convenient, especially in a teaching setting, so I use >> NumPy matrices all the time. Many on this list avoid them completely. >> >> Again, if you want a *scalar* as the product of vectors for which you >> created matrix objects (e.g., a and b), you can just use flat: >> np.dot(a.flat,b.flat) >> >> hth, >> Alan Isaac >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sierra_mtnview at sbcglobal.net Sat Dec 19 13:50:33 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sat, 19 Dec 2009 10:50:33 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D18E1.20905@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> Message-ID: <4B2D2079.6040008@sbcglobal.net> I guess I'll become accustomed to it over time. I have some interesting things to do for which I will need the facilities of numpy. I realized where I got into trouble with some of this. I was not differentiating between the dimensionality of space and that of a matrix or array. I haven't had to crank out math and computer work for quite awhile. Further, I've been doing a lot of reading on the Big Bang, and the dimensionality of space. I'm presently strongly biased towards thinking about space. For example, when I say 2D, I'm thinking of plane geometry space, and 3D as the world we live in. Thanks to all on this thread. Christopher Barker wrote: > Wayne Watson wrote: > >> Yes, flat sounds useful here. However, numpy isn't bending over >> backwards to tie in conventional mathematical language into it. >> > > exactly -- it isn't bending over at all! (well a little -- see below). > numpy was designed for general purpose computational needs, not any one > branch of math. nd-arrays are very useful for lots of things. In > contrast, Matlab, for instance, was originally designed to be an easy > front-end to linear algebra package. Personally, when I used Matlab, I > found that very awkward -- I was usually writing 100s of lines of code > that had nothing to do with linear algebra, for every few lines that > actually did matrix math. So I much prefer numpy's way -- the linear > algebra lines of code are longer an more awkward, but the rest is much > better. > > The Matrix class is the exception to this: is was written to provide a > natural way to express linear algebra. However, things get a bit tricky > when you mix matrices and arrays, and even when sticking with matrices > there are confusions and limitations -- how do you express a row vs a > column vector? what do you get when you iterate over a matrix? etc. > > There has been a bunch of discussion about these issues, a lot of good > ideas, a little bit of consensus about how to improve it, but no one > with the skill to do it has enough motivation to do it. > > As for your problem, I think a 3-d euclidean vector is well expressed as > a (3,) shape array, and then you don't need flat, etc. > > In [6]: v1 = np.array((1,2,3), dtype=np.float) > > In [7]: v2 = np.array((3,1,2), dtype=np.float) > > In [8]: np.dot(v1,v2) > Out[8]: 11.0 > > -Chris > > > > > > > > >> I don't recall flat in any calculus books. :-) Maybe I've been away so >> long from it, that it is a common math concept? Although I doubt that. >> >> >> Alan G Isaac wrote: >> >>> On 12/19/2009 11:45 AM, Wayne Watson wrote: >>> >>> >>>> A 4x1, 1x7, and 1x5 would be examples of a 1D array or matrix, right? >>>> >>>> Are you saying that instead of using a rotational matrix ... >>>> that I should use a 2-D array for rotCW? So why does numpy have a matrix >>>> class? Is the class only used when working with matplotlib? >>>> >>>> To get the scalar value (sum of squares) I had to use a transpose, T, on >>>> one argument. >>>> >>>> >>> At this point, you have raised some long standing issues. >>> There are a couple standard replies people give to some of them. >>> E.g., >>> >>> 1. don't use matrices, OR >>> 2. don't mix the use of matrices and arrays >>> >>> Matrices are *always* 2d (e.g., a "row vector" or a "column vector" is 2d). >>> So in fact you should find it quite natural that that transpose was needed. >>> Matrices change * to matrix multiplication and ** to matrix exponentiation. >>> I find this very convenient, especially in a teaching setting, so I use >>> NumPy matrices all the time. Many on this list avoid them completely. >>> >>> Again, if you want a *scalar* as the product of vectors for which you >>> created matrix objects (e.g., a and b), you can just use flat: >>> np.dot(a.flat,b.flat) >>> >>> hth, >>> Alan Isaac >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> > > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From charlesr.harris at gmail.com Sat Dec 19 14:16:53 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Dec 2009 12:16:53 -0700 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D2079.6040008@sbcglobal.net> References: <4B2BF965.5010107@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2079.6040008@sbcglobal.net> Message-ID: On Sat, Dec 19, 2009 at 11:50 AM, Wayne Watson wrote: > I guess I'll become accustomed to it over time. I have some interesting > things to do for which I will need the facilities of numpy. > > I realized where I got into trouble with some of this. I was not > differentiating between the dimensionality of space and that of a matrix > or array. I haven't had to crank out math and computer work for quite > awhile. Further, I've been doing a lot of reading on the Big Bang, and > the dimensionality of space. I'm presently strongly biased towards > thinking about space. For example, when I say 2D, I'm thinking of plane > geometry space, and 3D as the world we live in. > > Thanks to all on this thread. > > Ah, you got confused between number of elements (spatial dimension) vs number of indices (programming dimensions). Programming dimensions are more like the dimensions of a box or container, i.e., width x height (2 dimensional array) or width x height x depth (three dimensional array). You can put stuff in a container and the dimensions tell you how it is arranged inside. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Sat Dec 19 14:52:17 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sat, 19 Dec 2009 20:52:17 +0100 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D18E1.20905@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> Message-ID: <4B2D2EF1.2020905@student.matnat.uio.no> Christopher Barker wrote: > Wayne Watson wrote: > >> Yes, flat sounds useful here. However, numpy isn't bending over >> backwards to tie in conventional mathematical language into it. >> > > exactly -- it isn't bending over at all! (well a little -- see below). > numpy was designed for general purpose computational needs, not any one > branch of math. nd-arrays are very useful for lots of things. In > contrast, Matlab, for instance, was originally designed to be an easy > front-end to linear algebra package. Personally, when I used Matlab, I > found that very awkward -- I was usually writing 100s of lines of code > that had nothing to do with linear algebra, for every few lines that > actually did matrix math. So I much prefer numpy's way -- the linear > algebra lines of code are longer an more awkward, but the rest is much > better. > > The Matrix class is the exception to this: is was written to provide a > natural way to express linear algebra. However, things get a bit tricky > when you mix matrices and arrays, and even when sticking with matrices > there are confusions and limitations -- how do you express a row vs a > column vector? what do you get when you iterate over a matrix? etc. > > There has been a bunch of discussion about these issues, a lot of good > ideas, a little bit of consensus about how to improve it, but no one > with the skill to do it has enough motivation to do it. > I recently got motivated to get better linear algebra for Python; and startet submitting and writing on patches for Sage instead (which of course uses NumPy underneath). Sage has a strong concept of matrices and vectors, but not much numerical support, mainly exact or multi-precision arithmetic. So perhaps there will be more progress there; I'm not sure yet how far it will get or if anybody will join me in doing it... To me that seems like the ideal way to split up code -- let NumPy/SciPy deal with the array-oriented world and Sage the closer-to-mathematics notation. I never liked the NumPy matrix class. I think this is mainly because my matrices are often, but not always, diagonal, which doesn't fit at all into NumPy's way of thinking about these things. (Also a 2D or 3D array could easily be a "vector", like if you want to linearily transform the values of the pixels in an image. So I think any Python linear algebra package has to attack things in a totally different way from numpy.matrix). Dag Sverre From d.l.goldsmith at gmail.com Sat Dec 19 17:35:04 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sat, 19 Dec 2009 14:35:04 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D2EF1.2020905@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> Message-ID: <45d1ab480912191435i33cc98d3x5994dcc1dccdc088@mail.gmail.com> I think the "bottom line" is: _only_ use the matrix class if _all_ you're doing is matrix algebra - which, as Chris Barker said, is (likely) the exception, not the rule, for most numpy users. I feel confident in saying this (that is, _only_ ... _all_) because if you feel you really must have a matrix (which I think should never really be the case: all the operations of matrix algebra can be done w/ arrays, it's just that some look a little more elegant if the operands are matrices) you can always cast a two-d array (or a one-d array, but then you have to be careful about whether you're casting to a row vector or a column vector) to a matrix - A = np.matrix(np.array(a)) - "on the fly," so to speak. That said, I'll be the first to acknowledge that those coming to array programming after having come up through a pure math curriculum - where "array" is essentially synonymous with "matrix," tensors rarely being written out in all their gorey component glory - are confronted with a perhaps surprising adjustment. Since no one has yet provided an explicit example of, IMO, the most fundamental difference between a 2-D numpy array and a numpy matrix, observe: >>> a = np.array([[1, 2], [3, 4]]) >>> a array([[1, 2], [3, 4]]) >>> A = np.matrix(a) >>> A matrix([[1, 2], [3, 4]]) >>> a*a # multiplication is performed "element by element" array([[ 1, 4], [ 9, 16]]) >>> A*A # standard matrix multiplication is performed matrix([[ 7, 10], [15, 22]]) In other words, the most fundamental difference (not the only difference, but the one which pretty much characterizes all the others) is the way the multiplication operator is overloaded: array multiplication is "element by element," whereas matrix multiplication is, well, matrix multiplication; oh, and the fact that type is preserved, i.e., the type of an array times an array is an array, the type of a matrix times a matrix is a matrix. (But be careful: >>> A*a matrix([[ 7, 10], [15, 22]]) >>> a*A matrix([[ 7, 10], [15, 22]])) i.e., multiplication of a matrix by an array is allowed, and regardless of order, the array operand is cast to a matrix, resulting in matrix multiplication and a matrix-type result.) HTH, DG From aisaac at american.edu Sun Dec 20 20:58:05 2009 From: aisaac at american.edu (Alan G Isaac) Date: Sun, 20 Dec 2009 20:58:05 -0500 Subject: [Numpy-discussion] indexing question Message-ID: <4B2ED62D.5030204@american.edu> Why is s3 F_CONTIGUOUS, and perhaps equivalently, why is its C_CONTIGUOUS data in s3.base (below)? Thanks, Alan Isaac >>> a3 array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11]]) >>> a3.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> ind array([3, 1, 2, 4, 5, 0]) >>> s3 = a3[:,ind] >>> s3.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> s3.base array([[ 3, 9], [ 1, 7], [ 2, 8], [ 4, 10], [ 5, 11], [ 0, 6]]) >>> s3 array([[ 3, 1, 2, 4, 5, 0], [ 9, 7, 8, 10, 11, 6]]) >>> From sierra_mtnview at sbcglobal.net Sun Dec 20 21:37:05 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sun, 20 Dec 2009 18:37:05 -0800 Subject: [Numpy-discussion] help(numpy.dot) Hmmm. Message-ID: <4B2EDF51.8040001@sbcglobal.net> I've just become acquainted with the help command in WinXP IDLE. help(numyp.sin) works fine. What's going on with dot? >>> help(numpy.core.multiarray.dot) Help on built-in function dot in module numpy.core.multiarray: dot(...) Is there help for dot? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From george.dahl at gmail.com Sun Dec 20 21:38:51 2009 From: george.dahl at gmail.com (George Dahl) Date: Sun, 20 Dec 2009 21:38:51 -0500 Subject: [Numpy-discussion] nicest way to apply an arbitrary sequence of row deltas to an array In-Reply-To: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> References: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> Message-ID: <6b2e0e10912201838g3138ad6fl48d730bfb9a5e15b@mail.gmail.com> Hi everyone, I was wondering if anyone had insight on the best way to solve the following problem. Suppose I have a numpy array called U. U has shape (N,M) Suppose further that I have another array called dU and that dU has shape (P,M) and that P has no particular relationship to N, it could be larger or smaller. Additionally, I have a one dimensional array called idx and idx has shape (P,). Furthermore idx holds integers that are all valid indices into the rows of U. idx almost certainly contains some duplicates. I want, for k in range(P): U[idx[k],:] += dU[k,:] Is there a nice vectorized and efficient way to do this without making the obvious python for loop I included above? For what I am doing, P is usually quite large. I am most interested in a clever use of numpy/scipy commands and Python and not Cython. Thanks in advance for any suggestions. - George From charlesr.harris at gmail.com Sun Dec 20 21:52:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2009 19:52:18 -0700 Subject: [Numpy-discussion] help(numpy.dot) Hmmm. In-Reply-To: <4B2EDF51.8040001@sbcglobal.net> References: <4B2EDF51.8040001@sbcglobal.net> Message-ID: On Sun, Dec 20, 2009 at 7:37 PM, Wayne Watson wrote: > I've just become acquainted with the help command in WinXP IDLE. > help(numyp.sin) works fine. What's going on with dot? > > >>> help(numpy.core.multiarray.dot) > Help on built-in function dot in module numpy.core.multiarray: > > dot(...) > > Is there help for dot? > > Yes, but you may be using an old version of numpy. What does numpy.__version__ say? You can also find documentation on the scipy.orgsite. Here is part of the current help: Help on built-in function dot in module numpy.core._dotblas: dot(...) dot(a, b) Dot product of two arrays. For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of `a` and the second-to-last of `b`:: dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m]) Parameters ---------- a : array_like First argument. b : array_like Second argument. Returns ------- output : ndarray Returns the dot product of `a` and `b`. If `a` and `b` are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. Raises ------ ValueError If the last dimension of `a` is not the same size as the second-to-last dimension of `b`. See Also -------- vdot : Complex-conjugating dot product. tensordot : Sum products over arbitrary axes. Examples -------- ... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Dec 20 23:35:29 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 20 Dec 2009 23:35:29 -0500 Subject: [Numpy-discussion] indexing question In-Reply-To: <4B2ED62D.5030204@american.edu> References: <4B2ED62D.5030204@american.edu> Message-ID: <1cd32cbb0912202035v1ceb2688ie96ff2d16014f829@mail.gmail.com> On Sun, Dec 20, 2009 at 8:58 PM, Alan G Isaac wrote: > Why is s3 F_CONTIGUOUS, and perhaps equivalently, > why is its C_CONTIGUOUS data in s3.base (below)? > Thanks, > Alan Isaac > >>>> a3 > array([[ 0, ?1, ?2, ?3, ?4, ?5], > ? ? ? ?[ 6, ?7, ?8, ?9, 10, 11]]) >>>> a3.flags > ? C_CONTIGUOUS : True > ? F_CONTIGUOUS : False > ? OWNDATA : True > ? WRITEABLE : True > ? ALIGNED : True > ? UPDATEIFCOPY : False >>>> ind > array([3, 1, 2, 4, 5, 0]) >>>> s3 = a3[:,ind] >>>> s3.flags > ? C_CONTIGUOUS : False > ? F_CONTIGUOUS : True > ? OWNDATA : False > ? WRITEABLE : True > ? ALIGNED : True > ? UPDATEIFCOPY : False >>>> s3.base > array([[ 3, ?9], > ? ? ? ?[ 1, ?7], > ? ? ? ?[ 2, ?8], > ? ? ? ?[ 4, 10], > ? ? ? ?[ 5, 11], > ? ? ? ?[ 0, ?6]]) >>>> s3 > array([[ 3, ?1, ?2, ?4, ?5, ?0], > ? ? ? ?[ 9, ?7, ?8, 10, 11, ?6]]) Maybe another consequence of the different internal treatment of fancy and non-fancy slicing. I would infer from a comment by Travis in response to the question about the change in axis for 3d arrays with mixed fancy and non-fancy slicing. >>> ind = np.array([3, 1, 2, 4, 5, 0]) >>> s3 = a[:,ind] >>> s3.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> s4 = a[np.arange(2)[:,None],ind] >>> s4.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Josef >>>> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Sun Dec 20 23:44:08 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sun, 20 Dec 2009 20:44:08 -0800 Subject: [Numpy-discussion] help(numpy.dot) Hmmm. In-Reply-To: References: <4B2EDF51.8040001@sbcglobal.net> Message-ID: <4B2EFD18.3090109@sbcglobal.net> 1.2.0. Did you find the description in the reference manual? Charles R Harris wrote: > > > On Sun, Dec 20, 2009 at 7:37 PM, Wayne Watson > > > wrote: > > I've just become acquainted with the help command in WinXP IDLE. > help(numyp.sin) works fine. What's going on with dot? > > >>> help(numpy.core.multiarray.dot) > Help on built-in function dot in module numpy.core.multiarray: > > dot(...) > > Is there help for dot? > > > Yes, but you may be using an old version of numpy. What does > numpy.__version__ say? You can also find documentation on the > scipy.org site. Here is part of the current help: > > Help on built-in function dot in module numpy.core._dotblas: > > dot(...) > dot(a, b) > > Dot product of two arrays. > > For 2-D arrays it is equivalent to matrix multiplication, and for 1-D > arrays to inner product of vectors (without complex conjugation). For > N dimensions it is a sum product over the last axis of `a` and > the second-to-last of `b`:: > > dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m]) > > Parameters > ---------- > a : array_like > First argument. > b : array_like > Second argument. > > Returns > ------- > output : ndarray > Returns the dot product of `a` and `b`. If `a` and `b` are both > scalars or both 1-D arrays then a scalar is returned; otherwise > an array is returned. > > Raises > ------ > ValueError > If the last dimension of `a` is not the same size as > the second-to-last dimension of `b`. > > See Also > -------- > vdot : Complex-conjugating dot product. > tensordot : Sum products over arbitrary axes. > > Examples > -------- > ... > > Chuck > > ------------------------------------------------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From robert.kern at gmail.com Mon Dec 21 01:10:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Dec 2009 00:10:53 -0600 Subject: [Numpy-discussion] help(numpy.dot) Hmmm. In-Reply-To: <4B2EFD18.3090109@sbcglobal.net> References: <4B2EDF51.8040001@sbcglobal.net> <4B2EFD18.3090109@sbcglobal.net> Message-ID: <3d375d730912202210i35ce4f8ar754ec1cb987898e@mail.gmail.com> On Sun, Dec 20, 2009 at 22:44, Wayne Watson wrote: > 1.2.0. Did you find the description in the reference manual? No, he found it using help(numpy.dot) using a more recent version of numpy. I highly recommend upgrading. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sierra_mtnview at sbcglobal.net Mon Dec 21 01:21:47 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Sun, 20 Dec 2009 22:21:47 -0800 Subject: [Numpy-discussion] help(numpy.dot) Hmmm. In-Reply-To: <3d375d730912202210i35ce4f8ar754ec1cb987898e@mail.gmail.com> References: <4B2EDF51.8040001@sbcglobal.net> <4B2EFD18.3090109@sbcglobal.net> <3d375d730912202210i35ce4f8ar754ec1cb987898e@mail.gmail.com> Message-ID: <4B2F13FB.6060303@sbcglobal.net> Unfortunately, I'm in something of a bind with version. Although, I wonder if I can operate two versions of Python on the same Win XP? Whoops, I read that wrong. Yes, I think I can upgrade numpy without much difficulty. I am stuck with holding on the current version of Python. Robert Kern wrote: > On Sun, Dec 20, 2009 at 22:44, Wayne Watson > wrote: > >> 1.2.0. Did you find the description in the reference manual? >> > > No, he found it using help(numpy.dot) using a more recent version of > numpy. I highly recommend upgrading. > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From neilcrighton at gmail.com Mon Dec 21 04:35:08 2009 From: neilcrighton at gmail.com (Neil) Date: Mon, 21 Dec 2009 09:35:08 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?nicest_way_to_apply_an_arbitrary_seq?= =?utf-8?q?uence_of_row=09deltas_to_an_array?= References: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> <6b2e0e10912201838g3138ad6fl48d730bfb9a5e15b@mail.gmail.com> Message-ID: George Dahl gmail.com> writes: > > Hi everyone, > I was wondering if anyone had insight on the best way to solve the > following problem. > > Suppose I have a numpy array called U. > U has shape (N,M) > Suppose further that I have another array called dU and that > dU has shape (P,M) and that P has no particular relationship to N, it > could be larger or smaller. > Additionally, I have a one dimensional array called idx and > idx has shape (P,). Furthermore idx holds integers that are all valid > indices into the rows of U. idx almost certainly contains some > duplicates. > > I want, > > for k in range(P): > U[idx[k],:] += dU[k,:] > > Is there a nice vectorized and efficient way to do this without making > the obvious python for loop I included above? For what I am doing, P > is usually quite large. I am most interested in a clever use of > numpy/scipy commands and Python and not Cython. > I'm also interested to see if there are any answers to this; I came across a similar problem recently. It would have been convenient to do something like U[idx] += dU, but this didn't work because there were repeated indices in idx. Here's a short example that shows the problem: In [1]: U = np.array([1., 2., 3., 4.]) In [2]: dU = np.array([0.1, 0.1, 0.1, 0.1]) In [3]: idx = np.array([0, 1, 2, 0]) In [4]: U Out[4]: array([ 1., 2., 3., 4.]) In [5]: U[idx] Out[5]: array([ 1., 2., 3., 1.]) In [6]: U[idx] += dU In [7]: U Out[7]: array([ 1.1, 2.1, 3.1, 4. ]) Ideally U would end up as array([ 1.2, 2.1, 3.1, 4. ]) Neil From pav+sp at iki.fi Mon Dec 21 06:35:45 2009 From: pav+sp at iki.fi (Pauli Virtanen) Date: Mon, 21 Dec 2009 11:35:45 +0000 (UTC) Subject: [Numpy-discussion] nicest way to apply an arbitrary sequence of row deltas to an array References: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> <6b2e0e10912201838g3138ad6fl48d730bfb9a5e15b@mail.gmail.com> Message-ID: Mon, 21 Dec 2009 09:35:08 +0000, Neil wrote: [clip] > I'm also interested to see if there are any answers to this; I came > across a similar problem recently. It would have been convenient to do > something like U[idx] += dU, but this didn't work because there were > repeated indices in idx. Here's a short example that shows the problem: > > In [1]: U = np.array([1., 2., 3., 4.]) > In [2]: dU = np.array([0.1, 0.1, 0.1, 0.1]) > In [3]: idx = np.array([0, 1, 2, 0]) > In [4]: U > Out[4]: array([ 1., 2., 3., 4.]) > In [5]: U[idx] > Out[5]: array([ 1., 2., 3., 1.]) > In [6]: U[idx] += dU > In [7]: U > Out[7]: array([ 1.1, 2.1, 3.1, 4. ]) > > Ideally U would end up as array([ 1.2, 2.1, 3.1, 4. ]) One solution could be to use bincount: d = np.bincount(idx, dU) U[:len(d)] += d Also, bincount works only with scalar weights, so this is not a fully vectorized solution. A dedicated function could be nice here. -- Pauli Virtanen From josef.pktd at gmail.com Mon Dec 21 07:18:39 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 Dec 2009 07:18:39 -0500 Subject: [Numpy-discussion] nicest way to apply an arbitrary sequence of row deltas to an array In-Reply-To: References: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> <6b2e0e10912201838g3138ad6fl48d730bfb9a5e15b@mail.gmail.com> Message-ID: <1cd32cbb0912210418m1f7109f8j38265a7b158d0285@mail.gmail.com> On Mon, Dec 21, 2009 at 6:35 AM, Pauli Virtanen wrote: > Mon, 21 Dec 2009 09:35:08 +0000, Neil wrote: > [clip] >> I'm also interested to see if there are any answers to this; I came >> across a similar problem recently. It would have been convenient to do >> something like U[idx] += dU, but this didn't work because there were >> repeated indices in idx. Here's a short example that shows the problem: >> >> In [1]: U = np.array([1., 2., 3., 4.]) >> In [2]: dU = np.array([0.1, 0.1, 0.1, 0.1]) >> In [3]: idx = np.array([0, 1, 2, 0]) >> In [4]: U >> Out[4]: array([ 1., ?2., ?3., ?4.]) >> In [5]: U[idx] >> Out[5]: array([ 1., ?2., ?3., ?1.]) >> In [6]: U[idx] += dU >> In [7]: U >> Out[7]: array([ 1.1, ?2.1, ?3.1, ?4. ]) >> >> Ideally U would end up as array([ 1.2, ?2.1, ?3.1, ?4. ]) > > One solution could be to use bincount: > > ? ? ? ?d = np.bincount(idx, dU) > ? ? ? ?U[:len(d)] += d > > Also, bincount works only with scalar weights, so this is not a fully > vectorized solution. > > A dedicated function could be nice here. Or what would be *very* useful in many applications, is to extend bincount to take nd weights. Josef > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Mon Dec 21 12:40:36 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 09:40:36 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined Message-ID: <4B2FB314.9080508@sbcglobal.net> In this code, ===========start import math import numpy as np from numpy import matrix def sinD(D): # given in degrees, convert to radians return sin(radians(D)) def cosD(D): return cos(radians(D)) <<-------------- def acosD(D): acos(radians(D)) return=====end the << line produces, "NameError: global name 'cos' is not defined", but the sin() above it does not? They are both built-in functions. -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From robert.kern at gmail.com Mon Dec 21 12:44:12 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Dec 2009 11:44:12 -0600 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B2FB314.9080508@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> Message-ID: <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> On Mon, Dec 21, 2009 at 11:40, Wayne Watson wrote: > In this code, > ===========start > import math > import numpy as np > from numpy import matrix > def sinD(D): # given in degrees, convert to radians > ? ?return sin(radians(D)) > def cosD(D): > ? ?return cos(radians(D)) ? <<-------------- > def acosD(D): > ? ?acos(radians(D)) > ? ?return=====end > the << line produces, "NameError: global name 'cos' is not defined", but > the sin() above it does not? They are both built-in functions. No, they aren't. They are in the math module. You want math.cos(). The same goes for radians() and acos() and sin(). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From kwgoodman at gmail.com Mon Dec 21 12:44:42 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 21 Dec 2009 09:44:42 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B2FB314.9080508@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> Message-ID: On Mon, Dec 21, 2009 at 9:40 AM, Wayne Watson wrote: > In this code, > ===========start > import math > import numpy as np > from numpy import matrix > def sinD(D): # given in degrees, convert to radians > ? ?return sin(radians(D)) > def cosD(D): > ? ?return cos(radians(D)) ? <<-------------- > def acosD(D): > ? ?acos(radians(D)) > ? ?return=====end > the << line produces, "NameError: global name 'cos' is not defined", but > the sin() above it does not? They are both built-in functions. >> sin(10) NameError: name 'sin' is not defined Oh, right, there is no built-in sin function. I need to import it: >> import numpy as np >> import math >> >> math.sin(1) 0.8414709848078965 >> np.sin(1) 0.8414709848078965 or >> from numpy import sin >> sin(1) 0.8414709848078965 From Chris.Barker at noaa.gov Mon Dec 21 12:57:42 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 21 Dec 2009 09:57:42 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2D2EF1.2020905@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> Message-ID: <4B2FB716.4010607@noaa.gov> Dag Sverre Seljebotn wrote: > I recently got motivated to get better linear algebra for Python; wonderful! > To me that seems like the ideal way to split up code -- let NumPy/SciPy > deal with the array-oriented world and Sage the closer-to-mathematics > notation. well, maybe -- but there is a lot of call for pure-computational linear algebra. I do hope you'll consider building the computational portion of it in a way that might be included in numpy or scipy by itself in the future. Have you read this lengthy thread? and these summary wikipages: http://scipy.org/NewMatrixSpec http://www.scipy.org/MatrixIndexing Though it sounds a bit like you are going your own way with it anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From gokhansever at gmail.com Mon Dec 21 13:22:30 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 21 Dec 2009 12:22:30 -0600 Subject: [Numpy-discussion] Another numpy svn installation error Message-ID: <49d6b3500912211022g52089f0bnab7e4a58a90fe878@mail.gmail.com> Hello, Here are the steps that I went through to install the numpy from the svn-repo: svn co http://svn.scipy.org/svn/numpy/trunk numpy Be "su" and type: python setupegg.py develop Successful installation so far, but import fails with the given error: [gsever at ccn Desktop]$ python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/home/gsever/Desktop/python-repo/numpy/numpy/__init__.py", line 132, in import add_newdocs File "/home/gsever/Desktop/python-repo/numpy/numpy/add_newdocs.py", line 9, in from numpy.lib import add_newdoc File "/home/gsever/Desktop/python-repo/numpy/numpy/lib/__init__.py", line 4, in from type_check import * File "/home/gsever/Desktop/python-repo/numpy/numpy/lib/type_check.py", line 8, in import numpy.core.numeric as _nx File "/home/gsever/Desktop/python-repo/numpy/numpy/core/__init__.py", line 6, in import umath ImportError: /home/gsever/Desktop/python-repo/numpy/numpy/core/umath.so: undefined symbol: npy_spacing My platform: Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Any ideas what might be wrong here? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Mon Dec 21 13:52:41 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 21 Dec 2009 10:52:41 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2FB716.4010607@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> Message-ID: <45d1ab480912211052k747935fdhc7ad4fb517e20b42@mail.gmail.com> On Mon, Dec 21, 2009 at 9:57 AM, Christopher Barker wrote: > Dag Sverre Seljebotn wrote: >> I recently got motivated to get better linear algebra for Python; > > wonderful! > >> To me that seems like the ideal way to split up code -- let NumPy/SciPy >> deal with the array-oriented world and Sage the closer-to-mathematics >> notation. > > well, maybe -- but there is a lot of call for pure-computational linear > algebra. I do hope you'll consider building the computational portion of > it in a way that might be included in numpy or scipy by itself in the > future. My personal opinion is that the LA status quo is acceptably good: there's maybe a bit of an adjustment to make for newbies, but I don't see it as a very big one, and this list strikes me as very efficient at getting people over little bumps (e.g., someone emails in: "how do you matrix-multiply two arrays?" within minutes (:-)) Robert or Charles replies with "np.dot: np.dot([[1,2],[3,4]],[[1,2],[3,4]]) = array([[7,10],[15,22]])"). Certainly any significant changes to the base should need to run the gauntlet of an NEP process. DG From george.dahl at gmail.com Mon Dec 21 13:55:14 2009 From: george.dahl at gmail.com (George Dahl) Date: Mon, 21 Dec 2009 13:55:14 -0500 Subject: [Numpy-discussion] nicest way to apply an arbitrary sequence of row deltas to an array In-Reply-To: References: <6b2e0e10912201825s65e3a766k2bd89a1ba8bdbd0b@mail.gmail.com> <6b2e0e10912201838g3138ad6fl48d730bfb9a5e15b@mail.gmail.com> Message-ID: <6b2e0e10912211055m67cb526cm7fe3bec8fd126ffe@mail.gmail.com> So with bincount I can exchange a loop over P for a loop over M? I guess for me that is still really helpful. Thanks! - George On Mon, Dec 21, 2009 at 6:35 AM, Pauli Virtanen wrote: > Mon, 21 Dec 2009 09:35:08 +0000, Neil wrote: > [clip] >> I'm also interested to see if there are any answers to this; I came >> across a similar problem recently. It would have been convenient to do >> something like U[idx] += dU, but this didn't work because there were >> repeated indices in idx. Here's a short example that shows the problem: >> >> In [1]: U = np.array([1., 2., 3., 4.]) >> In [2]: dU = np.array([0.1, 0.1, 0.1, 0.1]) >> In [3]: idx = np.array([0, 1, 2, 0]) >> In [4]: U >> Out[4]: array([ 1., ?2., ?3., ?4.]) >> In [5]: U[idx] >> Out[5]: array([ 1., ?2., ?3., ?1.]) >> In [6]: U[idx] += dU >> In [7]: U >> Out[7]: array([ 1.1, ?2.1, ?3.1, ?4. ]) >> >> Ideally U would end up as array([ 1.2, ?2.1, ?3.1, ?4. ]) > > One solution could be to use bincount: > > ? ? ? ?d = np.bincount(idx, dU) > ? ? ? ?U[:len(d)] += d > > Also, bincount works only with scalar weights, so this is not a fully > vectorized solution. > > A dedicated function could be nice here. > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Mon Dec 21 14:44:44 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 11:44:44 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: References: <4B2FB314.9080508@sbcglobal.net> Message-ID: <4B2FD02C.4040608@sbcglobal.net> Yes, one can get both sin and cos via the interactive shell, if math is imported as you have done. However, I thought math itself always present to a program module? In my program, sin exists but not cos, so one is forced to use math.cos(). Why one but not the other? Keith Goodman wrote: > On Mon, Dec 21, 2009 at 9:40 AM, Wayne Watson > wrote: > >> In this code, >> ===========start >> import math >> import numpy as np >> from numpy import matrix >> def sinD(D): # given in degrees, convert to radians >> return sin(radians(D)) >> def cosD(D): >> return cos(radians(D)) <<-------------- >> def acosD(D): >> acos(radians(D)) >> return=====end >> the << line produces, "NameError: global name 'cos' is not defined", but >> the sin() above it does not? They are both built-in functions. >> > > >>> sin(10) >>> > NameError: name 'sin' is not defined > > Oh, right, there is no built-in sin function. I need to import it: > > >>> import numpy as np >>> import math >>> >>> math.sin(1) >>> > 0.8414709848078965 > >>> np.sin(1) >>> > 0.8414709848078965 > > or > > >>> from numpy import sin >>> sin(1) >>> > 0.8414709848078965 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From robert.kern at gmail.com Mon Dec 21 14:46:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 21 Dec 2009 13:46:51 -0600 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B2FD02C.4040608@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> <4B2FD02C.4040608@sbcglobal.net> Message-ID: <3d375d730912211146g739794fxe61aafeccb574c19@mail.gmail.com> On Mon, Dec 21, 2009 at 13:44, Wayne Watson wrote: > Yes, one can get both sin and cos via the interactive shell, if math is > imported as you have done. > However, I thought math itself always present to a program module? No. >?In > my program, sin exists but not cos, so one is forced to use math.cos(). > Why one but not the other? Presumably you have imported it somewhere. Please show your program, and we may be able to point it out to you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fonnesbeck at gmail.com Mon Dec 21 15:29:01 2009 From: fonnesbeck at gmail.com (Chris) Date: Mon, 21 Dec 2009 20:29:01 +0000 (UTC) Subject: [Numpy-discussion] Import error in builds of 7726 References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: David Cournapeau gmail.com> writes: > > Ok, so the undefined functions all indicate that the most recently > implemented ones are not included. I really cannot see any other > explanation that having a discrepancy between the source tree, build > tree and installation. Sometimes, svn screw things up when switching > between branches in my experience, so that's something to check for as > well. > > Could you give us the generated config.h (somewhere in > build/src.*/numpy/core/), just in case ? > Am I to assume, then, that there is no fix for this issue at this stage? I am still unable to build a working version of the software (I tried again today with a fresh checkout). Thanks, cf From charlesr.harris at gmail.com Mon Dec 21 15:32:41 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Dec 2009 13:32:41 -0700 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: On Mon, Dec 21, 2009 at 1:29 PM, Chris wrote: > David Cournapeau gmail.com> writes: > > > > > Ok, so the undefined functions all indicate that the most recently > > implemented ones are not included. I really cannot see any other > > explanation that having a discrepancy between the source tree, build > > tree and installation. Sometimes, svn screw things up when switching > > between branches in my experience, so that's something to check for as > > well. > > > > Could you give us the generated config.h (somewhere in > > build/src.*/numpy/core/), just in case ? > > > > Am I to assume, then, that there is no fix for this issue at this stage? > I am still unable to build a working version of the software (I tried again > today with a fresh checkout). > > Well, we don't know what the issue is, except that you are having problems. So we need to work through things and if you could supply the info David asks for that would help. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Mon Dec 21 16:31:04 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 21 Dec 2009 22:31:04 +0100 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2FB716.4010607@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> Message-ID: <4B2FE918.10206@student.matnat.uio.no> Christopher Barker wrote: > Dag Sverre Seljebotn wrote: >> I recently got motivated to get better linear algebra for Python; > > wonderful! > >> To me that seems like the ideal way to split up code -- let NumPy/SciPy >> deal with the array-oriented world and Sage the closer-to-mathematics >> notation. > > well, maybe -- but there is a lot of call for pure-computational linear > algebra. I do hope you'll consider building the computational portion of > it in a way that might be included in numpy or scipy by itself in the > future. > > Have you read this lengthy thread? > > > > and these summary wikipages: > > http://scipy.org/NewMatrixSpec > http://www.scipy.org/MatrixIndexing > > > Though it sounds a bit like you are going your own way with it anyway. Yes, I'm going my own way with it -- the SciPy matrix discussion tends to focus on cosmetics IMO, and I just tend to fundamentally disagree with the direction these discussions take on the SciPy/NumPy lists. What I'm after is not just some cosmetics for avoiding a call to dot. I'm after something which will allow me to structure my programs better -- something which e.g. allows my sampling routines to not care (by default, rather than as a workaround) about whether the specified covariance matrix is sparse or dense when trying to Cholesky decompose it, or something which allows one to set the best iterative solver to use for a given matrix at an outer level in the program, but do the actual solving somewhere else, without all the boilerplate and all the variable passing and callbacks. -- Dag Sverre From dagss at student.matnat.uio.no Mon Dec 21 16:33:39 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 21 Dec 2009 22:33:39 +0100 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2FB716.4010607@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> Message-ID: <4B2FE9B3.3000603@student.matnat.uio.no> Christopher Barker wrote: > Dag Sverre Seljebotn wrote: >> I recently got motivated to get better linear algebra for Python; > > wonderful! > >> To me that seems like the ideal way to split up code -- let NumPy/SciPy >> deal with the array-oriented world and Sage the closer-to-mathematics >> notation. > > well, maybe -- but there is a lot of call for pure-computational linear > algebra. I do hope you'll consider building the computational portion of > it in a way that might be included in numpy or scipy by itself in the > future. This is readily done -- there is no computational portion except for what is in NumPy/Scipy or scikits, and I intend for it to remain that way. It's just another interface, really. (What kind of computations were you thinking about?) -- Dag Sverre From d.l.goldsmith at gmail.com Mon Dec 21 17:25:23 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Mon, 21 Dec 2009 14:25:23 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2FE918.10206@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> <4B2FE918.10206@student.matnat.uio.no> Message-ID: <45d1ab480912211425n2873c2b4g47a63c61bf6c1a3b@mail.gmail.com> On Mon, Dec 21, 2009 at 1:31 PM, Dag Sverre Seljebotn wrote: > > Yes, I'm going my own way with it -- the SciPy matrix discussion tends > to focus on cosmetics IMO, and I just tend to fundamentally disagree > with the direction these discussions take on the SciPy/NumPy lists. > What I'm after is not just some cosmetics for avoiding a call to dot. > > I'm after something which will allow me to structure my programs better > -- something which e.g. allows my sampling routines to not care (by > default, rather than as a workaround) about whether the specified > covariance matrix is sparse or dense when trying to Cholesky decompose > it, or something which allows one to set the best iterative solver to > use for a given matrix at an outer level in the program, but do the > actual solving somewhere else, without all the boilerplate and all the > variable passing and callbacks. > > -- > Dag Sverre OK, it sounds like these sorts of things might be "universally" useful! :-) Keep us apprised, please. DG From cournape at gmail.com Mon Dec 21 17:29:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 22 Dec 2009 07:29:01 +0900 Subject: [Numpy-discussion] Import error in builds of 7726 In-Reply-To: References: <5b8d13220912122237p5da315e0yb7cc83f0e8edbbe1@mail.gmail.com> <3d375d730912122325w279f2a12uba9dd22f58cbbe3f@mail.gmail.com> <5b8d13220912130056w170b9e22g8e50e7e726f60596@mail.gmail.com> <5b8d13220912132026h55259b07t67254bd7c8f17806@mail.gmail.com> Message-ID: <5b8d13220912211429l61e1177ep929f5f6771cc15a5@mail.gmail.com> On Tue, Dec 22, 2009 at 5:29 AM, Chris wrote: > David Cournapeau gmail.com> writes: > >> >> Ok, so the undefined functions all indicate that the most recently >> implemented ones are not included. I really cannot see any other >> explanation that having a discrepancy between the source tree, build >> tree and installation. Sometimes, svn screw things up when switching >> between branches in my experience, so that's something to check for as >> well. >> >> Could you give us the generated config.h (somewhere in >> build/src.*/numpy/core/), just in case ? >> > > Am I to assume, then, that there is no fix for this issue at this stage? It is more that I cannot reproduce the issue, and without being able to do so, I don't see much change to solve the issue at hand. David From sierra_mtnview at sbcglobal.net Mon Dec 21 18:11:41 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 15:11:41 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> References: <4B2FB314.9080508@sbcglobal.net> <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> Message-ID: <4B3000AD.6060704@sbcglobal.net> Yes, thanks. That's the what I finally changed to. This originated up a thread or so when I displayed the highly populated code with math. Some said I didn't need it, so I thought I'd give it a go. I just started plugging away again with IDLE and am pretty convinced that IDLE is something of an enemy. I started afresh loading this into the editor: import math print "hello, math world." print math.cos(0.5) print math.sin(0.8) Run works fine. No errors. Now I do: >>> dir() ['__builtins__', '__doc__', '__file__', '__name__', 'idlelib', 'math'] >>> OK, swell. Now I import via the script window >>> import numpy as np >>> dir() ['__builtins__', '__doc__', '__file__', '__name__', 'idlelib', 'math', 'np'] I think I'm adding to the namespace both the program the script sees., because adding this ref to np in the program works fine. import math print "hello, math world." print math.cos(0.5) print math.sin(0.8) print np.sin(2.2) I've been assuming that IDLE clears the namespace. It's quite possible that I get anomalous results as I move between Run the program via the editor, and fiddling in script land. I would like to think that IDLE has some way to clear the namespace before it runs the program. If not, yikes! Robert Kern wrote: > On Mon, Dec 21, 2009 at 11:40, Wayne Watson > wrote: > >> In this code, >> ===========start >> import math >> import numpy as np >> from numpy import matrix >> def sinD(D): # given in degrees, convert to radians >> return sin(radians(D)) >> def cosD(D): >> return cos(radians(D)) <<-------------- >> def acosD(D): >> acos(radians(D)) >> return=====end >> the << line produces, "NameError: global name 'cos' is not defined", but >> the sin() above it does not? They are both built-in functions. >> > > No, they aren't. They are in the math module. You want math.cos(). The > same goes for radians() and acos() and sin(). > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From Chris.Barker at noaa.gov Mon Dec 21 18:30:31 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 21 Dec 2009 15:30:31 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B2FE9B3.3000603@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> <4B2FE9B3.3000603@student.matnat.uio.no> Message-ID: <4B300517.5070303@noaa.gov> Dag Sverre Seljebotn wrote: > This is readily done -- there is no computational portion except for > what is in NumPy/Scipy or scikits, and I intend for it to remain that > way. It's just another interface, really. > > (What kind of computations were you thinking about?) Nothing in particular -- just computational as opposed to symbolic manipulation. It sounds like you've got some good ideas. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Mon Dec 21 18:31:18 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 Dec 2009 18:31:18 -0500 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B3000AD.6060704@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> <4B3000AD.6060704@sbcglobal.net> Message-ID: <1cd32cbb0912211531mcad1748j381279200d97a051@mail.gmail.com> On Mon, Dec 21, 2009 at 6:11 PM, Wayne Watson wrote: > Yes, thanks. That's the what I finally changed to. This originated up a > thread or so when I displayed the highly populated code with math. Some > said I didn't need it, so I thought I'd give it a go. > > I just started plugging away again with IDLE and am pretty convinced > that IDLE is something of an enemy. I started afresh loading this into > the editor: > > ? ?import math > ? ?print "hello, math world." > ? ?print math.cos(0.5) > ? ?print math.sin(0.8) > > > Run works fine. No errors. Now I do: > ?>>> dir() > ['__builtins__', '__doc__', '__file__', '__name__', 'idlelib', 'math'] > ?>>> > OK, swell. Now I import via the script window > ?>>> import numpy as np > ?>>> dir() > ['__builtins__', '__doc__', '__file__', '__name__', 'idlelib', 'math', 'np'] > > I think I'm adding to the namespace both the program the script sees., > because adding this ref to np in the program works fine. > > ? ?import math > ? ?print "hello, math world." > ? ?print math.cos(0.5) > ? ?print math.sin(0.8) > ? ?print np.sin(2.2) > > I've been assuming that IDLE clears the namespace. ?It's quite possible > that I get anomalous results as I move between Run the program via the > editor, and fiddling in script land. I would like to think that IDLE has > some way to clear the namespace before it runs the program. If not, yikes! idle has two modes depending on whether it is started with or without -n option on the command line. I usually pick this option depending on what I am doing. From the IDLE help: Running without a subprocess: If IDLE is started with the -n command line switch it will run in a single process and will not create the subprocess which runs the RPC Python execution server. This can be useful if Python cannot create the subprocess or the RPC socket interface on your platform. However, in this mode user code is not isolated from IDLE itself. Also, the environment is not restarted when Run/Run Module (F5) is selected. If your code has been modified, you must reload() the affected modules and re-import any specific items (e.g. from foo import baz) if the changes are to take effect. For these reasons, it is preferable to run IDLE with the default subprocess if at all possible. Josef > > > > Robert Kern wrote: >> On Mon, Dec 21, 2009 at 11:40, Wayne Watson >> wrote: >> >>> In this code, >>> ===========start >>> import math >>> import numpy as np >>> from numpy import matrix >>> def sinD(D): # given in degrees, convert to radians >>> ? ?return sin(radians(D)) >>> def cosD(D): >>> ? ?return cos(radians(D)) ? <<-------------- >>> def acosD(D): >>> ? ?acos(radians(D)) >>> ? ?return=====end >>> the << line produces, "NameError: global name 'cos' is not defined", but >>> the sin() above it does not? They are both built-in functions. >>> >> >> No, they aren't. They are in the math module. You want math.cos(). The >> same goes for radians() and acos() and sin(). >> >> > > -- > ? ? ? ? ? Wayne Watson (Watson Adventures, Prop., Nevada City, CA) > > ? ? ? ? ? ? (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) > ? ? ? ? ? ? ?Obz Site: ?39? 15' 7" N, 121? 2' 32" W, 2700 feet > > ? ? ? ? ? ? "... humans'innate skills with numbers isn't much > ? ? ? ? ? ? ?better than that of rats and dolphins." > ? ? ? ? ? ? ? ? ? ? ? -- Stanislas Dehaene, neurosurgeon > > ? ? ? ? ? ? ? ? ? ?Web Page: > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sierra_mtnview at sbcglobal.net Mon Dec 21 19:51:54 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 16:51:54 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <1cd32cbb0912211531mcad1748j381279200d97a051@mail.gmail.com> References: <4B2FB314.9080508@sbcglobal.net> <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> <4B3000AD.6060704@sbcglobal.net> <1cd32cbb0912211531mcad1748j381279200d97a051@mail.gmail.com> Message-ID: <4B30182A.4020902@sbcglobal.net> josef.pktd at gmail.com wrote: > On Mon, Dec 21, 2009 at 6:11 PM, Wayne Watson > wrote: > ... >> ) >> print np.sin(2.2) >> >> I've been assuming that IDLE clears the namespace. It's quite possible >> that I get anomalous results as I move between Run the program via the >> editor, and fiddling in script land. I would like to think that IDLE has >> some way to clear the namespace before it runs the program. If not, yikes! >> > > idle has two modes depending on whether it is started with or without > -n option on the command line. I usually pick this option depending on > what I am doing. From the IDLE help: > > Running without a subprocess: > > If IDLE is started with the -n command line switch it will run in a > single process and will not create the subprocess which runs the RPC > Python execution server. This can be useful if Python cannot create > the subprocess or the RPC socket interface on your platform. However, > in this mode user code is not isolated from IDLE itself. Also, the > environment is not restarted when Run/Run Module (F5) is selected. If > your code has been modified, you must reload() the affected modules and > re-import any specific items (e.g. from foo import baz) if the changes > are to take effect. For these reasons, it is preferable to run IDLE > with the default subprocess if at all possible. > > Josef > > I'm running under Win XP. If there are command line options, I'm not aware of them. I tried the reload in the script window, but got nowhere with it. Is it usable in the program itself? Ah, I'm looking in my copy of Core Python by Chun, and gives some details on it. The IDLE Help for itself does not mention reload(). Maybe I need to a Win command line get away from IDLE. I started into iPython, but slipped back to IDLE. I think it may need another chance. From sierra_mtnview at sbcglobal.net Mon Dec 21 20:25:54 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 17:25:54 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B30182A.4020902@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> <4B3000AD.6060704@sbcglobal.net> <1cd32cbb0912211531mcad1748j381279200d97a051@mail.gmail.com> <4B30182A.4020902@sbcglobal.net> Message-ID: <4B302022.2020707@sbcglobal.net> I may have inadvertently made a slip between using script versus shell. What I'm getting at it that the namespace is the same for both the editor window and shell window. I find that a little bizarre. I would have expected each Run from the editor to clear all modules, and only load those shown in the editor. From josef.pktd at gmail.com Mon Dec 21 23:21:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 Dec 2009 23:21:07 -0500 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <4B302022.2020707@sbcglobal.net> References: <4B2FB314.9080508@sbcglobal.net> <3d375d730912210944r5f2bb3c7l9ad9d9b9e6ddd9be@mail.gmail.com> <4B3000AD.6060704@sbcglobal.net> <1cd32cbb0912211531mcad1748j381279200d97a051@mail.gmail.com> <4B30182A.4020902@sbcglobal.net> <4B302022.2020707@sbcglobal.net> Message-ID: <1cd32cbb0912212021i1854ba43wa1b4c71421d81746@mail.gmail.com> On Mon, Dec 21, 2009 at 8:25 PM, Wayne Watson wrote: > I may have inadvertently made a slip between using script versus shell. > What I'm getting at it that the namespace is the same for both the > editor window and shell window. I find that a little bizarre. I would > have expected each Run from the editor to clear all modules, and only > load those shown in the editor. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion I'm not really sure what you mean with "What I'm getting at it that the namespace is the same for both the editor window and shell window" I'm also on WindowsXP, I'm using the subprocess (-n) option of IDLE in the following way: starting on the command line with -n as in >"C:\Programs\Python25\pythonw.exe" "C:\Programs\Python25\Lib\idlelib\idle.pyw" -n which I think is the default in the programs shortcut, only uses one process for interpreter shell and idle and reload is not possible. When I start IDLE with >"C:\Programs\Python25\pythonw.exe" "C:\Programs\Python25\Lib\idlelib\idle.pyw" then the interpreter is run in a separate process. Whenever I hit F5 then it restarts the process with just the script in the editor. Idle in this case also has an option to "Restart Shell" under the Shell menu. In this case there are no spill-overs from one run to the next, and you only get what is in the script. To make my life easier, I associated two (actually I have 4 - same for python 2.4 and 2.5) different ways of starting a .py file when I right click a file in windows explorer. I changed/added the file association in file types in the tools-folder options menu of windows explorer, corresponding to the two options "C:\Programs\Python25\pythonw.exe" "C:\Programs\Python25\Lib\idlelib\idle.pyw" -n -e "%1" "C:\Programs\Python25\pythonw.exe" "C:\Programs\Python25\Lib\idlelib\idle.pyw" -e "%1" when I'm just experimenting with a script, I choose option with -n to avoid long import times. When I need to make sure that all dependencies/imports are reloaded, I start without -n. However, it is only possible to have one interpreter with -n open at one time (and right now I have 9 separate python idle running, one of them without -n) But actually, I'm using Spyder now most of the time, which let's you choose at runtime, whether to run a script in a separate process (external shell), in the same process as spyder and the interactive interpreter (internal shell), or to execute just a few selected lines (as F9 in matlab, I think) in the interpreter (also in internal shell). Except for a few editor quirks, Spyder works very well. Note: all editors when they run shell and editor in the same process have some background noise (or magic). But with separate subprocesses or external shells that allow restart, you loose some of the interactivity. Josef From sierra_mtnview at sbcglobal.net Tue Dec 22 00:43:30 2009 From: sierra_mtnview at sbcglobal.net (Wayne Watson) Date: Mon, 21 Dec 2009 21:43:30 -0800 Subject: [Numpy-discussion] cos -- NameError: global name 'cos' is not defined In-Reply-To: <3d375d730912211146g739794fxe61aafeccb574c19@mail.gmail.com> References: <4B2FB314.9080508@sbcglobal.net> <4B2FD02C.4040608@sbcglobal.net> <3d375d730912211146g739794fxe61aafeccb574c19@mail.gmail.com> Message-ID: <4B305C82.2060503@sbcglobal.net> Thanks, but I think I've got this under control now, and am moving on. Robert Kern wrote: > On Mon, Dec 21, 2009 at 13:44, Wayne Watson > wrote: > >> Yes, one can get both sin and cos via the interactive shell, if math is >> imported as you have done. >> However, I thought math itself always present to a program module? >> > > No. > > >> In >> my program, sin exists but not cos, so one is forced to use math.cos(). >> Why one but not the other? >> > > Presumably you have imported it somewhere. Please show your program, > and we may be able to point it out to you. > > -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time) Obz Site: 39? 15' 7" N, 121? 2' 32" W, 2700 feet "... humans'innate skills with numbers isn't much better than that of rats and dolphins." -- Stanislas Dehaene, neurosurgeon Web Page: From dagss at student.matnat.uio.no Tue Dec 22 04:06:50 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 22 Dec 2009 10:06:50 +0100 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B300517.5070303@noaa.gov> References: <4B2BF965.5010107@sbcglobal.net> <4B2C0EB2.5050504@sbcglobal.net> <4B2C1308.900@sbcglobal.net> <45d1ab480912181548i48c8be1ci90ed7d6aebbd5f0e@mail.gmail.com> <4B2C1A76.3000202@sbcglobal.net> <4B2C3074.80009@american.edu> <4B2C62FA.7010909@sbcglobal.net> <4B2C64AD.6000306@sbcglobal.net> <45d1ab480912182244r4466c85bo53cc5719d7085bcb@mail.gmail.com> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> <4B2FE9B3.3000603@student.matnat.uio.no> <4B300517.5070303@noaa.gov> Message-ID: <4B308C2A.2090308@student.matnat.uio.no> Christopher Barker wrote: > Dag Sverre Seljebotn wrote: >> This is readily done -- there is no computational portion except for >> what is in NumPy/Scipy or scikits, and I intend for it to remain that >> way. It's just another interface, really. >> >> (What kind of computations were you thinking about?) > > Nothing in particular -- just computational as opposed to symbolic > manipulation. OK. As a digression, I think it is easy to get the wrong impression of Sage that it is for "symbolics" vs. "computations". The reality is that the symbolics has been one of the *weaker* aspects of Sage (though steadily improving) -- the strong aspect is computations, but with elements that NumPy doesn't handle efficiently: Arbitrary size integer and rationals, polynomials (or vectors of their coefficients if you wish -- just numbers, not symbols), and so on. So the Sage design is very much about computation, it is just that the standard floating point hasn't got all that much attention. -- Dag Sverre From faltet at pytables.org Tue Dec 22 06:59:50 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 22 Dec 2009 12:59:50 +0100 Subject: [Numpy-discussion] ANN: PyTables 2.2b2 released Message-ID: <200912221259.50361.faltet@pytables.org> =========================== Announcing PyTables 2.2b2 =========================== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. This is the second beta version of 2.2 release. The main addition is the support for links. All HDF5 kind of links are supported: hard, soft and external. Hard and soft links are similar to hard and symbolic links in regular UNIX filesystems, while external links are more like mounting external filesystems (in this case, HDF5 files) on top of existing ones. This allows for a considerable degree of flexibility when defining your object tree. See the new tutorial at: http://www.pytables.org/docs/manual-2.2b2/ch03.html#LinksTutorial Also, some other new features (like complete control of HDF5 chunk cache parameters and native compound types in attributes), bug fixes and a couple of (small) API changes happened. In case you want to know more in detail what has changed in this version, have a look at: http://www.pytables.org/moin/ReleaseNotes/Release_2.2b2 You can download a source package with generated PDF and HTML docs, as well as binaries for Windows, from: http://www.pytables.org/download/preliminary For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.2b2 Resources ========= About PyTables: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Most specially, a lot of kudos go to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -- Francesc Alted From cournape at gmail.com Tue Dec 22 10:05:17 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 23 Dec 2009 00:05:17 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 Message-ID: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> Hi, I have just released the 2nd release candidate for numpy 1.4.0, which fixes a few critical bugs founds since the RC1. Tarballs and binary installers for numpy/scipy may be found on https://sourceforge.net/projects/numpy. cheers, David From bsouthey at gmail.com Tue Dec 22 10:50:22 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 22 Dec 2009 09:50:22 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> Message-ID: <4B30EABE.9060407@gmail.com> On 12/22/2009 09:05 AM, David Cournapeau wrote: > Hi, > > I have just released the 2nd release candidate for numpy 1.4.0, which > fixes a few critical bugs founds since the RC1. Tarballs and binary > installers for numpy/scipy may be found on > https://sourceforge.net/projects/numpy. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Hi, This still crashes Python 2.7 with the test_multiarray.TestIO.test_ascii. The file numpy/core/src/multiarray/numpyos.c needs a change as per this thread: "test_multiarray.TestIO.test_ascii segmentation fault with Python2.7" http://mail.scipy.org/pipermail/numpy-discussion/2009-December/047481.html The segmentation fault is avoided with this patch derived from the current numpy-1.4RC2 version and not the SVN version. Bruce -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: numpyos.patch URL: From matthew.brett at gmail.com Tue Dec 22 10:51:21 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 22 Dec 2009 10:51:21 -0500 Subject: [Numpy-discussion] Proposal for matrix_rank function in numpy In-Reply-To: <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> References: <1e2af89e0912150901m591b1999n1fc02fc412cf7fd6@mail.gmail.com> <3d375d730912151145g2023978ev294215a99362de87@mail.gmail.com> <1e2af89e0912151216m564a2bf3ha098085daf2d50a2@mail.gmail.com> <1e2af89e0912161113s6bcd8dbu560ab9df950d362d@mail.gmail.com> Message-ID: <1e2af89e0912220751i7b813e44o77761c87d51d930@mail.gmail.com> Hi, > I'm happy to write the doctests as tests. ? My feeling is there is no > objection to this function at the moment, so it would be reasonable, > unless I hear otherwise, to commit to SVN. Committed - with tests in tests_linalg.py - in revision 8029 Cheers, Matthew From gokhansever at gmail.com Tue Dec 22 11:41:25 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 22 Dec 2009 10:41:25 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> Message-ID: <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> On Tue, Dec 22, 2009 at 9:05 AM, David Cournapeau wrote: > Hi, > > I have just released the 2nd release candidate for numpy 1.4.0, which > fixes a few critical bugs founds since the RC1. Tarballs and binary > installers for numpy/scipy may be found on > https://sourceforge.net/projects/numpy. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > This release results with the same import error on my system that I posted on http://old.nabble.com/Another-numpy-svn-installation-error-td26878029.html [gsever at ccn Desktop]$ python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/home/gsever/Desktop/python-repo/numpy/numpy/__init__.py", line 132, in import add_newdocs File "/home/gsever/Desktop/python-repo/numpy/numpy/add_newdocs.py", line 9, in from lib import add_newdoc File "/home/gsever/Desktop/python-repo/numpy/numpy/lib/__init__.py", line 4, in from type_check import * File "/home/gsever/Desktop/python-repo/numpy/numpy/lib/type_check.py", line 8, in import numpy.core.numeric as _nx File "/home/gsever/Desktop/python-repo/numpy/numpy/core/__init__.py", line 6, in import umath ImportError: /home/gsever/Desktop/python-repo/numpy/numpy/core/umath.so: undefined symbol: npy_spacing Is there any remedy for this error? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 22 12:16:56 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Dec 2009 10:16:56 -0700 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <4B30EABE.9060407@gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> Message-ID: On Tue, Dec 22, 2009 at 8:50 AM, Bruce Southey wrote: > On 12/22/2009 09:05 AM, David Cournapeau wrote: > >> Hi, >> >> I have just released the 2nd release candidate for numpy 1.4.0, which >> fixes a few critical bugs founds since the RC1. Tarballs and binary >> installers for numpy/scipy may be found on >> https://sourceforge.net/projects/numpy. >> >> cheers, >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > Hi, > This still crashes Python 2.7 with the test_multiarray.TestIO.test_ascii. > > The file numpy/core/src/multiarray/numpyos.c needs a change as per this > thread: > "test_multiarray.TestIO.test_ascii segmentation fault with Python2.7" > http://mail.scipy.org/pipermail/numpy-discussion/2009-December/047481.html > > The segmentation fault is avoided with this patch derived from the current > numpy-1.4RC2 version and not the SVN version. > > The patch looks ok, but the functions handle errors differently and I wonder if that has been completely audited. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Dec 22 14:40:33 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 Dec 2009 21:40:33 +0200 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> Message-ID: <1261510832.5262.7.camel@idol> ti, 2009-12-22 kello 10:16 -0700, Charles R Harris kirjoitti: [clip: PyOS_ascii_strtod -> PyOS_string_to_double] > The patch looks ok, but the functions handle errors differently and I > wonder if that has been completely audited. It can actually still crash from the same reason: PyOS_string_to_double docs say: """If no initial segment of the string is the valid representation of a floating-point number, set *endptr to point to the beginning of the string, raise ValueError, and return -1.0""" Indeed, $ gdb --args python3 -c "import numpy as np; np.fromstring('1,,', sep=',')" (gdb) run Program received signal SIGSEGV, Segmentation fault. PyErr_SetObject (exception=0x8291740, value=0xb7e926a0) at ../Python/errors.c:67 67 ../Python/errors.c: Tiedostoa tai hakemistoa ei ole. in ../Python/errors.c (gdb) bt #0 PyErr_SetObject (exception=0x8291740, value=0xb7e926a0) at ../Python/errors.c:67 #1 0x080e8d5a in PyErr_Format (exception=0x8291740, format=0x81a0998 "could not convert string to float: %.200s") at ../Python/errors.c:638 #2 0x080fb5fe in PyOS_string_to_double (s=0xb7ca2ae2 ",", endptr=0xbfffd130, overflow_exception=0x0) at ../Python/pystrtod.c:354 #3 0x004a9bfc in NumPyOS_ascii_strtod (s=0xb7ca2ae2 ",", endptr=0xbfffd130) at numpy/core/src/multiarray/numpyos.c:525 I suppose raising an exception requires ownership of GIL. So either we implement ASCII number parsing ourselves from scratch (or steal it from somewhere), or surround the call with appropriate GIL-acquiring wrappers plus if (PyErr_Occurred()) PyErr_Clear(); Anyway, how malformed input is handled is currently not so well specified anyway, so this part requires further fixes. -- Pauli Virtanen From charlesr.harris at gmail.com Tue Dec 22 16:32:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Dec 2009 14:32:29 -0700 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <1261510832.5262.7.camel@idol> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <1261510832.5262.7.camel@idol> Message-ID: On Tue, Dec 22, 2009 at 12:40 PM, Pauli Virtanen wrote: > ti, 2009-12-22 kello 10:16 -0700, Charles R Harris kirjoitti: > [clip: PyOS_ascii_strtod -> PyOS_string_to_double] > > The patch looks ok, but the functions handle errors differently and I > > wonder if that has been completely audited. > > It can actually still crash from the same reason: PyOS_string_to_double > docs say: > > """If no initial segment of the string is the valid representation of a > floating-point number, set *endptr to point to the beginning of the > string, raise ValueError, and return -1.0""" > > Indeed, > > $ gdb --args python3 -c "import numpy as np; np.fromstring('1,,', sep=',')" > (gdb) run > Program received signal SIGSEGV, Segmentation fault. > PyErr_SetObject (exception=0x8291740, value=0xb7e926a0) > at ../Python/errors.c:67 > 67 ../Python/errors.c: Tiedostoa tai hakemistoa ei ole. > in ../Python/errors.c > (gdb) bt > #0 PyErr_SetObject (exception=0x8291740, value=0xb7e926a0) > at ../Python/errors.c:67 > #1 0x080e8d5a in PyErr_Format (exception=0x8291740, > format=0x81a0998 "could not convert string to float: %.200s") > at ../Python/errors.c:638 > #2 0x080fb5fe in PyOS_string_to_double (s=0xb7ca2ae2 ",", > endptr=0xbfffd130, > overflow_exception=0x0) at ../Python/pystrtod.c:354 > #3 0x004a9bfc in NumPyOS_ascii_strtod (s=0xb7ca2ae2 ",", > endptr=0xbfffd130) > at numpy/core/src/multiarray/numpyos.c:525 > > I suppose raising an exception requires ownership of GIL. So either we > implement ASCII number parsing ourselves from scratch (or steal it from > somewhere), or surround the call with appropriate GIL-acquiring wrappers > plus if (PyErr_Occurred()) PyErr_Clear(); > > Could you expand a bit on this? There are several places where PyErr_Occurred are called and I am wondering if there is a problem. In fact, I moved one such check and a segfault went away, which made me suspicious... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Dec 22 16:42:40 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 Dec 2009 23:42:40 +0200 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <1261510832.5262.7.camel@idol> Message-ID: <1261518159.8721.0.camel@idol> ti, 2009-12-22 kello 14:32 -0700, Charles R Harris kirjoitti: [clip] > Could you expand a bit on this? There are several places where > PyErr_Occurred are called and I am wondering if there is a problem. In > fact, I moved one such check and a segfault went away, which made me > suspicious... I think here the point is that since PyOS_acii_strtod used to fail silently, also its replacement should -- so we'd need to clear any raised error before continuing. Pauli From charlesr.harris at gmail.com Tue Dec 22 17:28:37 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 22 Dec 2009 15:28:37 -0700 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <1261518159.8721.0.camel@idol> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <1261510832.5262.7.camel@idol> <1261518159.8721.0.camel@idol> Message-ID: On Tue, Dec 22, 2009 at 2:42 PM, Pauli Virtanen wrote: > ti, 2009-12-22 kello 14:32 -0700, Charles R Harris kirjoitti: > [clip] > > Could you expand a bit on this? There are several places where > > PyErr_Occurred are called and I am wondering if there is a problem. In > > fact, I moved one such check and a segfault went away, which made me > > suspicious... > > I think here the point is that since PyOS_acii_strtod used to fail > silently, also its replacement should -- so we'd need to clear any > raised error before continuing. > > But what about the GIL? That's what I'm curious about. Do we need to hold the GIL to check and clear and error? If so, there are other places where this will matter. I was under the impression that each thread had it's own error stack. But I don't know much about the GIL. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Dec 22 18:09:32 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 23 Dec 2009 01:09:32 +0200 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <1261510832.5262.7.camel@idol> <1261518159.8721.0.camel@idol> Message-ID: <1261523372.8721.16.camel@idol> ti, 2009-12-22 kello 15:28 -0700, Charles R Harris kirjoitti: [clip] > But what about the GIL? That's what I'm curious about. Do we need to > hold the GIL to check and clear and error? If so, there are other > places where this will matter. I was under the impression that each > thread had it's own error stack. But I don't know much about the GIL. The issue seems to be that Py_BEGIN_ALLOW_THREADS / NPY_BEGIN_ALLOW_THREADS calls Python/ceval.c:PyEval_SaveThread(), which calls Python/pystate.c:PyThreadState_Swap, which sets the current thread state (Python/pystate.c:_PyThreadState_Current) to NULL. I'm not 100% sure if this is the same thing as releasing GIL, GIL is probably a subset of this. But, the exception information lives in the thread state -> NULL pointer dereference in PyErr_* -> BOOM. And yes, PyObject * PyErr_Occurred(void) { PyThreadState *tstate = PyThreadState_GET(); return tstate->curexc_type; } which probably means it shouldn't be called between ALLOW_THREADS. Needs to be wrapped between NPY_ALLOW_C_API & NPY_DISABLE_C_API, which call PyGILState_Ensure, which resurrects the thread state from some global dictionary or something. Pauli From d.l.goldsmith at gmail.com Tue Dec 22 21:48:41 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Dec 2009 18:48:41 -0800 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <4B308C2A.2090308@student.matnat.uio.no> References: <4B2BF965.5010107@sbcglobal.net> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> <4B2FE9B3.3000603@student.matnat.uio.no> <4B300517.5070303@noaa.gov> <4B308C2A.2090308@student.matnat.uio.no> Message-ID: <45d1ab480912221848x493e62c5va5daa86e07efe559@mail.gmail.com> On Tue, Dec 22, 2009 at 1:06 AM, Dag Sverre Seljebotn wrote: > > OK. As a digression, I think it is easy to get the wrong impression of > Sage that it is for "symbolics" vs. "computations". The reality is that > the symbolics has been one of the *weaker* aspects of Sage (though > steadily improving) -- the strong aspect is computations, but with > elements that NumPy doesn't handle efficiently: Arbitrary size integer > and rationals, polynomials (or vectors of their coefficients if you wish > -- just numbers, not symbols), and so on. > > So the Sage design is very much about computation, it is just that the > standard floating point hasn't got all that much attention. > Good to know, Dag, thanks for the "digression." :-) DG From peridot.faceted at gmail.com Tue Dec 22 22:13:42 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 22 Dec 2009 23:13:42 -0400 Subject: [Numpy-discussion] dot function or dot notation, matrices, arrays? In-Reply-To: <45d1ab480912211052k747935fdhc7ad4fb517e20b42@mail.gmail.com> References: <4B2BF965.5010107@sbcglobal.net> <4B2CCC4D.1070509@sbcglobal.net> <4B2CD2F6.7070406@student.matnat.uio.no> <4B2D031C.2090408@sbcglobal.net> <4B2D0A9D.20200@american.edu> <4B2D0FB2.6080905@sbcglobal.net> <4B2D18E1.20905@noaa.gov> <4B2D2EF1.2020905@student.matnat.uio.no> <4B2FB716.4010607@noaa.gov> <45d1ab480912211052k747935fdhc7ad4fb517e20b42@mail.gmail.com> Message-ID: 2009/12/21 David Goldsmith : > On Mon, Dec 21, 2009 at 9:57 AM, Christopher Barker > wrote: >> Dag Sverre Seljebotn wrote: >>> I recently got motivated to get better linear algebra for Python; >> >> wonderful! >> >>> To me that seems like the ideal way to split up code -- let NumPy/SciPy >>> deal with the array-oriented world and Sage the closer-to-mathematics >>> notation. >> >> well, maybe -- but there is a lot of call for pure-computational linear >> algebra. I do hope you'll consider building the computational portion of >> it in a way that might be included in numpy or scipy by itself in the >> future. > > My personal opinion is that the LA status quo is acceptably good: > there's maybe a bit of an adjustment to make for newbies, but I don't > see it as a very big one, and this list strikes me as very efficient > at getting people over little bumps (e.g., someone emails in: "how do > you matrix-multiply two arrays?" within minutes (:-)) Robert or > Charles replies with "np.dot: np.dot([[1,2],[3,4]],[[1,2],[3,4]]) = > array([[7,10],[15,22]])"). ?Certainly any significant changes to the > base should need to run the gauntlet of an NEP process. I think we have one major lacuna: vectorized linear algebra. If I have to solve a whole whack of four-dimensional linear systems, right now I need to either write a python loop and use linear algebra on them one by one, or implement my own linear algebra. It's a frustrating lacuna, because all the machinery is there: generalized ufuncs and LAPACK wrappers. Somebody just needs to glue them together. I've even tried making a start on it, but numpy's ufunc machinery and generic type system is just too much of a pain for me to make any progress as is. I think if someone wanted to start building a low-level generalized ufunc library interface to LAPACK, that would be a big improvement in numpy/scipy's linear algebra. Pretty much everything else strikes me as a question of notation. (Not to trivialize it: good notation makes a tremendous difference.) Anne From cournape at gmail.com Tue Dec 22 22:53:17 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 23 Dec 2009 12:53:17 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <4B30EABE.9060407@gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> Message-ID: <5b8d13220912221953m759cd870h68f37998a691c6ea@mail.gmail.com> On Wed, Dec 23, 2009 at 12:50 AM, Bruce Southey wrote: > This still crashes Python 2.7 with the test_multiarray.TestIO.test_ascii. Could you file a ticket next time ? I could not follow closely the discussion the last week or so, and although I saw the crash, I missed it was discussed already. thanks, David From cournape at gmail.com Tue Dec 22 22:55:35 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 23 Dec 2009 12:55:35 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <1261510832.5262.7.camel@idol> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <1261510832.5262.7.camel@idol> Message-ID: <5b8d13220912221955i67b1d1f5id0169ac00f396e7a@mail.gmail.com> On Wed, Dec 23, 2009 at 4:40 AM, Pauli Virtanen wrote: > I suppose raising an exception requires ownership of GIL. I am curious: how did you know it was related to the GIL ? When I tried debugging the issue, I could not tell whereas this was a problem with python 2.7 or with numpy, and did not suspect the GIL at all. David From d.l.goldsmith at gmail.com Wed Dec 23 01:02:49 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 22 Dec 2009 22:02:49 -0800 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) Message-ID: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> Starting a new thread for this. On Tue, Dec 22, 2009 at 7:13 PM, Anne Archibald wrote: > I think we have one major lacuna: vectorized linear algebra. If I have > to solve a whole whack of four-dimensional linear systems, right now I > need to either write a python loop and use linear algebra on them one > by one, or implement my own linear algebra. It's a frustrating lacuna, > because all the machinery is there: generalized ufuncs and LAPACK > wrappers. Somebody just needs to glue them together. I've even tried > making a start on it, but numpy's ufunc machinery and generic type > system is just too much of a pain for me to make any progress as is. Please be more specific: what (which aspects) have been "too much of a pain"? (I ask out of ignorance, not out of challenging your opinion/experience.) > I think if someone wanted to start building a low-level Again, please be more specific: what do you mean by this? (I know generally what is meant by "low level," but I'd like you to spell out a little more fully what you mean by this in this context.) > generalized > ufunc library interface to LAPACK, that would be a big improvement in > numpy/scipy's linear algebra. Pretty much everything else strikes me > as a question of notation. (Not to trivialize it: good notation makes > a tremendous difference.) Thanks, Anne. DG > > Anne > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From reckoner at gmail.com Wed Dec 23 09:12:39 2009 From: reckoner at gmail.com (reckoner) Date: Wed, 23 Dec 2009 06:12:39 -0800 Subject: [Numpy-discussion] Matlab's griddata3 for numpy? Message-ID: <4B322557.7010203@gmail.com> Hi, I realize that there is a griddata for numpy via matplotlib, but is there a griddata3 (same has griddata, but for higher dimensions). Any help appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Wed Dec 23 09:54:53 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 23 Dec 2009 08:54:53 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <5b8d13220912221953m759cd870h68f37998a691c6ea@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <5b8d13220912221953m759cd870h68f37998a691c6ea@mail.gmail.com> Message-ID: <4B322F3D.5050207@gmail.com> On 12/22/2009 09:53 PM, David Cournapeau wrote: > On Wed, Dec 23, 2009 at 12:50 AM, Bruce Southey wrote: > > >> This still crashes Python 2.7 with the test_multiarray.TestIO.test_ascii. >> > Could you file a ticket next time ? I could not follow closely the > discussion the last week or so, and although I saw the crash, I missed > it was discussed already. > > thanks, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Sorry, Ticket 1345 http://projects.scipy.org/numpy/ticket/1345 I added patches for the 1.4 rc2 version and a patch for the SVN version. I only tested the 1.4 branch on Python 2.7 after you announced it because I follow the SVN. It was also somewhat confusing because a fix is was in the SVN version except that it needed to include Python 2.7. (This was due to the Python 3 support that was added since the 1.4 branch.) Some of the Python 3.1 features have been backported to Python 2.7 which will help with some of the porting to Python 3. For that reason, I would suggest that release notes indicate that Python 2.7 support is experimental - especially as Python 2.7 has only had one alpha release and the expected final release is 2010-06-26. Bruce From peridot.faceted at gmail.com Wed Dec 23 10:34:29 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 23 Dec 2009 11:34:29 -0400 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> Message-ID: 2009/12/23 David Goldsmith : > Starting a new thread for this. > > On Tue, Dec 22, 2009 at 7:13 PM, Anne Archibald > wrote: > >> I think we have one major lacuna: vectorized linear algebra. If I have >> to solve a whole whack of four-dimensional linear systems, right now I >> need to either write a python loop and use linear algebra on them one >> by one, or implement my own linear algebra. It's a frustrating lacuna, >> because all the machinery is there: generalized ufuncs and LAPACK >> wrappers. Somebody just needs to glue them together. I've even tried >> making a start on it, but numpy's ufunc machinery and generic type >> system is just too much of a pain for me to make any progress as is. > > Please be more specific: what (which aspects) have been "too much of a > pain"? ?(I ask out of ignorance, not out of challenging your > opinion/experience.) It's been a little while since I took a really close look at it, but I'll try to describe the problems I had. Chiefly I had problems with documentation - the only way I could figure out how to build additional gufuncs was monkey-see-monkey-do, just copying an existing one in an existing file and hoping the build system figured it out. It was also not at all clear how to, say, link to LAPACK, let alone decide based on input types which arguments to promote and how to call out to LAPACK. I'm not saying this is impossible, just that it was enough frustrating no-progress to defeat my initial "hey, I could do that" impulse. >> I think if someone wanted to start building a low-level > > Again, please be more specific: what do you mean by this? ?(I know > generally what is meant by "low level," but I'd like you to spell out > a little more fully what you mean by this in this context.) Sure. Let me first say that all this is kind of beside the point - the hard part is not designing an API, so it's a bit silly to dream up an API without implementing anything. I had pictured two interfaces to the vectorized linear algebra code. The first would simply provide more-or-less direct access to vectorized versions of the linear algebra functions we have now, with no dimension inference. Thus inv, pinv, svd, lu factor, lu solve, et cetera - but not dot. Dot would have to be split up into vector-vector, vector-matrix, matrix-vector, and matrix-matrix products, since one can no longer use the dimensionality of the inputs to figure out what is wanted. The key idea would be that the "linear algebra dimensions" would always be the last one(s); this is fairly easy to arrange with rollaxis when it isn't already true, would tend to reduce copying on input to LAPACK, and is what the gufunc API wants. This is mostly what I meant by low-level. (A second generation would do things like combine many vector-vector products into a single LAPACK matrix-vector product.) The higher-level API I was imagining - remember, vaporware here - had a Matrix and a Vector class, each holding an arbitrarily-dimensioned array of the relevant object. The point of this is to avoid having to constantly specify whether you want a matrix-vector or matrix-matrix product; it also tidily avoids the always-two-dimensional nuisance of the current matrix API. Anne From nadavh at visionsense.com Wed Dec 23 11:23:32 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 23 Dec 2009 18:23:32 +0200 Subject: [Numpy-discussion] Matlab's griddata3 for numpy? References: <4B322557.7010203@gmail.com> Message-ID: <710F2847B0018641891D9A21602763605AD278@ex3.envision.co.il> You probably have to use the generic interpolation function from scipy.interpolate module: scipy.interpolate.splprep, scipy.interpolate.splev, etc. It could be cumbersome but doable. Nadav -----Original Message----- From: numpy-discussion-bounces at scipy.org on behalf of reckoner Sent: Wed 23-Dec-09 16:12 To: numpy-discussion at scipy.org Subject: [Numpy-discussion] Matlab's griddata3 for numpy? Hi, I realize that there is a griddata for numpy via matplotlib, but is there a griddata3 (same has griddata, but for higher dimensions). Any help appreciated. -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 2717 bytes Desc: not available URL: From dwf at cs.toronto.edu Wed Dec 23 13:30:46 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Dec 2009 13:30:46 -0500 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> Message-ID: On 23-Dec-09, at 10:34 AM, Anne Archibald wrote: > It's been a little while since I took a really close look at it, but > I'll try to describe the problems I had. Chiefly I had problems with > documentation - the only way I could figure out how to build > additional gufuncs was monkey-see-monkey-do, just copying an existing > one in an existing file and hoping the build system figured it out. It > was also not at all clear how to, say, link to LAPACK, let alone > decide based on input types which arguments to promote and how to call > out to LAPACK. I tried to create a new generalized ufunc (a logsumexp to go with logaddexp, so as to avoid all the needless exp's and log's that would be incurred by logaddexp.reduce) and had exactly the same problem. I did get it to build but it was misbehaving (returning an array of the same size as the input) and I couldn't figure out quite why. I agree that the documentation is lacking, but I think it's (rightly) a low priority in the midst of the release candidate. > The key idea would be that the "linear > algebra dimensions" would always be the last one(s); this is fairly > easy to arrange with rollaxis when it isn't already true, would tend > to reduce copying on input to LAPACK, and is what the gufunc API > wants. Would it actually reduce copying if you were using default C-ordered arrays? Maybe I'm mistaken but I thought one almost always had to copy in order to translate C to Fortran order except for a few functions that can take row-ordered stuff. Otherwise, +1 all the way. David From d.l.goldsmith at gmail.com Wed Dec 23 14:19:02 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 23 Dec 2009 11:19:02 -0800 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> Message-ID: <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> On Wed, Dec 23, 2009 at 10:30 AM, David Warde-Farley wrote: > On 23-Dec-09, at 10:34 AM, Anne Archibald wrote: > >> It's been a little while since I took a really close look at it, but >> I'll try to describe the problems I had. Chiefly I had problems with >> documentation - the only way I could figure out how to build >> additional gufuncs was monkey-see-monkey-do, just copying an existing >> one in an existing file and hoping the build system figured it out. It >> was also not at all clear how to, say, link to LAPACK, let alone >> decide based on input types which arguments to promote and how to call >> out to LAPACK. > > I tried to create a new generalized ufunc (a logsumexp to go with > logaddexp, so as to avoid all the needless exp's and log's that would > be incurred by logaddexp.reduce) and had exactly the same problem. I > did get it to build but it was misbehaving (returning an array of the > same size as the input) and I couldn't figure out quite why. I agree > that the documentation is lacking, but I think it's (rightly) a low > priority in the midst of the release candidate. Thanks Anne (and Dave): it may seem to you to be "a bit silly to dream up an API without implementing anything," but I think it's useful to get these things "on the record" so to speak, and as a person charged with being especially concerned w/ the doc, it's particularly important for me to hear when its specific deficiencies are productivity blockers... >> The key idea would be that the "linear >> algebra dimensions" would always be the last one(s); this is fairly >> easy to arrange with rollaxis when it isn't already true, would tend >> to reduce copying on input to LAPACK, and is what the gufunc API >> wants. > > Would it actually reduce copying if you were using default C-ordered > arrays? Maybe I'm mistaken but I thought one almost always had to copy > in order to translate C to Fortran order except for a few functions > that can take row-ordered stuff. > > Otherwise, +1 all the way. ...and of course, discussing these things here begins a dialog that can be the beginning of getting these improvements made - not necessarily by you... :-) Thanks again, for humoring me DG From peridot.faceted at gmail.com Wed Dec 23 15:49:17 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 23 Dec 2009 16:49:17 -0400 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> Message-ID: 2009/12/23 David Warde-Farley : > On 23-Dec-09, at 10:34 AM, Anne Archibald wrote: > >> The key idea would be that the "linear >> algebra dimensions" would always be the last one(s); this is fairly >> easy to arrange with rollaxis when it isn't already true, would tend >> to reduce copying on input to LAPACK, and is what the gufunc API >> wants. > > Would it actually reduce copying if you were using default C-ordered > arrays? Maybe I'm mistaken but I thought one almost always had to copy > in order to translate C to Fortran order except for a few functions > that can take row-ordered stuff. That's a good point. But even if you need to do a transpose, it'll be faster to transpose data in a contiguous block than data scattered all over memory. Maybe more to the point, broadcasting adds axes to the beginning, so that (say) two-dimensional arrays can act as "matrix scalars". Anne From dwf at cs.toronto.edu Wed Dec 23 17:26:17 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Dec 2009 17:26:17 -0500 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> Message-ID: On 23-Dec-09, at 2:19 PM, David Goldsmith wrote: > Thanks Anne (and Dave): it may seem to you to be "a bit silly to dream > up an API without implementing anything," but I think it's useful to > get these things "on the record" so to speak, and as a person charged > with being especially concerned w/ the doc, it's particularly > important for me to hear when its specific deficiencies are > productivity blockers... In fact, there are gufuncs in the tests that are quite instructive and would form the basis of good documentation, though not enough of them to give a complete picture of what the generalized ufunc architecture can do (I remember looking for an example of a particular supported pattern and coming up short, though I can't for the life of me remember which). The existing documentation, plus source code from the umath_tests module marked up descriptively (what all the parameters do, especially the ones which currently receive magic numbers) would probably be the way to go down the road. David From fwereade at googlemail.com Wed Dec 23 17:43:19 2009 From: fwereade at googlemail.com (William Reade) Date: Wed, 23 Dec 2009 23:43:19 +0100 Subject: [Numpy-discussion] Ironclad v2.6.0rc1 released In-Reply-To: References: Message-ID: Hi all I'm very happy to announce the latest release (candidate) of Ironclad, the 120-proof home-brewed CPython compatibility layer, now available for IronPython 2.6! No longer need .NET pythonistas toil thanklessly without the benefits of bz2, csv, numpy and scipy: with a simple 'import ironclad', (most parts of) the above packages -- and many more -- will transparently Just Work. For reference: over 1500 tests pass in numpy 1.3.0; over 1900 in 1.4.0RC1; and over 2300 in scipy 0.7.1. Get the package from: http://code.google.com/p/ironclad/ ...and get support from: http://groups.google.com/group/c-extensions-for-ironpython ...or just ask me directly. I'm very keen to hear your experiences, both positive and negative; I haven't been able to test it on as many machines as I have in the past, so your feedback is especially important this time round*. Cheers William * I'd be especially grateful if someone with a newish multicore machine would run the numpy and scipy test scripts (included in the source distrbution) a few times to check for consistent results and absence of weird crashes; if someone volunteers, I'll help however I can. From d.l.goldsmith at gmail.com Wed Dec 23 20:30:16 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 23 Dec 2009 17:30:16 -0800 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> Message-ID: <45d1ab480912231730r4af1ca16x153ce16937da9ef@mail.gmail.com> On Wed, Dec 23, 2009 at 2:26 PM, David Warde-Farley wrote: > > On 23-Dec-09, at 2:19 PM, David Goldsmith wrote: > >> Thanks Anne (and Dave): it may seem to you to be "a bit silly to dream >> up an API without implementing anything," but I think it's useful to >> get these things "on the record" so to speak, and as a person charged >> with being especially concerned w/ the doc, it's particularly >> important for me to hear when its specific deficiencies are >> productivity blockers... > > In fact, there are gufuncs in the tests that are quite instructive and > would form the basis of good documentation, though not enough of them > to give a complete picture of what the generalized ufunc architecture > can do (I remember looking for an example of a particular supported > pattern and coming up short, If you came up short, how/why are you certain that the existing arch would support it? > though I can't for the life of me > remember which). > > The existing documentation, plus source code from the umath_tests > module marked up descriptively (what all the parameters do, especially > the ones which currently receive magic numbers) would probably be the > way to go down the road. > > David Perfect, David! Thanks... DG From dwf at cs.toronto.edu Wed Dec 23 21:59:25 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 23 Dec 2009 21:59:25 -0500 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: <45d1ab480912231730r4af1ca16x153ce16937da9ef@mail.gmail.com> References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> <45d1ab480912231730r4af1ca16x153ce16937da9ef@mail.gmail.com> Message-ID: <20091224025924.GA14247@rodimus> On Wed, Dec 23, 2009 at 05:30:16PM -0800, David Goldsmith wrote: > On Wed, Dec 23, 2009 at 2:26 PM, David Warde-Farley wrote: > > > > On 23-Dec-09, at 2:19 PM, David Goldsmith wrote: > > > >> Thanks Anne (and Dave): it may seem to you to be "a bit silly to dream > >> up an API without implementing anything," but I think it's useful to > >> get these things "on the record" so to speak, and as a person charged > >> with being especially concerned w/ the doc, it's particularly > >> important for me to hear when its specific deficiencies are > >> productivity blockers... > > > > In fact, there are gufuncs in the tests that are quite instructive and > > would form the basis of good documentation, though not enough of them > > to give a complete picture of what the generalized ufunc architecture > > can do (I remember looking for an example of a particular supported > > pattern and coming up short, > > If you came up short, how/why are you certain that the existing arch > would support it? The existing documentation made the capabilities of generalized ufuncs pretty clear, however not much is demonstrated in terms of the appropriate C API (or code generator) constructs. David From matthew.brett at gmail.com Thu Dec 24 12:50:39 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 24 Dec 2009 12:50:39 -0500 Subject: [Numpy-discussion] Error for numpy.test() with doctest environment configs Message-ID: <1e2af89e0912240950t6fd9e42fr1c705f1d676246e7@mail.gmail.com> Hi, Because I like doctests, I have the following set in my .noserc file: with-doctest=1 This setting breaks numpy.test() like this: In [2]: numpy.test() Running unit tests for numpy NumPy version 1.5.0.dev8029 NumPy is installed in /Users/mb312/usr/local/lib/python2.6/site-packages/numpy Python version 2.6.4 (r264:75706, Dec 22 2009, 14:55:30) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] nose version 0.11.1 ------------------------------------------------------------ Traceback (most recent call last): ?File "", line 1, in ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/numpy/testing/nosetester.py", line 335, in test ? ?t = NumpyTestProgram(argv=argv, exit=False, plugins=plugins) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/core.py", line 113, in __init__ ? ?argv=argv, testRunner=testRunner, testLoader=testLoader) ?File "/Users/mb312/usr/local/lib/python2.6/unittest.py", line 816, in __init__ ? ?self.parseArgs(argv) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/core.py", line 130, in parseArgs ? ?self.config.configure(argv, doc=self.usage()) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/config.py", line 249, in configure ? ?options, args = self._parseArgs(argv, cfg_files) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/config.py", line 237, in _parseArgs ? ?return parser.parseArgsAndConfigFiles(argv[1:], cfg_files) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/config.py", line 132, in parseArgsAndConfigFiles ? ?self._applyConfigurationToValues(self._parser, config, values) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/config.py", line 118, in _applyConfigurationToValues ? ?name=name, filename=filename) ?File "/Users/mb312/usr/local/lib/python2.6/site-packages/nose-0.11.1-py2.6.egg/nose/config.py", line 234, in warn_sometimes ? ?raise ConfigError(msg) ConfigError: Error reading config file '/Users/mb312/.noserc': no such option 'with-doctest' The reason for this is that, in the numpy testing machinery (nostester.py, around line 249, the 'doctest' plugin is pulled out of the plugins list, because we prefer our own numpy doctest tester. Accordingly, when we initialize our noseclasses.NumpyTestProgram, the 'doctest' plugin is not present, and therefore cannot parse any of its configuration, and hence the error. An obvious idea is to allow the testing machinery to parse all the configs, including the doctest configs, then throw away the native (non-numpy) doctest plugin before we get to collecting and running the tests. I've attached a patch that does this; it's a little bit magic because of the class structure of nose, but I hope it makes sense. I'd be very grateful for a review, Thanks a lot, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: nose_plugin_workaround Type: application/octet-stream Size: 3311 bytes Desc: not available URL: From cournape at gmail.com Thu Dec 24 17:57:51 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 25 Dec 2009 07:57:51 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> Message-ID: <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> On Wed, Dec 23, 2009 at 1:41 AM, G?khan Sever wrote: > > > On Tue, Dec 22, 2009 at 9:05 AM, David Cournapeau > wrote: >> >> Hi, >> >> I have just released the 2nd release candidate for numpy 1.4.0, which >> fixes a few critical bugs founds since the RC1. Tarballs and binary >> installers for numpy/scipy may be found on >> https://sourceforge.net/projects/numpy. >> >> cheers, >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > This release results with the same import error on my system that I posted > on Don't use develop, and install numpy normally, but from scratch. Develop mode has some quircks, and it does not worth it unless you want to work on numpy code yourself IMHO, cheers, David From wesmckinn at gmail.com Fri Dec 25 18:31:37 2009 From: wesmckinn at gmail.com (Wes McKinney) Date: Fri, 25 Dec 2009 18:31:37 -0500 Subject: [Numpy-discussion] [ANN] pandas 0.1, a new NumPy-based data analysis library Message-ID: <6c476c8a0912251531k6fd08e75k103a5d93b5a0359f@mail.gmail.com> Hello all, I'm very happy to announce the release of a new data analysis library that many of you will hopefully find useful. This release is the product of a long period of development and use; hence, despite the low version number, it is quite suitable for general use. The documentation is still a bit sparse but will become much more complete in the coming weeks and months. Info / Documentation: http://pandas.sourceforge.net/ Overview slides: http://pandas.googlecode.com/files/nyfpug.pdf What it is ========== pandas is a library for pan-el da-ta analysis, i.e. multidimensional time series and cross-sectional data sets commonly found in statistics, econometrics, or finance. It provides convenient and easy-to-understand NumPy-based data structures for generic labeled data, with focus on automatically aligning data based on its label(s) and handling missing observations. One major goal of the library is to simplify the implementation of statistical models on unreliable data. Main Features ============= * Data structures: for 1, 2, and 3 dimensional labeled data sets. Some of their main features include: * Automatically aligning data * Handling missing observations in calculations * Convenient slicing and reshaping ("reindexing") functions * Provide 'group by' aggregation or transformation functionality * Tools for merging / joining together data sets * Simple matplotlib integration for plotting * Date tools: objects for expressing date offsets or generating date ranges; some functionality similar to scikits.timeseries * Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series / cross-sectional regressions. These will hopefully be the starting point for implementing other models pandas is not necessarily intended as a standalone library but rather as something which can be used in tandem with other NumPy-based packages like scikits.statsmodels. Where possible wheel-reinvention has largely been avoided. Also, its time series manipulation capability is not as extensive as scikits.timeseries; pandas does have its own time series object which fits into the unified data model. Some other useful tools for time series data (moving average, standard deviation, etc.) are available in the codebase but do not yet have a convenient interface. These will be highlighted in a future release. Where to get it =============== The source code is currently hosted on googlecode at: http://pandas.googlecode.com Releases can be downloaded currently on the Python package index or using easy_install PyPi: http://pypi.python.org/pypi/pandas/ License ======= BSD Documentation ============= The official documentation is hosted on SourceForge. http://pandas.sourceforge.net/ The sphinx documentation is still in an incomplete state, but it should provide a good starting point for learning how to use the library. Expect the docs to continue to expand as time goes on. Background ========== Work on pandas started at AQR (a quantitative hedge fund) in 2008 and has been under active development since then. Discussion and Development ========================== Since pandas development is related to a number of other scientific Python projects, questions are welcome on the scipy-user mailing list. Specialized discussions or design issues should take place on the pystatsmodels mailing list / google group, where scikits.statsmodels and other libraries will also be discussed: http://groups.google.com/group/pystatsmodels Best regards, Wes McKinney From cournape at gmail.com Thu Dec 24 21:52:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 25 Dec 2009 11:52:26 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <4B322F3D.5050207@gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <4B30EABE.9060407@gmail.com> <5b8d13220912221953m759cd870h68f37998a691c6ea@mail.gmail.com> <4B322F3D.5050207@gmail.com> Message-ID: <5b8d13220912241852q7393c4d6ja241a39a497d2672@mail.gmail.com> On Wed, Dec 23, 2009 at 11:54 PM, Bruce Southey wrote: > Some of the Python 3.1 features have been backported to Python 2.7 which > will help with some of the porting to Python 3. For that reason, I would > suggest that release notes indicate that Python 2.7 support is > experimental - especially as Python 2.7 has only had one alpha release > and the expected final release is 2010-06-26. I think I will not add this to the 1.4.x branch, then, because I don't understand the fix/what's broken very well. I suspect that people who want to try against python 2.7 can grab numpy from subversion and don't expect any released-quality code anyway. cheers, David From wesmckinn at gmail.com Sat Dec 26 10:44:19 2009 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 26 Dec 2009 10:44:19 -0500 Subject: [Numpy-discussion] [ANN] pandas 0.1, a new NumPy-based data analysis library Message-ID: <6c476c8a0912260744p649b6c82s1289fe434f7ace84@mail.gmail.com> Hello all, (resending, as this didn't make it through to the ML the first time) I'm very happy to announce the release of a new data analysis library that many of you will hopefully find useful. This release is the product of a long period of development and use; hence, despite the low version number, it is quite suitable for general use. The documentation is still a bit sparse but will become much more complete in the coming weeks and months. Info / Documentation: http://pandas.sourceforge.net/ Overview slides: http://pandas.googlecode.com/files/nyfpug.pdf What it is ========== pandas is a library for pan-el da-ta analysis, i.e. multidimensional time series and cross-sectional data sets commonly found in statistics, econometrics, or finance. It provides convenient and easy-to-understand NumPy-based data structures for generic labeled data, with focus on automatically aligning data based on its label(s) and handling missing observations. One major goal of the library is to simplify the implementation of statistical models on unreliable data. Main Features ============= * Data structures: for 1, 2, and 3 dimensional labeled data sets. Some of their main features include: * Automatically aligning data * Handling missing observations in calculations * Convenient slicing and reshaping ("reindexing") functions * Provide 'group by' aggregation or transformation functionality * Tools for merging / joining together data sets * Simple matplotlib integration for plotting * Date tools: objects for expressing date offsets or generating date ranges; some functionality similar to scikits.timeseries * Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series / cross-sectional regressions. These will hopefully be the starting point for implementing other models pandas is not necessarily intended as a standalone library but rather as something which can be used in tandem with other NumPy-based packages like scikits.statsmodels. Where possible wheel-reinvention has largely been avoided. Also, its time series manipulation capability is not as extensive as scikits.timeseries; pandas does have its own time series object which fits into the unified data model. Some other useful tools for time series data (moving average, standard deviation, etc.) are available in the codebase but do not yet have a convenient interface. These will be highlighted in a future release. Where to get it =============== The source code is currently hosted on googlecode at: http://pandas.googlecode.com Releases can be downloaded currently on the Python package index or using easy_install PyPi: http://pypi.python.org/pypi/pandas/ License ======= BSD Documentation ============= The official documentation is hosted on SourceForge. http://pandas.sourceforge.net/ The sphinx documentation is still in an incomplete state, but it should provide a good starting point for learning how to use the library. Expect the docs to continue to expand as time goes on. Background ========== Work on pandas started at AQR (a quantitative hedge fund) in 2008 and has been under active development since then. Discussion and Development ========================== Since pandas development is related to a number of other scientific Python projects, questions are welcome on the scipy-user mailing list. Specialized discussions or design issues should take place on the pystatsmodels mailing list / google group, where scikits.statsmodels and other libraries will also be discussed: http://groups.google.com/group/pystatsmodels Best regards, Wes McKinney From gokhansever at gmail.com Sat Dec 26 16:19:42 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 26 Dec 2009 15:19:42 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> Message-ID: <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> On Thu, Dec 24, 2009 at 4:57 PM, David Cournapeau wrote: > On Wed, Dec 23, 2009 at 1:41 AM, G?khan Sever > wrote: > > > > > > On Tue, Dec 22, 2009 at 9:05 AM, David Cournapeau > > wrote: > >> > >> Hi, > >> > >> I have just released the 2nd release candidate for numpy 1.4.0, which > >> fixes a few critical bugs founds since the RC1. Tarballs and binary > >> installers for numpy/scipy may be found on > >> https://sourceforge.net/projects/numpy. > >> > >> cheers, > >> > >> David > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > This release results with the same import error on my system that I > posted > > on > > Don't use develop, and install numpy normally, but from scratch. > Develop mode has some quircks, and it does not worth it unless you > want to work on numpy code yourself IMHO, > OK, a clean svn check-out and python setup.py install I get another interesting import error: [gsever at ccn ~]$ pwd /home/gsever [gsever at ccn ~]$ python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy Traceback (most recent call last): File "", line 1, in File "/home/gsever/Desktop/python-repo/numpy/numpy/__init__.py", line 123, in raise ImportError(msg) ImportError: Error importing numpy: you should not try to import numpy from its source directory; please exit the numpy source tree, and relaunch your python intepreter from there. >>> I launch the interpreter from a different directory than where the sources located, however it still complains. For the develop, it is one of easiest ways to catch up the bug-fixes even though I don't work on the source directly. So far besides a few glitches it was always working. I also install scipy, ipython, matplotlib, sympy and all other available packages using develop. Keep the checkouts in the directory on my desktop and if/when necessary do svn up or whichever command it corresponds to their respective vcs. I wonder how other people keep up the changes easily without using develop option. > > cheers, > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 26 17:15:29 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 26 Dec 2009 15:15:29 -0700 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> Message-ID: On Sat, Dec 26, 2009 at 2:19 PM, G?khan Sever wrote: > > > On Thu, Dec 24, 2009 at 4:57 PM, David Cournapeau wrote: > >> On Wed, Dec 23, 2009 at 1:41 AM, G?khan Sever >> wrote: >> > >> > >> > On Tue, Dec 22, 2009 at 9:05 AM, David Cournapeau >> > wrote: >> >> >> >> Hi, >> >> >> >> I have just released the 2nd release candidate for numpy 1.4.0, which >> >> fixes a few critical bugs founds since the RC1. Tarballs and binary >> >> installers for numpy/scipy may be found on >> >> https://sourceforge.net/projects/numpy. >> >> >> >> cheers, >> >> >> >> David >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > This release results with the same import error on my system that I >> posted >> > on >> >> Don't use develop, and install numpy normally, but from scratch. >> Develop mode has some quircks, and it does not worth it unless you >> want to work on numpy code yourself IMHO, >> > > OK, a clean svn check-out and python setup.py install I get another > interesting import error: > > [gsever at ccn ~]$ pwd > /home/gsever > [gsever at ccn ~]$ python > > Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) > [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > Traceback (most recent call last): > File "", line 1, in > File "/home/gsever/Desktop/python-repo/numpy/numpy/__init__.py", line > 123, in > raise ImportError(msg) > ImportError: Error importing numpy: you should not try to import numpy from > its source directory; please exit the numpy source tree, and > relaunch > your python intepreter from there. > >>> > > I launch the interpreter from a different directory than where the sources > located, however it still complains. > > For the develop, it is one of easiest ways to catch up the bug-fixes even > though I don't work on the source directly. So far besides a few glitches it > was always working. I also install scipy, ipython, matplotlib, sympy and all > other available packages using develop. Keep the checkouts in the directory > on my desktop and if/when necessary do svn up or whichever command it > corresponds to their respective vcs. I wonder how other people keep up the > changes easily without using develop option. > > I never see any of these problems and apparently no one else does either. There is something unique about your system. What does os.getcwd() return? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Sat Dec 26 17:35:03 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sat, 26 Dec 2009 16:35:03 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> Message-ID: <49d6b3500912261435i6a373e21q458cd384763e14ae@mail.gmail.com> On Sat, Dec 26, 2009 at 4:15 PM, Charles R Harris wrote: > > > On Sat, Dec 26, 2009 at 2:19 PM, G?khan Sever wrote: > >> >> >> On Thu, Dec 24, 2009 at 4:57 PM, David Cournapeau wrote: >> >>> On Wed, Dec 23, 2009 at 1:41 AM, G?khan Sever >>> wrote: >>> > >>> > >>> > On Tue, Dec 22, 2009 at 9:05 AM, David Cournapeau >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> I have just released the 2nd release candidate for numpy 1.4.0, which >>> >> fixes a few critical bugs founds since the RC1. Tarballs and binary >>> >> installers for numpy/scipy may be found on >>> >> https://sourceforge.net/projects/numpy. >>> >> >>> >> cheers, >>> >> >>> >> David >>> >> _______________________________________________ >>> >> NumPy-Discussion mailing list >>> >> NumPy-Discussion at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> > >>> > This release results with the same import error on my system that I >>> posted >>> > on >>> >>> Don't use develop, and install numpy normally, but from scratch. >>> Develop mode has some quircks, and it does not worth it unless you >>> want to work on numpy code yourself IMHO, >>> >> >> OK, a clean svn check-out and python setup.py install I get another >> interesting import error: >> >> [gsever at ccn ~]$ pwd >> /home/gsever >> [gsever at ccn ~]$ python >> >> Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) >> [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import numpy >> Traceback (most recent call last): >> File "", line 1, in >> File "/home/gsever/Desktop/python-repo/numpy/numpy/__init__.py", line >> 123, in >> raise ImportError(msg) >> ImportError: Error importing numpy: you should not try to import numpy >> from >> its source directory; please exit the numpy source tree, and >> relaunch >> your python intepreter from there. >> >>> >> >> I launch the interpreter from a different directory than where the sources >> located, however it still complains. >> >> For the develop, it is one of easiest ways to catch up the bug-fixes even >> though I don't work on the source directly. So far besides a few glitches it >> was always working. I also install scipy, ipython, matplotlib, sympy and all >> other available packages using develop. Keep the checkouts in the directory >> on my desktop and if/when necessary do svn up or whichever command it >> corresponds to their respective vcs. I wonder how other people keep up the >> changes easily without using develop option. >> >> > I never see any of these problems and apparently no one else does either. > There is something unique about your system. What does os.getcwd() return? > > Chuck > > I removed numpy.egg-link (a remnant from setupegg.py develop) file under /usr/lib/python2.6/site-packages. Still I get the same error. Your query returns the same as pwd command: >>> import os >>> os.getcwd() '/home/gsever/Desktop' >>> exit() [gsever at ccn Desktop]$ pwd /home/gsever/Desktop > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sat Dec 26 19:09:45 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 27 Dec 2009 09:09:45 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> Message-ID: <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> On Sun, Dec 27, 2009 at 6:19 AM, G?khan Sever wrote: > > For the develop, it is one of easiest ways to catch up the bug-fixes even > though I don't work on the source directly. So far besides a few glitches it > was always working. I also install scipy, ipython, matplotlib, sympy and all > other available packages using develop. Keep the checkouts in the directory > on my desktop and if/when necessary do svn up or whichever command it > corresponds to their respective vcs. If you do that, you have to be ready to look into the corresponding issues it brings. > I wonder how other people keep up the > changes easily without using develop option. I just install things, and avoid relying on too many developed versions of packages. David From cournape at gmail.com Sun Dec 27 20:34:38 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 28 Dec 2009 10:34:38 +0900 Subject: [Numpy-discussion] [ANN] Numpy 1.4.0 release Message-ID: <5b8d13220912271734m29388676m4db200fa6a02dc92@mail.gmail.com> Hi, I am pleased to announce the release of numpy 1.4.0. The highlights of this release are: - Faster import time - Extended array wrapping mechanism for ufuncs - New Neighborhood iterator (C-level only) - C99-like complex functions in npymath, and a lot of portability fixes for basic floating point math functions The full release notes are at the end of the email. The sources are uploaded on Pypi, and the binary installers will soon come on the sourceforge page: https://sourceforge.net/projects/numpy/ Thank you to everyone involved in this release, developers, users who reported bugs, fix documentation, etc... enjoy, the numpy developers. ========================= NumPy 1.4.0 Release Notes ========================= This minor includes numerous bug fixes, as well as a few new features. It is backward compatible with 1.3.0 release. Highlights ========== * Faster import time * Extended array wrapping mechanism for ufuncs * New Neighborhood iterator (C-level only) * C99-like complex functions in npymath New features ============ Extended array wrapping mechanism for ufuncs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ An __array_prepare__ method has been added to ndarray to provide subclasses greater flexibility to interact with ufuncs and ufunc-like functions. ndarray already provided __array_wrap__, which allowed subclasses to set the array type for the result and populate metadata on the way out of the ufunc (as seen in the implementation of MaskedArray). For some applications it is necessary to provide checks and populate metadata *on the way in*. __array_prepare__ is therefore called just after the ufunc has initialized the output array but before computing the results and populating it. This way, checks can be made and errors raised before operations which may modify data in place. Automatic detection of forward incompatibilities ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Previously, if an extension was built against a version N of NumPy, and used on a system with NumPy M < N, the import_array was successfull, which could cause crashes because the version M does not have a function in N. Starting from NumPy 1.4.0, this will cause a failure in import_array, so the error will be catched early on. New iterators ~~~~~~~~~~~~~ A new neighborhood iterator has been added to the C API. It can be used to iterate over the items in a neighborhood of an array, and can handle boundaries conditions automatically. Zero and one padding are available, as well as arbitrary constant value, mirror and circular padding. New polynomial support ~~~~~~~~~~~~~~~~~~~~~~ New modules chebyshev and polynomial have been added. The new polynomial module is not compatible with the current polynomial support in numpy, but is much like the new chebyshev module. The most noticeable difference to most will be that coefficients are specified from low to high power, that the low level functions do *not* work with the Chebyshev and Polynomial classes as arguements, and that the Chebyshev and Polynomial classes include a domain. Mapping between domains is a linear substitution and the two classes can be converted one to the other, allowing, for instance, a Chebyshev series in one domain to be expanded as a polynomial in another domain. The new classes should generally be used instead of the low level functions, the latter are provided for those who wish to build their own classes. The new modules are not automatically imported into the numpy namespace, they must be explicitly brought in with an "import numpy.polynomial" statement. New C API ~~~~~~~~~ The following C functions have been added to the C API: #. PyArray_GetNDArrayCFeatureVersion: return the *API* version of the loaded numpy. #. PyArray_Correlate2 - like PyArray_Correlate, but implements the usual definition of correlation. Inputs are not swapped, and conjugate is taken for complex arrays. #. PyArray_NeighborhoodIterNew - a new iterator to iterate over a neighborhood of a point, with automatic boundaries handling. It is documented in the iterators section of the C-API reference, and you can find some examples in the multiarray_test.c.src file in numpy.core. New ufuncs ~~~~~~~~~~ The following ufuncs have been added to the C API: #. copysign - return the value of the first argument with the sign copied from the second argument. #. nextafter - return the next representable floating point value of the first argument toward the second argument. New defines ~~~~~~~~~~~ The alpha processor is now defined and available in numpy/npy_cpu.h. The failed detection of the PARISC processor has been fixed. The defines are: #. NPY_CPU_HPPA: PARISC #. NPY_CPU_ALPHA: Alpha Testing ~~~~~~~ #. deprecated decorator: this decorator may be used to avoid cluttering testing output while testing DeprecationWarning is effectively raised by the decorated test. #. assert_array_almost_equal_nulps: new method to compare two arrays of floating point values. With this function, two values are considered close if there are not many representable floating point values in between, thus being more robust than assert_array_almost_equal when the values fluctuate a lot. #. assert_array_max_ulp: raise an assertion if there are more than N representable numbers between two floating point values. #. assert_warns: raise an AssertionError if a callable does not generate a warning of the appropriate class, without altering the warning state. Reusing npymath ~~~~~~~~~~~~~~~ In 1.3.0, we started putting portable C math routines in npymath library, so that people can use those to write portable extensions. Unfortunately, it was not possible to easily link against this library: in 1.4.0, support has been added to numpy.distutils so that 3rd party can reuse this library. See coremath documentation for more information. Improved set operations ~~~~~~~~~~~~~~~~~~~~~~~ In previous versions of NumPy some set functions (intersect1d, setxor1d, setdiff1d and setmember1d) could return incorrect results if the input arrays contained duplicate items. These now work correctly for input arrays with duplicates. setmember1d has been renamed to in1d, as with the change to accept arrays with duplicates it is no longer a set operation, and is conceptually similar to an elementwise version of the Python operator 'in'. All of these functions now accept the boolean keyword assume_unique. This is False by default, but can be set True if the input arrays are known not to contain duplicates, which can increase the functions' execution speed. Improvements ============ #. numpy import is noticeably faster (from 20 to 30 % depending on the platform and computer) #. The sort functions now sort nans to the end. * Real sort order is [R, nan] * Complex sort order is [R + Rj, R + nanj, nan + Rj, nan + nanj] Complex numbers with the same nan placements are sorted according to the non-nan part if it exists. #. The type comparison functions have been made consistent with the new sort order of nans. Searchsorted now works with sorted arrays containing nan values. #. Complex division has been made more resistent to overflow. #. Complex floor division has been made more resistent to overflow. Deprecations ============ The following functions are deprecated: #. correlate: it takes a new keyword argument old_behavior. When True (the default), it returns the same result as before. When False, compute the conventional correlation, and take the conjugate for complex arrays. The old behavior will be removed in NumPy 1.5, and raises a DeprecationWarning in 1.4. #. unique1d: use unique instead. unique1d raises a deprecation warning in 1.4, and will be removed in 1.5. #. intersect1d_nu: use intersect1d instead. intersect1d_nu raises a deprecation warning in 1.4, and will be removed in 1.5. #. setmember1d: use in1d instead. setmember1d raises a deprecation warning in 1.4, and will be removed in 1.5. The following raise errors: #. When operating on 0-d arrays, ``numpy.max`` and other functions accept only ``axis=0``, ``axis=-1`` and ``axis=None``. Using an out-of-bounds axes is an indication of a bug, so Numpy raises an error for these cases now. #. Specifying ``axis > MAX_DIMS`` is no longer allowed; Numpy raises now an error instead of behaving similarly as for ``axis=None``. Internal changes ================ Use C99 complex functions when available ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The numpy complex types are now guaranteed to be ABI compatible with C99 complex type, if availble on the platform. Moreoever, the complex ufunc now use the platform C99 functions intead of our own. split multiarray and umath source code ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The source code of multiarray and umath has been split into separate logic compilation units. This should make the source code more amenable for newcomers. Separate compilation ~~~~~~~~~~~~~~~~~~~~ By default, every file of multiarray (and umath) is merged into one for compilation as was the case before, but if NPY_SEPARATE_COMPILATION env variable is set to a non-negative value, experimental individual compilation of each file is enabled. This makes the compile/debug cycle much faster when working on core numpy. Separate core math library ~~~~~~~~~~~~~~~~~~~~~~~~~~ New functions which have been added: * npy_copysign * npy_nextafter * npy_cpack * npy_creal * npy_cimag * npy_cabs * npy_cexp * npy_clog * npy_cpow * npy_csqr * npy_ccos * npy_csin From charlesr.harris at gmail.com Mon Dec 28 01:28:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 27 Dec 2009 23:28:03 -0700 Subject: [Numpy-discussion] [ANN] Numpy 1.4.0 release In-Reply-To: <5b8d13220912271734m29388676m4db200fa6a02dc92@mail.gmail.com> References: <5b8d13220912271734m29388676m4db200fa6a02dc92@mail.gmail.com> Message-ID: On Sun, Dec 27, 2009 at 6:34 PM, David Cournapeau wrote: > Hi, > > I am pleased to announce the release of numpy 1.4.0. The highlights of > this release are: > > The new files aren't up on sourceforge yet, although I expect they will be the same as rc2. I put the announcement up on the scipy homepage anyway. When will the official file go up? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Mon Dec 28 02:01:47 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 28 Dec 2009 16:01:47 +0900 Subject: [Numpy-discussion] [ANN] Numpy 1.4.0 release In-Reply-To: References: <5b8d13220912271734m29388676m4db200fa6a02dc92@mail.gmail.com> Message-ID: <5b8d13220912272301j1d05b325o769f56722af76d4c@mail.gmail.com> On Mon, Dec 28, 2009 at 3:28 PM, Charles R Harris wrote: > > > On Sun, Dec 27, 2009 at 6:34 PM, David Cournapeau > wrote: >> >> Hi, >> >> I am pleased to announce the release of numpy 1.4.0. The highlights of >> this release are: >> > > > > The new files aren't up on sourceforge yet, although I expect they will be > the same as rc2. I put the announcement up on the scipy homepage anyway. > When will the official file go up? When I have time to build the binaries (hopefully tonight), as everything is generated at the same time, David From matthew.brett at gmail.com Mon Dec 28 05:35:34 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 28 Dec 2009 05:35:34 -0500 Subject: [Numpy-discussion] Error for numpy.test() with doctest environment configs In-Reply-To: <1e2af89e0912240950t6fd9e42fr1c705f1d676246e7@mail.gmail.com> References: <1e2af89e0912240950t6fd9e42fr1c705f1d676246e7@mail.gmail.com> Message-ID: <1e2af89e0912280235m196e09f6kf3de8e939d29c2e3@mail.gmail.com> Hi, > An obvious idea is to allow the testing machinery to parse all > the configs, including the doctest configs, then throw away the native > (non-numpy) doctest plugin before we get to collecting and running the > tests. > > I've attached a patch that does this; it's a little bit magic because > of the class structure of nose, but I hope it makes sense. ? I'd be > very grateful for a review, Any objections to applying this patch to trunk? I believe it is correct... Best, Matthew From cournape at gmail.com Mon Dec 28 09:03:14 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 28 Dec 2009 23:03:14 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation Message-ID: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> (warning, long post) Hi there, As some of you already know, the packaging and distributions of scientific python packages have been a constant source of frustration. Open source is about making it easy for anyone to use software how they see fit, and I think python packaging infrastructure has not been very successfull for people not intimately familiar with python. A few weeks ago, after Guido visited Berkeley and was told how those issues were still there for the scientific community, he wrote an email asking whether current efforts on distutils-sig will be enough (see http://aspn.activestate.com/ASPN/Mail/Message/distutils-sig/3775972). Several of us have been participating to this discussion, but I feel like the divide between current efforts on distutils-sig and us (the SciPy community) is not getting smaller. At best, their efforts will be more work for us to track the new distribute fork, and more likely, it will be all for nothing as it won't solve any deep issue. To be honest, most of what is considered on distutils-sig sounds like anti-goals to me. Instead of keeping up with the frustrating process of "improving" distutils, I think we have enough smart people and manpower in the scientific community to go with our own solution. I am convinced it is doable because R or haskell, with a much smaller community than python, managed to pull out something with is miles ahead compared to pypi. The SciPy community is hopefully big enough so that a SciPy-specific solution may reach critical mass. Ideally, I wish we had something with the following capabilities: - easy to understand tools - http-based package repository ala CRAN, which would be easy to mirror and backup (through rsync-like tools) - decoupling the building, packaging and distribution of code and data - reliable install/uninstall/query of what is installed locally - facilities for building windows/max os x binaries - making the life of OS vendors (Linux, *BSD, etc...) easier The packaging part ============== Speaking is easy, so I started coding part of this toolset, called toydist (temporary name), which I presented at Scipy India a few days ago: http://github.com/cournape/toydist/ Toydist is more or less a rip off of cabal (http://www.haskell.org/cabal/), and consist of three parts: - a core which builds a package description from a declarative file similar to cabal files. The file is almost purely declarative, and can be parsed so that no arbitrary code is executed, thus making it easy to sandbox packages builds (e.g. on a build farm). - a set of command line tools to configure, build, install, build installers (egg only for now) etc... from the declarative file - backward compatibility tools: a tool to convert existing setup.py to the new format has been written, and a tool to use distutils through the new format for backward compatibility with complex distutils extensions should be relatively easy. The core idea is to make the format just rich enough to describe most packages out there, but simple enough so interfacing it with external tools is possible and reliable. As a regular contributor to scons, I am all too aware that a build tool is a very complex beast to get right, and repeating their efforts does not make sense. Typically, I envision that complex packages such as numpy, scipy or matplotlib would use make/waf/scons for the build - in a sense, toydist is written so that writing something like numscons would be easier. OTOH, most if not all scikits should be buildable from a purely declarative file. To give you a feel of the format, here is a snippet for the grin package from Robert K. (automatically converted): Name: grin Version: 1.1.1 Summary: A grep program configured the way I like it. Description: ==== grin ==== I wrote grin to help me search directories full of source code. The venerable GNU grep_ and find_ are great tools, but they fall just a little short for my normal use cases. License: BSD Platforms: UNKNOWN Classifiers: License :: OSI Approved :: BSD License, Development Status :: 5 - Production/Stable, Environment :: Console, Intended Audience :: Developers, Operating System :: OS Independent, Programming Language :: Python, Topic :: Utilities, ExtraSourceFiles: README.txt, setup.cfg, setup.py, Library: InstallDepends: argparse, Modules: grin, Executable: grin module: grin function: grin_main Executable: grind module: grin function: grind_main Although still very much experimental at this point, toydist already makes some things much easier than with distutils/setuptools: - path customization for any target can be done easily: you can easily add an option in the file so that configure --mynewdir=value works and is accessible at every step. - making packages FHS compliant is not a PITA anymore, and the scheme can be adapted to any OS, be it traditional FHS-like unix, mac os x, windows, etc... - All the options are accessible at every step (no more distutils commands nonsense) - data files can finally be handled correctly and consistently, instead of the 5 or 6 magics methods currently available in distutils/setuptools/numpy.distutils - building eggs does not involve setuptools anymore - not much coupling between package description and build infrastructure (building extensions is actually done through distutils ATM). Repository ======== The goal here is to have something like CRAN (http://cran.r-project.org/web/views/), ideally with a build farm so that whenever anyone submits a package to our repository, it would automatically be checked, and built for windows/mac os x and maybe a few major linux distributions. One could investigate the build service from open suse to that end (http://en.opensuse.org/Build_Service), which is based on xen VM to build installers in a reproducible way. Installed package db =============== I believe that the current open source enstaller package from Enthought can be a good starting point. It is based on eggs, but eggs are only used as a distribution format (eggs are never installed as eggs AFAIK). You can easily remove packages, query installed versions, etc... Since toydist produces eggs, interoperation between toydist and enstaller should not be too difficult. What's next ? ========== At this point, I would like to ask for help and comments, in particular: - Does all this make sense, or hopelessly intractable ? - Besides the points I have mentioned, what else do you think is needed ? - There has already been some work for the scikits webportal, but I think we should bypass pypi entirely (the current philosophy of not enforcing consistent metadata does not make much sense to me, and is at the opposite of most other similar system out there). - I think a build farm for at least windows packages would be a killer feature, and enough incentive to push some people to use our new infrastructure. It would be good to have a windows guy familiar with windows sandboxing/virtualization to do something there. The people working on the opensuse build service have started working on windows support - I think being able to automatically convert most of scientific packages is a significant feature, and needs to be more robust - so anyone is welcomed to try converting existing setup.py with toydist (see toydist readme). thanks, David From cournape at gmail.com Mon Dec 28 10:03:15 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 00:03:15 +0900 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: <5b8d13220912280703r43b8122ds609cbcd4a9a75ead@mail.gmail.com> On Mon, Dec 28, 2009 at 11:47 PM, Stefan Schwarzburg wrote: > Hi, > I would like to add a comment from the user perspective: > > - the main reason why I'm not satisfied with pypi/distutils/etc. and why I > will not be satisfied with toydist (with the features you listed), is that > they break my installation (debian/ubuntu). Toydist (or distutils) does not break anything as is. It would be like saying make breaks debian - it does not make much sense. As stated, one of the goal of giving up distutils is to make packaging by os vendors easier. In particular, by allowing to follow the FHS, and making things more consistent. It should be possible to automatically convert most packages to .deb (or .rpm) relatively easily. When you look at the numpy .deb package, most of the issues are distutils issues, and almost everything else can be done automatically. Note that even ignoring the windows problem, there are systems to do the kind of things I am talking about for linux-only systems (the opensuse build service), because distributions are not always really good at tracking fast changing softwares. IOW, traditional linux packaging has some issues as well. And anyway, nothing prevents debian or other OS vendors to package things as they want (as they do for R packages). David From gokhansever at gmail.com Mon Dec 28 11:31:55 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Dec 2009 10:31:55 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> Message-ID: <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> On Sat, Dec 26, 2009 at 6:09 PM, David Cournapeau wrote: > On Sun, Dec 27, 2009 at 6:19 AM, G?khan Sever > wrote: > > > > > For the develop, it is one of easiest ways to catch up the bug-fixes even > > though I don't work on the source directly. So far besides a few glitches > it > > was always working. I also install scipy, ipython, matplotlib, sympy and > all > > other available packages using develop. Keep the checkouts in the > directory > > on my desktop and if/when necessary do svn up or whichever command it > > corresponds to their respective vcs. > > If you do that, you have to be ready to look into the corresponding > issues it brings. > > > I wonder how other people keep up the > > changes easily without using develop option. > > I just install things, and avoid relying on too many developed > versions of packages. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Fix comes following your suggestion. Use python install for the time being and remove the the check-out after installation. This prevents the funny import error even if I don't try to import the numpy from within I made the installation. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Mon Dec 28 12:00:47 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Dec 2009 11:00:47 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> Message-ID: <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> On Mon, Dec 28, 2009 at 10:31 AM, G?khan Sever wrote: > > > On Sat, Dec 26, 2009 at 6:09 PM, David Cournapeau wrote: > >> On Sun, Dec 27, 2009 at 6:19 AM, G?khan Sever >> wrote: >> >> > >> > For the develop, it is one of easiest ways to catch up the bug-fixes >> even >> > though I don't work on the source directly. So far besides a few >> glitches it >> > was always working. I also install scipy, ipython, matplotlib, sympy and >> all >> > other available packages using develop. Keep the checkouts in the >> directory >> > on my desktop and if/when necessary do svn up or whichever command it >> > corresponds to their respective vcs. >> >> If you do that, you have to be ready to look into the corresponding >> issues it brings. >> >> > I wonder how other people keep up the >> > changes easily without using develop option. >> >> I just install things, and avoid relying on too many developed >> versions of packages. >> >> David >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > Fix comes following your suggestion. Use python install for the time being > and remove the the check-out after installation. This prevents the funny > import error even if I don't try to import the numpy from within I made the > installation. > > -- > G?khan > One interesting thing I have noticed while installing the numpy from the source is that numpy dependent libraries must be re-installed and this must be a clean re-install. For instance I can't import some matplotlib and scipy modules without making a fresh installation for these packages. My attempts result with a runtime error. Could someone clarify this point? Is this due to API change in the numpy core? -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Dec 28 12:07:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 28 Dec 2009 11:07:28 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> Message-ID: <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> On Mon, Dec 28, 2009 at 11:00, G?khan Sever wrote: > One interesting thing I have noticed while installing the numpy from the > source is that numpy dependent libraries must be re-installed and this must > be a clean re-install. For instance I can't import some matplotlib and scipy > modules without making a fresh installation for these packages. My attempts > result with a runtime error. Please, please, always copy-and-paste the traceback when reporting an error. I know you aren't formally reporting a bug here, but it always helps. > Could someone clarify this point? Is this due > to API change in the numpy core? Cython/Pyrex code does a runtime check on the struct sizes of types. We have carefully added a member to the PyArrayDescr struct; i.e. it shouldn't cause any actual problems, but Cython does the check anyways. This affects a few modules in scipy, but shouldn't have affected anything in matplotlib. The traceback may help us identify the issue you are seeing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Mon Dec 28 12:16:44 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Dec 2009 11:16:44 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> Message-ID: <49d6b3500912280916q69a50e2ci318ec0509ac11bec@mail.gmail.com> On Mon, Dec 28, 2009 at 11:07 AM, Robert Kern wrote: > On Mon, Dec 28, 2009 at 11:00, G?khan Sever wrote: > > > One interesting thing I have noticed while installing the numpy from the > > source is that numpy dependent libraries must be re-installed and this > must > > be a clean re-install. For instance I can't import some matplotlib and > scipy > > modules without making a fresh installation for these packages. My > attempts > > result with a runtime error. > > Please, please, always copy-and-paste the traceback when reporting an > error. I know you aren't formally reporting a bug here, but it always > helps. > > > Could someone clarify this point? Is this due > > to API change in the numpy core? > > Cython/Pyrex code does a runtime check on the struct sizes of types. > We have carefully added a member to the PyArrayDescr struct; i.e. it > shouldn't cause any actual problems, but Cython does the check > anyways. This affects a few modules in scipy, but shouldn't have > affected anything in matplotlib. The traceback may help us identify > the issue you are seeing. > > It is too late for the tracebacks. I have already removed the problematic packages and did clean installs. However, next time I will be more careful while reporting such issues. If it helps, in both matplotlib (via ipython -pylab) and scipy.stats import cases the runtime errors was raised due to numpy.core.multiarray module import. > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Mon Dec 28 13:15:11 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Dec 2009 12:15:11 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912280916q69a50e2ci318ec0509ac11bec@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> <49d6b3500912280916q69a50e2ci318ec0509ac11bec@mail.gmail.com> Message-ID: <49d6b3500912281015h5fa4c5a1g74650c78822bae7@mail.gmail.com> On Mon, Dec 28, 2009 at 11:16 AM, G?khan Sever wrote: > > > On Mon, Dec 28, 2009 at 11:07 AM, Robert Kern wrote: > >> On Mon, Dec 28, 2009 at 11:00, G?khan Sever >> wrote: >> >> > One interesting thing I have noticed while installing the numpy from the >> > source is that numpy dependent libraries must be re-installed and this >> must >> > be a clean re-install. For instance I can't import some matplotlib and >> scipy >> > modules without making a fresh installation for these packages. My >> attempts >> > result with a runtime error. >> >> Please, please, always copy-and-paste the traceback when reporting an >> error. I know you aren't formally reporting a bug here, but it always >> helps. >> >> > Could someone clarify this point? Is this due >> > to API change in the numpy core? >> >> Cython/Pyrex code does a runtime check on the struct sizes of types. >> We have carefully added a member to the PyArrayDescr struct; i.e. it >> shouldn't cause any actual problems, but Cython does the check >> anyways. This affects a few modules in scipy, but shouldn't have >> affected anything in matplotlib. The traceback may help us identify >> the issue you are seeing. >> >> > It is too late for the tracebacks. I have already removed the problematic > packages and did clean installs. However, next time I will be more careful > while reporting such issues. If it helps, in both matplotlib (via ipython > -pylab) and scipy.stats import cases the runtime errors was raised due to > numpy.core.multiarray module import. > > > >> -- >> >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > G?khan > Here is another interesting point to consider. To reproduce the runtime error I mentioned previously I downgraded numpy (from the latest check-out installation) following these steps: svn co http://svn.scipy.org/svn/numpy/branches/1.3.x/ numpy cd numpy python setup.py install Writing /usr/lib/python2.6/site-packages/numpy-1.3.1.dev8031-py2.6.egg-info I[2]: import matplotlib.pyplot as plt Segmentation fault I[3]: from scipy import stats Segmentation fault I have installed matplotlib and scipy using the latest numpy dev version. A little later I will downgrade matplotlib and scipy to their previous stable versions, and compile them using numpy 1.3.x. Afterwards I will update numpy and test to see if I can re-produce the runtime error to provide the tracebacks. First let me know if any tracebacks needed for these segfaults or are these known failures? ================================================================================ Platform : Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas Python : ('CPython', 'tags/r26', '66714') ================================================================================ -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dagss at student.matnat.uio.no Mon Dec 28 13:49:13 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 28 Dec 2009 19:49:13 +0100 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: David wrote: > Repository > ======== > > The goal here is to have something like CRAN > (http://cran.r-project.org/web/views/), ideally with a build farm so > that whenever anyone submits a package to our repository, it would > automatically be checked, and built for windows/mac os x and maybe a > few major linux distributions. One could investigate the build service > from open suse to that end (http://en.opensuse.org/Build_Service), > which is based on xen VM to build installers in a reproducible way. Do you here mean automatic generation of Ubuntu debs, Debian debs, Windows MSI installer, Windows EXE installer, and so on? (If so then great!) If this is the goal, I wonder if one looks outside of Python-land one might find something that already does this -- there's a lot of different package format, "Linux meta-distributions", "install everywhere packages" and so on. Of course, toydist could have such any such tool as a backend/in a pipeline. > What's next ? > ========== > > At this point, I would like to ask for help and comments, in particular: > - Does all this make sense, or hopelessly intractable ? > - Besides the points I have mentioned, what else do you think is needed ? Hmm. What I miss is the discussion of other native libraries which the Python libraries need to bundle. Is it assumed that one want to continue linking C and Fortran code directly into Python .so modules, like the scipy library currently does? Let me take CHOLMOD (sparse Cholesky) as an example. - The Python package cvxopt use it, simply by linking about 20 C files directly into the Python-loadable module (.so) which goes into the Python site-packages (or wherever). This makes sure it just works. But, it doesn't feel like the right way at all. - scikits.sparse.cholmod OTOH simple specifies libraries=["cholmod"], and leave it up to the end-user to make sure it is installed. Linux users with root access can simply apt-get, but it is a pain for everybody else (Windows, Mac, non-root Linux). - Currently I'm making a Sage SPKG for CHOLMOD. This essentially gets the job done by not bothering about the problem, not even using the OS-installed Python. Something that would spit out both Sage SPKGs, Ubuntu debs, Windows installers, both with Python code and C/Fortran code or a mix (and put both in the place preferred by the system in question), seems ideal. Of course one would still need to make sure that the code builds properly everywhere, but just solving the distribution part of this would be a huge step ahead. What I'm saying is that this is a software distribution problem in general, and I'm afraid that Python-specific solutions are too narrow. Dag Sverre From cournape at gmail.com Mon Dec 28 13:55:13 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 03:55:13 +0900 Subject: [Numpy-discussion] [SciPy-dev] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912280703r43b8122ds609cbcd4a9a75ead@mail.gmail.com> Message-ID: <5b8d13220912281055m72fbbc53ld2c3a6d8abe2425a@mail.gmail.com> On Tue, Dec 29, 2009 at 3:03 AM, Neal Becker wrote: > David Cournapeau wrote: > >> On Mon, Dec 28, 2009 at 11:47 PM, Stefan Schwarzburg >> wrote: >>> Hi, >>> I would like to add a comment from the user perspective: >>> >>> - the main reason why I'm not satisfied with pypi/distutils/etc. and why >>> I will not be satisfied with toydist (with the features you listed), is >>> that they break my installation (debian/ubuntu). >> >> Toydist (or distutils) does not break anything as is. It would be like >> saying make breaks debian - it does not make much sense. As stated, >> one of the goal of giving up distutils is to make packaging by os >> vendors easier. In particular, by allowing to follow the FHS, and >> making things more consistent. It should be possible to automatically >> convert most packages to .deb (or .rpm) relatively easily. When you >> look at the numpy .deb package, most of the issues are distutils >> issues, and almost everything else can be done automatically. >> >> Note that even ignoring the windows problem, there are systems to do >> the kind of things I am talking about for linux-only systems (the >> opensuse build service), because distributions are not always really >> good at tracking fast changing softwares. IOW, traditional linux >> packaging has some issues as well. And anyway, nothing prevents debian >> or other OS vendors to package things as they want (as they do for R >> packages). >> >> David > > I think the breakage that is referred to I can describe on my favorite > system, fedora. > > I can install the fedora numpy rpm using yum. ?I could also use > easy_install. ?Unfortunately: > 1) Each one knows nothing about the other > 2) They may install things into conflicting paths. ?In particular, on fedora > arch-dependent things go in /usr/lib64/python/site-packages while > arch-independent goes into /usr/lib/python... ?If you mix yum with > easy_install (or setuptools), you many times wind up with 2 versions and a > lot of confusion. > > This is NOT unusual. ?Let's say I have numpy-1.3.0 installed from rpms. ?I > see the announcement of numpy-1.4.0, and decide I want it, before the rpm is > available, so I use easy_install. ?Now numpy-1.4.0 shows up as a standard > rpm, and a subsequent update (which could be automatic!) could produce a > broken system. Several points: - First, this is caused by distutils misfeature of defaulting to /usr. This is a mistake. It should default to /usr/local, as does every other install method from sources. - A lot of instructions start by sudo easy_install... This is a very bad advice, especially given the previous issue. > I don't really know what could be done about it. ?Perhaps a design that > attempts to use native backends for installation where available? The idea would be that for a few major distributions at least, you would have .rpm available on the repository. If you install from sources, there would be a few mechanisms to avoid your exact issue (like maybe defaulting to --user kind of installs). Of course, it can only be dealt up to a point. David From cournape at gmail.com Mon Dec 28 14:14:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 04:14:01 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: <5b8d13220912281114x18687d85wdaab008b3243a846@mail.gmail.com> On Tue, Dec 29, 2009 at 3:49 AM, Dag Sverre Seljebotn wrote: > > Do you here mean automatic generation of Ubuntu debs, Debian debs, Windows > MSI installer, Windows EXE installer, and so on? (If so then great!) Yes (although this is not yet implemented). In particular on windows, I want to implement a scheme so that you can convert from eggs to .exe and vice et versa, so people can still install as exe (or msi), even though the method would default to eggs. > If this is the goal, I wonder if one looks outside of Python-land one > might find something that already does this -- there's a lot of different > package format, "Linux meta-distributions", "install everywhere packages" > and so on. Yes, there are things like 0install or autopackage. I think those are deemed to fail, as long as it is not supported thoroughly by the distribution. Instead, my goal here is much simpler: producing rpm/deb. It does not solve every issue (install by non root, multiple // versions), but one has to be realistic :) I think automatically built rpm/deb, easy integration with native method can solve a lot of issues already. > > ?- Currently I'm making a Sage SPKG for CHOLMOD. This essentially gets the > job done by not bothering about the problem, not even using the > OS-installed Python. > > Something that would spit out both Sage SPKGs, Ubuntu debs, Windows > installers, both with Python code and C/Fortran code or a mix (and put > both in the place preferred by the system in question), seems ideal. Of > course one would still need to make sure that the code builds properly > everywhere, but just solving the distribution part of this would be a huge > step ahead. On windows, this issue may be solved using eggs: enstaller has a feature where dll put in a special location of an egg are installed in python such as they are found by the OS loader. One could have mechanisms based on $ORIGIN + rpath on linux to solve this issue for local installs on Linux, etc... But again, one has to be realistic on the goals. With toydist, I want to remove all the pile of magic, hacks built on top of distutils so that people can again hack their own solutions, as it should have been from the start (that's a big plus of python in general). It won't magically solve every issue out there, but it would hopefully help people to make their own. Bundling solutions like SAGE, EPD, etc... are still the most robust ways to deal with those issues in general, and I do not intended to replace those. > What I'm saying is that this is a software distribution problem in > general, and I'm afraid that Python-specific solutions are too narrow. Distribution is a hard problem. Instead of pushing a very narrow (and mostly ill-funded) view of how people should do things like distutils/setuptools/pip/buildout do, I want people to be able to be able to build their own solutions. No more "use this magic stick v 4.0.3.3.14svn1234, trust me it work you don't have to understand" which is too prevalant with those tools, which has always felt deeply unpythonic to me. David From dagss at student.matnat.uio.no Mon Dec 28 14:21:01 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 28 Dec 2009 20:21:01 +0100 Subject: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912281114x18687d85wdaab008b3243a846@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912281114x18687d85wdaab008b3243a846@mail.gmail.com> Message-ID: <930cd7013d563c4b882be2ad6358a9d7.squirrel@webmail.uio.no> > On Tue, Dec 29, 2009 at 3:49 AM, Dag Sverre Seljebotn > wrote: > >> >> Do you here mean automatic generation of Ubuntu debs, Debian debs, >> Windows >> MSI installer, Windows EXE installer, and so on? (If so then great!) > > Yes (although this is not yet implemented). In particular on windows, > I want to implement a scheme so that you can convert from eggs to .exe > and vice et versa, so people can still install as exe (or msi), even > though the method would default to eggs. > >> If this is the goal, I wonder if one looks outside of Python-land one >> might find something that already does this -- there's a lot of >> different >> package format, "Linux meta-distributions", "install everywhere >> packages" >> and so on. > > Yes, there are things like 0install or autopackage. I think those are > deemed to fail, as long as it is not supported thoroughly by the > distribution. Instead, my goal here is much simpler: producing > rpm/deb. It does not solve every issue (install by non root, multiple > // versions), but one has to be realistic :) > > I think automatically built rpm/deb, easy integration with native > method can solve a lot of issues already. > >> >> ?- Currently I'm making a Sage SPKG for CHOLMOD. This essentially gets >> the >> job done by not bothering about the problem, not even using the >> OS-installed Python. >> >> Something that would spit out both Sage SPKGs, Ubuntu debs, Windows >> installers, both with Python code and C/Fortran code or a mix (and put >> both in the place preferred by the system in question), seems ideal. Of >> course one would still need to make sure that the code builds properly >> everywhere, but just solving the distribution part of this would be a >> huge >> step ahead. > > On windows, this issue may be solved using eggs: enstaller has a > feature where dll put in a special location of an egg are installed in > python such as they are found by the OS loader. One could have > mechanisms based on $ORIGIN + rpath on linux to solve this issue for > local installs on Linux, etc... > > But again, one has to be realistic on the goals. With toydist, I want > to remove all the pile of magic, hacks built on top of distutils so > that people can again hack their own solutions, as it should have been > from the start (that's a big plus of python in general). It won't > magically solve every issue out there, but it would hopefully help > people to make their own. > > Bundling solutions like SAGE, EPD, etc... are still the most robust > ways to deal with those issues in general, and I do not intended to > replace those. > >> What I'm saying is that this is a software distribution problem in >> general, and I'm afraid that Python-specific solutions are too narrow. > > Distribution is a hard problem. Instead of pushing a very narrow (and > mostly ill-funded) view of how people should do things like > distutils/setuptools/pip/buildout do, I want people to be able to be > able to build their own solutions. No more "use this magic stick v > 4.0.3.3.14svn1234, trust me it work you don't have to understand" > which is too prevalant with those tools, which has always felt deeply > unpythonic to me. Thanks, this cleared things up, and I like the direction this is heading. Thanks a lot for doing this! Dag Sverre From cournape at gmail.com Mon Dec 28 15:04:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 05:04:01 +0900 Subject: [Numpy-discussion] [ANN] Numpy 1.4.0 release In-Reply-To: References: <5b8d13220912271734m29388676m4db200fa6a02dc92@mail.gmail.com> Message-ID: <5b8d13220912281204g55e8225eub3fc666ac4f4cae4@mail.gmail.com> On Mon, Dec 28, 2009 at 3:28 PM, Charles R Harris wrote: > > > On Sun, Dec 27, 2009 at 6:34 PM, David Cournapeau > wrote: >> >> Hi, >> >> I am pleased to announce the release of numpy 1.4.0. The highlights of >> this release are: >> > > > > The new files aren't up on sourceforge yet, although I expect they will be > the same as rc2. I put the announcement up on the scipy homepage anyway. They are on the sourceforge website, now. David From rob.clewley at gmail.com Mon Dec 28 16:23:35 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Mon, 28 Dec 2009 16:23:35 -0500 Subject: [Numpy-discussion] [ANN] PyDSTool 0.88 -- dynamical systems modeling tools Message-ID: A new release of the dynamical systems modeling toolbox PyDSTool is available from Sourceforge: http://www.sourceforge.net/projects/pydstool/ Highlights from the release notes: * Cleanup of global imports, especially: entire numpy.random and linalg namespaces no longer imported by default * Added support for 'min' and 'max' keywords in functional specifications (for ODE right-hand sides, for instance) * Optimization tools from third-party genericOpt (included with permission) and improved parameter estimation examples making use of this code * Numerical phase-response calculations now possible in PRC toolbox * Fully-fledged DSSRT toolbox for neural modeling (see wiki page) * New tests/demonstrations in PyDSTool/tests * Major improvements to intelligent expr2func (symbolic -> python function conversion) * Improved compatibility with cross-platform use and with recent python versions and associated libraries * Added many minor features (see timeline on Trac http://jay.cam.cornell.edu/pydstool/timeline) * Fixed many bugs and quirks (see timeline on Trac http://jay.cam.cornell.edu/pydstool/timeline) This is mainly a bugfix release in preparation for a substantial upgrade at version 0.90, which will have a proper installer, unit testing, symbolic expression support via SymPy, and greatly improved interfacing to legacy ODE integrators. These features are being actively developed in 2009/2010. For installation and setting up, please carefully read the GettingStarted page at our wiki for platform-specific details: http://pydstool.sourceforge.net Please use the bug tracker and user discussion list at Sourceforge to report bugs or provide feedback. Code and documentation contributions are always welcome. Regards, Rob Clewley From gokhansever at gmail.com Mon Dec 28 16:30:01 2009 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Mon, 28 Dec 2009 15:30:01 -0600 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912281015h5fa4c5a1g74650c78822bae7@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> <49d6b3500912280916q69a50e2ci318ec0509ac11bec@mail.gmail.com> <49d6b3500912281015h5fa4c5a1g74650c78822bae7@mail.gmail.com> Message-ID: <49d6b3500912281330g4559722as5a227abdf3ea7ee1@mail.gmail.com> On Mon, Dec 28, 2009 at 12:15 PM, G?khan Sever wrote: > > > On Mon, Dec 28, 2009 at 11:16 AM, G?khan Sever wrote: > >> >> >> On Mon, Dec 28, 2009 at 11:07 AM, Robert Kern wrote: >> >>> On Mon, Dec 28, 2009 at 11:00, G?khan Sever >>> wrote: >>> >>> > One interesting thing I have noticed while installing the numpy from >>> the >>> > source is that numpy dependent libraries must be re-installed and this >>> must >>> > be a clean re-install. For instance I can't import some matplotlib and >>> scipy >>> > modules without making a fresh installation for these packages. My >>> attempts >>> > result with a runtime error. >>> >>> Please, please, always copy-and-paste the traceback when reporting an >>> error. I know you aren't formally reporting a bug here, but it always >>> helps. >>> >>> > Could someone clarify this point? Is this due >>> > to API change in the numpy core? >>> >>> Cython/Pyrex code does a runtime check on the struct sizes of types. >>> We have carefully added a member to the PyArrayDescr struct; i.e. it >>> shouldn't cause any actual problems, but Cython does the check >>> anyways. This affects a few modules in scipy, but shouldn't have >>> affected anything in matplotlib. The traceback may help us identify >>> the issue you are seeing. >>> >>> >> It is too late for the tracebacks. I have already removed the problematic >> packages and did clean installs. However, next time I will be more careful >> while reporting such issues. If it helps, in both matplotlib (via ipython >> -pylab) and scipy.stats import cases the runtime errors was raised due to >> numpy.core.multiarray module import. >> >> >> >>> -- >>> >>> Robert Kern >>> >>> "I have come to believe that the whole world is an enigma, a harmless >>> enigma that is made terrible by our own mad attempt to interpret it as >>> though it had an underlying truth." >>> -- Umberto Eco >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> >> >> -- >> G?khan >> > > Here is another interesting point to consider. To reproduce the runtime > error I mentioned previously I downgraded numpy (from the latest check-out > installation) following these steps: > > svn co http://svn.scipy.org/svn/numpy/branches/1.3.x/ numpy > cd numpy > python setup.py install > Writing /usr/lib/python2.6/site-packages/numpy-1.3.1.dev8031-py2.6.egg-info > > I[2]: import matplotlib.pyplot as plt > Segmentation fault > > I[3]: from scipy import stats > Segmentation fault > > I have installed matplotlib and scipy using the latest numpy dev version. A > little later I will downgrade matplotlib and scipy to their previous stable > versions, and compile them using numpy 1.3.x. Afterwards I will update numpy > and test to see if I can re-produce the runtime error to provide the > tracebacks. First let me know if any tracebacks needed for these segfaults > or are these known failures? > > > ================================================================================ > Platform : > Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas > Python : ('CPython', 'tags/r26', '66714') > > ================================================================================ > > > -- > G?khan > Since no one has replied, I tried to reproduce the runtime error but ended up with different errors. Read more for the details: svn co http://svn.scipy.org/svn/scipy/tags/0.7.1/ scipy python setup.py install Writing /usr/lib/python2.6/site-packages/scipy-0.7.1-py2.6.egg-info svn co https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/tags/v0_99_0/matplotlib python setup.py install Writing /usr/lib/python2.6/site-packages/matplotlib-0.99.0-py2.6.egg-info install them using >>> import numpy >>> numpy.__version__ '1.3.1.dev8031' scipy import is fine but matplotlib fails with a different import error. I wonder if the buildbots test the source and releases for against different numpy versions or it is just might system acting weird. >>> from scipy import stats >>> import matplotlib.pyplot as plt Traceback (most recent call last): File "", line 1, in File "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/pyplot.py", line 6, in from matplotlib.figure import Figure, figaspect File "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/figure.py", line 17, in import artist File "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/artist.py", line 5, in from transforms import Bbox, IdentityTransform, TransformedBbox, TransformedPath File "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/transforms.py", line 34, in from matplotlib._path import affine_transform ImportError: No module named _path Anyways, back to the main point. First remove the numpy 1.3.x: [root at ccn site-packages]# rm -rf numpy [root at ccn site-packages]# rm -rf numpy-1.3.1.dev8031-py2.6.egg-info and install from the latest trunk: svn co http://svn.scipy.org/svn/numpy/trunk numpy python setup.py install Writing /usr/lib/python2.6/site-packages/numpy-1.5.0.dev8032-py2.6.egg-info This time scipy asserts: >>> from scipy import stats Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/site-packages/scipy/stats/__init__.py", line 7, in from stats import * File "/usr/lib/python2.6/site-packages/scipy/stats/stats.py", line 203, in from morestats import find_repeats #is only reference to scipy.stats File "/usr/lib/python2.6/site-packages/scipy/stats/morestats.py", line 7, in import distributions File "/usr/lib/python2.6/site-packages/scipy/stats/distributions.py", line 27, in import vonmises_cython File "numpy.pxd", line 30, in scipy.stats.vonmises_cython (scipy/stats/vonmises_cython.c:2939) ValueError: numpy.dtype does not appear to be the correct type object import matplotlib.pyplot as plt yields same as previous import attempt. I must have caught an old issue. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 28 16:36:44 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 28 Dec 2009 14:36:44 -0700 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912281330g4559722as5a227abdf3ea7ee1@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> <3d375d730912280907o1120fa2bwb8e79519d9c32972@mail.gmail.com> <49d6b3500912280916q69a50e2ci318ec0509ac11bec@mail.gmail.com> <49d6b3500912281015h5fa4c5a1g74650c78822bae7@mail.gmail.com> <49d6b3500912281330g4559722as5a227abdf3ea7ee1@mail.gmail.com> Message-ID: On Mon, Dec 28, 2009 at 2:30 PM, G?khan Sever wrote: > > > On Mon, Dec 28, 2009 at 12:15 PM, G?khan Sever wrote: > >> >> >> On Mon, Dec 28, 2009 at 11:16 AM, G?khan Sever wrote: >> >>> >>> >>> On Mon, Dec 28, 2009 at 11:07 AM, Robert Kern wrote: >>> >>>> On Mon, Dec 28, 2009 at 11:00, G?khan Sever >>>> wrote: >>>> >>>> > One interesting thing I have noticed while installing the numpy from >>>> the >>>> > source is that numpy dependent libraries must be re-installed and this >>>> must >>>> > be a clean re-install. For instance I can't import some matplotlib and >>>> scipy >>>> > modules without making a fresh installation for these packages. My >>>> attempts >>>> > result with a runtime error. >>>> >>>> Please, please, always copy-and-paste the traceback when reporting an >>>> error. I know you aren't formally reporting a bug here, but it always >>>> helps. >>>> >>>> > Could someone clarify this point? Is this due >>>> > to API change in the numpy core? >>>> >>>> Cython/Pyrex code does a runtime check on the struct sizes of types. >>>> We have carefully added a member to the PyArrayDescr struct; i.e. it >>>> shouldn't cause any actual problems, but Cython does the check >>>> anyways. This affects a few modules in scipy, but shouldn't have >>>> affected anything in matplotlib. The traceback may help us identify >>>> the issue you are seeing. >>>> >>>> >>> It is too late for the tracebacks. I have already removed the problematic >>> packages and did clean installs. However, next time I will be more careful >>> while reporting such issues. If it helps, in both matplotlib (via ipython >>> -pylab) and scipy.stats import cases the runtime errors was raised due to >>> numpy.core.multiarray module import. >>> >>> >>> >>>> -- >>>> >>>> Robert Kern >>>> >>>> "I have come to believe that the whole world is an enigma, a harmless >>>> enigma that is made terrible by our own mad attempt to interpret it as >>>> though it had an underlying truth." >>>> -- Umberto Eco >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>> >>> >>> >>> -- >>> G?khan >>> >> >> Here is another interesting point to consider. To reproduce the runtime >> error I mentioned previously I downgraded numpy (from the latest check-out >> installation) following these steps: >> >> svn co http://svn.scipy.org/svn/numpy/branches/1.3.x/ numpy >> cd numpy >> python setup.py install >> Writing >> /usr/lib/python2.6/site-packages/numpy-1.3.1.dev8031-py2.6.egg-info >> >> I[2]: import matplotlib.pyplot as plt >> Segmentation fault >> >> I[3]: from scipy import stats >> Segmentation fault >> >> I have installed matplotlib and scipy using the latest numpy dev version. >> A little later I will downgrade matplotlib and scipy to their previous >> stable versions, and compile them using numpy 1.3.x. Afterwards I will >> update numpy and test to see if I can re-produce the runtime error to >> provide the tracebacks. First let me know if any tracebacks needed for these >> segfaults or are these known failures? >> >> >> ================================================================================ >> Platform : >> Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas >> Python : ('CPython', 'tags/r26', '66714') >> >> ================================================================================ >> >> >> -- >> G?khan >> > > > Since no one has replied, I tried to reproduce the runtime error but ended > up with different errors. Read more for the details: > > svn co http://svn.scipy.org/svn/scipy/tags/0.7.1/ scipy > python setup.py install > Writing /usr/lib/python2.6/site-packages/scipy-0.7.1-py2.6.egg-info > > svn co > https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/tags/v0_99_0/matplotlib > python setup.py install > Writing /usr/lib/python2.6/site-packages/matplotlib-0.99.0-py2.6.egg-info > > install them using > > >>> import numpy > >>> numpy.__version__ > '1.3.1.dev8031' > > scipy import is fine but matplotlib fails with a different import error. I > wonder if the buildbots test the source and releases for against different > numpy versions or it is just might system acting weird. > > >>> from scipy import stats > > > >>> import matplotlib.pyplot as plt > Traceback (most recent call last): > File "", line 1, in > File > "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/pyplot.py", line > 6, in > from matplotlib.figure import Figure, figaspect > File > "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/figure.py", line > 17, in > import artist > File > "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/artist.py", line > 5, in > from transforms import Bbox, IdentityTransform, TransformedBbox, > TransformedPath > File > "/home/gsever/Desktop/python-repo/matplotlib/lib/matplotlib/transforms.py", > line 34, in > from matplotlib._path import affine_transform > ImportError: No module named _path > > Anyways, back to the main point. First remove the numpy 1.3.x: > > [root at ccn site-packages]# rm -rf numpy > [root at ccn site-packages]# rm -rf numpy-1.3.1.dev8031-py2.6.egg-info > > and install from the latest trunk: > > svn co http://svn.scipy.org/svn/numpy/trunk numpy > python setup.py install > Writing /usr/lib/python2.6/site-packages/numpy-1.5.0.dev8032-py2.6.egg-info > > This time scipy asserts: > > >>> from scipy import stats > > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.6/site-packages/scipy/stats/__init__.py", line 7, > in > from stats import * > File "/usr/lib/python2.6/site-packages/scipy/stats/stats.py", line 203, > in > from morestats import find_repeats #is only reference to scipy.stats > File "/usr/lib/python2.6/site-packages/scipy/stats/morestats.py", line 7, > in > import distributions > File "/usr/lib/python2.6/site-packages/scipy/stats/distributions.py", > line 27, in > import vonmises_cython > File "numpy.pxd", line 30, in scipy.stats.vonmises_cython > (scipy/stats/vonmises_cython.c:2939) > ValueError: numpy.dtype does not appear to be the correct type object > > That is the cython problem. I think they fixed it, but I don't know if that is in the recent release. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Dec 28 18:02:08 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 29 Dec 2009 00:02:08 +0100 Subject: [Numpy-discussion] [matplotlib-devel] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912280703r43b8122ds609cbcd4a9a75ead@mail.gmail.com> <5b8d13220912281055m72fbbc53ld2c3a6d8abe2425a@mail.gmail.com> Message-ID: <20091228230208.GA9952@phare.normalesup.org> On Mon, Dec 28, 2009 at 02:29:24PM -0500, Neal Becker wrote: > Perhaps this could be useful: > http://checkinstall.izto.org/ Yes, checkinstall is really cool. However, I tend to prefer things with no magic that I don't have to sandbox to know what they are doing. This is why I am also happy to hear about toydist. Ga?l From cournape at gmail.com Mon Dec 28 23:38:05 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 13:38:05 +0900 Subject: [Numpy-discussion] [SciPy-dev] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <20091228230208.GA9952@phare.normalesup.org> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912280703r43b8122ds609cbcd4a9a75ead@mail.gmail.com> <5b8d13220912281055m72fbbc53ld2c3a6d8abe2425a@mail.gmail.com> <20091228230208.GA9952@phare.normalesup.org> Message-ID: <5b8d13220912282038m5a32f6b4iae61b3c9278f562f@mail.gmail.com> On Tue, Dec 29, 2009 at 8:02 AM, Gael Varoquaux wrote: > On Mon, Dec 28, 2009 at 02:29:24PM -0500, Neal Becker wrote: >> Perhaps this could be useful: >> http://checkinstall.izto.org/ > > Yes, checkinstall is really cool. However, I tend to prefer things with > no magic that I don't have to sandbox to know what they are doing. I am still not sure the design is entirely right, but the install command in toymaker just reads a build manifest, which is a file containing all the files necessary for install. It is explicit, and list every file to be installed. By design, it cannot install anything outside this manifest. That's also how eggs are built (and soon win installers and mac os x pkg). cheers, David From cournape at gmail.com Tue Dec 29 06:52:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 20:52:01 +0900 Subject: [Numpy-discussion] [ANN] numpy 1.4.0 rc2 In-Reply-To: <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> References: <5b8d13220912220705s4febcbe9k10332ecff9166d5b@mail.gmail.com> <49d6b3500912220841q38c03fc2t9b08de1510f1e827@mail.gmail.com> <5b8d13220912241457o93f4087q7a492e99e8397ac9@mail.gmail.com> <49d6b3500912261319u4d8dbc0bqed4bb339a952cc75@mail.gmail.com> <5b8d13220912261609m14c0b15cw37ffa9276c8b1711@mail.gmail.com> <49d6b3500912280831w1ae206e4nf747873c89802c7b@mail.gmail.com> <49d6b3500912280900y36db51bfka3184d3c204e7d6@mail.gmail.com> Message-ID: <5b8d13220912290352n4cbca8casacaef38bcba2241@mail.gmail.com> On Tue, Dec 29, 2009 at 2:00 AM, G?khan Sever wrote: > > One interesting thing I have noticed while installing the numpy from the > source is that numpy dependent libraries must be re-installed and this must > be a clean re-install. This is expected if you build libraries against dev versions of numpy - we simply cannot guarantee that every revision in the trunk will be ABI compatible. David From renesd at gmail.com Tue Dec 29 08:27:18 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 29 Dec 2009 13:27:18 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> Hi, In the toydist proposal/release notes, I would address 'what does toydist do better' more explicitly. **** A big problem for science users is that numpy does not work with pypi + (easy_install, buildout or pip) and python 2.6. **** Working with the rest of the python community as much as possible is likely a good goal. At least getting numpy to work with the latest tools would be great. An interesting read is the history of python packaging here: http://faassen.n--tree.net/blog/view/weblog/2009/11/09/0 Buildout is what a lot of the python community are using now. Getting numpy to work nicely with buildout and pip would be a good start. numpy used to work with buildout in python2.5, but not with 2.6. buildout lets other team members get up to speed with a project by running one command. It installs things in the local directory, not system wide. So you can have different dependencies per project. Plenty of good work is going on with python packaging. Lots of the python community are not using compiled packages however, so the requirements are different. There are a lot of people (thousands) working with the python packaging system, improving it, and building tools around it. Distribute for example has many committers, as do buildout/pip. eg, there are fifty or so buildout plugins, which people use to customise their builds( see the buildout recipe list on pypi at http://pypi.python.org/pypi?:action=browse&show=all&c=512 ). There are build farms for windows packages and OSX uploaded to pypi. Start uploading pre releases to pypi, and you get these for free (once you make numpy compile out of the box on those compile farms). There are compile farms for other OSes too... like ubuntu/debian, macports etc. Some distributions even automatically download, compile and package new releases once they spot a new file on your ftp/web site. Speeding up the release cycle to be continuous can let people take advantage of these tools built together. If you get your tests running after the build step, all of these distributions also turn into test farms :) pypm: http://pypm.activestate.com/list-n.html#numpy ubuntu PPA: https://launchpad.net/ubuntu/+ppas the snakebite project : http://www.snakebite.org/ (seems mostly dead... but they have a lot of hardware) suse build service: https://build.opensuse.org/ pony-build: http://wiki.github.com/ctb/pony-build zope, and pygame also have their own build/test farms. They are two other compiled python packages projects. As do a number of other python projects(eg twisted...). Projects like pony-build should hopefully make it easier for people to run their own build farms, independently of the main projects. You just really need a script to: (download, build, test, post results), and then post a link to your mailing list... and someone will be able to run a build farm. Documentation projects are being worked on to document, give tutorials and make python packaging be easier all round. As witnessed by 20 or so releases on pypi every day(and growing), lots of people are using the python packaging tools successfully. Documenting how people can make numpy addon libraries(plugins) would encourage people to do so. Currently there is no documentation from the numpy community, or encouragement to do so. This combined with numpy being broken with python2.6+pypi will result in less science related packages. There is still a whole magnitude of people not releasing on pypi though, there are thousands of projects on the pygame.org website that are not on the pypi website for example. There are likely many hundreds or thousands of scientific projects not listed on their either. Given all of these projects not on pypi, obviously things could be improved. The pygame.org website also shows that community specific websites are very helpful. A science view of pypi would make it much more useful - so people don't have to look through web/game/database etc packages. Here is a view of 535 science/engineering related packages on pypi now: http://pypi.python.org/pypi?:action=browse&c=385 458 science/research packages on pypi: http://pypi.python.org/pypi?:action=browse&show=all&c=40 So there are already hundreds of science related packages and hundreds of people making those science related packages for pypi. Not too bad. Distribution of Applications is another issue that needs improving. That is so that people can share applications without needing to install a whole bunch of things. Think about sending applications to your grandma. Do you ask her to download python, grab these libraries, do this... do that. It would be much better if you could give her a url, and away you go! Bug tracking, and diff tracking between distributions is an area where many projects can improve. Searching through the distributions bug trackers, and diffs to apply to the core dramatically helps packages getting updated. So does maintaining good communication with different distribution packagers. I'm not sure making a separate build tool is a good idea. I think going with the rest of the python community, and improving the tools there is a better idea. cheers, pps. some notes on toydist itself. - toydist convert is cool for people converting a setup.py . This means that most people can try out toydist right away. but what does it gain these people who convert their setup.py files? - a toydist convert that generates a setup.py file might be cool :) It could also generate a Makefile and a configure script :) - arbitrary code execution happens when building or testing with toydist. However the source packaging part does not with toydist. Compiling, running and testing the code happens most of the time anyway, so moving the sandboxing to the OS is more useful as are reviews, trust and reputation of different packages. - it should be possible to build this toydist functionality as a distutils/distribute/buildout extension. - extending toydist? How are extensions made? there are 175 buildout packages which extend buildout, and many that extend distutils/setuptools - so extension of build tools in a necessary thing. - scripting builds in python for python developers is easier than scripting a different new language. From cournape at gmail.com Tue Dec 29 09:22:52 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 23:22:52 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> Message-ID: <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: > Hi, > > In the toydist proposal/release notes, I would address 'what does > toydist do better' more explicitly. > > > > **** A big problem for science users is that numpy does not work with > pypi + (easy_install, buildout or pip) and python 2.6. **** > > > > Working with the rest of the python community as much as possible is > likely a good goal. Yes, but it is hopeless. Most of what is being discussed on distutils-sig is useless for us, and what matters is ignored at best. I think most people on distutils-sig are misguided, and I don't think the community is representative of people concerned with packaging anyway - most of the participants seem to be around web development, and are mostly dismissive of other's concerns (OS packagers, etc...). I want to note that I am not starting this out of thin air - I know most of distutils code very well, I have been the mostly sole maintainer of numpy.distutils for 2 years now. I have written extensive distutils extensions, in particular numscons which is able to fully build numpy, scipy and matplotlib on every platform that matters. Simply put, distutils code is horrible (this is an objective fact) and flawed beyond repair (this is more controversial). IMHO, it has almost no useful feature, except being standard. If you want a more detailed explanation of why I think distutils and all tools on top are deeply flawed, you can look here: http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/ > numpy used to work with buildout in python2.5, but not with 2.6. > buildout lets other team members get up to speed with a project by > running one command. ?It installs things in the local directory, not > system wide. ?So you can have different dependencies per project. I don't think it is a very useful feature, honestly. It seems to me that they created a huge infrastructure to split packages into tiny pieces, and then try to get them back together, imaganing that multiple installed versions is a replacement for backward compatibility. Anyone with extensive packaging experience knows that's a deeply flawed model in general. > Plenty of good work is going on with python packaging. That's the opposite of my experience. What I care about is: - tools which are hackable and easily extensible - robust install/uninstall - real, DAG-based build system - explicit and repeatability None of this is supported by the tools, and the current directions go even further away. When I have to explain at length why the command-based design of distutils is a nightmare to work with, I don't feel very confident that the current maintainers are aware of the issues, for example. It shows that they never had to extend distutils much. > > There are build farms for windows packages and OSX uploaded to pypi. > Start uploading pre releases to pypi, and you get these for free (once > you make numpy compile out of the box on those compile farms). ?There > are compile farms for other OSes too... like ubuntu/debian, macports > etc. ?Some distributions even automatically download, compile and > package new releases once they spot a new file on your ftp/web site. I am familiar with some of those systems (PPA and opensuse build service in particular). One of the goal of my proposal is to make it easier to interoperate with those tools. I think Pypi is mostly useless. The lack of enforced metadata is a big no-no IMHO. The fact that Pypi is miles beyond CRAN for example is quite significant. I want CRAN for scientific python, and I don't see Pypi becoming it in the near future. The point of having our own Pypi-like server is that we could do the following: - enforcing metadata - making it easy to extend the service to support our needs > > pypm: ?http://pypm.activestate.com/list-n.html#numpy It is interesting to note that one of the maintainer of pypm has recently quitted the discussion about Pypi, most likely out of frustration from the other participants. > Documentation projects are being worked on to document, give tutorials > and make python packaging be easier all round. ?As witnessed by 20 or > so releases on pypi every day(and growing), lots of people are using > the python packaging tools successfully. This does not mean much IMO. Uploading on Pypi is almost required to use virtualenv, buildout, etc.. An interesting metric is not how many packages are uploaded, but how much it is used outside developers. > > I'm not sure making a separate build tool is a good idea. ?I think > going with the rest of the python community, and improving the tools > there is a better idea. It has been tried, and IMHO has been proved to have failed. You can look at the recent discussion (the one started by Guido in particular). > pps. some notes on toydist itself. > - toydist convert is cool for people converting a setup.py . ?This > means that most people can try out toydist right away. ?but what does > it gain these people who convert their setup.py files? Not much ATM, except that it is easier to write a toysetup.info compared to setup.py IMO, and that it supports a simple way to include data files (something which is currently *impossible* to do without writing your own distutils extensions). It has also the ability to build eggs without using setuptools (I consider not using setuptools a feature, given the too many failure modes of this package). The main goals though are to make it easier to build your own tools on top of if, and to integrate with real build systems. > - a toydist convert that generates a setup.py file might be cool :) toydist started like this, actually: you would write a setup.py file which loads the package from toysetup.info, and can be converted to a dict argument to distutils.core.setup. I have not updated it recently, but that's definitely on the TODO list for a first alpha, as it would enable people to benefit from the format, with 100 % backward compatibility with distutils. > - arbitrary code execution happens when building or testing with > toydist. You are right for testing, but wrong for building. As long as the build is entirely driven by toysetup.info, you only have to trust toydist (which is not safe ATM, but that's an implementation detail), and your build tools of course. Obviously, if you have a package which uses an external build tool on top of toysetup.info (as will be required for numpy itself for example), all bets are off. But I think that's a tiny fraction of the interesting packages for scientific computing. Sandboxing is particularly an issue on windows - I don't know a good solution for windows sandboxing, outside of full vms, which are heavy-weights. > - it should be possible to build this toydist functionality as a > distutils/distribute/buildout extension. No, it cannot, at least as far as distutils/distribute are concerned (I know nothing about buildout). Extending distutils is horrible, and fragile in general. Even autotools with its mix of generated sh scripts through m4 and perl is a breeze compared to distutils. > - extending toydist? ?How are extensions made? ?there are 175 buildout > packages which extend buildout, and many that extend > distutils/setuptools - so extension of build tools in a necessary > thing. See my answer earlier about interoperation with build tools. cheers, David From cournape at gmail.com Tue Dec 29 09:34:44 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 29 Dec 2009 23:34:44 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> Message-ID: <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: > Buildout is what a lot of the python community are using now. I would like to note that buildout is a solution to a problem that I don't care to solve. This issue is particularly difficult to explain to people accustomed with buildout in my experience - I have not found a way to explain it very well yet. Buildout, virtualenv all work by sandboxing from the system python: each of them do not see each other, which may be useful for development, but as a deployment solution to the casual user who may not be familiar with python, it is useless. A scientist who installs numpy, scipy, etc... to try things out want to have everything available in one python interpreter, and does not want to jump to different virtualenvs and whatnot to try different packages. This has strong consequences on how you look at things from a packaging POV: - uninstall is crucial - a package bringing down python is a big no no (this happens way too often when you install things through setuptools) - if something fails, the recovery should be trivial - the person doing the installation may not know much about python - you cannot use sandboxing as a replacement for backward compatibility (that's why I don't care much about all the discussion about versioning - I don't think it is very useful as long as python itself does not support it natively). In the context of ruby, this article makes a similar point: http://www.madstop.com/ruby/ruby_has_a_distribution_problem.html David From gael.varoquaux at normalesup.org Tue Dec 29 10:55:09 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 29 Dec 2009 16:55:09 +0100 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> Message-ID: <20091229155509.GA15515@phare.normalesup.org> On Tue, Dec 29, 2009 at 11:34:44PM +0900, David Cournapeau wrote: > Buildout, virtualenv all work by sandboxing from the system python: > each of them do not see each other, which may be useful for > development, but as a deployment solution to the casual user who may > not be familiar with python, it is useless. A scientist who installs > numpy, scipy, etc... to try things out want to have everything > available in one python interpreter, and does not want to jump to > different virtualenvs and whatnot to try different packages. I think that you are pointing out a large source of misunderstanding in packaging discussion. People behind setuptools, pip or buildout care to have a working ensemble of packages that deliver an application (often a web application)[1]. You and I, and many scientific developers see libraries as building blocks that need to be assembled by the user, the scientist using them to do new science. Thus the idea of isolation is not something that we can accept, because it means that we are restricting the user to a set of libraries. Our definition of user is not the same as the user targeted by buildout. Our user does not push buttons, but he writes code. However, unlike the developer targeted by buildout and distutils, our user does not want or need to learn about packaging. Trying to make the debate clearer... Ga?l [1] I know your position on why simply focusing on sandboxing working ensemble of libraries is not a replacement for backward compatibility, and will only create impossible problems in the long run. While I agree with you, this is not my point here. From emsellem at obs.univ-lyon1.fr Tue Dec 29 10:58:14 2009 From: emsellem at obs.univ-lyon1.fr (Eric Emsellem) Date: Tue, 29 Dec 2009 16:58:14 +0100 Subject: [Numpy-discussion] Complex slicing and take Message-ID: <1262102294.4b3a27167cd77@webmail.univ-lyon1.fr> Hi (sorry if you receive this twice, but I did not see the first post appear) I have a nagging problem which I think could be solved nicely with numpy indexing but I cannot find the solution, except by invoking a stupid loop. I would like to do this with some numpy item. Problem ======== I have a 2D array which is let's say 1000 x 100. This represents 100 1D spectra, each one containing 1000 datapoints. For example: startarray = random((1000,100)) Assuming I have a list of indices, for example: ind = [1,2,5,6,1,2] and a corresponding list of integer shifts let's say between -100 and 100: shift = [10,20,34,-10,22,-20] I would like to do the following sum result = stararray[100+shift[0]:-100+shift[0],ind[0]] + stararray[100+shift[1]:-100+shift[0],ind[0]] + ... Basically: I would like to sum some extracted 1D line after they have been shifted by some amount. Note that I would like to be able to use each line several times (with different shifts). At the moment, I am using "take" to extract from this array some of these spectra, so let's start from scratch here: import numpy as num startarray = random((1000,100)) take_sample = [1,2,5,6,1,2] temp = num.take(startarray,take_sample,axis=1) # and to do the sum after shifting I need a loop shift = [10,20,34,-10,22,-20] result = num.zeros(900) # shorter than initial because of the shift for i in range(len(shift)) : result += temp[100+shift[i]:-100+shift[1]] Is there a way to do this without invoking a loop? Is there also a FAST solution which makes the shifts at the same time than the "take" (that would then prevent the use of the "temp" array)? Of course the arrays I am using are much BIGGER than these, so I really need an efficient way to do all this. thanks! Eric -- -------------------------------------------------------------------------- Ce message a ?t? envoy? depuis le webmail IMP (Internet Messaging Program) From kwgoodman at gmail.com Tue Dec 29 11:25:31 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 29 Dec 2009 08:25:31 -0800 Subject: [Numpy-discussion] Complex slicing and take In-Reply-To: <1262102294.4b3a27167cd77@webmail.univ-lyon1.fr> References: <1262102294.4b3a27167cd77@webmail.univ-lyon1.fr> Message-ID: On Tue, Dec 29, 2009 at 7:58 AM, Eric Emsellem wrote: > Hi (sorry if you receive this twice, but I did not see the first post appear) > > I have a nagging problem which I think could be solved nicely with numpy > indexing but I cannot find the solution, except by invoking a stupid > loop. I would like to do this with some numpy item. > > Problem > ======== > I have a 2D array which is let's say 1000 x 100. > This represents 100 1D spectra, each one containing 1000 datapoints. For > example: > > ? startarray = random((1000,100)) > > Assuming I have a list of indices, for example: > > ? ind = [1,2,5,6,1,2] > > and a corresponding list of integer shifts let's say between -100 and 100: > > ? shift = [10,20,34,-10,22,-20] > > I would like to do the following sum > > ? result = stararray[100+shift[0]:-100+shift[0],ind[0]] > ? ? ? ? ?+ stararray[100+shift[1]:-100+shift[0],ind[0]] > ? ? ? ? ?+ ... > > Basically: I would like to sum some extracted 1D line after they have been > shifted by some amount. Note that I would like to be able to use each line > several times (with different shifts). > > At the moment, I am using "take" to extract from this array some of these > spectra, so let's start from scratch here: > > ? import numpy as num > ? startarray = random((1000,100)) > ? take_sample = [1,2,5,6,1,2] > ? temp = num.take(startarray,take_sample,axis=1) Would it help to make temp a 1000x4 array instead of 1000x6? Could you do that by changing take_sample to [1,2,5,6] and multiplying columns 1 and 2 by a factor of 2? That would slow down the construction of temp but speed up the addition (and slicing?) in the loop below. > > ? ?# and to do the sum after shifting I need a loop > > ? shift = [10,20,34,-10,22,-20] > ? result = num.zeros(900) ?# shorter than initial because of the shift > ? for i in range(len(shift)) : > ? ? ?result += temp[100+shift[i]:-100+shift[1]] This looks fast to me. The slicing doesn't make a copy nor does the addition. I've read that cython does fast indexing but I don't know if that applies to slicing as well. I assume that shift[1] is a typo and should be shift[i]. > > Is there a way to do this without invoking a loop? Is there also a FAST > solution which makes the shifts at the same time than the "take" (that would > then prevent the use of the "temp" array)? > > Of course the arrays I am using are much BIGGER than these, so I really need an > efficient way to do all this. > > thanks! > > Eric > > > -- > > > -------------------------------------------------------------------------- > Ce message a ?t? envoy? depuis le webmail IMP (Internet Messaging Program) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Tue Dec 29 11:26:45 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 29 Dec 2009 11:26:45 -0500 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <20091229155509.GA15515@phare.normalesup.org> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <20091229155509.GA15515@phare.normalesup.org> Message-ID: <1cd32cbb0912290826h3467e15ah903b5ff05803e52@mail.gmail.com> On Tue, Dec 29, 2009 at 10:55 AM, Gael Varoquaux wrote: > On Tue, Dec 29, 2009 at 11:34:44PM +0900, David Cournapeau wrote: >> Buildout, virtualenv all work by sandboxing from the system python: >> each of them do not see each other, which may be useful for >> development, but as a deployment solution to the casual user who may >> not be familiar with python, it is useless. A scientist who installs >> numpy, scipy, etc... to try things out want to have everything >> available in one python interpreter, and does not want to jump to >> different virtualenvs and whatnot to try different packages. > > I think that you are pointing out a large source of misunderstanding > in packaging discussion. People behind setuptools, pip or buildout care > to have a working ensemble of packages that deliver an application (often > a web application)[1]. You and I, and many scientific developers see > libraries as building blocks that need to be assembled by the user, the > scientist using them to do new science. Thus the idea of isolation is not > something that we can accept, because it means that we are restricting > the user to a set of libraries. > > Our definition of user is not the same as the user targeted by buildout. > Our user does not push buttons, but he writes code. However, unlike the > developer targeted by buildout and distutils, our user does not want or > need to learn about packaging. > > Trying to make the debate clearer... I wanted to say the same thing. Pylons during the active development time, required a different combination of versions of several different packages almost every month. virtualenv and pip are the only solutions if you don't want to spend all the time updating. In the last half year, I started to have similar problem with numpy trunk and scipy and the rest, but I hope this will be only temporary, and might not really be a problem for the end user. Additionally for obtaining packages from pypi, I never had problems with pure python packages, or packages that had complete binary installers (eg. wxpython or matplotlib). However, the standard case for scientific packages, with different build dependencies are often a pain. (A nice example that I never tried is http://fenics.org/wiki/Installing_DOLFIN_on_Windows - website doesn't respond but it looks like it takes a week to install a lot of required source packages). On pypm.activestate.com scipy, matplotlib, mayavi all fail, scipy because of missing lapack/blas. That's also a reason why CRAN is nice, because it has automatic platform specific binary installation. And any improvement will be very welcome, especially if we start with a more widespread use of cython. I'm reluctant to use cython in statsmodels, exactly to avoid any build and distribution problems, even though it would be very useful. Josef > > Ga?l > > [1] I know your position on why simply focusing on sandboxing working > ensemble of libraries is not a replacement for backward compatibility, > and will only create impossible problems in the long run. While I agree > with you, this is not my point here. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From renesd at gmail.com Tue Dec 29 13:36:12 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Tue, 29 Dec 2009 18:36:12 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> Message-ID: <64ddb72c0912291036o79815ee4jf35e4db955a67bed@mail.gmail.com> On Tue, Dec 29, 2009 at 2:34 PM, David Cournapeau wrote: > On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: > >> Buildout is what a lot of the python community are using now. > > I would like to note that buildout is a solution to a problem that I > don't care to solve. This issue is particularly difficult to explain > to people accustomed with buildout in my experience - I have not found > a way to explain it very well yet. Hello, The main problem buildout solves is getting developers up to speed very quickly on a project. They should be able to call one command and get dozens of packages, and everything else needed ready to go, completely isolated from the rest of the system. If a project does not want to upgrade to the latest versions of packages, they do not have to. This reduces the dependency problem a lot. As one package does not have to block on waiting for 20 other packages. It makes iterating packages daily, or even hourly to not be a problem - even with dozens of different packages used. This is not theoretical, many projects iterate this quickly, and do not have problems. Backwards compatibility is of course a great thing to keep up... but harder to do with dozens of packages, some of which are third party ones. For example, some people are running pygame applications written 8 years ago that are still running today on the latest versions of pygame. I don't think people in the python world understand API, and ABI compatibility as much as those in the C world. However buildout is a solution to their problem, and allows them to iterate quickly with many participants, on many different projects. Many of these people work on maybe 20-100 different projects at once, and some machines may be running that many applications at once too. So using the system pythons packages is completely out of the question for them. > A scientist who installs numpy, scipy, etc... to try things out want to have everything available in one python interpreter, and does > not want to jump to different virtualenvs and whatnot to try different packages. It is very easy to include a dozen packages in a buildout, so that you have all the packages required. Anyway... here is a skeleton buildout project that uses numpy if anyone wants to have a play. http://renesd.blogspot.com/2009/12/buildout-project-that-uses-numpy.html cheers, From Chris.Barker at noaa.gov Tue Dec 29 16:29:46 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 29 Dec 2009 13:29:46 -0800 Subject: [Numpy-discussion] [matplotlib-devel] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> Message-ID: <4B3A74CA.8010003@noaa.gov> David Cournapeau wrote: > Buildout, virtualenv all work by sandboxing from the system python: > each of them do not see each other, which may be useful for > development, And certain kinds of deployment, like web servers or installed tools. > but as a deployment solution to the casual user who may > not be familiar with python, it is useless. A scientist who installs > numpy, scipy, etc... to try things out want to have everything > available in one python interpreter, and does not want to jump to > different virtualenvs and whatnot to try different packages. Absolutely true -- which is why Python desperately needs package version selection of some sort. I've been tooting this horn on and off for years but never got any interest at all from the core python developers. I see putting packages in with no version like having non-versioned dynamic libraries in a system -- i.e. dll hell. If I have a bunch of stuff running just fine with the various package versions I've installed, but then I start working on something (maybe just testing, maybe something more real) that requires the latest version of a package, I have a few choices: - install the new package and hope I don't break too much - use something like virtualenv, which requires a lot of overhead to setup and use (my evidence is personal, despite working with a team that uses it, somehow I've never gotten around to using for my dev work, even though, in theory, it should be a good solution) - setuptools does supposedly support multiple version installs and selection, but it's ugly and poorly documented enough that I've never figured out how to use it. This has been addressed with a handful of ad-hock solution: wxPython as wxversion.select, and I think PyGTK has something, and who knows what else. It would be really nice to have a standard solution available. Note that the usual response I've gotten is to use py2exe or something to distribute, so you're defining the whole stack. That's good for some things, but not all (though py2app's "alias" bundles are nice), and really pretty worthless for development. Also, many, many packages are a pain to use with py2exe and friends anyway (see my forthcoming other long post...) > - you cannot use sandboxing as a replacement for backward > compatibility (that's why I don't care much about all the discussion > about versioning - I don't think it is very useful as long as python > itself does not support it natively). could be -- I'd love to have Python support it natively, though wxversion isn't too bad. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Tue Dec 29 18:20:39 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 30 Dec 2009 08:20:39 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912291036o79815ee4jf35e4db955a67bed@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <64ddb72c0912291036o79815ee4jf35e4db955a67bed@mail.gmail.com> Message-ID: <5b8d13220912291520n2bdbbd00x7e5a19c4b4aa941d@mail.gmail.com> On Wed, Dec 30, 2009 at 3:36 AM, Ren? Dudfield wrote: > On Tue, Dec 29, 2009 at 2:34 PM, David Cournapeau wrote: >> On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: >> >>> Buildout is what a lot of the python community are using now. >> >> I would like to note that buildout is a solution to a problem that I >> don't care to solve. This issue is particularly difficult to explain >> to people accustomed with buildout in my experience - I have not found >> a way to explain it very well yet. > > Hello, > > The main problem buildout solves is getting developers up to speed > very quickly on a project. ?They should be able to call one command > and get dozens of packages, and everything else needed ready to go, > completely isolated from the rest of the system. > > If a project does not want to upgrade to the latest versions of > packages, they do not have to. ?This reduces the dependency problem a > lot. ?As one package does not have to block on waiting for 20 other > packages. ?It makes iterating packages daily, or even hourly to not be > a problem - even with dozens of different packages used. ?This is not > theoretical, many projects iterate this quickly, and do not have > problems. > > Backwards compatibility is of course a great thing to keep up... but > harder to do with dozens of packages, some of which are third party > ones. ?For example, some people are running pygame applications > written 8 years ago that are still running today on the latest > versions of pygame. ?I don't think people in the python world > understand API, and ABI compatibility as much as those in the C world. > > However buildout is a solution to their problem, and allows them to > iterate quickly with many participants, on many different projects. > Many of these people work on maybe 20-100 different projects at once, > and some machines may be running that many applications at once too. > So using the system pythons packages is completely out of the question > for them. This is all great, but I don't care about solving this issue, this is a *developer* issue. I don't mean this is not an important issue, it is just totally out of scope. The developer issues I care about are much more fine-grained (corrent dependency handling between target, toolchain customization, etc...). Note however that hopefully, by simplifying the packaging tools, the problems you see with numpy on 2.6 would be less common. The whole distutils/setuptools/distribute stack is hopelessly intractable, given how messy the code is. > > It is very easy to include a dozen packages in a buildout, so that you > have all the packages required. I think there is a confusion - I mostly care about *end users*. People who may not have compilers, who want to be able to easily upgrade one package, etc... David From matthew.brett at gmail.com Tue Dec 29 18:35:55 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 29 Dec 2009 23:35:55 +0000 Subject: [Numpy-discussion] Empty strings not empty? Message-ID: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> Hi, I was surprised by this - should I have been? In [35]: e = np.array(['a']) In [36]: e.shape Out[36]: (1,) In [37]: e.size Out[37]: 1 In [38]: e.tostring() Out[38]: 'a' In [39]: f = np.array(['a']) In [40]: f.shape == e.shape Out[40]: True In [41]: f.size == e.size Out[41]: True In [42]: f.tostring() Out[42]: 'a' In [43]: z = np.array(['\x00']) In [44]: z.shape Out[44]: (1,) In [45]: z.size Out[45]: 1 In [46]: z Out[46]: array([''], dtype='|S1') That is, an empty string array seems to be the same as a string array with a single 0 byte, including having shape (1,) and size 1... Best, Matthew From cournape at gmail.com Tue Dec 29 18:44:08 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 30 Dec 2009 08:44:08 +0900 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> Message-ID: <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> On Wed, Dec 30, 2009 at 8:35 AM, Matthew Brett wrote: > Hi, > > I was surprised by this - should I have been? > > In [35]: e = np.array(['a']) > > In [36]: e.shape > Out[36]: (1,) > > In [37]: e.size > Out[37]: 1 > > In [38]: e.tostring() > Out[38]: 'a' > > In [39]: f = np.array(['a']) > > In [40]: f.shape == e.shape > Out[40]: True > > In [41]: f.size == e.size > Out[41]: True > > In [42]: f.tostring() > Out[42]: 'a' > > In [43]: z = np.array(['\x00']) > > In [44]: z.shape > Out[44]: (1,) > > In [45]: z.size > Out[45]: 1 > > In [46]: z > Out[46]: > array([''], > ? ? ?dtype='|S1') > > That is, an empty string array seems to be the same as a string array > with a single 0 byte, including having shape (1,) and size 1... I don't see any empty string in your code ? They all have one byte. The last one is slightly confusing as far as printing is concerned (I would have expected array(["?x00"]...) instead). It may be a bug in numpy because a byte with value 0 is used a string delimiter in C. cheers, David From warren.weckesser at enthought.com Tue Dec 29 18:45:56 2009 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 29 Dec 2009 17:45:56 -0600 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> Message-ID: <4B3A94B4.2020006@enthought.com> Hmmm... I don't see where you created "an empty string array" in your examples. All three of your arrays contain one element, so they are not empty. The element in the array z happens to be zero, is all. Here's an example of creating an empty string array: In [2]: e = np.array([],dtype='S') In [3]: e Out[3]: array([], dtype='|S1') In [4]: e.shape Out[4]: (0,) Warren Matthew Brett wrote: > Hi, > > I was surprised by this - should I have been? > > In [35]: e = np.array(['a']) > > In [36]: e.shape > Out[36]: (1,) > > In [37]: e.size > Out[37]: 1 > > In [38]: e.tostring() > Out[38]: 'a' > > In [39]: f = np.array(['a']) > > In [40]: f.shape == e.shape > Out[40]: True > > In [41]: f.size == e.size > Out[41]: True > > In [42]: f.tostring() > Out[42]: 'a' > > In [43]: z = np.array(['\x00']) > > In [44]: z.shape > Out[44]: (1,) > > In [45]: z.size > Out[45]: 1 > > In [46]: z > Out[46]: > array([''], > dtype='|S1') > > That is, an empty string array seems to be the same as a string array > with a single 0 byte, including having shape (1,) and size 1... > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Tue Dec 29 18:52:36 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 29 Dec 2009 23:52:36 +0000 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> Message-ID: <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> Hi, > I don't see any empty string in your code ? They all have one byte. > The last one is slightly confusing as far as printing is concerned (I > would have expected array(["?x00"]...) instead). It may be a bug in > numpy because a byte with value 0 is used a string delimiter in C. Sorry - I pasted the wrong code: In [49]: e = np.array(['']) In [50]: e.shape Out[50]: (1,) In [51]: e.size Out[51]: 1 In [52]: f = np.array(['a']) In [53]: f.shape == e.shape Out[53]: True In [54]: f.size == e.size Out[54]: True In [55]: e.tostring() Out[55]: '\x00' In [56]: f.tostring() Out[56]: 'a' In [58]: e == z Out[58]: array([ True], dtype=bool) Thanks, Matthew From cournape at gmail.com Tue Dec 29 19:18:27 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 30 Dec 2009 09:18:27 +0900 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> Message-ID: <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> On Wed, Dec 30, 2009 at 8:52 AM, Matthew Brett wrote: > In [58]: e == z > Out[58]: array([ True], dtype=bool) Ok, it looks like there are at least two issues: - if an item in a string array is set to '?x00', this seems to be replace with '', but '' != '?x00'] x = np.array(["?x00"]) x[0] == '' " # True, but should be False ? - if an item in a string array is set to '', tostring will convert it to '?x00' : x = np.array([""]) x.tostring() == ["?00"] # True, but should be False ? I guess the root cause is that there does not seem to be a "|S0" type - but that may be difficult to implement, since the array would have > 0 items, but a 0 size. It may have other, but as quirky behavior. What do you need this for ? cheers, David From matthew.brett at gmail.com Tue Dec 29 19:33:53 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 30 Dec 2009 00:33:53 +0000 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> Message-ID: <1e2af89e0912291633kb7720b5w5dcbeb8a79624c09@mail.gmail.com> Hi, > Ok, it looks like there are at least two issues: > ?- if an item in a string array is set to '?x00', this seems to be > replace with '', but '' != '?x00'] Sorry - I'm afraid I don't understand. It looks to me as though the buffer contents of [''] is a length 1 string with a 0 byte, and an array.size of 1 - is that also what you think? I guess I think that it should be a length 0 string, with a array.size of 0, > What do you need this for ? I noticed it when I found that writing an empty string array to matlab resulted in a single character array when loaded into matlab. I guess that I will have to special-case the writing code to detect 'empty' strings, but I can't (I don't think) distinguish a real string with \x00 from an empty string. Thanks a lot, Matthew From cournape at gmail.com Tue Dec 29 19:57:50 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 30 Dec 2009 09:57:50 +0900 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291633kb7720b5w5dcbeb8a79624c09@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> <1e2af89e0912291633kb7720b5w5dcbeb8a79624c09@mail.gmail.com> Message-ID: <5b8d13220912291657r532ed639webd6f101f3ffbbe6@mail.gmail.com> On Wed, Dec 30, 2009 at 9:33 AM, Matthew Brett wrote: > Hi, > >> Ok, it looks like there are at least two issues: >> ?- if an item in a string array is set to '?x00', this seems to be >> replace with '', but '' != '?x00'] > > Sorry - I'm afraid I don't understand Compare this: x = "?00" arr = np.array([x]) lst = [x] arr[0] == x # False arr[0] == "" # True lst[0] == x # True lst[0] == "" # False >?It looks to me as though the > buffer contents of [''] is a length 1 string with a 0 byte, and an > array.size of 1 - is that also what you think? ?I guess I think that > it should be a length 0 string, with a array.size of 0 Array size of 0 would be very weird: it means it would have no items, whereas it actually has one item (which itself has a size 0). If you create a list with an empty string (x = [""]), you have len(x) == 1 and len(x[0]) == 0. But an empty string has size 0, so the corresponding dtype should have an itemsize of 0 (assuming the array only contains empty strings). > I > guess that I will have to special-case the writing code to detect > 'empty' strings, but I can't (I don't think) distinguish a real string > with \x00 from an empty string. In python "proper", they are different: "?x00" != "". The problem is that it does not seem possible ATM to create an numpy array with an empty string. David From d.l.goldsmith at gmail.com Tue Dec 29 20:00:59 2009 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 29 Dec 2009 17:00:59 -0800 Subject: [Numpy-discussion] LA improvements (was: dot function or dot notation, matrices, arrays?) In-Reply-To: References: <45d1ab480912222202i2b62e02ay17256fd7e1d17650@mail.gmail.com> <45d1ab480912231119x5c5ecd77wda5b83a4f46dbb25@mail.gmail.com> Message-ID: <45d1ab480912291700x4f8d244ah7e9fb5944d7d67c0@mail.gmail.com> On Wed, Dec 23, 2009 at 2:26 PM, David Warde-Farley wrote: > > The existing documentation, plus source code from the umath_tests > module marked up descriptively (what all the parameters do, especially > the ones which currently receive magic numbers) would probably be the > way to go down the road. > > David Searching the Wiki, the most pertinent remaining doc is: http://docs.scipy.org/numpy/docs/numpy-docs/reference/c-api.generalized-ufuncs.rst/ Is this "the existing documentation" that you refer to above? If not, perhaps you can find and point me to what it refers in the Wiki? If it is, I infer from the above that, modulo "cleaning it up" a bit, it's fine, it just needs to be supplemented with some examples taken from: numpy\core\src\umath\umath_tests.c.src, correct? DG From matthew.brett at gmail.com Tue Dec 29 20:20:12 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 30 Dec 2009 01:20:12 +0000 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <5b8d13220912291657r532ed639webd6f101f3ffbbe6@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> <1e2af89e0912291633kb7720b5w5dcbeb8a79624c09@mail.gmail.com> <5b8d13220912291657r532ed639webd6f101f3ffbbe6@mail.gmail.com> Message-ID: <1e2af89e0912291720h753bc3dcq2bcaab82ee3ad9ca@mail.gmail.com> Hi, > x = "?00" > arr = np.array([x]) > lst = [x] > > arr[0] == x # False > arr[0] == "" # True > > lst[0] == x # True > lst[0] == "" # False Ah - thanks - got it. >>?It looks to me as though the >> buffer contents of [''] is a length 1 string with a 0 byte, and an >> array.size of 1 - is that also what you think? ?I guess I think that >> it should be a length 0 string, with a array.size of 0 > > Array size of 0 would be very weird: it means it would have no items, > whereas it actually has one item (which itself has a size 0). Is this a string-specific thing? I mean, you can have size 0 1d numeric arrays. Sorry if I'm being slow, it's late here. In [70]: np.array([[]]).shape Out[70]: (1, 0) In [71]: np.array([[]]).size Out[71]: 0 Cheers, Matthew From cournape at gmail.com Tue Dec 29 21:36:25 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 30 Dec 2009 11:36:25 +0900 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291720h753bc3dcq2bcaab82ee3ad9ca@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <5b8d13220912291544o31877c57i7b772b6a6181cb9a@mail.gmail.com> <1e2af89e0912291552t7353724bsee3d7c03dc629b5d@mail.gmail.com> <5b8d13220912291618o49abee77lbc41e53cd927d362@mail.gmail.com> <1e2af89e0912291633kb7720b5w5dcbeb8a79624c09@mail.gmail.com> <5b8d13220912291657r532ed639webd6f101f3ffbbe6@mail.gmail.com> <1e2af89e0912291720h753bc3dcq2bcaab82ee3ad9ca@mail.gmail.com> Message-ID: <5b8d13220912291836q5a60c7c3gdfb276f2ba5fb35c@mail.gmail.com> On Wed, Dec 30, 2009 at 10:20 AM, Matthew Brett wrote: > Hi, > >> x = "?00" >> arr = np.array([x]) >> lst = [x] >> >> arr[0] == x # False >> arr[0] == "" # True >> >> lst[0] == x # True >> lst[0] == "" # False > > Ah - thanks - got it. > >>>?It looks to me as though the >>> buffer contents of [''] is a length 1 string with a 0 byte, and an >>> array.size of 1 - is that also what you think? ?I guess I think that >>> it should be a length 0 string, with a array.size of 0 >> >> Array size of 0 would be very weird: it means it would have no items, >> whereas it actually has one item (which itself has a size 0). > > Is this a string-specific thing? No. I was not very clear: My point was that size 0 array is likely not what you want. What you want is arrays whose *itemsize" is 0. > ?I mean, you can have size 0 1d > numeric arrays. ?Sorry if I'm being slow, it's late here. > > In [70]: np.array([[]]).shape > Out[70]: (1, 0) > > In [71]: np.array([[]]).size > Out[71]: 0 Yes, you can create array with size 0. But I don't think that's what you want - you cannot index them normally (even though the array is 2d, you cannot do arr[0][0], so I don't think that's very useful for your case). David From eemselle at eso.org Wed Dec 30 03:08:01 2009 From: eemselle at eso.org (Eric Emsellem) Date: Wed, 30 Dec 2009 09:08:01 +0100 Subject: [Numpy-discussion] Complex slicing and take Message-ID: <4B3B0A61.4050303@eso.org> Hi thanks for the tips. Unfortunately this is not what I am after. >> > ? import numpy as num >> > ? startarray = random((1000,100)) >> > ? take_sample = [1,2,5,6,1,2] >> > ? temp = num.take(startarray,take_sample,axis=1) > Would it help to make temp a 1000x4 array instead of 1000x6? Could you > do that by changing take_sample to [1,2,5,6] and multiplying columns 1 > and 2 by a factor of 2? That would slow down the construction of temp > but speed up the addition (and slicing?) in the loop below. No it wouldn't help unfortunately, because the second instance of "1,2" would have different shifts. So I cannot just count the number of occurrence of each line. From the initial 2D array, 1D lines could be extracted several times, with each time a different shift. >> > ? shift = [10,20,34,-10,22,-20] >> > ? result = num.zeros(900) ?# shorter than initial because of the shift >> > ? for i in range(len(shift)) : >> > ? ? ?result += temp[100+shift[i]:-100+shift[1]] > This looks fast to me. The slicing doesn't make a copy nor does the > addition. I've read that cython does fast indexing but I don't know if > that applies to slicing as well. I assume that shift[1] is a typo and > should be shift[i]. (yes of course the shift[1] should be shift[i]) Well this may be fast, but not fast enough. And also, starting from my 2D startarray again, it looks odd that I cannot do something like: startarray = random((1000,100)) take_sample = [1,2,5,6,1,2] shift = [10,20,34,-10,22,-20] result = num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift]) but of course this is nonsense because I cannot address the data this way (with "shift"). In fact I realise now that my question is simpler: how do I extract and sum 1d lines from a 2D array if I want first each line to be "shifted". So starting again now, I want a quick way to write: startarray = random((1000,6)) shift = [10,20,34,-10,22,-20] result = num.zeros(1000, dtype=float) for i in len(shift) : result += startarray[100+shift[i]:900+shift[i]] Can I write this directly with some numpy indexing without the loop in python? thanks for any tip. Eric From pnorthug at gmail.com Wed Dec 30 03:21:16 2009 From: pnorthug at gmail.com (paul) Date: Wed, 30 Dec 2009 08:21:16 +0000 (UTC) Subject: [Numpy-discussion] linalg.py det of transpose Message-ID: In numpy/linalg/linalg.py, the function det(a) does a fastCopyAndTranspose of a before calling lapack. Is that necessary as det(a.T) = det(a)? From renesd at gmail.com Wed Dec 30 03:37:35 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 30 Dec 2009 08:37:35 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912291520n2bdbbd00x7e5a19c4b4aa941d@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290634u5902a6bag33ddb8a15a93406b@mail.gmail.com> <64ddb72c0912291036o79815ee4jf35e4db955a67bed@mail.gmail.com> <5b8d13220912291520n2bdbbd00x7e5a19c4b4aa941d@mail.gmail.com> Message-ID: <64ddb72c0912300037ief3ec87pc5200669b757034@mail.gmail.com> On Tue, Dec 29, 2009 at 11:20 PM, David Cournapeau wrote: > On Wed, Dec 30, 2009 at 3:36 AM, Ren? Dudfield wrote: >> On Tue, Dec 29, 2009 at 2:34 PM, David Cournapeau wrote: >>> On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: >>> >>>> Buildout is what a lot of the python community are using now. >>> >>> I would like to note that buildout is a solution to a problem that I >>> don't care to solve. This issue is particularly difficult to explain >>> to people accustomed with buildout in my experience - I have not found >>> a way to explain it very well yet. >> >> Hello, >> >> The main problem buildout solves is getting developers up to speed >> very quickly on a project. ?They should be able to call one command >> and get dozens of packages, and everything else needed ready to go, >> completely isolated from the rest of the system. >> >> If a project does not want to upgrade to the latest versions of >> packages, they do not have to. ?This reduces the dependency problem a >> lot. ?As one package does not have to block on waiting for 20 other >> packages. ?It makes iterating packages daily, or even hourly to not be >> a problem - even with dozens of different packages used. ?This is not >> theoretical, many projects iterate this quickly, and do not have >> problems. >> >> Backwards compatibility is of course a great thing to keep up... but >> harder to do with dozens of packages, some of which are third party >> ones. ?For example, some people are running pygame applications >> written 8 years ago that are still running today on the latest >> versions of pygame. ?I don't think people in the python world >> understand API, and ABI compatibility as much as those in the C world. >> >> However buildout is a solution to their problem, and allows them to >> iterate quickly with many participants, on many different projects. >> Many of these people work on maybe 20-100 different projects at once, >> and some machines may be running that many applications at once too. >> So using the system pythons packages is completely out of the question >> for them. > > This is all great, but I don't care about solving this issue, this is > a *developer* issue. I don't mean this is not an important issue, it > is just totally out of scope. > > The developer issues I care about are much more fine-grained (corrent > dependency handling between target, toolchain customization, etc...). > Note however that hopefully, by simplifying the packaging tools, the > problems you see with numpy on 2.6 would be less common. The whole > distutils/setuptools/distribute stack is hopelessly intractable, given > how messy the code is. > The numpy issue is because of the change in package handling for 2.6, which numpy 1.3 was not developed for. >> >> It is very easy to include a dozen packages in a buildout, so that you >> have all the packages required. > > I think there is a confusion - I mostly care about *end users*. People > who may not have compilers, who want to be able to easily upgrade one > package, etc... > I was just describing the problems that buildout solves(for others). If I have a project that depends on numpy and 12 other packages, I can send it to other people who can get their project up and running fairly quickly (assuming everything installs ok). btw, numpy 1.4 works with buildout! (at least on my ubuntu box) sweet :) cd /tmp/ bzr branch lp:numpybuildout cd numpybuildout/trunk/ python bootstrap.py -d ./bin/buildout ./bin/py >>> import numpy >>> numpy.__file__ '/tmp/numpybuildout/trunk/eggs/numpy-1.4.0-py2.6-linux-i686.egg/numpy/__init__.pyc' From renesd at gmail.com Wed Dec 30 06:15:45 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 30 Dec 2009 11:15:45 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> Message-ID: <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> hello again, On Tue, Dec 29, 2009 at 2:22 PM, David Cournapeau wrote: > On Tue, Dec 29, 2009 at 10:27 PM, Ren? Dudfield wrote: >> Hi, >> >> In the toydist proposal/release notes, I would address 'what does >> toydist do better' more explicitly. >> >> >> >> **** A big problem for science users is that numpy does not work with >> pypi + (easy_install, buildout or pip) and python 2.6. **** >> >> >> >> Working with the rest of the python community as much as possible is >> likely a good goal. > > Yes, but it is hopeless. Most of what is being discussed on > distutils-sig is useless for us, and what matters is ignored at best. > I think most people on distutils-sig are misguided, and I don't think > the community is representative of people concerned with packaging > anyway - most of the participants seem to be around web development, > and are mostly dismissive of other's concerns (OS packagers, etc...). > Sitting down with Tarek(who is one of the current distutils maintainers) in Berlin we had a little discussion about packaging over pizza and beer... and he was quite mindful of OS packagers problems and issues. He was also interested to hear about game developers issues with packaging (which are different again to scientific users... but similar in many ways). However these systems were developed by the zope/plone/web crowd, so they are naturally going to be thinking a lot about zope/plone/web issues. Debian, and ubuntu packages for them are mostly useless because of the age. Waiting a couple of years for your package to be released is just not an option (waiting even an hour for bug fixes is sometimes not an option). Also isolation of packages is needed for machines that have 100s of different applications running, written by different people, each with dozens of packages used by each application. Tools like checkinstall and stdeb ( http://pypi.python.org/pypi/stdeb/ ) can help with older style packaging systems like deb/rpm. I think perhaps if toydist included something like stdeb as not an extension to distutils, but a standalone tool (like toydist) there would be less problems with it. One thing the various zope related communities do is make sure all the relevant and needed packages are built/tested by their compile farms. This makes pypi work for them a lot better than a non-coordinated effort does. There are also lots of people trying out new versions all of the time. > I want to note that I am not starting this out of thin air - I know > most of distutils code very well, I have been the mostly sole > maintainer of numpy.distutils for 2 years now. I have written > extensive distutils extensions, in particular numscons which is able > to fully build numpy, scipy and matplotlib on every platform that > matters. > > Simply put, distutils code is horrible (this is an objective fact) and > ?flawed beyond repair (this is more controversial). IMHO, it has > almost no useful feature, except being standard. > yes, I have also battled with distutils over the years. However it is simpler than autotools (for me... maybe distutils has perverted my fragile mind), and works on more platforms for python than any other current system. It is much worse for C/C++ modules though. It needs dependency, and configuration tools for it to work better (like what many C/C++ projects hack into distutils themselves). Monkey patching, and extensions are especially a problem... as is the horrible code quality of distutils by modern standards. However distutils has had more tests and testing systems added, so that refactoring/cleaning up of distutils can happen more so. > If you want a more detailed explanation of why I think distutils and > all tools on top are deeply flawed, you can look here: > > http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/ > I agree with many things in that post. Except your conclusion on multiple versions of packages in isolation. Package isolation is like processes, and package sharing is like threads - and threads are evil! Leave my python site-packages directory alone I say... especially don't let setuptools infect it :) Many people currently find the multi versions of packages in isolation approach works well for them - so for some use cases the tools are working wonderfully. >> numpy used to work with buildout in python2.5, but not with 2.6. >> buildout lets other team members get up to speed with a project by >> running one command. ?It installs things in the local directory, not >> system wide. ?So you can have different dependencies per project. > > I don't think it is a very useful feature, honestly. It seems to me > that they created a huge infrastructure to split packages into tiny > pieces, and then try to get them back together, imaganing that > multiple installed versions is a replacement for backward > compatibility. Anyone with extensive packaging experience knows that's > a deeply flawed model in general. > Science is supposed to allow repeatability. Without the same versions of packages, repeating experiments is harder. This is a big problem in science that multiple versions of packages in _isolation_ can help get to a solution to the repeatability problem. Just pick some random paper and try to reproduce their results. It's generally very hard, unless the software is quite well packaged. Especially for graphics related papers, there are often many different types of environments, so setting up the environments to try out their techniques, and verify results quickly is difficult. Multiple versions are not a replacement for backwards compatibility, just a way to avoid the problem in the short term to avoid being blocked. If a new package version breaks your app, then you can either pin it to an old version, fix your app, or fix the package. It is also not a replacement for building on stable high quality components, but helps you work with less stable, and less high quality components - at a much faster rate of change, with a much larger dependency list. >> Plenty of good work is going on with python packaging. > > That's the opposite of my experience. What I care about is: > ?- tools which are hackable and easily extensible > ?- robust install/uninstall > ?- real, DAG-based build system > ?- explicit and repeatability > > None of this is supported by the tools, and the current directions go > even further away. When I have to explain at length why the > command-based design of distutils is a nightmare to work with, I don't > feel very confident that the current maintainers are aware of the > issues, for example. It shows that they never had to extend distutils > much. > All agreed! I'd add to the list parallel builds/tests (make -j 16), and outputting to native build systems. eg, xcode, msvc projects, and makefiles. It would interesting to know your thoughts on buildout recipes ( see creating recipes http://www.buildout.org/docs/recipe.html ). They seem to work better from my perspective. However, that is probably because of isolation. The recipe are only used by those projects that require them. So the chance of them interacting are lower, as they are not installed in the main python. How will you handle toydist extensions so that multiple extensions do not have problems with each other? I don't think this is possible without isolation, and even then it's still a problem. Note, the section in the distutils docs on creating command extensions is only around three paragraphs. There is also no central place to go looking for extra commands (that I know of). Or a place to document or share each others command extensions. Many of the methods for extending distutils are not very well documented either. For example, 'how do I you change compiler command line arguments for certain source files?' Basic things like that are possible with disutils, but not documented (very well). >> >> There are build farms for windows packages and OSX uploaded to pypi. >> Start uploading pre releases to pypi, and you get these for free (once >> you make numpy compile out of the box on those compile farms). ?There >> are compile farms for other OSes too... like ubuntu/debian, macports >> etc. ?Some distributions even automatically download, compile and >> package new releases once they spot a new file on your ftp/web site. > > I am familiar with some of those systems (PPA and opensuse build > service in particular). One of the goal of my proposal is to make it > easier to interoperate with those tools. > yeah, cool. > I think Pypi is mostly useless. The lack of enforced metadata is a big > no-no IMHO. The fact that Pypi is miles beyond CRAN for example is > quite significant. I want CRAN for scientific python, and I don't see > Pypi becoming it in the near future. > > The point of having our own Pypi-like server is that we could do the following: > ?- enforcing metadata > ?- making it easy to extend the service to support our needs > Yeah, cool. Many other projects have their own servers too. pygame.org, plone, etc etc, which meet their own needs. Patches are accepted for pypi btw. What type of enforcements of meta data, and how would they help? I imagine this could be done in a number of ways to pypi. - a distutils command extension that people could use. - change pypi source code. - check the metadata for certain packages, then email their authors telling them about issues. >> >> pypm: ?http://pypm.activestate.com/list-n.html#numpy > > It is interesting to note that one of the maintainer of pypm has > recently quitted the discussion about Pypi, most likely out of > frustration from the other participants. > yeah, big mailing list discussions hardly ever help I think :) oops, this is turning into one. >> Documentation projects are being worked on to document, give tutorials >> and make python packaging be easier all round. ?As witnessed by 20 or >> so releases on pypi every day(and growing), lots of people are using >> the python packaging tools successfully. > > This does not mean much IMO. Uploading on Pypi is almost required to > use virtualenv, buildout, etc.. An interesting metric is not how many > packages are uploaded, but how much it is used outside developers. > Yeah, it only means that there are lots of developers able to use the packaging system to put their own packages up there. However there are over 500 science related packages on there now - which is pretty cool. A way to measure packages being used would be by downloads, and by which packages depend on which other packages. I think the science ones would be reused lower than normal, since a much higher percentage are C/C++ based, and are likely to be more fragile packages. >> >> I'm not sure making a separate build tool is a good idea. ?I think >> going with the rest of the python community, and improving the tools >> there is a better idea. > > It has been tried, and IMHO has been proved to have failed. You can > look at the recent discussion (the one started by Guido in > particular). > I don't think 500+ science related packages is a total failure really. >> pps. some notes on toydist itself. >> - toydist convert is cool for people converting a setup.py . ?This >> means that most people can try out toydist right away. ?but what does >> it gain these people who convert their setup.py files? > > Not much ATM, except that it is easier to write a toysetup.info > compared to setup.py IMO, and that it supports a simple way to include > data files (something which is currently *impossible* to do without > writing your own distutils extensions). It has also the ability to > build eggs without using setuptools (I consider not using setuptools a > feature, given the too many failure modes of this package). > yeah, I always make setuptools not used in my packages by default. However I use command line arguments to use the features of setuptools required (eggs, bdist_mpkg etc etc). Having a tool to create eggs without setuptools would be great in itself. Definitely list this in the feature list :) > The main goals though are to make it easier to build your own tools on > top of if, and to integrate with real build systems. > yeah, cool. >> - a toydist convert that generates a setup.py file might be cool :) > > toydist started like this, actually: you would write a setup.py file > which loads the package from toysetup.info, and can be converted to a > dict argument to distutils.core.setup. I have not updated it recently, > but that's definitely on the TODO list for a first alpha, as it would > enable people to benefit from the format, with 100 % backward > compatibility with distutils. > yeah, cool. That would let you develop things incrementally too, and still have toydist be useful for the whole development period until it catches up with the features of distutils needed. >> - arbitrary code execution happens when building or testing with >> toydist. > > You are right for testing, but wrong for building. As long as the > build is entirely driven by toysetup.info, you only have to trust > toydist (which is not safe ATM, but that's an implementation detail), > and your build tools of course. > If you execute build tools on arbitrary code, then arbitrary code execution is easy for someone who wants to do bad things. Trust and secondarily sandboxing are the best ways to solve these problems imho. > Obviously, if you have a package which uses an external build tool on > top of toysetup.info (as will be required for numpy itself for > example), all bets are off. But I think that's a tiny fraction of the > interesting packages for scientific computing. > yeah, currently 1/5th of science packages use C/C++/fortran/cython etc (see http://pypi.python.org/pypi?:action=browse&c=40 110/458 on that page ). There seems to be a lot more using C/C++ compared to other types of pakages on there (eg zope3 packages list 0 out of 900 packages using C/C++). So the hight number of C/C++ science related packages on pypi demonstrate that better C/C++ tools for scientific packages is a big need. Especially getting compile/testing farms for all these packages. Getting compile farms is a big need compared to python packages - since C/C++ is MUCH harder to write/test in a portable way. I would say it is close to impossible to get code to work without quite good knowledge on multiple platforms without errors. There are many times with pygame development that I make changes on an osx, windows or linux box, commit the change, then wait for the compile/tests to run on the build farm ( http://thorbrian.com/pygame/builds.php ). Releasing packages otherwise makes the process *heaps* longer... and many times I still get errors on different platforms, despite many years of multi platform coding. > Sandboxing is particularly an issue on windows - I don't know a good > solution for windows sandboxing, outside of full vms, which are > heavy-weights. > yeah, VMs are the way to go. If only to make the copies a fresh install each time. However I think automated distributed building, and trust are more useful. ie, only build those packages where you trust the authors, and let anyone download, build and then post their build/test results. MS have given out copies of windows to some people to set up VMs for building to different members of the python community in the past. By automated distributed building, I mean what happens with mailing lists usually. Where people post their test results when they have a problem. Except in a more automated manner. Adding a 'Do you want to upload your build/test results?' at the end of a setup.py for subversion builds would give you dozens or hundreds of test results daily from all sorts of machines. Making it easy for people to set up package builders which also upload their packages somewhere gives you distributed package building, in a fairly safe automated manner. (more details here: http://renesd.blogspot.com/2009/09/python-build-bots-down-maybe-they-need.html ) >> - it should be possible to build this toydist functionality as a >> distutils/distribute/buildout extension. > > No, it cannot, at least as far as distutils/distribute are concerned > (I know nothing about buildout). Extending distutils is horrible, and > fragile in general. Even autotools with its mix of generated sh > scripts through m4 and perl is a breeze compared to distutils. > >> - extending toydist? ?How are extensions made? ?there are 175 buildout >> packages which extend buildout, and many that extend >> distutils/setuptools - so extension of build tools in a necessary >> thing. > > See my answer earlier about interoperation with build tools. > I'm still not clear on how toydist will be extended. I am however, a lot clearer about its goals. cheers, From lists_ravi at lavabit.com Wed Dec 30 09:26:04 2009 From: lists_ravi at lavabit.com (Ravi) Date: Wed, 30 Dec 2009 09:26:04 -0500 Subject: [Numpy-discussion] =?iso-8859-1?q?Announcing_toydist=2C_improving?= =?iso-8859-1?q?_distribution_and_=09packaging_situation?= In-Reply-To: <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> Message-ID: <200912300926.05071.lists_ravi@lavabit.com> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: > I agree with many things in that post. Except your conclusion on > multiple versions of packages in isolation. Package isolation is like > processes, and package sharing is like threads - and threads are evil! You have stated this several times, but is there any evidence that this is the desire of the majority of users? In the scientific community, interactive experimentation is critical and users are typically not seasoned systems administrators. For such users, almost all packages installed after installing python itself are packages they use. In particular, all I want to do is to use apt/yum to get the packages (or ask my sysadmin, who rightfully has no interest in learning the intricacies of python package installation, to do so) and continue with my work. "Packages-in-isolation" is for people whose job is to run server farms, not interactive experimenters. > Leave my python site-packages directory alone I say... especially > don't let setuptools infect it :) Many people currently find the > multi versions of packages in isolation approach works well for them - > so for some use cases the tools are working wonderfully. More power to them. But for the rest of us, that approach is too much hassle. > Science is supposed to allow repeatability. Without the same versions > of packages, repeating experiments is harder. Really? IME, this is not the case. Simulations in signal processing are typically run with two different kinds of data sets: - random data for Monte Carlo simulations - well-known and widely available test streams In both kinds of data sets, reimplementation of the same algorithms is rarely, if ever, affected by the versions of packages, primarily because of the wide variety of tool sets (and even more versions) that are in use. > This is a big problem > in science that multiple versions of packages in _isolation_ can help > get to a solution to the repeatability problem. Package versions are, at worst, a very minor distraction in solving the repeatability problem. Usually, the main issues are unclear descriptions of the algorithms and unstated assumptions. > Just pick some random paper and try to reproduce their results. It's > generally very hard, unless the software is quite well packaged. In scientific experimentation, it is folly to rely on software from the author of some random paper. In signal processing, almost every critical algorithm is re-implemented, and usually in a different language. The only exceptions are when the software can be validated with a large amount of test data, but this very rare. Usually, you use some package to get started in your current environment. If it works (i.e., results meet your quality metric), you then build on it. If it does not work (even if only due to version incompatibility), you usually jettison it and either find an alternative or rewrite it. > Multiple versions are not a replacement for backwards compatibility, > just a way to avoid the problem in the short term to avoid being > blocked. If a new package version breaks your app, then you can > either pin it to an old version, fix your app, or fix the package. It > is also not a replacement for building on stable high quality > components, but helps you work with less stable, and less high quality > components - at a much faster rate of change, with a much larger > dependency list. This is a software engineer + systems administrator solution. In larger institutions, this absolutely unworkable if you rely on IT for package management/installation. > >> Plenty of good work is going on with python packaging. > > > > That's the opposite of my experience. What I care about is: > > - tools which are hackable and easily extensible > > - robust install/uninstall > > - real, DAG-based build system > > - explicit and repeatability > > > > None of this is supported by the tools, and the current directions go > > even further away. When I have to explain at length why the > > command-based design of distutils is a nightmare to work with, I don't > > feel very confident that the current maintainers are aware of the > > issues, for example. It shows that they never had to extend distutils > > much. > > All agreed! I'd add to the list parallel builds/tests (make -j 16), > and outputting to native build systems. eg, xcode, msvc projects, and > makefiles. Essentially out of frustration with distutils and setuptools, I have migrated to CMake for pretty much all my build systems (except for a few scons ones I haven't had to touch for a while) since it supports all the features mentioned above. Even dealing with CMake's god-awful "scripting language" is better than dealing with distutils. I am very happy to see David C's efforts to finally get away from distutils, but I am worried that a cross-platform build system that has all the features that he wants is simply beyond the scope of 1-2 people unless they work on it full time for a year or two. > yeah, currently 1/5th of science packages use C/C++/fortran/cython etc > (see http://pypi.python.org/pypi?:action=browse&c=40 110/458 on that > page ). There seems to be a lot more using C/C++ compared to other > types of pakages on there (eg zope3 packages list 0 out of 900 > packages using C/C++). > > So the hight number of C/C++ science related packages on pypi > demonstrate that better C/C++ tools for scientific packages is a big > need. Especially getting compile/testing farms for all these > packages. Getting compile farms is a big need compared to python > packages - since C/C++ is MUCH harder to write/test in a portable way. > I would say it is close to impossible to get code to work without > quite good knowledge on multiple platforms without errors. Not sure that that is quite true. C++ is not a very popular language around here, but the combination of boost+Qt+python+scipy+hdf5+h5py has made virtually all of my platform-specific code vanish (with the exception of some platform-specific stuff in my CMake scripts). Regards, Ravi From dsdale24 at gmail.com Wed Dec 30 09:26:00 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 30 Dec 2009 09:26:00 -0500 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: Hi David, On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau wrote: > Executable: grin > ? ?module: grin > ? ?function: grin_main > > Executable: grind > ? ?module: grin > ? ?function: grind_main Have you thought at all about operations that are currently performed by post-installation scripts? For example, it might be desirable for the ipython or MayaVi windows installers to create a folder in the Start menu that contains links the the executable and the documentation. This is probably a secondary issue at this point in toydist's development, but I think it is an important feature in the long run. Also, have you considered support for package extras (package variants in Ports, allowing you to specify features that pull in additional dependencies like traits[qt4])? Enthought makes good use of them in ETS, and I think they would be worth keeping. Darren From dsdale24 at gmail.com Wed Dec 30 09:39:44 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 30 Dec 2009 09:39:44 -0500 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <200912300926.05071.lists_ravi@lavabit.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> Message-ID: On Wed, Dec 30, 2009 at 9:26 AM, Ravi wrote: > On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: > >> I agree with many things in that post. ?Except your conclusion on >> multiple versions of packages in isolation. ?Package isolation is like >> processes, and package sharing is like threads - and threads are evil! I don't think this is an appropriate analogy, and hyperbolic statements like "threads are evil!" are unlikely to persuade a scientific audience. > You have stated this several times, but is there any evidence that this is the > desire of the majority of users? In the scientific community, interactive > experimentation is critical and users are typically not seasoned systems > administrators. For such users, almost all packages installed after installing > python itself are packages they use. In particular, all I want to do is to use > apt/yum to get the packages (or ask my sysadmin, who rightfully has no > interest in learning the intricacies of python package installation, to do so) > and continue with my work. "Packages-in-isolation" is for people whose job is > to run server farms, not interactive experimenters. I agree. >> ?Leave my python site-packages directory alone I say... especially >> don't let setuptools infect it :) There are already mechanisms in place for this. "python setup.py install --user" or "easy_install --prefix=/usr/local" for example. Darren From cournape at gmail.com Wed Dec 30 10:50:10 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 31 Dec 2009 00:50:10 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <64ddb72c0912290527s1143efc7g3efe93936ca5de5@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> Message-ID: <5b8d13220912300750kaa4230hde31ced08c32d44e@mail.gmail.com> On Wed, Dec 30, 2009 at 8:15 PM, Ren? Dudfield wrote: > > Sitting down with Tarek(who is one of the current distutils > maintainers) in Berlin we had a little discussion about packaging over > pizza and beer... and he was quite mindful of OS packagers problems > and issues. This has been said many times on distutils-sig, but no concrete action has ever been taken in that direction. For example, toydist already supports the FHS better than distutils, and is more flexible. I have tried several times to explain why this matters on distutils-sig, but you then have the peanuts gallery interfering with unrelated nonsense (like it would break windows, as if it could not be implemented independently). Also, retrofitting support for --*dir in distutils would be *very* difficult, unless you are ready to break backward compatibility (there are 6 ways to install data files, and each of them has some corner cases, for example - it is a real pain to support this correctly in the convert command of toydist, and you simply cannot recover missing information to comply with the FHS in every case). > > However these systems were developed by the zope/plone/web crowd, so > they are naturally going to be thinking a lot about zope/plone/web > issues. Agreed - it is natural that they care about their problems first, that's how it works in open source. What I find difficult is when our concern are constantly dismissed by people who have no clue about our issues - and later claim we are not cooperative. > ?Debian, and ubuntu packages for them are mostly useless > because of the age. That's where the build farm enters. This is known issue, that's why the build service or PPA exist in the first place. >?I think > perhaps if toydist included something like stdeb as not an extension > to distutils, but a standalone tool (like toydist) there would be less > problems with it. That's pretty much how I intend to do things. Currently, in toydist, you can do something like: from toydist.core import PackageDescription pkg = PackageDescription.from_file("toysetup.info") # pkg now gives you access to metadata, as well as extensions, python modules, etc... I think this gives almost everything that is needed to implement a sdist_dsc command. Contrary to the Distribution class in distutils, this class would not need to be subclassed/monkey-patched by extensions, as it only cares about the description, and is 100 % uncoupled from the build part. > yes, I have also battled with distutils over the years. ?However it is > simpler than autotools (for me... maybe distutils has perverted my > fragile mind), and works on more platforms for python than any other > current system. Autotools certainly works on more platforms (windows notwhistanding), if only because python itself is built with autoconf. Distutils simplicity is a trap: it is simpler only if you restrict to what distutils gives you. Don't get me wrong, autotools are horrible, but I have never encountered cases where I had to spend hours to do trivial tasks, as has been the case with distutils. Numpy build system would be much, much easier to implement through autotools, and would be much more reliable. >?However > distutils has had more tests and testing systems added, so that > refactoring/cleaning up of distutils can happen more so. You can't refactor distutils without breaking backward compatibility, because distutils has no API. The whole implementation is the API. That's one of the fundamental disagreement I and other scipy dev have with current contributors on distutils-sig: the starting point (distutils) and the goal are so far away from each other that getting there step by step is hopeless. > I agree with many things in that post. ?Except your conclusion on > multiple versions of packages in isolation. ?Package isolation is like > processes, and package sharing is like threads - and threads are evil! I don't find the comparison very helpful (for once, you can share data between processes, whereas virtualenv cannot see each other AFAIK). > Science is supposed to allow repeatability. ?Without the same versions > of packages, repeating experiments is harder. ?This is a big problem > in science that multiple versions of packages in _isolation_ can help > get to a solution to the repeatability problem. I don't think that's true - at least it does not reflect my experience at all. But then, I don't pretend to have an extensive experience either. From most of my discussions at scipy conferences, I know most people are dissatisfied with the current python solutions. > >>> Plenty of good work is going on with python packaging. >> >> That's the opposite of my experience. What I care about is: >> ?- tools which are hackable and easily extensible >> ?- robust install/uninstall >> ?- real, DAG-based build system >> ?- explicit and repeatability >> >> None of this is supported by the tools, and the current directions go >> even further away. When I have to explain at length why the >> command-based design of distutils is a nightmare to work with, I don't >> feel very confident that the current maintainers are aware of the >> issues, for example. It shows that they never had to extend distutils >> much. >> > > All agreed! ?I'd add to the list parallel builds/tests (make -j 16), > and outputting to native build systems. ?eg, xcode, msvc projects, and > makefiles. Yep - I got quite far with numscons already. It cannot be used as a general solution, but as a dev tool for my own work on numpy/scipy, it has been a huge time saver, especially given the top notch dependency tracking system. It supports // builds, and I can build full debug builds of scipy < 1 minute on a fast machine. That's a real productivity booster. > > How will you handle toydist extensions so that multiple extensions do > not have problems with each other? ?I don't think this is possible > without isolation, and even then it's still a problem. By doing it mostly the Unix way, through protocols and file format, not through API. Good API is hard, but for build tools, it is much, much harder. When talking about extensions, I mostly think about the following: - adding a new compiler/new platform - adding a new installer format - adding a new kind of source file/target (say ctypes extension, cython compilation, etc...) Instead of using classes for compilers/tools, I am considering using python modules for each tool, and each tool would be registered through a source file extension (associate a function to ".c", for example). Actual compilation steps would be done through strings ("$CC ...."). The system would be kept simple, because for complex projects, one should forward all this to a real build system (like waf or scons). There is also the problem of post/pre hooks, adding new steps in toymaker: I have not thought much about this, but I like waf's way of doing it, and it may be applicable. In waf, the main script (called wscript) defines a function for each build step: def configure(): pass def build(): pass .... And undefined functions are considered unmodified. What I know for sure is that the distutils-way of extending through inheritance does not work at all. As soon as two extensions subclass the same base class, you're done. > > Yeah, cool. ?Many other projects have their own servers too. > pygame.org, plone, etc etc, which meet their own needs. ?Patches are > accepted for pypi btw. Yes, but how long before the patch is accepted and deployed ? > What type of enforcements of meta data, and how would they help? ?I > imagine this could be done in a number of ways to pypi. > - a distutils command extension that people could use. > - change pypi source code. > - check the metadata for certain packages, then email their authors > telling them about issues. First, packages with malformed metadata would be rejected, and it would not be possible to register a package without uploading the sources. I simply do not want to publish a package which does not even have a name or a version, for example. The current way of doing things in pypi in insane if you ask me. For example, if you want to install a package with its dependencies, you need to download the package, which may be in another website, and you need to execute setup.py just to know its dependencies. This has so many failures modes, I don't understand how this can seriously be considered, really. Every other system has an index to do this kind of things (curiously, both EPD and pypm have an index as well AFAIK). Again, a typical example of NIH, with inferior solutions implemented in the case of python. > > yeah, cool. ?That would let you develop things incrementally too, and > still have toydist be useful for the whole development period until it > catches up with the features of distutils needed. Initially, toydist was started to show that writing something compatible with distutils without being tight to distutils was possible. > If you execute build tools on arbitrary code, then arbitrary code > execution is easy for someone who wants to do bad things. Well, you could surely exploit built tools bugs. But at least, I can query metadata and packages features in a safe way - and this is very useful already (cf my points about being able to query packages metadata in one "query"). > and many times I still > get errors on different platforms, despite many years of multi > platform coding. Yes, that's a difficult process. We cannot fix this - but having automatically built (and hopefully tested) installers on major platforms would be a significant step in the right direction. That's one of the killer feature of CRAN (whenever you submit a package for CRAN, a windows installer is built, and tested). cheers, David From cournape at gmail.com Wed Dec 30 11:04:11 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 31 Dec 2009 01:04:11 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: <5b8d13220912300804p47687b99of8f58c154da9e6b3@mail.gmail.com> On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale wrote: > Hi David, > > On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau wrote: >> Executable: grin >> ? ?module: grin >> ? ?function: grin_main >> >> Executable: grind >> ? ?module: grin >> ? ?function: grind_main > > Have you thought at all about operations that are currently performed > by post-installation scripts? For example, it might be desirable for > the ipython or MayaVi windows installers to create a folder in the > Start menu that contains links the the executable and the > documentation. This is probably a secondary issue at this point in > toydist's development, but I think it is an important feature in the > long run. The main problem I see with post hooks is how to support them in installers. For example, you would have a function which does the post install, and declare it as a post install hook through decorator: @hook.post_install def myfunc(): pass The main issue is how to communicate data - that's a major issue in every build system I know of (scons' solution is ugly: every function takes an env argument, which is basically a giant global variable). > > Also, have you considered support for package extras (package variants > in Ports, allowing you to specify features that pull in additional > dependencies like traits[qt4])? Enthought makes good use of them in > ETS, and I think they would be worth keeping. The declarative format may declare flags as follows: Flag: c_exts Description: Build (optional) C extensions Default: false Library: if flag(c_exts): Extension: foo sources: foo.c And this is automatically available at configure stage. It can be used anywhere in Library, not just for Extension (you could use is within the Requires section). I am considering adding more than Flag (flag are boolean), if it does not make the format too complex. The use case I have in mind is something like: toydist configure --with-lapack-dir=/opt/mkl/lib which I have wished to implement for numpy for ages. David From cournape at gmail.com Wed Dec 30 11:16:19 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 31 Dec 2009 01:16:19 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> Message-ID: <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale wrote: > Hi David, > > On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau wrote: >> Executable: grin >> ? ?module: grin >> ? ?function: grin_main >> >> Executable: grind >> ? ?module: grin >> ? ?function: grind_main > > Have you thought at all about operations that are currently performed > by post-installation scripts? For example, it might be desirable for > the ipython or MayaVi windows installers to create a folder in the > Start menu that contains links the the executable and the > documentation. This is probably a secondary issue at this point in > toydist's development, but I think it is an important feature in the > long run. > > Also, have you considered support for package extras (package variants > in Ports, allowing you to specify features that pull in additional > dependencies like traits[qt4])? Enthought makes good use of them in > ETS, and I think they would be worth keeping. Does this example covers what you have in mind ? I am not so familiar with this feature of setuptools: Name: hello Version: 1.0 Library: BuildRequires: paver, sphinx, numpy if os(windows) BuildRequires: pywin32 Packages: hello Extension: hello._bar sources: src/hellomodule.c if os(linux) Extension: hello._linux_backend sources: src/linbackend.c Note that instead of os(os_name), you can use flag(flag_name), where flag are boolean variables which can be user defined: http://github.com/cournape/toydist/blob/master/examples/simples/conditional/toysetup.info http://github.com/cournape/toydist/blob/master/examples/var_example/toysetup.info David From kwgoodman at gmail.com Wed Dec 30 12:19:48 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 30 Dec 2009 09:19:48 -0800 Subject: [Numpy-discussion] Complex slicing and take In-Reply-To: <4B3B0A61.4050303@eso.org> References: <4B3B0A61.4050303@eso.org> Message-ID: On Wed, Dec 30, 2009 at 12:08 AM, Eric Emsellem wrote: > Hi > > thanks for the tips. Unfortunately this is not what I am after. > >>> > ? import numpy as num >>> > ? startarray = random((1000,100)) >>> > ? take_sample = [1,2,5,6,1,2] >>> > ? temp = num.take(startarray,take_sample,axis=1) >> >> Would it help to make temp a 1000x4 array instead of 1000x6? Could you >> do that by changing take_sample to [1,2,5,6] and multiplying columns 1 >> and 2 by a factor of 2? That would slow down the construction of temp >> but speed up the addition (and slicing?) in the loop below. > > No it wouldn't help unfortunately, because the second instance of "1,2" > would have different shifts. So I cannot just count the number of occurrence > of each line. > > From the initial 2D array, 1D lines could be extracted several times, with > each time a different shift. > >>> > ? shift = [10,20,34,-10,22,-20] >>> > ? result = num.zeros(900) ?# shorter than initial because of the shift >>> > ? for i in range(len(shift)) : >>> > ? ? ?result += temp[100+shift[i]:-100+shift[1]] >> >> This looks fast to me. The slicing doesn't make a copy nor does the >> addition. I've read that cython does fast indexing but I don't know if >> that applies to slicing as well. I assume that shift[1] is a typo and >> should be shift[i]. > > (yes of course the shift[1] should be shift[i]) > Well this may be fast, but not fast enough. And also, starting from my 2D > startarray again, it looks odd that I cannot do something like: > > startarray = random((1000,100)) > take_sample = [1,2,5,6,1,2] > shift = [10,20,34,-10,22,-20] > result = > num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift]) > > but of course this is nonsense because I cannot address the data this way > (with "shift"). > > In fact I realise now that my question is simpler: how do I extract and sum > 1d lines from a 2D array if I want first each line to be "shifted". So > starting again now, I want a quick way to write: > > startarray = random((1000,6)) > shift = [10,20,34,-10,22,-20] > result = num.zeros(1000, dtype=float) > for i in len(shift) : > ? result += startarray[100+shift[i]:900+shift[i]] > > > Can I write this directly with some numpy indexing without the loop in > python? > > thanks for any tip. > > Eric Where's the bottleneck? There's the loop, there's constructing the indices (which could be done outside the loop), slicing, adding. The location of the bottleneck probably depends on the relative sizes of the arrays. If the bottleneck is the loop, i.e. shift has a LOT of elements, then it might speed things up to break shift into chunks and use python's multiprocessing module to solve this in parallel. Something like cython would also speed up the loop. I haven't tried running your code, but if anyone does, I think result += startarray[100+shift[i]:900+shift[i]] should be result += startarray[100+shift[i]:900+shift[i], i] From josef.pktd at gmail.com Wed Dec 30 12:50:25 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 30 Dec 2009 12:50:25 -0500 Subject: [Numpy-discussion] Complex slicing and take In-Reply-To: References: <4B3B0A61.4050303@eso.org> Message-ID: <1cd32cbb0912300950k115f8083n2d503456655d066f@mail.gmail.com> On Wed, Dec 30, 2009 at 12:19 PM, Keith Goodman wrote: > On Wed, Dec 30, 2009 at 12:08 AM, Eric Emsellem wrote: >> Hi >> >> thanks for the tips. Unfortunately this is not what I am after. >> >>>> > ? import numpy as num >>>> > ? startarray = random((1000,100)) >>>> > ? take_sample = [1,2,5,6,1,2] >>>> > ? temp = num.take(startarray,take_sample,axis=1) >>> >>> Would it help to make temp a 1000x4 array instead of 1000x6? Could you >>> do that by changing take_sample to [1,2,5,6] and multiplying columns 1 >>> and 2 by a factor of 2? That would slow down the construction of temp >>> but speed up the addition (and slicing?) in the loop below. >> >> No it wouldn't help unfortunately, because the second instance of "1,2" >> would have different shifts. So I cannot just count the number of occurrence >> of each line. >> >> From the initial 2D array, 1D lines could be extracted several times, with >> each time a different shift. >> >>>> > ? shift = [10,20,34,-10,22,-20] >>>> > ? result = num.zeros(900) ?# shorter than initial because of the shift >>>> > ? for i in range(len(shift)) : >>>> > ? ? ?result += temp[100+shift[i]:-100+shift[1]] >>> >>> This looks fast to me. The slicing doesn't make a copy nor does the >>> addition. I've read that cython does fast indexing but I don't know if >>> that applies to slicing as well. I assume that shift[1] is a typo and >>> should be shift[i]. >> >> (yes of course the shift[1] should be shift[i]) >> Well this may be fast, but not fast enough. And also, starting from my 2D >> startarray again, it looks odd that I cannot do something like: >> >> startarray = random((1000,100)) >> take_sample = [1,2,5,6,1,2] >> shift = [10,20,34,-10,22,-20] >> result = >> num.sum(num.take(startarray,take_sample,axis=1)[100+shift:100-shift]) >> >> but of course this is nonsense because I cannot address the data this way >> (with "shift"). >> >> In fact I realise now that my question is simpler: how do I extract and sum >> 1d lines from a 2D array if I want first each line to be "shifted". So >> starting again now, I want a quick way to write: >> >> startarray = random((1000,6)) >> shift = [10,20,34,-10,22,-20] >> result = num.zeros(1000, dtype=float) >> for i in len(shift) : >> ? result += startarray[100+shift[i]:900+shift[i]] >> >> >> Can I write this directly with some numpy indexing without the loop in >> python? >> >> thanks for any tip. >> >> Eric > > Where's the bottleneck? There's the loop, there's constructing the > indices (which could be done outside the loop), slicing, adding. The > location of the bottleneck probably depends on the relative sizes of > the arrays. If the bottleneck is the loop, i.e. shift has a LOT of > elements, then it might speed things up to break shift into chunks and > use python's multiprocessing module to solve this in parallel. > Something like cython would also speed up the loop. > > I haven't tried running your code, but if anyone does, I think > > result += startarray[100+shift[i]:900+shift[i]] > > should be > > result += startarray[100+shift[i]:900+shift[i], i] > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > something like this ? just trying out, I haven't really checked carefully whether it actually replicates your snippets Constructing big intermediate arrays, might not improve performance compared to a loop >>> np.arange(30).reshape(6,5) array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]]) >>> np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T] array([[ 5, 6, 7], [11, 12, 13], [12, 13, 14], [ 6, 7, 8]]) >>> np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T].sum(0) array([34, 38, 42]) Josef From charlesr.harris at gmail.com Wed Dec 30 13:34:31 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 30 Dec 2009 11:34:31 -0700 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> Message-ID: On Tue, Dec 29, 2009 at 4:35 PM, Matthew Brett wrote: > Hi, > > I was surprised by this - should I have been? > > In [35]: e = np.array(['a']) > > In [36]: e.shape > Out[36]: (1,) > > In [37]: e.size > Out[37]: 1 > > In [38]: e.tostring() > Out[38]: 'a' > > In [39]: f = np.array(['a']) > > In [40]: f.shape == e.shape > Out[40]: True > > In [41]: f.size == e.size > Out[41]: True > > In [42]: f.tostring() > Out[42]: 'a' > > In [43]: z = np.array(['\x00']) > > In [44]: z.shape > Out[44]: (1,) > > In [45]: z.size > Out[45]: 1 > > In [46]: z > Out[46]: > array([''], > dtype='|S1') > > That is, an empty string array seems to be the same as a string array > with a single 0 byte, including having shape (1,) and size 1... > > It isn't empty: In [3]: array(['\x00']).dtype Out[3]: dtype('|S1') In [4]: array(['\x00']).tostring() Out[4]: '\x00' In [5]: array(['\x00'])[0] Out[5]: '' Looks like a printing problem to me, something in __repr__ for the string array. It seems that trailing zeros are trimmed off. In [11]: array(['a\x00\x00']) Out[11]: array(['a'], dtype='|S3') In [12]: array(['a\x00b']) Out[12]: array(['a\x00b'], dtype='|S3') Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From renesd at gmail.com Wed Dec 30 13:47:05 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 30 Dec 2009 18:47:05 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <200912300926.05071.lists_ravi@lavabit.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> Message-ID: <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: > On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: > >> I agree with many things in that post. ?Except your conclusion on >> multiple versions of packages in isolation. ?Package isolation is like >> processes, and package sharing is like threads - and threads are evil! > > You have stated this several times, but is there any evidence that this is the > desire of the majority of users? In the scientific community, interactive > experimentation is critical and users are typically not seasoned systems > administrators. For such users, almost all packages installed after installing > python itself are packages they use. In particular, all I want to do is to use > apt/yum to get the packages (or ask my sysadmin, who rightfully has no > interest in learning the intricacies of python package installation, to do so) > and continue with my work. "Packages-in-isolation" is for people whose job is > to run server farms, not interactive experimenters. > 500+ packages on pypi. Provide a counter point, otherwise the evidence is against your position - overwhelmingly. From dagss at student.matnat.uio.no Wed Dec 30 14:01:48 2009 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 30 Dec 2009 20:01:48 +0100 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> Message-ID: <4B3BA39C.4020401@student.matnat.uio.no> Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >> >>> I agree with many things in that post. Except your conclusion on >>> multiple versions of packages in isolation. Package isolation is like >>> processes, and package sharing is like threads - and threads are evil! >> You have stated this several times, but is there any evidence that this is the >> desire of the majority of users? In the scientific community, interactive >> experimentation is critical and users are typically not seasoned systems >> administrators. For such users, almost all packages installed after installing >> python itself are packages they use. In particular, all I want to do is to use >> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >> interest in learning the intricacies of python package installation, to do so) >> and continue with my work. "Packages-in-isolation" is for people whose job is >> to run server farms, not interactive experimenters. >> > > 500+ packages on pypi. Provide a counter point, otherwise the > evidence is against your position - overwhelmingly. ?!? Wouldn't you need to measure the number of downloads (and also figure out something else to measure that relative to)? Uploading something to PyPI is easy enough to do and probably done by default by a lot of package authors -- that doesn't mean that it is the main distribution method. -- Dag Sverre From matthew.brett at gmail.com Wed Dec 30 14:00:50 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 30 Dec 2009 19:00:50 +0000 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> Message-ID: <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> Hi. > It isn't empty: > > In [3]: array(['\x00']).dtype > Out[3]: dtype('|S1') > > In [4]: array(['\x00']).tostring() > Out[4]: '\x00' > > In [5]: array(['\x00'])[0] > Out[5]: '' No, but my problem was that an empty string is not empty either, and that you can't therefore distinguish between an empty string and a string with all 0 bytes: In [11]: np.array('') == '\x00\x00\x00' Out[11]: array(True, dtype=bool) > Looks like a printing problem to me, something in __repr__ for the string > array. It seems that trailing zeros are trimmed off. > > In [11]: array(['a\x00\x00']) > Out[11]: > array(['a'], > ????? dtype='|S3') > > In [12]: array(['a\x00b']) > Out[12]: > array(['a\x00b'], > ????? dtype='|S3') I don't think it's a printing problem, I think it's that the trailing zeros are pulled off in the string comparisons, and for printing, even though they are present in memory. I mean, that a.tostring() is right, and the __repr__ and comparisons are - at least to me - confusing. In [2]: a = np.array('a\x00\x00\x00') In [3]: a Out[3]: array('a', dtype='|S4') In [5]: a == 'a' Out[5]: array(True, dtype=bool) In [7]: a == 'a\x00\x00\x00' Out[7]: array(True, dtype=bool) See you, Matthew From josef.pktd at gmail.com Wed Dec 30 14:04:17 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 30 Dec 2009 14:04:17 -0500 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> Message-ID: <1cd32cbb0912301104l231b0c93h749f2a66246bd6b6@mail.gmail.com> On Wed, Dec 30, 2009 at 1:47 PM, Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >> >>> I agree with many things in that post. ?Except your conclusion on >>> multiple versions of packages in isolation. ?Package isolation is like >>> processes, and package sharing is like threads - and threads are evil! >> >> You have stated this several times, but is there any evidence that this is the >> desire of the majority of users? In the scientific community, interactive >> experimentation is critical and users are typically not seasoned systems >> administrators. For such users, almost all packages installed after installing >> python itself are packages they use. In particular, all I want to do is to use >> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >> interest in learning the intricacies of python package installation, to do so) >> and continue with my work. "Packages-in-isolation" is for people whose job is >> to run server farms, not interactive experimenters. >> > > 500+ packages on pypi. ? Provide a counter point, otherwise the > evidence is against your position - overwhelmingly. packages on pypi has no implication whether I have only one version of it or several in different isolated environment. Actually, I have only 1 version for almost all packages that I use, and those are on the python path, (and many of them are easy_installed from pypi, or exe installed). The only ones I have in several versions are the ones I work on myself, or where I track a repository. Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Wed Dec 30 14:08:53 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 30 Dec 2009 13:08:53 -0600 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> Message-ID: <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> On Wed, Dec 30, 2009 at 12:47, Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >> >>> I agree with many things in that post. ?Except your conclusion on >>> multiple versions of packages in isolation. ?Package isolation is like >>> processes, and package sharing is like threads - and threads are evil! >> >> You have stated this several times, but is there any evidence that this is the >> desire of the majority of users? In the scientific community, interactive >> experimentation is critical and users are typically not seasoned systems >> administrators. For such users, almost all packages installed after installing >> python itself are packages they use. In particular, all I want to do is to use >> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >> interest in learning the intricacies of python package installation, to do so) >> and continue with my work. "Packages-in-isolation" is for people whose job is >> to run server farms, not interactive experimenters. > > 500+ packages on pypi. ? Provide a counter point, otherwise the > evidence is against your position - overwhelmingly. Linux distributions, which are much, much more popular than any collection of packages on PyPI you might care to name. Isolated environments have their uses, but they are the exception, not the rule. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From renesd at gmail.com Wed Dec 30 14:10:50 2009 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Wed, 30 Dec 2009 19:10:50 +0000 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> Message-ID: <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> On Wed, Dec 30, 2009 at 7:08 PM, Robert Kern wrote: > On Wed, Dec 30, 2009 at 12:47, Ren? Dudfield wrote: >> On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >>> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >>> >>>> I agree with many things in that post. ?Except your conclusion on >>>> multiple versions of packages in isolation. ?Package isolation is like >>>> processes, and package sharing is like threads - and threads are evil! >>> >>> You have stated this several times, but is there any evidence that this is the >>> desire of the majority of users? In the scientific community, interactive >>> experimentation is critical and users are typically not seasoned systems >>> administrators. For such users, almost all packages installed after installing >>> python itself are packages they use. In particular, all I want to do is to use >>> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >>> interest in learning the intricacies of python package installation, to do so) >>> and continue with my work. "Packages-in-isolation" is for people whose job is >>> to run server farms, not interactive experimenters. >> >> 500+ packages on pypi. ? Provide a counter point, otherwise the >> evidence is against your position - overwhelmingly. > > Linux distributions, which are much, much more popular than any > collection of packages on PyPI you might care to name. Isolated > environments have their uses, but they are the exception, not the > rule. > wrong. pypi has way more python packages than any linux distribution. 8500+ listed, compared to how many in debian? From kwgoodman at gmail.com Wed Dec 30 14:13:18 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 30 Dec 2009 11:13:18 -0800 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> Message-ID: On Wed, Dec 30, 2009 at 11:10 AM, Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 7:08 PM, Robert Kern wrote: >> On Wed, Dec 30, 2009 at 12:47, Ren? Dudfield wrote: >>> On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >>>> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >>>> >>>>> I agree with many things in that post. ?Except your conclusion on >>>>> multiple versions of packages in isolation. ?Package isolation is like >>>>> processes, and package sharing is like threads - and threads are evil! >>>> >>>> You have stated this several times, but is there any evidence that this is the >>>> desire of the majority of users? In the scientific community, interactive >>>> experimentation is critical and users are typically not seasoned systems >>>> administrators. For such users, almost all packages installed after installing >>>> python itself are packages they use. In particular, all I want to do is to use >>>> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >>>> interest in learning the intricacies of python package installation, to do so) >>>> and continue with my work. "Packages-in-isolation" is for people whose job is >>>> to run server farms, not interactive experimenters. >>> >>> 500+ packages on pypi. ? Provide a counter point, otherwise the >>> evidence is against your position - overwhelmingly. >> >> Linux distributions, which are much, much more popular than any >> collection of packages on PyPI you might care to name. Isolated >> environments have their uses, but they are the exception, not the >> rule. >> > > wrong. ?pypi has way more python packages than any linux distribution. > ?8500+ listed, compared to how many in debian? Debian has over 30k packages. But I think he was talking about popularity, not the number of packages. From kwgoodman at gmail.com Wed Dec 30 14:14:41 2009 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 30 Dec 2009 11:14:41 -0800 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> Message-ID: On Wed, Dec 30, 2009 at 11:13 AM, Keith Goodman wrote: > On Wed, Dec 30, 2009 at 11:10 AM, Ren? Dudfield wrote: >> On Wed, Dec 30, 2009 at 7:08 PM, Robert Kern wrote: >>> On Wed, Dec 30, 2009 at 12:47, Ren? Dudfield wrote: >>>> On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >>>>> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >>>>> >>>>>> I agree with many things in that post. ?Except your conclusion on >>>>>> multiple versions of packages in isolation. ?Package isolation is like >>>>>> processes, and package sharing is like threads - and threads are evil! >>>>> >>>>> You have stated this several times, but is there any evidence that this is the >>>>> desire of the majority of users? In the scientific community, interactive >>>>> experimentation is critical and users are typically not seasoned systems >>>>> administrators. For such users, almost all packages installed after installing >>>>> python itself are packages they use. In particular, all I want to do is to use >>>>> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >>>>> interest in learning the intricacies of python package installation, to do so) >>>>> and continue with my work. "Packages-in-isolation" is for people whose job is >>>>> to run server farms, not interactive experimenters. >>>> >>>> 500+ packages on pypi. ? Provide a counter point, otherwise the >>>> evidence is against your position - overwhelmingly. >>> >>> Linux distributions, which are much, much more popular than any >>> collection of packages on PyPI you might care to name. Isolated >>> environments have their uses, but they are the exception, not the >>> rule. >>> >> >> wrong. ?pypi has way more python packages than any linux distribution. >> ?8500+ listed, compared to how many in debian? > > Debian has over 30k packages. But I think he was talking about > popularity, not the number of packages. Oh, 30k is all packages not just python. From robert.kern at gmail.com Wed Dec 30 14:15:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 30 Dec 2009 13:15:03 -0600 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <3d375d730912301108m164072c5oe54940cf3d4af92a@mail.gmail.com> <64ddb72c0912301110g64580920o22826c9fdabce104@mail.gmail.com> Message-ID: <3d375d730912301115v66b34ebcl61d078bb1a512faa@mail.gmail.com> On Wed, Dec 30, 2009 at 13:10, Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 7:08 PM, Robert Kern wrote: >> On Wed, Dec 30, 2009 at 12:47, Ren? Dudfield wrote: >>> 500+ packages on pypi. ? Provide a counter point, otherwise the >>> evidence is against your position - overwhelmingly. >> >> Linux distributions, which are much, much more popular than any >> collection of packages on PyPI you might care to name. Isolated >> environments have their uses, but they are the exception, not the >> rule. > > wrong. ?pypi has way more python packages than any linux distribution. > ?8500+ listed, compared to how many in debian? I said "more popular". As in "more users", not "more packages". But if you insist, Debian has ~30000 or so, depending on the architecture and release and how you count. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Wed Dec 30 14:21:15 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 30 Dec 2009 12:21:15 -0700 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> Message-ID: On Wed, Dec 30, 2009 at 12:00 PM, Matthew Brett wrote: > Hi. > > > It isn't empty: > > > > In [3]: array(['\x00']).dtype > > Out[3]: dtype('|S1') > > > > In [4]: array(['\x00']).tostring() > > Out[4]: '\x00' > > > > In [5]: array(['\x00'])[0] > > Out[5]: '' > > No, but my problem was that an empty string is not empty either, and > that you can't therefore distinguish between an empty string and a > string with all 0 bytes: > > In [11]: np.array('') == '\x00\x00\x00' > Out[11]: array(True, dtype=bool) > > > Looks like a printing problem to me, something in __repr__ for the string > > array. It seems that trailing zeros are trimmed off. > > > > In [11]: array(['a\x00\x00']) > > Out[11]: > > array(['a'], > > dtype='|S3') > > > > In [12]: array(['a\x00b']) > > Out[12]: > > array(['a\x00b'], > > dtype='|S3') > > I don't think it's a printing problem, I think it's that the trailing > zeros are pulled off in the string comparisons, and for printing, even > though they are present in memory. I mean, that a.tostring() is > right, and the __repr__ and comparisons are - at least to me - > confusing. > > In [2]: a = np.array('a\x00\x00\x00') > > In [3]: a > Out[3]: > array('a', > dtype='|S4') > > In [5]: a == 'a' > Out[5]: array(True, dtype=bool) > > In [7]: a == 'a\x00\x00\x00' > Out[7]: array(True, dtype=bool) > > That is due to type promotion for the ufunc call: In [17]: a1 = np.array('a\x00\x00\x00') n [21]: np.array(['a'], dtype=a1.dtype)[0] Out[21]: 'a' In [22]: np.array(['a'], dtype=a1.dtype).tostring() Out[22]: 'a\x00\x00\x00' Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsdale24 at gmail.com Wed Dec 30 16:06:56 2009 From: dsdale24 at gmail.com (Darren Dale) Date: Wed, 30 Dec 2009 16:06:56 -0500 Subject: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912300816s12c934adh4abdd6d703f8928f@mail.gmail.com> Message-ID: On Wed, Dec 30, 2009 at 11:16 AM, David Cournapeau wrote: > On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale wrote: >> Hi David, >> >> On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau wrote: >>> Executable: grin >>> ? ?module: grin >>> ? ?function: grin_main >>> >>> Executable: grind >>> ? ?module: grin >>> ? ?function: grind_main >> >> Have you thought at all about operations that are currently performed >> by post-installation scripts? For example, it might be desirable for >> the ipython or MayaVi windows installers to create a folder in the >> Start menu that contains links the the executable and the >> documentation. This is probably a secondary issue at this point in >> toydist's development, but I think it is an important feature in the >> long run. >> >> Also, have you considered support for package extras (package variants >> in Ports, allowing you to specify features that pull in additional >> dependencies like traits[qt4])? Enthought makes good use of them in >> ETS, and I think they would be worth keeping. > > Does this example covers what you have in mind ? I am not so familiar > with this feature of setuptools: > > Name: hello > Version: 1.0 > > Library: > ? ?BuildRequires: paver, sphinx, numpy > ? ?if os(windows) > ? ? ? ?BuildRequires: pywin32 > ? ?Packages: > ? ? ? ?hello > ? ?Extension: hello._bar > ? ? ? ?sources: > ? ? ? ? ? ?src/hellomodule.c > ? ?if os(linux) > ? ? ? ?Extension: hello._linux_backend > ? ? ? ? ? ?sources: > ? ? ? ? ? ? ? ?src/linbackend.c > > Note that instead of os(os_name), you can use flag(flag_name), where > flag are boolean variables which can be user defined: > > http://github.com/cournape/toydist/blob/master/examples/simples/conditional/toysetup.info > > http://github.com/cournape/toydist/blob/master/examples/var_example/toysetup.info I should defer to the description of extras in the setuptools documentation. It is only a few paragraphs long: http://peak.telecommunity.com/DevCenter/setuptools#declaring-extras-optional-features-with-their-own-dependencies Darren From eemselle at eso.org Wed Dec 30 18:05:17 2009 From: eemselle at eso.org (Eric Emsellem) Date: Thu, 31 Dec 2009 00:05:17 +0100 Subject: [Numpy-discussion] Complex slicing and take Message-ID: <4B3BDCAD.2090407@eso.org> Thanks! will try that and see how the performance varies depending on the size of my arrays. thanks again! Eric > Constructing big intermediate arrays, might not improve performance > compared to a loop > >>>> >>> np.arange(30).reshape(6,5) > array([[ 0, 1, 2, 3, 4], > [ 5, 6, 7, 8, 9], > [10, 11, 12, 13, 14], > [15, 16, 17, 18, 19], > [20, 21, 22, 23, 24], > [25, 26, 27, 28, 29]]) > >>>> >>> np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T] > array([[ 5, 6, 7], > [11, 12, 13], > [12, 13, 14], > [ 6, 7, 8]]) > >>>> >>> np.arange(30).reshape(6,5)[np.array([[1,2,2,1]]).T,np.arange(0,3)+np.array([[0,1,2,1]]).T].sum(0) > array([34, 38, 42]) > > Josef From warren.weckesser at enthought.com Wed Dec 30 18:23:20 2009 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 30 Dec 2009 17:23:20 -0600 Subject: [Numpy-discussion] linalg.py det of transpose In-Reply-To: References: Message-ID: <4B3BE0E8.4030105@enthought.com> paul wrote: > In numpy/linalg/linalg.py, the function det(a) does a fastCopyAndTranspose of a > before calling lapack. Is that necessary as det(a.T) = det(a)? > > > If you look further down in the function (or read the docstring), you'll see that det(a) computes the determinant by first performing an LU decomposition of the array. It does this using the Lapack function dgetrf or zgetrf. These functions perform the factorization 'in-place', so it is necessary to make a copy first. I don't know why the "CopyAndTranspose" function is used--maybe it is faster--but, as you point out, the fact that the copy has been transposed should not affect the result. Warren > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Wed Dec 30 19:19:27 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 31 Dec 2009 09:19:27 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> Message-ID: <5b8d13220912301619n42c6f577r375c172975849a7f@mail.gmail.com> On Thu, Dec 31, 2009 at 3:47 AM, Ren? Dudfield wrote: > On Wed, Dec 30, 2009 at 2:26 PM, Ravi wrote: >> On Wednesday 30 December 2009 06:15:45 Ren? Dudfield wrote: >> >>> I agree with many things in that post. ?Except your conclusion on >>> multiple versions of packages in isolation. ?Package isolation is like >>> processes, and package sharing is like threads - and threads are evil! >> >> You have stated this several times, but is there any evidence that this is the >> desire of the majority of users? In the scientific community, interactive >> experimentation is critical and users are typically not seasoned systems >> administrators. For such users, almost all packages installed after installing >> python itself are packages they use. In particular, all I want to do is to use >> apt/yum to get the packages (or ask my sysadmin, who rightfully has no >> interest in learning the intricacies of python package installation, to do so) >> and continue with my work. "Packages-in-isolation" is for people whose job is >> to run server farms, not interactive experimenters. >> > > 500+ packages on pypi. ? Provide a counter point, otherwise the > evidence is against your position - overwhelmingly. Number of packages is a useless metric to measure the success of something like pypi. I don't even know why someones would think it is an interesting number. Note that CRAN has several times more packages, and R community is much smaller than python's, if you care about that number. Haskell has ~2000 packages, and hackageDB ("haskell's pypi") is much smaller than python. You really should not try to make the point that Pypi is working for the scipy community: I know there is a bias in conferences and mailing list, but the consensus is vastly toward "pypi does not work very well for us". The issue keeps coming up. I think trying to convince us otherwise is counter productive at best. David From Chris.Barker at noaa.gov Wed Dec 30 20:50:06 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 30 Dec 2009 17:50:06 -0800 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13220912301619n42c6f577r375c172975849a7f@mail.gmail.com> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <5b8d13220912301619n42c6f577r375c172975849a7f@mail.gmail.com> Message-ID: <4B3C034E.3010202@noaa.gov> David Cournapeau wrote: > You really should not try to make the point that Pypi is working for > the scipy community: I think the evidence supports that pypi is useful, and therefor better than nothing -- which in no way means it couldn't be much better. My personal experience is that I always try: easy_install whatever first, and when it works, I'm happy and a bit surprised. It virtually never works for more complex packages, particularly on OS-X: - PIL - scipy - matplotlib - netcdf4 - gdal I don't now if wxPython is on PyPi, I've never even tried. many of these fail because of other non-python dependencies. So yes -- we really could use something better! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Dec 30 21:08:02 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 30 Dec 2009 18:08:02 -0800 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> Message-ID: <4B3C0782.1050304@noaa.gov> Charles R Harris wrote: > That is due to type promotion for the ufunc call: > > In [17]: a1 = np.array('a\x00\x00\x00') > > n [21]: np.array(['a'], dtype=a1.dtype)[0] > Out[21]: 'a' > > In [22]: np.array(['a'], dtype=a1.dtype).tostring() > Out[22]: 'a\x00\x00\x00' it took me a bit to figure out what this meant, so in case I'm not the only one, I thought I'd spell it out: In [3]: s1 = np.array('a') In [4]: s1.dtype Out[4]: dtype('|S1') so s1's dytype is a length-1 string In [11]: s2 = np.array('a\x00\x00') In [12]: s2.dtype Out[12]: dtype('|S3') and s2's is a length-3 string In [13]: s1 == s2 Out[13]: array(True, dtype=bool) when they are compared, s1's dtype is coerced to a length 3 string by padding with nulls, and thus they compare equal. otherwise, there is nothing special about zero bytes in a string: In [14]: s3 = np.array('\x00a\x00') In [15]: s3 == s2 Out[15]: array(False, dtype=bool) In [16]: s3 == s1 Out[16]: array(False, dtype=bool) The problem is that there is zero bytes are the only way to pad a string. I suppose the comparison could be smarter, by comparing without coercing, but that may not be possible without the ufunc machinery. As for printing, I think it simply reflects that numpy strings are null padded, and most people probably wouldn't want to see all those nulls every time. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cournape at gmail.com Wed Dec 30 23:05:24 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 31 Dec 2009 13:05:24 +0900 Subject: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <4B3C034E.3010202@noaa.gov> References: <5b8d13220912280603p7221a264o875b0d5e74a5404@mail.gmail.com> <5b8d13220912290622m2f0ec2c3x5a26e63118cb29a0@mail.gmail.com> <64ddb72c0912300315r420bd88dk5bb6be3a960bf44d@mail.gmail.com> <200912300926.05071.lists_ravi@lavabit.com> <64ddb72c0912301047t2dfd4059kd5bcb5a780e92dce@mail.gmail.com> <5b8d13220912301619n42c6f577r375c172975849a7f@mail.gmail.com> <4B3C034E.3010202@noaa.gov> Message-ID: <5b8d13220912302005w724d0015p78f3d1e69f9c83bd@mail.gmail.com> On Thu, Dec 31, 2009 at 10:50 AM, Christopher Barker wrote: > David Cournapeau wrote: >> You really should not try to make the point that Pypi is working for >> the scipy community: > > I think the evidence supports that pypi is useful Yes - I stressed that it was not working well, for the scipy community. Not that it was not working at all for python >, and therefor better > than nothing -- which in no way means it couldn't be much better. > > My personal experience is that I always try: > > easy_install whatever > > first, and when it works, I'm happy and a bit surprised. To say that you are happy when it works is telling about your low expectations, no ? My main point of disagreement with most discussion on distutils-sig is that I think the lack of robustness is rooted in the way tools are conceived and used, whereas others think it can be fixed by adding more features. This, and the refusal of learning how other communities do it is why I am considering starting with my own solution, David From matthew.brett at gmail.com Thu Dec 31 05:13:51 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 31 Dec 2009 10:13:51 +0000 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <4B3C0782.1050304@noaa.gov> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> <4B3C0782.1050304@noaa.gov> Message-ID: <1e2af89e0912310213u2401e0aci6acf5c7a7bc89484@mail.gmail.com> Hi, On Thu, Dec 31, 2009 at 2:08 AM, Christopher Barker wrote: > Charles R Harris wrote: >> That is due to type promotion for the ufunc call: >> >> In [17]: a1 = np.array('a\x00\x00\x00') >> >> n [21]: np.array(['a'], dtype=a1.dtype)[0] >> Out[21]: 'a' >> >> In [22]: np.array(['a'], dtype=a1.dtype).tostring() >> Out[22]: 'a\x00\x00\x00' > > it took me a bit to figure out what this meant, so in case I'm not the > only one, I thought I'd spell it out: I think the summary here is 'numpy strings are zero padded; therefore you may run into surprises with a string that has trailing zeros'. I see why that is - the zero terminator is the only way for numpy arrays to see where the end of the string is... Best, Matthew From Chris.Barker at noaa.gov Thu Dec 31 19:35:23 2009 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 31 Dec 2009 16:35:23 -0800 Subject: [Numpy-discussion] Empty strings not empty? In-Reply-To: <1e2af89e0912310213u2401e0aci6acf5c7a7bc89484@mail.gmail.com> References: <1e2af89e0912291535v74e27504t6e3b9f1910bb492@mail.gmail.com> <1e2af89e0912301100r5e18a0d0kd3335624c1280a41@mail.gmail.com> <4B3C0782.1050304@noaa.gov> <1e2af89e0912310213u2401e0aci6acf5c7a7bc89484@mail.gmail.com> Message-ID: <4B3D434B.9080806@noaa.gov> Matthew Brett wrote: > I think the summary here is 'numpy strings are zero padded; therefore > you may run into surprises with a string that has trailing zeros'. > > I see why that is - the zero terminator is the only way for numpy > arrays to see where the end of the string is... almost -- it's not quite zero-terminated, you can have embedded zeros: In [35]: np.array('aa\x00bb', dtype='S6') Out[35]: array('aa\x00bb', dtype='|S6') -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov