From ralf.gommers at gmail.com Mon Apr 1 07:58:36 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Apr 2013 13:58:36 +0200 Subject: [Numpy-discussion] NumPy/SciPy participation in GSoC 2013 In-Reply-To: References: Message-ID: On Tue, Mar 26, 2013 at 12:27 AM, Ralf Gommers wrote: > > > > On Thu, Mar 21, 2013 at 10:20 PM, Ralf Gommers wrote: > >> Hi all, >> >> It is the time of the year for Google Summer of Code applications. If we >> want to participate with Numpy and/or Scipy, we need two things: enough >> mentors and ideas for projects. If we get those, we'll apply under the PSF >> umbrella. They've outlined the timeline they're working by and guidelines >> at >> http://pyfound.blogspot.nl/2013/03/get-ready-for-google-summer-of-code.html. >> >> >> We should be able to come up with some interesting project ideas I'd >> think, let's put those at >> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. Preferably with >> enough detail to be understandable for people new to the projects and a >> proposed mentor. >> >> We need at least 3 people willing to mentor a student. Ideally we'd have >> enough mentors this week, so we can apply to the PSF on time. If you're >> willing to be a mentor, please send me the following: name, email address, >> phone nr, and what you're interested in mentoring. If you have time >> constaints and have doubts about being able to be a primary mentor, being a >> backup mentor would also be helpful. >> > > So far we've only got one primary mentor (thanks Chuck!), most core devs > do not seem to have the bandwidth this year. If there are other people > interested in mentoring please let me know. If not, then it looks like > we're not participating this year. > Hi all, an update on GSoC'13. We do have enough mentoring power after all; NumPy/SciPy is now registered as a participating project on the PSF page: http://wiki.python.org/moin/SummerOfCode/2013 Prospective students: please have a look at http://wiki.python.org/moin/SummerOfCode/Expectations and at http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. In particular note that we require you to make one pull request to NumPy/SciPy which has to be merged *before* the application deadline (May 3). So please start thinking about that, and start a discussion on your project idea on this list. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Mon Apr 1 08:27:26 2013 From: toddrjen at gmail.com (Todd) Date: Mon, 1 Apr 2013 14:27:26 +0200 Subject: [Numpy-discussion] [SciPy-Dev] NumPy/SciPy participation in GSoC 2013 In-Reply-To: References: Message-ID: On Mon, Apr 1, 2013 at 1:58 PM, Ralf Gommers wrote: > > > > On Tue, Mar 26, 2013 at 12:27 AM, Ralf Gommers wrote: > >> >> >> >> On Thu, Mar 21, 2013 at 10:20 PM, Ralf Gommers wrote: >> >>> Hi all, >>> >>> It is the time of the year for Google Summer of Code applications. If we >>> want to participate with Numpy and/or Scipy, we need two things: enough >>> mentors and ideas for projects. If we get those, we'll apply under the PSF >>> umbrella. They've outlined the timeline they're working by and guidelines >>> at >>> http://pyfound.blogspot.nl/2013/03/get-ready-for-google-summer-of-code.html. >>> >>> >>> We should be able to come up with some interesting project ideas I'd >>> think, let's put those at >>> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. Preferably with >>> enough detail to be understandable for people new to the projects and a >>> proposed mentor. >>> >>> We need at least 3 people willing to mentor a student. Ideally we'd have >>> enough mentors this week, so we can apply to the PSF on time. If you're >>> willing to be a mentor, please send me the following: name, email address, >>> phone nr, and what you're interested in mentoring. If you have time >>> constaints and have doubts about being able to be a primary mentor, being a >>> backup mentor would also be helpful. >>> >> >> So far we've only got one primary mentor (thanks Chuck!), most core devs >> do not seem to have the bandwidth this year. If there are other people >> interested in mentoring please let me know. If not, then it looks like >> we're not participating this year. >> > > Hi all, an update on GSoC'13. We do have enough mentoring power after all; > NumPy/SciPy is now registered as a participating project on the PSF page: > http://wiki.python.org/moin/SummerOfCode/2013 > > Prospective students: please have a look at > http://wiki.python.org/moin/SummerOfCode/Expectations and at > http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. In particular > note that we require you to make one pull request to NumPy/SciPy which has > to be merged *before* the application deadline (May 3). So please start > thinking about that, and start a discussion on your project idea on this > list. > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > There were a number of other ideas in this thread: http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065699.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Apr 1 13:23:39 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 01 Apr 2013 19:23:39 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <1364837019.2404.12.camel@sebastian-laptop> On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: > Hi, > > On Sun, Mar 31, 2013 at 1:43 PM, wrote: > > On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett wrote: > >> Hi, > >> > >> On Sat, Mar 30, 2013 at 10:38 PM, wrote: > >>> On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett wrote: > >>>> Hi, > >>>> > >>>> On Sat, Mar 30, 2013 at 9:37 PM, wrote: > >>>>> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett wrote: > >>>>>> Hi, > >>>>>> > >>>>>> On Sat, Mar 30, 2013 at 7:02 PM, wrote: > >>>>>>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: > >>>>>>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle > >>>>>>>>> wrote: > >>>>>>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: > >>>>>>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: > >>>>>>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett > >>>>>>>>>>> >> wrote: > >>>>>>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: > >>>>>>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett > >>>>>>>>>>> >>>> wrote: > >>>>>>>>>>> >>>>> > >>>>>>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index > >>>>>>>>>>> >>>>> ordering. > >>>>>>>>>>> >>>>> > >>>>>>>>>>> >>>>> This is very confusing. We think the index ordering and memory > >>>>>>>>>>> >>>>> ordering ideas need to be separated, and specifically, we should > >>>>>>>>>>> >>>>> avoid > >>>>>>>>>>> >>>>> using "C" and "F" to refer to index ordering. > >>>>>>>>>>> >>>>> > >>>>>>>>>>> >>>>> Proposal > >>>>>>>>>>> >>>>> ------------- > >>>>>>>>>>> >>>>> > >>>>>>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and forwards > >>>>>>>>>>> >>>>> index ordering for ravel, reshape > >>>>>>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of unraveling > >>>>>>>>>>> >>>>> in > >>>>>>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively (excellent > >>>>>>>>>>> >>>>> naming idea by Paul Ivanov) > >>>>>>>>>>> >>>>> > >>>>>>>>>>> >>>>> What do y'all think? > >>>>>>>>>>> >>>> > >>>>>>>>>>> >>>> I always thought "F" and "C" are easy to understand, I always thought > >>>>>>>>>>> >>>> about > >>>>>>>>>>> >>>> the content and never about the memory when using it. > >>>>>>>>>>> >> > >>>>>>>>>>> >> changing the names doesn't make it easier to understand. > >>>>>>>>>>> >> I think the confusion is because the new A and K refer to existing > >>>>>>>>>>> >> memory > >>>>>>>>>>> >> > >>>>>>>>>>> > >>>>>>>>>>> I disagree, I think it's confusing, but I have evidence, and that is > >>>>>>>>>>> that four out of four of us tested ourselves and got it wrong. > >>>>>>>>>>> > >>>>>>>>>>> Perhaps we are particularly dumb or poorly informed, but I think it's > >>>>>>>>>>> rash to assert there is no problem here. > >>>>>>>>> > >>>>>>>>> I think you are overcomplicating things or phrased it as a "trick question" > >>>>>>>> > >>>>>>>> I don't know what you mean by trick question - was there something > >>>>>>>> over-complicated in the example? I deliberately didn't include > >>>>>>>> various much more confusing examples in "reshape". > >>>>>>> > >>>>>>> I meant making the "candidates" think about memory instead of just > >>>>>>> column versus row stacking. > >>>>>> > >>>>>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D > >>>>>> array, it was an image, with time as the 4th dimension (N time > >>>>>> points). Raveling and reshaping 3D and 4D arrays is a common thing > >>>>>> to do in neuroimaging, as you can imagine. > >>>>>> > >>>>>> A student asked what he would get back from raveling this array, a > >>>>>> concatenated time series, or something spatial? > >>>>>> > >>>>>> We showed (I'd worked it out by this time) that the first N values > >>>>>> were the time series given by [0, 0, 0, :]. > >>>>>> > >>>>>> He said - "Oh - I see - so the data is stored as a whole lot of time > >>>>>> series one by one, I thought it would be stored as a series of > >>>>>> images'. > >>>>>> > >>>>>> Ironically, this was a Fortran-ordered array in memory, and he was wrong. > >>>>>> > >>>>>> So, I think the idea of memory ordering and index ordering is very > >>>>>> easy to confuse, and comes up naturally. > >>>>>> > >>>>>> I would like, as a teacher, to be able to say something like: > >>>>>> > >>>>>> This is what C memory layout is (it's the memory layout that gives > >>>>>> arr.flags.C_CONTIGUOUS=True) > >>>>>> This is what F memory layout is (it's the memory layout that gives > >>>>>> arr.flags.F_CONTIGUOUS=True) > >>>>>> It's rather easy to get something that is neither C or F memory layout > >>>>>> Numpy does many memory layouts. > >>>>>> Ravel and reshape and numpy in general do not care (normally) about C > >>>>>> or F layouts, they only care about index ordering. > >>>>>> > >>>>>> My point, that I'm repeating, is that my job is made harder by > >>>>>> 'arr.ravel('F')'. > >>>>> > >>>>> But once you know that ravel and reshape don't care about memory, the > >>>>> ravel is easy to predict (maybe not easy to visualize in 4-D): > >>>> > >>>> But this assumes that you already know that there's such a thing as > >>>> memory layout, and there's such a thing as index ordering, and that > >>>> 'C' and 'F' in ravel refer to index ordering. Once you have that, > >>>> you're golden. I'm arguing it's markedly harder to get this > >>>> distinction, and keep it in mind, and teach it, if we are using the > >>>> 'C' and 'F" names for both things. > >>> > >>> No, I think you are still missing my point. > >>> I think explaining ravel and reshape F and C is easy (kind of) because the > >>> students don't need to know at that stage about memory layouts. > >>> > >>> All they need to know is that we look at n-dimensional objects in > >>> C-order or in F-order > >>> (whichever index runs fastest) > >> > >> Would you accept that it may or may not be true that it is desirable > >> or practical not to mention memory layouts when teaching numpy? > > > > I think they should be in two different sections. > > > > basic usage: > > ravel, reshape in pure index order, and indexing, broadcasting, ... > > > > advanced usage: > > memory layout and some ability to predict when you get a view and > > when you get a copy. > > Right - that is what you think - but I was asking - do you agree that > it's possible that that is not best way to teach it? > > What evidence would you give that it was the best way to teach it? > > > And I still think words can mean different things in different context > > (with a qualifier maybe) > > indexing in fortran order > > memory in fortran order > > Right - but you'd probably also accept that using the same word for > different and related things is likely to cause confusion? I'm sure > we could come up with some experimental evidence for that if you do > doubt it. > > > Disclaimer: I never tried to teach numpy > > and with GSOC students my explanations only went a little bit > > beyond what they needed to know for the purpose at hand (I hope) > > > >> > >> You believe it is desirable, I believe that it is not - that teaching > >> numpy naturally involves some discussion of memory layout. > >> > >> As evidence: > >> > >> * My student, without any prompting about memory layouts, is asking about it > >> * Travis' numpy book has a very early section on this (section 2.3 - > >> memory layout) > >> * I often think about memory layouts, and from your discussion, you do > >> too. It's uncommon that you don't have to teach something that > >> experienced users think about often. > > > > I'm mentioning memory layout because I'm talking to you. > > I wouldn't talk about memory layout if I would try to explain ravel, > > reshape and indexing for the first time to a student. > > > >> * The most common use of 'order' only refers to memory layout. For > >> example np.array "order" doesn't refer to index ordering but to memory > >> layout. > > > > No, as I tried to show with the statsmodels example. > > I don't require GSOC students (that are relatively new to numpy) to understand > > much about memory layout. > > The only use of ``order`` in statsmodels refers to *index* order in > > ravel and reshape. > > > >> * The current docstring of 'reshape' cannot be explained without > >> referring to memory order. > > > > really ? > > I thought reshape only refers to *index* order for "F" and "C" > > Here's the docstring for 'reshape': > > order : {'C', 'F', 'A'}, optional > Determines whether the array data should be viewed as in C > (row-major) order, FORTRAN (column-major) order, or the C/FORTRAN > order should be preserved. > > The 'A' option cannot be explained without reference to 'C' or 'F' > *memory* layout - i.e. a different meaning of the 'C' and 'F" in the > indexing interpretation. > > Actually, as a matter of interest - how would you explain the behavior > of 'A' when the array is neither 'C' or 'F' memory layout? Maybe that > could be a good test case? > The 'A' means C-order unless `ndarray.flags.fnc == True` (which means "fortran not C"). The detail about "not C" should not matter really for copies, for reshape it should maybe be mentioned more clearly. Though honestly, reshaping with 'A' seems so weird to me, I doubt anyone ever does it. As for ravel... you can probably just as well use 'K' instead which is even less restrictive. - Sebastian > Here's the docstring for 'ravel': > > order : {'C','F', 'A', 'K'}, optional > The elements of ``a`` are read in this order. 'C' means to view > the elements in C (row-major) order. 'F' means to view the elements > in Fortran (column-major) order. 'A' means to view the elements > in 'F' order if a is Fortran contiguous, 'C' order otherwise. > 'K' means to view the elements in the order they occur in memory, > except for reversing the data when strides are negative. > By default, 'C' order is used. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Mon Apr 1 15:10:09 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Apr 2013 12:10:09 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <1364837019.2404.12.camel@sebastian-laptop> References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: Hi, On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg wrote: > On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: >> Hi, >> >> On Sun, Mar 31, 2013 at 1:43 PM, wrote: >> > On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett wrote: >> >> Hi, >> >> >> >> On Sat, Mar 30, 2013 at 10:38 PM, wrote: >> >>> On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett wrote: >> >>>> Hi, >> >>>> >> >>>> On Sat, Mar 30, 2013 at 9:37 PM, wrote: >> >>>>> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett wrote: >> >>>>>> Hi, >> >>>>>> >> >>>>>> On Sat, Mar 30, 2013 at 7:02 PM, wrote: >> >>>>>>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: >> >>>>>>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >> >>>>>>>>> wrote: >> >>>>>>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >> >>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >> >>>>>>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >> >>>>>>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >> >>>>>>>>>>> >> wrote: >> >>>>>>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >> >>>>>>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >> >>>>>>>>>>> >>>> wrote: >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index >> >>>>>>>>>>> >>>>> ordering. >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> This is very confusing. We think the index ordering and memory >> >>>>>>>>>>> >>>>> ordering ideas need to be separated, and specifically, we should >> >>>>>>>>>>> >>>>> avoid >> >>>>>>>>>>> >>>>> using "C" and "F" to refer to index ordering. >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> Proposal >> >>>>>>>>>>> >>>>> ------------- >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and forwards >> >>>>>>>>>>> >>>>> index ordering for ravel, reshape >> >>>>>>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of unraveling >> >>>>>>>>>>> >>>>> in >> >>>>>>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively (excellent >> >>>>>>>>>>> >>>>> naming idea by Paul Ivanov) >> >>>>>>>>>>> >>>>> >> >>>>>>>>>>> >>>>> What do y'all think? >> >>>>>>>>>>> >>>> >> >>>>>>>>>>> >>>> I always thought "F" and "C" are easy to understand, I always thought >> >>>>>>>>>>> >>>> about >> >>>>>>>>>>> >>>> the content and never about the memory when using it. >> >>>>>>>>>>> >> >> >>>>>>>>>>> >> changing the names doesn't make it easier to understand. >> >>>>>>>>>>> >> I think the confusion is because the new A and K refer to existing >> >>>>>>>>>>> >> memory >> >>>>>>>>>>> >> >> >>>>>>>>>>> >> >>>>>>>>>>> I disagree, I think it's confusing, but I have evidence, and that is >> >>>>>>>>>>> that four out of four of us tested ourselves and got it wrong. >> >>>>>>>>>>> >> >>>>>>>>>>> Perhaps we are particularly dumb or poorly informed, but I think it's >> >>>>>>>>>>> rash to assert there is no problem here. >> >>>>>>>>> >> >>>>>>>>> I think you are overcomplicating things or phrased it as a "trick question" >> >>>>>>>> >> >>>>>>>> I don't know what you mean by trick question - was there something >> >>>>>>>> over-complicated in the example? I deliberately didn't include >> >>>>>>>> various much more confusing examples in "reshape". >> >>>>>>> >> >>>>>>> I meant making the "candidates" think about memory instead of just >> >>>>>>> column versus row stacking. >> >>>>>> >> >>>>>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >> >>>>>> array, it was an image, with time as the 4th dimension (N time >> >>>>>> points). Raveling and reshaping 3D and 4D arrays is a common thing >> >>>>>> to do in neuroimaging, as you can imagine. >> >>>>>> >> >>>>>> A student asked what he would get back from raveling this array, a >> >>>>>> concatenated time series, or something spatial? >> >>>>>> >> >>>>>> We showed (I'd worked it out by this time) that the first N values >> >>>>>> were the time series given by [0, 0, 0, :]. >> >>>>>> >> >>>>>> He said - "Oh - I see - so the data is stored as a whole lot of time >> >>>>>> series one by one, I thought it would be stored as a series of >> >>>>>> images'. >> >>>>>> >> >>>>>> Ironically, this was a Fortran-ordered array in memory, and he was wrong. >> >>>>>> >> >>>>>> So, I think the idea of memory ordering and index ordering is very >> >>>>>> easy to confuse, and comes up naturally. >> >>>>>> >> >>>>>> I would like, as a teacher, to be able to say something like: >> >>>>>> >> >>>>>> This is what C memory layout is (it's the memory layout that gives >> >>>>>> arr.flags.C_CONTIGUOUS=True) >> >>>>>> This is what F memory layout is (it's the memory layout that gives >> >>>>>> arr.flags.F_CONTIGUOUS=True) >> >>>>>> It's rather easy to get something that is neither C or F memory layout >> >>>>>> Numpy does many memory layouts. >> >>>>>> Ravel and reshape and numpy in general do not care (normally) about C >> >>>>>> or F layouts, they only care about index ordering. >> >>>>>> >> >>>>>> My point, that I'm repeating, is that my job is made harder by >> >>>>>> 'arr.ravel('F')'. >> >>>>> >> >>>>> But once you know that ravel and reshape don't care about memory, the >> >>>>> ravel is easy to predict (maybe not easy to visualize in 4-D): >> >>>> >> >>>> But this assumes that you already know that there's such a thing as >> >>>> memory layout, and there's such a thing as index ordering, and that >> >>>> 'C' and 'F' in ravel refer to index ordering. Once you have that, >> >>>> you're golden. I'm arguing it's markedly harder to get this >> >>>> distinction, and keep it in mind, and teach it, if we are using the >> >>>> 'C' and 'F" names for both things. >> >>> >> >>> No, I think you are still missing my point. >> >>> I think explaining ravel and reshape F and C is easy (kind of) because the >> >>> students don't need to know at that stage about memory layouts. >> >>> >> >>> All they need to know is that we look at n-dimensional objects in >> >>> C-order or in F-order >> >>> (whichever index runs fastest) >> >> >> >> Would you accept that it may or may not be true that it is desirable >> >> or practical not to mention memory layouts when teaching numpy? >> > >> > I think they should be in two different sections. >> > >> > basic usage: >> > ravel, reshape in pure index order, and indexing, broadcasting, ... >> > >> > advanced usage: >> > memory layout and some ability to predict when you get a view and >> > when you get a copy. >> >> Right - that is what you think - but I was asking - do you agree that >> it's possible that that is not best way to teach it? >> >> What evidence would you give that it was the best way to teach it? >> >> > And I still think words can mean different things in different context >> > (with a qualifier maybe) >> > indexing in fortran order >> > memory in fortran order >> >> Right - but you'd probably also accept that using the same word for >> different and related things is likely to cause confusion? I'm sure >> we could come up with some experimental evidence for that if you do >> doubt it. >> >> > Disclaimer: I never tried to teach numpy >> > and with GSOC students my explanations only went a little bit >> > beyond what they needed to know for the purpose at hand (I hope) >> > >> >> >> >> You believe it is desirable, I believe that it is not - that teaching >> >> numpy naturally involves some discussion of memory layout. >> >> >> >> As evidence: >> >> >> >> * My student, without any prompting about memory layouts, is asking about it >> >> * Travis' numpy book has a very early section on this (section 2.3 - >> >> memory layout) >> >> * I often think about memory layouts, and from your discussion, you do >> >> too. It's uncommon that you don't have to teach something that >> >> experienced users think about often. >> > >> > I'm mentioning memory layout because I'm talking to you. >> > I wouldn't talk about memory layout if I would try to explain ravel, >> > reshape and indexing for the first time to a student. >> > >> >> * The most common use of 'order' only refers to memory layout. For >> >> example np.array "order" doesn't refer to index ordering but to memory >> >> layout. >> > >> > No, as I tried to show with the statsmodels example. >> > I don't require GSOC students (that are relatively new to numpy) to understand >> > much about memory layout. >> > The only use of ``order`` in statsmodels refers to *index* order in >> > ravel and reshape. >> > >> >> * The current docstring of 'reshape' cannot be explained without >> >> referring to memory order. >> > >> > really ? >> > I thought reshape only refers to *index* order for "F" and "C" >> >> Here's the docstring for 'reshape': >> >> order : {'C', 'F', 'A'}, optional >> Determines whether the array data should be viewed as in C >> (row-major) order, FORTRAN (column-major) order, or the C/FORTRAN >> order should be preserved. >> >> The 'A' option cannot be explained without reference to 'C' or 'F' >> *memory* layout - i.e. a different meaning of the 'C' and 'F" in the >> indexing interpretation. >> >> Actually, as a matter of interest - how would you explain the behavior >> of 'A' when the array is neither 'C' or 'F' memory layout? Maybe that >> could be a good test case? >> > > The 'A' means C-order unless `ndarray.flags.fnc == True` (which means > "fortran not C"). The detail about "not C" should not matter really for > copies, for reshape it should maybe be mentioned more clearly. Though > honestly, reshaping with 'A' seems so weird to me, I doubt anyone ever > does it. As for ravel... you can probably just as well use 'K' instead > which is even less restrictive. I was arguing that it is not possible to explain the docstring(s) without reference to memory order - I guess you agree. Cheers, Matthew From josef.pktd at gmail.com Mon Apr 1 16:34:00 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Apr 2013 16:34:00 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: On Mon, Apr 1, 2013 at 3:10 PM, Matthew Brett wrote: > Hi, > > On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg > wrote: >> On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: >>> Hi, >>> >>> On Sun, Mar 31, 2013 at 1:43 PM, wrote: >>> > On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett wrote: >>> >> Hi, >>> >> >>> >> On Sat, Mar 30, 2013 at 10:38 PM, wrote: >>> >>> On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett wrote: >>> >>>> Hi, >>> >>>> >>> >>>> On Sat, Mar 30, 2013 at 9:37 PM, wrote: >>> >>>>> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett wrote: >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> On Sat, Mar 30, 2013 at 7:02 PM, wrote: >>> >>>>>>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: >>> >>>>>>>> Hi, >>> >>>>>>>> >>> >>>>>>>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: >>> >>>>>>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >>> >>>>>>>>> wrote: >>> >>>>>>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >>> >>>>>>>>>> wrote: >>> >>>>>>>>>>> >>> >>>>>>>>>>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >>> >>>>>>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >>> >>>>>>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >>> >>>>>>>>>>> >> wrote: >>> >>>>>>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>> >>>>>>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>> >>>>>>>>>>> >>>> wrote: >>> >>>>>>>>>>> >>>>> >>> >>>>>>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index >>> >>>>>>>>>>> >>>>> ordering. >>> >>>>>>>>>>> >>>>> >>> >>>>>>>>>>> >>>>> This is very confusing. We think the index ordering and memory >>> >>>>>>>>>>> >>>>> ordering ideas need to be separated, and specifically, we should >>> >>>>>>>>>>> >>>>> avoid >>> >>>>>>>>>>> >>>>> using "C" and "F" to refer to index ordering. >>> >>>>>>>>>>> >>>>> >>> >>>>>>>>>>> >>>>> Proposal >>> >>>>>>>>>>> >>>>> ------------- >>> >>>>>>>>>>> >>>>> >>> >>>>>>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and forwards >>> >>>>>>>>>>> >>>>> index ordering for ravel, reshape >>> >>>>>>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of unraveling >>> >>>>>>>>>>> >>>>> in >>> >>>>>>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively (excellent >>> >>>>>>>>>>> >>>>> naming idea by Paul Ivanov) >>> >>>>>>>>>>> >>>>> >>> >>>>>>>>>>> >>>>> What do y'all think? >>> >>>>>>>>>>> >>>> >>> >>>>>>>>>>> >>>> I always thought "F" and "C" are easy to understand, I always thought >>> >>>>>>>>>>> >>>> about >>> >>>>>>>>>>> >>>> the content and never about the memory when using it. >>> >>>>>>>>>>> >> >>> >>>>>>>>>>> >> changing the names doesn't make it easier to understand. >>> >>>>>>>>>>> >> I think the confusion is because the new A and K refer to existing >>> >>>>>>>>>>> >> memory >>> >>>>>>>>>>> >> >>> >>>>>>>>>>> >>> >>>>>>>>>>> I disagree, I think it's confusing, but I have evidence, and that is >>> >>>>>>>>>>> that four out of four of us tested ourselves and got it wrong. >>> >>>>>>>>>>> >>> >>>>>>>>>>> Perhaps we are particularly dumb or poorly informed, but I think it's >>> >>>>>>>>>>> rash to assert there is no problem here. >>> >>>>>>>>> >>> >>>>>>>>> I think you are overcomplicating things or phrased it as a "trick question" >>> >>>>>>>> >>> >>>>>>>> I don't know what you mean by trick question - was there something >>> >>>>>>>> over-complicated in the example? I deliberately didn't include >>> >>>>>>>> various much more confusing examples in "reshape". >>> >>>>>>> >>> >>>>>>> I meant making the "candidates" think about memory instead of just >>> >>>>>>> column versus row stacking. >>> >>>>>> >>> >>>>>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >>> >>>>>> array, it was an image, with time as the 4th dimension (N time >>> >>>>>> points). Raveling and reshaping 3D and 4D arrays is a common thing >>> >>>>>> to do in neuroimaging, as you can imagine. >>> >>>>>> >>> >>>>>> A student asked what he would get back from raveling this array, a >>> >>>>>> concatenated time series, or something spatial? >>> >>>>>> >>> >>>>>> We showed (I'd worked it out by this time) that the first N values >>> >>>>>> were the time series given by [0, 0, 0, :]. >>> >>>>>> >>> >>>>>> He said - "Oh - I see - so the data is stored as a whole lot of time >>> >>>>>> series one by one, I thought it would be stored as a series of >>> >>>>>> images'. >>> >>>>>> >>> >>>>>> Ironically, this was a Fortran-ordered array in memory, and he was wrong. >>> >>>>>> >>> >>>>>> So, I think the idea of memory ordering and index ordering is very >>> >>>>>> easy to confuse, and comes up naturally. >>> >>>>>> >>> >>>>>> I would like, as a teacher, to be able to say something like: >>> >>>>>> >>> >>>>>> This is what C memory layout is (it's the memory layout that gives >>> >>>>>> arr.flags.C_CONTIGUOUS=True) >>> >>>>>> This is what F memory layout is (it's the memory layout that gives >>> >>>>>> arr.flags.F_CONTIGUOUS=True) >>> >>>>>> It's rather easy to get something that is neither C or F memory layout >>> >>>>>> Numpy does many memory layouts. >>> >>>>>> Ravel and reshape and numpy in general do not care (normally) about C >>> >>>>>> or F layouts, they only care about index ordering. >>> >>>>>> >>> >>>>>> My point, that I'm repeating, is that my job is made harder by >>> >>>>>> 'arr.ravel('F')'. >>> >>>>> >>> >>>>> But once you know that ravel and reshape don't care about memory, the >>> >>>>> ravel is easy to predict (maybe not easy to visualize in 4-D): >>> >>>> >>> >>>> But this assumes that you already know that there's such a thing as >>> >>>> memory layout, and there's such a thing as index ordering, and that >>> >>>> 'C' and 'F' in ravel refer to index ordering. Once you have that, >>> >>>> you're golden. I'm arguing it's markedly harder to get this >>> >>>> distinction, and keep it in mind, and teach it, if we are using the >>> >>>> 'C' and 'F" names for both things. >>> >>> >>> >>> No, I think you are still missing my point. >>> >>> I think explaining ravel and reshape F and C is easy (kind of) because the >>> >>> students don't need to know at that stage about memory layouts. >>> >>> >>> >>> All they need to know is that we look at n-dimensional objects in >>> >>> C-order or in F-order >>> >>> (whichever index runs fastest) >>> >> >>> >> Would you accept that it may or may not be true that it is desirable >>> >> or practical not to mention memory layouts when teaching numpy? >>> > >>> > I think they should be in two different sections. >>> > >>> > basic usage: >>> > ravel, reshape in pure index order, and indexing, broadcasting, ... >>> > >>> > advanced usage: >>> > memory layout and some ability to predict when you get a view and >>> > when you get a copy. >>> >>> Right - that is what you think - but I was asking - do you agree that >>> it's possible that that is not best way to teach it? >>> >>> What evidence would you give that it was the best way to teach it? >>> >>> > And I still think words can mean different things in different context >>> > (with a qualifier maybe) >>> > indexing in fortran order >>> > memory in fortran order >>> >>> Right - but you'd probably also accept that using the same word for >>> different and related things is likely to cause confusion? I'm sure >>> we could come up with some experimental evidence for that if you do >>> doubt it. >>> >>> > Disclaimer: I never tried to teach numpy >>> > and with GSOC students my explanations only went a little bit >>> > beyond what they needed to know for the purpose at hand (I hope) >>> > >>> >> >>> >> You believe it is desirable, I believe that it is not - that teaching >>> >> numpy naturally involves some discussion of memory layout. >>> >> >>> >> As evidence: >>> >> >>> >> * My student, without any prompting about memory layouts, is asking about it >>> >> * Travis' numpy book has a very early section on this (section 2.3 - >>> >> memory layout) >>> >> * I often think about memory layouts, and from your discussion, you do >>> >> too. It's uncommon that you don't have to teach something that >>> >> experienced users think about often. >>> > >>> > I'm mentioning memory layout because I'm talking to you. >>> > I wouldn't talk about memory layout if I would try to explain ravel, >>> > reshape and indexing for the first time to a student. >>> > >>> >> * The most common use of 'order' only refers to memory layout. For >>> >> example np.array "order" doesn't refer to index ordering but to memory >>> >> layout. >>> > >>> > No, as I tried to show with the statsmodels example. >>> > I don't require GSOC students (that are relatively new to numpy) to understand >>> > much about memory layout. >>> > The only use of ``order`` in statsmodels refers to *index* order in >>> > ravel and reshape. >>> > >>> >> * The current docstring of 'reshape' cannot be explained without >>> >> referring to memory order. >>> > >>> > really ? >>> > I thought reshape only refers to *index* order for "F" and "C" >>> >>> Here's the docstring for 'reshape': >>> >>> order : {'C', 'F', 'A'}, optional >>> Determines whether the array data should be viewed as in C >>> (row-major) order, FORTRAN (column-major) order, or the C/FORTRAN >>> order should be preserved. >>> >>> The 'A' option cannot be explained without reference to 'C' or 'F' >>> *memory* layout - i.e. a different meaning of the 'C' and 'F" in the >>> indexing interpretation. >>> >>> Actually, as a matter of interest - how would you explain the behavior >>> of 'A' when the array is neither 'C' or 'F' memory layout? Maybe that >>> could be a good test case? >>> >> >> The 'A' means C-order unless `ndarray.flags.fnc == True` (which means >> "fortran not C"). The detail about "not C" should not matter really for >> copies, for reshape it should maybe be mentioned more clearly. Though >> honestly, reshaping with 'A' seems so weird to me, I doubt anyone ever >> does it. As for ravel... you can probably just as well use 'K' instead >> which is even less restrictive. > > I was arguing that it is not possible to explain the docstring(s) > without reference to memory order - I guess you agree. I was carefully to always refer to "C" and "F" options. I've never seen a usage of "A", nor the "K" in ravel ("K" is not available in numpy 1.5) and I don't expect to run into a case where I need "A" or "K". My impression is that both "A" and "K" are only good for memory optimization, when we do *not* care (much) about the actual sequence. (So, in my opinion, it's mostly useless to try to figure out what the sequence is.) So, I would categorize a question for predicting what happens with "A" or "K" as a question to separate developers in the style of, Do you really understand the tricky parts of numpy? or Do you just have a working knowledge of numpy? (I just avoid certain parts of numpy because they make my head spin. e.g. mixing slices and fancy indexing in more than 2d ?) I'm just against taking away the easy to understand and frequently used (names) "F" and "C", to come back to the original question Josef > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Mon Apr 1 18:29:34 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Apr 2013 15:29:34 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: Hi, On Mon, Apr 1, 2013 at 1:34 PM, wrote: > On Mon, Apr 1, 2013 at 3:10 PM, Matthew Brett wrote: >> Hi, >> >> On Mon, Apr 1, 2013 at 10:23 AM, Sebastian Berg >> wrote: >>> On Sun, 2013-03-31 at 14:04 -0700, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Sun, Mar 31, 2013 at 1:43 PM, wrote: >>>> > On Sun, Mar 31, 2013 at 3:54 PM, Matthew Brett wrote: >>>> >> Hi, >>>> >> >>>> >> On Sat, Mar 30, 2013 at 10:38 PM, wrote: >>>> >>> On Sun, Mar 31, 2013 at 12:50 AM, Matthew Brett wrote: >>>> >>>> Hi, >>>> >>>> >>>> >>>> On Sat, Mar 30, 2013 at 9:37 PM, wrote: >>>> >>>>> On Sun, Mar 31, 2013 at 12:04 AM, Matthew Brett wrote: >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> On Sat, Mar 30, 2013 at 7:02 PM, wrote: >>>> >>>>>>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett wrote: >>>> >>>>>>>> Hi, >>>> >>>>>>>> >>>> >>>>>>>> On Sat, Mar 30, 2013 at 7:50 PM, wrote: >>>> >>>>>>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle >>>> >>>>>>>>> wrote: >>>> >>>>>>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett >>>> >>>>>>>>>> wrote: >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> On Sat, Mar 30, 2013 at 2:20 PM, wrote: >>>> >>>>>>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM, wrote: >>>> >>>>>>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett >>>> >>>>>>>>>>> >> wrote: >>>> >>>>>>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM, wrote: >>>> >>>>>>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett >>>> >>>>>>>>>>> >>>> wrote: >>>> >>>>>>>>>>> >>>>> >>>> >>>>>>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index >>>> >>>>>>>>>>> >>>>> ordering. >>>> >>>>>>>>>>> >>>>> >>>> >>>>>>>>>>> >>>>> This is very confusing. We think the index ordering and memory >>>> >>>>>>>>>>> >>>>> ordering ideas need to be separated, and specifically, we should >>>> >>>>>>>>>>> >>>>> avoid >>>> >>>>>>>>>>> >>>>> using "C" and "F" to refer to index ordering. >>>> >>>>>>>>>>> >>>>> >>>> >>>>>>>>>>> >>>>> Proposal >>>> >>>>>>>>>>> >>>>> ------------- >>>> >>>>>>>>>>> >>>>> >>>> >>>>>>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and forwards >>>> >>>>>>>>>>> >>>>> index ordering for ravel, reshape >>>> >>>>>>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of unraveling >>>> >>>>>>>>>>> >>>>> in >>>> >>>>>>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively (excellent >>>> >>>>>>>>>>> >>>>> naming idea by Paul Ivanov) >>>> >>>>>>>>>>> >>>>> >>>> >>>>>>>>>>> >>>>> What do y'all think? >>>> >>>>>>>>>>> >>>> >>>> >>>>>>>>>>> >>>> I always thought "F" and "C" are easy to understand, I always thought >>>> >>>>>>>>>>> >>>> about >>>> >>>>>>>>>>> >>>> the content and never about the memory when using it. >>>> >>>>>>>>>>> >> >>>> >>>>>>>>>>> >> changing the names doesn't make it easier to understand. >>>> >>>>>>>>>>> >> I think the confusion is because the new A and K refer to existing >>>> >>>>>>>>>>> >> memory >>>> >>>>>>>>>>> >> >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> I disagree, I think it's confusing, but I have evidence, and that is >>>> >>>>>>>>>>> that four out of four of us tested ourselves and got it wrong. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Perhaps we are particularly dumb or poorly informed, but I think it's >>>> >>>>>>>>>>> rash to assert there is no problem here. >>>> >>>>>>>>> >>>> >>>>>>>>> I think you are overcomplicating things or phrased it as a "trick question" >>>> >>>>>>>> >>>> >>>>>>>> I don't know what you mean by trick question - was there something >>>> >>>>>>>> over-complicated in the example? I deliberately didn't include >>>> >>>>>>>> various much more confusing examples in "reshape". >>>> >>>>>>> >>>> >>>>>>> I meant making the "candidates" think about memory instead of just >>>> >>>>>>> column versus row stacking. >>>> >>>>>> >>>> >>>>>> To be specific, we were teaching about reshaping a (I, J, K, N) 4D >>>> >>>>>> array, it was an image, with time as the 4th dimension (N time >>>> >>>>>> points). Raveling and reshaping 3D and 4D arrays is a common thing >>>> >>>>>> to do in neuroimaging, as you can imagine. >>>> >>>>>> >>>> >>>>>> A student asked what he would get back from raveling this array, a >>>> >>>>>> concatenated time series, or something spatial? >>>> >>>>>> >>>> >>>>>> We showed (I'd worked it out by this time) that the first N values >>>> >>>>>> were the time series given by [0, 0, 0, :]. >>>> >>>>>> >>>> >>>>>> He said - "Oh - I see - so the data is stored as a whole lot of time >>>> >>>>>> series one by one, I thought it would be stored as a series of >>>> >>>>>> images'. >>>> >>>>>> >>>> >>>>>> Ironically, this was a Fortran-ordered array in memory, and he was wrong. >>>> >>>>>> >>>> >>>>>> So, I think the idea of memory ordering and index ordering is very >>>> >>>>>> easy to confuse, and comes up naturally. >>>> >>>>>> >>>> >>>>>> I would like, as a teacher, to be able to say something like: >>>> >>>>>> >>>> >>>>>> This is what C memory layout is (it's the memory layout that gives >>>> >>>>>> arr.flags.C_CONTIGUOUS=True) >>>> >>>>>> This is what F memory layout is (it's the memory layout that gives >>>> >>>>>> arr.flags.F_CONTIGUOUS=True) >>>> >>>>>> It's rather easy to get something that is neither C or F memory layout >>>> >>>>>> Numpy does many memory layouts. >>>> >>>>>> Ravel and reshape and numpy in general do not care (normally) about C >>>> >>>>>> or F layouts, they only care about index ordering. >>>> >>>>>> >>>> >>>>>> My point, that I'm repeating, is that my job is made harder by >>>> >>>>>> 'arr.ravel('F')'. >>>> >>>>> >>>> >>>>> But once you know that ravel and reshape don't care about memory, the >>>> >>>>> ravel is easy to predict (maybe not easy to visualize in 4-D): >>>> >>>> >>>> >>>> But this assumes that you already know that there's such a thing as >>>> >>>> memory layout, and there's such a thing as index ordering, and that >>>> >>>> 'C' and 'F' in ravel refer to index ordering. Once you have that, >>>> >>>> you're golden. I'm arguing it's markedly harder to get this >>>> >>>> distinction, and keep it in mind, and teach it, if we are using the >>>> >>>> 'C' and 'F" names for both things. >>>> >>> >>>> >>> No, I think you are still missing my point. >>>> >>> I think explaining ravel and reshape F and C is easy (kind of) because the >>>> >>> students don't need to know at that stage about memory layouts. >>>> >>> >>>> >>> All they need to know is that we look at n-dimensional objects in >>>> >>> C-order or in F-order >>>> >>> (whichever index runs fastest) >>>> >> >>>> >> Would you accept that it may or may not be true that it is desirable >>>> >> or practical not to mention memory layouts when teaching numpy? >>>> > >>>> > I think they should be in two different sections. >>>> > >>>> > basic usage: >>>> > ravel, reshape in pure index order, and indexing, broadcasting, ... >>>> > >>>> > advanced usage: >>>> > memory layout and some ability to predict when you get a view and >>>> > when you get a copy. >>>> >>>> Right - that is what you think - but I was asking - do you agree that >>>> it's possible that that is not best way to teach it? >>>> >>>> What evidence would you give that it was the best way to teach it? >>>> >>>> > And I still think words can mean different things in different context >>>> > (with a qualifier maybe) >>>> > indexing in fortran order >>>> > memory in fortran order >>>> >>>> Right - but you'd probably also accept that using the same word for >>>> different and related things is likely to cause confusion? I'm sure >>>> we could come up with some experimental evidence for that if you do >>>> doubt it. >>>> >>>> > Disclaimer: I never tried to teach numpy >>>> > and with GSOC students my explanations only went a little bit >>>> > beyond what they needed to know for the purpose at hand (I hope) >>>> > >>>> >> >>>> >> You believe it is desirable, I believe that it is not - that teaching >>>> >> numpy naturally involves some discussion of memory layout. >>>> >> >>>> >> As evidence: >>>> >> >>>> >> * My student, without any prompting about memory layouts, is asking about it >>>> >> * Travis' numpy book has a very early section on this (section 2.3 - >>>> >> memory layout) >>>> >> * I often think about memory layouts, and from your discussion, you do >>>> >> too. It's uncommon that you don't have to teach something that >>>> >> experienced users think about often. >>>> > >>>> > I'm mentioning memory layout because I'm talking to you. >>>> > I wouldn't talk about memory layout if I would try to explain ravel, >>>> > reshape and indexing for the first time to a student. >>>> > >>>> >> * The most common use of 'order' only refers to memory layout. For >>>> >> example np.array "order" doesn't refer to index ordering but to memory >>>> >> layout. >>>> > >>>> > No, as I tried to show with the statsmodels example. >>>> > I don't require GSOC students (that are relatively new to numpy) to understand >>>> > much about memory layout. >>>> > The only use of ``order`` in statsmodels refers to *index* order in >>>> > ravel and reshape. >>>> > >>>> >> * The current docstring of 'reshape' cannot be explained without >>>> >> referring to memory order. >>>> > >>>> > really ? >>>> > I thought reshape only refers to *index* order for "F" and "C" >>>> >>>> Here's the docstring for 'reshape': >>>> >>>> order : {'C', 'F', 'A'}, optional >>>> Determines whether the array data should be viewed as in C >>>> (row-major) order, FORTRAN (column-major) order, or the C/FORTRAN >>>> order should be preserved. >>>> >>>> The 'A' option cannot be explained without reference to 'C' or 'F' >>>> *memory* layout - i.e. a different meaning of the 'C' and 'F" in the >>>> indexing interpretation. >>>> >>>> Actually, as a matter of interest - how would you explain the behavior >>>> of 'A' when the array is neither 'C' or 'F' memory layout? Maybe that >>>> could be a good test case? >>>> >>> >>> The 'A' means C-order unless `ndarray.flags.fnc == True` (which means >>> "fortran not C"). The detail about "not C" should not matter really for >>> copies, for reshape it should maybe be mentioned more clearly. Though >>> honestly, reshaping with 'A' seems so weird to me, I doubt anyone ever >>> does it. As for ravel... you can probably just as well use 'K' instead >>> which is even less restrictive. >> >> I was arguing that it is not possible to explain the docstring(s) >> without reference to memory order - I guess you agree. > > I was carefully to always refer to "C" and "F" options. > > I've never seen a usage of "A", nor the "K" in ravel ("K" is not > available in numpy 1.5) > and I don't expect to run into a case where I need "A" or "K". Right. I am only pointing out that one cannot explain the docstring without reference to memory order. > My impression is that both "A" and "K" are only good for memory > optimization, when we do *not* care (much) about the actual sequence. > (So, in my opinion, it's mostly useless to try to figure out what the > sequence is.) > > So, I would categorize a question for predicting what happens with "A" or "K" > as a question to separate developers in the style of, > Do you really understand the tricky parts of numpy? or > Do you just have a working knowledge of numpy? > > (I just avoid certain parts of numpy because they make my head spin. > e.g. mixing slices and fancy indexing in more than 2d ?) > > I'm just against taking away the easy to understand and frequently used > (names) "F" and "C", to come back to the original question I agree 'F' and 'C' are frequently used, but I estimate they are most frequently used with a different meaning. "Easy to understand" is obviously subjective, and not much use for the discussion, hence my attempt to try and find some evidence on the point. 'F' and 'C' are clearly not simple, in a technical sense, because they have two different meanings. The use of C and F are of course familiar, and that gives us a bias to believe they are easy for some someone else to understand. I was hoping for some attempt to get past that bias, which is obviously going to be strong, I believe that evidence on that point is your requirement that someone learning this stuff does not come across 'C' or 'F' in the sense of memory layout, until they are advanced, and my earlier assertion (with some evidence) that that is neither desirable nor practical. Cheers, Matthew From chris.barker at noaa.gov Mon Apr 1 19:51:53 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 1 Apr 2013 16:51:53 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: HI folks, I've been teaching Python lately, have taught numpy a couple times (formally), and am preparing a leacture about it over the next couple weeks -- so I'm taking an interest here. I've been a regular numpy user for a long time, though as it happens, rarely use ravel() (sode note, what's always confused me the most is that it seems to me that ravel() _unravels_ the array - but that's a side note...) So I ignored the first post, then fired up iPython, read the docstring, and played with ravel a bit -- it behaved EXACTLY like I expected. -- at least for 2-d.... Mathew, I expect your group may have gotten tied up by the fact that you know too much! kind of like how I have a hard time getting my iphone to work, and my computer-illiterate wife has no problem at all. So: yes, I do think it's bit confusing and unfortunate that the "order" parameter has two somewhat different meanings, but they are in fat, used fairly similarly. And while the idea of "fortran" or "C" ordering of arrays may be a foreign concept to folks that have not used fortran or C (or most critically, tried to interace the two...) it's a common enough concept that it's a reasonable shorthand. As for "should we teach memory order at all to newbies?' I usually do teach memory order early on, partly that's because I really like to emphasize that numpy arrays are both a really nice Python data structure and set of functions, but also a wrapper around a block of data -- for the later, you need to talk about order. Also, even with pure-python, knowing a bit about whether arrays are contiguous or not is important (and views, and...). You can do a lot with numpy without thinking about memory order at all, but to really make it dance, you need to know about it. In short -- I don't think the situation is too bad, and not bad enough to change any names or flags, but if someone wants to add a bit to the ravel docstring to clarify it, I'm all for it. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Tue Apr 2 01:15:31 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Apr 2013 22:15:31 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: Hi, On Mon, Apr 1, 2013 at 4:51 PM, Chris Barker - NOAA Federal wrote: > HI folks, > > I've been teaching Python lately, have taught numpy a couple times > (formally), and am preparing a leacture about it over the next couple > weeks -- so I'm taking an interest here. > > I've been a regular numpy user for a long time, though as it happens, > rarely use ravel() (sode note, what's always confused me the most is > that it seems to me that ravel() _unravels_ the array - but that's a > side note...) > > So I ignored the first post, then fired up iPython, read the > docstring, and played with ravel a bit -- it behaved EXACTLY like I > expected. -- at least for 2-d.... > > Mathew, I expect your group may have gotten tied up by the fact that > you know too much! kind of like how I have a hard time getting my > iphone to work, and my computer-illiterate wife has no problem at all. Thank you for the compliment, it's more enjoyable than other potential explanations of my confusion (sigh). But, I don't think that is the explanation. First, there were three of us with different levels of experience getting confused on this. Second, I think we all agree that: > So: yes, I do think it's bit confusing and unfortunate that the > "order" parameter has two somewhat different meanings, - so there is a good reason that we could get confused. Last, as soon as we came to the distinction between index order and memory layout, it was clear. We all agreed that this was an important distinction that would improve numpy if we made it. Before I sent the email I did wonder aloud whether people would read the email, understand the distinction, and then fail to see the problem. It is hard to imagine yourself before you understood something. > but they are in > fat, used fairly similarly. And while the idea of "fortran" or "C" > ordering of arrays may be a foreign concept to folks that have not > used fortran or C (or most critically, tried to interace the two...) > it's a common enough concept that it's a reasonable shorthand. > > As for "should we teach memory order at all to newbies?' > > I usually do teach memory order early on, partly that's because I > really like to emphasize that numpy arrays are both a really nice > Python data structure and set of functions, but also a wrapper around > a block of data -- for the later, you need to talk about order. Also, > even with pure-python, knowing a bit about whether arrays are > contiguous or not is important (and views, and...). You can do a lot > with numpy without thinking about memory order at all, but to really > make it dance, you need to know about it. > > In short -- I don't think the situation is too bad, and not bad enough > to change any names or flags, but if someone wants to add a bit to the > ravel docstring to clarify it, I'm all for it. I think you agree that there is potential for confusion, and there doesn't seem any reason to continue with that confusion if we can come up with a clearer name. So here is a compromise proposal. How about: * Preferring the names 'c-style' and 'f-style' for the indexing order case (ravel, reshape, flatiter) * Leaving 'C" and 'F' as functional shortcuts, so there is no possible backwards-compatibility problem. Would you object to that? Cheers, Matthew From eric at depagne.org Tue Apr 2 03:26:05 2013 From: eric at depagne.org (=?iso-8859-1?q?=C9ric_Depagne?=) Date: Tue, 2 Apr 2013 09:26:05 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <201304020926.05700.eric@depagne.org> Hi all, Since we're mentionning obvious and non-obvious naming, > > I think you agree that there is potential for confusion, and there > doesn't seem any reason to continue with that confusion if we can come > up with a clearer name. > > So here is a compromise proposal. > > How about: > > * Preferring the names 'c-style' and 'f-style' for the indexing order > case (ravel, reshape, flatiter) This naming scheme is obvious for the ones that have been doing some coding for a long time, but they tend not to speak to anyone else. Why not use naming that are a little bit more explicit (and of course, keep the legacy naming available), and use 'row-first' and 'column-first' (or anything else that may be more explicit) ? Cheers, ?ric. > * Leaving 'C" and 'F' as functional shortcuts, so there is no possible > backwards-compatibility problem. > > Would you object to that? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From njs at pobox.com Tue Apr 2 07:04:00 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 12:04:00 +0100 Subject: [Numpy-discussion] Indexing bug In-Reply-To: References: Message-ID: On Sun, Mar 31, 2013 at 6:14 AM, Ivan Oseledets wrote: >> I am using numpy 1.6.1, >> and encountered a wierd fancy indexing bug: >> >> import numpy as np >> c = np.random.randn(10,200,10); >> >> In [29]: print c[[0,1],:200,:2].shape >> (2, 200, 2) >> >> In [30]: print c[[0,1],:200,[0,1]].shape >> (2, 200) >> >> It means, that here fancy indexing is not working right for a 3d array. >> > > On Sat, Mar 30, 2013 at 11:01 AM, Ivan Oseledets > wrote: > >> I am using numpy 1.6.1, >> and encountered a wierd fancy indexing bug: >> >> import numpy as np >> c = np.random.randn(10,200,10); >> >> In [29]: print c[[0,1],:200,:2].shape >> (2, 200, 2) >> >> In [30]: print c[[0,1],:200,[0,1]].shape >> (2, 200) >> >> It means, that here fancy indexing is not working right for a 3d array. >> > --> > It is working fine, review the docs: > > http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing > > In your return, item [0, :] is c[0, :, 0] and item[1, :]is c[1, :, 1]. > > If you want a return of shape (2, 200, 2) where item [i, :, j] is c[i, :, > j] you could use slicing: > > c[:2, :200, :2] > > or something more elaborate like: > > c[np.arange(2)[:, None, None], np.arange(200)[:, None], np.arange(2)] > > Jaime > ---> > > Oh! So it is not a bug, it is a feature, which is completely > incompatible with other array based languages (MATLAB and Fortran). To > me, I can not find a single explanation why it is so in numpy. > Taking submatrices from a matrix is a common operation and the syntax > above is very natural to take submatrices, not a weird diagonal stuff. > i.e., > > c = np.random.randn(100,100) > d = c[[0,3],[2,3]] > > should NOT produce two numbers! (and you can not do it using slices!) > > In MATLAB and Fortran > c(indi,indj) > will produce a 2 x 2 matrix. > How it can be done in numpy (and why the complications?)> > So, please consider this message as a feature request. Numpy's handling of such things is strictly more general than MATLAB and Fortran's (AFAIK), and fits in with the rest of the system quite nicely, I'd say. The logic is: if you do a[row_coords, col_coords] then it treats row_coords and col_coords as parallel arrays, where each corresponding pair of entries in the two arrays gives the coordinates of one entry in the result -- so you get [a[row_coords[0], col_coords[0]], a[row_coords[1], col_coords[1]], ...]. This follows numpy's usual rules for arrays that are supposed to align: just like if you did 'row_coords + col_coords', say, which would give you [row_coords[0] + col_coords[0], row_coords[1] + col_coords[1], ...]. AND, all of this works just the same if row_coords and col_coords are arbitrary-dimensional arrays: the output of indexing (just like the output of '+') will have the same dimensionality as its input. So if they're both 2x2 arrays, then a[row_coords, col_coords] will be [[a[row_coords[0, 0], col_coords[0, 0]], a[row_coords[0, 1], col_coords[0, 1]], [a[row_coords[1, 0], col_coords[1, 0]], a[row_coords[1, 1], col_coords[1, 1]], ] AND (here's the solution to your problem), this "aligning" uses the same rules as "+" does, i.e., broadcasting applies: http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html So for your example, write c[[[0], [1]], :200, [[0, 1]]] which by broadcasting is equivalent to c[[[0, 1], [0, 1]], :200, [[0, 0], [1, 1]]] which is what you want. The only problem then is that slicing indexing happens "inside" fancy indexing ("fancy indexing" is the name for this line-up-the-arrays-and-extract-coordinates thing). So what this means is that when you use both inside a single indexing operation, the what it does is for each set of corresponding fancy indexing coordinates, it doesn't just extract a single element for the final array -- it extracts an entire sub-array. The practical result is that the shape of your output array will be the shape of your aligned fancy indexing arrays (after broadcasting), and then with any sliced axes stuck on the end. So you might expect that the above expression would give you an array of shape (2, 200, 2) but in fact it will be of shape (2, 2, 200) because the (2, 2) part is the shape of the fancy index arrays, and the (200,) is the shape of the slices. (Notice that there is no relation at all between the shape of the input array and the shape of the fancy indexes!) You can fix this up with a call to rollaxes(), or by getting even fancier, using fancy indexes for all three axes, and making sure they each broadcast to the shape (2, 200, 2): coord1 = np.asarray([0, 1]).reshape((2, 1, 1)) coord2 = np.arange(200).reshape((1, 200, 1)) coord3 = np.asarray([0, 1]).reshape((1, 1, 2)) c[coord1, coord2, coord3] -n From njs at pobox.com Tue Apr 2 07:32:15 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 12:32:15 +0100 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett wrote: > Hi, > > We were teaching today, and found ourselves getting very confused > about ravel and shape in numpy. > > Summary > -------------- > > There are two separate ideas needed to understand ordering in ravel and reshape: > > Idea 1): ravel / reshape can proceed from the last axis to the first, > or the first to the last. This is "ravel index ordering" > Idea 2) The physical layout of the array (on disk or in memory) can be > "C" or "F" contiguous or neither. > This is "memory ordering" > > The index ordering is usually (but see below) orthogonal to the memory ordering. > > The 'ravel' and 'reshape' commands use "C" and "F" in the sense of > index ordering, and this mixes the two ideas and is confusing. > > What the current situation looks like > ---------------------------------------------------- > > Specifically, we've been rolling this around 4 experienced numpy users > and we all predicted at least one of the results below wrongly. > > This was what we knew, or should have known: > > In [2]: import numpy as np > > In [3]: arr = np.arange(10).reshape((2, 5)) > > In [5]: arr.ravel() > Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > So, the 'ravel' operation unravels over the last axis (1) first, > followed by axis 0. > > So far so good (even if the opposite to MATLAB, Octave). > > Then we found the 'order' flag to ravel: > > In [10]: arr.flags > Out[10]: > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [11]: arr.ravel('C') > Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > But we soon got confused. How about this? > > In [12]: arr_F = np.array(arr, order='F') > > In [13]: arr_F.flags > Out[13]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [16]: arr_F > Out[16]: > array([[0, 1, 2, 3, 4], > [5, 6, 7, 8, 9]]) > > In [17]: arr_F.ravel('C') > Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > Right - so the flag 'C' to ravel, has got nothing to do with *memory* > ordering, but is to do with *index* ordering. > > And in fact, we can ask for memory ordering specifically: > > In [22]: arr.ravel('K') > Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [23]: arr_F.ravel('K') > Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > In [24]: arr.ravel('A') > Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > In [25]: arr_F.ravel('A') > Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) > > There are some confusions to get into with the 'order' flag to reshape > as well, of the same type. > > Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. > > This is very confusing. We think the index ordering and memory > ordering ideas need to be separated, and specifically, we should avoid > using "C" and "F" to refer to index ordering. > > Proposal > ------------- > > * Deprecate the use of "C" and "F" meaning backwards and forwards > index ordering for ravel, reshape > * Prefer "Z" and "N", being graphical representations of unraveling in > 2 dimensions, axis1 first and axis0 first respectively (excellent > naming idea by Paul Ivanov) > > What do y'all think? Surely it should be "Z" and "?"? ;-) I knew what your examples would produce, but only because I've bumped into this before. When you do reshapes of various sorts (ravel() == reshape((-1,))), then, like you say, there are two totally different sets of coordinate mapping in play: chunk of memory <-1-> virtual array layout <-2-> new array layout (C pointers) <---> (Python indexes) <---> (Python indexes) Mapping (1) is determined by the array strides, and you have to think about it when you interface with C code, but at the Python level it's pretty much irrelevant; all operations are defined at the "virtual array layout" level. Further confusing the issue is the fact that the vast majority of legal memory<->virtual array mappings are *neither* C- nor F-ordered. Strides are very flexible. Further further confusing the issue is that mapping (2) actually consists of two mappings: if you have an array with shape (3, 4, 5) and reshape it to (4, 15), then the way you work out the overall mapping is by first mapping the (3, 4, 5) onto a flat 1-d space with 60 elements, and then mapping *that* to the (4, 15) space. Anyway, I agree that this is very confusing; certainly it confused me. If you bump into these two mappings just in passing, and separately, then it's very easy to miss the fact that they have nothing to do with each other. And I agree that using exactly the same terminology for both of them is part of what causes this. I even kind of like the "Z"/"N" naming scheme (I still have to look up what C/F actually mean every time, I'm ashamed to say). But I don't see how the proposed solution helps, because the problem isn't that mapping (1) and (2) use different ordering schemes -- the column-major/row-major distinction really does apply to both equally. Using different names for those seems like it will confuse the issue further, if anything. The problem IMHO is that sometimes "order=" is used to specify mapping (1), and sometimes it's used to specify mapping (2), when in fact these are totally orthogonal. To see this, note that semantically it would be perfectly possible for .reshape() to take *two* order= arguments: one to specify the coordinate space mapping (2), and the other to specify the desired memory layout used by the result array (1). Of course we shouldn't actually do this, because in the unlikely event that someone actually wanted both of these they could just call asarray() on the output of reshape(). Maybe we should go through and rename "order" to something more descriptive in each case, so we'd have a.reshape(..., index_order="C") a.copy(memory_order="F") etc.? This way if you just bumped into these while reading code, it would still be immediately obvious that they were dealing with totally different concepts. Compare to reading along without the docs and seeing a.reshape(..., order="Z") a.copy(order="C") That'd just leave me even more baffled than the current system -- I'd start thinking that "Z" and "C" somehow were different options for the same order= option, so they must somehow mean ways of ordering elements? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.h.jaffe at gmail.com Tue Apr 2 07:46:48 2013 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Tue, 02 Apr 2013 12:46:48 +0100 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <1364669723.2556.19.camel@sebastian-laptop> References: <1364669723.2556.19.camel@sebastian-laptop> Message-ID: >> >> Proposal >> ------------- >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? >> > > Personally I think it is clear enough and that "Z" and "N" would confuse > me just as much (though I am used to the other names). Also "Z" and "N" > would seem more like aliases, which would also make sense in the memory > order context. > If anything, I would prefer renaming the arguments iteration_order and > memory_order, but it seems overdoing it... > Maybe the documentation could just be checked if it is always clear > though. I.e. maybe it does not use "iteration" or "memory" order > consistently (though I somewhat feel it is usually clear that it must be > iteration order, since no numpy function cares about the input memory > order as they will just do a copy if necessary). I have been using both C and Fortran for 25 or so years. Despite that, I have to sit and think every time I need to know which way the arrays are stored, basically by remembering that in fortran you do (I,J,*) for an assumed-size array. So I *love* the idea of 'Z' and 'N' which I understood immediately. Andrew From chris.barker at noaa.gov Tue Apr 2 12:29:15 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 2 Apr 2013 09:29:15 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett wrote: > Thank you for the compliment, it's more enjoyable than other potential > explanations of my confusion (sigh). > > But, I don't think that is the explanation. well, the core explanation is these are difficult and intertwined concepts...And yes, better names and better docs can help. > Last, as soon as we came to the distinction between index order and > memory layout, it was clear. > > We all agreed that this was an important distinction that would > improve numpy if we made it. yup. > I think you agree that there is potential for confusion, and there > doesn't seem any reason to continue with that confusion if we can come > up with a clearer name. well, changing an API is not to be taken lightly -- we are not discussion how we'd do it if we were to start from fresh here. So any change should make things enough better that it is worth dealing with the process of teh change. > So here is a compromise proposal. > * Preferring the names 'c-style' and 'f-style' for the indexing order > case (ravel, reshape, flatiter) > * Leaving 'C" and 'F' as functional shortcuts, so there is no possible > backwards-compatibility problem. seems reasonable enough -- though even with the backward compatibility, users will be faces with many, many older examples and docs that use "C' and 'F', while the new ones refer to the new names -- might this be cause for even more confusion (at least for a few years...) leaving me with an equivocal +0 on that .... antoher thought: """ Definition: np.ravel(a, order='C') A 1-D array, containing the elements of the input, is returned. A copy is made only if needed. Parameters ---------- a : array_like Input array. The elements in ``a`` are read in the order specified by `order`, and packed as a 1-D array. order : {'C','F', 'A', 'K'}, optional The elements of ``a`` are read in this order. 'C' means to view the elements in C (row-major) order. 'F' means to view the elements in Fortran (column-major) order. 'A' means to view the elements in 'F' order if a is Fortran contiguous, 'C' order otherwise. 'K' means to view the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, 'C' order is used. """ Does ravel need to support the 'A' and 'K' options? It's kind of an advanced use, and really more suited to .view(), perhaps? What I'm getting at is that this version of ravel() conflates the two concepts: virtual ordering and memory ordering in one function -- maybe they should be considered as two different functions altogether -- I think that would make for less confusion. ?ric Depagne wrote: > 'row-first' and 'column-first' (or anything else that may be more explicit) ? I like more explicit, but 'row-first' and 'column-first' have two issues: 1) what about higher dimension arrays?, and 2) the "row" and "column" convention is only that -- a convention -- I guess it's the way numpy prints, which gives it some meaning, but there are times when arrays are ordered: (col, row), rather than (row, col) (PIL uses that format for instance) I like the Z and N, and maybe even if they aren't used as flag names, they could be used in teh docstring -- nice and ascii safe.... Nathaniel wrote: >To see this, note that semantically it would be perfectly possible for .reshape() to > take *two* order= arguments: one to specify the coordinate space mapping (2), > and the other to specify the desired memory layout used by the result array (1). Of > course we shouldn't actually do this, because in the unlikely event that someone > actually wanted both of these they could just call asarray() on the output of > reshape(). exactly -- my point about keeping the raveling with "virtual order" separate from reveling with memory order -- it's really not critical that you can do both with one function call. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Tue Apr 2 13:59:01 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 13:59:01 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith wrote: > On Sat, Mar 30, 2013 at 2:08 AM, Matthew Brett > wrote: >> Hi, >> >> We were teaching today, and found ourselves getting very confused >> about ravel and shape in numpy. >> >> Summary >> -------------- >> >> There are two separate ideas needed to understand ordering in ravel and >> reshape: >> >> Idea 1): ravel / reshape can proceed from the last axis to the first, >> or the first to the last. This is "ravel index ordering" >> Idea 2) The physical layout of the array (on disk or in memory) can be >> "C" or "F" contiguous or neither. >> This is "memory ordering" >> >> The index ordering is usually (but see below) orthogonal to the memory >> ordering. >> >> The 'ravel' and 'reshape' commands use "C" and "F" in the sense of >> index ordering, and this mixes the two ideas and is confusing. >> >> What the current situation looks like >> ---------------------------------------------------- >> >> Specifically, we've been rolling this around 4 experienced numpy users >> and we all predicted at least one of the results below wrongly. >> >> This was what we knew, or should have known: >> >> In [2]: import numpy as np >> >> In [3]: arr = np.arange(10).reshape((2, 5)) >> >> In [5]: arr.ravel() >> Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> So, the 'ravel' operation unravels over the last axis (1) first, >> followed by axis 0. >> >> So far so good (even if the opposite to MATLAB, Octave). >> >> Then we found the 'order' flag to ravel: >> >> In [10]: arr.flags >> Out[10]: >> C_CONTIGUOUS : True >> F_CONTIGUOUS : False >> OWNDATA : False >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [11]: arr.ravel('C') >> Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> But we soon got confused. How about this? >> >> In [12]: arr_F = np.array(arr, order='F') >> >> In [13]: arr_F.flags >> Out[13]: >> C_CONTIGUOUS : False >> F_CONTIGUOUS : True >> OWNDATA : True >> WRITEABLE : True >> ALIGNED : True >> UPDATEIFCOPY : False >> >> In [16]: arr_F >> Out[16]: >> array([[0, 1, 2, 3, 4], >> [5, 6, 7, 8, 9]]) >> >> In [17]: arr_F.ravel('C') >> Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> Right - so the flag 'C' to ravel, has got nothing to do with *memory* >> ordering, but is to do with *index* ordering. >> >> And in fact, we can ask for memory ordering specifically: >> >> In [22]: arr.ravel('K') >> Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [23]: arr_F.ravel('K') >> Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> In [24]: arr.ravel('A') >> Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >> >> In [25]: arr_F.ravel('A') >> Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9]) >> >> There are some confusions to get into with the 'order' flag to reshape >> as well, of the same type. >> >> Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering. >> >> This is very confusing. We think the index ordering and memory >> ordering ideas need to be separated, and specifically, we should avoid >> using "C" and "F" to refer to index ordering. >> >> Proposal >> ------------- >> >> * Deprecate the use of "C" and "F" meaning backwards and forwards >> index ordering for ravel, reshape >> * Prefer "Z" and "N", being graphical representations of unraveling in >> 2 dimensions, axis1 first and axis0 first respectively (excellent >> naming idea by Paul Ivanov) >> >> What do y'all think? > > Surely it should be "Z" and "?"? ;-) > > I knew what your examples would produce, but only because I've bumped into > this before. When you do reshapes of various sorts (ravel() == > reshape((-1,))), then, like you say, there are two totally different sets of > coordinate mapping in play: > > chunk of memory <-1-> virtual array layout <-2-> new array layout > (C pointers) <---> (Python indexes) <---> (Python indexes) > > Mapping (1) is determined by the array strides, and you have to think about > it when you interface with C code, but at the Python level it's pretty much > irrelevant; all operations are defined at the "virtual array layout" level. > > Further confusing the issue is the fact that the vast majority of legal > memory<->virtual array mappings are *neither* C- nor F-ordered. Strides are > very flexible. > > Further further confusing the issue is that mapping (2) actually consists of > two mappings: if you have an array with shape (3, 4, 5) and reshape it to > (4, 15), then the way you work out the overall mapping is by first mapping > the (3, 4, 5) onto a flat 1-d space with 60 elements, and then mapping > *that* to the (4, 15) space. > > Anyway, I agree that this is very confusing; certainly it confused me. If > you bump into these two mappings just in passing, and separately, then it's > very easy to miss the fact that they have nothing to do with each other. And > I agree that using exactly the same terminology for both of them is part of > what causes this. I even kind of like the "Z"/"N" naming scheme (I still > have to look up what C/F actually mean every time, I'm ashamed to say). > > But I don't see how the proposed solution helps, because the problem isn't > that mapping (1) and (2) use different ordering schemes -- the > column-major/row-major distinction really does apply to both equally. Using > different names for those seems like it will confuse the issue further, if > anything. The problem IMHO is that sometimes "order=" is used to specify > mapping (1), and sometimes it's used to specify mapping (2), when in fact > these are totally orthogonal. Yes. Of course ravel is the perfect storm because it refers to order in both senses. > To see this, note that semantically it would be perfectly possible for > .reshape() to take *two* order= arguments: one to specify the coordinate > space mapping (2), and the other to specify the desired memory layout used > by the result array (1). Of course we shouldn't actually do this, because in > the unlikely event that someone actually wanted both of these they could > just call asarray() on the output of reshape(). Yes. > Maybe we should go through and rename "order" to something more descriptive > in each case, so we'd have > a.reshape(..., index_order="C") > a.copy(memory_order="F") > etc.? That seems like a good idea. If you are proposing it, I am "+1". > This way if you just bumped into these while reading code, it would still be > immediately obvious that they were dealing with totally different concepts. > Compare to reading along without the docs and seeing > a.reshape(..., order="Z") > a.copy(order="C") > That'd just leave me even more baffled than the current system -- I'd start > thinking that "Z" and "C" somehow were different options for the same order= > option, so they must somehow mean ways of ordering elements? I don't think you'd be more baffled than the current system, which, as you say, conflates two orthogonal concepts. Rather, I think it would cause the user to stop, as they should, and consider what concept order is using in this case. I don't find it difficult to explain this: There are two different but related concepts of 'order' 1) The memory layout of the array 2) The index ordering used to unravel the array If you see 'Z' or 'N" for 'order' - that refers to index ordering. If you see 'C' or 'F" for order - that refers to memory layout. Cheers, Matthew From matthew.brett at gmail.com Tue Apr 2 14:04:05 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 14:04:05 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: Hi, On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal wrote: > On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett wrote: >> Thank you for the compliment, it's more enjoyable than other potential >> explanations of my confusion (sigh). >> >> But, I don't think that is the explanation. > > well, the core explanation is these are difficult and intertwined > concepts...And yes, better names and better docs can help. > >> Last, as soon as we came to the distinction between index order and >> memory layout, it was clear. >> >> We all agreed that this was an important distinction that would >> improve numpy if we made it. > > yup. > >> I think you agree that there is potential for confusion, and there >> doesn't seem any reason to continue with that confusion if we can come >> up with a clearer name. > > well, changing an API is not to be taken lightly -- we are not > discussion how we'd do it if we were to start from fresh here. So any > change should make things enough better that it is worth dealing with > the process of teh change. Yes, for sure. I was only trying to point out that we are not talking about breaking backwards compatibility. >> So here is a compromise proposal. > >> * Preferring the names 'c-style' and 'f-style' for the indexing order >> case (ravel, reshape, flatiter) > >> * Leaving 'C" and 'F' as functional shortcuts, so there is no possible >> backwards-compatibility problem. > > seems reasonable enough -- though even with the backward > compatibility, users will be faces with many, many older examples and > docs that use "C' and 'F', while the new ones refer to the new names > -- might this be cause for even more confusion (at least for a few > years...) I doubt it would be 'even more' confusion. They would only have to read the docstrings to work out what is meant, and I believe, with better names, they'd be less likely to fall into the traps I fell into, at least. > leaving me with an equivocal +0 on that .... > > antoher thought: > > """ > Definition: np.ravel(a, order='C') > > A 1-D array, containing the elements of the input, is returned. A copy is > made only if needed. > > Parameters > ---------- > a : array_like > Input array. The elements in ``a`` are read in the order specified by > `order`, and packed as a 1-D array. > order : {'C','F', 'A', 'K'}, optional > The elements of ``a`` are read in this order. 'C' means to view > the elements in C (row-major) order. 'F' means to view the elements > in Fortran (column-major) order. 'A' means to view the elements > in 'F' order if a is Fortran contiguous, 'C' order otherwise. > 'K' means to view the elements in the order they occur in memory, > except for reversing the data when strides are negative. > By default, 'C' order is used. > """ > > Does ravel need to support the 'A' and 'K' options? It's kind of an > advanced use, and really more suited to .view(), perhaps? > > What I'm getting at is that this version of ravel() conflates the two > concepts: virtual ordering and memory ordering in one function -- > maybe they should be considered as two different functions altogether > -- I think that would make for less confusion. I think it would conceal the confusion only. If we don't have 'A' and 'K' in there, it allows us to keep the dream of a world where 'C" only refers to index ordering, but *only for this docstring*. As soon as somebody does ``np.array(arr, order='C')`` they will find themselves in conceptual trouble again. Cheers, Matthew From ralf.gommers at gmail.com Tue Apr 2 14:12:24 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 2 Apr 2013 20:12:24 +0200 Subject: [Numpy-discussion] [SciPy-Dev] NumPy/SciPy participation in GSoC 2013 In-Reply-To: References: Message-ID: On Mon, Apr 1, 2013 at 2:27 PM, Todd wrote: > On Mon, Apr 1, 2013 at 1:58 PM, Ralf Gommers wrote: > >> >> >> >> On Tue, Mar 26, 2013 at 12:27 AM, Ralf Gommers wrote: >> >>> >>> >>> >>> On Thu, Mar 21, 2013 at 10:20 PM, Ralf Gommers wrote: >>> >>>> Hi all, >>>> >>>> It is the time of the year for Google Summer of Code applications. If >>>> we want to participate with Numpy and/or Scipy, we need two things: enough >>>> mentors and ideas for projects. If we get those, we'll apply under the PSF >>>> umbrella. They've outlined the timeline they're working by and guidelines >>>> at >>>> http://pyfound.blogspot.nl/2013/03/get-ready-for-google-summer-of-code.html. >>>> >>>> >>>> We should be able to come up with some interesting project ideas I'd >>>> think, let's put those at >>>> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. Preferably >>>> with enough detail to be understandable for people new to the projects and >>>> a proposed mentor. >>>> >>>> We need at least 3 people willing to mentor a student. Ideally we'd >>>> have enough mentors this week, so we can apply to the PSF on time. If >>>> you're willing to be a mentor, please send me the following: name, email >>>> address, phone nr, and what you're interested in mentoring. If you have >>>> time constaints and have doubts about being able to be a primary mentor, >>>> being a backup mentor would also be helpful. >>>> >>> >>> So far we've only got one primary mentor (thanks Chuck!), most core devs >>> do not seem to have the bandwidth this year. If there are other people >>> interested in mentoring please let me know. If not, then it looks like >>> we're not participating this year. >>> >> >> Hi all, an update on GSoC'13. We do have enough mentoring power after >> all; NumPy/SciPy is now registered as a participating project on the PSF >> page: http://wiki.python.org/moin/SummerOfCode/2013 >> >> Prospective students: please have a look at >> http://wiki.python.org/moin/SummerOfCode/Expectations and at >> http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas. In particular >> note that we require you to make one pull request to NumPy/SciPy which has >> to be merged *before* the application deadline (May 3). So please start >> thinking about that, and start a discussion on your project idea on this >> list. >> >> Cheers, >> Ralf >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > There were a number of other ideas in this thread: > > http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065699.html > Thanks Todd. Your idea 5 is pretty much what Nathaniel just detailed out, it's on the ideas page now. That should enable idea 1 as well. The other idea that got some positive feedback was 3: http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065710.html. If you or someone else could make that a little more concrete, we can put that up. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From scopatz at gmail.com Tue Apr 2 14:26:25 2013 From: scopatz at gmail.com (Anthony Scopatz) Date: Tue, 2 Apr 2013 13:26:25 -0500 Subject: [Numpy-discussion] ANN: XDress v0.1 -- Automatic Code Generator and C/C++ Wrapper Message-ID: Hello All, I am spamming the lists which may be interested in a C/C++ automatic API wrapper / code generator / type system / thing I wrote. I'll keep future updates more discrete. I'd love to help folks get started with this and more participation is always welcome! Release notes are below. Please visit the docs: http://bit.ly/xdress-code Or just grab the repo: http://github.com/scopatz/xdress Be Well Anthony ======================== XDress 0.1 Release Notes ======================== XDress is an automatic wrapper generator for C/C++ written in pure Python. Currently, xdress may generate Python bindings (via Cython) for C++ classes & functions and in-memory wrappers for C++ standard library containers (sets, vectors, maps). In the future, other tools and bindings will be supported. The main enabling feature of xdress is a dynamic type system that was designed with the purpose of API generation in mind. Release highlights: - Dynamic system for specifying types - Automatically describes C/C++ APIs from source code with no modifications. - Python extension module generation (via Cython) from C++ API descriptions - Python views into C++ STL containers. Vectors are NumPy arrays while maps and sets have custom collections.MutableMapping and collections.MutableSet subclasses. - Command line interface to the above tools. Please visit the website for more information: http://bit.ly/xdress-code Or grab the code from GitHub: http://github.com/scopatz/xdress XDress is free & open source (BSD 2-clause license) and requires Python 2.7, NumPy 1.5+, PyTables 2.1+, Cython 0.18+, GCC-XML, and lxml. New Features ============ Type System ----------- This module provides a suite of tools for denoting, describing, and converting between various data types and the types coming from various systems. This is achieved by providing canonical abstractions of various kinds of types: * Base types (int, str, float, non-templated classes) * Refined types (even or odd ints, strings containing the letter 'a') * Dependent types (templates such arrays, maps, sets, vectors) All types are known by their name (a string identifier) and may be aliased with other names. However, the string id of a type is not sufficient to fully describe most types. The system here implements a canonical form for all kinds of types. This canonical form is itself hashable, being comprised only of strings, ints, and tuples. Descriptions ------------ A key component of API wrapper generation is having a a top-level, abstract representation of the software that is being wrapped. In C++ there are three basic constructs which may be wrapped: variables, functions, and classes. Here we restrict ourselves to wrapping classes and functions, though variables may be added in the future. The abstract representation of a C++ class is known as a description (abbr. desc). This description is simply a Python dictionary with a specific structure. This structure makes heavy use of the type system to declare the types of all needed parameters. Mini-FAQ ======== * Why not use an existing solution (eg, SWIG)? Their type systems don't support run-time, user provided refinement types, and thus are unsuited for verification & validation use cases that often arise in computational science. Furthermore, they tend to not handle C++ dependent types well (i.e. vector does not come back as a np.view(..., dtype=T)). * Why GCC-XML and not Clang's AST? I tried using Clang's AST (and the remnants of a broken visitor class remain in the code base). However, the official Clang AST Python bindings lack support for template argument types. This is a really big deal. Other C++ ASTs may be supported in the future -- including Clang's. * I run xdress and it creates these files, now what?! It is your job to integrate the files created by xdress into your build system. Join in the Fun! ================ If you are interested in using xdress on your project (and need help), contributing back to xdress, starting up a development team, or writing your own code generation front end tool on top of the type system and autodescriber, please let me know. Participation is very welcome! Authors ======= XDress was written by `Anthony Scopatz `_, who had many type system discussions with John Bachan over coffee at the Div school, and was polished up and released under the encouragement of Christopher Jordan-Squire at `PyCon 2013 `_. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Apr 2 14:37:39 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Apr 2013 14:37:39 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: On Tue, Apr 2, 2013 at 2:04 PM, Matthew Brett wrote: > Hi, > > On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal > wrote: >> On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett wrote: >>> Thank you for the compliment, it's more enjoyable than other potential >>> explanations of my confusion (sigh). >>> >>> But, I don't think that is the explanation. >> >> well, the core explanation is these are difficult and intertwined >> concepts...And yes, better names and better docs can help. >> >>> Last, as soon as we came to the distinction between index order and >>> memory layout, it was clear. >>> >>> We all agreed that this was an important distinction that would >>> improve numpy if we made it. >> >> yup. >> >>> I think you agree that there is potential for confusion, and there >>> doesn't seem any reason to continue with that confusion if we can come >>> up with a clearer name. >> >> well, changing an API is not to be taken lightly -- we are not >> discussion how we'd do it if we were to start from fresh here. So any >> change should make things enough better that it is worth dealing with >> the process of teh change. > > Yes, for sure. I was only trying to point out that we are not talking > about breaking backwards compatibility. > >>> So here is a compromise proposal. >> >>> * Preferring the names 'c-style' and 'f-style' for the indexing order >>> case (ravel, reshape, flatiter) >> >>> * Leaving 'C" and 'F' as functional shortcuts, so there is no possible >>> backwards-compatibility problem. >> >> seems reasonable enough -- though even with the backward >> compatibility, users will be faces with many, many older examples and >> docs that use "C' and 'F', while the new ones refer to the new names >> -- might this be cause for even more confusion (at least for a few >> years...) > > I doubt it would be 'even more' confusion. They would only have to > read the docstrings to work out what is meant, and I believe, with > better names, they'd be less likely to fall into the traps I fell > into, at least. > >> leaving me with an equivocal +0 on that .... >> >> antoher thought: >> >> """ >> Definition: np.ravel(a, order='C') >> >> A 1-D array, containing the elements of the input, is returned. A copy is >> made only if needed. >> >> Parameters >> ---------- >> a : array_like >> Input array. The elements in ``a`` are read in the order specified by >> `order`, and packed as a 1-D array. >> order : {'C','F', 'A', 'K'}, optional >> The elements of ``a`` are read in this order. 'C' means to view >> the elements in C (row-major) order. 'F' means to view the elements >> in Fortran (column-major) order. 'A' means to view the elements >> in 'F' order if a is Fortran contiguous, 'C' order otherwise. >> 'K' means to view the elements in the order they occur in memory, >> except for reversing the data when strides are negative. >> By default, 'C' order is used. >> """ >> >> Does ravel need to support the 'A' and 'K' options? It's kind of an >> advanced use, and really more suited to .view(), perhaps? >> >> What I'm getting at is that this version of ravel() conflates the two >> concepts: virtual ordering and memory ordering in one function -- >> maybe they should be considered as two different functions altogether >> -- I think that would make for less confusion. > > I think it would conceal the confusion only. If we don't have 'A' > and 'K' in there, it allows us to keep the dream of a world where 'C" > only refers to index ordering, but *only for this docstring*. As > soon as somebody does ``np.array(arr, order='C')`` they will find > themselves in conceptual trouble again. I still don't see why order is not a general concept, whether it refers to memory or indexing/iterating. The qualifier can be made clear in the docstrings (or from the context). It's all over the documentation: we can iterate in F-order over an array that is in C-order (*), or vice-versa (*) or just some strides http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.nditer.html#numpy.nditer pure shape http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-array-shape shape and copy http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html#numpy.ndarray.flatten memory http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-kind-of-array http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#from-existing-data Josef > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Tue Apr 2 14:44:31 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 19:44:31 +0100 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 6:59 PM, Matthew Brett wrote: > On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith wrote: >> Maybe we should go through and rename "order" to something more descriptive >> in each case, so we'd have >> a.reshape(..., index_order="C") >> a.copy(memory_order="F") >> etc.? > > That seems like a good idea. If you are proposing it, I am "+1". Well, I'm just throwing it out there as an idea, but if people like it, nothing better turns up, and someone implements it, then I'm not going to say no... >> This way if you just bumped into these while reading code, it would still be >> immediately obvious that they were dealing with totally different concepts. >> Compare to reading along without the docs and seeing >> a.reshape(..., order="Z") >> a.copy(order="C") >> That'd just leave me even more baffled than the current system -- I'd start >> thinking that "Z" and "C" somehow were different options for the same order= >> option, so they must somehow mean ways of ordering elements? > > I don't think you'd be more baffled than the current system, which, as > you say, conflates two orthogonal concepts. Rather, I think it would > cause the user to stop, as they should, and consider what concept > order is using in this case. > > I don't find it difficult to explain this: > > There are two different but related concepts of 'order' > > 1) The memory layout of the array > 2) The index ordering used to unravel the array > > If you see 'Z' or 'N" for 'order' - that refers to index ordering. > If you see 'C' or 'F" for order - that refers to memory layout. Sure, you can write it down like this, but compare to this system: If you see 'Z' or 'N" for 'order' - that refers to memory ordering. If you see 'C' or 'F" for order - that refers to index layout. Now suppose I forget which system we actually use -- how do you remember which system is which? It's totally arbitrary. Now I have even more things to remember. And I'm certainly not going to work out this distinction just from seeing these used once or twice in someone else's code. This is like observing that if I say "go North" then it's ambiguous about whether I want you to drive or walk, and concluding that we need new words for the directions depending on what sort of vehicle you use. So "go North" means drive North, "go htuoS" means walk North, etc. Totally silly. Makes much more sense to have one set of words for directions, and then make clear from context what the directions are used for -- "drive North", "walk North". Or "iterate C-wards", "store F-wards". "C" and "Z" mean exactly the same thing -- they describe a way of unraveling a cube into a straight line. The difference is what we do with the resulting straight line. That's why I'm suggesting that the distinction should be made in the name of the argument. -n From toddrjen at gmail.com Tue Apr 2 15:10:02 2013 From: toddrjen at gmail.com (Todd) Date: Tue, 2 Apr 2013 21:10:02 +0200 Subject: [Numpy-discussion] [SciPy-Dev] NumPy/SciPy participation in GSoC 2013 In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 8:12 PM, Ralf Gommers wrote: > > > > On Mon, Apr 1, 2013 at 2:27 PM, Todd wrote: > >> >> There were a number of other ideas in this thread: >> >> http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065699.html >> > > Thanks Todd. Your idea 5 is pretty much what Nathaniel just detailed out, > it's on the ideas page now. That should enable idea 1 as well. The other > idea that got some positive feedback was 3: > http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065710.html. > If you or someone else could make that a little more concrete, we can put > that up. > > For 3: With structured arrays, you can access them by name (key) in a manner identical to dictionaries: y = x['f'] It also has a method for accessing the list of names (keys): x.dtype.names This should be maintained for backwards-compatibility, but these methods should also be added: x.keys -- returns a list of field names x.values -- returns a list of views into the array, one for each structure x.items -- returns a list of tuple containing name/structure pairs (the structures being a views) x.iterkeys/itervalues/iteritems -- returns an iterable over the corresponding objects (should not be available in python 3) x.viewkeys/viewvalues/viewitems -- the same as the corresponding item, since they always return views (should not be available in python 3) x.has_key -- tests if a field name is present (should not be available in python 3, should use "key in x.keys()") x.get -- get a field by name, returning a default array (an empty array by default) if not present x.update -- copy values into the matching key from another structured array, a dict, or list of key/value tuples. Unlike dicts this will only work for keys that are already present. Documentation should probably be updated to have these as the default ways of interacting with structured arrays, with the old methods deprecated. "Names" should also probably be renamed "keys" for compatibility with dicts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 2 15:25:19 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 20:25:19 +0100 Subject: [Numpy-discussion] [SciPy-Dev] NumPy/SciPy participation in GSoC 2013 In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 8:10 PM, Todd wrote: > > On Tue, Apr 2, 2013 at 8:12 PM, Ralf Gommers wrote: >> >> >> >> >> On Mon, Apr 1, 2013 at 2:27 PM, Todd wrote: >>> >>> >>> There were a number of other ideas in this thread: >>> >>> http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065699.html >> >> >> Thanks Todd. Your idea 5 is pretty much what Nathaniel just detailed out, >> it's on the ideas page now. That should enable idea 1 as well. The other >> idea that got some positive feedback was 3: >> http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065710.html. If >> you or someone else could make that a little more concrete, we can put that >> up. >> > > For 3: > > With structured arrays, you can access them by name (key) in a manner > identical to dictionaries: > > y = x['f'] > > It also has a method for accessing the list of names (keys): > > x.dtype.names > > This should be maintained for backwards-compatibility, but these methods > should also be added: > > x.keys -- returns a list of field names > x.values -- returns a list of views into the array, one for each structure > x.items -- returns a list of tuple containing name/structure pairs (the > structures being a views) > x.iterkeys/itervalues/iteritems -- returns an iterable over the > corresponding objects (should not be available in python 3) > x.viewkeys/viewvalues/viewitems -- the same as the corresponding item, since > they always return views (should not be available in python 3) > x.has_key -- tests if a field name is present (should not be available in > python 3, should use "key in x.keys()") > x.get -- get a field by name, returning a default array (an empty array by > default) if not present > x.update -- copy values into the matching key from another structured array, > a dict, or list of key/value tuples. Unlike dicts this will only work for > keys that are already present. > > Documentation should probably be updated to have these as the default ways > of interacting with structured arrays, with the old methods deprecated. > "Names" should also probably be renamed "keys" for compatibility with dicts. I'm concerned that this will be 1 week of coding + 2 months of arguing over whether it's actually a good idea to add all these methods to every ndarray, whether it makes sense to have "views" for ndarrays, what should be done with "in", can/should we really rename "names", etc. Maybe I'm just pessimistic, but it isn't as much a slam-dunk obviously-the-right-thing improvement as some things. (Also I have no idea what the other comment about compact dicts that Ralf referred to above meant.) Handling such debates is a super-important part of OSS coding, but probably the worst thing to put on the critical path of a GSoC project. (Actually I guess I may be guilty of this myself, if anyone is worried about that dtype proposal speak up now... I think it can be done without serious compatibility breaks.) -n From chris.barker at noaa.gov Tue Apr 2 15:42:51 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 2 Apr 2013 12:42:51 -0700 Subject: [Numpy-discussion] timezones and datetime64 Message-ID: Hey folks, We've recently run into some issues with time zones and datetime64 (in numpy 1.7). Specifically, there no longer seems to be a way to have what the python datetime module calls "naive" datetime objects -- i.e. ones that have no awareness of timezones. Moreover, datetime64 seems to enforce the locale settings of the machine you're running on, with no way to turn that off. This is a "Bad Thing?". Time zones are a nightmare -- particularly a nightmare for computer code, and particularly with Daylight Savings issues. As a result, an application needs to either be fully-properly timezone aware, and manage it all properly, or be completely timezone naive -- mixing the two is a recipe for disaster! (OK, OK, I'm being a little histrionic here...) Getting timezone handling right is actually pretty tricky, and takes a fair bit of code, and is incompatible with some simple libraries. As a result, many of us punt and go with the naive approach. In particular a major app I'm working on has always made it the responsibility of the user to provide all input in the same timezone. If/when we do get smarter, we'll still treat timezone handling as an I/O issue -- internally, all datetimes with be in the same, naive, timezone. So I want it to be possible, and ideally easy, to use naive DateTime64s. One way to think about it is that the UTC time zone is equivalent to a naive object -- if you use UTC everywhere, no timezone conversions will take place, but with numpy1.7, this can be a trick. These issues come up in two core places: 1) Creating a DateTIme64 from the docs: "ISO 8601 specifies to use the local time zone if none is explicitly given" Well, yes and no -- ISO 8601 specifies that if no time zone is given, it means "local time". But it does not specify what local time means, nor how it should be used in a computer library. dateTime64 seems to define it as "use the computer's locale setting for the time zone" it also seems to not just keep the time zone around as meta-data, but actually change the internal representation to UTC. This is a really bad idea: - people have their time zones set wrong - or move their laptops around between time zones! - people, for example, run models for a location that is not their computer time zone - people run apps through the web -- who knows or should care what time zone the server is in? Note that this isn't just the string representation, but also conversion from a datetime.datetime object: In [28]: dt = datetime.datetime(2013, 4, 2, 12) In [29]: dt64 = np.datetime64(dt) In [30]: dt64 Out[30]: numpy.datetime64('2013-04-02T05:00:00.000000-0700') This is particularly problematic, as the built-in datetime module has no tzinfo objects -- you need a third-party library to supply them... 2) creating something else from a datetime64: In [47]: np.datetime64('2013-04-02T12:00:00-07').astype(datetime.datetime) Out[47]: datetime.datetime(2013, 4, 2, 19, 0) so I put in 12:00, but get back 19:00 -- and the datetime.datetime object has lost the timezone info. In [54]: str(np.datetime64('2013-04-02T12:00:00Z')) Out[54]: '2013-04-02T05:00:00-0700' I put in a UTC datetime, but get a string representation in my locale time -- this can get pretty ugly, particularly if you have to deal with DST. Using the locale also means that you have to do DST whether you want to or not, which can be weird: In [7]: np.arange('2013-03-10T07Z', '2013-03-10T12Z', dtype='datetime64[h]') Out[7]: array(['2013-03-09T23-0800', '2013-03-10T00-0800', '2013-03-10T01-0800', '2013-03-10T03-0700', '2013-03-10T04-0700'], dtype='datetime64[h]') so not all the elements in the the same TZ what happens when you want to go the reverse route -- very odd things with DST: In [15]: np.datetime64('2013-03-10T01:30') Out[15]: numpy.datetime64('2013-03-10T01:30-0800') In [16]: np.datetime64('2013-03-10T02:00') Out[16]: numpy.datetime64('2013-03-10T03:00-0700') # I put in 2:00, get back 3:00 ! In [17]: np.datetime64('2013-03-10T02:30') Out[17]: numpy.datetime64('2013-03-10T03:30-0700') # I put in 2:30, get back 3:30 ! In [18]: np.datetime64('2013-03-10T03:00') Out[18]: numpy.datetime64('2013-03-10T03:00-0700') # I put in 3:00, get back 3:00 ! To deal with all this, what we'll have to do is ensure that we are using UTC everywhere, and not ever use the built-in string representation. As you can see from the above, that's kind of a pain -- datetime.datetimes are often not timezone aware, people put whatever strings they put in, etc. My understanding is that datetime64 is still in experimental, and thus we have room to make some changes, so I propose: 1) allow a "naive" datetime64 -- one with no specified timezone. 2) have the default for ISO string interpretation be naive if no TZ is specified 3) never use the locale setting unless explicitly asked for. we'd need a way to specify timezone in string formatting, etc, not sure how do that. A couple questions: Are there docs defining the internals of timezone handling, and how one might change the timezone of an existing datetime64 array? As I poke at this a bit, I"m noticing that maybe time zones aren't handles at all internally -- rather, the conversion is done to UTC when creating a datetime64, and conversion is then done to the locale when creating a strng representation -- maybe nothing inside at all. Does the timezone info survive saving in npz, etc? PS: may have found a bug messing with arange and datetime64: In [56]: a = np.arange(np.datetime64('2013-04-02T12:00:00Z')) Bus error: 10 ( repeatable with 1.7.0, py2.7 OS-X 32 bit ) not good, even though that probably shouldn't be legal anyway. I'm guessing it's using the raw 64 bit integer and trying got build an array that big -- but it'd be better to get a ValueError. Thoughts? In particular, how does Pandas or any other time series package deal with all this? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Tue Apr 2 16:07:57 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 2 Apr 2013 13:07:57 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: On Tue, Apr 2, 2013 at 11:37 AM, wrote: > I still don't see why order is not a general concept, whether it > refers to memory or indexing/iterating. I agree -- the ordering concept is the same, it's _what_ is being ordered that's different. So I say we stick with 'C' and 'F' -- numpy users will need to figure out what it means eventually in any case.... we need some better doc strings and *maybe* renaming a keyword arguemnt or two. partly I say maybe because the "order" keyword in ravel() actually mixes the two concepts anyway... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Tue Apr 2 17:21:18 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 17:21:18 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Tue, Apr 2, 2013 at 2:44 PM, Nathaniel Smith wrote: > On Tue, Apr 2, 2013 at 6:59 PM, Matthew Brett wrote: >> On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith wrote: >>> Maybe we should go through and rename "order" to something more descriptive >>> in each case, so we'd have >>> a.reshape(..., index_order="C") >>> a.copy(memory_order="F") >>> etc.? >> >> That seems like a good idea. If you are proposing it, I am "+1". > > Well, I'm just throwing it out there as an idea, but if people like > it, nothing better turns up, and someone implements it, then I'm not > going to say no... I would certainly be happy to implement it if there was some agreement it was the right way to go. >>> This way if you just bumped into these while reading code, it would still be >>> immediately obvious that they were dealing with totally different concepts. >>> Compare to reading along without the docs and seeing >>> a.reshape(..., order="Z") >>> a.copy(order="C") >>> That'd just leave me even more baffled than the current system -- I'd start >>> thinking that "Z" and "C" somehow were different options for the same order= >>> option, so they must somehow mean ways of ordering elements? >> >> I don't think you'd be more baffled than the current system, which, as >> you say, conflates two orthogonal concepts. Rather, I think it would >> cause the user to stop, as they should, and consider what concept >> order is using in this case. >> >> I don't find it difficult to explain this: >> >> There are two different but related concepts of 'order' >> >> 1) The memory layout of the array >> 2) The index ordering used to unravel the array >> >> If you see 'Z' or 'N" for 'order' - that refers to index ordering. >> If you see 'C' or 'F" for order - that refers to memory layout. > > Sure, you can write it down like this, but compare to this system: > > If you see 'Z' or 'N" for 'order' - that refers to memory ordering. > If you see 'C' or 'F" for order - that refers to index layout. > > Now suppose I forget which system we actually use -- how do you > remember which system is which? It's totally arbitrary. I don't think it is completely arbitrary, as 'Z' / 'N' come from the process of getting elements from a 2D array in a certain order, and C / F memory layouts correspond to exactly what C and Fortran do (whereas the concept of index order cannot be separated from memory order for C, Fortran). > Now I have > even more things to remember. And I'm certainly not going to work out > this distinction just from seeing these used once or twice in someone > else's code. The extra things you have to remember are a) that there is a distinction (and this is good) and b) which of the two things you need to distinguish is 'Z' or 'C'. I think the benefit from a) is much greater than the small load from b). > This is like observing that if I say "go North" then it's ambiguous > about whether I want you to drive or walk, and concluding that we need > new words for the directions depending on what sort of vehicle you > use. So "go North" means drive North, "go htuoS" means walk North, > etc. Totally silly. Makes much more sense to have one set of words for > directions, and then make clear from context what the directions are > used for -- "drive North", "walk North". Or "iterate C-wards", "store > F-wards". > > "C" and "Z" mean exactly the same thing -- they describe a way of > unraveling a cube into a straight line. The difference is what we do > with the resulting straight line. That's why I'm suggesting that the > distinction should be made in the name of the argument. Could you unpack that for the 'ravel' docstring? Because these options all refer to the way of unraveling and not the memory layout that results. Cheers, Matthew From matthew.brett at gmail.com Tue Apr 2 17:23:02 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 17:23:02 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364837019.2404.12.camel@sebastian-laptop> Message-ID: Hi, On Tue, Apr 2, 2013 at 4:07 PM, Chris Barker - NOAA Federal wrote: > On Tue, Apr 2, 2013 at 11:37 AM, wrote: >> I still don't see why order is not a general concept, whether it >> refers to memory or indexing/iterating. > > I agree -- the ordering concept is the same, it's _what_ is being > ordered that's different. So I say we stick with 'C' and 'F' -- numpy > users will need to figure out what it means eventually in any case.... I'm not quite sure what you are arguing. I thought we all agreed that the index ordering idea is *orthogonal* to the memory layout idea? Not so? Cheers, Matthew From pav at iki.fi Tue Apr 2 17:39:58 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 03 Apr 2013 00:39:58 +0300 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: Message-ID: 02.04.2013 22:42, Chris Barker - NOAA Federal kirjoitti: [clip] > As I poke at this a bit, I"m noticing that maybe time zones aren't > handles at all internally -- rather, the conversion is done to UTC > when creating a datetime64, and conversion is then done to the locale > when creating a strng representation -- maybe nothing inside at all. > > Does the timezone info survive saving in npz, etc? As far as I understand (more knowledgeable people, please correct me), Numpy's datetime handling in 1.7 is timezone-agnostic and works in UTC (which is not a time zone). That is, datetime64 represents an absolute point in time, and "2013-04-02T05:00:00-0100", "2013-04-02T04:00:00-0200", ... are represented by the same (on the binary-level) datetime64 value. In particular no timezone information is stored, see "datetime_array.view(np.uint64)", as it is not needed. So it's rather unlike datetime.datetime. I can see the reasons why it was designed as it is now --- the problems seem similar to Unicode on Python 3, the encoding/decoding is painful to get right especially as the timezones are a mess due to historical reasons. The above design seems philosophically at odds with the concept of a "naive" datetime type. A "naive" datetime is sort of datetime64[D] plus HH:MM:SS, but not quite. *** I think your point about using current timezone in interpreting user input being dangerous is probably correct --- perhaps UTC all the way would be a safer (and simpler) choice? -- Pauli Virtanen From njs at pobox.com Tue Apr 2 17:52:57 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 22:52:57 +0100 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >> This is like observing that if I say "go North" then it's ambiguous >> about whether I want you to drive or walk, and concluding that we need >> new words for the directions depending on what sort of vehicle you >> use. So "go North" means drive North, "go htuoS" means walk North, >> etc. Totally silly. Makes much more sense to have one set of words for >> directions, and then make clear from context what the directions are >> used for -- "drive North", "walk North". Or "iterate C-wards", "store >> F-wards". >> >> "C" and "Z" mean exactly the same thing -- they describe a way of >> unraveling a cube into a straight line. The difference is what we do >> with the resulting straight line. That's why I'm suggesting that the >> distinction should be made in the name of the argument. > > Could you unpack that for the 'ravel' docstring? Because these > options all refer to the way of unraveling and not the memory layout > that results. Z/C/column-major/whatever-you-want-to-call-it is a general strategy for converting between a 1-dim representation and a n-dim representation. In the case of memory storage, the 1-dim representation is the flat space of pointer arithmetic. In the case of ravel, the 1-dim representation is the flat space of a 1-dim indexed array. But the 1-dim-to-n-dim part is the same in both cases. I think that's why you're seeing people baffled by your proposal -- to them the "C" refers to this general strategy, and what's different is the context where it gets applied. So giving the same strategy two different names is silly; if anything it's the contexts that should have different names. -n From josef.pktd at gmail.com Tue Apr 2 19:09:25 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Apr 2013 19:09:25 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: > On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>> This is like observing that if I say "go North" then it's ambiguous >>> about whether I want you to drive or walk, and concluding that we need >>> new words for the directions depending on what sort of vehicle you >>> use. So "go North" means drive North, "go htuoS" means walk North, >>> etc. Totally silly. Makes much more sense to have one set of words for >>> directions, and then make clear from context what the directions are >>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>> F-wards". >>> >>> "C" and "Z" mean exactly the same thing -- they describe a way of >>> unraveling a cube into a straight line. The difference is what we do >>> with the resulting straight line. That's why I'm suggesting that the >>> distinction should be made in the name of the argument. >> >> Could you unpack that for the 'ravel' docstring? Because these >> options all refer to the way of unraveling and not the memory layout >> that results. > > Z/C/column-major/whatever-you-want-to-call-it is a general strategy > for converting between a 1-dim representation and a n-dim > representation. In the case of memory storage, the 1-dim > representation is the flat space of pointer arithmetic. In the case of > ravel, the 1-dim representation is the flat space of a 1-dim indexed > array. But the 1-dim-to-n-dim part is the same in both cases. > > I think that's why you're seeing people baffled by your proposal -- to > them the "C" refers to this general strategy, and what's different is > the context where it gets applied. So giving the same strategy two > different names is silly; if anything it's the contexts that should > have different names. And once we get into memory optimization (and avoiding copies and preserving contiguity), it is necessary to keep both orders in mind, is memory order in "F" and am I iterating/raveling in "F" order (or slicing columns). I think having two separate keywords give the impression we can choose two different things at the same time. Josef > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chris.barker at noaa.gov Tue Apr 2 19:39:42 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 2 Apr 2013 16:39:42 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 2:39 PM, Pauli Virtanen wrote: > As far as I understand (more knowledgeable people, please correct me), > Numpy's datetime handling in 1.7 is timezone-agnostic and works in UTC > (which is not a time zone). That is, datetime64 represents an absolute > point in time, and "2013-04-02T05:00:00-0100", thanks Pauli -- actually, after writting that message, I realized that it looks like datetime64 doesn't know a thing about timezones of any sort, anyway, anyhow. rather, what it does is apply timezones on I/O -- when creating a new datetime64 or when writing out (in text format, anyway). So really, it's not very smart. > So it's rather unlike datetime.datetime. I can see the reasons why it > was designed as it is now --- the problems seem similar to Unicode on > Python 3, the encoding/decoding is painful to get right especially as > the timezones are a mess due to historical reasons. yup -- a real nightmare. > The above design seems philosophically at odds with the concept of a > "naive" datetime type. A "naive" datetime is sort of datetime64[D] plus > HH:MM:SS, but not quite. actually, I think datetime64 is naive -- the problem is entirely the I/O > I think your point about using current timezone in interpreting user > input being dangerous is probably correct --- Thanks -- I'm convinced it's a really bad idea. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Tue Apr 2 20:02:54 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Apr 2013 20:02:54 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 7:09 PM, wrote: > On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: >> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>>> This is like observing that if I say "go North" then it's ambiguous >>>> about whether I want you to drive or walk, and concluding that we need >>>> new words for the directions depending on what sort of vehicle you >>>> use. So "go North" means drive North, "go htuoS" means walk North, >>>> etc. Totally silly. Makes much more sense to have one set of words for >>>> directions, and then make clear from context what the directions are >>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>>> F-wards". >>>> >>>> "C" and "Z" mean exactly the same thing -- they describe a way of >>>> unraveling a cube into a straight line. The difference is what we do >>>> with the resulting straight line. That's why I'm suggesting that the >>>> distinction should be made in the name of the argument. >>> >>> Could you unpack that for the 'ravel' docstring? Because these >>> options all refer to the way of unraveling and not the memory layout >>> that results. >> >> Z/C/column-major/whatever-you-want-to-call-it is a general strategy >> for converting between a 1-dim representation and a n-dim >> representation. In the case of memory storage, the 1-dim >> representation is the flat space of pointer arithmetic. In the case of >> ravel, the 1-dim representation is the flat space of a 1-dim indexed >> array. But the 1-dim-to-n-dim part is the same in both cases. >> >> I think that's why you're seeing people baffled by your proposal -- to >> them the "C" refers to this general strategy, and what's different is >> the context where it gets applied. So giving the same strategy two >> different names is silly; if anything it's the contexts that should >> have different names. > > And once we get into memory optimization (and avoiding copies and > preserving contiguity), it is necessary to keep both orders in mind, > is memory order in "F" and am I iterating/raveling in "F" order > (or slicing columns). > > I think having two separate keywords give the impression we can > choose two different things at the same time. as aside (math): numpy.flatten made it into the Wikipedia page http://en.wikipedia.org/wiki/Vectorization_%28mathematics%29#Programming_language (and how it's different from R and Matlab/Octave, but doesn't mention: use order="F" to get the same behavior as math and the others) and the corresponding code in statsmodels (tools for vector autoregressive models by Wes) Josef baffled? > > Josef > > >> >> -n >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Tue Apr 2 20:56:42 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 20:56:42 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: > On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>> This is like observing that if I say "go North" then it's ambiguous >>> about whether I want you to drive or walk, and concluding that we need >>> new words for the directions depending on what sort of vehicle you >>> use. So "go North" means drive North, "go htuoS" means walk North, >>> etc. Totally silly. Makes much more sense to have one set of words for >>> directions, and then make clear from context what the directions are >>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>> F-wards". >>> >>> "C" and "Z" mean exactly the same thing -- they describe a way of >>> unraveling a cube into a straight line. The difference is what we do >>> with the resulting straight line. That's why I'm suggesting that the >>> distinction should be made in the name of the argument. >> >> Could you unpack that for the 'ravel' docstring? Because these >> options all refer to the way of unraveling and not the memory layout >> that results. > > Z/C/column-major/whatever-you-want-to-call-it is a general strategy > for converting between a 1-dim representation and a n-dim > representation. In the case of memory storage, the 1-dim > representation is the flat space of pointer arithmetic. In the case of > ravel, the 1-dim representation is the flat space of a 1-dim indexed > array. But the 1-dim-to-n-dim part is the same in both cases. > > I think that's why you're seeing people baffled by your proposal -- to > them the "C" refers to this general strategy, and what's different is > the context where it gets applied. So giving the same strategy two > different names is silly; if anything it's the contexts that should > have different names. Thanks - but I guess we all agree that np.array(a, order='C') and np.ravel(a, order='F') are using the term 'order' in two different and orthogonal senses, and the discussion is about whether it is possible to get confused about these two senses and, if so, what we should do about it. Just to repeat what you're suggesting np.array(a, memory_order='C') np.ravel(a, index_order='C') np.ravel(a, index_order='K') That makes sense to me. I guess we'd have to do something like: def ravel(a, index_order='C', **kwargs): Where kwargs must be empty if the second arg is specified, otherwise it can contain only one key, 'order' and 'index_order'. Thus: np.ravel(a, index_order='C') will work for the forseeable future. Cheers, Matthew From matthew.brett at gmail.com Tue Apr 2 21:09:30 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Apr 2013 21:09:30 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Tue, Apr 2, 2013 at 7:09 PM, wrote: > On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: >> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>>> This is like observing that if I say "go North" then it's ambiguous >>>> about whether I want you to drive or walk, and concluding that we need >>>> new words for the directions depending on what sort of vehicle you >>>> use. So "go North" means drive North, "go htuoS" means walk North, >>>> etc. Totally silly. Makes much more sense to have one set of words for >>>> directions, and then make clear from context what the directions are >>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>>> F-wards". >>>> >>>> "C" and "Z" mean exactly the same thing -- they describe a way of >>>> unraveling a cube into a straight line. The difference is what we do >>>> with the resulting straight line. That's why I'm suggesting that the >>>> distinction should be made in the name of the argument. >>> >>> Could you unpack that for the 'ravel' docstring? Because these >>> options all refer to the way of unraveling and not the memory layout >>> that results. >> >> Z/C/column-major/whatever-you-want-to-call-it is a general strategy >> for converting between a 1-dim representation and a n-dim >> representation. In the case of memory storage, the 1-dim >> representation is the flat space of pointer arithmetic. In the case of >> ravel, the 1-dim representation is the flat space of a 1-dim indexed >> array. But the 1-dim-to-n-dim part is the same in both cases. >> >> I think that's why you're seeing people baffled by your proposal -- to >> them the "C" refers to this general strategy, and what's different is >> the context where it gets applied. So giving the same strategy two >> different names is silly; if anything it's the contexts that should >> have different names. > > And once we get into memory optimization (and avoiding copies and > preserving contiguity), it is necessary to keep both orders in mind, > is memory order in "F" and am I iterating/raveling in "F" order > (or slicing columns). > > I think having two separate keywords give the impression we can > choose two different things at the same time. I guess it could not make sense to do this: np.ravel(a, index_order='C', memory_order='F') It could make sense to do this: np.reshape(a, (3,4), index_order='F, memory_order='F') but that just points out the inherent confusion between the uses of 'order', and in this case, the fact that you can only do: np.reshape(a, (3, 4), index_order='F') correctly distinguishes between the meanings. Best, Matthew From charlesr.harris at gmail.com Wed Apr 3 00:47:36 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 2 Apr 2013 22:47:36 -0600 Subject: [Numpy-discussion] Sphinx question Message-ID: Making the numpy documentation generates hundreds of warnings like ...doc/source/reference/generated/numpy.memmap.argsort.rst:: WARNING: document isn't included in any toctree Note that argsort is a method inherited from ndarray. These warnings are repeated for all ndarray methods in all classes that subclass ndarray. Does anyone know the proper way to deal with this? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Wed Apr 3 00:56:03 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Wed, 03 Apr 2013 06:56:03 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: Message-ID: <515BB663.7030804@hilboll.de> >> As I poke at this a bit, I"m noticing that maybe time zones aren't >> handles at all internally -- rather, the conversion is done to UTC >> when creating a datetime64, and conversion is then done to the locale >> when creating a strng representation -- maybe nothing inside at all. >> >> Does the timezone info survive saving in npz, etc? > > As far as I understand (more knowledgeable people, please correct me), > Numpy's datetime handling in 1.7 is timezone-agnostic and works in UTC > (which is not a time zone). That is, datetime64 represents an absolute > point in time, and "2013-04-02T05:00:00-0100", > "2013-04-02T04:00:00-0200", ... are represented by the same (on the > binary-level) datetime64 value. In particular no timezone information is > stored, see "datetime_array.view(np.uint64)", as it is not needed. > > So it's rather unlike datetime.datetime. I can see the reasons why it > was designed as it is now --- the problems seem similar to Unicode on > Python 3, the encoding/decoding is painful to get right especially as > the timezones are a mess due to historical reasons. > > The above design seems philosophically at odds with the concept of a > "naive" datetime type. A "naive" datetime is sort of datetime64[D] plus > HH:MM:SS, but not quite. > > *** > > I think your point about using current timezone in interpreting user > input being dangerous is probably correct --- perhaps UTC all the way > would be a safer (and simpler) choice? +1 From pav at iki.fi Wed Apr 3 03:59:27 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 3 Apr 2013 07:59:27 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: Message-ID: Chris Barker - NOAA Federal noaa.gov> writes: [clip] > actually, I think datetime64 is naive -- the problem is entirely the I/O It's a counter from unix epoch, but as it's just an integer you can probably redefine the epoch and pretend your local time is UTC (as long as you don't care about DST etc.). It seems it could be made to work by interpreting timezoneless input as UTC dates. -- Pauli Virtanen From josef.pktd at gmail.com Wed Apr 3 08:19:23 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 08:19:23 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett wrote: > Hi, > > On Tue, Apr 2, 2013 at 7:09 PM, wrote: >> On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: >>> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>>>> This is like observing that if I say "go North" then it's ambiguous >>>>> about whether I want you to drive or walk, and concluding that we need >>>>> new words for the directions depending on what sort of vehicle you >>>>> use. So "go North" means drive North, "go htuoS" means walk North, >>>>> etc. Totally silly. Makes much more sense to have one set of words for >>>>> directions, and then make clear from context what the directions are >>>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>>>> F-wards". >>>>> >>>>> "C" and "Z" mean exactly the same thing -- they describe a way of >>>>> unraveling a cube into a straight line. The difference is what we do >>>>> with the resulting straight line. That's why I'm suggesting that the >>>>> distinction should be made in the name of the argument. >>>> >>>> Could you unpack that for the 'ravel' docstring? Because these >>>> options all refer to the way of unraveling and not the memory layout >>>> that results. >>> >>> Z/C/column-major/whatever-you-want-to-call-it is a general strategy >>> for converting between a 1-dim representation and a n-dim >>> representation. In the case of memory storage, the 1-dim >>> representation is the flat space of pointer arithmetic. In the case of >>> ravel, the 1-dim representation is the flat space of a 1-dim indexed >>> array. But the 1-dim-to-n-dim part is the same in both cases. >>> >>> I think that's why you're seeing people baffled by your proposal -- to >>> them the "C" refers to this general strategy, and what's different is >>> the context where it gets applied. So giving the same strategy two >>> different names is silly; if anything it's the contexts that should >>> have different names. >> >> And once we get into memory optimization (and avoiding copies and >> preserving contiguity), it is necessary to keep both orders in mind, >> is memory order in "F" and am I iterating/raveling in "F" order >> (or slicing columns). >> >> I think having two separate keywords give the impression we can >> choose two different things at the same time. > > I guess it could not make sense to do this: > > np.ravel(a, index_order='C', memory_order='F') > > It could make sense to do this: > > np.reshape(a, (3,4), index_order='F, memory_order='F') > > but that just points out the inherent confusion between the uses of > 'order', and in this case, the fact that you can only do: > > np.reshape(a, (3, 4), index_order='F') > > correctly distinguishes between the meanings. So, if index_order and memory_order are never in the same function, then the context should be enough. It was always enough for me. np.reshape(a, (3,4), index_order='F, memory_order='F') really hurts my head because you mix a function that operates on views, indexing and shapes with memory creation, (or I have no idea what memory_order should do in this case). np.asarray(a.reshape(3,4 order="F"), order="F") or the example here http://docs.scipy.org/doc/numpy/reference/generated/numpy.asfortranarray.html?highlight=asfortranarray#numpy.asfortranarray http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html keeps functions with index_order and functions with memory_order nicely separated. (It might be useful but very confusing to add memory_order to every function that creates a view if possible and a copy if necessary: "If you have to make a copy, then I want F memory order, otherwise give me a view" But I cannot find a candidate function right now, except for ravel and reshape see first notes in docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html ) ---- a day later (haven't changed my mind): isn't specifying "index order" in the Parameter section enough as an explanation? something like: ``` def ravel Parameters order : index order how the array is stacked into a 1d array. F means we stack by columns (fortran order, first index first), C means we stack by rows (c-order, last index first) ``` most array *creation* functions explicitly mention memory layout in the docstring Josef > > Best, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Apr 3 09:24:48 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 03 Apr 2013 15:24:48 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <1364995488.4038.14.camel@sebastian-laptop> On Tue, 2013-04-02 at 22:52 +0100, Nathaniel Smith wrote: > On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: > >> This is like observing that if I say "go North" then it's ambiguous > >> about whether I want you to drive or walk, and concluding that we need > >> new words for the directions depending on what sort of vehicle you > >> use. So "go North" means drive North, "go htuoS" means walk North, > >> etc. Totally silly. Makes much more sense to have one set of words for > >> directions, and then make clear from context what the directions are > >> used for -- "drive North", "walk North". Or "iterate C-wards", "store > >> F-wards". > >> > >> "C" and "Z" mean exactly the same thing -- they describe a way of > >> unraveling a cube into a straight line. The difference is what we do > >> with the resulting straight line. That's why I'm suggesting that the > >> distinction should be made in the name of the argument. > > > > Could you unpack that for the 'ravel' docstring? Because these > > options all refer to the way of unraveling and not the memory layout > > that results. > > Z/C/column-major/whatever-you-want-to-call-it is a general strategy > for converting between a 1-dim representation and a n-dim > representation. In the case of memory storage, the 1-dim > representation is the flat space of pointer arithmetic. In the case of > ravel, the 1-dim representation is the flat space of a 1-dim indexed > array. But the 1-dim-to-n-dim part is the same in both cases. > > I think that's why you're seeing people baffled by your proposal -- to > them the "C" refers to this general strategy, and what's different is > the context where it gets applied. So giving the same strategy two > different names is silly; if anything it's the contexts that should > have different names. > Yup, thats how I think about it too... So I am against different values for the order argument. I am somewhat fine with a new name, but it seems like that may get clumsy. But I would really love if someone would try to make the documentation simpler! There is also never a mention of "contiguity", even though when we refer to "memory order", then having a C/F contiguous array is often the reason why (in np.asarray "contiguous='C'" would make as much sense as "order", maybe even more so). Also 'A' seems often explained not quite correctly (though that does not matter (except for reshape, where its explanation is fuzzy), it will matter more in the future -- even if I don't expect 'A' to be actually used). If there is not yet, there should maybe be an overview in the user/reference guide of what order means and how application to new memory is different to reshape, etc. use it... Then the functions using order, can also reference that, plus maybe we have some place to look up what C and F is for all of us who like to forget it... - Sebastian > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From dave.hirschfeld at gmail.com Wed Apr 3 09:26:50 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Wed, 3 Apr 2013 13:26:50 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: <515BB663.7030804@hilboll.de> Message-ID: Andreas Hilboll hilboll.de> writes: > > > > > I think your point about using current timezone in interpreting user > > input being dangerous is probably correct --- perhaps UTC all the way > > would be a safer (and simpler) choice? > > +1 > +10 from me! I've recently come across a bug due to the fact that numpy interprets dates as being in the local timezone. The data comes from a database query where there is no timezone information supplied (and dates are stored as strings). It is assumed that the user doesn't need to know the timezone - i.e. the dates are timezone naive. Working out the correct timezones would be fairly laborious, but whatever the correct timezones are, they're certainly not the timezone the current user happens to find themselves in! e.g. In [32]: rs = [ ...: (u'2000-01-17 00:00:00.000000', u'2000-02-01', u'2000-02-29', 0.1203), ...: (u'2000-01-26 00:00:00.000000', u'2000-02-01', u'2000-02-29', 0.1369), ...: (u'2000-01-18 00:00:00.000000', u'2000-03-01', u'2000-03-31', 0.1122), ...: (u'2000-02-25 00:00:00.000000', u'2000-03-01', u'2000-03-31', 0.1425) ...: ] ...: dtype = [('issue_date', 'datetime64[ns]'), ...: ('start_date', 'datetime64[D]'), ...: ('end_date', 'datetime64[D]'), ...: ('value', float)] ...: # In [33]: # What I see in London, UK ...: recordset = np.array(rs, dtype=dtype) ...: df = pd.DataFrame(recordset) ...: df = df.set_index('issue_date') ...: df ...: Out[33]: start_date end_date value issue_date 2000-01-17 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1203 2000-01-26 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1369 2000-01-18 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1122 2000-02-25 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1425 In [34]: # What my colleague sees in Auckland, NZ ...: recordset = np.array(rs, dtype=dtype) ...: df = pd.DataFrame(recordset) ...: df = df.set_index('issue_date') ...: df ...: Out[34]: start_date end_date value issue_date 2000-01-16 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1203 2000-01-25 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1369 2000-01-17 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1122 2000-02-24 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1425 Oh dear! This isn't acceptable for my use case (in a multinational company) and I found no reasonable way around it other than bypassing the numpy conversion entirely by setting the dtype to object, manually parsing the strings and creating an array from the list of datetime objects. Regards, Dave From njs at pobox.com Wed Apr 3 09:49:44 2013 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 3 Apr 2013 14:49:44 +0100 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld wrote: > Andreas Hilboll hilboll.de> writes: >> > I think your point about using current timezone in interpreting user >> > input being dangerous is probably correct --- perhaps UTC all the way >> > would be a safer (and simpler) choice? >> >> +1 >> > > +10 from me! > > I've recently come across a bug due to the fact that numpy interprets dates as > being in the local timezone. > > The data comes from a database query where there is no timezone information > supplied (and dates are stored as strings). It is assumed that the user doesn't > need to know the timezone - i.e. the dates are timezone naive. > > Working out the correct timezones would be fairly laborious, but whatever the > correct timezones are, they're certainly not the timezone the current user > happens to find themselves in! > > e.g. > > In [32]: rs = [ > ...: (u'2000-01-17 00:00:00.000000', u'2000-02-01', u'2000-02-29', 0.1203), > ...: (u'2000-01-26 00:00:00.000000', u'2000-02-01', u'2000-02-29', 0.1369), > ...: (u'2000-01-18 00:00:00.000000', u'2000-03-01', u'2000-03-31', 0.1122), > ...: (u'2000-02-25 00:00:00.000000', u'2000-03-01', u'2000-03-31', 0.1425) > ...: ] > ...: dtype = [('issue_date', 'datetime64[ns]'), > ...: ('start_date', 'datetime64[D]'), > ...: ('end_date', 'datetime64[D]'), > ...: ('value', float)] > ...: # > > In [33]: # What I see in London, UK > ...: recordset = np.array(rs, dtype=dtype) > ...: df = pd.DataFrame(recordset) > ...: df = df.set_index('issue_date') > ...: df > ...: > Out[33]: > start_date end_date value > issue_date > 2000-01-17 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1203 > 2000-01-26 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1369 > 2000-01-18 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1122 > 2000-02-25 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1425 > > In [34]: # What my colleague sees in Auckland, NZ > ...: recordset = np.array(rs, dtype=dtype) > ...: df = pd.DataFrame(recordset) > ...: df = df.set_index('issue_date') > ...: df > ...: > Out[34]: > start_date end_date value > issue_date > 2000-01-16 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1203 > 2000-01-25 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00 0.1369 > 2000-01-17 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1122 > 2000-02-24 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00 0.1425 > > > Oh dear! > > This isn't acceptable for my use case (in a multinational company) and I found > no reasonable way around it other than bypassing the numpy conversion entirely > by setting the dtype to object, manually parsing the strings and creating an > array from the list of datetime objects. Wow, that's truly broken. I'm sorry. I'm skeptical that just switching to UTC everywhere is actually the right solution. It smells like one of those solutions that's simple, neat, and wrong. (I don't know anything about calendar-time series handling, so I have no ability to actually judge this stuff, but wouldn't one problem be if you want to know about business days/hours? You lose the original day-of-year once you move everything to UTC.) Maybe datetime dtypes should be parametrized by both granularity and timezone? Or we could just declare that datetime64 is always timezone-naive and adjust the code to match? I'll CC the pandas list in case they have some insight. Unfortunately AFAIK no-one who's regularly working on numpy this point works with datetimes, so we have limited ability to judge solutions... please help! -n From dave.hirschfeld at gmail.com Wed Apr 3 10:38:52 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Wed, 3 Apr 2013 14:38:52 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: <515BB663.7030804@hilboll.de> Message-ID: Nathaniel Smith pobox.com> writes: > > On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld > gmail.com> wrote: > > > > This isn't acceptable for my use case (in a multinational company) and I found > > no reasonable way around it other than bypassing the numpy conversion entirely > > by setting the dtype to object, manually parsing the strings and creating an > > array from the list of datetime objects. > > Wow, that's truly broken. I'm sorry. > > I'm skeptical that just switching to UTC everywhere is actually the > right solution. It smells like one of those solutions that's simple, > neat, and wrong. (I don't know anything about calendar-time series > handling, so I have no ability to actually judge this stuff, but > wouldn't one problem be if you want to know about business days/hours? > You lose the original day-of-year once you move everything to UTC.) > Maybe datetime dtypes should be parametrized by both granularity and > timezone? Or we could just declare that datetime64 is always > timezone-naive and adjust the code to match? > > I'll CC the pandas list in case they have some insight. Unfortunately > AFAIK no-one who's regularly working on numpy this point works with > datetimes, so we have limited ability to judge solutions... please > help! > > -n > It think simply setting the timezone to UTC if it's not specified would solve 99% of use cases because IIUC the internal representation is UTC so numpy would be doing no conversion of the dates that were passed in. It was the conversion which was the source of the error in my example. The only potential issue with this is that the dates might take along an incorrect UTC timezone, making it more difficult to work with naive datetimes. e.g. In [42]: d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]') In [43]: d Out[43]: numpy.datetime64('2014-01-01T00:00:00+0000') In [44]: str(d) Out[44]: '2014-01-01T00:00:00+0000' In [45]: pydate(str(d)) Out[45]: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc()) In [46]: pydate(str(d)) == datetime.datetime(2014, 1, 1) Traceback (most recent call last): File "", line 1, in pydate(str(d)) == datetime.datetime(2014, 1, 1) TypeError: can't compare offset-naive and offset-aware datetimes In [47]: pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc()) Out[47]: True In [48]: pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1, 1) Out[48]: True In this case it may be best to have numpy not try to set the timezone at all if none was specified. Given the internal representation is UTC I'm not sure this is feasible though so defaulting to UTC may be the best solution. Regards, Dave From chris.barker at noaa.gov Wed Apr 3 11:52:47 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 08:52:47 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <1364995488.4038.14.camel@sebastian-laptop> References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg wrote: >> the context where it gets applied. So giving the same strategy two >> different names is silly; if anything it's the contexts that should >> have different names. >> > > Yup, thats how I think about it too... me too... > But I would really love if someone would try to make the documentation > simpler! yes, I think this is where the solution lies. > There is also never a mention of "contiguity", even though when > we refer to "memory order", then having a C/F contiguous array is often > the reason why good point -- in fact, I have no idea what would happen in many of these cases for a discontiguous array (or one with arbitrarily weird strides...) > Also 'A' seems often explained not > quite correctly (though that does not matter (except for reshape, where > its explanation is fuzzy), it will matter more in the future -- even if > I don't expect 'A' to be actually used). I wonder about having a 'A' option in reshape at all -- what the heck does it mean? why do we need it? Again, I come back to the fact that memory order is kind-of orthogonal to index order. So for reshape (or ravel, which is really just a special case of reshape...) the 'A' flag and 'K' flag (huh?) is pretty dangerous, and prone to error. I think of it this way: Much of the beauty of numpy is that it presents a consistent interface to various forms of strided data -- that way, folks can write code that works the same way for any ndarray, while still being able to have internal storage be efficient for the use at hand -- i.e. C order for the common case, Fortran order for interaction with libraries that expect that order (or for algorithms that are more efficient in that order, though that's mostly external libs..), and non-contiguous data so one can work on sub-parts of arrays without copying data around. In most places, the numpy API hides the internal memory order -- this is a good thing, most people have no need to think about it (or most code, anyway), and you can write code that works (even if not optimally) for any (strided) memory layout. All is good. There are times when you really need to understand, or control or manipulate the memory layout, to make sure your routines are optimized, or the data is in the right form to pass of to an external lib, or to make sense of raw data read from a file, or... That's what we have .view() and friends for. However, the 'A' and 'K' flags mix and match these concepts -- and I think that's dangerous. it would be easy for the a to use the 'A' flag, and have everything work fine and dandy with all their test cases, only to have it blow up when someone passes in a different-than-expected array. So really, they should only be used in cases where the code has checked memory order before hand, or in a really well-defined interface where you know exactly what you're getting. In those cases, it makes the code far more clear an less error prone to do you re-arranging of the memory in a separate step, rather than built-in to a ravel() or reshape() call. [note] -- I wrote earlier that I wasn't confused by the ravel() examples -- true for teh 'c' and 'F' flags, but I'm still not at all clear what 'A' and 'K' woudl give me -- particularly for 'A' and reshape() So I think the cause of the confusion here is not that we use "order" in two different contexts, nor the fact that 'C' and 'F' may not mean anything to some people, but that we are conflating two different process in one function, and with one flag. My (maybe) proposal: we deprecate the 'A' and 'K' flags in ravel() and reshape(). (maybe even deprecate ravel() -- does it add anything to reshape? If not deprecate, at least encourage people in the docs not to use them, and rather do their memory-structure manipulations with .view or stride manipulation, or... I'm still trying to figure out when you'd want the 'A' flag -- it seems at the end of your operation you will want: The resulting array to be a particular shape, with the elements in a particular order and You _may_ want the in-memory layout a certain way. but 'A' can't ensure both of those. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 3 12:33:03 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 09:33:03 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: wrote: > I found no reasonable way around it other than bypassing the numpy conversion entirely Exactly - we have come to the same conclusion. By the way, it's also consistent -- an ISO string without a TZ is interpreted as a to mean "use the locale", but a datetime object without a TZ is interpreted as UTC, so you get this: In [68]: dt Out[68]: datetime.datetime(2013, 4, 3, 12, 0) In [69]: np.dateti np.datetime64 np.datetime_as_string np.datetime_data In [69]: np.datetime64(dt) Out[69]: numpy.datetime64('2013-04-03T05:00:00.000000-0700') In [70]: np.datetime64(dt.iso) dt.isocalendar dt.isoformat dt.isoweekday In [70]: np.datetime64(dt.isoformat()) Out[70]: numpy.datetime64('2013-04-03T12:00:00-0700') two different results! (and as it happens, datetime.datetime does not have an ISO string parser, so it's not completely trivial to round-trip though that...) On Wed, Apr 3, 2013 at 6:49 AM, Nathaniel Smith wrote: > Wow, that's truly broken. I'm sorry. Did you put this in? break out the pitchforks! ( ;-) ) > I'm skeptical that just switching to UTC everywhere is actually the > right solution. It smells like one of those solutions that's simple, > neat, and wrong. well, actually, I don't think UTC everywhere is quite what's proposed -- really it's naive datetimes -- it would be up to the user/application to make sure the time zones are consistent. Which does mean that parsing a ISO string with a timezone becomes problematic... > (I don't know anything about calendar-time series > handling, so I have no ability to actually judge this stuff, but > wouldn't one problem be if you want to know about business days/hours? right -- then you'd want to use local time, so numpy might think it's ISO, but it'd actually be local time. Anyway, at the moment, I don't think datetime64 does this right anyway. I don't see mention of the timezone in the busday functions. I havne't checked to see if they use the locale TZ or ignore it, but either way is wrong (actually, using the locale setting is worse...) > Maybe datetime dtypes should be parametrized by both granularity and > timezone? That may be a good option. However, I suspect it's pretty hard to actually use the timezone correctly and consistently, so I"m nervous about that. In any case, we'd need to make sure that the user could specify timezone on I/O and busday calculations, etc, and *never* assume the locale TZ (Or anything else about locale) unless asked for. Using the locale TZ is almost never the right thing to do for the kind of applications numpy is used for. > Or we could just declare that datetime64 is always > timezone-naive and adjust the code to match? That would be the easy way to handle it -- from the numpy side, anyway. > I'll CC the pandas list in case they have some insight. I suspect pandas has their own way of dealing with all these issues already. Which makes me think that numpy should take the same approach as the python stdlib: provide a core datatype, but leave the use-case specific stuff for others to build on. For instance, it seems really odd to have the busday* functions in core numpy... > Unfortunately > AFAIK no-one who's regularly working on numpy this point works with > datetimes, so we have limited ability to judge solutions... well, that explains how this happened! > please help! in 1.7, it is still listed as experimental, so you could say this is all going as planned: release something we can try to use, and see what we find out when using it! I _think_ one reasonable option may be: 1) Internal is UTC 2) On input: a) Default for no-time-zone-specified is UTC (both from datetime objects and ISO strings) b) respect TZ if given, converting to UTC 3) On output: a) default to UTC a) provide a way for the user to specify the timezone desired ( perhaps a TZ attribute somewhere, or functions to specifically convert to ISO strings and datetime objects that take an optional TZ parameter.) 4) busday* and the like allow a way to specify TZ Issues I immediate see with this: Respecting the TZ on output is a problem because: 1) if people want "naive" datetimes, they will get UTC ISO strings, i.e.: '2013-04-03T05:00:00Z' rather than '2013-04-03T05:00:00' - so there should be a way to specify "naive" or None as a timezone. 2) the python datetime module doesn't have any tzinfo objects built in -- so to respect timezones, numpy would need to maintain its own, or depend on pytz Given all this, maybe naive is the way to go, perhaps mirroring datetime,datetime, an having an optional tzinfo object attribute. (by the way, I'm confused where that would live -- in the dtype instance? in the array? Issue with Naive: what do you do with an ISO string that specifies a TZ offset? I'm beginning to see why the datetime doesn't support reading ISO strings -- it would need to deal with timezones in that case! Another note about Timezones and ISO -- it doesn't really support timezones -- you specify an offset from UTC, that's it -- so you dont know if that is, for instance, Mountain Standard time or Pacific Daylight Time. All you can do with it is convert to UTC, but you don't have a way to convert back, as you don't know what the timezone is. We'd be taking on a heck of mess to support this! Hmm -- maybe only support ISO-like -- i.e. all we do is keep an offset around that can be re-applied on output if you want -- that's it. That's it for now -- thanks for engaging! -Chris PS: I'm pretty sure that the C stdlib time handling functions give you no choice but to use the locale when they covert to strings, etc -- this is a freaking nightmare, and I'm wondering if that's in fact why numpy does it. i.e it's easy to use the C lib functions, but writing your own requires the full TZ database, handling DST, etc. etc.... -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From travis at continuum.io Wed Apr 3 12:51:38 2013 From: travis at continuum.io (Travis Oliphant) Date: Wed, 3 Apr 2013 11:51:38 -0500 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: Mark Wiebe and I are both still tracking NumPy development and can provide context and even help when needed. Apologies if we've left a different impression. We have to be prudent about the time we spend as we have other projects we are pursuing as well, but we help clients with NumPy issues all the time and are eager to continue to improve the code base. It seems to me that the biggest issue is just the automatic conversion that is occurring on string or date-time input. We should stop using the local time-zone (explicit is better than implicit strikes again) and not use any time-zone unless time-zone information is provided in the string. I am definitely +1 on that. It may be necessary to carry around another flag in the data-type to indicate whether or not the date-time is naive (not time-zone aware) or time-zone aware so that string printing does not print a time-zone if it didn't have one to begin with as well. If others agree that this is the best way forward, then Mark or I can definitely help contribute a patch. Best, -Travis On Wed, Apr 3, 2013 at 9:38 AM, Dave Hirschfeld wrote: > Nathaniel Smith pobox.com> writes: > > > > > On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld > > gmail.com> wrote: > > > > > > This isn't acceptable for my use case (in a multinational company) and > I > found > > > no reasonable way around it other than bypassing the numpy conversion > entirely > > > by setting the dtype to object, manually parsing the strings and > creating an > > > array from the list of datetime objects. > > > > Wow, that's truly broken. I'm sorry. > > > > I'm skeptical that just switching to UTC everywhere is actually the > > right solution. It smells like one of those solutions that's simple, > > neat, and wrong. (I don't know anything about calendar-time series > > handling, so I have no ability to actually judge this stuff, but > > wouldn't one problem be if you want to know about business days/hours? > > You lose the original day-of-year once you move everything to UTC.) > > Maybe datetime dtypes should be parametrized by both granularity and > > timezone? Or we could just declare that datetime64 is always > > timezone-naive and adjust the code to match? > > > > I'll CC the pandas list in case they have some insight. Unfortunately > > AFAIK no-one who's regularly working on numpy this point works with > > datetimes, so we have limited ability to judge solutions... please > > help! > > > > -n > > > > It think simply setting the timezone to UTC if it's not specified would > solve > 99% of use cases because IIUC the internal representation is UTC so numpy > would > be doing no conversion of the dates that were passed in. It was the > conversion > which was the source of the error in my example. > > The only potential issue with this is that the dates might take along an > incorrect UTC timezone, making it more difficult to work with naive > datetimes. > > e.g. > > In [42]: d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]') > > In [43]: d > Out[43]: numpy.datetime64('2014-01-01T00:00:00+0000') > > In [44]: str(d) > Out[44]: '2014-01-01T00:00:00+0000' > > In [45]: pydate(str(d)) > Out[45]: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc()) > > In [46]: pydate(str(d)) == datetime.datetime(2014, 1, 1) > Traceback (most recent call last): > > File "", line 1, in > pydate(str(d)) == datetime.datetime(2014, 1, 1) > > TypeError: can't compare offset-naive and offset-aware datetimes > > > In [47]: pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc()) > Out[47]: True > > In [48]: pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1, > 1) > Out[48]: True > > > In this case it may be best to have numpy not try to set the timezone at > all if > none was specified. Given the internal representation is UTC I'm not sure > this > is feasible though so defaulting to UTC may be the best solution. > > Regards, > Dave > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- --- Travis Oliphant Continuum Analytics, Inc. http://www.continuum.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Apr 3 12:52:43 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 03 Apr 2013 18:52:43 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: <1365007963.4038.57.camel@sebastian-laptop> On Wed, 2013-04-03 at 08:52 -0700, Chris Barker - NOAA Federal wrote: > On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg > wrote: > >> the context where it gets applied. So giving the same strategy two > >> different names is silly; if anything it's the contexts that should > >> have different names. > >> > > > > Yup, thats how I think about it too... > > me too... > > > But I would really love if someone would try to make the documentation > > simpler! > > yes, I think this is where the solution lies. > > > There is also never a mention of "contiguity", even though when > > we refer to "memory order", then having a C/F contiguous array is often > > the reason why > > good point -- in fact, I have no idea what would happen in many of > these cases for a discontiguous array (or one with arbitrarily weird > strides...) > > > Also 'A' seems often explained not > > quite correctly (though that does not matter (except for reshape, where > > its explanation is fuzzy), it will matter more in the future -- even if > > I don't expect 'A' to be actually used). > > I wonder about having a 'A' option in reshape at all -- what the heck > does it mean? why do we need it? Again, I come back to the fact that > memory order is kind-of orthogonal to index order. So for reshape (or > ravel, which is really just a special case of reshape...) the 'A' flag > and 'K' flag (huh?) is pretty dangerous, and prone to error. I think > of it this way: > Actually 'K' + reshape is not even implemented sensibly and in current master I changed it to an error. I would not even know how to define it, and even if you find a definition I cannot imagine it being useful... Deprecating 'A' for reshape would seem OK to me since I doubt anyone actually uses it. It is currently equivalent to `'F' if input.flags.fnc else 'C'` (fnc means "fortran not c"), and as such is shaky business. I just realized that 'A' is a bit funny. Basically it means anything (Anyorder), including discontinuous memory chunks for np.array with copy=False. But if you do a copy (or reshape), lacking a more free way to do it, it means `'F' if input.flags.fnc else 'C'` again. Not sure about the history, but it seems to me 'K' basically supersedes 'A' for most stuff and its usage as Fortran or C, is more an accident because it is the simplest way to implement "I don't care". The use of 'K' is very sensible for copies of course. 'K' actually does make some sense for ravel, since if you don't care, it has the best chance of no copy. 'A' for ravel could/should in my opinion be deprecated just like for reshape, since it is pretty unpredictable. > Much of the beauty of numpy is that it presents a consistent interface > to various forms of strided data -- that way, folks can write code > that works the same way for any ndarray, while still being able to > have internal storage be efficient for the use at hand -- i.e. C order > for the common case, Fortran order for interaction with libraries that > expect that order (or for algorithms that are more efficient in that > order, though that's mostly external libs..), and non-contiguous data > so one can work on sub-parts of arrays without copying data around. > > In most places, the numpy API hides the internal memory order -- this > is a good thing, most people have no need to think about it (or most > code, anyway), and you can write code that works (even if not > optimally) for any (strided) memory layout. All is good. > > There are times when you really need to understand, or control or > manipulate the memory layout, to make sure your routines are > optimized, or the data is in the right form to pass of to an external > lib, or to make sense of raw data read from a file, or... That's what > we have .view() and friends for. > Yeah, I somewhat dislike the fact that "view" only works right for (roughly) C-contiguous arrays, thats another one of those old traps that is difficult to impossible to get rid of. Maybe some or all of view usages should be superseded by a new command... Regards, Sebastian > However, the 'A' and 'K' flags mix and match these concepts -- and I > think that's dangerous. it would be easy for the a to use the 'A' > flag, and have everything work fine and dandy with all their test > cases, only to have it blow up when someone passes in a > different-than-expected array. So really, they should only be used in > cases where the code has checked memory order before hand, or in a > really well-defined interface where you know exactly what you're > getting. In those cases, it makes the code far more clear an less > error prone to do you re-arranging of the memory in a separate step, > rather than built-in to a ravel() or reshape() call. > > [note] -- I wrote earlier that I wasn't confused by the ravel() > examples -- true for teh 'c' and 'F' flags, but I'm still not at all > clear what 'A' and 'K' woudl give me -- particularly for 'A' and > reshape() > > So I think the cause of the confusion here is not that we use "order" > in two different contexts, nor the fact that 'C' and 'F' may not mean > anything to some people, but that we are conflating two different > process in one function, and with one flag. > > My (maybe) proposal: we deprecate the 'A' and 'K' flags in ravel() and > reshape(). (maybe even deprecate ravel() -- does it add anything to > reshape? If not deprecate, at least encourage people in the docs not > to use them, and rather do their memory-structure manipulations with > .view or stride manipulation, or... > > I'm still trying to figure out when you'd want the 'A' flag -- it > seems at the end of your operation you will want: > > The resulting array to be a particular shape, with the elements in a > particular order > > and > > You _may_ want the in-memory layout a certain way. > > but 'A' can't ensure both of those. > > -Chris > > From matthew.brett at gmail.com Wed Apr 3 14:39:33 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Apr 2013 11:39:33 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Wed, Apr 3, 2013 at 5:19 AM, wrote: > On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett wrote: >> Hi, >> >> On Tue, Apr 2, 2013 at 7:09 PM, wrote: >>> On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith wrote: >>>> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett wrote: >>>>>> This is like observing that if I say "go North" then it's ambiguous >>>>>> about whether I want you to drive or walk, and concluding that we need >>>>>> new words for the directions depending on what sort of vehicle you >>>>>> use. So "go North" means drive North, "go htuoS" means walk North, >>>>>> etc. Totally silly. Makes much more sense to have one set of words for >>>>>> directions, and then make clear from context what the directions are >>>>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store >>>>>> F-wards". >>>>>> >>>>>> "C" and "Z" mean exactly the same thing -- they describe a way of >>>>>> unraveling a cube into a straight line. The difference is what we do >>>>>> with the resulting straight line. That's why I'm suggesting that the >>>>>> distinction should be made in the name of the argument. >>>>> >>>>> Could you unpack that for the 'ravel' docstring? Because these >>>>> options all refer to the way of unraveling and not the memory layout >>>>> that results. >>>> >>>> Z/C/column-major/whatever-you-want-to-call-it is a general strategy >>>> for converting between a 1-dim representation and a n-dim >>>> representation. In the case of memory storage, the 1-dim >>>> representation is the flat space of pointer arithmetic. In the case of >>>> ravel, the 1-dim representation is the flat space of a 1-dim indexed >>>> array. But the 1-dim-to-n-dim part is the same in both cases. >>>> >>>> I think that's why you're seeing people baffled by your proposal -- to >>>> them the "C" refers to this general strategy, and what's different is >>>> the context where it gets applied. So giving the same strategy two >>>> different names is silly; if anything it's the contexts that should >>>> have different names. >>> >>> And once we get into memory optimization (and avoiding copies and >>> preserving contiguity), it is necessary to keep both orders in mind, >>> is memory order in "F" and am I iterating/raveling in "F" order >>> (or slicing columns). >>> >>> I think having two separate keywords give the impression we can >>> choose two different things at the same time. >> >> I guess it could not make sense to do this: >> >> np.ravel(a, index_order='C', memory_order='F') >> >> It could make sense to do this: >> >> np.reshape(a, (3,4), index_order='F, memory_order='F') >> >> but that just points out the inherent confusion between the uses of >> 'order', and in this case, the fact that you can only do: >> >> np.reshape(a, (3, 4), index_order='F') >> >> correctly distinguishes between the meanings. > > So, if index_order and memory_order are never in the same function, > then the context should be enough. It was always enough for me. It was not enough for me or the three others who will publicly admit to the shame of finding it confusing without further thought. Again, I just can't see a reason not to separate these ideas. We are not arguing about backwards compatibility here, only about clarity. I guess you do accept that some people, other than yourself, might be less likely to get tripped up by: np.reshape(a, (3, 4), index_order='F') than np.reshape(a, (3, 4), order='F') ? > np.reshape(a, (3,4), index_order='F, memory_order='F') > really hurts my head because you mix a function that operates on > views, indexing and shapes with memory creation, (or I have > no idea what memory_order should do in this case). Right. I think you may now be close to my own discomfort when faced with working out (fast) what: np.reshape(a, (3,4), order='F') means, given 'order' means two different things, and both might be relevant here. Or are you saying that my brain should have quickly calculated that that 'order' would be difficult to understand as memory layout and therefore rejected that and seen immediately that index order was the meaning? Speaking as a psychologist, I don't think that's the way it works. Cheers, Matthew From huangkandiy at gmail.com Wed Apr 3 14:44:23 2013 From: huangkandiy at gmail.com (huangkandiy at gmail.com) Date: Wed, 3 Apr 2013 14:44:23 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 Message-ID: Hello, all I try to solve issue 2649 which is related to 473 on multiplication of a matrix and an array. As 2649 shows import numpy as np x = np.arange(5) I = np.asmatrix(np.identity(5)) print np.dot(I, x).shape # -> (1, 5) First of all I assume we expect that I.dot(x) and I * x behave the same, so I suggest add function dot to matrix, like def dot(self, other): return self * other Then the major issue is the constructor of array and matrix interpret a list differently. array([0,1]).shape = (2,) and matrix([0,1]).shape = (1, 2). It will throw error when run np.dot(I, x), because in __mul__, x will be converted to a 1*5 matrix first. It's not consistent with np.dot(np.identity(5), x), which returns x. To fix that, I suggest to check the dimension of array when convert it to matrix. If it's 1D array, then convert it to a vertical vector explicitly like this if isinstance(data, N.ndarray): + if len(data.shape) == 1: + data = data.reshape(data.shape[0], 1) if dtype is None: intype = data.dtype else: Any comments? -- Kan Huang Department of Applied math & Statistics Stony Brook University 917-767-8018 -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Apr 3 14:44:29 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Apr 2013 11:44:29 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal wrote: > On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg > wrote: >>> the context where it gets applied. So giving the same strategy two >>> different names is silly; if anything it's the contexts that should >>> have different names. >>> >> >> Yup, thats how I think about it too... > > me too... > >> But I would really love if someone would try to make the documentation >> simpler! > > yes, I think this is where the solution lies. No question that better docs would be an improvement, let's all agree on that. We all agree that 'order' is used with two different and orthogonal meanings in numpy. I think we are now more or less agreeing that: np.reshape(a, (3, 4), index_order='F') is at least as clear as: np.reshape(a, (3, 4), order='F') Do I have that right so far? Cheers, Matthew From alan.isaac at gmail.com Wed Apr 3 14:50:44 2013 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 03 Apr 2013 14:50:44 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: References: Message-ID: <515C7A04.6000004@gmail.com> On 4/3/2013 2:44 PM, huangkandiy at gmail.com wrote: > I suggest add function dot to matrix >>> import numpy as np; x = np.arange(5); I = np.asmatrix(np.identity(5)); >>> I.dot(x) matrix([[ 0., 1., 2., 3., 4.]]) Alan Isaac From dave.hirschfeld at gmail.com Wed Apr 3 15:09:42 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Wed, 3 Apr 2013 19:09:42 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: <515BB663.7030804@hilboll.de> Message-ID: Travis Oliphant continuum.io> writes: > > > Mark Wiebe and I are both still tracking NumPy development and can provide context and even help when needed. ? ?Apologies if we've left a different impression. ? We have to be prudent about the time we spend as we have other projects we are pursuing as well, but we help clients with NumPy issues all the time and are eager to continue to improve the code base.? > > > It seems to me that the biggest issue is just the automatic conversion that is occurring on string or date-time input. ? We should stop using the local time- zone (explicit is better than implicit strikes again) and not use any time-zone unless time-zone information is provided in the string. ? ? ?I am definitely +1 on that. ? ? > > It may be necessary to carry around another flag in the data-type to indicate whether or not the date-time is naive (not time-zone aware) or time-zone aware so that string printing does not print a time-zone if it didn't have one to begin with as well. ? > > If others agree that this is the best way forward, then Mark or I can definitely help contribute a patch.? > > Best, > > -Travis > That sounds like a good solution. -Dave From huangkandiy at gmail.com Wed Apr 3 15:18:09 2013 From: huangkandiy at gmail.com (huangkandiy at gmail.com) Date: Wed, 3 Apr 2013 15:18:09 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: <515C7A04.6000004@gmail.com> References: <515C7A04.6000004@gmail.com> Message-ID: I know matrix will call the dot function of ndarray. However, that will not give the answer we expect. A 5*5 matrix multiplies another matrix, we expect answer to be error or a 5*? matrix, not a 1*5 matrix. As ticket 2649 said, I * I * x or I.dot(I.dot(x)) should be as same as I * x. But it will return an error because I and I.dot(x) are not aligned. On Wed, Apr 3, 2013 at 2:50 PM, Alan G Isaac wrote: > On 4/3/2013 2:44 PM, huangkandiy at gmail.com wrote: > > I suggest add function dot to matrix > > >>> import numpy as np; x = np.arange(5); I = np.asmatrix(np.identity(5)); > >>> I.dot(x) > matrix([[ 0., 1., 2., 3., 4.]]) > > > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Kan Huang Department of Applied math & Statistics Stony Brook University 917-767-8018 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Apr 3 16:03:26 2013 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 03 Apr 2013 16:03:26 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: References: <515C7A04.6000004@gmail.com> Message-ID: <515C8B0E.2060605@gmail.com> On 4/3/2013 3:18 PM, huangkandiy at gmail.com wrote: > A 5*5 matrix multiplies another matrix, we expect answer to be error or a 5*? matrix, not a 1*5 matrix. That is what happens. But you are post"multiplying" a matrix by a one-dimensional list. What should happen then? That is the question. One could argue that this should just raise an error, or that the result should be 1d. In my view, the result should be a 1d array, the same as I.A.dot(x). But the maintainers wanted operations with matrices to return matrices whenever possible. So instead of returning x it returns np.matrix(x). My related grievance is that I[0] is a matrix, not an array. There was a long discussion of this a couple years ago. Anyway, the bottom line is: don't mix matrices and other objects. The matrix object is really built only to interact with other matrix objects. Alan From mwwiebe at gmail.com Wed Apr 3 17:03:16 2013 From: mwwiebe at gmail.com (Mark Wiebe) Date: Wed, 3 Apr 2013 14:03:16 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Wed, Apr 3, 2013 at 9:33 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > wrote: > > I found no reasonable way around it other than bypassing the numpy > conversion entirely > > Exactly - we have come to the same conclusion. By the way, it's also > consistent -- an ISO string without a TZ is interpreted as a to mean > "use the locale", but a datetime object without a TZ is interpreted as > UTC, so you get this: > > In [68]: dt > Out[68]: datetime.datetime(2013, 4, 3, 12, 0) > > In [69]: np.dateti > np.datetime64 np.datetime_as_string np.datetime_data > > In [69]: np.datetime64(dt) > Out[69]: numpy.datetime64('2013-04-03T05:00:00.000000-0700') > > In [70]: np.datetime64(dt.iso) > dt.isocalendar dt.isoformat dt.isoweekday > > In [70]: np.datetime64(dt.isoformat()) > Out[70]: numpy.datetime64('2013-04-03T12:00:00-0700') > > two different results! > > (and as it happens, datetime.datetime does not have an ISO string > parser, so it's not completely trivial to round-trip though that...) > On Wed, Apr 3, 2013 at 6:49 AM, Nathaniel Smith wrote: > > > Wow, that's truly broken. I'm sorry. > > Did you put this in? break out the pitchforks! ( ;-) ) Many of the aspects of how the datetime64 is are from me. I started out from the datetime64 NEP, but it wasn't fleshed out enough so I had to fill in lots of details. I guess your pitchforks are pointing at me. ;) For the way this specific part of the code is, I think it's hard to not have it broken one way or another, no matter how we do it. One thing I observed is the printing of getting the current time is weird if you're looking at it interactively. In general, if you get the current time, and print it in UTC, it's the wrong time unless you're in UTC. Python's datetime doesn't help the situation by having datetime.now() return a 'local' time. In [1]: import numpy as np In [2]: from datetime import datetime In [3]: np.datetime64('now') Out[3]: numpy.datetime64('2013-04-03T12:17:58-0700') In [4]: np.datetime_as_string(np.datetime64('now'), timezone='UTC') Out[4]: '2013-04-03T19:17:59Z' In [5]: datetime.now() Out[5]: datetime.datetime(2013, 4, 3, 12, 18, 2, 582000) In [6]: datetime.now().isoformat() Out[6]: '2013-04-03T12:18:06.796000' In [7]: np.datetime64(datetime.now()) Out[7]: numpy.datetime64('2013-04-03T05:18:15.525000-0700') In [8]: np.datetime64(datetime.now().isoformat()) Out[8]: numpy.datetime64('2013-04-03T12:18:25.291000-0700') > I'm skeptical that just switching to UTC everywhere is actually the > > right solution. It smells like one of those solutions that's simple, > > neat, and wrong. > > well, actually, I don't think UTC everywhere is quite what's proposed > -- really it's naive datetimes -- it would be up to the > user/application to make sure the time zones are consistent. > It seems to me that adding a time zone to the datetime64 metadata might be a good idea, and then allowing it to be None to behave like the Python naive datetimes. This wouldn't be a trivial addition, though. Using Python's timezone object doesn't seem like a good idea, because would require things to be converted to/from Python's datetime to be processed every time, which would remove the performance benefits of NumPy. The boost datetime library has a nice timezone object which could be used as inspiration for an equivalent in NumPy, but I think any way we cut it would be a lot of work. > Which does mean that parsing a ISO string with a timezone becomes > problematic... Yeah, there are a number of cases. How would it transform '2013-04-03T12:18' to a datetime64 with a timezone by default? I guess that would be to use the datetime64's metadata probably. How would it transform '2013-04-03T12:18Z' or '2013-04-03T12:18-0700' to a datetime64 with no timezone? Do we throw an error in the default conversion, and have a separate parsing function that allows more control? > > (I don't know anything about calendar-time series > > handling, so I have no ability to actually judge this stuff, but > > wouldn't one problem be if you want to know about business days/hours? > > right -- then you'd want to use local time, so numpy might think it's > ISO, but it'd actually be local time. Anyway, at the moment, I don't > think datetime64 does this right anyway. I don't see mention of the > timezone in the busday functions. I havne't checked to see if they use > the locale TZ or ignore it, but either way is wrong (actually, using > the locale setting is worse...) The busday functions just operate on datetime64[D]. There is no timezone interaction there, except for how a datetime with a date unit converts to/from a datetime which includes time. > > Maybe datetime dtypes should be parametrized by both granularity and > > timezone? > > That may be a good option. However, I suspect it's pretty hard to > actually use the timezone correctly and consistently, so I"m nervous > about that. In any case, we'd need to make sure that the user could > specify timezone on I/O and busday calculations, etc, and *never* > assume the locale TZ (Or anything else about locale) unless asked for. > Using the locale TZ is almost never the right thing to do for the kind > of applications numpy is used for. I think for local, interactive use, the locale timezone is good, but for non-interactive use it's not. NumPy plays roles in both contexts, and has many features that are skewed towards the interactive context, so it's not clear to me that excluding the locale TZ would be a good idea. > > Or we could just declare that datetime64 is always > > timezone-naive and adjust the code to match? > > That would be the easy way to handle it -- from the numpy side, anyway. > > > I'll CC the pandas list in case they have some insight. > > I suspect pandas has their own way of dealing with all these issues > already. Which makes me think that numpy should take the same approach > as the python stdlib: provide a core datatype, but leave the use-case > specific stuff for others to build on. For instance, it seems really > odd to have the busday* functions in core numpy... I believe Pandas is using datetime64[ns] for everything, and uses its own code to allow for numpy 1.6 compatibility. It borrowed some code from numpy 1.7 to make this possible. > > Unfortunately > > AFAIK no-one who's regularly working on numpy this point works with > > datetimes, so we have limited ability to judge solutions... > > well, that explains how this happened! > > > please help! > > in 1.7, it is still listed as experimental, so you could say this is > all going as planned: release something we can try to use, and see > what we find out when using it! > > I _think_ one reasonable option may be: > > 1) Internal is UTC > 2) On input: > a) Default for no-time-zone-specified is UTC (both from datetime > objects and ISO strings) > b) respect TZ if given, converting to UTC > 3) On output: > a) default to UTC > a) provide a way for the user to specify the timezone desired > ( perhaps a TZ attribute somewhere, or functions to specifically > convert to ISO strings and datetime objects that take an optional TZ > parameter.) > 4) busday* and the like allow a way to specify TZ > > Issues I immediate see with this: > Respecting the TZ on output is a problem because: > 1) if people want "naive" datetimes, they will get UTC ISO strings, > i.e.: > '2013-04-03T05:00:00Z' rather than '2013-04-03T05:00:00' > - so there should be a way to specify "naive" or None as a > timezone. > > 2) the python datetime module doesn't have any tzinfo objects > built in -- so to respect timezones, numpy would need to maintain its > own, or depend on pytz > > Given all this, maybe naive is the way to go, perhaps mirroring > datetime,datetime, an having an optional tzinfo object attribute. (by > the way, I'm confused where that would live -- in the dtype instance? > in the array? > > Issue with Naive: what do you do with an ISO string that specifies a TZ > offset? > > I'm beginning to see why the datetime doesn't support reading ISO > strings -- it would need to deal with timezones in that case! > > Another note about Timezones and ISO -- it doesn't really support > timezones -- you specify an offset from UTC, that's it -- so you dont > know if that is, for instance, Mountain Standard time or Pacific > Daylight Time. All you can do with it is convert to UTC, but you don't > have a way to convert back, as you don't know what the timezone is. > We'd be taking on a heck of mess to support this! Hmm -- maybe only > support ISO-like -- i.e. all we do is keep an offset around that can > be re-applied on output if you want -- that's it. > Datetimes are complicated! The biggest advantage of using ISO for the default string format is that it's unambiguous, it doesn't have the problem like with '01/02/03' that could be interpreted in many different ways depending on where in the world you are. I suspect adding a timezone to the datetime64 metadata is the way to proceed. We probably need to start up a new NEP about amending datetime64. The previous one is here: https://github.com/numpy/numpy/blob/master/doc/neps/datetime-proposal.rst > That's it for now -- thanks for engaging! > > -Chris > > PS: I'm pretty sure that the C stdlib time handling functions give you > no choice but to use the locale when they covert to strings, etc -- > this is a freaking nightmare, and I'm wondering if that's in fact why > numpy does it. i.e it's easy to use the C lib functions, but writing > your own requires the full TZ database, handling DST, etc. etc.... The C stdlib provides functions for doing timezone conversions with the locale, but going deeper than that becomes a bit more OS-specific. This seems like the kind of service the OS should provide, so that all libraries would get updates to new timezone databases when they're updated, etc, but unfortunately things aren't that simple. Thanks, Mark > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 3 17:52:24 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 14:52:24 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett wrote: > It was not enough for me or the three others who will publicly admit > to the shame of finding it confusing without further thought. I would submit that some of the confusion came from the fact that with ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH index_order and memory_order -- with one flag -- I know I'm still not clear what I'd get in complex situations. > Again, I just can't see a reason not to separate these ideas. I agree, but really separating them -- but ideally having a given function only deal with one or the other, not both at once. > We are > not arguing about backwards compatibility here, only about clarity. while it could be changed while strictly maintaining backward compatibility -- it is a change that would need to filter through the docs, example, random blog posts, stack=overflow questions, etc...... Is that worth it? I'm not convinced > Right. I think you may now be close to my own discomfort when faced > with working out (fast) what: > > np.reshape(a, (3,4), order='F') I still think it's cause you know too much.... ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ralf.gommers at gmail.com Wed Apr 3 17:58:37 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 3 Apr 2013 23:58:37 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 11:52 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett > wrote: > > It was not enough for me or the three others who will publicly admit > > to the shame of finding it confusing without further thought. > > I would submit that some of the confusion came from the fact that with > ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH > index_order and memory_order -- with one flag -- I know I'm still not > clear what I'd get in complex situations. > > > Again, I just can't see a reason not to separate these ideas. > > I agree, but really separating them -- but ideally having a given > function only deal with one or the other, not both at once. > > > We are > > not arguing about backwards compatibility here, only about clarity. > > while it could be changed while strictly maintaining backward > compatibility -- it is a change that would need to filter through the > docs, example, random blog posts, stack=overflow questions, etc...... > Not only that, we would then also be in the situation of having `order` *and* `xxx_order` keywords. This is also confusing, at least as much as the current situation imho. Ralf > Is that worth it? I'm not convinced > > > Right. I think you may now be close to my own discomfort when faced > > with working out (fast) what: > > > > np.reshape(a, (3,4), order='F') > > I still think it's cause you know too much.... ;-) > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 3 18:00:37 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 15:00:37 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! Message-ID: OK, OK, I know the fashion is to blast people with "please don't top post" messages -- it seems to have invaded all the mailing lists I'm on. But I don't get it. Most of us have threaded mail readers these days, so is it so hard to follow a thread? Maybe I'm a weirdo, but it is FAR more common for me to have been following a thread, and want to know what the latest comment is, then drop into the end of a long thread, and want to see the entire discussion in the most recent post. So why make me scroll WAY THE HECK down literally hundreds of lines to find your one: "+1" give me a break! Even if you have something significant and pithy to say, I still don't want to scroll through all that junk, and layers of 10+ >>>>>>>>>>> to find out the new stuff. (take a look at a the "Raveling, reshape order ..." thread to see what I mean. So, while I agree blind top posting is less than Ideal, I actually like it a bit better than blind bottom posting. Best of all is intelligent editing of the thread so far -- edit it down to the key points you are commenting on, and intersperse your comments. That way your email stands on its own as meaningful, but there is not a big pile of left over crap to wade through to read your fabulous pithy opinions.... OK, I've had my rant, carry on! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sebastian at sipsolutions.net Wed Apr 3 18:12:48 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 04 Apr 2013 00:12:48 +0200 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: <515C8B0E.2060605@gmail.com> References: <515C7A04.6000004@gmail.com> <515C8B0E.2060605@gmail.com> Message-ID: <1365027168.16562.13.camel@sebastian-laptop> On Wed, 2013-04-03 at 16:03 -0400, Alan G Isaac wrote: > On 4/3/2013 3:18 PM, huangkandiy at gmail.com wrote: > > A 5*5 matrix multiplies another matrix, we expect answer to be error or a 5*? matrix, not a 1*5 matrix. > > > That is what happens. > But you are post"multiplying" a matrix by a one-dimensional list. > What should happen then? That is the question. > > One could argue that this should just raise an error, > or that the result should be 1d. > In my view, the result should be a 1d array, > the same as I.A.dot(x). > Would it be reasonable if this was a Nx1 matrix? I am not sure how you would implement it exactly into dot. Maybe by transposing the result if the second argument was a vector and the result is not a base class? And then __mul__ can use np.asarray instead of np.asmatrix. Or just fix __mul__ itself to transpose the result and don't care about np.dot? - Sebastian > But the maintainers wanted operations with matrices to > return matrices whenever possible. So instead of > returning x it returns np.matrix(x). > > My related grievance is that I[0] is a matrix, > not an array. There was a long discussion of > this a couple years ago. > > Anyway, the bottom line is: don't mix matrices and > other objects. The matrix object is really built > only to interact with other matrix objects. > > Alan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Wed Apr 3 18:21:48 2013 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 3 Apr 2013 23:21:48 +0100 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 11:00 PM, Chris Barker - NOAA Federal wrote: > Best of all is intelligent editing of the thread so far -- edit it > down to the key points you are commenting on, and intersperse your > comments. That way your email stands on its own as meaningful, but > there is not a big pile of left over crap to wade through to read your > fabulous pithy opinions.... Traditionally this is what the phrase "bottom posting" meant, as a term of art, and is the key reason why those old "netiquette" guides recommend it. I guess the unexpressed nuances of such definitions get lost over time as people encounter them without the relevant context, though -- sort of like how the full in-context meaning of order= gets lost ;-). -n From chris.barker at noaa.gov Wed Apr 3 18:59:15 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 15:59:15 -0700 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: <515C8B0E.2060605@gmail.com> References: <515C7A04.6000004@gmail.com> <515C8B0E.2060605@gmail.com> Message-ID: On Wed, Apr 3, 2013 at 1:03 PM, Alan G Isaac wrote: > On 4/3/2013 3:18 PM, huangkandiy at gmail.com wrote: > In my view, the result should be a 1d array, > the same as I.A.dot(x). > > But the maintainers wanted operations with matrices to > return matrices whenever possible. So instead of > returning x it returns np.matrix(x). the matrix object is a fine idea, but the key problem is that it provides a 2-d matrix, but no concept of a 1-d vector. I think it would all be a cleaner if there were a row-vector and column-vector object to accompany matrix -- they things that naturally return a vector could do so, You can't use a regular 1-d array because there is no way to distinguish between a row or column version. But as Alan sid, this was all hashed out a few years back -- a bunch of great ideas, but no one to implement them. The truth is that matrix has little value outside of teaching, so no one with the skills to push it forward uses it themselves. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 3 19:01:45 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 16:01:45 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 3:21 PM, Nathaniel Smith wrote: > On Wed, Apr 3, 2013 at 11:00 PM, Chris Barker - NOAA Federal > wrote: >> Best of all is intelligent editing of the thread so far ... > Traditionally this is what the phrase "bottom posting" meant, as a > term of art, and is the key reason why those old "netiquette" guides > recommend it. Well, I'm not sure I've seen much use of the term "bottom posting" -- I've seen a lot of "don't top-post", with generally not recommendation as to what to do instead. but I do see a lot of naive bottom posting.... -Chris oh wait, I was supposed to be done ranting.... -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From huangkandiy at gmail.com Wed Apr 3 19:11:10 2013 From: huangkandiy at gmail.com (huangkandiy at gmail.com) Date: Wed, 3 Apr 2013 19:11:10 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: References: <515C7A04.6000004@gmail.com> <515C8B0E.2060605@gmail.com> Message-ID: Agree with the row-vector and column-vector thing. I notice that in ndarraymultiplication, the 1-d array is treated as a column-vector. But in matrix multiplication, 1-d array is converted to a row-vector. So just match the 1-d array to a column-vector, the behavior of ndarray and matrix will be consistent. On Wed, Apr 3, 2013 at 6:59 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Wed, Apr 3, 2013 at 1:03 PM, Alan G Isaac wrote: > > On 4/3/2013 3:18 PM, huangkandiy at gmail.com wrote: > > > In my view, the result should be a 1d array, > > the same as I.A.dot(x). > > > > But the maintainers wanted operations with matrices to > > return matrices whenever possible. So instead of > > returning x it returns np.matrix(x). > > the matrix object is a fine idea, but the key problem is that it > provides a 2-d matrix, but no concept of a 1-d vector. I think it > would all be a cleaner if there were a row-vector and column-vector > object to accompany matrix -- they things that naturally return a > vector could do so, You can't use a regular 1-d array because there is > no way to distinguish between a row or column version. > > But as Alan sid, this was all hashed out a few years back -- a bunch > of great ideas, but no one to implement them. > > The truth is that matrix has little value outside of teaching, so no > one with the skills to push it forward uses it themselves. > > -Chris > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Kan Huang Department of Applied math & Statistics Stony Brook University 917-767-8018 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Apr 3 19:27:07 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 3 Apr 2013 23:27:07 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: <515BB663.7030804@hilboll.de> Message-ID: Mark Wiebe gmail.com> writes: > It seems to me that adding a time zone to the datetime64 > metadata might be a good idea, and then allowing it to be > None to behave like the Python naive datetimes. Probably also TAI and UTC/Posix. Converting from one format to the other is problematic since all of them (except TAI afaik) require looking things up in regularly updated databases. Not only restricted to conversions, but also arithmetic, `b - a`. Affects also UTC/Posix via leap seconds --- this probably doesn't usually matter, but if we want to be strict, it's not good to ignore the issue. On the string representation level, one possible way to go could be to require UTC markers for UTC times, and disallow them for local times:: datetimeutc64('2013-02-02T03:00:00') # -> exception datetimeutc64('2013-02-02T03:00:00Z') # -> valid datetimeutc64('2013-02-02T03:00:00-0200') # -> valid datetime64('2013-02-02T03:00:00') # -> valid datetime64('2013-02-02T03:00:00Z') # -> exception datetime64('2013-02-02T03:00:00-0200') # -> exception Dealing with actual TZ handling could be left to be the responsibility of the users, maybe with some helper conversion functions on the Numpy side. This still leaves open the question how leap seconds should be handled, or if we should just ignore the results and let people live with datetime64('2012-01-01T00:00:00Z') - datetime64('1970-01-01T00:00:00Z') being wrong by ~ 30 seconds... -- Pauli Virtanen From doug.coleman at gmail.com Wed Apr 3 19:28:02 2013 From: doug.coleman at gmail.com (Doug Coleman) Date: Wed, 3 Apr 2013 16:28:02 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: I swear by a mouse with an unlockable scroll wheel. Scroll to your heart's content! http://www.amazon.com/Logitech-G500-Programmable-Gaming-Mouse/dp/B002J9GDXI Also, gmail "bottom-posts" by default. It's transparent to gmail users. I'd imagine they are some of the biggest offenders. Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 3 19:52:13 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 16:52:13 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: Thanks all for taking an interest. I need to think a bot more about the options before commenting more, but: while we're at it: It seems very odd to me that datetime64 supports different units (right down to attosecond) but not different epochs. How can it possible be useful to use nanoseconds, etc, but only right around 1970? For that matter, why all the units at all? I can see the need for nanosecond resolution, but not without changing the epoch -- so if the epoch is fixed, why bother with different units? Using days (for instance) rather than seconds doesn't save memory, as we're always using 64 bits. It can't be common to need more than 2.9e12 years (OK, that's not quite as old as the universe, so some cosmologists may need it...) Personally, I never need finer resolution than seconds, nor more than a century, so it's no big deal to me, but just wondering.... -Chris On Wed, Apr 3, 2013 at 4:27 PM, Pauli Virtanen wrote: > Mark Wiebe gmail.com> writes: >> It seems to me that adding a time zone to the datetime64 >> metadata might be a good idea, and then allowing it to be >> None to behave like the Python naive datetimes. > > Probably also TAI and UTC/Posix. > > Converting from one format to the other is problematic since > all of them (except TAI afaik) require looking things up in > regularly updated databases. Not only restricted to conversions, > but also arithmetic, `b - a`. Affects also UTC/Posix via leap > seconds --- this probably doesn't usually matter, but if we want > to be strict, it's not good to ignore the issue. > > On the string representation level, one possible way to go > could be to require UTC markers for UTC times, and disallow > them for local times:: > > datetimeutc64('2013-02-02T03:00:00') # -> exception > datetimeutc64('2013-02-02T03:00:00Z') # -> valid > datetimeutc64('2013-02-02T03:00:00-0200') # -> valid > datetime64('2013-02-02T03:00:00') # -> valid > datetime64('2013-02-02T03:00:00Z') # -> exception > datetime64('2013-02-02T03:00:00-0200') # -> exception > > Dealing with actual TZ handling could be left to be the > responsibility of the users, maybe with some helper conversion > functions on the Numpy side. > > This still leaves open the question how leap seconds should be > handled, or if we should just ignore the results and let > people live with > datetime64('2012-01-01T00:00:00Z') - datetime64('1970-01-01T00:00:00Z') > being wrong by ~ 30 seconds... > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Wed Apr 3 19:53:06 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 19:53:06 -0400 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 7:28 PM, Doug Coleman wrote: > I swear by a mouse with an unlockable scroll wheel. Scroll to your heart's > content! > > http://www.amazon.com/Logitech-G500-Programmable-Gaming-Mouse/dp/B002J9GDXI > > > Also, gmail "bottom-posts" by default. It's transparent to gmail users. I'd > imagine they are some of the biggest offenders. gmail also doesn't show the old messages, only the most recent reply, so there is no scrolling for me (most of the time). The main advantage of inline and bottom posting for me is when I reread a thread, especially long ones, after several years. It's much easier to follow than reading from bottom to top without inline context. (R mailing lists) Josef > > > Doug > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Wed Apr 3 19:56:47 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 3 Apr 2013 16:56:47 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 4:28 PM, Doug Coleman wrote: > Also, gmail "bottom-posts" by default. It's transparent to gmail users. I'd > imagine they are some of the biggest offenders. Not with my configuration -- which I don't hink I changed -- it's top posting by default for me. however, gmail does hide much of the "quoted" content, so this may not bug people as much -- in fact, for me, with gmail, it's often not to bad to read, but then I go to reply and find a huge pile of quoted material that gmail hid from me. but I've got it set to plain text -- not sure how that effects things... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Wed Apr 3 20:06:50 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 3 Apr 2013 18:06:50 -0600 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: What's the problem? ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 3 20:14:56 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 3 Apr 2013 18:14:56 -0600 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: References: <515C7A04.6000004@gmail.com> <515C8B0E.2060605@gmail.com> Message-ID: On Wed, Apr 3, 2013 at 5:11 PM, huangkandiy at gmail.com wrote: > Agree with the row-vector and column-vector thing. I notice that in > ndarray multiplication, the 1-d array is treated as a column-vector. But > in matrix multiplication, 1-d array is converted to a row-vector. So just > match the 1-d array to a column-vector, the behavior of ndarray and > matrix will be consistent. > > If someone is motivated to write some code it might not be too difficult to get it in. I don't know what the backwards compatibility problem would be, however. Maybe someone could put together a separate matrix package based on numpy for teaching purposes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwwiebe at gmail.com Wed Apr 3 21:02:37 2013 From: mwwiebe at gmail.com (Mark Wiebe) Date: Wed, 3 Apr 2013 18:02:37 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Wed, Apr 3, 2013 at 4:27 PM, Pauli Virtanen wrote: > Mark Wiebe gmail.com> writes: > > It seems to me that adding a time zone to the datetime64 > > metadata might be a good idea, and then allowing it to be > > None to behave like the Python naive datetimes. > > Probably also TAI and UTC/Posix. > > Converting from one format to the other is problematic since > all of them (except TAI afaik) require looking things up in > regularly updated databases. Not only restricted to conversions, > but also arithmetic, `b - a`. Affects also UTC/Posix via leap > seconds --- this probably doesn't usually matter, but if we want > to be strict, it's not good to ignore the issue. > I think this would be nice. Would it be a stretch to extend the ISO syntax with '2013-02-02T03:00:00TAI'? One problem with trying to give technically correct answers for the UTC/Posix format is that it can't actually represent the leap-second, so a datetime64 + timedelta64 could produce an unrepresentable moment in time. > On the string representation level, one possible way to go > could be to require UTC markers for UTC times, and disallow > them for local times:: > > datetimeutc64('2013-02-02T03:00:00') # -> exception > datetimeutc64('2013-02-02T03:00:00Z') # -> valid > datetimeutc64('2013-02-02T03:00:00-0200') # -> valid > datetime64('2013-02-02T03:00:00') # -> valid > datetime64('2013-02-02T03:00:00Z') # -> exception > datetime64('2013-02-02T03:00:00-0200') # -> exception > I like this kind of strict approach. To go along with it, there would need to also be a more flexible string parsing method where you could specify alternative behaviors. > Dealing with actual TZ handling could be left to be the > responsibility of the users, maybe with some helper conversion > functions on the Numpy side. > The np.datetime_as_string function has a start of some of this (looks like I neglected to document it properly, sorry!). In [10]: t = np.datetime64('2013-04-03T13:21Z') In [11]: t Out[11]: numpy.datetime64('2013-04-03T06:21-0700') In [12]: np.datetime_as_string(t, timezone='local') Out[12]: '2013-04-03T06:21-0700' In [13]: np.datetime_as_string(t, timezone='UTC') Out[13]: '2013-04-03T13:21Z' In [14]: np.datetime_as_string(t, timezone=pytz.timezone('US/Eastern')) Out[14]: '2013-04-03T09:21-0400' > This still leaves open the question how leap seconds should be > handled, or if we should just ignore the results and let > people live with > datetime64('2012-01-01T00:00:00Z') - datetime64('1970-01-01T00:00:00Z') > being wrong by ~ 30 seconds... I think it's most straightforward to leave it wrong for Posix-format datetimes, but add a TAI timezone where it will produce correct results. Thanks, Mark > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Apr 3 21:13:47 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Apr 2013 18:13:47 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett wrote: > Hi, > > On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal > wrote: >> On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg >> wrote: >>>> the context where it gets applied. So giving the same strategy two >>>> different names is silly; if anything it's the contexts that should >>>> have different names. >>>> >>> >>> Yup, thats how I think about it too... >> >> me too... >> >>> But I would really love if someone would try to make the documentation >>> simpler! >> >> yes, I think this is where the solution lies. > > No question that better docs would be an improvement, let's all agree on that. > > We all agree that 'order' is used with two different and orthogonal > meanings in numpy. > > I think we are now more or less agreeing that: > > np.reshape(a, (3, 4), index_order='F') > > is at least as clear as: > > np.reshape(a, (3, 4), order='F') I believe uur job here is to come to some consensus. In that spirit, I think we do agree on these statements above. Now we have the cost / benefit. Benefit : Some people may find it easier to understand numpy when these constructs are separated. Cost : There might be some confusion because we have changed the default keywords. Benefit ----------- What proportion of people would find it easier to understand with the order constructs separated? Clearly Chris and Josef and Sebastian - you estimate I think no change in your understanding, because your understanding was near complete already. At least I, Paul Ivanov, JB Poline found the current state strikingly confusing. I think we have other votes for that position here. It's difficult to estimate the proportions now because my original email and the subsequent discussion are based on the distinction already being made. So, it is hard for us to be objective about whether a new user is likely to get confused. At least it seems reasonable to say that some moderate proportion of users will get confused. In that situation, it seems to me the long-term benefit for separating these ideas is relatively high. The benefit will continue over the long term. Cost ------- The ravel docstring would looks something like this: index_order : {'C','F', 'A', 'K'}, optional ... This keyword used to be called simply 'order', and you can also use the keyword 'order' to specify index_order (this parameter). The problem would then be that, for a while, there will be older code and docs using 'order' instead of 'index_order'. I think this would not cause much trouble. Reading the docstring will explain the change. The old code will continue to work. This cost will decrease to zero over time. So, if we are planning for the long-term for numpy, I believe the benefit to the change considerably outweighs the cost. I'm happy to do the code changes, so that's not an issue. Cheers, Matthew From waterbug at pangalactic.us Wed Apr 3 21:26:57 2013 From: waterbug at pangalactic.us (Steve Waterbury) Date: Wed, 03 Apr 2013 21:26:57 -0400 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: <515CD6E1.2090906@pangalactic.us> On 04/03/2013 08:06 PM, Charles R Harris wrote: Nice editing! ;) Steve From josef.pktd at gmail.com Wed Apr 3 21:32:34 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 21:32:34 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Wed, Apr 3, 2013 at 9:13 PM, Matthew Brett wrote: > Hi, > > On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett wrote: >> Hi, >> >> On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal >> wrote: >>> On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg >>> wrote: >>>>> the context where it gets applied. So giving the same strategy two >>>>> different names is silly; if anything it's the contexts that should >>>>> have different names. >>>>> >>>> >>>> Yup, thats how I think about it too... >>> >>> me too... >>> >>>> But I would really love if someone would try to make the documentation >>>> simpler! >>> >>> yes, I think this is where the solution lies. >> >> No question that better docs would be an improvement, let's all agree on that. >> >> We all agree that 'order' is used with two different and orthogonal >> meanings in numpy. >> >> I think we are now more or less agreeing that: >> >> np.reshape(a, (3, 4), index_order='F') >> >> is at least as clear as: >> >> np.reshape(a, (3, 4), order='F') > > I believe uur job here is to come to some consensus. > > In that spirit, I think we do agree on these statements above. > > Now we have the cost / benefit. > > Benefit : Some people may find it easier to understand numpy when > these constructs are separated. > > Cost : There might be some confusion because we have changed the > default keywords. > > Benefit > ----------- > > What proportion of people would find it easier to understand with the > order constructs separated? Clearly Chris and Josef and Sebastian - > you estimate I think no change in your understanding, because your > understanding was near complete already. > > At least I, Paul Ivanov, JB Poline found the current state strikingly > confusing. I think we have other votes for that position here. It's > difficult to estimate the proportions now because my original email > and the subsequent discussion are based on the distinction already > being made. So, it is hard for us to be objective about whether a new > user is likely to get confused. At least it seems reasonable to say > that some moderate proportion of users will get confused. > > In that situation, it seems to me the long-term benefit for separating > these ideas is relatively high. The benefit will continue over the > long term. > > Cost > ------- > > The ravel docstring would looks something like this: > > index_order : {'C','F', 'A', 'K'}, optional > ... This keyword used to be called simply 'order', and you can > also use the keyword 'order' to specify index_order (this parameter). > > The problem would then be that, for a while, there will be older code > and docs using 'order' instead of 'index_order'. I think this would > not cause much trouble. Reading the docstring will explain the > change. The old code will continue to work. > > This cost will decrease to zero over time. > > So, if we are planning for the long-term for numpy, I believe the > benefit to the change considerably outweighs the cost. > > I'm happy to do the code changes, so that's not an issue. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Wed Apr 3 23:18:41 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 3 Apr 2013 21:18:41 -0600 Subject: [Numpy-discussion] Moving linalg c code Message-ID: Hi All, There is a PR that adds some blas and lapack functions to numpy. I'm thinking that if that PR is merged it would be good to move all of the blas and lapack functions, including the current ones in numpy/linalg into a single directory somewhere in numpy/core/src. So there are two questions here: should we be adding the new functions, and if so, should we consolidate all the blas and lapack C code into its own directory somewhere in numpy/core/src. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Apr 3 23:31:41 2013 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 3 Apr 2013 23:31:41 -0400 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > Personally, I never need finer resolution than seconds, nor more than > a century, so it's no big deal to me, but just wondering.... > > A use case for finer resolution than seconds (in our field, no less!) is lightning data. At the last SciPy conference, a fellow meteorologist mentioned how difficult it was to plot out lightning data at resolutions finer than microseconds (which is the resolution of the python datetime objects). Matplotlib has not supported the datetime64 object yet (John passed before he could write up that patch). Cheers! Ben By the way, my 12th Rule of Programming is "Never roll your own datetime" -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Thu Apr 4 00:17:30 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Thu, 4 Apr 2013 00:17:30 -0400 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On 4/3/13, Benjamin Root wrote: > On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >> >> Personally, I never need finer resolution than seconds, nor more than >> a century, so it's no big deal to me, but just wondering.... >> >> > A use case for finer resolution than seconds (in our field, no less!) is > lightning data. At the last SciPy conference, a fellow meteorologist > mentioned how difficult it was to plot out lightning data at resolutions > finer than microseconds (which is the resolution of the python datetime > objects). Matplotlib has not supported the datetime64 object yet (John > passed before he could write up that patch). > > Cheers! > Ben > > By the way, my 12th Rule of Programming is "Never roll your own datetime" A rule on par with "never get involved in a land war in Asia": both equally Fraught With Peril. :) Warren > From d.s.seljebotn at astro.uio.no Thu Apr 4 02:53:15 2013 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 04 Apr 2013 08:53:15 +0200 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: <515D235B.4090705@astro.uio.no> On 04/04/2013 12:00 AM, Chris Barker - NOAA Federal wrote: > OK, OK, I know the fashion is to blast people with "please don't top > post" messages -- it seems to have invaded all the mailing lists I'm > on. > > But I don't get it. > > Most of us have threaded mail readers these days, so is it so hard to > follow a thread? Fine if you say A, then I say B, then you say C, and so on. As in "where should we go for lunch on Friday". But technical discussions are often not like that -- it's more you say A, B, C, then in response I say D, E, F. Then it helps A LOT if there's an easy convention for recording that D is in response to A, E in response to B, and F in response to C. With top-posting I'm forced to write "With respect to what you write about the GIL issues, ...". Bleh. In fact, in deep technical threads it's very difficult for me to write top-posting at all, which is why I get so irritated when people switch to it. Dag Sverre From dave.hirschfeld at gmail.com Thu Apr 4 03:49:35 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 4 Apr 2013 07:49:35 +0000 (UTC) Subject: [Numpy-discussion] Moving linalg c code References: Message-ID: Charles R Harris gmail.com> writes: > > Hi All,There is a PR that adds some blas and lapack functions to numpy. I'm thinking that if that PR is merged it would be good to move all of the blas and lapack functions, including the current ones in numpy/linalg into a single directory somewhere in numpy/core/src. So there are two questions here: should we be adding the new functions, and if so, should we consolidate all the blas and lapack C code into its own directory somewhere in numpy/core/src.Thoughts? Chuck > The code in the aforementioned PR would be very useful to me in performance critical areas of my code. So much so in fact that I've actually rolled my own functions in cython to do what I need. I'd be happy if the functionality was available by default in numpy though. Regards, Dave From daniele at grinta.net Thu Apr 4 05:03:51 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Thu, 04 Apr 2013 11:03:51 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: <515D41F7.3050908@grinta.net> On 04/04/2013 01:27, Pauli Virtanen wrote: > > Probably also TAI and UTC/Posix. > > Converting from one format to the other is problematic since > all of them (except TAI afaik) require looking things up in > regularly updated databases. Not only restricted to conversions, > but also arithmetic, `b - a`. Affects also UTC/Posix via leap > seconds --- this probably doesn't usually matter, but if we want > to be strict, it's not good to ignore the issue. I was about to point out the same issue with UTC. I'm not aware of any library that handles the conversion from UTC to TAI. I would like to know if there is one. Furthermore, UTC cannot be computed unambiguously for times in the future because the leap second insertion is not scheduled regularly but it is based on the comparison between UT1 and UTC times. I think that generally the issue is not relevant for any practical use of a timebase: there are not many applications requiring sub-second accuracy over many years periods. Solving it correctly (having an internal representation in TAI time and converting to UTC and then to local time for user representation) is complex and potentially confusing for the users (i don't know how many scientists know what TAI is). I think it may be neglected in the code but stated in the documentation. Cheers, Daniele From daniele at grinta.net Thu Apr 4 05:08:20 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Thu, 04 Apr 2013 11:08:20 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: <515D4304.4080007@grinta.net> On 04/04/2013 03:02, Mark Wiebe wrote: > I think it's most straightforward to leave it wrong for Posix-format > datetimes, but add a TAI timezone where it will produce correct results. Strictly speaking, TAI is not a timezone but a different time base. Cheers, Daniele From davidmenhur at gmail.com Thu Apr 4 05:20:34 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 4 Apr 2013 11:20:34 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515D41F7.3050908@grinta.net> References: <515BB663.7030804@hilboll.de> <515D41F7.3050908@grinta.net> Message-ID: On 4 April 2013 11:03, Daniele Nicolodi wrote: > I think that generally the issue is not relevant for any practical use > of a timebase: there are not many applications requiring sub-second > accuracy over many years periods. I agree. I think the basic datetime object should ignore this issue (properly documented), and leaving room for someone writing a datetime_leapsecond object, that would be aware of them. Avoiding this issue we can achieve a better performance and a simpler code base, with enough functionality for most practical purposes. Another point that should be noted: as stated earlier, the leap seconds cannot be predicted, they require a frequent update, making replicability difficult: the same code in two machines with same numpy version, but updated at different times can produce different results. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 4 07:54:44 2013 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 4 Apr 2013 12:54:44 +0100 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Thu, Apr 4, 2013 at 12:52 AM, Chris Barker - NOAA Federal wrote: > Thanks all for taking an interest. I need to think a bot more about > the options before commenting more, but: > > while we're at it: > > It seems very odd to me that datetime64 supports different units > (right down to attosecond) but not different epochs. How can it > possible be useful to use nanoseconds, etc, but only right around > 1970? For that matter, why all the units at all? I can see the need > for nanosecond resolution, but not without changing the epoch -- so if > the epoch is fixed, why bother with different units? Using days (for > instance) rather than seconds doesn't save memory, as we're always > using 64 bits. It can't be common to need more than 2.9e12 years (OK, > that's not quite as old as the universe, so some cosmologists may need > it...) Another reason why it might be interesting to support different epochs is that many timeseries (e.g., the ones I work with) aren't linked to absolute time, but are instead "milliseconds since we turned on the recording equipment". You can reasonably represent these as timedeltas of course, but it'd be even more elegant to be able to be able to represent them as absolute times against an opaque epoch. In particular, when you have multiple recording tracks, only those which were recorded against the same epoch are actually commensurable -- trying to do recording1_times[10] - recording2_times[10] is meaningless and should be an error. I'm definitely not suggesting we go start retrofitting this into datetime64, but it's a real shame that defining a new dtype is so hard that we can't play around with such things on our own without serious mucking about in numpy's guts :-/. -n From josef.pktd at gmail.com Thu Apr 4 08:44:06 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 08:44:06 -0400 Subject: [Numpy-discussion] Moving linalg c code In-Reply-To: References: Message-ID: On Thu, Apr 4, 2013 at 3:49 AM, Dave Hirschfeld wrote: > Charles R Harris gmail.com> writes: > >> >> Hi All,There is a PR that adds some blas and lapack functions to numpy. I'm > thinking that if that PR is merged it would be good to move all of the blas > and lapack functions, including the current ones in numpy/linalg into a single > directory somewhere in numpy/core/src. So there are two questions here: should > we be adding the new functions, and if so, should we consolidate all the blas > and lapack C code into its own directory somewhere in numpy/core/src.Thoughts? > Chuck >> > > The code in the aforementioned PR would be very useful to me in performance > critical areas of my code. So much so in fact that I've actually rolled my own > functions in cython to do what I need. I'd be happy if the functionality was > available by default in numpy though. What I see form a quick browse of the PR, is that we can get most of linalg to work on many matrices (2d arrays) at the same time. (It's the first time we get generalized u-functions that are targeted to users, IISIC, if I see it correctly) I would also like to see these added, and we will have many cases where this will be very useful in statistics and statsmodels. Josef > > Regards, > Dave > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From jniehof at lanl.gov Thu Apr 4 09:34:09 2013 From: jniehof at lanl.gov (Jonathan T. Niehof) Date: Thu, 04 Apr 2013 07:34:09 -0600 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515D41F7.3050908@grinta.net> References: <515BB663.7030804@hilboll.de> <515D41F7.3050908@grinta.net> Message-ID: <515D8151.5040400@lanl.gov> On 04/04/2013 03:03 AM, Daniele Nicolodi wrote: > I'm not aware of any library that handles the conversion from UTC to > TAI. I would like to know if there is one. The CDF library does, although that's rather a sledgehammer to drive a thumbtack. > I think that generally the issue is not relevant for any practical use > of a timebase: there are not many applications requiring sub-second > accuracy over many years periods. Which issue? Leapseconds? Leap seconds are required for *second* accuracy. They're also required for time-tagging anything that takes place during a leapsecond. The lack of proper no-skip time support in Python datetimes has caused us headaches and it was annoying to find that datetime64 doesn't improve matters. Keeping a leap second database up to date is annoying but not as bad as it could be: they're not arbitrary. Although they can occur monthly, they've only ever happened at June 30 and Dec 31, announced in January and July, respectively. So it would be easy to check the date of a leapsecond database and warn if 1) date we're processing is after June 30 of a year AND 2) LSDB older than January same year (similar checks for the Dec. 31 opportunity.) Er, I guess I'm volunteering to help :) -- Jonathan Niehof ISR-3 Space Data Systems Los Alamos National Laboratory MS-D466 Los Alamos, NM 87545 Phone: 505-667-9595 email: jniehof at lanl.gov Correspondence / Technical data or Software Publicly Available From jaakko.luttinen at aalto.fi Thu Apr 4 09:56:33 2013 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 4 Apr 2013 16:56:33 +0300 Subject: [Numpy-discussion] einsum and broadcasting Message-ID: <515D8691.9080101@aalto.fi> I don't quite understand how einsum handles broadcasting. I get the following error, but I don't understand why: In [8]: import numpy as np In [9]: A = np.arange(12).reshape((4,3)) In [10]: B = np.arange(6).reshape((3,2)) In [11]: np.einsum('ik,k...->i...', A, B) --------------------------------------------------------------------------- ValueError: operand 0 did not have enough dimensions to match the broadcasting, and couldn't be extended because einstein sum subscripts were specified at both the start and end However, if I use explicit indexing, it works: In [12]: np.einsum('ik,kj->ij', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) It seems that it also works if I add '...' to the first operand: In [12]: np.einsum('ik...,k...->i...', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) However, as far as I understand, the syntax np.einsum('ik,k...->i...', A, B) should work. Have I misunderstood something or is there a bug? Thanks for your help! Jaakko From francesc at continuum.io Thu Apr 4 10:37:45 2013 From: francesc at continuum.io (Francesc Alted) Date: Thu, 04 Apr 2013 16:37:45 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: <515D9039.6020006@continuum.io> On 4/4/13 1:52 AM, Chris Barker - NOAA Federal wrote: > Thanks all for taking an interest. I need to think a bot more about > the options before commenting more, but: > > while we're at it: > > It seems very odd to me that datetime64 supports different units > (right down to attosecond) but not different epochs. How can it > possible be useful to use nanoseconds, etc, but only right around > 1970? For that matter, why all the units at all? I can see the need > for nanosecond resolution, but not without changing the epoch -- so if > the epoch is fixed, why bother with different units? When Ivan and me were discussing that, I remember us deciding that such a small units would be useful mainly for the timedelta datatype, which is a relative, not absolute time. We did not want to make short for very precise time measurements, and this is why we decided to go with attoseconds. -- Francesc Alted From francesc at continuum.io Thu Apr 4 10:52:36 2013 From: francesc at continuum.io (Francesc Alted) Date: Thu, 04 Apr 2013 16:52:36 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: <515D93B4.3000108@continuum.io> On 4/4/13 1:54 PM, Nathaniel Smith wrote: > On Thu, Apr 4, 2013 at 12:52 AM, Chris Barker - NOAA Federal > wrote: >> Thanks all for taking an interest. I need to think a bot more about >> the options before commenting more, but: >> >> while we're at it: >> >> It seems very odd to me that datetime64 supports different units >> (right down to attosecond) but not different epochs. How can it >> possible be useful to use nanoseconds, etc, but only right around >> 1970? For that matter, why all the units at all? I can see the need >> for nanosecond resolution, but not without changing the epoch -- so if >> the epoch is fixed, why bother with different units? Using days (for >> instance) rather than seconds doesn't save memory, as we're always >> using 64 bits. It can't be common to need more than 2.9e12 years (OK, >> that's not quite as old as the universe, so some cosmologists may need >> it...) > Another reason why it might be interesting to support different epochs > is that many timeseries (e.g., the ones I work with) aren't linked to > absolute time, but are instead "milliseconds since we turned on the > recording equipment". You can reasonably represent these as timedeltas > of course, but it'd be even more elegant to be able to be able to > represent them as absolute times against an opaque epoch. In > particular, when you have multiple recording tracks, only those which > were recorded against the same epoch are actually commensurable -- > trying to do > recording1_times[10] - recording2_times[10] > is meaningless and should be an error. I remember to be discussing this with some level of depth 5 years ago in this list, as we asked people about the convenience of including an user-defined 'epoch'. We were calling it 'origin'. But apparently it was decided that this was not needed because timestamps+timedelta would be enough. The NEP still reflects this discussion: https://github.com/numpy/numpy/blob/master/doc/neps/datetime-proposal.rst#why-the-origin-metadata-disappeared This is just an historical note, not that we can't change that again. -- Francesc Alted From chris.barker at noaa.gov Thu Apr 4 12:01:34 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 09:01:34 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: <515D235B.4090705@astro.uio.no> References: <515D235B.4090705@astro.uio.no> Message-ID: On Wed, Apr 3, 2013 at 11:53 PM, Dag Sverre Seljebotn wrote: > With top-posting I'm forced to write "With respect to what you write > about the GIL issues, ...". Bleh. In fact, in deep technical threads > it's very difficult for me to write top-posting at all, which is why I > get so irritated when people switch to it. well, raw top posting is pretty painful, I agree, though I personally prefer it over raw bottom posting, particularly for really short, simple comments. And if you are adding to a deep technical thread, raw bottom posting is no better, you still would need to write: "With respect to what you write about the GIL issues, ..." Which is why I advocate interspersed posting. It comes down to: please take some thought to compose your post in a way that is suited to the thread and what you are writing, rather than simply use whatever your mail or news reader makes easy without thinking about it. Ideally, for each person that writes a post, a lot of people are reading it -- so be respectful of the readers' time, more than your own. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From nouiz at nouiz.org Thu Apr 4 12:07:45 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Thu, 4 Apr 2013 18:07:45 +0200 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: <515D235B.4090705@astro.uio.no> Message-ID: On Thu, Apr 4, 2013 at 6:01 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: [...] > Which is why I advocate interspersed posting. > > It comes down to: please take some thought to compose your post in a > way that is suited to the thread and what you are writing, rather than > simply use whatever your mail or news reader makes easy without > thinking about it. > > Ideally, for each person that writes a post, a lot of people are > reading it -- so be respectful of the readers' time, more than your > own. > Since I read that idea somewhere else, that is what I suggest too. But it seam more people tell something different, and I suppose they never thaugh of this. I never heard argument again this. Fred -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 4 12:21:07 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 09:21:07 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >> We all agree that 'order' is used with two different and orthogonal >> meanings in numpy. well, not entirely orthogonal -- they are the some concept, used in different contexts, so there is some benefit to their having similarity. So I"d advocate for using the same flag names in any case -- i.e. "C" and "F" in both cases. >> I think we are now more or less agreeing that: >> >> np.reshape(a, (3, 4), index_order='F') >> >> is at least as clear as: >> >> np.reshape(a, (3, 4), order='F') sure. The trick is: np.reshape(a, (3, 4), index_order='A') which in mingling index_order and memory order...... > I believe our job here is to come to some consensus. yup. > In that spirit, I think we do agree on these statements above. with the caveats I just added... > Now we have the cost / benefit. > > Benefit : Some people may find it easier to understand numpy when > these constructs are separated. > > Cost : There might be some confusion because we have changed the > default keywords. > > Benefit > ----------- > > What proportion of people would find it easier to understand with the > order constructs separated? It's not just numbers -- it's depth of confusion -- if, once you "get" it, you remember it for the rest of your numpy use, then it's not big deal. However, if you need to re-think and test every time you re-visit reshape or ravel, then there's a significant benefit. We are talking about "separating the concepts", but I think it takes more than a keyword change to do that -- the 'A' and 'K' flags mingle the concpets, and are going to be confusing with new keywords -- maybe even more so (it says index_order, but the docstring talks about memory order) Does anyone think we should depreciate the 'A' and 'K' flags? Before you answer that -- does anyone see a use case for the 'A' and 'K' flags that can't be reasonably easily accomplished with .view() or asarray() or ??? if we get rid of the 'A' and 'K' flags, I think think the docstring will be more clear, and there may be less need for two names for the different "order" concepts (though we could change the flags and the keywords...) > The ravel docstring would looks something like this: > > index_order : {'C','F', 'A', 'K'}, optional > ... This keyword used to be called simply 'order', and you can > also use the keyword 'order' to specify index_order (this parameter). > > The problem would then be that, for a while, there will be older code > and docs using 'order' instead of 'index_order'. I think this would > not cause much trouble. Reading the docstring will explain the > change. The old code will continue to work. not a killer, I agree. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Thu Apr 4 13:01:51 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 10:01:51 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Wed, Apr 3, 2013 at 6:02 PM, Mark Wiebe wrote: > On Wed, Apr 3, 2013 at 4:27 PM, Pauli Virtanen wrote: >> Probably also TAI and UTC/Posix. >> >> Converting from one format to the other is problematic since >> all of them (except TAI afaik) require looking things up in >> regularly updated databases. Not only restricted to conversions, >> but also arithmetic, `b - a`. Affects also UTC/Posix via leap >> seconds --- this probably doesn't usually matter, but if we want >> to be strict, it's not good to ignore the issue. > > > I think this would be nice. Would it be a stretch to extend the ISO syntax > with '2013-02-02T03:00:00TAI'? Is there no standard for that already -- it's not mentioned in ISO_8601, but maybe there is something else. > One problem with trying to give technically correct answers for the > UTC/Posix format is that it can't actually represent the leap-second, so a > datetime64 + timedelta64 could produce an unrepresentable moment in time. I'm a bit confused by that -- how it it different than leap-days? Benjamin Root wrote: > A use case for finer resolution than seconds (in our field, no less!) is lightning data. > At the last SciPy conference, a fellow meteorologist mentioned how difficult it was > to plot out lightning data at resolutions finer than microseconds (which is the > resolution of the python datetime objects). sure -- but now you can only do that for lightning in 1970.... how useful is that? My point was that without the ability to choose your epoch, high-resolution time is pretty worthless. Also considering the leap-second and TAI vs. UTC issues, you really wouldn't want to use an epoch far away from your time-of-interest for high-resolution data anyway. > By the way, my 12th Rule of Programming is "Never roll your own datetime" no kidding! Why I was (am) really happy to see it in numpy... Daniele Nicolodi wrote: > I'm not aware of any library that handles the conversion from UTC to > TAI. I would like to know if there is one. In keeping with the above -- I doubt the numpy project wants to write and maintain its own from scratch... I"m guessing we'll need to punt on that. Francesc Alted wrote: > When Ivan and me were discussing that, I remember us deciding that such > a small units would be useful mainly for the timedelta datatype, which > is a relative, not absolute time. We did not want to make short for > very precise time measurements, and this is why we decided to go with > attoseconds. I thought about that -- but if you have timedelta without datetime, you really just have an integer -- we haven't bought anything. It seems we have a number of somewhat orthogonal issues with DateTime in front of us: 1) How to handle (or not) time zones 2) How (whether) to handle leap-seconds, etc. 3) Whether to support TAI time (or is that the same as the above?) 4) Should we add a flexible epoch? I suggest we create separate threads for these, discuss a bit more, then have at the NEP. I'll start one for (1). I don't have the expertise nor use-case for (2) and (3), so I'll punt, but someone can pick it up. I'll start one for (4) also, though I'm not sure I have much to say, other than that I think it's good idea. My naive view is that it would be pretty easy, actually, but I could be very wrong there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From charlesr.harris at gmail.com Thu Apr 4 13:06:43 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 4 Apr 2013 11:06:43 -0600 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Thu, Apr 4, 2013 at 11:01 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Wed, Apr 3, 2013 at 6:02 PM, Mark Wiebe wrote: > > On Wed, Apr 3, 2013 at 4:27 PM, Pauli Virtanen wrote: > > >> Probably also TAI and UTC/Posix. > >> > >> Converting from one format to the other is problematic since > >> all of them (except TAI afaik) require looking things up in > >> regularly updated databases. Not only restricted to conversions, > >> but also arithmetic, `b - a`. Affects also UTC/Posix via leap > >> seconds --- this probably doesn't usually matter, but if we want > >> to be strict, it's not good to ignore the issue. > > > > > > I think this would be nice. Would it be a stretch to extend the ISO > syntax > > with '2013-02-02T03:00:00TAI'? > > Is there no standard for that already -- it's not mentioned in > ISO_8601, but maybe there is something else. > > > One problem with trying to give technically correct answers for the > > UTC/Posix format is that it can't actually represent the leap-second, so > a > > datetime64 + timedelta64 could produce an unrepresentable moment in time. > > I'm a bit confused by that -- how it it different than leap-days? > There is a rule for leap days, one knows ahead of time when they will occur. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 4 13:28:53 2013 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 4 Apr 2013 18:28:53 +0100 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Thu, Apr 4, 2013 at 6:06 PM, Charles R Harris wrote: > > > On Thu, Apr 4, 2013 at 11:01 AM, Chris Barker - NOAA Federal > wrote: >> >> On Wed, Apr 3, 2013 at 6:02 PM, Mark Wiebe wrote: >> > One problem with trying to give technically correct answers for the >> > UTC/Posix format is that it can't actually represent the leap-second, so >> > a >> > datetime64 + timedelta64 could produce an unrepresentable moment in >> > time. >> >> I'm a bit confused by that -- how it it different than leap-days? > > > There is a rule for leap days, one knows ahead of time when they will occur. That is true, but it's not the issue with POSIX. The problem with POSIX is that when a leap second is inserted, you end up with two different physical spans of time that get assigned exactly the same POSIX time value. Imagine if we represented days-of-the-year by integers in the range [0, 365), and the standard said that all years must have exactly 365 days. When a leap year happened, we'd have nowhere to put Feb. 29. So instead we'd just have to have Feb. 28 twice. That's how POSIX handles leap seconds. (Also, if we ever get a deleted leap second -- which has never happened -- then there will be an integer value that time_t can take, but that does not correspond to any actual physical moment in time.) Leap second records are recorded in the Olson tz database, so in principle they should be available on most systems, and without us having to take responsibility for keeping the database up to date. I don't know if pytz makes this available. -n From francesc at continuum.io Thu Apr 4 13:54:38 2013 From: francesc at continuum.io (Francesc Alted) Date: Thu, 04 Apr 2013 19:54:38 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: <515DBE5E.1010507@continuum.io> On 4/4/13 7:01 PM, Chris Barker - NOAA Federal wrote: > Francesc Alted wrote: >> When Ivan and me were discussing that, I remember us deciding that such >> a small units would be useful mainly for the timedelta datatype, which >> is a relative, not absolute time. We did not want to make short for >> very precise time measurements, and this is why we decided to go with >> attoseconds. > I thought about that -- but if you have timedelta without datetime, > you really just have an integer -- we haven't bought anything. Well, it is not just an integer. It is an integer with a time scale: In []: np.array(1, dtype='timedelta64[us]') + np.array(1, dtype='timedelta64[ns]') Out[]: numpy.timedelta64(1001,'ns') That makes a difference. This can be specially important for creating user-defined time origins: In []: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, dtype='timedelta64[ns]') Out[]: numpy.datetime64('2017-07-14T04:40:00.000000001+0200') -- Francesc Alted From hodge at stsci.edu Thu Apr 4 14:09:59 2013 From: hodge at stsci.edu (Phil Hodge) Date: Thu, 4 Apr 2013 14:09:59 -0400 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515D41F7.3050908@grinta.net> References: <515BB663.7030804@hilboll.de> <515D41F7.3050908@grinta.net> Message-ID: <515DC1F7.6060107@stsci.edu> On 04/04/2013 05:03 AM, Daniele Nicolodi wrote: > I'm not aware of any library that handles the conversion from UTC to > TAI. I would like to know if there is one. Furthermore, UTC cannot be > computed unambiguously for times in the future because the leap second > insertion is not scheduled regularly but it is based on the comparison > between UT1 and UTC times. The slalib function sla_DAT gives the offset to be added to UTC to give TAI. According to the AstroPy web site there's a Python wrapper (pySLALIB). Phil From josef.pktd at gmail.com Thu Apr 4 14:26:51 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 14:26:51 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Thu, Apr 4, 2013 at 12:21 PM, Chris Barker - NOAA Federal wrote: > On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>> We all agree that 'order' is used with two different and orthogonal >>> meanings in numpy. > > well, not entirely orthogonal -- they are the some concept, used in > different contexts, so there is some benefit to their having > similarity. So I"d advocate for using the same flag names in any case > -- i.e. "C" and "F" in both cases. > >>> I think we are now more or less agreeing that: >>> >>> np.reshape(a, (3, 4), index_order='F') >>> >>> is at least as clear as: >>> >>> np.reshape(a, (3, 4), order='F') > > sure. > > The trick is: > > np.reshape(a, (3, 4), index_order='A') > > which in mingling index_order and memory order...... > >> I believe our job here is to come to some consensus. > > yup. > >> In that spirit, I think we do agree on these statements above. > > with the caveats I just added... > >> Now we have the cost / benefit. >> >> Benefit : Some people may find it easier to understand numpy when >> these constructs are separated. >> >> Cost : There might be some confusion because we have changed the >> default keywords. >> >> Benefit >> ----------- >> >> What proportion of people would find it easier to understand with the >> order constructs separated? > > It's not just numbers -- it's depth of confusion -- if, once you "get" > it, you remember it for the rest of your numpy use, then it's not big > deal. However, if you need to re-think and test every time you > re-visit reshape or ravel, then there's a significant benefit. I would also add: If you need it, it's easy to find and understand, even if it's not completely "obvious" just reading the current docstring. ("Proof": I haven't seen anyone having problems with "column-stacking" in statsmodels.) > > We are talking about "separating the concepts", but I think it takes > more than a keyword change to do that -- the 'A' and 'K' flags mingle > the concpets, and are going to be confusing with new keywords -- maybe > even more so (it says index_order, but the docstring talks about > memory order) > > Does anyone think we should depreciate the 'A' and 'K' flags? > > Before you answer that -- does anyone see a use case for the 'A' and > 'K' flags that can't be reasonably easily accomplished with .view() or > asarray() or ??? What order does a[a>2] use to create the returned 1-D array? I didn't know, don't remember if I ever knew, and I had to try it out. How do you find a docstring for this? http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html?highlight=order#boolean However, I never needed to know and never cared a[a>2] = 5 a[a>2] = b[a>2] Now, after this thread, I know about "K", and there might be cases where it would be appropriate to minimize copying memory, as Sebastian said, when (index) order doesn't matter. (Although I'm still using an older numpy, and won't have it for a while.) > > if we get rid of the 'A' and 'K' flags, I think think the docstring > will be more clear, and there may be less need for two names for the > different "order" concepts (though we could change the flags and the > keywords...) > >> The ravel docstring would looks something like this: >> >> index_order : {'C','F', 'A', 'K'}, optional >> ... This keyword used to be called simply 'order', and you can >> also use the keyword 'order' to specify index_order (this parameter). >> >> The problem would then be that, for a while, there will be older code >> and docs using 'order' instead of 'index_order'. I think this would >> not cause much trouble. Reading the docstring will explain the >> change. The old code will continue to work. > > not a killer, I agree. not a killer, but not worth the effort either, I still think. As I tried to explain, order is consistently used in the documentation both introduction and in many functions, as general concept with two levels of application. Either you have to rewrite it everywhere, or you get inconsistency. Newbie: "Why are they talking suddenly about index_order, did I miss something, which other orders are there?" I think adding a section to explain order more explicitly (Sebastian above) and improving the docstrings would be very helpful, but changing the name of the keyword is secondary. (and will mainly help as a reminder for users that are focused on memory, and not on the values in their arrays.) Josef ---------------------- >>> aa.shape (5, 5) >>> aa.var() 340.0 >>> np.all(aa.ravel("A") == aa.ravel("C")) True >>> np.all(aa.ravel("A") == aa.ravel("F")) True >>> np.all(aa.ravel("C") == aa.ravel("F")) True --------------------- > > -Chris > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chris.barker at noaa.gov Thu Apr 4 14:42:04 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 11:42:04 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515DBE5E.1010507@continuum.io> References: <515BB663.7030804@hilboll.de> <515DBE5E.1010507@continuum.io> Message-ID: On Thu, Apr 4, 2013 at 10:54 AM, Francesc Alted wrote: > On 4/4/13 7:01 PM, Chris Barker - NOAA Federal wrote: >> I thought about that -- but if you have timedelta without datetime, >> you really just have an integer -- we haven't bought anything. > > Well, it is not just an integer. It is an integer with a time scale: > > In []: np.array(1, dtype='timedelta64[us]') + np.array(1, > dtype='timedelta64[ns]') > Out[]: numpy.timedelta64(1001,'ns') > > That makes a difference. This can be specially important for creating > user-defined time origins: And mixing units, as you show. I'm curious about use-cases, though -- I can't imagine using it, rather than just a standard integer-unit-appropriate-for-the-use-case. It jsut down's buy enough. For much of my code for instance, we just use integers for time (seconds since some epoch) where I really like about a real datetime type is when I want to convert to-from year-month-day-etc forms. And that nifty feature isn't really usable with high-res datetime64. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Thu Apr 4 14:45:38 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 11:45:38 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal wrote: > On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>> We all agree that 'order' is used with two different and orthogonal >>> meanings in numpy. Brief thank you for your helpful and thoughtful discussion. > well, not entirely orthogonal -- they are the some concept, used in > different contexts, Here's a further clarification, in the hope that it is helpful: Input and output index orderings are orthogonal - I can read the data with C index ordering and return an array that is index ordered any-old-how. F and C are used in the sense of F contiguous and C contiguous - where contiguous is not the same concept as index ordering. So I think it's hard to say these concepts are not orthogonal, simply in the technical sense that order='F" could mean: * read my data using F-style index ordering * return my data in an array using F-style index ordering * (related to above) return my data in F-contiguous memory layout > so there is some benefit to their having > similarity. Would you agree with the stuff above? If you do - do you agree that not separating these ideas could be confusing? Cheers, Matthew From matthew.brett at gmail.com Thu Apr 4 14:48:40 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 11:48:40 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal wrote: > On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>> We all agree that 'order' is used with two different and orthogonal >>> meanings in numpy. > > well, not entirely orthogonal -- they are the some concept, used in > different contexts, so there is some benefit to their having > similarity. So I"d advocate for using the same flag names in any case > -- i.e. "C" and "F" in both cases. > >>> I think we are now more or less agreeing that: >>> >>> np.reshape(a, (3, 4), index_order='F') >>> >>> is at least as clear as: >>> >>> np.reshape(a, (3, 4), order='F') > > sure. > > The trick is: > > np.reshape(a, (3, 4), index_order='A') > > which in mingling index_order and memory order...... > >> I believe our job here is to come to some consensus. > > yup. > >> In that spirit, I think we do agree on these statements above. > > with the caveats I just added... > >> Now we have the cost / benefit. >> >> Benefit : Some people may find it easier to understand numpy when >> these constructs are separated. >> >> Cost : There might be some confusion because we have changed the >> default keywords. >> >> Benefit >> ----------- >> >> What proportion of people would find it easier to understand with the >> order constructs separated? > > It's not just numbers -- it's depth of confusion -- if, once you "get" > it, you remember it for the rest of your numpy use, then it's not big > deal. However, if you need to re-think and test every time you > re-visit reshape or ravel, then there's a significant benefit. > > We are talking about "separating the concepts", but I think it takes > more than a keyword change to do that -- the 'A' and 'K' flags mingle > the concpets, and are going to be confusing with new keywords -- maybe > even more so (it says index_order, but the docstring talks about > memory order) > > Does anyone think we should depreciate the 'A' and 'K' flags? Would you consider moving this one to another thread? Cheers, Matthew From chris.barker at noaa.gov Thu Apr 4 14:56:41 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 11:56:41 -0700 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515DBE5E.1010507@continuum.io> References: <515BB663.7030804@hilboll.de> <515DBE5E.1010507@continuum.io> Message-ID: On Thu, Apr 4, 2013 at 10:54 AM, Francesc Alted wrote: > That makes a difference. This can be specially important for creating > user-defined time origins: > > In []: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, > dtype='timedelta64[ns]') > Out[]: numpy.datetime64('2017-07-14T04:40:00.000000001+0200') but that's worthless if you try it higher-resolution: In [40]: np.array(int(1.5e9), dtype='datetime64[s]') Out[40]: array(datetime.datetime(2017, 7, 14, 2, 40), dtype='datetime64[s]') # Start at 2017 # add a picosecond: In [41]: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, dtype='timedelta64[ps]') Out[41]: numpy.datetime64('1970-03-08T22:55:30.029526319105-0800') # get 1970??? And even with nanoseconds, given the leap-second issues, etc, you really wouldn't want to do this anyway -- rather, keep your epoch close by. Now that I think about it -- being able to set your epoch could lessen the impact of leap-seconds for second-resolution as well. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cjwilliams43 at gmail.com Thu Apr 4 15:17:57 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Thu, 04 Apr 2013 15:17:57 -0400 Subject: [Numpy-discussion] try to solve issue #2649 and revisit #473 In-Reply-To: References: <515C7A04.6000004@gmail.com> <515C8B0E.2060605@gmail.com> Message-ID: <515DD1E5.40803@gmail.com> An HTML attachment was scrubbed... URL: From francesc at continuum.io Thu Apr 4 15:38:23 2013 From: francesc at continuum.io (Francesc Alted) Date: Thu, 04 Apr 2013 21:38:23 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> <515DBE5E.1010507@continuum.io> Message-ID: <515DD6AF.5080303@continuum.io> On 4/4/13 8:56 PM, Chris Barker - NOAA Federal wrote: > On Thu, Apr 4, 2013 at 10:54 AM, Francesc Alted wrote: > >> That makes a difference. This can be specially important for creating >> user-defined time origins: >> >> In []: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, >> dtype='timedelta64[ns]') >> Out[]: numpy.datetime64('2017-07-14T04:40:00.000000001+0200') > but that's worthless if you try it higher-resolution: > > In [40]: np.array(int(1.5e9), dtype='datetime64[s]') > Out[40]: array(datetime.datetime(2017, 7, 14, 2, 40), dtype='datetime64[s]') > > # Start at 2017 > > # add a picosecond: > In [41]: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, > dtype='timedelta64[ps]') > Out[41]: numpy.datetime64('1970-03-08T22:55:30.029526319105-0800') > > # get 1970??? This is clearly a bug. Could you file a ticket please? Also, using attoseconds is giving a weird behavior: In []: np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, dtype='timedelta64[as]') --------------------------------------------------------------------------- OverflowError Traceback (most recent call last) in () ----> 1 np.array(int(1.5e9), dtype='datetime64[s]') + np.array(1, dtype='timedelta64[as]') OverflowError: Integer overflow getting a common metadata divisor for NumPy datetime metadata [s] and [as] I would expect the attosecond to be happily ignored and nothing would be added. > > And even with nanoseconds, given the leap-second issues, etc, you > really wouldn't want to do this anyway -- rather, keep your epoch > close by. > > Now that I think about it -- being able to set your epoch could lessen > the impact of leap-seconds for second-resolution as well. Probably this is the way to go, yes. -- Francesc Alted From matthew.brett at gmail.com Thu Apr 4 15:40:58 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 12:40:58 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett wrote: > Hi, > > On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal > wrote: >> On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>>> We all agree that 'order' is used with two different and orthogonal >>>> meanings in numpy. > > Brief thank you for your helpful and thoughtful discussion. > >> well, not entirely orthogonal -- they are the some concept, used in >> different contexts, > > Here's a further clarification, in the hope that it is helpful: > > Input and output index orderings are orthogonal - I can read the data > with C index ordering and return an array that is index ordered > any-old-how. > > F and C are used in the sense of F contiguous and C contiguous - where > contiguous is not the same concept as index ordering. > > So I think it's hard to say these concepts are not orthogonal, simply > in the technical sense that order='F" could mean: > > * read my data using F-style index ordering > * return my data in an array using F-style index ordering > * (related to above) return my data in F-contiguous memory layout Sorry this is not well-put and should increase confusion rather than decrease it. I'll try again if I may. What do we mean by 'Fortran' 'order'. Two things : * np.array(a, order='F') - Fortran contiguous : the array memory is contiguous, the strides vector is strictly increasing * np.ravel(a, order='F') - first-to-last index ordering used to recover values from the array They are related in the sense that Fortran contiguous layout in memory means that returning the elements as stored in memory gives the same answer as first to last index ordering. They are different in the sense that first-to-last index ordering applies to any memory layout - is orthogonal to memory layout. In particular 'contiguous' has no meaning for first-to-last or last-to-first index ordering. So - to restate in other words - this : np.reshape(a, (3, 4), order='F') could reasonably mean one of two orthogonal things 1) Retrieve data from the array using first-to-last indexing, return any memory layout you like 2) Retrieve data from the array using the default last-to-first index ordering, and return memory in F-contiguous layout Cheers, Matthew From josef.pktd at gmail.com Thu Apr 4 15:54:22 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 15:54:22 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett wrote: >> Hi, >> >> On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal >> wrote: >>> On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>>>> We all agree that 'order' is used with two different and orthogonal >>>>> meanings in numpy. >> >> Brief thank you for your helpful and thoughtful discussion. >> >>> well, not entirely orthogonal -- they are the some concept, used in >>> different contexts, >> >> Here's a further clarification, in the hope that it is helpful: >> >> Input and output index orderings are orthogonal - I can read the data >> with C index ordering and return an array that is index ordered >> any-old-how. >> >> F and C are used in the sense of F contiguous and C contiguous - where >> contiguous is not the same concept as index ordering. >> >> So I think it's hard to say these concepts are not orthogonal, simply >> in the technical sense that order='F" could mean: >> >> * read my data using F-style index ordering >> * return my data in an array using F-style index ordering >> * (related to above) return my data in F-contiguous memory layout > > Sorry this is not well-put and should increase confusion rather than > decrease it. I'll try again if I may. > > What do we mean by 'Fortran' 'order'. > > Two things : > > * np.array(a, order='F') - Fortran contiguous : the array memory is > contiguous, the strides vector is strictly increasing > * np.ravel(a, order='F') - first-to-last index ordering used to > recover values from the array > > They are related in the sense that Fortran contiguous layout in memory > means that returning the elements as stored in memory gives the same > answer as first to last index ordering. They are different in the > sense that first-to-last index ordering applies to any memory layout - > is orthogonal to memory layout. In particular 'contiguous' has no > meaning for first-to-last or last-to-first index ordering. > > So - to restate in other words - this : > > np.reshape(a, (3, 4), order='F') > > could reasonably mean one of two orthogonal things > > 1) Retrieve data from the array using first-to-last indexing, return > any memory layout you like > 2) Retrieve data from the array using the default last-to-first index > ordering, and return memory in F-contiguous layout no to interpretation 2) reshape and ravel (in contrast to flatten) just return a view (if possible) (with possible some strange strides) docstring: " numpy.reshape(a, newshape, order='C') Gives a new shape to an array without changing its data " functions that return views versus functions that create new arrays Josef > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Thu Apr 4 16:02:20 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 13:02:20 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 12:54 PM, wrote: > On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett wrote: >>> Hi, >>> >>> On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal >>> wrote: >>>> On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>>>>> We all agree that 'order' is used with two different and orthogonal >>>>>> meanings in numpy. >>> >>> Brief thank you for your helpful and thoughtful discussion. >>> >>>> well, not entirely orthogonal -- they are the some concept, used in >>>> different contexts, >>> >>> Here's a further clarification, in the hope that it is helpful: >>> >>> Input and output index orderings are orthogonal - I can read the data >>> with C index ordering and return an array that is index ordered >>> any-old-how. >>> >>> F and C are used in the sense of F contiguous and C contiguous - where >>> contiguous is not the same concept as index ordering. >>> >>> So I think it's hard to say these concepts are not orthogonal, simply >>> in the technical sense that order='F" could mean: >>> >>> * read my data using F-style index ordering >>> * return my data in an array using F-style index ordering >>> * (related to above) return my data in F-contiguous memory layout >> >> Sorry this is not well-put and should increase confusion rather than >> decrease it. I'll try again if I may. >> >> What do we mean by 'Fortran' 'order'. >> >> Two things : >> >> * np.array(a, order='F') - Fortran contiguous : the array memory is >> contiguous, the strides vector is strictly increasing >> * np.ravel(a, order='F') - first-to-last index ordering used to >> recover values from the array >> >> They are related in the sense that Fortran contiguous layout in memory >> means that returning the elements as stored in memory gives the same >> answer as first to last index ordering. They are different in the >> sense that first-to-last index ordering applies to any memory layout - >> is orthogonal to memory layout. In particular 'contiguous' has no >> meaning for first-to-last or last-to-first index ordering. >> >> So - to restate in other words - this : >> >> np.reshape(a, (3, 4), order='F') >> >> could reasonably mean one of two orthogonal things >> >> 1) Retrieve data from the array using first-to-last indexing, return >> any memory layout you like >> 2) Retrieve data from the array using the default last-to-first index >> ordering, and return memory in F-contiguous layout > > no to interpretation 2) > reshape and ravel (in contrast to flatten) just return a view (if possible) > (with possible some strange strides) 'No' meaning what? That it is not possible that it could mean that? Obviously we're not arguing about whether it does mean that, we're arguing about whether such an interpretation would make sense. Cheers, Matthew From josef.pktd at gmail.com Thu Apr 4 16:33:35 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 16:33:35 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 4, 2013 at 12:54 PM, wrote: >> On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett wrote: >>> Hi, >>> >>> On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal >>>> wrote: >>>>> On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>>>>>> We all agree that 'order' is used with two different and orthogonal >>>>>>> meanings in numpy. >>>> >>>> Brief thank you for your helpful and thoughtful discussion. >>>> >>>>> well, not entirely orthogonal -- they are the some concept, used in >>>>> different contexts, >>>> >>>> Here's a further clarification, in the hope that it is helpful: >>>> >>>> Input and output index orderings are orthogonal - I can read the data >>>> with C index ordering and return an array that is index ordered >>>> any-old-how. >>>> >>>> F and C are used in the sense of F contiguous and C contiguous - where >>>> contiguous is not the same concept as index ordering. >>>> >>>> So I think it's hard to say these concepts are not orthogonal, simply >>>> in the technical sense that order='F" could mean: >>>> >>>> * read my data using F-style index ordering >>>> * return my data in an array using F-style index ordering >>>> * (related to above) return my data in F-contiguous memory layout >>> >>> Sorry this is not well-put and should increase confusion rather than >>> decrease it. I'll try again if I may. >>> >>> What do we mean by 'Fortran' 'order'. >>> >>> Two things : >>> >>> * np.array(a, order='F') - Fortran contiguous : the array memory is >>> contiguous, the strides vector is strictly increasing >>> * np.ravel(a, order='F') - first-to-last index ordering used to >>> recover values from the array >>> >>> They are related in the sense that Fortran contiguous layout in memory >>> means that returning the elements as stored in memory gives the same >>> answer as first to last index ordering. They are different in the >>> sense that first-to-last index ordering applies to any memory layout - >>> is orthogonal to memory layout. In particular 'contiguous' has no >>> meaning for first-to-last or last-to-first index ordering. >>> >>> So - to restate in other words - this : >>> >>> np.reshape(a, (3, 4), order='F') >>> >>> could reasonably mean one of two orthogonal things >>> >>> 1) Retrieve data from the array using first-to-last indexing, return >>> any memory layout you like >>> 2) Retrieve data from the array using the default last-to-first index >>> ordering, and return memory in F-contiguous layout >> >> no to interpretation 2) >> reshape and ravel (in contrast to flatten) just return a view (if possible) >> (with possible some strange strides) > > 'No' meaning what? That it is not possible that it could mean that? > Obviously we're not arguing about whether it does mean that, we're > arguing about whether such an interpretation would make sense. 'No' means: I don't think it makes sense given the current behavior of numpy with respect to functions that are designed to return views (and copy memory only if there is no way to make a view) One objective of functions that create views is *not* to change the underlying memory. So in most cases, requesting a specific contiguity (memory order) for a new array, when you actually want a view with strides, doesn't sound like an obvious explanation for "order". --- slightly more difficult: order = "I don't care" (aka. order="K") means: "I want a view in whichever order of the values, but please try harder not to copy any memory" This also doesn't refer to the memory of a *new* array, if it is really necessary to copy. Josef > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Thu Apr 4 16:38:06 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 13:38:06 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 1:33 PM, wrote: > On Thu, Apr 4, 2013 at 4:02 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Apr 4, 2013 at 12:54 PM, wrote: >>> On Thu, Apr 4, 2013 at 3:40 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Thu, Apr 4, 2013 at 11:45 AM, Matthew Brett wrote: >>>>> Hi, >>>>> >>>>> On Thu, Apr 4, 2013 at 9:21 AM, Chris Barker - NOAA Federal >>>>> wrote: >>>>>> On Wed, Apr 3, 2013 at 6:13 PM, Matthew Brett wrote: >>>>>>>> We all agree that 'order' is used with two different and orthogonal >>>>>>>> meanings in numpy. >>>>> >>>>> Brief thank you for your helpful and thoughtful discussion. >>>>> >>>>>> well, not entirely orthogonal -- they are the some concept, used in >>>>>> different contexts, >>>>> >>>>> Here's a further clarification, in the hope that it is helpful: >>>>> >>>>> Input and output index orderings are orthogonal - I can read the data >>>>> with C index ordering and return an array that is index ordered >>>>> any-old-how. >>>>> >>>>> F and C are used in the sense of F contiguous and C contiguous - where >>>>> contiguous is not the same concept as index ordering. >>>>> >>>>> So I think it's hard to say these concepts are not orthogonal, simply >>>>> in the technical sense that order='F" could mean: >>>>> >>>>> * read my data using F-style index ordering >>>>> * return my data in an array using F-style index ordering >>>>> * (related to above) return my data in F-contiguous memory layout >>>> >>>> Sorry this is not well-put and should increase confusion rather than >>>> decrease it. I'll try again if I may. >>>> >>>> What do we mean by 'Fortran' 'order'. >>>> >>>> Two things : >>>> >>>> * np.array(a, order='F') - Fortran contiguous : the array memory is >>>> contiguous, the strides vector is strictly increasing >>>> * np.ravel(a, order='F') - first-to-last index ordering used to >>>> recover values from the array >>>> >>>> They are related in the sense that Fortran contiguous layout in memory >>>> means that returning the elements as stored in memory gives the same >>>> answer as first to last index ordering. They are different in the >>>> sense that first-to-last index ordering applies to any memory layout - >>>> is orthogonal to memory layout. In particular 'contiguous' has no >>>> meaning for first-to-last or last-to-first index ordering. >>>> >>>> So - to restate in other words - this : >>>> >>>> np.reshape(a, (3, 4), order='F') >>>> >>>> could reasonably mean one of two orthogonal things >>>> >>>> 1) Retrieve data from the array using first-to-last indexing, return >>>> any memory layout you like >>>> 2) Retrieve data from the array using the default last-to-first index >>>> ordering, and return memory in F-contiguous layout >>> >>> no to interpretation 2) >>> reshape and ravel (in contrast to flatten) just return a view (if possible) >>> (with possible some strange strides) >> >> 'No' meaning what? That it is not possible that it could mean that? >> Obviously we're not arguing about whether it does mean that, we're >> arguing about whether such an interpretation would make sense. > > 'No' means: I don't think it makes sense given the current behavior of numpy > with respect to functions that are designed to return views > (and copy memory only if there is no way to make a view) OK - so no-one is suggesting that it is a good option, only that the concept makes sense. As I was saying before - for most of us it is still possible to get confused between two different meanings of the same word even if one of the meanings would (for complicated reasons) be less likely than the other. Cheers, Matthew From sebastian at sipsolutions.net Thu Apr 4 16:53:41 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 04 Apr 2013 22:53:41 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: <1365108821.24805.44.camel@sebastian-laptop> On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote: > Hi, > > > So - to restate in other words - this : > > np.reshape(a, (3, 4), order='F') > > could reasonably mean one of two orthogonal things > > 1) Retrieve data from the array using first-to-last indexing, return > any memory layout you like > 2) Retrieve data from the array using the default last-to-first index > ordering, and return memory in F-contiguous layout > Yes, it could mean both. I am simply not sure if it helps enough to warrant the trouble. So if it still interests someone, I feel the docs are more important, but I am neutral to changing this. I don't quite see a big gain, so I am just worried that it bugs a lot of people either because of changing or because of having to remember the different name (you can argue that is good, but if it bugs most maybe it does not help either). As to being confused. Did anyone ever see a np.reshape(arr, ..., order='F') and then continuing assuming the result is F-contiguous (when the original arr is not known to be contiguous)? If that actually create a real bug somewhere, that might actually convince me that it is worth it to walk through trouble and complaints. I guess I just don't believe it really happens in the real world. - Sebastian > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Thu Apr 4 17:04:20 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 14:04:20 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <1365108821.24805.44.camel@sebastian-laptop> References: <1364995488.4038.14.camel@sebastian-laptop> <1365108821.24805.44.camel@sebastian-laptop> Message-ID: Hi, On Thu, Apr 4, 2013 at 1:53 PM, Sebastian Berg wrote: > On Thu, 2013-04-04 at 12:40 -0700, Matthew Brett wrote: >> Hi, >> > >> >> So - to restate in other words - this : >> >> np.reshape(a, (3, 4), order='F') >> >> could reasonably mean one of two orthogonal things >> >> 1) Retrieve data from the array using first-to-last indexing, return >> any memory layout you like >> 2) Retrieve data from the array using the default last-to-first index >> ordering, and return memory in F-contiguous layout >> > > Yes, it could mean both. I am simply not sure if it helps enough to > warrant the trouble. So if it still interests someone, I feel the docs > are more important, but I am neutral to changing this. I don't think the docs enter the discussion, because we all agree that changing the docs is a good idea. > I don't quite see a big gain, so I am just worried that it bugs a lot of > people either because of changing or because of having to remember the > different name (you can argue that is good, but if it bugs most maybe it > does not help either). > > As to being confused. Did anyone ever see a np.reshape(arr, ..., > order='F') and then continuing assuming the result is F-contiguous (when > the original arr is not known to be contiguous)? If that actually create > a real bug somewhere, that might actually convince me that it is worth > it to walk through trouble and complaints. I guess I just don't believe > it really happens in the real world. There are two aspects here; 1) Making numpy easier to understand and teach. 2) Avoiding bugs I'm thinking primarily of the first. I would hate to teach the thing in the current state. As I've said many times before, I found it very confusing, others have said so too. The more confusing it is, the more likely people will make mistakes. Cheers, Matthew From matthew.brett at gmail.com Thu Apr 4 17:20:41 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 4 Apr 2013 14:20:41 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: Hi, On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith wrote: > Maybe we should go through and rename "order" to something more descriptive > in each case, so we'd have > a.reshape(..., index_order="C") > a.copy(memory_order="F") > etc.? I'd like to propose this instead: a.reshape(..., order="C") a.copy(layout="F") This fits well with the terms we've been using during the discussion. It reduces the changes to only one of the two meanings. Thinking about it, I feel that this would have been considerably clearer to me as I learned numpy. Cheers, Matthew From chris.barker at noaa.gov Thu Apr 4 17:54:49 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 4 Apr 2013 14:54:49 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Thu, Apr 4, 2013 at 11:26 AM, wrote: >> Before you answer that -- does anyone see a use case for the 'A' and >> 'K' flags that can't be reasonably easily accomplished with .view() or >> asarray() or ??? > > What order does a[a>2] use to create the returned 1-D array? ... > However, I never needed to know and never cared > a[a>2] = 5 > a[a>2] = b[a>2] > > Now, after this thread, I know about "K", does that use case use ravel() or reshape() under the hood? > and there might be cases > where it would be appropriate to minimize copying memory, hmm -- yes, that makes sense, and perhaps compelling enough to keep them around (at least with perhaps better docs). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From josef.pktd at gmail.com Thu Apr 4 19:13:31 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 19:13:31 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: On Thu, Apr 4, 2013 at 5:54 PM, Chris Barker - NOAA Federal wrote: > On Thu, Apr 4, 2013 at 11:26 AM, wrote: >>> Before you answer that -- does anyone see a use case for the 'A' and >>> 'K' flags that can't be reasonably easily accomplished with .view() or >>> asarray() or ??? >> >> What order does a[a>2] use to create the returned 1-D array? > ... >> However, I never needed to know and never cared >> a[a>2] = 5 >> a[a>2] = b[a>2] >> >> Now, after this thread, I know about "K", > > does that use case use ravel() or reshape() under the hood? only ravel has "K" as far as I saw in the current documentation. example for ravel("K") would be if axis=None in functions and we only have elementwise or reduce operations. All the code I've seen uses just ravel() in this case, instead, ravel("K") would have a better chance to avoid array copying, if axis is None: x = x.ravel("K") return ((x - x.mean(0))**2).sum(0) but it's dangerous because, if there is a second array, it might not ravel("K") the same way x.ravel("K") - y.ravel("K") sounds fun similar if x[mask] wouldn't select a fixed "order", then a[a>2] = b[a>2] would also be fun fun := find the bug that I have hidden in this code The only reason to use reshape with "A", I can think of, is, if the array (matrix) is symmetric, or if it's a square picture and we never care whether it's upright or sideways. reshape(.., order="A") and ravel("A") should roundtrip, I guess. Josef > >> and there might be cases >> where it would be appropriate to minimize copying memory, > > hmm -- yes, that makes sense, and perhaps compelling enough to keep > them around (at least with perhaps better docs). > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Thu Apr 4 23:38:52 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Apr 2013 23:38:52 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1364995488.4038.14.camel@sebastian-laptop> Message-ID: Catching up with numpy 1.6 > 'No' means: I don't think it makes sense given the current behavior of numpy > with respect to functions that are designed to return views > (and copy memory only if there is no way to make a view) > > One objective of functions that create views is *not* to change the underlying > memory. So in most cases, requesting a specific contiguity (memory order) > for a new array, when you actually want a view with strides, doesn't > sound like an obvious explanation for "order". > why I'm buffled: To me views are just a specific way of looking at an existing array, or parts of it, similar to an iteratior but with an n-dimensional shape. ravel is just like calling list(iterator), the iterator determines how we read the existing array. So, asking about the output memory order made no sense to me. What's the output of an iterator? I (and statsmodels) are still on numpy 1.5 but not for much longer. So I'm trying to read up http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#single-array-iteration explains the case for "K" : for elementwise operations just run the fastest way through the array The old flat and flatiter where always c-order. >>> a = np.arange(4*5).reshape(4,5) >>> b = np.array(a, order='F') >>> np.fromiter(np.nditer(b, order='K'), int) array([ 0, 5, 10, 15, 1, 6, 11, 16, 2, 7, 12, 17, 3, 8, 13, 18, 4, 9, 14, 19]) >>> np.fromiter(np.nditer(a, order='K'), int) array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) Is ravel('K') good for anything ? >>> def f(x): '''A function that only works in 1d''' if x.ndim > 1: raise ValueError return np.round(np.piecewise(x, [x < 0, x >= 0], [lambda x: np.sqrt(-x), lambda x: np.sqrt(x)])) >>> b = np.array(np.arange(4*5.).reshape(4,5), order='F') >>> b array([[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]) >>> f(b[:,:2]) Traceback (most recent call last): File "", line 1, in f(b[:,:2]) File "", line 2, in f if x.ndim > 1: raise ValueError ValueError ravel and reshape with 'K' doesn't roundtrip >>> (b.ravel('K')).reshape(b.shape, order='K') array([[ 0., 5., 10., 15., 1.], [ 6., 11., 16., 2., 7.], [ 12., 17., 3., 8., 13.], [ 18., 4., 9., 14., 19.]]) but we can do inplace transformations with it >>> e = b[:,:2].ravel() >>> e.flags.owndata True >>> e = b[:,:2].ravel('K') >>> e.flags.owndata False >>> e[:] = f(e) >>> b array([[ 0., 1., 2., 3., 4.], [ 2., 2., 7., 8., 9.], [ 3., 3., 12., 13., 14.], [ 4., 4., 17., 18., 19.]]) >>> e[:] = f(e) >>> b array([[ 0., 1., 2., 3., 4.], [ 1., 1., 7., 8., 9.], [ 2., 2., 12., 13., 14.], [ 2., 2., 17., 18., 19.]]) (A few hours of experimenting is more that I wanted to know, 99.5% of my cases are order='C' or order='F') nditer has also an interesting section on Iterator-Allocated Output Arrays Josef I found the scissors From daniele at grinta.net Fri Apr 5 04:30:31 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Fri, 05 Apr 2013 10:30:31 +0200 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: <515D8151.5040400@lanl.gov> References: <515BB663.7030804@hilboll.de> <515D41F7.3050908@grinta.net> <515D8151.5040400@lanl.gov> Message-ID: <515E8BA7.80308@grinta.net> On 04/04/2013 15:34, Jonathan T. Niehof wrote: > Keeping a leap second database up to date is annoying but not as bad as > it could be: they're not arbitrary. Although they can occur monthly, > they've only ever happened at June 30 and Dec 31, announced in January > and July, respectively. So it would be easy to check the date of a > leapsecond database and warn if 1) date we're processing is after June > 30 of a year AND 2) LSDB older than January same year (similar checks > for the Dec. 31 opportunity.) As far as I know there is no official standardization of such rules, furthermore, how do you plan to process datetimes far in the future? Cheers, Daniele From dave.hirschfeld at gmail.com Fri Apr 5 05:06:17 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Fri, 5 Apr 2013 09:06:17 +0000 (UTC) Subject: [Numpy-discussion] timezones and datetime64 References: <515BB663.7030804@hilboll.de> Message-ID: > > Sorry, having trouble keeping up with this thread! Comments, specific to my (limited) use-cases are inline: Chris Barker - NOAA Federal noaa.gov> writes: > > > I thought about that -- but if you have timedelta without datetime, > you really just have an integer -- we haven't bought anything. > > It seems we have a number of somewhat orthogonal issues with DateTime > in front of us: > > 1) How to handle (or not) time zones IMHO doing any conversion of the input data unless explicitly requested is wrong. That means the current behaviour of converting to the local timezone when no timezone is specified is bad. To prevent any conversion taking place I'm happy with the no timezone implies UTC fix. > 2) How (whether) to handle leap-seconds, etc. I don't care about leap-seconds - I want the difference between any two days to be 86400s, always. I don't mind if the leap-second functionality is provided so long as it doesn't incur a large performance penalty in the case that you don't care. > 3) Whether to support TAI time (or is that the same as the above?) I now know about TAI time... > 4) Should we add a flexible epoch? No strong opinion though it does sound sensible. > I suggest we create separate threads for these, discuss a bit more, > then have at the NEP. > > I'll start one for (1). > > I don't have the expertise nor use-case for (2) and (3), so I'll punt, > but someone can pick it up. > > I'll start one for (4) also, though I'm not sure I have much to say, > other than that I think it's good idea. My naive view is that it would > be pretty easy, actually, but I could be very wrong there. > > -Chris > -Dave From sebastian at sipsolutions.net Fri Apr 5 05:20:45 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 05 Apr 2013 11:20:45 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <1365153645.683.38.camel@sebastian-laptop> Hey On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: > Hi, > > On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith wrote: > > > Maybe we should go through and rename "order" to something more descriptive > > in each case, so we'd have > > a.reshape(..., index_order="C") > > a.copy(memory_order="F") > > etc.? > > I'd like to propose this instead: > > a.reshape(..., order="C") > a.copy(layout="F") > I actually like this, makes the point clearer that it has to do with memory layout and implies contiguity, plus it is short and from the numpy perspective copy, etc. are the ones that add additional info to "order" and not reshape (because IMO memory order is something new users should not worry about at first). A and K orders will still have their quirks with np.array and copy=True/False, but for many functions they are esoteric anyway. It will be one hell of a deprecation though, but I am +0.5 for adding an alias for now (maybe someone knows an even better name?), but I think that in this case, it probably really is better to wait with actual deprecation warnings for a few versions, since it touches a *lot* of code. Plus I think at the point of starting deprecation warnings (and best earlier) numpy should provide an automatic fixer script... The only counter point that remains for me is the difficulty of deprecation, since I think the new name idea is very clean. And this is unfortunately even more invasive then the index_order proposal. Fun point at the end: ndarray.tostring takes an order argument, which is correct as "order" but has a lot in common with "layout" :). (that is not an issue IMO, but for me it is a reason to prefer the layout proposal over the index_order one). Regards, Sebastian > This fits well with the terms we've been using during the discussion. > It reduces the changes to only one of the two meanings. > > Thinking about it, I feel that this would have been considerably > clearer to me as I learned numpy. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From gadiyar at gmail.com Fri Apr 5 07:43:32 2013 From: gadiyar at gmail.com (Anand Gadiyar) Date: Fri, 5 Apr 2013 17:13:32 +0530 Subject: [Numpy-discussion] Import error while freezing with cxfreeze Message-ID: Hi all, I have a small program that uses numpy and scipy. I ran into a couple of errors while trying to use cxfreeze to create a windows executable. I'm running Windows 7 x64, Python 2.7.3 64-bit, Numpy 1.7.1rc1 64-bit, Scipy-0.11.0 64-bit, all binary installs from < http://www.lfd.uci.edu/~gohlke/pythonlibs/> I was able to replicate this with scipy-0.12.0c1 as well. 1) "from scipy import constants" triggers the below: Traceback (most recent call last): File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in exec_code in m.__dict__ File "mSimpleGui.py", line 10, in File "mSystem.py", line 7, in File "D:\Python27\lib\site-packages\scipy\__init__.py", line 64, in from numpy import show_config as show_numpy_config File "D:\Python27\lib\site-packages\numpy\__init__.py", line 165, in from core import * AttributeError: 'module' object has no attribute 'sys' 2) "from scipy import interpolate" triggers the below: Traceback (most recent call last): File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in exec_code in m.__dict__ File "mSimpleGui.py", line 10, in File "mSystem.py", line 9, in File "mSensor.py", line 10, in File "D:\Python27\lib\site-packages\scipy\interpolate\__init__.py", line 154, in from rbf import Rbf File "D:\Python27\lib\site-packages\scipy\interpolate\rbf.py", line 50, in from scipy import linalg ImportError: cannot import name linalg I've attached a couple of small patches that fix these errors for me, but I'm not sure if these are the best way to fix. Could you please take a look? I'd be happy to test alternative fixes. Thanks in advance, Anand -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-import-sys-as-something-else-to-avoid-namespace-conf.patch Type: application/octet-stream Size: 1598 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-interpolate-import-only-the-required-function-to-avo.patch Type: application/octet-stream Size: 1700 bytes Desc: not available URL: From sebastian at sipsolutions.net Fri Apr 5 08:02:12 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 05 Apr 2013 14:02:12 +0200 Subject: [Numpy-discussion] einsum and broadcasting In-Reply-To: <515D8691.9080101@aalto.fi> References: <515D8691.9080101@aalto.fi> Message-ID: <1365163332.2506.7.camel@sebastian-laptop> On Thu, 2013-04-04 at 16:56 +0300, Jaakko Luttinen wrote: > I don't quite understand how einsum handles broadcasting. I get the > following error, but I don't understand why: > > In [8]: import numpy as np > In [9]: A = np.arange(12).reshape((4,3)) > In [10]: B = np.arange(6).reshape((3,2)) > In [11]: np.einsum('ik,k...->i...', A, B) > --------------------------------------------------------------------------- > ValueError: operand 0 did not have enough dimensions to match the > broadcasting, and couldn't be extended because einstein sum subscripts > were specified at both the start and end > > However, if I use explicit indexing, it works: > > In [12]: np.einsum('ik,kj->ij', A, B) > Out[12]: > array([[10, 13], > [28, 40], > [46, 67], > [64, 94]]) > > It seems that it also works if I add '...' to the first operand: > > In [12]: np.einsum('ik...,k...->i...', A, B) > Out[12]: > array([[10, 13], > [28, 40], > [46, 67], > [64, 94]]) > > However, as far as I understand, the syntax > np.einsum('ik,k...->i...', A, B) > should work. Have I misunderstood something or is there a bug? > My guess is, it is by design because the purpose of the ellipsis is more to allow extra dimensions that are not important to the problem itself. A vector product is np.einsum('i,i->i') and if I write np.einsum('...i,...i->...i') I allow generalizing that arrays of 1-d arrays (like the new gufunc linalg stuff). I did not check the code though, so maybe thats not the case. But in any case I don't see a reason why it should not be possible to only allow extra dims for some inputs (I guess it can also make sense to not give ... for the output). So I would say, if you want to generalize it, go ahead ;). - Sebastian > Thanks for your help! > Jaakko > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From brad.froehle at gmail.com Fri Apr 5 11:03:36 2013 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Fri, 5 Apr 2013 08:03:36 -0700 Subject: [Numpy-discussion] Import error while freezing with cxfreeze In-Reply-To: References: Message-ID: Hi Anand, On Friday, April 5, 2013, Anand Gadiyar wrote: > Hi all, > > I have a small program that uses numpy and scipy. I ran into a couple of > errors while trying to use cxfreeze to create a windows executable. > > I'm running Windows 7 x64, Python 2.7.3 64-bit, Numpy 1.7.1rc1 64-bit, > Scipy-0.11.0 64-bit, all binary installs from < > http://www.lfd.uci.edu/~gohlke/pythonlibs/> > > I was able to replicate this with scipy-0.12.0c1 as well. > > 1) "from scipy import constants" triggers the below: > Traceback (most recent call last): > File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", > line 27, in > exec_code in m.__dict__ > File "mSimpleGui.py", line 10, in > File "mSystem.py", line 7, in > File "D:\Python27\lib\site-packages\scipy\__init__.py", line 64, in > > from numpy import show_config as show_numpy_config > File "D:\Python27\lib\site-packages\numpy\__init__.py", line 165, in > > from core import * > AttributeError: 'module' object has no attribute 'sys' > It's a bug in cx_freeze that has been fixed in the development branch. See https://bitbucket.org/anthony_tuininga/cx_freeze/pull-request/17/avoid-polluting-extension-module-namespace/diff > 2) "from scipy import interpolate" triggers the below: > Traceback (most recent call last): > File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", > line 27, in > exec_code in m.__dict__ > File "mSimpleGui.py", line 10, in > File "mSystem.py", line 9, in > File "mSensor.py", line 10, in > File "D:\Python27\lib\site-packages\scipy\interpolate\__init__.py", line > 154, in > from rbf import Rbf > File "D:\Python27\lib\site-packages\scipy\interpolate\rbf.py", line 50, in > > from scipy import linalg > ImportError: cannot import name linalg > You might want to try the dev branch of cxfreeze to see if this has been fixed as well. Brad -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Apr 5 11:13:15 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 08:13:15 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <1365153645.683.38.camel@sebastian-laptop> References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg wrote: > Hey > > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: >> Hi, >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith wrote: >> >> > Maybe we should go through and rename "order" to something more descriptive >> > in each case, so we'd have >> > a.reshape(..., index_order="C") >> > a.copy(memory_order="F") >> > etc.? >> >> I'd like to propose this instead: >> >> a.reshape(..., order="C") >> a.copy(layout="F") >> > > I actually like this, makes the point clearer that it has to do with > memory layout and implies contiguity, plus it is short and from the > numpy perspective copy, etc. are the ones that add additional info to > "order" and not reshape (because IMO memory order is something new users > should not worry about at first). A and K orders will still have their > quirks with np.array and copy=True/False, but for many functions they > are esoteric anyway. > > It will be one hell of a deprecation though, but I am +0.5 for adding an > alias for now (maybe someone knows an even better name?), but I think > that in this case, it probably really is better to wait with actual > deprecation warnings for a few versions, since it touches a *lot* of > code. Plus I think at the point of starting deprecation warnings (and > best earlier) numpy should provide an automatic fixer script... > > The only counter point that remains for me is the difficulty of > deprecation, since I think the new name idea is very clean. And this is > unfortunately even more invasive then the index_order proposal. I completely agree that we'd have to be gentle with the change. The problem we'd want to avoid is people innocently using 'layout' and finding to their annoyance that the code doesn't work with other people's numpy. How about: Step 1: 'order' remains as named keyword, layout added as alias, comment on the lines of "layout will become the default keyword for this option in later versions of numpy; please consider updating any code that does not need to remain backwards compatible'. Step 2: default keyword becomes 'layout' with 'order' as alias, comment like "order is an alias for 'layout' to maintain backwards compatibility with numpy <= 1.7.1', please update any code that does not need to maintain backwards compatibility with these numpy versions' Step 3: Add deprecation warning for 'order', "order will be removed as an alias in future versions of numpy" Step 4: (distant future) Remove alias ? Cheers, Matthew From aronne.merrelli at gmail.com Fri Apr 5 12:11:03 2013 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Fri, 5 Apr 2013 11:11:03 -0500 Subject: [Numpy-discussion] Indexing bug In-Reply-To: References: Message-ID: On Sun, Mar 31, 2013 at 12:14 AM, Ivan Oseledets wrote: Oh! So it is not a bug, it is a feature, which is completely > incompatible with other array based languages (MATLAB and Fortran). To > me, I can not find a single explanation why it is so in numpy. > Taking submatrices from a matrix is a common operation and the syntax > above is very natural to take submatrices, not a weird diagonal stuff. > i.e., > > c = np.random.randn(100,100) > d = c[[0,3],[2,3]] > > should NOT produce two numbers! (and you can not do it using slices!) > > In MATLAB and Fortran > c(indi,indj) > will produce a 2 x 2 matrix. > How it can be done in numpy (and why the complications?) > > So, please consider this message as a feature request. > > There is already a function, ix, in the index_tricks that does this (I think it is essentially implementing the broadcasting trick that Nathaniel mentions. For me the index trick is easier, as I often forget the broadcasting details). Example: In [14]: c = np.random.randn(100,100) In [15]: c[[0,3],[2,3]] Out[15]: array([ 0.99141998, -0.88928917]) In [16]: c[np.ix_([0,3],[2,3])] Out[16]: array([[ 0.99141998, -1.98406295], [ 0.0729076 , -0.88928917]]) So for me, I think this is superior to MATLAB, as I have often had the case of wanting the result from the second option. In MATLAB you would need to extract the 2x2 matrix, and then take its diagonal. This can be wasteful when the index arrays become large. Cheers, Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewalter+numpy at cs.cmu.edu Fri Apr 5 13:08:22 2013 From: ewalter+numpy at cs.cmu.edu (Edward Walter) Date: Fri, 05 Apr 2013 13:08:22 -0400 Subject: [Numpy-discussion] dynamically choosing atlas libraries Message-ID: <515F0506.5030506@cs.cmu.edu> Hello List, We're standing up a new computational cluster and going through the steps of building Numpy etc. We would like to be able to install multiple versions of ATLAS (with different build settings / tunings / etc) and have Numpy load the shared Atlas libraries dynamically based on the LD_LIBRARY_PATH. It's not clear to me how to do this given that the library path is hard coded in Numpy's site.cfg. It seems like other people have gotten this working though (i.e. RHEL/Centos with their collection of Atlas versions {atlas, atlas-sse3, etc}). Does anyone have any pointers on this? Thanks much. -Ed Walter Carnegie Mellon University From ralf.gommers at gmail.com Fri Apr 5 15:09:47 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 5 Apr 2013 21:09:47 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg > wrote: > > Hey > > > > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: > >> Hi, > >> > >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith wrote: > >> > >> > Maybe we should go through and rename "order" to something more > descriptive > >> > in each case, so we'd have > >> > a.reshape(..., index_order="C") > >> > a.copy(memory_order="F") > >> > etc.? > >> > >> I'd like to propose this instead: > >> > >> a.reshape(..., order="C") > >> a.copy(layout="F") > >> > > > > I actually like this, makes the point clearer that it has to do with > > memory layout and implies contiguity, plus it is short and from the > > numpy perspective copy, etc. are the ones that add additional info to > > "order" and not reshape (because IMO memory order is something new users > > should not worry about at first). A and K orders will still have their > > quirks with np.array and copy=True/False, but for many functions they > > are esoteric anyway. > > > > It will be one hell of a deprecation though, but I am +0.5 for adding an > > alias for now (maybe someone knows an even better name?), but I think > > that in this case, it probably really is better to wait with actual > > deprecation warnings for a few versions, since it touches a *lot* of > > code. Plus I think at the point of starting deprecation warnings (and > > best earlier) numpy should provide an automatic fixer script... > > > > The only counter point that remains for me is the difficulty of > > deprecation, since I think the new name idea is very clean. And this is > > unfortunately even more invasive then the index_order proposal. > > I completely agree that we'd have to be gentle with the change. The > problem we'd want to avoid is people innocently using 'layout' and > finding to their annoyance that the code doesn't work with other > people's numpy. > > How about: > > Step 1: 'order' remains as named keyword, layout added as alias, > comment on the lines of "layout will become the default keyword for > this option in later versions of numpy; please consider updating any > code that does not need to remain backwards compatible'. > > Step 2: default keyword becomes 'layout' with 'order' as alias, > comment like "order is an alias for 'layout' to maintain backwards > compatibility with numpy <= 1.7.1', please update any code that does > not need to maintain backwards compatibility with these numpy > versions' > > Step 3: Add deprecation warning for 'order', "order will be removed as > an alias in future versions of numpy" > > Step 4: (distant future) Remove alias > > ? > A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? Here's how I see it: deprecation of "order" is a no go. Therefore we have two choices here: 1. Simply document the current "order" keyword better and leave it at that. 2. Add a "layout" (or "index_order") keyword, and live with both "order" and "layout" keywords forever. (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Apr 5 15:21:54 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 15:21:54 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers wrote: > > > > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >> wrote: >> > Hey >> > >> > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: >> >> Hi, >> >> >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith wrote: >> >> >> >> > Maybe we should go through and rename "order" to something more >> >> > descriptive >> >> > in each case, so we'd have >> >> > a.reshape(..., index_order="C") >> >> > a.copy(memory_order="F") >> >> > etc.? >> >> >> >> I'd like to propose this instead: >> >> >> >> a.reshape(..., order="C") >> >> a.copy(layout="F") >> >> >> > >> > I actually like this, makes the point clearer that it has to do with >> > memory layout and implies contiguity, plus it is short and from the >> > numpy perspective copy, etc. are the ones that add additional info to >> > "order" and not reshape (because IMO memory order is something new users >> > should not worry about at first). A and K orders will still have their >> > quirks with np.array and copy=True/False, but for many functions they >> > are esoteric anyway. >> > >> > It will be one hell of a deprecation though, but I am +0.5 for adding an >> > alias for now (maybe someone knows an even better name?), but I think >> > that in this case, it probably really is better to wait with actual >> > deprecation warnings for a few versions, since it touches a *lot* of >> > code. Plus I think at the point of starting deprecation warnings (and >> > best earlier) numpy should provide an automatic fixer script... >> > >> > The only counter point that remains for me is the difficulty of >> > deprecation, since I think the new name idea is very clean. And this is >> > unfortunately even more invasive then the index_order proposal. >> >> I completely agree that we'd have to be gentle with the change. The >> problem we'd want to avoid is people innocently using 'layout' and >> finding to their annoyance that the code doesn't work with other >> people's numpy. >> >> How about: >> >> Step 1: 'order' remains as named keyword, layout added as alias, >> comment on the lines of "layout will become the default keyword for >> this option in later versions of numpy; please consider updating any >> code that does not need to remain backwards compatible'. >> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >> comment like "order is an alias for 'layout' to maintain backwards >> compatibility with numpy <= 1.7.1', please update any code that does >> not need to maintain backwards compatibility with these numpy >> versions' >> >> Step 3: Add deprecation warning for 'order', "order will be removed as >> an alias in future versions of numpy" >> >> Step 4: (distant future) Remove alias >> >> ? > > > A very strong -1 from me. Now we're talking about deprecation warnings and a > backwards compatibility break after all. I thought we agreed that this was a > very bad idea, so why are you proposing it now? > > Here's how I see it: deprecation of "order" is a no go. Therefore we have > two choices here: > 1. Simply document the current "order" keyword better and leave it at that. > 2. Add a "layout" (or "index_order") keyword, and live with both "order" and > "layout" keywords forever. > > (2) is at least as confusing as (1), more work and poor design. Therefore I > propose to go with (1). You are saying that deprecation of 'order' at any stage in the next 10 years of numpy's lifetime is a no go? I think that is short-sighted and I think it will damage numpy. Believe me, I have as much investment in backward compatibility as you do. All the three libraries that I spend a long time maintaining need to test against old numpy versions - but - for heaven's sake - only back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, and I don't think we need to maintain compatibility with Numeric either. If you are saying that we need to maintain compatibility for 10 years at a stretch, then we will have to accept that numpy will gradually decay into a legacy library, because it is certain that, if we stay static, someone else with more ambition will do a better job. There is a cost to being averse to any change at all, no matter how gradually it is managed. Best, Matthew From ralf.gommers at gmail.com Fri Apr 5 15:53:00 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 5 Apr 2013 21:53:00 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers > wrote: > > > > > > > > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg > >> wrote: > >> > Hey > >> > > >> > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: > >> >> Hi, > >> >> > >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith > wrote: > >> >> > >> >> > Maybe we should go through and rename "order" to something more > >> >> > descriptive > >> >> > in each case, so we'd have > >> >> > a.reshape(..., index_order="C") > >> >> > a.copy(memory_order="F") > >> >> > etc.? > >> >> > >> >> I'd like to propose this instead: > >> >> > >> >> a.reshape(..., order="C") > >> >> a.copy(layout="F") > >> >> > >> > > >> > I actually like this, makes the point clearer that it has to do with > >> > memory layout and implies contiguity, plus it is short and from the > >> > numpy perspective copy, etc. are the ones that add additional info to > >> > "order" and not reshape (because IMO memory order is something new > users > >> > should not worry about at first). A and K orders will still have their > >> > quirks with np.array and copy=True/False, but for many functions they > >> > are esoteric anyway. > >> > > >> > It will be one hell of a deprecation though, but I am +0.5 for adding > an > >> > alias for now (maybe someone knows an even better name?), but I think > >> > that in this case, it probably really is better to wait with actual > >> > deprecation warnings for a few versions, since it touches a *lot* of > >> > code. Plus I think at the point of starting deprecation warnings (and > >> > best earlier) numpy should provide an automatic fixer script... > >> > > >> > The only counter point that remains for me is the difficulty of > >> > deprecation, since I think the new name idea is very clean. And this > is > >> > unfortunately even more invasive then the index_order proposal. > >> > >> I completely agree that we'd have to be gentle with the change. The > >> problem we'd want to avoid is people innocently using 'layout' and > >> finding to their annoyance that the code doesn't work with other > >> people's numpy. > >> > >> How about: > >> > >> Step 1: 'order' remains as named keyword, layout added as alias, > >> comment on the lines of "layout will become the default keyword for > >> this option in later versions of numpy; please consider updating any > >> code that does not need to remain backwards compatible'. > >> > >> Step 2: default keyword becomes 'layout' with 'order' as alias, > >> comment like "order is an alias for 'layout' to maintain backwards > >> compatibility with numpy <= 1.7.1', please update any code that does > >> not need to maintain backwards compatibility with these numpy > >> versions' > >> > >> Step 3: Add deprecation warning for 'order', "order will be removed as > >> an alias in future versions of numpy" > >> > >> Step 4: (distant future) Remove alias > >> > >> ? > > > > > > A very strong -1 from me. Now we're talking about deprecation warnings > and a > > backwards compatibility break after all. I thought we agreed that this > was a > > very bad idea, so why are you proposing it now? > > > > Here's how I see it: deprecation of "order" is a no go. Therefore we have > > two choices here: > > 1. Simply document the current "order" keyword better and leave it at > that. > > 2. Add a "layout" (or "index_order") keyword, and live with both "order" > and > > "layout" keywords forever. > > > > (2) is at least as confusing as (1), more work and poor design. > Therefore I > > propose to go with (1). > > You are saying that deprecation of 'order' at any stage in the next 10 > years of numpy's lifetime is a no go? > For something like this? Yes. > I think that is short-sighted and I think it will damage numpy. > It will damage numpy to be conservative and not change a name for a little bit of clarity for some people that avoids reading the docs maybe a little more carefully? There's a lot of things that can damage numpy, but this isn't even close in my book. Too few developers, continuous backwards compatibility issues, faster alternative libraries surpassing numpy - that's the kind of thing that causes damage. > Believe me, I have as much investment in backward compatibility as you > do. All the three libraries that I spend a long time maintaining need > to test against old numpy versions - but - for heaven's sake - only > back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, > and I don't think we need to maintain compatibility with Numeric > either. > Really? This is from 3 months ago: http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now 2013, we are probably dropping numarray compat in 1.8. Not exactly 10 years, but of the same order. > If you are saying that we need to maintain compatibility for 10 years > at a stretch, then we will have to accept that numpy will gradually > decay into a legacy library, because it is certain that, if we stay > static, someone else with more ambition will do a better job. > > There is a cost to being averse to any change at all, no matter how > gradually it is managed. > It's a cost/benefit trade-off, yes. Breaking backwards compatibility for a big step forward is sometimes necessary, in order to avoid decay as you say. You seem to have lost sight of the little thing you're arguing for though. There simply is no big step forward here. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Apr 5 18:09:14 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 15:09:14 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: > > > > On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >> wrote: >> > >> > >> > >> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >> >> wrote: >> >> > Hey >> >> > >> >> > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: >> >> >> Hi, >> >> >> >> >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith >> >> >> wrote: >> >> >> >> >> >> > Maybe we should go through and rename "order" to something more >> >> >> > descriptive >> >> >> > in each case, so we'd have >> >> >> > a.reshape(..., index_order="C") >> >> >> > a.copy(memory_order="F") >> >> >> > etc.? >> >> >> >> >> >> I'd like to propose this instead: >> >> >> >> >> >> a.reshape(..., order="C") >> >> >> a.copy(layout="F") >> >> >> >> >> > >> >> > I actually like this, makes the point clearer that it has to do with >> >> > memory layout and implies contiguity, plus it is short and from the >> >> > numpy perspective copy, etc. are the ones that add additional info to >> >> > "order" and not reshape (because IMO memory order is something new >> >> > users >> >> > should not worry about at first). A and K orders will still have >> >> > their >> >> > quirks with np.array and copy=True/False, but for many functions they >> >> > are esoteric anyway. >> >> > >> >> > It will be one hell of a deprecation though, but I am +0.5 for adding >> >> > an >> >> > alias for now (maybe someone knows an even better name?), but I think >> >> > that in this case, it probably really is better to wait with actual >> >> > deprecation warnings for a few versions, since it touches a *lot* of >> >> > code. Plus I think at the point of starting deprecation warnings (and >> >> > best earlier) numpy should provide an automatic fixer script... >> >> > >> >> > The only counter point that remains for me is the difficulty of >> >> > deprecation, since I think the new name idea is very clean. And this >> >> > is >> >> > unfortunately even more invasive then the index_order proposal. >> >> >> >> I completely agree that we'd have to be gentle with the change. The >> >> problem we'd want to avoid is people innocently using 'layout' and >> >> finding to their annoyance that the code doesn't work with other >> >> people's numpy. >> >> >> >> How about: >> >> >> >> Step 1: 'order' remains as named keyword, layout added as alias, >> >> comment on the lines of "layout will become the default keyword for >> >> this option in later versions of numpy; please consider updating any >> >> code that does not need to remain backwards compatible'. >> >> >> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >> >> comment like "order is an alias for 'layout' to maintain backwards >> >> compatibility with numpy <= 1.7.1', please update any code that does >> >> not need to maintain backwards compatibility with these numpy >> >> versions' >> >> >> >> Step 3: Add deprecation warning for 'order', "order will be removed as >> >> an alias in future versions of numpy" >> >> >> >> Step 4: (distant future) Remove alias >> >> >> >> ? >> > >> > >> > A very strong -1 from me. Now we're talking about deprecation warnings >> > and a >> > backwards compatibility break after all. I thought we agreed that this >> > was a >> > very bad idea, so why are you proposing it now? >> > >> > Here's how I see it: deprecation of "order" is a no go. Therefore we >> > have >> > two choices here: >> > 1. Simply document the current "order" keyword better and leave it at >> > that. >> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >> > and >> > "layout" keywords forever. >> > >> > (2) is at least as confusing as (1), more work and poor design. >> > Therefore I >> > propose to go with (1). >> >> You are saying that deprecation of 'order' at any stage in the next 10 >> years of numpy's lifetime is a no go? > > > For something like this? Yes. You are saying I think that I am wrong in thinking this is an important change that will make numpy easier to explain and use in the long term. You'd probably expect me to disagree, and I do. I think I am right in thinking the change is important - I've tried to make that case in this thread, as well as I can. >> I think that is short-sighted and I think it will damage numpy. > > > It will damage numpy to be conservative and not change a name for a little > bit of clarity for some people that avoids reading the docs maybe a little > more carefully? There's a lot of things that can damage numpy, but this > isn't even close in my book. Too few developers, continuous backwards > compatibility issues, faster alternative libraries surpassing numpy - that's > the kind of thing that causes damage. We're talked about consensus on this list. Of course it can be very hard to achieve. >> Believe me, I have as much investment in backward compatibility as you >> do. All the three libraries that I spend a long time maintaining need >> to test against old numpy versions - but - for heaven's sake - only >> back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, >> and I don't think we need to maintain compatibility with Numeric >> either. > > > Really? This is from 3 months ago: > http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now > 2013, we are probably dropping numarray compat in 1.8. Not exactly 10 years, > but of the same order. I am happy to make this change over the same time course if you think that is necessary. Cheers, Matthew From brad.froehle at gmail.com Fri Apr 5 18:09:58 2013 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Fri, 5 Apr 2013 15:09:58 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: <2C2E3730212E4FFC977B80D9949DD88C@gmail.com> Hi, On Friday, April 5, 2013 at 12:09 PM, Ralf Gommers wrote: > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett wrote: > > How about: > > > > Step 1: 'order' remains as named keyword, layout added as alias, > > comment on the lines of "layout will become the default keyword for > > this option in later versions of numpy; please consider updating any > > code that does not need to remain backwards compatible'. > > > > Step 2: default keyword becomes 'layout' with 'order' as alias, > > comment like "order is an alias for 'layout' to maintain backwards > > compatibility with numpy <= 1.7.1', please update any code that does > > not need to maintain backwards compatibility with these numpy > > versions' > > > > Step 3: Add deprecation warning for 'order', "order will be removed as > > an alias in future versions of numpy" > > > > Step 4: (distant future) Remove alias > > > > ? > > A very strong -1 from me. Now we're talking about deprecation warnings and a backwards compatibility break after all. I thought we agreed that this was a very bad idea, so why are you proposing it now? > > Here's how I see it: deprecation of "order" is a no go. Therefore we have two choices here: > 1. Simply document the current "order" keyword better and leave it at that. > 2. Add a "layout" (or "index_order") keyword, and live with both "order" and "layout" keywords forever. > > (2) is at least as confusing as (1), more work and poor design. Therefore I propose to go with (1). I agree with Ralf. It's not worth breaking backwards compatibility or supporting two flags (with only further potential for confusion). If we were designing a system from scratch, I concede that it _might_ have been better to use 'layout' instead of 'order'?. but that decision has already been made. This proposal fails the cost/benefit analysis, being too expensive for too little benefit. Regards, Brad From matthew.brett at gmail.com Fri Apr 5 19:03:24 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 16:03:24 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 3:09 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >> >> >> >> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>> wrote: >>> > >>> > >>> > >>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>> >> wrote: >>> >> > Hey >>> >> > >>> >> > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: >>> >> >> Hi, >>> >> >> >>> >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith >>> >> >> wrote: >>> >> >> >>> >> >> > Maybe we should go through and rename "order" to something more >>> >> >> > descriptive >>> >> >> > in each case, so we'd have >>> >> >> > a.reshape(..., index_order="C") >>> >> >> > a.copy(memory_order="F") >>> >> >> > etc.? >>> >> >> >>> >> >> I'd like to propose this instead: >>> >> >> >>> >> >> a.reshape(..., order="C") >>> >> >> a.copy(layout="F") >>> >> >> >>> >> > >>> >> > I actually like this, makes the point clearer that it has to do with >>> >> > memory layout and implies contiguity, plus it is short and from the >>> >> > numpy perspective copy, etc. are the ones that add additional info to >>> >> > "order" and not reshape (because IMO memory order is something new >>> >> > users >>> >> > should not worry about at first). A and K orders will still have >>> >> > their >>> >> > quirks with np.array and copy=True/False, but for many functions they >>> >> > are esoteric anyway. >>> >> > >>> >> > It will be one hell of a deprecation though, but I am +0.5 for adding >>> >> > an >>> >> > alias for now (maybe someone knows an even better name?), but I think >>> >> > that in this case, it probably really is better to wait with actual >>> >> > deprecation warnings for a few versions, since it touches a *lot* of >>> >> > code. Plus I think at the point of starting deprecation warnings (and >>> >> > best earlier) numpy should provide an automatic fixer script... >>> >> > >>> >> > The only counter point that remains for me is the difficulty of >>> >> > deprecation, since I think the new name idea is very clean. And this >>> >> > is >>> >> > unfortunately even more invasive then the index_order proposal. >>> >> >>> >> I completely agree that we'd have to be gentle with the change. The >>> >> problem we'd want to avoid is people innocently using 'layout' and >>> >> finding to their annoyance that the code doesn't work with other >>> >> people's numpy. >>> >> >>> >> How about: >>> >> >>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>> >> comment on the lines of "layout will become the default keyword for >>> >> this option in later versions of numpy; please consider updating any >>> >> code that does not need to remain backwards compatible'. >>> >> >>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>> >> comment like "order is an alias for 'layout' to maintain backwards >>> >> compatibility with numpy <= 1.7.1', please update any code that does >>> >> not need to maintain backwards compatibility with these numpy >>> >> versions' >>> >> >>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>> >> an alias in future versions of numpy" >>> >> >>> >> Step 4: (distant future) Remove alias >>> >> >>> >> ? >>> > >>> > >>> > A very strong -1 from me. Now we're talking about deprecation warnings >>> > and a >>> > backwards compatibility break after all. I thought we agreed that this >>> > was a >>> > very bad idea, so why are you proposing it now? >>> > >>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>> > have >>> > two choices here: >>> > 1. Simply document the current "order" keyword better and leave it at >>> > that. >>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>> > and >>> > "layout" keywords forever. >>> > >>> > (2) is at least as confusing as (1), more work and poor design. >>> > Therefore I >>> > propose to go with (1). >>> >>> You are saying that deprecation of 'order' at any stage in the next 10 >>> years of numpy's lifetime is a no go? >> >> >> For something like this? Yes. > > You are saying I think that I am wrong in thinking this is an > important change that will make numpy easier to explain and use in the > long term. > > You'd probably expect me to disagree, and I do. I think I am right in > thinking the change is important - I've tried to make that case in > this thread, as well as I can. > >>> I think that is short-sighted and I think it will damage numpy. >> >> >> It will damage numpy to be conservative and not change a name for a little >> bit of clarity for some people that avoids reading the docs maybe a little >> more carefully? There's a lot of things that can damage numpy, but this >> isn't even close in my book. Too few developers, continuous backwards >> compatibility issues, faster alternative libraries surpassing numpy - that's >> the kind of thing that causes damage. > > We're talked about consensus on this list. Of course it can be very > hard to achieve. > >>> Believe me, I have as much investment in backward compatibility as you >>> do. All the three libraries that I spend a long time maintaining need >>> to test against old numpy versions - but - for heaven's sake - only >>> back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, >>> and I don't think we need to maintain compatibility with Numeric >>> either. >> >> >> Really? This is from 3 months ago: >> http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now >> 2013, we are probably dropping numarray compat in 1.8. Not exactly 10 years, >> but of the same order. > > I am happy to make this change over the same time course if you think > that is necessary. I am also happy with only steps 1 and 2 if you feel that deprecation over any time scale is unacceptable. Best, Matthew From josef.pktd at gmail.com Fri Apr 5 19:27:49 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Apr 2013 19:27:49 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >> >> >> >> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>> wrote: >>> > >>> > >>> > >>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>> >> wrote: >>> >> > Hey >>> >> > >>> >> > On Thu, 2013-04-04 at 14:20 -0700, Matthew Brett wrote: >>> >> >> Hi, >>> >> >> >>> >> >> On Tue, Apr 2, 2013 at 4:32 AM, Nathaniel Smith >>> >> >> wrote: >>> >> >> >>> >> >> > Maybe we should go through and rename "order" to something more >>> >> >> > descriptive >>> >> >> > in each case, so we'd have >>> >> >> > a.reshape(..., index_order="C") >>> >> >> > a.copy(memory_order="F") >>> >> >> > etc.? >>> >> >> >>> >> >> I'd like to propose this instead: >>> >> >> >>> >> >> a.reshape(..., order="C") >>> >> >> a.copy(layout="F") >>> >> >> >>> >> > >>> >> > I actually like this, makes the point clearer that it has to do with >>> >> > memory layout and implies contiguity, plus it is short and from the >>> >> > numpy perspective copy, etc. are the ones that add additional info to >>> >> > "order" and not reshape (because IMO memory order is something new >>> >> > users >>> >> > should not worry about at first). A and K orders will still have >>> >> > their >>> >> > quirks with np.array and copy=True/False, but for many functions they >>> >> > are esoteric anyway. >>> >> > >>> >> > It will be one hell of a deprecation though, but I am +0.5 for adding >>> >> > an >>> >> > alias for now (maybe someone knows an even better name?), but I think >>> >> > that in this case, it probably really is better to wait with actual >>> >> > deprecation warnings for a few versions, since it touches a *lot* of >>> >> > code. Plus I think at the point of starting deprecation warnings (and >>> >> > best earlier) numpy should provide an automatic fixer script... >>> >> > >>> >> > The only counter point that remains for me is the difficulty of >>> >> > deprecation, since I think the new name idea is very clean. And this >>> >> > is >>> >> > unfortunately even more invasive then the index_order proposal. >>> >> >>> >> I completely agree that we'd have to be gentle with the change. The >>> >> problem we'd want to avoid is people innocently using 'layout' and >>> >> finding to their annoyance that the code doesn't work with other >>> >> people's numpy. >>> >> >>> >> How about: >>> >> >>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>> >> comment on the lines of "layout will become the default keyword for >>> >> this option in later versions of numpy; please consider updating any >>> >> code that does not need to remain backwards compatible'. >>> >> >>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>> >> comment like "order is an alias for 'layout' to maintain backwards >>> >> compatibility with numpy <= 1.7.1', please update any code that does >>> >> not need to maintain backwards compatibility with these numpy >>> >> versions' >>> >> >>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>> >> an alias in future versions of numpy" >>> >> >>> >> Step 4: (distant future) Remove alias >>> >> >>> >> ? >>> > >>> > >>> > A very strong -1 from me. Now we're talking about deprecation warnings >>> > and a >>> > backwards compatibility break after all. I thought we agreed that this >>> > was a >>> > very bad idea, so why are you proposing it now? >>> > >>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>> > have >>> > two choices here: >>> > 1. Simply document the current "order" keyword better and leave it at >>> > that. >>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>> > and >>> > "layout" keywords forever. >>> > >>> > (2) is at least as confusing as (1), more work and poor design. >>> > Therefore I >>> > propose to go with (1). >>> >>> You are saying that deprecation of 'order' at any stage in the next 10 >>> years of numpy's lifetime is a no go? >> >> >> For something like this? Yes. > > You are saying I think that I am wrong in thinking this is an > important change that will make numpy easier to explain and use in the > long term. > > You'd probably expect me to disagree, and I do. I think I am right in > thinking the change is important - I've tried to make that case in > this thread, as well as I can. > >>> I think that is short-sighted and I think it will damage numpy. >> >> >> It will damage numpy to be conservative and not change a name for a little >> bit of clarity for some people that avoids reading the docs maybe a little >> more carefully? There's a lot of things that can damage numpy, but this >> isn't even close in my book. Too few developers, continuous backwards >> compatibility issues, faster alternative libraries surpassing numpy - that's >> the kind of thing that causes damage. > > We're talked about consensus on this list. Of course it can be very > hard to achieve. So far the consensus is that the documentation needs improvement. After that ??? my summary: I still think it's has way too much impact for just a name change. I think views are unfamiliar coming from another matrix language, that has a only memory (and copies arrays all the time), and no single word can help much in "getting it". Once we understand the distinction between views and memory, then the dual use of ``order`` has a nice symmetry to it (for me). 95% of all numpy users/usage (a wild guess) work(s) without any consideration of actual memory. If there is no consensus for change, then the status-quo prevails (unless the executive council over-rides). (I had prepared another message, that got censored, which starts with What is the numpy equivalent of Matlab's x(:)? quick answer, and long answer ? ) Josef > >>> Believe me, I have as much investment in backward compatibility as you >>> do. All the three libraries that I spend a long time maintaining need >>> to test against old numpy versions - but - for heaven's sake - only >>> back to numpy 1.2 or numpy 1.3. We don't support Python 2.5 any more, >>> and I don't think we need to maintain compatibility with Numeric >>> either. >> >> >> Really? This is from 3 months ago: >> http://article.gmane.org/gmane.comp.python.numeric.general/52632. It's now >> 2013, we are probably dropping numarray compat in 1.8. Not exactly 10 years, >> but of the same order. > > I am happy to make this change over the same time course if you think > that is necessary. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Fri Apr 5 21:50:03 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 18:50:03 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 4:27 PM, wrote: > On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: >> Hi, >> >> On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >>> >>> >>> >>> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >>> wrote: >>>> >>>> Hi, >>>> >>>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>>> wrote: >>>> > >>>> > >>>> > >>>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>>> > wrote: >>>> >> >>>> >> Hi, >>>> >> >>>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>>> >> wrote: >>>> >> > Hey >>>> >> I completely agree that we'd have to be gentle with the change. The >>>> >> problem we'd want to avoid is people innocently using 'layout' and >>>> >> finding to their annoyance that the code doesn't work with other >>>> >> people's numpy. >>>> >> >>>> >> How about: >>>> >> >>>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>>> >> comment on the lines of "layout will become the default keyword for >>>> >> this option in later versions of numpy; please consider updating any >>>> >> code that does not need to remain backwards compatible'. >>>> >> >>>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>>> >> comment like "order is an alias for 'layout' to maintain backwards >>>> >> compatibility with numpy <= 1.7.1', please update any code that does >>>> >> not need to maintain backwards compatibility with these numpy >>>> >> versions' >>>> >> >>>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>>> >> an alias in future versions of numpy" >>>> >> >>>> >> Step 4: (distant future) Remove alias >>>> >> >>>> >> ? >>>> > >>>> > >>>> > A very strong -1 from me. Now we're talking about deprecation warnings >>>> > and a >>>> > backwards compatibility break after all. I thought we agreed that this >>>> > was a >>>> > very bad idea, so why are you proposing it now? >>>> > >>>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>>> > have >>>> > two choices here: >>>> > 1. Simply document the current "order" keyword better and leave it at >>>> > that. >>>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>>> > and >>>> > "layout" keywords forever. >>>> > >>>> > (2) is at least as confusing as (1), more work and poor design. >>>> > Therefore I >>>> > propose to go with (1). >>>> >>>> You are saying that deprecation of 'order' at any stage in the next 10 >>>> years of numpy's lifetime is a no go? >>> >>> >>> For something like this? Yes. >> >> You are saying I think that I am wrong in thinking this is an >> important change that will make numpy easier to explain and use in the >> long term. >> >> You'd probably expect me to disagree, and I do. I think I am right in >> thinking the change is important - I've tried to make that case in >> this thread, as well as I can. >> >>>> I think that is short-sighted and I think it will damage numpy. >>> >>> >>> It will damage numpy to be conservative and not change a name for a little >>> bit of clarity for some people that avoids reading the docs maybe a little >>> more carefully? There's a lot of things that can damage numpy, but this >>> isn't even close in my book. Too few developers, continuous backwards >>> compatibility issues, faster alternative libraries surpassing numpy - that's >>> the kind of thing that causes damage. >> >> We're talked about consensus on this list. Of course it can be very >> hard to achieve. > > So far the consensus is that the documentation needs improvement. The only thing all of the No camp agree with is documentation improvement, I think that's fair. > After that ??? Well I think we have: Flat-no - the change not important, almost any cost is too high You Ralf Bradley Mid-no - maybe something could work, but not sure we've seen it yet. Chris Middle - current situation can be confusing, maybe one of the proposed solutions would be acceptable Sebastian Nathaniel Mid-yes - previous apparent vote for argument name change ?ric Depagne Andrew Jaffe (sorry if I misrepresent you) And then me. I am trying to be balanced. Unlike others, I think better names would have a significant impact on how coherent numpy is to explain and use. It seems to me that a change would be beneficial in the long term, and I'm confident we can agree on a schedule for that change that would be acceptable. But you know that. So - as I understand our 'model' - our job is to try and come to some shared agreement, if we possibly can. It has been good and encouraging for me at least to see that we have developed our ideas over the course of this thread. Cheers, Matthew From josef.pktd at gmail.com Fri Apr 5 22:39:55 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Apr 2013 22:39:55 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 4:27 PM, wrote: >> On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: >>> Hi, >>> >>> On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >>>> >>>> >>>> >>>> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>>>> wrote: >>>>> > >>>>> > >>>>> > >>>>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>>>> > wrote: >>>>> >> >>>>> >> Hi, >>>>> >> >>>>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>>>> >> wrote: >>>>> >> > Hey > >>>>> >> I completely agree that we'd have to be gentle with the change. The >>>>> >> problem we'd want to avoid is people innocently using 'layout' and >>>>> >> finding to their annoyance that the code doesn't work with other >>>>> >> people's numpy. >>>>> >> >>>>> >> How about: >>>>> >> >>>>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>>>> >> comment on the lines of "layout will become the default keyword for >>>>> >> this option in later versions of numpy; please consider updating any >>>>> >> code that does not need to remain backwards compatible'. >>>>> >> >>>>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>>>> >> comment like "order is an alias for 'layout' to maintain backwards >>>>> >> compatibility with numpy <= 1.7.1', please update any code that does >>>>> >> not need to maintain backwards compatibility with these numpy >>>>> >> versions' >>>>> >> >>>>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>>>> >> an alias in future versions of numpy" >>>>> >> >>>>> >> Step 4: (distant future) Remove alias >>>>> >> >>>>> >> ? >>>>> > >>>>> > >>>>> > A very strong -1 from me. Now we're talking about deprecation warnings >>>>> > and a >>>>> > backwards compatibility break after all. I thought we agreed that this >>>>> > was a >>>>> > very bad idea, so why are you proposing it now? >>>>> > >>>>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>>>> > have >>>>> > two choices here: >>>>> > 1. Simply document the current "order" keyword better and leave it at >>>>> > that. >>>>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>>>> > and >>>>> > "layout" keywords forever. >>>>> > >>>>> > (2) is at least as confusing as (1), more work and poor design. >>>>> > Therefore I >>>>> > propose to go with (1). >>>>> >>>>> You are saying that deprecation of 'order' at any stage in the next 10 >>>>> years of numpy's lifetime is a no go? >>>> >>>> >>>> For something like this? Yes. >>> >>> You are saying I think that I am wrong in thinking this is an >>> important change that will make numpy easier to explain and use in the >>> long term. >>> >>> You'd probably expect me to disagree, and I do. I think I am right in >>> thinking the change is important - I've tried to make that case in >>> this thread, as well as I can. >>> >>>>> I think that is short-sighted and I think it will damage numpy. >>>> >>>> >>>> It will damage numpy to be conservative and not change a name for a little >>>> bit of clarity for some people that avoids reading the docs maybe a little >>>> more carefully? There's a lot of things that can damage numpy, but this >>>> isn't even close in my book. Too few developers, continuous backwards >>>> compatibility issues, faster alternative libraries surpassing numpy - that's >>>> the kind of thing that causes damage. >>> >>> We're talked about consensus on this list. Of course it can be very >>> hard to achieve. >> >> So far the consensus is that the documentation needs improvement. > > The only thing all of the No camp agree with is documentation > improvement, I think that's fair. > >> After that ??? > > Well I think we have: > > Flat-no - the change not important, almost any cost is too high It's not *any* cost, this goes deep and wide, it's one of the basic concepts of numpy that you want to rename. Note, I'm just a user of numpy My main objection was to "N" and "Z", which would have affected me (and statsmodels developers) I don't really care about the "layout" change. I have no or almost no code depending on it. And, I don't have to implement it, nor do I have to struggle with the low level numpy behavior that would be affected by this. (And renaming doesn't change the concept.) Josef > > You > Ralf > Bradley > > Mid-no - maybe something could work, but not sure we've seen it yet. > > Chris > > Middle - current situation can be confusing, maybe one of the proposed > solutions would be acceptable > > Sebastian > Nathaniel > > Mid-yes - previous apparent vote for argument name change > > ?ric Depagne > Andrew Jaffe (sorry if I misrepresent you) > > And then me. > > I am trying to be balanced. Unlike others, I think better names would > have a significant impact on how coherent numpy is to explain and use. > It seems to me that a change would be beneficial in the long term, > and I'm confident we can agree on a schedule for that change that > would be acceptable. But you know that. > > So - as I understand our 'model' - our job is to try and come to some > shared agreement, if we possibly can. > > It has been good and encouraging for me at least to see that we have > developed our ideas over the course of this thread. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Fri Apr 5 22:47:23 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 19:47:23 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 7:39 PM, wrote: > On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett wrote: >> Hi, >> >> On Fri, Apr 5, 2013 at 4:27 PM, wrote: >>> On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >>>>> >>>>> >>>>> >>>>> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>>>>> wrote: >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>>>>> > wrote: >>>>>> >> >>>>>> >> Hi, >>>>>> >> >>>>>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>>>>> >> wrote: >>>>>> >> > Hey >> >>>>>> >> I completely agree that we'd have to be gentle with the change. The >>>>>> >> problem we'd want to avoid is people innocently using 'layout' and >>>>>> >> finding to their annoyance that the code doesn't work with other >>>>>> >> people's numpy. >>>>>> >> >>>>>> >> How about: >>>>>> >> >>>>>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>>>>> >> comment on the lines of "layout will become the default keyword for >>>>>> >> this option in later versions of numpy; please consider updating any >>>>>> >> code that does not need to remain backwards compatible'. >>>>>> >> >>>>>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>>>>> >> comment like "order is an alias for 'layout' to maintain backwards >>>>>> >> compatibility with numpy <= 1.7.1', please update any code that does >>>>>> >> not need to maintain backwards compatibility with these numpy >>>>>> >> versions' >>>>>> >> >>>>>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>>>>> >> an alias in future versions of numpy" >>>>>> >> >>>>>> >> Step 4: (distant future) Remove alias >>>>>> >> >>>>>> >> ? >>>>>> > >>>>>> > >>>>>> > A very strong -1 from me. Now we're talking about deprecation warnings >>>>>> > and a >>>>>> > backwards compatibility break after all. I thought we agreed that this >>>>>> > was a >>>>>> > very bad idea, so why are you proposing it now? >>>>>> > >>>>>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>>>>> > have >>>>>> > two choices here: >>>>>> > 1. Simply document the current "order" keyword better and leave it at >>>>>> > that. >>>>>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>>>>> > and >>>>>> > "layout" keywords forever. >>>>>> > >>>>>> > (2) is at least as confusing as (1), more work and poor design. >>>>>> > Therefore I >>>>>> > propose to go with (1). >>>>>> >>>>>> You are saying that deprecation of 'order' at any stage in the next 10 >>>>>> years of numpy's lifetime is a no go? >>>>> >>>>> >>>>> For something like this? Yes. >>>> >>>> You are saying I think that I am wrong in thinking this is an >>>> important change that will make numpy easier to explain and use in the >>>> long term. >>>> >>>> You'd probably expect me to disagree, and I do. I think I am right in >>>> thinking the change is important - I've tried to make that case in >>>> this thread, as well as I can. >>>> >>>>>> I think that is short-sighted and I think it will damage numpy. >>>>> >>>>> >>>>> It will damage numpy to be conservative and not change a name for a little >>>>> bit of clarity for some people that avoids reading the docs maybe a little >>>>> more carefully? There's a lot of things that can damage numpy, but this >>>>> isn't even close in my book. Too few developers, continuous backwards >>>>> compatibility issues, faster alternative libraries surpassing numpy - that's >>>>> the kind of thing that causes damage. >>>> >>>> We're talked about consensus on this list. Of course it can be very >>>> hard to achieve. >>> >>> So far the consensus is that the documentation needs improvement. >> >> The only thing all of the No camp agree with is documentation >> improvement, I think that's fair. >> >>> After that ??? >> >> Well I think we have: >> >> Flat-no - the change not important, almost any cost is too high > > It's not *any* cost, this goes deep and wide, it's one of the basic > concepts of numpy that you want to rename. The proposal I last made was to change the default name to 'layout' after some period to be agreed - say - P - with suitable warning in the docstring up until that time, and after, and leave 'order' as an alias forever. The only problem I can see with this, is that if someone, after period P, does not read the docstring, and uses 'layout' instead of 'order', then they will find that their code is not backwards compatible with versions of numpy of greater age than P. They can fix this, forever, by reverting to 'order'. That's certainly not zero cost, but it's not much cost either, and the cost will depend on P. > Note, I'm just a user of numpy > My main objection was to "N" and "Z", which would have affected me > (and statsmodels developers) Right. > I don't really care about the "layout" change. I have no or almost no > code depending on it. And, I don't have to implement it, nor do I have > to struggle with the low level numpy behavior that would be affected > by this. (And renaming doesn't change the concept.) No, right, the renaming is to clarify and distinguish the concepts. Cheers, Matthew From josef.pktd at gmail.com Fri Apr 5 23:31:18 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 5 Apr 2013 23:31:18 -0400 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 7:39 PM, wrote: >> On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett wrote: >>> Hi, >>> >>> On Fri, Apr 5, 2013 at 4:27 PM, wrote: >>>> On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: >>>>> Hi, >>>>> >>>>> On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>>>>>> wrote: >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>>>>>> > wrote: >>>>>>> >> >>>>>>> >> Hi, >>>>>>> >> >>>>>>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>>>>>> >> wrote: >>>>>>> >> > Hey >>> >>>>>>> >> I completely agree that we'd have to be gentle with the change. The >>>>>>> >> problem we'd want to avoid is people innocently using 'layout' and >>>>>>> >> finding to their annoyance that the code doesn't work with other >>>>>>> >> people's numpy. >>>>>>> >> >>>>>>> >> How about: >>>>>>> >> >>>>>>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>>>>>> >> comment on the lines of "layout will become the default keyword for >>>>>>> >> this option in later versions of numpy; please consider updating any >>>>>>> >> code that does not need to remain backwards compatible'. >>>>>>> >> >>>>>>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>>>>>> >> comment like "order is an alias for 'layout' to maintain backwards >>>>>>> >> compatibility with numpy <= 1.7.1', please update any code that does >>>>>>> >> not need to maintain backwards compatibility with these numpy >>>>>>> >> versions' >>>>>>> >> >>>>>>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>>>>>> >> an alias in future versions of numpy" >>>>>>> >> >>>>>>> >> Step 4: (distant future) Remove alias >>>>>>> >> >>>>>>> >> ? >>>>>>> > >>>>>>> > >>>>>>> > A very strong -1 from me. Now we're talking about deprecation warnings >>>>>>> > and a >>>>>>> > backwards compatibility break after all. I thought we agreed that this >>>>>>> > was a >>>>>>> > very bad idea, so why are you proposing it now? >>>>>>> > >>>>>>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>>>>>> > have >>>>>>> > two choices here: >>>>>>> > 1. Simply document the current "order" keyword better and leave it at >>>>>>> > that. >>>>>>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>>>>>> > and >>>>>>> > "layout" keywords forever. >>>>>>> > >>>>>>> > (2) is at least as confusing as (1), more work and poor design. >>>>>>> > Therefore I >>>>>>> > propose to go with (1). >>>>>>> >>>>>>> You are saying that deprecation of 'order' at any stage in the next 10 >>>>>>> years of numpy's lifetime is a no go? >>>>>> >>>>>> >>>>>> For something like this? Yes. >>>>> >>>>> You are saying I think that I am wrong in thinking this is an >>>>> important change that will make numpy easier to explain and use in the >>>>> long term. >>>>> >>>>> You'd probably expect me to disagree, and I do. I think I am right in >>>>> thinking the change is important - I've tried to make that case in >>>>> this thread, as well as I can. >>>>> >>>>>>> I think that is short-sighted and I think it will damage numpy. >>>>>> >>>>>> >>>>>> It will damage numpy to be conservative and not change a name for a little >>>>>> bit of clarity for some people that avoids reading the docs maybe a little >>>>>> more carefully? There's a lot of things that can damage numpy, but this >>>>>> isn't even close in my book. Too few developers, continuous backwards >>>>>> compatibility issues, faster alternative libraries surpassing numpy - that's >>>>>> the kind of thing that causes damage. >>>>> >>>>> We're talked about consensus on this list. Of course it can be very >>>>> hard to achieve. >>>> >>>> So far the consensus is that the documentation needs improvement. >>> >>> The only thing all of the No camp agree with is documentation >>> improvement, I think that's fair. >>> >>>> After that ??? >>> >>> Well I think we have: >>> >>> Flat-no - the change not important, almost any cost is too high >> >> It's not *any* cost, this goes deep and wide, it's one of the basic >> concepts of numpy that you want to rename. > > The proposal I last made was to change the default name to 'layout' > after some period to be agreed - say - P - with suitable warning in > the docstring up until that time, and after, and leave 'order' as an > alias forever. > > The only problem I can see with this, is that if someone, after period > P, does not read the docstring, and uses 'layout' instead of 'order', > then they will find that their code is not backwards compatible with > versions of numpy of greater age than P. They can fix this, forever, > by reverting to 'order'. That's certainly not zero cost, but it's not > much cost either, and the cost will depend on P. You edit large parts of the numpy tutorial and explanations, you add a second keyword to (rough guess) 10 functions and a similar number of methods (even wilder guess), the methods are in C, so you have to change it both on the c and the python level. Two keywords will confuse users for a long time (and which one is in the tutorial documentation) I'm just guessing and I have no idea about the c-level. Josef > >> Note, I'm just a user of numpy >> My main objection was to "N" and "Z", which would have affected me >> (and statsmodels developers) > > Right. > >> I don't really care about the "layout" change. I have no or almost no >> code depending on it. And, I don't have to implement it, nor do I have >> to struggle with the low level numpy behavior that would be affected >> by this. (And renaming doesn't change the concept.) > > No, right, the renaming is to clarify and distinguish the concepts. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Fri Apr 5 23:38:12 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 5 Apr 2013 20:38:12 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Fri, Apr 5, 2013 at 8:31 PM, wrote: > On Fri, Apr 5, 2013 at 10:47 PM, Matthew Brett wrote: >> Hi, >> >> On Fri, Apr 5, 2013 at 7:39 PM, wrote: >>> On Fri, Apr 5, 2013 at 9:50 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> On Fri, Apr 5, 2013 at 4:27 PM, wrote: >>>>> On Fri, Apr 5, 2013 at 6:09 PM, Matthew Brett wrote: >>>>>> Hi, >>>>>> >>>>>> On Fri, Apr 5, 2013 at 12:53 PM, Ralf Gommers wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Apr 5, 2013 at 9:21 PM, Matthew Brett >>>>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> On Fri, Apr 5, 2013 at 3:09 PM, Ralf Gommers >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On Fri, Apr 5, 2013 at 5:13 PM, Matthew Brett >>>>>>>> > wrote: >>>>>>>> >> >>>>>>>> >> Hi, >>>>>>>> >> >>>>>>>> >> On Fri, Apr 5, 2013 at 2:20 AM, Sebastian Berg >>>>>>>> >> wrote: >>>>>>>> >> > Hey >>>> >>>>>>>> >> I completely agree that we'd have to be gentle with the change. The >>>>>>>> >> problem we'd want to avoid is people innocently using 'layout' and >>>>>>>> >> finding to their annoyance that the code doesn't work with other >>>>>>>> >> people's numpy. >>>>>>>> >> >>>>>>>> >> How about: >>>>>>>> >> >>>>>>>> >> Step 1: 'order' remains as named keyword, layout added as alias, >>>>>>>> >> comment on the lines of "layout will become the default keyword for >>>>>>>> >> this option in later versions of numpy; please consider updating any >>>>>>>> >> code that does not need to remain backwards compatible'. >>>>>>>> >> >>>>>>>> >> Step 2: default keyword becomes 'layout' with 'order' as alias, >>>>>>>> >> comment like "order is an alias for 'layout' to maintain backwards >>>>>>>> >> compatibility with numpy <= 1.7.1', please update any code that does >>>>>>>> >> not need to maintain backwards compatibility with these numpy >>>>>>>> >> versions' >>>>>>>> >> >>>>>>>> >> Step 3: Add deprecation warning for 'order', "order will be removed as >>>>>>>> >> an alias in future versions of numpy" >>>>>>>> >> >>>>>>>> >> Step 4: (distant future) Remove alias >>>>>>>> >> >>>>>>>> >> ? >>>>>>>> > >>>>>>>> > >>>>>>>> > A very strong -1 from me. Now we're talking about deprecation warnings >>>>>>>> > and a >>>>>>>> > backwards compatibility break after all. I thought we agreed that this >>>>>>>> > was a >>>>>>>> > very bad idea, so why are you proposing it now? >>>>>>>> > >>>>>>>> > Here's how I see it: deprecation of "order" is a no go. Therefore we >>>>>>>> > have >>>>>>>> > two choices here: >>>>>>>> > 1. Simply document the current "order" keyword better and leave it at >>>>>>>> > that. >>>>>>>> > 2. Add a "layout" (or "index_order") keyword, and live with both "order" >>>>>>>> > and >>>>>>>> > "layout" keywords forever. >>>>>>>> > >>>>>>>> > (2) is at least as confusing as (1), more work and poor design. >>>>>>>> > Therefore I >>>>>>>> > propose to go with (1). >>>>>>>> >>>>>>>> You are saying that deprecation of 'order' at any stage in the next 10 >>>>>>>> years of numpy's lifetime is a no go? >>>>>>> >>>>>>> >>>>>>> For something like this? Yes. >>>>>> >>>>>> You are saying I think that I am wrong in thinking this is an >>>>>> important change that will make numpy easier to explain and use in the >>>>>> long term. >>>>>> >>>>>> You'd probably expect me to disagree, and I do. I think I am right in >>>>>> thinking the change is important - I've tried to make that case in >>>>>> this thread, as well as I can. >>>>>> >>>>>>>> I think that is short-sighted and I think it will damage numpy. >>>>>>> >>>>>>> >>>>>>> It will damage numpy to be conservative and not change a name for a little >>>>>>> bit of clarity for some people that avoids reading the docs maybe a little >>>>>>> more carefully? There's a lot of things that can damage numpy, but this >>>>>>> isn't even close in my book. Too few developers, continuous backwards >>>>>>> compatibility issues, faster alternative libraries surpassing numpy - that's >>>>>>> the kind of thing that causes damage. >>>>>> >>>>>> We're talked about consensus on this list. Of course it can be very >>>>>> hard to achieve. >>>>> >>>>> So far the consensus is that the documentation needs improvement. >>>> >>>> The only thing all of the No camp agree with is documentation >>>> improvement, I think that's fair. >>>> >>>>> After that ??? >>>> >>>> Well I think we have: >>>> >>>> Flat-no - the change not important, almost any cost is too high >>> >>> It's not *any* cost, this goes deep and wide, it's one of the basic >>> concepts of numpy that you want to rename. >> >> The proposal I last made was to change the default name to 'layout' >> after some period to be agreed - say - P - with suitable warning in >> the docstring up until that time, and after, and leave 'order' as an >> alias forever. >> >> The only problem I can see with this, is that if someone, after period >> P, does not read the docstring, and uses 'layout' instead of 'order', >> then they will find that their code is not backwards compatible with >> versions of numpy of greater age than P. They can fix this, forever, >> by reverting to 'order'. That's certainly not zero cost, but it's not >> much cost either, and the cost will depend on P. > > You edit large parts of the numpy tutorial and explanations, We agree that these concepts need to be clarified in the explanations. For the docs, we would first add the keyword as an alias and note it so. > you add a second keyword to (rough guess) 10 functions and > a similar number of methods (even wilder guess), the methods > are in C, so you have to change it both on the c and the python > level. I'm OK to do the code changes, I don't think that's a concern at the moment. If I don't or can't do the code changes then of course it won't happen. > Two keywords will confuse users for a long time > (and which one is in the tutorial documentation) 'order' until expiry of 'P' (with note of planned switch). Then 'layout', with 'order' noted as a backwards compatible alias. I think the alternative keyword will help rather than confuse, and that the confusion is likely to come from using 'order' in the sense of 'layout' - but - hey - you know that already. > I'm just guessing and I have no idea about the c-level. I haven't done this kind of thing before - I imagine it must be fairly straightforward, but if not, then any information about that would be useful to the discussion. Cheers, Matthew From ralf.gommers at gmail.com Sat Apr 6 04:51:58 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 10:51:58 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett wrote: > Hi, > > On Fri, Apr 5, 2013 at 7:39 PM, wrote: > > > > It's not *any* cost, this goes deep and wide, it's one of the basic > > concepts of numpy that you want to rename. > > The proposal I last made was to change the default name to 'layout' > after some period to be agreed - say - P - with suitable warning in > the docstring up until that time, and after, and leave 'order' as an > alias forever. > The above paragraph is simply incorrect. Your last proposal also included deprecation warnings and a future backwards compatibility break by removing 'order'. If you now say you're not proposing steps 3 and 4 anymore, then you're back to what I called option (2) - duplicate keywords forever. Which for me is undesirable, for reasons I already mentioned. Ralf P.S. being called short-sighted and damaging numpy by responding to a proposal you now say you didn't make is pretty damn annoying. P.P.S. expect an identical response from me to future proposals that include backwards compatibility breaks of heavily used functions for something that's not a functional enhancement or bug fix. Such proposals are just not OK. P.P.P.S. I'm not sure exactly what you mean by "default keyword". If layout overrules order and layout's default value is not None, you're still proposing a backwards compatibility break. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Apr 6 11:30:06 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 17:30:06 +0200 Subject: [Numpy-discussion] Bug in np.records? In-Reply-To: References: Message-ID: On Wed, Mar 20, 2013 at 2:57 PM, Pierre Barbier de Reuille < pierre.barbierdereuille at gmail.com> wrote: > Hey, > > I am trying to use titles for the record arrays. In the documentation, it > is specified that any column can set to "None". However, trying this fails > on numpy 1.6.2 because in np.core.records, on line 195, the "strip" method > is called on the title object. This is really annoying. Could we fix this > by replacing line 195 with: > > > self._titles = [n.strip() if n is not None else None for n in > titles[:self._nfields]] > > ? > That sounds reasonable. Ideally you'd send a pull request for this, including a regression test. Otherwise providing a self-contained example that can be turned into a test would help. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Sat Apr 6 13:18:55 2013 From: matti.picus at gmail.com (matti picus) Date: Sat, 6 Apr 2013 20:18:55 +0300 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering Message-ID: as a lurker, may I say that this discussion seems to have become non-productive? It seems all agree that docs needs improvement, perhaps a first step would be to suggest doc improvements, and then the need for renaming may become self-evident, or not. aww darn, ruined my lurker status. Matti Picus -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Apr 6 13:22:09 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 6 Apr 2013 10:22:09 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers wrote: > > > > On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Fri, Apr 5, 2013 at 7:39 PM, wrote: >> > >> > It's not *any* cost, this goes deep and wide, it's one of the basic >> > concepts of numpy that you want to rename. >> >> The proposal I last made was to change the default name to 'layout' >> after some period to be agreed - say - P - with suitable warning in >> the docstring up until that time, and after, and leave 'order' as an >> alias forever. > > > The above paragraph is simply incorrect. Your last proposal also included > deprecation warnings and a future backwards compatibility break by removing > 'order'. > > If you now say you're not proposing steps 3 and 4 anymore, then you're back > to what I called option (2) - duplicate keywords forever. Which for me is > undesirable, for reasons I already mentioned. You might not have read my follow-up proposing to drop steps 3 and 4 if you felt they were unacceptable. > P.S. being called short-sighted and damaging numpy by responding to a > proposal you now say you didn't make is pretty damn annoying. No, I did make that proposal, and in the spirit of negotiation and consensus, I subsequently modified my proposal, as I hope you'd expect in this situation. I'm am honestly sorry that I offended you. In hindsight, although I do worry that numpy feels as if it does resist reasonable change more strongly than is healthy, I was probably responding to my feeling that you were trying to veto the discussion rather than joining it, and I really should have put it that way instead. I am sorry about that. > P.P.S. expect an identical response from me to future proposals that include > backwards compatibility breaks of heavily used functions for something > that's not a functional enhancement or bug fix. Such proposals are just not > OK. It seems to me that each change has to be considered on its merit, and strict rules of that sort are not very useful. You are again implying that this change is not important, and obviously there I don't agree. I addressed the level and timing of backwards compatibility breakage in my comments to Josef. You haven't responded to me on that. > P.P.P.S. I'm not sure exactly what you mean by "default keyword". If layout > overrules order and layout's default value is not None, you're still > proposing a backwards compatibility break. I mean, that until the expiry of some agreed period 'P' - the docstring would read def ravel(self, order='C', **kwargs) where kwargs can only contain 'layout', and 'layout', 'order' cannot both be defined and after the expiry of 'P' def ravel(self, layout='C', **kwargs) where kwargs can only contain 'order', and 'layout', 'order' cannot both be defined At least that's my proposal, I'm happy to change it if there is a better solution. Cheers, Matthew From matthew.brett at gmail.com Sat Apr 6 13:39:37 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 6 Apr 2013 10:39:37 -0700 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering Message-ID: Hi, On Sat, Apr 6, 2013 at 10:18 AM, matti picus wrote: > as a lurker, may I say that this discussion seems to have become > non-productive? Well - the discussion is about two things - as so often the case on numpy. The first is about the particular change. The second is implicit, and keeps coming up, and that is about how change of any sort comes about in numpy. These questions keep coming up. Who decides on change? Is there a veto? Who has it? When do we vote, when do we negotiate? For example, one small part of the discussion was the lack of developers in numpy. For some, stopping these long discussions (somehow) will help recruit developers. The idea is that developers don't like these discussions, and, implicitly, that there is no problem, so the discussion is unnecessary. This is a version of the old 'no man, no problem' solution. For others, these long and unproductive discussions are pointing at a severe problem right at the heart of numpy development, which is that it is very hard to work in a system where it is unclear how decisions get made. See for example Mark Wiebe's complaint at the end here [1]. If this second lot of people are right then we have two options a) stop the discussions, numpy decays and dies from lack of direction b) continue the discussions and hope that it becomes clear that this is indeed a serious problem, and there develops some agreement to fix it. > It seems all agree that docs needs improvement, perhaps a first step would > be to suggest doc improvements, and then the need for renaming may become > self-evident, or not. I'm sure you're right - that doing the docs first would help, and thank you for the suggestion. Cheers, Matthew [1] https://github.com/numpy/numpy.org/blob/master/NA-overview.rst From pivanov314 at gmail.com Sat Apr 6 14:16:55 2013 From: pivanov314 at gmail.com (Paul Ivanov) Date: Sat, 6 Apr 2013 11:16:55 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <20130406181655.GA6177@HbI-OTOH.berkeley.edu> Hi Ralf, Ralf Gommers, on 2013-04-06 10:51, wrote: > P.P.S. expect an identical response from me to future proposals that > include backwards compatibility breaks of heavily used functions for > something that's not a functional enhancement or bug fix. Such proposals > are just not OK. but it is a functional enhancement or bug fix - the ambiguity in the affect of order= values in several places only serve to confuse two different ideas into one. -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From fordas at uw.edu Sat Apr 6 15:03:54 2013 From: fordas at uw.edu (Alex Ford) Date: Sat, 6 Apr 2013 12:03:54 -0700 Subject: [Numpy-discussion] Fwd: Pull request #3188 - Void scalar pickling behavior. In-Reply-To: References: Message-ID: Hello, Do any core developers or uses have guidance on how to resolve PR #3188 ( https://github.com/numpy/numpy/pull/3188) in relation to the pickling behavior of array scalar objects? To summarize, pickling array scalars with object fields, which are produced when indexing record arrays with object fields, stores the object address instead of pickling the referenced object. This behavior, obviously, results in invalid references on unpickling. Two current options are to: A) Raise an exception on pickling scalars with object fields. B) Transparently convert scalars to zero-rank arrays on pickling, which pickle properly. Unless there are objections or opinions on potential solutions, I am inclined to implement A. Thanks, Alex Ford -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Apr 6 16:25:20 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 22:25:20 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <20130406181655.GA6177@HbI-OTOH.berkeley.edu> References: <20130406181655.GA6177@HbI-OTOH.berkeley.edu> Message-ID: On Sat, Apr 6, 2013 at 8:16 PM, Paul Ivanov wrote: > Hi Ralf, > > Ralf Gommers, on 2013-04-06 10:51, wrote: > > P.P.S. expect an identical response from me to future proposals that > > include backwards compatibility breaks of heavily used functions for > > something that's not a functional enhancement or bug fix. Such proposals > > are just not OK. > > but it is a functional enhancement or bug fix - the ambiguity in > the affect of order= values in several places only serve to > confuse two different ideas into one. That sentence makes zero sense. The reason you can't decide whether it's a bug fix or enhancement is because it's neither. What ambiguity there is can be solved with documentation only, there's nothing new you can do with these functions after introducing a new keyword and there is no bug. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Apr 6 16:35:30 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 22:35:30 +0200 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett wrote: > Hi, > > On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers > wrote: > > > > > > > > On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> On Fri, Apr 5, 2013 at 7:39 PM, wrote: > >> > > >> > It's not *any* cost, this goes deep and wide, it's one of the basic > >> > concepts of numpy that you want to rename. > >> > >> The proposal I last made was to change the default name to 'layout' > >> after some period to be agreed - say - P - with suitable warning in > >> the docstring up until that time, and after, and leave 'order' as an > >> alias forever. > > > > > > The above paragraph is simply incorrect. Your last proposal also included > > deprecation warnings and a future backwards compatibility break by > removing > > 'order'. > > > > If you now say you're not proposing steps 3 and 4 anymore, then you're > back > > to what I called option (2) - duplicate keywords forever. Which for me is > > undesirable, for reasons I already mentioned. > > You might not have read my follow-up proposing to drop steps 3 and 4 > if you felt they were unacceptable. > > > P.S. being called short-sighted and damaging numpy by responding to a > > proposal you now say you didn't make is pretty damn annoying. > > No, I did make that proposal, and in the spirit of negotiation and > consensus, I subsequently modified my proposal, as I hope you'd expect > in this situation. > You have had clear NOs to the various incarnations of your proposal from 3 active developers of this community, not once but two or three times from each of those developers. Furthermore you have got only a couple of +0.5s, after 90 emails no one else seems to feel that this is a change we really have to have this change. Therefore I don't expect another modification of your proposal, I expect you to drop it. As another poster said, this thread has run its course. The technical issues are clear, and apparently we're going to have to agree to disagree about the seriousness of the confusion. Please please go and fix the docs in the way you deem best, and leave it at that. And triple please not another governance thread. I'm am honestly sorry that I offended you. Thank you. I apologize as well if my tone of the last message was too strong. Ralf In hindsight, although I do > worry that numpy feels as if it does resist reasonable change more > strongly than is healthy, I was probably responding to my feeling that > you were trying to veto the discussion rather than joining it, and I > really should have put it that way instead. I am sorry about that. > > > P.P.S. expect an identical response from me to future proposals that > include > > backwards compatibility breaks of heavily used functions for something > > that's not a functional enhancement or bug fix. Such proposals are just > not > > OK. > > It seems to me that each change has to be considered on its merit, and > strict rules of that sort are not very useful. You are again implying > that this change is not important, and obviously there I don't agree. > I addressed the level and timing of backwards compatibility breakage > in my comments to Josef. You haven't responded to me on that. > > > P.P.P.S. I'm not sure exactly what you mean by "default keyword". If > layout > > overrules order and layout's default value is not None, you're still > > proposing a backwards compatibility break. > > I mean, that until the expiry of some agreed period 'P' - the > docstring would read > > def ravel(self, order='C', **kwargs) > > where kwargs can only contain 'layout', and 'layout', 'order' cannot > both be defined > > and after the expiry of 'P' > > def ravel(self, layout='C', **kwargs) > > where kwargs can only contain 'order', and 'layout', 'order' cannot > both be defined > > At least that's my proposal, I'm happy to change it if there is a > better solution. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Apr 6 16:45:36 2013 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 6 Apr 2013 21:45:36 +0100 Subject: [Numpy-discussion] Fwd: Pull request #3188 - Void scalar pickling behavior. In-Reply-To: References: Message-ID: On Sat, Apr 6, 2013 at 8:03 PM, Alex Ford wrote: > Hello, > > Do any core developers or uses have guidance on how to resolve PR #3188 > (https://github.com/numpy/numpy/pull/3188) in relation to the pickling > behavior of array scalar objects? > > To summarize, pickling array scalars with object fields, which are produced > when indexing record arrays with object fields, stores the object address > instead of pickling the referenced object. This behavior, obviously, results > in invalid references on unpickling. > > Two current options are to: > A) Raise an exception on pickling scalars with object fields. > B) Transparently convert scalars to zero-rank arrays on pickling, which > pickle properly. > > Unless there are objections or opinions on potential solutions, I am > inclined to implement A. Option (A) would certainly be an improvement over silently saving corrupted data! I guess the best would be to implement proper pickling -- which might be as simple as writing a pickle function that casts to a zero-rank array and saves that, and an unpickler that casts back. But I don't know if it's worth spending much effort on -- people haven't exactly been clamoring for this functionality that I've noticed. -n From david.verelst at gmail.com Sat Apr 6 18:06:11 2013 From: david.verelst at gmail.com (David Verelst) Date: Sun, 7 Apr 2013 00:06:11 +0200 Subject: [Numpy-discussion] dynamically choosing atlas libraries In-Reply-To: <515F0506.5030506@cs.cmu.edu> References: <515F0506.5030506@cs.cmu.edu> Message-ID: Hi, Not a developer here, but I was under the impression that you can only use the BLAS/LAPACK libraries that where chosen at build time? As a side note: I've read [1] that OpenBLAS on some systems could perform quite well compared to ATLAS, I used some simple benchmarks [2] and noticed that on my system (Intel Core2 Duo P9700) there was also a big performance gain. I build standard LAPACK against OpenBLAS. OpenBlas has been mentioned a few times in the last months on this list too. [1] http://www.der-schnorz.de/2012/06/optimized-linear-algebra-and-numpyscipy/ [2] https://dpinte.wordpress.com/2010/01/15/numpy-performance-improvement-with-the-mkl/ On 5 April 2013 19:08, Edward Walter wrote: > Hello List, > > We're standing up a new computational cluster and going through the > steps of building Numpy etc. We would like to be able to install > multiple versions of ATLAS (with different build settings / tunings / > etc) and have Numpy load the shared Atlas libraries dynamically based on > the LD_LIBRARY_PATH. > > It's not clear to me how to do this given that the library path is hard > coded in Numpy's site.cfg. It seems like other people have gotten this > working though (i.e. RHEL/Centos with their collection of Atlas versions > {atlas, atlas-sse3, etc}). Does anyone have any pointers on this? > > Thanks much. > > -Ed Walter > Carnegie Mellon University > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Apr 6 18:15:23 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 6 Apr 2013 15:15:23 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers wrote: > > > > On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers >> wrote: >> > >> > >> > >> > On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> On Fri, Apr 5, 2013 at 7:39 PM, wrote: >> >> > >> >> > It's not *any* cost, this goes deep and wide, it's one of the basic >> >> > concepts of numpy that you want to rename. >> >> >> >> The proposal I last made was to change the default name to 'layout' >> >> after some period to be agreed - say - P - with suitable warning in >> >> the docstring up until that time, and after, and leave 'order' as an >> >> alias forever. >> > >> > >> > The above paragraph is simply incorrect. Your last proposal also >> > included >> > deprecation warnings and a future backwards compatibility break by >> > removing >> > 'order'. >> > >> > If you now say you're not proposing steps 3 and 4 anymore, then you're >> > back >> > to what I called option (2) - duplicate keywords forever. Which for me >> > is >> > undesirable, for reasons I already mentioned. >> >> You might not have read my follow-up proposing to drop steps 3 and 4 >> if you felt they were unacceptable. >> >> > P.S. being called short-sighted and damaging numpy by responding to a >> > proposal you now say you didn't make is pretty damn annoying. >> >> No, I did make that proposal, and in the spirit of negotiation and >> consensus, I subsequently modified my proposal, as I hope you'd expect >> in this situation. > > > You have had clear NOs to the various incarnations of your proposal from 3 > active developers of this community, not once but two or three times from > each of those developers. Furthermore you have got only a couple of +0.5s, > after 90 emails no one else seems to feel that this is a change we really > have to have this change. Therefore I don't expect another modification of > your proposal, I expect you to drop it. OK - I think I have a better understanding of the 'model' now. > As another poster said, this thread has run its course. The technical issues > are clear, and apparently we're going to have to agree to disagree about the > seriousness of the confusion. Please please go and fix the docs in the way > you deem best, and leave it at that. And triple please not another > governance thread. The governance threads happen because of the lack of governance, as this thread shows. I don't agree that decisions should be taken like this (+1, -1, No!, Yes!). I think they should be taken by negotiation and agreement. You disagree, but on whose authority, I do not know, and we have no way of resolving that, because there is - no governance thread. >> I'm am honestly sorry that I offended you. > > > Thank you. I apologize as well if my tone of the last message was too > strong. Thank you in turn, that is generous of you, Best, Matthew From ondrej.certik at gmail.com Sun Apr 7 01:03:34 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 6 Apr 2013 23:03:34 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.1rc1 release In-Reply-To: References: Message-ID: On Tue, Mar 26, 2013 at 6:32 PM, Ond?ej ?ert?k wrote: [...] > Yes. I created an issue here for them to test it: > > https://github.com/scikit-learn/scikit-learn/issues/1809 > > Just to make sure. There doesn't seem to be any more problems, so I am releasing 1.7.1 now. Ondrej From ondrej.certik at gmail.com Sun Apr 7 04:09:57 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sun, 7 Apr 2013 02:09:57 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release Message-ID: Hi, I'm pleased to announce the availability of the final NumPy 1.7.1 release. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.1/ Only three simple bugs were fixed since 1.7.1rc1 (#3166, #3179, #3187). I would like to thank everybody who contributed patches since 1.7.1rc1: Eric Fode, Nathaniel J. Smith and Charles Harris. Cheers, Ondrej P.S. I'll create the Mac binary installers in a few days. Pypi is updated. ========================= NumPy 1.7.1 Release Notes ========================= This is a bugfix only release in the 1.7.x series. Issues fixed ------------ gh-2973 Fix `1` is printed during numpy.test() gh-2983 BUG: gh-2969: Backport memory leak fix 80b3a34. gh-3007 Backport gh-3006 gh-2984 Backport fix complex polynomial fit gh-2982 BUG: Make nansum work with booleans. gh-2985 Backport large sort fixes gh-3039 Backport object take gh-3105 Backport nditer fix op axes initialization gh-3108 BUG: npy-pkg-config ini files were missing after Bento build. gh-3124 BUG: PyArray_LexSort allocates too much temporary memory. gh-3131 BUG: Exported f2py_size symbol prevents linking multiple f2py modules. gh-3117 Backport gh-2992 gh-3135 DOC: Add mention of PyArray_SetBaseObject stealing a reference gh-3134 DOC: Fix typo in fft docs (the indexing variable is 'm', not 'n'). gh-3136 Backport #3128 Checksums ========= 9e369a96b94b107bf3fab7e07fef8557 release/installers/numpy-1.7.1-win32-superpack-python2.6.exe 0ab72b3b83528a7ae79c6df9042d61c6 release/installers/numpy-1.7.1.tar.gz bb0d30de007d649757a2d6d2e1c59c9a release/installers/numpy-1.7.1-win32-superpack-python3.2.exe 9a72db3cad7a6286c0d22ee43ad9bc6c release/installers/numpy-1.7.1.zip 0842258fad82060800b8d1f0896cb83b release/installers/numpy-1.7.1-win32-superpack-python3.1.exe 1b8f29b1fa89a801f83f551adc13aaf5 release/installers/numpy-1.7.1-win32-superpack-python2.7.exe 9ca22df942e5d5362cf7154217cb4b69 release/installers/numpy-1.7.1-win32-superpack-python2.5.exe 2fd475b893d8427e26153e03ad7d5b69 release/installers/numpy-1.7.1-win32-superpack-python3.3.exe From jnothman at student.usyd.edu.au Sun Apr 7 04:45:13 2013 From: jnothman at student.usyd.edu.au (Joel Nothman) Date: Sun, 7 Apr 2013 18:45:13 +1000 Subject: [Numpy-discussion] mrecarray indexing behaviour Message-ID: Hello all! I am a bit confused by the behaviour of mrecarray: >>> import numpy.ma.mrecords >>> data = [(1, 'a'), (2, 'b')] >>> ra = numpy.rec.fromrecords(data) >>> mra = numpy.ma.mrecords.fromrecords(data) >>> ra rec.array([(1, 'a'), (2, 'b')], dtype=[('f0', '>> mra masked_records( f0 : [1 2] f1 : [a b] fill_value : (999999, 'N') ) >>> for a in [ra, mra]: ... for cname, cell in [('row first', a[0]['f1']), ('field first', a['f1'][0])]: ... print(type(a), cname, cell, type(cell), sep=', ') ... , row first, a, , field first, a, , row first, a, , field first, a, Why when I index an mrecarray by offset before field do I get a zero-dimension MaskedArray instead of the object I get when indexing by field then offset? This seems strange behaviour. Is it documented? It doesn't seem to be tested; ma/tests/test_mrecords.py:79,87asserts equality of value but says nothing about the types that should be returned. Similarly: >>> mra['f1'][0] == mra[0]['f1'] masked_array(data = True, mask = False, fill_value = True) >>> type(mra['f1'][0]) == type(mra[0]['f1']) False I look forward to any explanation or clarification! Thanks, - Joel -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Sun Apr 7 05:29:43 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Sun, 07 Apr 2013 05:29:43 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.1rc1 release In-Reply-To: References: Message-ID: <51613C87.1070400@gmail.com> An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Sun Apr 7 05:40:09 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Sun, 07 Apr 2013 05:40:09 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: <51613EF9.90401@gmail.com> An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Apr 7 06:44:39 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 7 Apr 2013 12:44:39 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: <51613EF9.90401@gmail.com> References: <51613EF9.90401@gmail.com> Message-ID: On Sun, Apr 7, 2013 at 11:40 AM, Colin J. Williams wrote: > On 07/04/2013 4:09 AM, Ond?ej ?ert?k wrote: > > [snip] > P.S. I'll create the Mac binary installers in a few days. Pypi is updated. > > > > [snip] > > Ond?ej, > > I've seen the PyPi version *criticized on the grounds that it makes no > provision for the linear algebra optimizations. > > Is this valid?* > All functions in the PyPi installer work, but indeed they're not SSE2/SSE3-optimized. If you need a Windows installer and care about performance, use one from SourceForge. (or use EPD, Anacondo, Python(x,y), etc., just not from PyPi). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin at haenel.co Sun Apr 7 09:26:18 2013 From: valentin at haenel.co (Valentin Haenel) Date: Sun, 7 Apr 2013 15:26:18 +0200 Subject: [Numpy-discussion] question about the data entry in the __array_interface__ Message-ID: <20130407132616.GS12177@kudu.in-berlin.de> Hi, I am currently working with a C extension that wraps a C library. The library contains a function that takes, amongst others, a 'void *' as an argument. Now, I would like for that function to be able to read the 'data' buffer of a numpy array and thus need to pass address from Python down into the C-extension. I know that the address is contained in the 'data' field of the '__array_interface__' and is either an int or a long. My guess is that this depends on the architecture of the system, i.e. 32 vs 64 bit systems. My question is: what is the correct type to use when using PyArg_ParseTuple to convert the value. I am currently using: k (integer) [unsigned long] Convert a Python integer or long integer to a C unsigned long without overflow checking. The reason I chose 'k' is because it seems to be the only option that can deal with both Python int and long types. And was wondering if this is the correct choice. Also note that whatever it is, it will be cast to a 'void *' later. Thanks in advance for advice. V- FYI: the gory details can be found in: https://github.com/FrancescAlted/python-blosc/pull/16 From pav at iki.fi Sun Apr 7 10:03:46 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 07 Apr 2013 17:03:46 +0300 Subject: [Numpy-discussion] question about the data entry in the __array_interface__ In-Reply-To: <20130407132616.GS12177@kudu.in-berlin.de> References: <20130407132616.GS12177@kudu.in-berlin.de> Message-ID: 07.04.2013 16:26, Valentin Haenel kirjoitti: [clip] > My question is: what is the correct type to use when using > PyArg_ParseTuple to convert the value. [clip] This is probably the most correct option: http://docs.python.org/2/c-api/long.html#PyLong_AsVoidPtr Not via ParseTuple, though. -- Pauli Virtanen From bahtiyor_zohidov at mail.ru Sun Apr 7 10:32:11 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Sun, 07 Apr 2013 18:32:11 +0400 Subject: [Numpy-discussion] =?utf-8?q?Sources_more_confusing_in_Python?= Message-ID: <1365345131.168242828@f226.mail.ru> Hello, I started using python 4-5 months ago. At that time I didn't realize there are incredibly many resource like modules, additional programs (ready one) in python. The problem is to which one I can get all I want "properly". I mean where (exact place) I can download standard modules without going other links?? For example, Excel python module, Image processing module, something module..Every time I get modules from different links.. Is there exact place (stable) to get simply rather than picking/jumping from one to another site?? Any answer is appreciated -------------- next part -------------- An HTML attachment was scrubbed... URL: From shish at keba.be Sun Apr 7 10:41:00 2013 From: shish at keba.be (Olivier Delalleau) Date: Sun, 7 Apr 2013 10:41:00 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <1365345131.168242828@f226.mail.ru> References: <1365345131.168242828@f226.mail.ru> Message-ID: The Python Package Index (https://pypi.python.org/pypi) is to my knowledge the largest centralized source of Python packages. That's where easy_install and pip typically fetch them so that you can install from the command line without manual download. -=- Olivier 2013/4/7 Happyman > Hello, > > I started using python 4-5 months ago. At that time I didn't realize there > are incredibly many resource like modules, additional programs (ready one) > in python. The problem is to which one I can get all I want "properly". I > mean where (exact place) I can download standard modules without going > other links?? For example, Excel python module, Image processing module, > something module..Every time I get modules from different links.. > > Is there exact place (stable) to get simply rather than picking/jumping > from one to another site?? > > Any answer is appreciated > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bahtiyor_zohidov at mail.ru Sun Apr 7 10:53:58 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Sun, 07 Apr 2013 18:53:58 +0400 Subject: [Numpy-discussion] =?utf-8?q?Sources_more_confusing_in_Python?= References: <1365345131.168242828@f226.mail.ru> Message-ID: <1365346438.41186980@f44.mail.ru> thanks Oliver I completed my Registration..but still no idea how to use it..it is ,may be, lack of experience of python .. could you show the basic steps?? ???????????, 7 ?????? 2013, 10:41 -04:00 ?? Olivier Delalleau : >The Python Package Index ( https://pypi.python.org/pypi ) is to my knowledge the largest centralized source of Python packages. That's where easy_install and pip typically fetch them so that you can install from the command line without manual download. > >-=- Olivier > > >2013/4/7 Happyman < bahtiyor_zohidov at mail.ru > >>Hello, >> >>I started using python 4-5 months ago. At that time I didn't realize there are incredibly many resource like modules, additional programs (ready one) in python. The problem is to which one I can get all I want "properly". I mean where (exact place) I can download standard modules without going other links?? For example, Excel python module, Image processing module, something module..Every time I get modules from different links.. >> >>Is there exact place (stable) to get simply rather than picking/jumping from one to another site?? >> >>Any answer is appreciated >> >> >>_______________________________________________ >>NumPy-Discussion mailing list >>NumPy-Discussion at scipy.org >>http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Sun Apr 7 11:06:27 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Sun, 7 Apr 2013 17:06:27 +0200 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <1365346438.41186980@f44.mail.ru> References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> Message-ID: On 7 April 2013 16:53, Happyman wrote: > I completed my Registration..but still no idea how to use it..it is ,may > be, lack of experience of python .. > You don't need registration. The easiest way to use it is via pip or easy_install. If you are on Windows, download and execute the easy_install installer, if you are on Linux, browse your repository for any of them. Once installed, you can do: $easy_install pip # to install pip, easiest way to get it on Windows $pip install numpy # to install package "numpy" All this, in the terminal. To get it on Windows, run -> cmd -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Sun Apr 7 11:34:58 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Sun, 07 Apr 2013 11:34:58 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <1365345131.168242828@f226.mail.ru> References: <1365345131.168242828@f226.mail.ru> Message-ID: <51619222.1010302@gmail.com> An HTML attachment was scrubbed... URL: From sergio_r at mail.com Sun Apr 7 14:28:33 2013 From: sergio_r at mail.com (Sergio Rojas) Date: Sun, 07 Apr 2013 14:28:33 -0400 Subject: [Numpy-discussion] converting "numpy.bytes_" from numpy 1.7.0 to "numpy.string_" of numpy 1.6.1 Message-ID: <20130407182833.32510@gmx.com> Dear all, I am using a function which under python 2.2.7 and numpy 1.6.1 returns a "list" (called d) whose elements are of type "numpy.string_" (see below). Under python 3.3.0 and numpy 1.7.0 the same function returns the list as an object of type "builtins.list" whose elements are of type "numpy.bytes_". I am trying to find a way to have the function returns the same type of object in both versions of python. Acordingly I am wondering if there is a function which could convert the "numpy.bytes_" into "numpy.string_". The closes I have found is using "str(d[1]).lstrip('b')" (see below) but it wraps the contents of d[1] in double quotes. In python 2.7.2 In python 3.3.0 numpy 1.6.1 numpy 1.7.0 --------------- --------------- In [6]: type(d) In [5]: type(d) Out[6]: list Out[5]: builtins.list In [7]: d[1] In [6]: d[1] Out[7]: '05-Jan-00' Out[6]: b'05-Jan-00' In [8]: type(d[1]) In [7]: type(d[1]) Out[8]: numpy.string_ Out[7]: numpy.bytes_ In [9]: str(d[1]).lstrip('b') In [10]: str(d[1]).lstrip('b') Out[9]: '05-Jan-00' Out[10]: "'05-Jan-00'" In [10]: str(d[1]).lstrip('b')[0] In [11]: str(d[1]).lstrip('b')[0] Out[10]: '0' Out[11]: "'" I want to have extra double quotes I want to delete the double quotes enclosing the string stored in d[1] enclosing the string stored in d[1] Thanks in advance for your help, Sergio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecarlson at eng.ua.edu Sun Apr 7 14:30:29 2013 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Sun, 07 Apr 2013 13:30:29 -0500 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <1365345131.168242828@f226.mail.ru> References: <1365345131.168242828@f226.mail.ru> Message-ID: Hello, For most people who want to be doing amazing things through python with little fuss, you'd probably be better off downloading a comprehensive distribution that includes many useful modules. Some examples of several - For windows: Pythonxy - http://code.google.com/p/pythonxy/wiki/Downloads Enthought Python Distribution (EPD, commercial) - http://www.enthought.com/products/epd.php A La Carte HPC Scientific downloads - http://www.lfd.uci.edu/~gohlke/pythonlibs/ For OSX or Linux: EPD Sage - http://www.sagemath.org/index.html At the very least, you can go to these sites and identify packages/modules that may be of interest to you, then go to PyPI for more details. Cheers, Eric From ralf.gommers at gmail.com Sun Apr 7 16:09:07 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 7 Apr 2013 22:09:07 +0200 Subject: [Numpy-discussion] ANN: SciPy 0.12.0 release Message-ID: We are pleased to announce the availability of SciPy 0.12.0. This release has some cool new features (see highlights below) and a large amount of bug fixes and maintenance work under the hood. The number of contributors also keeps rising steadily - 75 people contributed patches to this release. We hope to see this trend continue. Some of the highlights of this release are: - Completed QHull wrappers in scipy.spatial. - cKDTree now a drop-in replacement for KDTree. - A new global optimizer, basinhopping. - Support for Python 2 and Python 3 from the same code base (no more 2to3). This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. Support for Python 2.4 and 2.5 has been dropped as of this release. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.12.0/, release notes are copied below. Enjoy, The SciPy developers ========================== SciPy 0.12.0 Release Notes ========================== SciPy 0.12.0 is the culmination of 7 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.12.x branch, and on adding new features on the master branch. Some of the highlights of this release are: - Completed QHull wrappers in scipy.spatial. - cKDTree now a drop-in replacement for KDTree. - A new global optimizer, basinhopping. - Support for Python 2 and Python 3 from the same code base (no more 2to3). This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. Support for Python 2.4 and 2.5 has been dropped as of this release. New features ============ ``scipy.spatial`` improvements ------------------------------ cKDTree feature-complete ^^^^^^^^^^^^^^^^^^^^^^^^ Cython version of KDTree, cKDTree, is now feature-complete. Most operations (construction, query, query_ball_point, query_pairs, count_neighbors and sparse_distance_matrix) are between 200 and 1000 times faster in cKDTree than in KDTree. With very minor caveats, cKDTree has exactly the same interface as KDTree, and can be used as a drop-in replacement. Voronoi diagrams and convex hulls ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `scipy.spatial` now contains functionality for computing Voronoi diagrams and convex hulls using the Qhull library. (Delaunay triangulation was available since Scipy 0.9.0.) Delaunay improvements ^^^^^^^^^^^^^^^^^^^^^ It's now possible to pass in custom Qhull options in Delaunay triangulation. Coplanar points are now also recorded, if present. Incremental construction of Delaunay triangulations is now also possible. Spectral estimators (``scipy.signal``) -------------------------------------- The functions ``scipy.signal.periodogram`` and ``scipy.signal.welch`` were added, providing DFT-based spectral estimators. ``scipy.optimize`` improvements ------------------------------- Callback functions in L-BFGS-B and TNC ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A callback mechanism was added to L-BFGS-B and TNC minimization solvers. Basin hopping global optimization (``scipy.optimize.basinhopping``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A new global optimization algorithm. Basinhopping is designed to efficiently find the global minimum of a smooth function. ``scipy.special`` improvements ------------------------------ Revised complex error functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The computation of special functions related to the error function now uses a new `Faddeeva library from MIT `__ which increases their numerical precision. The scaled and imaginary error functions ``erfcx`` and ``erfi`` were also added, and the Dawson integral ``dawsn`` can now be evaluated for a complex argument. Faster orthogonal polynomials ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Evaluation of orthogonal polynomials (the ``eval_*`` routines) in now faster in ``scipy.special``, and their ``out=`` argument functions properly. ``scipy.sparse.linalg`` features -------------------------------- - In ``scipy.sparse.linalg.spsolve``, the ``b`` argument can now be either a vector or a matrix. - ``scipy.sparse.linalg.inv`` was added. This uses ``spsolve`` to compute a sparse matrix inverse. - ``scipy.sparse.linalg.expm`` was added. This computes the exponential of a sparse matrix using a similar algorithm to the existing dense array implementation in ``scipy.linalg.expm``. Listing Matlab(R) file contents in ``scipy.io`` ----------------------------------------------- A new function ``whosmat`` is available in ``scipy.io`` for inspecting contents of MAT files without reading them to memory. Documented BLAS and LAPACK low-level interfaces (``scipy.linalg``) ------------------------------------------------------------------ The modules `scipy.linalg.blas` and `scipy.linalg.lapack` can be used to access low-level BLAS and LAPACK functions. Polynomial interpolation improvements (``scipy.interpolate``) ------------------------------------------------------------- The barycentric, Krogh, piecewise and pchip polynomial interpolators in ``scipy.interpolate`` accept now an ``axis`` argument. Deprecated features =================== `scipy.lib.lapack` ------------------ The module `scipy.lib.lapack` is deprecated. You can use `scipy.linalg.lapack` instead. The module `scipy.lib.blas` was deprecated earlier in Scipy 0.10.0. `fblas` and `cblas` ------------------- Accessing the modules `scipy.linalg.fblas`, `cblas`, `flapack`, `clapack` is deprecated. Instead, use the modules `scipy.linalg.lapack` and `scipy.linalg.blas`. Backwards incompatible changes ============================== Removal of ``scipy.io.save_as_module`` -------------------------------------- The function ``scipy.io.save_as_module`` was deprecated in Scipy 0.11.0, and is now removed. Its private support modules ``scipy.io.dumbdbm_patched`` and ``scipy.io.dumb_shelve`` are also removed. Other changes ============= Authors ======= * Anton Akhmerov + * Alexander Ebersp?cher + * Anne Archibald * Jisk Attema + * K.-Michael Aye + * bemasc + * Sebastian Berg + * Fran?ois Boulogne + * Matthew Brett * Lars Buitinck * Steven Byrnes + * Tim Cera + * Christian + * Keith Clawson + * David Cournapeau * Nathan Crock + * endolith * Bradley M. Froehle + * Matthew R Goodman * Christoph Gohlke * Ralf Gommers * Robert David Grant + * Yaroslav Halchenko * Charles Harris * Jonathan Helmus * Andreas Hilboll * Hugo + * Oleksandr Huziy * Jeroen Demeyer + * Johannes Sch?nberger + * Steven G. Johnson + * Chris Jordan-Squire * Jonathan Taylor + * Niklas Kroeger + * Jerome Kieffer + * kingson + * Josh Lawrence * Denis Laxalde * Alex Leach + * Tim Leslie * Richard Lindsley + * Lorenzo Luengo + * Stephen McQuay + * MinRK * Sturla Molden + * Eric Moore + * mszep + * Matt Newville + * Vlad Niculae * Travis Oliphant * David Parker + * Fabian Pedregosa * Josef Perktold * Zach Ploskey + * Alex Reinhart + * Gilles Rochefort + * Ciro Duran Santillli + * Jan Schlueter + * Jonathan Scholz + * Anthony Scopatz * Skipper Seabold * Fabrice Silva + * Scott Sinclair * Jacob Stevenson + * Sturla Molden + * Julian Taylor + * thorstenkranz + * John Travers + * True Price + * Nicky van Foreest * Jacob Vanderplas * Patrick Varilly * Daniel Velkov + * Pauli Virtanen * Stefan van der Walt * Warren Weckesser A total of 75 people contributed to this release. People with a "+" by their names contributed a patch for the first time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sun Apr 7 16:49:41 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 7 Apr 2013 13:49:41 -0700 Subject: [Numpy-discussion] question about the data entry in the __array_interface__ In-Reply-To: <20130407132616.GS12177@kudu.in-berlin.de> References: <20130407132616.GS12177@kudu.in-berlin.de> Message-ID: On Sun, Apr 7, 2013 at 6:26 AM, Valentin Haenel wrote: > I am currently working with a C extension that wraps a C library. > > The library contains a function that takes, amongst others, a 'void *' > as an argument. Now, I would like for that function to be able to read > the 'data' buffer of a numpy array and thus need to pass address from > Python down into the C-extension. I can't help but put in a plug for Cython for this kind of thing: www.cython.org Cython comes woth build-in support for numpy arrays, so they have figured all this out for you. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Sun Apr 7 16:57:22 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 7 Apr 2013 13:57:22 -0700 Subject: [Numpy-discussion] converting "numpy.bytes_" from numpy 1.7.0 to "numpy.string_" of numpy 1.6.1 In-Reply-To: <20130407182833.32510@gmx.com> References: <20130407182833.32510@gmx.com> Message-ID: On Sun, Apr 7, 2013 at 11:28 AM, Sergio Rojas wrote: > I am using a function which under python 2.2.7 and numpy 1.6.1 > returns a "list" (called d) whose elements are of type > "numpy.string_" (see below). > > Under python 3.3.0 and numpy 1.7.0 the same function returns the > list as an object of type "builtins.list" whose elements are > of type "numpy.bytes_". 1st, jsut to be clear, a "list" in py2 is a "builtins.list" in py3. But the more interesting part: A "string" in py2 is a "bytes_" object in py3. In Py2 a string was originaly a 1-byte per character ansi string - i,e a char* under the hood, and identical to a a string of bytes. In py3, a "sring" is a unicode string (same as a unicode object on py2). To occomidate this, the bytes object was intriduced (lso in py2, but it py 2 a bytes and a string ar the same thing...). A numpy "string" type is more like the py2 string -- i.e. a string of single bytes. This is whyou get a bytes objects from it. Numpy has no unicode support built in. So what you've got is probably what you want. If you do want py3 strings (i.e. unicode), you'll need to decode the numpy bytes object to make a string. Sorry, I'm not running py3, so not sure the exact syntax, but in py2, it's somethign like: list_of_unicode_strings = [b.decode('ascii') for b in list_of_bytes_objects] HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Sun Apr 7 17:02:11 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sun, 7 Apr 2013 14:02:11 -0700 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> Message-ID: On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: > On 7 April 2013 16:53, Happyman wrote: > $pip install numpy # to install package "numpy" as a warning, last I checked pip did not support binary installs, and you really want a binary installer for numpy/scipy, unless you've got your development environment set up just the way you like it. That requires some significant experience, particularly on Windows. So: the lots-of-stuff-at-once distributions are a great option. Otherwise, go to the PyPi web site, search for what you want, and on the package page there is usually a link to main package website, and/or its download site, on which you should find binary installers for the more common, complex packages. You can get binaries for numpy and scipy from the sourceforge site linked to from scipy.org. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Sun Apr 7 17:07:53 2013 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Apr 2013 22:07:53 +0100 Subject: [Numpy-discussion] question about the data entry in the __array_interface__ In-Reply-To: <20130407132616.GS12177@kudu.in-berlin.de> References: <20130407132616.GS12177@kudu.in-berlin.de> Message-ID: On Sun, Apr 7, 2013 at 2:26 PM, Valentin Haenel wrote: > I know that the address is contained in the 'data' field of the > '__array_interface__' and is either an int or a long. My guess is that > this depends on the architecture of the system, i.e. 32 vs 64 bit > systems. > > My question is: what is the correct type to use when using > PyArg_ParseTuple to convert the value. I am currently using: > > k (integer) [unsigned long] > > Convert a Python integer or long integer to a C unsigned long > without overflow checking. > > The reason I chose 'k' is because it seems to be the only option that > can deal with both Python int and long types. And was wondering if this > is the correct choice. Also note that whatever it is, it will be cast to > a 'void *' later. This won't work on Win64, where C 'long' is 32 bits, but void* is 64 bits. -n From waterbug at pangalactic.us Sun Apr 7 17:25:33 2013 From: waterbug at pangalactic.us (Steve Waterbury) Date: Sun, 07 Apr 2013 17:25:33 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> Message-ID: <5161E44D.1000101@pangalactic.us> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: > On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >> On 7 April 2013 16:53, Happyman wrote: > >> $pip install numpy # to install package "numpy" > > as a warning, last I checked pip did not support binary installs ... Guess you didn't check very recently ;) -- pip does indeed support binary installs. It's trivial to check: waterbug at boson:~$ mkvirtualenv pip-test New python executable in pip-test/bin/python Installing distribute.............................................................................................................................................................................................done. Installing pip...............done. (pip-test)waterbug at boson:~$ cdvirtualenv (pip-test)waterbug at boson:~/envs/pip-test$ pip install PIL Downloading/unpacking PIL Downloading PIL-1.1.7.tar.gz (506Kb): 506Kb downloaded Running setup.py egg_info for package PIL WARNING: '' not a valid package name; please use only.-separated package names in setup.py Installing collected packages: PIL Running setup.py install for PIL WARNING: '' not a valid package name; please use only.-separated package names in setup.py building '_imaging' extension gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -DHAVE_LIBJPEG -DHAVE_LIBZ -I/usr/local/include/freetype2 -IlibImaging -I/home/waterbug/envs/pip-test/include -I/usr/local/include -I/usr/include -I/usr/include/python2.7 -c _imaging.c -o build/temp.linux-i686-2.7/_imaging.o -------------------------------------------------------------------- PIL 1.1.7 SETUP SUMMARY -------------------------------------------------------------------- version 1.1.7 platform linux2 2.7.3 (default, Sep 26 2012, 21:53:58) [GCC 4.7.2] -------------------------------------------------------------------- *** TKINTER support not available (Tcl/Tk 8.5 libraries needed) --- JPEG support available --- ZLIB (PNG/ZIP) support available --- FREETYPE2 support available *** LITTLECMS support not available -------------------------------------------------------------------- To add a missing option, make sure you have the required library, and set the corresponding ROOT variable in the setup.py script. To check the build, run the selftest.py script. changing mode of build/scripts-2.7/pilfont.py from 664 to 775 changing mode of build/scripts-2.7/pildriver.py from 664 to 775 changing mode of build/scripts-2.7/pilfile.py from 664 to 775 changing mode of build/scripts-2.7/pilconvert.py from 664 to 775 changing mode of build/scripts-2.7/pilprint.py from 664 to 775 changing mode of /home/waterbug/envs/pip-test/bin/pilfont.py to 775 changing mode of /home/waterbug/envs/pip-test/bin/pildriver.py to 775 changing mode of /home/waterbug/envs/pip-test/bin/pilfile.py to 775 changing mode of /home/waterbug/envs/pip-test/bin/pilconvert.py to 775 changing mode of /home/waterbug/envs/pip-test/bin/pilprint.py to 775 Successfully installed PIL Cleaning up... (pip-test)waterbug at boson:~/envs/pip-test$ python Python 2.7.3 (default, Sep 26 2012, 21:53:58) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import imaging Traceback (most recent call last): File "", line 1, in ImportError: No module named imaging >>> import PIL >>> PIL.__file__ '/home/waterbug/envs/pip-test/local/lib/python2.7/site-packages/PIL/__init__.pyc' >>> Cheers, Steve From njs at pobox.com Sun Apr 7 17:30:14 2013 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Apr 2013 22:30:14 +0100 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <5161E44D.1000101@pangalactic.us> References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> Message-ID: On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury wrote: > On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: >> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >>> On 7 April 2013 16:53, Happyman wrote: >> >>> $pip install numpy # to install package "numpy" >> >> as a warning, last I checked pip did not support binary installs ... > > Guess you didn't check very recently ;) -- pip does indeed > support binary installs. Binary install in this case means, downloading a pre-built package containing .so/.dll files -- very useful if you don't have a working C compiler environment on the system you're installing onto. -n From waterbug at pangalactic.us Sun Apr 7 17:34:57 2013 From: waterbug at pangalactic.us (Steve Waterbury) Date: Sun, 07 Apr 2013 17:34:57 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> Message-ID: <5161E681.7040808@pangalactic.us> On 04/07/2013 05:30 PM, Nathaniel Smith wrote: > On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury > wrote: >> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: >>> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >>>> On 7 April 2013 16:53, Happyman wrote: >>> >>>> $pip install numpy # to install package "numpy" >>> >>> as a warning, last I checked pip did not support binary installs ... >> >> Guess you didn't check very recently ;) -- pip does indeed >> support binary installs. > > Binary install in this case means, downloading a pre-built package > containing .so/.dll files -- very useful if you don't have a working C > compiler environment on the system you're installing onto. Point taken -- just didn't want pip to be sold short. I'm one of those spoiled Linux people, obviously ... ;) Steve From cournape at gmail.com Sun Apr 7 17:40:01 2013 From: cournape at gmail.com (David Cournapeau) Date: Sun, 7 Apr 2013 22:40:01 +0100 Subject: [Numpy-discussion] [numfocus] Growing the contributor base of Numpy In-Reply-To: References: Message-ID: I have prepared a preliminary proposal https://github.com/enthought/davidc-scipy-2013/blob/master/proposal.rst Roughly, after ensuring everybody knows how to build numpy from sources in a dev-friendly way, I was thinking about - describing the source code tree organization - talk about the main data structures (array + dtype), and how they relate to the python runtime - more details about dtype: use basic array operations to describe the whole mechanism, and how to create a simple one (using wrapping float128 as an example) I also intended to give a few tips regarding tools (e.g. how to track a python call to its core C implementation). St?fan Van Der Walt agreed in principle to contribute, but his participation is still very conditional. David From josef.pktd at gmail.com Sun Apr 7 17:45:09 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 7 Apr 2013 17:45:09 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <5161E681.7040808@pangalactic.us> References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> <5161E681.7040808@pangalactic.us> Message-ID: On Sun, Apr 7, 2013 at 5:34 PM, Steve Waterbury wrote: > On 04/07/2013 05:30 PM, Nathaniel Smith wrote: >> On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury >> wrote: >>> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: >>>> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >>>>> On 7 April 2013 16:53, Happyman wrote: >>>> >>>>> $pip install numpy # to install package "numpy" >>>> >>>> as a warning, last I checked pip did not support binary installs ... >>> >>> Guess you didn't check very recently ;) -- pip does indeed >>> support binary installs. >> >> Binary install in this case means, downloading a pre-built package >> containing .so/.dll files -- very useful if you don't have a working C >> compiler environment on the system you're installing onto. > > Point taken -- just didn't want pip to be sold short. > I'm one of those spoiled Linux people, obviously ... ;) However, pip is really awful on Windows. If you have a virtualenv and you use --upgrade, it wants to upgrade all package dependencies (!), but it doesn't know how (with numpy and scipy). (easy_install was so much nicer.) Josef > > Steve > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From shish at keba.be Sun Apr 7 17:49:25 2013 From: shish at keba.be (Olivier Delalleau) Date: Sun, 7 Apr 2013 17:49:25 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> <5161E681.7040808@pangalactic.us> Message-ID: 2013/4/7 > On Sun, Apr 7, 2013 at 5:34 PM, Steve Waterbury > wrote: > > On 04/07/2013 05:30 PM, Nathaniel Smith wrote: > >> On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury > >> wrote: > >>> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: > >>>> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: > >>>>> On 7 April 2013 16:53, Happyman wrote: > >>>> > >>>>> $pip install numpy # to install package "numpy" > >>>> > >>>> as a warning, last I checked pip did not support binary installs ... > >>> > >>> Guess you didn't check very recently ;) -- pip does indeed > >>> support binary installs. > >> > >> Binary install in this case means, downloading a pre-built package > >> containing .so/.dll files -- very useful if you don't have a working C > >> compiler environment on the system you're installing onto. > > > > Point taken -- just didn't want pip to be sold short. > > I'm one of those spoiled Linux people, obviously ... ;) > > However, pip is really awful on Windows. > > If you have a virtualenv and you use --upgrade, it wants to upgrade all > package dependencies (!), but it doesn't know how (with numpy and scipy). > > (easy_install was so much nicer.) > > Josef > You can use --no-deps to prevent pip from trying to upgrade dependencies. -=- Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Apr 7 17:59:10 2013 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Apr 2013 22:59:10 +0100 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> <5161E681.7040808@pangalactic.us> Message-ID: On Sun, Apr 7, 2013 at 10:49 PM, Olivier Delalleau wrote: > 2013/4/7 >> >> On Sun, Apr 7, 2013 at 5:34 PM, Steve Waterbury >> wrote: >> > On 04/07/2013 05:30 PM, Nathaniel Smith wrote: >> >> On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury >> >> wrote: >> >>> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: >> >>>> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >> >>>>> On 7 April 2013 16:53, Happyman wrote: >> >>>> >> >>>>> $pip install numpy # to install package "numpy" >> >>>> >> >>>> as a warning, last I checked pip did not support binary installs ... >> >>> >> >>> Guess you didn't check very recently ;) -- pip does indeed >> >>> support binary installs. >> >> >> >> Binary install in this case means, downloading a pre-built package >> >> containing .so/.dll files -- very useful if you don't have a working C >> >> compiler environment on the system you're installing onto. >> > >> > Point taken -- just didn't want pip to be sold short. >> > I'm one of those spoiled Linux people, obviously ... ;) >> >> However, pip is really awful on Windows. >> >> If you have a virtualenv and you use --upgrade, it wants to upgrade all >> package dependencies (!), but it doesn't know how (with numpy and scipy). >> >> (easy_install was so much nicer.) >> >> Josef > > > You can use --no-deps to prevent pip from trying to upgrade dependencies. This is only a partial workaround, because this also means that if there *are* new needed dependencies, they get ignored, resulting in a possibly broken install. IIRC the full workaround is 'pip install --no-deps --upgrade foo; pip install foo' The other annoying workaround is to instead of using --upgrade, do something like 'pip install numpy==1.7.1'. This requires knowing (or looking up) what the latest version is, but once you've done that it works. -n From waterbug at pangalactic.us Sun Apr 7 19:08:45 2013 From: waterbug at pangalactic.us (Steve Waterbury) Date: Sun, 07 Apr 2013 19:08:45 -0400 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> <5161E681.7040808@pangalactic.us> Message-ID: <5161FC7D.8030409@pangalactic.us> On 04/07/2013 05:59 PM, Nathaniel Smith wrote: > On Sun, Apr 7, 2013 at 10:49 PM, Olivier Delalleau wrote: >> 2013/4/7 >>> >>> On Sun, Apr 7, 2013 at 5:34 PM, Steve Waterbury >>> wrote: >>>> On 04/07/2013 05:30 PM, Nathaniel Smith wrote: >>>>> On Sun, Apr 7, 2013 at 10:25 PM, Steve Waterbury >>>>> wrote: >>>>>> On 04/07/2013 05:02 PM, Chris Barker - NOAA Federal wrote: >>>>>>> On Sun, Apr 7, 2013 at 8:06 AM, Da?id wrote: >>>>>>>> On 7 April 2013 16:53, Happyman wrote: >>>>>>> >>>>>>>> $pip install numpy # to install package "numpy" >>>>>>> >>>>>>> as a warning, last I checked pip did not support binary installs ... >>>>>> >>>>>> Guess you didn't check very recently ;) -- pip does indeed >>>>>> support binary installs. >>>>> >>>>> Binary install in this case means, downloading a pre-built package >>>>> containing .so/.dll files -- very useful if you don't have a working C >>>>> compiler environment on the system you're installing onto. >>>> >>>> Point taken -- just didn't want pip to be sold short. >>>> I'm one of those spoiled Linux people, obviously ... ;) >>> >>> However, pip is really awful on Windows. >>> >>> If you have a virtualenv and you use --upgrade, it wants to upgrade all >>> package dependencies (!), but it doesn't know how (with numpy and scipy). >>> >>> (easy_install was so much nicer.) >>> >>> Josef >> >> >> You can use --no-deps to prevent pip from trying to upgrade dependencies. > > This is only a partial workaround, because this also means that if > there *are* new needed dependencies, they get ignored, resulting in a > possibly broken install. IIRC the full workaround is 'pip install > --no-deps --upgrade foo; pip install foo' > > The other annoying workaround is to instead of using --upgrade, do > something like 'pip install numpy==1.7.1'. This requires knowing (or > looking up) what the latest version is, but once you've done that it > works. Guess I'm not as easily annoyed -- esp. since looking up what the latest version of numpy is as simple as: waterbug at boson:~$ yolk -V numpy numpy 1.7.1 Steve From huangkandiy at gmail.com Sun Apr 7 19:17:36 2013 From: huangkandiy at gmail.com (huangkandiy at gmail.com) Date: Sun, 7 Apr 2013 19:17:36 -0400 Subject: [Numpy-discussion] [numfocus] Growing the contributor base of Numpy In-Reply-To: References: Message-ID: That's awesome. I'm definitely one of the targeted audience. Thanks. On Sun, Apr 7, 2013 at 5:40 PM, David Cournapeau wrote: > I have prepared a preliminary proposal > https://github.com/enthought/davidc-scipy-2013/blob/master/proposal.rst > > Roughly, after ensuring everybody knows how to build numpy from > sources in a dev-friendly way, I was thinking about > - describing the source code tree organization > - talk about the main data structures (array + dtype), and how they > relate to the python runtime > - more details about dtype: use basic array operations to describe > the whole mechanism, and how to create a simple one (using wrapping > float128 as an example) > > I also intended to give a few tips regarding tools (e.g. how to track > a python call to its core C implementation). > > St?fan Van Der Walt agreed in principle to contribute, but his > participation is still very conditional. > > David > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Kan Huang Department of Applied math & Statistics Stony Brook University 917-767-8018 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Sun Apr 7 19:23:55 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Sun, 7 Apr 2013 19:23:55 -0400 Subject: [Numpy-discussion] Sort performance with structured array Message-ID: I'm seeing about a factor of 50 difference in performance between sorting a random integer array versus sorting that same array viewed as a structured array. Am I doing anything wrong here? In [2]: x = np.random.randint(10000, size=10000) In [3]: xarr = x.view(dtype=[('a', np.int)]) In [4]: timeit np.sort(x) 1000 loops, best of 3: 588 us per loop In [5]: timeit np.sort(xarr) 10 loops, best of 3: 29 ms per loop In [6]: timeit np.sort(xarr, order=('a',)) 10 loops, best of 3: 28.9 ms per loop I was wondering if this slowdown is expected (maybe the comparison is dropping back to pure Python or ??). I'm showing a simple example here, but in reality I'm working with non-trivial structured arrays where I might want to sort on multiple columns. Does anyone have suggestions for speeding things up, or have a sort implementation (perhaps Cython) that has better performance for structured arrays? Thanks, Tom From charlesr.harris at gmail.com Sun Apr 7 19:56:19 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 7 Apr 2013 17:56:19 -0600 Subject: [Numpy-discussion] Sort performance with structured array In-Reply-To: References: Message-ID: On Sun, Apr 7, 2013 at 5:23 PM, Tom Aldcroft wrote: > I'm seeing about a factor of 50 difference in performance between > sorting a random integer array versus sorting that same array viewed > as a structured array. Am I doing anything wrong here? > > In [2]: x = np.random.randint(10000, size=10000) > > In [3]: xarr = x.view(dtype=[('a', np.int)]) > > In [4]: timeit np.sort(x) > 1000 loops, best of 3: 588 us per loop > > In [5]: timeit np.sort(xarr) > 10 loops, best of 3: 29 ms per loop > > In [6]: timeit np.sort(xarr, order=('a',)) > 10 loops, best of 3: 28.9 ms per loop > > I was wondering if this slowdown is expected (maybe the comparison is > dropping back to pure Python or ??). I'm showing a simple example > here, but in reality I'm working with non-trivial structured arrays > where I might want to sort on multiple columns. > > Does anyone have suggestions for speeding things up, or have a sort > implementation (perhaps Cython) that has better performance for > structured arrays? > This is probably due to the comparison function used. For straight integers the C operator `<` is used, for dtypes the dtype comparison function is passed as a pointer to the routines. I doubt Cython would make any difference in this case, but making the dtype comparison routine better would probably help a lot. For all I know, the dtype gets parsed on every call to the comparison function. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Apr 7 20:00:56 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 7 Apr 2013 18:00:56 -0600 Subject: [Numpy-discussion] Sort performance with structured array In-Reply-To: References: Message-ID: On Sun, Apr 7, 2013 at 5:56 PM, Charles R Harris wrote: > > > On Sun, Apr 7, 2013 at 5:23 PM, Tom Aldcroft < > aldcroft at head.cfa.harvard.edu> wrote: > >> I'm seeing about a factor of 50 difference in performance between >> sorting a random integer array versus sorting that same array viewed >> as a structured array. Am I doing anything wrong here? >> >> In [2]: x = np.random.randint(10000, size=10000) >> >> In [3]: xarr = x.view(dtype=[('a', np.int)]) >> >> In [4]: timeit np.sort(x) >> 1000 loops, best of 3: 588 us per loop >> >> In [5]: timeit np.sort(xarr) >> 10 loops, best of 3: 29 ms per loop >> >> In [6]: timeit np.sort(xarr, order=('a',)) >> 10 loops, best of 3: 28.9 ms per loop >> >> I was wondering if this slowdown is expected (maybe the comparison is >> dropping back to pure Python or ??). I'm showing a simple example >> here, but in reality I'm working with non-trivial structured arrays >> where I might want to sort on multiple columns. >> >> Does anyone have suggestions for speeding things up, or have a sort >> implementation (perhaps Cython) that has better performance for >> structured arrays? >> > > This is probably due to the comparison function used. For straight > integers the C operator `<` is used, for dtypes the dtype comparison > function is passed as a pointer to the routines. I doubt Cython would make > any difference in this case, but making the dtype comparison routine better > would probably help a lot. For all I know, the dtype gets parsed on every > call to the comparison function. > > Note that even sorting as a byte string is notably faster In [13]: sarr = x.view(dtype=' From pivanov314 at gmail.com Sun Apr 7 20:04:11 2013 From: pivanov314 at gmail.com (Paul Ivanov) Date: Sun, 7 Apr 2013 17:04:11 -0700 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> Matthew Brett, on 2013-04-06 10:39, wrote: > Hi, > > On Sat, Apr 6, 2013 at 10:18 AM, matti picus wrote: > > as a lurker, may I say that this discussion seems to have become > > non-productive? > > Well - the discussion is about two things - as so often the case on numpy. > > The first is about the particular change. > > The second is implicit, and keeps coming up, and that is about how > change of any sort comes about in numpy. These questions keep coming > up. Who decides on change? Is there a veto? Who has it? When do we > vote, when do we negotiate? > > For example, one small part of the discussion was the lack of > developers in numpy. > > For some, stopping these long discussions (somehow) will help recruit > developers. The idea is that developers don't like these discussions, > and, implicitly, that there is no problem, so the discussion is > unnecessary. This is a version of the old 'no man, no problem' > solution. > > For others, these long and unproductive discussions are pointing at a > severe problem right at the heart of numpy development, which is that > it is very hard to work in a system where it is unclear how decisions > get made. See for example Mark Wiebe's complaint at the end here [1]. > If this second lot of people are right then we have two options a) > stop the discussions, numpy decays and dies from lack of direction b) > continue the discussions and hope that it becomes clear that this is > indeed a serious problem, and there develops some agreement to fix it. > > [1] https://github.com/numpy/numpy.org/blob/master/NA-overview.rst I just wanted to chime in that while the document on contributing to numpy [2] is pretty thorough in terms of the technical details expected of a submission - it makes practically no allusions to the social aspects of contributing changes - e.g. the expectations one should have when proposing a change - a loose flowchart or set of possible outcomes. 2. http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html Here's an example of what those expectations might be: 1. your change gets reviewed by a core committer, is deemed acceptable, and gets merged. 2. specific changes are required to the implementation/logic, after which the PR will be deemed acceptable, and get merged 3. the proposal is too broad in scope, or proposes a drastic change that has not been discussed on the list, and should be paused from further action pending the outcome of such a discussion. About a year ago, it seems like something like situation 3 occurred, where folks didn't feel like there was sufficient notice outside of having to pay attention to the Numpy PR queue. In a related note - it should be made clear who the core committers are, at this point. The github organization lists the following eight: charris cournape njsmith pv rgommers rkern seberg teoliphant Is that the core, the whole core, and nothing but the core? -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From g.brandl at gmx.net Mon Apr 8 03:14:51 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 08 Apr 2013 09:14:51 +0200 Subject: [Numpy-discussion] The "I" dtype character Message-ID: Hi, is it intentional that "I" is supported as a dtype character, but cannot be suffixed with a size? >>> dtype('i1') dtype('int8') >>> dtype('I1') dtype('uint32') I know "u" is documented as unsigned integer, but this seems an unnecessary restriction that is confusing. thanks, Georg From bahtiyor_zohidov at mail.ru Mon Apr 8 04:00:17 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Mon, 08 Apr 2013 12:00:17 +0400 Subject: [Numpy-discussion] =?utf-8?q?Sources_more_confusing_in_Python?= References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> Message-ID: <1365408017.688813832@f77.mail.ru> Thanks a lot, friends! I got it. By the way I used this one: http://www.python-excel.org/ .? I guess this link provides the similar module like shown in https://pypi.python.org/pypi. ???????????, 7 ?????? 2013, 17:06 +02:00 ?? Da?id : >On 7 April 2013 16:53, Happyman < bahtiyor_zohidov at mail.ru > wrote: >>I completed my Registration..but still no idea how to use it..it is ,may be, lack of experience of python .. > >You don't need registration. The easiest way to use it is via pip or easy_install. If you are on Windows, download and execute the easy_install installer, if you are on Linux, browse your repository for any of them. > >Once installed, you can do: > >$easy_install pip ?# to install pip, easiest way to get it on Windows > >$pip install numpy # to install package "numpy" > >All this, in the terminal. To get it on Windows, run -> cmd >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Mon Apr 8 11:27:31 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 8 Apr 2013 08:27:31 -0700 Subject: [Numpy-discussion] Sources more confusing in Python In-Reply-To: <5161E681.7040808@pangalactic.us> References: <1365345131.168242828@f226.mail.ru> <1365346438.41186980@f44.mail.ru> <5161E44D.1000101@pangalactic.us> <5161E681.7040808@pangalactic.us> Message-ID: On Sun, Apr 7, 2013 at 2:34 PM, Steve Waterbury wrote: > Point taken -- just didn't want pip to be sold short. > I'm one of those spoiled Linux people, obviously ... ;) I really like pip -- but it is missing what is really a key feature for Windows (and to a slighltyl lessoe extent, OS-X) -- the ability it isntall binaries. As a rule, intstalling binaries is a pain on Linux, but installing from source is a pain on Windows -- binaroes work really well there. on OS-X, it's a mixed bag -- there are so many different builds of Pyton out there that binaries are a real trick -- but compiling form source can be a pain, too... while easy_install does support binaries, it never did it right on OS-X (what with teh "universal binary" problem (that's OS-X's way of putting multiple binaries in one file-- i.e. PPC and Intel -- easy_install always got confused about what it was supposed to install) And while PPC is getting to be history, we still have Intel32 and Intel64 to deal with... So scouring teh web for the binaries you need is still the way to go. NOTE for OP: for Windows, Christoph Gohlke's repository is a godsend: http://www.lfd.uci.edu/~gohlke/pythonlibs/ -Chris > Steve > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pav at iki.fi Mon Apr 8 05:38:52 2013 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 8 Apr 2013 09:38:52 +0000 (UTC) Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering References: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> Message-ID: Paul Ivanov gmail.com> writes: [clip] > In a related note - it should be made clear who the core > committers are, at this point. The github organization lists the > following eight: > > charris > cournape > njsmith > pv > rgommers > rkern > seberg > teoliphant > > Is that the core, the whole core, and nothing but the core? Actually, that's the list of people with admin access to the numpy organization, for creating repositories and managing commit access. The actual committer list is here (github has managed to make the team lists difficult to find): https://github.com/organizations/numpy/teams/14102 certik charris cournape FrancescAlted mwiebe njsmith pv rgommers rkern seberg stefanv teoliphant thouis Sure, it contains some names (such as mine) with previous association with the project but not so much recent activity. On the other hand, I don't see obvious missing names -- the policy has been to give commit access for people with sustained code contributions, see also git shortlog -a -c -s --since=now-2year | sort -n -- Pauli Virtanen From josef.pktd at gmail.com Mon Apr 8 15:16:01 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 8 Apr 2013 15:16:01 -0400 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> Message-ID: On Mon, Apr 8, 2013 at 5:38 AM, Pauli Virtanen wrote: > Paul Ivanov gmail.com> writes: > [clip] >> In a related note - it should be made clear who the core >> committers are, at this point. The github organization lists the >> following eight: >> >> charris >> cournape >> njsmith >> pv >> rgommers >> rkern >> seberg >> teoliphant >> >> Is that the core, the whole core, and nothing but the core? > > Actually, that's the list of people with admin access to > the numpy organization, for creating repositories and managing > commit access. > > The actual committer list is here (github has managed to make > the team lists difficult to find): > > https://github.com/organizations/numpy/teams/14102 > > certik > charris > cournape > FrancescAlted > mwiebe > njsmith > pv > rgommers > rkern > seberg > stefanv > teoliphant > thouis > > Sure, it contains some names (such as mine) with previous association > with the project but not so much recent activity. On the other hand, > I don't see obvious missing names -- the policy has been to give > commit access for people with sustained code contributions, see also > > git shortlog -a -c -s --since=now-2year | sort -n I find this helpful for a quick overview: http://www.ohloh.net/p/numpy/contributors?query=&sort=commits_12_mo Josef > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From chris.barker at noaa.gov Mon Apr 8 15:24:10 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 8 Apr 2013 12:24:10 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 Message-ID: Recent discussion has made it clear that the timezone handling in the current (numpy1.7) version of datetime64 is broken. Below is a discussion of some possible solutions, hopefully including most of the comments made on the recent thread on this list. http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066038.html The intent it that with a bit more discussion (focused, in this thread at least) on the time zone issues, rather than other DateTIme64 issues, we can start a new datetime64 NEP. Background: =============================== The current version (numpy 1.7) of datetime64 appears to handle timezones in the following ways: datetime64s are assumed to be in UTC internally. Time zone translation is done on I/O -- i.e creating a new datetime64 and outputting to text format or as a datetime.datetime object. When creating a datetime64 from an ISO string, the timezone info in the string is respected. If there is no timezone info in the string, the system time zone (locale setting) is used. On output (i.e.converting to text: __str__ and __repr__) the system locale is used to set the timezone. In [9]: np.datetime64('2013-04-08T12:00:00Z') Out[9]: numpy.datetime64('2013-04-08T05:00:00-0700') In [10]: np.datetime64('2013-04-08T12:00:00') Out[10]: numpy.datetime64('2013-04-08T12:00:00-0700') However, if a datetime,datetime is used without a tzinfo object (the common case, as no tzinfo objects are provided with the python stdlib), the timezone is assumed to be UTC: In [13]: dt Out[13]: datetime.datetime(2013, 4, 8, 12, 0) In [14]: np.datetime64(dt) Out[14]: numpy.datetime64('2013-04-08T05:00:00.000000-0700') which can give some odd results, as it's different if you convert the datetime object to a iso string first: In [15]: np.datetime64(dt.isoformat()) Out[15]: numpy.datetime64('2013-04-08T12:00:00-0700') Converting from a datetime64 to a datetime object uses the UTC time (the internal representation with no offset). Issues with the current configuration: =============================== Using the locale time zone is a long standing tradition, and used by the C standard library time functions. However, it is almost always NOT what one wants in a typical numpy application. When working with Scientific (and financial) datasets, the time zone of the data at hand is likely to have nothing to do with the timezone of the computer the code is running on. Also, with cloud computing and web applications, the time zone of the machine on which the code is running is irrelevant to the user. A number of early-adopters of datetime64 have found that they have needed to wrap creating and use of datetime64 arrays to override the timezone behavior. The current implementation may be natural for some interactive use, but that's often not the case, and is particularly problematic when datetime.datetime.now() gives locale lime, but with no time zone info, so numpy actually appears to shift it. In [19]: datetime.datetime.now().isoformat() Out[19]: '2013-04-08T12:05:26.157475' In [20]: np.datetime64(datetime.datetime.now()) Out[20]: numpy.datetime64('2013-04-08T05:05:45.813027-0700') This is really ugly -- and regardless if we like what the std lib does, we need to deal with it. The python standard library datetime implementation uses "naive" datetimes by default, with the provision for an optional "tzinfo" object, so that the user can supply timezone info if desired, However, the library does not provide any tzinfo objects out of the box. This is because timezones are messy, complicated, and change over time, and the core python devs did not want to be in the position of maintaining that code. There is a third party "pytz" package (http://pytz.sourceforge.net/) that provides a pretty complete implementation of time zone handling for those that need it. Note also that in the current implementation, The busday functions just operate on datetime64[D]. There is no timezone interaction there -- which makes it very hard for them to be useful, as it's a bit tricky to make sure your datetime64 arrays are in the correct time zone for your application. In fact, they are assuming that datetime64 is time zone naive, even though the I/O functions assume locale time. Proposed Alternatives: ====================== Principles: ------------------ * Partial time zone handling is probably worse than none. * The library should never apply the locale timezone (or any other) unless explicitly requested by the user. * It should be possible (and easy) to use "naive" datetime64 arrays. 1) A naive datetime64: ==================== This would follow, more or less, the python stdlib datetime approach (with no tzinfo object) -- ignore timezones entirely. This model leaves all time zone handling up to the user. In general, when working with this method, applications either: use UTC everywhere, or use "local time" everywhere, where local time is simply "all data is in the same time zone" and it's up to the user (or lib code) to make sure that's correct. Issues: ------------------ The main issue with a naive datetime64 is what to do on creation if time zone information is supplied (i.e in a ISO 8601 string, or datetime object with non-None tzinfo). Options include: - ignore the TZ info - raise an exception if there is a TZ adjustment (other than UTC, 00:00, or Z) I propose that we raise an exception, unless there is a way to pass an optional flag in to request timezone conversion. note that the stdlib datetime package does not provide an ISO8601 parsing function, so it has ignored the issue. There is also the issue of what to provide on output/conversion. I propose: - a datetime object with no tzinof - ISO8601 with no tz adjustment np.datetime_as_string() could be used with options to allow the user to request a time zone offset. 1) UTC-only ==================== This would be very similar to a naive datetime64 -- there are no timezone adjustments with pure UTC -- and would be similar to the current implementation, except for I/O: On conversion to datetime64, time zone offset would be respected, if it exists in the ISO8601 string or the datetime object has a tzinfo attribute. The value would be stored in UTC. If there is no timezone info in the input string or datetime objects, UTC is assumed. On output -- UTC is used, no offset computed. Issues: ------------------ The ISO string produced on output would logically contain the "Z" flag to indicate UTC. This may confuse some apps that really expect a naive datetime. If there were a way to pass in a flag to create ISO strings indicating the time zone, that would be perfect, probably using np.datetime_as_string() 3) Optional time zone support ========================== This would follow the standard library approach -- provide a hook for a tzinfo object -- and if there, handle properly. This would allow one to mix and match datetime64s that are in different time zones, etc. issues: ---------------- The biggest issue is that to be useful, you'd need a comprehensive tzinfo package. pytz provides one, but then you'd need to go through the python layer for every item in an array -- killing performance. However, perhaps that would be worth it for those that need it, and folks that need performance could use naive datetime64s. There apparently is also a datetime library in boost that has a nice timezone object which could be used as inspiration for an equivalent in NumPy. That could be a lot of work, though. 3) Full time zone support ========================== This would be similar to the above, except that every datetime64 array would be required to carry time zone info. This would probably be reasonable, as one could use UTC everywhere if you wanted the simple case. But it would require that a comprehensive tzinfo package be included with numpy -- likely something we don't want to have to maintain (even if someone wants to built it in the first place) issues: ----------- We would still want ways to input/output naive datetimes -- some app simply don't want to deal with all this! As I (Chris Barker) am not in a postion to implement anything, I advocate the simplest possible approach -- which I think is Naive datetime and/or UTC only. But if people want more, and someone wants to implement it, great! Please add your comment, and maybe we'll get a NEP together. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ralf.gommers at gmail.com Mon Apr 8 18:06:58 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 9 Apr 2013 00:06:58 +0200 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> References: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> Message-ID: On Mon, Apr 8, 2013 at 2:04 AM, Paul Ivanov wrote: > I just wanted to chime in that while the document on contributing > to numpy [2] is pretty thorough in terms of the technical details > expected of a submission - it makes practically no allusions to > the social aspects of contributing changes - e.g. the > expectations one should have when proposing a change - a loose > flowchart or set of possible outcomes. > This is a point I recognize, more description/guidance on how things work (or should work) in practice would be useful. Last year we did make an attempt to write up something on community process for SciPy: https://github.com/scipy/scipy/blob/master/HACKING.rst.txt. I suspect it's more beginner-oriented than what you have in mind, but large parts of it could be taken over for numpy. > 2. http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html > > > Here's an example of what those expectations might be: > > 1. your change gets reviewed by a core committer, is deemed > acceptable, and gets merged. > > 2. specific changes are required to the implementation/logic, > after which the PR will be deemed acceptable, and get merged > > 3. the proposal is too broad in scope, or proposes a drastic > change that has not been discussed on the list, and should be > paused from further action pending the outcome of such a > discussion. > Good examples, here are a few more. 4. the proposal looks good, but hasn't yet been discussed on the list. Please go do that, after that we continue. 5. You don't get a response to your PR or your post on the mailing list. This ideally shouldn't happen, however the numpy development team is small and very busy. Don't take it personally. If more than a week has passed (for an email) please ping the list again. For PRs wait at least a week or two, unless it's an urgent bug fix. To me, the type of response that one is likely to receive is connected to the type of contribution: A. bug fix. --> situation 1 or 2 B. new feature/function --> situation 3 or 4 C. documentation, build improvement, etc. --> situation 1 or 2 D. backwards-incompatible changes --> there's a strong feeling in the community that non-backwards-compatible changes are to be avoided unless there's a very good reason why those changes have to be made. Only propose these if you are convinced that there are no ways to achieve what you want in another way. Do expect some resistance. D1. ABI changes --> these are a last resort, and require a major version number change. D2. API changes --> these are in most cases less painful than ABI changes, but don't happen all that often - even if the change is accepted a (>1 year long) deprecation process is needed. E. forwards-incompatible changes --> there's no guarantee that numpy provides forwards compatibility. We don't go and break this for no reason, but the threshold is low. Please discuss on the list. F. adding new sub-modules --> situation 3. Adding new sub-modules is rare. The numpy namespace is relatively flat, we aim to keep it that way. There has to be a very good reason to add a new module. G. website improvement --> thank you, desperately needed. Here's some login credentials and a medal. If the scipy doc I linked and the above make sense, I can draft something for the numpy docs for people to comment on. I don't have much time this week though, so I'd be grateful if someone would help out with this. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pivanov314 at gmail.com Mon Apr 8 19:30:51 2013 From: pivanov314 at gmail.com (Paul Ivanov) Date: Mon, 8 Apr 2013 16:30:51 -0700 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> Message-ID: <20130408233051.GG17976@HbI-OTOH.berkeley.edu> Ralf Gommers, on 2013-04-09 00:06, wrote: > G. website improvement --> thank you, desperately needed. Here's some login > credentials and a medal. see https://github.com/numpy/numpy.org/pull/1 Do I really get a medal? ;) best, -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From charlesr.harris at gmail.com Mon Apr 8 21:47:13 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 8 Apr 2013 19:47:13 -0600 Subject: [Numpy-discussion] Numpy discussion - was: Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: <20130408233051.GG17976@HbI-OTOH.berkeley.edu> References: <20130408000411.GB17976@HbI-OTOH.berkeley.edu> <20130408233051.GG17976@HbI-OTOH.berkeley.edu> Message-ID: On Mon, Apr 8, 2013 at 5:30 PM, Paul Ivanov wrote: > Ralf Gommers, on 2013-04-09 00:06, wrote: > > G. website improvement --> thank you, desperately needed. Here's some > login > > credentials and a medal. > > see https://github.com/numpy/numpy.org/pull/1 > > Do I really get a medal? ;) > Send the barcode and a self addressed envelope to NMR (Numpy Medal Rewards), PO Box 314159, Reno NV, and and we'll think about it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Tue Apr 9 06:45:24 2013 From: tmp50 at ukr.net (Dmitrey) Date: Tue, 09 Apr 2013 13:45:24 +0300 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> Message-ID: <73318.1365504324.16663206196880080896@ffe17.ukr.net> --- ???????? ????????? --- ?? ????: "Robert Kern" ????: 16 ????? 2013, 22:15:07 On Sat, Mar 16, 2013 at 6:19 PM, Dmitrey < tmp50 at ukr.net > wrote: > > > --- ???????? ????????? --- > ?? ????: "Robert Kern" < robert.kern at gmail.com > > ????: 16 ????? 2013, 19:54:51 > > On Sat, Mar 16, 2013 at 10:39 AM, Matthieu Brucher > < matthieu.brucher at gmail.com > wrote: >> Even if they have different hashes, they can be stored in the same >> underlying list before they are retrieved. Then, an actual comparison is >> done to check if the given key (i.e. object instance, not hash) is the >> same >> as one of the stored keys. > > Right. And the rule is that if two objects compare equal, then they > must also hash equal. Unfortunately, it looks like `oofun` objects do > not obey this property. oofun.__eq__() seems to return a Constraint > rather than a bool, so oofun objects should simply not be used as > dictionary keys. > > It is one of several base features FuncDesigner is build on and is used > extremely often and wide; then whole FuncDesigner would work incorrectly > while it is used intensively and solves many problems better than its > competitors. I understand. It just means that you can't oofun objects as dictionary keys. Adding a __hash__() method is not enough to make that work. No, it just means I had mapped, have mapped, map and will map oofun objects as Python dict keys. As for the bug, I have found and fixed its source (I used some info from sorted list of free variables and somew other info from a non-sorted dict of oofun sizes). D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Apr 9 07:29:20 2013 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Apr 2013 16:59:20 +0530 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <73318.1365504324.16663206196880080896@ffe17.ukr.net> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> Message-ID: On Tue, Apr 9, 2013 at 4:15 PM, Dmitrey wrote: > > > --- ???????? ????????? --- > ?? ????: "Robert Kern" > ????: 16 ????? 2013, 22:15:07 > > On Sat, Mar 16, 2013 at 6:19 PM, Dmitrey wrote: >> >> >> --- ???????? ????????? --- >> ?? ????: "Robert Kern" >> ????: 16 ????? 2013, 19:54:51 >> >> On Sat, Mar 16, 2013 at 10:39 AM, Matthieu Brucher >> wrote: >>> Even if they have different hashes, they can be stored in the same >>> underlying list before they are retrieved. Then, an actual comparison is >>> done to check if the given key (i.e. object instance, not hash) is the >>> same >>> as one of the stored keys. >> >> Right. And the rule is that if two objects compare equal, then they >> must also hash equal. Unfortunately, it looks like `oofun` objects do >> not obey this property. oofun.__eq__() seems to return a Constraint >> rather than a bool, so oofun objects should simply not be used as >> dictionary keys. >> >> It is one of several base features FuncDesigner is build on and is used >> extremely often and wide; then whole FuncDesigner would work incorrectly >> while it is used intensively and solves many problems better than its >> competitors. > > I understand. It just means that you can't oofun objects as dictionary > keys. Adding a __hash__() method is not enough to make that work. > > No, it just means I had mapped, have mapped, map and will map oofun objects > as Python dict keys. Well, it's your software. You are free to make it as buggy as you wish, I guess. -- Robert Kern From jhtu at Princeton.EDU Tue Apr 9 08:40:51 2013 From: jhtu at Princeton.EDU (Jonathan Tu) Date: Tue, 9 Apr 2013 08:40:51 -0400 Subject: [Numpy-discussion] timezones and datetime64 In-Reply-To: References: <515BB663.7030804@hilboll.de> Message-ID: On Thu, Apr 4, 2013 at 12:52 AM, Chris Barker - NOAA Federal wrote: > Thanks all for taking an interest. I need to think a bot more about > the options before commenting more, but: > > while we're at it: > > It seems very odd to me that datetime64 supports different units > (right down to attosecond) but not different epochs. How can it > possible be useful to use nanoseconds, etc, but only right around > 1970? For that matter, why all the units at all? I can see the need > for nanosecond resolution, but not without changing the epoch -- so if > the epoch is fixed, why bother with different units? Using days (for > instance) rather than seconds doesn't save memory, as we're always > using 64 bits. It can't be common to need more than 2.9e12 years (OK, > that's not quite as old as the universe, so some cosmologists may need > it...) Another reason why it might be interesting to support different epochs is that many timeseries (e.g., the ones I work with) aren't linked to absolute time, but are instead "milliseconds since we turned on the recording equipment". You can reasonably represent these as timedeltas of course, but it'd be even more elegant to be able to be able to represent them as absolute times against an opaque epoch. In particular, when you have multiple recording tracks, only those which were recorded against the same epoch are actually commensurable -- trying to do recording1_times[10] - recording2_times[10] is meaningless and should be an error. I'm definitely not suggesting we go start retrofitting this into datetime64, but it's a real shame that defining a new dtype is so hard that we can't play around with such things on our own without serious mucking about in numpy's guts :-/. -n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From mwwiebe at gmail.com Tue Apr 9 17:46:10 2013 From: mwwiebe at gmail.com (Mark Wiebe) Date: Tue, 9 Apr 2013 14:46:10 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: Message-ID: On Mon, Apr 8, 2013 at 12:24 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > Recent discussion has made it clear that the timezone handling in the > current (numpy1.7) version of datetime64 is broken. Below is a > discussion of some possible solutions, hopefully including most of the > comments made on the recent thread on this list. > > http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066038.html > > The intent it that with a bit more discussion (focused, in this thread > at least) on the time zone issues, rather than other DateTIme64 > issues, we can start a new datetime64 NEP. > This looks great, thanks for putting it together! I've put some comments inline. > > Background: > =============================== > > > The current version (numpy 1.7) of datetime64 appears to handle > timezones in the following ways: > > datetime64s are assumed to be in UTC internally. Time zone translation > is done on I/O -- i.e creating a new datetime64 and outputting to text > format or as a datetime.datetime object. > It might be better to say "defined" instead of "assumed", because that was an explicit choice. > When creating a datetime64 from an ISO string, the timezone info in > the string is respected. If there is no timezone info in the string, > the system time zone (locale setting) is used. On output > (i.e.converting to text: __str__ and __repr__) the system locale is > used to set the timezone. > > In [9]: np.datetime64('2013-04-08T12:00:00Z') > Out[9]: numpy.datetime64('2013-04-08T05:00:00-0700') > > In [10]: np.datetime64('2013-04-08T12:00:00') > Out[10]: numpy.datetime64('2013-04-08T12:00:00-0700') > > However, if a datetime,datetime is used without a tzinfo object (the > common case, as no tzinfo objects are provided with the python > stdlib), the timezone is assumed to be UTC: > > In [13]: dt > Out[13]: datetime.datetime(2013, 4, 8, 12, 0) > > In [14]: np.datetime64(dt) > Out[14]: numpy.datetime64('2013-04-08T05:00:00.000000-0700') > > which can give some odd results, as it's different if you convert the > datetime object to a iso string first: > > In [15]: np.datetime64(dt.isoformat()) > Out[15]: numpy.datetime64('2013-04-08T12:00:00-0700') > > Converting from a datetime64 to a datetime object uses the UTC time > (the internal representation with no offset). > > > Issues with the current configuration: > =============================== > > Using the locale time zone is a long standing tradition, and used by > the C standard library time functions. However, it is almost always > NOT what one wants in a typical numpy application. When working with > Scientific (and financial) datasets, the time zone of the data at hand > is likely to have nothing to do with the timezone of the computer the > code is running on. Also, with cloud computing and web applications, > the time zone of the machine on which the code is running is > irrelevant to the user. A number of early-adopters of datetime64 have > found that they have needed to wrap creating and use of datetime64 > arrays to override the timezone behavior. > > The current implementation may be natural for some interactive use, > but that's often not the case, and is particularly problematic when > datetime.datetime.now() gives locale lime, but with no time zone info, > so numpy actually appears to shift it. > > In [19]: datetime.datetime.now().isoformat() > Out[19]: '2013-04-08T12:05:26.157475' > > In [20]: np.datetime64(datetime.datetime.now()) > Out[20]: numpy.datetime64('2013-04-08T05:05:45.813027-0700') > > This is really ugly -- and regardless if we like what the std lib > does, we need to deal with it. > > The python standard library datetime implementation uses "naive" > datetimes by default, with the provision for an optional "tzinfo" > object, so that the user can supply timezone info if desired, However, > the library does not provide any tzinfo objects out of the box. This > is because timezones are messy, complicated, and change over time, and > the core python devs did not want to be in the position of maintaining > that code. There is a third party "pytz" package > (http://pytz.sourceforge.net/) that provides a pretty complete > implementation of time zone handling for those that need it. > > Note also that in the current implementation, The busday functions > just operate on datetime64[D]. There is no timezone interaction there > -- which makes it very hard for them to be useful, as it's a bit > tricky to make sure your datetime64 arrays are in the correct time > zone for your application. In fact, they are assuming that datetime64 > is time zone naive, even though the I/O functions assume locale time. > The datetime64[D] type itself doesn't interact with time zones, for example: In [2]: np.datetime64('2000-03-12') Out[2]: numpy.datetime64('2000-03-12') doesn't use a time zone. Where time zones come into play is when converting between datetime64[D] and datetime64[s], or other time-unit datetimes: In [12]: a = np.array(["2012-03-02T22:00", "2013-02-01T01:00"], dtype='M8') In [13]: a Out[13]: array(['2012-03-02T22:00-0800', '2013-02-01T01:00-0800'], dtype='datetime64[m]') In [14]: a.astype('M8[D]') Out[14]: array(['2012-03-03', '2013-02-01'], dtype='datetime64[D]') The casting rules disallow conversion from time to date units, except under the 'unsafe' rule. That's unfortunately the default for the astype function though, so if we override the rule, we get: In [15]: a.astype('M8[D]', casting='same_kind') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 a.astype('M8[D]', casting='same_kind') TypeError: Cannot cast array from dtype(' ====================== > > Principles: > ------------------ > > * Partial time zone handling is probably worse than none. > * The library should never apply the locale timezone (or any other) > unless explicitly requested by the user. > * It should be possible (and easy) to use "naive" datetime64 arrays. > > 1) A naive datetime64: > ==================== > > This would follow, more or less, the python stdlib datetime approach > (with no tzinfo object) -- ignore timezones entirely. This model > leaves all time zone handling up to the user. In general, when working > with this method, applications either: use UTC everywhere, or use > "local time" everywhere, where local time is simply "all data is in > the same time zone" and it's up to the user (or lib code) to make sure > that's correct. > > Issues: > ------------------ > > The main issue with a naive datetime64 is what to do on creation if > time zone information is supplied (i.e in a ISO 8601 string, or > datetime object with non-None tzinfo). Options include: > - ignore the TZ info > - raise an exception if there is a TZ adjustment (other than UTC, 00:00, > or Z) > I'd still raise the exception for 00:00 and Z, to me they're more like the other time zone specifications than no time zone. > > I propose that we raise an exception, unless there is a way to pass an > optional flag in to request timezone conversion. > > note that the stdlib datetime package does not provide an ISO8601 > parsing function, so it has ignored the issue. > > There is also the issue of what to provide on output/conversion. I propose: > - a datetime object with no tzinof > - ISO8601 with no tz adjustment > > np.datetime_as_string() could be used with options to allow the user > to request a time zone offset. > Another thing to consider is adding some global state for default printing of datetimes, similar to that for controlling the number of decimals when printing floats. I don't like this kind of global state, but it would match NumPy's current practice. > 1) UTC-only > ==================== > > This would be very similar to a naive datetime64 -- there are no > timezone adjustments with pure UTC -- and would be similar to the > current implementation, except for I/O: > > On conversion to datetime64, time zone offset would be respected, if > it exists in the ISO8601 string or the datetime object has a tzinfo > attribute. The value would be stored in UTC. > > If there is no timezone info in the input string or datetime objects, > UTC is assumed. > > On output -- UTC is used, no offset computed. > > Issues: > ------------------ > > The ISO string produced on output would logically contain the "Z" flag > to indicate UTC. This may confuse some apps that really expect a naive > datetime. > > If there were a way to pass in a flag to create ISO strings indicating > the time zone, that would be perfect, probably using > np.datetime_as_string() > > > 3) Optional time zone support > ========================== > > This would follow the standard library approach -- provide a hook for > a tzinfo object -- and if there, handle properly. This would allow one > to mix and match datetime64s that are in different time zones, etc. > > issues: > ---------------- > > The biggest issue is that to be useful, you'd need a comprehensive > tzinfo package. pytz provides one, but then you'd need to go through > the python layer for every item in an array -- killing performance. > However, perhaps that would be worth it for those that need it, and > folks that need performance could use naive datetime64s. > > There apparently is also a datetime library in boost that has a nice > timezone object which could be used as inspiration for an equivalent > in NumPy. That could be a lot of work, though. > > 3) Full time zone support > ========================== > > This would be similar to the above, except that every datetime64 array > would be required to carry time zone info. This would probably be > reasonable, as one could use UTC everywhere if you wanted the simple > case. But it would require that a comprehensive tzinfo package be > included with numpy -- likely something we don't want to have to > maintain (even if someone wants to built it in the first place) > > issues: > ----------- > > We would still want ways to input/output naive datetimes -- some app > simply don't want to deal with all this! > > > > > > As I (Chris Barker) am not in a postion to implement anything, I > advocate the simplest possible approach -- which I think is Naive > datetime and/or UTC only. But if people want more, and someone wants > to implement it, great! > > Please add your comment, and maybe we'll get a NEP together. > Thanks again for putting this together, Mark > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Wed Apr 10 03:00:56 2013 From: tmp50 at ukr.net (Dmitrey) Date: Wed, 10 Apr 2013 10:00:56 +0300 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> Message-ID: <8233.1365577256.12194646403837853696@ffe10.ukr.net> > --- ???????? ????????? --- ?? ????: "Robert Kern" ????: 9 ?????? 2013, 14:29:43 On Tue, Apr 9, 2013 at 4:15 PM, Dmitrey < tmp50 at ukr.net > wrote: > > > --- ???????? ????????? --- > ?? ????: "Robert Kern" < robert.kern at gmail.com > > ????: 16 ????? 2013, 22:15:07 > > On Sat, Mar 16, 2013 at 6:19 PM, Dmitrey < tmp50 at ukr.net > wrote: >> >> >> --- ???????? ????????? --- >> ?? ????: "Robert Kern" < robert.kern at gmail.com > >> ????: 16 ????? 2013, 19:54:51 >> >> On Sat, Mar 16, 2013 at 10:39 AM, Matthieu Brucher >> < matthieu.brucher at gmail.com > wrote: >>> Even if they have different hashes, they can be stored in the same >>> underlying list before they are retrieved. Then, an actual comparison is >>> done to check if the given key (i.e. object instance, not hash) is the >>> same >>> as one of the stored keys. >> >> Right. And the rule is that if two objects compare equal, then they >> must also hash equal. Unfortunately, it looks like `oofun` objects do >> not obey this property. oofun.__eq__() seems to return a Constraint >> rather than a bool, so oofun objects should simply not be used as >> dictionary keys. >> >> It is one of several base features FuncDesigner is build on and is used >> extremely often and wide; then whole FuncDesigner would work incorrectly >> while it is used intensively and solves many problems better than its >> competitors. > > I understand. It just means that you can't oofun objects as dictionary > keys. Adding a __hash__() method is not enough to make that work. > > No, it just means I had mapped, have mapped, map and will map oofun objects > as Python dict keys. Well, it's your software. You are free to make it as buggy as you wish, I guess. Yes, and that's why each time I get a bugreport I immediately start working on it, so usually I have zero opened bugs, as now . It somewhat differs from your bugtracker , that has tens of opened bugs, and ~ half of them are hanging for years (also, half of them are mentioned as high and highest priority) . But it's definitely your right to keep it as buggy as you wish, as well! D. ?-- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 10 03:31:24 2013 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Apr 2013 13:01:24 +0530 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <8233.1365577256.12194646403837853696@ffe10.ukr.net> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> Message-ID: On Wed, Apr 10, 2013 at 12:30 PM, Dmitrey wrote: > > > --- ???????? ????????? --- > ?? ????: "Robert Kern" > ????: 9 ?????? 2013, 14:29:43 >> Well, it's your software. You are free to make it as buggy as you wish, I >> guess. > > Yes, and that's why each time I get a bugreport I immediately start working > on it, so usually I have zero opened bugs, as now . It somewhat differs > from your bugtracker , that has tens of opened bugs, and ~ half of them are > hanging for years (also, half of them are mentioned as high and highest > priority) . But it's definitely your right to keep it as buggy as you wish, > as well! You think comparing tracked bug counts across different projects means anything? That's adorable. I admire your diligence at addressing the bugs that you do acknowledge. That was never in question. But refusing to acknowledge a bug is not the same thing as fixing a bug. You cannot use objects that do not have a valid __eq__() (as in, returns boolean True if and only if they are to be considered equivalent for the purpose of dictionary lookup, otherwise returns False) as dictionary keys. Your oofun object still violates this principle. As dictionary keys, you want them to use their `id` attributes to distinguish them, but their __eq__() method still just returns another oofun with the default object.__nonzero__() implementation. This means that bool(some_oofun == other_oofun) is always True regardless of the `id` attributes. You have been unfortunate enough to not run into cases where this causes a problem yet, but the bug is still there, lurking, waiting for a chance hash collision to silently give you wrong results. That is the worst kind of bug. -- Robert Kern From robert.kern at gmail.com Wed Apr 10 03:38:38 2013 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Apr 2013 13:08:38 +0530 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> Message-ID: On Wed, Apr 10, 2013 at 1:06 PM, Nathaniel Smith wrote: > This kind of personal attack is never appropriate for this list. Please > stop. My apologies. I will stop. -- Robert Kern From njs at pobox.com Wed Apr 10 03:36:49 2013 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Apr 2013 08:36:49 +0100 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <8233.1365577256.12194646403837853696@ffe10.ukr.net> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> Message-ID: On 10 Apr 2013 08:01, "Dmitrey" wrote: > > --- ???????? ????????? --- > ?? ????: "Robert Kern" > ????: 9 ?????? 2013, 14:29:43 > >> On Tue, Apr 9, 2013 at 4:15 PM, Dmitrey wrote: >> > >> > >> > --- ???????? ????????? --- >> > ?? ????: "Robert Kern" >> > ????: 16 ????? 2013, 22:15:07 >> > >> > On Sat, Mar 16, 2013 at 6:19 PM, Dmitrey wrote: >> >> >> >> >> >> --- ???????? ????????? --- >> >> ?? ????: "Robert Kern" >> >> ????: 16 ????? 2013, 19:54:51 >> >> >> >> On Sat, Mar 16, 2013 at 10:39 AM, Matthieu Brucher >> >> wrote: >> >>> Even if they have different hashes, they can be stored in the same >> >>> underlying list before they are retrieved. Then, an actual comparison is >> >>> done to check if the given key (i.e. object instance, not hash) is the >> >>> same >> >>> as one of the stored keys. >> >> >> >> Right. And the rule is that if two objects compare equal, then they >> >> must also hash equal. Unfortunately, it looks like `oofun` objects do >> >> not obey this property. oofun.__eq__() seems to return a Constraint >> >> rather than a bool, so oofun objects should simply not be used as >> >> dictionary keys. >> >> >> >> It is one of several base features FuncDesigner is build on and is used >> >> extremely often and wide; then whole FuncDesigner would work incorrectly >> >> while it is used intensively and solves many problems better than its >> >> competitors. >> > >> > I understand. It just means that you can't oofun objects as dictionary >> > keys. Adding a __hash__() method is not enough to make that work. >> > >> > No, it just means I had mapped, have mapped, map and will map oofun objects >> > as Python dict keys. >> >> Well, it's your software. You are free to make it as buggy as you wish, I guess. > > Yes, and that's why each time I get a bugreport I immediately start working on it, so usually I have zero opened bugs, as now . It somewhat differs from your bugtracker , that has tens of opened bugs, and ~ half of them are hanging for years (also, half of them are mentioned as high and highest priority) . But it's definitely your right to keep it as buggy as you wish, as well! This kind of personal attack is never appropriate for this list. Please stop. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Wed Apr 10 04:54:30 2013 From: tmp50 at ukr.net (Dmitrey) Date: Wed, 10 Apr 2013 11:54:30 +0300 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> Message-ID: <20353.1365584070.4744103532183552000@ffe16.ukr.net> On 04/10/2013 10:31 AM, Robert Kern wrote: You think comparing tracked bug counts across different projects means anything? That's adorable. I admire your diligence at addressing the bugs that you do acknowledge. That was never in question. But refusing to acknowledge a bug is not the same thing as fixing a bug. You cannot use objects that do not have a valid __eq__() (as in, returns boolean True if and only if they are to be considered equivalent for the purpose of dictionary lookup, otherwise returns False) as dictionary keys. Your oofun object still violates this principle. As dictionary keys, you want them to use their `id` attributes to distinguish them, but their __eq__() method still just returns another oofun with the default object.__nonzero__() implementation. This means that bool(some_oofun == other_oofun) is always True regardless of the `id` attributes. You have been unfortunate enough to not run into cases where this causes a problem yet, but the bug is still there, lurking, waiting for a chance hash collision to silently give you wrong results. That is the worst kind of bug. -- Robert Kern I had encountered the bugs with bool(some_oofun == other_oofun) when it was raised from other, than dict, cases, e.g. from "in list" (e.f. "if my_oofun in freeVarsList") etc, and had fixed them all. But that one doesn't occur from "in dict", I traced it with both debugger and putting print("in __eq__"),print("in __le__"), print("in __lt__"), print('in __gt__'), print('in __ge__') statements. As I had mentioned, removing mapping oofuns as dict keys is mere impossible - this is fundamental thing whole FuncDesigner is build on, as well as its user API. D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gadiyar at gmail.com Wed Apr 10 05:13:30 2013 From: gadiyar at gmail.com (Anand Gadiyar) Date: Wed, 10 Apr 2013 14:43:30 +0530 Subject: [Numpy-discussion] Import error while freezing with cxfreeze In-Reply-To: References: Message-ID: Hi Brad, On Fri, Apr 5, 2013 at 8:33 PM, Bradley M. Froehle wrote: > Hi Anand, > > On Friday, April 5, 2013, Anand Gadiyar wrote: > >> Hi all, >> >> I have a small program that uses numpy and scipy. I ran into a couple of >> errors while trying to use cxfreeze to create a windows executable. >> >> I'm running Windows 7 x64, Python 2.7.3 64-bit, Numpy 1.7.1rc1 64-bit, >> Scipy-0.11.0 64-bit, all binary installs from < >> http://www.lfd.uci.edu/~gohlke/pythonlibs/> >> >> I was able to replicate this with scipy-0.12.0c1 as well. >> >> 1) "from scipy import constants" triggers the below: >> Traceback (most recent call last): >> File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", >> line 27, in >> exec_code in m.__dict__ >> File "mSimpleGui.py", line 10, in >> File "mSystem.py", line 7, in >> File "D:\Python27\lib\site-packages\scipy\__init__.py", line 64, in >> >> from numpy import show_config as show_numpy_config >> File "D:\Python27\lib\site-packages\numpy\__init__.py", line 165, in >> >> from core import * >> AttributeError: 'module' object has no attribute 'sys' >> > > It's a bug in cx_freeze that has been fixed in the development branch. > > See > https://bitbucket.org/anthony_tuininga/cx_freeze/pull-request/17/avoid-polluting-extension-module-namespace/diff > > >> Thanks - the development branch fixed this. > 2) "from scipy import interpolate" triggers the below: >> Traceback (most recent call last): >> File "D:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", >> line 27, in >> exec_code in m.__dict__ >> File "mSimpleGui.py", line 10, in >> File "mSystem.py", line 9, in >> File "mSensor.py", line 10, in >> File "D:\Python27\lib\site-packages\scipy\interpolate\__init__.py", line >> 154, in >> from rbf import Rbf >> File "D:\Python27\lib\site-packages\scipy\interpolate\rbf.py", line 50, >> in >> from scipy import linalg >> ImportError: cannot import name linalg >> > > You might want to try the dev branch of cxfreeze to see if this has been > fixed as well. > > This one seems to be still an issue with the dev branch. I tried "from scipy import linalg" in the cx freeze setup script and that went through. I then tried using imp.find_module and that didn't work. So maybe cxfreeze is the problem. I'll report it on the cxfreeze list. Thanks, Anand -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Apr 10 05:45:09 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 10 Apr 2013 11:45:09 +0200 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <20353.1365584070.4744103532183552000@ffe16.ukr.net> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <20353.1365584070.4744103532183552000@ffe16.ukr.net> Message-ID: <1365587109.2506.14.camel@sebastian-laptop> On Wed, 2013-04-10 at 11:54 +0300, Dmitrey wrote: > On 04/10/2013 10:31 AM, Robert Kern wrote: > > You think comparing tracked bug counts across different projects > > means anything? That's adorable. I admire your diligence at > > addressing the bugs that you do acknowledge. That was never in > > question. But refusing to acknowledge a bug is not the same thing as > > fixing a bug. You cannot use objects that do not have a valid > > __eq__() (as in, returns boolean True if and only if they are to be > > considered equivalent for the purpose of dictionary lookup, > > otherwise returns False) as dictionary keys. Your oofun object still > > violates this principle. As dictionary keys, you want them to use > > their `id` attributes to distinguish them, but their __eq__() method > > still just returns another oofun with the default > > object.__nonzero__() implementation. This means that bool(some_oofun > > == other_oofun) is always True regardless of the `id` attributes. > > You have been unfortunate enough to not run into cases where this > > causes a problem yet, but the bug is still there, lurking, waiting > > for a chance hash collision to silently give you wrong results. That > > is the worst kind of bug. -- Robert Kern > I had encountered the bugs with bool(some_oofun == other_oofun) when > it was raised from other, than dict, cases, e.g. from "in list" (e.f. > "if my_oofun in freeVarsList") etc, and had fixed them all. But that > one doesn't occur from "in dict", I traced it with both debugger and > putting print("in __eq__"),print("in __le__"), print("in __lt__"), > print('in __gt__'), print('in __ge__') statements. This is all good and nice, but Robert is still right. For dictionaries to work predictable you need to ensure two things. First: if object1 == object2: assert bool(hash(object1) == hash(object2)) and second, which is your case for the dictionary lookup to be predictable this must always work: keys, values = zip(*dictionary.keys()) assert(keys.count(object2) == 1) index = keys.index(object2) value = values[index] And apparently this is not the case and it simply invites bugs which are impossible to track down because you will not see them in small tests. Instead you will have code that runs great for toy problems and can suddenly break down in impossible to understand ways when you have large problems. So hopefully the list fixes you mention provide that the second code block will work as you would expect dictionary[object2] to work, but if this is not the case... - Sebastian > As I had mentioned, removing mapping oofuns as dict keys is mere > impossible - this is fundamental thing whole FuncDesigner is build on, > as well as its user API. > D. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Apr 10 05:47:09 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 10 Apr 2013 11:47:09 +0200 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <1365587109.2506.14.camel@sebastian-laptop> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <20353.1365584070.4744103532183552000@ffe16.ukr.net> <1365587109.2506.14.camel@sebastian-laptop> Message-ID: <1365587229.2506.15.camel@sebastian-laptop> On Wed, 2013-04-10 at 11:45 +0200, Sebastian Berg wrote: > On Wed, 2013-04-10 at 11:54 +0300, Dmitrey wrote: > > On 04/10/2013 10:31 AM, Robert Kern wrote: > > This is all good and nice, but Robert is still right. For dictionaries > to work predictable you need to ensure two things. > > First: > > if object1 == object2: > assert bool(hash(object1) == hash(object2)) > > and second, which is your case for the dictionary lookup to be > predictable this must always work: > > keys, values = zip(*dictionary.keys()) sorry dictionary.items() of course. > assert(keys.count(object2) == 1) > > index = keys.index(object2) > value = values[index] > > And apparently this is not the case and it simply invites bugs which are > impossible to track down because you will not see them in small tests. > Instead you will have code that runs great for toy problems and can > suddenly break down in impossible to understand ways when you have large > problems. > > So hopefully the list fixes you mention provide that the second code > block will work as you would expect dictionary[object2] to work, but if > this is not the case... > > - Sebastian > > > As I had mentioned, removing mapping oofuns as dict keys is mere > > impossible - this is fundamental thing whole FuncDesigner is build on, > > as well as its user API. > > D. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sebastian at sipsolutions.net Wed Apr 10 06:53:59 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 10 Apr 2013 12:53:59 +0200 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <1365587109.2506.14.camel@sebastian-laptop> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <20353.1365584070.4744103532183552000@ffe16.ukr.net> <1365587109.2506.14.camel@sebastian-laptop> Message-ID: <1365591239.2506.18.camel@sebastian-laptop> On Wed, 2013-04-10 at 11:45 +0200, Sebastian Berg wrote: > On Wed, 2013-04-10 at 11:54 +0300, Dmitrey wrote: > > On 04/10/2013 10:31 AM, Robert Kern wrote: > > > You think comparing tracked bug counts across different projects > > > means anything? That's adorable. I admire your diligence at > > > addressing the bugs that you do acknowledge. That was never in > > > question. But refusing to acknowledge a bug is not the same thing as > > > fixing a bug. You cannot use objects that do not have a valid > > > __eq__() (as in, returns boolean True if and only if they are to be > > > considered equivalent for the purpose of dictionary lookup, > > > otherwise returns False) as dictionary keys. Your oofun object still > > > violates this principle. As dictionary keys, you want them to use > > > their `id` attributes to distinguish them, but their __eq__() method > > > still just returns another oofun with the default > > > object.__nonzero__() implementation. This means that bool(some_oofun > > > == other_oofun) is always True regardless of the `id` attributes. > > > You have been unfortunate enough to not run into cases where this > > > causes a problem yet, but the bug is still there, lurking, waiting > > > for a chance hash collision to silently give you wrong results. That > > > is the worst kind of bug. -- Robert Kern > > I had encountered the bugs with bool(some_oofun == other_oofun) when > > it was raised from other, than dict, cases, e.g. from "in list" (e.f. > > "if my_oofun in freeVarsList") etc, and had fixed them all. But that > > one doesn't occur from "in dict", I traced it with both debugger and > > putting print("in __eq__"),print("in __le__"), print("in __lt__"), > > print('in __gt__'), print('in __ge__') statements. > > This is all good and nice, but Robert is still right. For dictionaries > to work predictable you need to ensure two things. > > First: > > if object1 == object2: > assert bool(hash(object1) == hash(object2)) > > and second, which is your case for the dictionary lookup to be > predictable this must always work: > > keys, values = zip(*dictionary.keys()) > assert(keys.count(object2) == 1) > > index = keys.index(object2) > value = values[index] Ah well, so maybe it is not quite as strict, if you know that hashes can't match (and if they do for another type, equal evaluates to False). But it would be a bit weird anyway if these give different results. > > And apparently this is not the case and it simply invites bugs which are > impossible to track down because you will not see them in small tests. > Instead you will have code that runs great for toy problems and can > suddenly break down in impossible to understand ways when you have large > problems. > > So hopefully the list fixes you mention provide that the second code > block will work as you would expect dictionary[object2] to work, but if > this is not the case... > > - Sebastian > > > As I had mentioned, removing mapping oofuns as dict keys is mere > > impossible - this is fundamental thing whole FuncDesigner is build on, > > as well as its user API. > > D. > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cjwilliams43 at gmail.com Wed Apr 10 07:21:02 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Wed, 10 Apr 2013 07:21:02 -0400 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: Message-ID: <51654B1E.6050708@gmail.com> An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Apr 10 08:11:46 2013 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 10 Apr 2013 08:11:46 -0400 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> Message-ID: <51655702.3050201@gmail.com> On 4/10/2013 3:31 AM, Robert Kern wrote: > You cannot use objects that do not have a valid __eq__() (as in, > returns boolean True if and only if they are to be considered > equivalent for the purpose of dictionary lookup, otherwise returns > False) as dictionary keys. Your oofun object still violates this > principle. As dictionary keys, you want them to use their `id` > attributes to distinguish them, but their __eq__() method still just > returns another oofun with the default object.__nonzero__() > implementation. This means that bool(some_oofun == other_oofun) is > always True regardless of the `id` attributes. You have been > unfortunate enough to not run into cases where this causes a problem > yet, but the bug is still there, lurking, waiting for a chance hash > collision to silently give you wrong results. That is the worst kind > of bug. Hi Dmitrey, Robert and Sebastien have taken their time to carefully explain to your why your design is flawed. Your response has been only that you rely on this design flaw and it has not bitten you yet. I trust you can see that this is truly not a response. The right response is to explore how you can refactor to eliminate this lurking bug, or to prove that it can *never* bite due to another design feature. You have done neither, and the second looks impossible. So you have work to do. You say that you *must* use oofuns as dict keys. This is probably false, but you clearly want to retain this aspect of your design. But this choice has an implication for the design of oofuns, as carefully explained in this thread. So you will have to change the design, even though that may prove painful. No smaller step is adequate to the quality of software you aspire to. One last thing. When someone like Robert or Sebastien take their time to explain a problem to you, the right response is "thank you", even if their news is unwelcome. Don't shoot the messenger. Cheers, Alan From sole at esrf.fr Wed Apr 10 10:21:56 2013 From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=) Date: Wed, 10 Apr 2013 16:21:56 +0200 Subject: [Numpy-discussion] Import error while freezing with cxfreeze In-Reply-To: References: Message-ID: <51657584.3070507@esrf.fr> Hello, On 10/04/2013 11:13, Anand Gadiyar wrote: > On Friday, April 5, 2013, Anand Gadiyar wrote: > > > Hi all, > > I have a small program that uses numpy and scipy. I ran into a > couple of errors while trying to use cxfreeze to create a > windows executable. > > I'm running Windows 7 x64, Python 2.7.3 64-bit, Numpy 1.7.1rc1 > 64-bit, Scipy-0.11.0 64-bit, all binary installs from > > > > If you intend to use that binary for yourself, please forget this message. As far as I know, if you intend to distribute that binary *and* you use the numpy version built with MKL support, you need an MKL license from Intel. Best regards, Armando -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Apr 10 10:24:15 2013 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Apr 2013 15:24:15 +0100 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <1365587109.2506.14.camel@sebastian-laptop> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <20353.1365584070.4744103532183552000@ffe16.ukr.net> <1365587109.2506.14.camel@sebastian-laptop> Message-ID: An easy solution to all of this is to use a dict-like object that matches keys based on object identity while ignoring __hash__ and __eq__ entirely, e.g.: https://bitbucket.org/pypy/pypy/src/2f51f2142f7b/lib_pypy/identity_dict.py#cl-9 -n On Wed, Apr 10, 2013 at 10:45 AM, Sebastian Berg wrote: > On Wed, 2013-04-10 at 11:54 +0300, Dmitrey wrote: >> On 04/10/2013 10:31 AM, Robert Kern wrote: >> > You think comparing tracked bug counts across different projects >> > means anything? That's adorable. I admire your diligence at >> > addressing the bugs that you do acknowledge. That was never in >> > question. But refusing to acknowledge a bug is not the same thing as >> > fixing a bug. You cannot use objects that do not have a valid >> > __eq__() (as in, returns boolean True if and only if they are to be >> > considered equivalent for the purpose of dictionary lookup, >> > otherwise returns False) as dictionary keys. Your oofun object still >> > violates this principle. As dictionary keys, you want them to use >> > their `id` attributes to distinguish them, but their __eq__() method >> > still just returns another oofun with the default >> > object.__nonzero__() implementation. This means that bool(some_oofun >> > == other_oofun) is always True regardless of the `id` attributes. >> > You have been unfortunate enough to not run into cases where this >> > causes a problem yet, but the bug is still there, lurking, waiting >> > for a chance hash collision to silently give you wrong results. That >> > is the worst kind of bug. -- Robert Kern >> I had encountered the bugs with bool(some_oofun == other_oofun) when >> it was raised from other, than dict, cases, e.g. from "in list" (e.f. >> "if my_oofun in freeVarsList") etc, and had fixed them all. But that >> one doesn't occur from "in dict", I traced it with both debugger and >> putting print("in __eq__"),print("in __le__"), print("in __lt__"), >> print('in __gt__'), print('in __ge__') statements. > > This is all good and nice, but Robert is still right. For dictionaries > to work predictable you need to ensure two things. > > First: > > if object1 == object2: > assert bool(hash(object1) == hash(object2)) > > and second, which is your case for the dictionary lookup to be > predictable this must always work: > > keys, values = zip(*dictionary.keys()) > assert(keys.count(object2) == 1) > > index = keys.index(object2) > value = values[index] > > And apparently this is not the case and it simply invites bugs which are > impossible to track down because you will not see them in small tests. > Instead you will have code that runs great for toy problems and can > suddenly break down in impossible to understand ways when you have large > problems. > > So hopefully the list fixes you mention provide that the second code > block will work as you would expect dictionary[object2] to work, but if > this is not the case... > > - Sebastian > >> As I had mentioned, removing mapping oofuns as dict keys is mere >> impossible - this is fundamental thing whole FuncDesigner is build on, >> as well as its user API. >> D. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From tmp50 at ukr.net Wed Apr 10 10:25:37 2013 From: tmp50 at ukr.net (Dmitrey) Date: Wed, 10 Apr 2013 17:25:37 +0300 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <51655702.3050201@gmail.com> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <51655702.3050201@gmail.com> Message-ID: <48634.1365603937.12506068579671998464@ffe12.ukr.net> --- ???????? ????????? --- ?? ????: "Alan G Isaac" ????: 10 ?????? 2013, 15:12:07 On 4/10/2013 3:31 AM, Robert Kern wrote: > You cannot use objects that do not have a valid __eq__() (as in, > returns boolean True if and only if they are to be considered > equivalent for the purpose of dictionary lookup, otherwise returns > False) as dictionary keys. Your oofun object still violates this > principle. As dictionary keys, you want them to use their `id` > attributes to distinguish them, but their __eq__() method still just > returns another oofun with the default object.__nonzero__() > implementation. This means that bool(some_oofun == other_oofun) is > always True regardless of the `id` attributes. You have been > unfortunate enough to not run into cases where this causes a problem > yet, but the bug is still there, lurking, waiting for a chance hash > collision to silently give you wrong results. That is the worst kind > of bug. Hi Dmitrey, Robert and Sebastien have taken their time to carefully explain to your why your design is flawed. Your response has been only that you rely on this design flaw and it has not bitten you yet. It had bitten me some times till I understood the bugs source, but as I had mentioned I had fixed all those parts of code. ?I trust you can see that this is truly not a response. The right response is to explore how you can refactor to eliminate this lurking bug, or to prove that it can *never* bite due to another design feature. You have done neither, and the second looks impossible. So you have work to do. ?You say that you *must* use oofuns as dict keys. This is probably false, but you clearly want to retain this aspect of your design. But this choice has an implication for the design of oofuns, as carefully explained in this thread. So you will have to change the design, even though that may prove painful. Refactoring is mere impossible, user API and thouzands lines of whole FuncDesigner kernel heavily relies on the oofuns as dict keys. Also, I don't see any alternative that is as convenient and fast as the involved approach. As for new features, I just keep it in mind while implementing them, and now it's quite simple. ?No smaller step is adequate to the quality of software you aspire to. One last thing. When someone like Robert or Sebastien take their time to explain a problem to you, the right response is "thank you", even if their news is unwelcome. Don't shoot the messenger. I understand your opinion, but I'm not a kind of person who thanks on responses like "Well, it's your software. You are free to make it as buggy as you wish" (Robert have apologised, although). Also, I haven't thanked Sebastien because I was AFK. Thanks for all who participated in the thread. D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 10 11:47:51 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 10 Apr 2013 08:47:51 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: <51654B1E.6050708@gmail.com> References: <51654B1E.6050708@gmail.com> Message-ID: On Wed, Apr 10, 2013 at 4:21 AM, Colin J. Williams wrote: > On Mon, Apr 8, 2013 at 12:24 PM, Chris Barker - NOAA Federal >> Recent discussion has made it clear that the timezone handling in the >> current (numpy1.7) version of datetime64 is broken. Below is a >> discussion of some possible solutions, hopefully including most of the >> comments made on the recent thread on this list. > Is MxDateTime helpful? > > Colin W, How so? I remember MXDateTime from way back when before pyton had a datetime in the stdlib. I"m trying to remember why pyton itself didn't adopt MxDateTime rather make a new one, but I think: - The licensing may not have been appropriate - MxDateTIme is more ambitious -- the core python devs didn't want to support the whole thing These apply to numpy as well. The goal of python's DateTime is that it be the basis for more comprehensive DateTime packages, so having numpy work well with it makes sense. Anyway, if MxDateTime has good timezone handling, it might be nice if numpy could allow users to optionally plug that in -- also having an easy MxDateitme => datetiem64 conversion could be nice. But we wouldn't want it as a dependency. If someone wants to take a good look at it and see what lessons we can learn, that would be great. (note: I have no idea how compatible MxDateTime and datetime.datetime are.. that would be nice to know.) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From riccardodemaria at gmail.com Wed Apr 10 12:55:33 2013 From: riccardodemaria at gmail.com (Riccardo De Maria) Date: Wed, 10 Apr 2013 16:55:33 +0000 (UTC) Subject: [Numpy-discussion] Time Zones and datetime64 References: <51654B1E.6050708@gmail.com> Message-ID: Thanks for the effort. I think one should assume UTC when the time zones are not explicit. The library should handle correctly leap seconds, otherwise using unix time as a floating point number is already sufficient for many applications. Did you have a look to http://cr.yp.to/libtai.html? Riccardo From daniele at grinta.net Wed Apr 10 13:18:07 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 10 Apr 2013 19:18:07 +0200 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: <51659ECF.6080503@grinta.net> On 10/04/2013 18:55, Riccardo De Maria wrote: > The library should handle correctly leap seconds, otherwise using unix time > as a floating point number is already sufficient for many applications. Please define what you mean by "handle correctly leap seconds". As leap seconds are not predictable there is no way to correctly convert from a date and time representation representing a point in time in the future to a representation of number of seconds since an epoch on a TAI timebase. Either we forbid the (number of seconds since TAI epoch) to and from (date and time representation) for times in the future or I don't know ho to leap seconds may be correctly handled. > Did you have a look to http://cr.yp.to/libtai.html? This library looks outdated. It does not list the leap second insertion occurred in 2012. Cheers, Daniele From chris.barker at noaa.gov Wed Apr 10 13:46:16 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 10 Apr 2013 10:46:16 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: On Wed, Apr 10, 2013 at 9:55 AM, Riccardo De Maria wrote: > The library should handle correctly leap seconds, otherwise using unix time > as a floating point number is already sufficient for many applications. well, we could have used floating point in datetime64, but integers have their advantages. But anyway, I'd like to keep the leap-second question separate from the time zone question -- they are orthogonal issues. So if you feel strongly about this, please write up a proposal, and start a new thread for that. leap-seconds don't happen to be my itch... Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From alan.isaac at gmail.com Wed Apr 10 13:56:43 2013 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 10 Apr 2013 13:56:43 -0400 Subject: [Numpy-discussion] OpenOpt Suite release 0.45 In-Reply-To: <48634.1365603937.12506068579671998464@ffe12.ukr.net> References: <1991.1363353681.6942528722304958464@ffe16.ukr.net> <51436A9C.7010905@gmail.com> <27202.1363376076.16920329947726938112@ffe12.ukr.net> <51438A72.3000301@gmail.com> <6447.1363426297.5285784474692943872@ffe6.ukr.net> <90157.1363430170.9409858753789952@ffe8.ukr.net> <29942.1363457966.17802425454193213440@ffe11.ukr.net> <73318.1365504324.16663206196880080896@ffe17.ukr.net> <8233.1365577256.12194646403837853696@ffe10.ukr.net> <51655702.3050201@gmail.com> <48634.1365603937.12506068579671998464@ffe12.ukr.net> Message-ID: <5165A7DB.8080302@gmail.com> On 4/10/2013 10:25 AM, Dmitrey wrote: > Refactoring is mere impossible See Nathaniel's suggestion. Alan Isaac From riccardodemaria at gmail.com Wed Apr 10 17:53:26 2013 From: riccardodemaria at gmail.com (Riccardo De Maria) Date: Wed, 10 Apr 2013 21:53:26 +0000 (UTC) Subject: [Numpy-discussion] Time Zones and datetime64 References: <51654B1E.6050708@gmail.com> <51659ECF.6080503@grinta.net> Message-ID: Daniele Nicolodi grinta.net> writes: > > On 10/04/2013 18:55, Riccardo De Maria wrote: > > The library should handle correctly leap seconds, otherwise using unix time > > as a floating point number is already sufficient for many applications. > > Please define what you mean by "handle correctly leap seconds". I know they are not predictable but one can always update a file or the binary itself for the ones introduced over the years. I mentioned leap seconds because it is only feature that I see that has deep implication in the implementaion and that may be useful to people that needs to compute precise time delta between time stamped events or need to parse dates like 2012-06-30T23:59:60UTC. At the moment I use unix time since leap seconds would be a corner case for me that I can handle manually. Unix time can be converted from and to date string using datetime and pytz and this is needed only during io operations, can be used as coordinates in matplotlib plots without or with a trivial customized ticker, time deltas are just differences (although not always physically accurate), can be used as a timestamp (with glitches during the leap second). But please take this comment as suggestions, I only wanted to share my use case. Riccardo From cjwilliams43 at gmail.com Wed Apr 10 19:43:10 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Wed, 10 Apr 2013 19:43:10 -0400 Subject: [Numpy-discussion] Canopy and Anaconda In-Reply-To: References: <51654B1E.6050708@gmail.com> <51659ECF.6080503@grinta.net> Message-ID: <5165F90E.9080502@gmail.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Final_version_canopy_logo_1.png Type: image/png Size: 20636 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: open.php Type: image/gif Size: 35 bytes Desc: not available URL: From cournape at gmail.com Wed Apr 10 19:54:27 2013 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Apr 2013 01:54:27 +0200 Subject: [Numpy-discussion] Canopy and Anaconda In-Reply-To: <5165F90E.9080502@gmail.com> References: <51654B1E.6050708@gmail.com> <51659ECF.6080503@grinta.net> <5165F90E.9080502@gmail.com> Message-ID: Hi Colin, Please ask Canopy question on the corresponding Enthought list, or Anaconda questions on the corresponding channel at continuum. This Mailing List is for discussion about NumPy itself, David On Thu, Apr 11, 2013 at 1:43 AM, Colin J. Williams wrote: > Are CANOPY and Anaconda intended to work together? > > I hope that this is envisaged. > > Colin W. > > > Email not displaying correctly? View it in your browser. > > *Greetings from Enthought,* > > We are proud to announce our brand new product, *Enthought Canopy*: > a Python based analysis desktop and Python distribution for scientific and > analytic computing that will help you solve your most complex problems. > > *Enthought Canopy *is the follow on to the popular Enthought Python > Distribution (EPD). But EPD users, have no fear. * Canopy* adds an > advanced text editor, integrated IPython console, graphical package manager > and online documentation to its EPD foundation. The *Canopy* analysis > environment streamlines data analysis, visualization, algorithm design and > application development for all its users. > > We can't wait for you to try it out > . > > The Enthought Team > > > You are receiving this email because you have subscribed to the > Enthought Python Distribution, downloaded an EPD trial, or registered for > one of our webinars. > > Unsubscribe > cjwilliams43 at gmail.com from this list | Forward to a friend| Update > your profile > *Our mailing address is:* > Enthought, Inc. > 515 Congress Ave. > Austin, TX 78701 > > Add us to your address book > > *Copyright (C) 2013 Enthought, Inc. All rights reserved.* > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: open.php Type: image/gif Size: 35 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Final_version_canopy_logo_1.png Type: image/png Size: 20636 bytes Desc: not available URL: From pivanov314 at gmail.com Thu Apr 11 03:14:46 2013 From: pivanov314 at gmail.com (Paul Ivanov) Date: Thu, 11 Apr 2013 00:14:46 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: <515D235B.4090705@astro.uio.no> Message-ID: <20130411071446.GE12951@HbI-OTOH.berkeley.edu> Chris Barker - NOAA Federal, on 2013-04-04 09:01, wrote: > Which is why I advocate interspersed posting. This. (with heavy snipping). ... But I just came across a wonderfully short signature from Rick Moen, and thought I'd pass it along: Cheers, A: Yes. Rick Moen > Q: Are you sure? rick at linuxmafia >> A: Because it reverses the logical flow of conversation. .com McQ! (4x80) >>> Q: Why is top posting frowned upon? best, -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From sebastian at sipsolutions.net Thu Apr 11 06:32:06 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 11 Apr 2013 12:32:06 +0200 Subject: [Numpy-discussion] Scheduling the 1.7.1 and 1.8 releases In-Reply-To: References: Message-ID: <1365676326.2506.45.camel@sebastian-laptop> On Wed, 2013-03-06 at 11:43 -0700, Charles R Harris wrote: > Hi All, > > The development branch has been accumulating stuff since last summer, > I suggest we look to get it out in May, branching at the end of this > month. Hey, maybe it is a bit early, but I was wondering. What are the things that should be done before branching? Most of the blockers should be resolved? I think the largest of that are maybe the deprecations (but since 1.7.0 is not too long ago many of them maybe don't need touching). Aside from that, I think these are lurking/in-progress: Fix the piled up mapping.c things: * This means resolving https://github.com/numpy/numpy/pull/436 . It will be even worse now, but maybe at least most of it can be put in without too much work? * After that, non-integer deprecations should be done fully before 1.8. I think, but these need to change mapping.c too (the most work is probably the tests...): - https://github.com/numpy/numpy/pull/2891 - https://github.com/numpy/numpy/pull/2825 * (https://github.com/numpy/numpy/pull/2701) Are the 2to3 fixers supposed to be finished for 1.8? If yes there is probably no point in thinking about branching, etc. since back porting them all is just pain. Regards, Sebastian > Thoughts? > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Thu Apr 11 09:14:05 2013 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Apr 2013 14:14:05 +0100 Subject: [Numpy-discussion] Scheduling the 1.7.1 and 1.8 releases In-Reply-To: <1365676326.2506.45.camel@sebastian-laptop> References: <1365676326.2506.45.camel@sebastian-laptop> Message-ID: On Thu, Apr 11, 2013 at 11:32 AM, Sebastian Berg wrote: > On Wed, 2013-03-06 at 11:43 -0700, Charles R Harris wrote: >> Hi All, >> > > >> The development branch has been accumulating stuff since last summer, >> I suggest we look to get it out in May, branching at the end of this >> month. > > Hey, > > maybe it is a bit early, but I was wondering. What are the things that > should be done before branching? Most of the blockers should be > resolved? I think the largest of that are maybe the deprecations (but > since 1.7.0 is not too long ago many of them maybe don't need touching). The "known blockers" list is here: https://github.com/numpy/numpy/issues?milestone=1&page=1&state=open #3194: this has a PR and will be closed shortly. #3008: this is fundamentally trivial, but annoying and confusing. I *think* we understand generally what we need to do here, just need re-arrange some header files so it works: https://github.com/numpy/numpy/issues/3008#issuecomment-14316385 #2905: trivial, needs doing #2830: all this actually needs is an INSTALL.txt update. #2801: need to rip NPY_CHAR out of master. trivial, needs doing. #596: needs bumping to 1.9, and a check whether the deprecation warning message explicitly says "1.8". if so then it should be made more vague. #456: same as #596 #378: need to merge #452. #294: same as #596 > Aside from that, I think these are lurking/in-progress: > > Fix the piled up mapping.c things: > * This means resolving https://github.com/numpy/numpy/pull/436 . > It will be even worse now, but maybe at least most of it can > be put in without too much work? > * After that, non-integer deprecations should be done fully before > 1.8. I think, but these need to change mapping.c too (the > most work is probably the tests...): > - https://github.com/numpy/numpy/pull/2891 > - https://github.com/numpy/numpy/pull/2825 > * (https://github.com/numpy/numpy/pull/2701) > > Are the 2to3 fixers supposed to be finished for 1.8? If yes there is > probably no point in thinking about branching, etc. since back porting > them all is just pain. My feeling is that as a matter of policy, we should branch as soon as the blockers are fixed, and anything that isn't ready can go into 1.9. If you want a change into 1.8, you should hurry instead of making 1.8 (and everything else in it) wait ;-). -n From charlesr.harris at gmail.com Thu Apr 11 10:28:22 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Apr 2013 08:28:22 -0600 Subject: [Numpy-discussion] Scheduling the 1.7.1 and 1.8 releases In-Reply-To: <1365676326.2506.45.camel@sebastian-laptop> References: <1365676326.2506.45.camel@sebastian-laptop> Message-ID: On Thu, Apr 11, 2013 at 4:32 AM, Sebastian Berg wrote: > On Wed, 2013-03-06 at 11:43 -0700, Charles R Harris wrote: > > Hi All, > > > > > > The development branch has been accumulating stuff since last summer, > > I suggest we look to get it out in May, branching at the end of this > > month. > > Hey, > > maybe it is a bit early, but I was wondering. What are the things that > should be done before branching? Most of the blockers should be > resolved? I think the largest of that are maybe the deprecations (but > since 1.7.0 is not too long ago many of them maybe don't need touching). > Aside from that, I think these are lurking/in-progress: > > Fix the piled up mapping.c things: > * This means resolving https://github.com/numpy/numpy/pull/436 . > It will be even worse now, but maybe at least most of it can > be put in without too much work? > * After that, non-integer deprecations should be done fully before > 1.8. I think, but these need to change mapping.c too (the > most work is probably the tests...): > - https://github.com/numpy/numpy/pull/2891 > - https://github.com/numpy/numpy/pull/2825 > * (https://github.com/numpy/numpy/pull/2701) > > Are the 2to3 fixers supposed to be finished for 1.8? If yes there is > probably no point in thinking about branching, etc. since back porting > them all is just pain. > > There is no milestone for the 2to3 stuff, although I expect it will be done by the end of April. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Apr 11 12:12:17 2013 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Apr 2013 17:12:17 +0100 Subject: [Numpy-discussion] Scheduling the 1.7.1 and 1.8 releases In-Reply-To: References: <1365676326.2506.45.camel@sebastian-laptop> Message-ID: On 11 Apr 2013 15:29, "Charles R Harris" wrote: > > > > On Thu, Apr 11, 2013 at 4:32 AM, Sebastian Berg < sebastian at sipsolutions.net> wrote: >> >> On Wed, 2013-03-06 at 11:43 -0700, Charles R Harris wrote: >> > Hi All, >> > >> >> >> > The development branch has been accumulating stuff since last summer, >> > I suggest we look to get it out in May, branching at the end of this >> > month. >> >> Hey, >> >> maybe it is a bit early, but I was wondering. What are the things that >> should be done before branching? Most of the blockers should be >> resolved? I think the largest of that are maybe the deprecations (but >> since 1.7.0 is not too long ago many of them maybe don't need touching). >> Aside from that, I think these are lurking/in-progress: >> >> Fix the piled up mapping.c things: >> * This means resolving https://github.com/numpy/numpy/pull/436 . >> It will be even worse now, but maybe at least most of it can >> be put in without too much work? >> * After that, non-integer deprecations should be done fully before >> 1.8. I think, but these need to change mapping.c too (the >> most work is probably the tests...): >> - https://github.com/numpy/numpy/pull/2891 >> - https://github.com/numpy/numpy/pull/2825 >> * (https://github.com/numpy/numpy/pull/2701) >> >> Are the 2to3 fixers supposed to be finished for 1.8? If yes there is >> probably no point in thinking about branching, etc. since back porting >> them all is just pain. >> > > There is no milestone for the 2to3 stuff, although I expect it will be done by the end of April. Will releasing 1.8 with these partially applied cause any issues that you can think of? (Like making later 1.8.1 backports easier or harder?) I guess we haven't noticed any problems backporting 1.7 changes yet, so hopefully it doesn't matter? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Apr 11 12:52:27 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Apr 2013 10:52:27 -0600 Subject: [Numpy-discussion] Scheduling the 1.7.1 and 1.8 releases In-Reply-To: References: <1365676326.2506.45.camel@sebastian-laptop> Message-ID: On Thu, Apr 11, 2013 at 10:12 AM, Nathaniel Smith wrote: > On 11 Apr 2013 15:29, "Charles R Harris" > wrote: > > > > > > > > On Thu, Apr 11, 2013 at 4:32 AM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > >> > >> On Wed, 2013-03-06 at 11:43 -0700, Charles R Harris wrote: > >> > Hi All, > >> > > >> > >> > >> > The development branch has been accumulating stuff since last summer, > >> > I suggest we look to get it out in May, branching at the end of this > >> > month. > >> > >> Hey, > >> > >> maybe it is a bit early, but I was wondering. What are the things that > >> should be done before branching? Most of the blockers should be > >> resolved? I think the largest of that are maybe the deprecations (but > >> since 1.7.0 is not too long ago many of them maybe don't need touching). > >> Aside from that, I think these are lurking/in-progress: > >> > >> Fix the piled up mapping.c things: > >> * This means resolving https://github.com/numpy/numpy/pull/436 . > >> It will be even worse now, but maybe at least most of it can > >> be put in without too much work? > >> * After that, non-integer deprecations should be done fully before > >> 1.8. I think, but these need to change mapping.c too (the > >> most work is probably the tests...): > >> - https://github.com/numpy/numpy/pull/2891 > >> - https://github.com/numpy/numpy/pull/2825 > >> * (https://github.com/numpy/numpy/pull/2701) > >> > >> Are the 2to3 fixers supposed to be finished for 1.8? If yes there is > >> probably no point in thinking about branching, etc. since back porting > >> them all is just pain. > >> > > > > There is no milestone for the 2to3 stuff, although I expect it will be > done by the end of April. > > Will releasing 1.8 with these partially applied cause any issues that you > can think of? (Like making later 1.8.1 backports easier or harder?) I guess > we haven't noticed any problems backporting 1.7 changes yet, so hopefully > it doesn't matter? > I don't think they should cause problems unless they depend the presence of methods not present in 2.4 and 2.5, but that only affects 1.7. It is hard to say what things will cause problems in advance, but I suspect most things won't be difficult. If the ws_comma fixer is run and committed after 1.8 is released, there might be cherry-picking problems that would make backports a bit more work. There are also a couple of fixes yet to go in that involve version dependent imports, memoryview/buffer in particular, that would be a good to have in 1.8. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Thu Apr 11 19:20:47 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Thu, 11 Apr 2013 16:20:47 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 4:28 PM, Doug Coleman wrote: > > Also, gmail "bottom-posts" by default. It's transparent to gmail users. > I'd imagine they are some of the biggest offenders. > Interesting. Mine go to the top by default and I always have to expand the quoted text, trim down as necessary, and then reply below the relevant bits. A quick gander at gmail's setting doesn't offer anything obvious. I'll dig deeper later. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Thu Apr 11 19:49:41 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Thu, 11 Apr 2013 19:49:41 -0400 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: Message-ID: <51674C15.1000506@gmail.com> An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Apr 11 20:07:42 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Apr 2013 17:07:42 -0700 Subject: [Numpy-discussion] Canopy and Anaconda In-Reply-To: References: <51654B1E.6050708@gmail.com> <51659ECF.6080503@grinta.net> <5165F90E.9080502@gmail.com> Message-ID: Hi, On Wed, Apr 10, 2013 at 4:54 PM, David Cournapeau wrote: > > Hi Colin, > > Please ask Canopy question on the corresponding Enthought list, or Anaconda questions on the corresponding channel at continuum. > > This Mailing List is for discussion about NumPy itself, > > David > > On Thu, Apr 11, 2013 at 1:43 AM, Colin J. Williams wrote: >> >> Are CANOPY and Anaconda intended to work together? >> >> I hope that this is envisaged. >> >> Colin W. Although it is also worth saying that I would have thought it is reasonable to ask about general numpy-related infrastructure here. Cheers, Matthew From charlesr.harris at gmail.com Thu Apr 11 20:14:02 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 11 Apr 2013 18:14:02 -0600 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: <51674C15.1000506@gmail.com> References: <51674C15.1000506@gmail.com> Message-ID: On Thu, Apr 11, 2013 at 5:49 PM, Colin J. Williams wrote: > On 11/04/2013 7:20 PM, Paul Hobson wrote: > > On Wed, Apr 3, 2013 at 4:28 PM, Doug Coleman wrote: > >> >> Also, gmail "bottom-posts" by default. It's transparent to gmail users. >> I'd imagine they are some of the biggest offenders. >> > > Interesting. Mine go to the top by default and I always have to expand > the quoted text, trim down as necessary, and then reply below the relevant > bits. A quick gander at gmail's setting doesn't offer anything obvious. > I'll dig deeper later. > > > _______________________________________________ > NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion > > Bottom posting seems to be the accepted Usenet standard. > > I don't care, can't someone can make a decision, so that we all do the > same thing? > > Please develop a rationale or toss a coin and let us know. Numpy needs a > BDFL (or a shorter term, if you wish). > > It's always been bottom posting. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Apr 12 02:03:17 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Apr 2013 11:33:17 +0530 Subject: [Numpy-discussion] Canopy and Anaconda In-Reply-To: References: <51654B1E.6050708@gmail.com> <51659ECF.6080503@grinta.net> <5165F90E.9080502@gmail.com> Message-ID: On Fri, Apr 12, 2013 at 5:37 AM, Matthew Brett wrote: > Hi, > > On Wed, Apr 10, 2013 at 4:54 PM, David Cournapeau wrote: >> >> Hi Colin, >> >> Please ask Canopy question on the corresponding Enthought list, or Anaconda questions on the corresponding channel at continuum. >> >> This Mailing List is for discussion about NumPy itself, >> >> David >> >> On Thu, Apr 11, 2013 at 1:43 AM, Colin J. Williams wrote: >>> >>> Are CANOPY and Anaconda intended to work together? >>> >>> I hope that this is envisaged. >>> >>> Colin W. > > Although it is also worth saying that I would have thought it is > reasonable to ask about general numpy-related infrastructure here. Their respective support lines are the best places to get answers about those products, especially questions about their development roadmaps. Whether or not it is "reasonable" or on-topic to ask here, one won't get good answers here. -- Robert Kern From robert.kern at gmail.com Fri Apr 12 02:43:18 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Apr 2013 12:13:18 +0530 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: <51674C15.1000506@gmail.com> References: <51674C15.1000506@gmail.com> Message-ID: On Fri, Apr 12, 2013 at 5:19 AM, Colin J. Williams wrote: > PS My last posting used the word Anaconda. It was squelched. What posting are you referring to? What do you mean by "squelched"? Is there a post of yours that has not made it to the list yet? I can have the admin check the queue for messages being held up. Are you referring to your question about Canopy and Anaconda? It was not "squelched" in any sense of the word that I understand. You were pointed to more appropriate fora that will actually have a chance of answering your question accurately. The Canopy support team does not monitor this list for questions about Canopy's roadmap (nor should it). When matplotlib questions get asked here, I consistently point people to the matplotlib-users list. When people ask numpy questions on python-list, I redirect them here. While one certainly may ask general "numpy ecosystem" questions here as a starting point, one must expect to be pointed to better resources when they exist and to follow up there. -- Robert Kern From g.brandl at gmx.net Fri Apr 12 02:44:12 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 12 Apr 2013 08:44:12 +0200 Subject: [Numpy-discussion] The "I" dtype character In-Reply-To: References: Message-ID: Am 08.04.2013 09:14, schrieb Georg Brandl: > Hi, > > is it intentional that "I" is supported as a dtype character, but cannot be > suffixed with a size? > >>>> dtype('i1') > dtype('int8') >>>> dtype('I1') > dtype('uint32') > > I know "u" is documented as unsigned integer, but this seems an unnecessary > restriction that is confusing. Looking at the code, I see now that using "I2" raises a DeprecationWarning anyway, which is fair enough. Georg From njs at pobox.com Fri Apr 12 04:13:20 2013 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 12 Apr 2013 09:13:20 +0100 Subject: [Numpy-discussion] The "I" dtype character In-Reply-To: References: Message-ID: On Mon, Apr 8, 2013 at 8:14 AM, Georg Brandl wrote: > Hi, > > is it intentional that "I" is supported as a dtype character, but cannot be > suffixed with a size? > >>>> dtype('i1') > dtype('int8') >>>> dtype('I1') > dtype('uint32') "i" means "integer". "i1" means "integer with 8 bits". "I" means "32-bit unsigned integer". "I1" means "32-bit unsigned integer with 8 bits". Obviously this last thing doesn't make much sense :-). Historically numpy as accepted it anyway, and just ignored the suffix. In current numpy it's still allowed for backwards compatibility, but deprecated, and will become an error at some point. See https://github.com/numpy/numpy/issues/294 -n From bahtiyor_zohidov at mail.ru Fri Apr 12 05:34:06 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Fri, 12 Apr 2013 13:34:06 +0400 Subject: [Numpy-discussion] =?utf-8?q?Impossible_to_draw_a_direction_of_ar?= =?utf-8?q?rows_in_Python=3F=3F=3F?= Message-ID: <1365759246.587349380@f309.mail.ru> Hi, I have encountered some problem while I was drawing a direction of arrow. I have point (x,y) coordinates and angle of them. What I want to do is that to draw arrow according to the given angle (just to show the point direction as an arrow?in each point coordinate). Here, we should assume coordinates of ?'+x', '+y', '-x ', '-y' are 90, 0, 270, 180 degrees, respectively.? I am a bit unfamiliar with Python drawing tools. I am still not sure to draw directional point (arrow based on angle) whether I use pylab or some other modules or.. still not sure at all.?I put the following codes as a sample?to give better description: import numpy as np import scipy as sp import pylab as pl def draw_line(x,y,angle): ? ? ? ? ? ? ?# Inputs: ? ? ? ? ? ? ?x = np.array([ 2, 4, 8, 10, 12, 14, 16]) ? ? ? ? ? ? ?y = np.array([ 5, 10, 15, 20, 25, 30, 35]) ? ? ? ? ? ? ?angles = np.array([45,275,190,100,280,18,45])? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# First, draw (x,y) coordinate ? ? ? ? ? ? ???? ? ? ? ? ? ? ?# Second, according to the angle?indicate the direction as an arrow ? ? ? ? ? ? ???? Thanks in advance for your friendly support, -- happy Man -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Fri Apr 12 05:36:48 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 12 Apr 2013 11:36:48 +0200 Subject: [Numpy-discussion] Impossible to draw a direction of arrows in Python??? In-Reply-To: <1365759246.587349380@f309.mail.ru> References: <1365759246.587349380@f309.mail.ru> Message-ID: <5167D5B0.500@hilboll.de> > Hi, > I have encountered some problem while I was drawing a direction of > arrow. I have point (x,y) coordinates and angle of them. What I want to > do is that to draw arrow according to the given angle (just to show the > point direction as an arrow in each point coordinate). Here, we should > assume coordinates of '+x', '+y', '-x ', '-y' are 90, 0, 270, 180 > degrees, respectively. > I am a bit unfamiliar with Python drawing tools. I am still not sure to > draw directional point (arrow based on angle) whether I use pylab or > some other modules or.. still not sure at all. I put the following codes > as a sample to give better description: > > import numpy as np > import scipy as sp > import pylab as pl > > def draw_line(x,y,angle): > # Inputs: > x = np.array([ 2, 4, 8, 10, 12, 14, 16]) > y = np.array([ 5, 10, 15, 20, 25, 30, 35]) > angles = np.array([45,275,190,100,280,18,45]) > > # First, draw (x,y) coordinate > ??? > # Second, according to the angle indicate the direction as > an arrow > ??? > > Thanks in advance for your friendly support, Hi Happyman, your question would be better suited to the matplotlib-users mailing list, pleaes re-post there. I think mpl has a quiver method to do what you want, but I'm not sure. Check the docs at matplotlib.org. Cheers, A. From derek at astro.physik.uni-goettingen.de Fri Apr 12 10:34:00 2013 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 12 Apr 2013 16:34:00 +0200 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: <51674C15.1000506@gmail.com> Message-ID: <5EFE864A-01A8-4FE7-807E-5D7D5A94068D@astro.physik.uni-goettingen.de> On 12.04.2013, at 2:14AM, Charles R Harris wrote: > On Thu, Apr 11, 2013 at 5:49 PM, Colin J. Williams wrote: > On 11/04/2013 7:20 PM, Paul Hobson wrote: >> On Wed, Apr 3, 2013 at 4:28 PM, Doug Coleman wrote: >> >> Also, gmail "bottom-posts" by default. It's transparent to gmail users. I'd imagine they are some of the biggest offenders. >> >> Interesting. Mine go to the top by default and I always have to expand the quoted text, trim down as necessary, and then reply below the relevant bits. A quick gander at gmail's setting doesn't offer anything obvious. I'll dig deeper later. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > Bottom posting seems to be the accepted Usenet standard. > > I don't care, can't someone can make a decision, so that we all do the same thing? > > Please develop a rationale or toss a coin and let us know. Numpy needs a BDFL (or a shorter term, if you wish). > > > It's always been bottom posting. In German this kind of faux pas is usually labelled "TOFU" for "text on top, full quote underneath", and I think it has been a bit overlooked so far that the "full quote" part probably is the bigger problem. IOW a call to try and trim the OP more rigourously should help a lot, and I'd think most people then can agree on bottom posting (and I know the issue with mail clients doing that automatically - the thread in question looks quite readable in Mountain Lion's Mail.app, but a nightmare on Snow Leopard!). Cheers, Derek From andyfaff at gmail.com Fri Apr 12 10:50:33 2013 From: andyfaff at gmail.com (Andrew Nelson) Date: Fri, 12 Apr 2013 10:50:33 -0400 Subject: [Numpy-discussion] Random number generation and testing across different OS's. Message-ID: I have written a differential evolution optimiser that i use for curvefitting. As a genetic optimisation technique it is stochastic and relies heavily on random number generators to do the minimisation. As part of the module tests I would like to write a cross-platform test that checks if the fitting is being done correctly. I use an instance of numpy.random.RandomState for the generation. If I use the seed method on a single platform I get the same output, which I could use to write a test. However, I am unsure of how the seeding and RandomState works across platforms. If I use the same seed on OSX/Windows/Linux, will I get the same stream of random numbers being generated? I need to know if the test I write works across platforms. regards, Andrew. -- _____________________________________ Dr. Andrew Nelson _____________________________________ From sebastian at sipsolutions.net Fri Apr 12 10:56:56 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Apr 2013 16:56:56 +0200 Subject: [Numpy-discussion] Random number generation and testing across different OS's. In-Reply-To: References: Message-ID: <1365778616.3448.11.camel@sebastian-laptop> On Fri, 2013-04-12 at 10:50 -0400, Andrew Nelson wrote: > I have written a differential evolution optimiser that i use for > curvefitting. As a genetic optimisation technique it is stochastic and > relies heavily on random number generators to do the minimisation. As part > of the module tests I would like to write a cross-platform test that checks > if the fitting is being done correctly. > I use an instance of numpy.random.RandomState for the generation. If I use > the seed method on a single platform I get the same output, which I could > use to write a test. However, I am unsure of how the seeding and > RandomState works across platforms. > If I use the same seed on OSX/Windows/Linux, will I get the same stream of > random numbers being generated? I need to know if the test I write works > across platforms. Hi, yes, you can be certain of that. NumPy does exactly this for its test as well. Regards, sebastian > regards, > Andrew. > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Fri Apr 12 10:57:37 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Apr 2013 20:27:37 +0530 Subject: [Numpy-discussion] Random number generation and testing across different OS's. In-Reply-To: References: Message-ID: On Fri, Apr 12, 2013 at 8:20 PM, Andrew Nelson wrote: > I have written a differential evolution optimiser that i use for > curvefitting. As a genetic optimisation technique it is stochastic and > relies heavily on random number generators to do the minimisation. As part > of the module tests I would like to write a cross-platform test that checks > if the fitting is being done correctly. > I use an instance of numpy.random.RandomState for the generation. If I use > the seed method on a single platform I get the same output, which I could > use to write a test. However, I am unsure of how the seeding and > RandomState works across platforms. > If I use the same seed on OSX/Windows/Linux, will I get the same stream of > random numbers being generated? I need to know if the test I write works > across platforms. Modulo bugs, if you are using the same exact version of numpy on all platforms, you should be getting the same output bytes from the same seed. Some of the distributions may have some differences between 32-bit and 64-bit in some rare cases where a C long would overflow on the 32-bit platform but won't on a 32-bit platform, but those are usually indicative of bugs to be fixed. -- Robert Kern From chris.barker at noaa.gov Fri Apr 12 11:49:48 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 12 Apr 2013 08:49:48 -0700 Subject: [Numpy-discussion] Impossible to draw a direction of arrows in Python??? In-Reply-To: <1365759246.587349380@f309.mail.ru> References: <1365759246.587349380@f309.mail.ru> Message-ID: On Fri, Apr 12, 2013 at 2:34 AM, Happyman wrote: > I have encountered some problem while I was drawing a direction of arrow. What are you drawing with? numpy has nothign to do with drawing. It's likely that matplotlib is a good choice, if you are doing genreal plotting, and not jsut frawing an arrow. Otherwise, each GUI toolkit has custom drawing, or you can draw to a ini-memory image (and save as PNG, pdf, etc) with pyCairo, PIL, pyGD, .... But not of that is numpy. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Fri Apr 12 12:01:56 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 12 Apr 2013 09:01:56 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: <5EFE864A-01A8-4FE7-807E-5D7D5A94068D@astro.physik.uni-goettingen.de> References: <51674C15.1000506@gmail.com> <5EFE864A-01A8-4FE7-807E-5D7D5A94068D@astro.physik.uni-goettingen.de> Message-ID: On Fri, Apr 12, 2013 at 7:34 AM, Derek Homeier wrote: > In German this kind of faux pas is usually labelled "TOFU" for "text on top, full quote underneath", > and I think it has been a bit overlooked so far that the "full quote" part probably is the bigger problem. Exactly -- my original title "stop bottom posting" was a lame attempt to be a bit ironically humorous.. > IOW a call to try and trim the OP more rigourously should help a lot, and I'd think most people then > can agree on bottom posting Though I'd still argue that I'd prefer top posting to bottom posting in the situation of a small comment added to a big thread. gmail is pretty good at hiding the old stuff, so it's getting less annoying, but when i go to add a comment myself, and find this huge pile of old junk to sort through and delete -- it's really annoying. > ...the issue with mail clients doing that automatically - the thread > in question looks quite readable in Mountain Lion's Mail.app, but a nightmare on Snow Leopard!). Exactly -- email clients are maybe getting too smart for us -- if it looks good in yours, it may not look good in mine. Interesting that this note itself had not a lot, but still too much, worthless junk left over from the previous post... Maybe we need a short-hand for "clean up the previous parts of the thread to show only what you need to make your post relevant and clear" Paul Ivanof wrote: > ... But I just came across a wonderfully short signature from > Rick Moen, and thought I'd pass it along: > Cheers, A: Yes. > Rick Moen > Q: Are you sure? > rick at linuxmafia >> A: Because it reverses the logical flow of conversation. > .com McQ! (4x80) >>> Q: Why is top posting frowned upon? This is cute, and makes a good point, but misses the bigger issue: most interesting technical threads are not a series of linear, simple one-liners. Bottom posting is obviously the way to go for that. They are long, with multiple points, and comments on the middle parts, not just the end, and different people all commenting on the same post. It simply does not end up a neat linear conversation anyway. niether pure bottom posting nor pure top posting works well. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From derek at astro.physik.uni-goettingen.de Fri Apr 12 12:42:20 2013 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Fri, 12 Apr 2013 18:42:20 +0200 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: <51674C15.1000506@gmail.com> <5EFE864A-01A8-4FE7-807E-5D7D5A94068D@astro.physik.uni-goettingen.de> Message-ID: On 12.04.2013, at 6:01PM, Chris Barker - NOAA Federal wrote: > Maybe we need a short-hand for "clean up the previous parts of the > thread to show only what you need to make your post relevant and > clear" > +1 > Paul Ivanof wrote: >> ... But I just came across a wonderfully short signature from >> Rick Moen, and thought I'd pass it along: > >> Cheers, A: Yes. >> Rick Moen > Q: Are you sure? >> rick at linuxmafia >> A: Because it reverses the logical flow of conversation. >> .com McQ! (4x80) >>> Q: Why is top posting frowned upon? > > This is cute, and makes a good point, but misses the bigger issue: > most interesting technical threads are not a series of linear, simple > one-liners. Bottom posting is obviously the way to go for that. They > are long, with multiple points, and comments on the middle parts, not > just the end, and different people all commenting on the same post. It > simply does not end up a neat linear conversation anyway. niether pure > bottom posting nor pure top posting works well. Absolutely, I did not intend to suggest posting everything at the bottom of one monolithic quote (even if shortened), but following the corresponding sub-threads of the conversation (and tried to give a better example this time ;-). Cheers, Derek From riccardodemaria at gmail.com Fri Apr 12 12:52:00 2013 From: riccardodemaria at gmail.com (Riccardo De Maria) Date: Fri, 12 Apr 2013 16:52:00 +0000 (UTC) Subject: [Numpy-discussion] Time Zones and datetime64 References: <51654B1E.6050708@gmail.com> Message-ID: Not related to leap seconds and physically accurate time deltas, I have just noticed that SQLite has a nice API: http://www.sqlite.org/lang_datefunc.html that one can be inspired from. The source contains a date.c which looks reasonably clear. Riccardo From sebastian at sipsolutions.net Fri Apr 12 12:55:30 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Apr 2013 18:55:30 +0200 Subject: [Numpy-discussion] Non-Integer deprecations Message-ID: <1365785730.3448.42.camel@sebastian-laptop> Hey all, just revisiting non-integer (index) deprecations (basically https://github.com/numpy/numpy/pull/2891). I believe for all natural integer arguments, it is correct to do a deprecation if the input is not an integer. (Technically most of these go through PyArray_PyIntAsIntp, and if not probably should) This affects all axes, shapes, strides, indices (as well as slices) and (in the future) possibly other integer arguments. Indexing already has a deprecation warning in master, but I think it should be moved further down into PyArray_PyIntAsIntp. Just a couple of minor things to note or I am wondering about: 1. Does anyone see a problem with rejecting python bools as not integers? Python does not do it, but for array indices they misbehave and I don't see a real reason to allow them even for other integer arguments. 2. This will mean that 1e4, etc. is not valid, 10**4 must be used. 3. Deprecate (maybe faster then the rest) that array.__index__() works for non 0-d arrays seems right, but makes me a bit wonder about __int__ and __float__. (This is unrelated though) 4. This will affect third party behavior using the public numpy conversion functions. These are long deprecations (but possibly the __index__ which I think already has a bug report as being wrong...) so if any problems occur there should be time to clear them. This is just a note in case someone sees any problems with the general plan. Regards, Sebastian From chris.barker at noaa.gov Fri Apr 12 15:57:42 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 12 Apr 2013 12:57:42 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: On Fri, Apr 12, 2013 at 9:52 AM, Riccardo De Maria wrote: > Not related to leap seconds and physically accurate time deltas, I have just > noticed that SQLite has a nice API: > > http://www.sqlite.org/lang_datefunc.html > > that one can be inspired from. The source contains a date.c which looks > reasonably clear. well, I don't see any timezone support in there at all. It appears the use UTC, though I"m not entierly sure from the docs what now() would return. So I think it's pretty much like my "use UTC" proposal. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From scopatz at gmail.com Fri Apr 12 16:36:14 2013 From: scopatz at gmail.com (Anthony Scopatz) Date: Fri, 12 Apr 2013 15:36:14 -0500 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: Thanks for putting this together Chris. I am in favor of option (1) Pure UTC. I think it is the simplest to implement, and to get from / to other time zones is one ufunc application. On the other hand, option (3) full time zone support isn't too bad either. It is more work to implement but a lot of code could be borrowed from pytz -- which makes timezones usable in python at all. Option (2), what datetime does, is the wrong model. This is more complicated in both the implementation and API, and leads to lots of broken code, weird errors, and no clear right way of doing thing. Be Well Anthony On Fri, Apr 12, 2013 at 2:57 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Fri, Apr 12, 2013 at 9:52 AM, Riccardo De Maria > wrote: > > Not related to leap seconds and physically accurate time deltas, I have > just > > noticed that SQLite has a nice API: > > > > http://www.sqlite.org/lang_datefunc.html > > > > that one can be inspired from. The source contains a date.c which looks > > reasonably clear. > > well, I don't see any timezone support in there at all. It appears the > use UTC, though I"m not entierly sure from the docs what now() would > return. > > So I think it's pretty much like my "use UTC" proposal. > > > -Chris > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Fri Apr 12 17:23:51 2013 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Fri, 12 Apr 2013 17:23:51 -0400 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: <51687B67.7050109@gmail.com> An HTML attachment was scrubbed... URL: From pivanov314 at gmail.com Fri Apr 12 19:33:44 2013 From: pivanov314 at gmail.com (Paul Ivanov) Date: Fri, 12 Apr 2013 16:33:44 -0700 Subject: [Numpy-discussion] Please stop bottom posting!! In-Reply-To: References: <51674C15.1000506@gmail.com> <5EFE864A-01A8-4FE7-807E-5D7D5A94068D@astro.physik.uni-goettingen.de> Message-ID: <20130412233344.GJ9287@HbI-OTOH.berkeley.edu> > Paul Ivanov wrote: > > ... But I just came across a wonderfully short signature from > > Rick Moen, and thought I'd pass it along: > > > Cheers, A: Yes. > > Rick Moen > Q: Are you sure? > > rick at linuxmafia >> A: Because it reverses the logical flow of conversation. > > .com McQ! (4x80) >>> Q: Why is top posting frowned upon? > > This is cute, and makes a good point, but misses the bigger issue: > most interesting technical threads are not a series of linear, simple > one-liners. Bottom posting is obviously the way to go for that. They > are long, with multiple points, and comments on the middle parts, not > just the end, and different people all commenting on the same post. It > simply does not end up a neat linear conversation anyway. niether pure > bottom posting nor pure top posting works well. Full agreed, again, so I'm a little puzzled why you say it misses the point, since I started off that post agreeing with you: > Chris Barker - NOAA Federal, on 2013-04-04 09:01, wrote: > > Which is why I advocate interspersed posting. > > This. (with heavy snipping). My inclusion of Rick's signature was intended to just be cute, just as you perceived it. best, -- Paul Ivanov 314 address only used for lists, off-list direct email at: http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 From riccardodemaria at gmail.com Sat Apr 13 03:42:16 2013 From: riccardodemaria at gmail.com (Riccardo De Maria) Date: Sat, 13 Apr 2013 07:42:16 +0000 (UTC) Subject: [Numpy-discussion] Time Zones and datetime64 References: <51654B1E.6050708@gmail.com> <51687B67.7050109@gmail.com> Message-ID: Colin J. Williams gmail.com> writes: > well, I don't see any timezone support in there at all. It appears the > use UTC, though I"m not entierly sure from the docs what now() would > return. > > So I think it's pretty much like my "use UTC" proposal. I think so. There is no support for leap second either. > It's not clear whether the Julian day is an integer or contains > a fractional part. > Colin W. In SQLite the internal representation is Julian day * 86400 * 1000. In astronomy the modified julian day is apparenly often used: CCIR RECOMMENDATION 457-1, USE OF THE MODIFIED JULIAN DATE BY THE STANDARD FREQUENCY AND TIME-SIGNAL SERVICES. There is python library with classes that support several time coordinate systems (TAI, UTC, ISO, JD, MJD, UNX, RDT, CDF, DOY, eDOY) http://spacepy.lanl.gov/doc/autosummary/spacepy.time.Ticktock.html I also found those page useful for a quick recap. http://tycho.usno.navy.mil/mjd.html http://tycho.usno.navy.mil/systime.html From charlesr.harris at gmail.com Sun Apr 14 11:39:28 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 14 Apr 2013 09:39:28 -0600 Subject: [Numpy-discussion] Adding gufuncs Message-ID: There is a pull request for work adding linear algebra support as generalized ufuncs. The result is that many of the linear algebra routines can now be applied to stacks of matrices. Another new feature is support for float32 versions of the routines. Some work has also gone into porting the current linalg package to use the new routines. The work isn't finished, the new and old libraries {blas, lapack}_lite libraries should probably be united and the error handling could maybe use more polish, but I'm inclined to put the PR in at this point. Although some things may break, I think it needs to be out there to gather feedback and testing. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Sun Apr 14 16:32:01 2013 From: faltet at gmail.com (Francesc Alted) Date: Sun, 14 Apr 2013 22:32:01 +0200 Subject: [Numpy-discussion] ANN: numexpr 2.1 RC1 Message-ID: <516B1241.50401@gmail.com> ============================ Announcing Numexpr 2.1RC1 ============================ Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's VML library, which allows for squeezing the last drop of performance out of your multi-core processors. What's new ========== This version adds compatibility for Python 3. A bunch of thanks to Antonio Valentino for his excellent work on this.I apologize for taking so long in releasing his contributions. In case you want to know more in detail what has changed in this version, see: http://code.google.com/p/numexpr/wiki/ReleaseNotes or have a look at RELEASE_NOTES.txt in the tarball. Where I can find Numexpr? ========================= The project is hosted at Google code in: http://code.google.com/p/numexpr/ This is a release candidate 1, so it will not be available on the PyPi repository. I'll post it there when the final version will be released. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy! -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Mon Apr 15 07:00:03 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Mon, 15 Apr 2013 13:00:03 +0200 Subject: [Numpy-discussion] Pruned / sparse FFT 2D Message-ID: Hi, I have a sparse cloud of points of ~10^4 points randomly scattered around a triangular domain [1] from which I want to take the Fourier transform. I have been looking for algorithms and found one library, but only appears to be for the 1-D case (and seems there is no documentation). In [3] there is C code for the FFTW, but seems it code is itself pruned (pun intended). Does anyone have a clue about how to perform it? Speed is not a big issue, but accuracy is quite important. Thanks, David. [1] http://www.asaaf.org/~davidmh/coverage.png [2] https://pypi.python.org/pypi/pynfftls/1.0 [3] http://www.fftw.org/pruned.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Apr 15 12:29:14 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 15 Apr 2013 18:29:14 +0200 Subject: [Numpy-discussion] MapIter api Message-ID: <1366043354.8595.21.camel@sebastian-laptop> Hey, the MapIter API has only been made public in master right? So it is no problem at all to change at least the mapiter struct, right? I got annoyed at all those special cases that make things difficult to get an idea where to put i.e. to fix the boolean array-like stuff. So actually started rewriting it (and I already got one big function that does all index preparation -- ok it is untested but its basically there). I would guess it is not really a big problem even if it was public for longer, since you shouldn't do those direct struct access probably? But just checking. Since I got the test which mimics complex indexes in the tests, I thinks it should actually be feasible to do bigger refactoring there without having to worry too much about breaking things. Regards, Sebastian From charlesr.harris at gmail.com Mon Apr 15 13:16:37 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 15 Apr 2013 11:16:37 -0600 Subject: [Numpy-discussion] MapIter api In-Reply-To: <1366043354.8595.21.camel@sebastian-laptop> References: <1366043354.8595.21.camel@sebastian-laptop> Message-ID: On Mon, Apr 15, 2013 at 10:29 AM, Sebastian Berg wrote: > Hey, > > the MapIter API has only been made public in master right? So it is no > problem at all to change at least the mapiter struct, right? > > I got annoyed at all those special cases that make things difficult to > get an idea where to put i.e. to fix the boolean array-like stuff. So > actually started rewriting it (and I already got one big function that > does all index preparation -- ok it is untested but its basically > there). > > I would guess it is not really a big problem even if it was public for > longer, since you shouldn't do those direct struct access probably? But > just checking. > > Since I got the test which mimics complex indexes in the tests, I thinks > it should actually be feasible to do bigger refactoring there without > having to worry too much about breaking things. > > Looks like the public API went in last August but didn't make it into the 1.7.x release. What sort of schedule are you looking at? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Apr 15 15:27:57 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 15 Apr 2013 21:27:57 +0200 Subject: [Numpy-discussion] MapIter api In-Reply-To: References: <1366043354.8595.21.camel@sebastian-laptop> Message-ID: <1366054077.8595.47.camel@sebastian-laptop> On Mon, 2013-04-15 at 11:16 -0600, Charles R Harris wrote: > > > On Mon, Apr 15, 2013 at 10:29 AM, Sebastian Berg > wrote: > Hey, > > the MapIter API has only been made public in master right? So > it is no > problem at all to change at least the mapiter struct, right? > > I got annoyed at all those special cases that make things > difficult to > get an idea where to put i.e. to fix the boolean array-like > stuff. So > actually started rewriting it (and I already got one big > function that > does all index preparation -- ok it is untested but its > basically > there). > > I would guess it is not really a big problem even if it was > public for > longer, since you shouldn't do those direct struct access > probably? But > just checking. > > Since I got the test which mimics complex indexes in the > tests, I thinks > it should actually be feasible to do bigger refactoring there > without > having to worry too much about breaking things. > > > Looks like the public API went in last August but didn't make it into > the 1.7.x release. What sort of schedule are you looking at? > Not sure about a schedule, I somewhat think it is not even that hard, but of course it would still take a while (once I get a bit further, I will put it out there, hopefully someone else will be interested to help), but certainly not aiming to get anything done for 1.8. My first idea was to just do the parsing differently and keep the mapiter part identical (or with minor modifications). That seems actually impractical, since MapIter has a lot of stuff that it does not need. Plus it seems to me that it might be worth it to use the new nditer. One could try keep the fields somewhat identical (likely identical enough to be binary compatible with that ufunc.at pull request even), but I am not even sure that that is something to aim for, since the ufunc.at could be modified too (and might get good speed improvements out of that). - Sebastian > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Apr 15 15:36:44 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 15 Apr 2013 13:36:44 -0600 Subject: [Numpy-discussion] MapIter api In-Reply-To: <1366054077.8595.47.camel@sebastian-laptop> References: <1366043354.8595.21.camel@sebastian-laptop> <1366054077.8595.47.camel@sebastian-laptop> Message-ID: On Mon, Apr 15, 2013 at 1:27 PM, Sebastian Berg wrote: > On Mon, 2013-04-15 at 11:16 -0600, Charles R Harris wrote: > > > > > > On Mon, Apr 15, 2013 at 10:29 AM, Sebastian Berg > > wrote: > > Hey, > > > > the MapIter API has only been made public in master right? So > > it is no > > problem at all to change at least the mapiter struct, right? > > > > I got annoyed at all those special cases that make things > > difficult to > > get an idea where to put i.e. to fix the boolean array-like > > stuff. So > > actually started rewriting it (and I already got one big > > function that > > does all index preparation -- ok it is untested but its > > basically > > there). > > > > I would guess it is not really a big problem even if it was > > public for > > longer, since you shouldn't do those direct struct access > > probably? But > > just checking. > > > > Since I got the test which mimics complex indexes in the > > tests, I thinks > > it should actually be feasible to do bigger refactoring there > > without > > having to worry too much about breaking things. > > > > > > Looks like the public API went in last August but didn't make it into > > the 1.7.x release. What sort of schedule are you looking at? > > > Not sure about a schedule, I somewhat think it is not even that hard, > but of course it would still take a while (once I get a bit further, I > will put it out there, hopefully someone else will be interested to > help), but certainly not aiming to get anything done for 1.8. > > My first idea was to just do the parsing differently and keep the > mapiter part identical (or with minor modifications). That seems > actually impractical, since MapIter has a lot of stuff that it does not > need. Plus it seems to me that it might be worth it to use the new > nditer. One could try keep the fields somewhat identical (likely > identical enough to be binary compatible with that ufunc.at pull request > even), but I am not even sure that that is something to aim for, since > the ufunc.at could be modified too (and might get good speed > improvements out of that). > > - Sebastian > > Makes me wonder if we should expose the API in 1.8 if you are thinking a change might be appropriate. Or am I missing something here? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Apr 16 08:26:45 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 16 Apr 2013 14:26:45 +0200 Subject: [Numpy-discussion] MapIter api In-Reply-To: References: <1366043354.8595.21.camel@sebastian-laptop> <1366054077.8595.47.camel@sebastian-laptop> Message-ID: <1366115205.9563.23.camel@sebastian-laptop> On Mon, 2013-04-15 at 13:36 -0600, Charles R Harris wrote: > > > On Mon, Apr 15, 2013 at 1:27 PM, Sebastian Berg > wrote: > On Mon, 2013-04-15 at 11:16 -0600, Charles R Harris wrote: > > > > > > On Mon, Apr 15, 2013 at 10:29 AM, Sebastian Berg > > wrote: > > Hey, > > > > the MapIter API has only been made public in master > right? > > Looks like the public API went in last August but didn't > make it into > > the 1.7.x release. What sort of schedule are you looking at? > > > > Not sure about a schedule, I somewhat think it is not even > that hard, > but of course it would still take a while (once I get a bit > further, I > will put it out there, hopefully someone else will be > interested to > help), but certainly not aiming to get anything done for 1.8. > > My first idea was to just do the parsing differently and keep > the > mapiter part identical (or with minor modifications). That > seems > actually impractical, since MapIter has a lot of stuff that it > does not > need. Plus it seems to me that it might be worth it to use the > new > nditer. One could try keep the fields somewhat identical > (likely > identical enough to be binary compatible with that ufunc.at > pull request > even), but I am not even sure that that is something to aim > for, since > the ufunc.at could be modified too (and might get good speed > improvements out of that). > > - Sebastian > > > Makes me wonder if we should expose the API in 1.8 if you are thinking > a change might be appropriate. Or am I missing something here? > Yeah, I am wondering about that. But since I am not clear on exactly if and how one would reimplement it right now (certainly it would look very similar in the basic design), there is a bit time before deciding that maybe. And maybe someone else has an opinion one way or another? For example the MapIter currently does not expose the subspace as a separate iterator. You could access it, but you cannot optimize subspace iteration by handling it separately. I am thinking about something like the np.nestediters, but the user would maybe have to check if the inner iterator even exists (the subspace can easily be 0-d or have only one element). Also, I could imagine to tag a second array onto the fancy index iteration itself. That would be iterated together with the fancy indexes in one nditer and return pointers into its own subspace. That array would be the value array in assignment or the new array in subscription. - Sebastian > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Tue Apr 16 09:54:13 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 16 Apr 2013 14:54:13 +0100 Subject: [Numpy-discussion] MapIter api In-Reply-To: <1366043354.8595.21.camel@sebastian-laptop> References: <1366043354.8595.21.camel@sebastian-laptop> Message-ID: On Mon, Apr 15, 2013 at 5:29 PM, Sebastian Berg wrote: > Hey, > > the MapIter API has only been made public in master right? So it is no > problem at all to change at least the mapiter struct, right? > > I got annoyed at all those special cases that make things difficult to > get an idea where to put i.e. to fix the boolean array-like stuff. So > actually started rewriting it (and I already got one big function that > does all index preparation -- ok it is untested but its basically > there). > > I would guess it is not really a big problem even if it was public for > longer, since you shouldn't do those direct struct access probably? But > just checking. Why don't we just make the struct opaque, i.e., just declare it in the public header file and move the actual definition to an internal header file? If it's too annoying I guess we could even make it non-public, at least in 1.8 -- IIRC it's only there so we can use it in umath, and IIRC the patch to use it hasn't landed yet. Or we could just merge umath and multiarray into a single .so, that would save a *lot* of annoying fiddling with the public API that doesn't actually serve any purpose. -n From bwoods at aer.com Tue Apr 16 16:21:52 2013 From: bwoods at aer.com (Bryan Woods) Date: Tue, 16 Apr 2013 16:21:52 -0400 Subject: [Numpy-discussion] taking a 2D uneven surface slice Message-ID: <516DB2E0.6090401@aer.com> I'm trying to do something that at first glance I think should be simple but I can't quite figure out how to do it. The problem is as follows: I have a 3D grid Values[Nx, Ny, Nz] I want to slice Values at a 2D surface in the Z dimension specified by Z_index[Nx, Ny] and return a 2D slice[Nx, Ny]. It is not as simple as Values[:,:,Z_index]. I tried this: >>> values.shape (4, 5, 6) >>> coords.shape (4, 5) >>> slice = values[:,:,coords] >>> slice.shape (4, 5, 4, 5) >>> slice = np.take(values, coords, axis=2) >>> slice.shape (4, 5, 4, 5) >>> Obviously I could create an empty 2D slice and then fill it by using np.ndenumerate to fill it point by point by selecting values[i, j, Z_index[i, j]]. This just seems too inefficient and not very pythonic. -------------- next part -------------- A non-text attachment was scrubbed... Name: bwoods.vcf Type: text/x-vcard Size: 341 bytes Desc: not available URL: From brad.froehle at gmail.com Tue Apr 16 16:35:42 2013 From: brad.froehle at gmail.com (Bradley M. Froehle) Date: Tue, 16 Apr 2013 13:35:42 -0700 Subject: [Numpy-discussion] taking a 2D uneven surface slice In-Reply-To: <516DB2E0.6090401@aer.com> References: <516DB2E0.6090401@aer.com> Message-ID: Hi Bryan: On Tue, Apr 16, 2013 at 1:21 PM, Bryan Woods wrote: > I'm trying to do something that at first glance I think should be simple > but I can't quite figure out how to do it. The problem is as follows: > > I have a 3D grid Values[Nx, Ny, Nz] > > I want to slice Values at a 2D surface in the Z dimension specified by > Z_index[Nx, Ny] and return a 2D slice[Nx, Ny]. > > It is not as simple as Values[:,:,Z_index]. > > I tried this: > >>> values.shape > (4, 5, 6) > >>> coords.shape > (4, 5) > >>> slice = values[:,:,coords] > >>> slice.shape > (4, 5, 4, 5) > >>> slice = np.take(values, coords, axis=2) > >>> slice.shape > (4, 5, 4, 5) > >>> > > Obviously I could create an empty 2D slice and then fill it by using > np.ndenumerate to fill it point by point by selecting values[i, j, > Z_index[i, j]]. This just seems too inefficient and not very pythonic. > The following should work: >>> values.shape (4,5,6) >>> coords.shape (4,5) >>> values[np.arange(values.shape[0])[:,None], ... np.arange(values.shape[1])[None,:], ... coords].shape (4, 5) Essentially we extract the values we want by values[I,J,K] where the indices I, J and K are each of shape (4,5) [or broadcast-able to that shape]. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Tue Apr 16 18:44:27 2013 From: srean.list at gmail.com (srean) Date: Tue, 16 Apr 2013 17:44:27 -0500 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: Message-ID: As one lurker to another, thanks for calling it out. Over-argumentative, and personality centric threads like these have actually led me to distance myself from the numpy community. I do not know how common it is now because I do not follow it closely anymore. It used to be quite common at one point in time. I came down to check after a while, and lo there it is again. If a mail is put forward as a question "i find this confusing, is it confusing for you", it ought not to devolve into a shouting match atop moral high-horses "so you think I am stupid do you? too smart are you ? how dare you express that it doesnt bother you as much when it bothers me and my documented case of 4 people. I have four, how many do you have" If something is posed as a question one should be open to the answers. Sometimes it is better not to pose it a question at all but offer alternatives and ask for preference. I am not siding with any of the technical options provided, just requesting that the discourse not devolve into these personality oriented contests. It gets too loud and noisy. Thank you On Sat, Apr 6, 2013 at 12:18 PM, matti picus wrote: > as a lurker, may I say that this discussion seems to have become > non-productive? > > It seems all agree that docs needs improvement, perhaps a first step would > be to suggest doc improvements, and then the need for renaming may become > self-evident, or not. > > aww darn, ruined my lurker status. > Matti Picus > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.nnamtrop at gmail.com Tue Apr 16 18:55:53 2013 From: bob.nnamtrop at gmail.com (Bob Nnamtrop) Date: Tue, 16 Apr 2013 16:55:53 -0600 Subject: [Numpy-discussion] datetime64 1970 issue Message-ID: I am curious if others have noticed an issue with datetime64 at the beginning of 1970. First: In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) Out[144]: numpy.timedelta64(1,'D') OK this look fine, they are one day apart. But look at this: In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00')) Out[145]: numpy.timedelta64(31,'h') Hmmm, seems like there are 7 extra hours? Am I missing something? I don't see this at any other year. This discontinuity makes it hard to use the datetime64 object without special adjustment in ones code. I assume this a bug? Thanks, Bob ps I'm using the most recent anaconda release on mac os x 10.6.8 which includes numpy 1.7.0. pss It would be most handy if datetime64 had a constructor of the form np.datetime64(year,month,day,hour,min,sec) where these inputs were numpy arrays and the output would have the same shape as the input arrays (but be of type datetime64). The hour,min,sec would be optional. Scalar inputs would be broadcast to the size of the array inputs, etc. Maybe this is a topic for another post. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Tue Apr 16 19:45:16 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 16 Apr 2013 17:45:16 -0600 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop wrote: > I am curious if others have noticed an issue with datetime64 at the > beginning of 1970. First: > > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) > Out[144]: numpy.timedelta64(1,'D') > > OK this look fine, they are one day apart. But look at this: > > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00')) > Out[145]: numpy.timedelta64(31,'h') > > Hmmm, seems like there are 7 extra hours? Am I missing something? I don't > see this at any other year. This discontinuity makes it hard to use the > datetime64 object without special adjustment in ones code. I assume this a > bug? Indeed, this looks like a bug, I can reproduce it on linux as well: In [1]: import numpy as np In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') Out[2]: numpy.timedelta64(1,'D') In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') Out[3]: numpy.timedelta64(31,'h') In [4]: np.__version__ Out[4]: '1.7.1' We need to look into the sources to see what is going on. Ondrej From ben.root at ou.edu Tue Apr 16 20:45:11 2013 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 16 Apr 2013 20:45:11 -0400 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 7:45 PM, Ond?ej ?ert?k wrote: > On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop > wrote: > > I am curious if others have noticed an issue with datetime64 at the > > beginning of 1970. First: > > > > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) > > Out[144]: numpy.timedelta64(1,'D') > > > > OK this look fine, they are one day apart. But look at this: > > > > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 > 00')) > > Out[145]: numpy.timedelta64(31,'h') > > > > Hmmm, seems like there are 7 extra hours? Am I missing something? I don't > > see this at any other year. This discontinuity makes it hard to use the > > datetime64 object without special adjustment in ones code. I assume this > a > > bug? > > Indeed, this looks like a bug, I can reproduce it on linux as well: > > In [1]: import numpy as np > > In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') > Out[2]: numpy.timedelta64(1,'D') > > In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') > Out[3]: numpy.timedelta64(31,'h') > > Maybe, maybe not... were you alive then? For all we know, Charles and co. were partying an extra 7 hours every day back then? Just sayin' Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From zploskey at gmail.com Tue Apr 16 23:23:15 2013 From: zploskey at gmail.com (Zachary Ploskey) Date: Tue, 16 Apr 2013 20:23:15 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: The problem does not appear to exist on Linux with numpy version 1.6.2. In [1]: import numpy as np In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') Out[2]: 1 day, 0:00:00 In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') Out[3]: 1 day, 0:00:00 In [4]: np.__version__ Out[4]: '1.6.2' Zach On Tue, Apr 16, 2013 at 4:45 PM, Ond?ej ?ert?k wrote: > On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop > wrote: > > I am curious if others have noticed an issue with datetime64 at the > > beginning of 1970. First: > > > > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) > > Out[144]: numpy.timedelta64(1,'D') > > > > OK this look fine, they are one day apart. But look at this: > > > > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 > 00')) > > Out[145]: numpy.timedelta64(31,'h') > > > > Hmmm, seems like there are 7 extra hours? Am I missing something? I don't > > see this at any other year. This discontinuity makes it hard to use the > > datetime64 object without special adjustment in ones code. I assume this > a > > bug? > > Indeed, this looks like a bug, I can reproduce it on linux as well: > > In [1]: import numpy as np > > In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') > Out[2]: numpy.timedelta64(1,'D') > > In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') > Out[3]: numpy.timedelta64(31,'h') > > In [4]: np.__version__ > Out[4]: '1.7.1' > > > We need to look into the sources to see what is going on. > > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 17 00:32:08 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 16 Apr 2013 22:32:08 -0600 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 6:45 PM, Benjamin Root wrote: > > On Tue, Apr 16, 2013 at 7:45 PM, Ond?ej ?ert?k wrote: > >> On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop >> wrote: >> > I am curious if others have noticed an issue with datetime64 at the >> > beginning of 1970. First: >> > >> > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) >> > Out[144]: numpy.timedelta64(1,'D') >> > >> > OK this look fine, they are one day apart. But look at this: >> > >> > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 >> 00')) >> > Out[145]: numpy.timedelta64(31,'h') >> > >> > Hmmm, seems like there are 7 extra hours? Am I missing something? I >> don't >> > see this at any other year. This discontinuity makes it hard to use the >> > datetime64 object without special adjustment in ones code. I assume >> this a >> > bug? >> >> Indeed, this looks like a bug, I can reproduce it on linux as well: >> >> In [1]: import numpy as np >> >> In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') >> Out[2]: numpy.timedelta64(1,'D') >> >> In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') >> Out[3]: numpy.timedelta64(31,'h') >> >> > Maybe, maybe not... were you alive then? For all we know, Charles and co. > were partying an extra 7 hours every day back then? > > Dude, it was the 60's, no one remembers. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bahtiyor_zohidov at mail.ru Wed Apr 17 03:44:12 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Wed, 17 Apr 2013 11:44:12 +0400 Subject: [Numpy-discussion] =?utf-8?q?Finding_the_same_value_in_List?= Message-ID: <1366184652.285421609@f246.mail.ru> Hello, I have had encountered some problem while I was trying to create the following code which finds number of?the same values ?and indexes in an array or list. Here is the code: y = [ 1, 12, ?3, ?3, ?5, ?1, ?1, 34, 0, 0, ?1, ?5] OR? y = array( [ 1, 12, ?3, ?3, ?5, ?1, ?1, 34, 0, 0, ?1, ?5 ] ) b = [ [ item for item in range( len(y) ) if y[ item ] == y[ j ] ] ?for j in range(0, len( y ) ) ] answer: ? ?[ [ 0, 5, 6, 10], [1], [2, 3], [2, 3], [4, 11], [0, 5, 6, 10], [0, 5, 6, 10], [7], [8, 9], [8, 9], [0, 5, 6, 10], [4, 11] ] The result I want to get is ,not that shown above as an answer, that I want to calculate the number of the same values and their indexes in not repeated way as well. For example, '1' - 4, index: '0, 5, 6, 10' '12' - 1, index: '1' '3' - 2, index: '2, 3' '5' - 2, index: '4, 11' '34' - 1, index: '7' '0' - 2, index: '8, 9' Any answer would be appreciated.. --? -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 17 04:46:17 2013 From: toddrjen at gmail.com (Todd) Date: Wed, 17 Apr 2013 10:46:17 +0200 Subject: [Numpy-discussion] Finding the same value in List In-Reply-To: <1366184652.285421609@f246.mail.ru> References: <1366184652.285421609@f246.mail.ru> Message-ID: x,i=numpy.unique(y, return_inverse=True) f=[numpy.where(i==ind) for ind in range(len(x))] x will give you the list of unique values, and f will give you the indices of each corresponding value in x. So f[0] is the indices of x[0] in y. To explain, unique in this form gives two outputs, a sorted, non-repeating list of values (x), and an array of the same shape as y that gives you the indices of x of each corresponding value of y (i, that is x[i] is the same as y) The second goes through each index of x and finds where that index occurs in i. On Wed, Apr 17, 2013 at 9:44 AM, Happyman wrote: > Hello, > > I have had encountered some problem while I was trying to create the > following code which finds number of the same values and indexes in an > array or list. Here is the code: > > y = [ 1, 12, 3, 3, 5, 1, 1, 34, 0, 0, 1, 5] > OR > y = array( [ 1, 12, 3, 3, 5, 1, 1, 34, 0, 0, 1, 5 ] ) > > b = [ [ item for item in range( len(y) ) if y[ item ] == y[ j ] ] for j > in range(0, len( y ) ) ] > > answer: > [ [ 0, 5, 6, 10], [1], [2, 3], [2, 3], [4, 11], [0, 5, 6, 10], [0, 5, > 6, 10], [7], [8, 9], [8, 9], [0, 5, 6, 10], [4, 11] ] > > The result I want to get is ,not that shown above as an answer, that I > want to calculate the number of the same values and their indexes in not > repeated way as well. For example, > '1' - 4, index: '0, 5, 6, 10' > '12' - 1, index: '1' > '3' - 2, index: '2, 3' > '5' - 2, index: '4, 11' > '34' - 1, index: '7' > '0' - 2, index: '8, 9' > > Any answer would be appreciated.. > > > > > > -- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 17 05:02:24 2013 From: toddrjen at gmail.com (Todd) Date: Wed, 17 Apr 2013 11:02:24 +0200 Subject: [Numpy-discussion] Finding the same value in List In-Reply-To: References: <1366184652.285421609@f246.mail.ru> Message-ID: On Wed, Apr 17, 2013 at 10:46 AM, Todd wrote: > x,i=numpy.unique(y, return_inverse=True) > f=[numpy.where(i==ind) for ind in range(len(x))] > > > A better version would be (np.where returns tuples, but we don't want tuples): x,i=numpy.unique(y, return_inverse=True) f=[numpy.where(i==ind)[0] for ind in range(len(x))] You can also do it this way, but it is much harder to read IMO: x=numpy.unique(y) f=numpy.split(numpy.argsort(y), numpy.nonzero(numpy.diff(numpy.sort(y)))[0]+1) This version figures out the indexes needed to put the values of y in sorted order (the same order x uses), then splits it into sub-arrays based on value. The principle is simpler but the implementation looks like clear to me. Note that these are only guaranteed to work on 1D arrays, I have not tested them on multidimensional arrays -------------- next part -------------- An HTML attachment was scrubbed... URL: From bahtiyor_zohidov at mail.ru Wed Apr 17 05:32:26 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Wed, 17 Apr 2013 13:32:26 +0400 Subject: [Numpy-discussion] =?utf-8?q?Finding_the_same_value_in_List?= In-Reply-To: References: <1366184652.285421609@f246.mail.ru> Message-ID: <1366191146.411473206@f360.mail.ru> Hi Todd, Greaaat thanks for your help.. By the way, the first one (I think) is much simpler.. I tested it and ,of course, it is 1D, but it is also a good idea to consider it for Ndimensional. I prefer the first one!?Do you you think first version is okay to use?? ?????, 17 ?????? 2013, 11:02 +02:00 ?? Todd : >On Wed, Apr 17, 2013 at 10:46 AM, Todd < toddrjen at gmail.com > wrote: >>x,i=numpy.unique(y, return_inverse=True) >>f=[numpy.where(i==ind) for ind in range(len(x))] >> >> > > >A better version would be (np.where returns tuples, but we don't want tuples): > >x,i=numpy.unique(y, return_inverse=True) >f=[numpy.where(i==ind)[0] for ind in range(len(x))] > >You can also do it this way, but it is much harder to read IMO: > >x=numpy.unique(y) >f=numpy.split(numpy.argsort(y), numpy.nonzero(numpy.diff(numpy.sort(y)))[0]+1) > >This version figures out the indexes needed to put the values of y in sorted order (the same order x uses), then splits it into sub-arrays based on value.? The principle is simpler but the implementation looks like clear to me. > >Note that these are only guaranteed to work on 1D arrays, I have not tested them on multidimensional arrays > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Apr 17 05:53:50 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Apr 2013 11:53:50 +0200 Subject: [Numpy-discussion] Finding the same value in List In-Reply-To: <1366191146.411473206@f360.mail.ru> References: <1366184652.285421609@f246.mail.ru> <1366191146.411473206@f360.mail.ru> Message-ID: <1366192430.9563.28.camel@sebastian-laptop> On Wed, 2013-04-17 at 13:32 +0400, Happyman wrote: > Hi Todd, > Greaaat thanks for your help.. By the way, the first one (I think) is > much simpler.. I tested it and ,of course, it is 1D, but it is also a > good idea to consider it for Ndimensional. > I prefer the first one! Do you you think first version is okay to > use? If you are only interested in the count, using np.bincount should be much faster then the list comprehension with "==". Of course that gives you a count of zero for all indexes that do not exist. But even then I very much expect that filtering those out afterwards will be faster unless your "indexes" can be arbitrary large. Of course bincount loses the order information, so if you need that, you can only replace the second step with it. - Sebastian > > > ?????, 17 ?????? 2013, 11:02 +02:00 ?? Todd : > On Wed, Apr 17, 2013 at 10:46 AM, Todd > wrote: > x,i=numpy.unique(y, return_inverse=True) > > f=[numpy.where(i==ind) for ind in range(len(x))] > > > > > > > > A better version would be (np.where returns tuples, but we > don't want tuples): > > x,i=numpy.unique(y, return_inverse=True) > f=[numpy.where(i==ind)[0] for ind in range(len(x))] > > You can also do it this way, but it is much harder to read > IMO: > > x=numpy.unique(y) > f=numpy.split(numpy.argsort(y), > numpy.nonzero(numpy.diff(numpy.sort(y)))[0]+1) > > > This version figures out the indexes needed to put the values > of y in sorted order (the same order x uses), then splits it > into sub-arrays based on value. The principle is simpler but > the implementation looks like clear to me. > > > Note that these are only guaranteed to work on 1D arrays, I > have not tested them on multidimensional arrays > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From bahtiyor_zohidov at mail.ru Wed Apr 17 06:05:43 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Wed, 17 Apr 2013 14:05:43 +0400 Subject: [Numpy-discussion] =?utf-8?q?Finding_the_same_value_in_List?= In-Reply-To: <1366192430.9563.28.camel@sebastian-laptop> References: <1366184652.285421609@f246.mail.ru> <1366191146.411473206@f360.mail.ru> <1366192430.9563.28.camel@sebastian-laptop> Message-ID: <1366193143.471991139@f431.i.mail.ru> Okay Todd, In both results, I got the proper values.. for me indices are important also counting.? From your code:? let's say function--> sort_out(data): ? ? ? ?x , f = sort_out( data ) The data type: x in ndarray and x[ i ]--> int64 type(f) ? ? ? ? ? ? ? ? ? ? --> ? ' list ' type( f[ 0 ] ) ? ? ? ? ? ? --> ? ' tuple ' type( f[ 0][0] ) ? ? ? ? ?--> ?'ndarray' type( f[ 0 ][ 0 ][ 0] ?)?--> ?'int64' How do you think to avoid diversity if data type in this example? I think ?it is not necessary to get diverse dtype as well as more than 1D array.. ?? ?????, 17 ?????? 2013, 11:53 +02:00 ?? Sebastian Berg : >On Wed, 2013-04-17 at 13:32 +0400, Happyman wrote: >> Hi Todd, >> Greaaat thanks for your help.. By the way, the first one (I think) is >> much simpler.. I tested it and ,of course, it is 1D, but it is also a >> good idea to consider it for Ndimensional. >> I prefer the first one! Do you you think first version is okay to >> use? > >If you are only interested in the count, using np.bincount should be >much faster then the list comprehension with "==". Of course that gives >you a count of zero for all indexes that do not exist. But even then I >very much expect that filtering those out afterwards will be faster >unless your "indexes" can be arbitrary large. Of course bincount loses >the order information, so if you need that, you can only replace the >second step with it. > >- Sebastian >> >> >> ?????, 17 ?????? 2013, 11:02 +02:00 ?? Todd < toddrjen at gmail.com >: >> On Wed, Apr 17, 2013 at 10:46 AM, Todd < toddrjen at gmail.com > >> wrote: >> x,i=numpy.unique(y, return_inverse=True) >> >> f=[numpy.where(i==ind) for ind in range(len(x))] >> >> >> >> >> >> >> >> A better version would be (np.where returns tuples, but we >> don't want tuples): >> >> x,i=numpy.unique(y, return_inverse=True) >> f=[numpy.where(i==ind)[0] for ind in range(len(x))] >> >> You can also do it this way, but it is much harder to read >> IMO: >> >> x=numpy.unique(y) >> f=numpy.split(numpy.argsort(y), >> numpy.nonzero(numpy.diff(numpy.sort(y)))[0]+1) >> >> >> This version figures out the indexes needed to put the values >> of y in sorted order (the same order x uses), then splits it >> into sub-arrays based on value. The principle is simpler but >> the implementation looks like clear to me. >> >> >> Note that these are only guaranteed to work on 1D arrays, I >> have not tested them on multidimensional arrays >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Wed Apr 17 06:34:45 2013 From: toddrjen at gmail.com (Todd) Date: Wed, 17 Apr 2013 12:34:45 +0200 Subject: [Numpy-discussion] Finding the same value in List In-Reply-To: <1366193143.471991139@f431.i.mail.ru> References: <1366184652.285421609@f246.mail.ru> <1366191146.411473206@f360.mail.ru> <1366192430.9563.28.camel@sebastian-laptop> <1366193143.471991139@f431.i.mail.ru> Message-ID: > > The data type: > x in ndarray and x[ i ]--> int64 > type(f) --> ' list ' > type( f[ 0 ] ) --> ' tuple ' > type( f[ 0][0] ) --> 'ndarray' > type( f[ 0 ][ 0 ][ 0] ) --> 'int64' > > How do you think to avoid diversity if data type in this example? I think > it is not necessary to get diverse dtype as well as more than 1D array.. > That is why I suggested this approach was better ( note the that this is where()[0] instead of just where() as it was in my first example): x,i=numpy.unique(y, return_inverse=True) f=[numpy.where(i==ind)[0] for ind in range(len(x))] type(f) --> list type(f[0]) --> ndarray type(f[0][0]) is meaningless since it is just a single element in an array. It must be an int type of some sort of since indices have to be int types. x will be the same dtype as your input array. You could conceivably change the type of f[0] to a list, but why would you want to? One of the big advantages of python is that usually it doesn't matter what the type is. In this case, a numpy ndarray will work the same as a list in most cases where you would want to use these sorts of indices. It is possibly to change the ndarray to a list, but unless there is a specific reason you need to use lists so then it is better not to. You cannot change the list to an ndarray because the elements of the list are different lengths. ndarray doesn't support that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.giessel at gmail.com Wed Apr 17 08:29:25 2013 From: andrew.giessel at gmail.com (andrew giessel) Date: Wed, 17 Apr 2013 08:29:25 -0400 Subject: [Numpy-discussion] contributing to numpy Message-ID: First, my apologies if this isn't the right forum for this question- I looked for a dev list, but couldn't really find it. I have a small method I'd like to contribute to numpy, ideally as a method on ndarrays and a general function in the numpy namespace. I found it on a stackoverflow thread, and it is a generator that yields slices of a multidimensional array over a specified axis, which is convenient for use in list comprehensions and loops. https://gist.github.com/andrewgiessel/5400659 I've forked the numpy source and am familar with git/pull requests/etc but the code base is a bit overwhelming. I have 2 questions: 1) Is there a document which gives an overview of the numpy source and perhaps a tutorial on the best way to add methods/functions? 2) is there a current way to do this in numpy? The only iterator related stuff I found in a brief search last night was for essentially looping over all elements of an array, one by one. best, Andrew ps: please feel free to contribute to the gist! -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From bahtiyor_zohidov at mail.ru Wed Apr 17 10:19:25 2013 From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=) Date: Wed, 17 Apr 2013 18:19:25 +0400 Subject: [Numpy-discussion] =?utf-8?q?Finding_the_same_value_in_List?= References: <1366184652.285421609@f246.mail.ru> <1366193143.471991139@f431.i.mail.ru> Message-ID: <1366208365.232811374@f185.mail.ru> okay Todd, I got it. There are some reasons why I preferred asking that question. Let me explain: I am using Excel data which contains 3 columns and say 12 rows to process some simple data. What I want to do with the code you provided is that In the first column A has data that indicates the same ID where 2nd and 3rd has the same value. In other words I will put the following sample data to explain better: A_column = [?22,?92,?64,?64,?77,?77,?64,?64,?22,?92, 99,?200 ] # The same length 12 B_column = [ 8, 8, 8, 8, 8, 0, 0, 0, 0, 0, 12, 0] # The same length 12 C_column = [ 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 4, 13] # The same length 12 The main reason for the question is as we discussed we already processed data in A_column. B_column has five "8" numbers in C_column as well but not in the same index!!! I want to get the following result which really confused me in terms of 'dtype': IF A_column the same THEN The value of B_column and C_columns in the ID (index, we got) is the same THEN get B and C value otherwise NO... I hope you would understand my problem ?????, 17 ?????? 2013, 12:34 +02:00 ?? Todd < toddrjen at gmail.com >: >>The data type: >>x in ndarray and x[ i ]--> int64 >>type(f) ? ? ? ? ? ? ? ? ? ? --> ? ' list ' >>type( f[ 0 ] ) ? ? ? ? ? ? --> ? ' tuple ' >>type( f[ 0][0] ) ? ? ? ? ?--> ?'ndarray' >>type( f[ 0 ][ 0 ][ 0] ?)?--> ?'int64' >> >>How do you think to avoid diversity if data type in this example? I think ?it is not necessary to get diverse dtype as well as more than 1D array.. >That is why I suggested this approach was better ( note the that this is where()[0] instead of just where() as it was in my first example): > >x,i=numpy.unique(y, return_inverse=True) >f=[numpy.where(i==ind)[0] for ind in range(len(x))] > >type(f)? ?? --> list >type(f[0]) --> ndarray > >type(f[0][0]) is meaningless since it is just a single element in an array.? It must be an int type of some sort of since indices have to be int types.? x will be the same dtype as your input array. > >You could conceivably change the type of f[0] to a list, but why would you want to?? One of the big advantages of python is that usually it doesn't matter what the type is.? In this case, a numpy ndarray will work the same as a list in most cases where you would want to use these sorts of indices. It is possibly to change the ndarray to a list, but unless there is a specific reason you need to use lists so then it is better not to.? > >You cannot change the list to an ndarray because the elements of the list are different lengths.? ndarray doesn't support that. >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From bob.nnamtrop at gmail.com Wed Apr 17 10:25:46 2013 From: bob.nnamtrop at gmail.com (Bob Nnamtrop) Date: Wed, 17 Apr 2013 08:25:46 -0600 Subject: [Numpy-discussion] Finding the same value in List In-Reply-To: References: <1366184652.285421609@f246.mail.ru> <1366191146.411473206@f360.mail.ru> <1366192430.9563.28.camel@sebastian-laptop> <1366193143.471991139@f431.i.mail.ru> Message-ID: At bit OT, but I am new to numpy. The help for np.where says: Returns ------- out : ndarray or tuple of ndarrays If both `x` and `y` are specified, the output array contains elements of `x` where `condition` is True, and elements from `y` elsewhere. If only `condition` is given, return the tuple ``condition.nonzero()``, the indices where `condition` is True. However, I don't see any case that it returns an ndarray (it always seems to return a tuple of ndarrys). It seems to me for the case where only 'condition' is given it should return just the ndarry, eg (using this case discussed above): In [44]: np.where(i==0) Out[44]: (array([8, 9]),) This should just return the ndarray and not the tuple of ndarrays. In what case does it only return the ndarray? Thanks, Bob On Wed, Apr 17, 2013 at 4:34 AM, Todd wrote: > The data type: >> x in ndarray and x[ i ]--> int64 >> type(f) --> ' list ' >> type( f[ 0 ] ) --> ' tuple ' >> type( f[ 0][0] ) --> 'ndarray' >> type( f[ 0 ][ 0 ][ 0] ) --> 'int64' >> >> How do you think to avoid diversity if data type in this example? I think >> it is not necessary to get diverse dtype as well as more than 1D array.. >> > > That is why I suggested this approach was better ( note the that this is > where()[0] instead of just where() as it was in my first example): > > x,i=numpy.unique(y, return_inverse=True) > f=[numpy.where(i==ind)[0] for ind in range(len(x))] > > type(f) --> list > type(f[0]) --> ndarray > > type(f[0][0]) is meaningless since it is just a single element in an > array. It must be an int type of some sort of since indices have to be int > types. x will be the same dtype as your input array. > > You could conceivably change the type of f[0] to a list, but why would you > want to? One of the big advantages of python is that usually it doesn't > matter what the type is. In this case, a numpy ndarray will work the same > as a list in most cases where you would want to use these sorts of indices. > It is possibly to change the ndarray to a list, but unless there is a > specific reason you need to use lists so then it is better not to. > > You cannot change the list to an ndarray because the elements of the list > are different lengths. ndarray doesn't support that. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arinkverma at iitrpr.ac.in Wed Apr 17 11:03:45 2013 From: arinkverma at iitrpr.ac.in (Arink Verma) Date: Wed, 17 Apr 2013 20:33:45 +0530 Subject: [Numpy-discussion] Gsoc : Performance parity between numpy arrays and Python scalars Message-ID: Hello everyone I am Arink, computer science student and open source enthusiastic. This year I am interested to work on project "Performance parity between numpy arrays and Python scalars"[1]. I tried to adobt rald's work on numpy1.7[2] (which was done for numpy1.6 [3]). Till now by avoiding a) the uncessary Checking for floating point errors which is slow, b) unnecessarily Creation / destruction of scalar array types I am getting the speedup by ~ 1.05 times, which marginal offcourse. As in project's describtion it is mention that ufunc look up code is slow and inefficient. Few questions 1. Does it has to check every single data type possible until if finds the best match for the data that the operation is being performed on, or is there better way to find the best possible match? 2. If yes, so where are bottle-necks? Is the checks for proper data types are very expensive? [1]http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas [2]https://github.com/arinkverma/numpy/compare/master...gsoc_performance [3]http://article.gmane.org/gmane.comp.python.numeric.general/52480 -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 17 11:43:40 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Apr 2013 09:43:40 -0600 Subject: [Numpy-discussion] contributing to numpy In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 6:29 AM, andrew giessel wrote: > First, my apologies if this isn't the right forum for this question- I > looked for a dev list, but couldn't really find it. > > I have a small method I'd like to contribute to numpy, ideally as a method > on ndarrays and a general function in the numpy namespace. I found it on a > stackoverflow thread, and it is a generator that yields slices of a > multidimensional array over a specified axis, which is convenient for use > in list comprehensions and loops. > > https://gist.github.com/andrewgiessel/5400659 > > I've forked the numpy source and am familar with git/pull requests/etc but > the code base is a bit overwhelming. > > I have 2 questions: > > 1) Is there a document which gives an overview of the numpy source and > perhaps a tutorial on the best way to add methods/functions? > The closest thing is probably Contributing to Numpy , which I suspect is not what you are looking for. At this point you pretty much need to dig through the source to see how it is organized. To see how things like tests/documentation are organized look for existing examples and also the relevant docs in doc/. > 2) is there a current way to do this in numpy? The only iterator related > stuff I found in a brief search last night was for essentially looping over > all elements of an array, one by one. > I think a function like this would be useful. There are ad-hoc ways to get the same result but they aren't quite as flexible. A few comments 1) The function would probably best go in numpy/core/numeric.py 2) It will need a docstring 3) It will need tests in numpy/core/tests/test_numeric.py 4) xrange isn't Python 3 compatible, use range instead. The name isn't very descriptive, maybe iter_over_axis? One possible generalization would to let axis (axes) be either a number, or a list of axes. The first being the number of leading axes, the second letting one choose arbitrary axes. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Apr 17 11:57:57 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 08:57:57 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 9:32 PM, Charles R Harris wrote: > Dude, it was the 60's, no one remembers. I can't say I remember much from then -- but probably because I was 4 years old, not because of too much partying.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 17 12:04:08 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 09:04:08 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 8:23 PM, Zachary Ploskey wrote: > The problem does not appear to exist on Linux with numpy version 1.6.2. datetime64 was re-vampded a fair bit between 1.6 and 1.7 something is up here for sure with 1.7 We can be more dramatic about it: In [5]: np.datetime64('1970-01-01T00') - np.datetime64('1969-12-31T23:59 ') Out[5]: numpy.timedelta64(481,'m') In [6]: np.datetime64('1970-01-01T00:00') - np.datetime64('1969-12-31T23:59 ') Out[6]: numpy.timedelta64(481,'m') 1970-01-01T00 is the epoch, of course, so no surprise that if there is something wrong, it would be there. Interesting that no one noticed this before -- I guess I haven't happened to use pre-1970 dates -- and or didn't check for accuracy down to a few hours... I'd say we need some more unit-tests! We've been discussion some changes to datetime64 anyway, so a good time to open it up. -CHris > In [1]: import numpy as np > > In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') > Out[2]: 1 day, 0:00:00 > > In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') > Out[3]: 1 day, 0:00:00 > > In [4]: np.__version__ > Out[4]: '1.6.2' > > Zach > > > On Tue, Apr 16, 2013 at 4:45 PM, Ond?ej ?ert?k > wrote: >> >> On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop >> wrote: >> > I am curious if others have noticed an issue with datetime64 at the >> > beginning of 1970. First: >> > >> > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) >> > Out[144]: numpy.timedelta64(1,'D') >> > >> > OK this look fine, they are one day apart. But look at this: >> > >> > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 >> > 00')) >> > Out[145]: numpy.timedelta64(31,'h') >> > >> > Hmmm, seems like there are 7 extra hours? Am I missing something? I >> > don't >> > see this at any other year. This discontinuity makes it hard to use the >> > datetime64 object without special adjustment in ones code. I assume this >> > a >> > bug? >> >> Indeed, this looks like a bug, I can reproduce it on linux as well: >> >> In [1]: import numpy as np >> >> In [2]: np.datetime64('1970-01-01') - np.datetime64('1969-12-31') >> Out[2]: numpy.timedelta64(1,'D') >> >> In [3]: np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00') >> Out[3]: numpy.timedelta64(31,'h') >> >> In [4]: np.__version__ >> Out[4]: '1.7.1' >> >> >> We need to look into the sources to see what is going on. >> >> Ondrej >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 17 12:07:50 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 09:07:50 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 9:04 AM, Chris Barker - NOAA Federal wrote: > On Tue, Apr 16, 2013 at 8:23 PM, Zachary Ploskey wrote: > I'd say we need some more unit-tests! speaking of which, where are the tests? I just did a quick poke at github, and found: https://github.com/numpy/numpy/tree/master/numpy/testing and https://github.com/numpy/numpy/tree/master/numpy/test but there's very little in there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 17 12:09:55 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 09:09:55 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 9:07 AM, Chris Barker - NOAA Federal wrote: > speaking of which, where are the tests? I just did a quick poke at > github, and found: > > https://github.com/numpy/numpy/tree/master/numpy/testing > and > https://github.com/numpy/numpy/tree/master/numpy/test > > but there's very little in there. ah -- in another thread, Charles just pointed someone to: https://github.com/numpy/numpy/tree/master/numpy/core/tests so -- never mind. -Chris > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sebastian at sipsolutions.net Wed Apr 17 12:11:24 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 17 Apr 2013 18:11:24 +0200 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: <1366215084.9563.50.camel@sebastian-laptop> On Wed, 2013-04-17 at 09:07 -0700, Chris Barker - NOAA Federal wrote: > On Wed, Apr 17, 2013 at 9:04 AM, Chris Barker - NOAA Federal > wrote: > > On Tue, Apr 16, 2013 at 8:23 PM, Zachary Ploskey wrote: > > I'd say we need some more unit-tests! > > speaking of which, where are the tests? I just did a quick poke at > github, and found: > > https://github.com/numpy/numpy/tree/master/numpy/testing > and > https://github.com/numpy/numpy/tree/master/numpy/test > > but there's very little in there. > The datetime is implemented in the core, so the tests are here: https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_datetime.py - Sebastian > -Chris > > From ondrej.certik at gmail.com Wed Apr 17 12:42:53 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 17 Apr 2013 10:42:53 -0600 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: <1366215084.9563.50.camel@sebastian-laptop> References: <1366215084.9563.50.camel@sebastian-laptop> Message-ID: On Wed, Apr 17, 2013 at 10:11 AM, Sebastian Berg wrote: > On Wed, 2013-04-17 at 09:07 -0700, Chris Barker - NOAA Federal wrote: >> On Wed, Apr 17, 2013 at 9:04 AM, Chris Barker - NOAA Federal >> wrote: >> > On Tue, Apr 16, 2013 at 8:23 PM, Zachary Ploskey wrote: >> > I'd say we need some more unit-tests! >> >> speaking of which, where are the tests? I just did a quick poke at >> github, and found: >> >> https://github.com/numpy/numpy/tree/master/numpy/testing >> and >> https://github.com/numpy/numpy/tree/master/numpy/test >> >> but there's very little in there. >> > > The datetime is implemented in the core, so the tests are here: > > https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_datetime.py I wonder if the problem has something to do with these functions here: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/datetime_strings.c#L189 https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/datetime_strings.c#L265 Ondrej From chris.barker at noaa.gov Wed Apr 17 12:53:53 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 09:53:53 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Tue, Apr 16, 2013 at 3:55 PM, Bob Nnamtrop wrote: > pss It would be most handy if datetime64 had a constructor of the form > np.datetime64(year,month,day,hour,min,sec) where these inputs were numpy > arrays and the output would have the same shape as the input arrays (but be > of type datetime64). The hour,min,sec would be optional. Scalar inputs would > be broadcast to the size of the array inputs, etc. Maybe this is a topic for > another post. Indeed -- there are a couple other threads about DateTime64 right now, and we need to get a new NEP draft going (I've started discussion, but haven't had a chance to put it in a NEP yet...) It would be good to add this. How/Where do I start a NEP, anyway? It looks like they go here: https://github.com/numpy/numpy/tree/master/doc/neps But do I get it in pretty good shape first? Do I ask someone to commit it for me? -Chris > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From andrew_giessel at hms.harvard.edu Wed Apr 17 14:18:32 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Wed, 17 Apr 2013 14:18:32 -0400 Subject: [Numpy-discussion] contributing to numpy In-Reply-To: References: Message-ID: Chuck- Thank you for the very helpful and encouraging email! I will first try to just add a function, rather than a method on ndarray (which looks to be lower level ie: C). The pointer to numeric.py / test_numeric.py is exactly what I needed. I will of course figure out a good test and document it well. For starters, I'll make the function simply take an integer corresponding to an axis. I'm not exactly sure what you mean by generalizing to take multiple axes -- would the idea be to return slices of an array w/o one of the dimensions? I'll try to tackle this over the next week and hopefully the conversation on the PR will be the place to talk about these issues. I'll need to figure out the best way to have a dev branch of numpy side-by-side with a stock version, and how to build the module, first. Lastly, I'll also try to write up something re: my experience so others can have something to take a look at. best+thanks, ag On Wed, Apr 17, 2013 at 11:43 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Wed, Apr 17, 2013 at 6:29 AM, andrew giessel wrote: > >> First, my apologies if this isn't the right forum for this question- I >> looked for a dev list, but couldn't really find it. >> >> I have a small method I'd like to contribute to numpy, ideally as a >> method on ndarrays and a general function in the numpy namespace. I found >> it on a stackoverflow thread, and it is a generator that yields slices of a >> multidimensional array over a specified axis, which is convenient for use >> in list comprehensions and loops. >> >> https://gist.github.com/andrewgiessel/5400659 >> >> I've forked the numpy source and am familar with git/pull requests/etc >> but the code base is a bit overwhelming. >> >> I have 2 questions: >> >> 1) Is there a document which gives an overview of the numpy source and >> perhaps a tutorial on the best way to add methods/functions? >> > > The closest thing is probably Contributing to Numpy > , > which I suspect is not what you are looking for. At this point you pretty > much need to dig through the source to see how it is organized. To see how > things like tests/documentation are organized look for existing examples > and also the relevant docs in doc/. > > >> 2) is there a current way to do this in numpy? The only iterator related >> stuff I found in a brief search last night was for essentially looping over >> all elements of an array, one by one. >> > > I think a function like this would be useful. There are ad-hoc ways to get > the same result but they aren't quite as flexible. A few comments > > 1) The function would probably best go in numpy/core/numeric.py > 2) It will need a docstring > 3) It will need tests in numpy/core/tests/test_numeric.py > 4) xrange isn't Python 3 compatible, use range instead. > > The name isn't very descriptive, maybe iter_over_axis? One possible > generalization would to let axis (axes) be either a number, or a list of > axes. The first being the number of leading axes, the second letting one > choose arbitrary axes. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Apr 17 15:09:34 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 17 Apr 2013 22:09:34 +0300 Subject: [Numpy-discussion] Adding gufuncs In-Reply-To: References: Message-ID: 14.04.2013 18:39, Charles R Harris kirjoitti: > There is a pull request for > work adding linear algebra support as generalized ufuncs. The result is > that many of the linear algebra routines can now be applied to stacks of > matrices. Another new feature is support for float32 versions of the > routines. Some work has also gone into porting the current linalg > package to use the new routines. > > The work isn't finished, the new and old libraries {blas, lapack}_lite > libraries should probably be united and the error handling could maybe > use more polish, but I'm inclined to put the PR in at this point. > Although some things may break, I think it needs to be out there to > gather feedback and testing. It's merged: >>> a = np.array([[[1, 2], [3, 4]], [[1, 2], [2, 1]], [[1, 3], [3, 1]] ]) >>> np.linalg.det(a) array([-2., -3., -8.]) -- Pauli Virtanen From raul at virtualmaterials.com Wed Apr 17 15:17:38 2013 From: raul at virtualmaterials.com (Raul Cota) Date: Wed, 17 Apr 2013 13:17:38 -0600 Subject: [Numpy-discussion] Gsoc : Performance parity between numpy arrays and Python scalars In-Reply-To: References: Message-ID: <516EF552.2000803@virtualmaterials.com> An HTML attachment was scrubbed... URL: From bob.nnamtrop at gmail.com Wed Apr 17 16:09:25 2013 From: bob.nnamtrop at gmail.com (Bob Nnamtrop) Date: Wed, 17 Apr 2013 14:09:25 -0600 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: It would seem that before 1970 the dates do not include the time zone adjustment while after 1970 they do. This is the source of the extra 7 hours. In [21]: np.datetime64('1970-01-01 00') Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h') In [22]: np.datetime64('1969-12-31 00') Out[22]: numpy.datetime64('1969-12-31T00:00Z','h') I saw the other thread about the time zone issues and I think getting rid of timezones (perhaps unless they are explicitly requested) is the right thing to do. Bob On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop wrote: > I am curious if others have noticed an issue with datetime64 at the > beginning of 1970. First: > > In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) > Out[144]: numpy.timedelta64(1,'D') > > OK this look fine, they are one day apart. But look at this: > > In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 00')) > Out[145]: numpy.timedelta64(31,'h') > > Hmmm, seems like there are 7 extra hours? Am I missing something? I don't > see this at any other year. This discontinuity makes it hard to use the > datetime64 object without special adjustment in ones code. I assume this a > bug? > > Thanks, > Bob > > ps I'm using the most recent anaconda release on mac os x 10.6.8 which > includes numpy 1.7.0. > > pss It would be most handy if datetime64 had a constructor of the form > np.datetime64(year,month,day,hour,min,sec) where these inputs were numpy > arrays and the output would have the same shape as the input arrays (but be > of type datetime64). The hour,min,sec would be optional. Scalar inputs > would be broadcast to the size of the array inputs, etc. Maybe this is a > topic for another post. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Wed Apr 17 17:21:09 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Wed, 17 Apr 2013 15:21:09 -0600 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 2:09 PM, Bob Nnamtrop wrote: > It would seem that before 1970 the dates do not include the time zone > adjustment while after 1970 they do. This is the source of the extra 7 > hours. > > In [21]: np.datetime64('1970-01-01 00') > Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h') > > In [22]: np.datetime64('1969-12-31 00') > Out[22]: numpy.datetime64('1969-12-31T00:00Z','h') > > I saw the other thread about the time zone issues and I think getting rid of > timezones (perhaps unless they are explicitly requested) is the right thing > to do. I think you are right. The timezone conversion only happens for years >= 1970: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/datetime_strings.c#L299 Which explains the problem. Ondrej From chris.barker at noaa.gov Wed Apr 17 19:10:39 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 16:10:39 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 1:09 PM, Bob Nnamtrop wrote: > It would seem that before 1970 the dates do not include the time zone > adjustment while after 1970 they do. This is the source of the extra 7 > hours. > > In [21]: np.datetime64('1970-01-01 00') > Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h') > > In [22]: np.datetime64('1969-12-31 00') > Out[22]: numpy.datetime64('1969-12-31T00:00Z','h') wow! that is so wrong, and confusing -- I thought I had an idea what was going on here: datetime64 currently does a timezone adjustment at two places: 1) when constructing a datetime64 from an ISO string 2) when constructing an ISO string from a datetime64 This: In [110]: np.datetime64('1969-12-31 00').view(np.int64) Out[110]: -24 In [111]: np.datetime64('1970-01-01 00').view(np.int64) Out[111]: 8 indicates that it is doing the input transition differently, as the underlying value is wrong for one. (another weird note -- I;m in pacific time, which is -7 now, with DST....so why the 8?) That explains the timedelta error. But the output is odd, too: In [117]: np.datetime64(datetime.datetime(1969, 12, 31, 0)) Out[117]: numpy.datetime64('1969-12-31T00:00:00.000000Z') In [118]: np.datetime64(datetime.datetime(1970, 1, 1, 0)) Out[118]: numpy.datetime64('1969-12-31T16:00:00.000000-0800') (when converting datetime.datetime objects, no timezone adjustment is applied) I suspect that it's trying to use the system time functions (which wil apply the locale), but that they don't work before 1970...at least on *nix machines. ANyone tested this on Windows? We REALLY need to fix this! -Chris > I saw the other thread about the time zone issues and I think getting rid of > timezones (perhaps unless they are explicitly requested) is the right thing > to do. > > Bob > > > On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop > wrote: >> >> I am curious if others have noticed an issue with datetime64 at the >> beginning of 1970. First: >> >> In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) >> Out[144]: numpy.timedelta64(1,'D') >> >> OK this look fine, they are one day apart. But look at this: >> >> In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 >> 00')) >> Out[145]: numpy.timedelta64(31,'h') >> >> Hmmm, seems like there are 7 extra hours? Am I missing something? I don't >> see this at any other year. This discontinuity makes it hard to use the >> datetime64 object without special adjustment in ones code. I assume this a >> bug? >> >> Thanks, >> Bob >> >> ps I'm using the most recent anaconda release on mac os x 10.6.8 which >> includes numpy 1.7.0. >> >> pss It would be most handy if datetime64 had a constructor of the form >> np.datetime64(year,month,day,hour,min,sec) where these inputs were numpy >> arrays and the output would have the same shape as the input arrays (but be >> of type datetime64). The hour,min,sec would be optional. Scalar inputs would >> be broadcast to the size of the array inputs, etc. Maybe this is a topic for >> another post. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 17 20:28:11 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 17:28:11 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? Message-ID: Folks, I've discovered somethign intertesting (bug?) with numpy scalars ans savz. If I save a numpy scalar, then reload it, ot comes back as rank-0 array -- similar, but not the same thing: In [144]: single_value, type(single_value) Out[144]: (2.0, numpy.float32) In [145]: np.savez('test.npz', single_value=single_value) In [146]: single_value2 = np.load('test.npz')['single_value'] In [147]: single_value, type(single_value) Out[147]: (2.0, numpy.float32) In [148]: single_value2, type(single_value2) Out[148]: (array(2.0, dtype=float32), numpy.ndarray) straight np.save has the same issue (which makes sense, I'm sure savez uses the save code under the hood): In [149]: single_value, type(single_value) Out[149]: (2.0, numpy.float32) In [150]: np.save('test.npy', single_value) In [151]: single_value2 = np.load('test.npy') In [152]: single_value2, type(single_value2) Out[152]: (array(2.0, dtype=float32), numpy.ndarray) This has been annoying, particular as rank-zero scalars are kind of a pain. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Wed Apr 17 20:36:19 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 17 Apr 2013 17:36:19 -0700 Subject: [Numpy-discussion] Time Zones and datetime64 In-Reply-To: References: <51654B1E.6050708@gmail.com> Message-ID: On Fri, Apr 12, 2013 at 1:36 PM, Anthony Scopatz wrote: > Option (2), what datetime does, is the wrong model. This is more > complicated in both the implementation and API, and leads to lots of broken > code, weird errors, and no clear right way of doing thing. could you elaborate a bit more? I've never tried to use timezone support with datetime, so I have no idea what goes wrong -- but it looks reasonable to me. though it really punts on the hard stuff, so maybe no point. -Chris > Be Well > Anthony > > > > On Fri, Apr 12, 2013 at 2:57 PM, Chris Barker - NOAA Federal > wrote: >> >> On Fri, Apr 12, 2013 at 9:52 AM, Riccardo De Maria >> wrote: >> > Not related to leap seconds and physically accurate time deltas, I have >> > just >> > noticed that SQLite has a nice API: >> > >> > http://www.sqlite.org/lang_datefunc.html >> > >> > that one can be inspired from. The source contains a date.c which looks >> > reasonably clear. >> >> well, I don't see any timezone support in there at all. It appears the >> use UTC, though I"m not entierly sure from the docs what now() would >> return. >> >> So I think it's pretty much like my "use UTC" proposal. >> >> >> -Chris >> >> >> >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From ben.root at ou.edu Wed Apr 17 21:05:58 2013 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 17 Apr 2013 21:05:58 -0400 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 7:10 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Wed, Apr 17, 2013 at 1:09 PM, Bob Nnamtrop > wrote: > > It would seem that before 1970 the dates do not include the time zone > > adjustment while after 1970 they do. This is the source of the extra 7 > > hours. > > > > In [21]: np.datetime64('1970-01-01 00') > > Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h') > > > > In [22]: np.datetime64('1969-12-31 00') > > Out[22]: numpy.datetime64('1969-12-31T00:00Z','h') > > In [111]: np.datetime64('1970-01-01 00').view(np.int64) > Out[111]: 8 > > indicates that it is doing the input transition differently, as the > underlying value is wrong for one. > (another weird note -- I;m in pacific time, which is -7 now, with > DST....so why the 8?) > Aren't we on standard time at Jan 1st? So, at that date, you would have been -8. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at picloud.com Wed Apr 17 21:21:00 2013 From: john at picloud.com (John Riley) Date: Wed, 17 Apr 2013 18:21:00 -0700 Subject: [Numpy-discussion] NumPy 1.7.0 w/ MKL available on PiCloud Message-ID: Hello, I'm happy to announce that NumPy 1.7.0 is now available on PiCloud [1]. Previously, NumPy 1.6.2 was the most up-to-date version available by default. While any user could always create a custom environment, and install the latest version themselves [2], we've decided to address the issue directly by having the latest NumPy available as part of the public environment "/picloud/science". Using that environment will give you access to 1.7.0, and we plan to have it track the latest version of popular scientific packages including SciPy. Note also that NumPy has been compiled with the Intel Math Kernel (MKL) library, which greatly accelerates certain operations (for ex. numpy.dot), especially on the hyper-threading-enabled f2 core. If you're unfamiliar with how to use NumPy on PiCloud, please see our documentation [3] [4]. Hope this helps! [1] http://www.picloud.com [2] http://docs.picloud.com/**environment.html [3] http://docs.picloud.com/howto/pyscientifictools.html [4] http://docs.picloud.com/howto/primer.html Best Regards, John -- John Riley PiCloud, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Thu Apr 18 02:27:20 2013 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Thu, 18 Apr 2013 08:27:20 +0200 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: 2013/4/18 Chris Barker - NOAA Federal > On Wed, Apr 17, 2013 at 1:09 PM, Bob Nnamtrop > wrote: > > It would seem that before 1970 the dates do not include the time zone > > adjustment while after 1970 they do. This is the source of the extra 7 > > hours. > > > > In [21]: np.datetime64('1970-01-01 00') > > Out[21]: numpy.datetime64('1970-01-01T00:00-0700','h') > > > > In [22]: np.datetime64('1969-12-31 00') > > Out[22]: numpy.datetime64('1969-12-31T00:00Z','h') > > wow! that is so wrong, and confusing -- I thought I had an idea what > was going on here: > > datetime64 currently does a timezone adjustment at two places: > > 1) when constructing a datetime64 from an ISO string > 2) when constructing an ISO string from a datetime64 > > This: > In [110]: np.datetime64('1969-12-31 00').view(np.int64) > Out[110]: -24 > > In [111]: np.datetime64('1970-01-01 00').view(np.int64) > Out[111]: 8 > > indicates that it is doing the input transition differently, as the > underlying value is wrong for one. > (another weird note -- I;m in pacific time, which is -7 now, with > DST....so why the 8?) > > That explains the timedelta error. > > But the output is odd, too: > > In [117]: np.datetime64(datetime.datetime(1969, 12, 31, 0)) > Out[117]: numpy.datetime64('1969-12-31T00:00:00.000000Z') > > In [118]: np.datetime64(datetime.datetime(1970, 1, 1, 0)) > Out[118]: numpy.datetime64('1969-12-31T16:00:00.000000-0800') > > (when converting datetime.datetime objects, no timezone adjustment is > applied) > > I suspect that it's trying to use the system time functions (which wil > apply the locale), but that they don't work before 1970...at least on > *nix machines. > > ANyone tested this on Windows? > On Windows 7, numpy 1.7.0 (Anaconda 1.4.0 64 bit), I don't even get a wrong answer, but an error: In [3]: np.datetime64('1969-12-31 00') Out[3]: numpy.datetime64('1969-12-31T00:00Z','h') In [4]: np.datetime64('1970-01-01 00') --------------------------------------------------------------------------- OSError Traceback (most recent call last) in () ----> 1 np.datetime64('1970-01-01 00') OSError: Failed to use 'mktime' to convert local time to UTC > We REALLY need to fix this! > > -Chris > > > > > > > > > > > I saw the other thread about the time zone issues and I think getting > rid of > > timezones (perhaps unless they are explicitly requested) is the right > thing > > to do. > > > > Bob > > > > > > On Tue, Apr 16, 2013 at 4:55 PM, Bob Nnamtrop > > wrote: > >> > >> I am curious if others have noticed an issue with datetime64 at the > >> beginning of 1970. First: > >> > >> In [144]: (np.datetime64('1970-01-01') - np.datetime64('1969-12-31')) > >> Out[144]: numpy.timedelta64(1,'D') > >> > >> OK this look fine, they are one day apart. But look at this: > >> > >> In [145]: (np.datetime64('1970-01-01 00') - np.datetime64('1969-12-31 > >> 00')) > >> Out[145]: numpy.timedelta64(31,'h') > >> > >> Hmmm, seems like there are 7 extra hours? Am I missing something? I > don't > >> see this at any other year. This discontinuity makes it hard to use the > >> datetime64 object without special adjustment in ones code. I assume > this a > >> bug? > >> > >> Thanks, > >> Bob > >> > >> ps I'm using the most recent anaconda release on mac os x 10.6.8 which > >> includes numpy 1.7.0. > >> > >> pss It would be most handy if datetime64 had a constructor of the form > >> np.datetime64(year,month,day,hour,min,sec) where these inputs were numpy > >> arrays and the output would have the same shape as the input arrays > (but be > >> of type datetime64). The hour,min,sec would be optional. Scalar inputs > would > >> be broadcast to the size of the array inputs, etc. Maybe this is a > topic for > >> another post. > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Apr 18 07:04:48 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 18 Apr 2013 16:34:48 +0530 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 5:58 AM, Chris Barker - NOAA Federal wrote: > Folks, > > I've discovered somethign intertesting (bug?) with numpy scalars ans > savz. If I save a numpy scalar, then reload it, ot comes back as > rank-0 array -- similar, but not the same thing: > > In [144]: single_value, type(single_value) > Out[144]: (2.0, numpy.float32) > > In [145]: np.savez('test.npz', single_value=single_value) > > In [146]: single_value2 = np.load('test.npz')['single_value'] > > In [147]: single_value, type(single_value) > Out[147]: (2.0, numpy.float32) > > In [148]: single_value2, type(single_value2) > Out[148]: (array(2.0, dtype=float32), numpy.ndarray) > > straight np.save has the same issue (which makes sense, I'm sure savez > uses the save code under the hood): > > In [149]: single_value, type(single_value) > Out[149]: (2.0, numpy.float32) > > In [150]: np.save('test.npy', single_value) > > In [151]: single_value2 = np.load('test.npy') > > In [152]: single_value2, type(single_value2) > Out[152]: (array(2.0, dtype=float32), numpy.ndarray) > > > This has been annoying, particular as rank-zero scalars are kind of a pain. np.save() and company (and the NPY format itself) are for arrays, not for scalars. np.save() uses an np.asanyarray() to coerce its input which is why your scalar gets converted to a rank-zero array. -- Robert Kern From ben.root at ou.edu Thu Apr 18 09:20:34 2013 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 18 Apr 2013 09:20:34 -0400 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 2:27 AM, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > ANyone tested this on Windows? >> > > > On Windows 7, numpy 1.7.0 (Anaconda 1.4.0 64 bit), I don't even get a > wrong answer, but an error: > > In [3]: np.datetime64('1969-12-31 00') > Out[3]: numpy.datetime64('1969-12-31T00:00Z','h') > > In [4]: np.datetime64('1970-01-01 00') > --------------------------------------------------------------------------- > OSError Traceback (most recent call last) > in () > ----> 1 np.datetime64('1970-01-01 00') > > OSError: Failed to use 'mktime' to convert local time to UTC > > Have you tried np.test()? I know some of the tests I added awhile back utilized pre-epoch dates to test sorting and min/max finding. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Apr 18 11:31:23 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 18 Apr 2013 08:31:23 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 4:04 AM, Robert Kern wrote: > np.save() and company (and the NPY format itself) are for arrays, not > for scalars. np.save() uses an np.asanyarray() to coerce its input > which is why your scalar gets converted to a rank-zero array. Fair enough -- so a missing feature, not bug -- I'll need to look at the docs and see if that can be clarified - I note that it never dawned on me to pass anything other than an array in (like a list), but I guess if I did, it would likely work, but return an array when re-loaded. I'm ambivalent about whether I like this feature -- in this case, it resulted in confusion. If I'd gotten an exception in the first place, it would have been simple enough to fix, as it was it took some poking around. As for numpy scalars -- would it be a major lift to support them directly? -Chris > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From scollis.acrf at gmail.com Thu Apr 18 11:32:32 2013 From: scollis.acrf at gmail.com (Scott Collis) Date: Thu, 18 Apr 2013 10:32:32 -0500 Subject: [Numpy-discussion] Position at Brookhaven National Laboratory.. Message-ID: <5CA9A690-37EA-4767-8942-CCAEE94D37FB@gmail.com> Good morning Numpy list! A colleague of mine at Brookhaven National Laboratory (Long Island, NY) is hiring a "Senior Applications Analyst". One of the desired languages is Python, and as a strong collaborator with this group I can attest anyone with good python skills will do well there? Here is the description: Brookhaven National Laboratory's Atmospheric Science Division currently has a Full-Time opportunity for a Senior Applications Analyst. The major duties and responsibilities include the development of meteorological products which add value to and enhance the scientific usability of data (particularly from millimeter-wavelength radar systems) collected by the Department of Energy's Atmospheric Radiation Measurement (ARM) program. Required Knowledge, Skills and Abilities: Bachelor?s degree with at least one year experience or equivalent experience (2:1 / work:education). Expertise in high-level programming languages (including one or more of C, IDL, Python, and Matlab). Experience working in a Linux/Unix environment. Attention to detail is essential, as is the ability to work well as part of a team. Strong written and oral communication skills. Preferred Knowledge, Skills, and Abilities:A background meteorology, radar applications or another physical science. and here are the instructions: To be considered for this position, apply online at www.bnl.gov and click Jobs, then click Search Job List and apply to job #16385 Any questions give me a yell and I can put you in contact with them.. and please feel free to disseminate! Cheers, Scott From chris.barker at noaa.gov Thu Apr 18 11:32:42 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 18 Apr 2013 08:32:42 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 6:05 PM, Benjamin Root wrote: > Aren't we on standard time at Jan 1st? So, at that date, you would have > been -8. yes, of course, pardon me for being an idiot. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Thu Apr 18 11:37:55 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 18 Apr 2013 08:37:55 -0700 Subject: [Numpy-discussion] datetime64 1970 issue In-Reply-To: References: Message-ID: On Wed, Apr 17, 2013 at 11:27 PM, Joris Van den Bossche wrote: >> Anyone tested this on Windows? > > On Windows 7, numpy 1.7.0 (Anaconda 1.4.0 64 bit), I don't even get a wrong > answer, but an error: > > In [3]: np.datetime64('1969-12-31 00') > Out[3]: numpy.datetime64('1969-12-31T00:00Z','h') > > In [4]: np.datetime64('1970-01-01 00') > --------------------------------------------------------------------------- > OSError Traceback (most recent call last) > in () > ----> 1 np.datetime64('1970-01-01 00') > > OSError: Failed to use 'mktime' to convert local time to UTC OK -- so that confirms the issue is from using the system libs to do this. though I'm surprised it fails on that date -- I would have expected it to fail on the 1960 date -- unless there is code that bypasses the system libs (Or modifies it some way) for pre-1970 dates. >> We REALLY need to fix this! There is a lot that we _could_ do to handle all tis well, but one thing is clear to me -- poor timezone support is MUCH worse than no timezone support! I'll get that NEP up, but I hope someone can get on this -- at least before any next release. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Thu Apr 18 11:50:11 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 18 Apr 2013 08:50:11 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 8:31 AM, Chris Barker - NOAA Federal wrote: > Fair enough -- so a missing feature, not bug -- I'll need to look at > the docs and see if that can be clarified - All I've found is the docstring docs (which also show up in the Sphinx docs). I suggest some slight modification: def save(file, arr): """ Save an array to a binary file in NumPy ``.npy`` format. Parameters ---------- file : file or str File or filename to which the data is saved. If file is a file-object, then the filename is unchanged. If file is a string, a ``.npy`` extension will be appended to the file name if it does not already have one. arr : array_like Array data to be saved. Any object that is not an array will be converted to an array with asanyarray(). When reloaded, the array version of the object will be returned. See Also -------- savez : Save several arrays into a ``.npz`` archive savetxt, load Notes ----- For a description of the ``.npy`` format, see `format`. Examples -------- >>> from tempfile import TemporaryFile >>> outfile = TemporaryFile() >>> x = np.arange(10) >>> np.save(outfile, x) >>> outfile.seek(0) # Only needed here to simulate closing & reopening file >>> np.load(outfile) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) """ I also see: For a description of the ``.npy`` format, see `format`. but no idea where to find 'format' -- it looks like it should be a link in the Sphinx docs, but it's not. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From andrew_giessel at hms.harvard.edu Thu Apr 18 12:02:07 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Thu, 18 Apr 2013 12:02:07 -0400 Subject: [Numpy-discussion] contributing to numpy In-Reply-To: References: Message-ID: An update--- I submitted a PR if anyone is interested: https://github.com/numpy/numpy/pull/3262 Secondly, it was pointed out to my by Stefan van der Walt that one could use np.rollaxis() to reorder an array such that the default iterator behavior would yield the same slices of the array: a = np.ones((100,10,3)) for i in np.rollaxis(a, 2): print i.shape gives you 3 100x10 slices. My PR is more of a pure iterator, so I sent it in all the best, ag On Wed, Apr 17, 2013 at 2:18 PM, Andrew Giessel < andrew_giessel at hms.harvard.edu> wrote: > Chuck- > > Thank you for the very helpful and encouraging email! I will first try to > just add a function, rather than a method on ndarray (which looks to be > lower level ie: C). The pointer to numeric.py / test_numeric.py is exactly > what I needed. > > I will of course figure out a good test and document it well. For > starters, I'll make the function simply take an integer corresponding to an > axis. I'm not exactly sure what you mean by generalizing to take multiple > axes -- would the idea be to return slices of an array w/o one of the > dimensions? > > I'll try to tackle this over the next week and hopefully the conversation > on the PR will be the place to talk about these issues. I'll need to > figure out the best way to have a dev branch of numpy side-by-side with a > stock version, and how to build the module, first. > > Lastly, I'll also try to write up something re: my experience so others > can have something to take a look at. > > best+thanks, > > ag > > > On Wed, Apr 17, 2013 at 11:43 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Wed, Apr 17, 2013 at 6:29 AM, andrew giessel > > wrote: >> >>> First, my apologies if this isn't the right forum for this question- I >>> looked for a dev list, but couldn't really find it. >>> >>> I have a small method I'd like to contribute to numpy, ideally as a >>> method on ndarrays and a general function in the numpy namespace. I found >>> it on a stackoverflow thread, and it is a generator that yields slices of a >>> multidimensional array over a specified axis, which is convenient for use >>> in list comprehensions and loops. >>> >>> https://gist.github.com/andrewgiessel/5400659 >>> >>> I've forked the numpy source and am familar with git/pull requests/etc >>> but the code base is a bit overwhelming. >>> >>> I have 2 questions: >>> >>> 1) Is there a document which gives an overview of the numpy source and >>> perhaps a tutorial on the best way to add methods/functions? >>> >> >> The closest thing is probably Contributing to Numpy >> , >> which I suspect is not what you are looking for. At this point you pretty >> much need to dig through the source to see how it is organized. To see how >> things like tests/documentation are organized look for existing examples >> and also the relevant docs in doc/. >> >> >>> 2) is there a current way to do this in numpy? The only iterator >>> related stuff I found in a brief search last night was for essentially >>> looping over all elements of an array, one by one. >>> >> >> I think a function like this would be useful. There are ad-hoc ways to >> get the same result but they aren't quite as flexible. A few comments >> >> 1) The function would probably best go in numpy/core/numeric.py >> 2) It will need a docstring >> 3) It will need tests in numpy/core/tests/test_numeric.py >> 4) xrange isn't Python 3 compatible, use range instead. >> >> The name isn't very descriptive, maybe iter_over_axis? One possible >> generalization would to let axis (axes) be either a number, or a list of >> axes. The first being the number of leading axes, the second letting one >> choose arbitrary axes. >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Andrew Giessel, PhD > > Department of Neurobiology, Harvard Medical School > 220 Longwood Ave Boston, MA 02115 > ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Apr 18 13:13:13 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 18 Apr 2013 22:43:13 +0530 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 9:20 PM, Chris Barker - NOAA Federal wrote: > On Thu, Apr 18, 2013 at 8:31 AM, Chris Barker - NOAA Federal > wrote: > >> Fair enough -- so a missing feature, not bug -- I'll need to look at >> the docs and see if that can be clarified - > > All I've found is the docstring docs (which also show up in the Sphinx > docs). I suggest some slight modification: > > def save(file, arr): > """ > Save an array to a binary file in NumPy ``.npy`` format. > > Parameters > ---------- > file : file or str > File or filename to which the data is saved. If file is a file-object, > then the filename is unchanged. If file is a string, a ``.npy`` > extension will be appended to the file name if it does not already > have one. > arr : array_like > Array data to be saved. Any object that is not an array will be > converted to an array with asanyarray(). When reloaded, the array > version of the object will be returned. > > See Also > -------- > savez : Save several arrays into a ``.npz`` archive > savetxt, load > > Notes > ----- > For a description of the ``.npy`` format, see `format`. > > Examples > -------- > >>> from tempfile import TemporaryFile > >>> outfile = TemporaryFile() > > >>> x = np.arange(10) > >>> np.save(outfile, x) > > >>> outfile.seek(0) # Only needed here to simulate closing & reopening file > >>> np.load(outfile) > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > """ > > I also see: > > For a description of the ``.npy`` format, see `format`. > > but no idea where to find 'format' -- it looks like it should be a > link in the Sphinx docs, but it's not. It does seem to be missing from the docs. https://github.com/numpy/numpy/blob/master/numpy/lib/format.py -- Robert Kern From sebastian at sipsolutions.net Thu Apr 18 15:33:32 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 18 Apr 2013 21:33:32 +0200 Subject: [Numpy-discussion] Index Parsing redo Message-ID: <1366313612.3260.17.camel@sebastian-laptop> Hey, so I ignored trying to redo MapIter (likely it is lobotomized at this time though). But actually got a working new index parsing (still needs cleanup, etc.). Also some of the fast paths are not yet put back. For most pure integer indices it got a bit slower, if it actually gets too much one could re-add such special cases I guess... If anyone wants to take a look, it is here: https://github.com/seberg/numpy/compare/rewrite-index-parser The tests run through fine with the exception of the matrix item setting [1] and changes in errors. Certainly many errors still need to be made indexing specific. If anyone is interested in messing with it, give me a ping for direct access. Polishing such a thing up (if deemed right) is a lot of work and I am not sure I will find the time soon. Regards, Sebastian [1] That is because subclass item setting (for non-fancy, non-scalar result and non-single integer indices) is really tmp = subclass.view(np.ndarray) tmp[index] = values` which needs to be put back in. The whole matrix indexing business probably needs some extra functionality in the core to get right I guess. From kmichael.aye at gmail.com Thu Apr 18 19:31:36 2013 From: kmichael.aye at gmail.com (K.-Michael Aye) Date: Thu, 18 Apr 2013 16:31:36 -0700 Subject: [Numpy-discussion] type conversion question Message-ID: I don't understand why sometimes a direct assignment of a new dtype is possible (but messes up the values), and why at other times a seemingly harmless upcast (in my potentially ignorant point of view) is not possible. So, maybe a direct assignment of a new dtype is actually never a good idea? (I'm asking), and one should always go the route of newarray= array(oldarray, dtype=newdtype), but why then sometimes the upcast provides an error and forbids it and sometimes not? Examples: In [140]: slope.read_center_window() In [141]: slope.data.dtype Out[141]: dtype('float32') In [142]: slope.data[1,1] Out[142]: 10.044398 In [143]: val = slope.data[1,1] In [144]: slope.data.dtype='float64' In [145]: slope.data[1,1] Out[145]: 586.98938070189865 #----- #Here, the value of data[1,1] has completely changed (and so has the rest of the array), and no error was given. # But then... #---- In [146]: val.dtype Out[146]: dtype('float32') In [147]: val Out[147]: 10.044398 In [148]: val.dtype='float64' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 val.dtype='float64' AttributeError: attribute 'dtype' of 'numpy.generic' objects is not writable === end of code So why is there an error in the 2nd case, but no error in the first case? Is there a logic to it? Thanks, Michael From ben.root at ou.edu Thu Apr 18 21:02:59 2013 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 18 Apr 2013 21:02:59 -0400 Subject: [Numpy-discussion] type conversion question In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 7:31 PM, K.-Michael Aye wrote: > I don't understand why sometimes a direct assignment of a new dtype is > possible (but messes up the values), and why at other times a seemingly > harmless upcast (in my potentially ignorant point of view) is not > possible. > So, maybe a direct assignment of a new dtype is actually never a good > idea? (I'm asking), and one should always go the route of newarray= > array(oldarray, dtype=newdtype), but why then sometimes the upcast > provides an error and forbids it and sometimes not? > > > Examples: > > In [140]: slope.read_center_window() > > In [141]: slope.data.dtype > Out[141]: dtype('float32') > > In [142]: slope.data[1,1] > Out[142]: 10.044398 > > In [143]: val = slope.data[1,1] > > In [144]: slope.data.dtype='float64' > > In [145]: slope.data[1,1] > Out[145]: 586.98938070189865 > > #----- > #Here, the value of data[1,1] has completely changed (and so has the > rest of the array), and no error was given. > # But then... > #---- > > In [146]: val.dtype > Out[146]: dtype('float32') > > In [147]: val > Out[147]: 10.044398 > > In [148]: val.dtype='float64' > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call last) > in () > ----> 1 val.dtype='float64' > > AttributeError: attribute 'dtype' of 'numpy.generic' objects is not > writable > > === end of code > > So why is there an error in the 2nd case, but no error in the first > case? Is there a logic to it? > > When you change a dtype like that in the first one, you aren't really upcasting anything. You are changing how numpy interprets the underlying bits. Because you went from a 32-bit element size to a 64-bit element size, you are actually seeing the double-precision representation of 2 of your original data points together. The correct way to cast is to do something like "a = slope.data.astype('float64')". That makes a copy and does the casting as safely as possible. As for the second one, you have what is called a numpy scalar. These aren't quite the same thing as a numpy array, and can be a bit more restrictive. Can you imagine what sort of issues that would pose if one could start viewing and modifying neighboring chunks of memory without ever having to mess around with pointers? It would be a hacker's dream! I hope that clears things up. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From kmichael.aye at gmail.com Fri Apr 19 01:04:57 2013 From: kmichael.aye at gmail.com (K.-Michael Aye) Date: Thu, 18 Apr 2013 22:04:57 -0700 Subject: [Numpy-discussion] type conversion question References: Message-ID: On 2013-04-19 01:02:59 +0000, Benjamin Root said: > > > > On Thu, Apr 18, 2013 at 7:31 PM, K.-Michael Aye wrote: > I don't understand why sometimes a direct assignment of a new dtype is > possible (but messes up the values), and why at other times a seemingly > harmless upcast (in my potentially ignorant point of view) is not > possible. > So, maybe a direct assignment of a new dtype is actually never a good > idea? (I'm asking), and one should always go the route of newarray= > array(oldarray, dtype=newdtype), but why then sometimes the upcast > provides an error and forbids it and sometimes not? > > > Examples: > > In [140]: slope.read_center_window() > > In [141]: slope.data.dtype > Out[141]: dtype('float32') > > In [142]: slope.data[1,1] > Out[142]: 10.044398 > > In [143]: val = slope.data[1,1] > > In [144]: slope.data.dtype='float64' > > In [145]: slope.data[1,1] > Out[145]: 586.98938070189865 > > #----- > #Here, the value of data[1,1] has completely changed (and so has the > rest of the array), and no error was given. > # But then... > #---- > > In [146]: val.dtype > Out[146]: dtype('float32') > > In [147]: val > Out[147]: 10.044398 > > In [148]: val.dtype='float64' > --------------------------------------------------------------------------- > AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last) > in () > ----> 1 val.dtype='float64' > > AttributeError: attribute 'dtype' of 'numpy.generic' objects is not writable > > === end of code > > So why is there an error in the 2nd case, but no error in the first > case? Is there a logic to it? > > > When you change a dtype like that in the first one, you aren't really > upcasting anything.? You are changing how numpy interprets the > underlying bits.? Because you went from a 32-bit element size to a > 64-bit element size, you are actually seeing the double-precision > representation of 2 of your original data points together. > > The correct way to cast is to do something like "a = > slope.data.astype('float64')".? That makes a copy and does the casting > as safely as possible. > > As for the second one, you have what is called a numpy scalar.? These > aren't quite the same thing as a numpy array, and can be a bit more > restrictive.? Can you imagine what sort of issues that would pose if > one could start viewing and modifying neighboring chunks of memory > without ever having to mess around with pointers?? It would be a > hacker's dream! > > I hope that clears things up. > Ben Root yes, thanks! Michael > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Fri Apr 19 02:33:14 2013 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 19 Apr 2013 07:33:14 +0100 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: On 18 Apr 2013 01:29, "Chris Barker - NOAA Federal" wrote: > This has been annoying, particular as rank-zero scalars are kind of a pain. BTW, while we're on the topic, can you elaborate on this? I tend to think scalars (as opposed to 0d ndarrays) are kind of a pain, so I'm curious if you have specific issues you've run into with 0d ndarrays. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Apr 19 09:32:32 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 19 Apr 2013 09:32:32 -0400 Subject: [Numpy-discussion] what do I get if I build with MKL? Message-ID: What sorts of functions take advantage of MKL? Linear Algebra (equation solving)? Something like dot product? exp, log, trig of matrix? basic numpy arithmetic? (add matrixes) From matthieu.brucher at gmail.com Fri Apr 19 09:39:08 2013 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 19 Apr 2013 14:39:08 +0100 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: Hi, I think you have at least linear algebra (lapack) and dot. Basic arithmetics will not benefit, for expm, logm... I don't know. Matthieu 2013/4/19 Neal Becker > What sorts of functions take advantage of MKL? > > Linear Algebra (equation solving)? > > Something like dot product? > > exp, log, trig of matrix? > > basic numpy arithmetic? (add matrixes) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Tom.KACVINSKY at 3ds.com Fri Apr 19 09:40:39 2013 From: Tom.KACVINSKY at 3ds.com (KACVINSKY Tom) Date: Fri, 19 Apr 2013 13:40:39 +0000 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: You also get highly optimized BLAS routines, like dgemm and degemv. From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Matthieu Brucher Sent: Friday, April 19, 2013 9:39 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] what do I get if I build with MKL? Hi, I think you have at least linear algebra (lapack) and dot. Basic arithmetics will not benefit, for expm, logm... I don't know. Matthieu 2013/4/19 Neal Becker > What sorts of functions take advantage of MKL? Linear Algebra (equation solving)? Something like dot product? exp, log, trig of matrix? basic numpy arithmetic? (add matrixes) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email. For other languages, go to http://www.3ds.com/terms/email-disclaimer -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Apr 19 09:47:49 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 19 Apr 2013 09:47:49 -0400 Subject: [Numpy-discussion] what do I get if I build with MKL? References: Message-ID: KACVINSKY Tom wrote: > You also get highly optimized BLAS routines, like dgemm and degemv. And does numpy/scipy just then automatically use them? When I do a matrix multiply, for example? From matthieu.brucher at gmail.com Fri Apr 19 09:50:06 2013 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 19 Apr 2013 14:50:06 +0100 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: For the matrix multiplication or array dot, you use BLAS3 functions as they are more or less the same. For the rest, nothing inside Numpy uses BLAS or LAPACK explicitelly IIRC. You have to do the calls yourself. 2013/4/19 Neal Becker > KACVINSKY Tom wrote: > > > You also get highly optimized BLAS routines, like dgemm and degemv. > > And does numpy/scipy just then automatically use them? When I do a matrix > multiply, for example? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Apr 19 11:03:18 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 08:03:18 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: Message-ID: <4532307196415651864@unknownmsgid> On Apr 18, 2013, at 11:33 PM, Nathaniel Smith wrote: On 18 Apr 2013 01:29, "Chris Barker - NOAA Federal" wrote: > This has been annoying, particular as rank-zero scalars are kind of a pain. BTW, while we're on the topic, can you elaborate on this? I tend to think scalars (as opposed to 0d ndarrays) are kind of a pain, so I'm curious if you have specific issues you've run into with 0d ndarrays. Well, I suppose what's really a pain is that we have both, and they are not the same, and neither can be used in all cases one may want. In the case at hand, I really wanted a datetime64 scalar. By saving and re-loading in an npz, it got converted to a rank-zero array, which had different behavior. In this case, the frustrating bit was how to extract a scalar again ( which I really wanted to turn into a datetime object). After the fact, I discovered .item(), which seems to do what I want. On a phone now, so sorry about the lack of examples. Note: I've lost track of why we need both scalers and rank-zero arrays. I can't help thinking that there could be an object that acts like a scalar in most contexts, but also has the array methods that make sense. But I know it's far from simple. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Tom.KACVINSKY at 3ds.com Fri Apr 19 11:10:33 2013 From: Tom.KACVINSKY at 3ds.com (KACVINSKY Tom) Date: Fri, 19 Apr 2013 15:10:33 +0000 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: Looks like the *lapack_lite files have internal calls to dgemm. I alos found this: http://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl So it looks like numpy/scipy performs better with MKL, regardless of how the MKL routines are called (directly, or via a numpy/scipy interface). Tom From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Matthieu Brucher Sent: Friday, April 19, 2013 9:50 AM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] what do I get if I build with MKL? For the matrix multiplication or array dot, you use BLAS3 functions as they are more or less the same. For the rest, nothing inside Numpy uses BLAS or LAPACK explicitelly IIRC. You have to do the calls yourself. 2013/4/19 Neal Becker > KACVINSKY Tom wrote: > You also get highly optimized BLAS routines, like dgemm and degemv. And does numpy/scipy just then automatically use them? When I do a matrix multiply, for example? _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged. If you are not one of the named recipients or have received this email in error, (i) you should not read, disclose, or copy it, (ii) please notify sender of your receipt by reply email and delete this email and all attachments, (iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email. For other languages, go to http://www.3ds.com/terms/email-disclaimer -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Fri Apr 19 11:12:20 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Fri, 19 Apr 2013 09:12:20 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: On Sun, Apr 7, 2013 at 2:09 AM, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the final NumPy 1.7.1 release. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.1/ > > Only three simple bugs were fixed since 1.7.1rc1 (#3166, #3179, #3187). > > I would like to thank everybody who contributed patches since 1.7.1rc1: > Eric Fode, Nathaniel J. Smith and Charles Harris. > > Cheers, > Ondrej > > P.S. I'll create the Mac binary installers in a few days. Pypi is updated. I've uploaded Mac binaries: https://sourceforge.net/projects/numpy/files/NumPy/1.7.1/ Ralf, if you have a minute, would you mind uploading the Mac 10.6 binary? I don't have access to such a mac. Thanks Ondrej From chris.barker at noaa.gov Fri Apr 19 11:12:58 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 08:12:58 -0700 Subject: [Numpy-discussion] type conversion question In-Reply-To: References: Message-ID: On Thu, Apr 18, 2013 at 10:04 PM, K.-Michael Aye wrote: > On 2013-04-19 01:02:59 +0000, Benjamin Root said: >> So why is there an error in the 2nd case, but no error in the first >> case? Is there a logic to it? >> >> When you change a dtype like that in the first one, you aren't really >> upcasting anything. You are changing how numpy interprets the >> underlying bits. Because you went from a 32-bit element size to a >> 64-bit element size, you are actually seeing the double-precision >> representation of 2 of your original data points together. I was wondering what would happen if there were not the right number of points available... In [225]: a = np.array((2.0, 3.0), dtype=np.float32) In [226]: a.dtype=np.float64 In [227]: a Out[227]: array([ 32.00000763]) OK , but: In [228]: a = np.array((2.0,), dtype=np.float32) In [229]: a.dtype=np.float64 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 a.dtype=np.float64 ValueError: new type not compatible with array. so numpy is smart enough to not let you do it .. good thing. Final note -- changing the dtype in place like that is a very powerful and useful tool, but not likely to be used often -- it's really for things like working with odd binary data and the like. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthieu.brucher at gmail.com Fri Apr 19 11:16:11 2013 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 19 Apr 2013 16:16:11 +0100 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: The graph is a comparison of the dot calls, of course they are better with MKL than the default BLAS version ;) For the rest, Numpy doesn't benefit from MKL, scipy may if they call LAPACK functions wrapped by Numpy or Scipy (I don't remember which does the wrapping). Matthieu 2013/4/19 KACVINSKY Tom > Looks like the *lapack_lite files have internal calls to dgemm. I alos > found this: > > > > http://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl > > > > So it looks like numpy/scipy performs better with MKL, regardless of how > the MKL routines are called (directly, or via a numpy/scipy interface). > > > > Tom > > > > *From:* numpy-discussion-bounces at scipy.org [mailto: > numpy-discussion-bounces at scipy.org] *On Behalf Of *Matthieu Brucher > *Sent:* Friday, April 19, 2013 9:50 AM > > *To:* Discussion of Numerical Python > *Subject:* Re: [Numpy-discussion] what do I get if I build with MKL? > > > > For the matrix multiplication or array dot, you use BLAS3 functions as > they are more or less the same. For the rest, nothing inside Numpy uses > BLAS or LAPACK explicitelly IIRC. You have to do the calls yourself. > > > > 2013/4/19 Neal Becker > > KACVINSKY Tom wrote: > > > You also get highly optimized BLAS routines, like dgemm and degemv. > > And does numpy/scipy just then automatically use them? When I do a matrix > multiply, for example? > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > > This email and any attachments are intended solely for the use of the > individual or entity to whom it is addressed and may be confidential and/or > privileged. > > If you are not one of the named recipients or have received this email in > error, > > (i) you should not read, disclose, or copy it, > > (ii) please notify sender of your receipt by reply email and delete this > email and all attachments, > > (iii) Dassault Systemes does not accept or assume any liability or > responsibility for any use of or reliance on this email. > > For other languages, go to http://www.3ds.com/terms/email-disclaimer > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Apr 19 11:15:39 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 08:15:39 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: <4532307196415651864@unknownmsgid> References: <4532307196415651864@unknownmsgid> Message-ID: Robert, As I think you wrote the code, you may have a quick answer: Given that numpy scalars do exist, and have their uses -- I found this wiki page to remind me: http://projects.scipy.org/numpy/wiki/ZeroRankArray It would be nice if the .npy format could support them. Would that be a major change? I'm trying to decide if this bugs me enough to work on that. -Chris On Fri, Apr 19, 2013 at 8:03 AM, Chris Barker - NOAA Federal wrote: > On Apr 18, 2013, at 11:33 PM, Nathaniel Smith wrote: > > On 18 Apr 2013 01:29, "Chris Barker - NOAA Federal" > wrote: >> This has been annoying, particular as rank-zero scalars are kind of a >> pain. > > BTW, while we're on the topic, can you elaborate on this? I tend to think > scalars (as opposed to 0d ndarrays) are kind of a pain, so I'm curious if > you have specific issues you've run into with 0d ndarrays. > > Well, I suppose what's really a pain is that we have both, and they are not > the same, and neither can be used in all cases one may want. > > In the case at hand, I really wanted a datetime64 scalar. By saving and > re-loading in an npz, it got converted to a rank-zero array, which had > different behavior. In this case, the frustrating bit was how to extract a > scalar again ( which I really wanted to turn into a datetime object). > > After the fact, I discovered .item(), which seems to do what I want. > > On a phone now, so sorry about the lack of examples. > > Note: I've lost track of why we need both scalers and rank-zero arrays. I > can't help thinking that there could be an object that acts like a scalar in > most contexts, but also has the array methods that make sense. > > But I know it's far from simple. > > -Chris > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Fri Apr 19 11:17:39 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 08:17:39 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: On Fri, Apr 19, 2013 at 8:12 AM, Ond?ej ?ert?k wrote: >> I'm pleased to announce the availability of the final NumPy 1.7.1 release. Nice work -- but darn! I was hoping a change/fix to teh datetime64 timezone handlien could get into the next release -- oh well. When do we expect the next one may be? I've lost track, are you trying to keep to a schedule now, or just waiting until there is something compelling to release? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Fri Apr 19 11:46:56 2013 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 19 Apr 2013 16:46:56 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: On Fri, Apr 19, 2013 at 4:17 PM, Chris Barker - NOAA Federal wrote: > On Fri, Apr 19, 2013 at 8:12 AM, Ond?ej ?ert?k wrote: > >>> I'm pleased to announce the availability of the final NumPy 1.7.1 release. > > Nice work -- but darn! I was hoping a change/fix to teh datetime64 > timezone handlien could get into the next release -- oh well. That's probably too big a behavioural chance to go into a point release in any case... > When do we expect the next one may be? I've lost track, are you trying > to keep to a schedule now, or just waiting until there is something > compelling to release? I think the current schedule for 1.8 is "any time now, as soon as we manage to get around to it". -n From chris.barker at noaa.gov Fri Apr 19 11:55:39 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 08:55:39 -0700 Subject: [Numpy-discussion] bug in deepcopy() of rank-zero arrays? Message-ID: Hi folks, In [264]: np.__version__ Out[264]: '1.7.0' I just noticed that deep copying a rank-zero array yields a scalar -- probably not what we want. In [242]: a1 = np.array(3) In [243]: type(a1), a1 Out[243]: (numpy.ndarray, array(3)) In [244]: a2 = copy.deepcopy(a1) In [245]: type(a2), a2 Out[245]: (numpy.int32, 3) regular copy.copy() seems to work fine: In [246]: a3 = copy.copy(a1) In [247]: type(a3), a3 Out[247]: (numpy.ndarray, array(3)) Higher-rank arrays seem to work fine: In [253]: a1 = np.array((3,4)) In [254]: type(a1), a1 Out[254]: (numpy.ndarray, array([3, 4])) In [255]: a2 = copy.deepcopy(a1) In [256]: type(a2), a2 Out[256]: (numpy.ndarray, array([3, 4])) Array scalars seem to work fine as well: In [257]: s1 = np.float32(3) In [258]: s2 = copy.deepcopy(s1) In [261]: type(s1), s1 Out[261]: (numpy.float32, 3.0) In [262]: type(s2), s2 Out[262]: (numpy.float32, 3.0) There are other ways to copy arrays, but in this case, I had a dict with a bunch of arrays in it, and needed a deepcopy of the dict. I was surprised to find that my rank-0 array got turned into a scalar. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sebastian at sipsolutions.net Fri Apr 19 12:10:17 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 19 Apr 2013 18:10:17 +0200 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: <4532307196415651864@unknownmsgid> References: <4532307196415651864@unknownmsgid> Message-ID: <1366387817.3260.30.camel@sebastian-laptop> On Fri, 2013-04-19 at 08:03 -0700, Chris Barker - NOAA Federal wrote: > On Apr 18, 2013, at 11:33 PM, Nathaniel Smith wrote: > > > > > On 18 Apr 2013 01:29, "Chris Barker - NOAA Federal" > > wrote: > > > This has been annoying, particular as rank-zero scalars are kind > > of a pain. > > > > BTW, while we're on the topic, can you elaborate on this? I tend to > > think scalars (as opposed to 0d ndarrays) are kind of a pain, so I'm > > curious if you have specific issues you've run into with 0d > > ndarrays. > > > > > Well, I suppose what's really a pain is that we have both, and they > are not the same, and neither can be used in all cases one may want. > > > In the case at hand, I really wanted a datetime64 scalar. By saving > and re-loading in an npz, it got converted to a rank-zero array, which > had different behavior. In this case, the frustrating bit was how to > extract a scalar again ( which I really wanted to turn into a datetime > object). > > > After the fact, I discovered .item(), which seems to do what I want. > Fun fact, array[()] will convert a 0-d array to a scalar, but do nothing (or currently create a view) for other arrays. Which is actually a good question. Should array[()] force a view or not? - Sebastian > > On a phone now, so sorry about the lack of examples. > > > Note: I've lost track of why we need both scalers and rank-zero > arrays. I can't help thinking that there could be an object that acts > like a scalar in most contexts, but also has the array methods that > make sense. > > > But I know it's far from simple. > > > -Chris > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Fri Apr 19 13:21:14 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 19 Apr 2013 22:51:14 +0530 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: On Fri, Apr 19, 2013 at 8:45 PM, Chris Barker - NOAA Federal wrote: > Robert, > > As I think you wrote the code, you may have a quick answer: > > Given that numpy scalars do exist, and have their uses -- I found this > wiki page to remind me: > > http://projects.scipy.org/numpy/wiki/ZeroRankArray > > It would be nice if the .npy format could support them. Would that be > a major change? I'm trying to decide if this bugs me enough to work on > that. I think that is significant scope creep for the .npy format, and I would like to avoid it. A case might be made for letting np.savez() simply pickle non-arrays that it is given. I have a vague recollection that that was discussed when savez() was designed and rejected as a moral hazard, but I could be wrong. The .npy and .npz formats are intentionally limited by design. As soon as you feel constrained by those limitations, you should start using more full-fledged and standard file formats. -- Robert Kern From robert.kern at gmail.com Fri Apr 19 13:32:24 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 19 Apr 2013 23:02:24 +0530 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: <1366387817.3260.30.camel@sebastian-laptop> References: <4532307196415651864@unknownmsgid> <1366387817.3260.30.camel@sebastian-laptop> Message-ID: On Fri, Apr 19, 2013 at 9:40 PM, Sebastian Berg wrote: > Fun fact, array[()] will convert a 0-d array to a scalar, but do nothing > (or currently create a view) for other arrays. Which is actually a good > question. Should array[()] force a view or not? Another fun fact: scalar[()] gives you a rank-0 array. :-) I think the array[()] behavior follows logically as a limiting form of multidimensional indexing. Given rank-3 array, array[(i, j, k)] gives a scalar array[(i, j)] gives a rank-1 view of the last axis array[(i,)] gives a rank-2 view of the last 2 axes array[()] gives a rank-3 view of the last 3 axes (i.e. all of them) The rank-N-general rules look like so: For a rank-N array, an N-tuple gives a scalar. Subsequent (N-M)-tuples gives appropriate rank-M views picked out by the tuple. I can't explain the scalar[()] behavior, though. :-) -- Robert Kern From sebastian at sipsolutions.net Fri Apr 19 14:00:11 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 19 Apr 2013 20:00:11 +0200 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> <1366387817.3260.30.camel@sebastian-laptop> Message-ID: <1366394411.3260.39.camel@sebastian-laptop> On Fri, 2013-04-19 at 23:02 +0530, Robert Kern wrote: > On Fri, Apr 19, 2013 at 9:40 PM, Sebastian Berg > wrote: > > > Fun fact, array[()] will convert a 0-d array to a scalar, but do nothing > > (or currently create a view) for other arrays. Which is actually a good > > question. Should array[()] force a view or not? > > Another fun fact: scalar[()] gives you a rank-0 array. :-) > Hahahaha, thats pretty (I would say bug). > I think the array[()] behavior follows logically as a limiting form of > multidimensional indexing. Given rank-3 array, > > array[(i, j, k)] gives a scalar > array[(i, j)] gives a rank-1 view of the last axis > array[(i,)] gives a rank-2 view of the last 2 axes > array[()] gives a rank-3 view of the last 3 axes (i.e. all of them) > > The rank-N-general rules look like so: For a rank-N array, an N-tuple > gives a scalar. Subsequent (N-M)-tuples gives appropriate rank-M views > picked out by the tuple. > > I can't explain the scalar[()] behavior, though. :-) > Another special case... It doesn't hit the normal indexing code, and so it can never hit the tuple of integers special case to get converted back to a scalar. But it also always gets converted to an array first. So many special cases that are hard to miss, there is a reason I started rewriting it ;). Still not sure if one should force views sometimes though, but I doubt it matters... - Sebastian > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Fri Apr 19 14:02:00 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 11:02:00 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: On Fri, Apr 19, 2013 at 8:46 AM, Nathaniel Smith wrote: >> Nice work -- but darn! I was hoping a change/fix to teh datetime64 >> timezone handlien could get into the next release -- oh well. > > That's probably too big a behavioural chance to go into a point > release in any case... well, datetime64 is marked as experimental, and clearly quite broken (in this regard), so I think a point release change would be fine -- the fewer people use it this way the better. > I think the current schedule for 1.8 is "any time now, as soon as we > manage to get around to it". then the point-release question is a non-issue anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Fri Apr 19 14:21:26 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 11:21:26 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: On Fri, Apr 19, 2013 at 10:21 AM, Robert Kern wrote: > On Fri, Apr 19, 2013 at 8:45 PM, Chris Barker - NOAA Federal > wrote: >> Given that numpy scalars do exist, and have their uses -- I found this >> wiki page to remind me: >> >> http://projects.scipy.org/numpy/wiki/ZeroRankArray >> >> It would be nice if the .npy format could support them. Would that be >> a major change? I'm trying to decide if this bugs me enough to work on >> that. > > I think that is significant scope creep for the .npy format, and I > would like to avoid it. hmm -- maybe it's more work that we want, but it seems to me that numpy scalars are part and parcel of numpy -- so it makes sense for .npy to save them. > A case might be made for letting np.savez() > simply pickle non-arrays that it is given. I have a vague recollection > that that was discussed when savez() was designed and rejected as a > moral hazard, but I could be wrong. That could be a nice solution -- I'm not _so_ worried about moral hazards! > The .npy and .npz formats are > intentionally limited by design. As soon as you feel constrained by > those limitations, you should start using more full-fledged and > standard file formats. well, maybe -- in this case, I'm using it to cache a bunch of data on disk. The data are all in a dict of numpy arrays, so it was really natural and easy (and I presume fast) to use npz. All I want to is dump it to disk, and get back the same way. It worked great. Then I needed a datetime stored with it all -- so I figured a datetime64 scalar would be perfect. It's not a huge deal to use a rank-zero array instead, but it would have been nicer to be able to store a scalar (I suppose one trick may be that there are numpy scalars, and there are regular old pyton scalars...) Anyway -- going to HDF, or netcdf, or role-your-own really seems like overkill for this. I just need something fast and simple and it doesn't need to interchange with anything else. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Fri Apr 19 14:31:00 2013 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 19 Apr 2013 19:31:00 +0100 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: On 19 Apr 2013 19:22, "Chris Barker - NOAA Federal" wrote: > Anyway -- going to HDF, or netcdf, or role-your-own really seems like > overkill for this. I just need something fast and simple and it > doesn't need to interchange with anything else. Just use pickle...? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Apr 19 15:06:02 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 19 Apr 2013 12:06:02 -0700 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: On Fri, Apr 19, 2013 at 11:31 AM, Nathaniel Smith wrote: > On 19 Apr 2013 19:22, "Chris Barker - NOAA Federal" > wrote: >> Anyway -- going to HDF, or netcdf, or role-your-own really seems like >> overkill for this. I just need something fast and simple and it >> doesn't need to interchange with anything else. > > Just use pickle...? hmm -- for some reason, I always have thought as pickle as unreliable and ill-suited to numpy arrays -- we developed savez for a reason... but maybe I just need to give it a shot and see how it works. Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pav at iki.fi Fri Apr 19 15:10:27 2013 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 19 Apr 2013 22:10:27 +0300 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: 19.04.2013 22:06, Chris Barker - NOAA Federal kirjoitti: > On Fri, Apr 19, 2013 at 11:31 AM, Nathaniel Smith wrote: >> On 19 Apr 2013 19:22, "Chris Barker - NOAA Federal" >> wrote: >>> Anyway -- going to HDF, or netcdf, or role-your-own really seems like >>> overkill for this. I just need something fast and simple and it >>> doesn't need to interchange with anything else. >> >> Just use pickle...? > > hmm -- for some reason, I always have thought as pickle as unreliable > and ill-suited to numpy arrays -- we developed savez for a reason... > but maybe I just need to give it a shot and see how it works. protocol=2 so it doesn't needlessly ascii-quote the data. -- Pauli Virtanen From robert.kern at gmail.com Fri Apr 19 15:25:45 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 20 Apr 2013 00:55:45 +0530 Subject: [Numpy-discussion] numpy scalars and savez -- bug? In-Reply-To: References: <4532307196415651864@unknownmsgid> Message-ID: On Sat, Apr 20, 2013 at 12:36 AM, Chris Barker - NOAA Federal wrote: > On Fri, Apr 19, 2013 at 11:31 AM, Nathaniel Smith wrote: >> On 19 Apr 2013 19:22, "Chris Barker - NOAA Federal" >> wrote: >>> Anyway -- going to HDF, or netcdf, or role-your-own really seems like >>> overkill for this. I just need something fast and simple and it >>> doesn't need to interchange with anything else. >> >> Just use pickle...? > > hmm -- for some reason, I always have thought as pickle as unreliable > and ill-suited to numpy arrays -- we developed savez for a reason... > but maybe I just need to give it a shot and see how it works. The rationale behind .npy format are laid out here: https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt -- Robert Kern From bakhtiyor_zokhidov at mail.ru Fri Apr 19 19:01:08 2013 From: bakhtiyor_zokhidov at mail.ru (=?UTF-8?B?QmFraHRpeW9yIFpva2hpZG92?=) Date: Sat, 20 Apr 2013 03:01:08 +0400 Subject: [Numpy-discussion] =?utf-8?q?Reading_String_type_to_find_long_=22?= =?utf-8?q?magic_word=22?= Message-ID: <1366412468.869435448@f298.mail.ru> Hello everybody, I just have one long string type:ThesampletextthatcouldbereadedthesameinbothordersArozaupalanalapuazorA The result I want to take is?ArozaupalanalapuazorA - which means reading directly each letter should be the same as reading?reversely?...? Is there any function which can deals with this problem in Python??? Thanks for answer -- Bakhti -------------- next part -------------- An HTML attachment was scrubbed... URL: From xabart at gmail.com Fri Apr 19 19:11:55 2013 From: xabart at gmail.com (Xavier Barthelemy) Date: Sat, 20 Apr 2013 09:11:55 +1000 Subject: [Numpy-discussion] what do I get if I build with MKL? In-Reply-To: References: Message-ID: One major advantage you can have using mkl is installing "numexpr" compiling it with MLK. That's a strong suggestion to easily use mkl and go faster on common operations. Xavier On 20/04/2013 1:16 AM, "Matthieu Brucher" wrote: > The graph is a comparison of the dot calls, of course they are better with > MKL than the default BLAS version ;) > For the rest, Numpy doesn't benefit from MKL, scipy may if they call > LAPACK functions wrapped by Numpy or Scipy (I don't remember which does the > wrapping). > > Matthieu > > > 2013/4/19 KACVINSKY Tom > >> Looks like the *lapack_lite files have internal calls to dgemm. I alos >> found this: >> >> >> >> http://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl >> >> >> >> So it looks like numpy/scipy performs better with MKL, regardless of how >> the MKL routines are called (directly, or via a numpy/scipy interface). >> >> >> >> Tom >> >> >> >> *From:* numpy-discussion-bounces at scipy.org [mailto: >> numpy-discussion-bounces at scipy.org] *On Behalf Of *Matthieu Brucher >> *Sent:* Friday, April 19, 2013 9:50 AM >> >> *To:* Discussion of Numerical Python >> *Subject:* Re: [Numpy-discussion] what do I get if I build with MKL? >> >> >> >> For the matrix multiplication or array dot, you use BLAS3 functions as >> they are more or less the same. For the rest, nothing inside Numpy uses >> BLAS or LAPACK explicitelly IIRC. You have to do the calls yourself. >> >> >> >> 2013/4/19 Neal Becker >> >> KACVINSKY Tom wrote: >> >> > You also get highly optimized BLAS routines, like dgemm and degemv. >> >> And does numpy/scipy just then automatically use them? When I do a matrix >> multiply, for example? >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> >> -- >> Information System Engineer, Ph.D. >> Blog: http://matt.eifelle.com >> LinkedIn: http://www.linkedin.com/in/matthieubrucher >> Music band: http://liliejay.com/ >> >> This email and any attachments are intended solely for the use of the >> individual or entity to whom it is addressed and may be confidential and/or >> privileged. >> >> If you are not one of the named recipients or have received this email in >> error, >> >> (i) you should not read, disclose, or copy it, >> >> (ii) please notify sender of your receipt by reply email and delete this >> email and all attachments, >> >> (iii) Dassault Systemes does not accept or assume any liability or >> responsibility for any use of or reliance on this email. >> >> For other languages, go to http://www.3ds.com/terms/email-disclaimer >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kalatsky at gmail.com Fri Apr 19 20:05:27 2013 From: kalatsky at gmail.com (Val Kalatsky) Date: Fri, 19 Apr 2013 19:05:27 -0500 Subject: [Numpy-discussion] Reading String type to find long "magic word" In-Reply-To: <1366412468.869435448@f298.mail.ru> References: <1366412468.869435448@f298.mail.ru> Message-ID: Here's a seed for your function: s = 'ThesampletextthatcouldbereadedthesameinbothordersArozaupalanalapuazorA' f = np.array(list(s)).view('int8').astype(float) f -= f.mean() maybe_here = np.argmax(np.convolve(f,f))/2 magic = 10 print s[maybe_here - magic:maybe_here + magic + 1] Let us now how to assign significance to the result and to find optimal 'magic' Val On Fri, Apr 19, 2013 at 6:01 PM, Bakhtiyor Zokhidov < bakhtiyor_zokhidov at mail.ru> wrote: > Hello everybody, > > I just have one long string > type:ThesampletextthatcouldbereadedthesameinbothordersArozaupalanalapuazorA > > The result I want to take is ArozaupalanalapuazorA - which means reading > directly each letter should be the same as reading reversely ... > > Is there any function which can deals with this problem in Python??? > > Thanks for answer > > > -- > Bakhti > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bakhtiyor_zokhidov at mail.ru Fri Apr 19 20:19:48 2013 From: bakhtiyor_zokhidov at mail.ru (=?UTF-8?B?QmFraHRpeW9yIFpva2hpZG92?=) Date: Sat, 20 Apr 2013 04:19:48 +0400 Subject: [Numpy-discussion] =?utf-8?q?Reading_String_type_to_find_long_=22?= =?utf-8?q?magic_word=22?= References: <1366412468.869435448@f298.mail.ru> Message-ID: <1366417188.241827415@f266.mail.ru> Hi, I am a bit unaware with that you put magic = 10 . Why? ???????, 19 ?????? 2013, 19:05 -05:00 ?? Val Kalatsky : >Here's a seed for your function: > >s = ' Thesampletextthatcouldbereaded thesameinbothordersArozaupalan alapuazorA ' >f = np.array(list(s)).view('int8').astype(float) >f -= f.mean() >maybe_here = np.argmax(np.convolve(f,f))/2 >magic = 10 >print?s[maybe_here - magic:maybe_here + magic + 1] > > >Let us now how to assign significance to the result and to find optimal 'magic' > >Val > > >On Fri, Apr 19, 2013 at 6:01 PM, Bakhtiyor Zokhidov < bakhtiyor_zokhidov at mail.ru > wrote: >>Hello everybody, >> >>I just have one long string type:ThesampletextthatcouldbereadedthesameinbothordersArozaupalanalapuazorA >> >>The result I want to take is?ArozaupalanalapuazorA - which means reading directly each letter should be the same as reading?reversely?...? >> >>Is there any function which can deals with this problem in Python??? >> >>Thanks for answer >> >> >>-- >>Bakhti >>_______________________________________________ >>NumPy-Discussion mailing list >>NumPy-Discussion at scipy.org >>http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Apr 21 06:59:33 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 21 Apr 2013 12:59:33 +0200 Subject: [Numpy-discussion] KeepDims flag? Message-ID: <1366541973.5989.50.camel@sebastian-laptop> Hi, just something that has been spooking around in my mind. Considering that matrix indexing does not really support fancy indexing, I was wondering about introducing a KeepDims flag. Maybe it is not worth it, at least not unless other subclasses could make use of it, too. And a big reason for it being not worth the trouble is probably that you could not use it inside the current matrix class because it would i.e. break workarounds for ufunc reductions (like np.multiply.reduce(a, axis=1).T, as the .T would be unnecessary). Such a flag could only be set (not unset) and never be set on base class arrays (basically it should be set by array_finalize). If set it would toggle ufunc reductions to always use keepdims (unless the reduction is to a scalar, maybe). And the same thing for indexing (meaning that some fancy indices and np.newaxis would just error out), though axes added by broadcasting should be caught by the subclass itself. That way, a matrix-like class would normally have a 1:1 mapping of old to new axes (even if they might be transposed or elements arbitrarily shuffled), and does not have to do magic to guess where to add the missing one (instead the magic is done in the core, where it is actually easier to implement). Anyway, as I never use matrix I personally have no real use for it, but I thought I would throw the thought out there. For starters one might rather think about something specific to indexing. Regards, Sebastian From ribonucleico at gmail.com Sun Apr 21 13:35:12 2013 From: ribonucleico at gmail.com (James Jong) Date: Sun, 21 Apr 2013 13:35:12 -0400 Subject: [Numpy-discussion] Can Numpy use static libraries from LAPACK? Message-ID: Note: I started a thread in StackOverflow a few days ago with this question, but I have not received any response yet (the link is: http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading ) The question is the following: Say that I build ATLAS with LAPACK as follows: wget http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download wget http://www.netlib.org/lapack/lapack-3.4.2.tgz tar -jxvf atlas3.10.1.tar.bz2 mkdir BUILD cd BUILD ../ATLAS/configure -b 64 -Fa alg -fPIC \ --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ --prefix= make cd lib make shared make ptshared cd .. make install Note that I did *not *pass the flag --shared in .my call to configure. I end up with the following files under BUILD/lib: Make.inc@ Makefile the following .a files: libatlas.a libcblas.a libf77blas.a libptf77blas.a libtstatlas.a liblapack.a libf77refblas.a libptlapack.a libptcblas.a and the following .so files: libsatlas.so* libtatlas.so* Finally, if I define: BLAS=/path_to_BUILD/lib/libcblas.a LAPACK=/path_to_BUILD/lib/liblapack.a ATLAS=/path_to_BUILD/lib/libatlas.a and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs variable within thesite.cfg file in NumPy. Would NumPy and SciPy use my libraries? (even though they all seem to be static?). Thanks, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilanschnell at gmail.com Sun Apr 21 14:24:52 2013 From: ilanschnell at gmail.com (Ilan Schnell) Date: Sun, 21 Apr 2013 13:24:52 -0500 Subject: [Numpy-discussion] Can Numpy use static libraries from LAPACK? In-Reply-To: References: Message-ID: Hello Jason, the answer is yes. This is how my site.cfg on Linux look like: [DEFAULT] library_dirs = /lib include_dirs = /include [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack, f77blas, cblas, atlas - Ilan On Sun, Apr 21, 2013 at 12:35 PM, James Jong wrote: > Note: I started a thread in StackOverflow a few days ago with this > question, but I have not received any response yet (the link is: > http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading > ) > > The question is the following: > > Say that I build ATLAS with LAPACK as follows: > > wget http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download > wget http://www.netlib.org/lapack/lapack-3.4.2.tgz > tar -jxvf atlas3.10.1.tar.bz2 > mkdir BUILD > cd BUILD > ../ATLAS/configure -b 64 -Fa alg -fPIC \ > --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ > --prefix= > make > cd lib > make shared > make ptshared > cd .. > make install > > Note that I did *not *pass the flag --shared in .my call to configure. > > I end up with the following files under BUILD/lib: > > Make.inc@ > Makefile > > the following .a files: > > libatlas.a > libcblas.a > libf77blas.a > libptf77blas.a > libtstatlas.a > liblapack.a > libf77refblas.a > libptlapack.a > libptcblas.a > > and the following .so files: > > libsatlas.so* > libtatlas.so* > > Finally, if I define: > > BLAS=/path_to_BUILD/lib/libcblas.a > LAPACK=/path_to_BUILD/lib/liblapack.a > ATLAS=/path_to_BUILD/lib/libatlas.a > > and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs variable > within thesite.cfg file in NumPy. > > Would NumPy and SciPy use my libraries? (even though they all seem to be > static?). > > Thanks, > > Jason > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ribonucleico at gmail.com Sun Apr 21 14:48:57 2013 From: ribonucleico at gmail.com (James Jong) Date: Sun, 21 Apr 2013 14:48:57 -0400 Subject: [Numpy-discussion] Can Numpy use static libraries from LAPACK? In-Reply-To: References: Message-ID: Thanks a lot Ilan, That's great to know. Do you know if there is any way to verify this? Perhaps seeing which specific files with their extensions are actually Numpy loads and uses? Jason On Sun, Apr 21, 2013 at 2:24 PM, Ilan Schnell wrote: > Hello Jason, > the answer is yes. This is how my site.cfg on Linux look like: > > [DEFAULT] > library_dirs = /lib > include_dirs = /include > > [blas_opt] > libraries = f77blas, cblas, atlas > > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > - Ilan > > On Sun, Apr 21, 2013 at 12:35 PM, James Jong wrote: > >> Note: I started a thread in StackOverflow a few days ago with this >> question, but I have not received any response yet (the link is: >> http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading >> ) >> >> The question is the following: >> >> Say that I build ATLAS with LAPACK as follows: >> >> wget http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download >> wget http://www.netlib.org/lapack/lapack-3.4.2.tgz >> tar -jxvf atlas3.10.1.tar.bz2 >> mkdir BUILD >> cd BUILD >> ../ATLAS/configure -b 64 -Fa alg -fPIC \ >> --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ >> --prefix= >> make >> cd lib >> make shared >> make ptshared >> cd .. >> make install >> >> Note that I did *not *pass the flag --shared in .my call to configure. >> >> I end up with the following files under BUILD/lib: >> >> Make.inc@ >> Makefile >> >> the following .a files: >> >> libatlas.a >> libcblas.a >> libf77blas.a >> libptf77blas.a >> libtstatlas.a >> liblapack.a >> libf77refblas.a >> libptlapack.a >> libptcblas.a >> >> and the following .so files: >> >> libsatlas.so* >> libtatlas.so* >> >> Finally, if I define: >> >> BLAS=/path_to_BUILD/lib/libcblas.a >> LAPACK=/path_to_BUILD/lib/liblapack.a >> ATLAS=/path_to_BUILD/lib/libatlas.a >> >> and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs variable >> within thesite.cfg file in NumPy. >> >> Would NumPy and SciPy use my libraries? (even though they all seem to be >> static?). >> >> Thanks, >> >> Jason >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Apr 21 15:11:26 2013 From: cournape at gmail.com (David Cournapeau) Date: Sun, 21 Apr 2013 20:11:26 +0100 Subject: [Numpy-discussion] Can Numpy use static libraries from LAPACK? In-Reply-To: References: Message-ID: On Sun, Apr 21, 2013 at 7:48 PM, James Jong wrote: > Thanks a lot Ilan, > > That's great to know. Do you know if there is any way to verify this? > Perhaps seeing which specific files with their extensions are actually Numpy > loads and uses? numpy.show_config() will give you the configuration set up at build time You can't see static libraries being loaded as once linked, the static library is out of the picture. Generally, to check which libraries are linked (dynamically), you use ldd on unix, otool -L on mac and dumpbin or dependency walker on windows. David From christopher.ruesch at gmail.com Sun Apr 21 17:41:03 2013 From: christopher.ruesch at gmail.com (Christopher Ruesch) Date: Sun, 21 Apr 2013 17:41:03 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 79, Issue 67 In-Reply-To: References: Message-ID: help On Sun, Apr 21, 2013 at 2:44 PM, wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. Can Numpy use static libraries from LAPACK? (James Jong) > 2. Re: Can Numpy use static libraries from LAPACK? (Ilan Schnell) > 3. Re: Can Numpy use static libraries from LAPACK? (James Jong) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 21 Apr 2013 13:35:12 -0400 > From: James Jong > Subject: [Numpy-discussion] Can Numpy use static libraries from > LAPACK? > To: numpy-discussion at scipy.org > Message-ID: > < > CAD4ivxVdTNgbkMD0rJOGrUGpMUtieN10Gh8WzOz59Xr0196VgA at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Note: I started a thread in StackOverflow a few days ago with this > question, but I have not received any response yet (the link is: > > http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading > ) > > The question is the following: > > Say that I build ATLAS with LAPACK as follows: > > wget > http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download > wget http://www.netlib.org/lapack/lapack-3.4.2.tgz > tar -jxvf atlas3.10.1.tar.bz2 > mkdir BUILD > cd BUILD > ../ATLAS/configure -b 64 -Fa alg -fPIC \ > --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ > --prefix= > make > cd lib > make shared > make ptshared > cd .. > make install > > Note that I did *not *pass the flag --shared in .my call to configure. > > I end up with the following files under BUILD/lib: > > Make.inc@ > Makefile > > the following .a files: > > libatlas.a > libcblas.a > libf77blas.a > libptf77blas.a > libtstatlas.a > liblapack.a > libf77refblas.a > libptlapack.a > libptcblas.a > > and the following .so files: > > libsatlas.so* > libtatlas.so* > > Finally, if I define: > > BLAS=/path_to_BUILD/lib/libcblas.a > LAPACK=/path_to_BUILD/lib/liblapack.a > ATLAS=/path_to_BUILD/lib/libatlas.a > > and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs > variable > within thesite.cfg file in NumPy. > > Would NumPy and SciPy use my libraries? (even though they all seem to be > static?). > > Thanks, > > Jason > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130421/8eb16cdf/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Sun, 21 Apr 2013 13:24:52 -0500 > From: Ilan Schnell > Subject: Re: [Numpy-discussion] Can Numpy use static libraries from > LAPACK? > To: Discussion of Numerical Python > Message-ID: > < > CAHxB1U3+OmXPMU-2vX8i50xhS+RBdQzGRuWqk4zocZpHwBmgmA at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hello Jason, > the answer is yes. This is how my site.cfg on Linux look like: > > [DEFAULT] > library_dirs = /lib > include_dirs = /include > > [blas_opt] > libraries = f77blas, cblas, atlas > > [lapack_opt] > libraries = lapack, f77blas, cblas, atlas > > - Ilan > > On Sun, Apr 21, 2013 at 12:35 PM, James Jong > wrote: > > > Note: I started a thread in StackOverflow a few days ago with this > > question, but I have not received any response yet (the link is: > > > http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading > > ) > > > > The question is the following: > > > > Say that I build ATLAS with LAPACK as follows: > > > > wget > http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download > > wget http://www.netlib.org/lapack/lapack-3.4.2.tgz > > tar -jxvf atlas3.10.1.tar.bz2 > > mkdir BUILD > > cd BUILD > > ../ATLAS/configure -b 64 -Fa alg -fPIC \ > > --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ > > --prefix= > > make > > cd lib > > make shared > > make ptshared > > cd .. > > make install > > > > Note that I did *not *pass the flag --shared in .my call to configure. > > > > I end up with the following files under BUILD/lib: > > > > Make.inc@ > > Makefile > > > > the following .a files: > > > > libatlas.a > > libcblas.a > > libf77blas.a > > libptf77blas.a > > libtstatlas.a > > liblapack.a > > libf77refblas.a > > libptlapack.a > > libptcblas.a > > > > and the following .so files: > > > > libsatlas.so* > > libtatlas.so* > > > > Finally, if I define: > > > > BLAS=/path_to_BUILD/lib/libcblas.a > > LAPACK=/path_to_BUILD/lib/liblapack.a > > ATLAS=/path_to_BUILD/lib/libatlas.a > > > > and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs > variable > > within thesite.cfg file in NumPy. > > > > Would NumPy and SciPy use my libraries? (even though they all seem to be > > static?). > > > > Thanks, > > > > Jason > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130421/12cdef78/attachment-0001.html > > ------------------------------ > > Message: 3 > Date: Sun, 21 Apr 2013 14:48:57 -0400 > From: James Jong > Subject: Re: [Numpy-discussion] Can Numpy use static libraries from > LAPACK? > To: Discussion of Numerical Python > Message-ID: > < > CAD4ivxU3-On5fjrUvByrCU0xwVr9Lrycp557iho-_pnQX+4D1A at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Thanks a lot Ilan, > > That's great to know. Do you know if there is any way to verify this? > Perhaps seeing which specific files with their extensions are actually > Numpy loads and uses? > > Jason > > > > On Sun, Apr 21, 2013 at 2:24 PM, Ilan Schnell > wrote: > > > Hello Jason, > > the answer is yes. This is how my site.cfg on Linux look like: > > > > [DEFAULT] > > library_dirs = /lib > > include_dirs = /include > > > > [blas_opt] > > libraries = f77blas, cblas, atlas > > > > [lapack_opt] > > libraries = lapack, f77blas, cblas, atlas > > > > - Ilan > > > > On Sun, Apr 21, 2013 at 12:35 PM, James Jong >wrote: > > > >> Note: I started a thread in StackOverflow a few days ago with this > >> question, but I have not received any response yet (the link is: > >> > http://stackoverflow.com/questions/16093910/numpy-and-scipy-static-vs-dynamic-loading > >> ) > >> > >> The question is the following: > >> > >> Say that I build ATLAS with LAPACK as follows: > >> > >> wget > http://sourceforge.net/projects/math-atlas/files/Stable/3.10.1/atlas3.10.1.tar.bz2/download > >> wget http://www.netlib.org/lapack/lapack-3.4.2.tgz > >> tar -jxvf atlas3.10.1.tar.bz2 > >> mkdir BUILD > >> cd BUILD > >> ../ATLAS/configure -b 64 -Fa alg -fPIC \ > >> --with-netlib-lapack-tarfile=../lapack-3.4.2.tgz \ > >> --prefix= > >> make > >> cd lib > >> make shared > >> make ptshared > >> cd .. > >> make install > >> > >> Note that I did *not *pass the flag --shared in .my call to configure. > >> > >> I end up with the following files under BUILD/lib: > >> > >> Make.inc@ > >> Makefile > >> > >> the following .a files: > >> > >> libatlas.a > >> libcblas.a > >> libf77blas.a > >> libptf77blas.a > >> libtstatlas.a > >> liblapack.a > >> libf77refblas.a > >> libptlapack.a > >> libptcblas.a > >> > >> and the following .so files: > >> > >> libsatlas.so* > >> libtatlas.so* > >> > >> Finally, if I define: > >> > >> BLAS=/path_to_BUILD/lib/libcblas.a > >> LAPACK=/path_to_BUILD/lib/liblapack.a > >> ATLAS=/path_to_BUILD/lib/libatlas.a > >> > >> and add /path_to_BUILD/lib to LD_LIBRARY_PATH and to the library_dirs > variable > >> within thesite.cfg file in NumPy. > >> > >> Would NumPy and SciPy use my libraries? (even though they all seem to be > >> static?). > >> > >> Thanks, > >> > >> Jason > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > >> > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130421/61a9c6ee/attachment.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 79, Issue 67 > ************************************************ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Apr 22 13:00:06 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 22 Apr 2013 11:00:06 -0600 Subject: [Numpy-discussion] KeepDims flag? In-Reply-To: <1366541973.5989.50.camel@sebastian-laptop> References: <1366541973.5989.50.camel@sebastian-laptop> Message-ID: On Sun, Apr 21, 2013 at 4:59 AM, Sebastian Berg wrote: > Hi, > > just something that has been spooking around in my mind. Considering > that matrix indexing does not really support fancy indexing, I was > wondering about introducing a KeepDims flag. Maybe it is not worth it, > at least not unless other subclasses could make use of it, too. And a > big reason for it being not worth the trouble is probably that you could > not use it inside the current matrix class because it would i.e. break > workarounds for ufunc reductions (like np.multiply.reduce(a, axis=1).T, > as the .T would be unnecessary). > > Such a flag could only be set (not unset) and never be set on base class > arrays (basically it should be set by array_finalize). If set it would > toggle ufunc reductions to always use keepdims (unless the reduction is > to a scalar, maybe). And the same thing for indexing (meaning that some > fancy indices and np.newaxis would just error out), though axes added by > broadcasting should be caught by the subclass itself. > > That way, a matrix-like class would normally have a 1:1 mapping of old > to new axes (even if they might be transposed or elements arbitrarily > shuffled), and does not have to do magic to guess where to add the > missing one (instead the magic is done in the core, where it is actually > easier to implement). > > Anyway, as I never use matrix I personally have no real use for it, but > I thought I would throw the thought out there. For starters one might > rather think about something specific to indexing. > I'd be hesitant to add another flag. Perhaps a better direction to go would be to add row/column vectors, something that has been much discussed. There is also a keepdims flag for reduce operations that is a fairly new addition that I don't think has been exploited in the matrix class. ISTR discussion of adding a facility for subclasses to intercept ufunc calls, or something along those lines which would help with that, both for matrix and for masked arrays. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Tue Apr 23 13:13:52 2013 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Tue, 23 Apr 2013 12:13:52 -0500 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) Message-ID: <5176C150.6020904@gmail.com> Back in December it was pointed out on the scipy-user list[1] that numpy has a percentile function which has similar functionality to scipy's stats.scoreatpercentile. I've been trying to harmonize these two functions into a single version which has the features of both. Scipy PR 374[2] introduced a version which look the parameters from both the scipy and numpy percentile function and was accepted into Scipy with the plan that it would be depreciated when a similar function was introduced into Numpy. Then I moved to enhancing the Numpy version with Pull Request 2970 [3]. With some input from Sebastian Berg the percentile function was rewritten with further vectorization, but neither of us felt fully comfortable with the final product. Can someone look at implementation in the PR and suggest what should be done from here? Cheers, - Jonathan Helmus [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 [2] https://github.com/scipy/scipy/pull/374 [3] https://github.com/numpy/numpy/pull/2970 From nouiz at nouiz.org Tue Apr 23 14:10:54 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 23 Apr 2013 14:10:54 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: Hi, A big thanks for that release. I also think it would be useful to do a release candidate about this. This release changed the behavior releated to python long and broke a test in Theano. Nothing important, but we could have fixed this before the release. The numpy change is that a python long that don't fit in an int64, but fit in an uint64, was throwing an overflow exception. Now it return an uint64. thanks again! Fred On Sun, Apr 7, 2013 at 4:09 AM, Ond?ej ?ert?k wrote: > Hi, > > I'm pleased to announce the availability of the final NumPy 1.7.1 release. > > Sources and binary installers can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.7.1/ > > Only three simple bugs were fixed since 1.7.1rc1 (#3166, #3179, #3187). > > I would like to thank everybody who contributed patches since 1.7.1rc1: > Eric Fode, Nathaniel J. Smith and Charles Harris. > > Cheers, > Ondrej > > P.S. I'll create the Mac binary installers in a few days. Pypi is updated. > > > ========================= > NumPy 1.7.1 Release Notes > ========================= > > This is a bugfix only release in the 1.7.x series. > > > Issues fixed > ------------ > > gh-2973 Fix `1` is printed during numpy.test() > gh-2983 BUG: gh-2969: Backport memory leak fix 80b3a34. > gh-3007 Backport gh-3006 > gh-2984 Backport fix complex polynomial fit > gh-2982 BUG: Make nansum work with booleans. > gh-2985 Backport large sort fixes > gh-3039 Backport object take > gh-3105 Backport nditer fix op axes initialization > gh-3108 BUG: npy-pkg-config ini files were missing after Bento build. > gh-3124 BUG: PyArray_LexSort allocates too much temporary memory. > gh-3131 BUG: Exported f2py_size symbol prevents linking multiple f2py > modules. > gh-3117 Backport gh-2992 > gh-3135 DOC: Add mention of PyArray_SetBaseObject stealing a reference > gh-3134 DOC: Fix typo in fft docs (the indexing variable is 'm', not > 'n'). > gh-3136 Backport #3128 > > Checksums > ========= > > 9e369a96b94b107bf3fab7e07fef8557 > release/installers/numpy-1.7.1-win32-superpack-python2.6.exe > 0ab72b3b83528a7ae79c6df9042d61c6 release/installers/numpy-1.7.1.tar.gz > bb0d30de007d649757a2d6d2e1c59c9a > release/installers/numpy-1.7.1-win32-superpack-python3.2.exe > 9a72db3cad7a6286c0d22ee43ad9bc6c release/installers/numpy-1.7.1.zip > 0842258fad82060800b8d1f0896cb83b > release/installers/numpy-1.7.1-win32-superpack-python3.1.exe > 1b8f29b1fa89a801f83f551adc13aaf5 > release/installers/numpy-1.7.1-win32-superpack-python2.7.exe > 9ca22df942e5d5362cf7154217cb4b69 > release/installers/numpy-1.7.1-win32-superpack-python2.5.exe > 2fd475b893d8427e26153e03ad7d5b69 > release/installers/numpy-1.7.1-win32-superpack-python3.3.exe > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Apr 23 14:34:08 2013 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 23 Apr 2013 11:34:08 -0700 Subject: [Numpy-discussion] ANN: pandas 0.11.0 released! Message-ID: hi all, We've released pandas 0.11.0, a big release that span 3 months of continuous development, led primarily by the intrepid Jeff Reback and y-p. The release brings many new features, performance and API improvements, bug fixes, and other goodies. Some highlights: - New precision indexing fields loc, iloc, at, and iat, to reduce occasional ambiguity in the catch-all hitherto ix method. - Expanded support for NumPy data types in DataFrame - NumExpr integration to accelerate various operator evaluation - New Cookbook and 10 minutes to pandas pages in the documentation by Jeff Reback - Improved DataFrame to CSV exporting performance - Experimental "rplot" branch with faceted plots with matplotlib merged and open for community hacking Source archives and Windows installers are on PyPI. Thanks to all who contributed to this release, especially Jeff and y-p. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html Installers: http://pypi.python.org/pypi/pandas $ git log v0.10.1..v0.11.0 --pretty=format:%aN | sort | uniq -c | sort -rn 308 y-p 279 jreback 85 Vytautas Jancauskas 74 Wes McKinney 25 Stephen Lin 22 Andy Hayden 19 Chang She 13 Wouter Overmeire 8 Spencer Lyon 6 Phillip Cloud 6 Nicholaus E. Halecky 5 Thierry Moisan 5 Skipper Seabold 4 waitingkuo 4 Lo?c Est?ve 4 Jeff Reback 4 Garrett Drapala 4 Alvaro Tejero-Cantero 3 lexual 3 Dra?en Lu?anin 3 dieterv77 3 dengemann 3 Dan Birken 3 Adam Greenhall 2 Will Furnass 2 Vytautas Jan?auskas 2 Robert Gieseke 2 Peter Prettenhofer 2 Jonathan Chambers 2 Dieter Vandenbussche 2 Damien Garaud 2 Christopher Whelan 2 Chapman Siu 2 Brad Buran 1 vytas 1 Tim Akinbo 1 Thomas Kluyver 1 thauck 1 stephenwlin 1 K.-Michael Aye 1 Karmel Allison 1 Jeremy Wagner 1 James Casbon 1 Illia Polosukhin 1 Draz?en Luc?anin 1 davidjameshumphreys 1 Dan Davison 1 Chris Withers 1 Christian Geier 1 anomrake Happy data hacking! - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata From nouiz at nouiz.org Tue Apr 23 17:08:19 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 23 Apr 2013 17:08:19 -0400 Subject: [Numpy-discussion] MapIter api In-Reply-To: References: <1366043354.8595.21.camel@sebastian-laptop> Message-ID: Hi, this is currently used in Theano! In fact, it is a John S. that implemented it in NumPy to allow fast gradient of the advanced indexing in Theano. It allow code like: matrix1[vector1, vector2] += matrix2 where there is duplicate indices in the vector In looking at the code, I saw it use at least those part of the interface. PyArrayMapIterObject PyArray_MapIterNext PyArray_ITER_NEXT PyArray_MapIterSwapAxes PyArray_BroadcastToShape I lost the end of this discussion, but I think this is not possible in NumPy as there was not an agreement to include that. But I remember a few other user on this list asking for this(and they where Theano user to my knowledge). So I would prefer that you don't remove the part that we use for the next 1.8 release. thanks Fr?d?ric On Tue, Apr 16, 2013 at 9:54 AM, Nathaniel Smith wrote: > On Mon, Apr 15, 2013 at 5:29 PM, Sebastian Berg > wrote: > > Hey, > > > > the MapIter API has only been made public in master right? So it is no > > problem at all to change at least the mapiter struct, right? > > > > I got annoyed at all those special cases that make things difficult to > > get an idea where to put i.e. to fix the boolean array-like stuff. So > > actually started rewriting it (and I already got one big function that > > does all index preparation -- ok it is untested but its basically > > there). > > > > I would guess it is not really a big problem even if it was public for > > longer, since you shouldn't do those direct struct access probably? But > > just checking. > > Why don't we just make the struct opaque, i.e., just declare it in the > public header file and move the actual definition to an internal > header file? > > If it's too annoying I guess we could even make it non-public, at > least in 1.8 -- IIRC it's only there so we can use it in umath, and > IIRC the patch to use it hasn't landed yet. Or we could just merge > umath and multiarray into a single .so, that would save a *lot* of > annoying fiddling with the public API that doesn't actually serve any > purpose. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Apr 23 18:06:20 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 24 Apr 2013 00:06:20 +0200 Subject: [Numpy-discussion] MapIter api In-Reply-To: References: <1366043354.8595.21.camel@sebastian-laptop> Message-ID: <1366754780.17435.18.camel@sebastian-laptop> On Tue, 2013-04-23 at 17:08 -0400, Fr?d?ric Bastien wrote: > Hi, > > this is currently used in Theano! In fact, it is a John S. that > implemented it in NumPy to allow fast gradient of the advanced > indexing in Theano. It allow code like: > > > matrix1[vector1, vector2] += matrix2 > Yes, I had missed that and thought maybe nobody actually used it yet. I gave some points why I think there should be some changes in the original pull request [1]. Mostly I think it would make sense (also a lot for theano) to rewrite it with the new iterators and expose the subspace more directly. That would give vast speedups for mixed fancy/non-fancy indices. But if this is useful to you, I guess one can also just create a new one if someone finds time, leaving the old MapIter deprecated and unmaintained. [1] https://github.com/numpy/numpy/pull/377 > where there is duplicate indices in the vector > > In looking at the code, I saw it use at least those part of the > interface. > > PyArrayMapIterObject > PyArray_MapIterNext > PyArray_ITER_NEXT > PyArray_MapIterSwapAxes > PyArray_BroadcastToShape > There is likely no reason for changing these, but improving MapIter would likely break binary compatibility because of struct access. - Sebastian > > I lost the end of this discussion, but I think this is not possible in > NumPy as there was not an agreement to include that. But I remember a > few other user on this list asking for this(and they where Theano user > to my knowledge). > > > So I would prefer that you don't remove the part that we use for the > next 1.8 release. > > thanks > > Fr?d?ric > > > > On Tue, Apr 16, 2013 at 9:54 AM, Nathaniel Smith > wrote: > On Mon, Apr 15, 2013 at 5:29 PM, Sebastian Berg > wrote: > > Hey, > > > > the MapIter API has only been made public in master right? > So it is no > > problem at all to change at least the mapiter struct, right? > > > > I got annoyed at all those special cases that make things > difficult to > > get an idea where to put i.e. to fix the boolean array-like > stuff. So > > actually started rewriting it (and I already got one big > function that > > does all index preparation -- ok it is untested but its > basically > > there). > > > > I would guess it is not really a big problem even if it was > public for > > longer, since you shouldn't do those direct struct access > probably? But > > just checking. > > > Why don't we just make the struct opaque, i.e., just declare > it in the > public header file and move the actual definition to an > internal > header file? > > If it's too annoying I guess we could even make it non-public, > at > least in 1.8 -- IIRC it's only there so we can use it in > umath, and > IIRC the patch to use it hasn't landed yet. Or we could just > merge > umath and multiarray into a single .so, that would save a > *lot* of > annoying fiddling with the public API that doesn't actually > serve any > purpose. > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Tue Apr 23 18:16:50 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 24 Apr 2013 00:16:50 +0200 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) In-Reply-To: <5176C150.6020904@gmail.com> References: <5176C150.6020904@gmail.com> Message-ID: <1366755410.17435.27.camel@sebastian-laptop> On Tue, 2013-04-23 at 12:13 -0500, Jonathan Helmus wrote: > Back in December it was pointed out on the scipy-user list[1] that > numpy has a percentile function which has similar functionality to > scipy's stats.scoreatpercentile. I've been trying to harmonize these > two functions into a single version which has the features of both. > Scipy PR 374[2] introduced a version which look the parameters from > both the scipy and numpy percentile function and was accepted into Scipy > with the plan that it would be depreciated when a similar function was > introduced into Numpy. Then I moved to enhancing the Numpy version with > Pull Request 2970 [3]. With some input from Sebastian Berg the > percentile function was rewritten with further vectorization, but > neither of us felt fully comfortable with the final product. Can > someone look at implementation in the PR and suggest what should be done > from here? > Thanks! For me the main question is the vectorized usage when both haystack (`a`) and needle (`q`) are vectorized. What I mean is for: np.percentile(np.random.randn(n1, n2, N), [25., 50., 75.], axis=-1) I would probably expect an output shape of (n1, n2, 3), but currently you will get the needle dimensions first, because it is roughly the same as [np.percentile(np.random.randn(n1, n2, N), q, axis=-1) for q in [25., 50., 75.]] so for the (probably rare) vectorization of both `a` and `q`, would it be preferable to do some kind of long term behaviour change, or just put the dimensions in `q` first, which should be compatible to the current list? Regards, Sebastian > Cheers, > > - Jonathan Helmus > > > [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 > [2] https://github.com/scipy/scipy/pull/374 > [3] https://github.com/numpy/numpy/pull/2970 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Apr 23 18:29:00 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Apr 2013 16:29:00 -0600 Subject: [Numpy-discussion] MapIter api In-Reply-To: <1366754780.17435.18.camel@sebastian-laptop> References: <1366043354.8595.21.camel@sebastian-laptop> <1366754780.17435.18.camel@sebastian-laptop> Message-ID: On Tue, Apr 23, 2013 at 4:06 PM, Sebastian Berg wrote: > On Tue, 2013-04-23 at 17:08 -0400, Fr?d?ric Bastien wrote: > > Hi, > > > > this is currently used in Theano! In fact, it is a John S. that > > implemented it in NumPy to allow fast gradient of the advanced > > indexing in Theano. It allow code like: > > > > > > matrix1[vector1, vector2] += matrix2 > > > Yes, I had missed that and thought maybe nobody actually used it yet. I > gave some points why I think there should be some changes in the > original pull request [1]. Mostly I think it would make sense (also a > lot for theano) to rewrite it with the new iterators and expose the > subspace more directly. That would give vast speedups for mixed > fancy/non-fancy indices. > > But if this is useful to you, I guess one can also just create a new one > if someone finds time, leaving the old MapIter deprecated and > unmaintained. > > [1] https://github.com/numpy/numpy/pull/377 > > > where there is duplicate indices in the vector > > > > In looking at the code, I saw it use at least those part of the > > interface. > > > > PyArrayMapIterObject > > PyArray_MapIterNext > > PyArray_ITER_NEXT > > PyArray_MapIterSwapAxes > > PyArray_BroadcastToShape > > > > There is likely no reason for changing these, but improving MapIter > would likely break binary compatibility because of struct access. > > - Sebastian > > > > I lost the end of this discussion, but I think this is not possible in > > NumPy as there was not an agreement to include that. But I remember a > > few other user on this list asking for this(and they where Theano user > > to my knowledge). > > > > > > So I would prefer that you don't remove the part that we use for the > > next 1.8 release. > > > > thanks > > > > Fr?d?ric > > > > > > > > On Tue, Apr 16, 2013 at 9:54 AM, Nathaniel Smith > > wrote: > > On Mon, Apr 15, 2013 at 5:29 PM, Sebastian Berg > > wrote: > > > Hey, > > > > > > the MapIter API has only been made public in master right? > > So it is no > > > problem at all to change at least the mapiter struct, right? > > > > > > I got annoyed at all those special cases that make things > > difficult to > > > get an idea where to put i.e. to fix the boolean array-like > > stuff. So > > > actually started rewriting it (and I already got one big > > function that > > > does all index preparation -- ok it is untested but its > > basically > > > there). > > > > > > I would guess it is not really a big problem even if it was > > public for > > > longer, since you shouldn't do those direct struct access > > probably? But > > > just checking. > > > > > > Why don't we just make the struct opaque, i.e., just declare > > it in the > > public header file and move the actual definition to an > > internal > > header file? > > > > If it's too annoying I guess we could even make it non-public, > > at > > least in 1.8 -- IIRC it's only there so we can use it in > > umath, and > > IIRC the patch to use it hasn't landed yet. Or we could just > > merge > > umath and multiarray into a single .so, that would save a > > *lot* of > > annoying fiddling with the public API that doesn't actually > > serve any > > purpose. > > > Does this have any overlap with https://github.com/numpy/numpy/pull/2821 ? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Apr 23 23:33:29 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 23 Apr 2013 23:33:29 -0400 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) In-Reply-To: <1366755410.17435.27.camel@sebastian-laptop> References: <5176C150.6020904@gmail.com> <1366755410.17435.27.camel@sebastian-laptop> Message-ID: On Tue, Apr 23, 2013 at 6:16 PM, Sebastian Berg wrote: > On Tue, 2013-04-23 at 12:13 -0500, Jonathan Helmus wrote: >> Back in December it was pointed out on the scipy-user list[1] that >> numpy has a percentile function which has similar functionality to >> scipy's stats.scoreatpercentile. I've been trying to harmonize these >> two functions into a single version which has the features of both. >> Scipy PR 374[2] introduced a version which look the parameters from >> both the scipy and numpy percentile function and was accepted into Scipy >> with the plan that it would be depreciated when a similar function was >> introduced into Numpy. Then I moved to enhancing the Numpy version with >> Pull Request 2970 [3]. With some input from Sebastian Berg the >> percentile function was rewritten with further vectorization, but >> neither of us felt fully comfortable with the final product. Can >> someone look at implementation in the PR and suggest what should be done >> from here? >> > > Thanks! For me the main question is the vectorized usage when both > haystack (`a`) and needle (`q`) are vectorized. What I mean is for: > > np.percentile(np.random.randn(n1, n2, N), [25., 50., 75.], axis=-1) > > I would probably expect an output shape of (n1, n2, 3), but currently > you will get the needle dimensions first, because it is roughly the same > as > > [np.percentile(np.random.randn(n1, n2, N), q, axis=-1) for q in [25., 50., 75.]] > > so for the (probably rare) vectorization of both `a` and `q`, would it > be preferable to do some kind of long term behaviour change, or just put > the dimensions in `q` first, which should be compatible to the current > list? I don't have much of a preference either way, but I'm glad this is going into numpy. We can work with it either way. In stats, the most common case will be axis=0, and then the two are the same, aren't they? What I like about the second version is unrolling (with 2 or 3 quantiles), which I think will work u, l = np.random.randn(2,5) or res = np.percentile(...) func(*res) The first case will be nicer when there are lots of percentiles, but I guess I won't need it much except for axis=0. Actually, I would prefer the second version, because it might be a bit more cumbersome to get the individual percentiles out if the axis is somewhere in the middle, however I don't think I have a case like that. The first version would be consistent with reduceat, and that would be more numpythonic. I would go for that in numpy. my 2.5c Josef > > Regards, > > Sebastian > >> Cheers, >> >> - Jonathan Helmus >> >> >> [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 >> [2] https://github.com/scipy/scipy/pull/374 >> [3] https://github.com/numpy/numpy/pull/2970 >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Apr 24 04:11:57 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 24 Apr 2013 10:11:57 +0200 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) In-Reply-To: References: <5176C150.6020904@gmail.com> <1366755410.17435.27.camel@sebastian-laptop> Message-ID: <1366791117.17435.30.camel@sebastian-laptop> On Tue, 2013-04-23 at 23:33 -0400, josef.pktd at gmail.com wrote: > On Tue, Apr 23, 2013 at 6:16 PM, Sebastian Berg > wrote: > > On Tue, 2013-04-23 at 12:13 -0500, Jonathan Helmus wrote: > >> Back in December it was pointed out on the scipy-user list[1] that > >> numpy has a percentile function which has similar functionality to > >> scipy's stats.scoreatpercentile. I've been trying to harmonize these > >> two functions into a single version which has the features of both. > >> Scipy PR 374[2] introduced a version which look the parameters from > >> both the scipy and numpy percentile function and was accepted into Scipy > >> with the plan that it would be depreciated when a similar function was > >> introduced into Numpy. Then I moved to enhancing the Numpy version with > >> Pull Request 2970 [3]. With some input from Sebastian Berg the > >> percentile function was rewritten with further vectorization, but > >> neither of us felt fully comfortable with the final product. Can > >> someone look at implementation in the PR and suggest what should be done > >> from here? > >> > > > > Thanks! For me the main question is the vectorized usage when both > > haystack (`a`) and needle (`q`) are vectorized. What I mean is for: > > > > np.percentile(np.random.randn(n1, n2, N), [25., 50., 75.], axis=-1) > > > > I would probably expect an output shape of (n1, n2, 3), but currently > > you will get the needle dimensions first, because it is roughly the same > > as > > > > [np.percentile(np.random.randn(n1, n2, N), q, axis=-1) for q in [25., 50., 75.]] > > > > so for the (probably rare) vectorization of both `a` and `q`, would it > > be preferable to do some kind of long term behaviour change, or just put > > the dimensions in `q` first, which should be compatible to the current > > list? > > I don't have much of a preference either way, but I'm glad this is > going into numpy. > We can work with it either way. > > In stats, the most common case will be axis=0, and then the two are > the same, aren't they? > > What I like about the second version is unrolling (with 2 or 3 > quantiles), which I think will work > > u, l = np.random.randn(2,5) > or > res = np.percentile(...) > func(*res) > > The first case will be nicer when there are lots of percentiles, but I > guess I won't need it much except for axis=0. > > Actually, I would prefer the second version, because it might be a bit > more cumbersome to get the individual percentiles out if the axis is > somewhere in the middle, however I don't think I have a case like > that. > I never thought about the axis being where to insert the dimensions of the quantiles. That would be a third option. It feels simpler to me to just always use the end (or the start) though. - Sebastian > The first version would be consistent with reduceat, and that would be > more numpythonic. I would go for that in numpy. > > my 2.5c > > Josef > > > > > Regards, > > > > Sebastian > > > >> Cheers, > >> > >> - Jonathan Helmus > >> > >> > >> [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 > >> [2] https://github.com/scipy/scipy/pull/374 > >> [3] https://github.com/numpy/numpy/pull/2970 > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Wed Apr 24 12:03:17 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Apr 2013 12:03:17 -0400 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) In-Reply-To: <1366791117.17435.30.camel@sebastian-laptop> References: <5176C150.6020904@gmail.com> <1366755410.17435.27.camel@sebastian-laptop> <1366791117.17435.30.camel@sebastian-laptop> Message-ID: On Wed, Apr 24, 2013 at 4:11 AM, Sebastian Berg wrote: > On Tue, 2013-04-23 at 23:33 -0400, josef.pktd at gmail.com wrote: >> On Tue, Apr 23, 2013 at 6:16 PM, Sebastian Berg >> wrote: >> > On Tue, 2013-04-23 at 12:13 -0500, Jonathan Helmus wrote: >> >> Back in December it was pointed out on the scipy-user list[1] that >> >> numpy has a percentile function which has similar functionality to >> >> scipy's stats.scoreatpercentile. I've been trying to harmonize these >> >> two functions into a single version which has the features of both. >> >> Scipy PR 374[2] introduced a version which look the parameters from >> >> both the scipy and numpy percentile function and was accepted into Scipy >> >> with the plan that it would be depreciated when a similar function was >> >> introduced into Numpy. Then I moved to enhancing the Numpy version with >> >> Pull Request 2970 [3]. With some input from Sebastian Berg the >> >> percentile function was rewritten with further vectorization, but >> >> neither of us felt fully comfortable with the final product. Can >> >> someone look at implementation in the PR and suggest what should be done >> >> from here? >> >> >> > >> > Thanks! For me the main question is the vectorized usage when both >> > haystack (`a`) and needle (`q`) are vectorized. What I mean is for: >> > >> > np.percentile(np.random.randn(n1, n2, N), [25., 50., 75.], axis=-1) >> > >> > I would probably expect an output shape of (n1, n2, 3), but currently >> > you will get the needle dimensions first, because it is roughly the same >> > as >> > >> > [np.percentile(np.random.randn(n1, n2, N), q, axis=-1) for q in [25., 50., 75.]] >> > >> > so for the (probably rare) vectorization of both `a` and `q`, would it >> > be preferable to do some kind of long term behaviour change, or just put >> > the dimensions in `q` first, which should be compatible to the current >> > list? >> >> I don't have much of a preference either way, but I'm glad this is >> going into numpy. >> We can work with it either way. >> >> In stats, the most common case will be axis=0, and then the two are >> the same, aren't they? >> >> What I like about the second version is unrolling (with 2 or 3 >> quantiles), which I think will work >> >> u, l = np.random.randn(2,5) >> or >> res = np.percentile(...) >> func(*res) >> >> The first case will be nicer when there are lots of percentiles, but I >> guess I won't need it much except for axis=0. >> >> Actually, I would prefer the second version, because it might be a bit >> more cumbersome to get the individual percentiles out if the axis is >> somewhere in the middle, however I don't think I have a case like >> that. >> > > I never thought about the axis being where to insert the dimensions of > the quantiles. That would be a third option. It feels simpler to me to > just always use the end (or the start) though. If the choices are start or end, then I prefer start for unpacking. Josef > > - Sebastian > >> The first version would be consistent with reduceat, and that would be >> more numpythonic. I would go for that in numpy. >> >> my 2.5c >> >> Josef >> >> > >> > Regards, >> > >> > Sebastian >> > >> >> Cheers, >> >> >> >> - Jonathan Helmus >> >> >> >> >> >> [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 >> >> [2] https://github.com/scipy/scipy/pull/374 >> >> [3] https://github.com/numpy/numpy/pull/2970 >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Apr 24 13:43:40 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 24 Apr 2013 19:43:40 +0200 Subject: [Numpy-discussion] Vectorized percentile function in Numpy (PR #2970) In-Reply-To: References: <5176C150.6020904@gmail.com> <1366755410.17435.27.camel@sebastian-laptop> <1366791117.17435.30.camel@sebastian-laptop> Message-ID: <1366825420.17435.43.camel@sebastian-laptop> On Wed, 2013-04-24 at 12:03 -0400, josef.pktd at gmail.com wrote: > On Wed, Apr 24, 2013 at 4:11 AM, Sebastian Berg > wrote: > > On Tue, 2013-04-23 at 23:33 -0400, josef.pktd at gmail.com wrote: > >> On Tue, Apr 23, 2013 at 6:16 PM, Sebastian Berg > >> wrote: > >> > On Tue, 2013-04-23 at 12:13 -0500, Jonathan Helmus wrote: > >> >> Back in December it was pointed out on the scipy-user list[1] that > >> >> numpy has a percentile function which has similar functionality to > >> >> scipy's stats.scoreatpercentile. I've been trying to harmonize these > >> >> two functions into a single version which has the features of both. > >> >> Scipy PR 374[2] introduced a version which look the parameters from > >> >> both the scipy and numpy percentile function and was accepted into Scipy > >> >> with the plan that it would be depreciated when a similar function was > >> >> introduced into Numpy. Then I moved to enhancing the Numpy version with > >> >> Pull Request 2970 [3]. With some input from Sebastian Berg the > >> >> percentile function was rewritten with further vectorization, but > >> >> neither of us felt fully comfortable with the final product. Can > >> >> someone look at implementation in the PR and suggest what should be done > >> >> from here? > >> >> > >> > > >> > Thanks! For me the main question is the vectorized usage when both > >> > haystack (`a`) and needle (`q`) are vectorized. What I mean is for: > >> > > >> > np.percentile(np.random.randn(n1, n2, N), [25., 50., 75.], axis=-1) > >> > > >> > I would probably expect an output shape of (n1, n2, 3), but currently > >> > you will get the needle dimensions first, because it is roughly the same > >> > as > >> > > >> > [np.percentile(np.random.randn(n1, n2, N), q, axis=-1) for q in [25., 50., 75.]] > >> > > >> > so for the (probably rare) vectorization of both `a` and `q`, would it > >> > be preferable to do some kind of long term behaviour change, or just put > >> > the dimensions in `q` first, which should be compatible to the current > >> > list? > >> > >> I don't have much of a preference either way, but I'm glad this is > >> going into numpy. > >> We can work with it either way. > >> > >> In stats, the most common case will be axis=0, and then the two are > >> the same, aren't they? > >> > >> What I like about the second version is unrolling (with 2 or 3 > >> quantiles), which I think will work > >> > >> u, l = np.random.randn(2,5) > >> or > >> res = np.percentile(...) > >> func(*res) > >> > >> The first case will be nicer when there are lots of percentiles, but I > >> guess I won't need it much except for axis=0. > >> > >> Actually, I would prefer the second version, because it might be a bit > >> more cumbersome to get the individual percentiles out if the axis is > >> somewhere in the middle, however I don't think I have a case like > >> that. > >> > > > > I never thought about the axis being where to insert the dimensions of > > the quantiles. That would be a third option. It feels simpler to me to > > just always use the end (or the start) though. > > If the choices are start or end, then I prefer start for unpacking. > I missed the reduceat argument, it kind of makes sense to me (and usually we will have either axis=0 or axis=-1 I guess). I was going to check what searchsorted does, but it doesn't vectorize :). Sebastian > Josef > > > > > - Sebastian > > > >> The first version would be consistent with reduceat, and that would be > >> more numpythonic. I would go for that in numpy. > >> > >> my 2.5c > >> > >> Josef > >> > >> > > >> > Regards, > >> > > >> > Sebastian > >> > > >> >> Cheers, > >> >> > >> >> - Jonathan Helmus > >> >> > >> >> > >> >> [1] http://thread.gmane.org/gmane.comp.python.scientific.user/33331 > >> >> [2] https://github.com/scipy/scipy/pull/374 > >> >> [3] https://github.com/numpy/numpy/pull/2970 > >> >> _______________________________________________ > >> >> NumPy-Discussion mailing list > >> >> NumPy-Discussion at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> >> > >> > > >> > > >> > _______________________________________________ > >> > NumPy-Discussion mailing list > >> > NumPy-Discussion at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From ralf.gommers at gmail.com Wed Apr 24 14:43:42 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 24 Apr 2013 20:43:42 +0200 Subject: [Numpy-discussion] GSoC application time! Message-ID: Hi, This is for all students planning to put in a GSoC'13 application for Numpy/Scipy. As you probably noticed, you are now able to submit your application in Melange. We're participating under the PSF org. Terri Oda, who is the PSF organizer, sent out the below announcement with tips and requirements as well as a helpful application template - please read those. When you have a draft proposal ready, I suggest you a) submit it already, since you can edit it later b) discuss it on the numpy-discussion or scipy-dev list If you haven't submitted a pull request yet, it's a good idea to do so asap - getting your first patch merged may be nontrivial and require some rework, so don't wait till the last day! Of all the requirements, I want to stress that discussing your proposal and interacting with the community is especially important. Not only will it help improve your proposal, it is also something we will pay attention to when ranking the proposals. Reason: it's a good predictor of the success of your project, both in terms of code/features contributed and of whether you're interested and likely to stay involved after GSoC ends. The latter is at least as important to us as the former. Cheers, Ralf P.S. I will be offline from April 25th to 29th. ---------- Forwarded message ---------- From: Terri Oda Date: Mon, Apr 22, 2013 at 12:22 AM Subject: [Soc2013-general] Student Application Template (Applications start April 22!) To: soc2013-general at python.org As hopefully all of you are aware, student applications to GSoC will be opening April 22 19:00 UTC (tomorrow to me) and closing May 3rd. I highly recommend that you all submit applications early -- you can modify them up until the final deadline. Google will not extend the deadline for any reason, including technical problems with the melange system (which have been known to happen at the last minute in the past), so the sooner you can get an application in the better! We have a template to help you prepare your application with the PSF: http://wiki.python.org/moin/**SummerOfCode/**ApplicationTemplate2013 Your sub-organizations may have additional requirements; ask them if there's any extra information they need from you. Please note a few things we ask for that are not always required by other orgs: * We do require students to blog about their projects, so you will need to set up a GSoC blog for weekly status updates and any other thoughts you wish to record about your project. * We do require students to submit a link to some sort of code sample, preferably a patch to the sub-org to which you are applying. Talk to your mentors if you're uncertain what would be appropriate. * Don't forget to put the name of your sub-organization (e.g. OpenHatch, MNE-Python) into the title of your application. If you're not sure about how to write a good proposal, ask your prospective mentors: they're the ones who will be deciding if they hire you or not, so they get the final word as to what a good proposal looks like for them. Terri ______________________________**_________________ Soc2013-general mailing list Soc2013-general at python.org http://mail.python.org/**mailman/listinfo/soc2013-**general -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.giessel at gmail.com Wed Apr 24 17:37:09 2013 From: andrew.giessel at gmail.com (andrew giessel) Date: Wed, 24 Apr 2013 17:37:09 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() Message-ID: Hello all- A while back I emailed the list about function for the numpy namespace, iteraxis(), which allows you to generalize the default iteration behavior of numpy arrays over any axis. I've implemented this function more cleanly and the pull request is here: https://github.com/numpy/numpy/pull/3262, and includes passing tests and documentation. This is very simple code, which uses np.rollaxis() to bring the desired dimension to the front, and then allows you to loop over slices in this re-structured view of the array. While little more than an alias, I feel this is a very useful function because looping over iterators is a core pattern in python, and makes working with slices of any multidimensional array very pythonic. Adding this function makes this more visible for users, new and old, and I hope members of this list will agree it is worth adding to the namespace. Generalizing this to iterate over multiple axes is something that might be worthwhile, but the specifics of how to implement the axis ordering would take some thought. I'm happy to discuss and tackle this if people are really interested. Hoping for some nice feedback, ag -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Apr 25 11:16:00 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 25 Apr 2013 09:16:00 -0600 Subject: [Numpy-discussion] 1.8 release Message-ID: Hi All, I think it is time to start the runup to the 1.8 release. I don't know of any outstanding blockers but if anyone has a PR/issue that they feel needs to be in the next Numpy release now is the time to make it known. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Thu Apr 25 11:19:38 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 25 Apr 2013 15:19:38 +0000 (UTC) Subject: [Numpy-discussion] 1.8 release References: Message-ID: Charles R Harris gmail.com> writes: > > Hi All,I think it is time to start the runup to the 1.8 release. I don't know of any outstanding blockers but if anyone has a PR/issue that they feel needs to be in the next Numpy release now is the time to make it known.Chuck > It would be good to get the utc-everywhere fix for datetime64 in there if someone has time to look into it. Thanks, Dave From sebastian at sipsolutions.net Thu Apr 25 11:33:17 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 25 Apr 2013 17:33:17 +0200 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: <1366903997.17435.55.camel@sebastian-laptop> On Thu, 2013-04-25 at 09:16 -0600, Charles R Harris wrote: > Hi All, > > I think it is time to start the runup to the 1.8 release. I don't know > of any outstanding blockers but if anyone has a PR/issue that they > feel needs to be in the next Numpy release now is the time to make it > known. Sounds good, I would like to get the deprecation stuff done, because if we prefer to do it deeper down (I do), it changes the warnings a little and what happens when they are raised as errors. - Sebastian > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Thu Apr 25 13:14:27 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 25 Apr 2013 18:14:27 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Wed, Apr 24, 2013 at 10:37 PM, andrew giessel wrote: > Hello all- > > A while back I emailed the list about function for the numpy namespace, > iteraxis(), which allows you to generalize the default iteration behavior of > numpy arrays over any axis. > > I've implemented this function more cleanly and the pull request is here: > https://github.com/numpy/numpy/pull/3262, and includes passing tests and > documentation. > > This is very simple code, which uses np.rollaxis() to bring the desired > dimension to the front, and then allows you to loop over slices in this > re-structured view of the array. While little more than an alias, I feel > this is a very useful function because looping over iterators is a core > pattern in python, and makes working with slices of any multidimensional > array very pythonic. Adding this function makes this more visible for > users, new and old, and I hope members of this list will agree it is worth > adding to the namespace. I'm afraid I don't. It's a just a reduced-functionality version of rollaxis(). I don't think the additional name adds anything substantial. -- Robert Kern From jslavin at cfa.harvard.edu Thu Apr 25 13:34:41 2013 From: jslavin at cfa.harvard.edu (Jonathan Slavin) Date: Thu, 25 Apr 2013 13:34:41 -0400 Subject: [Numpy-discussion] f2py and object libraries Message-ID: <1366911281.16984.35.camel@shevek> Hi all, I have recently started using f2py to access some legacy fortran code and it's mostly worked better than I expected. It handles common blocks, block data, etc. with no problems. I did need to define the type of all the arguments in subroutine and function calls, but not in the body of the code (though of course that would be good programming practice). The one thing that I haven't been able to get to work is linking to code in an object library. My normal approach to compiling my code that uses this particular code collection (RS) goes like: f77 -c (...other options) main_prog.f f77 -o main_prog main_prog.o -L/lib_dir -lRS where /lib_dir is the directory where the library resides and the library name is libRS.a The library was created by via: ar cru libRS.a file1.o file2.o (...list of object files, compiled fortran routines) ranlib libRS.a f2py does accept the -L and -l arguments but doesn't seem to be able to find the code in the libraries. Any suggestions? Jon -- ______________________________________________________________ Jonathan D. Slavin Harvard-Smithsonian CfA jslavin at cfa.harvard.edu 60 Garden Street, MS 83 phone: (617) 496-7981 Cambridge, MA 02138-1516 cell: (781) 363-0035 USA ______________________________________________________________ From matthew.brett at gmail.com Thu Apr 25 13:30:38 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 25 Apr 2013 10:30:38 -0700 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: Hi, On Thu, Apr 25, 2013 at 10:14 AM, Robert Kern wrote: > On Wed, Apr 24, 2013 at 10:37 PM, andrew giessel > wrote: >> Hello all- >> >> A while back I emailed the list about function for the numpy namespace, >> iteraxis(), which allows you to generalize the default iteration behavior of >> numpy arrays over any axis. >> >> I've implemented this function more cleanly and the pull request is here: >> https://github.com/numpy/numpy/pull/3262, and includes passing tests and >> documentation. >> >> This is very simple code, which uses np.rollaxis() to bring the desired >> dimension to the front, and then allows you to loop over slices in this >> re-structured view of the array. While little more than an alias, I feel >> this is a very useful function because looping over iterators is a core >> pattern in python, and makes working with slices of any multidimensional >> array very pythonic. Adding this function makes this more visible for >> users, new and old, and I hope members of this list will agree it is worth >> adding to the namespace. > > I'm afraid I don't. It's a just a reduced-functionality version of > rollaxis(). I don't think the additional name adds anything > substantial. There's a little more on this in the pull request discussion for those of y'all that are interested. So the decision has to be based on some estimate of: 1) Cost for adding a new function to the namespace 2) Benefit : some combination of: Likelihood of needing to iterate over arbitrary axis. Likelihood of not finding rollaxis / transpose as a solution to this. Increased likelihood of finding iteraxis in this situation. As a data point - Gael pointed me to rollaxis back in the day, I didn't find it myself, and although it was completely obvious in retrospect, it had not previously occurred to me to use transposing for this task. Cheers, Matthew From robert.kern at gmail.com Thu Apr 25 13:42:11 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 25 Apr 2013 18:42:11 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 6:30 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 25, 2013 at 10:14 AM, Robert Kern wrote: >> On Wed, Apr 24, 2013 at 10:37 PM, andrew giessel >> wrote: >>> Hello all- >>> >>> A while back I emailed the list about function for the numpy namespace, >>> iteraxis(), which allows you to generalize the default iteration behavior of >>> numpy arrays over any axis. >>> >>> I've implemented this function more cleanly and the pull request is here: >>> https://github.com/numpy/numpy/pull/3262, and includes passing tests and >>> documentation. >>> >>> This is very simple code, which uses np.rollaxis() to bring the desired >>> dimension to the front, and then allows you to loop over slices in this >>> re-structured view of the array. While little more than an alias, I feel >>> this is a very useful function because looping over iterators is a core >>> pattern in python, and makes working with slices of any multidimensional >>> array very pythonic. Adding this function makes this more visible for >>> users, new and old, and I hope members of this list will agree it is worth >>> adding to the namespace. >> >> I'm afraid I don't. It's a just a reduced-functionality version of >> rollaxis(). I don't think the additional name adds anything >> substantial. > > There's a little more on this in the pull request discussion for those > of y'all that are interested. > > So the decision has to be based on some estimate of: > > 1) Cost for adding a new function to the namespace > 2) Benefit : some combination of: Likelihood of needing to iterate > over arbitrary axis. Likelihood of not finding rollaxis / transpose as > a solution to this. Increased likelihood of finding iteraxis in this > situation. 3) Comparison with other solutions that might obtain the same benefits without the attendant costs: i.e. additional documentation in any number of forms. -- Robert Kern From matthew.brett at gmail.com Thu Apr 25 13:54:12 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 25 Apr 2013 10:54:12 -0700 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: Hi, On Thu, Apr 25, 2013 at 10:42 AM, Robert Kern wrote: > On Thu, Apr 25, 2013 at 6:30 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Apr 25, 2013 at 10:14 AM, Robert Kern wrote: >>> On Wed, Apr 24, 2013 at 10:37 PM, andrew giessel >>> wrote: >>>> Hello all- >>>> >>>> A while back I emailed the list about function for the numpy namespace, >>>> iteraxis(), which allows you to generalize the default iteration behavior of >>>> numpy arrays over any axis. >>>> >>>> I've implemented this function more cleanly and the pull request is here: >>>> https://github.com/numpy/numpy/pull/3262, and includes passing tests and >>>> documentation. >>>> >>>> This is very simple code, which uses np.rollaxis() to bring the desired >>>> dimension to the front, and then allows you to loop over slices in this >>>> re-structured view of the array. While little more than an alias, I feel >>>> this is a very useful function because looping over iterators is a core >>>> pattern in python, and makes working with slices of any multidimensional >>>> array very pythonic. Adding this function makes this more visible for >>>> users, new and old, and I hope members of this list will agree it is worth >>>> adding to the namespace. >>> >>> I'm afraid I don't. It's a just a reduced-functionality version of >>> rollaxis(). I don't think the additional name adds anything >>> substantial. >> >> There's a little more on this in the pull request discussion for those >> of y'all that are interested. >> >> So the decision has to be based on some estimate of: >> >> 1) Cost for adding a new function to the namespace >> 2) Benefit : some combination of: Likelihood of needing to iterate >> over arbitrary axis. Likelihood of not finding rollaxis / transpose as >> a solution to this. Increased likelihood of finding iteraxis in this >> situation. > > 3) Comparison with other solutions that might obtain the same benefits > without the attendant costs: i.e. additional documentation in any > number of forms. Right, good point. That would also need to be weighted with the likelihood that people will find and read that documentation. Cheers, Matthew From jay.bourque at continuum.io Thu Apr 25 14:54:10 2013 From: jay.bourque at continuum.io (Jay Bourque) Date: Thu, 25 Apr 2013 13:54:10 -0500 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: I would love to get the following pull requests of mine merged in: https://github.com/numpy/numpy/pull/2822 https://github.com/numpy/numpy/pull/462 https://github.com/numpy/numpy/pull/359 https://github.com/numpy/numpy/pull/2821 The last one probably requires a bit more work, but I'm still waiting on feedback on my solution (see my last comment in the PR discussion thread). Thanks, -Jay On Thu, Apr 25, 2013 at 10:16 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > I think it is time to start the runup to the 1.8 release. I don't know of > any outstanding blockers but if anyone has a PR/issue that they feel needs > to be in the next Numpy release now is the time to make it known. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Apr 25 15:10:32 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 25 Apr 2013 20:10:32 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 6:54 PM, Matthew Brett wrote: > Hi, > > On Thu, Apr 25, 2013 at 10:42 AM, Robert Kern wrote: >> On Thu, Apr 25, 2013 at 6:30 PM, Matthew Brett wrote: >>> So the decision has to be based on some estimate of: >>> >>> 1) Cost for adding a new function to the namespace >>> 2) Benefit : some combination of: Likelihood of needing to iterate >>> over arbitrary axis. Likelihood of not finding rollaxis / transpose as >>> a solution to this. Increased likelihood of finding iteraxis in this >>> situation. >> >> 3) Comparison with other solutions that might obtain the same benefits >> without the attendant costs: i.e. additional documentation in any >> number of forms. > > Right, good point. That would also need to be weighted with the > likelihood that people will find and read that documentation. In my opinion, duplicating functionality under different aliases just so people can supposedly find things without reading the documentation is not a viable strategy for building out an API. My suggestion is to start building out a "How do I ...?" section to the User's Guide that answers small questions like this. "How do I iterate over an arbitrary axis of an array?" should be sufficiently discoverable. This is precisely the kind of problem that documentation solves better than anything else. This is what we write documentation for. Let's make use of it before trying something else. If we add such a section, and still see many people not finding it, then we can consider adding aliases. -- Robert Kern From andrew_giessel at hms.harvard.edu Thu Apr 25 15:21:19 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Thu, 25 Apr 2013 15:21:19 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: I respect this opinion. However (and maybe this is legacy), while reading through the numeric.py source file, I was surprised at how short many of the functions are, generally. Functions like ones() and zeros() are pretty simple wrappers which call empty() and then copy over values. FWIW, I had used numpy for over two years before realizing that the default behavior of iterating on a numpy array was to return slices over the first axis (although, this makes sense because it makes a 1d array like a list), and I think it is generally left out of any tutorials or guides. If nothing else I learned how to build the numpy source and how to make tests. And how to iterate over axes with np.rollaxis() ;) Any other opinions from people that haven't commented on the PR thread already? ag On Thu, Apr 25, 2013 at 3:10 PM, Robert Kern wrote: > On Thu, Apr 25, 2013 at 6:54 PM, Matthew Brett > wrote: > > Hi, > > > > On Thu, Apr 25, 2013 at 10:42 AM, Robert Kern > wrote: > >> On Thu, Apr 25, 2013 at 6:30 PM, Matthew Brett > wrote: > > >>> So the decision has to be based on some estimate of: > >>> > >>> 1) Cost for adding a new function to the namespace > >>> 2) Benefit : some combination of: Likelihood of needing to iterate > >>> over arbitrary axis. Likelihood of not finding rollaxis / transpose as > >>> a solution to this. Increased likelihood of finding iteraxis in this > >>> situation. > >> > >> 3) Comparison with other solutions that might obtain the same benefits > >> without the attendant costs: i.e. additional documentation in any > >> number of forms. > > > > Right, good point. That would also need to be weighted with the > > likelihood that people will find and read that documentation. > > In my opinion, duplicating functionality under different aliases just > so people can supposedly find things without reading the documentation > is not a viable strategy for building out an API. > > My suggestion is to start building out a "How do I ...?" section to > the User's Guide that answers small questions like this. "How do I > iterate over an arbitrary axis of an array?" should be sufficiently > discoverable. This is precisely the kind of problem that documentation > solves better than anything else. This is what we write documentation > for. Let's make use of it before trying something else. If we add such > a section, and still see many people not finding it, then we can > consider adding aliases. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Apr 25 15:40:59 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 25 Apr 2013 20:40:59 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 8:21 PM, Andrew Giessel wrote: > I respect this opinion. However (and maybe this is legacy), while reading > through the numeric.py source file, I was surprised at how short many of the > functions are, generally. Functions like ones() and zeros() are pretty > simple wrappers which call empty() and then copy over values. Many of these are short, but they do tend to do at least two things that someone would otherwise have to do. This really isn't the case for iteraxis() and rollaxis(). One can use rollaxis() pretty much everywhere you would use iteraxis(), but not vice-versa. > FWIW, I had used numpy for over two years before realizing that the default > behavior of iterating on a numpy array was to return slices over the first > axis (although, this makes sense because it makes a 1d array like a list), > and I think it is generally left out of any tutorials or guides. Then let's add it. -- Robert Kern From josef.pktd at gmail.com Thu Apr 25 15:51:16 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Apr 2013 15:51:16 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 3:40 PM, Robert Kern wrote: > On Thu, Apr 25, 2013 at 8:21 PM, Andrew Giessel > wrote: >> I respect this opinion. However (and maybe this is legacy), while reading >> through the numeric.py source file, I was surprised at how short many of the >> functions are, generally. Functions like ones() and zeros() are pretty >> simple wrappers which call empty() and then copy over values. > > Many of these are short, but they do tend to do at least two things > that someone would otherwise have to do. This really isn't the case > for iteraxis() and rollaxis(). One can use rollaxis() pretty much > everywhere you would use iteraxis(), but not vice-versa. > >> FWIW, I had used numpy for over two years before realizing that the default >> behavior of iterating on a numpy array was to return slices over the first >> axis (although, this makes sense because it makes a 1d array like a list), >> and I think it is generally left out of any tutorials or guides. That definitely sounds like a documentation problem. I'm using often that it's a python iterator in the first dimension, and can be used with *args and tuple unpacking. (I didn't need it with anything else than axis=0 or axis=-1 for matplotlib IIRC) I never used rollaxis, but I have seen it a lot when I was still reading the nipy source. In general, I think that there are already too many aliases in numpy, or function whether it's not really clear if they are aliases or something slightly different. It took me more than a year to remember what `expand_dims` is called, (I always tried, add_axis) until I bookmarked it for a while. Josef > > Then let's add it. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Thu Apr 25 16:04:31 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 25 Apr 2013 14:04:31 -0600 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 1:51 PM, wrote: > On Thu, Apr 25, 2013 at 3:40 PM, Robert Kern > wrote: > > On Thu, Apr 25, 2013 at 8:21 PM, Andrew Giessel > > wrote: > >> I respect this opinion. However (and maybe this is legacy), while > reading > >> through the numeric.py source file, I was surprised at how short many > of the > >> functions are, generally. Functions like ones() and zeros() are pretty > >> simple wrappers which call empty() and then copy over values. > > > > Many of these are short, but they do tend to do at least two things > > that someone would otherwise have to do. This really isn't the case > > for iteraxis() and rollaxis(). One can use rollaxis() pretty much > > everywhere you would use iteraxis(), but not vice-versa. > > > >> FWIW, I had used numpy for over two years before realizing that the > default > >> behavior of iterating on a numpy array was to return slices over the > first > >> axis (although, this makes sense because it makes a 1d array like a > list), > >> and I think it is generally left out of any tutorials or guides. > > That definitely sounds like a documentation problem. > I'm using often that it's a python iterator in the first dimension, > and can be used with *args and tuple unpacking. > (I didn't need it with anything else than axis=0 or axis=-1 for matplotlib > IIRC) > > I never used rollaxis, but I have seen it a lot when I was still > reading the nipy source. > > In general, I think that there are already too many aliases in numpy, > or function whether it's not really clear if they are aliases or > something slightly different. > > It took me more than a year to remember what `expand_dims` is called, > (I always tried, add_axis) until I bookmarked it for a while. > > After thinking about it, I'm in favor of this small function. Rollaxis takes a bit of thought and document reading to figure out how to use it, whereas this function covers a common use with an easy to understand API. I'm not completely satisfied with the name, it isn't as memorable as I'd like, but that is a small quibble. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Apr 25 16:38:52 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 25 Apr 2013 22:38:52 +0200 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: <1366922332.17435.85.camel@sebastian-laptop> On Thu, 2013-04-25 at 14:04 -0600, Charles R Harris wrote: > > > On Thu, Apr 25, 2013 at 1:51 PM, wrote: > On Thu, Apr 25, 2013 at 3:40 PM, Robert Kern > wrote: > > On Thu, Apr 25, 2013 at 8:21 PM, Andrew Giessel > > wrote: > >> I respect this opinion. However (and maybe this is > legacy), while reading > >> through the numeric.py source file, I was surprised at how > short many of the > >> functions are, generally. Functions like ones() and > zeros() are pretty > >> simple wrappers which call empty() and then copy over > values. > > > > Many of these are short, but they do tend to do at least two > things > > that someone would otherwise have to do. This really isn't > the case > > for iteraxis() and rollaxis(). One can use rollaxis() pretty > much > > everywhere you would use iteraxis(), but not vice-versa. > > > >> FWIW, I had used numpy for over two years before realizing > that the default > >> behavior of iterating on a numpy array was to return slices > over the first > >> axis (although, this makes sense because it makes a 1d > array like a list), > >> and I think it is generally left out of any tutorials or > guides. > > > That definitely sounds like a documentation problem. > I'm using often that it's a python iterator in the first > dimension, > and can be used with *args and tuple unpacking. > (I didn't need it with anything else than axis=0 or axis=-1 > for matplotlib IIRC) > > I never used rollaxis, but I have seen it a lot when I was > still > reading the nipy source. > > In general, I think that there are already too many aliases in > numpy, > or function whether it's not really clear if they are aliases > or > something slightly different. > > It took me more than a year to remember what `expand_dims` is > called, > (I always tried, add_axis) until I bookmarked it for a while. > > > After thinking about it, I'm in favor of this small function. Rollaxis > takes a bit of thought and document reading to figure out how to use > it, whereas this function covers a common use with an easy to > understand API. I'm not completely satisfied with the name, it isn't > as memorable as I'd like, but that is a small quibble. > What I am not quite happy with is, that it feels that if we want to keep it open to understanding multiple axes (and maybe also be a method of that same name) at some point, defaulting to flat iteration may be better. So from a future point of view, maybe it should have axes=None as default? I.e. (oh, evil code!) the long term goal could be something like this (obviously it would be preferable and much faster in C...): def iteraxes(arr, axis=None, order='C'): view_shape = [] view_strides = [] op_axes = [] count_axes = 0 if axis is None: op_axes = range(arr.ndim) else: if not isinstance(axis, tuple): axis = {axis} else: axis = set(axis) # ignores duplicates... for ax in range(arr.ndim): if ax in axis: axis.remove(ax) op_axes.append(ax) else: view_shape.append(arr.shape[ax]) view_strides.append(arr.strides[ax]) if len(axis) != 0: raise ValueError i = np.nditer(arr, op_axes=[op_axes], order=order) for s in i: view = np.lib.stride_tricks.as_strided(s, view_shape, view_strides) yield view > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ckkart at hoc.net Thu Apr 25 18:44:18 2013 From: ckkart at hoc.net (Christian K.) Date: Thu, 25 Apr 2013 19:44:18 -0300 Subject: [Numpy-discussion] Random number generation and testing across different OS's. In-Reply-To: References: Message-ID: Hi Andrew, Am 12.04.13 11:50, schrieb Andrew Nelson: > I have written a differential evolution optimiser that i use for > curvefitting. As a genetic optimisation technique it is stochastic and > relies heavily on random number generators to do the minimisation. As part out of curiosity I would like to know for what type of problems/models you need DE or why do you think it is superior to gradient based minimizers? Btw., are you aware of ecspy (https://code.google.com/p/ecspy/)? I used it same years ago and found it very powerful. Regards, Christian From aron at ahmadia.net Thu Apr 25 20:16:11 2013 From: aron at ahmadia.net (Aron Ahmadia) Date: Fri, 26 Apr 2013 01:16:11 +0100 Subject: [Numpy-discussion] f2py and object libraries In-Reply-To: <1366911281.16984.35.camel@shevek> References: <1366911281.16984.35.camel@shevek> Message-ID: Hi Jon, I have personally never used f2py to link to library code, but according to the documentation: http://cens.ioc.ee/projects/f2py2e/usersguide/index.html#command-f2py if you are building a module (that is, invoking f2py -c), then you can either include the path to the .a file directly at the end of the list, or you can use the -L and -l flags. f2py -c \ [[ only: : ] \ [ skip: : ]]... \ [ ] [ <.o, .a, .so files> ] One trap I've run into in the past is that if the object code library was not compiled with the correct compiler flags (-fPIC, or target architecture), the linker will actually skip over the incompatible library. Usually you will see some sort of warning or complaint. Do you mind running with the --verbose option and posting the complete log somewhere, including the commands you are using to build the library and your invocation of f2py? Cheers, Aron On Thu, Apr 25, 2013 at 6:34 PM, Jonathan Slavin wrote: > Hi all, > > I have recently started using f2py to access some legacy fortran code > and it's mostly worked better than I expected. It handles common > blocks, block data, etc. with no problems. I did need to define the > type of all the arguments in subroutine and function calls, but not in > the body of the code (though of course that would be good programming > practice). The one thing that I haven't been able to get to work is > linking to code in an object library. My normal approach to compiling > my code that uses this particular code collection (RS) goes like: > f77 -c (...other options) main_prog.f > f77 -o main_prog main_prog.o -L/lib_dir -lRS > where /lib_dir is the directory where the library resides and the > library name is libRS.a The library was created by via: > ar cru libRS.a file1.o file2.o (...list of object files, compiled > fortran routines) > ranlib libRS.a > f2py does accept the -L and -l arguments but doesn't seem to be able to > find the code in the libraries. Any suggestions? > > Jon > -- > ______________________________________________________________ > Jonathan D. Slavin Harvard-Smithsonian CfA > jslavin at cfa.harvard.edu 60 Garden Street, MS 83 > phone: (617) 496-7981 Cambridge, MA 02138-1516 > cell: (781) 363-0035 USA > ______________________________________________________________ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Thu Apr 25 22:11:24 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Thu, 25 Apr 2013 20:11:24 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: On Tue, Apr 23, 2013 at 12:10 PM, Fr?d?ric Bastien wrote: > Hi, > > A big thanks for that release. > > I also think it would be useful to do a release candidate about this. This > release changed the behavior releated to python long and broke a test in > Theano. Nothing important, but we could have fixed this before the release. > > The numpy change is that a python long that don't fit in an int64, but fit > in an uint64, was throwing an overflow exception. Now it return an uint64. My apologies for this. There was a release candidate here: http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065948.html and I don't see any offending patch between the 1.7.1rc1 and 1.7.1. If the bugs are in numpy, would you please report it into issues? So that we can fix it. Thanks, Ondrej From robert.kern at gmail.com Fri Apr 26 05:42:35 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Apr 2013 10:42:35 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 9:04 PM, Charles R Harris wrote: > After thinking about it, I'm in favor of this small function. Rollaxis takes > a bit of thought and document reading to figure out how to use it, whereas > this function covers a common use with an easy to understand API. It seems to me that just an additional example in the rollaxis() docstring solves that problem: ``rollaxis()`` can be used to iterate over a given axis of a multidimensional array: >>> for x in np.rollaxis(a, 2): ... print x.shape ... (3, 4, 6) (3, 4, 6) (3, 4, 6) (3, 4, 6) (3, 4, 6) -- Robert Kern From andrew_giessel at hms.harvard.edu Fri Apr 26 07:26:01 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Fri, 26 Apr 2013 07:26:01 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: I agree with Charles that rollaxis() isn't immediately intuitive. It seems to me that documentation like this doesn't belong in rollaxis() but instead wherever people talk about indexing and/or iterating over an array. Nothing about the iteration depends on rollaxis(), rollaxis is just giving you a different view of the array to call __getitem__() on, if I understand correctly. I'm counting 2 for (Me, Charles), 2 against (Robert, Josef), and two or three neutral parties (based on interest/comments: Matthew, Sebastian, and Phil Elson (who commented on the PR)). I don't know how to best proceed from here. Best, Andrew On Fri, Apr 26, 2013 at 5:42 AM, Robert Kern wrote: > On Thu, Apr 25, 2013 at 9:04 PM, Charles R Harris > wrote: > > > After thinking about it, I'm in favor of this small function. Rollaxis > takes > > a bit of thought and document reading to figure out how to use it, > whereas > > this function covers a common use with an easy to understand API. > > It seems to me that just an additional example in the rollaxis() > docstring solves that problem: > > ``rollaxis()`` can be used to iterate over a given axis of a > multidimensional array: > > >>> for x in np.rollaxis(a, 2): > ... print x.shape > ... > (3, 4, 6) > (3, 4, 6) > (3, 4, 6) > (3, 4, 6) > (3, 4, 6) > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Apr 26 07:33:49 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Apr 2013 12:33:49 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: On Fri, Apr 26, 2013 at 12:26 PM, Andrew Giessel wrote: > I agree with Charles that rollaxis() isn't immediately intuitive. > > It seems to me that documentation like this doesn't belong in rollaxis() but > instead wherever people talk about indexing and/or iterating over an array. > Nothing about the iteration depends on rollaxis(), rollaxis is just giving > you a different view of the array to call __getitem__() on, if I understand > correctly. Docstrings are perfect places to briefly describe and demonstrate common use cases for a function. There is no problem with including the example that I wrote in the rollaxis() docstring. In any case, whether you put the documentation in the rollaxis() docstring or in one of the indexing/iteration sections, or (preferably) both, I strongly encourage you to do that first and see how it goes before adding a new alias. -- Robert Kern From jason-sage at creativetrax.com Fri Apr 26 09:37:29 2013 From: jason-sage at creativetrax.com (Jason Grout) Date: Fri, 26 Apr 2013 08:37:29 -0500 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: <517A8319.8010306@creativetrax.com> On 4/26/13 6:33 AM, Robert Kern wrote: > In any case, whether you put the documentation in the rollaxis() > docstring or in one of the indexing/iteration sections, or > (preferably) both, I strongly encourage you to do that first and see > how it goes before adding a new alias. +1 (for what it's worth) to being conservative with API changes as a first resort. Jason From josef.pktd at gmail.com Fri Apr 26 09:52:52 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 26 Apr 2013 09:52:52 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: <517A8319.8010306@creativetrax.com> References: <517A8319.8010306@creativetrax.com> Message-ID: the "new" documentation http://stackoverflow.com/questions/1589706/iterating-over-arbitrary-dimension-of-numpy-array second answer, 1st answer is what I usually use search term "[numpy] iterate over axis" Josef On Fri, Apr 26, 2013 at 9:37 AM, Jason Grout wrote: > On 4/26/13 6:33 AM, Robert Kern wrote: >> In any case, whether you put the documentation in the rollaxis() >> docstring or in one of the indexing/iteration sections, or >> (preferably) both, I strongly encourage you to do that first and see >> how it goes before adding a new alias. > > +1 (for what it's worth) to being conservative with API changes as a > first resort. > > Jason > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From pelson.pub at gmail.com Fri Apr 26 09:56:43 2013 From: pelson.pub at gmail.com (Phil Elson) Date: Fri, 26 Apr 2013 14:56:43 +0100 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: I didn't find the rollaxis solution particularly obvious and also had to think about what rollaxis did before understanding its usefulness for iteration. Now that I've understood it, I'm +1 for the statement that, as it stands, the proposed iteraxis method doesn't add enough to warrant its inclusion. That said, I do think array iteration could be made simpler (or the function I've missed better documented!). I've put together an implementation of a "slices" function which can return subsets of an array based on the axes provided (a generalisation of iteraxis but implemented slightly differently): def slices(a, axes=-1): indices = np.repeat(slice(None), a.ndim) # turn axes into a 1d array of axes indices axes = np.array(axes).flatten() bad_indices = (axes < (-a.ndim + 1)) | axes > (a.ndim - 1) if np.any(bad_indices): raise ValueError('The axis index/indices were out of range.') # Turn negative indices into real indices axes[axes < 0] = a.ndim + axes[axes < 0] if np.unique(axes).shape != axes.shape: raise ValueError('Repeated axis indices were given.') indexing_shape = np.array(a.shape)[axes] for ind in np.ndindex(*indexing_shape): indices[axes] = ind yield a[tuple(indices)] This can be used simply with: >>> a = np.ones([2, 3, 4, 5]) >>> for s in slices(a, 2): ... print s.shape ... (2, 3, 5) (2, 3, 5) (2, 3, 5) (2, 3, 5) Or slightly with the slightly more complex: >>> len(list(slices(a, [2, -1]))) 20 Without focusing on my actual implementation, would this kind of interface be more desirable? Cheers, On 26 April 2013 12:33, Robert Kern wrote: > On Fri, Apr 26, 2013 at 12:26 PM, Andrew Giessel > wrote: > > I agree with Charles that rollaxis() isn't immediately intuitive. > > > > It seems to me that documentation like this doesn't belong in rollaxis() > but > > instead wherever people talk about indexing and/or iterating over an > array. > > Nothing about the iteration depends on rollaxis(), rollaxis is just > giving > > you a different view of the array to call __getitem__() on, if I > understand > > correctly. > > Docstrings are perfect places to briefly describe and demonstrate > common use cases for a function. There is no problem with including > the example that I wrote in the rollaxis() docstring. > > In any case, whether you put the documentation in the rollaxis() > docstring or in one of the indexing/iteration sections, or > (preferably) both, I strongly encourage you to do that first and see > how it goes before adding a new alias. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at gmail.com Fri Apr 26 10:20:29 2013 From: chanley at gmail.com (Christopher Hanley) Date: Fri, 26 Apr 2013 10:20:29 -0400 Subject: [Numpy-discussion] numpy.scipy.org page 404s Message-ID: Dear Numpy Webmasters, Would it be possible to either redirect numpy.scipy.org to www.numpy.org or to the main numpy github landing page? Currently numpy.scipy.org hits a Github 404 page. As the numpy.scipy.org site still shows up in searches it would be useful to have that address resolve to something more helpful. Thank you for your time and help, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Fri Apr 26 09:27:43 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 26 Apr 2013 09:27:43 -0400 Subject: [Numpy-discussion] ANN: NumPy 1.7.1 release In-Reply-To: References: Message-ID: Sorry, I didn't saw the release candidate. I was away for 1 mounts and didn't read all my email in order. Normally I try to test the release candidate, but I wasn't able this time. I have nothing to report again NumPy 1.7.1. I reread the previous emails and I remark that I badly read the first time one of them. I understood someone suggested to do a release candidate as if you didn't do that, but he wrote about not doing a 1.7.2 for datetime! Sorry for the noise. Fred On Thu, Apr 25, 2013 at 10:11 PM, Ond?ej ?ert?k wrote: > On Tue, Apr 23, 2013 at 12:10 PM, Fr?d?ric Bastien > wrote: > > Hi, > > > > A big thanks for that release. > > > > I also think it would be useful to do a release candidate about this. > This > > release changed the behavior releated to python long and broke a test in > > Theano. Nothing important, but we could have fixed this before the > release. > > > > The numpy change is that a python long that don't fit in an int64, but > fit > > in an uint64, was throwing an overflow exception. Now it return an > uint64. > > My apologies for this. There was a release candidate here: > > http://mail.scipy.org/pipermail/numpy-discussion/2013-March/065948.html > > and I don't see any offending patch between the 1.7.1rc1 and 1.7.1. > > If the bugs are in numpy, would you please report it into issues? So > that we can fix it. > > Thanks, > Ondrej > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Apr 26 11:11:47 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Apr 2013 16:11:47 +0100 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: Message-ID: On Fri, Apr 26, 2013 at 3:20 PM, Christopher Hanley wrote: > Dear Numpy Webmasters, > > Would it be possible to either redirect numpy.scipy.org to www.numpy.org or > to the main numpy github landing page? Currently numpy.scipy.org hits a > Github 404 page. As the numpy.scipy.org site still shows up in searches it > would be useful to have that address resolve to something more helpful. > > Thank you for your time and help, $ dig numpy.scipy.org ; <<>> DiG 9.8.3-P1 <<>> numpy.scipy.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49456 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;numpy.scipy.org. IN A ;; ANSWER SECTION: numpy.scipy.org. 682 IN CNAME www.numpy.org. www.numpy.org. 60 IN CNAME numpy.github.com. numpy.github.com. 42982 IN A 204.232.175.78 ;; Query time: 14 msec ;; SERVER: 10.44.0.1#53(10.44.0.1) ;; WHEN: Fri Apr 26 16:05:20 2013 ;; MSG SIZE rcvd: 103 Unfortunately, Github can only deal with one CNAME, www.numpy.org. The documentation recommends that one "redirect" the other domains, but it's not clear exactly what it is referring to. Having an HTTP server with an A record for numpy.scipy.org that just issues HTTP 301 redirects for everything? I can look into getting that set up. https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file -- Robert Kern From p.j.a.cock at googlemail.com Fri Apr 26 11:18:17 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 26 Apr 2013 16:18:17 +0100 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: Message-ID: On Fri, Apr 26, 2013 at 4:11 PM, Robert Kern wrote: > On Fri, Apr 26, 2013 at 3:20 PM, Christopher Hanley wrote: >> Dear Numpy Webmasters, >> >> Would it be possible to either redirect numpy.scipy.org to www.numpy.org or >> to the main numpy github landing page? Currently numpy.scipy.org hits a >> Github 404 page. As the numpy.scipy.org site still shows up in searches it >> would be useful to have that address resolve to something more helpful. >> >> Thank you for your time and help, > > $ dig numpy.scipy.org > > ; <<>> DiG 9.8.3-P1 <<>> numpy.scipy.org > ;; global options: +cmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49456 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;numpy.scipy.org. IN A > > ;; ANSWER SECTION: > numpy.scipy.org. 682 IN CNAME www.numpy.org. > www.numpy.org. 60 IN CNAME numpy.github.com. > numpy.github.com. 42982 IN A 204.232.175.78 > > ;; Query time: 14 msec > ;; SERVER: 10.44.0.1#53(10.44.0.1) > ;; WHEN: Fri Apr 26 16:05:20 2013 > ;; MSG SIZE rcvd: 103 > > > Unfortunately, Github can only deal with one CNAME, www.numpy.org. The > documentation recommends that one "redirect" the other domains, but > it's not clear exactly what it is referring to. Having an HTTP server > with an A record for numpy.scipy.org that just issues HTTP 301 > redirects for everything? I can look into getting that set up. > > https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file > +1 for fixing this - I tried to report this back in February, but checking the archive my email seems to have gotten lost. I noticed this from a manuscript in proof when the copy editor pointed out http://numpy.scipy.org wasn't working. As http://numpy.scipy.org used to be a widely used URL for the project, and likely appears in many printed references, fixing it to redirect to the (relatively new) http://www.numpy.org would be good. Peter From lists at hilboll.de Fri Apr 26 11:30:32 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 26 Apr 2013 17:30:32 +0200 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: Message-ID: <517A9D98.70003@hilboll.de> > Unfortunately, Github can only deal with one CNAME, www.numpy.org. The > documentation recommends that one "redirect" the other domains, but > it's not clear exactly what it is referring to. Having an HTTP server > with an A record for numpy.scipy.org that just issues HTTP 301 > redirects for everything? I can look into getting that set up. > > https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file If I understand that correctly, you'll need to point the numpy.scipy.org CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD rewrite on that IP. So we need to find a server to do that. Probably easiest to ask numfocus, right? Cheers, Andreas. From robert.kern at gmail.com Fri Apr 26 11:38:27 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Apr 2013 16:38:27 +0100 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: <517A9D98.70003@hilboll.de> References: <517A9D98.70003@hilboll.de> Message-ID: On Fri, Apr 26, 2013 at 4:30 PM, Andreas Hilboll wrote: >> Unfortunately, Github can only deal with one CNAME, www.numpy.org. The >> documentation recommends that one "redirect" the other domains, but >> it's not clear exactly what it is referring to. Having an HTTP server >> with an A record for numpy.scipy.org that just issues HTTP 301 >> redirects for everything? I can look into getting that set up. >> >> https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file > > If I understand that correctly, you'll need to point the numpy.scipy.org > CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD > rewrite on that IP. So we need to find a server to do that. Probably > easiest to ask numfocus, right? There's no need. We'll just use the existing www.scipy.org Apache server to host the redirects. -- Robert Kern From lists at hilboll.de Fri Apr 26 11:44:15 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 26 Apr 2013 17:44:15 +0200 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: <517A9D98.70003@hilboll.de> Message-ID: <517AA0CF.3000801@hilboll.de> >>> Unfortunately, Github can only deal with one CNAME, www.numpy.org. The >>> documentation recommends that one "redirect" the other domains, but >>> it's not clear exactly what it is referring to. Having an HTTP server >>> with an A record for numpy.scipy.org that just issues HTTP 301 >>> redirects for everything? I can look into getting that set up. >>> >>> https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file >> >> If I understand that correctly, you'll need to point the numpy.scipy.org >> CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD >> rewrite on that IP. So we need to find a server to do that. Probably >> easiest to ask numfocus, right? > > There's no need. We'll just use the existing www.scipy.org Apache > server to host the redirects. Good. I have no clue about who operates which servers, and just assumed numfocus is doing that. BTW, is there help needed in server administration (for numpy, scipy, or whatever)? I could happily volunteer to help out. Cheers, Andreas. From robert.kern at gmail.com Fri Apr 26 11:51:06 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 26 Apr 2013 16:51:06 +0100 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: <517AA0CF.3000801@hilboll.de> References: <517A9D98.70003@hilboll.de> <517AA0CF.3000801@hilboll.de> Message-ID: On Fri, Apr 26, 2013 at 4:44 PM, Andreas Hilboll wrote: >>>> Unfortunately, Github can only deal with one CNAME, www.numpy.org. The >>>> documentation recommends that one "redirect" the other domains, but >>>> it's not clear exactly what it is referring to. Having an HTTP server >>>> with an A record for numpy.scipy.org that just issues HTTP 301 >>>> redirects for everything? I can look into getting that set up. >>>> >>>> https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file >>> >>> If I understand that correctly, you'll need to point the numpy.scipy.org >>> CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD >>> rewrite on that IP. So we need to find a server to do that. Probably >>> easiest to ask numfocus, right? >> >> There's no need. We'll just use the existing www.scipy.org Apache >> server to host the redirects. > > Good. I have no clue about who operates which servers, and just assumed > numfocus is doing that. Enthought hosts and administers the scipy.org domain. > BTW, is there help needed in server administration (for numpy, scipy, or > whatever)? I could happily volunteer to help out. Right now, the recurring cost is kicking the www.scipy.org wiki every once in a while under the deluge of spam. The best way to stop that is to help finish the migration of content to the new static site: https://github.com/scipy/scipy.org-new I think the major TODO items there are the conversion of the Topical Software and Cookbook pages. I can give dumps of the current wiki pages to anyone who wants to help with that. -- Robert Kern From ognen at enthought.com Fri Apr 26 12:06:32 2013 From: ognen at enthought.com (Ognen Duzlevski) Date: Fri, 26 Apr 2013 11:06:32 -0500 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: <517A9D98.70003@hilboll.de> <517AA0CF.3000801@hilboll.de> Message-ID: Should be fixed now. Ognen On Fri, Apr 26, 2013 at 10:51 AM, Robert Kern wrote: > On Fri, Apr 26, 2013 at 4:44 PM, Andreas Hilboll wrote: > >>>> Unfortunately, Github can only deal with one CNAME, www.numpy.org. > The > >>>> documentation recommends that one "redirect" the other domains, but > >>>> it's not clear exactly what it is referring to. Having an HTTP server > >>>> with an A record for numpy.scipy.org that just issues HTTP 301 > >>>> redirects for everything? I can look into getting that set up. > >>>> > >>>> > https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file > >>> > >>> If I understand that correctly, you'll need to point the > numpy.scipy.org > >>> CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD > >>> rewrite on that IP. So we need to find a server to do that. Probably > >>> easiest to ask numfocus, right? > >> > >> There's no need. We'll just use the existing www.scipy.org Apache > >> server to host the redirects. > > > > Good. I have no clue about who operates which servers, and just assumed > > numfocus is doing that. > > Enthought hosts and administers the scipy.org domain. > > > BTW, is there help needed in server administration (for numpy, scipy, or > > whatever)? I could happily volunteer to help out. > > Right now, the recurring cost is kicking the www.scipy.org wiki every > once in a while under the deluge of spam. The best way to stop that is > to help finish the migration of content to the new static site: > > https://github.com/scipy/scipy.org-new > > I think the major TODO items there are the conversion of the Topical > Software and Cookbook pages. I can give dumps of the current wiki > pages to anyone who wants to help with that. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chanley at gmail.com Fri Apr 26 12:20:06 2013 From: chanley at gmail.com (Christopher Hanley) Date: Fri, 26 Apr 2013 12:20:06 -0400 Subject: [Numpy-discussion] numpy.scipy.org page 404s In-Reply-To: References: <517A9D98.70003@hilboll.de> <517AA0CF.3000801@hilboll.de> Message-ID: Thank you! On Fri, Apr 26, 2013 at 12:06 PM, Ognen Duzlevski wrote: > Should be fixed now. > Ognen > > > On Fri, Apr 26, 2013 at 10:51 AM, Robert Kern wrote: > >> On Fri, Apr 26, 2013 at 4:44 PM, Andreas Hilboll >> wrote: >> >>>> Unfortunately, Github can only deal with one CNAME, www.numpy.org. >> The >> >>>> documentation recommends that one "redirect" the other domains, but >> >>>> it's not clear exactly what it is referring to. Having an HTTP server >> >>>> with an A record for numpy.scipy.org that just issues HTTP 301 >> >>>> redirects for everything? I can look into getting that set up. >> >>>> >> >>>> >> https://help.github.com/articles/my-custom-domain-isn-t-working#multiple-domains-in-cname-file >> >>> >> >>> If I understand that correctly, you'll need to point the >> numpy.scipy.org >> >>> CNAME to a non-github IP, and install a HTTP redirect **or** a HTTPD >> >>> rewrite on that IP. So we need to find a server to do that. Probably >> >>> easiest to ask numfocus, right? >> >> >> >> There's no need. We'll just use the existing www.scipy.org Apache >> >> server to host the redirects. >> > >> > Good. I have no clue about who operates which servers, and just assumed >> > numfocus is doing that. >> >> Enthought hosts and administers the scipy.org domain. >> >> > BTW, is there help needed in server administration (for numpy, scipy, or >> > whatever)? I could happily volunteer to help out. >> >> Right now, the recurring cost is kicking the www.scipy.org wiki every >> once in a while under the deluge of spam. The best way to stop that is >> to help finish the migration of content to the new static site: >> >> https://github.com/scipy/scipy.org-new >> >> I think the major TODO items there are the conversion of the Topical >> Software and Cookbook pages. I can give dumps of the current wiki >> pages to anyone who wants to help with that. >> >> -- >> Robert Kern >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew_giessel at hms.harvard.edu Fri Apr 26 13:02:00 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Fri, 26 Apr 2013 13:02:00 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: I like this, thank you Phil. >From what I can see, the ordering of the returned slices when you use more than one axis (ie: slices(a, [1,2]), increments the last axis fastest. Does this makes sense based on the default ordering of, say, nditer()? I know that C-order (row major) and Fortran order (column major) are two ways of ordering the returned values- which does this default to? Is there a default across numpy? best, On Fri, Apr 26, 2013 at 9:56 AM, Phil Elson wrote: > I didn't find the rollaxis solution particularly obvious and also had to > think about what rollaxis did before understanding its usefulness for > iteration. > Now that I've understood it, I'm +1 for the statement that, as it stands, > the proposed iteraxis method doesn't add enough to warrant its inclusion. > > That said, I do think array iteration could be made simpler (or the > function I've missed better documented!). I've put together an > implementation of a "slices" function which can return subsets of an array > based on the axes provided (a generalisation of iteraxis but implemented > slightly differently): > > def slices(a, axes=-1): > indices = np.repeat(slice(None), a.ndim) > # turn axes into a 1d array of axes indices > axes = np.array(axes).flatten() > > bad_indices = (axes < (-a.ndim + 1)) | axes > (a.ndim - 1) > if np.any(bad_indices): > raise ValueError('The axis index/indices were out of range.') > > # Turn negative indices into real indices > axes[axes < 0] = a.ndim + axes[axes < 0] > > if np.unique(axes).shape != axes.shape: > raise ValueError('Repeated axis indices were given.') > > indexing_shape = np.array(a.shape)[axes] > > for ind in np.ndindex(*indexing_shape): > indices[axes] = ind > yield a[tuple(indices)] > > > This can be used simply with: > > >>> a = np.ones([2, 3, 4, 5]) > >>> for s in slices(a, 2): > ... print s.shape > ... > (2, 3, 5) > (2, 3, 5) > (2, 3, 5) > (2, 3, 5) > > > Or slightly with the slightly more complex: > > >>> len(list(slices(a, [2, -1]))) > 20 > > Without focusing on my actual implementation, would this kind of interface > be more desirable? > > Cheers, > > > > > On 26 April 2013 12:33, Robert Kern wrote: > >> On Fri, Apr 26, 2013 at 12:26 PM, Andrew Giessel >> wrote: >> > I agree with Charles that rollaxis() isn't immediately intuitive. >> > >> > It seems to me that documentation like this doesn't belong in >> rollaxis() but >> > instead wherever people talk about indexing and/or iterating over an >> array. >> > Nothing about the iteration depends on rollaxis(), rollaxis is just >> giving >> > you a different view of the array to call __getitem__() on, if I >> understand >> > correctly. >> >> Docstrings are perfect places to briefly describe and demonstrate >> common use cases for a function. There is no problem with including >> the example that I wrote in the rollaxis() docstring. >> >> In any case, whether you put the documentation in the rollaxis() >> docstring or in one of the indexing/iteration sections, or >> (preferably) both, I strongly encourage you to do that first and see >> how it goes before adding a new alias. >> >> -- >> Robert Kern >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Apr 26 16:45:38 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 26 Apr 2013 13:45:38 -0700 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: Hi, On Fri, Apr 26, 2013 at 10:02 AM, Andrew Giessel wrote: > I like this, thank you Phil. > > From what I can see, the ordering of the returned slices when you use more > than one axis (ie: slices(a, [1,2]), increments the last axis fastest. Does > this makes sense based on the default ordering of, say, nditer()? I know > that C-order (row major) and Fortran order (column major) are two ways of > ordering the returned values- which does this default to? Is there a default > across numpy? There was a thread on the distinction between index ordering and memory layout starting here: http://www.mail-archive.com/numpy-discussion at scipy.org/msg40956.html The answer is that C-like index ordering is the default across numpy (last changing fastest), and that, typically (always?) you can change this ordering to Fortran-like (first-fastest) with an 'order' keyword to the function or method. Cheers, Matthew From gael.varoquaux at normalesup.org Fri Apr 26 16:50:02 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 26 Apr 2013 22:50:02 +0200 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: References: Message-ID: <20130426205002.GA4942@phare.normalesup.org> On Thu, Apr 25, 2013 at 08:10:32PM +0100, Robert Kern wrote: > In my opinion, duplicating functionality under different aliases just > so people can supposedly find things without reading the documentation > is not a viable strategy for building out an API. +1. It's been my experience over and over again. Richer APIs are actually less used and known than simple APIs. People don't find their way around them. > My suggestion is to start building out a "How do I ...?" section to > the User's Guide that answers small questions like this. "How do I > iterate over an arbitrary axis of an array?" should be sufficiently > discoverable. Indeed, I agree that this is a documentation problem. It does not make it a simple problem. G From scopatz at gmail.com Sat Apr 27 13:30:26 2013 From: scopatz at gmail.com (Anthony Scopatz) Date: Sat, 27 Apr 2013 12:30:26 -0500 Subject: [Numpy-discussion] [Pytables-users] ANN: numexpr 2.1 (Python 3 support is here!) In-Reply-To: <517BA352.8020102@gmail.com> References: <517BA352.8020102@gmail.com> Message-ID: Congrats Francesc! On Sat, Apr 27, 2013 at 5:07 AM, Francesc Alted wrote: > ======================== > Announcing Numexpr 2.1 > ======================== > > Numexpr is a fast numerical expression evaluator for NumPy. With it, > expressions that operate on arrays (like "3*a+4*b") are accelerated > and use less memory than doing the same calculation in Python. > > It wears multi-threaded capabilities, as well as support for Intel's > VML library (included in Intel MKL), which allows an extremely fast > evaluation of transcendental functions (sin, cos, tan, exp, log...) > while squeezing the last drop of performance out of your multi-core > processors. > > Its only dependency is NumPy (MKL is optional), so it works well as an > easy-to-deploy, easy-to-use, computational kernel for projects that > don't want to adopt other solutions that require more heavy > dependencies. > > What's new > ========== > > The main feature of this version is that it adds a much needed > **compatibility for Python 3** > > Many thanks to Antonio Valentino for his fine work on this. > Also, Christoph Gohlke quickly provided feedback and binaries for > Windows and Mark Wiebe and Ga?tan de Menten provided many small > (but important!) fixes and improvements. All of you made numexpr 2.1 > the best release ever. Thanks! > > In case you want to know more in detail what has changed in this > version, see: > > http://code.google.com/p/numexpr/wiki/ReleaseNotes > > or have a look at RELEASE_NOTES.txt in the tarball. > > Where I can find Numexpr? > ========================= > > The project is hosted at Google code in: > > http://code.google.com/p/numexpr/ > > You can get the packages from PyPI as well: > > http://pypi.python.org/pypi/numexpr > > Share your experience > ===================== > > Let us know of any bugs, suggestions, gripes, kudos, etc. you may > have. > > > Enjoy data! > > Francesc Alted > > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr > _______________________________________________ > Pytables-users mailing list > Pytables-users at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users > -------------- next part -------------- An HTML attachment was scrubbed... URL: From opossumnano at gmail.com Mon Apr 29 09:16:51 2013 From: opossumnano at gmail.com (Tiziano Zito) Date: Mon, 29 Apr 2013 15:16:51 +0200 (CEST) Subject: [Numpy-discussion] =?utf-8?q?EuroSciPy_2013=3A_deadline_extension?= =?utf-8?q?_5_May_2013!?= Message-ID: <20130429131651.4D19312E00D6@comms.bccn-berlin.de> The committee of the EuroSciPy 2013 conference has extended the deadline for abstract submission to **Sunday May 5th 2013, 23:59:50 (UTC)**. Up to then, new abstracts may be submitted on http://www.euroscipy.org . We are very much looking forward to your submissions to the conference. EuroSciPy 2013 is the annual European conference for scientists using Python. It will be held August 21-25 2013 in Brussels, Belgium. Any other questions should be addressed exclusively to euroscipy-org at python.org -- Tiziano Zito (Program Chair) From josef.pktd at gmail.com Mon Apr 29 11:15:14 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Apr 2013 11:15:14 -0400 Subject: [Numpy-discussion] int to binary Message-ID: Is there a available function to convert an int to binary representation as sequence of 0 and 1? binary_repr produces strings and is not vectorized >>> np.binary_repr(5) '101' >>> np.binary_repr(5, width=4) '0101' >>> np.binary_repr(np.arange(5), width=4) Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line 1732, in binary_repr if num < 0: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ------------ That's the best I could come up with in a few minutes: >>> k = 3; int2bin(np.arange(2**k), k, roll=False) array([[ 0., 0., 0.], [ 1., 0., 0.], [ 0., 0., 1.], [ 1., 0., 1.], [ 0., 1., 0.], [ 1., 1., 0.], [ 0., 1., 1.], [ 1., 1., 1.]]) >>> k = 3; int2bin(np.arange(2**k), k, roll=True) array([[ 0., 0., 0.], [ 0., 0., 1.], [ 0., 1., 0.], [ 0., 1., 1.], [ 1., 0., 0.], [ 1., 0., 1.], [ 1., 1., 0.], [ 1., 1., 1.]]) ----------- def int2bin(x, width, roll=True): x = np.atleast_1d(x) res = np.zeros(x.shape + (width,) ) for i in range(width): x, r = divmod(x, 2) res[..., -i] = r if roll: res = np.roll(res, width-1, axis=-1) return res Josef From sebastian at sipsolutions.net Mon Apr 29 11:25:10 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 29 Apr 2013 17:25:10 +0200 Subject: [Numpy-discussion] int to binary In-Reply-To: References: Message-ID: <1367249110.2545.2.camel@sebastian-laptop> On Mon, 2013-04-29 at 11:15 -0400, josef.pktd at gmail.com wrote: > Is there a available function to convert an int to binary > representation as sequence of 0 and 1? > Maybe unpackbits/packbits? It only supports the uint8 type, but you can view anything as that (being aware of endianess where necessary). You will also have to reshape the result, but that should not be a problem. - Sebastian > > binary_repr produces strings and is not vectorized > > >>> np.binary_repr(5) > '101' > >>> np.binary_repr(5, width=4) > '0101' > >>> np.binary_repr(np.arange(5), width=4) > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line > 1732, in binary_repr > if num < 0: > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > > ------------ > That's the best I could come up with in a few minutes: > > > >>> k = 3; int2bin(np.arange(2**k), k, roll=False) > array([[ 0., 0., 0.], > [ 1., 0., 0.], > [ 0., 0., 1.], > [ 1., 0., 1.], > [ 0., 1., 0.], > [ 1., 1., 0.], > [ 0., 1., 1.], > [ 1., 1., 1.]]) > >>> k = 3; int2bin(np.arange(2**k), k, roll=True) > array([[ 0., 0., 0.], > [ 0., 0., 1.], > [ 0., 1., 0.], > [ 0., 1., 1.], > [ 1., 0., 0.], > [ 1., 0., 1.], > [ 1., 1., 0.], > [ 1., 1., 1.]]) > > ----------- > def int2bin(x, width, roll=True): > x = np.atleast_1d(x) > res = np.zeros(x.shape + (width,) ) > for i in range(width): > x, r = divmod(x, 2) > res[..., -i] = r > if roll: > res = np.roll(res, width-1, axis=-1) > return res > > > Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Apr 29 11:48:07 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Apr 2013 11:48:07 -0400 Subject: [Numpy-discussion] int to binary In-Reply-To: <1367249110.2545.2.camel@sebastian-laptop> References: <1367249110.2545.2.camel@sebastian-laptop> Message-ID: On Mon, Apr 29, 2013 at 11:25 AM, Sebastian Berg wrote: > On Mon, 2013-04-29 at 11:15 -0400, josef.pktd at gmail.com wrote: >> Is there a available function to convert an int to binary >> representation as sequence of 0 and 1? >> > > Maybe unpackbits/packbits? It only supports the uint8 type, but you can > view anything as that (being aware of endianess where necessary). You > will also have to reshape the result, but that should not be a problem. endianess sounds scary, maybe too close to the memory layout for my taste (for me to maintain as a helper function) >>> k=3; np.unpackbits(np.arange(2**k, dtype=np.uint32).view(np.uint8), axis=-1).reshape(2**k,-1)[:, 4:8] array([[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 0, 1, 1], [0, 1, 0, 0], [0, 1, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1]], dtype=uint8) >>> k=3; np.unpackbits(np.arange(256, 256+2**k, dtype=np.uint32).view(np.uint8), axis=-1).reshape(2**k,-1)[-1,:] array([0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8) >>> k=3; np.unpackbits(np.arange(256, 256+2**k, dtype=np.uint32).view(np.uint8), axis=-1).reshape(2**k,-1).sum(0) array([0, 0, 0, 0, 0, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint32) Thanks, Josef > > - Sebastian > >> >> binary_repr produces strings and is not vectorized >> >> >>> np.binary_repr(5) >> '101' >> >>> np.binary_repr(5, width=4) >> '0101' >> >>> np.binary_repr(np.arange(5), width=4) >> Traceback (most recent call last): >> File "", line 1, in >> File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line >> 1732, in binary_repr >> if num < 0: >> ValueError: The truth value of an array with more than one element is >> ambiguous. Use a.any() or a.all() >> >> ------------ >> That's the best I could come up with in a few minutes: >> >> >> >>> k = 3; int2bin(np.arange(2**k), k, roll=False) >> array([[ 0., 0., 0.], >> [ 1., 0., 0.], >> [ 0., 0., 1.], >> [ 1., 0., 1.], >> [ 0., 1., 0.], >> [ 1., 1., 0.], >> [ 0., 1., 1.], >> [ 1., 1., 1.]]) >> >>> k = 3; int2bin(np.arange(2**k), k, roll=True) >> array([[ 0., 0., 0.], >> [ 0., 0., 1.], >> [ 0., 1., 0.], >> [ 0., 1., 1.], >> [ 1., 0., 0.], >> [ 1., 0., 1.], >> [ 1., 1., 0.], >> [ 1., 1., 1.]]) >> >> ----------- >> def int2bin(x, width, roll=True): >> x = np.atleast_1d(x) >> res = np.zeros(x.shape + (width,) ) >> for i in range(width): >> x, r = divmod(x, 2) >> res[..., -i] = r >> if roll: >> res = np.roll(res, width-1, axis=-1) >> return res >> >> >> Josef >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From warren.weckesser at gmail.com Mon Apr 29 12:24:11 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 29 Apr 2013 12:24:11 -0400 Subject: [Numpy-discussion] int to binary In-Reply-To: References: Message-ID: On 4/29/13, josef.pktd at gmail.com wrote: > Is there a available function to convert an int to binary > representation as sequence of 0 and 1? > > > binary_repr produces strings and is not vectorized > >>>> np.binary_repr(5) > '101' >>>> np.binary_repr(5, width=4) > '0101' >>>> np.binary_repr(np.arange(5), width=4) > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line > 1732, in binary_repr > if num < 0: > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > > ------------ > That's the best I could come up with in a few minutes: > > >>>> k = 3; int2bin(np.arange(2**k), k, roll=False) > array([[ 0., 0., 0.], > [ 1., 0., 0.], > [ 0., 0., 1.], > [ 1., 0., 1.], > [ 0., 1., 0.], > [ 1., 1., 0.], > [ 0., 1., 1.], > [ 1., 1., 1.]]) >>>> k = 3; int2bin(np.arange(2**k), k, roll=True) > array([[ 0., 0., 0.], > [ 0., 0., 1.], > [ 0., 1., 0.], > [ 0., 1., 1.], > [ 1., 0., 0.], > [ 1., 0., 1.], > [ 1., 1., 0.], > [ 1., 1., 1.]]) > > ----------- > def int2bin(x, width, roll=True): > x = np.atleast_1d(x) > res = np.zeros(x.shape + (width,) ) > for i in range(width): > x, r = divmod(x, 2) > res[..., -i] = r > if roll: > res = np.roll(res, width-1, axis=-1) > return res > Here one way, in which each value is and'ed (with broadcasting) with an array of values with a 1 in each consecutive bit. The comparison ` != 0` converts the values from powers of 2 to bools, and then `astype(int)` converts those to 0s and 1s. You'll probably want to adjust how reshaping is done to get the result the way you want it. In [1]: x = array([0, 1, 2, 3, 15, 16]) In [2]: width = 5 In [3]: ((x.reshape(-1,1) & (2**arange(width))) != 0).astype(int) Out[3]: array([[0, 0, 0, 0, 0], [1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [1, 1, 0, 0, 0], [1, 1, 1, 1, 0], [0, 0, 0, 0, 1]]) Warren > > Josef > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Mon Apr 29 13:00:54 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 29 Apr 2013 13:00:54 -0400 Subject: [Numpy-discussion] int to binary In-Reply-To: References: Message-ID: On Mon, Apr 29, 2013 at 12:24 PM, Warren Weckesser wrote: > On 4/29/13, josef.pktd at gmail.com wrote: >> Is there a available function to convert an int to binary >> representation as sequence of 0 and 1? >> >> >> binary_repr produces strings and is not vectorized >> >>>>> np.binary_repr(5) >> '101' >>>>> np.binary_repr(5, width=4) >> '0101' >>>>> np.binary_repr(np.arange(5), width=4) >> Traceback (most recent call last): >> File "", line 1, in >> File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line >> 1732, in binary_repr >> if num < 0: >> ValueError: The truth value of an array with more than one element is >> ambiguous. Use a.any() or a.all() >> >> ------------ >> That's the best I could come up with in a few minutes: >> >> >>>>> k = 3; int2bin(np.arange(2**k), k, roll=False) >> array([[ 0., 0., 0.], >> [ 1., 0., 0.], >> [ 0., 0., 1.], >> [ 1., 0., 1.], >> [ 0., 1., 0.], >> [ 1., 1., 0.], >> [ 0., 1., 1.], >> [ 1., 1., 1.]]) >>>>> k = 3; int2bin(np.arange(2**k), k, roll=True) >> array([[ 0., 0., 0.], >> [ 0., 0., 1.], >> [ 0., 1., 0.], >> [ 0., 1., 1.], >> [ 1., 0., 0.], >> [ 1., 0., 1.], >> [ 1., 1., 0.], >> [ 1., 1., 1.]]) >> >> ----------- >> def int2bin(x, width, roll=True): >> x = np.atleast_1d(x) >> res = np.zeros(x.shape + (width,) ) >> for i in range(width): >> x, r = divmod(x, 2) >> res[..., -i] = r >> if roll: >> res = np.roll(res, width-1, axis=-1) >> return res >> > > Here one way, in which each value is and'ed (with broadcasting) with > an array of values with a 1 in each consecutive bit. The comparison ` > != 0` converts the values from powers of 2 to bools, and then > `astype(int)` converts those to 0s and 1s. You'll probably want to > adjust how reshaping is done to get the result the way you want it. > > In [1]: x = array([0, 1, 2, 3, 15, 16]) > > In [2]: width = 5 > > In [3]: ((x.reshape(-1,1) & (2**arange(width))) != 0).astype(int) > Out[3]: > array([[0, 0, 0, 0, 0], > [1, 0, 0, 0, 0], > [0, 1, 0, 0, 0], > [1, 1, 0, 0, 0], > [1, 1, 1, 1, 0], > [0, 0, 0, 0, 1]]) nice and tricky. I've never seen the bitwise_and in action like this. Maybe something to remember. -------- example indexing into a 2x2x2 contingency table >>> k = 3 >>> a3 = int2bin(np.arange(2**k), k) >>> c3 = np.arange(2**k).reshape(*([2]*k)) >>> c3[tuple(a3.T)] array([0, 1, 2, 3, 4, 5, 6, 7]) but I need to flatten 2 of those into a 2*2*2 x 2*2*2 table (2**(2k) table) Thanks, Josef > > > Warren > > >> >> Josef >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From andrew_giessel at hms.harvard.edu Mon Apr 29 14:10:14 2013 From: andrew_giessel at hms.harvard.edu (Andrew Giessel) Date: Mon, 29 Apr 2013 14:10:14 -0400 Subject: [Numpy-discussion] Proposal of new function: iteraxis() In-Reply-To: <20130426205002.GA4942@phare.normalesup.org> References: <20130426205002.GA4942@phare.normalesup.org> Message-ID: Matthew: Thanks for the link to array order discussion. Any more thoughts on Phil's slice() function? On Fri, Apr 26, 2013 at 4:50 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Thu, Apr 25, 2013 at 08:10:32PM +0100, Robert Kern wrote: > > In my opinion, duplicating functionality under different aliases just > > so people can supposedly find things without reading the documentation > > is not a viable strategy for building out an API. > > +1. It's been my experience over and over again. Richer APIs are actually > less used and known than simple APIs. People don't find their way around > them. > > > My suggestion is to start building out a "How do I ...?" section to > > the User's Guide that answers small questions like this. "How do I > > iterate over an arbitrary axis of an array?" should be sufficiently > > discoverable. > > Indeed, I agree that this is a documentation problem. It does not make it > a simple problem. > > G > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Andrew Giessel, PhD Department of Neurobiology, Harvard Medical School 220 Longwood Ave Boston, MA 02115 ph: 617.432.7971 email: andrew_giessel at hms.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Mon Apr 29 14:17:41 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Mon, 29 Apr 2013 20:17:41 +0200 Subject: [Numpy-discussion] int to binary In-Reply-To: References: Message-ID: <517EB945.20200@grinta.net> On 29/04/2013 17:15, josef.pktd at gmail.com wrote: > Is there a available function to convert an int to binary > representation as sequence of 0 and 1? ... > That's the best I could come up with in a few minutes: ... > def int2bin(x, width, roll=True): > x = np.atleast_1d(x) > res = np.zeros(x.shape + (width,) ) > for i in range(width): > x, r = divmod(x, 2) > res[..., -i] = r > if roll: > res = np.roll(res, width-1, axis=-1) > return res I haven't actually run a benchmark but r = (x >> i) & 0x01 or r = (x >>= 1) & 0x01 may be faster than x, r = divmod(x, 2) Cheers, Daniele From chris.barker at noaa.gov Mon Apr 29 14:58:04 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 29 Apr 2013 11:58:04 -0700 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 8:19 AM, Dave Hirschfeld wrote: > > Hi All,I think it is time to start the runup to the 1.8 release. I don't > know of any outstanding blockers but if anyone has a PR/issue that they > feel > needs to be in the next Numpy release now is the time to make it > known.Chuck > > > > It would be good to get the utc-everywhere fix for datetime64 in there if > someone has time to look into it. > > +1 I've been on vacation, so haven't written up the various notes and comments as a NEP yet -- I'll try to do that soon. There are some larger proposals in the mix, which I doubt could be done by 1.8, but we really should fix the "utc-everywhere" issue ASAP. I think it will be pretty easy to do, but someone still needs to do it... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Mon Apr 29 15:04:04 2013 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 29 Apr 2013 15:04:04 -0400 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Thu, Apr 25, 2013 at 11:16 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > I think it is time to start the runup to the 1.8 release. I don't know of > any outstanding blockers but if anyone has a PR/issue that they feel needs > to be in the next Numpy release now is the time to make it known. > > Chuck > > Has a np.minmax() function been added yet? I know it keeps getting +1's whenever suggested, but I haven't seen it done yet. Another annoyance is the lack of a np.nanmean() and np.nanstd() function. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Apr 29 15:07:43 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Apr 2013 13:07:43 -0600 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Mon, Apr 29, 2013 at 12:58 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > On Thu, Apr 25, 2013 at 8:19 AM, Dave Hirschfeld < > dave.hirschfeld at gmail.com> wrote: > >> > Hi All,I think it is time to start the runup to the 1.8 release. I don't >> know of any outstanding blockers but if anyone has a PR/issue that they >> feel >> needs to be in the next Numpy release now is the time to make it >> known.Chuck >> > >> >> It would be good to get the utc-everywhere fix for datetime64 in there if >> someone has time to look into it. >> >> +1 > > I've been on vacation, so haven't written up the various notes and > comments as a NEP yet -- I'll try to do that soon. There are some larger > proposals in the mix, which I doubt could be done by 1.8, but we really > should fix the "utc-everywhere" issue ASAP. I think it will be pretty easy > to do, but someone still needs to do it... > > Is there a issue for this? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Apr 29 15:08:44 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Apr 2013 13:08:44 -0600 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Mon, Apr 29, 2013 at 1:04 PM, Benjamin Root wrote: > > On Thu, Apr 25, 2013 at 11:16 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> I think it is time to start the runup to the 1.8 release. I don't know of >> any outstanding blockers but if anyone has a PR/issue that they feel needs >> to be in the next Numpy release now is the time to make it known. >> >> Chuck >> >> > Has a np.minmax() function been added yet? I know it keeps getting +1's > whenever suggested, but I haven't seen it done yet. Another annoyance is > the lack of a np.nanmean() and np.nanstd() function. > Best make this a separate thread and open issues based on the outcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Apr 29 15:12:26 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 29 Apr 2013 12:12:26 -0700 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Mon, Apr 29, 2013 at 12:07 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > It would be good to get the utc-everywhere fix for datetime64 in there if >>> someone has time to look into it. >>> >>> +1 >> >> I've been on vacation, so haven't written up the various notes and >> comments as a NEP yet -- I'll try to do that soon. There are some larger >> proposals in the mix, which I doubt could be done by 1.8, but we really >> should fix the "utc-everywhere" issue ASAP. I think it will be pretty easy >> to do, but someone still needs to do it... >> >> > Is there a issue for this? > > no -- I don't think so -- just a bunch of discussion on the list. I was starting down the path of a "proper" NEP, etc, but I think that: a) Datetime64 is still "officially" experimental, so we can change things more rapidly than we might otherwise. b) It's really pretty broken as it is. I'll see if I can open an issue for the "easy" fix. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Apr 29 18:03:40 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 29 Apr 2013 15:03:40 -0700 Subject: [Numpy-discussion] 1.8 release In-Reply-To: References: Message-ID: On Mon, Apr 29, 2013 at 12:12 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > It would be good to get the utc-everywhere fix for datetime64 in there if >>>> someone has time to look into it. >>>> >>> I'll see if I can open an issue for the "easy" fix. > > DONE: Issue #3290 -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhattersley at gmail.com Tue Apr 30 08:35:07 2013 From: rhattersley at gmail.com (Richard Hattersley) Date: Tue, 30 Apr 2013 13:35:07 +0100 Subject: [Numpy-discussion] bug in deepcopy() of rank-zero arrays? In-Reply-To: References: Message-ID: +1 for getting rid of this inconsistency We've hit this with Iris (a met/ocean analysis package - see github), and have had to add several workarounds. On 19 April 2013 16:55, Chris Barker - NOAA Federal wrote: > Hi folks, > > In [264]: np.__version__ > Out[264]: '1.7.0' > > I just noticed that deep copying a rank-zero array yields a scalar -- > probably not what we want. > > In [242]: a1 = np.array(3) > > In [243]: type(a1), a1 > Out[243]: (numpy.ndarray, array(3)) > > In [244]: a2 = copy.deepcopy(a1) > > In [245]: type(a2), a2 > Out[245]: (numpy.int32, 3) > > regular copy.copy() seems to work fine: > > In [246]: a3 = copy.copy(a1) > > In [247]: type(a3), a3 > Out[247]: (numpy.ndarray, array(3)) > > Higher-rank arrays seem to work fine: > > In [253]: a1 = np.array((3,4)) > > In [254]: type(a1), a1 > Out[254]: (numpy.ndarray, array([3, 4])) > > In [255]: a2 = copy.deepcopy(a1) > > In [256]: type(a2), a2 > Out[256]: (numpy.ndarray, array([3, 4])) > > Array scalars seem to work fine as well: > > In [257]: s1 = np.float32(3) > > In [258]: s2 = copy.deepcopy(s1) > > In [261]: type(s1), s1 > Out[261]: (numpy.float32, 3.0) > > In [262]: type(s2), s2 > Out[262]: (numpy.float32, 3.0) > > There are other ways to copy arrays, but in this case, I had a dict > with a bunch of arrays in it, and needed a deepcopy of the dict. I was > surprised to find that my rank-0 array got turned into a scalar. > > -Chris > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Apr 30 11:49:16 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 30 Apr 2013 08:49:16 -0700 Subject: [Numpy-discussion] bug in deepcopy() of rank-zero arrays? In-Reply-To: References: Message-ID: hmm -- I suppose one of us should post an issue on github -- then ask for it ti be fixed before 1.8 ;-) I'll try to get to the issue if no one beats me to it -- got to run now... -Chris On Tue, Apr 30, 2013 at 5:35 AM, Richard Hattersley wrote: > +1 for getting rid of this inconsistency > > We've hit this with Iris (a met/ocean analysis package - see github), and > have had to add several workarounds. > > > On 19 April 2013 16:55, Chris Barker - NOAA Federal > wrote: > >> Hi folks, >> >> In [264]: np.__version__ >> Out[264]: '1.7.0' >> >> I just noticed that deep copying a rank-zero array yields a scalar -- >> probably not what we want. >> >> In [242]: a1 = np.array(3) >> >> In [243]: type(a1), a1 >> Out[243]: (numpy.ndarray, array(3)) >> >> In [244]: a2 = copy.deepcopy(a1) >> >> In [245]: type(a2), a2 >> Out[245]: (numpy.int32, 3) >> >> regular copy.copy() seems to work fine: >> >> In [246]: a3 = copy.copy(a1) >> >> In [247]: type(a3), a3 >> Out[247]: (numpy.ndarray, array(3)) >> >> Higher-rank arrays seem to work fine: >> >> In [253]: a1 = np.array((3,4)) >> >> In [254]: type(a1), a1 >> Out[254]: (numpy.ndarray, array([3, 4])) >> >> In [255]: a2 = copy.deepcopy(a1) >> >> In [256]: type(a2), a2 >> Out[256]: (numpy.ndarray, array([3, 4])) >> >> Array scalars seem to work fine as well: >> >> In [257]: s1 = np.float32(3) >> >> In [258]: s2 = copy.deepcopy(s1) >> >> In [261]: type(s1), s1 >> Out[261]: (numpy.float32, 3.0) >> >> In [262]: type(s2), s2 >> Out[262]: (numpy.float32, 3.0) >> >> There are other ways to copy arrays, but in this case, I had a dict >> with a bunch of arrays in it, and needed a deepcopy of the dict. I was >> surprised to find that my rank-0 array got turned into a scalar. >> >> -Chris >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From blake.a.griffith at gmail.com Tue Apr 30 15:19:43 2013 From: blake.a.griffith at gmail.com (Blake Griffith) Date: Tue, 30 Apr 2013 14:19:43 -0500 Subject: [Numpy-discussion] GSoC proposal -- Numpy SciPy Message-ID: Hello, I'm writing a GSoC proposal, mostly concerning SciPy, but it involves a few changes to NumPy. The proposal is titled: Improvements to the sparse package of Scipy: support for bool dtype and better interaction with NumPy and can be found on my GitHub: https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown#numpy-interactions----july-8th-to-august-26th-7-weeks Basically, I want to change the ufunc class to be aware of SciPy's sparse matrices. So that when a ufunc is passed a sparse matrix as an argument, it will dispatch to a function in the sparse matrix package, which will then decide what to do. I just wanted to ping NumPy to make sure this is reasonable, and I'm not totally off track. Suggestions, feedback and criticism welcome. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 30 15:37:27 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 30 Apr 2013 15:37:27 -0400 Subject: [Numpy-discussion] GSoC proposal -- Numpy SciPy In-Reply-To: References: Message-ID: On Tue, Apr 30, 2013 at 3:19 PM, Blake Griffith wrote: > Hello, I'm writing a GSoC proposal, mostly concerning SciPy, but it involves > a few changes to NumPy. > The proposal is titled: Improvements to the sparse package of Scipy: support > for bool dtype and better interaction with NumPy > and can be found on my GitHub: > https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown#numpy-interactions----july-8th-to-august-26th-7-weeks > > Basically, I want to change the ufunc class to be aware of SciPy's sparse > matrices. So that when a ufunc is passed a sparse matrix as an argument, it > will dispatch to a function in the sparse matrix package, which will then > decide what to do. I just wanted to ping NumPy to make sure this is > reasonable, and I'm not totally off track. Suggestions, feedback and > criticism welcome. How do you plan to go about this? The obvious option of just calling scipy.sparse.issparse() on ufunc entry raises some problems, since numpy can't depend on or even import scipy, and we might be reluctant to add such a special case for what's a rather more general problem. OTOH it might be possible to solve the problem in general, e.g., see the prototyped _ufunc_override_ special method in: https://github.com/njsmith/numpyNEP/blob/master/numpyNEP.py but I don't know if you want to get into such a debate within the scope of your GSoC. What were you thinking? -n From charlesr.harris at gmail.com Tue Apr 30 16:00:26 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 30 Apr 2013 14:00:26 -0600 Subject: [Numpy-discussion] GSoC proposal -- Numpy SciPy In-Reply-To: References: Message-ID: On Tue, Apr 30, 2013 at 1:37 PM, Nathaniel Smith wrote: > On Tue, Apr 30, 2013 at 3:19 PM, Blake Griffith > wrote: > > Hello, I'm writing a GSoC proposal, mostly concerning SciPy, but it > involves > > a few changes to NumPy. > > The proposal is titled: Improvements to the sparse package of Scipy: > support > > for bool dtype and better interaction with NumPy > > and can be found on my GitHub: > > > https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown#numpy-interactions----july-8th-to-august-26th-7-weeks > > > > Basically, I want to change the ufunc class to be aware of SciPy's sparse > > matrices. So that when a ufunc is passed a sparse matrix as an argument, > it > > will dispatch to a function in the sparse matrix package, which will then > > decide what to do. I just wanted to ping NumPy to make sure this is > > reasonable, and I'm not totally off track. Suggestions, feedback and > > criticism welcome. > > How do you plan to go about this? The obvious option of just calling > scipy.sparse.issparse() on ufunc entry raises some problems, since > numpy can't depend on or even import scipy, and we might be reluctant > to add such a special case for what's a rather more general problem. > OTOH it might be possible to solve the problem in general, e.g., see > the prototyped _ufunc_override_ special method in: > https://github.com/njsmith/numpyNEP/blob/master/numpyNEP.py > but I don't know if you want to get into such a debate within the > scope of your GSoC. What were you thinking? > ISTR that Mark Wiebe also had thoughts for that functionality. There was a thread on the topic but I don't recall the time. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Apr 30 16:02:24 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Apr 2013 23:02:24 +0300 Subject: [Numpy-discussion] GSoC proposal -- Numpy SciPy In-Reply-To: References: Message-ID: 30.04.2013 22:37, Nathaniel Smith kirjoitti: [clip] > How do you plan to go about this? The obvious option of just calling > scipy.sparse.issparse() on ufunc entry raises some problems, since > numpy can't depend on or even import scipy, and we might be reluctant > to add such a special case for what's a rather more general problem. > OTOH it might be possible to solve the problem in general, e.g., see > the prototyped _ufunc_override_ special method in: > > https://github.com/njsmith/numpyNEP/blob/master/numpyNEP.py > > but I don't know if you want to get into such a debate within the > scope of your GSoC. What were you thinking? To me it seems that the right thing to do here is the general solution. Do you see immediate problems in e.g. just enabling something like your _ufunc_override_? The easy thing is that there are no backward compatibility problems here, since if the magic is missing, the old logic is used. Currently, the numpy dot() and ufuncs also most of the time do nothing sensible with sparse matrix inputs even though they in some cases return values. Which then makes writing generic sparse/dense code more painful than just __mul__ being matrix multiplication. IIRC, I seem to remember that also the quantities package had some issues with operations involving ndarrays, to which being able to override this could be a solution. -- Pauli Virtanen From matthew.brett at gmail.com Tue Apr 30 16:16:38 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 30 Apr 2013 13:16:38 -0700 Subject: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering In-Reply-To: References: <1365153645.683.38.camel@sebastian-laptop> Message-ID: Hi, On Sat, Apr 6, 2013 at 3:15 PM, Matthew Brett wrote: > Hi, > > On Sat, Apr 6, 2013 at 1:35 PM, Ralf Gommers wrote: >> >> >> >> On Sat, Apr 6, 2013 at 7:22 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Sat, Apr 6, 2013 at 1:51 AM, Ralf Gommers >>> wrote: >>> > >>> > >>> > >>> > On Sat, Apr 6, 2013 at 4:47 AM, Matthew Brett >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Fri, Apr 5, 2013 at 7:39 PM, wrote: >>> >> > >>> >> > It's not *any* cost, this goes deep and wide, it's one of the basic >>> >> > concepts of numpy that you want to rename. >>> >> >>> >> The proposal I last made was to change the default name to 'layout' >>> >> after some period to be agreed - say - P - with suitable warning in >>> >> the docstring up until that time, and after, and leave 'order' as an >>> >> alias forever. >>> > >>> > >>> > The above paragraph is simply incorrect. Your last proposal also >>> > included >>> > deprecation warnings and a future backwards compatibility break by >>> > removing >>> > 'order'. >>> > >>> > If you now say you're not proposing steps 3 and 4 anymore, then you're >>> > back >>> > to what I called option (2) - duplicate keywords forever. Which for me >>> > is >>> > undesirable, for reasons I already mentioned. >>> >>> You might not have read my follow-up proposing to drop steps 3 and 4 >>> if you felt they were unacceptable. >>> >>> > P.S. being called short-sighted and damaging numpy by responding to a >>> > proposal you now say you didn't make is pretty damn annoying. >>> >>> No, I did make that proposal, and in the spirit of negotiation and >>> consensus, I subsequently modified my proposal, as I hope you'd expect >>> in this situation. >> >> >> You have had clear NOs to the various incarnations of your proposal from 3 >> active developers of this community, not once but two or three times from >> each of those developers. Furthermore you have got only a couple of +0.5s, >> after 90 emails no one else seems to feel that this is a change we really >> have to have this change. Therefore I don't expect another modification of >> your proposal, I expect you to drop it. > > OK - I think I have a better understanding of the 'model' now. > >> As another poster said, this thread has run its course. The technical issues >> are clear, and apparently we're going to have to agree to disagree about the >> seriousness of the confusion. Please please go and fix the docs in the way >> you deem best, and leave it at that. And triple please not another >> governance thread. https://github.com/numpy/numpy/pull/3294 Cheers, Matthew From njs at pobox.com Tue Apr 30 19:53:08 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 30 Apr 2013 19:53:08 -0400 Subject: [Numpy-discussion] GSoC proposal -- Numpy SciPy In-Reply-To: References: Message-ID: On Tue, Apr 30, 2013 at 4:02 PM, Pauli Virtanen wrote: > 30.04.2013 22:37, Nathaniel Smith kirjoitti: > [clip] >> How do you plan to go about this? The obvious option of just calling >> scipy.sparse.issparse() on ufunc entry raises some problems, since >> numpy can't depend on or even import scipy, and we might be reluctant >> to add such a special case for what's a rather more general problem. >> OTOH it might be possible to solve the problem in general, e.g., see >> the prototyped _ufunc_override_ special method in: >> >> https://github.com/njsmith/numpyNEP/blob/master/numpyNEP.py >> >> but I don't know if you want to get into such a debate within the >> scope of your GSoC. What were you thinking? > > To me it seems that the right thing to do here is the general solution. > > Do you see immediate problems in e.g. just enabling something like your > _ufunc_override_? Just that we might want to think a bit about the design space before implementing something. E.g., apparently doing Python attribute lookup is very expensive -- we recently had a patch to skip __array_interface__ checks whenever possible -- is adding another such per-operation overhead ok? I guess we could use similar checks (skip checking for known types like int/float/ndarray), or only check for _ufunc_override_ on the class (not the instance) and cache the result per-class? > The easy thing is that there are no backward compatibility problems > here, since if the magic is missing, the old logic is used. Currently, > the numpy dot() and ufuncs also most of the time do nothing sensible > with sparse matrix inputs even though they in some cases return values. > Which then makes writing generic sparse/dense code more painful than > just __mul__ being matrix multiplication. I agree, but, if the main target is 'dot' then the current _ufunc_override_ design alone won't do it, since 'dot' is not a ufunc... -n From ben.root at ou.edu Tue Apr 30 21:36:55 2013 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Apr 2013 21:36:55 -0400 Subject: [Numpy-discussion] nanmean(), nanstd() and other "missing" functions for 1.8 Message-ID: Currently, I am in the process of migrating some co-workers from Matlab and IDL, and the number one complaint I get is that numpy has nansum() but no nanmean() and nanstd(). While we do have an alternative in the form of masked arrays, most of these people are busy enough trying to port their existing code over to python that this sort of stumbling block becomes difficult to explain away. Given how relatively simple these functions are, I can not think of any reason not to include these functions in v1.8. Of course, the documentation for these functions should certainly include mention of masked arrays. There is one other non-trivial function that have been discussed before: np.minmax(). My thinking is that it would return a 2xN array (where N is whatever size of the result that would be returned if just np.min() was used). This would allow one to do "min, max = np.minmax(X)". Are there any other functions that others feel are "missing" from numpy and would like to see for v1.8? Let's discuss them here. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From arinkverma at iitrpr.ac.in Tue Apr 30 22:26:39 2013 From: arinkverma at iitrpr.ac.in (Arink Verma) Date: Wed, 1 May 2013 07:56:39 +0530 Subject: [Numpy-discussion] GSoC : Performance parity between numpy arrays and Python scalars Message-ID: Hi all! I have written my application[1] for *Performance parity between numpy arrays and Python scalars[2]. *It would be a great help if you view it. Does it look achievable and deliverable according to the project. [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/arinkverma/40001# [2] http://projects.scipy.org/scipy/wiki/SummerofCodeIdeas -- Arink Computer Science and Engineering Indian Institute of Technology Ropar www.arinkverma.in -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Tue Apr 30 22:33:15 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 1 May 2013 04:33:15 +0200 Subject: [Numpy-discussion] nanmean(), nanstd() and other "missing" functions for 1.8 In-Reply-To: References: Message-ID: On 1 May 2013 03:36, Benjamin Root wrote: > There is one other non-trivial function that have been discussed before: > np.minmax(). My thinking is that it would return a 2xN array (where N is > whatever size of the result that would be returned if just np.min() was > used). This would allow one to do "min, max = np.minmax(X)". I had been looking for this function in the past, I think it is a good and necessary addition. It should also come with its companion, np.argminmax. David. From lists at onerussian.com Tue Apr 30 23:08:49 2013 From: lists at onerussian.com (Yaroslav Halchenko) Date: Tue, 30 Apr 2013 23:08:49 -0400 Subject: [Numpy-discussion] could anyone check on a 32bit system? In-Reply-To: References: Message-ID: <20130501030849.GO5140@onerussian.com> could anyone on 32bit system with fresh numpy (1.7.1) test following: > wget -nc http://www.onerussian.com/tmp/data.npy ; python -c 'import numpy as np; data1 = np.load("/tmp/data.npy"); print np.sum(data1[1,:,0,1]) - np.sum(data1, axis=1)[1,0,1]' 0.0 because unfortunately it seems on fresh ubuntu raring (in 32bit build only, seems ok in 64 bit... also never ran into it on older numpy releases): > python -c 'import numpy as np; data1 = np.load("/tmp/data.npy"); print np.sum(data1[1,:,0,1]) - np.sum(data1, axis=1)[1,0,1]' -1.11022302463e-16 PS detected by failed tests of pymvpa -- Yaroslav O. Halchenko, Ph.D. http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org Senior Research Associate, Psychological and Brain Sciences Dept. Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik