From davidmenhur at gmail.com Thu Oct 1 02:54:14 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 1 Oct 2015 08:54:14 +0200 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On 30 September 2015 at 18:20, Nathaniel Smith wrote: > On Sep 30, 2015 2:28 AM, "Da?id" wrote: > [...] > > Is there a nice way to ship both versions? After all, most > implementations of BLAS and friends do spawn OpenMP threads, so I don't > think it would be outrageous to take advantage of it in more places; > provided there is a nice way to fallback to a serial version when it is not > available. > > This is incorrect -- the only common implementation of BLAS that uses > *OpenMP* threads is OpenBLAS, and even then it's not the default -- it only > happens if you run it in a special non-default configuration. > Right, sorry. I wanted to say they spawn parallel threads. What do you mean by a non default configuration? Setting he OMP_NUM_THREADS? > The challenges to providing transparent multithreading in numpy generally > are: > > - gcc + OpenMP on linux still breaks multiprocessing. There's a patch to > fix this but they still haven't applied it; alternatively there's a > workaround you can use in multiprocessing (not using fork mode), but this > requires every user update their code and the workaround has other > limitations. We're unlikely to use OpenMP while this is the case. > Any idea when is this going to be released? As I understand it, OpenBLAS doesn't have this problem, am I right? > - parallel code in general is not very composable. If someone is calling a > numpy operation from one thread, great, transparently using multiple > threads internally is a win. If they're exploiting some higher-level > structure in their problem to break it into pieces and process each in > parallel, and then using numpy on each piece, then numpy spawning threads > internally will probably destroy performance. And numpy is too low-level to > know which case it's in. This problem exists to some extent already with > multi-threaded BLAS, so people use various BLAS-specific knobs to manage it > in ad hoc ways, but this doesn't scale. > > (Ironically OpenMP is more composable then most approaches to threading, > but only if everyone is using it and, as per above, not everyone is and we > currently can't.) > That is what I meant with providing also a single threaded version. The user can choose if they want the parallel or the serial, depending on the case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Oct 1 03:05:22 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 1 Oct 2015 00:05:22 -0700 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On Wed, Sep 30, 2015 at 11:54 PM, Da?id wrote: > > > On 30 September 2015 at 18:20, Nathaniel Smith wrote: >> >> On Sep 30, 2015 2:28 AM, "Da?id" wrote: >> [...] >> > Is there a nice way to ship both versions? After all, most >> > implementations of BLAS and friends do spawn OpenMP threads, so I don't >> > think it would be outrageous to take advantage of it in more places; >> > provided there is a nice way to fallback to a serial version when it is not >> > available. >> >> This is incorrect -- the only common implementation of BLAS that uses >> *OpenMP* threads is OpenBLAS, and even then it's not the default -- it only >> happens if you run it in a special non-default configuration. > > Right, sorry. I wanted to say they spawn parallel threads. What do you mean > by a non default configuration? Setting he OMP_NUM_THREADS? I don't remember the details -- I think it might be a special setting you have to enable when you build OpenBLAS. >> The challenges to providing transparent multithreading in numpy generally >> are: >> >> - gcc + OpenMP on linux still breaks multiprocessing. There's a patch to >> fix this but they still haven't applied it; alternatively there's a >> workaround you can use in multiprocessing (not using fork mode), but this >> requires every user update their code and the workaround has other >> limitations. We're unlikely to use OpenMP while this is the case. > > Any idea when is this going to be released? Which? The gcc patch? I spent 2 full release cycles nagging them and they still can't be bothered to make a decision either way, so :-(. If anyone has some ideas for how to get traction in gcc-land then I'm happy to pass on details... > As I understand it, OpenBLAS doesn't have this problem, am I right? Right, in the default configuration then OpenBLAS will use its own internal thread pool code, and that code has the fixes needed to work with fork-based multiprocessing. Of course if you configure OpenBLAS to use OpenMP instead of its internal thread code then this no longer applies... -n -- Nathaniel J. Smith -- http://vorpus.org From alex.rogozhnikov at yandex.ru Thu Oct 1 14:46:59 2015 From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov) Date: Thu, 1 Oct 2015 21:46:59 +0300 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: <560D7F05.3000308@yandex.ru> References: <560D7F05.3000308@yandex.ru> Message-ID: <560D7FA3.6070505@yandex.ru> Hi, I have written some numpy tips and tricks I am using, which may be interesting to you. This is quite long reading, so I've splitted it into two parts: http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html Comments are welcome, specially if you know any other ways to make this code faster (or better). Regards, Alex. From stefanv at berkeley.edu Thu Oct 1 18:28:08 2015 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 01 Oct 2015 15:28:08 -0700 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: <560D7FA3.6070505@yandex.ru> References: <560D7F05.3000308@yandex.ru> <560D7FA3.6070505@yandex.ru> Message-ID: <87h9mans07.fsf@berkeley.edu> On 2015-10-01 11:46:59, Alex Rogozhnikov wrote: > Hi, I have written some numpy tips and tricks I am using, which may be > interesting to you. > This is quite long reading, so I've splitted it into two parts: > > http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html > http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html I think that's a nice list already! I would probably start with: %matplotlib inline import numpy as np Then port all the code to Python 3 (or at least Python 2 & 3 compatible). Perhaps some illustrations could be useful, e.g. how to use the IronTransform to do histogram equalization. St?fan From jaime.frio at gmail.com Thu Oct 1 20:05:18 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 1 Oct 2015 17:05:18 -0700 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: <560D7FA3.6070505@yandex.ru> References: <560D7F05.3000308@yandex.ru> <560D7FA3.6070505@yandex.ru> Message-ID: On Thu, Oct 1, 2015 at 11:46 AM, Alex Rogozhnikov < alex.rogozhnikov at yandex.ru> wrote: > Hi, I have written some numpy tips and tricks I am using, which may be > interesting to you. > This is quite long reading, so I've splitted it into two parts: > > http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html The recommendation of inverting a permutation by argsort'ing it, while it works, is suboptimal, as it takes O(n log(n)) time, and you can do it in linear time: In [14]: import numpy as np In [15]: arr = np.random.rand(10) In [16]: perm = arr.argsort() In [17]: perm Out[17]: array([5, 0, 9, 4, 2, 8, 6, 7, 1, 3]) In [18]: inv_perm = np.empty_like(perm) In [19]: inv_perm[perm] = np.arange(len(perm)) In [20]: np.all(inv_perm == perm.argsort()) Out[20]: True It does require two lines of code, so for small stuff it is probably good enough to argsort, but it gave e.g. np.unique a nice boost on larger arrays when we applied it there. Jaime > > http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html > > Comments are welcome, specially if you know any other ways to make this > code faster (or better). > > Regards, > Alex. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Thu Oct 1 22:35:06 2015 From: tcaswell at gmail.com (Thomas Caswell) Date: Fri, 02 Oct 2015 02:35:06 +0000 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: References: <560D7F05.3000308@yandex.ru> <560D7FA3.6070505@yandex.ru> Message-ID: I would suggest %matplotlib notebook It will still have to a nice png, but you get an interactive figure when it is live. I agree that making the example code Python3 is critical. Tom On Thu, Oct 1, 2015 at 8:05 PM Jaime Fern?ndez del R?o wrote: > On Thu, Oct 1, 2015 at 11:46 AM, Alex Rogozhnikov < > alex.rogozhnikov at yandex.ru> wrote: > >> Hi, I have written some numpy tips and tricks I am using, which may be >> interesting to you. >> This is quite long reading, so I've splitted it into two parts: >> >> http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html > > > The recommendation of inverting a permutation by argsort'ing it, while it > works, is suboptimal, as it takes O(n log(n)) time, and you can do it in > linear time: > > In [14]: import numpy as np > > In [15]: arr = np.random.rand(10) > > In [16]: perm = arr.argsort() > > In [17]: perm > Out[17]: array([5, 0, 9, 4, 2, 8, 6, 7, 1, 3]) > > In [18]: inv_perm = np.empty_like(perm) > > In [19]: inv_perm[perm] = np.arange(len(perm)) > > In [20]: np.all(inv_perm == perm.argsort()) > Out[20]: True > > It does require two lines of code, so for small stuff it is probably good > enough to argsort, but it gave e.g. np.unique a nice boost on larger arrays > when we applied it there. > > Jaime > > >> >> http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html >> >> Comments are welcome, specially if you know any other ways to make this >> code faster (or better). >> >> Regards, >> Alex. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Thu Oct 1 23:09:18 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 01 Oct 2015 20:09:18 -0700 (PDT) Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: References: Message-ID: <1443755358119.1d9bb73e@Nodemailer> It will still have to a nice png, but you get an interactive figure when it is live. You just blew my mind. =D +1 to Python 3 and aliasing numpy as np. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.rogozhnikov at yandex.ru Fri Oct 2 03:38:16 2015 From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov) Date: Fri, 2 Oct 2015 10:38:16 +0300 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: <1443755358119.1d9bb73e@Nodemailer> References: <1443755358119.1d9bb73e@Nodemailer> Message-ID: <560E3468.8010502@yandex.ru> > I would suggest > > %matplotlib notebook > > It will still have to a nice png, but you get an interactive figure > when it is live. Amazing, thanks. I was using mpld3 for this. (for some strange reason I need to put %matplotlib notebook before each plot) > The recommendation of inverting a permutation by argsort'ing it, while > it works, is suboptimal, as it takes O(n log(n)) time, and you can do > it in linear time: Actually, there is (later in post) a linear solution using bincount, but your code is definitely better. Thanks! From kikocorreoso at gmail.com Fri Oct 2 03:48:54 2015 From: kikocorreoso at gmail.com (Kiko) Date: Fri, 2 Oct 2015 09:48:54 +0200 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: <560E3468.8010502@yandex.ru> References: <1443755358119.1d9bb73e@Nodemailer> <560E3468.8010502@yandex.ru> Message-ID: 2015-10-02 9:38 GMT+02:00 Alex Rogozhnikov : > I would suggest >> >> %matplotlib notebook >> >> It will still have to a nice png, but you get an interactive figure when >> it is live. >> > > Amazing, thanks. I was using mpld3 for this. > (for some strange reason I need to put %matplotlib notebook before each > plot) > You should create a figure before each plot instead of putthon %matplotlib notebook plt.figure() .... > > The recommendation of inverting a permutation by argsort'ing it, while it >> works, is suboptimal, as it takes O(n log(n)) time, and you can do it in >> linear time: >> > Actually, there is (later in post) a linear solution using bincount, but > your code is definitely better. Thanks! > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kikocorreoso at gmail.com Fri Oct 2 03:50:08 2015 From: kikocorreoso at gmail.com (Kiko) Date: Fri, 2 Oct 2015 09:50:08 +0200 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: References: <1443755358119.1d9bb73e@Nodemailer> <560E3468.8010502@yandex.ru> Message-ID: 2015-10-02 9:48 GMT+02:00 Kiko : > > > 2015-10-02 9:38 GMT+02:00 Alex Rogozhnikov : > >> I would suggest >>> >>> %matplotlib notebook >>> >>> It will still have to a nice png, but you get an interactive figure when >>> it is live. >>> >> >> Amazing, thanks. I was using mpld3 for this. >> (for some strange reason I need to put %matplotlib notebook before each >> plot) >> > > You should create a figure before each plot instead of putthon %matplotlib > notebook > plt.figure() > .... > putthon == putting > > >> >> The recommendation of inverting a permutation by argsort'ing it, while it >>> works, is suboptimal, as it takes O(n log(n)) time, and you can do it in >>> linear time: >>> >> Actually, there is (later in post) a linear solution using bincount, but >> your code is definitely better. Thanks! >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juha.jeronen at jyu.fi Fri Oct 2 05:47:30 2015 From: juha.jeronen at jyu.fi (Juha Jeronen) Date: Fri, 2 Oct 2015 12:47:30 +0300 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7C3D.9080107@jyu.fi> Message-ID: <560E52B2.2020504@jyu.fi> On 01.10.2015 03:32, Sturla Molden wrote: > On 01/10/15 02:20, Juha Jeronen wrote: > >> Then again, the matter is further complicated by considering codes that >> run on a single machine, versus codes that run on a cluster.Threads >> being local to each node in a cluster, > > You can run MPI programs on a single machine and you get OpenMP > implementations for clusters. Just pick an API and stick with it. Mm. I've quite often run MPI locally (it's nice for multicore scientific computing on Python), but I had no idea that OpenMP had cluster implementations. Thanks for the tip. I've got the impression that the way these APIs market themselves is that MPI is for processes, while OpenMP is for threads, even if this is not completely true across all implementations. (If I wanted maximal control over what each process/thread is doing, I'd go for ZeroMQ :) ) -J From juha.jeronen at jyu.fi Fri Oct 2 05:58:49 2015 From: juha.jeronen at jyu.fi (Juha Jeronen) Date: Fri, 2 Oct 2015 12:58:49 +0300 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7F1F.9060106@jyu.fi> Message-ID: <560E5559.9010402@jyu.fi> On 01.10.2015 03:52, Sturla Molden wrote: > On 01/10/15 02:32, Juha Jeronen wrote: > >> Sounds good. Out of curiosity, are there any standard fork-safe >> threadpools, or would this imply rolling our own? > > You have to roll your own. > > Basically use pthreads_atfork to register a callback that shuts down > the threadpool before a fork and another callback that restarts it. > Python's threading module does not expose the pthreads_atfork > function, so you must call it from Cython. > > I believe there is also a tiny atfork module in PyPI. Ok. Thanks. This approach fixes the issue of the threads not being there for the child process. I think it still leaves open the issue of creating the correct number of threads in the pools for each of the processes when the pool is restarted (so that in total there will be as many threads as cores (physical or virtual, whichever the user desires)). But this is again something that requires context... >> So maybe it would be better, at least at first, to make a pure-Cython >> version with no attempt at multithreading? > > I would start by making a pure Cython version that works correctly. > The next step would be to ensure that it releases the GIL. After that > you can worry about parallel processing, or just tell the user to use > threads or joblib. First version done and uploaded: https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy OpenMP support removed; this version uses only Cython. The example program has been renamed to main.py, and setup.py has been cleaned, removing the irrelevant module. This folder contains only the files for the polynomial solver. As I suspected, removing OpenMP support only required changing a few lines, and dropping the import for Cython.parallel. The "prange"s have been replaced with "with nogil" and "range". Note that both the original version and this version release the GIL when running the processing loops. It may be better to leave this single-threaded for now. Using Python threads isn't that difficult and joblib sounds nice, too. What's the next step? -J From davidmenhur at gmail.com Fri Oct 2 06:07:10 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Fri, 2 Oct 2015 12:07:10 +0200 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: <560E5559.9010402@jyu.fi> References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7F1F.9060106@jyu.fi> <560E5559.9010402@jyu.fi> Message-ID: On 2 October 2015 at 11:58, Juha Jeronen wrote: > >> > First version done and uploaded: > > > https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy > Small comment: now you are checking if the input is a scalar or a ndarray, but it should also accept any array-like. If I pass a list, I expect it to work, internally converting it into an array. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juha.jeronen at jyu.fi Fri Oct 2 06:31:47 2015 From: juha.jeronen at jyu.fi (Juha Jeronen) Date: Fri, 2 Oct 2015 13:31:47 +0300 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7F1F.9060106@jyu.fi> <560E5559.9010402@jyu.fi> Message-ID: <560E5D13.30008@jyu.fi> On 02.10.2015 13:07, Da?id wrote: > > On 2 October 2015 at 11:58, Juha Jeronen > wrote: > > > > First version done and uploaded: > > https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy > > > Small comment: now you are checking if the input is a scalar or a > ndarray, but it should also accept any array-like. If I pass a list, I > expect it to work, internally converting it into an array. Good catch. Is there an official way to test for array-likes? Or should I always convert with asarray()? Or something else? -J -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Fri Oct 2 07:05:30 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Fri, 2 Oct 2015 13:05:30 +0200 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On 1 October 2015 at 09:05, Nathaniel Smith wrote: > > >> - gcc + OpenMP on linux still breaks multiprocessing. There's a patch to > >> fix this but they still haven't applied it; alternatively there's a > >> workaround you can use in multiprocessing (not using fork mode), but > this > >> requires every user update their code and the workaround has other > >> limitations. We're unlikely to use OpenMP while this is the case. > > > > Any idea when is this going to be released? > > Which? The gcc patch? I spent 2 full release cycles nagging them and > they still can't be bothered to make a decision either way, so :-(. If > anyone has some ideas for how to get traction in gcc-land then I'm > happy to pass on details... > :( Have you tried asking Python-dev for help with this? Hopefully they would have some weight there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jslavin at cfa.harvard.edu Fri Oct 2 08:52:01 2015 From: jslavin at cfa.harvard.edu (Slavin, Jonathan) Date: Fri, 2 Oct 2015 08:52:01 -0400 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver Message-ID: ?Personally I like atleast_1d, which will convert a scalar into a 1d array but will leave arrays untouched (i.e. won't change the dimensions. Not sure what the advantages/disadvantages are relative to asarray. Jon? On Fri, Oct 2, 2015 at 7:05 AM, wrote: > From: Juha Jeronen > To: Discussion of Numerical Python > Cc: > Date: Fri, 2 Oct 2015 13:31:47 +0300 > Subject: Re: [Numpy-discussion] Cython-based OpenMP-accelerated quartic > polynomial solver > On 02.10.2015 13:07, Da?id wrote: > > > On 2 October 2015 at 11:58, Juha Jeronen wrote: > >> >>> >> First version done and uploaded: >> >> >> https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy >> > > Small comment: now you are checking if the input is a scalar or a ndarray, > but it should also accept any array-like. If I pass a list, I expect it to > work, internally converting it into an array. > > > Good catch. > > Is there an official way to test for array-likes? Or should I always > convert with asarray()? Or something else? > > > -J > -- ________________________________________________________ Jonathan D. Slavin Harvard-Smithsonian CfA jslavin at cfa.harvard.edu 60 Garden Street, MS 83 phone: (617) 496-7981 Cambridge, MA 02138-1516 cell: (781) 363-0035 USA ________________________________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Fri Oct 2 10:18:16 2015 From: rmay31 at gmail.com (Ryan May) Date: Fri, 2 Oct 2015 08:18:16 -0600 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: Message-ID: numpy.asanyarray() would be my preferred goto, as it will leave subclasses of ndarray untouched; asarray() and atleast_1d() force ndarray. It's nice to do the whenever possible. Ryan On Fri, Oct 2, 2015 at 6:52 AM, Slavin, Jonathan wrote: > ?Personally I like atleast_1d, which will convert a scalar into a 1d array > but will leave arrays untouched (i.e. won't change the dimensions. Not > sure what the advantages/disadvantages are relative to asarray. > > Jon? > > > On Fri, Oct 2, 2015 at 7:05 AM, > wrote: > >> From: Juha Jeronen >> To: Discussion of Numerical Python >> Cc: >> Date: Fri, 2 Oct 2015 13:31:47 +0300 >> Subject: Re: [Numpy-discussion] Cython-based OpenMP-accelerated quartic >> polynomial solver >> >> On 02.10.2015 13:07, Da?id wrote: >> >> >> On 2 October 2015 at 11:58, Juha Jeronen wrote: >> >>> >>>> >>> First version done and uploaded: >>> >>> >>> https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy >>> >> >> Small comment: now you are checking if the input is a scalar or a >> ndarray, but it should also accept any array-like. If I pass a list, I >> expect it to work, internally converting it into an array. >> >> >> Good catch. >> >> Is there an official way to test for array-likes? Or should I always >> convert with asarray()? Or something else? >> >> >> -J >> > > > > > -- > ________________________________________________________ > Jonathan D. Slavin Harvard-Smithsonian CfA > jslavin at cfa.harvard.edu 60 Garden Street, MS 83 > phone: (617) 496-7981 Cambridge, MA 02138-1516 > cell: (781) 363-0035 USA > ________________________________________________________ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Ryan May -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Oct 2 15:41:27 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 2 Oct 2015 19:41:27 +0000 (UTC) Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7C3D.9080107@jyu.fi> <560E52B2.2020504@jyu.fi> Message-ID: <2019608500465506663.143317sturla.molden-gmail.com@news.gmane.org> Juha Jeronen wrote: > Mm. I've quite often run MPI locally (it's nice for multicore scientific > computing on Python), but I had no idea that OpenMP had cluster > implementations. Thanks for the tip. Intel has been selling one, I think there are others too. OpenMP has a flush pragma for synchronizing shared variables. This means that OpenMP is not restricted to shared memory hardware. A "pragma omp flush" can just as well invoke some IPC mechanism, even network communication. Sturla From sturla.molden at gmail.com Fri Oct 2 16:00:23 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 2 Oct 2015 20:00:23 +0000 (UTC) Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7C3D.9080107@jyu.fi> <560E52B2.2020504@jyu.fi> <2019608500465506663.143317sturla.molden-gmail.com@news.gmane.org> Message-ID: <2065264910465507888.986922sturla.molden-gmail.com@news.gmane.org> Sturla Molden wrote: > OpenMP has a flush pragma for synchronizing shared variables. This means > that OpenMP is not restricted to shared memory hardware. A "pragma omp > flush" can just as well invoke some IPC mechanism, even network > communication. By the way, while this is the case for C and Fortran, it is certainly not the case for Cython. In a Cython prange block, a shared variable is accessed by dereferencing its address. This requires shared memory. Pure OpenMP in C does not, because shared variables are not accessed through pointers, but are rather normal variables that are synchronized with a pragma. Cython actually requires that there is a shared address space, and it invokes something that strictly speaking has undefined behavior under the OpenMP standard. So thus, a prange block in Cython is expected to work correctly on a laptop with a multicore processor, but it is not expected to work correctly on a cluster. IIRC, Intel's cluster OpenMP is based on MPI, which means the compiler will internally translate code with OpenMP pragmas into equivalent code that calls MPI functions. A program written for OpenMP can then run on any cluster that provides an MPI implementation. From sturla.molden at gmail.com Fri Oct 2 16:15:05 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 2 Oct 2015 20:15:05 +0000 (UTC) Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> <560C7C3D.9080107@jyu.fi> <560E52B2.2020504@jyu.fi> <2019608500465506663.143317sturla.molden-gmail.com@news.gmane.org> <2065264910465507888.986922sturla.molden-gmail.com@news.gmane.org> Message-ID: <481170864465509391.402498sturla.molden-gmail.com@news.gmane.org> Sturla Molden wrote: > Cython actually requires that there is a shared address space, and it > invokes something that strictly speaking has undefined behavior under the > OpenMP standard. So thus, a prange block in Cython is expected to work > correctly on a laptop with a multicore processor, but it is not expected to > work correctly on a cluster. OpenMP does not guarrantee that dereferencing a pointer in a parallel block will dereference the same object across all processors. It only guarrantees that the value of a shared object can be synchronized. There are many who use OpenMP and think only in terms of threads that do this incorrectly. Cython is actually among those. S.M. From travis at continuum.io Fri Oct 2 21:40:00 2015 From: travis at continuum.io (Travis Oliphant) Date: Fri, 2 Oct 2015 20:40:00 -0500 Subject: [Numpy-discussion] Let's move forward with the current governance document. Message-ID: Hi everyone, After some further thought and spending quite a bit of time re-reading the discussion on a few threads, I now believe that my request to be on the steering council might be creating more trouble than it's worth. Nothing matters to me more than seeing NumPy continue to grow and improve. So, I'm switching my position to supporting the adoption of the governance model outlined and just contributing as I can outside the steering council. The people on the steering council are committed to the success of NumPy and will do a great job --- they already have in contributing to the community over the past year(s). We can always revisit the question in a year if difficulties arise with the model. If my voice and other strong voices remain outside the council, perhaps we can all encourage that the intended community governance of NumPy does in fact happen, and most decisions continue to be made in the open. I had the pleasure last night of meeting one of the new NumPy core contributors, Allan Haldane. This only underscored my confidence in everyone who is contributing to NumPy today. This confidence has already been established by watching the great contributions of many talented developers who have given their time and talents to the project over the past several years. I hope that we can move on from the governance discussion and continue to promote the success of the project together. Best, -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 3 15:23:07 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 3 Oct 2015 13:23:07 -0600 Subject: [Numpy-discussion] Numpy 1.10.0 coming Monday, 5 Oct. Message-ID: Hi All, A heads up about the coming Numpy release. If you have discovered any problems with rc1 or rc2, please report them. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat Oct 3 17:33:34 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 3 Oct 2015 17:33:34 -0400 Subject: [Numpy-discussion] ANN: pandas v0.17.0rc2 - RELEASE CANDIDATE 2 Message-ID: Hi, I'm pleased to announce the availability of the second release candidate of Pandas 0.17.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially on October 9. **RELEASE CANDIDATE 2** >From RC 1 we have: - compat for Python 3.5 - compat for matplotlib 1.5.0 - .convert_objects is now restored to the original, and is deprecated This is a major release from 0.16.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: - Release the Global Interpreter Lock (GIL) on some cython operations, see here - Plotting methods are now available as attributes of the .plot accessor, see here - The sorting API has been revamped to remove some long-time inconsistencies, see here - Support for a datetime64[ns] with timezones as a first-class dtype, see here - The default for to_datetime will now be to raise when presented with unparseable formats, previously this would return the original input, see here - The default for dropna in HDFStore has changed to False, to store by default all rows even if they are all NaN, see here - Support for Series.dt.strftime to generate formatted strings for datetime-likes, see here - Development installed versions of pandas will now have PEP440 compliant version strings GH9518 - Development support for benchmarking with the Air Speed Velocity library GH8316 - Support for reading SAS xport files, see here - Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see here - Display format with plain text can optionally align with Unicode East Asian Width, see here - Compatibility with Python 3.5 GH11097 - Compatibility with matplotlib 1.5.0 GH11111 See the Whatsnew for much more information. Best way to get this is to install via conda from our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7, Python 3.4, and Python 3.5 (for osx/linux) are all available. conda install pandas -c pandas Thanks to all who made this release happen. It is a very large release! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sun Oct 4 01:35:42 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 3 Oct 2015 22:35:42 -0700 Subject: [Numpy-discussion] [pydata] ANN: pandas v0.17.0rc2 - RELEASE CANDIDATE 2 In-Reply-To: References: Message-ID: Hi, On Sat, Oct 3, 2015 at 2:33 PM, Jeff Reback wrote: > Hi, > > I'm pleased to announce the availability of the second release candidate of > Pandas 0.17.0. > Please try this RC and report any issues here: Pandas Issues > We will be releasing officially on October 9. > > **RELEASE CANDIDATE 2** > > From RC 1 we have: > > compat for Python 3.5 > compat for matplotlib 1.5.0 > .convert_objects is now restored to the original, and is deprecated > > This is a major release from 0.16.2 and includes a small number of API > changes, several new features, enhancements, and performance improvements > along with a large number of bug fixes. We recommend that all users upgrade > to this version. > > Highlights include: > > Release the Global Interpreter Lock (GIL) on some cython operations, see > here > Plotting methods are now available as attributes of the .plot accessor, see > here > The sorting API has been revamped to remove some long-time inconsistencies, > see here > Support for a datetime64[ns] with timezones as a first-class dtype, see here > The default for to_datetime will now be to raise when presented with > unparseable formats, previously this would return the original input, see > here > The default for dropna in HDFStore has changed to False, to store by default > all rows even if they are all NaN, see here > Support for Series.dt.strftime to generate formatted strings for > datetime-likes, see here > Development installed versions of pandas will now have PEP440 compliant > version strings GH9518 > Development support for benchmarking with the Air Speed Velocity library > GH8316 > Support for reading SAS xport files, see here > Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, > see here > Display format with plain text can optionally align with Unicode East Asian > Width, see here > Compatibility with Python 3.5 GH11097 > Compatibility with matplotlib 1.5.0 GH11111 > > > See the Whatsnew for much more information. > > Best way to get this is to install via conda from our development channel. > Builds for osx-64,linux-64,win-64 for Python 2.7, Python 3.4, and Python 3.5 > (for osx/linux) are all available. > > conda install pandas -c pandas I built OSX wheels for Pythons 2.7, 3.4, 3.5. To test: pip install --pre -f http://wheels.scipy.org pandas There were some test failures for Python 3.3 - issue here: https://github.com/pydata/pandas/issues/11232 Cheers, Matthew From njs at pobox.com Sun Oct 4 03:24:16 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 4 Oct 2015 00:24:16 -0700 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: References: Message-ID: On Fri, Oct 2, 2015 at 6:40 PM, Travis Oliphant wrote: > Hi everyone, > > After some further thought and spending quite a bit of time re-reading the > discussion on a few threads, I now believe that my request to be on the > steering council might be creating more trouble than it's worth. Nothing > matters to me more than seeing NumPy continue to grow and improve. > > So, I'm switching my position to supporting the adoption of the governance > model outlined and just contributing as I can outside the steering council. > The people on the steering council are committed to the success of NumPy and > will do a great job --- they already have in contributing to the community > over the past year(s). We can always revisit the question in a year if > difficulties arise with the model. Wow -- I can't imagine this was an easy decision, but I share your confidence that it will work out -- esp. since we'll still have you around to contribute wisdom when necessary :-). Thank you for your efforts -- they're very much appreciated. I believe this means all outstanding issues have been addressed, and that we can now declare the governance document to be ready. I'm avoiding using the word "finished" because of course we can continue to adapt it as necessary -- but from this point on I think we can do that using the mechanisms described in the document itself. I've just updated the governance document pull request with final formatting tweaks, in case anyone wants to review the current text or the (very minor and boring) changes that have been made since it was first posted: https://github.com/numpy/numpy/pull/6352/commits I think that PR is now ready to merge -- Chuck, perhaps you'd like to do the honors? -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Sun Oct 4 19:14:05 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 4 Oct 2015 17:14:05 -0600 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: References: Message-ID: On Sun, Oct 4, 2015 at 1:24 AM, Nathaniel Smith wrote: > On Fri, Oct 2, 2015 at 6:40 PM, Travis Oliphant > wrote: > > Hi everyone, > > > > After some further thought and spending quite a bit of time re-reading > the > > discussion on a few threads, I now believe that my request to be on the > > steering council might be creating more trouble than it's worth. > Nothing > > matters to me more than seeing NumPy continue to grow and improve. > > > > So, I'm switching my position to supporting the adoption of the > governance > > model outlined and just contributing as I can outside the steering > council. > > The people on the steering council are committed to the success of NumPy > and > > will do a great job --- they already have in contributing to the > community > > over the past year(s). We can always revisit the question in a year if > > difficulties arise with the model. > > Wow -- I can't imagine this was an easy decision, but I share your > confidence that it will work out -- esp. since we'll still have you > around to contribute wisdom when necessary :-). Thank you for your > efforts -- they're very much appreciated. > > I believe this means all outstanding issues have been addressed, and > that we can now declare the governance document to be ready. I'm > avoiding using the word "finished" because of course we can continue > to adapt it as necessary -- but from this point on I think we can do > that using the mechanisms described in the document itself. > > I've just updated the governance document pull request with final > formatting tweaks, in case anyone wants to review the current text or > the (very minor and boring) changes that have been made since it was > first posted: > > https://github.com/numpy/numpy/pull/6352/commits > > I think that PR is now ready to merge -- Chuck, perhaps you'd like to > do the honors? > I've added a few comments. It looks almost ready. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Oct 4 20:43:40 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 4 Oct 2015 17:43:40 -0700 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: References: Message-ID: On Sun, Oct 4, 2015 at 4:14 PM, Charles R Harris wrote: > > On Sun, Oct 4, 2015 at 1:24 AM, Nathaniel Smith wrote: [...] >> I've just updated the governance document pull request with final >> formatting tweaks, in case anyone wants to review the current text or >> the (very minor and boring) changes that have been made since it was >> first posted: >> >> https://github.com/numpy/numpy/pull/6352/commits >> >> I think that PR is now ready to merge -- Chuck, perhaps you'd like to >> do the honors? > > > I've added a few comments. It looks almost ready. Thanks Chuck! It looks like it's just wording tweaks / clarifications at this point, so nothing we need to discuss in detail on the list. If anyone wants to watch the sausage being made, then the link is above :-), and we'll continue the discussion in the PR unless anything substantive comes up. -n -- Nathaniel J. Smith -- http://vorpus.org From faltet at gmail.com Mon Oct 5 05:00:57 2015 From: faltet at gmail.com (Francesc Alted) Date: Mon, 5 Oct 2015 11:00:57 +0200 Subject: [Numpy-discussion] ANN: bcolz 0.11.3 released! Message-ID: ======================= Announcing bcolz 0.11.3 ======================= What's new ========== Implemented new feature (#255): bcolz.zeros() can create new ctables too, either empty or filled with zeros. (#256 @FrancescElies @FrancescAlted). Also, in previous, non announced versions (0.11.1 and 0.11.2), new dependencies were added and other fixes are there too. For a more detailed change log, see: https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst What it is ========== *bcolz* provides columnar and compressed data containers that can live either on-disk or in-memory. Column storage allows for efficiently querying tables with a large number of columns. It also allows for cheap addition and removal of column. In addition, bcolz objects are compressed by default for reducing memory/disk I/O needs. The compression process is carried out internally by Blosc, an extremely fast meta-compressor that is optimized for binary data. Lastly, high-performance iterators (like ``iter()``, ``where()``) for querying the objects are provided. bcolz can use numexpr internally so as to accelerate many vector and query operations (although it can use pure NumPy for doing so too). numexpr optimizes the memory usage and use several cores for doing the computations, so it is blazing fast. Moreover, since the carray/ctable containers can be disk-based, and it is possible to use them for seamlessly performing out-of-memory computations. bcolz has minimal dependencies (NumPy), comes with an exhaustive test suite and fully supports both 32-bit and 64-bit platforms. Also, it is typically tested on both UNIX and Windows operating systems. Together, bcolz and the Blosc compressor, are finally fulfilling the promise of accelerating memory I/O, at least for some real scenarios: http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots Other users of bcolz are Visualfabriq (http://www.visualfabriq.com/) the Blaze project (http://blaze.pydata.org/), Quantopian (https://www.quantopian.com/) and Scikit-Allel (https://github.com/cggh/scikit-allel) which you can read more about by pointing your browser at the links below. * Visualfabriq: * *bquery*, A query and aggregation framework for Bcolz: * https://github.com/visualfabriq/bquery * Blaze: * Notebooks showing Blaze + Pandas + BColz interaction: * http://nbviewer.ipython.org/url/blaze.pydata.org/notebooks/timings-csv.ipynb * http://nbviewer.ipython.org/url/blaze.pydata.org/notebooks/timings-bcolz.ipynb * Quantopian: * Using compressed data containers for faster backtesting at scale: * https://quantopian.github.io/talks/NeedForSpeed/slides.html * Scikit-Allel * Provides an alternative backend to work with compressed arrays * https://scikit-allel.readthedocs.org/en/latest/model/bcolz.html Installing ========== bcolz is in the PyPI repository, so installing it is easy:: $ pip install -U bcolz Resources ========= Visit the main bcolz site repository at: http://github.com/Blosc/bcolz Manual: http://bcolz.blosc.org Home of Blosc compressor: http://blosc.org User's mail list: bcolz at googlegroups.com http://groups.google.com/group/bcolz License is the new BSD: https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt Release notes can be found in the Git repository: https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst ---- **Enjoy data!** -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Oct 5 16:50:25 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 5 Oct 2015 20:50:25 +0000 (UTC) Subject: [Numpy-discussion] Let's move forward with the current governance document. References: Message-ID: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > Thanks Chuck! It looks like it's just wording tweaks / clarifications > at this point, so nothing we need to discuss in detail on the list. If > anyone wants to watch the sausage being made, then the link is above > :-), and we'll continue the discussion in the PR unless anything > substantive comes up. Anyone has a veto? That reminds me of something that happened in the senate of Rome; they only had a small number of vetoers, sometimes only one or two, and even that caused havoc. I think it should be better clarified how much contribution is needed before someone can be considered to have veto rights. It would e.g. be ridiculous if I were to begin and veto stuff, as my contributions are minute... OMG. Sturla From njs at pobox.com Mon Oct 5 16:59:57 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 5 Oct 2015 13:59:57 -0700 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> References: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Oct 5, 2015 at 1:50 PM, Sturla Molden wrote: > > Nathaniel Smith wrote: > > > Thanks Chuck! It looks like it's just wording tweaks / clarifications > > at this point, so nothing we need to discuss in detail on the list. If > > anyone wants to watch the sausage being made, then the link is above > > :-), and we'll continue the discussion in the PR unless anything > > substantive comes up. > > Anyone has a veto? That reminds me of something that happened in the senate > of Rome; they only had a small number of vetoers, sometimes only one or > two, and even that caused havoc. I think it should be better clarified how > much contribution is needed before someone can be considered to have veto > rights. It would e.g. be ridiculous if I were to begin and veto stuff, as > my contributions are minute... OMG. Are you planning to go around vetoing things for ridiculous reasons and causing havoc? If so, then notice that the steering council reserves the right to kick you out ;-). And if not, then who is it that you're worried about? -n -- Nathaniel J. Smith -- http://vorpus.org From sturla.molden at gmail.com Mon Oct 5 17:11:27 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 5 Oct 2015 21:11:27 +0000 (UTC) Subject: [Numpy-discussion] Let's move forward with the current governance document. References: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> Message-ID: <765286879465771974.903732sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > Are you planning to go around vetoing things I don't consider myself qualified. > for ridiculous reasons and causing havoc? That would be unpolite. > And if not, then who is it that you're worried about? I am not sure :) I just envisioned a Roman patron shouting veto or a US senator filibustering. Expulsion would be the appropriate recation, yes :-) Sturla From ben.v.root at gmail.com Mon Oct 5 17:16:09 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 5 Oct 2015 17:16:09 -0400 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: <765286879465771974.903732sturla.molden-gmail.com@news.gmane.org> References: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> <765286879465771974.903732sturla.molden-gmail.com@news.gmane.org> Message-ID: There is the concept of consensus-driven development, which centers on veto rights. It does assume that all actors are driven by a common goal to improve the project. For example, the fact that we didn't have consensus back during the whole NA brouhaha was actually a good thing because IMHO including NA into NumPy would have hurt the community more than it would have helped. Ben Root On Mon, Oct 5, 2015 at 5:11 PM, Sturla Molden wrote: > Nathaniel Smith wrote: > > > Are you planning to go around vetoing things > > I don't consider myself qualified. > > > for ridiculous reasons and causing havoc? > > That would be unpolite. > > > And if not, then who is it that you're worried about? > > I am not sure :) > > I just envisioned a Roman patron shouting veto or a US senator > filibustering. Expulsion would be the appropriate recation, yes :-) > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Oct 5 17:52:23 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 05 Oct 2015 23:52:23 +0200 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On 02/10/15 13:05, Da?id wrote: > Have you tried asking Python-dev for help with this? Hopefully they > would have some weight there. It seems both GCC dev and Apple (for GCD and Accelerate) has taken a similar stance on this. There is tiny set of functions the POSIX standard demands should work on both sides of a fork without exec, but OpenMP, GCD, BLAS or LAPAPCK are not included. As long as there is no bug, it is hard to convince them to follow Intel and allow fork-based multiprocessing. As it stands now, using Intel compilers and MKL is the only way to make this work, but Intel's development tools are not freeware. Sturla From jeffreback at gmail.com Mon Oct 5 18:00:38 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 5 Oct 2015 18:00:38 -0400 Subject: [Numpy-discussion] [pydata] ANN: pandas v0.17.0rc2 - RELEASE CANDIDATE 2 In-Reply-To: <86269805-aeb7-4cff-97ba-ed0794600d76@googlegroups.com> References: <86269805-aeb7-4cff-97ba-ed0794600d76@googlegroups.com> Message-ID: <8FF3E86A-6FAC-4FA9-AC8D-99CCC00ED2C0@gmail.com> it should be exactly the same (they are going to release soon as well I think) - with an updated version > On Oct 5, 2015, at 2:25 PM, Big Stone wrote: > > hi, > > on pypi, pandas_datareader (0.1.1) is dated from April 10th. > > Is it up-to-date with pandas 0.17rc2 ? > >> On Sunday, October 4, 2015 at 7:36:26 AM UTC+2, Matthew Brett wrote: >> Hi, >> >> On Sat, Oct 3, 2015 at 2:33 PM, Jeff Reback wrote: >> > Hi, >> > >> > I'm pleased to announce the availability of the second release candidate of >> > Pandas 0.17.0. >> > Please try this RC and report any issues here: Pandas Issues >> > We will be releasing officially on October 9. >> > >> > **RELEASE CANDIDATE 2** >> > >> > From RC 1 we have: >> > >> > compat for Python 3.5 >> > compat for matplotlib 1.5.0 >> > .convert_objects is now restored to the original, and is deprecated >> > >> > This is a major release from 0.16.2 and includes a small number of API >> > changes, several new features, enhancements, and performance improvements >> > along with a large number of bug fixes. We recommend that all users upgrade >> > to this version. >> > >> > Highlights include: >> > >> > Release the Global Interpreter Lock (GIL) on some cython operations, see >> > here >> > Plotting methods are now available as attributes of the .plot accessor, see >> > here >> > The sorting API has been revamped to remove some long-time inconsistencies, >> > see here >> > Support for a datetime64[ns] with timezones as a first-class dtype, see here >> > The default for to_datetime will now be to raise when presented with >> > unparseable formats, previously this would return the original input, see >> > here >> > The default for dropna in HDFStore has changed to False, to store by default >> > all rows even if they are all NaN, see here >> > Support for Series.dt.strftime to generate formatted strings for >> > datetime-likes, see here >> > Development installed versions of pandas will now have PEP440 compliant >> > version strings GH9518 >> > Development support for benchmarking with the Air Speed Velocity library >> > GH8316 >> > Support for reading SAS xport files, see here >> > Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, >> > see here >> > Display format with plain text can optionally align with Unicode East Asian >> > Width, see here >> > Compatibility with Python 3.5 GH11097 >> > Compatibility with matplotlib 1.5.0 GH11111 >> > >> > >> > See the Whatsnew for much more information. >> > >> > Best way to get this is to install via conda from our development channel. >> > Builds for osx-64,linux-64,win-64 for Python 2.7, Python 3.4, and Python 3.5 >> > (for osx/linux) are all available. >> > >> > conda install pandas -c pandas >> >> I built OSX wheels for Pythons 2.7, 3.4, 3.5. To test: >> >> pip install --pre -f http://wheels.scipy.org pandas >> >> There were some test failures for Python 3.3 - issue here: >> >> https://github.com/pydata/pandas/issues/11232 >> >> Cheers, >> >> Matthew > > -- > You received this message because you are subscribed to the Google Groups "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 5 18:05:33 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 5 Oct 2015 15:05:33 -0700 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On Mon, Oct 5, 2015 at 2:52 PM, Sturla Molden wrote: > On 02/10/15 13:05, Da?id wrote: > >> Have you tried asking Python-dev for help with this? Hopefully they >> would have some weight there. > > It seems both GCC dev and Apple (for GCD and Accelerate) has taken a similar > stance on this. There is tiny set of functions the POSIX standard demands > should work on both sides of a fork without exec, but OpenMP, GCD, BLAS or > LAPAPCK are not included. As long as there is no bug, it is hard to convince > them to follow Intel and allow fork-based multiprocessing. To be clear, the GCC devs are open to supporting fork+OpenMP in principle, they just aren't willing to do it in a way that risks breaking strict POSIX or OpenMP compatibility. But that isn't even the problem -- we have a patch that is strictly compatible with POSIX and OpenMP. The problem is that with the patch, the cases that would formerly have deadlocked instead leak some memory. This is not a big deal IMO for a variety of reasons (mostly that a one time leak per child process is tiny, esp. compared to the current situation with deadlocks), but it means the patch needs someone serious in the GCC community to take a look at it carefully, understand what the tradeoffs actually are, and make a judgement call. And so far we haven't convinced anyone to do this. -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Mon Oct 5 18:13:30 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 5 Oct 2015 15:13:30 -0700 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On Mon, Oct 5, 2015 at 3:05 PM, Nathaniel Smith wrote: > On Mon, Oct 5, 2015 at 2:52 PM, Sturla Molden wrote: >> On 02/10/15 13:05, Da?id wrote: >> >>> Have you tried asking Python-dev for help with this? Hopefully they >>> would have some weight there. >> >> It seems both GCC dev and Apple (for GCD and Accelerate) has taken a similar >> stance on this. There is tiny set of functions the POSIX standard demands >> should work on both sides of a fork without exec, but OpenMP, GCD, BLAS or >> LAPAPCK are not included. As long as there is no bug, it is hard to convince >> them to follow Intel and allow fork-based multiprocessing. > > To be clear, the GCC devs are open to supporting fork+OpenMP in > principle, they just aren't willing to do it in a way that risks > breaking strict POSIX or OpenMP compatibility. But that isn't even the > problem -- we have a patch that is strictly compatible with POSIX and > OpenMP. The problem is that with the patch, the cases that would > formerly have deadlocked instead leak some memory. This is not a big > deal IMO for a variety of reasons (mostly that a one time leak per > child process is tiny, esp. compared to the current situation with > deadlocks), but it means the patch needs someone serious in the GCC > community to take a look at it carefully, understand what the > tradeoffs actually are, and make a judgement call. And so far we > haven't convinced anyone to do this. Since this discussion's come around again, I finally got curious enough to check the Intel OpenMP runtime's new(ish) open source releases, and it turns out that they leak memory in exactly the same way as the gcc patch :-). -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Mon Oct 5 18:26:17 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 5 Oct 2015 15:26:17 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? Message-ID: Hi all, For a long time, NumPy has supported two different ways of being compiled: "Separate compilation" mode: like most C projects, each .c file gets compiled to a .o file, and then the .o files get linked together to make a shared library. (This has been the default since 1.8.0.) "One file" mode: first concatenate all the .c files together to make one monster .c file, and then compile that .c file to make a shared library. (This was the default before 1.8.0.) Supporting these two different build modes creates a drag on development progress; in particular Stefan recently ran into this in this experiments with porting parts of the NumPy internals to Cython: https://github.com/numpy/numpy/pull/6408 (I suspect the particular problem he's running into can be fixed b/c so far he only has one .pyx file, but I also suspect that it will be impossible to support "one file" mode once we have multiple .pyx files.) There are some rumors that "one file" mode might be needed on some obscure platform somewhere, or that it might be necessary for statically linking numpy into the CPython executable, but we can't continue supporting things forever based only on rumors. If all we can get are rumors, then eventually we have to risk breaking things just to force anyone who cares to actually show up and explain what they need so we can support it properly :-). Would anyone object if we dropped support for the "one file" mode, making "separate compilation" mandatory, e.g. in 1.11? -n -- Nathaniel J. Smith -- http://vorpus.org From solipsis at pitrou.net Mon Oct 5 18:35:58 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Oct 2015 00:35:58 +0200 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? References: Message-ID: <20151006003558.3a824512@fsol> On Mon, 5 Oct 2015 15:26:17 -0700 Nathaniel Smith wrote: > Hi all, > > For a long time, NumPy has supported two different ways of being compiled: > > "Separate compilation" mode: like most C projects, each .c file gets > compiled to a .o file, and then the .o files get linked together to > make a shared library. (This has been the default since 1.8.0.) > > "One file" mode: first concatenate all the .c files together to make > one monster .c file, and then compile that .c file to make a shared > library. (This was the default before 1.8.0.) > [...] > > There are some rumors that "one file" mode might be needed on some > obscure platform somewhere, or that it might be necessary for > statically linking numpy into the CPython executable, but we can't > continue supporting things forever based only on rumors. If those rumors were true, CPython would not even be able to build (the _io module in 3.x is linked from several C object files, for example). Regards Antoine. From charlesr.harris at gmail.com Mon Oct 5 19:24:28 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Oct 2015 17:24:28 -0600 Subject: [Numpy-discussion] NumPy Governance Document. Message-ID: Hi All The NumPy Governance Document has been merged and is now in effect. Thanks to all who contributed to the discussion. And a special thanks to Nathaniel, who wrote the draft and kept it up to date as the discussion progressed. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Oct 5 19:37:49 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 5 Oct 2015 16:37:49 -0700 Subject: [Numpy-discussion] NumPy Governance Document. In-Reply-To: References: Message-ID: > > > The NumPy Governance Document > has been merged and is now in effect. > whoo hoo! And a special thanks to Nathaniel, > Indeed -- and everyone else that put a lot of their time into the discussion. Looking forward to discussing technical issues again :-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Oct 5 19:44:55 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 5 Oct 2015 16:44:55 -0700 Subject: [Numpy-discussion] Let's move forward with the current governance document. In-Reply-To: <765286879465771974.903732sturla.molden-gmail.com@news.gmane.org> References: <1555232434465770574.717974sturla.molden-gmail.com@news.gmane.org> <765286879465771974.903732sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Oct 5, 2015 at 2:11 PM, Sturla Molden wrote: > I just envisioned a Roman patron shouting veto or a US senator > filibustering. Expulsion would be the appropriate recation, yes :-) Oh if only the US Senate could expulse people! -sigh -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From phillip.m.feldman at gmail.com Tue Oct 6 00:08:13 2015 From: phillip.m.feldman at gmail.com (Phillip Feldman) Date: Mon, 5 Oct 2015 21:08:13 -0700 Subject: [Numpy-discussion] method to calculate the magnitude squared Message-ID: My apologies for the slow response; I was experiencing some technical problems with e-mail. In answer to Antoine's question, my main desire is for a numpy ndarray method, for the convenience, with a secondary goal being improved performance. I have added the function `magsq` to my library, but would like to access it as a method rather than as a function. I understand that I could create a class that inherits from NumPy and add a `magsq` method to that class, but this has a number of disadvantages. Phillip -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Oct 6 00:52:50 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Oct 2015 22:52:50 -0600 Subject: [Numpy-discussion] Numpy 1.10.0 release Message-ID: Hi All, It is my pleasure to release NumPy 1.10.0. Files may be found at Sourceforge and pypi . This release is the result of 789 non-merge commits made by 160 developers over a period of a year and supports Python 2.6 - 2.7 and 3.2 - 3.5. NumPy 1.10.0 Release Notes ************************** This release supports Python 2.6 - 2.7 and 3.2 - 3.5. Highlights ========== * numpy.distutils now supports parallel compilation via the --parallel/-j argument passed to setup.py build * numpy.distutils now supports additional customization via site.cfg to control compilation parameters, i.e. runtime libraries, extra linking/compilation flags. * Addition of *np.linalg.multi_dot*: compute the dot product of two or more arrays in a single function call, while automatically selecting the fastest evaluation order. * The new function `np.stack` provides a general interface for joining a sequence of arrays along a new axis, complementing `np.concatenate` for joining along an existing axis. * Addition of `nanprod` to the set of nanfunctions. * Support for the '@' operator in Python 3.5. Dropped Support: * The _dotblas module has been removed. CBLAS Support is now in Multiarray. * The testcalcs.py file has been removed. * The polytemplate.py file has been removed. * npy_PyFile_Dup and npy_PyFile_DupClose have been removed from npy_3kcompat.h. * splitcmdline has been removed from numpy/distutils/exec_command.py. * try_run and get_output have been removed from numpy/distutils/command/config.py * The a._format attribute is no longer supported for array printing. * Keywords ``skiprows`` and ``missing`` removed from np.genfromtxt. * Keyword ``old_behavior`` removed from np.correlate. Future Changes: * In array comparisons like ``arr1 == arr2``, many corner cases involving strings or structured dtypes that used to return scalars now issue ``FutureWarning`` or ``DeprecationWarning``, and in the future will be change to either perform elementwise comparisons or raise an error. * The SafeEval class will be removed. * The alterdot and restoredot functions will be removed. See below for more details on these changes. Compatibility notes =================== numpy version string ~~~~~~~~~~~~~~~~~~~~ The numpy version string for development builds has been changed from ``x.y.z.dev-githash`` to ``x.y.z.dev0+githash`` (note the +) in order to comply with PEP 440. relaxed stride checking ~~~~~~~~~~~~~~~~~~~~~~~ NPY_RELAXED_STRIDE_CHECKING is now true by default. Concatenation of 1d arrays along any but ``axis=0`` raises ``IndexError`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using axis != 0 has raised a DeprecationWarning since NumPy 1.7, it now raises an error. *np.ravel*, *np.diagonal* and *np.diag* now preserve subtypes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There was inconsistent behavior between *x.ravel()* and *np.ravel(x)*, as well as between *x.diagonal()* and *np.diagonal(x)*, with the methods preserving subtypes while the functions did not. This has been fixed and the functions now behave like the methods, preserving subtypes except in the case of matrices. Matrices are special cased for backward compatibility and still return 1-D arrays as before. If you need to preserve the matrix subtype, use the methods instead of the functions. *rollaxis* and *swapaxes* always return a view ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Previously, a view was returned except when no change was made in the order of the axes, in which case the input array was returned. A view is now returned in all cases. *nonzero* now returns base ndarrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Previously, an inconsistency existed between 1-D inputs (returning a base ndarray) and higher dimensional ones (which preserved subclasses). Behavior has been unified, and the return will now be a base ndarray. Subclasses can still override this behavior by providing their own *nonzero* method. C API ~~~~~ The changes to *swapaxes* also apply to the *PyArray_SwapAxes* C function, which now returns a view in all cases. The changes to *nonzero* also apply to the *PyArray_Nonzero* C function, which now returns a base ndarray in all cases. The dtype structure (PyArray_Descr) has a new member at the end to cache its hash value. This shouldn't affect any well-written applications. The change to the concatenation function DeprecationWarning also affects PyArray_ConcatenateArrays, recarray field return types ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Previously the returned types for recarray fields accessed by attribute and by index were inconsistent, and fields of string type were returned as chararrays. Now, fields accessed by either attribute or indexing will return an ndarray for fields of non-structured type, and a recarray for fields of structured type. Notably, this affect recarrays containing strings with whitespace, as trailing whitespace is trimmed from chararrays but kept in ndarrays of string type. Also, the dtype.type of nested structured fields is now inherited. recarray views ~~~~~~~~~~~~~~ Viewing an ndarray as a recarray now automatically converts the dtype to np.record. See new record array documentation. Additionally, viewing a recarray with a non-structured dtype no longer converts the result's type to ndarray - the result will remain a recarray. 'out' keyword argument of ufuncs now accepts tuples of arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When using the 'out' keyword argument of a ufunc, a tuple of arrays, one per ufunc output, can be provided. For ufuncs with a single output a single array is also a valid 'out' keyword argument. Previously a single array could be provided in the 'out' keyword argument, and it would be used as the first output for ufuncs with multiple outputs, is deprecated, and will result in a `DeprecationWarning` now and an error in the future. byte-array indices now raises an IndexError ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Indexing an ndarray using a byte-string in Python 3 now raises an IndexError instead of a ValueError. Masked arrays containing objects with arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For such (rare) masked arrays, getting a single masked item no longer returns a corrupted masked array, but a fully masked version of the item. Median warns and returns nan when invalid values are encountered ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Similar to mean, median and percentile now emits a Runtime warning and returns `NaN` in slices where a `NaN` is present. To compute the median or percentile while ignoring invalid values use the new `nanmedian` or `nanpercentile` functions. Functions available from numpy.ma.testutils have changed ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All functions from numpy.testing were once available from numpy.ma.testutils but not all of them were redefined to work with masked arrays. Most of those functions have now been removed from numpy.ma.testutils with a small subset retained in order to preserve backward compatibility. In the long run this should help avoid mistaken use of the wrong functions, but it may cause import problems for some. New Features ============ Reading extra flags from site.cfg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Previously customization of compilation of dependency libraries and numpy itself was only accomblishable via code changes in the distutils package. Now numpy.distutils reads in the following extra flags from each group of the *site.cfg*: * ``runtime_library_dirs/rpath``, sets runtime library directories to override ``LD_LIBRARY_PATH`` * ``extra_compile_args``, add extra flags to the compilation of sources * ``extra_link_args``, add extra flags when linking libraries This should, at least partially, complete user customization. *np.cbrt* to compute cube root for real floats ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.cbrt* wraps the C99 cube root function *cbrt*. Compared to *np.power(x, 1./3.)* it is well defined for negative real floats and a bit faster. numpy.distutils now allows parallel compilation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ By passing *--parallel=n* or *-j n* to *setup.py build* the compilation of extensions is now performed in *n* parallel processes. The parallelization is limited to files within one extension so projects using Cython will not profit because it builds extensions from single files. *genfromtxt* has a new ``max_rows`` argument ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A ``max_rows`` argument has been added to *genfromtxt* to limit the number of rows read in a single call. Using this functionality, it is possible to read in multiple arrays stored in a single file by making repeated calls to the function. New function *np.broadcast_to* for invoking array broadcasting ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.broadcast_to* manually broadcasts an array to a given shape according to numpy's broadcasting rules. The functionality is similar to broadcast_arrays, which in fact has been rewritten to use broadcast_to internally, but only a single array is necessary. New context manager *clear_and_catch_warnings* for testing warnings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When Python emits a warning, it records that this warning has been emitted in the module that caused the warning, in a module attribute ``__warningregistry__``. Once this has happened, it is not possible to emit the warning again, unless you clear the relevant entry in ``__warningregistry__``. This makes is hard and fragile to test warnings, because if your test comes after another that has already caused the warning, you will not be able to emit the warning or test it. The context manager ``clear_and_catch_warnings`` clears warnings from the module registry on entry and resets them on exit, meaning that warnings can be re-raised. *cov* has new ``fweights`` and ``aweights`` arguments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``fweights`` and ``aweights`` arguments add new functionality to covariance calculations by applying two types of weighting to observation vectors. An array of ``fweights`` indicates the number of repeats of each observation vector, and an array of ``aweights`` provides their relative importance or probability. Support for the '@' operator in Python 3.5+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Python 3.5 adds support for a matrix multiplication operator '@' proposed in PEP465. Preliminary support for that has been implemented, and an equivalent function ``matmul`` has also been added for testing purposes and use in earlier Python versions. The function is preliminary and the order and number of its optional arguments can be expected to change. New argument ``norm`` to fft functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The default normalization has the direct transforms unscaled and the inverse transforms are scaled by :math:`1/n`. It is possible to obtain unitary transforms by setting the keyword argument ``norm`` to ``"ortho"`` (default is `None`) so that both direct and inverse transforms will be scaled by :math:`1/\\sqrt{n}`. Improvements ============ *np.digitize* using binary search ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.digitize* is now implemented in terms of *np.searchsorted*. This means that a binary search is used to bin the values, which scales much better for larger number of bins than the previous linear search. It also removes the requirement for the input array to be 1-dimensional. *np.poly* now casts integer inputs to float ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.poly* will now cast 1-dimensional input arrays of integer type to double precision floating point, to prevent integer overflow when computing the monic polynomial. It is still possible to obtain higher precision results by passing in an array of object type, filled e.g. with Python ints. *np.interp* can now be used with periodic functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.interp* now has a new parameter *period* that supplies the period of the input data *xp*. In such case, the input data is properly normalized to the given period and one end point is added to each extremity of *xp* in order to close the previous and the next period cycles, resulting in the correct interpolation behavior. *np.pad* supports more input types for ``pad_width`` and ``constant_values`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``constant_values`` parameters now accepts NumPy arrays and float values. NumPy arrays are supported as input for ``pad_width``, and an exception is raised if its values are not of integral type. *np.argmax* and *np.argmin* now support an ``out`` argument ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ``out`` parameter was added to *np.argmax* and *np.argmin* for consistency with *ndarray.argmax* and *ndarray.argmin*. The new parameter behaves exactly as it does in those methods. More system C99 complex functions detected and used ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All of the functions ``in complex.h`` are now detected. There are new fallback implementations of the following functions. * npy_ctan, * npy_cacos, npy_casin, npy_catan * npy_ccosh, npy_csinh, npy_ctanh, * npy_cacosh, npy_casinh, npy_catanh As a result of these improvements, there will be some small changes in returned values, especially for corner cases. *np.loadtxt* support for the strings produced by the ``float.hex`` method ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The strings produced by ``float.hex`` look like ``0x1.921fb54442d18p+1``, so this is not the hex used to represent unsigned integer types. *np.isclose* properly handles minimal values of integer dtypes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to properly handle minimal values of integer types, *np.isclose* will now cast to the float dtype during comparisons. This aligns its behavior with what was provided by *np.allclose*. *np.allclose* uses *np.isclose* internally. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.allclose* now uses *np.isclose* internally and inherits the ability to compare NaNs as equal by setting ``equal_nan=True``. Subclasses, such as *np.ma.MaskedArray*, are also preserved now. *np.genfromtxt* now handles large integers correctly ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.genfromtxt* now correctly handles integers larger than ``2**31-1`` on 32-bit systems and larger than ``2**63-1`` on 64-bit systems (it previously crashed with an ``OverflowError`` in these cases). Integers larger than ``2**63-1`` are converted to floating-point values. *np.load*, *np.save* have pickle backward compatibility flags ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The functions *np.load* and *np.save* have additional keyword arguments for controlling backward compatibility of pickled Python objects. This enables Numpy on Python 3 to load npy files containing object arrays that were generated on Python 2. MaskedArray support for more complicated base classes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Built-in assumptions that the baseclass behaved like a plain array are being removed. In particular, setting and getting elements and ranges will respect baseclass overrides of ``__setitem__`` and ``__getitem__``, and arithmetic will respect overrides of ``__add__``, ``__sub__``, etc. Changes ======= dotblas functionality moved to multiarray ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The cblas versions of dot, inner, and vdot have been integrated into the multiarray module. In particular, vdot is now a multiarray function, which it was not before. stricter check of gufunc signature compliance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Inputs to generalized universal functions are now more strictly checked against the function's signature: all core dimensions are now required to be present in input arrays; core dimensions with the same label must have the exact same size; and output core dimension's must be specified, either by a same label input core dimension or by a passed-in output array. views returned from *np.einsum* are writeable ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Views returned by *np.einsum* will now be writeable whenever the input array is writeable. *np.argmin* skips NaT values ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *np.argmin* now skips NaT values in datetime64 and timedelta64 arrays, making it consistent with *np.min*, *np.argmax* and *np.max*. Deprecations ============ Array comparisons involving strings or structured dtypes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Normally, comparison operations on arrays perform elementwise comparisons and return arrays of booleans. But in some corner cases, especially involving strings are structured dtypes, NumPy has historically returned a scalar instead. For example:: ### Current behaviour np.arange(2) == "foo" # -> False np.arange(2) < "foo" # -> True on Python 2, error on Python 3 np.ones(2, dtype="i4,i4") == np.ones(2, dtype="i4,i4,i4") # -> False Continuing work started in 1.9, in 1.10 these comparisons will now raise ``FutureWarning`` or ``DeprecationWarning``, and in the future they will be modified to behave more consistently with other comparison operations, e.g.:: ### Future behaviour np.arange(2) == "foo" # -> array([False, False]) np.arange(2) < "foo" # -> error, strings and numbers are not orderable np.ones(2, dtype="i4,i4") == np.ones(2, dtype="i4,i4,i4") # -> [False, False] SafeEval ~~~~~~~~ The SafeEval class in numpy/lib/utils.py is deprecated and will be removed in the next release. alterdot, restoredot ~~~~~~~~~~~~~~~~~~~~ The alterdot and restoredot functions no longer do anything, and are deprecated. pkgload, PackageLoader ~~~~~~~~~~~~~~~~~~~~~~ These ways of loading packages are now deprecated. bias, ddof arguments to corrcoef ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The values for the ``bias`` and ``ddof`` arguments to the ``corrcoef`` function canceled in the division implied by the correlation coefficient and so had no effect on the returned values. We now deprecate these arguments to ``corrcoef`` and the masked array version ``ma.corrcoef``. Because we are deprecating the ``bias`` argument to ``ma.corrcoef``, we also deprecate the use of the ``allow_masked`` argument as a positional argument, as its position will change with the removal of ``bias``. ``allow_masked`` will in due course become a keyword-only argument. dtype string representation changes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since 1.6, creating a dtype object from its string representation, e.g. ``'f4'``, would issue a deprecation warning if the size did not correspond to an existing type, and default to creating a dtype of the default size for the type. Starting with this release, this will now raise a ``TypeError``. The only exception is object dtypes, where both ``'O4'`` and ``'O8'`` will still issue a deprecation warning. This platform-dependent representation will raise an error in the next release. In preparation for this upcoming change, the string representation of an object dtype, i.e. ``np.dtype(object).str``, no longer includes the item size, i.e. will return ``'|O'`` instead of ``'|O4'`` or ``'|O8'`` as before. Authors ======= This release contains work by the following people who contributed at least one patch to this release. The names are in alphabetical order by first name. Names followed by a "+" contributed a patch for the first time. Abdul Muneer+ Adam Williams+ Alan Briolat+ Alex Griffing Alex Willmer+ Alexander Belopolsky Alistair Muldal+ Allan Haldane+ Amir Sarabadani+ Andrea Bedini+ Andrew Dawson+ Andrew Nelson+ Antoine Pitrou+ Anton Ovchinnikov+ Antony Lee+ Behzad Nouri+ Bertrand+ Blake Griffith Bob Poekert+ Brian Kearns+ CJ Carey Carl Kleffner+ Chander G+ Charles Harris Chris Hogan+ Chris Kerr Chris Lamb+ Chris Laumann+ Christian Brodbeck+ Christian Brueffer Christoph Gohlke Cimarron Mittelsteadt Daniel da Silva Darsh P. Ranjan+ David Cournapeau David M Fobes+ David Powell+ Didrik Pinte+ Dimas Abreu Dutra Dmitry Zagorny+ Eric Firing Eric Hunsberger+ Eric Martin+ Eric Moore Eric O. LEBIGOT (EOL)+ Erik M. Bray Ernest N. Mamikonyan+ Fei Liu+ Fran?ois Magimel+ Gabor Kovacs+ Gabriel-p+ Garrett-R+ George Castillo+ Gerrit Holl+ Gert-Ludwig Ingold+ Glen Mabey+ Graham Christensen+ Greg Thomsen+ Gregory R. Lee+ Helder Cesar+ Helder Oliveira+ Henning Dickten+ Ian Henriksen+ Jaime Fernandez James Camel+ James Salter+ Jan Schl?ter+ Jarl Haggerty+ Jay Bourque Joel Nothman+ John Kirkham+ John Tyree+ Joris Van den Bossche+ Joseph Martinot-Lagarde Josh Warner (Mac) Juan Luis Cano Rodr?guez Julian Taylor Kreiswolke+ Lars Buitinck Leonardo Donelli+ Lev Abalkin Lev Levitsky+ Malik Woods+ Maniteja Nandana+ Marshall Farrier+ Marten van Kerkwijk Martin Spacek Martin Thoma+ Masud Rahman+ Matt Newville+ Mattheus Ueckermann+ Matthew Brett Matthew Craig+ Michael Currie+ Michael Droettboom Michele Vallisneri+ Mortada Mehyar+ Nate Jensen+ Nathaniel J. Smith Nick Papior Andersen+ Nick Papior+ Nils Werner Oliver Eberle+ Patrick Peglar+ Paul Jacobson Pauli Virtanen Peter Iannucci+ Ralf Gommers Richard Barnes+ Ritta Narita+ Robert Johansson+ Robert LU+ Robert McGibbon+ Ryan Blakemore+ Ryan Nelson+ Sandro Tosi Saullo Giovani+ Sebastian Berg Sebastien Gouezel+ Simon Gibbons+ Simon Guillot+ Stefan Eng+ Stefan Otte+ Stefan van der Walt Stephan Hoyer+ Stuart Berg+ Sturla Molden+ Thomas A Caswell+ Thomas Robitaille Tim D. Smith+ Tom Krauss+ Tom Poole+ Toon Verstraelen+ Ulrich Seidl Valentin Haenel Vraj Mohan+ Warren Weckesser Wendell Smith+ Yaroslav Halchenko Yotam Doron Yousef Hamza+ Yuval Langer+ Yuxiang Wang+ Zbigniew J?drzejewski-Szmek+ cel+ chebee7i+ empeeu+ endolith hannaro+ immerrr jmrosen155+ jnothman kanhua+ mbyt+ mlai+ styr+ tdihp+ wim glenn+ yolanda15+ ?smund Hjulstad+ Enjjoy, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Tue Oct 6 04:14:15 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 6 Oct 2015 10:14:15 +0200 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On 30 September 2015 at 18:20, Nathaniel Smith wrote: > - parallel code in general is not very composable. If someone is calling a > numpy operation from one thread, great, transparently using multiple > threads internally is a win. If they're exploiting some higher-level > structure in their problem to break it into pieces and process each in > parallel, and then using numpy on each piece, then numpy spawning threads > internally will probably destroy performance. And numpy is too low-level to > know which case it's in. This problem exists to some extent already with > multi-threaded BLAS, so people use various BLAS-specific knobs to manage it > in ad hoc ways, but this doesn't scale. > One idea: what about creating a "parallel numpy"? There are a few algorithms that can benefit from parallelisation. This library would mimic Numpy's signature, and the user would be responsible for choosing the single threaded or the parallel one by just changing np.function(x, y) to pnp.function(x, y) If that were deemed a good one, what would be the best parallelisation scheme? OpenMP? Threads? -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Oct 6 04:20:18 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 6 Oct 2015 01:20:18 -0700 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: <560AA195.8030001@jyu.fi> <-4907756911279440734@unknownmsgid> <560B9B3B.7000106@jyu.fi> Message-ID: On Tue, Oct 6, 2015 at 1:14 AM, Da?id wrote: > One idea: what about creating a "parallel numpy"? There are a few > algorithms that can benefit from parallelisation. This library would mimic > Numpy's signature, and the user would be responsible for choosing the > single threaded or the parallel one by just changing np.function(x, y) to > pnp.function(x, y) > I would recommend taking a look at dask.array [1], which in many cases works exactly like a parallel NumPy, though it also does lazy and out-of-core computation. It's a new project, but it's remarkably mature -- we use it as an alternative array backend (to numpy) in xray, and it's also being used by scikit-image. [1] http://dask.pydata.org/en/latest/array.html > If that were deemed a good one, what would be the best parallelisation > scheme? OpenMP? Threads? > Dask uses threads. That works pretty well as long as all the hard work is calling into something that releases the GIL (which includes NumPy, of course). -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Oct 6 06:00:30 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 11:00:30 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: Message-ID: On Mon, Oct 5, 2015 at 11:26 PM, Nathaniel Smith wrote: > Hi all, > > For a long time, NumPy has supported two different ways of being compiled: > > "Separate compilation" mode: like most C projects, each .c file gets > compiled to a .o file, and then the .o files get linked together to > make a shared library. (This has been the default since 1.8.0.) > > "One file" mode: first concatenate all the .c files together to make > one monster .c file, and then compile that .c file to make a shared > library. (This was the default before 1.8.0.) > > Supporting these two different build modes creates a drag on > development progress; in particular Stefan recently ran into this in > this experiments with porting parts of the NumPy internals to Cython: > https://github.com/numpy/numpy/pull/6408 > (I suspect the particular problem he's running into can be fixed b/c > so far he only has one .pyx file, but I also suspect that it will be > impossible to support "one file" mode once we have multiple .pyx > files.) > > There are some rumors that "one file" mode might be needed on some > obscure platform somewhere, or that it might be necessary for > statically linking numpy into the CPython executable, but we can't > continue supporting things forever based only on rumors. If all we can > get are rumors, then eventually we have to risk breaking things just > to force anyone who cares to actually show up and explain what they > need so we can support it properly :-). > Assuming one of the rumour is related to some comments I made some time (years ?) earlier, the context was the ability to hide exported symbols. As you know, the issue is not to build extensions w/ multiple compilation units, but sharing functionalities between them without sharing them outside the extension. I am just reiterating that point so that we all discuss under the right context :) I also agree the current situation is not sustainable -- as we discussed privately before, cythonizing numpy.core is made quite more complicated by this. I have myself quite a few issues w/ cythonizing the other parts of umath. I would also like to support the static link better than we do now (do we know some static link users we can contact to validate our approach ?) Currently, what we have in numpy core is the following: numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ + statically link npymath numpy.core.umath -> compilation units in numpy/core/src/umath + statically link npymath/npysort + some shenanigans to use things in numpy.core.multiarray I would suggest to have a more layered approach, to enable both 'normal' build and static build, without polluting the public namespace too much. This is an approach followed by most large libraries (e.g. MKL), and is fairly flexible. Concretely, we could start by putting more common functionalities (aka the 'core' library) into its own static library. The API would be considered private to numpy (no stability guaranteed outside numpy), and every exported symbol from that library would be decorated appropriately to avoid potential clashes (e.g. '_npy_internal_'). FWIW, that has always been my intention to go toward this when I split up multiarray/umath into multiple .c files and extracted out npymath. cheers, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From juha.jeronen at jyu.fi Tue Oct 6 07:06:42 2015 From: juha.jeronen at jyu.fi (Juha Jeronen) Date: Tue, 6 Oct 2015 14:06:42 +0300 Subject: [Numpy-discussion] Cython-based OpenMP-accelerated quartic polynomial solver In-Reply-To: References: Message-ID: <5613AB42.8050800@jyu.fi> Hi all, Thanks Jon and Ryan for the suggestions. Both asanyarray() or atleast_1d() sound good. There's the technical detail that Cython needs to know the object type (e.g. in the parameters to quartic_z_array()), so I think atleast_1d() may be safer. I've taken this option for now. This simplified the code somewhat. The *_scalar() functions have been removed, as they are no longer needed. The *_array() versions have been renamed, removing the _array suffix. The return values have stayed as they were - if there is only one problem to solve, the singleton dimension is dropped, and otherwise a 2D array is returned. (The exception is linear(), which does not need the second dimension, since the solution for each problem is unique. It will return a scalar in the case of a single problem, or a 1D array in the case of multiple problems.) I've pushed the new version. It's available from the same repository: https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy Other comments? The next step? -J P.S.: I'm not sure how exact the object type must be - i.e. whether Cython wants to know that the object stores its data somewhat like an ndarray, or that its C API exactly matches that of an ndarray. In Cython there are some surprising details regarding this kind of things, such as the ctypedef'ing of scalar types. For example, see the comments about complex support near the beginning of polysolve2.pyx, and the commented-out SSE2 intrinsics experiment in https://yousource.it.jyu.fi/jjrandom2/miniprojects/blobs/master/misc/tworods/tworods.pyx . In the latter, it was somewhat tricky to get Cython to recognize __m128d - turns out it's close enough for Cython to know that it behaves like a double, although it actually contains two doubles. (Note that these ctypedefs never end up in the generated C code; Cython uses them as extra context information when mapping the Python code into C.) (And no need to worry, I'm not planning to put any SSE into polysolve2 :) ) On 02.10.2015 17:18, Ryan May wrote: > numpy.asanyarray() would be my preferred goto, as it will leave > subclasses of ndarray untouched; asarray() and atleast_1d() force > ndarray. It's nice to do the whenever possible. > > Ryan > > On Fri, Oct 2, 2015 at 6:52 AM, Slavin, Jonathan > > wrote: > > ? Personally I like atleast_1d, which will convert a scalar into a > 1d array but will leave arrays untouched (i.e. won't change the > dimensions. Not sure what the advantages/disadvantages are > relative to asarray. > > Jon? > > > On Fri, Oct 2, 2015 at 7:05 AM, > > wrote: > > From: Juha Jeronen > > To: Discussion of Numerical Python > > Cc: > Date: Fri, 2 Oct 2015 13:31:47 +0300 > Subject: Re: [Numpy-discussion] Cython-based > OpenMP-accelerated quartic polynomial solver > > On 02.10.2015 13:07, Da?id wrote: >> >> On 2 October 2015 at 11:58, Juha Jeronen > > wrote: >> >> >> >> First version done and uploaded: >> >> https://yousource.it.jyu.fi/jjrandom2/miniprojects/trees/master/misc/polysolve_for_numpy >> >> >> Small comment: now you are checking if the input is a scalar >> or a ndarray, but it should also accept any array-like. If I >> pass a list, I expect it to work, internally converting it >> into an array. > > Good catch. > > Is there an official way to test for array-likes? Or should I > always convert with asarray()? Or something else? > > > -J > > > > > > -- > ________________________________________________________ > Jonathan D. Slavin Harvard-Smithsonian CfA > jslavin at cfa.harvard.edu 60 > Garden Street, MS 83 > phone: (617) 496-7981 Cambridge, > MA 02138-1516 > cell: (781) 363-0035 USA > ________________________________________________________ > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Ryan May > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 6 07:07:17 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Oct 2015 13:07:17 +0200 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? References: Message-ID: <20151006130717.5304c762@fsol> On Tue, 6 Oct 2015 11:00:30 +0100 David Cournapeau wrote: > > Assuming one of the rumour is related to some comments I made some time > (years ?) earlier, the context was the ability to hide exported symbols. As > you know, the issue is not to build extensions w/ multiple compilation > units, but sharing functionalities between them without sharing them > outside the extension. Can't you use the visibility attribute with gcc for this? Other Unix compilers probably provide something similar. The issue doesn't exist on Windows by construction. https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Attributes.html#Function-Attributes By the way, external packages may reuse the npy_* functions, so I would like them not the be hidden :-) Regards Antoine. From ndbecker2 at gmail.com Tue Oct 6 07:31:09 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 06 Oct 2015 07:31:09 -0400 Subject: [Numpy-discussion] Numpy 1.10.0 release References: Message-ID: lots of warning with openblas python setup.py build Running from numpy source directory. /usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'test_suite' warnings.warn(msg) blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in ['/usr/local/lib64', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/'] NOT AVAILABLE openblas_info: /home/nbecker/numpy-1.10.0/numpy/distutils/system_info.py:635: UserWarning: Specified path is invalid. warnings.warn('Specified path %s is invalid.' % d) libraries openblas not found in [] Runtime library openblas was not found. Ignoring FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/lib64'] language = c define_macros = [('HAVE_CBLAS', None)] FOUND: libraries = ['openblas', 'openblas'] extra_compile_args = ['-march=native -O3'] library_dirs = ['/usr/lib64'] language = c define_macros = [('HAVE_CBLAS', None)] ... cc /tmp/tmpoQq7Hu/tmp/tmpoQq7Hu/source.o -L/usr/lib64 -lopenblas -o /tmp/tmpoQq7Hu/a.out libraries openblas not found in [] Runtime library openblas was not found. Ignoring FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/lib64'] language = c define_macros = [('HAVE_CBLAS', None)] FOUND: libraries = ['openblas', 'openblas'] extra_compile_args = ['-march=native -O3'] library_dirs = ['/usr/lib64'] language = c define_macros = [('HAVE_CBLAS', None)] ... But I think openblas was used, despite the warnings, because later on I see -lopenblas in the link step. From ndbecker2 at gmail.com Tue Oct 6 07:45:30 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 06 Oct 2015 07:45:30 -0400 Subject: [Numpy-discussion] Numpy 1.10.0 release References: Message-ID: Are extra_compile_args actually used in all compile steps? I set: extra_compile_args = -march=native -O3 On some compile steps, it echos: compile options: '-DHAVE_CBLAS -Inumpy/core/include -Ibuild/src.linux- x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private - Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath - Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort - I/usr/include/python2.7 -Ibuild/src.linux-x86_64-2.7/numpy/core/src/private -Ibuild/src.linux-x86_64-2.7/numpy/core/src/private -Ibuild/src.linux- x86_64-2.7/numpy/core/src/private -c' extra options: '-march=native -O3' But on at least one it doesn't: building 'numpy.random.mtrand' extension compiling C sources C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall - Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack- protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 - mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall - Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack- protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 - mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -march=native -O3 -fPIC creating build/temp.linux-x86_64-2.7/numpy/random creating build/temp.linux-x86_64-2.7/numpy/random/mtrand compile options: '-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 - D_LARGEFILE64_SOURCE=1 -Inumpy/core/include -Ibuild/src.linux- x86_64-2.7/numpy/core/include/numpy -Inumpy/core/src/private - Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath - Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort - I/usr/include/python2.7 -Ibuild/src.linux-x86_64-2.7/numpy/core/src/private -Ibuild/src.linux-x86_64-2.7/numpy/core/src/private -Ibuild/src.linux- x86_64-2.7/numpy/core/src/private -c' Building with: CFLAGS='-march=native -O3' python setup.py build Does seem to use my CFLAGS, as it always did on previous numpy versions. It seems a bit difficult to verify what the exact compile/link steps were. Is there a verbose flag? From cournape at gmail.com Tue Oct 6 07:46:18 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 12:46:18 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: <20151006130717.5304c762@fsol> References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 12:07 PM, Antoine Pitrou wrote: > On Tue, 6 Oct 2015 11:00:30 +0100 > David Cournapeau wrote: > > > > Assuming one of the rumour is related to some comments I made some time > > (years ?) earlier, the context was the ability to hide exported symbols. > As > > you know, the issue is not to build extensions w/ multiple compilation > > units, but sharing functionalities between them without sharing them > > outside the extension. > > Can't you use the visibility attribute with gcc for this? > We do that already for gcc, I think the question was whether every platform supported this or not (and whether we should care). > Other Unix compilers probably provide something similar. The issue > doesn't exist on Windows by construction. > > > https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Attributes.html#Function-Attributes > > By the way, external packages may reuse the npy_* functions, so I would > like them not the be hidden :-) > The npy_ functions in npymath were designed to be exported. Those would stay that way. David > > Regards > > Antoine. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Oct 6 07:53:07 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 06 Oct 2015 07:53:07 -0400 Subject: [Numpy-discussion] Numpy 1.10.0 release References: Message-ID: 1 test failure: FAIL: test_blasdot.test_blasdot_used ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/nbecker/.local/lib/python2.7/site- packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/home/nbecker/.local/lib/python2.7/site- packages/numpy/core/tests/test_blasdot.py", line 31, in test_blasdot_used assert_(dot is _dotblas.dot) File "/home/nbecker/.local/lib/python2.7/site- packages/numpy/testing/utils.py", line 53, in assert_ raise AssertionError(smsg) AssertionError From sebastian at sipsolutions.net Tue Oct 6 08:07:23 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 06 Oct 2015 14:07:23 +0200 Subject: [Numpy-discussion] Numpy 1.10.0 release In-Reply-To: References: Message-ID: <1444133243.2029.13.camel@sipsolutions.net> On Di, 2015-10-06 at 07:53 -0400, Neal Becker wrote: > 1 test failure: > > FAIL: test_blasdot.test_blasdot_used > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest > self.test(*self.arg) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/testing/decorators.py", line 146, in skipper_func > return f(*args, **kwargs) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/core/tests/test_blasdot.py", line 31, in test_blasdot_used > assert_(dot is _dotblas.dot) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/testing/utils.py", line 53, in assert_ > raise AssertionError(smsg) > AssertionError > My first guess would be, that it sounds like you got some old test files flying around. Can you try cleaning up everything and reinstall? It can happen that old installed test files survive the new version. And most of all, thanks a lot Chuck! - Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From davidmenhur at gmail.com Tue Oct 6 08:08:36 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 6 Oct 2015 14:08:36 +0200 Subject: [Numpy-discussion] Numpy 1.10.0 release In-Reply-To: References: Message-ID: I don't get any failures on Fedora 22. I have installed it with pip, setting my CFLAGS to "-march=core-avx-i -O2 -pipe -mtune=native" and linking against openblas. With the new Numpy, Scipy full suite shows two errors, I am sorry I didn't think of running that in the RC phase: ====================================================================== FAIL: test_weighting (test_stats.TestHistogram) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/.local/virtualenv/py27/lib/python2.7/site-packages/scipy/stats/tests/test_stats.py", line 892, in test_weighting decimal=2) File "/home/david/.local/virtualenv/py27/lib/python2.7/site-packages/numpy/testing/utils.py", line 886, in assert_array_almost_equal precision=decimal) File "/home/david/.local/virtualenv/py27/lib/python2.7/site-packages/numpy/testing/utils.py", line 708, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 2 decimals (mismatch 40.0%) x: array([ 4. , 0. , 4.5, -0.9, 0. , 0.3, 110.2, 0. , 0. , 42. ]) y: array([ 4. , 0. , 4.5, -0.9, 0.3, 0. , 7. , 103.2, 0. , 42. ]) ====================================================================== FAIL: test_nanmedian_all_axis (test_stats.TestNanFunc) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/.local/virtualenv/py27/lib/python2.7/site-packages/scipy/stats/tests/test_stats.py", line 226, in test_nanmedian_all_axis assert_equal(len(w), 4) File "/home/david/.local/virtualenv/py27/lib/python2.7/site-packages/numpy/testing/utils.py", line 354, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: 1 DESIRED: 4 I am almost sure these errors weren't there before. On 6 October 2015 at 13:53, Neal Becker wrote: > 1 test failure: > > FAIL: test_blasdot.test_blasdot_used > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in > runTest > self.test(*self.arg) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/testing/decorators.py", line 146, in skipper_func > return f(*args, **kwargs) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/core/tests/test_blasdot.py", line 31, in test_blasdot_used > assert_(dot is _dotblas.dot) > File "/home/nbecker/.local/lib/python2.7/site- > packages/numpy/testing/utils.py", line 53, in assert_ > raise AssertionError(smsg) > AssertionError > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Oct 6 08:41:49 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 06 Oct 2015 08:41:49 -0400 Subject: [Numpy-discussion] Numpy 1.10.0 release References: <1444133243.2029.13.camel@sipsolutions.net> Message-ID: Sebastian Berg wrote: > On Di, 2015-10-06 at 07:53 -0400, Neal Becker wrote: >> 1 test failure: >> >> FAIL: test_blasdot.test_blasdot_used >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in >> runTest >> self.test(*self.arg) >> File "/home/nbecker/.local/lib/python2.7/site- >> packages/numpy/testing/decorators.py", line 146, in skipper_func >> return f(*args, **kwargs) >> File "/home/nbecker/.local/lib/python2.7/site- >> packages/numpy/core/tests/test_blasdot.py", line 31, in test_blasdot_used >> assert_(dot is _dotblas.dot) >> File "/home/nbecker/.local/lib/python2.7/site- >> packages/numpy/testing/utils.py", line 53, in assert_ >> raise AssertionError(smsg) >> AssertionError >> > > My first guess would be, that it sounds like you got some old test files > flying around. Can you try cleaning up everything and reinstall? It can > happen that old installed test files survive the new version. > Yes, I rm'd the old ~/.local/lib/python2.7/site-packages/numpy, reinstalled, and now no test failure: Ran 5955 tests in 30.778s OK (KNOWNFAIL=3, SKIP=1) From njs at pobox.com Tue Oct 6 12:40:43 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 09:40:43 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: <20151006130717.5304c762@fsol> References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 4:07 AM, Antoine Pitrou wrote: > On Tue, 6 Oct 2015 11:00:30 +0100 > David Cournapeau wrote: >> >> Assuming one of the rumour is related to some comments I made some time >> (years ?) earlier, the context was the ability to hide exported symbols. As >> you know, the issue is not to build extensions w/ multiple compilation >> units, but sharing functionalities between them without sharing them >> outside the extension. > > Can't you use the visibility attribute with gcc for this? > Other Unix compilers probably provide something similar. The issue > doesn't exist on Windows by construction. > > https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Attributes.html#Function-Attributes This is what we do normally when building a shared extension, but in the exceptional case where people want to statically link numpy into cpython, it doesn't work -- the visibility attribute and related machinery only works on shared libraries, not static libraries. (Recall that traditionally, doing 'a.o + b.o -> static.a; static.a + c.o -> final' is just a shorthand for doing 'a.o + b.o + c.o -> final'.) But this is still a solved problem, you just have to use the static linking version instead of the shared library version :-). E.g. with GNU tools the magic incantation is objcopy --keep-symbol-name PyInit_multiarray multiarray.a It's possible that there's some system somewhere that both needs static linking *and* doesn't have access to objcopy-or-equivalent, but then we're back with the thing where it's not a great plan to keep spending energy on supporting purely theoretical platforms. > By the way, external packages may reuse the npy_* functions, so I would > like them not the be hidden :-) This is a separate issue from the one file/multi-file compilation mode, but: I'd really prefer that we stick to just one way of exporting stuff to external packages, and that that be the (admittedly somewhat cumbersome) public API / import_array() mechanism. Trying to manage numpy's API/ABI exposure is a huge challenge in general, so having two mechanisms is not really sustainable. And trying to use the linker directly creates huge cross-platform headaches -- no-one can agree on what's exported by default, or how you find shared libraries, and the numpy extensions will certainly not be on the default library search path, so you need some platform-specific hack to find them... OTOH the "public API" mechanism takes some ugly stuff on numpy's side to set things up, but then the result is a uniform mechanism that uses Python's familiar package lookup rules to find the relevant symbols. If you need some npy_* function it'd be much better to let us know what it is and let us export it in an intentional way, instead of just relying on whatever stuff we accidentally exposed? -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Tue Oct 6 12:44:08 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 09:44:08 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau wrote: > The npy_ functions in npymath were designed to be exported. Those would stay > that way. If we want to export these then I vote that we either: - use the usual API export mechanism, or else - provide a static library for people to link to, instead of trying to do runtime binding. (I.e. drop it in some known place, and then provide some functions for extension modules to find it at build time -- similar to how np.get_include() works.) -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Tue Oct 6 12:51:08 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 17:51:08 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 5:44 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau > wrote: > > The npy_ functions in npymath were designed to be exported. Those would > stay > > that way. > > If we want to export these then I vote that we either: > - use the usual API export mechanism, or else > - provide a static library for people to link to, instead of trying to > do runtime binding. (I.e. drop it in some known place, and then > provide some functions for extension modules to find it at build time > -- similar to how np.get_include() works.) > Unless something changed, that's more or less how it works already (npymath is used in scipy, for example, which was one of the rationale for writing it in the first place !). You access the compilation/linking issues through the numpy distutils get_info function. David > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Oct 6 12:56:50 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 17:56:50 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 5:51 PM, David Cournapeau wrote: > > > On Tue, Oct 6, 2015 at 5:44 PM, Nathaniel Smith wrote: > >> On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau >> wrote: >> > The npy_ functions in npymath were designed to be exported. Those would >> stay >> > that way. >> >> If we want to export these then I vote that we either: >> - use the usual API export mechanism, or else >> - provide a static library for people to link to, instead of trying to >> do runtime binding. (I.e. drop it in some known place, and then >> provide some functions for extension modules to find it at build time >> -- similar to how np.get_include() works.) >> > > Unless something changed, that's more or less how it works already > (npymath is used in scipy, for example, which was one of the rationale for > writing it in the first place !). > > You access the compilation/linking issues through the numpy distutils > get_info function. > And my suggestion is to use a similar mechanism for multiarray and umath, so that in the end the exported Python C API is just a thin layer on top of the underlying static library. That would make cython and I suspect static linking quite a bit easier. The API between the low layer and python C API of multiarray/umath would be considered private and outside any API/ABI stability. IOW, it would be an internal change, and should not cause visible changes to the users, except that some _npy_private_ symbols would be exported (but you would be crazy to use them, and the prototype declarations would not be available when you install numpy anyway). Think of those as the internal driver API/ABI of Linux or similar. David > > David > > >> -n >> >> -- >> Nathaniel J. Smith -- http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 6 12:58:20 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 09:58:20 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 9:51 AM, David Cournapeau wrote: > > On Tue, Oct 6, 2015 at 5:44 PM, Nathaniel Smith wrote: >> >> On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau >> wrote: >> > The npy_ functions in npymath were designed to be exported. Those would >> > stay >> > that way. >> >> If we want to export these then I vote that we either: >> - use the usual API export mechanism, or else >> - provide a static library for people to link to, instead of trying to >> do runtime binding. (I.e. drop it in some known place, and then >> provide some functions for extension modules to find it at build time >> -- similar to how np.get_include() works.) > > Unless something changed, that's more or less how it works already (npymath > is used in scipy, for example, which was one of the rationale for writing it > in the first place !). Okay... in fact multiarray.so right now *does* export tons and tons of random junk into the global symbol namespace (on systems like Linux that do have a global symbol namespace), so it isn't obvious whether people are asking for that to continue :-). I'm just specifically saying that we should try to get this back down to the 1 exported symbol. (Try: objdump -T $(python -c 'import numpy; print(numpy.core.multiarray.__file__)') This *should* print 1 line... I currently get ~700. numpy.core.umath is similar.) -n -- Nathaniel J. Smith -- http://vorpus.org From solipsis at pitrou.net Tue Oct 6 13:00:55 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Oct 2015 19:00:55 +0200 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? References: <20151006130717.5304c762@fsol> Message-ID: <20151006190055.120ffa8a@fsol> On Tue, 6 Oct 2015 09:40:43 -0700 Nathaniel Smith wrote: > > If you need some npy_* function it'd be much better to let us know > what it is and let us export it in an intentional way, instead of just > relying on whatever stuff we accidentally exposed? Ok, we seem to be using only the complex math functions (npy_cpow and friends, I could make a complete list if required). And, of course, we would also benefit from the CBLAS functions (or any kind of C wrappers around them) :-) https://github.com/numpy/numpy/issues/6324 Regards Antoine. From njs at pobox.com Tue Oct 6 13:07:13 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 10:07:13 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: <20151006190055.120ffa8a@fsol> References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: On Tue, Oct 6, 2015 at 10:00 AM, Antoine Pitrou wrote: > On Tue, 6 Oct 2015 09:40:43 -0700 > Nathaniel Smith wrote: >> >> If you need some npy_* function it'd be much better to let us know >> what it is and let us export it in an intentional way, instead of just >> relying on whatever stuff we accidentally exposed? > > Ok, we seem to be using only the complex math functions (npy_cpow and > friends, I could make a complete list if required). And how are you getting at them? Are you just relying the way that on ELF systems, if two libraries are loaded into the same address space then they automatically get access to each other's symbols, even if they aren't linked to each other? What do you do on Windows? > And, of course, we would also benefit from the CBLAS functions (or any > kind of C wrappers around them) :-) > https://github.com/numpy/numpy/issues/6324 This is difficult to do from NumPy itself -- we don't necessarily have access to a full BLAS or LAPACK API -- in some configurations we fall back on our minimal internal implementations that just have what we need. There was an interesting idea that came up in some discussions here a few weeks ago -- we already know that we want to package up BLAS inside a Python package that (numpy / scipy / scikit-learn / ...) can depend on and assume is there to link against. Maybe this new package would also be a good place for exposing these wrappers? -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Tue Oct 6 13:08:17 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 18:08:17 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 5:58 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 9:51 AM, David Cournapeau > wrote: > > > > On Tue, Oct 6, 2015 at 5:44 PM, Nathaniel Smith wrote: > >> > >> On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau > >> wrote: > >> > The npy_ functions in npymath were designed to be exported. Those > would > >> > stay > >> > that way. > >> > >> If we want to export these then I vote that we either: > >> - use the usual API export mechanism, or else > >> - provide a static library for people to link to, instead of trying to > >> do runtime binding. (I.e. drop it in some known place, and then > >> provide some functions for extension modules to find it at build time > >> -- similar to how np.get_include() works.) > > > > Unless something changed, that's more or less how it works already > (npymath > > is used in scipy, for example, which was one of the rationale for > writing it > > in the first place !). > > Okay... in fact multiarray.so right now *does* export tons and tons of > random junk into the global symbol namespace (on systems like Linux > that do have a global symbol namespace), so it isn't obvious whether > people are asking for that to continue :-). I'm just specifically > saying that we should try to get this back down to the 1 exported > symbol. > > (Try: > objdump -T $(python -c 'import numpy; > print(numpy.core.multiarray.__file__)') > This *should* print 1 line... I currently get ~700. numpy.core.umath > is similar.) > > I think this overestimates the amount by quite a bit, since you see GLIBC symbols, etc... I am using nm -Dg --defined-only $(python -c 'import numpy; print(numpy.core.multiarray.__file__)') instead. I see around 290 symboles: the npy_ from npymath don't bother me, but the ones from npysort do. We should at least prefix those with npy_ (I don't think npysort API has ever been publisher in our header like npymath was ?) David -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 6 13:09:41 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 10:09:41 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> Message-ID: On Tue, Oct 6, 2015 at 9:51 AM, David Cournapeau wrote: > > On Tue, Oct 6, 2015 at 5:44 PM, Nathaniel Smith wrote: >> >> On Tue, Oct 6, 2015 at 4:46 AM, David Cournapeau >> wrote: >> > The npy_ functions in npymath were designed to be exported. Those would >> > stay >> > that way. >> >> If we want to export these then I vote that we either: >> - use the usual API export mechanism, or else >> - provide a static library for people to link to, instead of trying to >> do runtime binding. (I.e. drop it in some known place, and then >> provide some functions for extension modules to find it at build time >> -- similar to how np.get_include() works.) > > Unless something changed, that's more or less how it works already (npymath > is used in scipy, for example, which was one of the rationale for writing it > in the first place !). Okay... in fact multiarray.so right now *does* export tons and tons of random junk into the global symbol namespace (on systems like Linux that do have a global symbol namespace), so it isn't obvious whether people are asking for that to continue :-). I'm just specifically saying that we should try to get this back down to the 1 exported symbol. (Try: objdump -T $(python -c 'import numpy; print(numpy.core.multiarray.__file__)') This *should* print 1 line... I currently get ~700. numpy.core.umath is similar.) -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Tue Oct 6 13:10:42 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 18:10:42 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: On Tue, Oct 6, 2015 at 6:07 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 10:00 AM, Antoine Pitrou > wrote: > > On Tue, 6 Oct 2015 09:40:43 -0700 > > Nathaniel Smith wrote: > >> > >> If you need some npy_* function it'd be much better to let us know > >> what it is and let us export it in an intentional way, instead of just > >> relying on whatever stuff we accidentally exposed? > > > > Ok, we seem to be using only the complex math functions (npy_cpow and > > friends, I could make a complete list if required). > > And how are you getting at them? Are you just relying the way that on > ELF systems, if two libraries are loaded into the same address space > then they automatically get access to each other's symbols, even if > they aren't linked to each other? What do you do on Windows? > It is possible (and documented) to use any of the npy_ symbols from npymath from outside numpy: http://docs.scipy.org/doc/numpy-dev/reference/c-api.coremath.html#linking-against-the-core-math-library-in-an-extension The design is not perfect (I was young and foolish :) ), but it has worked fairly well and has been used in at least scipy since the 1.4/1.5 days IIRC (including windows). David > > > And, of course, we would also benefit from the CBLAS functions (or any > > kind of C wrappers around them) :-) > > https://github.com/numpy/numpy/issues/6324 > > This is difficult to do from NumPy itself -- we don't necessarily have > access to a full BLAS or LAPACK API -- in some configurations we fall > back on our minimal internal implementations that just have what we > need. > > There was an interesting idea that came up in some discussions here a > few weeks ago -- we already know that we want to package up BLAS > inside a Python package that (numpy / scipy / scikit-learn / ...) can > depend on and assume is there to link against. > > Maybe this new package would also be a good place for exposing these > wrappers? > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 6 13:14:19 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 10:14:19 -0700 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: On Tue, Oct 6, 2015 at 10:10 AM, David Cournapeau wrote: > > > On Tue, Oct 6, 2015 at 6:07 PM, Nathaniel Smith wrote: >> >> On Tue, Oct 6, 2015 at 10:00 AM, Antoine Pitrou >> wrote: >> > On Tue, 6 Oct 2015 09:40:43 -0700 >> > Nathaniel Smith wrote: >> >> >> >> If you need some npy_* function it'd be much better to let us know >> >> what it is and let us export it in an intentional way, instead of just >> >> relying on whatever stuff we accidentally exposed? >> > >> > Ok, we seem to be using only the complex math functions (npy_cpow and >> > friends, I could make a complete list if required). >> >> And how are you getting at them? Are you just relying the way that on >> ELF systems, if two libraries are loaded into the same address space >> then they automatically get access to each other's symbols, even if >> they aren't linked to each other? What do you do on Windows? > > > It is possible (and documented) to use any of the npy_ symbols from npymath > from outside numpy: > http://docs.scipy.org/doc/numpy-dev/reference/c-api.coremath.html#linking-against-the-core-math-library-in-an-extension > > The design is not perfect (I was young and foolish :) ), but it has worked > fairly well and has been used in at least scipy since the 1.4/1.5 days IIRC > (including windows). Okay, so just to confirm, it looks like this does indeed implement the static linking thing I just suggested (so perhaps I am also young and foolish ;-)) -- from looking at the output of get_info("npymath"), it seems to add -I.../numpy/core/include to the compiler flags, add -lnpymath -L.../numpy/core/lib to the linker flags, and then .../numpy/core/lib contains only libnpymath.a, so it's static linking. -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Tue Oct 6 13:18:17 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 18:18:17 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: On Tue, Oct 6, 2015 at 6:14 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 10:10 AM, David Cournapeau > wrote: > > > > > > On Tue, Oct 6, 2015 at 6:07 PM, Nathaniel Smith wrote: > >> > >> On Tue, Oct 6, 2015 at 10:00 AM, Antoine Pitrou > >> wrote: > >> > On Tue, 6 Oct 2015 09:40:43 -0700 > >> > Nathaniel Smith wrote: > >> >> > >> >> If you need some npy_* function it'd be much better to let us know > >> >> what it is and let us export it in an intentional way, instead of > just > >> >> relying on whatever stuff we accidentally exposed? > >> > > >> > Ok, we seem to be using only the complex math functions (npy_cpow and > >> > friends, I could make a complete list if required). > >> > >> And how are you getting at them? Are you just relying the way that on > >> ELF systems, if two libraries are loaded into the same address space > >> then they automatically get access to each other's symbols, even if > >> they aren't linked to each other? What do you do on Windows? > > > > > > It is possible (and documented) to use any of the npy_ symbols from > npymath > > from outside numpy: > > > http://docs.scipy.org/doc/numpy-dev/reference/c-api.coremath.html#linking-against-the-core-math-library-in-an-extension > > > > The design is not perfect (I was young and foolish :) ), but it has > worked > > fairly well and has been used in at least scipy since the 1.4/1.5 days > IIRC > > (including windows). > > Okay, so just to confirm, it looks like this does indeed implement the > static linking thing I just suggested (so perhaps I am also young and > foolish ;-)) -- from looking at the output of get_info("npymath"), it > seems to add -I.../numpy/core/include to the compiler flags, add > -lnpymath -L.../numpy/core/lib to the linker flags, and then > .../numpy/core/lib contains only libnpymath.a, so it's static linking. > Yes, I was not trying to argue otherwise. If you thought I was, blame it on my poor English (which sadly does not get better as I get less young...). My proposal is to extend this technique for *internal* API, but with the following differences: * the declarations are not put in any public header * we don't offer any way to link to this library, and name it something scary enough that people would have to be foolish (young or not) to use it. David > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 6 13:27:53 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 6 Oct 2015 19:27:53 +0200 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: <20151006192753.117c1902@fsol> On Tue, 6 Oct 2015 10:07:13 -0700 Nathaniel Smith wrote: > > And how are you getting at them? Are you just relying the way that on > ELF systems, if two libraries are loaded into the same address space > then they automatically get access to each other's symbols, even if > they aren't linked to each other? What do you do on Windows? Well it seems to work on Windows too, thanks to numpy.distutils.misc_util.get_info('npymath'). Under Windows, I seem to have a "\site-packages\numpy\core\lib\npymath.lib" static library, and there's also a "npy-pkg-config" subdirectory there with some INI files in it. Hopefully you know better than me what this all is :-) > > And, of course, we would also benefit from the CBLAS functions (or any > > kind of C wrappers around them) :-) > > https://github.com/numpy/numpy/issues/6324 > > This is difficult to do from NumPy itself -- we don't necessarily have > access to a full BLAS or LAPACK API -- in some configurations we fall > back on our minimal internal implementations that just have what we > need. I'm thinking about the functions exposed in "numpy/core/src/private/npy_cblas.h". My knowledge of the Numpy build system doesn't allow me to tell if it's always available or not :-) > There was an interesting idea that came up in some discussions here a > few weeks ago -- we already know that we want to package up BLAS > inside a Python package that (numpy / scipy / scikit-learn / ...) can > depend on and assume is there to link against. > > Maybe this new package would also be a good place for exposing these wrappers? Yeah, why not - as long as there's something well-known and well-supported to depend on. But given Numpy is a hard dependency for all the other packages you mentioned, it may make sense (and simplify dependency management) to bundle it with Numpy. Regards Antoine. From cournape at gmail.com Tue Oct 6 13:31:04 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 18:31:04 +0100 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: On Tue, Oct 6, 2015 at 6:18 PM, David Cournapeau wrote: > > > On Tue, Oct 6, 2015 at 6:14 PM, Nathaniel Smith wrote: > >> On Tue, Oct 6, 2015 at 10:10 AM, David Cournapeau >> wrote: >> > >> > >> > On Tue, Oct 6, 2015 at 6:07 PM, Nathaniel Smith wrote: >> >> >> >> On Tue, Oct 6, 2015 at 10:00 AM, Antoine Pitrou >> >> wrote: >> >> > On Tue, 6 Oct 2015 09:40:43 -0700 >> >> > Nathaniel Smith wrote: >> >> >> >> >> >> If you need some npy_* function it'd be much better to let us know >> >> >> what it is and let us export it in an intentional way, instead of >> just >> >> >> relying on whatever stuff we accidentally exposed? >> >> > >> >> > Ok, we seem to be using only the complex math functions (npy_cpow and >> >> > friends, I could make a complete list if required). >> >> >> >> And how are you getting at them? Are you just relying the way that on >> >> ELF systems, if two libraries are loaded into the same address space >> >> then they automatically get access to each other's symbols, even if >> >> they aren't linked to each other? What do you do on Windows? >> > >> > >> > It is possible (and documented) to use any of the npy_ symbols from >> npymath >> > from outside numpy: >> > >> http://docs.scipy.org/doc/numpy-dev/reference/c-api.coremath.html#linking-against-the-core-math-library-in-an-extension >> > >> > The design is not perfect (I was young and foolish :) ), but it has >> worked >> > fairly well and has been used in at least scipy since the 1.4/1.5 days >> IIRC >> > (including windows). >> >> Okay, so just to confirm, it looks like this does indeed implement the >> static linking thing I just suggested (so perhaps I am also young and >> foolish ;-)) -- from looking at the output of get_info("npymath"), it >> seems to add -I.../numpy/core/include to the compiler flags, add >> -lnpymath -L.../numpy/core/lib to the linker flags, and then >> .../numpy/core/lib contains only libnpymath.a, so it's static linking. >> > > Yes, I was not trying to argue otherwise. If you thought I was, blame it > on my poor English (which sadly does not get better as I get less young...). > > My proposal is to extend this technique for *internal* API, but with the > following differences: > * the declarations are not put in any public header > * we don't offer any way to link to this library, and name it something > scary enough that people would have to be foolish (young or not) to use it. > I am stupid: we of course do not even ship that internal library, it would just be linked into multiarray/umath and never installed or part of binary packages. David > > David > > >> -n >> >> -- >> Nathaniel J. Smith -- http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 6 14:30:53 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 11:30:53 -0700 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) Message-ID: [splitting this off into a new thread] On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau wrote: [...] > I also agree the current situation is not sustainable -- as we discussed > privately before, cythonizing numpy.core is made quite more complicated by > this. I have myself quite a few issues w/ cythonizing the other parts of > umath. I would also like to support the static link better than we do now > (do we know some static link users we can contact to validate our approach > ?) > > Currently, what we have in numpy core is the following: > > numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ + > statically link npymath > numpy.core.umath -> compilation units in numpy/core/src/umath + statically > link npymath/npysort + some shenanigans to use things in > numpy.core.multiarray There are also shenanigans in the other direction - supposedly umath is layered "above" multiarray, but in practice there are circular dependencies (see e.g. np.set_numeric_ops). > I would suggest to have a more layered approach, to enable both 'normal' > build and static build, without polluting the public namespace too much. > This is an approach followed by most large libraries (e.g. MKL), and is > fairly flexible. > > Concretely, we could start by putting more common functionalities (aka the > 'core' library) into its own static library. The API would be considered > private to numpy (no stability guaranteed outside numpy), and every exported > symbol from that library would be decorated appropriately to avoid potential > clashes (e.g. '_npy_internal_'). I don't see why we need this multi-layered complexity, though. npymath is a well-defined utility library that other people use, so sure, it makes sense to keep that somewhat separate as a static library (as discussed in the other thread). Beyond that -- NumPy is really not a large library. multiarray is <50k lines of code, and umath is only ~6k (!). And there's no particular reason to keep them split up from the user point of view -- all their functionality gets combined into the flat numpy namespace anyway. So we *could* rewrite them as three libraries, with a "common core" that then gets exported via two different wrapper libraries -- but it's much simpler just to do mv umath/* multiarray/ rmdir umath and then make multiarray work the way we want. (After fixing up the build system of course :-).) -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Tue Oct 6 14:52:11 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 6 Oct 2015 19:52:11 +0100 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith wrote: > [splitting this off into a new thread] > > On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau > wrote: > [...] > > I also agree the current situation is not sustainable -- as we discussed > > privately before, cythonizing numpy.core is made quite more complicated > by > > this. I have myself quite a few issues w/ cythonizing the other parts of > > umath. I would also like to support the static link better than we do now > > (do we know some static link users we can contact to validate our > approach > > ?) > > > > Currently, what we have in numpy core is the following: > > > > numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ > + > > statically link npymath > > numpy.core.umath -> compilation units in numpy/core/src/umath + > statically > > link npymath/npysort + some shenanigans to use things in > > numpy.core.multiarray > > There are also shenanigans in the other direction - supposedly umath > is layered "above" multiarray, but in practice there are circular > dependencies (see e.g. np.set_numeric_ops). > Indeed, I am not arguing about merging umath and multiarray. > > I would suggest to have a more layered approach, to enable both 'normal' > > build and static build, without polluting the public namespace too much. > > This is an approach followed by most large libraries (e.g. MKL), and is > > fairly flexible. > > > > Concretely, we could start by putting more common functionalities (aka > the > > 'core' library) into its own static library. The API would be considered > > private to numpy (no stability guaranteed outside numpy), and every > exported > > symbol from that library would be decorated appropriately to avoid > potential > > clashes (e.g. '_npy_internal_'). > > I don't see why we need this multi-layered complexity, though. > For several reasons: - when you want to cythonize either extension, it is much easier to separate it as cython for CPython API, C for the rest. - if numpy.core.multiarray.so is built as cython-based .o + a 'large' C static library, it should become much simpler to support static link. - maybe that's just personal, but I find the whole multiarray + umath quite beyond manageable in terms of intertwined complexity. You may argue it is not that big, and we all have different preferences in terms of organization, but if I look at the binary size of multiarray + umath, it is quite larger than the median size of the .so I have in my /usr/lib. I am also hoping that splitting up numpy.core in separate elements that communicate through internal APIs would make participating into numpy easier. We could also swap the argument: assuming it does not make the build more complex, and that it does help static linking, why not doing it ? David -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 6 15:04:59 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 6 Oct 2015 12:04:59 -0700 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Tue, Oct 6, 2015 at 11:52 AM, David Cournapeau wrote: > > > On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith wrote: >> >> [splitting this off into a new thread] >> >> On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau >> wrote: >> [...] >> > I also agree the current situation is not sustainable -- as we discussed >> > privately before, cythonizing numpy.core is made quite more complicated >> > by >> > this. I have myself quite a few issues w/ cythonizing the other parts of >> > umath. I would also like to support the static link better than we do >> > now >> > (do we know some static link users we can contact to validate our >> > approach >> > ?) >> > >> > Currently, what we have in numpy core is the following: >> > >> > numpy.core.multiarray -> compilation units in numpy/core/src/multiarray/ >> > + >> > statically link npymath >> > numpy.core.umath -> compilation units in numpy/core/src/umath + >> > statically >> > link npymath/npysort + some shenanigans to use things in >> > numpy.core.multiarray >> >> There are also shenanigans in the other direction - supposedly umath >> is layered "above" multiarray, but in practice there are circular >> dependencies (see e.g. np.set_numeric_ops). > > Indeed, I am not arguing about merging umath and multiarray. Oh, okay :-). >> > I would suggest to have a more layered approach, to enable both 'normal' >> > build and static build, without polluting the public namespace too much. >> > This is an approach followed by most large libraries (e.g. MKL), and is >> > fairly flexible. >> > >> > Concretely, we could start by putting more common functionalities (aka >> > the >> > 'core' library) into its own static library. The API would be considered >> > private to numpy (no stability guaranteed outside numpy), and every >> > exported >> > symbol from that library would be decorated appropriately to avoid >> > potential >> > clashes (e.g. '_npy_internal_'). >> >> I don't see why we need this multi-layered complexity, though. > > > For several reasons: > > - when you want to cythonize either extension, it is much easier to > separate it as cython for CPython API, C for the rest. I don't think this will help much, because I think we'll want to have multiple cython files, and that we'll probably move individual functions between being implemented in C and Cython (including utility functions). So that means we need to solve the problem of mixing C and Cython files inside a single library. If you look at Stefan's PR: https://github.com/numpy/numpy/pull/6408 it does solve most of these problems. It would help if Cython added a few tweaks to officially support compiling multiple modules into one .so, and I'm not sure whether the current code quite handles initialization of the submodule correctly, but it's actually surprisingly easy to make work. (Obviously we won't want to go overboard here -- but the point of removing the technical constraints is that then it frees us to pick whatever arrangement makes the most sense, instead of deciding based on what makes the build system and linker easiest.) > - if numpy.core.multiarray.so is built as cython-based .o + a 'large' C > static library, it should become much simpler to support static link. I don't see this at all, so I must be missing something? Either way you build a bunch of .o files, and then you have to either combine them into a shared library or combine them into a static library. Why does pre-combining some of them into a static library make this easier? > - maybe that's just personal, but I find the whole multiarray + umath quite > beyond manageable in terms of intertwined complexity. You may argue it is > not that big, and we all have different preferences in terms of > organization, but if I look at the binary size of multiarray + umath, it is > quite larger than the median size of the .so I have in my /usr/lib. The binary size isn't a good measure here -- most of that is the bazillions of copies of slightly tweaked loops that we auto-generate, which take up a lot of space but don't add much intertwined complexity. (Though now that I think about it, my LOC estimate was probably a bit low because cloc is probably ignoring those autogeneration template files.) We definitely could do a better job with our internal APIs -- I just think that'll be easiest if everything is in the same directory so there are minimal obstacles to rearranging and refactoring things. Anyway, it sounds like we agree that the next step is to merge multiarray and umath, so possibly we should worry about doing that and then see what makes sense from there :-). -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Tue Oct 6 15:19:41 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 6 Oct 2015 13:19:41 -0600 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Tue, Oct 6, 2015 at 1:04 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 11:52 AM, David Cournapeau > wrote: > > > > > > On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith wrote: > >> > >> [splitting this off into a new thread] > >> > >> On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau > >> wrote: > >> [...] > >> > I also agree the current situation is not sustainable -- as we > discussed > >> > privately before, cythonizing numpy.core is made quite more > complicated > >> > by > >> > this. I have myself quite a few issues w/ cythonizing the other parts > of > >> > umath. I would also like to support the static link better than we do > >> > now > >> > (do we know some static link users we can contact to validate our > >> > approach > >> > ?) > >> > > >> > Currently, what we have in numpy core is the following: > >> > > >> > numpy.core.multiarray -> compilation units in > numpy/core/src/multiarray/ > >> > + > >> > statically link npymath > >> > numpy.core.umath -> compilation units in numpy/core/src/umath + > >> > statically > >> > link npymath/npysort + some shenanigans to use things in > >> > numpy.core.multiarray > >> > >> There are also shenanigans in the other direction - supposedly umath > >> is layered "above" multiarray, but in practice there are circular > >> dependencies (see e.g. np.set_numeric_ops). > > > > Indeed, I am not arguing about merging umath and multiarray. > > Oh, okay :-). > > >> > I would suggest to have a more layered approach, to enable both > 'normal' > >> > build and static build, without polluting the public namespace too > much. > >> > This is an approach followed by most large libraries (e.g. MKL), and > is > >> > fairly flexible. > >> > > >> > Concretely, we could start by putting more common functionalities (aka > >> > the > >> > 'core' library) into its own static library. The API would be > considered > >> > private to numpy (no stability guaranteed outside numpy), and every > >> > exported > >> > symbol from that library would be decorated appropriately to avoid > >> > potential > >> > clashes (e.g. '_npy_internal_'). > >> > >> I don't see why we need this multi-layered complexity, though. > > > > > > For several reasons: > > > > - when you want to cythonize either extension, it is much easier to > > separate it as cython for CPython API, C for the rest. > > I don't think this will help much, because I think we'll want to have > multiple cython files, and that we'll probably move individual > functions between being implemented in C and Cython (including utility > functions). So that means we need to solve the problem of mixing C and > Cython files inside a single library. > > If you look at Stefan's PR: > https://github.com/numpy/numpy/pull/6408 > it does solve most of these problems. It would help if Cython added a > few tweaks to officially support compiling multiple modules into one > .so, and I'm not sure whether the current code quite handles > initialization of the submodule correctly, but it's actually > surprisingly easy to make work. > > (Obviously we won't want to go overboard here -- but the point of > removing the technical constraints is that then it frees us to pick > whatever arrangement makes the most sense, instead of deciding based > on what makes the build system and linker easiest.) > > > - if numpy.core.multiarray.so is built as cython-based .o + a 'large' C > > static library, it should become much simpler to support static link. > > I don't see this at all, so I must be missing something? Either way > you build a bunch of .o files, and then you have to either combine > them into a shared library or combine them into a static library. Why > does pre-combining some of them into a static library make this > easier? > > > - maybe that's just personal, but I find the whole multiarray + umath > quite > > beyond manageable in terms of intertwined complexity. You may argue it is > > not that big, and we all have different preferences in terms of > > organization, but if I look at the binary size of multiarray + umath, it > is > > quite larger than the median size of the .so I have in my /usr/lib. > > The binary size isn't a good measure here -- most of that is the > bazillions of copies of slightly tweaked loops that we auto-generate, > which take up a lot of space but don't add much intertwined > complexity. (Though now that I think about it, my LOC estimate was > probably a bit low because cloc is probably ignoring those > autogeneration template files.) > > We definitely could do a better job with our internal APIs -- I just > think that'll be easiest if everything is in the same directory so > there are minimal obstacles to rearranging and refactoring things. > > Anyway, it sounds like we agree that the next step is to merge > multiarray and umath, so possibly we should worry about doing that and > then see what makes sense from there :-). > > What about removing the single file build? That seems somewhat orthogonal to this discussion. Would someone explain to me the advantages of the single file build for static linking, apart from possible doing a better job of hiding symbols? If symbols are the problem, it there not a solution we could implement? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.rogozhnikov at yandex.ru Tue Oct 6 20:26:22 2015 From: alex.rogozhnikov at yandex.ru (Alex Rogozhnikov) Date: Wed, 7 Oct 2015 03:26:22 +0300 Subject: [Numpy-discussion] Fwd: Numpy for data manipulation In-Reply-To: References: <1443755358119.1d9bb73e@Nodemailer> <560E3468.8010502@yandex.ru> Message-ID: <561466AE.9040501@yandex.ru> Thanks for comments, I've fixed the named issues. Code is python2&3 compatible, I aliased numpy and used better inversion. Specially thanks for pointing at histogram equalization - I've added example for images. Probably some other 'visual' examples would help - I'll try to invent something to other points, but this is not simple. (I left %matplolib inline due to more appropriate rendering) Alex. 02.10.15 10:50, Kiko ?????: > > > 2015-10-02 9:48 GMT+02:00 Kiko >: > > > > 2015-10-02 9:38 GMT+02:00 Alex Rogozhnikov > >: > > I would suggest > > %matplotlib notebook > > It will still have to a nice png, but you get an > interactive figure when it is live. > > > Amazing, thanks. I was using mpld3 for this. > (for some strange reason I need to put %matplotlib notebook > before each plot) > > > You should create a figure before each plot instead of putthon > %matplotlib notebook > plt.figure() > .... > > > putthon == putting > > > The recommendation of inverting a permutation by > argsort'ing it, while it works, is suboptimal, as it takes > O(n log(n)) time, and you can do it in linear time: > > Actually, there is (later in post) a linear solution using > bincount, but your code is definitely better. Thanks! > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Oct 7 07:30:18 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 7 Oct 2015 13:30:18 +0200 Subject: [Numpy-discussion] Numpy 1.10.0 release In-Reply-To: References: Message-ID: <5615024A.2020403@googlemail.com> On 10/06/2015 01:45 PM, Neal Becker wrote: > Are extra_compile_args actually used in all compile steps? extra_compile_args is not used by numpy, its to support some third party use case I never understood. As the typical site.cfg used by numpy only contains binaries that are never compiled by numpy itself it should have no effect on anything. > CFLAGS='-march=native -O3' python setup.py build > > Does seem to use my CFLAGS, as it always did on previous numpy versions. > still seems to work for me, though the preferred variable is OPT= as CFLAGS will contain a bunch of other stuff related to building python extensions themselves (e.g. -fno-strict-aliasing) From jtaylor.debian at googlemail.com Wed Oct 7 07:40:46 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 7 Oct 2015 13:40:46 +0200 Subject: [Numpy-discussion] [SciPy-Dev] Numpy 1.10.0 release In-Reply-To: References: Message-ID: <561504BE.50500@googlemail.com> On 10/06/2015 02:08 PM, Da?id wrote: > I don't get any failures on Fedora 22. I have installed it with pip, > setting my CFLAGS to "-march=core-avx-i -O2 -pipe -mtune=native" and > linking against openblas. > > With the new Numpy, Scipy full suite shows two errors, I am sorry I > didn't think of running that in the RC phase: > ====================================================================== > FAIL: test_weighting (test_stats.TestHistogram) this is a known issue see scipy/scipy/#5148 It can most likely be ignored as the scipy test is too sensitive to floating point rounding. From matti.picus at gmail.com Wed Oct 7 15:59:04 2015 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 7 Oct 2015 22:59:04 +0300 Subject: [Numpy-discussion] nditer when using operands with mixed C and F order Message-ID: <56157988.5020801@gmail.com> An HTML attachment was scrubbed... URL: From mwwiebe at gmail.com Wed Oct 7 16:14:39 2015 From: mwwiebe at gmail.com (Mark Wiebe) Date: Wed, 7 Oct 2015 13:14:39 -0700 Subject: [Numpy-discussion] nditer when using operands with mixed C and F order In-Reply-To: <56157988.5020801@gmail.com> References: <56157988.5020801@gmail.com> Message-ID: On Wed, Oct 7, 2015 at 12:59 PM, Matti Picus wrote: > I am trying to understand how nditer(ops, order='K') handles C and F > order. In the documentation it states > "?K? means as close to the order the array elements appear in memory as > possible" > but I seem to be getting inconsistent results (numpy 1.9): > > >>> a = np.array([[1, 2], [3, 4]], order="C") > >>> b = np.array([[1, 2], [3, 4]], order="F") > >>> [v for v in np.nditer([a], order='K')] > > [array(1), array(2), array(3), array(4)] > > >>> [v for v in np.nditer([b], order='K')] > [array(1), array(3), array(2), array(4)] > >>> [v for v in np.nditer([a,b], order='K')] > [(array(1), array(1)), (array(2), array(2)), (array(3), array(3)), > (array(4), array(4))] > > The result for np.nditer([b], order='K') seems to be wrong. Could someone > confirm this is an issue or explain what is going on? > In this example, elements of a and b are being matched up according to their array indices, and then the iteration order is chosen according to the 'K' rule. The array a suggests to go in 'C' order, while the array b suggests to go in 'F' order. When there's a conflict/ambiguity such as this, it's resolved in the direction of 'C' order. If it were to go through a and b in each individual 'K' order, the elements wouldn't be paired up/broadcast together, which is the whole point of iterating over multiple arrays via the nditer. -Mark > > > Matti > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Oct 8 01:57:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 7 Oct 2015 22:57:00 -0700 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement Message-ID: Hi all, Now that the governance document is in place, we need to get our legal ducks in a row by signing a fiscal sponsorship agreement with NumFOCUS. The basic idea here is that there are times when you really need some kind of corporation to represent the project -- the legal system for better or worse does not understand "a bunch of folks on a mailing list" as a legal entity capable of accepting donations, or holding funds or other assets like domain names. The obvious solution is to incorporate a company to represent the project -- but incorporating a company involves lots of super-annoying paperwork. (Like, *super* annoying.) So a standard trick is that a single non-profit corporation acts as an umbrella organization providing these services to multiple projects at once, and this is called "fiscal sponsorship". You can read more about it here: https://en.wikipedia.org/wiki/Fiscal_sponsorship NumFOCUS's standard comprehensive FSA agreement can be seen here: https://docs.google.com/document/d/11YqMX9UrgfCSgiQEUzmOFyg6Ku-vED6gMxhO6J9lCgg/edit?usp=sharing and we have the option of negotiating changes if there's anything we don't like. They also have a FAQ: https://docs.google.com/document/d/1zdXp07dLvkbqBrDsw96P6mkqxnWzKJuM-1f4408I6Qs/edit?usp=sharing I've read through the document and didn't see anything that bothered me, except that I'm not quite sure how to make the split between the steering council and numfocus subcommittee that we have in our governance model sync up with their language about the "leadership body", and in particular the language in section 10 about simple majority votes. So I've queried them about that already. In the mean time, I'd encourage anyone with an interest to look it over and speak up if you see anything that you think should be changed before we sign. Cheers, -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Thu Oct 8 03:10:17 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Oct 2015 00:10:17 -0700 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Tue, Oct 6, 2015 at 12:19 PM, Charles R Harris wrote: > > > On Tue, Oct 6, 2015 at 1:04 PM, Nathaniel Smith wrote: [...] >> Anyway, it sounds like we agree that the next step is to merge >> multiarray and umath, so possibly we should worry about doing that and >> then see what makes sense from there :-). >> > > What about removing the single file build? That seems somewhat orthogonal to > this discussion. We seem to also have consensus about removing the single file build, but yeah, it's orthogonal -- notice the changed subject line in this subthread :-). > Would someone explain to me the advantages of the single > file build for static linking, apart from possible doing a better job of > hiding symbols? If symbols are the problem, it there not a solution we could > implement? Hiding symbols is the only advantage that I'm aware of, and as noted in the other thread there do exist other solutions. The only thing is that we can't be absolutely certain these tools will work until someone who needs static builds actually tries it -- the tools definitely exist on regular linux, but IIUC the people who need static builds are generally on really weird architectures that we can't test ourselves. Or for all I know the weird architectures have finally added shared linking and no-one uses static builds anymore. I think we need to just try dropping it and see. -n -- Nathaniel J. Smith -- http://vorpus.org From daniele at grinta.net Thu Oct 8 06:44:24 2015 From: daniele at grinta.net (Daniele Nicolodi) Date: Thu, 8 Oct 2015 12:44:24 +0200 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: <56164908.20306@grinta.net> Hello, sorry for replying in the wrong thread, but I don't find an appropriate message to reply to in the original one. On 08/10/15 09:10, Nathaniel Smith wrote: > Hiding symbols is the only advantage that I'm aware of, and as noted > in the other thread there do exist other solutions. Indeed, and those are way easier than maintaining the single file build. > The only thing is > that we can't be absolutely certain these tools will work until > someone who needs static builds actually tries it -- the tools > definitely exist on regular linux, but IIUC the people who need static > builds are generally on really weird architectures that we can't test > ourselves. Or for all I know the weird architectures have finally > added shared linking and no-one uses static builds anymore. I think we > need to just try dropping it and see. I don't really see how building from a single source file or multiple source files affects the linking of a static library. Can you be more precise about what the problems are? The only thing that I may think of is instructing distutils to do the right thing, but that should not be a stopper. Cheers, Daniele From cournape at gmail.com Thu Oct 8 09:30:03 2015 From: cournape at gmail.com (David Cournapeau) Date: Thu, 8 Oct 2015 14:30:03 +0100 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Tue, Oct 6, 2015 at 8:04 PM, Nathaniel Smith wrote: > On Tue, Oct 6, 2015 at 11:52 AM, David Cournapeau > wrote: > > > > > > On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith wrote: > >> > >> [splitting this off into a new thread] > >> > >> On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau > >> wrote: > >> [...] > >> > I also agree the current situation is not sustainable -- as we > discussed > >> > privately before, cythonizing numpy.core is made quite more > complicated > >> > by > >> > this. I have myself quite a few issues w/ cythonizing the other parts > of > >> > umath. I would also like to support the static link better than we do > >> > now > >> > (do we know some static link users we can contact to validate our > >> > approach > >> > ?) > >> > > >> > Currently, what we have in numpy core is the following: > >> > > >> > numpy.core.multiarray -> compilation units in > numpy/core/src/multiarray/ > >> > + > >> > statically link npymath > >> > numpy.core.umath -> compilation units in numpy/core/src/umath + > >> > statically > >> > link npymath/npysort + some shenanigans to use things in > >> > numpy.core.multiarray > >> > >> There are also shenanigans in the other direction - supposedly umath > >> is layered "above" multiarray, but in practice there are circular > >> dependencies (see e.g. np.set_numeric_ops). > > > > Indeed, I am not arguing about merging umath and multiarray. > > Oh, okay :-). > > >> > I would suggest to have a more layered approach, to enable both > 'normal' > >> > build and static build, without polluting the public namespace too > much. > >> > This is an approach followed by most large libraries (e.g. MKL), and > is > >> > fairly flexible. > >> > > >> > Concretely, we could start by putting more common functionalities (aka > >> > the > >> > 'core' library) into its own static library. The API would be > considered > >> > private to numpy (no stability guaranteed outside numpy), and every > >> > exported > >> > symbol from that library would be decorated appropriately to avoid > >> > potential > >> > clashes (e.g. '_npy_internal_'). > >> > >> I don't see why we need this multi-layered complexity, though. > > > > > > For several reasons: > > > > - when you want to cythonize either extension, it is much easier to > > separate it as cython for CPython API, C for the rest. > > I don't think this will help much, because I think we'll want to have > multiple cython files, and that we'll probably move individual > functions between being implemented in C and Cython (including utility > functions). So that means we need to solve the problem of mixing C and > Cython files inside a single library. > Separating the pure C code into static lib is the simple way of achieving the same goal. Essentially, you write: # implemented in npyinternal.a _npy_internal_foo(....) # implemented in merged_multiarray_umath.pyx cdef PyArray_Foo(...): # use _npy_internal_foo() then our merged_multiarray_umath.so is built by linking the .pyx and the npyinternal.a together. IOW, the static link is internal. Going through npyinternal.a instead of just linking .o from pure C and Cython together gives us the following: 1. the .a can just use normal linking strategies instead of the awkward capsule thing. Those are easy to get wrong when using cython as you may end up with multiple internal copies of the wrapped object inside capsule, causing hard to track bugs (this is what we wasted most of the time on w/ Stefan and Kurt during ds4ds) 2. the only public symbols in .a are the ones needed by the cython wrapping, and since those are decorated with npy_internal, clashes are unlikely to happen 3. since most of the code is already in .a internally, supporting the static linking should be simpler since the only difference is how you statically link the cython-generated code. Because of 1, you are also less likely to cause nasty surprises when putting everything together. When you cythonize umath/multiarray, you need to do most of the underlying work anyway I don't really care if the files are in the same directory or not, we can keep things as they are now. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 8 13:06:09 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 8 Oct 2015 19:06:09 +0200 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: <5616A281.4070802@googlemail.com> On 10/08/2015 03:30 PM, David Cournapeau wrote: > > > On Tue, Oct 6, 2015 at 8:04 PM, Nathaniel Smith > wrote: > > On Tue, Oct 6, 2015 at 11:52 AM, David Cournapeau > > wrote: > > > > > > On Tue, Oct 6, 2015 at 7:30 PM, Nathaniel Smith > wrote: > >> > >> [splitting this off into a new thread] > >> > >> On Tue, Oct 6, 2015 at 3:00 AM, David Cournapeau > > > >> wrote: > >> [...] > >> > I also agree the current situation is not sustainable -- as we > discussed > >> > privately before, cythonizing numpy.core is made quite more > complicated > >> > by > >> > this. I have myself quite a few issues w/ cythonizing the > other parts of > >> > umath. I would also like to support the static link better > than we do > >> > now > >> > (do we know some static link users we can contact to validate our > >> > approach > >> > ?) > >> > > >> > Currently, what we have in numpy core is the following: > >> > > >> > numpy.core.multiarray -> compilation units in > numpy/core/src/multiarray/ > >> > + > >> > statically link npymath > >> > numpy.core.umath -> compilation units in numpy/core/src/umath + > >> > statically > >> > link npymath/npysort + some shenanigans to use things in > >> > numpy.core.multiarray > >> > >> There are also shenanigans in the other direction - supposedly umath > >> is layered "above" multiarray, but in practice there are circular > >> dependencies (see e.g. np.set_numeric_ops). > > > > Indeed, I am not arguing about merging umath and multiarray. > > Oh, okay :-). > > >> > I would suggest to have a more layered approach, to enable both 'normal' > >> > build and static build, without polluting the public namespace too much. > >> > This is an approach followed by most large libraries (e.g. MKL), and is > >> > fairly flexible. > >> > > >> > Concretely, we could start by putting more common functionalities (aka > >> > the > >> > 'core' library) into its own static library. The API would be considered > >> > private to numpy (no stability guaranteed outside numpy), and every > >> > exported > >> > symbol from that library would be decorated appropriately to avoid > >> > potential > >> > clashes (e.g. '_npy_internal_'). > >> > >> I don't see why we need this multi-layered complexity, though. > > > > > > For several reasons: > > > > - when you want to cythonize either extension, it is much easier to > > separate it as cython for CPython API, C for the rest. > > I don't think this will help much, because I think we'll want to have > multiple cython files, and that we'll probably move individual > functions between being implemented in C and Cython (including utility > functions). So that means we need to solve the problem of mixing C and > Cython files inside a single library. > > > Separating the pure C code into static lib is the simple way of > achieving the same goal. Essentially, you write: > > # implemented in npyinternal.a > _npy_internal_foo(....) > > # implemented in merged_multiarray_umath.pyx > cdef PyArray_Foo(...): > # use _npy_internal_foo() > > then our merged_multiarray_umath.so is built by linking the .pyx and the > npyinternal.a together. IOW, the static link is internal. > > Going through npyinternal.a instead of just linking .o from pure C and > Cython together gives us the following: > > 1. the .a can just use normal linking strategies instead of the > awkward capsule thing. Those are easy to get wrong when using cython as > you may end up with multiple internal copies of the wrapped object > inside capsule, causing hard to track bugs (this is what we wasted most > of the time on w/ Stefan and Kurt during ds4ds) > 2. the only public symbols in .a are the ones needed by the cython > wrapping, and since those are decorated with npy_internal, clashes are > unlikely to happen > 3. since most of the code is already in .a internally, supporting the > static linking should be simpler since the only difference is how you > statically link the cython-generated code. Because of 1, you are also > less likely to cause nasty surprises when putting everything together. > I don't see why static libraries for internals are discussed at all? There is not much difference between an .a (archive) file and an .o (object) file. What you call a static library is just a collection of object files with an index slapped on top for faster lookup. Whether a symbol is exported or not is defined in the object file, not the archive file, so in this regard static library of collection of .o files makes no difference. So our current system also produces a library, the only thing thats "missing" is bundling it into an archive via ar cru *.o I also don't see how pycapsule plays a role in this. You don't need pycapsule to link a bunch of object files together. So for me the issue is simply, what is easier with distutils: get the list of object files to link against the cython file or first create a static library from the list of object files and link that against the cython object. I don't think either way should be particular hard. So there is not really much to discuss. Do whatever is easier or results in nicer code. As for adding cython to numpy, I'd start with letting a cython file provide the multiarraymodule init function with all regular numpy object files linked into that thing. Then we have a pyx file with minimal bloat to get started and should also be independent of merging umath (which I'm in favour for). When that single pyx module file gets too large probably concatenating multiple files together could work until cython supports a splut util/user-code build. From njs at pobox.com Thu Oct 8 15:47:56 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Oct 2015 12:47:56 -0700 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Oct 8, 2015 06:30, "David Cournapeau" wrote: > [...] > > Separating the pure C code into static lib is the simple way of achieving the same goal. Essentially, you write: > > # implemented in npyinternal.a > _npy_internal_foo(....) > > # implemented in merged_multiarray_umath.pyx > cdef PyArray_Foo(...): > # use _npy_internal_foo() > > then our merged_multiarray_umath.so is built by linking the .pyx and the npyinternal.a together. IOW, the static link is internal. > > Going through npyinternal.a instead of just linking .o from pure C and Cython together gives us the following: > > 1. the .a can just use normal linking strategies instead of the awkward capsule thing. Those are easy to get wrong when using cython as you may end up with multiple internal copies of the wrapped object inside capsule, causing hard to track bugs (this is what we wasted most of the time on w/ Stefan and Kurt during ds4ds) Check out St?fan's branch -- it just uses regular linking to mix cython and C. I know what you mean about the capsule thing, and I think we shouldn't use it at all. With a few tweaks, you can treat cython-generated .c files just like regular .c files (except for the main module file, which if we port it to cython then we just compile like a regular cython file). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Oct 8 16:07:05 2015 From: cournape at gmail.com (David Cournapeau) Date: Thu, 8 Oct 2015 21:07:05 +0100 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 8:47 PM, Nathaniel Smith wrote: > On Oct 8, 2015 06:30, "David Cournapeau" wrote: > > > [...] > > > > Separating the pure C code into static lib is the simple way of > achieving the same goal. Essentially, you write: > > > > # implemented in npyinternal.a > > _npy_internal_foo(....) > > > > # implemented in merged_multiarray_umath.pyx > > cdef PyArray_Foo(...): > > # use _npy_internal_foo() > > > > then our merged_multiarray_umath.so is built by linking the .pyx and the > npyinternal.a together. IOW, the static link is internal. > > > > Going through npyinternal.a instead of just linking .o from pure C and > Cython together gives us the following: > > > > 1. the .a can just use normal linking strategies instead of the awkward > capsule thing. Those are easy to get wrong when using cython as you may end > up with multiple internal copies of the wrapped object inside capsule, > causing hard to track bugs (this is what we wasted most of the time on w/ > Stefan and Kurt during ds4ds) > > Check out St?fan's branch -- it just uses regular linking to mix cython > and C. > I know, we worked on this together after all ;) My suggested organisation is certainly not mandatory, I was not trying to claim otherwise, sorry if that was unclear. At that point, I guess the consensus is that I have to prove my suggestion is useful. I will take a few more hours to submit a PR with the umath conversion (maybe merging w/ the work from St?fan). I discovered on my flight back that you can call PyModule_Init multiple times for a given module, which is useful while we do the transition C->Cython for the module initialization (it is not documented as possible, so I would not to rely on it for long either). David -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 8 16:57:02 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 Oct 2015 14:57:02 -0600 Subject: [Numpy-discussion] Should we drop support for "one file" compilation mode? In-Reply-To: References: <20151006130717.5304c762@fsol> <20151006190055.120ffa8a@fsol> Message-ID: PR #6429 is a preliminary cut at removing single file build support. A bit of cleanup remains, mostly rearranging some defines for style. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Oct 8 18:05:09 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Oct 2015 15:05:09 -0700 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: Hi Phillip, My advice would be to stick with the function call. It's consistent with most other array operations (esp. when you consider that the vast majority of operations on arrays are functions defined in third party libraries like yours), and the more things we add to the core array object, the more work it is for people implementing new array-style containers. I definitely would not recommend subclassing ndarray for this purpose -- there are all kinds of subtle problems that you'll run into that mean it's extremely difficult to do well, and may well be impossible to do perfectly. Good luck, -n On Oct 5, 2015 21:08, "Phillip Feldman" wrote: > My apologies for the slow response; I was experiencing some technical > problems with e-mail. > > In answer to Antoine's question, my main desire is for a numpy ndarray > method, for the convenience, with a secondary goal being improved > performance. > > I have added the function `magsq` to my library, but would like to access > it as a method rather than as a function. I understand that I could create > a class that inherits from NumPy and add a `magsq` method to that class, > but this has a number of disadvantages. > > Phillip > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Oct 8 20:30:12 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 8 Oct 2015 17:30:12 -0700 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? Message-ID: Hi, I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. Using twine to do the upload generated a new release - 1.10.0.post2 - containing only the wheels. I deleted that new release to avoid confusion, but now, when I try and upload the wheels to the 1.10.0 pypi release via the web form, I get this error: Error processing form This filename has previously been used, you should use a different version. Any chance of a post3 upload so I can upload some matching wheels? Sorry about that, Matthew From charlesr.harris at gmail.com Thu Oct 8 20:39:39 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 Oct 2015 18:39:39 -0600 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett wrote: > Hi, > > I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. > Using twine to do the upload generated a new release - 1.10.0.post2 - > containing only the wheels. I deleted that new release to avoid > confusion, but now, when I try and upload the wheels to the 1.10.0 > pypi release via the web form, I get this error: > > Error processing form > > This filename has previously been used, you should use a different version. > > Any chance of a post3 upload so I can upload some matching wheels? > > Sorry about that, > Yeah, pipy is why we are on post2 already. Given the problem with msvc9, I think we are due for 1.10.1 in a day or two. Or, I could revert the troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. I'll see if Julian has anything to say in the morning and go from there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 8 20:44:48 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 8 Oct 2015 17:44:48 -0700 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: <1510892797960772092@unknownmsgid> Looks good to me. This pretty exciting, actually :-) -CHB Sent from my iPhone > On Oct 7, 2015, at 10:57 PM, Nathaniel Smith wrote: > > Hi all, > > Now that the governance document is in place, we need to get our legal > ducks in a row by signing a fiscal sponsorship agreement with > NumFOCUS. > > The basic idea here is that there are times when you really need some > kind of corporation to represent the project -- the legal system for > better or worse does not understand "a bunch of folks on a mailing > list" as a legal entity capable of accepting donations, or holding > funds or other assets like domain names. The obvious solution is to > incorporate a company to represent the project -- but incorporating a > company involves lots of super-annoying paperwork. (Like, *super* > annoying.) So a standard trick is that a single non-profit corporation > acts as an umbrella organization providing these services to multiple > projects at once, and this is called "fiscal sponsorship". You can > read more about it here: > https://en.wikipedia.org/wiki/Fiscal_sponsorship > > NumFOCUS's standard comprehensive FSA agreement can be seen here: > > https://docs.google.com/document/d/11YqMX9UrgfCSgiQEUzmOFyg6Ku-vED6gMxhO6J9lCgg/edit?usp=sharing > > and we have the option of negotiating changes if there's anything we don't like. > > They also have a FAQ: > https://docs.google.com/document/d/1zdXp07dLvkbqBrDsw96P6mkqxnWzKJuM-1f4408I6Qs/edit?usp=sharing > > I've read through the document and didn't see anything that bothered > me, except that I'm not quite sure how to make the split between the > steering council and numfocus subcommittee that we have in our > governance model sync up with their language about the "leadership > body", and in particular the language in section 10 about simple > majority votes. So I've queried them about that already. > > In the mean time, I'd encourage anyone with an interest to look it > over and speak up if you see anything that you think should be changed > before we sign. > > Cheers, > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Thu Oct 8 21:19:48 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Oct 2015 21:19:48 -0400 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 8:39 PM, Charles R Harris wrote: > > > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett > wrote: > >> Hi, >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >> Using twine to do the upload generated a new release - 1.10.0.post2 - >> containing only the wheels. I deleted that new release to avoid >> confusion, but now, when I try and upload the wheels to the 1.10.0 >> pypi release via the web form, I get this error: >> >> Error processing form >> >> This filename has previously been used, you should use a different >> version. >> >> Any chance of a post3 upload so I can upload some matching wheels? >> >> Sorry about that, >> > > Yeah, pipy is why we are on post2 already. Given the problem with msvc9, I > think we are due for 1.10.1 in a day or two. Or, I could revert the > troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. > I'll see if Julian has anything to say in the morning and go from there. > If you manage a release without a `post` post-fix, then I would not have to worry right away about what to do about a statsmodels setup.py that cannot handle it. Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Oct 8 21:26:39 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Oct 2015 18:26:39 -0700 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Oct 8, 2015 5:39 PM, "Charles R Harris" wrote: > > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett wrote: >> >> Hi, >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >> Using twine to do the upload generated a new release - 1.10.0.post2 - >> containing only the wheels. I deleted that new release to avoid >> confusion, but now, when I try and upload the wheels to the 1.10.0 >> pypi release via the web form, I get this error: >> >> Error processing form >> >> This filename has previously been used, you should use a different version. >> >> Any chance of a post3 upload so I can upload some matching wheels? >> >> Sorry about that, > > > Yeah, pipy is why we are on post2 already. Given the problem with msvc9, I think we are due for 1.10.1 in a day or two. Or, I could revert the troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. I'll see if Julian has anything to say in the morning and go from there. I vote that we increment the micro number every time we upload a new source release, and reserve the postN suffix for binary-only uploads. If this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 -- we probably won't run out of numbers :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 8 21:28:08 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 Oct 2015 19:28:08 -0600 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 7:19 PM, wrote: > > > On Thu, Oct 8, 2015 at 8:39 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett >> wrote: >> >>> Hi, >>> >>> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >>> Using twine to do the upload generated a new release - 1.10.0.post2 - >>> containing only the wheels. I deleted that new release to avoid >>> confusion, but now, when I try and upload the wheels to the 1.10.0 >>> pypi release via the web form, I get this error: >>> >>> Error processing form >>> >>> This filename has previously been used, you should use a different >>> version. >>> >>> Any chance of a post3 upload so I can upload some matching wheels? >>> >>> Sorry about that, >>> >> >> Yeah, pipy is why we are on post2 already. Given the problem with msvc9, >> I think we are due for 1.10.1 in a day or two. Or, I could revert the >> troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. >> I'll see if Julian has anything to say in the morning and go from there. >> > > > If you manage a release without a `post` post-fix, then I would not have > to worry right away about what to do about a statsmodels setup.py that > cannot handle it. > It's a learning experience all round :-) Might take a look at NumpyVersion in numpy/lib if handling versioning is a problem. But... NumpyVersion is buggy also In [7]: NumpyVersion('1.10.0.post1') < '1.10.0' Out[7]: True Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 8 21:32:12 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 8 Oct 2015 19:32:12 -0600 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith wrote: > On Oct 8, 2015 5:39 PM, "Charles R Harris" > wrote: > > > > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett > wrote: > >> > >> Hi, > >> > >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. > >> Using twine to do the upload generated a new release - 1.10.0.post2 - > >> containing only the wheels. I deleted that new release to avoid > >> confusion, but now, when I try and upload the wheels to the 1.10.0 > >> pypi release via the web form, I get this error: > >> > >> Error processing form > >> > >> This filename has previously been used, you should use a different > version. > >> > >> Any chance of a post3 upload so I can upload some matching wheels? > >> > >> Sorry about that, > > > > > > Yeah, pipy is why we are on post2 already. Given the problem with msvc9, > I think we are due for 1.10.1 in a day or two. Or, I could revert the > troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. > I'll see if Julian has anything to say in the morning and go from there. > > I vote that we increment the micro number every time we upload a new > source release, and reserve the postN suffix for binary-only uploads. If > this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 -- we > probably won't run out of numbers :-). > The only difference between 1.10.0 and 1.10.0.post2 is that the latter is signed. Sigh. We need to capture this experience in the HOWTO_RELEASE document. Matthew, can you take care of that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Oct 8 22:45:57 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 8 Oct 2015 19:45:57 -0700 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: Hi, On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris wrote: > > > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith wrote: >> >> On Oct 8, 2015 5:39 PM, "Charles R Harris" >> wrote: >> > >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >> >> Using twine to do the upload generated a new release - 1.10.0.post2 - >> >> containing only the wheels. I deleted that new release to avoid >> >> confusion, but now, when I try and upload the wheels to the 1.10.0 >> >> pypi release via the web form, I get this error: >> >> >> >> Error processing form >> >> >> >> This filename has previously been used, you should use a different >> >> version. >> >> >> >> Any chance of a post3 upload so I can upload some matching wheels? >> >> >> >> Sorry about that, >> > >> > >> > Yeah, pipy is why we are on post2 already. Given the problem with msvc9, >> > I think we are due for 1.10.1 in a day or two. Or, I could revert the >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, decisions. >> > I'll see if Julian has anything to say in the morning and go from there. >> >> I vote that we increment the micro number every time we upload a new >> source release, and reserve the postN suffix for binary-only uploads. If >> this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 -- we >> probably won't run out of numbers :-). > > > The only difference between 1.10.0 and 1.10.0.post2 is that the latter is > signed. Sigh. We need to capture this experience in the HOWTO_RELEASE > document. Matthew, can you take care of that? Is the summary this: * never have an actual numpy version .postN; * releases always have source with a clean Major.Minor.Micro release number; * binary packages for Minor.Minor.Micro release numbers may have filenames ending in .postN * these binary packages should be uploaded via the web interface to avoid creating a new release ? Matthew From njs at pobox.com Fri Oct 9 02:14:55 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 8 Oct 2015 23:14:55 -0700 Subject: [Numpy-discussion] reorganizing numpy internal extensions (was: Re: Should we drop support for "one file" compilation mode?) In-Reply-To: References: Message-ID: On Thu, Oct 8, 2015 at 1:07 PM, David Cournapeau wrote: > > On Thu, Oct 8, 2015 at 8:47 PM, Nathaniel Smith wrote: >> >> On Oct 8, 2015 06:30, "David Cournapeau" wrote: >> > >> [...] >> > >> > Separating the pure C code into static lib is the simple way of >> > achieving the same goal. Essentially, you write: >> > >> > # implemented in npyinternal.a >> > _npy_internal_foo(....) >> > >> > # implemented in merged_multiarray_umath.pyx >> > cdef PyArray_Foo(...): >> > # use _npy_internal_foo() >> > >> > then our merged_multiarray_umath.so is built by linking the .pyx and the >> > npyinternal.a together. IOW, the static link is internal. >> > >> > Going through npyinternal.a instead of just linking .o from pure C and >> > Cython together gives us the following: >> > >> > 1. the .a can just use normal linking strategies instead of the awkward >> > capsule thing. Those are easy to get wrong when using cython as you may end >> > up with multiple internal copies of the wrapped object inside capsule, >> > causing hard to track bugs (this is what we wasted most of the time on w/ >> > Stefan and Kurt during ds4ds) >> >> Check out St?fan's branch -- it just uses regular linking to mix cython >> and C. > > I know, we worked on this together after all ;) > > My suggested organisation is certainly not mandatory, I was not trying to > claim otherwise, sorry if that was unclear. > > At that point, I guess the consensus is that I have to prove my suggestion > is useful. I will take a few more hours to submit a PR with the umath > conversion (maybe merging w/ the work from St?fan). Okay! Still not sure what capsules have to do with anything, but I guess the PR will either make it clear or else make it clear that it doesn't matter :-). > I discovered on my > flight back that you can call PyModule_Init multiple times for a given > module, which is useful while we do the transition C->Cython for the module > initialization (it is not documented as possible, so I would not to rely on > it for long either). Oh, right, Stefan mentioned something about this... PyModule_Init is a python-2-only thing, so whatever it does now is what it will do forever and ever amen. But I can't think of any good reason to call it twice -- if your goal is just to get a reference to the new module, then once PyModule_Init has run once, you can just run PyImport_ImportModule (assuming you know your fully-qualified module name). -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Fri Oct 9 06:28:58 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 9 Oct 2015 12:28:58 +0200 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 4:45 AM, Matthew Brett wrote: > Hi, > > On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris > wrote: > > > > > > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith wrote: > >> > >> On Oct 8, 2015 5:39 PM, "Charles R Harris" > >> wrote: > >> > > >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett < > matthew.brett at gmail.com> > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. > >> >> Using twine to do the upload generated a new release - 1.10.0.post2 - > >> >> containing only the wheels. I deleted that new release to avoid > >> >> confusion, but now, when I try and upload the wheels to the 1.10.0 > >> >> pypi release via the web form, I get this error: > >> >> > >> >> Error processing form > >> >> > >> >> This filename has previously been used, you should use a different > >> >> version. > >> >> > >> >> Any chance of a post3 upload so I can upload some matching wheels? > >> >> > >> >> Sorry about that, > >> > > >> > > >> > Yeah, pipy is why we are on post2 already. Given the problem with > msvc9, > >> > I think we are due for 1.10.1 in a day or two. Or, I could revert the > >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, > decisions. > >> > I'll see if Julian has anything to say in the morning and go from > there. > >> > >> I vote that we increment the micro number every time we upload a new > >> source release, and reserve the postN suffix for binary-only uploads. If > >> this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 -- > we > >> probably won't run out of numbers :-). > > > > > > The only difference between 1.10.0 and 1.10.0.post2 is that the latter is > > signed. Sigh. We need to capture this experience in the HOWTO_RELEASE > > document. Matthew, can you take care of that? > > Is the summary this: > > * never have an actual numpy version .postN; > * releases always have source with a clean Major.Minor.Micro release > number; > * binary packages for Minor.Minor.Micro release numbers may have > filenames ending in .postN > The few times in the past when we've needed to fix a binary, we've just re-uploaded it with the same name. This seems much preferable to me than confusing users with a post-fix on PyPi that doesn't even match ``numpy.__version__`` and that is so uncommon that I've never seen it used anywhere. If re-uploading with the same name is now disallowed by PyPi (is it?) then bumping the micro version number as Nathaniel proposes would be the way to go imho. Ralf > * these binary packages should be uploaded via the web interface to > avoid creating a new release > > ? > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 9 10:43:50 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Oct 2015 08:43:50 -0600 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 4:28 AM, Ralf Gommers wrote: > > > On Fri, Oct 9, 2015 at 4:45 AM, Matthew Brett > wrote: > >> Hi, >> >> On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith wrote: >> >> >> >> On Oct 8, 2015 5:39 PM, "Charles R Harris" >> >> wrote: >> >> > >> >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett < >> matthew.brett at gmail.com> >> >> > wrote: >> >> >> >> >> >> Hi, >> >> >> >> >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >> >> >> Using twine to do the upload generated a new release - 1.10.0.post2 >> - >> >> >> containing only the wheels. I deleted that new release to avoid >> >> >> confusion, but now, when I try and upload the wheels to the 1.10.0 >> >> >> pypi release via the web form, I get this error: >> >> >> >> >> >> Error processing form >> >> >> >> >> >> This filename has previously been used, you should use a different >> >> >> version. >> >> >> >> >> >> Any chance of a post3 upload so I can upload some matching wheels? >> >> >> >> >> >> Sorry about that, >> >> > >> >> > >> >> > Yeah, pipy is why we are on post2 already. Given the problem with >> msvc9, >> >> > I think we are due for 1.10.1 in a day or two. Or, I could revert the >> >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, >> decisions. >> >> > I'll see if Julian has anything to say in the morning and go from >> there. >> >> >> >> I vote that we increment the micro number every time we upload a new >> >> source release, and reserve the postN suffix for binary-only uploads. >> If >> >> this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 >> -- we >> >> probably won't run out of numbers :-). >> > >> > >> > The only difference between 1.10.0 and 1.10.0.post2 is that the latter >> is >> > signed. Sigh. We need to capture this experience in the HOWTO_RELEASE >> > document. Matthew, can you take care of that? >> >> Is the summary this: >> >> * never have an actual numpy version .postN; >> * releases always have source with a clean Major.Minor.Micro release >> number; >> * binary packages for Minor.Minor.Micro release numbers may have >> filenames ending in .postN >> > > The few times in the past when we've needed to fix a binary, we've just > re-uploaded it with the same name. This seems much preferable to me than > confusing users with a post-fix on PyPi that doesn't even match > ``numpy.__version__`` and that is so uncommon that I've never seen it used > anywhere. > > If re-uploading with the same name is now disallowed by PyPi (is it?) then > bumping the micro version number as Nathaniel proposes would be the way to > go imho. > You are not allowed to reuse a file name, and numpy.__version__ must match the file name or pip install will fail. This has all been a bit of experimentation and I think we have learned something. Agree about not using the `.postN` suffix. I expect we will have fewer problems next time around. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 9 12:50:25 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Oct 2015 10:50:25 -0600 Subject: [Numpy-discussion] msvc9 comipiler problems Message-ID: Hi All, There is a compilation problem with 1.10.0 on 32 bit windows using the msvc9 compiler. One possible solution to this is to drop support for python 2.6. The last, and final, release of of that series was Python 2.6.9 in Oct 2013. The first release was in 2008. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri Oct 9 12:53:17 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 9 Oct 2015 18:53:17 +0200 Subject: [Numpy-discussion] msvc9 comipiler problems In-Reply-To: References: Message-ID: <5617F0FD.1080601@googlemail.com> On 10/09/2015 06:50 PM, Charles R Harris wrote: > Hi All, > > There is a compilation problem > with 1.10.0 on 32 bit > windows using the msvc9 compiler. One possible solution to this is to > drop support for python 2.6. The last, and final, release of of that > series was Python 2.6.9 in Oct 2013. The first release was in 2008. > > Thoughts? > doesn't the problem also affect python2.7? I don't recall which msvc is required for that but I though it was v9. If its only the compiler needed for python2.6 thats affected then +1, we already dropped binary support for that in numpy 1.9. From charlesr.harris at gmail.com Fri Oct 9 12:53:39 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Oct 2015 10:53:39 -0600 Subject: [Numpy-discussion] msvc9 comipiler problems In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 10:50 AM, Charles R Harris wrote: > Hi All, > > There is a compilation problem > with 1.10.0 on 32 bit windows > using the msvc9 compiler. One possible solution to this is to drop support > for python 2.6. The last, and final, release of of that series was Python > 2.6.9 in Oct 2013. The first release was in 2008. > > Thoughts? > NVM. Looks like Python 2.7 also uses msvc9. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray+numpy at gmail.com Fri Oct 9 13:06:45 2015 From: erik.m.bray+numpy at gmail.com (Erik Bray) Date: Fri, 9 Oct 2015 13:06:45 -0400 Subject: [Numpy-discussion] Should we allow arrays with "empty string" dtypes? Message-ID: Hi all, This is a post about strings--for the purpose of discussion then I'll be assuming Python 2 and string means non-unicode strings. However, the discussion applies all the same to unicode strings. For a long time Numpy has had the following behavior: When creating an array with a zero-width string dtype like 'S0', Numpy automatically increases the width of the dtype to support the longest string in the input, like so: >>> np.array(['abc', 'de'], dtype='S0') # or equivalently dtype=str array(['abc', 'de'], dtype='|S3') But it *always* converts to a one character string dtype, at a minimum. So even when passing in a list of empty strings: >>> np.array(['', '', ''], dtype='S0') array(['', '', ''], dtype='|S1') Or even >>> np.zeros(3, dtype='S0') array(['', '', ''], dtype='|S1') This behavior is encoded in PyArray_NewFromDescr_int [1] and is very old (since 2006) [2]. This made sense at the time, certainly, since the logic for handling zero-sized strides was shaky, but most issues with that have long since been worked out. However, there's an oversight associated with this that it *is* possible to make a structured dtype that has a zero-width string as one of its fields. But since even PyArray_View goes through PyArray_NewFromDescr, viewing such a field results in a non-empty view that contains garbage and allows writing garbage into a structured array. This is documented in several issues, such as #473 [3]. A fixed I've proposed in #6430 [4] takes a conservative approach of keeping all the existing behavior *except* in the case of structured arrays, where views with a dtype of 'S0' would be allowed. However, a simpler fix would be to just remove the restriction on creating arrays of dtype 'S0' in general (with my first example above being one exception--given a list of strings it will still convert 'S0' to a dtype that can hold the longest string in the list). I think I would prefer the general fix, but it would be a slight change in behavior for any code using PyArray_NewFromDescr to create string arrays. But would anyone actually be negatively impacted by such a change? It seems to me that any code actually relies on the existing behavior would smell fishy anyways. Thanks, Erik [1] https://github.com/numpy/numpy/blob/8cb3ec6ab804f594daf553e53e7cf7478656bebd/numpy/core/src/multiarray/ctors.c#L940-L956 [2] https://github.com/numpy/numpy/commit/b022765aa487070866663b1707e4a2a0d8ead2e8 [3] https://github.com/numpy/numpy/issues/473 [4] https://github.com/numpy/numpy/pull/6430 From matthew.brett at gmail.com Fri Oct 9 14:17:15 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 9 Oct 2015 11:17:15 -0700 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 7:43 AM, Charles R Harris wrote: > > > On Fri, Oct 9, 2015 at 4:28 AM, Ralf Gommers wrote: >> >> >> >> On Fri, Oct 9, 2015 at 4:45 AM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith wrote: >>> >> >>> >> On Oct 8, 2015 5:39 PM, "Charles R Harris" >>> >> wrote: >>> >> > >>> >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett >>> >> > >>> >> > wrote: >>> >> >> >>> >> >> Hi, >>> >> >> >>> >> >> I'm afraid I made a mistake uploading OSX wheels for numpy 1.10.0. >>> >> >> Using twine to do the upload generated a new release - 1.10.0.post2 >>> >> >> - >>> >> >> containing only the wheels. I deleted that new release to avoid >>> >> >> confusion, but now, when I try and upload the wheels to the 1.10.0 >>> >> >> pypi release via the web form, I get this error: >>> >> >> >>> >> >> Error processing form >>> >> >> >>> >> >> This filename has previously been used, you should use a different >>> >> >> version. >>> >> >> >>> >> >> Any chance of a post3 upload so I can upload some matching wheels? >>> >> >> >>> >> >> Sorry about that, >>> >> > >>> >> > >>> >> > Yeah, pipy is why we are on post2 already. Given the problem with >>> >> > msvc9, >>> >> > I think we are due for 1.10.1 in a day or two. Or, I could revert >>> >> > the >>> >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, >>> >> > decisions. >>> >> > I'll see if Julian has anything to say in the morning and go from >>> >> > there. >>> >> >>> >> I vote that we increment the micro number every time we upload a new >>> >> source release, and reserve the postN suffix for binary-only uploads. >>> >> If >>> >> this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 >>> >> -- we >>> >> probably won't run out of numbers :-). >>> > >>> > >>> > The only difference between 1.10.0 and 1.10.0.post2 is that the latter >>> > is >>> > signed. Sigh. We need to capture this experience in the HOWTO_RELEASE >>> > document. Matthew, can you take care of that? >>> >>> Is the summary this: >>> >>> * never have an actual numpy version .postN; >>> * releases always have source with a clean Major.Minor.Micro release >>> number; >>> * binary packages for Minor.Minor.Micro release numbers may have >>> filenames ending in .postN >> >> >> The few times in the past when we've needed to fix a binary, we've just >> re-uploaded it with the same name. This seems much preferable to me than >> confusing users with a post-fix on PyPi that doesn't even match >> ``numpy.__version__`` and that is so uncommon that I've never seen it used >> anywhere. >> >> If re-uploading with the same name is now disallowed by PyPi (is it?) then >> bumping the micro version number as Nathaniel proposes would be the way to >> go imho. > > > You are not allowed to reuse a file name, and numpy.__version__ must match > the file name or pip install will fail. This has all been a bit of > experimentation and I think we have learned something. Agree about not using > the `.postN` suffix. I expect we will have fewer problems next time around. OK - any chance of a 1.10.1 release urgently? Otherwise the wheel installs don't work on OSX... Matthew From jeffreback at gmail.com Fri Oct 9 14:31:13 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 9 Oct 2015 14:31:13 -0400 Subject: [Numpy-discussion] ANN: pandas v0.17.0 released Message-ID: Hi, We are proud to announce v0.17.0 of pandas. This is a major release from 0.16.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. This was a release of 4 months with 515 commits by 112 authors encompassing 233 issues and 362 pull-requests. We recommend that all users upgrade to this version. *What is it:* *pandas* is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. *Highlights*: - Release the Global Interpreter Lock (GIL) on some cython operations, see here - Plotting methods are now available as attributes of the .plot accessor, see here - The sorting API has been revamped to remove some long-time inconsistencies, see here - Support for a datetime64[ns] with timezones as a first-class dtype, see here - The default for to_datetime will now be to raise when presented with unparseable formats, previously this would return the original input, see here - The default for dropna in HDFStore has changed to False, to store by default all rows even if they are all NaN, see here - Support for Series.dt.strftime to generate formatted strings for datetime-likes, see here - Development installed versions of pandas will now have PEP440 compliant version strings GH9518 - Development support for benchmarking with the Air Speed Velocity library GH8316 - Support for reading SAS xport files, see here - Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see here - Display format with plain text can optionally align with Unicode East Asian Width, see here - Compatibility with Python 3.5 GH11097 - Compatibility with matplotlib 1.5.0 GH11111 See the Whatsnew for much more information and the full Documentation link. *How to get it:* Source tarballs, windows wheels, macosx wheels are available on PyPI - note that currently PyPi is not accepting 3.5 wheels. Installation via conda is: - conda install pandas windows wheels are courtesy of Christoph Gohlke and are built on Numpy 1.9 macosx wheels are courtesy of Matthew Brett *Issues:* Please report any issues on our issue tracker : Thanks to all who made this release happen. It is a very large release! Jeff *Thanks to all of the contributors* - Alex Rothberg - Andrea Bedini - Andrew Rosenfeld - Andy Li - Anthonios Partheniou - Artemy Kolchinsky - Bernard Willers - Charlie Clark - Chris - Chris Whelan - Christoph Gohlke - Christopher Whelan - Clark Fitzgerald - Clearfield Christopher - Dan Ringwalt - Daniel Ni - Data & Code Expert Experimenting with Code on Data - David Cottrell - David John Gagne - David Kelly - ETF - Eduardo Schettino - Egor - Egor Panfilov - Evan Wright - Frank Pinter - Gabriel Araujo - Garrett-R - Gianluca Rossi - Guillaume Gay - Guillaume Poulin - Harsh Nisar - Ian Henriksen - Ian Hoegen - Jaidev Deshpande - Jan Rudolph - Jan Schulz - Jason Swails - Jeff Reback - Jonas Buyl - Joris Van den Bossche - Joris Vankerschaver - Josh Levy-Kramer - Julien Danjou - Ka Wo Chen - Karrie Kehoe - Kelsey Jordahl - Kerby Shedden - Kevin Sheppard - Lars Buitinck - Leif Johnson - Luis Ortiz - Mac - Matt Gambogi - Matt Savoie - Matthew Gilbert - Maximilian Roos - Michelangelo D'Agostino - Mortada Mehyar - Nick Eubank - Nipun Batra - Ond?ej ?ert?k - Phillip Cloud - Pratap Vardhan - Rafal Skolasinski - Richard Lewis - Rinoc Johnson - Rob Levy - Robert Gieseke - Safia Abdalla - Samuel Denny - Saumitra Shahapure - Sebastian P?lsterl - Sebastian Rubbert - Sheppard, Kevin - Sinhrks - Siu Kwan Lam - Skipper Seabold - Spencer Carrucciu - Stephan Hoyer - Stephen Hoover - Stephen Pascoe - Terry Santegoeds - Thomas Grainger - Tjerk Santegoeds - Tom Augspurger - Vincent Davis - Winterflower - Yaroslav Halchenko - Yuan Tang (Terry) - agijsberts - ajcr - behzad nouri - cel4 - cyrusmaher - davidovitch - ganego - jreback - juricast - larvian - maximilianr - msund - rekcahpassyla - robertzk - scls19fr - seth-p - sinhrks - springcoil - terrytangyuan - tzinckgraf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 9 14:54:46 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Oct 2015 12:54:46 -0600 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 12:17 PM, Matthew Brett wrote: > On Fri, Oct 9, 2015 at 7:43 AM, Charles R Harris > wrote: > > > > > > On Fri, Oct 9, 2015 at 4:28 AM, Ralf Gommers > wrote: > >> > >> > >> > >> On Fri, Oct 9, 2015 at 4:45 AM, Matthew Brett > >> wrote: > >>> > >>> Hi, > >>> > >>> On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris > >>> wrote: > >>> > > >>> > > >>> > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith > wrote: > >>> >> > >>> >> On Oct 8, 2015 5:39 PM, "Charles R Harris" < > charlesr.harris at gmail.com> > >>> >> wrote: > >>> >> > > >>> >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett > >>> >> > > >>> >> > wrote: > >>> >> >> > >>> >> >> Hi, > >>> >> >> > >>> >> >> I'm afraid I made a mistake uploading OSX wheels for numpy > 1.10.0. > >>> >> >> Using twine to do the upload generated a new release - > 1.10.0.post2 > >>> >> >> - > >>> >> >> containing only the wheels. I deleted that new release to avoid > >>> >> >> confusion, but now, when I try and upload the wheels to the > 1.10.0 > >>> >> >> pypi release via the web form, I get this error: > >>> >> >> > >>> >> >> Error processing form > >>> >> >> > >>> >> >> This filename has previously been used, you should use a > different > >>> >> >> version. > >>> >> >> > >>> >> >> Any chance of a post3 upload so I can upload some matching > wheels? > >>> >> >> > >>> >> >> Sorry about that, > >>> >> > > >>> >> > > >>> >> > Yeah, pipy is why we are on post2 already. Given the problem with > >>> >> > msvc9, > >>> >> > I think we are due for 1.10.1 in a day or two. Or, I could revert > >>> >> > the > >>> >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, > >>> >> > decisions. > >>> >> > I'll see if Julian has anything to say in the morning and go from > >>> >> > there. > >>> >> > >>> >> I vote that we increment the micro number every time we upload a new > >>> >> source release, and reserve the postN suffix for binary-only > uploads. > >>> >> If > >>> >> this means we have a tiny 1.10.1 then oh well, there's always 1.10.2 > >>> >> -- we > >>> >> probably won't run out of numbers :-). > >>> > > >>> > > >>> > The only difference between 1.10.0 and 1.10.0.post2 is that the > latter > >>> > is > >>> > signed. Sigh. We need to capture this experience in the HOWTO_RELEASE > >>> > document. Matthew, can you take care of that? > >>> > >>> Is the summary this: > >>> > >>> * never have an actual numpy version .postN; > >>> * releases always have source with a clean Major.Minor.Micro release > >>> number; > >>> * binary packages for Minor.Minor.Micro release numbers may have > >>> filenames ending in .postN > >> > >> > >> The few times in the past when we've needed to fix a binary, we've just > >> re-uploaded it with the same name. This seems much preferable to me than > >> confusing users with a post-fix on PyPi that doesn't even match > >> ``numpy.__version__`` and that is so uncommon that I've never seen it > used > >> anywhere. > >> > >> If re-uploading with the same name is now disallowed by PyPi (is it?) > then > >> bumping the micro version number as Nathaniel proposes would be the way > to > >> go imho. > > > > > > You are not allowed to reuse a file name, and numpy.__version__ must > match > > the file name or pip install will fail. This has all been a bit of > > experimentation and I think we have learned something. Agree about not > using > > the `.postN` suffix. I expect we will have fewer problems next time > around. > > OK - any chance of a 1.10.1 release urgently? Otherwise the wheel > installs don't work on OSX... > Working on it. There is a problem with msvc9 that needs to be addressed, otherwise it would be out already. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Oct 9 14:56:33 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 9 Oct 2015 11:56:33 -0700 Subject: [Numpy-discussion] msvc9 comipiler problems In-Reply-To: References: Message-ID: > > > NVM. Looks like Python 2.7 also uses msvc9. > yup, according to Wikipedia: *Visual C++ 2008* (known also as Visual C++ 9.0) so py2.7 Are you testing with the "MS Visual C++ compiler for Python 2.7" here: http://www.microsoft.com/en-us/download/details.aspx?id=44266 I think the only difference is how.where it is installed, but you never know... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Oct 9 15:18:02 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 9 Oct 2015 12:18:02 -0700 Subject: [Numpy-discussion] Oops - maybe post3 numpy file? In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 11:54 AM, Charles R Harris wrote: > > > On Fri, Oct 9, 2015 at 12:17 PM, Matthew Brett > wrote: >> >> On Fri, Oct 9, 2015 at 7:43 AM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Oct 9, 2015 at 4:28 AM, Ralf Gommers >> > wrote: >> >> >> >> >> >> >> >> On Fri, Oct 9, 2015 at 4:45 AM, Matthew Brett >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> On Thu, Oct 8, 2015 at 6:32 PM, Charles R Harris >> >>> wrote: >> >>> > >> >>> > >> >>> > On Thu, Oct 8, 2015 at 7:26 PM, Nathaniel Smith >> >>> > wrote: >> >>> >> >> >>> >> On Oct 8, 2015 5:39 PM, "Charles R Harris" >> >>> >> >> >>> >> wrote: >> >>> >> > >> >>> >> > On Thu, Oct 8, 2015 at 6:30 PM, Matthew Brett >> >>> >> > >> >>> >> > wrote: >> >>> >> >> >> >>> >> >> Hi, >> >>> >> >> >> >>> >> >> I'm afraid I made a mistake uploading OSX wheels for numpy >> >>> >> >> 1.10.0. >> >>> >> >> Using twine to do the upload generated a new release - >> >>> >> >> 1.10.0.post2 >> >>> >> >> - >> >>> >> >> containing only the wheels. I deleted that new release to avoid >> >>> >> >> confusion, but now, when I try and upload the wheels to the >> >>> >> >> 1.10.0 >> >>> >> >> pypi release via the web form, I get this error: >> >>> >> >> >> >>> >> >> Error processing form >> >>> >> >> >> >>> >> >> This filename has previously been used, you should use a >> >>> >> >> different >> >>> >> >> version. >> >>> >> >> >> >>> >> >> Any chance of a post3 upload so I can upload some matching >> >>> >> >> wheels? >> >>> >> >> >> >>> >> >> Sorry about that, >> >>> >> > >> >>> >> > >> >>> >> > Yeah, pipy is why we are on post2 already. Given the problem with >> >>> >> > msvc9, >> >>> >> > I think we are due for 1.10.1 in a day or two. Or, I could revert >> >>> >> > the >> >>> >> > troublesome commit and do a post3 tomorrow. Hmm... decisions, >> >>> >> > decisions. >> >>> >> > I'll see if Julian has anything to say in the morning and go from >> >>> >> > there. >> >>> >> >> >>> >> I vote that we increment the micro number every time we upload a >> >>> >> new >> >>> >> source release, and reserve the postN suffix for binary-only >> >>> >> uploads. >> >>> >> If >> >>> >> this means we have a tiny 1.10.1 then oh well, there's always >> >>> >> 1.10.2 >> >>> >> -- we >> >>> >> probably won't run out of numbers :-). >> >>> > >> >>> > >> >>> > The only difference between 1.10.0 and 1.10.0.post2 is that the >> >>> > latter >> >>> > is >> >>> > signed. Sigh. We need to capture this experience in the >> >>> > HOWTO_RELEASE >> >>> > document. Matthew, can you take care of that? >> >>> >> >>> Is the summary this: >> >>> >> >>> * never have an actual numpy version .postN; >> >>> * releases always have source with a clean Major.Minor.Micro release >> >>> number; >> >>> * binary packages for Minor.Minor.Micro release numbers may have >> >>> filenames ending in .postN >> >> >> >> >> >> The few times in the past when we've needed to fix a binary, we've just >> >> re-uploaded it with the same name. This seems much preferable to me >> >> than >> >> confusing users with a post-fix on PyPi that doesn't even match >> >> ``numpy.__version__`` and that is so uncommon that I've never seen it >> >> used >> >> anywhere. >> >> >> >> If re-uploading with the same name is now disallowed by PyPi (is it?) >> >> then >> >> bumping the micro version number as Nathaniel proposes would be the way >> >> to >> >> go imho. >> > >> > >> > You are not allowed to reuse a file name, and numpy.__version__ must >> > match >> > the file name or pip install will fail. This has all been a bit of >> > experimentation and I think we have learned something. Agree about not >> > using >> > the `.postN` suffix. I expect we will have fewer problems next time >> > around. >> >> OK - any chance of a 1.10.1 release urgently? Otherwise the wheel >> installs don't work on OSX... > > > Working on it. There is a problem with msvc9 that needs to be addressed, > otherwise it would be out already. Great, thanks - meanwhile I'll get onto the HOWTO_RELEASE PR in the next couple of hours. Matthew From cmkleffner at gmail.com Fri Oct 9 17:02:11 2015 From: cmkleffner at gmail.com (Carl Kleffner) Date: Fri, 9 Oct 2015 23:02:11 +0200 Subject: [Numpy-discussion] msvc9 comipiler problems In-Reply-To: References: Message-ID: The error occurs also for Python-2.7 win32. I tested numpy-1.10.0+mkl-cp27-none-win32.whl some days ago and reported to C. Gohlke. Carl 2015-10-09 20:56 GMT+02:00 Chris Barker : > >> NVM. Looks like Python 2.7 also uses msvc9. >> > > yup, according to Wikipedia: > > *Visual C++ 2008* (known also as Visual C++ 9.0) > > so py2.7 > > Are you testing with the "MS Visual C++ compiler for Python 2.7" here: > > http://www.microsoft.com/en-us/download/details.aspx?id=44266 > > I think the only difference is how.where it is installed, but you never > know... > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmkleffner at gmail.com Fri Oct 9 18:29:33 2015 From: cmkleffner at gmail.com (Carl Kleffner) Date: Sat, 10 Oct 2015 00:29:33 +0200 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available Message-ID: I made numpy master (numpy-1.11.0.dev0 , https://github.com/numpy/numpy/commit/0243bce23383ff5e894b99e40df2f8fd806ad79f) windows binary wheels available for testing. Install it with pip: > pip install -i https://pypi.anaconda.org/carlkl/simple numpy These builds are compiled with OPENBLAS trunk for BLAS/LAPACK support and the mingwpy compiler toolchain. OpenBLAS is deployed within the numpy wheels. To be performant on all usual CPU architectures OpenBLAS is configured with it's 'dynamic architecture' and automatic CPU detection. This version of numpy fakes long double as double just like the MSVC builds. Some test statistics: win32 (32 bit) numpy-1.11.0.dev0, python-2.6: errors=8, failures=1 numpy-1.11.0.dev0, python-2.7: errors=8, failures=1 numpy-1.11.0.dev0, python-3.3: errors=9 numpy-1.11.0.dev0, python-3.4: errors=9 amd64 (64bit) numpy-1.11.0.dev0, python-2.6: errors=9, failures=6 numpy-1.11.0.dev0, python-2.7: errors=9, failures=6 numpy-1.11.0.dev0, python-3.3: errors=10, failures=6 numpy-1.11.0.dev0, python-3.4: errors=10, failures=6 Carl -------------- next part -------------- An HTML attachment was scrubbed... URL: From phillip.m.feldman at gmail.com Sat Oct 10 00:32:40 2015 From: phillip.m.feldman at gmail.com (Phillip Feldman) Date: Fri, 9 Oct 2015 21:32:40 -0700 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: Hello Nathaniel, It is hard to say what is normative practice with NumPy, because there are at least three paradigms: (1) Some operations are implemented as methods of the `ndarray` class. `sum` and `mean` are examples. (2) Some operations are implemented via functions that invoke a private method of the class. `abs` is an example of this: In [8]: x= array([1+1J]) In [9]: x.__abs__() Out[9]: array([ 1.41421356]) (3) Some operations are implemented as functions that operate directly on the array, e.g., RMS (root-mean-square). Because calculating the square of the magnitude is such a widely-used operation, and is often done in a grossly inefficient manner (e.g., by taking the absolute value, which involves a square-root, and then squaring), I believe that there is a strong argument for doing either (1) or (2). I'd prefer (1). Phillip On Thu, Oct 8, 2015 at 3:05 PM, Nathaniel Smith wrote: > Hi Phillip, > > My advice would be to stick with the function call. It's consistent with > most other array operations (esp. when you consider that the vast majority > of operations on arrays are functions defined in third party libraries like > yours), and the more things we add to the core array object, the more work > it is for people implementing new array-style containers. I definitely > would not recommend subclassing ndarray for this purpose -- there are all > kinds of subtle problems that you'll run into that mean it's extremely > difficult to do well, and may well be impossible to do perfectly. > > Good luck, > -n > On Oct 5, 2015 21:08, "Phillip Feldman" > wrote: > >> My apologies for the slow response; I was experiencing some technical >> problems with e-mail. >> >> In answer to Antoine's question, my main desire is for a numpy ndarray >> method, for the convenience, with a secondary goal being improved >> performance. >> >> I have added the function `magsq` to my library, but would like to access >> it as a method rather than as a function. I understand that I could create >> a class that inherits from NumPy and add a `magsq` method to that class, >> but this has a number of disadvantages. >> >> Phillip >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 10 10:44:38 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 10 Oct 2015 08:44:38 -0600 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: On Fri, Oct 9, 2015 at 10:32 PM, Phillip Feldman < phillip.m.feldman at gmail.com> wrote: > Hello Nathaniel, > > It is hard to say what is normative practice with NumPy, because there are > at least three paradigms: > > (1) Some operations are implemented as methods of the `ndarray` class. > `sum` and `mean` are examples. > > (2) Some operations are implemented via functions that invoke a private > method of the class. `abs` is an example of this: > > In [8]: x= array([1+1J]) > In [9]: x.__abs__() > Out[9]: array([ 1.41421356]) > > (3) Some operations are implemented as functions that operate directly on > the array, e.g., RMS (root-mean-square). > > Because calculating the square of the magnitude is such a widely-used > operation, and is often done in a grossly inefficient manner (e.g., by > taking the absolute value, which involves a square-root, and then > squaring), I believe that there is a strong argument for doing either (1) > or (2). I'd prefer (1). > > We tend to avoid adding methods. 2) would be a very easy enhancement, just a slight modification of sqr. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sat Oct 10 13:14:28 2015 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 10 Oct 2015 13:14:28 -0400 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: > We tend to avoid adding methods. 2) would be a very easy enhancement, just a slight modification of sqr. Did you mean `np.square`? Sadly, that doesn't do the right thing: `np.square(1+1j)` yields `2j`, while one wants `c*c.conj()` and thus `2`. Or, for fastest speed, really just `c.real**2 + c.imag**2`. My guess would be that a new ufunc, say `np.abs2` or `np.modulus2` or so, would be more appropriate than defining a new method. I'd also be hesitant to define a new private method -- I like how those usually are just used to override python basics. -- Marten -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 10 13:50:32 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 10 Oct 2015 11:50:32 -0600 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: On Sat, Oct 10, 2015 at 11:14 AM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > > We tend to avoid adding methods. 2) would be a very easy enhancement, > just a slight modification of sqr. > > Did you mean `np.square`? Sadly, that doesn't do the right thing: > `np.square(1+1j)` yields `2j`, while one wants `c*c.conj()` and thus `2`. > Or, for fastest speed, really just `c.real**2 + c.imag**2`. > Yes, I meant the new function could made by reusing the square code with slight modifications. > My guess would be that a new ufunc, say `np.abs2` or `np.modulus2` or so, > would be more appropriate than defining a new method. I'd also be hesitant > to define a new private method -- I like how those usually are just used to > override python basics. > Julia uses abs2. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 10 14:29:22 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 10 Oct 2015 11:29:22 -0700 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: On Oct 10, 2015 10:50 AM, "Charles R Harris" wrote: > > On Sat, Oct 10, 2015 at 11:14 AM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: >> >> > We tend to avoid adding methods. 2) would be a very easy enhancement, just a slight modification of sqr. >> >> Did you mean `np.square`? Sadly, that doesn't do the right thing: `np.square(1+1j)` yields `2j`, while one wants `c*c.conj()` and thus `2`. Or, for fastest speed, really just `c.real**2 + c.imag**2`. > > > Yes, I meant the new function could made by reusing the square code with slight modifications. > >> >> My guess would be that a new ufunc, say `np.abs2` or `np.modulus2` or so, would be more appropriate than defining a new method. I'd also be hesitant to define a new private method -- I like how those usually are just used to override python basics. > > > Julia uses abs2. I don't have an opinion on whether abs2 is important enough to bother with (I don't work much with complex numbers myself, nor have I run any benchmarks), but I agree that if we do want it then adding it as a regular ufunc would definitely be the right approach. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From phillip.m.feldman at gmail.com Sat Oct 10 15:07:57 2015 From: phillip.m.feldman at gmail.com (Phillip Feldman) Date: Sat, 10 Oct 2015 12:07:57 -0700 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: The ufunc approach makes sense. Something like abs2 is essential for anyone who does signal processing simulations using NumPy. Phillip On Sat, Oct 10, 2015 at 11:29 AM, Nathaniel Smith wrote: > On Oct 10, 2015 10:50 AM, "Charles R Harris" > wrote: > > > > On Sat, Oct 10, 2015 at 11:14 AM, Marten van Kerkwijk < > m.h.vankerkwijk at gmail.com> wrote: > >> > >> > We tend to avoid adding methods. 2) would be a very easy enhancement, > just a slight modification of sqr. > >> > >> Did you mean `np.square`? Sadly, that doesn't do the right thing: > `np.square(1+1j)` yields `2j`, while one wants `c*c.conj()` and thus `2`. > Or, for fastest speed, really just `c.real**2 + c.imag**2`. > > > > > > Yes, I meant the new function could made by reusing the square code with > slight modifications. > > > >> > >> My guess would be that a new ufunc, say `np.abs2` or `np.modulus2` or > so, would be more appropriate than defining a new method. I'd also be > hesitant to define a new private method -- I like how those usually are > just used to override python basics. > > > > > > Julia uses abs2. > > I don't have an opinion on whether abs2 is important enough to bother with > (I don't work much with complex numbers myself, nor have I run any > benchmarks), but I agree that if we do want it then adding it as a regular > ufunc would definitely be the right approach. > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Oct 10 18:45:02 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 11 Oct 2015 00:45:02 +0200 Subject: [Numpy-discussion] 1.10 and development version docs Message-ID: Hi, I see that there are no docs for 1.10 on docs.scipy.org yet, and the development version docs are from Nov'14. Anyone with permissions want to look at rectifying that situation? Also, building development version docs on TravisCI after each merge would be useful (for Numpy and Scipy). A little more work, but unlike updating the current docs you don't need doc server permissions so can be done by anyone. Here's a more concrete idea of how to achieve this https://github.com/scipy/scipy/issues/5343 (with links to things to script to borrow from IPython/MPL). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryanv at continuum.io Sat Oct 10 18:57:58 2015 From: bryanv at continuum.io (Bryan Van de Ven) Date: Sat, 10 Oct 2015 17:57:58 -0500 Subject: [Numpy-discussion] 1.10 and development version docs In-Reply-To: References: Message-ID: Bokeh also uses TravisCI, and we automatically build deploy docs on "dev" builds and releases, using encrypted Travis variables to store the necessary credentials. In case any of that sounds useful, most of the machinery is in these files: https://github.com/bokeh/bokeh/blob/master/.travis.yml https://github.com/bokeh/bokeh/blob/master/scripts/build_upload.sh https://github.com/bokeh/bokeh/blob/master/sphinx/fabfile.py Bryan > On Oct 10, 2015, at 5:45 PM, Ralf Gommers wrote: > > Hi, > > I see that there are no docs for 1.10 on docs.scipy.org yet, and the development version docs are from Nov'14. Anyone with permissions want to look at rectifying that situation? > > Also, building development version docs on TravisCI after each merge would be useful (for Numpy and Scipy). A little more work, but unlike updating the current docs you don't need doc server permissions so can be done by anyone. Here's a more concrete idea of how to achieve this https://github.com/scipy/scipy/issues/5343 (with links to things to script to borrow from IPython/MPL). > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From nilsc.becker at gmail.com Sun Oct 11 08:56:45 2015 From: nilsc.becker at gmail.com (Nils Becker) Date: Sun, 11 Oct 2015 14:56:45 +0200 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: Hey, I use complex numbers a lot and obviously need the modulus a lot. However, I am not sure if we need a special function for _performance_ reasons. At 10:01 AM 9/20/2015, you wrote: It is, but since that involves taking sqrt, it is *much* slower. Even now, ``` In [32]: r = np.arange(10000)*(1+1j) In [33]: %timeit np.abs(r)**2 1000 loops, best of 3: 213 ??s per loop In [34]: %timeit r.real**2 + r.imag**2 10000 loops, best of 3: 47.5 ??s per loop This benchmark is not quite fair as the first example needs a python function call and the second doesn't. If you benchmark a modulus function against np.abs(x)**2 the performance gain is ca. 30% on my machine. This means that for such a basic operation most of the time is spent in the function call. In my opinion if you want to have speed you write the modulus explicitly in your expression (3-4x speedup on my machine). If you don't need speed you can afford the function call (be it to abs2 or to abs). By not providing abs2 in numpy, however, people do not loose out on a lot of performance... There may be reasons to provide abs2 related to accuracy. If people (for not knowing it better) use np.abs(x)**2 they lose significant digits I think (may be wrong on that...). I did not look into it, though. Cheers Nils -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Sun Oct 11 14:56:08 2015 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sun, 11 Oct 2015 14:56:08 -0400 Subject: [Numpy-discussion] method to calculate the magnitude squared In-Reply-To: References: Message-ID: Hi Nils, I think performance will actually be better than I indicated, especially for larger arrays, since `r.real**2 + r.imag**2` makes a quite unnecessary intermediate arrays. With a `ufunc`, this can be done much faster. Indeed, it should be no slower than `np.square` (which does more operations): %timeit b = np.square(a) 100000 loops, best of 3: 16.6 ?s per loop [This is on same laptop as the timings above.] Chuck: agreed that a future np.abs2 could just reuse the internal ufunc loops for np.square except for the complex case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Oct 11 15:02:15 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 11 Oct 2015 21:02:15 +0200 Subject: [Numpy-discussion] 1.10 and development version docs In-Reply-To: References: Message-ID: On Sun, Oct 11, 2015 at 12:57 AM, Bryan Van de Ven wrote: > Bokeh also uses TravisCI, and we automatically build deploy docs on "dev" > builds and releases, using encrypted Travis variables to store the > necessary credentials. In case any of that sounds useful, most of the > machinery is in these files: > > https://github.com/bokeh/bokeh/blob/master/.travis.yml > https://github.com/bokeh/bokeh/blob/master/scripts/build_upload.sh > https://github.com/bokeh/bokeh/blob/master/sphinx/fabfile.py > > Thanks Bryan. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Oct 11 23:38:15 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 11 Oct 2015 20:38:15 -0700 Subject: [Numpy-discussion] Make all comparisons with NaT false? Message-ID: Currently, NaT (not a time) does not have any special treatment when used in comparison with datetime64/timedelta64 objects. This means that it's equal to itself, and treated as the smallest possible value in comparisons, e.g., NaT == NaT and NaT < any_other_time. To me, this seems a little crazy for a value meant to denote a missing/invalid time -- NaT should really have the same comparison behavior as NaN. That is, all comparisons with NaT should be false. The good news is that updating this behavior turns out to be only a matter of adding a single conditional to umath/loops.c.src -- most of the work would be fixing tests. Whether you call this an API change or a bug fix is somewhat of a judgment call, but I believe this change is certainly consistent with the goals of datetime64. It's also consistent with how NaT is used in pandas, which uses its own wrappers around datetime64 precisely to fix these sorts of issues. So I'm raising this here to get some opinions on the right path forward: 1. Is this a bug fix that we can backport to 1.10.x? 2. Is this an API change that should wait until 1.11? 3. Is this something where we need to start issuing warnings and deprecate the existing behavior? My vote would be for option 2. I think it's really a bug fix, but it would break enough code that I wouldn't want to spring this on anybody in a bug fix release. I'd rather not wait several releases on this one because that will only exacerbate issues with being able to use datetime64 reliably. Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Oct 12 03:10:26 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 12 Oct 2015 00:10:26 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive Message-ID: As has come up repeatedly over the past few years, nobody seems to be very happy with the way that NumPy's datetime64 type parses and prints datetimes in local timezones. The tentative consensus from last year's discussion was that we should make datetime64 timezone naive, like the standard library's datetime.datetime: http://thread.gmane.org/gmane.comp.python.numeric.general/57184 That makes sense to me, and it's exactly what I'd like to see happen for NumPy 1.11. Here's my PR to make that happen: https://github.com/numpy/numpy/pull/6453 As a temporary measure, we still will parse datetimes that include a timezone specification by converting them to UTC, but will issue a DeprecationWarning. This is important for a smooth transition, because at the very least I suspect the "Z" modifier for UTC is widely used. Another option would be to preserve this conversion indefinitely, without any deprecation warning. There's one (slightly) contentious API decision to make: What should we do with the numpy.datetime_to_string function? As far as I can tell, it was never documented as part of the NumPy API and has not been used very much or at all outside of NumPy's own test suite, but it is exposed in the main numpy namespace. If we can remove it, then we can delete and simplify a lot more code related to timezone parsing and display. If not, we'll need to do a bit of work so we can distinguish between the string representations of timezone naive and UTC. Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 12 03:38:09 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 12 Oct 2015 00:38:09 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 12:10 AM, Stephan Hoyer wrote: > As has come up repeatedly over the past few years, nobody seems to be very > happy with the way that NumPy's datetime64 type parses and prints datetimes > in local timezones. > > The tentative consensus from last year's discussion was that we should make > datetime64 timezone naive, like the standard library's datetime.datetime: > http://thread.gmane.org/gmane.comp.python.numeric.general/57184 > > That makes sense to me, and it's exactly what I'd like to see happen for > NumPy 1.11. Here's my PR to make that happen: > https://github.com/numpy/numpy/pull/6453 > > As a temporary measure, we still will parse datetimes that include a > timezone specification by converting them to UTC, but will issue a > DeprecationWarning. This is important for a smooth transition, because at > the very least I suspect the "Z" modifier for UTC is widely used. Another > option would be to preserve this conversion indefinitely, without any > deprecation warning. I'm dubious about supporting conversions in the long run -- even "Z" -- because UTC datetimes and naive datetimes are really not the same thing. OTOH maybe if we dropped this it would break everyone's code and they would hate us -- I actually have no idea what people are doing with datetime64 outside of pandas. One way to find out is to start issuing DeprecationWarnings and see if anyone notices :-). (Though of course this is far from fool-proof.) > There's one (slightly) contentious API decision to make: What should we do > with the numpy.datetime_to_string function? As far as I can tell, it was > never documented as part of the NumPy API and has not been used very much or > at all outside of NumPy's own test suite, but it is exposed in the main > numpy namespace. If we can remove it, then we can delete and simplify a lot > more code related to timezone parsing and display. If not, we'll need to do > a bit of work so we can distinguish between the string representations of > timezone naive and UTC. One possible strategy here would be to do some corpus analysis to find out whether anyone is actually using it, like I did for the ufunc ABI stuff: https://github.com/njsmith/codetrawl https://github.com/njsmith/ufunc-abi-analysis "datetime_to_string" is an easy token to search for, though it looks like enough people have their own functions named that that you'd have to do a bit of filtering to ignore non-numpy-related uses. A filter("content", "import.*numpy") would collect all files that import numpy into a single group for further examination. -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Mon Oct 12 12:27:08 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 12 Oct 2015 10:27:08 -0600 Subject: [Numpy-discussion] Numpy 1.10.1 released. Message-ID: Hi All, I'm pleased to announce the release of Numpy 1.10.1. This release fixes some build problems and serves to reset the release number on pipy to something usable. As a note for future release managers, I had to upload these files from the command line, as using the file upload option at pipy resulted in a failure to parse the version. NumPy 1.10.1 Release Notes ************************** This release deals with a few build problems that showed up in 1.10.0. Most users would not have seen these problems. The differences are: * Compiling with msvc9 or msvc10 for 32 bit Windows now requires SSE2. This was the easiest fix for what looked to be some miscompiled code when SSE2 was not used. If you need to compile for 32 bit Windows systems without SSE2 support, mingw32 should still work. * Make compiling with VS2008 python2.7 SDK easier * Change Intel compiler options so that code will also be generated to support systems without SSE4.2. * Some _config test functions needed an explicit integer return in order to avoid the openSUSE rpmlinter erring out. * We ran into a problem with pipy not allowing reuse of filenames and a resulting proliferation of *.*.*.postN releases. Not only were the names getting out of hand, some packages were unable to work with the postN suffix. Numpy 1.10.1 supports Python 2.6 - 2.7 and 3.2 - 3.5. Commits: 45a3d84 DEP: Remove warning for `full` when dtype is set. 0c1a5df BLD: import setuptools to allow compile with VS2008 python2.7 sdk 04211c6 BUG: mask nan to 1 in ordered compare 826716f DOC: Document the reason msvc requires SSE2 on 32 bit platforms. 49fa187 BLD: enable SSE2 for 32-bit msvc 9 and 10 compilers dcbc4cc MAINT: remove Wreturn-type warnings from config checks d6564cb BLD: do not build exclusively for SSE4.2 processors 15cb66f BLD: do not build exclusively for SSE4.2 processors c38bc08 DOC: fix var. reference in percentile docstring 78497f4 DOC: Sync 1.10.0-notes.rst in 1.10.x branch with master. Cheers, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Oct 12 13:15:00 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 12 Oct 2015 10:15:00 -0700 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: Hi, On Mon, Oct 12, 2015 at 9:27 AM, Charles R Harris wrote: > Hi All, > > I'm pleased to announce the release of Numpy 1.10.1. This release fixes some > build problems and serves to reset the release number on pipy to something > usable. As a note for future release managers, I had to upload these files > from the command line, as using the file upload option at pipy resulted in a > failure to parse the version. > > NumPy 1.10.1 Release Notes > ************************** > > This release deals with a few build problems that showed up in 1.10.0. Most > users would not have seen these problems. The differences are: > > * Compiling with msvc9 or msvc10 for 32 bit Windows now requires SSE2. > This was the easiest fix for what looked to be some miscompiled code when > SSE2 was not used. If you need to compile for 32 bit Windows systems > without SSE2 support, mingw32 should still work. > > * Make compiling with VS2008 python2.7 SDK easier > > * Change Intel compiler options so that code will also be generated to > support systems without SSE4.2. > > * Some _config test functions needed an explicit integer return in > order to avoid the openSUSE rpmlinter erring out. > > * We ran into a problem with pipy not allowing reuse of filenames and a > resulting proliferation of *.*.*.postN releases. Not only were the names > getting out of hand, some packages were unable to work with the postN > suffix. > > > Numpy 1.10.1 supports Python 2.6 - 2.7 and 3.2 - 3.5. > > > Commits: > > 45a3d84 DEP: Remove warning for `full` when dtype is set. > 0c1a5df BLD: import setuptools to allow compile with VS2008 python2.7 sdk > 04211c6 BUG: mask nan to 1 in ordered compare > 826716f DOC: Document the reason msvc requires SSE2 on 32 bit platforms. > 49fa187 BLD: enable SSE2 for 32-bit msvc 9 and 10 compilers > dcbc4cc MAINT: remove Wreturn-type warnings from config checks > d6564cb BLD: do not build exclusively for SSE4.2 processors > 15cb66f BLD: do not build exclusively for SSE4.2 processors > c38bc08 DOC: fix var. reference in percentile docstring > 78497f4 DOC: Sync 1.10.0-notes.rst in 1.10.x branch with master. Thanks a lot for guiding this through. I uploaded the OSX wheels to pypi via : https://github.com/MacPython/numpy-wheels Cheers, Matthew From oss at behrisch.de Mon Oct 12 14:19:58 2015 From: oss at behrisch.de (Michael Behrisch) Date: Mon, 12 Oct 2015 20:19:58 +0200 Subject: [Numpy-discussion] Problem with os.environ.clear in numpy initialization Message-ID: <561BF9CE.8040001@behrisch.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi list, I encountered a problem in my code which depends on numpy. Due to an unusual variable in my environment and this python bug https://bugs.python.org/issue20658 the os.environ.clear call in numpy/core/__init__.py fails. I wrote a patch and submitted a pull request: https://github.com/behrisch/numpy/pull/1 Although it is mainly a workaround for the bug mentioned, I would be happy if it could get accepted because I have only limited control of the environment. Best regards, Michael -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlYb+c0ACgkQPBD+ltFwpilo1wCZAbvS192vVwnWCYT0y7L5At6m TdsAoOmLjTnq7iVlh2QmnsM1qJ72PTyE =wDnI -----END PGP SIGNATURE----- From ndarray at mac.com Mon Oct 12 14:48:30 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 12 Oct 2015 14:48:30 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 3:10 AM, Stephan Hoyer wrote: > The tentative consensus from last year's discussion was that we should > make datetime64 timezone naive, like the standard library's > datetime.datetime If you are going to make datetime64 more like datetime.datetime, please consider adding the "fold" bit. See PEP 495. [1] [1]: https://www.python.org/dev/peps/pep-0495/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From oss at behrisch.de Mon Oct 12 16:25:10 2015 From: oss at behrisch.de (Michael Behrisch) Date: Mon, 12 Oct 2015 22:25:10 +0200 Subject: [Numpy-discussion] Problem with os.environ.clear in numpy initialization In-Reply-To: <561BF9CE.8040001@behrisch.de> References: <561BF9CE.8040001@behrisch.de> Message-ID: <561C1726.1020203@behrisch.de> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I am sorry I sent the wrong pull request. Here is the correct one: https://github.com/numpy/numpy/pull/6460 Best regards, Michael Am 12.10.2015 um 20:19 schrieb Michael Behrisch: > Hi list, I encountered a problem in my code which depends on numpy. > Due to an unusual variable in my environment and this python bug > https://bugs.python.org/issue20658 the os.environ.clear call in > numpy/core/__init__.py fails. I wrote a patch and submitted a pull > request: https://github.com/behrisch/numpy/pull/1 > > Although it is mainly a workaround for the bug mentioned, I would > be happy if it could get accepted because I have only limited > control of the environment. > > Best regards, Michael > _______________________________________________ NumPy-Discussion > mailing list NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlYcFyYACgkQPBD+ltFwpilpqACgjWgriUX8qhz8tDIcsTnSXijM 6G4An2XlDvg7owA7e13mPrMjEzIKHx3H =HAMn -----END PGP SIGNATURE----- From ralf.gommers at gmail.com Mon Oct 12 17:01:26 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 12 Oct 2015 23:01:26 +0200 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: Hi, Thanks Nathaniel and everyone else who contributed for pushing forward with formalizing Numpy governance and with this FSA. I'm quite excited about both! Before I start commenting on the FSA, I'd like to point out that I'm both on the numpy steering committee and the NumFOCUS board. I don't see that as a problem for being involved in the discussions or signing the FSA, however I will obviously abstain from voting or (non-)consenting in case of a possible conflict of interest. On Thu, Oct 8, 2015 at 7:57 AM, Nathaniel Smith wrote: > Hi all, > > Now that the governance document is in place, we need to get our legal > ducks in a row by signing a fiscal sponsorship agreement with > NumFOCUS. > > The basic idea here is that there are times when you really need some > kind of corporation to represent the project -- the legal system for > better or worse does not understand "a bunch of folks on a mailing > list" as a legal entity capable of accepting donations, Additional clarification: NumFOCUS is a 501(c)3 organization, which means that in the US donations that are tax-deductable can be made to it (and hence to Numpy after this FSA is signed). From European or other countries donations can be made, but they won't be deductable. > or holding > funds or other assets like domain names. The obvious solution is to > incorporate a company to represent the project -- but incorporating a > company involves lots of super-annoying paperwork. (Like, *super* > annoying.) So a standard trick is that a single non-profit corporation > acts as an umbrella organization providing these services to multiple > projects at once, and this is called "fiscal sponsorship". You can > read more about it here: > https://en.wikipedia.org/wiki/Fiscal_sponsorship > > NumFOCUS's standard comprehensive FSA agreement can be seen here: > > > https://docs.google.com/document/d/11YqMX9UrgfCSgiQEUzmOFyg6Ku-vED6gMxhO6J9lCgg/edit?usp=sharing There's one upcoming change to this FSA: the overhead percentage (now 4-7%) charged will go up significantly, to around 10-15%. Re4ason: NumFOCUS cannot cover its admin/legal costs as well as support its projects based on what the doc says now. This is still at the lower end of the scale for non-profits, and universities typically charge way more on grants. So I don't see any issue here, but it's good to know now rather than right after we sign. > and we have the option of negotiating changes if there's anything we don't > like. > > They also have a FAQ: > > https://docs.google.com/document/d/1zdXp07dLvkbqBrDsw96P6mkqxnWzKJuM-1f4408I6Qs/edit?usp=sharing > > I've read through the document and didn't see anything that bothered > me, except that I'm not quite sure how to make the split between the > steering council and numfocus subcommittee that we have in our > governance model sync up with their language about the "leadership > body", and in particular the language in section 10 about simple > majority votes. So I've queried them about that already. > I'd like to clarify that the Numfocus subcommittee is only meant to facility interaction with NumFOCUS and to ensure that if funds are spent, they are spent in a way consistent with the mission and non-profit nature of NumFOCUS. The same applies to possible legal impacts of decisions made in the Numpy project. Regarding the question about the "simple majority votes" language, we can simply replace that with the appropriate text describing how decisions are made in the Numpy project. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Oct 13 12:42:38 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 13 Oct 2015 12:42:38 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: I'd be totally in support of switching to timezone naive form. While it would be ideal that everyone stores their dates in UTC, the real world is messy and most of the time, people are just loading dates as-is and don't even care about timezones. I work on machines with different TZs, and I hate it when I save a bunch of data on one machine in UTC, but then go to view it on my local machine and everything is shifted. It gets even more confusing around DST switches because it gets all mixed up. Ben Root On Mon, Oct 12, 2015 at 2:48 PM, Alexander Belopolsky wrote: > > On Mon, Oct 12, 2015 at 3:10 AM, Stephan Hoyer wrote: > >> The tentative consensus from last year's discussion was that we should >> make datetime64 timezone naive, like the standard library's >> datetime.datetime > > > > If you are going to make datetime64 more like datetime.datetime, please > consider adding the "fold" bit. See PEP 495. [1] > > [1]: https://www.python.org/dev/peps/pep-0495/ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Oct 13 13:36:29 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 13 Oct 2015 10:36:29 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 12:38 AM, Nathaniel Smith wrote: > > One possible strategy here would be to do some corpus analysis to find > out whether anyone is actually using it, like I did for the ufunc ABI > stuff: > https://github.com/njsmith/codetrawl > https://github.com/njsmith/ufunc-abi-analysis > > "datetime_to_string" is an easy token to search for, though it looks > like enough people have their own functions named that that you'd have > to do a bit of filtering to ignore non-numpy-related uses. Yes, this is a good approach. I actually mistyped the name here -- it's actually "datetime_as_string". A GitHub search does turn up a handful of uses outside of NumPy: https://github.com/search?utf8=%E2%9C%93&q=numpy.datetime_as_string+in%3Afile%2Cpath+NOT+numpy%2Fcore+NOT+test_datetime.py+NOT+arrayprint.py&type=Code&ref=searchresults That said, I'm not sure it's worth going to the trouble to ensure it continues to work in the future. This function was entirely undocumented, and doesn't even have an inspectable function signature. Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Oct 13 13:48:33 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 13 Oct 2015 10:48:33 -0700 Subject: [Numpy-discussion] Deprecating unitless timedelta64 and "safe" casting of integers to timedelta64 Message-ID: As part of the datetime64 cleanup I've been working on over the past few days, I noticed that NumPy's casting rules for np.datetime64('NaT') were not working properly: https://github.com/numpy/numpy/pull/6465 This led to my discovery that NumPy currently supports unit-less timedeltas (e.g., "np.timedelta64(5)"), which indicate some sort of generic time difference. The current behavior is to take the time units from the other argument when these are used in a binary operation. Even worse, we currently support "safe" casting of integers to timedelta64, which means that integer + datetime64 and integer + timedelta64 arithmetic works: In [4]: np.datetime64('2000-01-01T00') + 10 Out[4]: numpy.datetime64('2000-01-01T10:00-0800','h') Based on the principle that NumPy's datetime support should mirror the standard library as much as possible, both of these behaviors seem like a bad idea. We have datetime types precisely to disambiguate these sort of situations. I'd like to propose deprecating such casting in NumPy 1.11, with the intent of removing it entirely as soon as practical. Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 13 17:17:12 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 13 Oct 2015 14:17:12 -0700 Subject: [Numpy-discussion] Deprecating unitless timedelta64 and "safe" casting of integers to timedelta64 In-Reply-To: References: Message-ID: On Oct 13, 2015 10:48 AM, "Stephan Hoyer" wrote: > > As part of the datetime64 cleanup I've been working on over the past few days, I noticed that NumPy's casting rules for np.datetime64('NaT') were not working properly: > https://github.com/numpy/numpy/pull/6465 > > This led to my discovery that NumPy currently supports unit-less timedeltas (e.g., "np.timedelta64(5)"), which indicate some sort of generic time difference. The current behavior is to take the time units from the other argument when these are used in a binary operation. > > Even worse, we currently support "safe" casting of integers to timedelta64, which means that integer + datetime64 and integer + timedelta64 arithmetic works: > > In [4]: np.datetime64('2000-01-01T00') + 10 > Out[4]: numpy.datetime64('2000-01-01T10:00-0800','h') > > Based on the principle that NumPy's datetime support should mirror the standard library as much as possible, both of these behaviors seem like a bad idea. We have datetime types precisely to disambiguate these sort of situations. > > I'd like to propose deprecating such casting in NumPy 1.11, with the intent of removing it entirely as soon as practical. Makes sense to me. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 13 17:44:02 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Oct 2015 14:44:02 -0700 Subject: [Numpy-discussion] Make all comparisons with NaT false? In-Reply-To: References: Message-ID: On Sun, Oct 11, 2015 at 8:38 PM, Stephan Hoyer wrote: > Currently, NaT (not a time) does not have any special treatment when used > in comparison with datetime64/timedelta64 objects. > > To me, this seems a little crazy for a value meant to denote a > missing/invalid time -- NaT should really have the same comparison behavior > as NaN. > Yes, indeed. > Whether you call this an API change or a bug fix is somewhat of a judgment > call, but I believe this change is certainly consistent with the goals of > datetime64. It's also consistent with how NaT is used in pandas, which uses > its own wrappers around datetime64 precisely to fix these sorts of issues. > Getting closer to Pandas is a Good Thing too... > So I'm raising this here to get some opinions on the right path forward: > 1. Is this a bug fix that we can backport to 1.10.x? > 2. Is this an API change that should wait until 1.11? > 3. Is this something where we need to start issuing warnings and deprecate > the existing behavior? > > My vote would be for option 2. > I agree. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 13 17:52:38 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Oct 2015 14:52:38 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 11:48 AM, Alexander Belopolsky wrote: > If you are going to make datetime64 more like datetime.datetime, please > consider adding the "fold" bit. See PEP 495. [1] > > [1]: https://www.python.org/dev/peps/pep-0495/ > well, adding any timezone support is not (yet) in the table. (no need for "fold" with purely naive time, yes?) But yes, when we get there, absolutely. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 13 18:04:50 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Oct 2015 15:04:50 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 12:38 AM, Nathaniel Smith wrote: > > As a temporary measure, we still will parse datetimes that include a > > timezone specification by converting them to UTC, but will issue a > > DeprecationWarning. This is important for a smooth transition, because at > > the very least I suspect the "Z" modifier for UTC is widely used. Another > > option would be to preserve this conversion indefinitely, without any > > deprecation warning. > > I'm dubious about supporting conversions in the long run -- even "Z" > -- because UTC datetimes and naive datetimes are really not the same > thing. no -- but almost! > OTOH maybe if we dropped this it would break everyone's code > and they would hate us -- I think it probably would. In the current implementation, an ISO string without an offset specifier is converted using the system's locale timezone. So to get naive time (or UTC), we need to tack a Z (or 00:00) on there. So killing that would likely break a lot of code! And excepting a Z or 00:00 and then treating it as naive, while being perhaps misleading, would not actually change any results. So I say we keep it. Depreciating it eventually would be good in the long run -- but maybe when we have actual time zone support. I actually have no idea what people are > doing with datetime64 outside of pandas. What do we need to do with this not to break Panda? I'm guessing more people use datetime64 wrapped by Pandas than any other way... (not me, though) > There's one (slightly) contentious API decision to make: What should we do > > with the numpy.datetime_to_string function? As far as I can tell, it was > > never documented as part of the NumPy API and has not been used very > much Well, I'm not using it :-) though I can see that it might be pretty useful. Though once we get rid of datetime64 adjusting for the locale timezone, maybe not anymore. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 13 18:21:07 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 13 Oct 2015 15:21:07 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Oct 12, 2015 11:48 AM, "Alexander Belopolsky" wrote: > > > On Mon, Oct 12, 2015 at 3:10 AM, Stephan Hoyer wrote: >> >> The tentative consensus from last year's discussion was that we should make datetime64 timezone naive, like the standard library's datetime.datetime > > > > If you are going to make datetime64 more like datetime.datetime, please consider adding the "fold" bit. See PEP 495. [1] > > [1]: https://www.python.org/dev/peps/pep-0495/ The challenge here is that we literally do not have a bit too use :-) Unless we make it datetime65 + 63 bits of padding, stealing a bit to use for fold would halve the range of representable times, and I'm guessing this would not be acceptable? -- pandas's 64-bits-of-nanoseconds already has a somewhat narrow range (584 years). I think for now the two goals are to make the built in datetime64 minimally functional and self consistent, and to make it possible for fancier datetime needs to be handled using third party dtypes. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 13 18:24:36 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Oct 2015 15:24:36 -0700 Subject: [Numpy-discussion] Deprecating unitless timedelta64 and "safe" casting of integers to timedelta64 In-Reply-To: References: Message-ID: On Tue, Oct 13, 2015 at 10:48 AM, Stephan Hoyer wrote: > This led to my discovery that NumPy currently supports unit-less > timedeltas (e.g., "np.timedelta64(5)"), which indicate some sort of generic > time difference. The current behavior is to take the time units from the > other argument when these are used in a binary operation. > this really is odd :-) > Even worse, we currently support "safe" casting of integers to > timedelta64, which means that integer + datetime64 and integer + > timedelta64 arithmetic works: > which makes the above even odder -- underlying datetime64 is, "just" a 64 bit int -- so I can see how someone _may_ want to work directly with that -- but if you can use regular integerts, why have a unitless timedelta? Based on the principle that NumPy's datetime support should mirror the > standard library as much as possible, both of these behaviors seem like a > bad idea. We have datetime types precisely to disambiguate these sort of > situations. > > I'd like to propose deprecating such casting in NumPy 1.11, with the > intent of removing it entirely as soon as practical. > Agreed -- I can imagine use-cases, but users can cadt to/from integers if that's what they want to do e.g. with .astype() -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Oct 13 18:48:38 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 13 Oct 2015 15:48:38 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Tue, Oct 13, 2015 at 3:21 PM, Nathaniel Smith wrote: > > If you are going to make datetime64 more like datetime.datetime, please > consider adding the "fold" bit. See PEP 495. [1] > The challenge here is that we literally do not have a bit too use :-) > hmm -- I was first thinking that this could all be in the timezone stuff (when we get there), but while I imagine we'll want an entire array to be in a single timezone, each individual value would need its own "fold" flag. But in any case, we don't need it 'till we do timezones, and my understanding is that we aren't' going to do timezones until we have the mythical new-and-improved-dtype-system. So a future datetime dtype could be 64 bits + a byte of extra info, or be 63 bits plus the fold flag, or... > Unless we make it datetime65 + 63 bits of padding, stealing a bit to use > for fold would halve the range of representable times, and I'm guessing > this would not be acceptable? > well, not now, with eh fixed epoch, but if the epoch could be adjusted, maybe a small range would be fine -- who need nanosecond accuracy, AND centuries of range? Thinking a bit more here: For those that didn't follow the massive discussion on this on Python-dev and the new datetime list: the fold flag is required to round-trip properly for timezones with discontiguous time -- i.e. Daylight savings. So if you have: 2015-11-01T01:30 Do you mean the first 1:30 am or the seconds one, after the DST transition? (i.e. in the fold, or not?) So it is key, for Python's Datetime, to make sure to keep that information around. However: Python's datetime was designed to be optimized for: - converting between datetime and other representations in Database, etc. - fast math for "naive time" -- i.e. basic manipulations within the same timezone, like "one day later" - Fast math for "absolute relative deltas" is of secondary concern. The result of this is that datetime stores: year, month, day, hour minute second, microsecond It does NOT store some time_unit_since_an_epch, like unix time or numpy datetime64. Also, IIUC, when you associate a datetime with a timezone, it stores the year, month, day, hour, second,... in the specified timezone -- NOT in UTC, or anything else. This makes manipulations within that timezone easy -- the next day simply required adding a day to teh day field (then normalizing to the month). Given all that -- the "fold" bit is needed, as a particular datetime in a particular timezone may have more than one meaning. Note that to compute a proper time span between two "aware" datetimes, it is necessary to convert to UTC, do the math, then convert back to the timezone you want. However, numpy datetime is optimized for compact storage and fast computation of absolute deltas (actual hours, minutes, seconds... not calendar units like "the next day" ). Because of this, and because it's what we already have, datetime64 stores times as "some number of time units since an epoch -- a simple integer. And because we probably want fast absolute delta computation, when we add timezones, we'll probably want to store the datetime in UTC, and apply the timezone on I/O. Alexander: Am I right that we don't need the "fold" bit in this case? You'd still need it when specifying a time in a timezone with folds.. -- but again, only on I/O -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 13 18:58:14 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 13 Oct 2015 15:58:14 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Oct 13, 2015 3:49 PM, "Chris Barker" wrote: > [...] > However, numpy datetime is optimized for compact storage and fast computation of absolute deltas (actual hours, minutes, seconds... not calendar units like "the next day" ). Except that ironically it actually can't compute absolute deltas accurately with one second resolution, because it does the POSIX time thing of pretending that all UTC days have the same number of seconds, even though this is not true (leap seconds). This isn't really relevant to anything else in this thread, except as a reminder of how freaky date/time handling is. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Tue Oct 13 20:07:20 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Tue, 13 Oct 2015 20:07:20 -0400 Subject: [Numpy-discussion] Make all comparisons with NaT false? In-Reply-To: References: Message-ID: Here another oddity to add to the list In [28]: issubclass(np.datetime64,np.integer) Out[28]: False In [29]: issubclass(np.timedelta64,np.integer) Out[29]: True On Tue, Oct 13, 2015 at 5:44 PM, Chris Barker wrote: > On Sun, Oct 11, 2015 at 8:38 PM, Stephan Hoyer wrote: > >> Currently, NaT (not a time) does not have any special treatment when used >> in comparison with datetime64/timedelta64 objects. >> >> To me, this seems a little crazy for a value meant to denote a >> missing/invalid time -- NaT should really have the same comparison behavior >> as NaN. >> > > Yes, indeed. > > >> Whether you call this an API change or a bug fix is somewhat of a >> judgment call, but I believe this change is certainly consistent with the >> goals of datetime64. It's also consistent with how NaT is used in pandas, >> which uses its own wrappers around datetime64 precisely to fix these sorts >> of issues. >> > > Getting closer to Pandas is a Good Thing too... > > >> So I'm raising this here to get some opinions on the right path forward: >> 1. Is this a bug fix that we can backport to 1.10.x? >> 2. Is this an API change that should wait until 1.11? >> 3. Is this something where we need to start issuing warnings and >> deprecate the existing behavior? >> >> My vote would be for option 2. >> > > I agree. > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Tue Oct 13 20:08:14 2015 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Tue, 13 Oct 2015 20:08:14 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: > > > However, numpy datetime is optimized for compact storage and fast > computation of absolute deltas (actual hours, minutes, seconds... not > calendar units like "the next day" ). > > Except that ironically it actually can't compute absolute deltas > accurately with one second resolution, because it does the POSIX time thing > of pretending that all UTC days have the same number of seconds, even > though this is not true (leap seconds). > > This isn't really relevant to anything else in this thread, except as a > reminder of how freaky date/time handling is. > Maybe not directly relevant, but also very clearly why one should ideally not use these at all! Perhaps even less relevant, but if you do need absolute times (and thus work with UTC or TAI or GPS), have a look at astropy's `Time` class. It does use two doubles, but with that maintains "sub-nanosecond precision over times spanning the age of the universe" [1]. And it even converts to strings nicely! -- Marten [1] http://docs.astropy.org/en/latest/time/index.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Wed Oct 14 01:23:48 2015 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 14 Oct 2015 05:23:48 +0000 Subject: [Numpy-discussion] A regression in numpy 1.10: VERY slow memory mapped file generation Message-ID: I have binary files of size range between few MB to 1GB, which I read process as memory mapped files (via np.memmap). Until numpy 1.9 the creation? of recarray on an existing file (without reading its content) was instantaneous, and now it takes ~6 seconds (system: archlinux on sandy bridge). A profiling (using ipython %prun) top of the list is: ?? ncalls? tottime? percall? cumtime? percall filename:lineno(function) ?????? 21??? 3.037??? 0.145??? 4.266??? 0.203 _internal.py:372(_check_field_overlap) ? 3713431??? 1.663??? 0.000??? 1.663??? 0.000 _internal.py:366() ? 3713750??? 0.790??? 0.000??? 0.790??? 0.000 {range} ? 3713709??? 0.406??? 0.000??? 0.406??? 0.000 {method 'update' of 'set' objects} ????? 322??? 0.320??? 0.001??? 1.984??? 0.006 {method 'extend' of 'list' objects} Nadav. From allanhaldane at gmail.com Wed Oct 14 11:59:57 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Wed, 14 Oct 2015 11:59:57 -0400 Subject: [Numpy-discussion] A regression in numpy 1.10: VERY slow memory mapped file generation In-Reply-To: References: Message-ID: <561E7BFD.1060506@gmail.com> On 10/14/2015 01:23 AM, Nadav Horesh wrote: > > I have binary files of size range between few MB to 1GB, which I read process as memory mapped files (via np.memmap). Until numpy 1.9 the creation of recarray on an existing file (without reading its content) was instantaneous, and now it takes ~6 seconds (system: archlinux on sandy bridge). A profiling (using ipython %prun) top of the list is: > > > ncalls tottime percall cumtime percall filename:lineno(function) > 21 3.037 0.145 4.266 0.203 _internal.py:372(_check_field_overlap) > 3713431 1.663 0.000 1.663 0.000 _internal.py:366() > 3713750 0.790 0.000 0.790 0.000 {range} > 3713709 0.406 0.000 0.406 0.000 {method 'update' of 'set' objects} > 322 0.320 0.001 1.984 0.006 {method 'extend' of 'list' objects} > > Nadav. Hi Nadav, The slowdown is due to a problem in PR I introduced to add safety checks to views of structured arrays (to prevent segfaults involving object fields), which will hopefully be fixed quickly. It is being discussed here https://github.com/numpy/numpy/issues/6467 Also, I do not think the problem is with memmap - as far as I have tested, memmmap is still fast. Most likely what is slowing your script down is subsequent access to the fields of the array, which is what has regressed. Is that right? Allan From chris.barker at noaa.gov Wed Oct 14 11:59:53 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 08:59:53 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Tue, Oct 13, 2015 at 3:58 PM, Nathaniel Smith wrote: > > However, numpy datetime is optimized for compact storage and fast > computation of absolute deltas (actual hours, minutes, seconds... not > calendar units like "the next day" ). > > Except that ironically it actually can't compute absolute deltas > accurately with one second resolution, because it does the POSIX time thing > of pretending that all UTC days have the same number of seconds, even > though this is not true (leap seconds). > Note that I said "fast", not "accurate" -- but the leap second thing may be one more reason not to call datetime64 "UTC" -- who's to say that "naive" time should include leap seconds :-) Also, we could certainly add a leap seconds implementation to the current infrastructure -- the real technical problem with that is how to keep the leap-seconds table up to date -- we have no way to know when there will be leap-seconds in the future... Also -- this may be one more reason to have a selectable epoch -- then you'd likely overlap fewer leap-seconds in a given us case. > This isn't really relevant to anything else in this thread, except as a > reminder of how freaky date/time handling is. > yup -- it sure is. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Oct 14 12:07:41 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 09:07:41 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Tue, Oct 13, 2015 at 5:08 PM, Marten van Kerkwijk < m.h.vankerkwijk at gmail.com> wrote: > Maybe not directly relevant, but also very clearly why one should ideally >> not use these a >> > all! >> > I wouldn't say not at all -- I'd say "not in some circumstances" > Perhaps even less relevant, but if you do need absolute times (and thus >> work with UTC or TAI or GPS), have a look at astropy's `Time` class. It >> does use two doubles, >> > interesting -- I wonder why not two integers? > but with that maintains "sub-nanosecond precision over times spanning the >> age of the universe" [1]. >> > well, we do all need that! Seriously, though -- if we are opening all this up, maybe it's worth considering other options, rather than kludging datetime64 -- particularly if there is something someone has already implemented and tested... But for now, Stephan's patches to make datetime64 far more useful and easy are very welcome! -CHB [1] http://docs.astropy.org/en/latest/time/index.html > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Oct 14 12:14:46 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 09:14:46 -0700 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 9:27 AM, Charles R Harris wrote: > * Compiling with msvc9 or msvc10 for 32 bit Windows now requires SSE2. > This was the easiest fix for what looked to be some miscompiled code when > SSE2 was not used. > Note that there is discusion right now on pyton-dev about requireing SSE2 for teh python.org build of python3.5 -- it does now, so it's fine for third party pacakges to also require it. But there is some talk of removing that requirement -- still a lot of old machines around, I guess -- particular at schools and the like. Ideally, any binary wheels on PyPi should be compatible with the python.org builds -- so not require SSE2, if the python.org builds don't. Though we had this discussion a while back -- and numpy could, and maybe should require more -- did we ever figure out a way to get a meaningful message to the user if they try to run an SSE2 build on a machine without SSE2? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 14 12:38:45 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 14 Oct 2015 09:38:45 -0700 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Oct 14, 2015 9:15 AM, "Chris Barker" wrote: > > On Mon, Oct 12, 2015 at 9:27 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >> * Compiling with msvc9 or msvc10 for 32 bit Windows now requires SSE2. >> This was the easiest fix for what looked to be some miscompiled code when >> SSE2 was not used. > > > Note that there is discusion right now on pyton-dev about requireing SSE2 for teh python.org build of python3.5 -- it does now, so it's fine for third party pacakges to also require it. But there is some talk of removing that requirement -- still a lot of old machines around, I guess -- particular at schools and the like. Note that the 1.10.1 release announcement is somewhat misleading -- apparently the affected builds have actually required SSE2 since numpy 1.8, and the change here just makes it even more required. I'm not sure if this is all 32 bit builds or only ones using msvc that have been needing SSE2 all along. The change in 1.10.1 only affects msvc, which is not what most people are using (IIUC Enthought Canopy uses msvc, but the pypi, gohlke, and Anaconda builds don't). I'm actually not sure if anyone even uses the 32 bit builds at all :-) > Ideally, any binary wheels on PyPi should be compatible with the python.org builds -- so not require SSE2, if the python.org builds don't. > > Though we had this discussion a while back -- and numpy could, and maybe should require more -- did we ever figure out a way to get a meaningful message to the user if they try to run an SSE2 build on a machine without SSE2? It's not that difficult in principle, just someone has to do it :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Oct 14 12:47:23 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 09:47:23 -0700 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Wed, Oct 14, 2015 at 9:38 AM, Nathaniel Smith wrote: > The change in 1.10.1 only affects msvc, which is not what most people are > using (IIUC Enthought Canopy uses msvc, but the pypi, gohlke, and Anaconda > builds don't). > Anaconda uses MSVC for the most part -- they _may_ compile numpy itself some other way, no one but continuum knows for sure :-) > I'm actually not sure if anyone even uses the 32 bit builds at all :-) > There's a lot of 32 bit python use out there still, including numpy. We ever figure out a way to get a meaningful message to the user if they > try to run an SSE2 build on a machine without SSE2? > > It's not that difficult in principle, just someone has to do it :-). > yeah, there's always that .... -CHB > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From hodge at stsci.edu Wed Oct 14 13:34:42 2015 From: hodge at stsci.edu (Phil Hodge) Date: Wed, 14 Oct 2015 13:34:42 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: <561E9232.5080807@stsci.edu> On 10/14/2015 11:59 AM, Chris Barker wrote: > we have no way to know when there will be leap-seconds in the future Leap seconds are announced about six months in advance. Phil From chris.barker at noaa.gov Wed Oct 14 13:55:17 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 10:55:17 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: <561E9232.5080807@stsci.edu> References: <561E9232.5080807@stsci.edu> Message-ID: On Wed, Oct 14, 2015 at 10:34 AM, Phil Hodge wrote: > On 10/14/2015 11:59 AM, Chris Barker wrote: > >> we have no way to know when there will be leap-seconds in the future >> > > Leap seconds are announced about six months in advance. exactly -- so more than six month, we have no idea. and even within six months, then you'd need to update some sort of database of leapseconds to get it. So depending on what version of the DB someone was using, they'd get different answers. That could all get ugly :-( -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Oct 14 15:38:56 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 14 Oct 2015 21:38:56 +0200 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Wed, Oct 14, 2015 at 6:47 PM, Chris Barker wrote: > > On Wed, Oct 14, 2015 at 9:38 AM, Nathaniel Smith wrote: > >> I'm actually not sure if anyone even uses the 32 bit builds at all :-) >> > There's a lot of 32 bit python use out there still, including numpy. > If you want a quick impression, there are download stats for our binaries: http://sourceforge.net/projects/numpy/files/NumPy/ The total number of 32-bit .exe installer downloads for the last week is ~5000. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Oct 14 15:55:50 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 14 Oct 2015 12:55:50 -0700 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Wed, Oct 14, 2015 at 12:38 PM, Ralf Gommers wrote: > I'm actually not sure if anyone even uses the 32 bit builds at all :-) >>> >> There's a lot of 32 bit python use out there still, including numpy. >> > > If you want a quick impression, there are download stats for our binaries: > http://sourceforge.net/projects/numpy/files/NumPy/ > > The total number of 32-bit .exe installer downloads for the last week is > ~5000. > That may be somewhat skewed by the fact that we don't provide 64 bit intstallers a t all (or did I miss something?) But nevertheless, plenty of users... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Oct 14 16:34:35 2015 From: cournape at gmail.com (David Cournapeau) Date: Wed, 14 Oct 2015 21:34:35 +0100 Subject: [Numpy-discussion] Numpy 1.10.1 released. In-Reply-To: References: Message-ID: On Wed, Oct 14, 2015 at 5:38 PM, Nathaniel Smith wrote: > On Oct 14, 2015 9:15 AM, "Chris Barker" wrote: > > > > On Mon, Oct 12, 2015 at 9:27 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> > >> * Compiling with msvc9 or msvc10 for 32 bit Windows now requires SSE2. > >> This was the easiest fix for what looked to be some miscompiled code > when > >> SSE2 was not used. > > > > > > Note that there is discusion right now on pyton-dev about requireing > SSE2 for teh python.org build of python3.5 -- it does now, so it's fine > for third party pacakges to also require it. But there is some talk of > removing that requirement -- still a lot of old machines around, I guess -- > particular at schools and the like. > > Note that the 1.10.1 release announcement is somewhat misleading -- > apparently the affected builds have actually required SSE2 since numpy 1.8, > and the change here just makes it even more required. I'm not sure if this > is all 32 bit builds or only ones using msvc that have been needing SSE2 > all along. The change in 1.10.1 only affects msvc, which is not what most > people are using (IIUC Enthought Canopy uses msvc, but the pypi, gohlke, > and Anaconda builds don't). > > I'm actually not sure if anyone even uses the 32 bit builds at all :-) > I cannot divulge exact figures for downloads, but for us at Enthought, windows 32 bits is in the same ballpark as OS X and Linux (64 bits) in terms of proportion, windows 64 bits being significantly more popular. Linux 32 bits and OS X 32 bits have been in the 1 % range each of our downloads for a while (we recently stopped support for both). David > > Ideally, any binary wheels on PyPi should be compatible with the > python.org builds -- so not require SSE2, if the python.org builds don't. > > > > Though we had this discussion a while back -- and numpy could, and maybe > should require more -- did we ever figure out a way to get a meaningful > message to the user if they try to run an SSE2 build on a machine without > SSE2? > > It's not that difficult in principle, just someone has to do it :-). > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Thu Oct 15 01:10:48 2015 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 15 Oct 2015 05:10:48 +0000 Subject: [Numpy-discussion] A regression in numpy 1.10: VERY slow memory mapped file generation In-Reply-To: <561E7BFD.1060506@gmail.com> References: , <561E7BFD.1060506@gmail.com> Message-ID: You right, the delay is not in the memmap: ... _data = N.memmap(filename, dtype=frame_type, mode=mode, offset=fh_size, shape=nframes) data = _data['data'] The delay is in the 2nd line which selects a field from a recarray. I use a common drawing application mypaint that uses numpy, and I think it also suffers from that delay. Thank you, Nadav ________________________________________ From: NumPy-Discussion on behalf of Allan Haldane Sent: 14 October 2015 18:59 To: numpy-discussion at scipy.org Subject: Re: [Numpy-discussion] A regression in numpy 1.10: VERY slow memory mapped file generation On 10/14/2015 01:23 AM, Nadav Horesh wrote: > > I have binary files of size range between few MB to 1GB, which I read process as memory mapped files (via np.memmap). Until numpy 1.9 the creation of recarray on an existing file (without reading its content) was instantaneous, and now it takes ~6 seconds (system: archlinux on sandy bridge). A profiling (using ipython %prun) top of the list is: > > > ncalls tottime percall cumtime percall filename:lineno(function) > 21 3.037 0.145 4.266 0.203 _internal.py:372(_check_field_overlap) > 3713431 1.663 0.000 1.663 0.000 _internal.py:366() > 3713750 0.790 0.000 0.790 0.000 {range} > 3713709 0.406 0.000 0.406 0.000 {method 'update' of 'set' objects} > 322 0.320 0.001 1.984 0.006 {method 'extend' of 'list' objects} > > Nadav. Hi Nadav, The slowdown is due to a problem in PR I introduced to add safety checks to views of structured arrays (to prevent segfaults involving object fields), which will hopefully be fixed quickly. It is being discussed here https://github.com/numpy/numpy/issues/6467 Also, I do not think the problem is with memmap - as far as I have tested, memmmap is still fast. Most likely what is slowing your script down is subsequent access to the fields of the array, which is what has regressed. Is that right? Allan _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion From a.h.jaffe at gmail.com Thu Oct 15 11:49:17 2015 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Thu, 15 Oct 2015 16:49:17 +0100 Subject: [Numpy-discussion] how is toolchain selected for compiling (OS X with python.org build)? Message-ID: This isn't, strictly speaking, a numpy question, but I suspect it's something that numpy devs and users have some insight into. I am trying to compile an extension that requires a fairly advanced c++ compiler. Using the built-in apple python, it defaults to the latest clang from apple, and it works just fine. Using the python.org framework build, it still selects clang, which is in principle a new enough compiler, but for some reason it seems to end up pointing to /usr/include/c++/4.2.1/ which of course is too old, and the build fails. So the questions I have are: - *why* is it using such an old toolchain (I am pretty sure that the answer is backward compatibility, and specifically because that is the way the framework build python is itself compiled). - *how* is it selecting those tools, and in particular, that include directory? It doesn't seem to explicitly show up in the logs, until there's an error. If I just use the same clang invocation as seems to be used by the build, it is able to compile full C++-11 code... - Is there any way to still use the apple clang, but in full c++-11 mode to build extensions? The solution/workaround is to install and then explicitly select a more advanced compiler, e.g., from homebrew, using environment variables, but it would be nice if it could work out of the box, and ideally with the same behaviour as with apple's python build. -Andrew p.s. for the aficionados, this is for [healpy][1], and we're looking at it with [this issue][2]. [1]: https://github.com/healpy [2]: https://github.com/healpy/healpy/issues/284#issuecomment-148354405 From chris.barker at noaa.gov Thu Oct 15 13:39:03 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 15 Oct 2015 10:39:03 -0700 Subject: [Numpy-discussion] how is toolchain selected for compiling (OS X with python.org build)? In-Reply-To: References: Message-ID: you might try the python-mac list: https://mail.python.org/mailman/listinfo/pythonmac-sig not very active, but folks there know what they are doing :-) -CHB On Thu, Oct 15, 2015 at 8:49 AM, Andrew Jaffe wrote: > This isn't, strictly speaking, a numpy question, but I suspect it's > something that numpy devs and users have some insight into. > > I am trying to compile an extension that requires a fairly advanced c++ > compiler. Using the built-in apple python, it defaults to the latest clang > from apple, and it works just fine. > > Using the python.org framework build, it still selects clang, which is in > principle a new enough compiler, but for some reason it seems to end up > pointing to /usr/include/c++/4.2.1/ which of course is too old, and the > build fails. > > So the questions I have are: > > - *why* is it using such an old toolchain (I am pretty sure that the > answer is backward compatibility, and specifically because that is the way > the framework build python is itself compiled). > > - *how* is it selecting those tools, and in particular, that include > directory? It doesn't seem to explicitly show up in the logs, until there's > an error. If I just use the same clang invocation as seems to be used by > the build, it is able to compile full C++-11 code... > > - Is there any way to still use the apple clang, but in full c++-11 mode > to build extensions? > > The solution/workaround is to install and then explicitly select a more > advanced compiler, e.g., from homebrew, using environment variables, but it > would be nice if it could work out of the box, and ideally with the same > behaviour as with apple's python build. > > -Andrew > > p.s. for the aficionados, this is for [healpy][1], and we're looking at it > with [this issue][2]. > > [1]: https://github.com/healpy > [2]: https://github.com/healpy/healpy/issues/284#issuecomment-148354405 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 15 23:28:23 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 15 Oct 2015 21:28:23 -0600 Subject: [Numpy-discussion] Interesting discussion on copyrighting files. Message-ID: Worth a read at A&D . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Fri Oct 16 13:19:58 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Fri, 16 Oct 2015 13:19:58 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Tue, Oct 13, 2015 at 6:48 PM, Chris Barker wrote: > And because we probably want fast absolute delta computation, when we add > timezones, we'll probably want to store the datetime in UTC, and apply the > timezone on I/O. > > Alexander: Am I right that we don't need the "fold" bit in this case? > You'd still need it when specifying a time in a timezone with folds.. -- > but again, only on I/O Since Guido hates leap seconds, PEP 495 is silent on this issue, but strictly speaking UTC leap seconds are "folds." AFAICT, a strictly POSIX system must repeat the same value of time_t when a leap second is inserted. While datetime will never extend the second field to allow second=60, with PEP 495, it is now possible to represent 23:59:60 as 23:59:59/fold=1. Apart from leap seconds, there is no need to use "fold" on datetimes that represent time in UTC or any timezone at a fixed offset from utc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 16 13:40:15 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Oct 2015 13:40:15 -0400 Subject: [Numpy-discussion] Bug Message-ID: -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 16 13:41:25 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Oct 2015 13:41:25 -0400 Subject: [Numpy-discussion] Bug In-Reply-To: References: Message-ID: Sorry, wrong shortcut key, question will arrive later. Josef On Fri, Oct 16, 2015 at 1:40 PM, wrote: > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 16 13:58:39 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Oct 2015 13:58:39 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays Message-ID: was there a change with reduce operations with recarrays in 1.10 or 1.10.1? Travis shows a new test failure in the statsmodels testsuite with 1.10.1: ERROR: test suite for File "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", line 131, in _handle_constant const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze() TypeError: cannot perform reduce with flexible type Sorry for asking so late. (statsmodels is short on maintainers, and I'm distracted) statsmodels still has code to support recarrays and structured dtypes from the time before pandas became popular, but I don't think anyone is using them together with statsmodels anymore. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 16 14:20:36 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 16 Oct 2015 12:20:36 -0600 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: References: Message-ID: On Fri, Oct 16, 2015 at 11:58 AM, wrote: > was there a change with reduce operations with recarrays in 1.10 or 1.10.1? > > Travis shows a new test failure in the statsmodels testsuite with 1.10.1: > > ERROR: test suite for 'statsmodels.base.tests.test_data.TestRecarrays'> > > File > "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", > line 131, in _handle_constant > const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze() > TypeError: cannot perform reduce with flexible type > > > Sorry for asking so late. > (statsmodels is short on maintainers, and I'm distracted) > > > statsmodels still has code to support recarrays and structured dtypes from > the time before pandas became popular, but I don't think anyone is using > them together with statsmodels anymore. > > There were several commits dealing both recarrays and ufuncs, so this might well be a regression. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 16 14:21:19 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 16 Oct 2015 12:21:19 -0600 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: References: Message-ID: On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Fri, Oct 16, 2015 at 11:58 AM, wrote: > >> was there a change with reduce operations with recarrays in 1.10 or >> 1.10.1? >> >> Travis shows a new test failure in the statsmodels testsuite with 1.10.1: >> >> ERROR: test suite for > 'statsmodels.base.tests.test_data.TestRecarrays'> >> >> File >> "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", >> line 131, in _handle_constant >> const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze() >> TypeError: cannot perform reduce with flexible type >> >> >> Sorry for asking so late. >> (statsmodels is short on maintainers, and I'm distracted) >> >> >> statsmodels still has code to support recarrays and structured dtypes >> from the time before pandas became popular, but I don't think anyone is >> using them together with statsmodels anymore. >> >> > There were several commits dealing both recarrays and ufuncs, so this > might well be a regression. > > A bisection would be helpful. Also, open an issue. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 16 17:31:37 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Oct 2015 17:31:37 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: References: Message-ID: On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris wrote: > > > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Fri, Oct 16, 2015 at 11:58 AM, wrote: >> >>> was there a change with reduce operations with recarrays in 1.10 or >>> 1.10.1? >>> >>> Travis shows a new test failure in the statsmodels testsuite with 1.10.1: >>> >>> ERROR: test suite for >> 'statsmodels.base.tests.test_data.TestRecarrays'> >>> >>> File >>> "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", >>> line 131, in _handle_constant >>> const_idx = np.where(self.exog.ptp(axis=0) == 0)[0].squeeze() >>> TypeError: cannot perform reduce with flexible type >>> >>> >>> Sorry for asking so late. >>> (statsmodels is short on maintainers, and I'm distracted) >>> >>> >>> statsmodels still has code to support recarrays and structured dtypes >>> from the time before pandas became popular, but I don't think anyone is >>> using them together with statsmodels anymore. >>> >>> >> There were several commits dealing both recarrays and ufuncs, so this >> might well be a regression. >> >> > A bisection would be helpful. Also, open an issue. > The reason for the test failure might be somewhere else hiding behind several layers of statsmodels, but only started to show up with numpy 1.10.1 I already have the reduce exception with my currently installed numpy '1.9.2rc1' >>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), ('x_2', 'f8')]).view(np.recarray) >>> np.ptp(x, axis=0) Traceback (most recent call last): File "", line 1, in File "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py", line 2047, in ptp return ptp(axis, out) TypeError: cannot perform reduce with flexible type Sounds like fun, and I don't even know how to automatically bisect. Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Fri Oct 16 20:56:31 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 16 Oct 2015 20:56:31 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: References: Message-ID: <56219CBF.1070605@gmail.com> On 10/16/2015 05:31 PM, josef.pktd at gmail.com wrote: > > > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris > > wrote: > > > > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris > > wrote: > > > > On Fri, Oct 16, 2015 at 11:58 AM, > wrote: > > was there a change with reduce operations with recarrays in > 1.10 or 1.10.1? > > Travis shows a new test failure in the statsmodels testsuite > with 1.10.1: > > ERROR: test suite for 'statsmodels.base.tests.test_data.TestRecarrays'> > > File > "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", > line 131, in _handle_constant > const_idx = np.where(self.exog.ptp(axis=0) == > 0)[0].squeeze() > TypeError: cannot perform reduce with flexible type > > > Sorry for asking so late. > (statsmodels is short on maintainers, and I'm distracted) > > > statsmodels still has code to support recarrays and > structured dtypes from the time before pandas became > popular, but I don't think anyone is using them together > with statsmodels anymore. > > > There were several commits dealing both recarrays and ufuncs, so > this might well be a regression. > > > A bisection would be helpful. Also, open an issue. > > > > The reason for the test failure might be somewhere else hiding behind > several layers of statsmodels, but only started to show up with numpy 1.10.1 > > I already have the reduce exception with my currently installed numpy > '1.9.2rc1' > >>>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), > ('x_2', 'f8')]).view(np.recarray) > >>>> np.ptp(x, axis=0) > Traceback (most recent call last): > File "", line 1, in > File > "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py", > line 2047, in ptp > return ptp(axis, out) > TypeError: cannot perform reduce with flexible type > > > Sounds like fun, and I don't even know how to automatically bisect. > > Josef That example isn't the problem (ptp should definitely fail on structured arrays), but I've tracked down what is - it has to do with views of record arrays. The fix looks simple, I'll get it in for the next release. Allan From josef.pktd at gmail.com Fri Oct 16 21:17:22 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Oct 2015 21:17:22 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: <56219CBF.1070605@gmail.com> References: <56219CBF.1070605@gmail.com> Message-ID: On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane wrote: > On 10/16/2015 05:31 PM, josef.pktd at gmail.com wrote: > > > > > > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris > > > wrote: > > > > > > > > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris > > > > wrote: > > > > > > > > On Fri, Oct 16, 2015 at 11:58 AM, > > wrote: > > > > was there a change with reduce operations with recarrays in > > 1.10 or 1.10.1? > > > > Travis shows a new test failure in the statsmodels testsuite > > with 1.10.1: > > > > ERROR: test suite for > 'statsmodels.base.tests.test_data.TestRecarrays'> > > > > File > > > "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", > > line 131, in _handle_constant > > const_idx = np.where(self.exog.ptp(axis=0) == > > 0)[0].squeeze() > > TypeError: cannot perform reduce with flexible type > > > > > > Sorry for asking so late. > > (statsmodels is short on maintainers, and I'm distracted) > > > > > > statsmodels still has code to support recarrays and > > structured dtypes from the time before pandas became > > popular, but I don't think anyone is using them together > > with statsmodels anymore. > > > > > > There were several commits dealing both recarrays and ufuncs, so > > this might well be a regression. > > > > > > A bisection would be helpful. Also, open an issue. > > > > > > > > The reason for the test failure might be somewhere else hiding behind > > several layers of statsmodels, but only started to show up with numpy > 1.10.1 > > > > I already have the reduce exception with my currently installed numpy > > '1.9.2rc1' > > > >>>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), > > ('x_2', 'f8')]).view(np.recarray) > > > >>>> np.ptp(x, axis=0) > > Traceback (most recent call last): > > File "", line 1, in > > File > > > "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py", > > line 2047, in ptp > > return ptp(axis, out) > > TypeError: cannot perform reduce with flexible type > > > > > > Sounds like fun, and I don't even know how to automatically bisect. > > > > Josef > > That example isn't the problem (ptp should definitely fail on structured > arrays), but I've tracked down what is - it has to do with views of > record arrays. > > The fix looks simple, I'll get it in for the next release. > Thanks, I realized that at that point in the statsmodels code we should have only regular ndarrays, so the array conversion fails somewhere. AFAICS, the main helper function to convert is def struct_to_ndarray(arr): return arr.view((float, len(arr.dtype.names))) which doesn't look like it will handle other dtypes than float64. Nobody ever complained, so maybe our test suite is the only user of this. What is now the recommended way of converting structured dtypes/recarrays to ndarrays? Josef > > Allan > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Fri Oct 16 21:31:11 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Fri, 16 Oct 2015 21:31:11 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: References: <56219CBF.1070605@gmail.com> Message-ID: <5621A4DF.6080001@gmail.com> On 10/16/2015 09:17 PM, josef.pktd at gmail.com wrote: > > > On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane > wrote: > > On 10/16/2015 05:31 PM, josef.pktd at gmail.com > wrote: > > > > > > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris > > > >> wrote: > > > > > > > > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris > > > >> wrote: > > > > > > > > On Fri, Oct 16, 2015 at 11:58 AM, > > >> wrote: > > > > was there a change with reduce operations with > recarrays in > > 1.10 or 1.10.1? > > > > Travis shows a new test failure in the statsmodels > testsuite > > with 1.10.1: > > > > ERROR: test suite for > 'statsmodels.base.tests.test_data.TestRecarrays'> > > > > File > > > "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", > > line 131, in _handle_constant > > const_idx = np.where(self.exog.ptp(axis=0) == > > 0)[0].squeeze() > > TypeError: cannot perform reduce with flexible type > > > > > > Sorry for asking so late. > > (statsmodels is short on maintainers, and I'm distracted) > > > > > > statsmodels still has code to support recarrays and > > structured dtypes from the time before pandas became > > popular, but I don't think anyone is using them together > > with statsmodels anymore. > > > > > > There were several commits dealing both recarrays and > ufuncs, so > > this might well be a regression. > > > > > > A bisection would be helpful. Also, open an issue. > > > > > > > > The reason for the test failure might be somewhere else hiding behind > > several layers of statsmodels, but only started to show up with > numpy 1.10.1 > > > > I already have the reduce exception with my currently installed numpy > > '1.9.2rc1' > > > >>>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), > > ('x_2', 'f8')]).view(np.recarray) > > > >>>> np.ptp(x, axis=0) > > Traceback (most recent call last): > > File "", line 1, in > > File > > > "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py", > > line 2047, in ptp > > return ptp(axis, out) > > TypeError: cannot perform reduce with flexible type > > > > > > Sounds like fun, and I don't even know how to automatically bisect. > > > > Josef > > That example isn't the problem (ptp should definitely fail on structured > arrays), but I've tracked down what is - it has to do with views of > record arrays. > > The fix looks simple, I'll get it in for the next release. > > > Thanks, > > I realized that at that point in the statsmodels code we should have > only regular ndarrays, so the array conversion fails somewhere. > > AFAICS, the main helper function to convert is > > def struct_to_ndarray(arr): > return arr.view((float, len(arr.dtype.names))) > > which doesn't look like it will handle other dtypes than float64. Nobody > ever complained, so maybe our test suite is the only user of this. > > What is now the recommended way of converting structured > dtypes/recarrays to ndarrays? > > Josef Yes, that's the code I narrowed it down to as well. I think the code in statsmodels is fine, the problem is actually a bug I must admit I introduced in changes to the way views of recarrays work. If you are curious, the bug is in this line: https://github.com/numpy/numpy/blob/master/numpy/core/records.py#L467 This line was intended to fix the problem that accessing a nested record array field would lose the 'np.record' dtype. I only considered void structured arrays, and had forgotten about sub-arrays which statsmodels uses. I think the fix is to replace `issubclass(val.type, nt.void)` with `val.names` or something similar. I'll take a closer look soon. Allan From josef.pktd at gmail.com Sat Oct 17 13:24:45 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 17 Oct 2015 13:24:45 -0400 Subject: [Numpy-discussion] Interesting discussion on copyrighting files. In-Reply-To: References: Message-ID: On Thu, Oct 15, 2015 at 11:28 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Worth a read at A&D . > Thanks, it is worth a read. Most of the time when I see code copied from scipy or statsmodels, it is properly attributed. But every once in a while (like just now) I see code in an interesting sounding package on github where I start to recognize parts because they have my code comments still left in but don't have an attribution to the origin. It's almost ok if it's MIT or BSD licensed because then I can "borrow back" the changes, but not if the new license is GPL. This is to the point in the discussion of seeing modules or functions that got isolated from the parent package. Josef (slightly grumpy) > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 17 15:25:14 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 17 Oct 2015 12:25:14 -0700 Subject: [Numpy-discussion] Interesting discussion on copyrighting files. In-Reply-To: References: Message-ID: On Sat, Oct 17, 2015 at 10:24 AM, wrote: > > On Thu, Oct 15, 2015 at 11:28 PM, Charles R Harris > wrote: >> >> Worth a read at A&D. > > Thanks, it is worth a read. > > Most of the time when I see code copied from scipy or statsmodels, it is > properly attributed. > But every once in a while (like just now) I see code in an interesting > sounding package on github where I start to recognize parts because they > have my code comments still left in but don't have an attribution to the > origin. > > It's almost ok if it's MIT or BSD licensed because then I can "borrow back" > the changes, but not if the new license is GPL. I'm not sure I fully agree about the GPL thing (I understand and sympathize with how annoying it is, but when we fight for BSD licensing then what are we fighting for if not for the right of random people to take our stuff and without letting us "borrow back" changes?), but more importantly it should be noted: People who take MIT/BSD licensed code and strip off the attribution are actually violating the license. This is pretty much the only thing you can do that violates the license, but they're doing it. Better practice is to keep a list of places where code was taken from, and the licenses governing its use, in your LICENSE.txt file: https://github.com/pydata/patsy/blob/master/LICENSE.txt https://github.com/rust-lang/rust/blob/master/COPYRIGHT -n -- Nathaniel J. Smith -- http://vorpus.org From chris.barker at noaa.gov Sat Oct 17 18:59:16 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 17 Oct 2015 15:59:16 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Fri, Oct 16, 2015 at 10:19 AM, Alexander Belopolsky wrote: > Since Guido hates leap seconds, PEP 495 is silent on this issue, but > strictly speaking UTC leap seconds are "folds." AFAICT, a strictly POSIX > system must repeat the same value of time_t when a leap second is > inserted. While datetime will never extend the second field to > allow second=60, with PEP 495, it is now possible to represent 23:59:60 as 23:59:59/fold=1. Thanks -- If anyhone decides to actually get arond to leap seconds suport in numpy datetime, se can decide whether to do folds or allow second: 60. Off the top of my head, I think allowing a 60th second makes more sense -- jsut like we do leap years. Granted, external systems often don't understand/allow a 60th second, but they generally don't understand a fold bit, either.... -CHB > Apart from leap seconds, there is no need to use "fold" on datetimes that > represent time in UTC or any timezone at a fixed offset from utc. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 17 19:15:01 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 17 Oct 2015 16:15:01 -0700 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: <3A97BA32-9431-4923-A7F3-C9A24D950449@cfa.harvard.edu> <7A24EAB8-98E2-437F-85F1-6D573924FDEC@cfa.harvard.edu> <9DAD3224-EA17-4303-A7B1-3A130DE4F51D@gmail.com> Message-ID: Hi Luke, For day-to-day development and testing of numpy, I don't bother with either inplace builds *or* installing it -- I just use the magical "runtests.py" script that you'll find in the root of your git checkout. E.g., to build and then test the (possibly modified) source in your current checkout, just do: ./runtests.py That's all. This builds into a hidden directory and then sets up the correct PYTHONPATH before running the tests etc. -- you don't have to worry about any of it, it's magic. There are also lots of options, see ./runtests.py --help. Try adding -j for multi-core builds, or you can specify arbitrary options to pass to nose, or you can run it under gdb (there's an example in --help), or if you just want an interactive shell to futz around in manually instead of running the test suite then try passing --ipython. BTW, numpy has its own mailing list at numpy-discussion at scipy.org (CC'ed), which is where numpy development discussions usually take place -- this list is more for scipy-the-package itself. There's lots of overlap in readership between the two lists, but numpy-discussion will probably give you quicker and more useful answers to questions like this in general :-) -n On Sat, Oct 17, 2015 at 4:03 PM, Luke Zoltan Kelley wrote: > Thanks Nathan, I'll try that. Both without the inplace build, I'll have to > rebuild and install everytime I want to test something, right? > > On Oct 17, 2015, at 6:42 PM, Nathan Woods wrote: > > My best guess is to nuke and reclone Numpy, then do setup.py install without > the inplace build. What you're doing seems like it should work, though, so > I'm not sure what's going on. > > Nathan Woods > > On Oct 17, 2015, at 3:59 PM, Luke Zoltan Kelley > wrote: > > When trying to do that is when I get the error I described in the OP. i.e. > I get an error when trying to install. > > > On Oct 17, 2015, at 5:57 PM, Andrew Nelson wrote: > > It would get installed into whatever conda environment you had activated. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > https://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > https://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > https://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > https://mail.scipy.org/mailman/listinfo/scipy-dev > -- Nathaniel J. Smith -- http://vorpus.org From nadavh at visionsense.com Sun Oct 18 00:38:04 2015 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun, 18 Oct 2015 04:38:04 +0000 Subject: [Numpy-discussion] dot product: large speed difference metween seemingly indentical operations Message-ID: The functions dot, matmul and tensordot performs the same on a MxN matrix multiplied by length N vector, but very different if the matrix is replaced by a PxQxN array. Why? In [3]: a = rand(1000000,3) In [4]: a1 = a.reshape(1000,1000,3) In [5]: w = rand(3) In [6]: %timeit a.dot(w) 100 loops, best of 3: 3.47 ms per loop In [7]: %timeit a1.dot(w) # Very slow! 10 loops, best of 3: 25.5 ms per loop In [8]: %timeit a at w 100 loops, best of 3: 3.45 ms per loop In [9]: %timeit a1 at w 100 loops, best of 3: 6.77 ms per loop In [10]: %timeit tensordot(a,w,1) 100 loops, best of 3: 3.44 ms per loop In [11]: %timeit tensordot(a1,w,1) 100 loops, best of 3: 3.41 ms per loop BTW, this is not a corner case, since PxQx3 arrays represent RGB images. ? Nadav From ndarray at mac.com Sun Oct 18 15:20:03 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Sun, 18 Oct 2015 15:20:03 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker wrote: > Off the top of my head, I think allowing a 60th second makes more sense -- > jsut like we do leap years. Yet we don't implement DST by allowing the 24th hour. Even the countries that adjust the clocks at midnight don't do that. In some sense leap seconds are more similar to timezone changes (DST or political) because they are irregular and unpredictable. Furthermore, the notion of "fold" is not tied to a particular 24/60/60 system of encoding times and thus more applicable to numpy where times are encoded as binary integers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Sun Oct 18 15:57:36 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Sun, 18 Oct 2015 15:57:36 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker wrote: > If anyone decides to actually get around to leap seconds support in numpy > datetime, s/he can decide ... This attitude is the reason why we will probably never have bug free software when it comes to civil time reckoning. Even though ANSI C has the difftime(time_t time1, time_t time0) function which in theory may not reduce to time1 - time0, in practice it is only useful to avoid overflows in integer to float conversions in cross-platform code and cannot account for the fact some days are longer than others. Similarly, current numpy.datetime64 design ties arithmetic with encoding. This makes arithmetic easier, but in the long run may preclude designs that better match the problem domain. Note how the development of PEP 495 has highlighted the fact that allowing binary operations (subtraction, comparison etc.) between times in different timezones was a design mistake. It will be wise to learn from such mistakes when redesigning numpy.datetime64. If you ever plan to support civil time in some form, you should think about it now. In Python 3.6, datetime.now() will return different values in the first and the second repeated hour in the "fall-back fold." If you allow datetime.datetime to numpy.datetime64 conversion, you should decide what you do with that difference. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzkelley at gmail.com Sun Oct 18 16:25:15 2015 From: lzkelley at gmail.com (Luke Zoltan Kelley) Date: Sun, 18 Oct 2015 16:25:15 -0400 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda Message-ID: Thanks for the help Nathaniel --- but building via `./runtests.py` is failing in the same way. Hopefully Numpy-discussion can help me out. I'm able to build using `python setup.py build_ext --inplace` but both trying to run `python setup.py install` or `./runtests.py` leads to the following error: (numpy-py27)daedalus-2:numpy lzkelley$ ./runtests.py Building, see build.log... Running from numpy source directory. Traceback (most recent call last): File "setup.py", line 264, in setup_package() File "setup.py", line 248, in setup_package from numpy.distutils.core import setup File "/Users/lzkelley/Programs/public/numpy/numpy/distutils/__init__.py", line 21, in from numpy.testing import Tester File "/Users/lzkelley/Programs/public/numpy/numpy/testing/__init__.py", line 14, in from .utils import * File "/Users/lzkelley/Programs/public/numpy/numpy/testing/utils.py", line 17, in from numpy.core import float32, empty, arange, array_repr, ndarray File "/Users/lzkelley/Programs/public/numpy/numpy/core/__init__.py", line 59, in test = Tester().test File "/Users/lzkelley/Programs/public/numpy/numpy/testing/nosetester.py", line 180, in __init__ if raise_warnings is None and '.dev0' in np.__version__: AttributeError: 'module' object has no attribute '__version__' Build failed! Has anyone seen something like this before? Thanks! Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From rainwoodman at gmail.com Sun Oct 18 19:22:26 2015 From: rainwoodman at gmail.com (Feng Yu) Date: Sun, 18 Oct 2015 16:22:26 -0700 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: Message-ID: Hi Luke, Could you check if you have "/Users/lzkelley/Programs/public/numpy/ in your PYTHONPATH? I would also suggest you add a print(np) line before the crash in nosetester.py. I got something like this (which didn't crash): If you see something not starting with 'numpy/build', then it is again pointing at PYTHONPATH. I hope these helps. Best, - Yu On Sun, Oct 18, 2015 at 1:25 PM, Luke Zoltan Kelley wrote: > Thanks for the help Nathaniel --- but building via `./runtests.py` is > failing in the same way. Hopefully Numpy-discussion can help me out. > > I'm able to build using `python setup.py build_ext --inplace` but both > trying to run `python setup.py install` or `./runtests.py` leads to the > following error: > > (numpy-py27)daedalus-2:numpy lzkelley$ ./runtests.py > Building, see build.log... > Running from numpy source directory. > Traceback (most recent call last): > File "setup.py", line 264, in > setup_package() > File "setup.py", line 248, in setup_package > from numpy.distutils.core import setup > File "/Users/lzkelley/Programs/public/numpy/numpy/distutils/__init__.py", > line 21, in > from numpy.testing import Tester > File "/Users/lzkelley/Programs/public/numpy/numpy/testing/__init__.py", > line 14, in > from .utils import * > File "/Users/lzkelley/Programs/public/numpy/numpy/testing/utils.py", line > 17, in > from numpy.core import float32, empty, arange, array_repr, ndarray > File "/Users/lzkelley/Programs/public/numpy/numpy/core/__init__.py", line > 59, in > test = Tester().test > File "/Users/lzkelley/Programs/public/numpy/numpy/testing/nosetester.py", > line 180, in __init__ > if raise_warnings is None and '.dev0' in np.__version__: > AttributeError: 'module' object has no attribute '__version__' > > Build failed! > > > Has anyone seen something like this before? > > Thanks! > Luke > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From lzkelley at gmail.com Sun Oct 18 19:46:51 2015 From: lzkelley at gmail.com (Luke Zoltan Kelley) Date: Sun, 18 Oct 2015 19:46:51 -0400 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: Message-ID: Thanks Yu, There was nothing in my PYTHONPATH at first, and adding my numpy directory ('/Users/lzkelley/Programs/public/numpy') didn't help (same error). In both cases, adding 'print(np)' yields: > On Oct 18, 2015, at 7:22 PM, Feng Yu wrote: > > Hi Luke, > > Could you check if you have "/Users/lzkelley/Programs/public/numpy/ in > your PYTHONPATH? > > I would also suggest you add a print(np) line before the crash in > nosetester.py. I got something like this (which didn't crash): > > '/home/yfeng1/source/numpy/build/testenv/lib64/python2.7/site-packages/numpy/__init__.pyc'> > > If you see something not starting with 'numpy/build', then it is again > pointing at PYTHONPATH. > > I hope these helps. > > Best, > > - Yu > > On Sun, Oct 18, 2015 at 1:25 PM, Luke Zoltan Kelley wrote: >> Thanks for the help Nathaniel --- but building via `./runtests.py` is >> failing in the same way. Hopefully Numpy-discussion can help me out. >> >> I'm able to build using `python setup.py build_ext --inplace` but both >> trying to run `python setup.py install` or `./runtests.py` leads to the >> following error: >> >> (numpy-py27)daedalus-2:numpy lzkelley$ ./runtests.py >> Building, see build.log... >> Running from numpy source directory. >> Traceback (most recent call last): >> File "setup.py", line 264, in >> setup_package() >> File "setup.py", line 248, in setup_package >> from numpy.distutils.core import setup >> File "/Users/lzkelley/Programs/public/numpy/numpy/distutils/__init__.py", >> line 21, in >> from numpy.testing import Tester >> File "/Users/lzkelley/Programs/public/numpy/numpy/testing/__init__.py", >> line 14, in >> from .utils import * >> File "/Users/lzkelley/Programs/public/numpy/numpy/testing/utils.py", line >> 17, in >> from numpy.core import float32, empty, arange, array_repr, ndarray >> File "/Users/lzkelley/Programs/public/numpy/numpy/core/__init__.py", line >> 59, in >> test = Tester().test >> File "/Users/lzkelley/Programs/public/numpy/numpy/testing/nosetester.py", >> line 180, in __init__ >> if raise_warnings is None and '.dev0' in np.__version__: >> AttributeError: 'module' object has no attribute '__version__' >> >> Build failed! >> >> >> Has anyone seen something like this before? >> >> Thanks! >> Luke >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From msarahan at gmail.com Sun Oct 18 20:02:34 2015 From: msarahan at gmail.com (Michael Sarahan) Date: Mon, 19 Oct 2015 00:02:34 +0000 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: Message-ID: Running tests in the folder might be causing your problem. If it's trying to import numpy, and numpy is a folder in your current folder, sometimes you see errors like this. The confusion is that Python treats folders (packages) similarly to modules, and the resolution order sometimes bites you. Try cd'ing to a different folder (importantly, one NOT containing a numpy folder!) and run the test command from there. HTH, Michael On Sun, Oct 18, 2015 at 6:46 PM Luke Zoltan Kelley wrote: > Thanks Yu, > > There was nothing in my PYTHONPATH at first, and adding my numpy directory > ('/Users/lzkelley/Programs/public/numpy') didn't help (same error). In > both cases, adding 'print(np)' yields: > > '/Users/lzkelley/Programs/public/numpy/numpy/__init__.pyc'> > > > > On Oct 18, 2015, at 7:22 PM, Feng Yu wrote: > > > > Hi Luke, > > > > Could you check if you have "/Users/lzkelley/Programs/public/numpy/ in > > your PYTHONPATH? > > > > I would also suggest you add a print(np) line before the crash in > > nosetester.py. I got something like this (which didn't crash): > > > > > > '/home/yfeng1/source/numpy/build/testenv/lib64/python2.7/site-packages/numpy/__init__.pyc'> > > > > If you see something not starting with 'numpy/build', then it is again > > pointing at PYTHONPATH. > > > > I hope these helps. > > > > Best, > > > > - Yu > > > > On Sun, Oct 18, 2015 at 1:25 PM, Luke Zoltan Kelley > wrote: > >> Thanks for the help Nathaniel --- but building via `./runtests.py` is > >> failing in the same way. Hopefully Numpy-discussion can help me out. > >> > >> I'm able to build using `python setup.py build_ext --inplace` but both > >> trying to run `python setup.py install` or `./runtests.py` leads to the > >> following error: > >> > >> (numpy-py27)daedalus-2:numpy lzkelley$ ./runtests.py > >> Building, see build.log... > >> Running from numpy source directory. > >> Traceback (most recent call last): > >> File "setup.py", line 264, in > >> setup_package() > >> File "setup.py", line 248, in setup_package > >> from numpy.distutils.core import setup > >> File > "/Users/lzkelley/Programs/public/numpy/numpy/distutils/__init__.py", > >> line 21, in > >> from numpy.testing import Tester > >> File "/Users/lzkelley/Programs/public/numpy/numpy/testing/__init__.py", > >> line 14, in > >> from .utils import * > >> File "/Users/lzkelley/Programs/public/numpy/numpy/testing/utils.py", > line > >> 17, in > >> from numpy.core import float32, empty, arange, array_repr, ndarray > >> File "/Users/lzkelley/Programs/public/numpy/numpy/core/__init__.py", > line > >> 59, in > >> test = Tester().test > >> File > "/Users/lzkelley/Programs/public/numpy/numpy/testing/nosetester.py", > >> line 180, in __init__ > >> if raise_warnings is None and '.dev0' in np.__version__: > >> AttributeError: 'module' object has no attribute '__version__' > >> > >> Build failed! > >> > >> > >> Has anyone seen something like this before? > >> > >> Thanks! > >> Luke > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Oct 18 21:04:33 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 18 Oct 2015 18:04:33 -0700 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: Message-ID: On Sun, Oct 18, 2015 at 5:02 PM, Michael Sarahan wrote: > Running tests in the folder might be causing your problem. If it's trying > to import numpy, and numpy is a folder in your current folder, sometimes you > see errors like this. The confusion is that Python treats folders > (packages) similarly to modules, and the resolution order sometimes bites > you. Try cd'ing to a different folder (importantly, one NOT containing a > numpy folder!) and run the test command from there. This isn't the problem -- ./runtests.py is designed to work fine when run from the root of the numpy checkout. You might try nuking your checkout and environment and starting over just in case your earlier attempts left behind some broken detritus somewhere. 'git clean -xdf' will clear everything out of a git directory aside from tracked files (so make sure to add any new files you want to keep first!). -n -- Nathaniel J. Smith -- http://vorpus.org From rmcgibbo at gmail.com Sun Oct 18 21:40:04 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Sun, 18 Oct 2015 18:40:04 -0700 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: Message-ID: Hi, Is it possible to test this with py35 as well? For MSVC, py35 requires a new compiler toolchain (VS2015) -- is that something mingwpy/mingw-w64 can handle? -Robert On Fri, Oct 9, 2015 at 3:29 PM, Carl Kleffner wrote: > I made numpy master (numpy-1.11.0.dev0 , > https://github.com/numpy/numpy/commit/0243bce23383ff5e894b99e40df2f8fd806ad79f) > windows binary wheels available for testing. > > Install it with pip: > > > pip install -i https://pypi.anaconda.org/carlkl/simple numpy > > These builds are compiled with OPENBLAS trunk for BLAS/LAPACK support and > the mingwpy compiler toolchain. > > OpenBLAS is deployed within the numpy wheels. To be performant on all > usual CPU architectures OpenBLAS is configured with it's 'dynamic > architecture' and automatic CPU detection. > > This version of numpy fakes long double as double just like the MSVC > builds. > > Some test statistics: > > win32 (32 bit) > numpy-1.11.0.dev0, python-2.6: errors=8, failures=1 > numpy-1.11.0.dev0, python-2.7: errors=8, failures=1 > numpy-1.11.0.dev0, python-3.3: errors=9 > numpy-1.11.0.dev0, python-3.4: errors=9 > > amd64 (64bit) > numpy-1.11.0.dev0, python-2.6: errors=9, failures=6 > numpy-1.11.0.dev0, python-2.7: errors=9, failures=6 > numpy-1.11.0.dev0, python-3.3: errors=10, failures=6 > numpy-1.11.0.dev0, python-3.4: errors=10, failures=6 > > Carl > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzkelley at gmail.com Sun Oct 18 23:31:46 2015 From: lzkelley at gmail.com (Luke Zoltan Kelley) Date: Sun, 18 Oct 2015 23:31:46 -0400 Subject: [Numpy-discussion] [SciPy-Dev] Setting up a dev environment with conda In-Reply-To: References: Message-ID: I tried cleaning the git dir, and trying again. It still didn't work giving me the report: ====================================================================== ERROR: test_scripts.test_f2py ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/lzkelley/anaconda/envs/numpy-py27/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Users/lzkelley/Programs/public/numpy/build/testenv/lib/python2.7/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/Users/lzkelley/Programs/public/numpy/build/testenv/lib/python2.7/site-packages/numpy/tests/test_scripts.py", line 68, in test_f2py code, stdout, stderr = run_command([f2py_cmd, '-v']) File "/Users/lzkelley/Programs/public/numpy/build/testenv/lib/python2.7/site-packages/numpy/tests/test_scripts.py", line 48, in run_command proc = Popen(cmd, stdout=PIPE, stderr=PIPE) File "/Users/lzkelley/anaconda/envs/numpy-py27/lib/python2.7/subprocess.py", line 710, in __init__ errread, errwrite) File "/Users/lzkelley/anaconda/envs/numpy-py27/lib/python2.7/subprocess.py", line 1335, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory ---------------------------------------------------------------------- Ran 6029 tests in 82.132s FAILED (KNOWNFAIL=6, SKIP=10, errors=2) f2py itself did seem to work fine from the command-line... I did get things (seemingly) working, by cleaning the dir again, and then running: python setupegg.py develop --user This built properly, and now lets me make modification to the source files and have them take effect immediately. This is really all I need for now, but I will try to get the `./runtests.py` working for the future. Perhaps the problem is something to do with my previous python environment installed via macports... > On Oct 18, 2015, at 9:04 PM, Nathaniel Smith wrote: > > On Sun, Oct 18, 2015 at 5:02 PM, Michael Sarahan wrote: >> Running tests in the folder might be causing your problem. If it's trying >> to import numpy, and numpy is a folder in your current folder, sometimes you >> see errors like this. The confusion is that Python treats folders >> (packages) similarly to modules, and the resolution order sometimes bites >> you. Try cd'ing to a different folder (importantly, one NOT containing a >> numpy folder!) and run the test command from there. > > This isn't the problem -- ./runtests.py is designed to work fine when > run from the root of the numpy checkout. > > You might try nuking your checkout and environment and starting over > just in case your earlier attempts left behind some broken detritus > somewhere. 'git clean -xdf' will clear everything out of a git > directory aside from tracked files (so make sure to add any new files > you want to keep first!). > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 00:35:58 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 00:35:58 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? Message-ID: >>> np.column_stack((np.ones(10), np.ones(10))).flags C_CONTIGUOUS : True F_CONTIGUOUS : False >>> np.__version__ '1.9.2rc1' on my notebook which has numpy 1.6.1 it is f_contiguous I was just trying to optimize a loop over variable adjustment in regression, and found out that we lost fortran contiguity. I always thought column_stack is for fortran usage (linalg) What's the alternative? column_stack was one of my favorite commands, and I always assumed we have in statsmodels the right memory layout to call the linalg libraries. ("assumed" means we don't have timing nor unit tests for it.) Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 00:51:33 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 00:51:33 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 12:35 AM, wrote: > >>> np.column_stack((np.ones(10), np.ones(10))).flags > C_CONTIGUOUS : True > F_CONTIGUOUS : False > > >>> np.__version__ > '1.9.2rc1' > > > on my notebook which has numpy 1.6.1 it is f_contiguous > > > I was just trying to optimize a loop over variable adjustment in > regression, and found out that we lost fortran contiguity. > > I always thought column_stack is for fortran usage (linalg) > > What's the alternative? > column_stack was one of my favorite commands, and I always assumed we have > in statsmodels the right memory layout to call the linalg libraries. > > ("assumed" means we don't have timing nor unit tests for it.) > What's the difference between using array and column_stack except for a transpose and memory order? my current usecase is copying columns on top of each other #exog0 = np.column_stack((np.ones(nobs), x0, x0s2)) exog0 = np.array((np.ones(nobs), x0, x0s2)).T exog_opt = exog0.copy(order='F') the following part is in a loop, followed by some linear algebra for OLS, res_optim is a scalar parameter. exog_opt[:, -1] = np.clip(exog0[:, k] + res_optim, 0, np.inf) Are my assumption on memory access correct, or is there a better way? (I have quite a bit code in statsmodels that is optimized for fortran ordered memory layout especially for sequential regression, under the assumption that column_stack provides that Fortran order.) Also, do I need to start timing and memory benchmarking or is it obvious that a loop for k in range(maxi): x = arr[:, :k] depends on memory order? Josef > > Josef > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Oct 19 01:10:50 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 18 Oct 2015 22:10:50 -0700 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: Looking at the git logs, column_stack appears to have been that way (creating a new array with concatenate) since at least NumPy 0.9.2, way back in January 2006: https://github.com/numpy/numpy/blob/v0.9.2/numpy/lib/shape_base.py#L271 Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 01:27:26 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 01:27:26 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 1:10 AM, Stephan Hoyer wrote: > Looking at the git logs, column_stack appears to have been that way > (creating a new array with concatenate) since at least NumPy 0.9.2, way > back in January 2006: > https://github.com/numpy/numpy/blob/v0.9.2/numpy/lib/shape_base.py#L271 > Then it must have been changed somewhere else between 1.6.1 amd 1.9.2rc1 I have my notebook and my desktop with different numpy and python versions next to each other and I don't see a typo in my command. I assume python 2.7 versus python 3.4 doesn't make a difference. ------------------ >>> np.column_stack((np.ones(10), np.ones(10))).flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> np.__version__ '1.6.1' >>> import sys >>> sys.version '2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]' ---------------- >>> np.column_stack((np.ones(10), np.ones(10))).flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> np.__version__ '1.9.2rc1' >>> import sys >>> sys.version '3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)]' --------------------------- comparing all flags, owndata also has changed, but I don't think that has any effect Josef > > > Stephan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 01:34:12 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 01:34:12 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 1:27 AM, wrote: > > > On Mon, Oct 19, 2015 at 1:10 AM, Stephan Hoyer wrote: > >> Looking at the git logs, column_stack appears to have been that way >> (creating a new array with concatenate) since at least NumPy 0.9.2, way >> back in January 2006: >> https://github.com/numpy/numpy/blob/v0.9.2/numpy/lib/shape_base.py#L271 >> > > Then it must have been changed somewhere else between 1.6.1 amd 1.9.2rc1 > > I have my notebook and my desktop with different numpy and python versions > next to each other and I don't see a typo in my command. > > I assume python 2.7 versus python 3.4 doesn't make a difference. > > ------------------ > > >>> np.column_stack((np.ones(10), np.ones(10))).flags > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > >>> np.__version__ > '1.6.1' > >>> import sys > >>> sys.version > '2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]' > > ---------------- > > >>> np.column_stack((np.ones(10), np.ones(10))).flags > C_CONTIGUOUS : True > F_CONTIGUOUS : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > >>> np.__version__ > '1.9.2rc1' > >>> import sys > >>> sys.version > '3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit > (AMD64)]' > > --------------------------- > > comparing all flags, owndata also has changed, but I don't think that has > any effect > qualification It looks like in 1.9 it depends on the order of the 2-d arrays, which it didn't do in 1.6 >>> np.column_stack((np.ones(10), np.ones((10, 2), order='F'))).flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False which means the default order looks more like "K" now, not "C", IIUC Josef > > Josef > > >> >> >> Stephan >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 19 02:14:05 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 18 Oct 2015 23:14:05 -0700 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Sun, Oct 18, 2015 at 9:35 PM, wrote: >>>> np.column_stack((np.ones(10), np.ones(10))).flags > C_CONTIGUOUS : True > F_CONTIGUOUS : False > >>>> np.__version__ > '1.9.2rc1' > > > on my notebook which has numpy 1.6.1 it is f_contiguous > > > I was just trying to optimize a loop over variable adjustment in regression, > and found out that we lost fortran contiguity. > > I always thought column_stack is for fortran usage (linalg) > > What's the alternative? > column_stack was one of my favorite commands, and I always assumed we have > in statsmodels the right memory layout to call the linalg libraries. > > ("assumed" means we don't have timing nor unit tests for it.) In general practice no numpy functions make any guarantee about memory layout, unless that's explicitly a documented part of their contract (e.g. 'ascontiguous', or some functions that take an order= argument -- I say "some" b/c there are functions like 'reshape' that take an argument called order= that doesn't actually refer to memory layout). This isn't so much an official policy as just a fact of life -- if no-one has any idea that the someone is depending on some memory layout detail then there's no way to realize that we've broken something. (But it is a good policy IMO.) If this kind of problem gets caught during a pre-release cycle then we generally do try to fix it, because we try not to break code, but if it's been broken for 2 full releases then there's no much we can do -- we can't go back in time to fix it so it sounds like you're stuck working around the problem no matter what (unless you want to refuse to support 1.9.0 through 1.10.1, which I assume you don't... worst case, you just have to do a global search replace of np.column_stack with statsmodels.utils.column_stack_f, right?). And the regression issue seems like the only real argument for changing it back -- we'd never guarantee f-contiguity here if starting from a blank slate, I think? -n -- Nathaniel J. Smith -- http://vorpus.org From sebastian at sipsolutions.net Mon Oct 19 05:16:59 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 19 Oct 2015 11:16:59 +0200 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: <1445246219.30031.6.camel@sipsolutions.net> On Mo, 2015-10-19 at 01:34 -0400, josef.pktd at gmail.com wrote: > > > It looks like in 1.9 it depends on the order of the 2-d arrays, which > it didn't do in 1.6 > Yes, it uses concatenate, and concatenate probably changed in 1.7 to use "K" (since "K" did not really exists before 1.7 IIRC). Not sure what we can do about it, the order is not something that is easily fixed unless explicitly given. It might be optimized (as in this case I would guess). Whether or not doing the fastest route for these kind of functions is faster for the user is of course impossible to know, we can only hope that in most cases it is better. If someone has an idea how to decide I am all ears, but I think all we can do is put in asserts/tests in the downstream code if it relies heavily on the order (or just copy, if the order is wrong) :(, another example is change of the output order in advanced indexing in some cases, it makes it faster sometimes, and probably slower in others, what is right seems very much non-trivial. - Sebastian > > >>> np.column_stack((np.ones(10), np.ones((10, 2), order='F'))).flags > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > > > > which means the default order looks more like "K" now, not "C", IIUC > > > Josef > > > > > > Josef > > > > > > Stephan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From olivier.grisel at ensta.org Mon Oct 19 05:26:42 2015 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 19 Oct 2015 11:26:42 +0200 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: Message-ID: > Is it possible to test this with py35 as well? Unfortunately not yet. > For MSVC, py35 requires a new compiler toolchain (VS2015) -- is that something mingwpy/mingw-w64 can handle? I am pretty sure that mingwpy does not support Python 3.5 yet. I don't know the status of the interop of mingw-w64 w.r.t. VS2015 but as far as I know it's not supported yet either. Once the issue is fixed at the upstream level, I think mingwpy could be rebuilt to benefit from the fix. -- Olivier Grisel ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From thecy18 at gmail.com Mon Oct 19 06:06:31 2015 From: thecy18 at gmail.com (cy18) Date: Mon, 19 Oct 2015 06:06:31 -0400 Subject: [Numpy-discussion] [Feature Suggestion]More comparison functions for floating point numbers Message-ID: I think these would be useful and easy to implement. greater_close(a, b) = greater_equal(a, b) | isclose(a, b) less_close(a, b) = less_equal(a, b) | isclose(a, b) greater_no_close = greater(a, b) & ~isclose(a, b) less_no_close = less(a, b) & ~isclose(a, b) The results are element-wise, just like the original functions. I'm not sure if it is useful enough to be a part of numpy. If so, I will try to implement them and make a pull request. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 08:55:05 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 08:55:05 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith wrote: > On Sun, Oct 18, 2015 at 9:35 PM, wrote: > >>>> np.column_stack((np.ones(10), np.ones(10))).flags > > C_CONTIGUOUS : True > > F_CONTIGUOUS : False > > > >>>> np.__version__ > > '1.9.2rc1' > > > > > > on my notebook which has numpy 1.6.1 it is f_contiguous > > > > > > I was just trying to optimize a loop over variable adjustment in > regression, > > and found out that we lost fortran contiguity. > > > > I always thought column_stack is for fortran usage (linalg) > > > > What's the alternative? > > column_stack was one of my favorite commands, and I always assumed we > have > > in statsmodels the right memory layout to call the linalg libraries. > > > > ("assumed" means we don't have timing nor unit tests for it.) > > In general practice no numpy functions make any guarantee about memory > layout, unless that's explicitly a documented part of their contract > (e.g. 'ascontiguous', or some functions that take an order= argument > -- I say "some" b/c there are functions like 'reshape' that take an > argument called order= that doesn't actually refer to memory layout). > This isn't so much an official policy as just a fact of life -- if > no-one has any idea that the someone is depending on some memory > layout detail then there's no way to realize that we've broken > something. (But it is a good policy IMO.) > I understand that in general. However, I always thought column_stack is a array creation function which have guaranteed memory layout. And since it's stacking by columns I thought that order is always Fortran. And the fact that it doesn't have an order keyword yet, I thought is just a missing extension. > > If this kind of problem gets caught during a pre-release cycle then we > generally do try to fix it, because we try not to break code, but if > it's been broken for 2 full releases then there's no much we can do -- > we can't go back in time to fix it so it sounds like you're stuck > working around the problem no matter what (unless you want to refuse > to support 1.9.0 through 1.10.1, which I assume you don't... worst > case, you just have to do a global search replace of np.column_stack > with statsmodels.utils.column_stack_f, right?). > > And the regression issue seems like the only real argument for > changing it back -- we'd never guarantee f-contiguity here if starting > from a blank slate, I think? > When the cat is out of the bag, the down stream developer writes compatibility code or helper functions. I will do that at at least the parts I know are intentionally designed for F memory order. --- statsmodels doesn't really check or consistently optimize the memory order, except in some cython functions. But, I thought we should be doing quite well with getting Fortran ordered arrays. I only paid attention where we have more extensive loops internally. Nathniel, Does patsy guarantee memory layout (F-contiguous) when creating design matrices? Josef > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 09:00:57 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 09:00:57 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: <1445246219.30031.6.camel@sipsolutions.net> References: <1445246219.30031.6.camel@sipsolutions.net> Message-ID: On Mon, Oct 19, 2015 at 5:16 AM, Sebastian Berg wrote: > On Mo, 2015-10-19 at 01:34 -0400, josef.pktd at gmail.com wrote: > > > > > > > > > > It looks like in 1.9 it depends on the order of the 2-d arrays, which > > it didn't do in 1.6 > > > > Yes, it uses concatenate, and concatenate probably changed in 1.7 to use > "K" (since "K" did not really exists before 1.7 IIRC). > Not sure what we can do about it, the order is not something that is > easily fixed unless explicitly given. It might be optimized (as in this > case I would guess). > Whether or not doing the fastest route for these kind of functions is > faster for the user is of course impossible to know, we can only hope > that in most cases it is better. > If someone has an idea how to decide I am all ears, but I think all we > can do is put in asserts/tests in the downstream code if it relies > heavily on the order (or just copy, if the order is wrong) :(, another > example is change of the output order in advanced indexing in some > cases, it makes it faster sometimes, and probably slower in others, what > is right seems very much non-trivial. > To understand the reason: Is this to have more efficient memory access during copying? AFAIU, column_stack needs to create a new array which has to be either F or C contiguous, so we always have to pick one of the two. With a large number of 1d arrays it seemed more "intuitive" to me to copy them by columns. Josef > > - Sebastian > > > > > > >>> np.column_stack((np.ones(10), np.ones((10, 2), order='F'))).flags > > C_CONTIGUOUS : False > > F_CONTIGUOUS : True > > OWNDATA : True > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > > > > > > > which means the default order looks more like "K" now, not "C", IIUC > > > > > > Josef > > > > > > > > > > > > Josef > > > > > > > > > > > > Stephan > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Oct 19 10:11:24 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 10:11:24 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: <1445246219.30031.6.camel@sipsolutions.net> Message-ID: On Mon, Oct 19, 2015 at 9:00 AM, wrote: > > > On Mon, Oct 19, 2015 at 5:16 AM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > >> On Mo, 2015-10-19 at 01:34 -0400, josef.pktd at gmail.com wrote: >> > >> >> >> >> >> > >> > It looks like in 1.9 it depends on the order of the 2-d arrays, which >> > it didn't do in 1.6 >> > >> >> Yes, it uses concatenate, and concatenate probably changed in 1.7 to use >> "K" (since "K" did not really exists before 1.7 IIRC). >> Not sure what we can do about it, the order is not something that is >> easily fixed unless explicitly given. It might be optimized (as in this >> case I would guess). >> Whether or not doing the fastest route for these kind of functions is >> faster for the user is of course impossible to know, we can only hope >> that in most cases it is better. >> If someone has an idea how to decide I am all ears, but I think all we >> can do is put in asserts/tests in the downstream code if it relies >> heavily on the order (or just copy, if the order is wrong) :(, another >> example is change of the output order in advanced indexing in some >> cases, it makes it faster sometimes, and probably slower in others, what >> is right seems very much non-trivial. >> > > To understand the reason: > > Is this to have more efficient memory access during copying? > > AFAIU, column_stack needs to create a new array which has to be either F > or C contiguous, so we always have to pick one of the two. With a large > number of 1d arrays it seemed more "intuitive" to me to copy them by > columns. > just as background I was mainly surprised last night about having my long held beliefs shattered. I skipped numpy 1.7 and 1.8 in my development environment and still need to catch up now that I use 1.9 as my main numpy version. I might have to update a bit my "folk wisdom", which is not codified anywhere and doesn't have unit tests. For example, the improvement iteration for Fortran contiguous or not C or F contiguous arrays sounded very useful, but I never checked if it would affect us. Josef > > Josef > > > >> >> - Sebastian >> >> >> > >> > >>> np.column_stack((np.ones(10), np.ones((10, 2), order='F'))).flags >> > C_CONTIGUOUS : False >> > F_CONTIGUOUS : True >> > OWNDATA : True >> > WRITEABLE : True >> > ALIGNED : True >> > UPDATEIFCOPY : False >> > >> > >> > >> > >> > which means the default order looks more like "K" now, not "C", IIUC >> > >> > >> > Josef >> > >> > >> > >> > >> > >> > Josef >> > >> > >> > >> > >> > >> > Stephan >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Oct 19 12:11:36 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 19 Oct 2015 09:11:36 -0700 Subject: [Numpy-discussion] Workshop tonight, expect GitHub activity Message-ID: Hi all, As mentioned a few weeks ago, I am organizing a "Become an Open Source Contributor" workshop tonight, for the Data Science Student Society at UCSD. During this morning I will be creating a few ridiculously simple issues, e.g. "missing space, arrayobject --> array object", for participants to work on as part of the workshop. So there may also be a surge in simple PRs starting at around 7 PM PST. Please, bear with us. And refrain from fixing those issues if you are not a workshop participant! Thanks, Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Oct 19 13:27:06 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 19 Oct 2015 10:27:06 -0700 Subject: [Numpy-discussion] Workshop tonight, expect GitHub activity In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 9:11 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > Hi all, > > As mentioned a few weeks ago, I am organizing a "Become an Open Source > Contributor" workshop tonight, for the Data Science Student Society at UCSD. > > During this morning I will be creating a few ridiculously simple issues, > e.g. "missing space, arrayobject --> array object", for participants to > work on as part of the workshop. So there may also be a surge in simple PRs > starting at around 7 PM PST. > Ok, so issues 6515 up to and including 6525 are mine, should have them all fixed and closed by the end of today. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Oct 19 15:34:56 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 19 Oct 2015 12:34:56 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Sun, Oct 18, 2015 at 12:20 PM, Alexander Belopolsky wrote: > > On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker > wrote: > >> Off the top of my head, I think allowing a 60th second makes more sense >> -- jsut like we do leap years. > > > Yet we don't implement DST by allowing the 24th hour. Even the countries > that adjust the clocks at midnight don't do that. > Well, isn't that about conforming to already existing standards? DST is a civil construct -- and mst (all?) implementations use the convention of having repeated times. -- so that's what software has to deal with. IIUC, at least +some+standards handle leap seconds by adding a 60th (61st) second, rather than having a repeated one. So it's at least an option to do it that way. And it can then fit into the already existing standards for representing datetimes, etc. Does the "fold" flag approach for representing, well, "folds" exist in a widely used standards? It's my impression that it doesn't since we had to argue a lot about what to call it :-) > In some sense leap seconds are more similar to timezone changes (DST or > political) because they are irregular and unpredictable. > in that regard, yes -- you need a constantly updating database to use them. but I don't know that that has any impact on how you represent them. They seem a lot more like leap years to me -- some februaries have a 29th day -- some hours on some days have a 61st second. > Furthermore, the notion of "fold" is not tied to a particular 24/60/60 > system of encoding times and thus more applicable to numpy where > times are encoded as binary integers. > but there are no folds in the underlying integer representation -- that is the "continuous" time scale -- the folds (or leap seconds, or leap years, or any of the 24/60/60 business comes in only when you want to go to-from the "datetime" representation. If anyone decides to actually get around to leap seconds support in numpy > datetime, s/he can decide ... This attitude is the reason why we will probably never have bug free software when it comes to civil time reckoning. OK -- fair enough -- good to think about it sooner than later. Similarly, current numpy.datetime64 design ties arithmetic with encoding. > This makes arithmetic easier, but in the long run may preclude designs that > better match the problem domain. I don't follow here -- how can you NOT tied arithmetic to encoding? sure you could decide that you are going to overload the arithmetic, and it's up t the object that encodes the data to do that math -- but that's pretty much what datetime64 is doing -- defining an encoding so that it can do math -- numpy dtypes are very much about binary representation. No reason one couldn't make a different numpy dtype for datetimes that encoded it a different way, and then it would have to implement math, too. Note how the development of PEP 495 has highlighted the fact that allowing binary operations (subtraction, comparison etc.) between times in different timezones was a design mistake. It will be wise to learn from such mistakes when redesigning numpy.datetime64. So was not considering folds -- frankly, and I this this may be your point, I don't think timezones were well thought out at all when datetime was first introduced -- and however well thought out it was, if you don't provide an implementation, you are not going to find the limitations. And despite Tim's articulate defense of the original impp;imentation decisions, I think encoding the datetime in the local "calendar/clock" just invites a mess. And I'm quite convinced that it wouldn't be a the way to go for numpy use-cases. If you ever plan to support civil time in some form, you should think about it now. well, the goal for now is naive time -- and unlike the original datetime -- we are not adding on a "you can implement your own timezone handling this way" hook yet. > In Python 3.6, datetime.now() will return different values in the first and the second repeated hour in the "fall-back fold." > If you allow datetime.datetime to numpy.datetime64 conversion, you should decide what you do with that difference. Indeed. Though will that only occur with timezones that have DST? I know I'd be fine with NOT being able to create a numpy datetime64 from a non-naive datetime object. Which would force the user to think about and convert to the timezone they want before passing off to numpy. Unless you can suggest a sensible default way to handle this. At first blush, I think naive time does not have folds, so there is no way to handle them "properly" Also -- I think we are at phase one of a (at least) two step process: 1) clean up datetime64 just enough that it is useful, and less error-prone -- i.e. have it not pretend to support anything other than naive datetimes. 2) Do it right -- perhaps adding some time zone support. This is going to wait until the numpy dtype machinery is cleaned up some. Phase 2 is where we really need the thinking ahead. And I'm still confused about what thinking ahead needs to be done for potential leap second support. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Oct 19 15:46:27 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 19 Oct 2015 12:46:27 -0700 Subject: [Numpy-discussion] [Feature Suggestion]More comparison functions for floating point numbers In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 3:06 AM, cy18 wrote: > I think these would be useful and easy to implement. > > greater_close(a, b) = greater_equal(a, b) | isclose(a, b) > less_close(a, b) = less_equal(a, b) | isclose(a, b) > greater_no_close = greater(a, b) & ~isclose(a, b) > less_no_close = less(a, b) & ~isclose(a, b) > What's the use-case here? we need is_close because we want to test equality, but precision errors are such that two floats may be as close to equal as they can be given the computations done. And the assumption is that you don't care about the precision to the point you specify. But for a greater_than (or equiv) comparison, if you the precision is not important beyond a certain level, then it's generally not important whether you get greater than or less than when it's that close.... And this would great a wierd property that some values would be greater than, less than, and equal to a target value -- pretty weird! note that you can get the same effect by subtracting a bit from your comparison value for a greater than check... But maybe there is a common use-case that I'm not thinking of.. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Oct 19 16:00:49 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 19 Oct 2015 16:00:49 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 3:34 PM, Chris Barker wrote: > DST is a civil construct -- and mst (all?) implementations use the > convention of having repeated times. What is "mst"? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Oct 19 16:11:54 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 19 Oct 2015 16:11:54 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 3:34 PM, Chris Barker wrote: > > > In Python 3.6, datetime.now() will return different values in the first > and the second repeated hour in the "fall-back fold." > If you allow > datetime.datetime to numpy.datetime64 conversion, you should decide what > you do with that difference. > > Indeed. Though will that only occur with timezones that have DST? I know > I'd be fine with NOT being able to create a numpy datetime64 from a > non-naive datetime object. > datetime.now() returns *naive* datetime objects unless you supply the timezone. In Python 3.6 *naive* datetime objects will have the fold attribute and datetime.now() will occasionally return fold=1 values unless your system timezone has a fixed UTC offset. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Oct 19 16:12:19 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 19 Oct 2015 13:12:19 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 12:34 PM, Chris Barker wrote: > Also -- I think we are at phase one of a (at least) two step process: > > 1) clean up datetime64 just enough that it is useful, and less error-prone > -- i.e. have it not pretend to support anything other than naive datetimes. > > 2) Do it right -- perhaps adding some time zone support. This is going to > wait until the numpy dtype machinery is cleaned up some. > I agree with Chris. My intent with this work for now (for NumPy 1.11) is simply to complete phase 1. Once NumPy stops pretending to be time zone aware (and with a few other small cleanups), datetime64 will be far more useable. For major fixes, we'll have to wait until dtype support is better. Alexander -- by "mst" I think Chris meant "most". Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Oct 19 16:14:38 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 19 Oct 2015 16:14:38 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 4:12 PM, Stephan Hoyer wrote: > Alexander -- by "mst" I think Chris meant "most". Good because in context it could be "Moscow Standard Time" or "Mean Solar Time". :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Oct 19 16:25:07 2015 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 19 Oct 2015 16:25:07 -0400 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 4:12 PM, Stephan Hoyer wrote: > On Mon, Oct 19, 2015 at 12:34 PM, Chris Barker > wrote: > >> Also -- I think we are at phase one of a (at least) two step process: >> >> 1) clean up datetime64 just enough that it is useful, and less >> error-prone -- i.e. have it not pretend to support anything other than >> naive datetimes. >> > > I agree with Chris. My intent with this work for now (for NumPy 1.11) is > simply to complete phase 1. > This is fine. Just be aware that *naive* datetimes will also have the PEP 495 "fold" attribute in Python 3.6. You are free to ignore it, but you will loose the ability to round-trip between naive stdlib datetimes and numpy.datetime64. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thecy18 at gmail.com Mon Oct 19 16:51:28 2015 From: thecy18 at gmail.com (cy18) Date: Mon, 19 Oct 2015 16:51:28 -0400 Subject: [Numpy-discussion] [Feature Suggestion]More comparison functions for floating point numbers In-Reply-To: References: Message-ID: It would be useful when we need to subtracting a bit before comparing by greater or less. By subtracting a bit, we only have an absolute error tolerance and with the new functions, we can have both absolute and relative error tolerance. This is how isclose(a, b) better than abs(a-b)<=atol. 2015-10-19 15:46 GMT-04:00 Chris Barker : > > > On Mon, Oct 19, 2015 at 3:06 AM, cy18 wrote: > >> I think these would be useful and easy to implement. >> >> greater_close(a, b) = greater_equal(a, b) | isclose(a, b) >> less_close(a, b) = less_equal(a, b) | isclose(a, b) >> greater_no_close = greater(a, b) & ~isclose(a, b) >> less_no_close = less(a, b) & ~isclose(a, b) >> > > What's the use-case here? we need is_close because we want to test > equality, but precision errors are such that two floats may be as close to > equal as they can be given the computations done. And the assumption is > that you don't care about the precision to the point you specify. > > But for a greater_than (or equiv) comparison, if you the precision is not > important beyond a certain level, then it's generally not important whether > you get greater than or less than when it's that close.... > > And this would great a wierd property that some values would be greater > than, less than, and equal to a target value -- pretty weird! > > note that you can get the same effect by subtracting a bit from your > comparison value for a greater than check... > > But maybe there is a common use-case that I'm not thinking of.. > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Oct 19 17:04:45 2015 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Oct 2015 22:04:45 +0100 Subject: [Numpy-discussion] [Feature Suggestion]More comparison functions for floating point numbers In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 9:51 PM, cy18 wrote: > > It would be useful when we need to subtracting a bit before comparing by greater or less. By subtracting a bit, we only have an absolute error tolerance and with the new functions, we can have both absolute and relative error tolerance. This is how isclose(a, b) better than abs(a-b)<=atol. You just adjust the value by whichever tolerance is greatest in magnitude. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Oct 19 20:54:49 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 19 Oct 2015 17:54:49 -0700 Subject: [Numpy-discussion] Making datetime64 timezone naive In-Reply-To: References: Message-ID: <-3103455990780817434@unknownmsgid> > This is fine. Just be aware that *naive* datetimes will also have the PEP 495 "fold" attribute in Python 3.6. You are free to ignore it, but you will loose the ability to round-trip between naive stdlib datetimes and numpy.datetime64. Sigh. I can see why it's there ( primarily to support now(), I suppose). But a naive datetime doesn't have a timezone, so how could you know what time one actually corresponds to if fold is True? And what could you do with it if you did know? I've always figured that if you are using naive time for times in a timezone that has DST, than you'd better know wether you were in DST or not. (Which fold tells you, I guess) but the fold isn't guaranteed to be an hour is it? So without more info, what can you do? And if the fold bit is False, then you still have no idea if you are in DST or not. And then what if you attach a timezone to it? Then the fold bit could be wrong... I take it back, I can't see why the fold bit could be anything but confusing for a naive datetime. :-) Anyway, all I can see to do here is for the datetime64 docs to say that fold is ignored if it's there. But what should datetime64 do when provided with a datetime with a timezone? - Raise an exception? - ignore the timezone? - Convert to UTC? If the time zone is ignored, then you could get DST and non DST times in the same array - that could be ugly. Is there any way to query a timezone object to ask if it's a constant-offset? And yes, I did mean "most". There is no way I'm ever going to introduce a three letter "timezone" abbreviation in one of these threads! -CHB From njs at pobox.com Mon Oct 19 21:15:51 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 19 Oct 2015 18:15:51 -0700 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 5:55 AM, wrote: > > > On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith wrote: >> >> On Sun, Oct 18, 2015 at 9:35 PM, wrote: >> >>>> np.column_stack((np.ones(10), np.ones(10))).flags >> > C_CONTIGUOUS : True >> > F_CONTIGUOUS : False >> > >> >>>> np.__version__ >> > '1.9.2rc1' >> > >> > >> > on my notebook which has numpy 1.6.1 it is f_contiguous >> > >> > >> > I was just trying to optimize a loop over variable adjustment in >> > regression, >> > and found out that we lost fortran contiguity. >> > >> > I always thought column_stack is for fortran usage (linalg) >> > >> > What's the alternative? >> > column_stack was one of my favorite commands, and I always assumed we >> > have >> > in statsmodels the right memory layout to call the linalg libraries. >> > >> > ("assumed" means we don't have timing nor unit tests for it.) >> >> In general practice no numpy functions make any guarantee about memory >> layout, unless that's explicitly a documented part of their contract >> (e.g. 'ascontiguous', or some functions that take an order= argument >> -- I say "some" b/c there are functions like 'reshape' that take an >> argument called order= that doesn't actually refer to memory layout). >> This isn't so much an official policy as just a fact of life -- if >> no-one has any idea that the someone is depending on some memory >> layout detail then there's no way to realize that we've broken >> something. (But it is a good policy IMO.) > > > I understand that in general. > > However, I always thought column_stack is a array creation function which > have guaranteed memory layout. And since it's stacking by columns I thought > that order is always Fortran. > And the fact that it doesn't have an order keyword yet, I thought is just a > missing extension. I guess I don't know what to say except that I'm sorry to hear that and sorry that no-one noticed until several releases later. >> If this kind of problem gets caught during a pre-release cycle then we >> generally do try to fix it, because we try not to break code, but if >> it's been broken for 2 full releases then there's no much we can do -- >> we can't go back in time to fix it so it sounds like you're stuck >> working around the problem no matter what (unless you want to refuse >> to support 1.9.0 through 1.10.1, which I assume you don't... worst >> case, you just have to do a global search replace of np.column_stack >> with statsmodels.utils.column_stack_f, right?). >> >> And the regression issue seems like the only real argument for >> changing it back -- we'd never guarantee f-contiguity here if starting >> from a blank slate, I think? > > > When the cat is out of the bag, the down stream developer writes > compatibility code or helper functions. > > I will do that at at least the parts I know are intentionally designed for F > memory order. > > --- > > statsmodels doesn't really check or consistently optimize the memory order, > except in some cython functions. > But, I thought we should be doing quite well with getting Fortran ordered > arrays. I only paid attention where we have more extensive loops internally. > > Nathniel, Does patsy guarantee memory layout (F-contiguous) when creating > design matrices? I never thought about it :-). So: no, it looks like right now patsy usually returns C-order matrices (or really, whatever np.empty or np.repeat returns), and there aren't any particular guarantees that this will continue to be the case in the future. Is returning matrices in F-contiguous layout really important? Should there be a return_type="fortran_matrix" option or something like that? -n -- Nathaniel J. Smith -- http://vorpus.org From josef.pktd at gmail.com Mon Oct 19 21:51:10 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 21:51:10 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 9:15 PM, Nathaniel Smith wrote: > On Mon, Oct 19, 2015 at 5:55 AM, wrote: > > > > > > On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith wrote: > >> > >> On Sun, Oct 18, 2015 at 9:35 PM, wrote: > >> >>>> np.column_stack((np.ones(10), np.ones(10))).flags > >> > C_CONTIGUOUS : True > >> > F_CONTIGUOUS : False > >> > > >> >>>> np.__version__ > >> > '1.9.2rc1' > >> > > >> > > >> > on my notebook which has numpy 1.6.1 it is f_contiguous > >> > > >> > > >> > I was just trying to optimize a loop over variable adjustment in > >> > regression, > >> > and found out that we lost fortran contiguity. > >> > > >> > I always thought column_stack is for fortran usage (linalg) > >> > > >> > What's the alternative? > >> > column_stack was one of my favorite commands, and I always assumed we > >> > have > >> > in statsmodels the right memory layout to call the linalg libraries. > >> > > >> > ("assumed" means we don't have timing nor unit tests for it.) > >> > >> In general practice no numpy functions make any guarantee about memory > >> layout, unless that's explicitly a documented part of their contract > >> (e.g. 'ascontiguous', or some functions that take an order= argument > >> -- I say "some" b/c there are functions like 'reshape' that take an > >> argument called order= that doesn't actually refer to memory layout). > >> This isn't so much an official policy as just a fact of life -- if > >> no-one has any idea that the someone is depending on some memory > >> layout detail then there's no way to realize that we've broken > >> something. (But it is a good policy IMO.) > > > > > > I understand that in general. > > > > However, I always thought column_stack is a array creation function which > > have guaranteed memory layout. And since it's stacking by columns I > thought > > that order is always Fortran. > > And the fact that it doesn't have an order keyword yet, I thought is > just a > > missing extension. > > I guess I don't know what to say except that I'm sorry to hear that > and sorry that no-one noticed until several releases later. > Were there more contiguity changes in 0.10? I just saw a large number of test errors and failures in statespace models which are heavily based on cython code where it's not just a question of performance. I don't know yet what's going on, but I just saw that we have some explicit tests for fortran contiguity which just started to fail. > > >> If this kind of problem gets caught during a pre-release cycle then we > >> generally do try to fix it, because we try not to break code, but if > >> it's been broken for 2 full releases then there's no much we can do -- > >> we can't go back in time to fix it so it sounds like you're stuck > >> working around the problem no matter what (unless you want to refuse > >> to support 1.9.0 through 1.10.1, which I assume you don't... worst > >> case, you just have to do a global search replace of np.column_stack > >> with statsmodels.utils.column_stack_f, right?). > >> > >> And the regression issue seems like the only real argument for > >> changing it back -- we'd never guarantee f-contiguity here if starting > >> from a blank slate, I think? > > > > > > When the cat is out of the bag, the down stream developer writes > > compatibility code or helper functions. > > > > I will do that at at least the parts I know are intentionally designed > for F > > memory order. > > > > --- > > > > statsmodels doesn't really check or consistently optimize the memory > order, > > except in some cython functions. > > But, I thought we should be doing quite well with getting Fortran ordered > > arrays. I only paid attention where we have more extensive loops > internally. > > > > Nathniel, Does patsy guarantee memory layout (F-contiguous) when creating > > design matrices? > > I never thought about it :-). So: no, it looks like right now patsy > usually returns C-order matrices (or really, whatever np.empty or > np.repeat returns), and there aren't any particular guarantees that > this will continue to be the case in the future. > > Is returning matrices in F-contiguous layout really important? Should > there be a return_type="fortran_matrix" option or something like that? > I don't know, yet. My intuition was that it would be better because we feed the arrays directly to pinv/SVD or QR which, I think, require by default Fortran contiguous. However, my intuition might not be correct, and it might not make much difference in a single OLS estimation. There are a few critical loops in variable selection that I'm planning to investigate to see how much it matters. Memory optimization was never high in our priority compared to expanding the functionality overall, but reading the Julia mailing list is starting to worry me a bit. :) (I'm even starting to see the reason for multiple dispatch.) Josef > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Oct 19 21:55:27 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 19 Oct 2015 18:55:27 -0700 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 2:26 AM, Olivier Grisel wrote: >> Is it possible to test this with py35 as well? > > Unfortunately not yet. > >> For MSVC, py35 requires a new compiler toolchain (VS2015) -- is that >> something mingwpy/mingw-w64 can handle? > > I am pretty sure that mingwpy does not support Python 3.5 yet. Correct. > I don't know the status of the interop of mingw-w64 w.r.t. VS2015 but as far > as I know it's not supported yet either. Once the issue is fixed at the > upstream level, I think mingwpy could be rebuilt to benefit from the fix. Upstream mingw-w64 doesn't support interop with any version of visual studio that was released this millennium -- all the interop stuff is new in mingwpy. VS2015 had a major reorganization of how it handles runtime libraries, so it's not quite so trivial as just adding support the same way as was done for VS2008 and VS2010. Or rather, IIUC: we *could* just add support the same way as before, but there are undocumented rules about which parts of the new runtime are considered stable and which are not, so if we did this willy-nilly then we might end up using some of the "unstable" parts. And then in 2017 the Windows team does some internal refactoring and pushes it out through windows update and suddenly NumPy / R / Julia / git / ... all start segfaulting at startup on Windows, which would be a disaster from everyone's point of view. We've pointed this out to the Python team at Microsoft and they've promised to try and put Carl and the relevant mingw-w64 folks in touch with the relevant internal folks at MS to hopefully tell us how to do this correctly... fingers crossed :-). Aside from that, the main challenge for mingwpy in general is exactly the issue of upstream support: if we don't get the interop stuff pushed upstream from mingwpy to mingw-w64, then it will rot and break. And upstream would love to have this interoperability as an officially supported feature... but upstream doesn't consider what we have right now to be maintainable, so they won't take it as is. (And honestly, this is a reasonable opinion.) So what I've been trying to do is to scrounge up some funding to support Carl and upstream doing this right (the rough estimate is ~3 person-months of work). The original goal was to get MS to pay for this, on the theory that they should be cleaning up their own messes, but after 6 months of back-and-forth we've pretty much given up on that at this point, and I'm in the process of emailing everyone I can think of who might be convinced to donate some money to the cause. Maybe we should have a kickstarter or something, I dunno :-). -n -- Nathaniel J. Smith -- http://vorpus.org From josef.pktd at gmail.com Mon Oct 19 21:56:59 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Oct 2015 21:56:59 -0400 Subject: [Numpy-discussion] numpy 1.10.1 reduce operation on recarrays In-Reply-To: <5621A4DF.6080001@gmail.com> References: <56219CBF.1070605@gmail.com> <5621A4DF.6080001@gmail.com> Message-ID: On Fri, Oct 16, 2015 at 9:31 PM, Allan Haldane wrote: > On 10/16/2015 09:17 PM, josef.pktd at gmail.com wrote: > >> >> >> On Fri, Oct 16, 2015 at 8:56 PM, Allan Haldane > > wrote: >> >> On 10/16/2015 05:31 PM, josef.pktd at gmail.com >> wrote: >> > >> > >> > On Fri, Oct 16, 2015 at 2:21 PM, Charles R Harris >> > >> > >> wrote: >> > >> > >> > >> > On Fri, Oct 16, 2015 at 12:20 PM, Charles R Harris >> > >> > >> wrote: >> > >> > >> > >> > On Fri, Oct 16, 2015 at 11:58 AM, > >> > > >> >> wrote: >> > >> > was there a change with reduce operations with >> recarrays in >> > 1.10 or 1.10.1? >> > >> > Travis shows a new test failure in the statsmodels >> testsuite >> > with 1.10.1: >> > >> > ERROR: test suite for > > 'statsmodels.base.tests.test_data.TestRecarrays'> >> > >> > File >> > >> >> "/home/travis/miniconda/envs/statsmodels-test/lib/python2.7/site-packages/statsmodels-0.8.0-py2.7-linux-x86_64.egg/statsmodels/base/data.py", >> > line 131, in _handle_constant >> > const_idx = np.where(self.exog.ptp(axis=0) == >> > 0)[0].squeeze() >> > TypeError: cannot perform reduce with flexible type >> > >> > >> > Sorry for asking so late. >> > (statsmodels is short on maintainers, and I'm >> distracted) >> > >> > >> > statsmodels still has code to support recarrays and >> > structured dtypes from the time before pandas became >> > popular, but I don't think anyone is using them >> together >> > with statsmodels anymore. >> > >> > >> > There were several commits dealing both recarrays and >> ufuncs, so >> > this might well be a regression. >> > >> > >> > A bisection would be helpful. Also, open an issue. >> > >> > >> > >> > The reason for the test failure might be somewhere else hiding >> behind >> > several layers of statsmodels, but only started to show up with >> numpy 1.10.1 >> > >> > I already have the reduce exception with my currently installed >> numpy >> > '1.9.2rc1' >> > >> >>>> x = np.random.random(9*3).view([('const', 'f8'),('x_1', 'f8'), >> > ('x_2', 'f8')]).view(np.recarray) >> > >> >>>> np.ptp(x, axis=0) >> > Traceback (most recent call last): >> > File "", line 1, in >> > File >> > >> >> "C:\programs\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\numpy\core\fromnumeric.py", >> > line 2047, in ptp >> > return ptp(axis, out) >> > TypeError: cannot perform reduce with flexible type >> > >> > >> > Sounds like fun, and I don't even know how to automatically bisect. >> > >> > Josef >> >> That example isn't the problem (ptp should definitely fail on >> structured >> arrays), but I've tracked down what is - it has to do with views of >> record arrays. >> >> The fix looks simple, I'll get it in for the next release. >> >> >> Thanks, >> >> I realized that at that point in the statsmodels code we should have >> only regular ndarrays, so the array conversion fails somewhere. >> >> AFAICS, the main helper function to convert is >> >> def struct_to_ndarray(arr): >> return arr.view((float, len(arr.dtype.names))) >> >> which doesn't look like it will handle other dtypes than float64. Nobody >> ever complained, so maybe our test suite is the only user of this. >> >> What is now the recommended way of converting structured >> dtypes/recarrays to ndarrays? >> >> Josef >> > > Yes, that's the code I narrowed it down to as well. I think the code in > statsmodels is fine, the problem is actually a bug I must admit I > introduced in changes to the way views of recarrays work. > > If you are curious, the bug is in this line: > > https://github.com/numpy/numpy/blob/master/numpy/core/records.py#L467 > > This line was intended to fix the problem that accessing a nested record > array field would lose the 'np.record' dtype. I only considered void > structured arrays, and had forgotten about sub-arrays which statsmodels > uses. > > I think the fix is to replace `issubclass(val.type, nt.void)` with > `val.names` or something similar. I'll take a closer look soon. > > Another example fresh from Travis that might have the same source and I didn't even know statsmodels uses recarrays in the models AssertionError: Arrays are not almost equal to 7 decimals (shapes (6,), (6, 3) mismatch) x: recarray([??, ?;?:B??](?D??????????, ??L???????? ?C?3Y??, O?????N;?j???8???H??, ?N?A???????T??B;??p?, 9m?;_???J??... y: array([[ 1. , 0. , 0. ], [-0.2794347, -0.100468 , -1.9709737], [-0.0469873, -0.1728197, 0.0436493],... Josef > > Allan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Mon Oct 19 22:23:18 2015 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Mon, 19 Oct 2015 21:23:18 -0500 Subject: [Numpy-discussion] Behavior of numpy.copy with sub-classes Message-ID: <5625A596.6020603@gmail.com> In GitHub issue #3474, a number of us have started a conversation on how NumPy's copy function should behave when passed an instance which is a sub-class of the array class. Specifically, the issue began by noting that when a MaskedArray is passed to np.copy, the sub-class is not passed through but rather a ndarray is returned. I suggested adding a "subok" parameter which controls how sub-classes are handled and others suggested having the function call a copy method on duck arrays. The "subok" parameter is implemented in PR #6509 as an example. Both of these options would change the API of numpy.copy and possibly break backwards compatibility. Do others have an opinion of how np.copy should handle sub-classes? For a concrete example of this behavior and possible changes, what type should copy_x be in the following snippet: import numpy as np x = np.ma.array([1,2,3]) copy_x = np.copy(x) Cheers, - Jonathan Helmus From nathan12343 at gmail.com Mon Oct 19 22:28:26 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Mon, 19 Oct 2015 19:28:26 -0700 Subject: [Numpy-discussion] Behavior of numpy.copy with sub-classes In-Reply-To: <5625A596.6020603@gmail.com> References: <5625A596.6020603@gmail.com> Message-ID: On Mon, Oct 19, 2015 at 7:23 PM, Jonathan Helmus wrote: > In GitHub issue #3474, a number of us have started a conversation on how > NumPy's copy function should behave when passed an instance which is a > sub-class of the array class. Specifically, the issue began by noting that > when a MaskedArray is passed to np.copy, the sub-class is not passed > through but rather a ndarray is returned. > > I suggested adding a "subok" parameter which controls how sub-classes are > handled and others suggested having the function call a copy method on duck > arrays. The "subok" parameter is implemented in PR #6509 as an example. > Both of these options would change the API of numpy.copy and possibly break > backwards compatibility. Do others have an opinion of how np.copy should > handle sub-classes? > > For a concrete example of this behavior and possible changes, what type > should copy_x be in the following snippet: > > import numpy as np > x = np.ma.array([1,2,3]) > copy_x = np.copy(x) > FWIW, it looks like np.copy() is never used in our code to work with the ndarray subclass we maintain in yt. Instead we use the copy() method much more often, and that returns the appropriate type. I guess it makes sense to have the type of the return value of np.copy() agree with the type of the copy() member function. That said, breaking backwards compatibility here before numpy 2.0 might very well break real code. It might be worth it search e.g. github for all instances of np.copy() to see if they're dealing with subclasses. > > > Cheers, > > - Jonathan Helmus > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Oct 19 22:40:15 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 Oct 2015 20:40:15 -0600 Subject: [Numpy-discussion] Behavior of numpy.copy with sub-classes In-Reply-To: References: <5625A596.6020603@gmail.com> Message-ID: On Mon, Oct 19, 2015 at 8:28 PM, Nathan Goldbaum wrote: > > > On Mon, Oct 19, 2015 at 7:23 PM, Jonathan Helmus > wrote: > >> In GitHub issue #3474, a number of us have started a conversation on how >> NumPy's copy function should behave when passed an instance which is a >> sub-class of the array class. Specifically, the issue began by noting that >> when a MaskedArray is passed to np.copy, the sub-class is not passed >> through but rather a ndarray is returned. >> >> I suggested adding a "subok" parameter which controls how sub-classes are >> handled and others suggested having the function call a copy method on duck >> arrays. The "subok" parameter is implemented in PR #6509 as an example. >> Both of these options would change the API of numpy.copy and possibly break >> backwards compatibility. Do others have an opinion of how np.copy should >> handle sub-classes? >> >> For a concrete example of this behavior and possible changes, what type >> should copy_x be in the following snippet: >> >> import numpy as np >> x = np.ma.array([1,2,3]) >> copy_x = np.copy(x) >> > > FWIW, it looks like np.copy() is never used in our code to work with the > ndarray subclass we maintain in yt. Instead we use the copy() method much > more often, and that returns the appropriate type. I guess it makes sense > to have the type of the return value of np.copy() agree with the type of > the copy() member function. > > That said, breaking backwards compatibility here before numpy 2.0 might > very well break real code. It might be worth it search e.g. github for all > instances of np.copy() to see if they're dealing with subclasses. > The problem with github searches is that there are a ton of numpy forks. ISTR once finding a method to avoid them, but can't remember what is was. If anyone knows how to do that, I'd appreciate learning. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nevion at gmail.com Tue Oct 20 00:05:51 2015 From: nevion at gmail.com (Jason Newton) Date: Tue, 20 Oct 2015 00:05:51 -0400 Subject: [Numpy-discussion] correct sizeof for ndarray Message-ID: Hi folks, I noticed an unexpected behavior of itemsize for structures with offsets that are larger than that of a packed structure in memory. This matters when parsing in memory structures from C and some others (recently and HDF5/h5py detail got me for a bit). So what is the correct way to get "sizeof" a structure? AFAIK this is the size of the last item + it's offset. If this doesn't exist... shouldn't it? Thanks, Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Oct 20 09:28:32 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 20 Oct 2015 09:28:32 -0400 Subject: [Numpy-discussion] Behavior of numpy.copy with sub-classes In-Reply-To: References: <5625A596.6020603@gmail.com> Message-ID: In many other parts of numpy, calling the numpy function that had an equivalent array method would result in the method being called. I would certainly be surprised if the copy() method behaved differently from the np.copy() function. Now it is time for me to do some grepping of my code-bases... On Mon, Oct 19, 2015 at 10:40 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Mon, Oct 19, 2015 at 8:28 PM, Nathan Goldbaum > wrote: > >> >> >> On Mon, Oct 19, 2015 at 7:23 PM, Jonathan Helmus >> wrote: >> >>> In GitHub issue #3474, a number of us have started a conversation on how >>> NumPy's copy function should behave when passed an instance which is a >>> sub-class of the array class. Specifically, the issue began by noting that >>> when a MaskedArray is passed to np.copy, the sub-class is not passed >>> through but rather a ndarray is returned. >>> >>> I suggested adding a "subok" parameter which controls how sub-classes >>> are handled and others suggested having the function call a copy method on >>> duck arrays. The "subok" parameter is implemented in PR #6509 as an >>> example. Both of these options would change the API of numpy.copy and >>> possibly break backwards compatibility. Do others have an opinion of how >>> np.copy should handle sub-classes? >>> >>> For a concrete example of this behavior and possible changes, what type >>> should copy_x be in the following snippet: >>> >>> import numpy as np >>> x = np.ma.array([1,2,3]) >>> copy_x = np.copy(x) >>> >> >> FWIW, it looks like np.copy() is never used in our code to work with the >> ndarray subclass we maintain in yt. Instead we use the copy() method much >> more often, and that returns the appropriate type. I guess it makes sense >> to have the type of the return value of np.copy() agree with the type of >> the copy() member function. >> >> That said, breaking backwards compatibility here before numpy 2.0 might >> very well break real code. It might be worth it search e.g. github for all >> instances of np.copy() to see if they're dealing with subclasses. >> > > The problem with github searches is that there are a ton of numpy forks. > ISTR once finding a method to avoid them, but can't remember what is was. > If anyone knows how to do that, I'd appreciate learning. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Tue Oct 20 09:56:02 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 20 Oct 2015 09:56:02 -0400 Subject: [Numpy-discussion] correct sizeof for ndarray In-Reply-To: References: Message-ID: <562647F2.3040204@gmail.com> On 10/20/2015 12:05 AM, Jason Newton wrote: > Hi folks, > > I noticed an unexpected behavior of itemsize for structures with offsets > that are larger than that of a packed structure in memory. This matters > when parsing in memory structures from C and some others (recently and > HDF5/h5py detail got me for a bit). > > So what is the correct way to get "sizeof" a structure? AFAIK this is > the size of the last item + it's offset. If this doesn't exist... > shouldn't it? > > Thanks, > Jason Hi Jason, The 'itemsize' attribute of a dtype object is probably what you're looking for. It gives the itemsize in bytes. "last item + it's offset" is not a reliable way to get the itemsize because "aligned" (and other) structures can have trailing padding, just like C structs: >>> dtype('i4,u1', align=True).itemsize 8 The documentation on all this is a little scattered right now, but there are hints in the array.dtypes reference page and the dtype docstring. Cheers, Allan From josef.pktd at gmail.com Tue Oct 20 13:01:46 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 20 Oct 2015 13:01:46 -0400 Subject: [Numpy-discussion] when did column_stack become C-contiguous? In-Reply-To: References: Message-ID: On Mon, Oct 19, 2015 at 9:51 PM, wrote: > > > On Mon, Oct 19, 2015 at 9:15 PM, Nathaniel Smith wrote: > >> On Mon, Oct 19, 2015 at 5:55 AM, wrote: >> > >> > >> > On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith wrote: >> >> >> >> On Sun, Oct 18, 2015 at 9:35 PM, wrote: >> >> >>>> np.column_stack((np.ones(10), np.ones(10))).flags >> >> > C_CONTIGUOUS : True >> >> > F_CONTIGUOUS : False >> >> > >> >> >>>> np.__version__ >> >> > '1.9.2rc1' >> >> > >> >> > >> >> > on my notebook which has numpy 1.6.1 it is f_contiguous >> >> > >> >> > >> >> > I was just trying to optimize a loop over variable adjustment in >> >> > regression, >> >> > and found out that we lost fortran contiguity. >> >> > >> >> > I always thought column_stack is for fortran usage (linalg) >> >> > >> >> > What's the alternative? >> >> > column_stack was one of my favorite commands, and I always assumed we >> >> > have >> >> > in statsmodels the right memory layout to call the linalg libraries. >> >> > >> >> > ("assumed" means we don't have timing nor unit tests for it.) >> >> >> >> In general practice no numpy functions make any guarantee about memory >> >> layout, unless that's explicitly a documented part of their contract >> >> (e.g. 'ascontiguous', or some functions that take an order= argument >> >> -- I say "some" b/c there are functions like 'reshape' that take an >> >> argument called order= that doesn't actually refer to memory layout). >> >> This isn't so much an official policy as just a fact of life -- if >> >> no-one has any idea that the someone is depending on some memory >> >> layout detail then there's no way to realize that we've broken >> >> something. (But it is a good policy IMO.) >> > >> > >> > I understand that in general. >> > >> > However, I always thought column_stack is a array creation function >> which >> > have guaranteed memory layout. And since it's stacking by columns I >> thought >> > that order is always Fortran. >> > And the fact that it doesn't have an order keyword yet, I thought is >> just a >> > missing extension. >> >> I guess I don't know what to say except that I'm sorry to hear that >> and sorry that no-one noticed until several releases later. >> > > > Were there more contiguity changes in 0.10? > I just saw a large number of test errors and failures in statespace models > which are heavily based on cython code where it's not just a question of > performance. > > I don't know yet what's going on, but I just saw that we have some > explicit tests for fortran contiguity which just started to fail. > > > > >> >> >> If this kind of problem gets caught during a pre-release cycle then we >> >> generally do try to fix it, because we try not to break code, but if >> >> it's been broken for 2 full releases then there's no much we can do -- >> >> we can't go back in time to fix it so it sounds like you're stuck >> >> working around the problem no matter what (unless you want to refuse >> >> to support 1.9.0 through 1.10.1, which I assume you don't... worst >> >> case, you just have to do a global search replace of np.column_stack >> >> with statsmodels.utils.column_stack_f, right?). >> >> >> >> And the regression issue seems like the only real argument for >> >> changing it back -- we'd never guarantee f-contiguity here if starting >> >> from a blank slate, I think? >> > >> > >> > When the cat is out of the bag, the down stream developer writes >> > compatibility code or helper functions. >> > >> > I will do that at at least the parts I know are intentionally designed >> for F >> > memory order. >> > >> > --- >> > >> > statsmodels doesn't really check or consistently optimize the memory >> order, >> > except in some cython functions. >> > But, I thought we should be doing quite well with getting Fortran >> ordered >> > arrays. I only paid attention where we have more extensive loops >> internally. >> > >> > Nathniel, Does patsy guarantee memory layout (F-contiguous) when >> creating >> > design matrices? >> >> I never thought about it :-). So: no, it looks like right now patsy >> usually returns C-order matrices (or really, whatever np.empty or >> np.repeat returns), and there aren't any particular guarantees that >> this will continue to be the case in the future. >> >> Is returning matrices in F-contiguous layout really important? Should >> there be a return_type="fortran_matrix" option or something like that? >> > > I don't know, yet. My intuition was that it would be better because we > feed the arrays directly to pinv/SVD or QR which, I think, require by > default Fortran contiguous. > > However, my intuition might not be correct, and it might not make much > difference in a single OLS estimation. > I did some quick timing checks of pinv and qr, and the Fortran ordered is only about 5% to 15% faster and uses about the same amount of memory (watching the Task manager). So, nothing to get excited about. I used functions like this where xf is either C or Fortran contiguous def funcl_f(): for k in range(3, xf.shape[1]): np.linalg.pinv(xf[:,:k]) with MKL which looks, to my surprise, multithreaded. (It used 50% CPU on my Quadcore, single processor is 13% CPU) BTW: default scipy.linalg.qr is pretty bad, 9GB peak memory instead of 200MB in mode='economic' Josef > > There are a few critical loops in variable selection that I'm planning to > investigate to see how much it matters. > Memory optimization was never high in our priority compared to expanding > the functionality overall, but reading the Julia mailing list is starting > to worry me a bit. :) > > (I'm even starting to see the reason for multiple dispatch.) > > Josef > > >> >> -n >> >> -- >> Nathaniel J. Smith -- http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 22 13:03:15 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 22 Oct 2015 10:03:15 -0700 Subject: [Numpy-discussion] deprecate fromstring() for text reading? Message-ID: There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file. https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess! So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it. Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway. (and the hopefully coming new dtype system would make it easier to write cleanly) I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.h.vankerkwijk at gmail.com Thu Oct 22 18:35:28 2015 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Thu, 22 Oct 2015 18:35:28 -0400 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: Message-ID: I think it would be good to keep the usage to read binary data at least. Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: > There was just a question about a bug/issue with scipy.fromstring (which > is numpy.fromstring) when used to read integers from a text file. > > https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html > > fromstring() is bugging and inflexible for reading text files -- and it is > a very, very ugly mess of code. I dug into it a while back, and gave up -- > just to much of a mess! > > So we really should completely re-implement it, or deprecate it. I doubt > anyone is going to do a big refactor, so that means deprecating it. > > Also -- if we do want a fast read numbers from text files function (which > would be nice, actually), it really should get a new name anyway. > > (and the hopefully coming new dtype system would make it easier to write > cleanly) > > I'm not sure what deprecating something means, though -- have it raise a > deprecation warning in the next version? > > -CHB > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Oct 22 19:47:30 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 22 Oct 2015 16:47:30 -0700 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: Message-ID: <2283704104052164280@unknownmsgid> I think it would be good to keep the usage to read binary data at least. Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place. Oh, fromfile() has the same issues. Chris Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: > There was just a question about a bug/issue with scipy.fromstring (which > is numpy.fromstring) when used to read integers from a text file. > > https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html > > fromstring() is bugging and inflexible for reading text files -- and it is > a very, very ugly mess of code. I dug into it a while back, and gave up -- > just to much of a mess! > > So we really should completely re-implement it, or deprecate it. I doubt > anyone is going to do a big refactor, so that means deprecating it. > > Also -- if we do want a fast read numbers from text files function (which > would be nice, actually), it really should get a new name anyway. > > (and the hopefully coming new dtype system would make it easier to write > cleanly) > > I'm not sure what deprecating something means, though -- have it raise a > deprecation warning in the next version? > > -CHB > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Thu Oct 22 21:14:11 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Thu, 22 Oct 2015 18:14:11 -0700 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: Message-ID: Got it. Thanks, Nathaniel -- this is really good information. -Robert On Mon, Oct 19, 2015 at 6:55 PM, Nathaniel Smith wrote: > On Mon, Oct 19, 2015 at 2:26 AM, Olivier Grisel > wrote: > >> Is it possible to test this with py35 as well? > > > > Unfortunately not yet. > > > >> For MSVC, py35 requires a new compiler toolchain (VS2015) -- is that > >> something mingwpy/mingw-w64 can handle? > > > > I am pretty sure that mingwpy does not support Python 3.5 yet. > > Correct. > > > I don't know the status of the interop of mingw-w64 w.r.t. VS2015 but as > far > > as I know it's not supported yet either. Once the issue is fixed at the > > upstream level, I think mingwpy could be rebuilt to benefit from the fix. > > Upstream mingw-w64 doesn't support interop with any version of visual > studio that was released this millennium -- all the interop stuff is > new in mingwpy. > > VS2015 had a major reorganization of how it handles runtime libraries, > so it's not quite so trivial as just adding support the same way as > was done for VS2008 and VS2010. Or rather, IIUC: we *could* just add > support the same way as before, but there are undocumented rules about > which parts of the new runtime are considered stable and which are > not, so if we did this willy-nilly then we might end up using some of > the "unstable" parts. And then in 2017 the Windows team does some > internal refactoring and pushes it out through windows update and > suddenly NumPy / R / Julia / git / ... all start segfaulting at > startup on Windows, which would be a disaster from everyone's point of > view. We've pointed this out to the Python team at Microsoft and > they've promised to try and put Carl and the relevant mingw-w64 folks > in touch with the relevant internal folks at MS to hopefully tell us > how to do this correctly... fingers crossed :-). > > Aside from that, the main challenge for mingwpy in general is exactly > the issue of upstream support: if we don't get the interop stuff > pushed upstream from mingwpy to mingw-w64, then it will rot and break. > And upstream would love to have this interoperability as an officially > supported feature... but upstream doesn't consider what we have right > now to be maintainable, so they won't take it as is. (And honestly, > this is a reasonable opinion.) So what I've been trying to do is to > scrounge up some funding to support Carl and upstream doing this right > (the rough estimate is ~3 person-months of work). > > The original goal was to get MS to pay for this, on the theory that > they should be cleaning up their own messes, but after 6 months of > back-and-forth we've pretty much given up on that at this point, and > I'm in the process of emailing everyone I can think of who might be > convinced to donate some money to the cause. Maybe we should have a > kickstarter or something, I dunno :-). > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From webmastertux1 at gmail.com Fri Oct 23 12:45:57 2015 From: webmastertux1 at gmail.com (Charles Rilhac) Date: Sat, 24 Oct 2015 00:45:57 +0800 Subject: [Numpy-discussion] Nansum function behavior Message-ID: Hello, I noticed the change regarding nan function and especially nansum function. I think this choice is a big mistake. I know that Matlab and R have made this choice but it is illogical and counterintuitive. First argument is about logic. An arithmetic operation between Nothing and Nothing cannot make a figure or an object. Nothing + Object can be an object or something else, but from nothing, it cannot ensue something else than nothing. I hope you see what I mean. Secondly, it's counterintuitive and not convenient. Because, if you want to fill the result of nanfunction you can do that easily : a = np.array([[np.nan, np.nan], [1,np.nan]]) a = np.nansum(a, axis=1) print(a) array([np.nan, 1.]) a[np.isnan(a)] = 0 Whereas, if the result is already filled with zero on NaN-full rows, you cannot replace the result of NaN-full rows by NaN easily. In the case above, you cannot because you lost information about NaN-full rows. I know it is tough to come back to a previous stage but I really think that it is wrong to absolutely fill with zeros the result of arithmetic operation containing NaN. Thank for your work guys ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Oct 23 13:08:07 2015 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 Oct 2015 18:08:07 +0100 Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: References: Message-ID: On Fri, Oct 23, 2015 at 5:45 PM, Charles Rilhac wrote: > > Hello, > > I noticed the change regarding nan function and especially nansum function. I think this choice is a big mistake. I know that Matlab and R have made this choice but it is illogical and counterintuitive. What change are you referring to? -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Fri Oct 23 13:11:13 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 23 Oct 2015 13:11:13 -0400 Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: References: Message-ID: The change to nansum() happened several years ago. The main thrust of it was to make the following consistent: np.sum([]) # zero np.nansum([np.nan]) # zero np.sum([1]) # one np.nansum([np.nan, 1]) # one If you want to propagate masks and such, use masked arrays. Ben Root On Fri, Oct 23, 2015 at 12:45 PM, Charles Rilhac wrote: > Hello, > > I noticed the change regarding nan function and especially nansum > function. I think this choice is a big mistake. I know that Matlab and R > have made this choice but it is illogical and counterintuitive. > > First argument is about logic. An arithmetic operation between Nothing and > Nothing cannot make a figure or an object. Nothing + Object can be an > object or something else, but from nothing, it cannot ensue something else > than nothing. I hope you see what I mean. > > Secondly, it's counterintuitive and not convenient. Because, if you want > to fill the result of nanfunction you can do that easily : > > a = np.array([[np.nan, np.nan], [1,np.nan]]) > a = np.nansum(a, axis=1)print(a) > array([np.nan, 1.]) > a[np.isnan(a)] = 0 > > Whereas, if the result is already filled with zero on NaN-full rows, you > cannot replace the result of NaN-full rows by NaN easily. In the case > above, you cannot because you lost information about NaN-full rows. > > I know it is tough to come back to a previous stage but I really think > that it is wrong to absolutely fill with zeros the result of arithmetic > operation containing NaN. > Thank for your work guys ;-) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 23 18:13:02 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 23 Oct 2015 16:13:02 -0600 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: <2283704104052164280@unknownmsgid> References: <2283704104052164280@unknownmsgid> Message-ID: On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > I think it would be good to keep the usage to read binary data at least. > > > Agreed -- it's only the text file reading I'm proposing to deprecate. It > was kind of weird to cram it in there in the first place. > > Oh, fromfile() has the same issues. > > Chris > > > Or is there a good alternative to `np.fromstring(, dtype=...)`? -- > Marten > > On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker > wrote: > >> There was just a question about a bug/issue with scipy.fromstring (which >> is numpy.fromstring) when used to read integers from a text file. >> >> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >> >> fromstring() is bugging and inflexible for reading text files -- and it >> is a very, very ugly mess of code. I dug into it a while back, and gave up >> -- just to much of a mess! >> >> So we really should completely re-implement it, or deprecate it. I doubt >> anyone is going to do a big refactor, so that means deprecating it. >> >> Also -- if we do want a fast read numbers from text files function (which >> would be nice, actually), it really should get a new name anyway. >> >> (and the hopefully coming new dtype system would make it easier to write >> cleanly) >> >> I'm not sure what deprecating something means, though -- have it raise a >> deprecation warning in the next version? >> >> There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri Oct 23 18:30:39 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 23 Oct 2015 18:30:39 -0400 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: <2283704104052164280@unknownmsgid> Message-ID: > On Oct 23, 2015, at 6:13 PM, Charles R Harris wrote: > > > >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal wrote: >> >>> I think it would be good to keep the usage to read binary data at least. >> >> Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place. >> >> Oh, fromfile() has the same issues. >> >> Chris >> >> >>> Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten >>> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: >>>> There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file. >>>> >>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >>>> >>>> fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess! >>>> >>>> So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it. >>>> >>>> Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway. >>>> >>>> (and the hopefully coming new dtype system would make it easier to write cleanly) >>>> >>>> I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version? > > There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion IIRC Thomas Caswell was interested in doing this :) Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Oct 23 18:49:06 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 23 Oct 2015 15:49:06 -0700 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: <2283704104052164280@unknownmsgid> Message-ID: On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: > > On Oct 23, 2015, at 6:13 PM, Charles R Harris wrote: > >> >> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: >>> >>> >>>> I think it would be good to keep the usage to read binary data at least. >>> >>> >>> Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place. >>> >>> Oh, fromfile() has the same issues. >>> >>> Chris >>> >>> >>>> Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten >>>> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: >>>>> >>>>> There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file. >>>>> >>>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >>>>> >>>>> fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess! >>>>> >>>>> So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it. >>>>> >>>>> Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway. >>>>> >>>>> (and the hopefully coming new dtype system would make it easier to write cleanly) >>>>> >>>>> I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version? >>>>> >> >> There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward. > > > IIRC Thomas Caswell was interested in doing this :) When he was in Berkeley a few weeks ago he assured me that every night since SciPy he has dutifully been feeling guilty about not having done it yet. I think this week his paltry excuse is that he's "on his honeymoon" or something. ...which is to say that if someone has some spare cycles to take this over then I think that might be a nice wedding present for him :-). (The basic idea is to take the text reading backend behind pandas.read_csv and extract it into a standalone package that pandas could depend on, and that could also be used by other packages like numpy (among others -- I thing dato's SFrame package has a fork of this code as well?)) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri Oct 23 19:02:36 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 23 Oct 2015 19:02:36 -0400 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: <2283704104052164280@unknownmsgid> Message-ID: > On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: > > On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: > > > > On Oct 23, 2015, at 6:13 PM, Charles R Harris wrote: > > > >> > >> > >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal wrote: > >>> > >>> > >>>> I think it would be good to keep the usage to read binary data at least. > >>> > >>> > >>> Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place. > >>> > >>> Oh, fromfile() has the same issues. > >>> > >>> Chris > >>> > >>> > >>>> Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten > >>>> > >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: > >>>>> > >>>>> There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file. > >>>>> > >>>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html > >>>>> > >>>>> fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess! > >>>>> > >>>>> So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it. > >>>>> > >>>>> Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway. > >>>>> > >>>>> (and the hopefully coming new dtype system would make it easier to write cleanly) > >>>>> > >>>>> I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version? > >>>>> > >> > >> There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward. > > > > > > IIRC Thomas Caswell was interested in doing this :) > > When he was in Berkeley a few weeks ago he assured me that every night since SciPy he has dutifully been feeling guilty about not having done it yet. I think this week his paltry excuse is that he's "on his honeymoon" or something. > > ...which is to say that if someone has some spare cycles to take this over then I think that might be a nice wedding present for him :-). > > (The basic idea is to take the text reading backend behind pandas.read_csv and extract it into a standalone package that pandas could depend on, and that could also be used by other packages like numpy (among others -- I thing dato's SFrame package has a fork of this code as well?)) > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion I can certainly provide guidance on how/what to extract but don't have spare cycles myself for this :( -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Oct 23 20:22:55 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 23 Oct 2015 17:22:55 -0700 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References: <2283704104052164280@unknownmsgid> Message-ID: <-1464708838107245522@unknownmsgid> Grabbing the pandas csv reader would be great, and I hope it happens sooner than later, though alas, I haven't the spare cycles for it either. In the meantime though, can we put a deprecation Warning in when using fromstring() on text files? It's really pretty broken. -Chris On Oct 23, 2015, at 4:02 PM, Jeff Reback wrote: On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: > > On Oct 23, 2015, at 6:13 PM, Charles R Harris wrote: > >> >> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: >>> >>> >>>> I think it would be good to keep the usage to read binary data at least. >>> >>> >>> Agreed -- it's only the text file reading I'm proposing to deprecate. It was kind of weird to cram it in there in the first place. >>> >>> Oh, fromfile() has the same issues. >>> >>> Chris >>> >>> >>>> Or is there a good alternative to `np.fromstring(, dtype=...)`? -- Marten >>>> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker wrote: >>>>> >>>>> There was just a question about a bug/issue with scipy.fromstring (which is numpy.fromstring) when used to read integers from a text file. >>>>> >>>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >>>>> >>>>> fromstring() is bugging and inflexible for reading text files -- and it is a very, very ugly mess of code. I dug into it a while back, and gave up -- just to much of a mess! >>>>> >>>>> So we really should completely re-implement it, or deprecate it. I doubt anyone is going to do a big refactor, so that means deprecating it. >>>>> >>>>> Also -- if we do want a fast read numbers from text files function (which would be nice, actually), it really should get a new name anyway. >>>>> >>>>> (and the hopefully coming new dtype system would make it easier to write cleanly) >>>>> >>>>> I'm not sure what deprecating something means, though -- have it raise a deprecation warning in the next version? >>>>> >> >> There was discussion at SciPy 2015 of separating out the text reading abilities of Pandas so that numpy could include it. We should contact Jeff Rebeck and see about moving that forward. > > > IIRC Thomas Caswell was interested in doing this :) When he was in Berkeley a few weeks ago he assured me that every night since SciPy he has dutifully been feeling guilty about not having done it yet. I think this week his paltry excuse is that he's "on his honeymoon" or something. ...which is to say that if someone has some spare cycles to take this over then I think that might be a nice wedding present for him :-). (The basic idea is to take the text reading backend behind pandas.read_csv and extract it into a standalone package that pandas could depend on, and that could also be used by other packages like numpy (among others -- I thing dato's SFrame package has a fork of this code as well?)) -n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion I can certainly provide guidance on how/what to extract but don't have spare cycles myself for this :( _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From webmastertux1 at gmail.com Fri Oct 23 20:47:28 2015 From: webmastertux1 at gmail.com (Charles Rilhac) Date: Sat, 24 Oct 2015 08:47:28 +0800 Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: References: Message-ID: Why do we keep this behaviour ? : np.nansum([np.nan]) # zero Firstly, you lose information. You can easily fill nan with zero after applying nansum but you cannot keep nan for nan-full rows if you doesn?t have a mask or keep the information about nan-full row before. It is not convenient, useful. Secondly, it is illogical. A arithmetic operation or whatever else between Nothing and Nothing cannot return Something. We can accept that Nothing + Object = Object but we cannot get a figure from nothing. It is counterintuitive. I really disagree with this change happened few years ago. > On 24 Oct 2015, at 01:11, Benjamin Root wrote: > > The change to nansum() happened several years ago. The main thrust of it was to make the following consistent: > > np.sum([]) # zero > np.nansum([np.nan]) # zero > np.sum([1]) # one > np.nansum([np.nan, 1]) # one > > If you want to propagate masks and such, use masked arrays. > Ben Root > > > On Fri, Oct 23, 2015 at 12:45 PM, Charles Rilhac > wrote: > Hello, > > I noticed the change regarding nan function and especially nansum function. I think this choice is a big mistake. I know that Matlab and R have made this choice but it is illogical and counterintuitive. > > First argument is about logic. An arithmetic operation between Nothing and Nothing cannot make a figure or an object. Nothing + Object can be an object or something else, but from nothing, it cannot ensue something else than nothing. I hope you see what I mean. > > Secondly, it's counterintuitive and not convenient. Because, if you want to fill the result of nanfunction you can do that easily : > > a = np.array([[np.nan, np.nan], [1,np.nan]]) > a = np.nansum(a, axis=1) > print(a) > array([np.nan, 1.]) > a[np.isnan(a)] = 0 > Whereas, if the result is already filled with zero on NaN-full rows, you cannot replace the result of NaN-full rows by NaN easily. In the case above, you cannot because you lost information about NaN-full rows. > > I know it is tough to come back to a previous stage but I really think that it is wrong to absolutely fill with zeros the result of arithmetic operation containing NaN. > > Thank for your work guys ;-) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Oct 23 21:28:27 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 23 Oct 2015 18:28:27 -0700 Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: References: Message-ID: Hi Charles, You should read the previous discussion about this issue on GitHub: https://github.com/numpy/numpy/issues/1721 For what it's worth, I do think the new definition of nansum is more consistent. If you want to preserve NaN if there are no non-NaN values, you can often calculate this desired quantity from nanmean, which does return NaN if there are only NaNs. Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From webmastertux1 at gmail.com Fri Oct 23 21:43:54 2015 From: webmastertux1 at gmail.com (Charles Rilhac) Date: Sat, 24 Oct 2015 09:43:54 +0800 Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: References: Message-ID: <08DB1F07-5A22-4E21-B9B3-61A55B719D54@gmail.com> I saw this thread and I totally disagree with thouis argument? Of course, you can have NaN if there are only NaNs. Thanks goodness, There is a lot of way to do that. But it?s not convenient, consistent and above all, it is wrong logically to do that. NaN does not mean zeros and operation with NaN only cannot return a figure? You lose information about your array. It is easier to fill the result of nansum with zeros than to keep a mask of your orignal array or whatever you do. Why it?s misleading ? For example you want to sum rows of a array and mean the result : a = np.array([[2,np.nan,4], [np.nan,np.nan, np.nan]]) b = np.nansum(a, axis=1) # array([ 6., 0.]) m = np.nanmean(b) # 3.0 WRONG because you wanted to get 6 > On 24 Oct 2015, at 09:28, Stephan Hoyer wrote: > > Hi Charles, > > You should read the previous discussion about this issue on GitHub: > https://github.com/numpy/numpy/issues/1721 > > For what it's worth, I do think the new definition of nansum is more consistent. > > If you want to preserve NaN if there are no non-NaN values, you can often calculate this desired quantity from nanmean, which does return NaN if there are only NaNs. > > Stephan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From jni.soma at gmail.com Sat Oct 24 02:08:18 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Fri, 23 Oct 2015 23:08:18 -0700 (PDT) Subject: [Numpy-discussion] Nansum function behavior In-Reply-To: <08DB1F07-5A22-4E21-B9B3-61A55B719D54@gmail.com> References: <08DB1F07-5A22-4E21-B9B3-61A55B719D54@gmail.com> Message-ID: <1445666897743.824aaa9c@Nodemailer> Hi Charles, Just providing an outsider's perspective... Your specific use-case doesn't address the general definition of nansum: perform a sum while ignoring nans. As others have pointed out, (especially in the linked thread) the sum of nothing is 0. Although the current behaviour of nansum doesn't quite match your use-case, there is no doubt at all that it follows a consistent convention. "Wrong" is certainly not the correct way to describe it. You can easily cater to your use case as follows: def rilhac_nansum(ar, axis=None): ? ? if axis is None: ? ? ? ? return np.nanmean(ar) ? ? else: ? ? ? ? return np.nanmean(ar, axis=axis) * ar.shape[axis] nanmean _consistently_ returns nans when encountering nan-only values because the mean of nothing is nan (the sum of nothing divided by the length of nothing, ie 0/0). Hope this helps... Juan. On Sat, Oct 24, 2015 at 12:44 PM, Charles Rilhac wrote: > I saw this thread and I totally disagree with thouis argument? > Of course, you can have NaN if there are only NaNs. Thanks goodness, There is a lot of way to do that. > But it?s not convenient, consistent and above all, it is wrong logically to do that. NaN does not mean zeros and operation with NaN only cannot return a figure? > You lose information about your array. It is easier to fill the result of nansum with zeros than to keep a mask of your orignal array or whatever you do. > Why it?s misleading ? > For example you want to sum rows of a array and mean the result : > a = np.array([[2,np.nan,4], [np.nan,np.nan, np.nan]]) > b = np.nansum(a, axis=1) # array([ 6., 0.]) > m = np.nanmean(b) # 3.0 WRONG because you wanted to get 6 >> On 24 Oct 2015, at 09:28, Stephan Hoyer wrote: >> >> Hi Charles, >> >> You should read the previous discussion about this issue on GitHub: >> https://github.com/numpy/numpy/issues/1721 >> >> For what it's worth, I do think the new definition of nansum is more consistent. >> >> If you want to preserve NaN if there are no non-NaN values, you can often calculate this desired quantity from nanmean, which does return NaN if there are only NaNs. >> >> Stephan >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Oct 25 06:25:26 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 25 Oct 2015 11:25:26 +0100 Subject: [Numpy-discussion] ANN: Scipy 0.16.1 release Message-ID: Hi all, I'm happy to announce the availability of the Scipy 0.16.1 release. This is a bugfix only release; it contains no new features compared to 0.16.0. The sources and binary installers can be found at: - Source tarballs: at https://github.com/scipy/scipy/releases and on PyPi. - OS X: there are wheels on PyPi, so simply install with pip. - Windows: .exe installers can be found on https://github.com/scipy/scipy/releases Cheers, Ralf ========================== SciPy 0.16.1 Release Notes ========================== SciPy 0.16.1 is a bug-fix release with no new features compared to 0.16.0. Issues closed for 0.16.1 ------------------------ - `#5077 `__: cKDTree not indexing properly for arrays with too many elements - `#5127 `__: Regression in 0.16.0: solve_banded errors out in patsy test suite - `#5149 `__: linalg tests apparently cause python to crash with numpy 1.10.0b1 - `#5154 `__: 0.16.0 fails to build on OS X; can't find Python.h - `#5173 `__: failing stats.histogram test with numpy 1.10 - `#5191 `__: Scipy 0.16.x - TypeError: _asarray_validated() got an unexpected... - `#5195 `__: tarballs missing documentation source - `#5363 `__: FAIL: test_orthogonal.test_j_roots, test_orthogonal.test_js_roots Pull requests for 0.16.1 ------------------------ - `#5088 `__: BUG: fix logic error in cKDTree.sparse_distance_matrix - `#5089 `__: BUG: Don't overwrite b in lfilter's FIR path - `#5128 `__: BUG: solve_banded failed when solving 1x1 systems - `#5155 `__: BLD: fix missing Python include for Homebrew builds. - `#5192 `__: BUG: backport as_inexact kwarg to _asarray_validated - `#5203 `__: BUG: fix uninitialized use in lartg 0.16 backport - `#5204 `__: BUG: properly return error to fortran from ode_jacobian_function... - `#5207 `__: TST: Fix TestCtypesQuad failure on Python 3.5 for Windows - `#5352 `__: TST: sparse: silence warnings about boolean indexing - `#5355 `__: MAINT: backports for 0.16.1 release - `#5356 `__: REL: update Paver file to ensure sdist contents are OK for releases. - `#5382 `__: 0.16.x backport: MAINT: work around a possible numpy ufunc loop... - `#5393 `__: TST:special: bump tolerance levels for test_j_roots and test_js_roots - `#5417 From eleanore.young at artorg.unibe.ch Sun Oct 25 08:06:46 2015 From: eleanore.young at artorg.unibe.ch (eleanore.young at artorg.unibe.ch) Date: Sun, 25 Oct 2015 12:06:46 +0000 Subject: [Numpy-discussion] Numpy Generalized Ufuncs: Pointer Arithmetic and Segmentation Faults (Debugging?) Message-ID: <2577F295-530A-4433-90B9-1CA9701156F5@artorg.unibe.ch> Dear Numpy maintainers and developers, Thanks for providing such a great numerical library! I?m currently trying to implement the Dynamic Time Warping metric as a set of generalised numpy ufuncs, but unfortunately, I have lasting issues with pointer arithmetic and segmentation faults. Is there any way that I can use GDB or some such to debug a python/numpy extension? Furthermore: is it necessary to use pointer arithmetic to access the function arguments (as seen on http://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html) or is element access (operator[]) also permissible? To break it down quickly, I need to have a fast DTW distance function dist_dtw() with two vector inputs (broadcasting should be possible), two scalar parameters and one scalar output (signature: (i), (j), (), () -> ()) usable in python for a 1-Nearest Neighbor classification algorithm. The extension also implements two functions compute_envelope() and piecewise_mean_reduction() which are used for lower-bounding based on Keogh and Ratanamahatana, 2005. The source code is available at http://pastebin.com/MunNaP7V and the prominent segmentation fault happens somewhere in the chain dist_dtw() ?> meta_dtw_dist() ?> slow_dtw_dist(), but I fail to pin it down. Aside from my primary questions, I wonder how to approach errors/exceptions and unit testing when developing numpy ufuncs. Are there any examples apart from the numpy manual that I could use as reference implementations of generalised numpy ufuncs? I would greatly appreciate some insight into properly developing generalised ufuncs. Best, Eleanore -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sun Oct 25 10:13:02 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 25 Oct 2015 07:13:02 -0700 Subject: [Numpy-discussion] Numpy Generalized Ufuncs: Pointer Arithmetic and Segmentation Faults (Debugging?) In-Reply-To: <2577F295-530A-4433-90B9-1CA9701156F5@artorg.unibe.ch> References: <2577F295-530A-4433-90B9-1CA9701156F5@artorg.unibe.ch> Message-ID: HI Eleanore, Thanks for the kind words, you are very welcome! As for your issues, I think they are coming from the handling of the strides you are doing in the slow_dtw_dist function. The strides are the number of bytes you have to advance your pointer to get to the next item. In your code, you end up doing something akin to: dtype *v_i = v0; ... for (...) { ... v_i += stride_v; } This, rather than increase the v_i pointer by stride_v bytes, increases it by stride_v * sizeof(dtype), and with the npy_double you seem to be using as dtype, sends you out of your allocated memory at a rate 8x too fast. What you increase by stride_v has to be of char* type, so one simple solution would be to do something like: char *v_ptr = (char *)v0; ... for (...) { dtype v_val = *(dtype *)v_ptr; ... v_ptr += stride_v; } And use v_val directly wherever you were dereferencing v_i before. Jaime On Sun, Oct 25, 2015 at 5:06 AM, wrote: > Dear Numpy maintainers and developers, > > Thanks for providing such a great numerical library! > > I?m currently trying to implement the Dynamic Time Warping metric as a set > of generalised numpy ufuncs, but unfortunately, I have lasting issues with > pointer arithmetic and segmentation faults. Is there any way that I can > use GDB or some such to debug a python/numpy extension? Furthermore: is it > necessary to use pointer arithmetic to access the function arguments (as > seen on http://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html) > or is element access (operator[]) also permissible? > > To break it down quickly, I need to have a fast DTW distance function > dist_dtw() with two vector inputs (broadcasting should be possible), two > scalar parameters and one scalar output (signature: (i), (j), (), () -> ()) > usable in python for a 1-Nearest Neighbor classification algorithm. The > extension also implements two functions compute_envelope() and > piecewise_mean_reduction() which are used for lower-bounding based on Keogh > and Ratanamahatana, 2005. The source code is available at > http://pastebin.com/MunNaP7V and the prominent segmentation fault happens > somewhere in the chain dist_dtw() ?> meta_dtw_dist() ?> slow_dtw_dist(), > but I fail to pin it down. > > Aside from my primary questions, I wonder how to approach > errors/exceptions and unit testing when developing numpy ufuncs. Are there > any examples apart from the numpy manual that I could use as reference > implementations of generalised numpy ufuncs? > > I would greatly appreciate some insight into properly developing > generalised ufuncs. > > Best, > Eleanore > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Oct 26 01:04:28 2015 From: travis at continuum.io (Travis Oliphant) Date: Mon, 26 Oct 2015 00:04:28 -0500 Subject: [Numpy-discussion] Numpy Generalized Ufuncs: Pointer Arithmetic and Segmentation Faults (Debugging?) In-Reply-To: <2577F295-530A-4433-90B9-1CA9701156F5@artorg.unibe.ch> References: <2577F295-530A-4433-90B9-1CA9701156F5@artorg.unibe.ch> Message-ID: Two things that might help you create generalized ufuncs: 1) Look at Numba --- it makes it very easy to write generalized ufuncs in simple Python code. Numba will compile to machine code so it can be as fast as writing in C. Here is the documentation for that specific feature: http://numba.pydata.org/numba-doc/0.21.0/user/vectorize.html#the-guvectorize-decorator. One wart of the interface is that scalars need to be treated as 1-element 1-d arrays (but still use '()' in the signature). 2) Look at the linear algebra module in NumPy which now wraps a bunch of linear-algebra based generalized ufuncs (all written in C): https://github.com/numpy/numpy/blob/master/numpy/linalg/umath_linalg.c.src -Travis On Sun, Oct 25, 2015 at 7:06 AM, wrote: > Dear Numpy maintainers and developers, > > Thanks for providing such a great numerical library! > > I?m currently trying to implement the Dynamic Time Warping metric as a set > of generalised numpy ufuncs, but unfortunately, I have lasting issues with > pointer arithmetic and segmentation faults. Is there any way that I can > use GDB or some such to debug a python/numpy extension? Furthermore: is it > necessary to use pointer arithmetic to access the function arguments (as > seen on http://docs.scipy.org/doc/numpy/user/c-info.ufunc-tutorial.html) > or is element access (operator[]) also permissible? > > To break it down quickly, I need to have a fast DTW distance function > dist_dtw() with two vector inputs (broadcasting should be possible), two > scalar parameters and one scalar output (signature: (i), (j), (), () -> ()) > usable in python for a 1-Nearest Neighbor classification algorithm. The > extension also implements two functions compute_envelope() and > piecewise_mean_reduction() which are used for lower-bounding based on Keogh > and Ratanamahatana, 2005. The source code is available at > http://pastebin.com/MunNaP7V and the prominent segmentation fault happens > somewhere in the chain dist_dtw() ?> meta_dtw_dist() ?> slow_dtw_dist(), > but I fail to pin it down. > > Aside from my primary questions, I wonder how to approach > errors/exceptions and unit testing when developing numpy ufuncs. Are there > any examples apart from the numpy manual that I could use as reference > implementations of generalised numpy ufuncs? > > I would greatly appreciate some insight into properly developing > generalised ufuncs. > > Best, > Eleanore > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *Travis Oliphant* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From eleanore.young at artorg.unibe.ch Mon Oct 26 12:25:05 2015 From: eleanore.young at artorg.unibe.ch (eleanore.young at artorg.unibe.ch) Date: Mon, 26 Oct 2015 16:25:05 +0000 Subject: [Numpy-discussion] Numpy Generalized Ufuncs: Pointer Arithmetic and Segmentation Faults (Debugging?) Message-ID: <349C9218-6197-4D11-B0C2-68A166FEB201@artorg.unibe.ch> Dear Jaime, dear Travis thanks for pointing out my stride errors. This just gets me every time. After trying out Travis? suggestion to work with numba, I feel that this works best for me. Functions are easier to generalise to different data types and I can make use of my existing Python development environment that way. Thanks again for your rapid and helpful support! Best, Eleanore From njs at pobox.com Tue Oct 27 00:31:12 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 26 Oct 2015 21:31:12 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead Message-ID: Hi all, Apparently it is not well known that if you have a Python project source tree (e.g., a numpy checkout), then the correct way to install it is NOT to type python setup.py install # bad and broken! but rather to type pip install . (I.e., pip install isn't just for packages on pypi -- you can also pass it the path to an arbitrary source directory or the URL of a source tarball and it will do its thing. In this case "install ." means "install the project in the current directory".) These don't quite have identical results -- the main difference is that the latter makes sure that proper metadata gets installed so that later on it will be possible to upgrade or uninstall correctly. If you call setup.py directly, and then later you try to upgrade your package, then it's entirely possible to end up with a mixture of old and new versions of the package installed in your PYTHONPATH. (One common effect is in numpy's case is that we get people sending us mysterious bug reports about failing tests in files don't even exist (!) -- because nose is finding tests in files from one version of numpy and running them against a different version of numpy.) But this isn't the only issue -- using pip also avoids a bunch of weird corner cases in distutils/setuptools. E.g., if setup.py uses plain distutils, then it turns out this will mangle numpy version numbers in ways that cause weird horribleness -- see [1] for a bug report of the form "matplotlib doesn't build anymore" which turned out to be because of using 'setup.py install' to install numpy. OTOH if setup.py uses setuptools then you get different weirdnesses, like you can easily end up with multiple versions of the same library installed simultaneously. And finally, an advantage of getting used to using 'pip install .' now is that you'll be prepared for the glorious future when we kill distutils and get rid of setup.py entirely in favor of something less terrible [2]. So a proposal that came out of the discussion in [1] is that we modify numpy's setup.py now so that if you try running python setup.py install you get Error: Calling 'setup.py install' directly is NOT SUPPORTED! Instead, do: pip install . Alternatively, if you want to proceed at your own risk, you can try 'setup.py install --force-raw-setup.py' For more information see http://... (Other setup.py commands would continue to work as normal.) I believe that this would also break both 'easy_install numpy', and attempts to install numpy via the setup_requires= argument to setuptools.setup (because setup_requires= implicitly calls easy_install). install_requires= would *not* be affected, and setup_requires= would still be fine in cases where numpy was already installed. This would hopefully cut down on the amount of time everyone spends trying to track down these stupid weird bugs, but it will also require some adjustment in people's workflows, so... objections? concerns? -n [1] https://github.com/numpy/numpy/issues/6551 [2] https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Tue Oct 27 01:44:15 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 26 Oct 2015 22:44:15 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Mon, Oct 26, 2015 at 9:31 PM, Nathaniel Smith wrote: [...] > I believe that this would also break both 'easy_install numpy', and > attempts to install numpy via the setup_requires= argument to > setuptools.setup (because setup_requires= implicitly calls > easy_install). install_requires= would *not* be affected, and > setup_requires= would still be fine in cases where numpy was already > installed. On further investigation, it looks like the simplest approach to doing this would actually treat easy_install and setup_requires= the same way as they treat pip, i.e., they would all be allowed. (I was misreading some particularly confusing code in setuptools.) It also looks like easy_installed packages can at least be safely upgraded, so I guess allowing this is okay :-). -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Tue Oct 27 02:03:22 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 Oct 2015 00:03:22 -0600 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Mon, Oct 26, 2015 at 10:31 PM, Nathaniel Smith wrote: > Hi all, > > Apparently it is not well known that if you have a Python project > source tree (e.g., a numpy checkout), then the correct way to install > it is NOT to type > > python setup.py install # bad and broken! > > but rather to type > > pip install . > > (I.e., pip install isn't just for packages on pypi -- you can also > pass it the path to an arbitrary source directory or the URL of a > source tarball and it will do its thing. In this case "install ." > means "install the project in the current directory".) > > These don't quite have identical results -- the main difference is > that the latter makes sure that proper metadata gets installed so that > later on it will be possible to upgrade or uninstall correctly. If you > call setup.py directly, and then later you try to upgrade your > package, then it's entirely possible to end up with a mixture of old > and new versions of the package installed in your PYTHONPATH. (One > common effect is in numpy's case is that we get people sending us > mysterious bug reports about failing tests in files don't even exist > (!) -- because nose is finding tests in files from one version of > numpy and running them against a different version of numpy.) > > But this isn't the only issue -- using pip also avoids a bunch of > weird corner cases in distutils/setuptools. E.g., if setup.py uses > plain distutils, then it turns out this will mangle numpy version > numbers in ways that cause weird horribleness -- see [1] for a bug > report of the form "matplotlib doesn't build anymore" which turned out > to be because of using 'setup.py install' to install numpy. OTOH if > setup.py uses setuptools then you get different weirdnesses, like you > can easily end up with multiple versions of the same library installed > simultaneously. > > And finally, an advantage of getting used to using 'pip install .' now > is that you'll be prepared for the glorious future when we kill > distutils and get rid of setup.py entirely in favor of something less > terrible [2]. > > So a proposal that came out of the discussion in [1] is that we modify > numpy's setup.py now so that if you try running > > python setup.py install > > you get > > Error: Calling 'setup.py install' directly is NOT SUPPORTED! > Instead, do: > > pip install . > > Alternatively, if you want to proceed at your own risk, you > can try 'setup.py install --force-raw-setup.py' > For more information see http://... > > (Other setup.py commands would continue to work as normal.) > > I believe that this would also break both 'easy_install numpy', and > attempts to install numpy via the setup_requires= argument to > setuptools.setup (because setup_requires= implicitly calls > easy_install). install_requires= would *not* be affected, and > setup_requires= would still be fine in cases where numpy was already > installed. > > This would hopefully cut down on the amount of time everyone spends > trying to track down these stupid weird bugs, but it will also require > some adjustment in people's workflows, so... objections? concerns? > I gave it a shot the other day. Pip keeps a record of the path to the repo and in order to cleanup I needed to search out the file and delete the repo path. There is probably a better way to do that, but it didn't strike me as less troublesome than ` python setup.py install --local`. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 02:08:07 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 26 Oct 2015 23:08:07 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Mon, Oct 26, 2015 at 11:03 PM, Charles R Harris wrote: > [...] > I gave it a shot the other day. Pip keeps a record of the path to the repo > and in order to cleanup I needed to search out the file and delete the repo > path. There is probably a better way to do that, but it didn't strike me as > less troublesome than ` python setup.py install --local`. Sorry, what did you "give a shot", and what problem did it create? What does `setup.py install --local` do? (it doesn't seem to be mentioned in `setup.py install --help`.) -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Tue Oct 27 02:33:01 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 27 Oct 2015 00:33:01 -0600 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 12:08 AM, Nathaniel Smith wrote: > On Mon, Oct 26, 2015 at 11:03 PM, Charles R Harris > wrote: > > > [...] > > I gave it a shot the other day. Pip keeps a record of the path to the > repo > > and in order to cleanup I needed to search out the file and delete the > repo > > path. There is probably a better way to do that, but it didn't strike me > as > > less troublesome than ` python setup.py install --local`. > > Sorry, what did you "give a shot", and what problem did it create? > What does `setup.py install --local` do? (it doesn't seem to be > mentioned in `setup.py install --help`.) > `pip install --user -e . `. However, `pip install --user .` seems to work fine. The pip documentation isn't the best. Yeah, `--user` not `--local`. It's getting late... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Tue Oct 27 02:53:09 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Tue, 27 Oct 2015 06:53:09 +0000 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: Is there a pip equivalent of "python setup.py develop"? On Tue, Oct 27, 2015 at 5:33 PM Charles R Harris wrote: > On Tue, Oct 27, 2015 at 12:08 AM, Nathaniel Smith wrote: > >> On Mon, Oct 26, 2015 at 11:03 PM, Charles R Harris >> wrote: >> > >> [...] >> > I gave it a shot the other day. Pip keeps a record of the path to the >> repo >> > and in order to cleanup I needed to search out the file and delete the >> repo >> > path. There is probably a better way to do that, but it didn't strike >> me as >> > less troublesome than ` python setup.py install --local`. >> >> Sorry, what did you "give a shot", and what problem did it create? >> What does `setup.py install --local` do? (it doesn't seem to be >> mentioned in `setup.py install --help`.) >> > > `pip install --user -e . `. However, `pip install --user .` seems to work > fine. The pip documentation isn't the best. > > Yeah, `--user` not `--local`. It's getting late... > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 03:18:21 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 00:18:21 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Mon, Oct 26, 2015 at 11:33 PM, Charles R Harris wrote: > > > On Tue, Oct 27, 2015 at 12:08 AM, Nathaniel Smith wrote: >> >> On Mon, Oct 26, 2015 at 11:03 PM, Charles R Harris >> wrote: >> > >> [...] >> > I gave it a shot the other day. Pip keeps a record of the path to the >> > repo >> > and in order to cleanup I needed to search out the file and delete the >> > repo >> > path. There is probably a better way to do that, but it didn't strike me >> > as >> > less troublesome than ` python setup.py install --local`. >> >> Sorry, what did you "give a shot", and what problem did it create? >> What does `setup.py install --local` do? (it doesn't seem to be >> mentioned in `setup.py install --help`.) > > > `pip install --user -e . `. However, `pip install --user .` seems to work > fine. The pip documentation isn't the best. > > Yeah, `--user` not `--local`. It's getting late... Ah. I think if you want to undo a `pip install --user -e .` you can just do `pip uninstall numpy`? But in any case the equivalent of 'pip install -e' is 'setup.py develop', and the proposal is only to disable `setup.py install`. So if you prefer `setup.py develop` then you could still use it. (IIUC `pip install -e` is a *very* thin wrapper around `setup.py develop` -- normally pip takes over the job of actually installing files, so pip gets to interpret options like --user and figure out what it means for where it puts the files, but in the case of `install -e` it looks like it just calls `setup.py develop` directly, so if --user doesn't work it's probably because setup.py develop doesn't know about --user?) -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Tue Oct 27 03:19:04 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 27 Oct 2015 08:19:04 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 6:44 AM, Nathaniel Smith wrote: > On Mon, Oct 26, 2015 at 9:31 PM, Nathaniel Smith wrote: > [...] > > I believe that this would also break both 'easy_install numpy', and > > attempts to install numpy via the setup_requires= argument to > > setuptools.setup (because setup_requires= implicitly calls > > easy_install). install_requires= would *not* be affected, and > > setup_requires= would still be fine in cases where numpy was already > > installed. > > On further investigation, it looks like the simplest approach to doing > this would actually treat easy_install and setup_requires= the same > way as they treat pip, i.e., they would all be allowed. (I was > misreading some particularly confusing code in setuptools.) > > It also looks like easy_installed packages can at least be safely > upgraded, so I guess allowing this is okay :-). I just discovered https://bitbucket.org/dholth/setup-requires, which ensures that setup_requires uses pip instead of easy_install. So we can not only keep setup-requires working, but make it work significantly better. So if/when we accept the proposal in this thread, I'm thinking we should make a bunch of changes at once: - always use setuptools (this is a new dependency) - error on ``python setup.py install`` - add the setup-requires trick - error on ``python setup.py clean`` (saying "use `git clean -xdf` (or -Xdf ...) instead") - change ``python setup.py --help`` to first show numpy-specific stuff before setuptools help info - update all our install docs And when "pip upgrade" is released (should be soon, see https://github.com/pypa/pip/pull/3194), officially change our mind and recommend the use of install_requires/setup_requires to packages depending on numpy. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 03:28:05 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 00:28:05 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 12:19 AM, Ralf Gommers wrote: > > > On Tue, Oct 27, 2015 at 6:44 AM, Nathaniel Smith wrote: >> >> On Mon, Oct 26, 2015 at 9:31 PM, Nathaniel Smith wrote: >> [...] >> > I believe that this would also break both 'easy_install numpy', and >> > attempts to install numpy via the setup_requires= argument to >> > setuptools.setup (because setup_requires= implicitly calls >> > easy_install). install_requires= would *not* be affected, and >> > setup_requires= would still be fine in cases where numpy was already >> > installed. >> >> On further investigation, it looks like the simplest approach to doing >> this would actually treat easy_install and setup_requires= the same >> way as they treat pip, i.e., they would all be allowed. (I was >> misreading some particularly confusing code in setuptools.) >> >> It also looks like easy_installed packages can at least be safely >> upgraded, so I guess allowing this is okay :-). > > > I just discovered https://bitbucket.org/dholth/setup-requires, which ensures > that setup_requires uses pip instead of easy_install. So we can not only > keep setup-requires working, but make it work significantly better. IIUC this is not something that we (= numpy) could use ourselves, but instead something that everyone who does setup_requires=["numpy"] would have to set up in their individual projects? -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Tue Oct 27 03:31:09 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 00:31:09 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Mon, Oct 26, 2015 at 11:53 PM, Juan Nunez-Iglesias wrote: > Is there a pip equivalent of "python setup.py develop"? Kinda answered this already when replying to Chuck, but: yes, it's `pip install -e ` (the -e is short for --editable), not that you would need it necessarily because `setup.py develop` would still be legal. -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Tue Oct 27 03:32:00 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 27 Oct 2015 08:32:00 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 8:28 AM, Nathaniel Smith wrote: > On Tue, Oct 27, 2015 at 12:19 AM, Ralf Gommers > wrote: > > > > > > On Tue, Oct 27, 2015 at 6:44 AM, Nathaniel Smith wrote: > >> > >> On Mon, Oct 26, 2015 at 9:31 PM, Nathaniel Smith wrote: > >> [...] > >> > I believe that this would also break both 'easy_install numpy', and > >> > attempts to install numpy via the setup_requires= argument to > >> > setuptools.setup (because setup_requires= implicitly calls > >> > easy_install). install_requires= would *not* be affected, and > >> > setup_requires= would still be fine in cases where numpy was already > >> > installed. > >> > >> On further investigation, it looks like the simplest approach to doing > >> this would actually treat easy_install and setup_requires= the same > >> way as they treat pip, i.e., they would all be allowed. (I was > >> misreading some particularly confusing code in setuptools.) > >> > >> It also looks like easy_installed packages can at least be safely > >> upgraded, so I guess allowing this is okay :-). > > > > > > I just discovered https://bitbucket.org/dholth/setup-requires, which > ensures > > that setup_requires uses pip instead of easy_install. So we can not only > > keep setup-requires working, but make it work significantly better. > > IIUC this is not something that we (= numpy) could use ourselves, but > instead something that everyone who does setup_requires=["numpy"] > would have to set up in their individual projects? > Right. I was thinking about using it in scipy. Ah well, I'm sure we can manage to not break ``setup_requires=numpy`` in some way. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 27 09:07:56 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 27 Oct 2015 09:07:56 -0400 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 3:32 AM, Ralf Gommers wrote: > > > On Tue, Oct 27, 2015 at 8:28 AM, Nathaniel Smith wrote: > >> On Tue, Oct 27, 2015 at 12:19 AM, Ralf Gommers >> wrote: >> > >> > >> > On Tue, Oct 27, 2015 at 6:44 AM, Nathaniel Smith wrote: >> >> >> >> On Mon, Oct 26, 2015 at 9:31 PM, Nathaniel Smith >> wrote: >> >> [...] >> >> > I believe that this would also break both 'easy_install numpy', and >> >> > attempts to install numpy via the setup_requires= argument to >> >> > setuptools.setup (because setup_requires= implicitly calls >> >> > easy_install). install_requires= would *not* be affected, and >> >> > setup_requires= would still be fine in cases where numpy was already >> >> > installed. >> >> >> >> On further investigation, it looks like the simplest approach to doing >> >> this would actually treat easy_install and setup_requires= the same >> >> way as they treat pip, i.e., they would all be allowed. (I was >> >> misreading some particularly confusing code in setuptools.) >> >> >> >> It also looks like easy_installed packages can at least be safely >> >> upgraded, so I guess allowing this is okay :-). >> > >> > >> > I just discovered https://bitbucket.org/dholth/setup-requires, which >> ensures >> > that setup_requires uses pip instead of easy_install. So we can not only >> > keep setup-requires working, but make it work significantly better. >> >> IIUC this is not something that we (= numpy) could use ourselves, but >> instead something that everyone who does setup_requires=["numpy"] >> would have to set up in their individual projects? >> > > Right. I was thinking about using it in scipy. Ah well, I'm sure we can > manage to not break ``setup_requires=numpy`` in some way. > What's the equivalent of python setup.py build_ext --inplace brief google search (I didn't follow up on those) https://github.com/pypa/pip/issues/1887 https://github.com/pypa/pip/issues/18 Given that I rely completely on binary distributions for numpy and scipy, I won't be affected. (I'm still allergic to pip and will switch only several years after everybody else.) Josef > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jehturner at gmail.com Tue Oct 27 09:48:21 2015 From: jehturner at gmail.com (James E.H. Turner) Date: Tue, 27 Oct 2015 10:48:21 -0300 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: <562F80A5.6060004@gmail.com> > Apparently it is not well known that if you have a Python project > source tree (e.g., a numpy checkout), then the correct way to install > it is NOT to type > > python setup.py install # bad and broken! > > but rather to type > > pip install . Though I haven't studied it exhaustively, it always seems to me that pip is bad & broken, whereas python setup.py install does what I expect (even if it's a mess internally). In particular, when maintaining a distribution of Python packages, you try to have some well-defined, reproducible build from source tarballs and then you find that pip is going off and downloading stuff under the radar without being asked (etc.). Stopping that can be a pain & I always groan whenever some package insists on using pip. Maybe I don't understand it well enough but in this role its dependency handling is an unnecessary complication with no purpose. Just a comment that not every installation is someone trying to get numpy on their laptop... Cheers, James. From ben.v.root at gmail.com Tue Oct 27 10:30:08 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 27 Oct 2015 10:30:08 -0400 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: <-1464708838107245522@unknownmsgid> References: <2283704104052164280@unknownmsgid> <-1464708838107245522@unknownmsgid> Message-ID: FWIW, when I needed a fast Fixed Width reader for a very large dataset last year, I found that np.genfromtext() was faster than pandas' read_fwf(). IIRC, pandas' text reading code fell back to pure python for fixed width scenarios. On Fri, Oct 23, 2015 at 8:22 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > Grabbing the pandas csv reader would be great, and I hope it happens > sooner than later, though alas, I haven't the spare cycles for it either. > > In the meantime though, can we put a deprecation Warning in when using > fromstring() on text files? It's really pretty broken. > > -Chris > > On Oct 23, 2015, at 4:02 PM, Jeff Reback wrote: > > > > On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: > > On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: > > > > On Oct 23, 2015, at 6:13 PM, Charles R Harris > wrote: > > > >> > >> > >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >>> > >>> > >>>> I think it would be good to keep the usage to read binary data at > least. > >>> > >>> > >>> Agreed -- it's only the text file reading I'm proposing to deprecate. > It was kind of weird to cram it in there in the first place. > >>> > >>> Oh, fromfile() has the same issues. > >>> > >>> Chris > >>> > >>> > >>>> Or is there a good alternative to `np.fromstring(, > dtype=...)`? -- Marten > >>>> > >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker > wrote: > >>>>> > >>>>> There was just a question about a bug/issue with scipy.fromstring > (which is numpy.fromstring) when used to read integers from a text file. > >>>>> > >>>>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html > >>>>> > >>>>> fromstring() is bugging and inflexible for reading text files -- and > it is a very, very ugly mess of code. I dug into it a while back, and gave > up -- just to much of a mess! > >>>>> > >>>>> So we really should completely re-implement it, or deprecate it. I > doubt anyone is going to do a big refactor, so that means deprecating it. > >>>>> > >>>>> Also -- if we do want a fast read numbers from text files function > (which would be nice, actually), it really should get a new name anyway. > >>>>> > >>>>> (and the hopefully coming new dtype system would make it easier to > write cleanly) > >>>>> > >>>>> I'm not sure what deprecating something means, though -- have it > raise a deprecation warning in the next version? > >>>>> > >> > >> There was discussion at SciPy 2015 of separating out the text reading > abilities of Pandas so that numpy could include it. We should contact Jeff > Rebeck and see about moving that forward. > > > > > > IIRC Thomas Caswell was interested in doing this :) > > When he was in Berkeley a few weeks ago he assured me that every night > since SciPy he has dutifully been feeling guilty about not having done it > yet. I think this week his paltry excuse is that he's "on his honeymoon" or > something. > > ...which is to say that if someone has some spare cycles to take this over > then I think that might be a nice wedding present for him :-). > > (The basic idea is to take the text reading backend behind pandas.read_csv > and extract it into a standalone package that pandas could depend on, and > that could also be used by other packages like numpy (among others -- I > thing dato's SFrame package has a fork of this code as well?)) > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > I can certainly provide guidance on how/what to extract but don't have > spare cycles myself for this :( > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edisongustavo at gmail.com Tue Oct 27 10:31:56 2015 From: edisongustavo at gmail.com (Edison Gustavo Muenz) Date: Tue, 27 Oct 2015 12:31:56 -0200 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: <562F80A5.6060004@gmail.com> References: <562F80A5.6060004@gmail.com> Message-ID: I'm sorry if this is out-of-topic, but I'm curious on why nobody mentioned Conda yet. Is there any particular reason for not using it? On Tue, Oct 27, 2015 at 11:48 AM, James E.H. Turner wrote: > Apparently it is not well known that if you have a Python project >> source tree (e.g., a numpy checkout), then the correct way to install >> it is NOT to type >> >> python setup.py install # bad and broken! >> >> but rather to type >> >> pip install . >> > > Though I haven't studied it exhaustively, it always seems to me that > pip is bad & broken, whereas python setup.py install does what I > expect (even if it's a mess internally). In particular, when > maintaining a distribution of Python packages, you try to have some > well-defined, reproducible build from source tarballs and then you > find that pip is going off and downloading stuff under the radar > without being asked (etc.). Stopping that can be a pain & I always > groan whenever some package insists on using pip. Maybe I don't > understand it well enough but in this role its dependency handling > is an unnecessary complication with no purpose. Just a comment that > not every installation is someone trying to get numpy on their > laptop... > > Cheers, > > James. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Oct 27 10:46:28 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Oct 2015 14:46:28 +0000 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: <562F80A5.6060004@gmail.com> Message-ID: On Tue, Oct 27, 2015 at 2:31 PM, Edison Gustavo Muenz < edisongustavo at gmail.com> wrote: > I'm sorry if this is out-of-topic, but I'm curious on why nobody mentioned > Conda yet. > Conda is a binary distribution system, whereas we are talking about installing from sources. You will need a way to install things when building a conda package in any case David > Is there any particular reason for not using it? > > On Tue, Oct 27, 2015 at 11:48 AM, James E.H. Turner > wrote: > >> Apparently it is not well known that if you have a Python project >>> source tree (e.g., a numpy checkout), then the correct way to install >>> it is NOT to type >>> >>> python setup.py install # bad and broken! >>> >>> but rather to type >>> >>> pip install . >>> >> >> Though I haven't studied it exhaustively, it always seems to me that >> pip is bad & broken, whereas python setup.py install does what I >> expect (even if it's a mess internally). In particular, when >> maintaining a distribution of Python packages, you try to have some >> well-defined, reproducible build from source tarballs and then you >> find that pip is going off and downloading stuff under the radar >> without being asked (etc.). Stopping that can be a pain & I always >> groan whenever some package insists on using pip. Maybe I don't >> understand it well enough but in this role its dependency handling >> is an unnecessary complication with no purpose. Just a comment that >> not every installation is someone trying to get numpy on their >> laptop... >> >> Cheers, >> >> James. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Oct 27 10:47:01 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 27 Oct 2015 10:47:01 -0400 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: <562F80A5.6060004@gmail.com> Message-ID: Conda is for binary installs and largely targeted for end-users. This topic pertains to source installs, and is mostly relevant to developers, testers, and those who like to live on the bleeding edge of a particular project. On Tue, Oct 27, 2015 at 10:31 AM, Edison Gustavo Muenz < edisongustavo at gmail.com> wrote: > I'm sorry if this is out-of-topic, but I'm curious on why nobody mentioned > Conda yet. > > Is there any particular reason for not using it? > > On Tue, Oct 27, 2015 at 11:48 AM, James E.H. Turner > wrote: > >> Apparently it is not well known that if you have a Python project >>> source tree (e.g., a numpy checkout), then the correct way to install >>> it is NOT to type >>> >>> python setup.py install # bad and broken! >>> >>> but rather to type >>> >>> pip install . >>> >> >> Though I haven't studied it exhaustively, it always seems to me that >> pip is bad & broken, whereas python setup.py install does what I >> expect (even if it's a mess internally). In particular, when >> maintaining a distribution of Python packages, you try to have some >> well-defined, reproducible build from source tarballs and then you >> find that pip is going off and downloading stuff under the radar >> without being asked (etc.). Stopping that can be a pain & I always >> groan whenever some package insists on using pip. Maybe I don't >> understand it well enough but in this role its dependency handling >> is an unnecessary complication with no purpose. Just a comment that >> not every installation is someone trying to get numpy on their >> laptop... >> >> Cheers, >> >> James. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 10:59:48 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 07:59:48 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Oct 27, 2015 6:08 AM, wrote: > [...] > > > What's the equivalent of > python setup.py build_ext --inplace It's python setup.py build_ext --inplace ;-) There's also no replacement for setup.py sdist, or setup.py upload (which is broken and should never be used), or setup.py clean (which is also broken and should never be used in numpy's case). pip is a better package installer than raw distutils or setuptools, for non-installation-related tasks it has nothing to offer. (With the partial exception of 'pip wheel'.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 11:18:29 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 08:18:29 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: <562F80A5.6060004@gmail.com> References: <562F80A5.6060004@gmail.com> Message-ID: On Oct 27, 2015 6:48 AM, "James E.H. Turner" wrote: >> >> Apparently it is not well known that if you have a Python project >> source tree (e.g., a numpy checkout), then the correct way to install >> it is NOT to type >> >> python setup.py install # bad and broken! >> >> but rather to type >> >> pip install . > > > Though I haven't studied it exhaustively, it always seems to me that > pip is bad & broken, whereas python setup.py install does what I > expect (even if it's a mess internally). Unfortunately this is only true if what you expect is for packages to be installed in subtly corrupted ways, as described in the original email. Sorry to be the bearer of bad tidings :-/ > In particular, when > maintaining a distribution of Python packages, you try to have some > well-defined, reproducible build from source tarballs and then you > find that pip is going off and downloading stuff under the radar > without being asked (etc.). Stopping that can be a pain & I always > groan whenever some package insists on using pip. Maybe I don't > understand it well enough but in this role its dependency handling > is an unnecessary complication with no purpose. There are two cases where a 'pip install' run might go off and start downloading packages without asking you: - if the project is using setuptools with setup_requires=..., then the setup.py itself will go off and start downloading things without asking. This has nothing to do with pip. The way Debian prevents this is that they always define an intentionally invalid http_proxy environment variable before building any python package. - if the project has declared that they do not work without some other package installed via install_requires=... For this case, if you really know what you're doing and you intentionally want to install a non-functional configuration (which yeah, a package build tool might indeed want to do), then just add --no-deps to the pip install command line. Maybe add --no-index and/or the magic http_proxy setting if you want to be extra sure. > Just a comment that > not every installation is someone trying to get numpy on their > laptop... Sure, we're well aware of the importance of downstream packagers -- part of the point of having this email thread is to smoke out such non-trivial use cases. (And note that worst case if you decide that you'd rather take your chances with setup.py install, then that's why the proposal includes an escape hatch of passing a special --force switch.) But unless you're somehow planning to disable pip entirely in your distribution, so that end users have to get upgrades through your tools rather than using pip, then you do probably want to think about how to provide accurate pip-style metadata. (And even then it doesn't hurt.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Tue Oct 27 11:25:30 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Tue, 27 Oct 2015 10:25:30 -0500 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: <562F80A5.6060004@gmail.com> Message-ID: Interestingly, conda actually does "setup.py install" in the recipe for numpy: https://github.com/conda/conda-recipes/blob/master/numpy-openblas/build.sh I'm not sure if this is the one they use to build the anaconda package, I think they have internal versions of most of the recipes on conda-recipes. On Tue, Oct 27, 2015 at 10:18 AM, Nathaniel Smith wrote: > On Oct 27, 2015 6:48 AM, "James E.H. Turner" wrote: > >> > >> Apparently it is not well known that if you have a Python project > >> source tree (e.g., a numpy checkout), then the correct way to install > >> it is NOT to type > >> > >> python setup.py install # bad and broken! > >> > >> but rather to type > >> > >> pip install . > > > > > > Though I haven't studied it exhaustively, it always seems to me that > > pip is bad & broken, whereas python setup.py install does what I > > expect (even if it's a mess internally). > > Unfortunately this is only true if what you expect is for packages to be > installed in subtly corrupted ways, as described in the original email. > Sorry to be the bearer of bad tidings :-/ > > > In particular, when > > maintaining a distribution of Python packages, you try to have some > > well-defined, reproducible build from source tarballs and then you > > find that pip is going off and downloading stuff under the radar > > without being asked (etc.). Stopping that can be a pain & I always > > groan whenever some package insists on using pip. Maybe I don't > > understand it well enough but in this role its dependency handling > > is an unnecessary complication with no purpose. > > There are two cases where a 'pip install' run might go off and start > downloading packages without asking you: > > - if the project is using setuptools with setup_requires=..., then the > setup.py itself will go off and start downloading things without asking. > This has nothing to do with pip. The way Debian prevents this is that they > always define an intentionally invalid http_proxy environment variable > before building any python package. > > - if the project has declared that they do not work without some other > package installed via install_requires=... For this case, if you really > know what you're doing and you intentionally want to install a > non-functional configuration (which yeah, a package build tool might indeed > want to do), then just add --no-deps to the pip install command line. Maybe > add --no-index and/or the magic http_proxy setting if you want to be extra > sure. > > > Just a comment that > > not every installation is someone trying to get numpy on their > > laptop... > > Sure, we're well aware of the importance of downstream packagers -- part > of the point of having this email thread is to smoke out such non-trivial > use cases. (And note that worst case if you decide that you'd rather take > your chances with setup.py install, then that's why the proposal includes > an escape hatch of passing a special --force switch.) > > But unless you're somehow planning to disable pip entirely in your > distribution, so that end users have to get upgrades through your tools > rather than using pip, then you do probably want to think about how to > provide accurate pip-style metadata. (And even then it doesn't hurt.) > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 27 11:28:52 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 27 Oct 2015 11:28:52 -0400 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 10:59 AM, Nathaniel Smith wrote: > On Oct 27, 2015 6:08 AM, wrote: > > > [...] > > > > > > What's the equivalent of > > python setup.py build_ext --inplace > > It's > python setup.py build_ext --inplace > > ;-) > Ok, Sorry, I read now the small print and the issue. Sounds reasonable, given we can `force` our way out. (If the reason to run to pip is a misspelled dev version number, then it looks like a hammer to me.) Josef > There's also no replacement for setup.py sdist, or setup.py upload (which > is broken and should never be used), or setup.py clean (which is also > broken and should never be used in numpy's case). pip is a better package > installer than raw distutils or setuptools, for non-installation-related > tasks it has nothing to offer. (With the partial exception of 'pip wheel'.) > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Tue Oct 27 11:33:37 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Tue, 27 Oct 2015 10:33:37 -0500 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: Would this happen at the level of numpy's setup.py script or would it be implemented in numpy.distutils? I'm asking as the developer of a package that uses numpy.distutils to manage C extensions. On Tue, Oct 27, 2015 at 10:28 AM, wrote: > > > On Tue, Oct 27, 2015 at 10:59 AM, Nathaniel Smith wrote: > >> On Oct 27, 2015 6:08 AM, wrote: >> > >> [...] >> > >> > >> > What's the equivalent of >> > python setup.py build_ext --inplace >> >> It's >> python setup.py build_ext --inplace >> >> ;-) >> > Ok, Sorry, I read now the small print and the issue. > > Sounds reasonable, given we can `force` our way out. > > (If the reason to run to pip is a misspelled dev version number, then it > looks like a hammer to me.) > > Josef > > > >> There's also no replacement for setup.py sdist, or setup.py upload (which >> is broken and should never be used), or setup.py clean (which is also >> broken and should never be used in numpy's case). pip is a better package >> installer than raw distutils or setuptools, for non-installation-related >> tasks it has nothing to offer. (With the partial exception of 'pip wheel'.) >> >> -n >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Oct 27 14:40:42 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 27 Oct 2015 11:40:42 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Oct 27, 2015 8:34 AM, "Nathan Goldbaum" wrote: > > Would this happen at the level of numpy's setup.py script or would it be implemented in numpy.distutils? I'm asking as the developer of a package that uses numpy.distutils to manage C extensions. NumPy's setup.py, no effect on numpy.distutils users. Unless you also get fed up and implement the same thing, of course ;-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 27 17:54:10 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 27 Oct 2015 22:54:10 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 4:28 PM, wrote: > > > > On Tue, Oct 27, 2015 at 10:59 AM, Nathaniel Smith wrote: > >> On Oct 27, 2015 6:08 AM, wrote: >> > >> [...] >> > >> > >> > What's the equivalent of >> > python setup.py build_ext --inplace >> >> It's >> python setup.py build_ext --inplace >> >> ;-) >> > Ok, Sorry, I read now the small print and the issue. > > Sounds reasonable, given we can `force` our way out. > > (If the reason to run to pip is a misspelled dev version number, then it > looks like a hammer to me.) > That's not the reason. A main reason is that we want reliable uninstall of numpy, as explained in the initial post. We also want to avoid as many other broken parts of setuptools/easy_install as possible. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 27 18:16:56 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 27 Oct 2015 23:16:56 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 8:19 AM, Ralf Gommers wrote: Updating this list for comments made after I sent it and now that I've looked in more detail at what the less common commands do: > So if/when we accept the proposal in this thread, I'm thinking we should > make a bunch of changes at once: > - always use setuptools (this is a new dependency) > - error on ``python setup.py install`` > (removed the item about setup_requires, relevant for scipy but not numpy) > - error on ``python setup.py clean`` (saying "use `git clean -xdf` (or > -Xdf ...) instead") > - change ``python setup.py --help`` to first show numpy-specific stuff > before setuptools help info > - update all our install docs > - error on ``python setup.py upload`` (saying "use `twine upload -s` instead") - error on ``python setup.py upload_docs`` - error on ``python setup.py easy_install`` (I'm not joking, that exists) - error on ``python setup.py test`` (saying "use `python runtests.py` instead") - remove setupegg.py Ralf And when "pip upgrade" is released (should be soon, see > https://github.com/pypa/pip/pull/3194), officially change our mind and > recommend the use of install_requires/setup_requires to packages depending > on numpy. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Tue Oct 27 18:35:50 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Tue, 27 Oct 2015 15:35:50 -0700 (PDT) Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: <1445985349990.b3ed8378@Nodemailer> Can someone here who understands more about distribution maybe write a blog post detailing: - why these setup.py commands are bad - which alternative corresponds to each command and why it's better - where to find information about this For example, I had never heard of "twine", and parenthetical statements such as "setup.py upload (which is broken and should never be used)" are useless to those who don't know this and useless to those who do. I understand that this is an "internal" discussion, but it's nice if those following to learn can get quick pointers. Since there is a *ton* of material online telling us *to use* python setup.py install, all the time, it would be extremely helpful for the community if discussions such as this one helped to bubble up the Right Way of doing Python packaging and distribution. Thanks, Juan. On Wed, Oct 28, 2015 at 9:16 AM, Ralf Gommers wrote: > On Tue, Oct 27, 2015 at 8:19 AM, Ralf Gommers > wrote: > Updating this list for comments made after I sent it and now that I've > looked in more detail at what the less common commands do: >> So if/when we accept the proposal in this thread, I'm thinking we should >> make a bunch of changes at once: >> - always use setuptools (this is a new dependency) >> - error on ``python setup.py install`` >> > (removed the item about setup_requires, relevant for scipy but not numpy) >> - error on ``python setup.py clean`` (saying "use `git clean -xdf` (or >> -Xdf ...) instead") >> - change ``python setup.py --help`` to first show numpy-specific stuff >> before setuptools help info >> - update all our install docs >> > - error on ``python setup.py upload`` (saying "use `twine upload -s` > instead") > - error on ``python setup.py upload_docs`` > - error on ``python setup.py easy_install`` (I'm not joking, that exists) > - error on ``python setup.py test`` (saying "use `python runtests.py` > instead") > - remove setupegg.py > Ralf > And when "pip upgrade" is released (should be soon, see >> https://github.com/pypa/pip/pull/3194), officially change our mind and >> recommend the use of install_requires/setup_requires to packages depending >> on numpy. >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 27 19:02:26 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 28 Oct 2015 00:02:26 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: <1445985349990.b3ed8378@Nodemailer> References: <1445985349990.b3ed8378@Nodemailer> Message-ID: On Tue, Oct 27, 2015 at 11:35 PM, Juan Nunez-Iglesias wrote: > Can someone here who understands more about distribution maybe write a > blog post detailing: > > - why these setup.py commands are bad > - which alternative corresponds to each command and why it's better > - where to find information about this > Good question. Not that I have a blog, but I can try to write something a bit longer the coming weekend. > > For example, I had never heard of "twine", and parenthetical statements > such as "setup.py upload (which is broken and should never be used)" are > useless to those who don't know this and useless to those who do. > IIRC `setup.py upload` sends passwords over plain http. I've also seen it do weird things like change one's own PyPi rights from maintainer to owner. The most comprehensive overview of all this stuff is https://packaging.python.org/en/latest/, which starts with tool recommendations. Twine is one of the first things mentioned. Ralf > I understand that this is an "internal" discussion, but it's nice if those > following to learn can get quick pointers. Since there is a *ton* of > material online telling us *to use* python setup.py install, all the time, > it would be extremely helpful for the community if discussions such as this > one helped to bubble up the Right Way of doing Python packaging and > distribution. > > Thanks, > > Juan. > > > > > > On Wed, Oct 28, 2015 at 9:16 AM, Ralf Gommers > wrote: > >> >> >> On Tue, Oct 27, 2015 at 8:19 AM, Ralf Gommers >> wrote: >> >> Updating this list for comments made after I sent it and now that I've >> looked in more detail at what the less common commands do: >> >> >>> So if/when we accept the proposal in this thread, I'm thinking we should >>> make a bunch of changes at once: >>> - always use setuptools (this is a new dependency) >>> - error on ``python setup.py install`` >>> >> >> (removed the item about setup_requires, relevant for scipy but not numpy) >> >> >>> - error on ``python setup.py clean`` (saying "use `git clean -xdf` (or >>> -Xdf ...) instead") >>> - change ``python setup.py --help`` to first show numpy-specific stuff >>> before setuptools help info >>> - update all our install docs >>> >> >> - error on ``python setup.py upload`` (saying "use `twine upload -s` >> instead") >> - error on ``python setup.py upload_docs`` >> - error on ``python setup.py easy_install`` (I'm not joking, that exists) >> - error on ``python setup.py test`` (saying "use `python runtests.py` >> instead") >> - remove setupegg.py >> >> Ralf >> >> And when "pip upgrade" is released (should be soon, see >>> https://github.com/pypa/pip/pull/3194), officially change our mind and >>> recommend the use of install_requires/setup_requires to packages depending >>> on numpy. >>> >>> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Tue Oct 27 20:04:40 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Tue, 27 Oct 2015 17:04:40 -0700 (PDT) Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: <1445990680204.47cd9173@Nodemailer> Thanks Ralf! The pointer to Python Packaging User Guide is already gold! (But a wider discussion e.g. in the NumPy repo, mirroring the docstring conventions, would also be good!) On Wed, Oct 28, 2015 at 10:02 AM, Ralf Gommers wrote: > On Tue, Oct 27, 2015 at 11:35 PM, Juan Nunez-Iglesias > wrote: >> Can someone here who understands more about distribution maybe write a >> blog post detailing: >> >> - why these setup.py commands are bad >> - which alternative corresponds to each command and why it's better >> - where to find information about this >> > Good question. Not that I have a blog, but I can try to write something a > bit longer the coming weekend. >> >> For example, I had never heard of "twine", and parenthetical statements >> such as "setup.py upload (which is broken and should never be used)" are >> useless to those who don't know this and useless to those who do. >> > IIRC `setup.py upload` sends passwords over plain http. I've also seen it > do weird things like change one's own PyPi rights from maintainer to owner. > The most comprehensive overview of all this stuff is > https://packaging.python.org/en/latest/, which starts with tool > recommendations. Twine is one of the first things mentioned. > Ralf >> I understand that this is an "internal" discussion, but it's nice if those >> following to learn can get quick pointers. Since there is a *ton* of >> material online telling us *to use* python setup.py install, all the time, >> it would be extremely helpful for the community if discussions such as this >> one helped to bubble up the Right Way of doing Python packaging and >> distribution. >> >> Thanks, >> >> Juan. >> >> >> >> >> >> On Wed, Oct 28, 2015 at 9:16 AM, Ralf Gommers >> wrote: >> >>> >>> >>> On Tue, Oct 27, 2015 at 8:19 AM, Ralf Gommers >>> wrote: >>> >>> Updating this list for comments made after I sent it and now that I've >>> looked in more detail at what the less common commands do: >>> >>> >>>> So if/when we accept the proposal in this thread, I'm thinking we should >>>> make a bunch of changes at once: >>>> - always use setuptools (this is a new dependency) >>>> - error on ``python setup.py install`` >>>> >>> >>> (removed the item about setup_requires, relevant for scipy but not numpy) >>> >>> >>>> - error on ``python setup.py clean`` (saying "use `git clean -xdf` (or >>>> -Xdf ...) instead") >>>> - change ``python setup.py --help`` to first show numpy-specific stuff >>>> before setuptools help info >>>> - update all our install docs >>>> >>> >>> - error on ``python setup.py upload`` (saying "use `twine upload -s` >>> instead") >>> - error on ``python setup.py upload_docs`` >>> - error on ``python setup.py easy_install`` (I'm not joking, that exists) >>> - error on ``python setup.py test`` (saying "use `python runtests.py` >>> instead") >>> - remove setupegg.py >>> >>> Ralf >>> >>> And when "pip upgrade" is released (should be soon, see >>>> https://github.com/pypa/pip/pull/3194), officially change our mind and >>>> recommend the use of install_requires/setup_requires to packages depending >>>> on numpy. >>>> >>>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Wed Oct 28 03:36:23 2015 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Wed, 28 Oct 2015 08:36:23 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: <1445985349990.b3ed8378@Nodemailer> References: <1445985349990.b3ed8378@Nodemailer> Message-ID: <20151028083623.46534010@lintaillefer.esrf.fr> On Tue, 27 Oct 2015 15:35:50 -0700 (PDT) "Juan Nunez-Iglesias" wrote: > Can someone here who understands more about distribution maybe write a blog post detailing: Hi, Olivier Grisel from sklearn gave a very good talk on this topic at PyCon, earlier this year: http://www.pyvideo.org/video/3473/build-and-test-wheel-packages-on-linux-osx-win Very instructive. -- J?r?me Kieffer tel +33 476 882 445 From jni.soma at gmail.com Wed Oct 28 05:19:29 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Wed, 28 Oct 2015 02:19:29 -0700 (PDT) Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: <20151028083623.46534010@lintaillefer.esrf.fr> References: <20151028083623.46534010@lintaillefer.esrf.fr> Message-ID: <1446023969271.774cdff2@Nodemailer> Thanks, Jerome! I?ve added it to my to-watch list. It sounds really useful! Juan. On Wed, Oct 28, 2015 at 6:36 PM, Jerome Kieffer wrote: > On Tue, 27 Oct 2015 15:35:50 -0700 (PDT) > "Juan Nunez-Iglesias" wrote: >> Can someone here who understands more about distribution maybe write a blog post detailing: > Hi, > Olivier Grisel from sklearn gave a very good talk on this topic at PyCon, earlier > this year: > http://www.pyvideo.org/video/3473/build-and-test-wheel-packages-on-linux-osx-win > Very instructive. > -- > J?r?me Kieffer > tel +33 476 882 445 > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Oct 28 17:27:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 28 Oct 2015 14:27:00 -0700 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus Message-ID: Hi all, Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all welcome him aboard. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Oct 28 17:48:18 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 28 Oct 2015 22:48:18 +0100 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: References: Message-ID: On Wed, Oct 28, 2015 at 10:27 PM, Nathaniel Smith wrote: Hi all, > > Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all > welcome him aboard. > Welcome Jonathan! And thanks for tackling the numpy.ma backlog in the issue tracker - it can certainly use some attention:) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Oct 28 18:48:45 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 28 Oct 2015 23:48:45 +0100 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: On Mon, Oct 12, 2015 at 11:01 PM, Ralf Gommers wrote: > Hi, > > Thanks Nathaniel and everyone else who contributed for pushing forward > with formalizing Numpy governance and with this FSA. I'm quite excited > about both! > > Before I start commenting on the FSA, I'd like to point out that I'm both > on the numpy steering committee and the NumFOCUS board. I don't see that as > a problem for being involved in the discussions or signing the FSA, however > I will obviously abstain from voting or (non-)consenting in case of a > possible conflict of interest. > > > On Thu, Oct 8, 2015 at 7:57 AM, Nathaniel Smith wrote: > >> Hi all, >> >> Now that the governance document is in place, we need to get our legal >> ducks in a row by signing a fiscal sponsorship agreement with >> NumFOCUS. >> >> The basic idea here is that there are times when you really need some >> kind of corporation to represent the project -- the legal system for >> better or worse does not understand "a bunch of folks on a mailing >> list" as a legal entity capable of accepting donations, > > > Additional clarification: NumFOCUS is a 501(c)3 organization, which means > that in the US donations that are tax-deductable can be made to it (and > hence to Numpy after this FSA is signed). From European or other countries > donations can be made, but they won't be deductable. > > >> or holding >> funds or other assets like domain names. The obvious solution is to >> incorporate a company to represent the project -- but incorporating a >> company involves lots of super-annoying paperwork. (Like, *super* >> annoying.) So a standard trick is that a single non-profit corporation >> acts as an umbrella organization providing these services to multiple >> projects at once, and this is called "fiscal sponsorship". You can >> read more about it here: >> https://en.wikipedia.org/wiki/Fiscal_sponsorship >> >> NumFOCUS's standard comprehensive FSA agreement can be seen here: >> >> >> https://docs.google.com/document/d/11YqMX9UrgfCSgiQEUzmOFyg6Ku-vED6gMxhO6J9lCgg/edit?usp=sharing > > > There's one upcoming change to this FSA: the overhead percentage (now > 4-7%) charged will go up significantly, to around 10-15%. Re4ason: NumFOCUS > cannot cover its admin/legal costs as well as support its projects based on > what the doc says now. This is still at the lower end of the scale for > non-profits, and universities typically charge way more on grants. So I > don't see any issue here, but it's good to know now rather than right after > we sign. > > >> and we have the option of negotiating changes if there's anything we >> don't like. >> >> They also have a FAQ: >> >> https://docs.google.com/document/d/1zdXp07dLvkbqBrDsw96P6mkqxnWzKJuM-1f4408I6Qs/edit?usp=sharing >> >> I've read through the document and didn't see anything that bothered >> me, except that I'm not quite sure how to make the split between the >> steering council and numfocus subcommittee that we have in our >> governance model sync up with their language about the "leadership >> body", and in particular the language in section 10 about simple >> majority votes. So I've queried them about that already. >> > > I'd like to clarify that the Numfocus subcommittee is only meant to > facility interaction with NumFOCUS and to ensure that if funds are spent, > they are spent in a way consistent with the mission and non-profit nature > of NumFOCUS. The same applies to possible legal impacts of decisions made > in the Numpy project. > > Regarding the question about the "simple majority votes" language, we can > simply replace that with the appropriate text describing how decisions are > made in the Numpy project. > Hi all, there wasn't much feedback on this FSA, but I want to point out that it's actually quite important for the project. Maybe everyone already thought about this when the governance model was agreed on (it does include a NumFOCUS subcommittee after all), but if not: read / think /ask question fast, because we're moving forward with signing of the agreement with the people listed at http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Oct 28 19:26:10 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 28 Oct 2015 17:26:10 -0600 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: References: Message-ID: On Wed, Oct 28, 2015 at 3:27 PM, Nathaniel Smith wrote: > Hi all, > > Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all > welcome him aboard. > > Welcome to the crew. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From morph at debian.org Wed Oct 28 19:28:32 2015 From: morph at debian.org (Sandro Tosi) Date: Wed, 28 Oct 2015 19:28:32 -0400 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: please, pretty please, do not disable setup.py install or at least keep providing a way for distribution (Debian in this case) to be able to build/install numpy in a temporary location for packaging reasons. pip is not the solution for us On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: > Hi all, > > Apparently it is not well known that if you have a Python project > source tree (e.g., a numpy checkout), then the correct way to install > it is NOT to type > > python setup.py install # bad and broken! > > but rather to type > > pip install . > > (I.e., pip install isn't just for packages on pypi -- you can also > pass it the path to an arbitrary source directory or the URL of a > source tarball and it will do its thing. In this case "install ." > means "install the project in the current directory".) > > These don't quite have identical results -- the main difference is > that the latter makes sure that proper metadata gets installed so that > later on it will be possible to upgrade or uninstall correctly. If you > call setup.py directly, and then later you try to upgrade your > package, then it's entirely possible to end up with a mixture of old > and new versions of the package installed in your PYTHONPATH. (One > common effect is in numpy's case is that we get people sending us > mysterious bug reports about failing tests in files don't even exist > (!) -- because nose is finding tests in files from one version of > numpy and running them against a different version of numpy.) > > But this isn't the only issue -- using pip also avoids a bunch of > weird corner cases in distutils/setuptools. E.g., if setup.py uses > plain distutils, then it turns out this will mangle numpy version > numbers in ways that cause weird horribleness -- see [1] for a bug > report of the form "matplotlib doesn't build anymore" which turned out > to be because of using 'setup.py install' to install numpy. OTOH if > setup.py uses setuptools then you get different weirdnesses, like you > can easily end up with multiple versions of the same library installed > simultaneously. > > And finally, an advantage of getting used to using 'pip install .' now > is that you'll be prepared for the glorious future when we kill > distutils and get rid of setup.py entirely in favor of something less > terrible [2]. > > So a proposal that came out of the discussion in [1] is that we modify > numpy's setup.py now so that if you try running > > python setup.py install > > you get > > Error: Calling 'setup.py install' directly is NOT SUPPORTED! > Instead, do: > > pip install . > > Alternatively, if you want to proceed at your own risk, you > can try 'setup.py install --force-raw-setup.py' > For more information see http://... > > (Other setup.py commands would continue to work as normal.) > > I believe that this would also break both 'easy_install numpy', and > attempts to install numpy via the setup_requires= argument to > setuptools.setup (because setup_requires= implicitly calls > easy_install). install_requires= would *not* be affected, and > setup_requires= would still be fine in cases where numpy was already > installed. > > This would hopefully cut down on the amount of time everyone spends > trying to track down these stupid weird bugs, but it will also require > some adjustment in people's workflows, so... objections? concerns? > > -n > > [1] https://github.com/numpy/numpy/issues/6551 > [2] https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi From ralf.gommers at gmail.com Wed Oct 28 19:34:50 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 29 Oct 2015 00:34:50 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Thu, Oct 29, 2015 at 12:28 AM, Sandro Tosi wrote: > please, pretty please, do not disable setup.py install or at least > keep providing a way for distribution (Debian in this case) to be able > to build/install numpy in a temporary location for packaging reasons. > pip is not the solution for us > ``python setup.py install --force`` would still work. Would that be OK then? Ralf > On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: > > Hi all, > > > > Apparently it is not well known that if you have a Python project > > source tree (e.g., a numpy checkout), then the correct way to install > > it is NOT to type > > > > python setup.py install # bad and broken! > > > > but rather to type > > > > pip install . > > > > (I.e., pip install isn't just for packages on pypi -- you can also > > pass it the path to an arbitrary source directory or the URL of a > > source tarball and it will do its thing. In this case "install ." > > means "install the project in the current directory".) > > > > These don't quite have identical results -- the main difference is > > that the latter makes sure that proper metadata gets installed so that > > later on it will be possible to upgrade or uninstall correctly. If you > > call setup.py directly, and then later you try to upgrade your > > package, then it's entirely possible to end up with a mixture of old > > and new versions of the package installed in your PYTHONPATH. (One > > common effect is in numpy's case is that we get people sending us > > mysterious bug reports about failing tests in files don't even exist > > (!) -- because nose is finding tests in files from one version of > > numpy and running them against a different version of numpy.) > > > > But this isn't the only issue -- using pip also avoids a bunch of > > weird corner cases in distutils/setuptools. E.g., if setup.py uses > > plain distutils, then it turns out this will mangle numpy version > > numbers in ways that cause weird horribleness -- see [1] for a bug > > report of the form "matplotlib doesn't build anymore" which turned out > > to be because of using 'setup.py install' to install numpy. OTOH if > > setup.py uses setuptools then you get different weirdnesses, like you > > can easily end up with multiple versions of the same library installed > > simultaneously. > > > > And finally, an advantage of getting used to using 'pip install .' now > > is that you'll be prepared for the glorious future when we kill > > distutils and get rid of setup.py entirely in favor of something less > > terrible [2]. > > > > So a proposal that came out of the discussion in [1] is that we modify > > numpy's setup.py now so that if you try running > > > > python setup.py install > > > > you get > > > > Error: Calling 'setup.py install' directly is NOT SUPPORTED! > > Instead, do: > > > > pip install . > > > > Alternatively, if you want to proceed at your own risk, you > > can try 'setup.py install --force-raw-setup.py' > > For more information see http://... > > > > (Other setup.py commands would continue to work as normal.) > > > > I believe that this would also break both 'easy_install numpy', and > > attempts to install numpy via the setup_requires= argument to > > setuptools.setup (because setup_requires= implicitly calls > > easy_install). install_requires= would *not* be affected, and > > setup_requires= would still be fine in cases where numpy was already > > installed. > > > > This would hopefully cut down on the amount of time everyone spends > > trying to track down these stupid weird bugs, but it will also require > > some adjustment in people's workflows, so... objections? concerns? > > > > -n > > > > [1] https://github.com/numpy/numpy/issues/6551 > > [2] > https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html > > > > -- > > Nathaniel J. Smith -- http://vorpus.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Sandro Tosi (aka morph, morpheus, matrixhasu) > My website: http://matrixhasu.altervista.org/ > Me at Debian: http://wiki.debian.org/SandroTosi > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Wed Oct 28 22:43:51 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Wed, 28 Oct 2015 22:43:51 -0400 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: References: Message-ID: <563187E7.10801@gmail.com> On 10/28/2015 05:27 PM, Nathaniel Smith wrote: > Hi all, > > Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all > welcome him aboard. > > -n Welcome Jonathan, happy to have you on the team! Allan From toddrjen at gmail.com Thu Oct 29 11:25:17 2015 From: toddrjen at gmail.com (Todd) Date: Thu, 29 Oct 2015 16:25:17 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Oct 29, 2015 00:29, "Sandro Tosi" wrote: > > please, pretty please, do not disable setup.py install or at least > keep providing a way for distribution (Debian in this case) to be able > to build/install numpy in a temporary location for packaging reasons. > pip is not the solution for us What is wrong with "pip install --root" ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Thu Oct 29 13:25:01 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 29 Oct 2015 18:25:01 +0100 Subject: [Numpy-discussion] Linking Numpy with parallel OpenBLAS Message-ID: I have installed all the OpenBLAS versions availables at the Fedora repos, that include openMP and pthreads versions. But Numpy installed by pip on a virtualenv seems to only link to the serial version. Is there a way to convince it to use the parallel one? Here are my libraries: (py27)[david at SQUIDS lib64]$ ls libopenblas* libopenblas64.a libopenblaso64.so.0 libopenblasp64.so.0 libopenblas64-r0.2.14.so libopenblaso.a libopenblasp.a libopenblas64.so libopenblaso-r0.2.14.so libopenblasp-r0.2.14.so libopenblas64.so.0 libopenblaso.so libopenblasp.so libopenblas.a libopenblaso.so.0 libopenblasp.so.0 libopenblaso64.a libopenblasp64.a libopenblas-r0.2.14.so libopenblaso64-r0.2.14.so libopenblasp64-r0.2.14.so libopenblas.so libopenblaso64.so libopenblasp64.so libopenblas.so.0 And importing numpy shows that the serial is the only one open: (py27)[david at SQUIDS lib64]$ lsof libopenbl* lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing Output information may be incomplete. COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ipython 2355 david mem REG 8,2 32088056 2372346 libopenblas-r0.2.14.so This is the output of np.show_config(): lapack_opt_info: libraries = ['openblas'] library_dirs = ['/usr/lib64'] define_macros = [('HAVE_CBLAS', None)] language = c blas_opt_info: libraries = ['openblas'] library_dirs = ['/usr/lib64'] define_macros = [('HAVE_CBLAS', None)] language = c openblas_info: libraries = ['openblas'] library_dirs = ['/usr/lib64'] define_macros = [('HAVE_CBLAS', None)] language = c openblas_lapack_info: libraries = ['openblas'] library_dirs = ['/usr/lib64'] define_macros = [('HAVE_CBLAS', None)] language = c blas_mkl_info: NOT AVAILABLE Thanks, /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Thu Oct 29 15:11:44 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Thu, 29 Oct 2015 15:11:44 -0400 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: > Hi all, > > Apparently it is not well known that if you have a Python project > source tree (e.g., a numpy checkout), then the correct way to install > it is NOT to type > > python setup.py install # bad and broken! > > but rather to type > > pip install . > > FWIW, I don't see any mention of this in the numpy docs, but I do see a lot of instructions involving `setup.py build` and `setup.py install`. See, for example, INSTALL.txt. Also see http://docs.scipy.org/doc/numpy/user/install.html#building-from-source So I guess it is not surprising that it is not well known. Warren > (I.e., pip install isn't just for packages on pypi -- you can also > pass it the path to an arbitrary source directory or the URL of a > source tarball and it will do its thing. In this case "install ." > means "install the project in the current directory".) > > These don't quite have identical results -- the main difference is > that the latter makes sure that proper metadata gets installed so that > later on it will be possible to upgrade or uninstall correctly. If you > call setup.py directly, and then later you try to upgrade your > package, then it's entirely possible to end up with a mixture of old > and new versions of the package installed in your PYTHONPATH. (One > common effect is in numpy's case is that we get people sending us > mysterious bug reports about failing tests in files don't even exist > (!) -- because nose is finding tests in files from one version of > numpy and running them against a different version of numpy.) > > But this isn't the only issue -- using pip also avoids a bunch of > weird corner cases in distutils/setuptools. E.g., if setup.py uses > plain distutils, then it turns out this will mangle numpy version > numbers in ways that cause weird horribleness -- see [1] for a bug > report of the form "matplotlib doesn't build anymore" which turned out > to be because of using 'setup.py install' to install numpy. OTOH if > setup.py uses setuptools then you get different weirdnesses, like you > can easily end up with multiple versions of the same library installed > simultaneously. > > And finally, an advantage of getting used to using 'pip install .' now > is that you'll be prepared for the glorious future when we kill > distutils and get rid of setup.py entirely in favor of something less > terrible [2]. > > So a proposal that came out of the discussion in [1] is that we modify > numpy's setup.py now so that if you try running > > python setup.py install > > you get > > Error: Calling 'setup.py install' directly is NOT SUPPORTED! > Instead, do: > > pip install . > > Alternatively, if you want to proceed at your own risk, you > can try 'setup.py install --force-raw-setup.py' > For more information see http://... > > (Other setup.py commands would continue to work as normal.) > > I believe that this would also break both 'easy_install numpy', and > attempts to install numpy via the setup_requires= argument to > setuptools.setup (because setup_requires= implicitly calls > easy_install). install_requires= would *not* be affected, and > setup_requires= would still be fine in cases where numpy was already > installed. > > This would hopefully cut down on the amount of time everyone spends > trying to track down these stupid weird bugs, but it will also require > some adjustment in people's workflows, so... objections? concerns? > > -n > > [1] https://github.com/numpy/numpy/issues/6551 > [2] > https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 29 15:25:47 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 29 Oct 2015 20:25:47 +0100 Subject: [Numpy-discussion] Linking Numpy with parallel OpenBLAS In-Reply-To: References: Message-ID: <563272BB.2060905@googlemail.com> should be possible by putting this into: ~/.numpy-site.cfg [openblas] libraries = openblasp LD_PRELOAD the file should also work. On 29.10.2015 18:25, Da?id wrote: > I have installed all the OpenBLAS versions availables at the Fedora > repos, that include openMP and pthreads versions. But Numpy installed by > pip on a virtualenv seems to only link to the serial version. Is there a > way to convince it to use the parallel one? > > Here are my libraries: > > (py27)[david at SQUIDS lib64]$ ls libopenblas* > libopenblas64.a libopenblaso64.so.0 libopenblasp64.so.0 > libopenblas64-r0.2.14.so > libopenblaso.a libopenblasp.a > libopenblas64.so libopenblaso-r0.2.14.so > libopenblasp-r0.2.14.so > > libopenblas64.so.0 libopenblaso.so libopenblasp.so > libopenblas.a libopenblaso.so.0 libopenblasp.so.0 > libopenblaso64.a libopenblasp64.a > libopenblas-r0.2.14.so > libopenblaso64-r0.2.14.so > libopenblasp64-r0.2.14.so libopenblas.so > libopenblaso64.so libopenblasp64.so libopenblas.so.0 > > And importing numpy shows that the serial is the only one open: > > (py27)[david at SQUIDS lib64]$ lsof libopenbl* > lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing > Output information may be incomplete. > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > ipython 2355 david mem REG 8,2 32088056 2372346 > libopenblas-r0.2.14.so > > > This is the output of np.show_config(): > > lapack_opt_info: > libraries = ['openblas'] > library_dirs = ['/usr/lib64'] > define_macros = [('HAVE_CBLAS', None)] > language = c > blas_opt_info: > libraries = ['openblas'] > library_dirs = ['/usr/lib64'] > define_macros = [('HAVE_CBLAS', None)] > language = c > openblas_info: > libraries = ['openblas'] > library_dirs = ['/usr/lib64'] > define_macros = [('HAVE_CBLAS', None)] > language = c > openblas_lapack_info: > libraries = ['openblas'] > library_dirs = ['/usr/lib64'] > define_macros = [('HAVE_CBLAS', None)] > language = c > blas_mkl_info: > NOT AVAILABLE > > > Thanks, > > > /David. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From davidmenhur at gmail.com Thu Oct 29 16:50:34 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 29 Oct 2015 21:50:34 +0100 Subject: [Numpy-discussion] Linking Numpy with parallel OpenBLAS In-Reply-To: <563272BB.2060905@googlemail.com> References: <563272BB.2060905@googlemail.com> Message-ID: On 29 October 2015 at 20:25, Julian Taylor wrote: > should be possible by putting this into: ~/.numpy-site.cfg > > [openblas] > libraries = openblasp > > LD_PRELOAD the file should also work. > > Thank! I did some timings on a dot product of a square matrix of size 10000 with LD_PRELOADing the different versions. I checked that all the cores were crunching when an other than plain libopenblas/64 was selected. Here are the timings in seconds: Intel i5-3317U: /usr/lib64/libopenblaso.so 86.3651878834 /usr/lib64/libopenblasp64.so 96.8817200661 /usr/lib64/libopenblas.so 114.60265708 /usr/lib64/libopenblasp.so 107.927740097 /usr/lib64/libopenblaso64.so 97.5418870449 /usr/lib64/libopenblas64.so 109.000799179 Intel i7-4770: /usr/lib64/libopenblas.so 37.9794859886 /usr/lib64/libopenblasp.so 12.3455951214 /usr/lib64/libopenblas64.so 38.0571939945 /usr/lib64/libopenblasp64.so 12.5558650494 /usr/lib64/libopenblaso64.so 12.4118559361 /usr/lib64/libopenblaso.so 13.4787950516 Both computers have the same software and OS. So, it seems that openblas doesn't get a significant advantage from going parallel in the older i5; the i7 using all its cores (4 + 4 hyperthread) gains a 3x speed up, and there is no big different between OpenMP and pthreads. I am particullary puzzled by the i5 results, shouldn't threads get a noticeable speedup? /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Oct 29 17:07:57 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 29 Oct 2015 22:07:57 +0100 Subject: [Numpy-discussion] Linking Numpy with parallel OpenBLAS In-Reply-To: References: <563272BB.2060905@googlemail.com> Message-ID: <56328AAD.9050802@googlemail.com> On 29.10.2015 21:50, Da?id wrote: > > On 29 October 2015 at 20:25, Julian Taylor > > > wrote: > > should be possible by putting this into: ~/.numpy-site.cfg > > [openblas] > libraries = openblasp > > LD_PRELOAD the file should also work. > > > Thank! > > I did some timings on a dot product of a square matrix of size 10000 > with LD_PRELOADing the different versions. I checked that all the cores > were crunching when an other than plain libopenblas/64 was selected. > Here are the timings in seconds: > > > Intel i5-3317U: > /usr/lib64/libopenblaso.so > 86.3651878834 > /usr/lib64/libopenblasp64.so > 96.8817200661 > /usr/lib64/libopenblas.so > 114.60265708 > /usr/lib64/libopenblasp.so > 107.927740097 > /usr/lib64/libopenblaso64.so > 97.5418870449 > /usr/lib64/libopenblas64.so > 109.000799179 > > Intel i7-4770: > /usr/lib64/libopenblas.so > 37.9794859886 > /usr/lib64/libopenblasp.so > 12.3455951214 > /usr/lib64/libopenblas64.so > 38.0571939945 > /usr/lib64/libopenblasp64.so > 12.5558650494 > /usr/lib64/libopenblaso64.so > 12.4118559361 > /usr/lib64/libopenblaso.so > 13.4787950516 > > Both computers have the same software and OS. So, it seems that openblas > doesn't get a significant advantage from going parallel in the older i5; > the i7 using all its cores (4 + 4 hyperthread) gains a 3x speed up, and > there is no big different between OpenMP and pthreads. > > I am particullary puzzled by the i5 results, shouldn't threads get a > noticeable speedup? > > > /David. > > Try with only 2 cores instead of the 2+2 via OMP_NUM_THREADS=2, its possible the hyperthreading is just leading to cache trashing. Also when only one core is active the cpus will overclock themselves a bit which will decrease relative parallelization speedups (intel turbo boost). From tcaswell at gmail.com Thu Oct 29 23:44:40 2015 From: tcaswell at gmail.com (Thomas Caswell) Date: Fri, 30 Oct 2015 03:44:40 +0000 Subject: [Numpy-discussion] [announce] matplotlib 1.5.0 released Message-ID: Hey all, We are pleased to finally announce the release of matplotlib 1.5.0! It has been over a year since the last feature release and we have had over 230 people contribute to this cycle. This release of matplotlib has several major new features including - Auto-redraw using the object-oriented API in interactive mode. - Most plotting functions now support labeled data API [Jan Schulz]. - Color cycling has extended to all style properties [Ben Root]. - Four new perceptually uniform color maps, including the soon-to-be default 'viridis'. [Stefan van der Walt and Nathaniel Smith]. - More included style sheets. - Many small plotting improvements. - Proposed new framework for managing the GUI toolbar and tools. - Pixel-value on mouse over for imshow [Steven Silvester] For demos of some of these features in action see this notebook: https://gist.github.com/tacaswell/72b0d579aeb54d4fbf87 which is version of the talk I presented at scipy, pydata Seattle and pygotham this summer. There will be more in-depth demos of the new features coming. This release has a new required dependency, cycler , for composing complex style cycles. In 1.5.0 we have dropped official support for python 2.6 and 3.3. The next matplotlib release will be the 2.0 default-style-only release, planned for 1-2 months from now. Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Thu Oct 29 23:59:24 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 29 Oct 2015 20:59:24 -0700 (PDT) Subject: [Numpy-discussion] [announce] matplotlib 1.5.0 released In-Reply-To: References: Message-ID: <1446177564355.c2d2a061@Nodemailer> Yay! I have been eagerly awaiting this! =D Thank you everyone! On Fri, Oct 30, 2015 at 2:44 PM, Thomas Caswell wrote: > Hey all, > We are pleased to finally announce the release of matplotlib 1.5.0! It has > been over a year since the last feature release and we have had over 230 > people contribute to this cycle. > This release of matplotlib has several major new features including > - Auto-redraw using the object-oriented API in interactive mode. > - Most plotting functions now support labeled data API [Jan Schulz]. > - Color cycling has extended to all style properties [Ben Root]. > - Four new perceptually uniform color maps, including the soon-to-be > default 'viridis'. [Stefan van der Walt and Nathaniel Smith]. > - More included style sheets. > - Many small plotting improvements. > - Proposed new framework for managing the GUI toolbar and tools. > - Pixel-value on mouse over for imshow [Steven Silvester] > For demos of some of these features in action see this notebook: > https://gist.github.com/tacaswell/72b0d579aeb54d4fbf87 > which is version of the talk I presented at scipy, pydata Seattle and > pygotham this summer. There will be more in-depth demos of the new > features coming. > This release has a new required dependency, cycler > , for composing complex style cycles. > In 1.5.0 we have dropped official support for python 2.6 and 3.3. > The next matplotlib release will be the 2.0 default-style-only release, > planned for 1-2 months from now. > Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Fri Oct 30 11:20:04 2015 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Fri, 30 Oct 2015 10:20:04 -0500 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: <563187E7.10801@gmail.com> References: <563187E7.10801@gmail.com> Message-ID: <56338AA4.5080308@gmail.com> On 10/28/2015 09:43 PM, Allan Haldane wrote: > On 10/28/2015 05:27 PM, Nathaniel Smith wrote: >> Hi all, >> >> Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all >> welcome him aboard. >> >> -n > > Welcome Jonathan, happy to have you on the team! > > Allan > Thanks you everyone for the kind welcome. I'm looking forwarding to being part of them team. - Jonathan Helmus From charlesr.harris at gmail.com Fri Oct 30 16:53:31 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 30 Oct 2015 14:53:31 -0600 Subject: [Numpy-discussion] Back to the Future for np.lib.split Message-ID: Hi All, This is to note that gh-6592 proposes to accelerate a planned future change to array_split. In Numpy < 1.9 empty arrays resulting from split always had dimension (0,),regardless of the dimensions of the other arrays in the result. In Numpy 1.9 a FutureWarning was raised saying that that would change and at some point empty arrays would have dimensions consistent with the other arrays in the result. However, there was a bug in the condition for raising that warning and empty arrays that were not the last of the returned arrays already had consistent dimensions. As a result, 1.9 was half way to the future but not quite there. That bug was fixed in 1.10.0, but some folks were already relying on the future behavior and opened a bug report on 1.10, see gh-6575 . This left us in a sticky position. The decision has been made to revert the bug fix in 1.10 so that things remain as they were in 1.9, and move completely to consistent dimensions in 1.11. The move is a bit accelerated but seems the most graceful way out of the morass. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Oct 30 18:12:50 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 30 Oct 2015 16:12:50 -0600 Subject: [Numpy-discussion] isfortran compatibility in numpy 1.10. Message-ID: Hi All, The isfortran function calls a.fnc (Fortran-Not-C), which is implemented as F_CONTIGUOUS && !C_CONTIGUOUS. Before relaxed stride checking contiguous multidimensional arrays could not be both and continguous 1-D arrays were always CONTIGUOUS, but this is not longer the case. Consequently current isfortran breaks backward compatiblity. There are two suggested solutions 1. Return `a.flags.f_contiguous`. This differs for 1-D arrays, but is most consistent with the name isfortran. 2. Return `a.flags.f_contiguous and a.ndim > 1`, which would be backward compatible. It is also possible to start with 2. but add a FutureWarning and later move to 1, which it my preferred solution. See gh-6590 for the issue. Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Fri Oct 30 20:03:19 2015 From: travis at continuum.io (Travis Oliphant) Date: Fri, 30 Oct 2015 19:03:19 -0500 Subject: [Numpy-discussion] isfortran compatibility in numpy 1.10. In-Reply-To: References: Message-ID: As I posted to the github issue, I support #2 as it is the original meaning. The most common case of isfortran that I recall was to support transpositions that needed to occur before calling Fortran-compiled linear algebra routines. However, with that said, you could also reasonably do #1 and likely have no real problem --- because transposing a 1-d array doesn't have any effect. In NumPy 1.0.1, isfortran was intended to be True only for arrays with a.ndim > 1. Thus, it would have been possible for someone to rely on that invariant for some other reason. With relaxed stride checking, this invariant changed because isfortran was implemented by returning True if the F_Contiguous flag was set but the C_Contiguous flag was not (this was only ever previously possible for a.ndim > 1). If you choose to go with #1, please emphasize in the release notes that isfortran now does not assume a.ndim > 1 but is simply short-hand for a.flags.f_contiguous. -Travis On Fri, Oct 30, 2015 at 5:12 PM, Charles R Harris wrote: > Hi All, > > The isfortran function calls a.fnc (Fortran-Not-C), which is implemented > as F_CONTIGUOUS && !C_CONTIGUOUS. Before relaxed stride checking > contiguous multidimensional arrays could not be both and continguous 1-D > arrays were always CONTIGUOUS, but this is not longer the case. > Consequently current isfortran breaks backward compatiblity. There are two > suggested solutions > > 1. Return `a.flags.f_contiguous`. This differs for 1-D arrays, but is > most consistent with the name isfortran. > 2. Return `a.flags.f_contiguous and a.ndim > 1`, which would be > backward compatible. > > It is also possible to start with 2. but add a FutureWarning and later > move to 1, which it my preferred solution. See gh-6590 > for the issue. > > Thoughts? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- *Travis Oliphant* *Co-founder and CEO* @teoliphant 512-222-5440 http://www.continuum.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentelshafey at gmail.com Sat Oct 31 02:15:09 2015 From: laurentelshafey at gmail.com (laurentes) Date: Fri, 30 Oct 2015 23:15:09 -0700 (MST) Subject: [Numpy-discussion] [NumPy/Swig] Return NumPy array with same size as input array (no additional length argument) Message-ID: <1446272109262-41601.post@n7.nabble.com> Hello, Using Swig, I don't manage to (properly) create the Python Binding for the following C-like function: void add_array(double* input_array1, double* input_array2, double* output_array, int length); where the three arrays have all the same length. This is similar to this thread , which has never been fully addressed online. >From Python, I would like to be able to call: add_array(input_array1, input_array2) which would return me a newly allocated NumPy array (output_array) with the result. In my Swig file, I've first used the wrapper function trick described here , that is: %apply (double* IN_ARRAY1, int DIM1) {(double* input_array1, int length1), (double* input_array2, int length2)}; %apply (double* ARGOUT_ARRAY1, int DIM1) {(double* output_array, int length3)}; %rename (add_array) my_add_array; %exception my_add_array { $action if (PyErr_Occurred()) SWIG_fail; } %inline %{ void my_add_array(double* input_array1, int length1, double* input_array2, int length2, double* output_array, int length3) { if (length1 != length2 || length1 != length3) { PyErr_Format(PyExc_ValueError, "Arrays of lengths (%d,%d,%d) given", length1, length2, length3); } else { add_array(input_array1, input_array2, output_array, length1); } } %} This allows me to call the function from Python using add_array(input_array1, input_array2, length). But the third argument of this function is useless and this function does not look 'Pythonic'. Could someone help me to modify my Swig file, such that only the first two arguments are required for the Python API? Thanks a lot, Laurent -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/NumPy-Swig-Return-NumPy-array-with-same-size-as-input-array-no-additional-length-argument-tp41601.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From ralf.gommers at gmail.com Sat Oct 31 19:01:55 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Nov 2015 00:01:55 +0100 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: Hi all, On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers wrote: > > Hi all, there wasn't much feedback on this FSA, but I want to point out > that it's actually quite important for the project. > > Maybe everyone already thought about this when the governance model was > agreed on (it does include a NumFOCUS subcommittee after all), but if not: > read / think /ask question fast, because we're moving forward with signing > of the agreement with the people listed at > http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee. > > The document is now signed. No other project with an FSA seems to have done this (yet), but I think it would be good to publish the FSA. Either on numpy.org, scipy.org or numfocus.org. Any objections/concerns about that? +1's from the people that signed would be good to have before moving forward. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcaswell at gmail.com Sat Oct 31 19:20:21 2015 From: tcaswell at gmail.com (Thomas Caswell) Date: Sat, 31 Oct 2015 23:20:21 +0000 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: +1 to posting it as part of the documentation. I also like the idea of numfocus hosting the whole collection of them locally so that we can just link to them. On Sat, Oct 31, 2015, 19:01 Ralf Gommers wrote: > Hi all, > > On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers > wrote: > >> >> Hi all, there wasn't much feedback on this FSA, but I want to point out >> that it's actually quite important for the project. >> >> Maybe everyone already thought about this when the governance model was >> agreed on (it does include a NumFOCUS subcommittee after all), but if not: >> read / think /ask question fast, because we're moving forward with signing >> of the agreement with the people listed at >> http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee. >> >> > > The document is now signed. No other project with an FSA seems to have > done this (yet), but I think it would be good to publish the FSA. Either on > numpy.org, scipy.org or numfocus.org. Any objections/concerns about that? > > +1's from the people that signed would be good to have before moving > forward. > > Cheers, > Ralf > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 31 19:53:05 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 31 Oct 2015 16:53:05 -0700 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: On Oct 31, 2015 4:01 PM, "Ralf Gommers" wrote: > > Hi all, > > On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers wrote: >> >> >> Hi all, there wasn't much feedback on this FSA, but I want to point out that it's actually quite important for the project. >> >> Maybe everyone already thought about this when the governance model was agreed on (it does include a NumFOCUS subcommittee after all), but if not: read / think /ask question fast, because we're moving forward with signing of the agreement with the people listed at http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee. > > > The document is now signed. No other project with an FSA seems to have done this (yet), but I think it would be good to publish the FSA. Either on numpy.org, scipy.org or numfocus.org. Any objections/concerns about that? > > +1's from the people that signed would be good to have before moving forward. +1 from me, with the proviso that most project FSAs probably contain individual contributor's home addresses, which should be redacted just in case. (Correctly redacting PDFs is notoriously tricky [1], but it looks like recent versions of e.g. Acrobat have tools that take appropriate care [2].) -n [1] http://blog.foxitsoftware.com/how-to-properly-redact-pdf-files/ [2] https://helpx.adobe.com/acrobat/using/removing-sensitive-content-pdfs.html From ralf.gommers at gmail.com Sat Oct 31 19:57:08 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Nov 2015 00:57:08 +0100 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: On Sun, Nov 1, 2015 at 12:53 AM, Nathaniel Smith wrote: > On Oct 31, 2015 4:01 PM, "Ralf Gommers" wrote: > > > > Hi all, > > > > On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers > wrote: > >> > >> > >> Hi all, there wasn't much feedback on this FSA, but I want to point out > that it's actually quite important for the project. > >> > >> Maybe everyone already thought about this when the governance model was > agreed on (it does include a NumFOCUS subcommittee after all), but if not: > read / think /ask question fast, because we're moving forward with signing > of the agreement with the people listed at > http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee > . > > > > > > The document is now signed. No other project with an FSA seems to have > done this (yet), but I think it would be good to publish the FSA. Either on > numpy.org, scipy.org or numfocus.org. Any objections/concerns about that? > > > > +1's from the people that signed would be good to have before moving > forward. > > +1 from me, with the proviso that most project FSAs probably contain > individual contributor's home addresses, which should be redacted just > in case. > Ours doesn't contain any home addressed, and I don't think the ones for other projects do either (didn't check yet). Ralf > (Correctly redacting PDFs is notoriously tricky [1], but it looks like > recent versions of e.g. Acrobat have tools that take appropriate care > [2].) > > -n > > [1] http://blog.foxitsoftware.com/how-to-properly-redact-pdf-files/ > [2] > https://helpx.adobe.com/acrobat/using/removing-sensitive-content-pdfs.html > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Oct 31 20:00:58 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 31 Oct 2015 17:00:58 -0700 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: On Sat, Oct 31, 2015 at 4:57 PM, Ralf Gommers wrote: > > On Sun, Nov 1, 2015 at 12:53 AM, Nathaniel Smith wrote: >> >> On Oct 31, 2015 4:01 PM, "Ralf Gommers" wrote: >> > >> > Hi all, >> > >> > On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers >> > wrote: >> >> >> >> >> >> Hi all, there wasn't much feedback on this FSA, but I want to point out >> >> that it's actually quite important for the project. >> >> >> >> Maybe everyone already thought about this when the governance model was >> >> agreed on (it does include a NumFOCUS subcommittee after all), but if not: >> >> read / think /ask question fast, because we're moving forward with signing >> >> of the agreement with the people listed at >> >> http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee. >> > >> > >> > The document is now signed. No other project with an FSA seems to have >> > done this (yet), but I think it would be good to publish the FSA. Either on >> > numpy.org, scipy.org or numfocus.org. Any objections/concerns about that? >> > >> > +1's from the people that signed would be good to have before moving >> > forward. >> >> +1 from me, with the proviso that most project FSAs probably contain >> individual contributor's home addresses, which should be redacted just >> in case. > > > Ours doesn't contain any home addressed, and I don't think the ones for > other projects do either (didn't check yet). Yes it does -- one of the things to fill in on the FSA template is the "project mailing address", and since most projects don't actually have one of those, the advice is to pick some contributor's address and use that. -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Sat Oct 31 20:19:36 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Nov 2015 01:19:36 +0100 Subject: [Numpy-discussion] NumFOCUS fiscal sponsorship agreement In-Reply-To: References: Message-ID: On Sun, Nov 1, 2015 at 1:00 AM, Nathaniel Smith wrote: > On Sat, Oct 31, 2015 at 4:57 PM, Ralf Gommers > wrote: > > > > On Sun, Nov 1, 2015 at 12:53 AM, Nathaniel Smith wrote: > >> > >> On Oct 31, 2015 4:01 PM, "Ralf Gommers" wrote: > >> > > >> > Hi all, > >> > > >> > On Wed, Oct 28, 2015 at 11:48 PM, Ralf Gommers < > ralf.gommers at gmail.com> > >> > wrote: > >> >> > >> >> > >> >> Hi all, there wasn't much feedback on this FSA, but I want to point > out > >> >> that it's actually quite important for the project. > >> >> > >> >> Maybe everyone already thought about this when the governance model > was > >> >> agreed on (it does include a NumFOCUS subcommittee after all), but > if not: > >> >> read / think /ask question fast, because we're moving forward with > signing > >> >> of the agreement with the people listed at > >> >> > http://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#numfocus-subcommittee > . > >> > > >> > > >> > The document is now signed. No other project with an FSA seems to have > >> > done this (yet), but I think it would be good to publish the FSA. > Either on > >> > numpy.org, scipy.org or numfocus.org. Any objections/concerns about > that? > >> > > >> > +1's from the people that signed would be good to have before moving > >> > forward. > >> > >> +1 from me, with the proviso that most project FSAs probably contain > >> individual contributor's home addresses, which should be redacted just > >> in case. > > > > > > Ours doesn't contain any home addressed, and I don't think the ones for > > other projects do either (didn't check yet). > > Yes it does -- one of the things to fill in on the FSA template is the > "project mailing address", and since most projects don't actually have > one of those, the advice is to pick some contributor's address and use > that. That was pretty well hidden (I searched for "address" and "mailing") but found it now, thanks. So good point, home addresses will need redacting. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Oct 31 20:54:45 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Nov 2015 01:54:45 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Thu, Oct 29, 2015 at 8:11 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > > On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: > >> Hi all, >> >> Apparently it is not well known that if you have a Python project >> source tree (e.g., a numpy checkout), then the correct way to install >> it is NOT to type >> >> python setup.py install # bad and broken! >> >> but rather to type >> >> pip install . >> >> > > FWIW, I don't see any mention of this in the numpy docs, but I do see a > lot of instructions involving `setup.py build` and `setup.py install`. > See, for example, INSTALL.txt. Also see > http://docs.scipy.org/doc/numpy/user/install.html#building-from-source > So I guess it is not surprising that it is not well known. > Indeed, install docs are always hopelessly outdated. And we have too many of them. There's duplicate info in INSTALL.txt and http://scipy.org/scipylib/building/index.html for example. We should probably just empty out INSTALL.txt and simply put a link in it to the html docs. I've created an issue with a long todo list and a bunch of links: https://github.com/numpy/numpy/issues/6599. Feel free to add stuff. Or to go fix something:) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Oct 31 20:59:09 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Nov 2015 01:59:09 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: Message-ID: On Sun, Nov 1, 2015 at 1:54 AM, Ralf Gommers wrote: > > > > On Thu, Oct 29, 2015 at 8:11 PM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> >> >> On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: >> >>> Hi all, >>> >>> Apparently it is not well known that if you have a Python project >>> source tree (e.g., a numpy checkout), then the correct way to install >>> it is NOT to type >>> >>> python setup.py install # bad and broken! >>> >>> but rather to type >>> >>> pip install . >>> >>> >> >> FWIW, I don't see any mention of this in the numpy docs, but I do see a >> lot of instructions involving `setup.py build` and `setup.py install`. >> See, for example, INSTALL.txt. Also see >> >> http://docs.scipy.org/doc/numpy/user/install.html#building-from-source >> So I guess it is not surprising that it is not well known. >> > > Indeed, install docs are always hopelessly outdated. And we have too many > of them. There's duplicate info in INSTALL.txt and > http://scipy.org/scipylib/building/index.html for example. We should > probably just empty out INSTALL.txt and simply put a link in it to the html > docs. > > I've created an issue with a long todo list and a bunch of links: > https://github.com/numpy/numpy/issues/6599. Feel free to add stuff. Or to > go fix something:) > Oh, and: looking at this thread there haven't been serious unanswered concerns (at least in my perception), so without more discussion I'd interpret the current status as "go ahead". Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Oct 31 21:00:27 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 1 Nov 2015 02:00:27 +0100 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: <56338AA4.5080308@gmail.com> References: <563187E7.10801@gmail.com> <56338AA4.5080308@gmail.com> Message-ID: "Gruetzi!", as I just found out we say in Switzerland... On Oct 30, 2015 8:20 AM, "Jonathan Helmus" wrote: > On 10/28/2015 09:43 PM, Allan Haldane wrote: > > On 10/28/2015 05:27 PM, Nathaniel Smith wrote: > >> Hi all, > >> > >> Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all > >> welcome him aboard. > >> > >> -n > > > > Welcome Jonathan, happy to have you on the team! > > > > Allan > > > > Thanks you everyone for the kind welcome. I'm looking forwarding to > being part of them team. > > - Jonathan Helmus > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: