From joerg at joergdietrich.com Thu Jan 2 07:42:29 2014 From: joerg at joergdietrich.com (Joerg Dietrich) Date: Thu, 02 Jan 2014 12:42:29 -0000 Subject: [SciPy-Dev] k-sample Anderson Darling test Message-ID: Hi, I'm new to scipy development and I'm trying to follow the HACKING document. I have code that implements the k-sample Anderson Darling test (Scholz & Stephens 1987, Journal of the American Statistical Association 82, 918). This test tries to reject the null hypothesis that k samples of a distribution F_i are drawn from the same parent distribution, i.e. H_0: F_1 = ... = F_k. It is similar to the 2 sample KS test and an extension of the one sample AD test already present in scipy. If this is deemed of interest for inclusion in scipy, I'd be happy to add documentation to my code and prepare a pull request. Cheers, Joerg -- ---=== Encrypted mail preferred. Key-ID: 1024D/2B693EBF ===--- Fortune cookie of the day: There was a phone call for you. From josef.pktd at gmail.com Thu Jan 2 08:22:55 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Jan 2014 08:22:55 -0500 Subject: [SciPy-Dev] k-sample Anderson Darling test In-Reply-To: <52c56010.ea34b60a.588c.50c4SMTPIN_ADDED_MISSING@mx.google.com> References: <52c56010.ea34b60a.588c.50c4SMTPIN_ADDED_MISSING@mx.google.com> Message-ID: On Thu, Jan 2, 2014 at 7:48 AM, Joerg Dietrich wrote: > Hi, > > I'm new to scipy development and I'm trying to follow the HACKING > document. > > I have code that implements the k-sample Anderson Darling test (Scholz > & Stephens 1987, Journal of the American Statistical Association 82, > 918). This test tries to reject the null hypothesis that k samples of > a distribution F_i are drawn from the same parent distribution, > i.e. H_0: F_1 = ... = F_k. It is similar to the 2 sample KS test and > an extension of the one sample AD test already present in scipy. > > If this is deemed of interest for inclusion in scipy, I'd be happy to > add documentation to my code and prepare a pull request. This would be a good contribution to scipy.stats (or to statsmodels). Josef > > Cheers, > Joerg > > -- > ---=== Encrypted mail preferred. Key-ID: 1024D/2B693EBF ===--- > Fortune cookie of the day: > There was a phone call for you. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From joerg at joergdietrich.com Fri Jan 3 14:29:23 2014 From: joerg at joergdietrich.com (Joerg Dietrich) Date: Fri, 3 Jan 2014 20:29:23 +0100 Subject: [SciPy-Dev] k-sample Anderson Darling test In-Reply-To: References: <52c56010.ea34b60a.588c.50c4SMTPIN_ADDED_MISSING@mx.google.com> Message-ID: <20140103192923.GA20606@riesling.joergdietrich.com> On Thu, Jan 02, 2014 at 08:22:55AM -0500, josef.pktd at gmail.com wrote: > On Thu, Jan 2, 2014 at 7:48 AM, Joerg Dietrich wrot> > I have code that implements the k-sample Anderson Darling test (Scholz > > This would be a good contribution to scipy.stats (or to statsmodels). There's code https://github.com/joergdietrich/scipy/compare/k-sample-AD ready for a review. Or should I send a PR right away? Cheers, Joerg -- ---=== Encrypted mail preferred. Key-ID: 1024D/2B693EBF ===--- Fortune cookie of the day: You will be awarded some great honor. From josef.pktd at gmail.com Fri Jan 3 14:52:24 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jan 2014 14:52:24 -0500 Subject: [SciPy-Dev] k-sample Anderson Darling test In-Reply-To: <20140103192923.GA20606@riesling.joergdietrich.com> References: <52c56010.ea34b60a.588c.50c4SMTPIN_ADDED_MISSING@mx.google.com> <20140103192923.GA20606@riesling.joergdietrich.com> Message-ID: On Fri, Jan 3, 2014 at 2:29 PM, Joerg Dietrich wrote: > On Thu, Jan 02, 2014 at 08:22:55AM -0500, josef.pktd at gmail.com wrote: >> On Thu, Jan 2, 2014 at 7:48 AM, Joerg Dietrich wrot> > I have code that implements the k-sample Anderson Darling test (Scholz >> >> This would be a good contribution to scipy.stats (or to statsmodels). > > There's code > https://github.com/joergdietrich/scipy/compare/k-sample-AD ready for a > review. Or should I send a PR right away? You could send the PR right away, it will make it easier to add line comments and discuss. Overall looks good. The main thing we should check is whether and how to get rid of some of the loops. It has been a while since I looked at this, but as far as I remember there should be some numpy shortcuts to avoid at least some of the loops. The main thing that puzzles me from a quick look is the use of np.unique and `==` (line 1232) when we assume continuous variables, where all values "should" be unique. Josef > > Cheers, > Joerg > > -- > ---=== Encrypted mail preferred. Key-ID: 1024D/2B693EBF ===--- > Fortune cookie of the day: > You will be awarded some great honor. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Fri Jan 3 15:49:29 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jan 2014 15:49:29 -0500 Subject: [SciPy-Dev] signature for k-sample tests ??? Message-ID: I don't like it, but I don't care. Opinions please. The inherited pattern in scipy.stats for k-sample tests is to use *args anderson_ksamp(*args) f_oneway(*args) obrientransform(*args) what do we do if we want to add keyword arguments? >>> def f(*args, method='approx'): SyntaxError: invalid syntax unless this changes in newer versions of python In statsmodels I either require a tuple instead of *args, or stack the data into a long format. (I don't use *args for the main required arguments.) Josef From josef.pktd at gmail.com Fri Jan 3 15:50:34 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jan 2014 15:50:34 -0500 Subject: [SciPy-Dev] signature for k-sample tests ??? In-Reply-To: References: Message-ID: On Fri, Jan 3, 2014 at 3:49 PM, wrote: > I don't like it, but I don't care. Opinions please. > > The inherited pattern in scipy.stats for k-sample tests is to use *args > > anderson_ksamp(*args) > f_oneway(*args) > obrientransform(*args) > > what do we do if we want to add keyword arguments? > >>>> def f(*args, method='approx'): > SyntaxError: invalid syntax > unless this changes in newer versions of python > > In statsmodels I either require a tuple instead of *args, or stack the > data into a long format. (I don't use *args for the main required > arguments.) this is a follow-up to PR https://github.com/scipy/scipy/pull/3183 Josef > > Josef From warren.weckesser at gmail.com Fri Jan 3 16:02:15 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 3 Jan 2014 16:02:15 -0500 Subject: [SciPy-Dev] signature for k-sample tests ??? In-Reply-To: References: Message-ID: On Fri, Jan 3, 2014 at 3:49 PM, wrote: > I don't like it, but I don't care. Opinions please. > > The inherited pattern in scipy.stats for k-sample tests is to use *args > > anderson_ksamp(*args) > f_oneway(*args) > obrientransform(*args) > > what do we do if we want to add keyword arguments? > > >>> def f(*args, method='approx'): > SyntaxError: invalid syntax > unless this changes in newer versions of python > > In statsmodels I either require a tuple instead of *args, or stack the > data into a long format. (I don't use *args for the main required > arguments.) > An option that maintains the scipy API is to define the function as def f(*args, **kwargs): and then add code to check that any given keyword arguments are valid. In this case, check that the only key in kwargs is 'method'. Warren > Josef > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jan 3 16:18:46 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jan 2014 16:18:46 -0500 Subject: [SciPy-Dev] signature for k-sample tests ??? In-Reply-To: References: Message-ID: On Fri, Jan 3, 2014 at 4:02 PM, Warren Weckesser wrote: > On Fri, Jan 3, 2014 at 3:49 PM, wrote: >> >> I don't like it, but I don't care. Opinions please. >> >> The inherited pattern in scipy.stats for k-sample tests is to use *args >> >> anderson_ksamp(*args) >> f_oneway(*args) >> obrientransform(*args) >> >> what do we do if we want to add keyword arguments? >> >> >>> def f(*args, method='approx'): >> SyntaxError: invalid syntax >> unless this changes in newer versions of python >> >> In statsmodels I either require a tuple instead of *args, or stack the >> data into a long format. (I don't use *args for the main required >> arguments.) > > > > An option that maintains the scipy API is to define the function as > > def f(*args, **kwargs): > > and then add code to check that any given keyword arguments are valid. In > this case, check that the only key in kwargs is 'method'. I was thinking about that as the only solution that follows the scipy pattern. However, ugly and not informative. But I don't think that the scipy functions will get lots of keywords, so still bearable. In the anderson_ksamp case there might be "eventually" an option to choose the p-value calculation, which would end up in the **kwargs. In one of my versions of anova/f_oneway I have variance and trimming options IIRC. Josef > > Warren > > >> >> Josef >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From ralf.gommers at gmail.com Sat Jan 4 16:18:07 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 4 Jan 2014 22:18:07 +0100 Subject: [SciPy-Dev] signature for k-sample tests ??? In-Reply-To: References: Message-ID: On Fri, Jan 3, 2014 at 10:18 PM, wrote: > On Fri, Jan 3, 2014 at 4:02 PM, Warren Weckesser > wrote: > > On Fri, Jan 3, 2014 at 3:49 PM, wrote: > >> > >> I don't like it, but I don't care. Opinions please. > >> > >> The inherited pattern in scipy.stats for k-sample tests is to use *args > >> > >> anderson_ksamp(*args) > >> f_oneway(*args) > >> obrientransform(*args) > >> > >> what do we do if we want to add keyword arguments? > >> > >> >>> def f(*args, method='approx'): > >> SyntaxError: invalid syntax > >> unless this changes in newer versions of python > >> > >> In statsmodels I either require a tuple instead of *args, or stack the > >> data into a long format. (I don't use *args for the main required > >> arguments.) > > > > > > > > An option that maintains the scipy API is to define the function as > > > > def f(*args, **kwargs): > > > > and then add code to check that any given keyword arguments are valid. > In > > this case, check that the only key in kwargs is 'method'. > > I was thinking about that as the only solution that follows the scipy > pattern. > > However, ugly and not informative. > Agreed. > But I don't think that the scipy functions will get lots of keywords, > so still bearable. > Bearable, but for new functions I'd rather have def f(arrays, kw1=None, ...) where arrays is a sequence of arrays (or a single ndarray if that makes sense). These tests don't have *args uniformly now, so I don't think it's required for anderson_ksamp to copy it. Ralf > > In the anderson_ksamp case there might be "eventually" an option to > choose the p-value calculation, which would end up in the **kwargs. > In one of my versions of anova/f_oneway I have variance and trimming > options IIRC. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Mon Jan 6 16:45:03 2014 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 6 Jan 2014 16:45:03 -0500 Subject: [SciPy-Dev] conv2 and normxcorr2 In-Reply-To: References: <52c2bbf2.c57bcc0a.61c6.4ceb@mx.google.com> <52c2d968.0937cc0a.334f.5101@mx.google.com> Message-ID: Hi, I implemented faster CPU convolution in Theano. The code isn't easily portable... So here is the optimization that I recall where important. 1) In the inner loop, there is an if. This is very bad speed wise. We can replace the inner loop with if with 3 consecutive inner loop. One for before the image (when we pad with 0 or something else). The second for when we are in the image and the last for after. In the valid mode, the first and last loop will be empty. 2) Don't copy data! This is very slow and not needed in many cases. 3) Don't use a jump table to function call to just do an multiplication. This is used to make it work for all dtype. It need to have different code path for each dtype. Doing a pseudo-function call to just do a multiplication is very slow. 4) do some type of unrolling If someone want to see the part of Theano code that could be the more portable is this: https://github.com/Theano/Theano/blob/master/theano/tensor/nnet/conv.py#L2167 It do those 4 optimizations and I point to just the code that do the computation. So this should be redable by people knowing numpy c-api. We add a paper that compare this implementation to the scipy one. From memory it was 100x faster... but for neural network, as it also generalize this code to do more then one convolution per call. That is why there is other loop before line 2167. Also, the parallel version don't always speed up, so I disabled it by default in Theano. It need test to disable it when the shape are too small. If someone look at this and have questions, I can answer them. HTH Fred On Tue, Dec 31, 2013 at 10:50 AM, Ralf Gommers wrote: > > > > On Tue, Dec 31, 2013 at 4:07 PM, Luke Pfister > wrote: >> >> I *believe* that Matlab is calling either the Intel MKL or Intel IPP >> convolution routines, which is why they are so much faster. >> >> I ran into a situation where I needed to perform many, many small 2D >> convolutions, and wound up writing a Cython wrapper to call the IPP >> convolution. I seem to remember getting speedups of ~200x when >> convolving an 8x8 kernel with a 512x512 image. >> >> I'm not familiar with how the Scipy convolution functions are >> implemented under the hood. Do they use efficient algorithms for >> small convolution sizes (ie, overlap-add, overlap-save)? > > > It looks like the implementation is very straightforward and could benefit > from some optimization: > Convolve2d: > > https://github.com/scipy/scipy/blob/master/scipy/signal/sigtoolsmodule.c#L1006 > https://github.com/scipy/scipy/blob/master/scipy/signal/firfilter.c#L84 > And correlate2d just calls convolve2d: > > https://github.com/scipy/scipy/blob/master/scipy/signal/signaltools.py#L503 > > >> >> >> -- >> Luke >> >> On Tue, Dec 31, 2013 at 8:49 AM, Aaron Webster >> wrote: >> > On Tue, Dec 31, 2013 at 2:42 PM, Ralf Gommers >> > wrote: >> >> >> >> >> >> >> >> On Tue, Dec 31, 2013 at 1:43 PM, awebster at falsecolour.com >> >> wrote: >> >>> I noticed a couple of popular matlab functions - conv2 and >> >>> normxcorr2 were not present in the scipy.signal packages. I would >> >>> like to submit them for addition. Can anyone point me on >> >>> instructions on how to write such a thing? Below are examples. >> >>> >> >> >> >> Hi Aaron, isn't conv2 the same as signal.convolve2d? And can what >> >> normxcorr2 does be done with signal.correlate2d? >> >> >> > I did a quick test and it seems that you are correct: signal.convolve2d >> > appears to generate basically the same output as conv2, and following >> > normxcorr2 can be done with signal.correlate2d. However, I noticed >> > while doing this that both signal.convolve2d and signal.correlate2d are >> > *extremely* slow. For example, on my computer with a random 100x100 >> > matrix signal.correlate2d takes 4.73 seconds while normxcorr2 take >> > 0.253 seconds. The results are similar for signal.convolve2d and conv2. >> > >> > As a practical matter, would it make most sense to fix >> > signal.correlate2d and signal.convolve2d, or implement new functions? > > > Speeding up the existing function would be preferable. firfilter.c already > contains a suggestion on how to do that. > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From awebster at falsecolour.com Tue Jan 7 00:32:50 2014 From: awebster at falsecolour.com (Aaron Webster) Date: Tue, 7 Jan 2014 06:32:50 +0100 Subject: [SciPy-Dev] conv2 and normxcorr2 In-Reply-To: References: <52c2bbf2.c57bcc0a.61c6.4ceb@mx.google.com> <52c2d968.0937cc0a.334f.5101@mx.google.com> Message-ID: Do you know the reason why the convolutions here aren't computed using it's Fourier transform property? It seems like this is an obvious path to take advantage of existing code and speed. On Mon, Jan 6, 2014 at 10:45 PM, Fr?d?ric Bastien wrote: > Hi, > > I implemented faster CPU convolution in Theano. The code isn't easily > portable... So here is the optimization that I recall where important. > > 1) In the inner loop, there is an if. This is very bad speed wise. We > can replace the inner loop with if with 3 consecutive inner loop. One > for before the image (when we pad with 0 or something else). The > second for when we are in the image and the last for after. In the > valid mode, the first and last loop will be empty. > > 2) Don't copy data! This is very slow and not needed in many cases. > > 3) Don't use a jump table to function call to just do an > multiplication. This is used to make it work for all dtype. It need to > have different code path for each dtype. Doing a pseudo-function call > to just do a multiplication is very slow. > > 4) do some type of unrolling > > If someone want to see the part of Theano code that could be the more > portable is this: > > > https://github.com/Theano/Theano/blob/master/theano/tensor/nnet/conv.py#L2167 > > It do those 4 optimizations and I point to just the code that do the > computation. So this should be redable by people knowing numpy c-api. > We add a paper that compare this implementation to the scipy one. From > memory it was 100x faster... but for neural network, as it also > generalize this code to do more then one convolution per call. That is > why there is other loop before line 2167. Also, the parallel version > don't always speed up, so I disabled it by default in Theano. It need > test to disable it when the shape are too small. > > If someone look at this and have questions, I can answer them. > > HTH > > Fred > > > > On Tue, Dec 31, 2013 at 10:50 AM, Ralf Gommers > wrote: > > > > > > > > On Tue, Dec 31, 2013 at 4:07 PM, Luke Pfister > > wrote: > >> > >> I *believe* that Matlab is calling either the Intel MKL or Intel IPP > >> convolution routines, which is why they are so much faster. > >> > >> I ran into a situation where I needed to perform many, many small 2D > >> convolutions, and wound up writing a Cython wrapper to call the IPP > >> convolution. I seem to remember getting speedups of ~200x when > >> convolving an 8x8 kernel with a 512x512 image. > >> > >> I'm not familiar with how the Scipy convolution functions are > >> implemented under the hood. Do they use efficient algorithms for > >> small convolution sizes (ie, overlap-add, overlap-save)? > > > > > > It looks like the implementation is very straightforward and could > benefit > > from some optimization: > > Convolve2d: > > > > > https://github.com/scipy/scipy/blob/master/scipy/signal/sigtoolsmodule.c#L1006 > > > https://github.com/scipy/scipy/blob/master/scipy/signal/firfilter.c#L84 > > And correlate2d just calls convolve2d: > > > > > https://github.com/scipy/scipy/blob/master/scipy/signal/signaltools.py#L503 > > > > > >> > >> > >> -- > >> Luke > >> > >> On Tue, Dec 31, 2013 at 8:49 AM, Aaron Webster < > awebster at falsecolour.com> > >> wrote: > >> > On Tue, Dec 31, 2013 at 2:42 PM, Ralf Gommers > > >> > wrote: > >> >> > >> >> > >> >> > >> >> On Tue, Dec 31, 2013 at 1:43 PM, awebster at falsecolour.com > >> >> wrote: > >> >>> I noticed a couple of popular matlab functions - conv2 and > >> >>> normxcorr2 were not present in the scipy.signal packages. I would > >> >>> like to submit them for addition. Can anyone point me on > >> >>> instructions on how to write such a thing? Below are examples. > >> >>> > >> >> > >> >> Hi Aaron, isn't conv2 the same as signal.convolve2d? And can what > >> >> normxcorr2 does be done with signal.correlate2d? > >> >> > >> > I did a quick test and it seems that you are correct: > signal.convolve2d > >> > appears to generate basically the same output as conv2, and following > >> > normxcorr2 can be done with signal.correlate2d. However, I noticed > >> > while doing this that both signal.convolve2d and signal.correlate2d > are > >> > *extremely* slow. For example, on my computer with a random 100x100 > >> > matrix signal.correlate2d takes 4.73 seconds while normxcorr2 take > >> > 0.253 seconds. The results are similar for signal.convolve2d and > conv2. > >> > > >> > As a practical matter, would it make most sense to fix > >> > signal.correlate2d and signal.convolve2d, or implement new functions? > > > > > > Speeding up the existing function would be preferable. firfilter.c > already > > contains a suggestion on how to do that. > > > > Ralf > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- Aaron Webster -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Jan 7 09:03:33 2014 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 7 Jan 2014 09:03:33 -0500 Subject: [SciPy-Dev] conv2 and normxcorr2 In-Reply-To: References: <52c2bbf2.c57bcc0a.61c6.4ceb@mx.google.com> <52c2d968.0937cc0a.334f.5101@mx.google.com> Message-ID: Using the FFT version need a conversion to from the FFT space. This take time. This make that using the FFT version is useful only for big convolution. For small convolution, the direct version is faster. But I never timed it and so I can't give any idea of what is small and big. This also depend of how efficient both version are. Comparing the current slow version in SciPy vs the FFT would make the FFT practically always faster. If someone have the time to compare the FFT version vs the Theano implementation, we could have an idea of the size for each case. Fred On Tue, Jan 7, 2014 at 12:32 AM, Aaron Webster wrote: > Do you know the reason why the convolutions here aren't computed using it's > Fourier transform property? It seems like this is an obvious path to take > advantage of existing code and speed. > > > On Mon, Jan 6, 2014 at 10:45 PM, Fr?d?ric Bastien wrote: >> >> Hi, >> >> I implemented faster CPU convolution in Theano. The code isn't easily >> portable... So here is the optimization that I recall where important. >> >> 1) In the inner loop, there is an if. This is very bad speed wise. We >> can replace the inner loop with if with 3 consecutive inner loop. One >> for before the image (when we pad with 0 or something else). The >> second for when we are in the image and the last for after. In the >> valid mode, the first and last loop will be empty. >> >> 2) Don't copy data! This is very slow and not needed in many cases. >> >> 3) Don't use a jump table to function call to just do an >> multiplication. This is used to make it work for all dtype. It need to >> have different code path for each dtype. Doing a pseudo-function call >> to just do a multiplication is very slow. >> >> 4) do some type of unrolling >> >> If someone want to see the part of Theano code that could be the more >> portable is this: >> >> >> https://github.com/Theano/Theano/blob/master/theano/tensor/nnet/conv.py#L2167 >> >> It do those 4 optimizations and I point to just the code that do the >> computation. So this should be redable by people knowing numpy c-api. >> We add a paper that compare this implementation to the scipy one. From >> memory it was 100x faster... but for neural network, as it also >> generalize this code to do more then one convolution per call. That is >> why there is other loop before line 2167. Also, the parallel version >> don't always speed up, so I disabled it by default in Theano. It need >> test to disable it when the shape are too small. >> >> If someone look at this and have questions, I can answer them. >> >> HTH >> >> Fred >> >> >> >> On Tue, Dec 31, 2013 at 10:50 AM, Ralf Gommers >> wrote: >> > >> > >> > >> > On Tue, Dec 31, 2013 at 4:07 PM, Luke Pfister >> > wrote: >> >> >> >> I *believe* that Matlab is calling either the Intel MKL or Intel IPP >> >> convolution routines, which is why they are so much faster. >> >> >> >> I ran into a situation where I needed to perform many, many small 2D >> >> convolutions, and wound up writing a Cython wrapper to call the IPP >> >> convolution. I seem to remember getting speedups of ~200x when >> >> convolving an 8x8 kernel with a 512x512 image. >> >> >> >> I'm not familiar with how the Scipy convolution functions are >> >> implemented under the hood. Do they use efficient algorithms for >> >> small convolution sizes (ie, overlap-add, overlap-save)? >> > >> > >> > It looks like the implementation is very straightforward and could >> > benefit >> > from some optimization: >> > Convolve2d: >> > >> > >> > https://github.com/scipy/scipy/blob/master/scipy/signal/sigtoolsmodule.c#L1006 >> > >> > https://github.com/scipy/scipy/blob/master/scipy/signal/firfilter.c#L84 >> > And correlate2d just calls convolve2d: >> > >> > >> > https://github.com/scipy/scipy/blob/master/scipy/signal/signaltools.py#L503 >> > >> > >> >> >> >> >> >> -- >> >> Luke >> >> >> >> On Tue, Dec 31, 2013 at 8:49 AM, Aaron Webster >> >> >> >> wrote: >> >> > On Tue, Dec 31, 2013 at 2:42 PM, Ralf Gommers >> >> > >> >> > wrote: >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Dec 31, 2013 at 1:43 PM, awebster at falsecolour.com >> >> >> wrote: >> >> >>> I noticed a couple of popular matlab functions - conv2 and >> >> >>> normxcorr2 were not present in the scipy.signal packages. I would >> >> >>> like to submit them for addition. Can anyone point me on >> >> >>> instructions on how to write such a thing? Below are examples. >> >> >>> >> >> >> >> >> >> Hi Aaron, isn't conv2 the same as signal.convolve2d? And can what >> >> >> normxcorr2 does be done with signal.correlate2d? >> >> >> >> >> > I did a quick test and it seems that you are correct: >> >> > signal.convolve2d >> >> > appears to generate basically the same output as conv2, and following >> >> > normxcorr2 can be done with signal.correlate2d. However, I noticed >> >> > while doing this that both signal.convolve2d and signal.correlate2d >> >> > are >> >> > *extremely* slow. For example, on my computer with a random 100x100 >> >> > matrix signal.correlate2d takes 4.73 seconds while normxcorr2 take >> >> > 0.253 seconds. The results are similar for signal.convolve2d and >> >> > conv2. >> >> > >> >> > As a practical matter, would it make most sense to fix >> >> > signal.correlate2d and signal.convolve2d, or implement new functions? >> > >> > >> > Speeding up the existing function would be preferable. firfilter.c >> > already >> > contains a suggestion on how to do that. >> > >> > Ralf >> > >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > -- > Aaron Webster > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From amicitas at gmail.com Tue Jan 7 09:17:22 2014 From: amicitas at gmail.com (mir amicitas) Date: Tue, 7 Jan 2014 09:17:22 -0500 Subject: [SciPy-Dev] Add a "Wrapping Code From Other Languages" section to topical-software In-Reply-To: References: Message-ID: Dear Ralf, Thank you for your suggestions. I went ahead and made a new sub-section named "Wrapping MATLAB, R and C codes" and moved the relevant entries. I then put in a pull request. If anyone knows of anything else that should go into this section let me know and I'll add it in. P.S. Sorry for the long delay in responding. Traveling and Christmas break got in my way . . . Sincerely, Novimir On Mon, Dec 23, 2013 at 3:20 AM, Ralf Gommers wrote: > > > > On Mon, Dec 23, 2013 at 8:33 AM, mir amicitas wrote: > >> Dear Scipy-dev, >> >> I would like to add a new section to the topical-software page for >> wrapping code in Matlab, IDL etc. Some of these tools are currently listed >> in Miscellaneous, while they would naturally fall under "Running Code >> Written In Other Languages". As part of this I will add a new link to the >> package pyidlrpc (https://bitbucket.org/amicitas/pyidlrpc/) which can be >> used to wrap IDL code. >> >> My specific suggestion is to add a section under: >> "Running Code Written In Other Languages" >> >> With the title: >> "Wrapping/Executing Code in Other Languages" >> >> Under this heading I will add the MATLAB and IDL excecution/wrapping >> libraries. >> > > Hi Novimir, that topic could indeed use some attention. Your proposed > title is almost the same as > http://scipy.org/topical-software.html#running-code-written-in-other-languagesthough. Would it make sense to add separate a separate subsection under it > for Matlab/IDL/R? Collecting your new link, some Matlab links that are in > other places and RPy which is under "basic science". > > >> Please reply with any comments. Otherwise I will make a pull request >> with the above changes. >> > > A PR would be very welcome. > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jan 15 15:40:15 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 15 Jan 2014 21:40:15 +0100 Subject: [SciPy-Dev] 0.13.3 and 0.14.0 releases Message-ID: Hi all, It looks to me like we should do an 0.13.3 bugfix release soon to fix these two issues: - another memory leak in ndimage.label: https://github.com/scipy/scipy/issues/3148 - we broke weave.inline with Visual Studio: https://github.com/scipy/scipy/issues/3216 I propose to make this release within a week. For the 0.14.0 release I propose to branch this around Feb 23rd. That gives us a month to work through a decent part of the backlog of PRs. (plus I'm on holiday until the 15th, so earlier wouldn't work for me). Does that schedule work for everyone? Any more fixes that have to go into 0.13.3? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Jan 15 17:41:10 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 15 Jan 2014 23:41:10 +0100 Subject: [SciPy-Dev] 0.13.3 and 0.14.0 releases In-Reply-To: References: Message-ID: <52D70E86.2040701@googlemail.com> On 15.01.2014 21:40, Ralf Gommers wrote: > Hi all, > > It looks to me like we should do an 0.13.3 bugfix release soon to fix > these two issues: > - another memory leak in ndimage.label: > https://github.com/scipy/scipy/issues/3148 > - we broke weave.inline with Visual Studio: > https://github.com/scipy/scipy/issues/3216 > > I propose to make this release within a week. has it been tested with python3.4b2? I just tried with numpy 1.7.1 cython 0.19.2 and got a failure: test_interpnd.TestEstimateGradients2DGlobal.test_regression_2359 Traceback (most recent call last): File "/usr/lib/python3/dist-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/usr/lib/python3/dist-packages/scipy/interpolate/tests/test_interpnd.py", line 144, in test_regression_2359 points = np.load(data_file('estimate_gradients_hang.npy')) File "/usr/lib/python3/dist-packages/numpy/lib/npyio.py", line 378, in load return format.read_array(fid) File "/usr/lib/python3/dist-packages/numpy/lib/format.py", line 440, in read_array shape, fortran_order, dtype = read_array_header_1_0(fp) File "/usr/lib/python3/dist-packages/numpy/lib/format.py", line 339, in read_array_header_1_0 raise ValueError(msg % (header, e)) ValueError: Cannot parse header: b"{'descr': '",) as neither the used numpy nor cython support python3.4 its probably an issue on of those two. I can check in more detail and up to date dependencies this weekend but if somebody already knows what is to blame let me know. From jtaylor.debian at googlemail.com Wed Jan 15 17:47:05 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 15 Jan 2014 23:47:05 +0100 Subject: [SciPy-Dev] 0.13.3 and 0.14.0 releases In-Reply-To: <52D70E86.2040701@googlemail.com> References: <52D70E86.2040701@googlemail.com> Message-ID: <52D70FE9.5080703@googlemail.com> On 15.01.2014 23:41, Julian Taylor wrote: > On 15.01.2014 21:40, Ralf Gommers wrote: >> Hi all, >> >> It looks to me like we should do an 0.13.3 bugfix release soon to fix >> these two issues: >> - another memory leak in ndimage.label: >> https://github.com/scipy/scipy/issues/3148 >> - we broke weave.inline with Visual Studio: >> https://github.com/scipy/scipy/issues/3216 >> >> I propose to make this release within a week. > > has it been tested with python3.4b2? > I just tried with numpy 1.7.1 cython 0.19.2 and got a failure: > > test_interpnd.TestEstimateGradients2DGlobal.test_regression_2359 > Exception: SyntaxError("Unsupported source construct: '_ast.NameConstant'>",) > > > as neither the used numpy nor cython support python3.4 its probably an > issue on of those two. > > I can check in more detail and up to date dependencies this weekend but > if somebody already knows what is to blame let me know. > probably this: https://github.com/numpy/numpy/pull/3701 so no issue in scipy. From pav at iki.fi Thu Jan 16 15:24:46 2014 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 16 Jan 2014 22:24:46 +0200 Subject: [SciPy-Dev] 0.13.3 and 0.14.0 releases In-Reply-To: References: Message-ID: 15.01.2014 22:40, Ralf Gommers kirjoitti: > It looks to me like we should do an 0.13.3 bugfix release soon to fix these > two issues: > - another memory leak in ndimage.label: > https://github.com/scipy/scipy/issues/3148 > - we broke weave.inline with Visual Studio: > https://github.com/scipy/scipy/issues/3216 > > I propose to make this release within a week. > > For the 0.14.0 release I propose to branch this around Feb 23rd. That gives > us a month to work through a decent part of the backlog of PRs. (plus I'm > on holiday until the 15th, so earlier wouldn't work for me). > > Does that schedule work for everyone? +1 sounds OK to me. > Any more fixes that have to go into 0.13.3? Nothing that I recall at the moment. I don't recall serious regressions. -- Pauli Virtanen From matthew.brett at gmail.com Mon Jan 20 07:24:47 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 20 Jan 2014 12:24:47 +0000 Subject: [SciPy-Dev] Symbol not found: __ZNSt8ios_base4InitD1Ev for scipy.sparse Message-ID: Hi, I am trying to install scipy master on OSX 10.9. I'm using export CC=clang export CXX=clang export FFLAGS=--ff2c from http://www.scipy.org/scipylib/building/macosx.html I built and installed numpy with these flags, then scipy. scipy installs: >>> scipy.__version__ '0.14.0.dev-6b18a3b' but then: >>> import scipy.sparse Traceback (most recent call last): File "", line 1, in File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/__init__.py", line 206, in from .csr import * File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/csr.py", line 13, in from .sparsetools import csr_tocsc, csr_tobsr, csr_count_blocks, \ File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/__init__.py", line 5, in from .csr import * File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", line 26, in _csr = swig_import_helper() File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", line 22, in swig_import_helper _mod = imp.load_module('_csr', fp, pathname, description) File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/imp.py", line 183, in load_module return load_dynamic(name, filename, file) ImportError: dlopen(/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so, 2): Symbol not found: __ZNSt8ios_base4InitD1Ev Referenced from: /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so Expected in: flat namespace in /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so Here's clang version info: $ clang -v Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) Target: x86_64-apple-darwin13.0.0 Thread model: posix I noticed a similar issue here: https://github.com/scipy/scipy/issues/3053 but I think I have clean install (into a virtualenv). Any hints as to where to look next? Cheers, Matthew From warren.weckesser at gmail.com Tue Jan 21 02:03:03 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 21 Jan 2014 02:03:03 -0500 Subject: [SciPy-Dev] 0.13.3 and 0.14.0 releases In-Reply-To: References: Message-ID: On Wed, Jan 15, 2014 at 3:40 PM, Ralf Gommers wrote: > Hi all, > > It looks to me like we should do an 0.13.3 bugfix release soon to fix > these two issues: > - another memory leak in ndimage.label: > https://github.com/scipy/scipy/issues/3148 > - we broke weave.inline with Visual Studio: > https://github.com/scipy/scipy/issues/3216 > > I propose to make this release within a week. > > For the 0.14.0 release I propose to branch this around Feb 23rd. That > gives us a month to work through a decent part of the backlog of PRs. (plus > I'm on holiday until the 15th, so earlier wouldn't work for me). > > Does that schedule work for everyone? Any more fixes that have to go into > 0.13.3? > Sounds good to me. Warren > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Jan 22 14:33:21 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 22 Jan 2014 20:33:21 +0100 Subject: [SciPy-Dev] Symbol not found: __ZNSt8ios_base4InitD1Ev for scipy.sparse In-Reply-To: References: Message-ID: On Mon, Jan 20, 2014 at 1:24 PM, Matthew Brett wrote: > Hi, > > I am trying to install scipy master on OSX 10.9. > > I'm using > > export CC=clang > export CXX=clang > export FFLAGS=--ff2c > > from http://www.scipy.org/scipylib/building/macosx.html > > I built and installed numpy with these flags, then scipy. > > scipy installs: > > >>> scipy.__version__ > '0.14.0.dev-6b18a3b' > > but then: > > >>> import scipy.sparse > Traceback (most recent call last): > File "", line 1, in > File > "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/__init__.py", > line 206, in > from .csr import * > File > "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/csr.py", > line 13, in > from .sparsetools import csr_tocsc, csr_tobsr, csr_count_blocks, \ > File > "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/__init__.py", > line 5, in > from .csr import * > File > "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", > line 26, in > _csr = swig_import_helper() > File > "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", > line 22, in swig_import_helper > _mod = imp.load_module('_csr', fp, pathname, description) > File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/imp.py", > line 183, in load_module > return load_dynamic(name, filename, file) > ImportError: > dlopen(/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so, > 2): Symbol not found: __ZNSt8ios_base4InitD1Ev > Referenced from: > > /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so > Expected in: flat namespace > in > /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so > > Here's clang version info: > > $ clang -v > Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) > Target: x86_64-apple-darwin13.0.0 > Thread model: posix > > I noticed a similar issue here: > > https://github.com/scipy/scipy/issues/3053 > > but I think I have clean install (into a virtualenv). > > Any hints as to where to look next? > I can't reproduce this with the same OS and Clang versions, XCode 5.0.1 and Homebrew Python 2.7. Do you get the same without a virtualenv? Maybe try ``git clean -xdf`` followed by an in-place build. Do you get this for Python 2.7 as well? And which XCode and Python (version + how installed)? Ralf > Cheers, > > Matthew > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Jan 22 19:59:37 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 22 Jan 2014 16:59:37 -0800 Subject: [SciPy-Dev] Symbol not found: __ZNSt8ios_base4InitD1Ev for scipy.sparse In-Reply-To: References: Message-ID: Hi, On Wed, Jan 22, 2014 at 11:33 AM, Ralf Gommers wrote: > > > > On Mon, Jan 20, 2014 at 1:24 PM, Matthew Brett > wrote: >> >> Hi, >> >> I am trying to install scipy master on OSX 10.9. >> >> I'm using >> >> export CC=clang >> export CXX=clang >> export FFLAGS=--ff2c >> >> from http://www.scipy.org/scipylib/building/macosx.html >> >> I built and installed numpy with these flags, then scipy. >> >> scipy installs: >> >> >>> scipy.__version__ >> '0.14.0.dev-6b18a3b' >> >> but then: >> >> >>> import scipy.sparse >> Traceback (most recent call last): >> File "", line 1, in >> File >> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/__init__.py", >> line 206, in >> from .csr import * >> File >> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/csr.py", >> line 13, in >> from .sparsetools import csr_tocsc, csr_tobsr, csr_count_blocks, \ >> File >> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/__init__.py", >> line 5, in >> from .csr import * >> File >> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", >> line 26, in >> _csr = swig_import_helper() >> File >> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", >> line 22, in swig_import_helper >> _mod = imp.load_module('_csr', fp, pathname, description) >> File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/imp.py", >> line 183, in load_module >> return load_dynamic(name, filename, file) >> ImportError: >> dlopen(/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so, >> 2): Symbol not found: __ZNSt8ios_base4InitD1Ev >> Referenced from: >> >> /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so >> Expected in: flat namespace >> in >> /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so >> >> Here's clang version info: >> >> $ clang -v >> Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) >> Target: x86_64-apple-darwin13.0.0 >> Thread model: posix >> >> I noticed a similar issue here: >> >> https://github.com/scipy/scipy/issues/3053 >> >> but I think I have clean install (into a virtualenv). >> >> Any hints as to where to look next? > > > I can't reproduce this with the same OS and Clang versions, XCode 5.0.1 and > Homebrew Python 2.7. > > Do you get the same without a virtualenv? Maybe try ``git clean -xdf`` > followed by an in-place build. Do you get this for Python 2.7 as well? And > which XCode and Python (version + how installed)? Thanks for checking. I do get the same with an in-place python 2.7 build, after git clean -fxd, and I get the same in a virtualenv with python 2.7. For both python 3.3 and python 2.7 I'm using the python.org binaries installed from the binary dmg installers. [mb312 at Kerbin ~]$ python2.7 --version Python 2.7.6 [mb312 at Kerbin ~]$ python3.3 --version Python 3.3.2 [mb312 at Kerbin ~]$ xcodebuild -version Xcode 5.0.2 Build version 5A3005 Thanks, Matthew From warren.weckesser at gmail.com Sat Jan 25 15:24:18 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sat, 25 Jan 2014 15:24:18 -0500 Subject: [SciPy-Dev] API suggestions wanted for an enhancement to scipy.signal.filtfilt Message-ID: Hey all, I'm adding an option to `scipy.signal.filtfilt` that uses Gustaffson's method [1] to handle the edges of the data. In this method, initial conditions for the forward and backward passes of `lfilter` are chosen such that applying the filter first in the forward direction and then backward gives the same result as applying the filter backward and then forward. There is no padding applied to the edges. Gustaffson's method has one optional parameter. It is the estimate of the length of the impulse response of the filter (i.e. anything after this length is supposed to be negligible and is ignored). If it is not given, no truncation of the impulse response is done. The current signature of `filtfilt` is def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None) (See http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.filtfilt.html ) The arguments `padtype` and `padlen` control the type ('odd', 'even', 'constant' or None) and length of the padding. Any suggestions for modifying the signature in a backwards-compatible way? Here are a few options I've considered: (1) Specify the use of Gustaffson's method with `padtype='gust'`, and specify the estimate of the impulse response length using `padlen`. (I don't like this version--there is no padding performed by Gustaffson's method; using `padlen` for the impulse response length is just wrong.) (2) Specify the use of Gustaffson's method with `padtype='gust'`, and specify the estimate of the impulse response with a new argument `irlen`. (A bit better than (1); I could live with using `padtype` to specify the method, even though it isn't actually padding the data.) New signature: def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, irlen=None) (3) A new argument `method` specifies the method. It accepts either `'gust'` or `'pad'`. If the method is `'gust'`, the argument `irlen` specifies the impulse response length (and `padtype` and `padlen` are ignored). If the method is `'pad'`, `padtype` specifies the type of padding, `padlen` specifies the padding length (and `irlen` is ignored). The new signature would be: def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, method='pad', irlen=None) (4) Don't touch `filtfilt`. Create a new function with the signature: def filtfilt_gust(b, a, x, axis=-1, irlen=None) Any suggestions? Any other APIs that would be preferable? Warren [1] F. Gustaffson. Determining the initial states in forward-backward filtering. Transactions on Signal Processing, 46(4):988-992, 1996. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Jan 25 15:34:18 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 25 Jan 2014 22:34:18 +0200 Subject: [SciPy-Dev] 64-bit sparse matrix indices Message-ID: Hi, The 32 & 64 support for sparse matrices is nearly finished, and essentially waiting for more testing & merging: https://github.com/scipy/scipy/pull/442 What this will do is that sparse matrices with nnz that fit into 32 bit use 32-bit index arrays, but those with larger nnz automatically switch into 64-bit indices. This means that e.g. csr_matrix.indices can be either int32 or int64. In most cases (for sparse matrices taking less than a few gigabytes of memory) it will be int32. Direct matrix inversion & sparse graph algorithms are currently not implemented for the int64 matrices. Iterative solvers and pure-Python algorithms probably work, however. *** Now would be a good moment to voice concerns on the approach, and to actually test it in real life if you happen to have big data and enough memory at hand (16 GB of memory is still a bit small to properly test it in real life). User code that relies on 32-bit indices (e.g. written in C/Cython) will continue to work for < 2GB matrices as previously, so that this change are mostly backward compatible. Suggested transition route for future is to either (i) implement your routines both in 32 and 64-bit integers, or, (ii) implement your routines in intp size, and cast input arrays when necessary. -- Pauli Virtanen From njs at pobox.com Sat Jan 25 15:58:00 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 25 Jan 2014 20:58:00 +0000 Subject: [SciPy-Dev] 64-bit sparse matrix indices In-Reply-To: References: Message-ID: On Sat, Jan 25, 2014 at 8:34 PM, Pauli Virtanen wrote: > Hi, > > The 32 & 64 support for sparse matrices is nearly finished, and > essentially waiting for more testing & merging: > > https://github.com/scipy/scipy/pull/442 > > What this will do is that sparse matrices with nnz that fit into 32 bit > use 32-bit index arrays, but those with larger nnz automatically switch > into 64-bit indices. > > This means that e.g. csr_matrix.indices can be either int32 or int64. In > most cases (for sparse matrices taking less than a few gigabytes of > memory) it will be int32. Does this also allow for sparse matrices with more than 2**32 rows or columns (which might have arbitrarily small nnz)? -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From pav at iki.fi Sat Jan 25 16:08:51 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 25 Jan 2014 23:08:51 +0200 Subject: [SciPy-Dev] 64-bit sparse matrix indices In-Reply-To: References: Message-ID: 25.01.2014 22:58, Nathaniel Smith kirjoitti: > On Sat, Jan 25, 2014 at 8:34 PM, Pauli Virtanen wrote: >> The 32 & 64 support for sparse matrices is nearly finished, and >> essentially waiting for more testing & merging: >> >> https://github.com/scipy/scipy/pull/442 >> >> What this will do is that sparse matrices with nnz that fit into 32 bit >> use 32-bit index arrays, but those with larger nnz automatically switch >> into 64-bit indices. >> >> This means that e.g. csr_matrix.indices can be either int32 or int64. In >> most cases (for sparse matrices taking less than a few gigabytes of >> memory) it will be int32. > > Does this also allow for sparse matrices with more than 2**32 rows or > columns (which might have arbitrarily small nnz)? Yes, it should enable also that. -- Pauli Virtanen From yw5aj at virginia.edu Mon Jan 27 16:36:05 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Mon, 27 Jan 2014 16:36:05 -0500 Subject: [SciPy-Dev] An inconsistency in scipy.optimize.minimize Message-ID: Dear all, I have been using the wrapper in scipy.optimize, the minimize(), to call L-BFGS-B method. One option that I'd like to set, is called "epsilon", as shown in this page. http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html However, by digging into the code, I realize that this value's name is called "eps" instead of "epsilon" in the minimize wrapper. That is to say, options = {'epsilon': 1e-3} will not work and says it is not recgonized, but options = {'eps': 1e-3} can work. I think this will cause future users a lot of confusions (not everyone will dig into the code to find out what it is called). Or... it could be me understanding something wrong as well. What do you guys think? -Shawn -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From awebster at falsecolour.com Mon Jan 27 16:46:28 2014 From: awebster at falsecolour.com (Aaron Webster) Date: Mon, 27 Jan 2014 22:46:28 +0100 Subject: [SciPy-Dev] An inconsistency in scipy.optimize.minimize In-Reply-To: References: Message-ID: On Mon, Jan 27, 2014 at 10:36 PM, Yuxiang Wang wrote: > However, by digging into the code, I realize that this value's name is > called "eps" instead of "epsilon" in the minimize wrapper. That is to > say, "epsilon" seems to be the consistent naming convention; I suggest it be changed to that. Good observation. -- Aaron Webster From yw5aj at virginia.edu Mon Jan 27 16:55:39 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Mon, 27 Jan 2014 16:55:39 -0500 Subject: [SciPy-Dev] An inconsistency in scipy.optimize.minimize In-Reply-To: References: Message-ID: Aaron, Thanks for confirming! I agree that epsilon is better, as in the following functions "epsilon" instead of "eps" are used: scipy.optimize.fmin_cg scipy.optimize.fmin_ncg scipy.optimize.fmin_tnc scipy.optimize.fmin_bfgs scipy.optimize.fmin_l_bfgs_b -Shawn On Mon, Jan 27, 2014 at 4:46 PM, Aaron Webster wrote: > On Mon, Jan 27, 2014 at 10:36 PM, Yuxiang Wang wrote: >> However, by digging into the code, I realize that this value's name is >> called "eps" instead of "epsilon" in the minimize wrapper. That is to >> say, > > "epsilon" seems to be the consistent naming convention; I suggest it > be changed to that. > > Good observation. > > -- > Aaron Webster > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From emanuele at relativita.com Tue Jan 28 11:04:48 2014 From: emanuele at relativita.com (Emanuele Olivetti) Date: Tue, 28 Jan 2014 17:04:48 +0100 Subject: [SciPy-Dev] faster pdist and cdist when metric='sqeuclidean' (pull-request) Message-ID: <52E7D520.7070503@relativita.com> Dear SciPy developers, I noticed that both scipy.spatial.pdist() and cdist() have pretty inefficient computation when metric='sqeuclidean'. In essence the current code computes the distances with metric='euclidean' and then - literally - does **2. So a lot of useless computation is done: there is no need to do sqrt() first and then **2. I've added an issue on the github repository, see: https://github.com/scipy/scipy/issues/3251 I've also prepared a pull-request to address this issue: https://github.com/scipy/scipy/pull/3252 which adds C functions for metric='sqeuclidean' and wraps them up till the module level (distance.py). The added code is (no surprise) almost identical to the case metric='euclidean', so this enhancement was pretty straighforward and mainly a (careful :)) cut-n-paste from previous code. The nice part is that there is a (at least) 2x speedup in the computation. Of course all tests that passed before, still pass now. There are no changes at the API-level, so there is no impact on the documentation. Comments are welcome. Best, Emanuele From emanuele at relativita.com Wed Jan 29 04:56:49 2014 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 29 Jan 2014 10:56:49 +0100 Subject: [SciPy-Dev] faster pdist and cdist when metric='sqeuclidean' (pull-request) In-Reply-To: <52E7D520.7070503@relativita.com> References: <52E7D520.7070503@relativita.com> Message-ID: <52E8D061.4040208@relativita.com> On 01/28/2014 05:04 PM, Emanuele Olivetti wrote: > > I've also prepared a pull-request to address this issue: > https://github.com/scipy/scipy/pull/3252 > The PR has been merged. Great! (Thanks Pauli) Best, Emanuele From matthew.brett at gmail.com Thu Jan 30 23:09:50 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 30 Jan 2014 20:09:50 -0800 Subject: [SciPy-Dev] Symbol not found: __ZNSt8ios_base4InitD1Ev for scipy.sparse In-Reply-To: References: Message-ID: Hi, On Wed, Jan 22, 2014 at 4:59 PM, Matthew Brett wrote: > Hi, > > On Wed, Jan 22, 2014 at 11:33 AM, Ralf Gommers wrote: >> >> >> >> On Mon, Jan 20, 2014 at 1:24 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> I am trying to install scipy master on OSX 10.9. >>> >>> I'm using >>> >>> export CC=clang >>> export CXX=clang >>> export FFLAGS=--ff2c >>> >>> from http://www.scipy.org/scipylib/building/macosx.html >>> >>> I built and installed numpy with these flags, then scipy. >>> >>> scipy installs: >>> >>> >>> scipy.__version__ >>> '0.14.0.dev-6b18a3b' >>> >>> but then: >>> >>> >>> import scipy.sparse >>> Traceback (most recent call last): >>> File "", line 1, in >>> File >>> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/__init__.py", >>> line 206, in >>> from .csr import * >>> File >>> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/csr.py", >>> line 13, in >>> from .sparsetools import csr_tocsc, csr_tobsr, csr_count_blocks, \ >>> File >>> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/__init__.py", >>> line 5, in >>> from .csr import * >>> File >>> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", >>> line 26, in >>> _csr = swig_import_helper() >>> File >>> "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py", >>> line 22, in swig_import_helper >>> _mod = imp.load_module('_csr', fp, pathname, description) >>> File "/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/imp.py", >>> line 183, in load_module >>> return load_dynamic(name, filename, file) >>> ImportError: >>> dlopen(/Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so, >>> 2): Symbol not found: __ZNSt8ios_base4InitD1Ev >>> Referenced from: >>> >>> /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so >>> Expected in: flat namespace >>> in >>> /Users/mb312/.virtualenvs/py33-sp-devel/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so >>> >>> Here's clang version info: >>> >>> $ clang -v >>> Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) >>> Target: x86_64-apple-darwin13.0.0 >>> Thread model: posix >>> >>> I noticed a similar issue here: >>> >>> https://github.com/scipy/scipy/issues/3053 >>> >>> but I think I have clean install (into a virtualenv). >>> >>> Any hints as to where to look next? >> >> >> I can't reproduce this with the same OS and Clang versions, XCode 5.0.1 and >> Homebrew Python 2.7. >> >> Do you get the same without a virtualenv? Maybe try ``git clean -xdf`` >> followed by an in-place build. Do you get this for Python 2.7 as well? And >> which XCode and Python (version + how installed)? > > Thanks for checking. > > I do get the same with an in-place python 2.7 build, after git clean > -fxd, and I get the same in a virtualenv with python 2.7. > > For both python 3.3 and python 2.7 I'm using the python.org binaries > installed from the binary dmg installers. > > [mb312 at Kerbin ~]$ python2.7 --version > Python 2.7.6 > [mb312 at Kerbin ~]$ python3.3 --version > Python 3.3.2 > [mb312 at Kerbin ~]$ xcodebuild -version > Xcode 5.0.2 > Build version 5A3005 Just as a data point - the build works fine for system python. Cheers, Matthew From ralf.gommers at gmail.com Fri Jan 31 09:42:34 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 31 Jan 2014 15:42:34 +0100 Subject: [SciPy-Dev] API suggestions wanted for an enhancement to scipy.signal.filtfilt In-Reply-To: References: Message-ID: On Sat, Jan 25, 2014 at 9:24 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > Hey all, > > I'm adding an option to `scipy.signal.filtfilt` that uses Gustaffson's > method [1] to handle the edges of the data. In this method, initial > conditions for the forward and backward passes of `lfilter` are chosen such > that applying the filter first in the forward direction and then backward > gives the same result as applying the filter backward and then forward. > There is no padding applied to the edges. > > Gustaffson's method has one optional parameter. It is the estimate of the > length of the impulse response of the filter (i.e. anything after this > length is supposed to be negligible and is ignored). If it is not given, > no truncation of the impulse response is done. > > The current signature of `filtfilt` is > > def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None) > > (See > http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.filtfilt.html > ) > > The arguments `padtype` and `padlen` control the type ('odd', 'even', > 'constant' or None) and length of the padding. > > Any suggestions for modifying the signature in a backwards-compatible > way? Here are a few options I've considered: > > (1) Specify the use of Gustaffson's method with `padtype='gust'`, and > specify the estimate of the impulse response length using `padlen`. (I > don't like this version--there is no padding performed by Gustaffson's > method; using `padlen` for the impulse response length is just wrong.) > > > (2) Specify the use of Gustaffson's method with `padtype='gust'`, and > specify the estimate of the impulse response with a new argument `irlen`. > (A bit better than (1); I could live with using `padtype` to specify the > method, even though it isn't actually padding the data.) New signature: > > def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, irlen=None) > > > (3) A new argument `method` specifies the method. It accepts either > `'gust'` or `'pad'`. If the method is `'gust'`, the argument `irlen` > specifies the impulse response length (and `padtype` and `padlen` are > ignored). If the method is `'pad'`, `padtype` specifies the type of > padding, `padlen` specifies the padding length (and `irlen` is ignored). > The new signature would be: > > def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, > method='pad', irlen=None) > > > (4) Don't touch `filtfilt`. Create a new function with the signature: > > def filtfilt_gust(b, a, x, axis=-1, irlen=None) > > > Any suggestions? Any other APIs that would be preferable? > My preference would be (3) or (2), in that order. Ralf > > Warren > > > [1] F. Gustaffson. Determining the initial states in forward-backward > filtering. Transactions on Signal Processing, 46(4):988-992, 1996. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Brian.Newsom at Colorado.EDU Fri Jan 31 14:18:07 2014 From: Brian.Newsom at Colorado.EDU (Brian Lee Newsom) Date: Fri, 31 Jan 2014 12:18:07 -0700 Subject: [SciPy-Dev] Notification of Pull Request Message-ID: Hey all, Just submitted a PR, here is the included info: Implement back end of faster multivariate integration This PR implements my recent work on the backend of scipy.integrate, adding: 1. A lower level (C) method of handling multivariate functions so that they may be passed to the quadpack routines 2. C Wrappers for all double fortran quadpack routines supporting this function type 3. A simple additional wrapper that allows the wrappers to be incorporated into the existing SciPy integration, for testing purposes. 1.) Currently Scipy's integration is inhibited in that it's multivariate integration requires nested calls between python and fortran that are slow. The intent of my work (which is not fully completed) is to allow the movement of the innermost-loop integration entirely to compiled languages, which should be much faster. The first step of my solution is the cwrapper.h file, which converts a multivariate function of the form f(int nargs, double args[nargs]) to a function of a single variable - the form expected by quadpack. This is handled nearly exactly as it is currently done in Python, storing the original function and any additional arguments to global variables, then using them from inside a generic "call" function passed into quadpack. An example function of this type would be: double f(int nargs, double args[nargs]){ args[0] * args[1] + args[2] * args[2] } Which would be equivalent to f(x) = x*y+z*z but now allows us to pass the number of extra arguments (2) and their values in the args array so that they may be integrated in fortran. 2.) This file then additionally includes C wrappers for each of quadpack's double routines that handle this evaluation. I added wrappers for every routine, for future use if necessary, though the current schema seems to only use about 6 of them. These can be removed if superfluous. Between these two elements the foundation is laid for handling the n-dimensional integration at this lower level. 3.) Additionally, so that these wrappers may be tested and used, an additional wrapper has been added. This wrapper allows the currently handled python and ctypes functions to be evaluated in the f(nargs,args) form with another simple initialization and call. These functions are funcwrapper_init and funcwrapper, and these have been applied in the __quadpack.h file inside each routine so that the cwrapper is being applied as will be necessary with future additions. These functions should be removed, once SciPy's integration module is adapted to use the new-style wrappers, as described earlier. What does this mean for the end user? As of now, nothing. No functionality has changed, quad works exactly the same way as before, passing all existing unit tests going through my wrappers. It can still handle ctypes or python functions of any number of variables, and this handling is still quite slow. Because of this, no additional documentation has been included. Why does this PR matter, then? This code adds a framework for doing exactly what I described earlier on, moving the handling of multiple variables down into a compiled language, which should provide a significant speed advantage to the power user, once a framework is written that allows ctypes or cython (or both) functions of this new format to be passed directly into the wrappers, without changing anything for the traditional user. I am a first time contributor to SciPy but believe this will be an important contribution and am open to any advice, criticism, or comments on the code or process. Thanks, Brian Newsom -------------- next part -------------- An HTML attachment was scrubbed... URL: From mws at lionex.de Fri Jan 31 16:40:02 2014 From: mws at lionex.de (Maximilian Singh) Date: Fri, 31 Jan 2014 22:40:02 +0100 Subject: [SciPy-Dev] Add detrend=None for scipy.signal.welch et al. Message-ID: <1391204402.13427.9.camel@mws-deb> Hey all, I suggest to add the possibility to turn off the 'detrend' functionality in scipy.signal.welch and scipy.signal.periodogram, e.g. by giving 'detrend=None' as parameter. Although detrending is an useful option, there are also use cases where DC should not be suppressed. Do you agree that this is common enough to be added? -Max -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: