From josef.pktd at gmail.com Fri Feb 1 01:20:41 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 01:20:41 -0500 Subject: [SciPy-User] norm.logpdf fails with float128? In-Reply-To: References: Message-ID: On Thu, Jan 31, 2013 at 11:55 PM, Jacob Biesinger wrote: > Hi! > > I know float128 support is a bit sketchy. I've noticed the norm.pdf and > norm.logpdf functions both choke when given float128 values. The weird > thing is the corresponding norm._pdf and norm._logpdf functions seem to work > fine: > > # pdf function fails >>>> m = sp.ones(5, dtype=sp.longdouble) >>>> norm.pdf(m) > ... > TypeError: array cannot be safely cast to required type > >>>> norm.pdf(m, dtype=sp.longdouble) > ... > TypeError: array cannot be safely cast to required type > > > # but the _pdf function works fine, with appropriate long-double-precision >>>> norm._pdf(m*100) > array([ 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172, > 1.3443135e-2172, 1.3443135e-2172], dtype=float128) > > Is this expected behaviour? Perhaps a problem with the `place` command? If > it's not expected, I'm happy to tinker and submit a pull request. Is this recent behavior because numpy became more picky on downcasting (which I welcome)? I'm on Windows and cannot really check the behavior with float128. http://projects.scipy.org/scipy/ticket/1718 the output array is defined as "d" float64 https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1179 I never looked closely at what the behavior of the continuous distributions for different dtypes. There is no direct casting of arguments. As you showed, the private, distributions specific methods are calculating with whatever is given, lower precision if float32, or an exception if it cannot handle it >>> stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) Traceback (most recent call last): File "", line 1, in stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) File "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", line 2032, in _cdf return _norm_cdf(x) File "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", line 2003, in _norm_cdf return special.ndtr(x) TypeError: ufunc 'ndtr' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' >>> stats.norm._pdf(np.array([-1j, 0, 1, np.nan, np.inf])) array([ 0.65774462 -0.j, 0.39894228 +0.j, 0.24197072 +0.j, nan+nanj, nan+nanj]) I don't know what we should do about it, some of the special functions can only handle float64, and float32 I think. 1) status quo: let users decide, and raise an exception as now if results cannot be cast to float64. 2) status quo with explicit type check of arguments: we can raise an exception before doing the calculations. We could return output as float32 if input is also float32 3) match output type to input type (at least a floating point, no integers): Will work in some cases, like norm.pdf, will not work (exception or implicit casting) if the supporting functions (scipy.special, scipy.integrate) don't support it. I don't know if all of them raise exceptions with downcasting. other versions like always cast to float64 goes against the trend to be more explicit. ??? Josef > > -- > Jake Biesinger > Graduate Student > Xie Lab, UC Irvine > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jake.biesinger at gmail.com Fri Feb 1 02:31:18 2013 From: jake.biesinger at gmail.com (Jacob Biesinger) Date: Thu, 31 Jan 2013 23:31:18 -0800 Subject: [SciPy-User] norm.logpdf fails with float128? In-Reply-To: References: Message-ID: I don't think it would solve the issues of having different dtypes supported differently by the various pdf functions, but would it be helpful to replace the hard coded 'd' data type of output with a kwargs.get('dtype', 'd') so the user could specify the output data type? On Jan 31, 2013 10:20 PM, wrote: > On Thu, Jan 31, 2013 at 11:55 PM, Jacob Biesinger > wrote: > > Hi! > > > > I know float128 support is a bit sketchy. I've noticed the norm.pdf and > > norm.logpdf functions both choke when given float128 values. The weird > > thing is the corresponding norm._pdf and norm._logpdf functions seem to > work > > fine: > > > > # pdf function fails > >>>> m = sp.ones(5, dtype=sp.longdouble) > >>>> norm.pdf(m) > > ... > > TypeError: array cannot be safely cast to required type > > > >>>> norm.pdf(m, dtype=sp.longdouble) > > ... > > TypeError: array cannot be safely cast to required type > > > > > > # but the _pdf function works fine, with appropriate > long-double-precision > >>>> norm._pdf(m*100) > > array([ 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172, > > 1.3443135e-2172, 1.3443135e-2172], dtype=float128) > > > > Is this expected behaviour? Perhaps a problem with the `place` command? > If > > it's not expected, I'm happy to tinker and submit a pull request. > > Is this recent behavior because numpy became more picky on downcasting > (which I welcome)? > I'm on Windows and cannot really check the behavior with float128. > > http://projects.scipy.org/scipy/ticket/1718 > > the output array is defined as "d" float64 > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1179 > > I never looked closely at what the behavior of the continuous > distributions for different dtypes. > > There is no direct casting of arguments. As you showed, the private, > distributions specific methods are calculating with whatever is given, > lower precision if float32, or an exception if it cannot handle it > > >>> stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) > Traceback (most recent call last): > File "", line 1, in > stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) > File > "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", > line 2032, in _cdf > return _norm_cdf(x) > File > "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", > line 2003, in _norm_cdf > return special.ndtr(x) > TypeError: ufunc 'ndtr' not supported for the input types, and the > inputs could not be safely coerced to any supported types according to > the casting rule ''safe'' > > >>> stats.norm._pdf(np.array([-1j, 0, 1, np.nan, np.inf])) > array([ 0.65774462 -0.j, 0.39894228 +0.j, 0.24197072 +0.j, > nan+nanj, nan+nanj]) > > > I don't know what we should do about it, some of the special functions > can only handle float64, and float32 I think. > > 1) status quo: let users decide, and raise an exception as now if > results cannot be cast to float64. > > 2) status quo with explicit type check of arguments: we can raise an > exception before doing the calculations. > We could return output as float32 if input is also float32 > > 3) match output type to input type (at least a floating point, no > integers): Will work in some cases, like norm.pdf, will not work > (exception or implicit casting) if the supporting functions > (scipy.special, scipy.integrate) don't support it. > I don't know if all of them raise exceptions with downcasting. > > other versions like always cast to float64 goes against the trend to > be more explicit. > > ??? > > Josef > > > > > > -- > > Jake Biesinger > > Graduate Student > > Xie Lab, UC Irvine > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fboulogne at sciunto.org Fri Feb 1 09:23:00 2013 From: fboulogne at sciunto.org (=?ISO-8859-1?Q?Fran=E7ois_Boulogne?=) Date: Fri, 01 Feb 2013 15:23:00 +0100 Subject: [SciPy-User] convolve/deconvolve Message-ID: <510BCFC4.1030507@sciunto.org> Hi, I need to deconvolve a signal with a filter. I had a look in the documentation. The function exists but the docstring is missing and I'm not satisfied of the result I got from a "simple" example. filter = np.array([0,1,1,1,1,0]) step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) # I convolve both res = convolve(step, filter, 'valid') # and it returns a slope as expected array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) Now, I want to deconvolve. deconvolve(res, filter) # oops, it raises an exception ValueError: BUG: filter coefficient a[0] == 0 not supported yet # well, let's try this deconvolve(res, filter+1e-9) (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, 1.00000000e+90, 9.99999999e+80])) It's better but I do not recognize my signal :) 1/ Am I misunderstanding or missing something? 2/ How can I do it correctly? I also noted that no test exists for deconvolve() :( Cheers, Fran?ois. From josef.pktd at gmail.com Fri Feb 1 09:50:21 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 09:50:21 -0500 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: <510BCFC4.1030507@sciunto.org> References: <510BCFC4.1030507@sciunto.org> Message-ID: On Fri, Feb 1, 2013 at 9:23 AM, Fran?ois Boulogne wrote: > Hi, > > I need to deconvolve a signal with a filter. I had a look in the > documentation. The function exists but the docstring is missing and I'm > not satisfied of the result I got from a "simple" example. > > filter = np.array([0,1,1,1,1,0]) > step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1]) > # I convolve both > res = convolve(step, filter, 'valid') > # and it returns a slope as expected > array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) > > Now, I want to deconvolve. > deconvolve(res, filter) > # oops, it raises an exception > ValueError: BUG: filter coefficient a[0] == 0 not supported yet > > # well, let's try this > deconvolve(res, filter+1e-9) > (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, > -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, > 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, > -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), > array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, > -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, > 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, > 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, > 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, > 1.00000000e+90, 9.99999999e+80])) > > It's better but I do not recognize my signal :) > 1/ Am I misunderstanding or missing something? > 2/ How can I do it correctly? AFAICS: not supported, maybe using the numpy polynomial might work for the deconvolution from the docstring of lfilter, which is used by deconvolve: a : array_like The denominator coefficient vector in a 1-D sequence. If ``a[0]`` is not 1, then both `a` and `b` are normalized by ``a[0]``. with the normalization your 1e-9 blows up the calculations. (it's been a long time since I tried to figure out deconvolve, and I always had 1 in the first position) Josef > > I also noted that no test exists for deconvolve() :( > > Cheers, > Fran?ois. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 1 09:50:59 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 09:50:59 -0500 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: References: <510BCFC4.1030507@sciunto.org> Message-ID: On Fri, Feb 1, 2013 at 9:50 AM, wrote: > On Fri, Feb 1, 2013 at 9:23 AM, Fran?ois Boulogne wrote: >> Hi, >> >> I need to deconvolve a signal with a filter. I had a look in the >> documentation. The function exists but the docstring is missing and I'm >> not satisfied of the result I got from a "simple" example. >> >> filter = np.array([0,1,1,1,1,0]) >> step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1]) >> # I convolve both >> res = convolve(step, filter, 'valid') >> # and it returns a slope as expected >> array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) >> >> Now, I want to deconvolve. >> deconvolve(res, filter) >> # oops, it raises an exception >> ValueError: BUG: filter coefficient a[0] == 0 not supported yet >> >> # well, let's try this >> deconvolve(res, filter+1e-9) >> (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, >> -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, >> 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, >> -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), >> array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, >> -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, >> 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, >> 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, >> 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, >> 1.00000000e+90, 9.99999999e+80])) >> >> It's better but I do not recognize my signal :) >> 1/ Am I misunderstanding or missing something? >> 2/ How can I do it correctly? > > AFAICS: > > not supported, maybe using the numpy polynomial might work for the deconvolution > > from the docstring of lfilter, which is used by deconvolve: > > a : array_like > The denominator coefficient vector in a 1-D sequence. If ``a[0]`` > is not 1, then both `a` and `b` are normalized by ``a[0]``. > > with the normalization your 1e-9 blows up the calculations. > > (it's been a long time since I tried to figure out deconvolve, and I > always had 1 in the first position) > > Josef > > >> >> I also noted that no test exists for deconvolve() :( Volunteers ? Josef >> >> Cheers, >> Fran?ois. >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 1 10:01:55 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 10:01:55 -0500 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: References: <510BCFC4.1030507@sciunto.org> Message-ID: On Fri, Feb 1, 2013 at 9:50 AM, wrote: > On Fri, Feb 1, 2013 at 9:50 AM, wrote: >> On Fri, Feb 1, 2013 at 9:23 AM, Fran?ois Boulogne wrote: >>> Hi, >>> >>> I need to deconvolve a signal with a filter. I had a look in the >>> documentation. The function exists but the docstring is missing and I'm >>> not satisfied of the result I got from a "simple" example. >>> >>> filter = np.array([0,1,1,1,1,0]) >>> step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >>> 1, 1, 1, 1]) >>> # I convolve both >>> res = convolve(step, filter, 'valid') >>> # and it returns a slope as expected >>> array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) >>> >>> Now, I want to deconvolve. >>> deconvolve(res, filter) >>> # oops, it raises an exception >>> ValueError: BUG: filter coefficient a[0] == 0 not supported yet >>> >>> # well, let's try this >>> deconvolve(res, filter+1e-9) >>> (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, >>> -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, >>> 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, >>> -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), >>> array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, >>> -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, >>> 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, >>> 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, >>> 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, >>> 1.00000000e+90, 9.99999999e+80])) >>> >>> It's better but I do not recognize my signal :) >>> 1/ Am I misunderstanding or missing something? >>> 2/ How can I do it correctly? >> >> AFAICS: >> >> not supported, maybe using the numpy polynomial might work for the deconvolution >> >> from the docstring of lfilter, which is used by deconvolve: >> >> a : array_like >> The denominator coefficient vector in a 1-D sequence. If ``a[0]`` >> is not 1, then both `a` and `b` are normalized by ``a[0]``. >> >> with the normalization your 1e-9 blows up the calculations. >> >> (it's been a long time since I tried to figure out deconvolve, and I >> always had 1 in the first position) another problem in your case are the unit roots >>> np.roots(filter_) array([ -1.00000000e+00+0.j, -7.77156117e-16+1.j, -7.77156117e-16-1.j, 0.00000000e+00+0.j]) I don't remember whether lfilter supports those. Or, if setting explicit initial conditions would help. Or, if deconvolution makes sense with a nonstationary sequence (infinite ?). Josef >> >> Josef >> >> >>> >>> I also noted that no test exists for deconvolve() :( > > Volunteers ? > > Josef > > >>> >>> Cheers, >>> Fran?ois. >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 1 10:16:39 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 10:16:39 -0500 Subject: [SciPy-User] norm.logpdf fails with float128? In-Reply-To: References: Message-ID: please reply at end of message or inline, this is a bottom-posting mailinglist On Fri, Feb 1, 2013 at 2:31 AM, Jacob Biesinger wrote: > I don't think it would solve the issues of having different dtypes supported > differently by the various pdf functions, but would it be helpful to replace > the hard coded 'd' data type of output with a kwargs.get('dtype', 'd') so > the user could specify the output data type? That would be another possibility > > On Jan 31, 2013 10:20 PM, wrote: >> >> On Thu, Jan 31, 2013 at 11:55 PM, Jacob Biesinger >> wrote: >> > Hi! >> > >> > I know float128 support is a bit sketchy. I've noticed the norm.pdf and >> > norm.logpdf functions both choke when given float128 values. The weird >> > thing is the corresponding norm._pdf and norm._logpdf functions seem to >> > work >> > fine: >> > >> > # pdf function fails >> >>>> m = sp.ones(5, dtype=sp.longdouble) >> >>>> norm.pdf(m) >> > ... >> > TypeError: array cannot be safely cast to required type >> > >> >>>> norm.pdf(m, dtype=sp.longdouble) >> > ... >> > TypeError: array cannot be safely cast to required type >> > >> > >> > # but the _pdf function works fine, with appropriate >> > long-double-precision >> >>>> norm._pdf(m*100) >> > array([ 1.3443135e-2172, 1.3443135e-2172, 1.3443135e-2172, >> > 1.3443135e-2172, 1.3443135e-2172], dtype=float128) >> > >> > Is this expected behaviour? Perhaps a problem with the `place` command? >> > If >> > it's not expected, I'm happy to tinker and submit a pull request. >> >> Is this recent behavior because numpy became more picky on downcasting >> (which I welcome)? >> I'm on Windows and cannot really check the behavior with float128. >> >> http://projects.scipy.org/scipy/ticket/1718 >> >> the output array is defined as "d" float64 >> >> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1179 >> >> I never looked closely at what the behavior of the continuous >> distributions for different dtypes. >> >> There is no direct casting of arguments. As you showed, the private, >> distributions specific methods are calculating with whatever is given, >> lower precision if float32, or an exception if it cannot handle it >> >> >>> stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) >> Traceback (most recent call last): >> File "", line 1, in >> stats.norm._cdf(np.array([-1j, 0, 1, np.nan, np.inf])) >> File >> "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", >> line 2032, in _cdf >> return _norm_cdf(x) >> File >> "C:\Programs\Python33\lib\site-packages\scipy\stats\distributions.py", >> line 2003, in _norm_cdf >> return special.ndtr(x) >> TypeError: ufunc 'ndtr' not supported for the input types, and the >> inputs could not be safely coerced to any supported types according to >> the casting rule ''safe'' >> >> >>> stats.norm._pdf(np.array([-1j, 0, 1, np.nan, np.inf])) >> array([ 0.65774462 -0.j, 0.39894228 +0.j, 0.24197072 +0.j, >> nan+nanj, nan+nanj]) >> >> >> I don't know what we should do about it, some of the special functions >> can only handle float64, and float32 I think. >> >> 1) status quo: let users decide, and raise an exception as now if >> results cannot be cast to float64. >> >> 2) status quo with explicit type check of arguments: we can raise an >> exception before doing the calculations. >> We could return output as float32 if input is also float32 >> >> 3) match output type to input type (at least a floating point, no >> integers): Will work in some cases, like norm.pdf, will not work >> (exception or implicit casting) if the supporting functions >> (scipy.special, scipy.integrate) don't support it. >> I don't know if all of them raise exceptions with downcasting. 4) add dtype argument to methods implementation: the current *args, **kwds parsing would need to be changed, since with two exceptions, we rely on loc scale being the last arguments, Ralph is just cleaning up this a bit https://github.com/scipy/scipy/pull/400 usability: for the usage in statsmodels, automatic (input dependent) casting would be more convenient. Since we cannot hardcode the dtype, because we would like different dtypes at different calls to the function, we would have to choose the dtype case dependent. Would be nicer if that is in the methods directly. edit: not really a problem, we could just do (..., dtype=x.dtype) --- I think we should widen the supported dtypes, but I think that we are too close to a release to change this now because the impact of any change is not so easy to guess, and we might need more time to figure out how to handle it. float128 would be useful to have higher precision in some of the methods, especially when they are struggling numerically close to corner or extreme cases. (Although it won't help me on Windows.) But maybe we have the precision already in the actual calculations, we just don't return it. Just to get a better background What's your use case for using float128, why don't you cast to float64? The test suite has a list of distributions and example parameters that makes it easy to loop over all distributions. As a relatively quick check, you could run them on the private methods, _pdf _cdf, and see which return again float128. I have no idea about the number of cases that would work correctly, there might be exceptions and implicit casting in some of the helper functions. Josef >> >> other versions like always cast to float64 goes against the trend to >> be more explicit. >> >> ??? >> >> Josef >> >> >> > >> > -- >> > Jake Biesinger >> > Graduate Student >> > Xie Lab, UC Irvine >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jkhilmer at chemistry.montana.edu Fri Feb 1 11:02:59 2013 From: jkhilmer at chemistry.montana.edu (jkhilmer at chemistry.montana.edu) Date: Fri, 1 Feb 2013 09:02:59 -0700 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: References: <510BCFC4.1030507@sciunto.org> Message-ID: I have some old code for Richardson-Lucy deconvolution, although the method is so simple there's no reason not to try it yourself. Jonathan On Fri, Feb 1, 2013 at 7:50 AM, wrote: > On Fri, Feb 1, 2013 at 9:50 AM, wrote: > > On Fri, Feb 1, 2013 at 9:23 AM, Fran?ois Boulogne > wrote: > >> Hi, > >> > >> I need to deconvolve a signal with a filter. I had a look in the > >> documentation. The function exists but the docstring is missing and I'm > >> not satisfied of the result I got from a "simple" example. > >> > >> filter = np.array([0,1,1,1,1,0]) > >> step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > >> 1, 1, 1, 1]) > >> # I convolve both > >> res = convolve(step, filter, 'valid') > >> # and it returns a slope as expected > >> array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) > >> > >> Now, I want to deconvolve. > >> deconvolve(res, filter) > >> # oops, it raises an exception > >> ValueError: BUG: filter coefficient a[0] == 0 not supported yet > >> > >> # well, let's try this > >> deconvolve(res, filter+1e-9) > >> (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, > >> -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, > >> 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, > >> -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), > >> array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, > >> -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, > >> 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, > >> 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, > >> 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, > >> 1.00000000e+90, 9.99999999e+80])) > >> > >> It's better but I do not recognize my signal :) > >> 1/ Am I misunderstanding or missing something? > >> 2/ How can I do it correctly? > > > > AFAICS: > > > > not supported, maybe using the numpy polynomial might work for the > deconvolution > > > > from the docstring of lfilter, which is used by deconvolve: > > > > a : array_like > > The denominator coefficient vector in a 1-D sequence. If > ``a[0]`` > > is not 1, then both `a` and `b` are normalized by ``a[0]``. > > > > with the normalization your 1e-9 blows up the calculations. > > > > (it's been a long time since I tried to figure out deconvolve, and I > > always had 1 in the first position) > > > > Josef > > > > > >> > >> I also noted that no test exists for deconvolve() :( > > Volunteers ? > > Josef > > > >> > >> Cheers, > >> Fran?ois. > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.biesinger at gmail.com Fri Feb 1 13:44:22 2013 From: jake.biesinger at gmail.com (Jacob Biesinger) Date: Fri, 1 Feb 2013 10:44:22 -0800 Subject: [SciPy-User] norm.logpdf fails with float128? In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 7:16 AM, wrote: > please reply at end of message or inline, this is a bottom-posting > mailinglist > Sorry about that. The post was from my mobile during my (walking) commute. > >> I don't know what we should do about it, some of the special functions > >> can only handle float64, and float32 I think. > >> > >> 1) status quo: let users decide, and raise an exception as now if > >> results cannot be cast to float64. > >> > >> 2) status quo with explicit type check of arguments: we can raise an > >> exception before doing the calculations. > >> We could return output as float32 if input is also float32 > >> > >> 3) match output type to input type (at least a floating point, no > >> integers): Will work in some cases, like norm.pdf, will not work > >> (exception or implicit casting) if the supporting functions > >> (scipy.special, scipy.integrate) don't support it. > >> I don't know if all of them raise exceptions with downcasting. > > 4) add dtype argument to methods > > implementation: the current *args, **kwds parsing would need to be > changed, since with two exceptions, we rely on loc scale being the > last arguments, Ralph is just cleaning up this a bit > https://github.com/scipy/scipy/pull/400 > > usability: for the usage in statsmodels, automatic (input dependent) > casting would be more convenient. Since we cannot hardcode the dtype, > because we would like different dtypes at different calls to the > function, we would have to choose the dtype case dependent. Would be > nicer if that is in the methods directly. > edit: not really a problem, we could just do (..., dtype=x.dtype) > > It would be nice to have both options: use x's dtype by default but let the user specify what the output dtype should be. Does the latter imply that x, loc, and scale would be cast before the pdf/cdf/et al operation? > > --- > I think we should widen the supported dtypes, but I think that we are > too close to a release to change this now because the impact of any > change is not so easy to guess, and we might need more time to figure > out how to handle it. > No problem for me. Since this is research code, I can just hack the changes temporarily. > > float128 would be useful to have higher precision in some of the > methods, especially when they are struggling numerically close to > corner or extreme cases. (Although it won't help me on Windows.) > But maybe we have the precision already in the actual calculations, we > just don't return it. > It seems like that's the case at least for several of the norm functions. > > Just to get a better background > > What's your use case for using float128, why don't you cast to float64? > I have been using float64 in this code for a while. A change in parameterization of a hefty graphical model to high-dimensional gaussian observed variables has led to underflow nightmares. I've tried several normalization schemes but they all eventually they fail. Upgrading to float128's lets the model refine quite a bit more and that's all I need in this code. > > The test suite has a list of distributions and example parameters that > makes it easy to loop over all distributions. As a relatively quick > check, you could run them on the private methods, _pdf _cdf, and see > which return again float128. > I have no idea about the number of cases that would work correctly, > there might be exceptions and implicit casting in some of the helper > functions. > Cool. I'll have a look and post back here. > > Josef > > >> > >> other versions like always cast to float64 goes against the trend to > >> be more explicit. > >> > >> ??? > >> > >> Josef > >> > >> > >> > > >> > -- > >> > Jake Biesinger > >> > Graduate Student > >> > Xie Lab, UC Irvine > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 1 14:03:19 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 14:03:19 -0500 Subject: [SciPy-User] norm.logpdf fails with float128? In-Reply-To: References: Message-ID: On Fri, Feb 1, 2013 at 1:44 PM, Jacob Biesinger wrote: > On Fri, Feb 1, 2013 at 7:16 AM, wrote: >> >> please reply at end of message or inline, this is a bottom-posting >> mailinglist > > > Sorry about that. The post was from my mobile during my (walking) commute. > >> >> >> I don't know what we should do about it, some of the special functions >> >> can only handle float64, and float32 I think. >> >> >> >> 1) status quo: let users decide, and raise an exception as now if >> >> results cannot be cast to float64. >> >> >> >> 2) status quo with explicit type check of arguments: we can raise an >> >> exception before doing the calculations. >> >> We could return output as float32 if input is also float32 >> >> >> >> 3) match output type to input type (at least a floating point, no >> >> integers): Will work in some cases, like norm.pdf, will not work >> >> (exception or implicit casting) if the supporting functions >> >> (scipy.special, scipy.integrate) don't support it. >> >> I don't know if all of them raise exceptions with downcasting. >> >> 4) add dtype argument to methods >> >> implementation: the current *args, **kwds parsing would need to be >> changed, since with two exceptions, we rely on loc scale being the >> last arguments, Ralph is just cleaning up this a bit >> https://github.com/scipy/scipy/pull/400 >> >> usability: for the usage in statsmodels, automatic (input dependent) >> casting would be more convenient. Since we cannot hardcode the dtype, >> because we would like different dtypes at different calls to the >> function, we would have to choose the dtype case dependent. Would be >> nicer if that is in the methods directly. >> edit: not really a problem, we could just do (..., dtype=x.dtype) >> > > It would be nice to have both options: use x's dtype by default but let the > user specify what the output dtype should be. Does the latter imply that x, > loc, and scale would be cast before the pdf/cdf/et al operation? Currently nothing is explicitely cast, except for the output array, as far as I can see and remember. One argument in favor of just using a dtype option is that we don't have to worry about which dtype to pick for the output if there are several different dtypes used in the arguments (which will be often the case). I thought for pdf and cdf we could cast according to x, but for the other methods it's not obvious, isf, ppf, .... > >> >> >> --- >> I think we should widen the supported dtypes, but I think that we are >> too close to a release to change this now because the impact of any >> change is not so easy to guess, and we might need more time to figure >> out how to handle it. > > > No problem for me. Since this is research code, I can just hack the changes > temporarily. > >> >> >> float128 would be useful to have higher precision in some of the >> methods, especially when they are struggling numerically close to >> corner or extreme cases. (Although it won't help me on Windows.) >> But maybe we have the precision already in the actual calculations, we >> just don't return it. > > > It seems like that's the case at least for several of the norm functions. > >> >> >> Just to get a better background >> >> What's your use case for using float128, why don't you cast to float64? > > > I have been using float64 in this code for a while. A change in > parameterization of a hefty graphical model to high-dimensional gaussian > observed variables has led to underflow nightmares. I've tried several > normalization schemes but they all eventually they fail. Upgrading to > float128's lets the model refine quite a bit more and that's all I need in > this code. Ok, that's the gain in numerical precision I would expect in many of the distributions. As a cautionary note: use loc and scale as kwds and not as positional arguments, because there might still be strange hard coded behavior in the argument parsing, especially if you add an extra keyword. > >> >> >> The test suite has a list of distributions and example parameters that >> makes it easy to loop over all distributions. As a relatively quick >> check, you could run them on the private methods, _pdf _cdf, and see >> which return again float128. >> I have no idea about the number of cases that would work correctly, >> there might be exceptions and implicit casting in some of the helper >> functions. > > > Cool. I'll have a look and post back here. Thanks, I'm curious about the result Josef > >> >> >> Josef >> >> >> >> >> other versions like always cast to float64 goes against the trend to >> >> be more explicit. >> >> >> >> ??? >> >> >> >> Josef >> >> >> >> >> >> > >> >> > -- >> >> > Jake Biesinger >> >> > Graduate Student >> >> > Xie Lab, UC Irvine >> >> > >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cweisiger at msg.ucsf.edu Fri Feb 1 17:07:59 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Fri, 1 Feb 2013 14:07:59 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: <5ECF7AB6-5544-47FD-9107-A2ADFD08AB6F@yale.edu> References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> <5ECF7AB6-5544-47FD-9107-A2ADFD08AB6F@yale.edu> Message-ID: On Thu, Jan 31, 2013 at 4:00 PM, Zachary Pincus wrote: > > Let's go back a few steps to make sure we're on the same page... You have > a series of flat-field images acquired at different exposure times, which > together define a per-pixel gain function, right? Then for each new image > you want to calculate the "effective exposure time" for the count at a > given pixel. Which is to say, the light input. Is this all correct? > > So for each pixel, you are estimating the gain function f(exposure) -> > value from your series of flat-field calibration images. > Because it's monotonic, you can invert this to g(value) -> exposure. > Then for any given value in an input image, you want to apply function g(). > Again, is this all correct? > > Yes, this is all correct. ... Instead let's resample the exposures and values to be uniform: > num_samples = 10 > vmin, vmax = values.min(), values.max() > uniform_values = numpy.linspace(vmin, vmax, num_samples) > uniform_exposures = numpy.interp(uniform_values, values, exposures) > > I think this is what I was missing: it's the function values that need to be uniformly-spaced, not the exposure times used to collect those values. Which in hindsight makes sense. I often have trouble wrapping my head around vectorized problems; I'm much more of a software engineer than a mathematician so this is a difficult area for me. Incidentally, I really appreciate your help! I understood the rest of the explanation, I'm pretty sure, and mocked up this vectorized version that appears to function properly: http://pastebin.com/4QjHUf37 I'd appreciate a more experienced (and, I suspect, more mentally awake!) look-over. And thanks again for your assistance! -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Fri Feb 1 18:07:52 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 1 Feb 2013 15:07:52 -0800 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit Message-ID: Hey folks, I've run into a bit of a roadblock. I've got some model runs (x) in an Nx2 array where the first column is the input, and the second column is the output. So in a single case, I'd do: popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) But how should I handle, say, 5000 model runs such that x.shape = (500, N, 2) and I want the 5000 results for popt? This works: popt_array = np.empty(5000, 2) for r, layer in enumerate(model_runs): popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] layer[:,1]) popt_array[r] = popt But is there a better (faster way)? The number of model runs and data points may grow sustatially (~10^4 runs and 10^3 data points). Thanks, -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Fri Feb 1 20:31:53 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 1 Feb 2013 20:31:53 -0500 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> <5ECF7AB6-5544-47FD-9107-A2ADFD08AB6F@yale.edu> Message-ID: > I often have trouble wrapping my head around vectorized problems; I'm much more of a software engineer than a mathematician so this is a difficult area for me. Incidentally, I really appreciate your help! I understood the rest of the explanation, I'm pretty sure, and mocked up this vectorized version that appears to function properly: > http://pastebin.com/4QjHUf37 > > I'd appreciate a more experienced (and, I suspect, more mentally awake!) look-over. And thanks again for your assistance! This looks reasonable enough. Note that you'll run into trouble if you have image pixels that are above or below the per-pixel min/max range. With the map_coordinates() 'mode' parameter set to 'constant' and cval=-1 as you have, this will break spectacularly if by chance a pixel winds up darker than in the zero-exposure calibration image... yet this will happen occasionally, since there's a statistical distribution of noise in the pixel readout. Similarly, your calibration images probably don't (and definitely shouldn't!) contain saturated pixels, so you'll need to think about what to do when you get a pixel with a value in the region between that of the longest-exposure calibration image and 2**16-1. Probably to deal with the low-end case, you should use mode="nearest", so that randomly darker pixels still get assigned a zero-second exposure time, as opposed to -1 as currently. For the high-end case, you need to make sure that the index calculated isn't larger than the array, and if so, trigger an error condition. (Alternately, you could add an extra image to the stack with NAN values, so that any offending pixels get set to NAN to signal local error, but the rest of the image is still usable.) Zach From charlesr.harris at gmail.com Fri Feb 1 22:23:32 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 Feb 2013 20:23:32 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: Message-ID: On Wed, Jan 30, 2013 at 10:29 AM, Chris Weisiger wrote: > We have a camera at our lab that has a nonlinear (but monotonic) response > to light. I'm attempting to linearize the data output by the camera. I'm > doing this by sampling the response curve of the camera, generating a > linear fit of the sample, and mapping new data to the linear fit by way of > the sample. In other words, we have the following functions: > One thing that I have seen with some CCD cameras is that the non-linearity is in the photon flux to current conversion, which means that the output from a fixed light source scales linearly with exposure time and the slopes scale non-linearly with focal plane irradiance. I don't know if that is the case for CMOS cameras but would be happy to find out ;) > f(x): the response curve of the camera (maps photon intensity to reported > counts by the camera) > g(x): an approximation of f(x), composed of line segments > h(x): a linear fit of g(x) > > We get a new pixel value Y in -- this is counts reported by the camera. We > invert g() to get the approximate photon intensity for that many counts. > And then we plug that photon intensity into the linear fit. > > Right now I believe I have a working algorithm, but it's very slow (which > in turn makes testing for validity slow), largely because inverting g() > involves iterating over each datapoint in the approximation to find the two > that bracket Y so that I can linearly interpolate between them. Having to > iterate over every pixel in the image in Python isn't doing me any favors > either; we typically deal with 528x512 images so that's 270k iterations per > image. > Try 4K x 4K. It helps if the data accessed in inner loops is laid out contiguously in memory. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Feb 1 22:47:37 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 1 Feb 2013 20:47:37 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: On Thu, Jan 31, 2013 at 10:57 AM, Chris Weisiger wrote: > On Thu, Jan 31, 2013 at 9:39 AM, Zachary Pincus wrote: > >> I presume you've seen this article about some of the sCMOS cameras, but >> if not: >> >> http://www.microscopy-analysis.com/files/jwiley_microscopy/2012_January_Sabharwal.pdf >> >> They mention the dual amplifier gain issues, and also point out some >> potential trouble spots (toward the end in the "unexpected findings" >> section) with the low-gain amplifier at least for the (unidentified) camera >> they used. Worth knowing about... >> >> > I hadn't seen that; shame on me for not doing my due-diligence. However, > their plots look significantly worse than ours do, even if they're > cherry-picking bad pixels. For comparison, here's our 7 worst (most > nonlinear) pixels: > http://derakon.dyndns.org/~chriswei/temp2/badPixels.png > > And here's our sensor-wide average low-end nonlinearity (note that camera > baseline is 100 counts): > http://derakon.dyndns.org/~chriswei/temp2/lowEndToe.png > > So the light source is held constant here and only the integration time varied? Due to pipelining, it is possible that polynomial fits might be as fast as the linear splines you are using. In any case, a polynomial fit to the inverse function could be used to sample the output to input conversion at equally spaced output values and the result stored. With proper scaling you could then determine the table index and offset for the interpolation using divmod. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 1 23:54:20 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 1 Feb 2013 23:54:20 -0500 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: References: <510BCFC4.1030507@sciunto.org> Message-ID: On Fri, Feb 1, 2013 at 11:02 AM, jkhilmer at chemistry.montana.edu wrote: > I have some old code for Richardson-Lucy deconvolution, although the method > is so simple there's no reason not to try it yourself. > > Jonathan > > > > On Fri, Feb 1, 2013 at 7:50 AM, wrote: >> >> On Fri, Feb 1, 2013 at 9:50 AM, wrote: >> > On Fri, Feb 1, 2013 at 9:23 AM, Fran?ois Boulogne >> > wrote: >> >> Hi, >> >> >> >> I need to deconvolve a signal with a filter. I had a look in the >> >> documentation. The function exists but the docstring is missing and I'm >> >> not satisfied of the result I got from a "simple" example. >> >> >> >> filter = np.array([0,1,1,1,1,0]) >> >> step = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> >> 1, 1, 1, 1]) >> >> # I convolve both >> >> res = convolve(step, filter, 'valid') >> >> # and it returns a slope as expected >> >> array([0, 0, 1, 2, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]) >> >> >> >> Now, I want to deconvolve. >> >> deconvolve(res, filter) >> >> # oops, it raises an exception >> >> ValueError: BUG: filter coefficient a[0] == 0 not supported yet >> >> >> >> # well, let's try this >> >> deconvolve(res, filter+1e-9) >> >> (array([ 0.00000000e+00, 0.00000000e+00, 1.00000000e+09, >> >> -9.99999999e+17, 9.99999999e+26, -9.99999999e+35, >> >> 9.99999999e+44, -9.99999999e+53, 9.99999999e+62, >> >> -9.99999999e+71, 9.99999999e+80, -9.99999999e+89]), >> >> array([ 0.00000000e+00, 0.00000000e+00, 1.11022302e-16, >> >> -8.27130862e-08, -4.42500000e+01, 5.46901335e+10, >> >> 8.27266814e+19, 7.56858250e+28, -8.74285726e+37, >> >> 9.99419626e+46, 8.27205507e+55, -8.26933326e+64, >> >> 9.99999999e+89, 9.99999999e+89, 9.99999999e+89, >> >> 1.00000000e+90, 9.99999999e+80])) >> >> >> >> It's better but I do not recognize my signal :) >> >> 1/ Am I misunderstanding or missing something? >> >> 2/ How can I do it correctly? >> > >> > AFAICS: >> > >> > not supported, maybe using the numpy polynomial might work for the >> > deconvolution just occured to me: leading zeros just shift the array, lets try to drop them, here's an almost roundtrip with "full" >>> filter_ array([0, 1, 1, 1, 1, 0]) >>> a = np.round(np.random.randn(22), 2) >>> res2 = signal.convolve(a, filter_, 'full') >>> s, z = signal.deconvolve(res2, filter_[1:]) #drop leading zero in filter >>> a array([-0.96, -0.99, 1.3 , 0.76, -1.1 , -0.3 , -0.51, 0.86, 1.53, -0.11, -0.21, 0.46, -0.48, -0.64, -0.85, 0.02, 0.21, -0.5 , 2.09, 1.47, -1.73, 0.01]) >>> s array([ 0. , -0.96, -0.99, 1.3 , 0.76, -1.1 , -0.3 , -0.51, 0.86, 1.53, -0.11, -0.21, 0.46, -0.48, -0.64, -0.85, 0.02, 0.21, -0.5 , 2.09, 1.47, -1.73, 0.01]) >>> len(a) 22 >>> len(s) 23 >>> np.allclose(a, s[1:]) True Josef >> > >> > from the docstring of lfilter, which is used by deconvolve: >> > >> > a : array_like >> > The denominator coefficient vector in a 1-D sequence. If >> > ``a[0]`` >> > is not 1, then both `a` and `b` are normalized by ``a[0]``. >> > >> > with the normalization your 1e-9 blows up the calculations. >> > >> > (it's been a long time since I tried to figure out deconvolve, and I >> > always had 1 in the first position) >> > >> > Josef >> > >> > >> >> >> >> I also noted that no test exists for deconvolve() :( >> >> Volunteers ? >> >> Josef >> >> >> >> >> >> Cheers, >> >> Fran?ois. >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pierre at barbierdereuille.net Sat Feb 2 07:58:49 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Sat, 2 Feb 2013 13:58:49 +0100 Subject: [SciPy-User] why is my scipy slow? In-Reply-To: References: Message-ID: In this example, 99% of the time is spent in the convolution operation. But you are comparing the generic nD convolution operation in python, against the specialized 1D convolution in Matlab. Using the 1D convolution of scipy (i.e. scipy.convolve), I get almost a factor 10 in speed on my machine. In general, I find that Matlab can be a bit (~10%) faster on pure matrix operations. However, the language is so much slower than python that any useful code ends up being quite a lot faster in python. The worst case is when dealing with irregular structures where cell-arrays would be used in Matlab and dictionnaries in python. The only times Matlab can still end up being faster, is if the JIT triggers and optimise the code correctly. In these cases, you might want to use Cython, which can improve your speed a lot sometimes just by compiling the python itself. Hope this helps, -- Barbier de Reuille Pierre On 1 February 2013 04:55, Warren Weckesser wrote: > > > On Thu, Jan 31, 2013 at 10:39 PM, John wrote: > >> Hello, >> >> I've been using scipy for a few weeks now and for the most part am >> thoughily >> enjoying it! However I have been porting code from matlab and have been >> surprissed by how much slower it is runnning under python. So much so >> that I >> suspect I must be doing something wrong. Below is an example. In matlab >> the >> doSomething() function takes 6.4ms. In python it taks 78ms, more than 10x >> slower. Does this seem right? Or am I missing something? I installed the >> Enthough distribution for Windows. Any advise much appreaciated! >> >> First in python: >> >> import time >> import scipy.signal >> >> def speedTest(): >> rep = 1000 >> tt = time.time() >> for i in range(rep): >> doSomething() >> print (time.time() - tt) / rep >> >> def doSomething(): >> lp = scipy.signal.firwin(16, 0.5); >> data = scipy.rand(100000) >> data = scipy.signal.convolve(data, lp) >> >> if __name__ == '__main__': >> speedTest() >> >> >> >> Now in matlab: >> >> function matlabSpeedTest() >> rep = 1000; >> tStart=tic; >> for j=1:rep >> doSomething(); >> end >> tElapsed=toc(tStart)/rep; >> str = sprintf('time %s', tElapsed); >> disp(str); >> end >> >> function data = doSomething() >> lp = fir1(16,0.5); >> data = rand(100000, 1, 'double'); >> data = conv(lp, data); >> end >> >> >> > There are several methods you can use to apply a FIR filter to a signal; > scipy.signal.convolve is actually one of the slowest. See > http://www.scipy.org/Cookbook/ApplyFIRFilter for a comparison of the > methods. > > Warren > > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Feb 2 08:33:15 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 02 Feb 2013 15:33:15 +0200 Subject: [SciPy-User] why is my scipy slow? In-Reply-To: References: Message-ID: 02.02.2013 14:58, Pierre Barbier de Reuille kirjoitti: > In this example, 99% of the time is spent in the convolution operation. > But you are comparing the generic nD convolution operation in python, > against the specialized 1D convolution in Matlab. Using the 1D > convolution of scipy (i.e. scipy.convolve), I get almost a factor 10 in > speed on my machine. However, to be fair, scipy.signal.convolve should automatically switch to the 1-D version of the algorithm, when possible. Patches are accepted. -- Pauli Virtanen From rnelsonchem at gmail.com Sat Feb 2 21:38:31 2013 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Sat, 02 Feb 2013 21:38:31 -0500 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: Message-ID: <510DCDA7.1090502@gmail.com> Hi Paul, I've played around with this for scipy.optimize.leastsq, and I've never been successful. I'm not an expert, but I don't think the original code was set up for that type of usage. (Just speculating here... An equally likely explanation is that I'm not smart enough to figure it out :) What I've found really useful for this type of problem is IPython's new parallel architecture (IPython version 0.13.1). It's easy to set up an IPython cluster and get things running in parallel through the notebook interface, so I've attached a simple notebook that does a trivial fitting (using leastsq) of some noisy data. Before you run the notebook, you'll need to set up a cluster through the Cluster tab in the IPython notebook dashboard. Unfortunately, in my experience I've found that the speed improvement is only noticeable until the number of IPython engines in your cluster equals the number of cores not the number of processor threads. (This might be obvious for those in the know, but it took me a while to figure out.) But every little bit of improvement helps, I suppose. Ryan On 2/1/2013 6:07 PM, Paul Hobson wrote: > Hey folks, > > I've run into a bit of a roadblock. I've got some model runs (x) in > an Nx2 array where the first column is the input, and the second > column is the output. So in a single case, I'd do: > > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) > > But how should I handle, say, 5000 model runs such that x.shape = > (500, N, 2) and I want the 5000 results for popt? > > This works: > popt_array = np.empty(5000, 2) > for r, layer in enumerate(model_runs): > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] layer[:,1]) > popt_array[r] = popt > > But is there a better (faster way)? The number of model runs and data > points may grow sustatially (~10^4 runs and 10^3 data points). > > Thanks, > -paul > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- { "metadata": { "name": "Parallel Opt" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Import all of the stuff we'll need into the current IPython kernel. The Client object will let us connect to our parallel IPython kernels, which were started from the Clusters tab on the IPython dashboard." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import scipy.optimize as spo\n", "from IPython.parallel import Client" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up our parallel client object, and make a direct view to the cluster kernels. Block the execution of the code in the main loop while the parallel code is running on the direct view kernels." ] }, { "cell_type": "code", "collapsed": false, "input": [ "c = Client()\n", "dview = c[:]\n", "dview.block = True" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The %%px magic sends all of the cell code to the parallel kernels. Here I've done a couple imports, defined a function for fitting (from a colleague), define a generic residual function, and set up a bunch of 'global' values that I'll need for each fit." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%px\n", "import numpy as np\n", "import scipy.optimize as spo\n", "\n", "def my_funct(x, h, csmax, Klf):\n", " numerator = csmax*(Klf*x)**h\n", " denominator = 1 + (Klf*x)**h\n", " return numerator/denominator\n", "\n", "def resid(parameters, y, x, funct):\n", " return y - funct(x, *parameters)\n", "\n", "true_values = np.array([3., 1., 5.]) # Actual values\n", "x_vals = np.array([0.01, 0.03, 0.07, 0.1, 0.2, 0.5, 0.8]) # x axis data\n", "cs_noise_global = my_funct(x_vals, *true_values) # Generate non-noisy data\n", "guesses = [0, 0, 0.1] # An initial guess for fitting" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here I'll define a function that does my fitting. In this case, I am simply adding random noise to my original data and then fitting it. I put a little test in there to make sure I don't get bad fit values.\n", "\n", "Using the my direct view 'map' function, I can run this function on each of the parallel kernels and collect the results as a Python list. The first argument is the function name, and the second is an array to use for each execution of the function. In my case, it's a simple array of ints, but I guess it could be a list of values for each fit. Added some timing info because I was testing things out. In my tests, I've noticed that the execution time decreases until you reach the number of cores on your machine, not the number of processor threads. I don't know why that is, and it might just be something I'm doing wrong." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def tester(n):\n", " # Make an array of random numbers\n", " random_norm = np.random.randn( len(cs_noise_global) )\n", " # Use the random numbers to add noise to my clean data.\n", " cs_noise = cs_noise_global + cs_noise_global*0.07*random_norm\n", " # Fit the noisy data using the residual function defined above\n", " fit, error = spo.leastsq(resid, guesses, args=(cs_noise, x_vals, my_funct) )\n", " # If the fit was successful, return the fit results, or else return None. That\n", " # way it is pretty easy to loop through the output and drop the duds.\n", " if error in (1, 2, 3, 4):\n", " return fit\n", " else:\n", " return None\n", "\n", "%time vals = dview.map(tester, np.arange(5000))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "len(vals)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] } From josef.pktd at gmail.com Sun Feb 3 07:28:14 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 3 Feb 2013 07:28:14 -0500 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: <510DCDA7.1090502@gmail.com> References: <510DCDA7.1090502@gmail.com> Message-ID: On Sat, Feb 2, 2013 at 9:38 PM, Ryan Nelson wrote: > Hi Paul, > > I've played around with this for scipy.optimize.leastsq, and I've never been > successful. I'm not an expert, but I don't think the original code was set > up for that type of usage. (Just speculating here... An equally likely > explanation is that I'm not smart enough to figure it out :) > > What I've found really useful for this type of problem is IPython's new > parallel architecture (IPython version 0.13.1). It's easy to set up an > IPython cluster and get things running in parallel through the notebook > interface, so I've attached a simple notebook that does a trivial fitting > (using leastsq) of some noisy data. Before you run the notebook, you'll need > to set up a cluster through the Cluster tab in the IPython notebook > dashboard. > > Unfortunately, in my experience I've found that the speed improvement is > only noticeable until the number of IPython engines in your cluster equals > the number of cores not the number of processor threads. (This might be > obvious for those in the know, but it took me a while to figure out.) But > every little bit of improvement helps, I suppose. > > Ryan > > > On 2/1/2013 6:07 PM, Paul Hobson wrote: > > Hey folks, > > I've run into a bit of a roadblock. I've got some model runs (x) in an Nx2 > array where the first column is the input, and the second column is the > output. So in a single case, I'd do: > > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) > > But how should I handle, say, 5000 model runs such that x.shape = (500, N, > 2) and I want the 5000 results for popt? > > This works: > popt_array = np.empty(5000, 2) > for r, layer in enumerate(model_runs): > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] layer[:,1]) > popt_array[r] = popt > > But is there a better (faster way)? The number of model runs and data points > may grow sustatially (~10^4 runs and 10^3 data points). I think there is no efficient way to avoid the loop. besides going parallel: If you are only interested in popt, then using optimize.leastsq directly will save some calculations. The other major speedup for this kind of problems is to find good starting values. For example, if the solutions in the sequence are close to each other, then using previous solutions as starting values for the next case will speed up convergence. IIRC, in statsmodels we use an average of previous solutions, after some initial warm-up, in a similar problem. A moving average could also be useful. Josef > > Thanks, > -paul > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From scipy at samueljohn.de Sun Feb 3 12:18:47 2013 From: scipy at samueljohn.de (Samuel John) Date: Sun, 3 Feb 2013 18:18:47 +0100 Subject: [SciPy-User] scipy.test() fails for 0.11.0 on OS X 10.8.2 In-Reply-To: References: Message-ID: <98C1947F-1856-44D3-B9DB-E48E95FBD0F6@samueljohn.de> Thanks for mentioning my numpy/scipy formulae for Homebrew (on Mac), Will! But please not, that I found the "LAPACK" part of openBLAS is actually slower than Accelerate.framework (teted with eigenvalue decomp.) because openBLAS does not optimize LAPACK and just ships unoptimized LAPACK. The BLAS part however is qite fast. To try my Homebrew "tap": brew tap samueljohn/python brew info numpy On 14.01.2013, at 18:31, Will wrote: > > Just a follow up in case anyone else is interested: > > The scipy homebrew formula available from Samuel John > (https://github.com/samueljohn/homebrew-python) > has a --with-openblas option that builds numpy and > scipy using netlib LAPACK and OpenBLAS > (http://xianyi.github.com/OpenBLAS/). By this method, > I was able to get scipy 0.11.0 installed in OS X > 10.8.2 and have it run its tests with no failures. > > Will From pmhobson at gmail.com Sun Feb 3 13:29:57 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Sun, 3 Feb 2013 10:29:57 -0800 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Sun, Feb 3, 2013 at 4:28 AM, wrote: > On Sat, Feb 2, 2013 at 9:38 PM, Ryan Nelson wrote: > > Hi Paul, > > > > I've played around with this for scipy.optimize.leastsq, and I've never > been > > successful. I'm not an expert, but I don't think the original code was > set > > up for that type of usage. (Just speculating here... An equally likely > > explanation is that I'm not smart enough to figure it out :) > > > > What I've found really useful for this type of problem is IPython's new > > parallel architecture (IPython version 0.13.1). It's easy to set up an > > IPython cluster and get things running in parallel through the notebook > > interface, so I've attached a simple notebook that does a trivial fitting > > (using leastsq) of some noisy data. Before you run the notebook, you'll > need > > to set up a cluster through the Cluster tab in the IPython notebook > > dashboard. > > > > Unfortunately, in my experience I've found that the speed improvement is > > only noticeable until the number of IPython engines in your cluster > equals > > the number of cores not the number of processor threads. (This might be > > obvious for those in the know, but it took me a while to figure out.) But > > every little bit of improvement helps, I suppose. > > > > Ryan > > > > > > On 2/1/2013 6:07 PM, Paul Hobson wrote: > > > > Hey folks, > > > > I've run into a bit of a roadblock. I've got some model runs (x) in an > Nx2 > > array where the first column is the input, and the second column is the > > output. So in a single case, I'd do: > > > > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) > > > > But how should I handle, say, 5000 model runs such that x.shape = (500, > N, > > 2) and I want the 5000 results for popt? > > > > This works: > > popt_array = np.empty(5000, 2) > > for r, layer in enumerate(model_runs): > > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] layer[:,1]) > > popt_array[r] = popt > > > > But is there a better (faster way)? The number of model runs and data > points > > may grow sustatially (~10^4 runs and 10^3 data points). > > I think there is no efficient way to avoid the loop. > > besides going parallel: > > If you are only interested in popt, then using optimize.leastsq > directly will save some calculations. > > The other major speedup for this kind of problems is to find good > starting values. For example, if the solutions in the sequence are > close to each other, then using previous solutions as starting values > for the next case will speed up convergence. > IIRC, in statsmodels we use an average of previous solutions, after > some initial warm-up, in a similar problem. A moving average could > also be useful. > > Josef > > > > > Thanks, > > -paul > > Thanks for the advice, Ryan and Josef. I've been meaning to get a feel of for IPython Parallel for some time now, so this will be a good impetus to do so. Also, Josef, the suggestion about using optimize.leastsq with an initial guess, will be a very good one, I think. A better description of what I'm doing is actually bootstrapping the regression parameters to find their 95% confidence intervals. So it make a lot sense, really, to get the initial guess at the parameters from the full data set for use in the bootstrap iterations. Cheers, -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From fboulogne at sciunto.org Sun Feb 3 13:34:49 2013 From: fboulogne at sciunto.org (=?ISO-8859-1?Q?Fran=E7ois_Boulogne?=) Date: Sun, 03 Feb 2013 19:34:49 +0100 Subject: [SciPy-User] convolve/deconvolve In-Reply-To: References: <510BCFC4.1030507@sciunto.org> Message-ID: <510EADC9.2020105@sciunto.org> Thank you Josef for your help. -- Fran?ois Boulogne. http://www.sciunto.org GPG fingerprint: 25F6 C971 4875 A6C1 EDD1 75C8 1AA7 216E 32D5 F22F From josef.pktd at gmail.com Sun Feb 3 13:47:00 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 3 Feb 2013 13:47:00 -0500 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Sun, Feb 3, 2013 at 1:29 PM, Paul Hobson wrote: > > On Sun, Feb 3, 2013 at 4:28 AM, wrote: >> >> On Sat, Feb 2, 2013 at 9:38 PM, Ryan Nelson wrote: >> > Hi Paul, >> > >> > I've played around with this for scipy.optimize.leastsq, and I've never >> > been >> > successful. I'm not an expert, but I don't think the original code was >> > set >> > up for that type of usage. (Just speculating here... An equally likely >> > explanation is that I'm not smart enough to figure it out :) >> > >> > What I've found really useful for this type of problem is IPython's new >> > parallel architecture (IPython version 0.13.1). It's easy to set up an >> > IPython cluster and get things running in parallel through the notebook >> > interface, so I've attached a simple notebook that does a trivial >> > fitting >> > (using leastsq) of some noisy data. Before you run the notebook, you'll >> > need >> > to set up a cluster through the Cluster tab in the IPython notebook >> > dashboard. >> > >> > Unfortunately, in my experience I've found that the speed improvement is >> > only noticeable until the number of IPython engines in your cluster >> > equals >> > the number of cores not the number of processor threads. (This might be >> > obvious for those in the know, but it took me a while to figure out.) >> > But >> > every little bit of improvement helps, I suppose. >> > >> > Ryan >> > >> > >> > On 2/1/2013 6:07 PM, Paul Hobson wrote: >> > >> > Hey folks, >> > >> > I've run into a bit of a roadblock. I've got some model runs (x) in an >> > Nx2 >> > array where the first column is the input, and the second column is the >> > output. So in a single case, I'd do: >> > >> > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) >> > >> > But how should I handle, say, 5000 model runs such that x.shape = (500, >> > N, >> > 2) and I want the 5000 results for popt? >> > >> > This works: >> > popt_array = np.empty(5000, 2) >> > for r, layer in enumerate(model_runs): >> > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] layer[:,1]) >> > popt_array[r] = popt >> > >> > But is there a better (faster way)? The number of model runs and data >> > points >> > may grow sustatially (~10^4 runs and 10^3 data points). >> >> I think there is no efficient way to avoid the loop. >> >> besides going parallel: >> >> If you are only interested in popt, then using optimize.leastsq >> directly will save some calculations. >> >> The other major speedup for this kind of problems is to find good >> starting values. For example, if the solutions in the sequence are >> close to each other, then using previous solutions as starting values >> for the next case will speed up convergence. >> IIRC, in statsmodels we use an average of previous solutions, after >> some initial warm-up, in a similar problem. A moving average could >> also be useful. >> >> Josef >> >> > >> > Thanks, >> > -paul >> > > Thanks for the advice, Ryan and Josef. I've been meaning to get a feel of > for IPython Parallel for some time now, so this will be a good impetus to do > so. If you just want to run it on multiple cores, then joblib is easier. scikit-learn uses it extensively, and statsmodels uses it at the few parts, especially for bootstrap. > > Also, Josef, the suggestion about using optimize.leastsq with an initial > guess, will be a very good one, I think. A better description of what I'm > doing is actually bootstrapping the regression parameters to find their 95% > confidence intervals. So it make a lot sense, really, to get the initial > guess at the parameters from the full data set for use in the bootstrap > iterations. The only problems I would worry a bit about in this case are the possible presence of multiple local minima, if there are any, and convergence failures. Josef > > Cheers, > -paul > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pmhobson at gmail.com Mon Feb 4 00:13:23 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Sun, 3 Feb 2013 21:13:23 -0800 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Sun, Feb 3, 2013 at 10:47 AM, wrote: > On Sun, Feb 3, 2013 at 1:29 PM, Paul Hobson wrote: > > > > On Sun, Feb 3, 2013 at 4:28 AM, wrote: > >> > >> On Sat, Feb 2, 2013 at 9:38 PM, Ryan Nelson > wrote: > >> > Hi Paul, > >> > > >> > I've played around with this for scipy.optimize.leastsq, and I've > never > >> > been > >> > successful. I'm not an expert, but I don't think the original code was > >> > set > >> > up for that type of usage. (Just speculating here... An equally likely > >> > explanation is that I'm not smart enough to figure it out :) > >> > > >> > What I've found really useful for this type of problem is IPython's > new > >> > parallel architecture (IPython version 0.13.1). It's easy to set up an > >> > IPython cluster and get things running in parallel through the > notebook > >> > interface, so I've attached a simple notebook that does a trivial > >> > fitting > >> > (using leastsq) of some noisy data. Before you run the notebook, > you'll > >> > need > >> > to set up a cluster through the Cluster tab in the IPython notebook > >> > dashboard. > >> > > >> > Unfortunately, in my experience I've found that the speed improvement > is > >> > only noticeable until the number of IPython engines in your cluster > >> > equals > >> > the number of cores not the number of processor threads. (This might > be > >> > obvious for those in the know, but it took me a while to figure out.) > >> > But > >> > every little bit of improvement helps, I suppose. > >> > > >> > Ryan > >> > > >> > > >> > On 2/1/2013 6:07 PM, Paul Hobson wrote: > >> > > >> > Hey folks, > >> > > >> > I've run into a bit of a roadblock. I've got some model runs (x) in > an > >> > Nx2 > >> > array where the first column is the input, and the second column is > the > >> > output. So in a single case, I'd do: > >> > > >> > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) > >> > > >> > But how should I handle, say, 5000 model runs such that x.shape = > (500, > >> > N, > >> > 2) and I want the 5000 results for popt? > >> > > >> > This works: > >> > popt_array = np.empty(5000, 2) > >> > for r, layer in enumerate(model_runs): > >> > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] > layer[:,1]) > >> > popt_array[r] = popt > >> > > >> > But is there a better (faster way)? The number of model runs and data > >> > points > >> > may grow sustatially (~10^4 runs and 10^3 data points). > >> > >> I think there is no efficient way to avoid the loop. > >> > >> besides going parallel: > >> > >> If you are only interested in popt, then using optimize.leastsq > >> directly will save some calculations. > >> > >> The other major speedup for this kind of problems is to find good > >> starting values. For example, if the solutions in the sequence are > >> close to each other, then using previous solutions as starting values > >> for the next case will speed up convergence. > >> IIRC, in statsmodels we use an average of previous solutions, after > >> some initial warm-up, in a similar problem. A moving average could > >> also be useful. > >> > >> Josef > >> > >> > > >> > Thanks, > >> > -paul > >> > > > > Thanks for the advice, Ryan and Josef. I've been meaning to get a feel of > > for IPython Parallel for some time now, so this will be a good impetus > to do > > so. > > If you just want to run it on multiple cores, then joblib is easier. > scikit-learn uses it extensively, and statsmodels uses it at the few > parts, especially for bootstrap. > > > > > Also, Josef, the suggestion about using optimize.leastsq with an initial > > guess, will be a very good one, I think. A better description of what I'm > > doing is actually bootstrapping the regression parameters to find their > 95% > > confidence intervals. So it make a lot sense, really, to get the initial > > guess at the parameters from the full data set for use in the bootstrap > > iterations. > > The only problems I would worry a bit about in this case are the > possible presence of multiple local minima, if there are any, and > convergence failures. > > Josef > > Thanks for the advice. Currently working with linear fits, but I'm hoping to expand this work to be more general. I'll be sure to keep these things in mind. -p -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 4 00:16:08 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 4 Feb 2013 00:16:08 -0500 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Mon, Feb 4, 2013 at 12:13 AM, Paul Hobson wrote: > On Sun, Feb 3, 2013 at 10:47 AM, wrote: >> >> On Sun, Feb 3, 2013 at 1:29 PM, Paul Hobson wrote: >> > >> > On Sun, Feb 3, 2013 at 4:28 AM, wrote: >> >> >> >> On Sat, Feb 2, 2013 at 9:38 PM, Ryan Nelson >> >> wrote: >> >> > Hi Paul, >> >> > >> >> > I've played around with this for scipy.optimize.leastsq, and I've >> >> > never >> >> > been >> >> > successful. I'm not an expert, but I don't think the original code >> >> > was >> >> > set >> >> > up for that type of usage. (Just speculating here... An equally >> >> > likely >> >> > explanation is that I'm not smart enough to figure it out :) >> >> > >> >> > What I've found really useful for this type of problem is IPython's >> >> > new >> >> > parallel architecture (IPython version 0.13.1). It's easy to set up >> >> > an >> >> > IPython cluster and get things running in parallel through the >> >> > notebook >> >> > interface, so I've attached a simple notebook that does a trivial >> >> > fitting >> >> > (using leastsq) of some noisy data. Before you run the notebook, >> >> > you'll >> >> > need >> >> > to set up a cluster through the Cluster tab in the IPython notebook >> >> > dashboard. >> >> > >> >> > Unfortunately, in my experience I've found that the speed improvement >> >> > is >> >> > only noticeable until the number of IPython engines in your cluster >> >> > equals >> >> > the number of cores not the number of processor threads. (This might >> >> > be >> >> > obvious for those in the know, but it took me a while to figure out.) >> >> > But >> >> > every little bit of improvement helps, I suppose. >> >> > >> >> > Ryan >> >> > >> >> > >> >> > On 2/1/2013 6:07 PM, Paul Hobson wrote: >> >> > >> >> > Hey folks, >> >> > >> >> > I've run into a bit of a roadblock. I've got some model runs (x) in >> >> > an >> >> > Nx2 >> >> > array where the first column is the input, and the second column is >> >> > the >> >> > output. So in a single case, I'd do: >> >> > >> >> > popt, pcov = scipy.optimize.curve_fit(myFit, x[:,0], x[:,1]) >> >> > >> >> > But how should I handle, say, 5000 model runs such that x.shape = >> >> > (500, >> >> > N, >> >> > 2) and I want the 5000 results for popt? >> >> > >> >> > This works: >> >> > popt_array = np.empty(5000, 2) >> >> > for r, layer in enumerate(model_runs): >> >> > popt, pcov = scipy.optimize.curve_fit(myFit, layer[:,0] >> >> > layer[:,1]) >> >> > popt_array[r] = popt >> >> > >> >> > But is there a better (faster way)? The number of model runs and data >> >> > points >> >> > may grow sustatially (~10^4 runs and 10^3 data points). >> >> >> >> I think there is no efficient way to avoid the loop. >> >> >> >> besides going parallel: >> >> >> >> If you are only interested in popt, then using optimize.leastsq >> >> directly will save some calculations. >> >> >> >> The other major speedup for this kind of problems is to find good >> >> starting values. For example, if the solutions in the sequence are >> >> close to each other, then using previous solutions as starting values >> >> for the next case will speed up convergence. >> >> IIRC, in statsmodels we use an average of previous solutions, after >> >> some initial warm-up, in a similar problem. A moving average could >> >> also be useful. >> >> >> >> Josef >> >> >> >> > >> >> > Thanks, >> >> > -paul >> >> >> > >> > Thanks for the advice, Ryan and Josef. I've been meaning to get a feel >> > of >> > for IPython Parallel for some time now, so this will be a good impetus >> > to do >> > so. >> >> If you just want to run it on multiple cores, then joblib is easier. >> scikit-learn uses it extensively, and statsmodels uses it at the few >> parts, especially for bootstrap. >> >> > >> > Also, Josef, the suggestion about using optimize.leastsq with an initial >> > guess, will be a very good one, I think. A better description of what >> > I'm >> > doing is actually bootstrapping the regression parameters to find their >> > 95% >> > confidence intervals. So it make a lot sense, really, to get the initial >> > guess at the parameters from the full data set for use in the bootstrap >> > iterations. >> >> The only problems I would worry a bit about in this case are the >> possible presence of multiple local minima, if there are any, and >> convergence failures. >> >> Josef >> > > Thanks for the advice. Currently working with linear fits, but I'm hoping to > expand this work to be more general. I'll be sure to keep these things in > mind. If you work with linear fits, why do you use curve_fit/leastsq instead of linalg? Josef > -p > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From lists at hilboll.de Mon Feb 4 06:23:53 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Mon, 04 Feb 2013 12:23:53 +0100 Subject: [SciPy-User] scipy.stats.pearsonr returns exactly 0.0 Message-ID: <510F9A49.9020404@hilboll.de> Hi, in a situation where scipy.stats.spearmanr and numpy.corrcoef[0,1] return sensible results, scipy.stats.pearsonr returns exactly 0.0. Are there situations where this is to be expected? I have sanitized my input arrays for nans before calling the correlation methods: tmpdata1 = data[key1].data[:, x, y] tmpdata2 = data[key2].data[:, x, y] tmpidx = True - (np.isnan(tmpdata1) | np.isnan(tmpdata2)) tmpdata1, tmpdata2 = tmpdata1[tmpidx], tmpdata2[tmpidx] The results I get is: sp.stats.linregress: 0.09.../..... sp.stats.spearmanr: 0.331... np.corrcoef: 0.2574... sp.stats.pearsonr: 0.0 (exactly) I'm a bit worried that pearsonr is exactly 0.00. Any ideas? Cheers, Andreas. -- Andreas Hilboll PhD Student Institute of Environmental Physics University of Bremen U3145 Otto-Hahn-Allee 1 D-28359 Bremen Germany +49(0)421 218 62133 (phone) +49(0)421 218 98 62133 (fax) http://www.iup.uni-bremen.de/~hilboll From warren.weckesser at gmail.com Mon Feb 4 07:20:46 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 4 Feb 2013 07:20:46 -0500 Subject: [SciPy-User] scipy.stats.pearsonr returns exactly 0.0 In-Reply-To: <510F9A49.9020404@hilboll.de> References: <510F9A49.9020404@hilboll.de> Message-ID: On 2/4/13, Andreas Hilboll wrote: > Hi, > > in a situation where scipy.stats.spearmanr and numpy.corrcoef[0,1] > return sensible results, scipy.stats.pearsonr returns exactly 0.0. > > Are there situations where this is to be expected? I have sanitized my > input arrays for nans before calling the correlation methods: > > tmpdata1 = data[key1].data[:, x, y] > tmpdata2 = data[key2].data[:, x, y] > tmpidx = True - (np.isnan(tmpdata1) | np.isnan(tmpdata2)) > tmpdata1, tmpdata2 = tmpdata1[tmpidx], tmpdata2[tmpidx] > > The results I get is: > > sp.stats.linregress: 0.09.../..... > sp.stats.spearmanr: 0.331... > np.corrcoef: 0.2574... > sp.stats.pearsonr: 0.0 (exactly) > > I'm a bit worried that pearsonr is exactly 0.00. Any ideas? > Can you provide your data? Warren > Cheers, > Andreas. > > -- > Andreas Hilboll > PhD Student > > Institute of Environmental Physics > University of Bremen > > U3145 > Otto-Hahn-Allee 1 > D-28359 Bremen > Germany > > +49(0)421 218 62133 (phone) > +49(0)421 218 98 62133 (fax) > http://www.iup.uni-bremen.de/~hilboll > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cweisiger at msg.ucsf.edu Mon Feb 4 11:19:36 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Mon, 4 Feb 2013 08:19:36 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: On Fri, Feb 1, 2013 at 7:47 PM, Charles R Harris wrote: > > So the light source is held constant here and only the integration time > varied? Due to pipelining, it is possible that polynomial fits might be as > fast as the linear splines you are using. In any case, a polynomial fit to > the inverse function could be used to sample the output to input conversion > at equally spaced output values and the result stored. With proper scaling > you could then determine the table index and offset for the interpolation > using divmod. > > You're correct that we're just varying the exposure time while the light intensity is constant. I tried doing polynomial fits, but even upwards of 10th-degree polynomials still gave terrible fit qualities. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From cweisiger at msg.ucsf.edu Mon Feb 4 11:23:49 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Mon, 4 Feb 2013 08:23:49 -0800 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> <5ECF7AB6-5544-47FD-9107-A2ADFD08AB6F@yale.edu> Message-ID: On Fri, Feb 1, 2013 at 5:31 PM, Zachary Pincus wrote: > > Note that you'll run into trouble if you have image pixels that are above > or below the per-pixel min/max range. With the map_coordinates() 'mode' > parameter set to 'constant' and cval=-1 as you have, this will break > spectacularly if by chance a pixel winds up darker than in the > zero-exposure calibration image... yet this will happen occasionally, since > there's a statistical distribution of noise in the pixel readout. > > My current plan is to linearly extrapolate my low-end datapoints (which are near the baseline of 100) to 0; thus it should be impossible to go off the low end. The rare 99 output from the camera would still map to a valid "exposure time". Going off the high end is certainly still possible, but I can recognize those because they'll use the cval of -1 (or NaN, as you say) and handle them specially -- most likely by again using linear extrapolation of the last few points where I do have data. Thanks for looking over the code. Time to upscale it to deal with real data... -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Mon Feb 4 11:30:00 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Mon, 04 Feb 2013 17:30:00 +0100 Subject: [SciPy-User] scipy.stats.pearsonr returns exactly 0.0 In-Reply-To: References: <510F9A49.9020404@hilboll.de> Message-ID: <510FE208.1070708@hilboll.de> >> Hi, >> >> in a situation where scipy.stats.spearmanr and numpy.corrcoef[0,1] >> return sensible results, scipy.stats.pearsonr returns exactly 0.0. >> >> Are there situations where this is to be expected? I have sanitized my >> input arrays for nans before calling the correlation methods: >> >> tmpdata1 = data[key1].data[:, x, y] >> tmpdata2 = data[key2].data[:, x, y] >> tmpidx = True - (np.isnan(tmpdata1) | np.isnan(tmpdata2)) >> tmpdata1, tmpdata2 = tmpdata1[tmpidx], tmpdata2[tmpidx] >> >> The results I get is: >> >> sp.stats.linregress: 0.09.../..... >> sp.stats.spearmanr: 0.331... >> np.corrcoef: 0.2574... >> sp.stats.pearsonr: 0.0 (exactly) >> >> I'm a bit worried that pearsonr is exactly 0.00. Any ideas? >> > > > Can you provide your data? > > Warren Problem solved. For some reason (which I don't understand) warnings were suppressed in my script. After extracting the data and running pearsonr in an ipython shell, I got a RuntimeWarning("overflow encountered in float_scalars"). My tmpdata arrays were float32. After changing them to float64, I a pearsonr of 0.30694, which seems reasonable. Sorry for the noise. Andreas. From charlesr.harris at gmail.com Mon Feb 4 14:08:30 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 4 Feb 2013 12:08:30 -0700 Subject: [SciPy-User] Help optimizing an algorithm In-Reply-To: References: <05182E1C-E5D6-41DF-93FC-FB1BB3028CE6@yale.edu> <640567B9-BDEE-43A9-B80F-9B3D55F347FF@yale.edu> Message-ID: On Mon, Feb 4, 2013 at 9:19 AM, Chris Weisiger wrote: > On Fri, Feb 1, 2013 at 7:47 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> So the light source is held constant here and only the integration time >> varied? Due to pipelining, it is possible that polynomial fits might be as >> fast as the linear splines you are using. In any case, a polynomial fit to >> the inverse function could be used to sample the output to input conversion >> at equally spaced output values and the result stored. With proper scaling >> you could then determine the table index and offset for the interpolation >> using divmod. >> >> > You're correct that we're just varying the exposure time while the light > intensity is constant. I tried doing polynomial fits, but even upwards of > 10th-degree polynomials still gave terrible fit qualities. > I gone higher than that without problems, although I do include more exposure times. I've a routine that does this while combining several data sets that use different backgrounds as a check on the consistency of the exposure time numbers. If you would be interested I'll send it along. Or you could send me some data and I run the fit is see what is looks like. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Mon Feb 4 19:26:00 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Mon, 4 Feb 2013 16:26:00 -0800 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Sun, Feb 3, 2013 at 9:16 PM, wrote: > On Mon, Feb 4, 2013 at 12:13 AM, Paul Hobson wrote: > > > > Thanks for the advice. Currently working with linear fits, but I'm > hoping to > > expand this work to be more general. I'll be sure to keep these things in > > mind. > > If you work with linear fits, why do you use curve_fit/leastsq instead > of linalg? > > Josef First reason: I didn't know any better :) Second reason: I'm pretty certain that non-linear data will be coming my way soon, so I'm trying to plan ahead. -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 4 19:54:34 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 4 Feb 2013 19:54:34 -0500 Subject: [SciPy-User] Vectorizing scipy.optimize.curve_fit In-Reply-To: References: <510DCDA7.1090502@gmail.com> Message-ID: On Mon, Feb 4, 2013 at 7:26 PM, Paul Hobson wrote: > On Sun, Feb 3, 2013 at 9:16 PM, wrote: >> >> On Mon, Feb 4, 2013 at 12:13 AM, Paul Hobson wrote: >> > >> > Thanks for the advice. Currently working with linear fits, but I'm >> > hoping to >> > expand this work to be more general. I'll be sure to keep these things >> > in >> > mind. >> >> If you work with linear fits, why do you use curve_fit/leastsq instead >> of linalg? >> >> Josef > > > First reason: I didn't know any better :) > Second reason: I'm pretty certain that non-linear data will be coming my way > soon, so I'm trying to plan ahead. The speed difference between linear and nonlinear models is huge. So, unless you have only very rare cases for linear fit, I would split the code. You could still share almost all the code, just use two different fit methods/functions in the bootstrap loop. Josef > -paul > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From mailinglistxf at yahoo.fr Tue Feb 5 04:28:18 2013 From: mailinglistxf at yahoo.fr (mailinglist) Date: Tue, 05 Feb 2013 10:28:18 +0100 Subject: [SciPy-User] sparse lu factorization Message-ID: <5110D0B2.5010903@yahoo.fr> Hello, I do LU factorazation using scipy.sparse.linagl.splu, but how i want to know the matrix L and U explicitely, how i can achive it ? Best regards fxing From nils106 at googlemail.com Mon Feb 4 14:57:59 2013 From: nils106 at googlemail.com (Nils Wagner) Date: Mon, 4 Feb 2013 19:57:59 +0000 Subject: [SciPy-User] DeprecationWarning: non-integer slice parameter. In a future numpy release, this will raise an error. Message-ID: Hi all, Please find attached my first patch for scipy. Cheers, Nils -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-NW-non-integer-slice-parameter.patch Type: application/octet-stream Size: 949 bytes Desc: not available URL: From pranava.madhyastha at gmail.com Wed Feb 6 13:04:12 2013 From: pranava.madhyastha at gmail.com (Pranava Swaroop Madhyastha) Date: Wed, 6 Feb 2013 19:04:12 +0100 Subject: [SciPy-User] Please help me optimize this chunk of code. Message-ID: Hi all, I have been trying to optimize this chunk of code, but this is all I could do. The code is actually trying to sparsify (increase the number of 0s in the matrix); so when I give a sparse matrix, it checks the value and removes all values that are less than value% mean of each rows. Here is the gist of the code: https://gist.github.com/anonymous/4724485 Thanks, Pranava From ralf.gommers at gmail.com Wed Feb 6 14:30:03 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 6 Feb 2013 20:30:03 +0100 Subject: [SciPy-User] DeprecationWarning: non-integer slice parameter. In a future numpy release, this will raise an error. In-Reply-To: References: Message-ID: On Mon, Feb 4, 2013 at 8:57 PM, Nils Wagner wrote: > Hi all, > > Please find attached my first patch for scipy. > Hi Nils, thanks for the patch. It fixes the issue. The issue was fixed in parallel by Pauli already in https://github.com/scipy/scipy/pull/420though, so while your patch is fine it doesn't need to be applied anymore. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From srey at asu.edu Fri Feb 8 10:36:21 2013 From: srey at asu.edu (Serge Rey) Date: Fri, 8 Feb 2013 08:36:21 -0700 Subject: [SciPy-User] [ANN] PySAL 1.5 Message-ID: = PySAL 1.5 Released = On behalf of the PySAL development team, I'm happy to announce the official release of PySAL 1.5. PySAL is a library of tools for spatial data analysis and geocomputation written in Python. PySAL 1.5, the sixth official release of PySAL brings the following key enhancements: ===spatial regression (spreg)=== Adding regime classes for all GM methods and OLS available in pysal.spreg, i.e. OLS, TSLS, spatial lag models, spatial error models and SARAR models. All tests and heteroskedasticity corrections/estimators currently available in pysal.spreg apply to regime models (e.g. White, HAC and KP-HET). With the regimes, it is possible to estimate models that have: * Common or regime-specific error variance; Common or regime-specific * Coefficients for all variables or for a selection of variables * Common or regime-specific constant term ===FileIO=== * support for kwt ===contrib modules (sandbox modules, not in core)=== * shapely * mapping * network * spatial databases among the 116 commits and bug fixes since the last release, 6 months ago. ==PySAL modules== * pysal.core ? Core Data Structures and IO * pysal.cg ? Computational Geometry * pysal.esda ? Exploratory Spatial Data Analysis * pysal.inequality ? Spatial Inequality Analysis * pysal.spatial_dynamics ? Spatial Dynamics * pysal.spreg - Regression and Diagnostics * pysal.region ? Spatially Constrained Clustering * pysal.weights ? Spatial Weights * pysal.FileIO ? PySAL FileIO: Module for reading and writing various file types in a Pythonic way ==Downloads== Binary installers and source distributions are available for download at http://code.google.com/p/pysal/downloads/list PySAL can also be installed with pip or easy_install. ==Documentation== The documentation site is here http://pysal.geodacenter.org/1.5 ==Web sites== PySAL's home is here http://pysal.org/ The developer's site is here http://code.google.com/p/pysal/ ==Mailing Lists== Please see the developer's list here http://groups.google.com/group/pysal-dev Help for users is here http://groups.google.com/group/openspace-list ==Bug reports and feature requests== To search for or report bugs, as well as request enhancements, please see http://code.google.com/p/pysal/issues/list ==License information== See the file "LICENSE.txt" for information on the history of this software, terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES. Many thanks to [http://code.google.com/p/pysal/source/browse/tags/1.5/THANKS.txt all who contributed!] Serge, on behalf of the PySAL development team. -- Sergio (Serge) Rey Professor, School of Geographical Sciences and Urban Planning GeoDa Center for Geospatial Analysis and Computation Arizona State University http://geoplan.asu.edu/rey Editor, International Regional Science Review http://irx.sagepub.com From erareph07 at gmail.com Sun Feb 10 02:38:48 2013 From: erareph07 at gmail.com (luis sote) Date: Sun, 10 Feb 2013 07:38:48 +0000 (UTC) Subject: [SciPy-User] Anyone Interested in Developing Simulation Software for Biological Wastewater Treatment? References: Message-ID: Hi Kai Zhang I'm from Mexico and interest for your propose. I don't write English very well but I'm interesting developed simulation software for biological wastewater treatment. My wife and I are wastewater designers. I wish collaborate in this new project and you remember: Math is language of the universe we're understand. Bye. From ondrej.certik at gmail.com Sat Feb 9 20:25:03 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Sat, 9 Feb 2013 17:25:03 -0800 Subject: [SciPy-User] ANN: NumPy 1.7.0 release Message-ID: Hi, I'm pleased to announce the availability of the final release of NumPy 1.7.0. Sources and binary installers can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.7.0/ This release is equivalent to the 1.7.0rc2 release, since no more problems were found. For release notes see below. I would like to thank everybody who contributed to this release. Cheers, Ondrej ========================= NumPy 1.7.0 Release Notes ========================= This release includes several new features as well as numerous bug fixes and refactorings. It supports Python 2.4 - 2.7 and 3.1 - 3.3 and is the last release that supports Python 2.4 - 2.5. Highlights ========== * ``where=`` parameter to ufuncs (allows the use of boolean arrays to choose where a computation should be done) * ``vectorize`` improvements (added 'excluded' and 'cache' keyword, general cleanup and bug fixes) * ``numpy.random.choice`` (random sample generating function) Compatibility notes =================== In a future version of numpy, the functions np.diag, np.diagonal, and the diagonal method of ndarrays will return a view onto the original array, instead of producing a copy as they do now. This makes a difference if you write to the array returned by any of these functions. To facilitate this transition, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for np.diagonal for details. Similar to np.diagonal above, in a future version of numpy, indexing a record array by a list of field names will return a view onto the original array, instead of producing a copy as they do now. As with np.diagonal, numpy 1.7 produces a FutureWarning if it detects that you may be attempting to write to such an array. See the documentation for array indexing for details. In a future version of numpy, the default casting rule for UFunc out= parameters will be changed from 'unsafe' to 'same_kind'. (This also applies to in-place operations like a += b, which is equivalent to np.add(a, b, out=a).) Most usages which violate the 'same_kind' rule are likely bugs, so this change may expose previously undetected errors in projects that depend on NumPy. In this version of numpy, such usages will continue to succeed, but will raise a DeprecationWarning. Full-array boolean indexing has been optimized to use a different, optimized code path. This code path should produce the same results, but any feedback about changes to your code would be appreciated. Attempting to write to a read-only array (one with ``arr.flags.writeable`` set to ``False``) used to raise either a RuntimeError, ValueError, or TypeError inconsistently, depending on which code path was taken. It now consistently raises a ValueError. The .reduce functions evaluate some reductions in a different order than in previous versions of NumPy, generally providing higher performance. Because of the nature of floating-point arithmetic, this may subtly change some results, just as linking NumPy to a different BLAS implementations such as MKL can. If upgrading from 1.5, then generally in 1.6 and 1.7 there have been substantial code added and some code paths altered, particularly in the areas of type resolution and buffered iteration over universal functions. This might have an impact on your code particularly if you relied on accidental behavior in the past. New features ============ Reduction UFuncs Generalize axis= Parameter ------------------------------------------- Any ufunc.reduce function call, as well as other reductions like sum, prod, any, all, max and min support the ability to choose a subset of the axes to reduce over. Previously, one could say axis=None to mean all the axes or axis=# to pick a single axis. Now, one can also say axis=(#,#) to pick a list of axes for reduction. Reduction UFuncs New keepdims= Parameter ---------------------------------------- There is a new keepdims= parameter, which if set to True, doesn't throw away the reduction axes but instead sets them to have size one. When this option is set, the reduction result will broadcast correctly to the original operand which was reduced. Datetime support ---------------- .. note:: The datetime API is *experimental* in 1.7.0, and may undergo changes in future versions of NumPy. There have been a lot of fixes and enhancements to datetime64 compared to NumPy 1.6: * the parser is quite strict about only accepting ISO 8601 dates, with a few convenience extensions * converts between units correctly * datetime arithmetic works correctly * business day functionality (allows the datetime to be used in contexts where only certain days of the week are valid) The notes in `doc/source/reference/arrays.datetime.rst `_ (also available in the online docs at `arrays.datetime.html `_) should be consulted for more details. Custom formatter for printing arrays ------------------------------------ See the new ``formatter`` parameter of the ``numpy.set_printoptions`` function. New function numpy.random.choice --------------------------------- A generic sampling function has been added which will generate samples from a given array-like. The samples can be with or without replacement, and with uniform or given non-uniform probabilities. New function isclose -------------------- Returns a boolean array where two arrays are element-wise equal within a tolerance. Both relative and absolute tolerance can be specified. Preliminary multi-dimensional support in the polynomial package --------------------------------------------------------------- Axis keywords have been added to the integration and differentiation functions and a tensor keyword was added to the evaluation functions. These additions allow multi-dimensional coefficient arrays to be used in those functions. New functions for evaluating 2-D and 3-D coefficient arrays on grids or sets of points were added together with 2-D and 3-D pseudo-Vandermonde matrices that can be used for fitting. Ability to pad rank-n arrays ---------------------------- A pad module containing functions for padding n-dimensional arrays has been added. The various private padding functions are exposed as options to a public 'pad' function. Example:: pad(a, 5, mode='mean') Current modes are ``constant``, ``edge``, ``linear_ramp``, ``maximum``, ``mean``, ``median``, ``minimum``, ``reflect``, ``symmetric``, ``wrap``, and ````. New argument to searchsorted ---------------------------- The function searchsorted now accepts a 'sorter' argument that is a permutation array that sorts the array to search. Build system ------------ Added experimental support for the AArch64 architecture. C API ----- New function ``PyArray_RequireWriteable`` provides a consistent interface for checking array writeability -- any C code which works with arrays whose WRITEABLE flag is not known to be True a priori, should make sure to call this function before writing. NumPy C Style Guide added (``doc/C_STYLE_GUIDE.rst.txt``). Changes ======= General ------- The function np.concatenate tries to match the layout of its input arrays. Previously, the layout did not follow any particular reason, and depended in an undesirable way on the particular axis chosen for concatenation. A bug was also fixed which silently allowed out of bounds axis arguments. The ufuncs logical_or, logical_and, and logical_not now follow Python's behavior with object arrays, instead of trying to call methods on the objects. For example the expression (3 and 'test') produces the string 'test', and now np.logical_and(np.array(3, 'O'), np.array('test', 'O')) produces 'test' as well. The ``.base`` attribute on ndarrays, which is used on views to ensure that the underlying array owning the memory is not deallocated prematurely, now collapses out references when you have a view-of-a-view. For example:: a = np.arange(10) b = a[1:] c = b[1:] In numpy 1.6, ``c.base`` is ``b``, and ``c.base.base`` is ``a``. In numpy 1.7, ``c.base`` is ``a``. To increase backwards compatibility for software which relies on the old behaviour of ``.base``, we only 'skip over' objects which have exactly the same type as the newly created view. This makes a difference if you use ``ndarray`` subclasses. For example, if we have a mix of ``ndarray`` and ``matrix`` objects which are all views on the same original ``ndarray``:: a = np.arange(10) b = np.asmatrix(a) c = b[0, 1:] d = c[0, 1:] then ``d.base`` will be ``b``. This is because ``d`` is a ``matrix`` object, and so the collapsing process only continues so long as it encounters other ``matrix`` objects. It considers ``c``, ``b``, and ``a`` in that order, and ``b`` is the last entry in that list which is a ``matrix`` object. Casting Rules ------------- Casting rules have undergone some changes in corner cases, due to the NA-related work. In particular for combinations of scalar+scalar: * the `longlong` type (`q`) now stays `longlong` for operations with any other number (`? b h i l q p B H I`), previously it was cast as `int_` (`l`). The `ulonglong` type (`Q`) now stays as `ulonglong` instead of `uint` (`L`). * the `timedelta64` type (`m`) can now be mixed with any integer type (`b h i l q p B H I L Q P`), previously it raised `TypeError`. For array + scalar, the above rules just broadcast except the case when the array and scalars are unsigned/signed integers, then the result gets converted to the array type (of possibly larger size) as illustrated by the following examples:: >>> (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype dtype('uint16') >>> (np.zeros((2,), dtype=np.int8) + np.uint16(257)).dtype dtype('int16') >>> (np.zeros((2,), dtype=np.int16) + np.uint32(2**17)).dtype dtype('int32') Whether the size gets increased depends on the size of the scalar, for example:: >>> (np.zeros((2,), dtype=np.uint8) + np.int16(255)).dtype dtype('uint8') >>> (np.zeros((2,), dtype=np.uint8) + np.int16(256)).dtype dtype('uint16') Also a ``complex128`` scalar + ``float32`` array is cast to ``complex64``. In NumPy 1.7 the `datetime64` type (`M`) must be constructed by explicitly specifying the type as the second argument (e.g. ``np.datetime64(2000, 'Y')``). Deprecations ============ General ------- Specifying a custom string formatter with a `_format` array attribute is deprecated. The new `formatter` keyword in ``numpy.set_printoptions`` or ``numpy.array2string`` can be used instead. The deprecated imports in the polynomial package have been removed. ``concatenate`` now raises DepractionWarning for 1D arrays if ``axis != 0``. Versions of numpy < 1.7.0 ignored axis argument value for 1D arrays. We allow this for now, but in due course we will raise an error. C-API ----- Direct access to the fields of PyArrayObject* has been deprecated. Direct access has been recommended against for many releases. Expect similar deprecations for PyArray_Descr* and other core objects in the future as preparation for NumPy 2.0. The macros in old_defines.h are deprecated and will be removed in the next major release (>= 2.0). The sed script tools/replace_old_macros.sed can be used to replace these macros with the newer versions. You can test your code against the deprecated C API by #defining NPY_NO_DEPRECATED_API to the target version number, for example NPY_1_7_API_VERSION, before including any NumPy headers. The ``NPY_CHAR`` member of the ``NPY_TYPES`` enum is deprecated and will be removed in NumPy 1.8. See the discussion at `gh-2801 `_ for more details. Checksums ========= 7b72cc17b6a9043f6d46af4e71cd3dbe release/installers/numpy-1.7.0-win32-superpack-python3.3.exe 4fa54e40b6a243416f0248123b6ec332 release/installers/numpy-1.7.0.tar.gz 9ef1688bb9f8deb058a8022b4788686c release/installers/numpy-1.7.0-win32-superpack-python2.7.exe 909fe47da05d2a35edd6909ba0152213 release/installers/numpy-1.7.0-win32-superpack-python3.2.exe 5d4318b722d0098f78b49c0030d47026 release/installers/numpy-1.7.0-win32-superpack-python2.6.exe 92b61d6f278a81cf9a5033b0c8e7b53e release/installers/numpy-1.7.0-win32-superpack-python3.1.exe 51d6f4f854cdca224fa56a327ad7c620 release/installers/numpy-1.7.0-win32-superpack-python2.5.exe ca27913c59393940e880fab420f985b4 release/installers/numpy-1.7.0.zip 3f20becbb80da09412d94815ad3b586b release/installers/numpy-1.7.0-py2.5-python.org-macosx10.3.dmg 600dfa4dab31db5dc2ed9655521cfa9e release/installers/numpy-1.7.0-py2.6-python.org-macosx10.3.dmg a907a37416163b3245a30cfd160506ab release/installers/numpy-1.7.0-py2.7-python.org-macosx10.3.dmg From bouloumag at gmail.com Sun Feb 10 19:55:50 2013 From: bouloumag at gmail.com (bouloumag at gmail.com) Date: Sun, 10 Feb 2013 16:55:50 -0800 (PST) Subject: [SciPy-User] spectral derivative of aperiodic function Message-ID: <0007d837-006b-420d-b570-4dfc0ec65a14@googlegroups.com> I would like to calculate the spatial derivative of a variable defined on the points of a uniform grid. The domain is NOT periodic, so the FFT method is useless here. I wonder if there exist other spectral methods to calculate the derivative on an aperiodic domain or if my only hope is to use the finite difference method? -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at barbierdereuille.net Mon Feb 11 11:52:17 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Mon, 11 Feb 2013 17:52:17 +0100 Subject: [SciPy-User] spectral derivative of aperiodic function In-Reply-To: <0007d837-006b-420d-b570-4dfc0ec65a14@googlegroups.com> References: <0007d837-006b-420d-b570-4dfc0ec65a14@googlegroups.com> Message-ID: If you have reflexive boundary conditions, then you can use DCT the same way you were using FFT (i.e. I assume multiplying the transform of the differentiation matrix and the grid). -- Barbier de Reuille Pierre On 11 February 2013 01:55, wrote: > I would like to calculate the spatial derivative of a variable defined on > the points of a uniform grid. The domain is NOT periodic, so the FFT method > is useless here. I wonder if there exist other spectral methods to > calculate the derivative on an aperiodic domain or if my only hope is to > use the finite difference method? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vs at it.uu.se Mon Feb 11 13:10:41 2013 From: vs at it.uu.se (Virgil Stokes) Date: Mon, 11 Feb 2013 19:10:41 +0100 Subject: [SciPy-User] Anyone Interested in Developing Simulation Software for Biological Wastewater Treatment? In-Reply-To: References: Message-ID: <51193421.7090104@it.uu.se> On 10-Feb-2013 08:38, luis sote wrote: > Hi Kai Zhang > I'm from Mexico and interest for your propose. > I don't write English very well but I'm interesting developed simulation > software for biological wastewater treatment. > My wife and I are wastewater designers. > > I wish collaborate in this new project and you remember: > Math is language of the universe > > > we're understand. > > Bye. > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Please provide more information (details) about your plans/project. From josephsmidt at gmail.com Mon Feb 11 22:48:31 2013 From: josephsmidt at gmail.com (Joseph Smidt) Date: Mon, 11 Feb 2013 20:48:31 -0700 Subject: [SciPy-User] How can I interpolate array from spherical to cartesian coordinates? Message-ID: I have an array of density values in spherical coordinates. More specifically I have an array called density with shape (180,200,200). I also have an array called r_coord, theta_coord and phi_coord also with shape (180,200,200) being the spherical coordinates for the density array. I would like to map this density to cartesian coordinates using python. I will need therefore a new array density_prime which is interpolated over cartesian coordinates x_coord, y_coord and z_coord. I found scipy.ndimage.interpolation.map_coordinates which looks promising but I can't figure out how to get it to work. Any help would be appreciated. Thanks. -- ------------------------------------------------------------------------ Joseph Smidt Theoretical Division P.O. Box 1663, Mail Stop B283 Los Alamos, NM 87545 Office: 505-665-9752 Fax: 505-667-1931 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ddvento at ucar.edu Mon Feb 11 23:00:23 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Mon, 11 Feb 2013 21:00:23 -0700 Subject: [SciPy-User] number of tests Message-ID: I compiled scipy 0.11.0 myself with the intel compiler, on top of numpy 1.6.2 (with intel compiler too and MKL). I'm trying to assess whether or not everything has been build fine. Since my machine is actually a cluster, I'm running the tests in different configurations (login node and batch script). However, I'm confused by the number of tests which ran. On the login nodes (either interactively or in a script without the tty) I get: Ran 6218 tests in 514.867s FAILED (KNOWNFAIL=17, SKIP=42, errors=1, failures=1) Whereas in a remote batch node (with a script) I get: Ran 6167 tests in 271.833s FAILED (KNOWNFAIL=17, SKIP=42, errors=201, failures=1) I am not worried about the difference in timing, since it's about what I expected. However, I am surprised about the different number of tests that ran, because I'd expect it to be the same (possibly with difference in success vs failure vs skipped, but not in the overall number). Instead I've got 51 "missing" test while running in a remote node. Should have been the number of errors 52, I'd suspect an odd way of counting what had run, however the difference in errors is 200, so it must be something else. Why is it so? Thanks and Regards, Davide From charlesr.harris at gmail.com Tue Feb 12 01:07:56 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 11 Feb 2013 23:07:56 -0700 Subject: [SciPy-User] How can I interpolate array from spherical to cartesian coordinates? In-Reply-To: References: Message-ID: On Mon, Feb 11, 2013 at 8:48 PM, Joseph Smidt wrote: > I have an array of density values in spherical coordinates. More > specifically I have an array called density with shape (180,200,200). I > also have an array called r_coord, theta_coord and phi_coord also with > shape (180,200,200) being the spherical coordinates for the density array. > > I would like to map this density to cartesian coordinates using python. I > will need therefore a new array density_prime which is interpolated over > cartesian coordinates x_coord, y_coord and z_coord. I found > scipy.ndimage.interpolation.map_coordinates which looks promising but I > can't figure out how to get it to work. > > I'm not clear on what you are trying to do, but I'm guessing you have sample points on a sphere and you want to find interpolated values at other points on the sphere, the cartesian coordinates being a means rather than an end. Is that the case? If not, can you be more explicit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephsmidt at gmail.com Tue Feb 12 01:30:11 2013 From: josephsmidt at gmail.com (Joseph Smidt) Date: Mon, 11 Feb 2013 23:30:11 -0700 Subject: [SciPy-User] How can I interpolate array from spherical to cartesian coordinates? In-Reply-To: References: Message-ID: Chuck and everyone, Okay, I will give a more specific example. Consider the following script: from pylab import * # Build 3D arrays for spherical coordinates. r, theta, phi = mgrid[0:201,0:201,0:201] r = r/20.0 # r goes from 0 to 10. theta = theta/200.0*pi # Theta goes from 0 to pi phi = phi/200.0*2*pi # Phi goes from 0 to 2pi # Density is spherically symmetric. Only depends on r. density = exp(-r**2/20.0) # Plot density. Doesn't look spherical because # not in cartesian coordinates. imshow(squeeze(f[:,:,1])) Okay, so density is defined in terms of spherical coordinates. I would like a function that transforms density to density_prime to cartesian coordinate arrays x, y, and z such that the r = 0 line gets mapped to x = 0, y = 0 z = 0. The r = 1 line gets mapped to x = sin(theta)*cos(phi), y = sin(theta)*sin(phi), z = cos(phi), the r = 2 line gets mapped to... etc. So that when I plot density_prime it looks like a nice spherical function peaking at x = 0, y = 0 z = 0 and getting small for x, y, and z large. If anyone knows how I could do such a transformation to get density_prime with scipy.ndimage.interpolation.map_coordinates or any other interpolator for N-dim data I would appreciate it. On Mon, Feb 11, 2013 at 11:07 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Mon, Feb 11, 2013 at 8:48 PM, Joseph Smidt wrote: > >> I have an array of density values in spherical coordinates. More >> specifically I have an array called density with shape (180,200,200). I >> also have an array called r_coord, theta_coord and phi_coord also with >> shape (180,200,200) being the spherical coordinates for the density array. >> >> I would like to map this density to cartesian coordinates using python. I >> will need therefore a new array density_prime which is interpolated over >> cartesian coordinates x_coord, y_coord and z_coord. I found >> scipy.ndimage.interpolation.map_coordinates which looks promising but I >> can't figure out how to get it to work. >> >> > I'm not clear on what you are trying to do, but I'm guessing you have > sample points on a sphere and you want to find interpolated values at other > points on the sphere, the cartesian coordinates being a means rather than > an end. Is that the case? If not, can you be more explicit. > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- ------------------------------------------------------------------------ Joseph Smidt Theoretical Division P.O. Box 1663, Mail Stop B283 Los Alamos, NM 87545 Office: 505-665-9752 Fax: 505-667-1931 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Tue Feb 12 02:59:27 2013 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 12 Feb 2013 08:59:27 +0100 Subject: [SciPy-User] scipy.stats.linregress Message-ID: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> Dear Scipy Community, I am looking at the scipy.stats.linregress code and see: # average sum of squares: ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat r_num = ssxym r_den = np.sqrt(ssxm*ssym) if r_den == 0.0: r = 0.0 else: r = r_num / r_den if (r > 1.0): r = 1.0 # from numerical error if the slope is negative, the correlation factor R is -1, not one so one should add: if (r < -1.0): r = -1.0 # from numerical error or did I completely mis-understood the code ? Cheers, -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From Jerome.Kieffer at esrf.fr Tue Feb 12 03:02:34 2013 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 12 Feb 2013 09:02:34 +0100 Subject: [SciPy-User] How can I interpolate array from spherical to cartesian coordinates? In-Reply-To: References: Message-ID: <20130212090234.468c3159.Jerome.Kieffer@esrf.fr> On Mon, 11 Feb 2013 23:30:11 -0700 Joseph Smidt wrote: > If anyone knows how I could do such a transformation to get density_prime > with scipy.ndimage.interpolation.map_coordinates or any other interpolator > for N-dim data I would appreciate it. I never did it in 3D but you need the inverse transformation for map_coordinate: r, theta, phi -> x, y , z = r*sin(theta)*cos(phi), r*sin(theta)*sin(phi), r*cos(phi) I think that's all. Cheers, -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From davidmenhur at gmail.com Tue Feb 12 05:55:31 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 12 Feb 2013 11:55:31 +0100 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> Message-ID: This parameter is R**2, the square of R. You are computing it using the squares of the residuals, so everything should be positive. If you get r negative, something has gone terribly wrong. On Feb 12, 2013 8:59 AM, "Jerome Kieffer" wrote: > > Dear Scipy Community, > > I am looking at the scipy.stats.linregress code and see: > # average sum of squares: > ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat > r_num = ssxym > r_den = np.sqrt(ssxm*ssym) > if r_den == 0.0: > r = 0.0 > else: > r = r_num / r_den > if (r > 1.0): r = 1.0 # from numerical error > > if the slope is negative, the correlation factor R is -1, not one so one > should add: > > if (r < -1.0): r = -1.0 # from numerical error > > or did I completely mis-understood the code ? > > Cheers, > > > -- > J?r?me Kieffer > On-Line Data analysis / Software Group > ISDD / ESRF > tel +33 476 882 445 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jerome.Kieffer at esrf.fr Tue Feb 12 07:19:04 2013 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 12 Feb 2013 13:19:04 +0100 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> Message-ID: <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> On Tue, 12 Feb 2013 11:55:31 +0100 Da?id wrote: > This parameter is R**2, the square of R. You are computing it using the > squares of the residuals, so everything should be positive. If you get r > negative, something has gone terribly wrong. In [3]: scipy.stats.linregress([1,2,3],[3,2,1]) Out[3]: (-1.0, 4.0, -1.0, 9.0031759825137669e-11, 0.0) I am not very confident in my knowledge in statistics but here R = -1 and it does not look like an error. I implemented a method to perform thousands of linear regression and few of them returned -1.00001 (or so) which later gave NaN as stderr. so either the test should be: if (r*r > 1.0): r = r/abs(r) or: if (r > 1.0): r = 1 elif (r < -1.0): r = -1 Cheers, -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From josef.pktd at gmail.com Tue Feb 12 07:37:56 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Feb 2013 07:37:56 -0500 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> Message-ID: On Tue, Feb 12, 2013 at 7:19 AM, Jerome Kieffer wrote: > On Tue, 12 Feb 2013 11:55:31 +0100 > Da?id wrote: > >> This parameter is R**2, the square of R. You are computing it using the >> squares of the residuals, so everything should be positive. If you get r >> negative, something has gone terribly wrong. > > In [3]: scipy.stats.linregress([1,2,3],[3,2,1]) > Out[3]: (-1.0, 4.0, -1.0, 9.0031759825137669e-11, 0.0) > > I am not very confident in my knowledge in statistics but here R = -1 and it does not look like an error. > I implemented a method to perform thousands of linear regression and few of them returned -1.00001 (or so) which later gave NaN as stderr. > > so either the test should be: > if (r*r > 1.0): r = r/abs(r) > or: > if (r > 1.0): r = 1 > elif (r < -1.0): r = -1 The correlation coefficient that is reported is the signed correlation, the docstring has an example that takes the square to get R_squared. I don't know whether it's the R_squared, I need to check. But because r is signed we should have the check r<-1 then -1 Josef > > Cheers, > > -- > J?r?me Kieffer > On-Line Data analysis / Software Group > ISDD / ESRF > tel +33 476 882 445 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From davidmenhur at gmail.com Tue Feb 12 07:43:05 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 12 Feb 2013 13:43:05 +0100 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> Message-ID: On 12 February 2013 13:19, Jerome Kieffer wrote: > I am not very confident in my knowledge in statistics but here R = -1 and it does not look like an error. > I implemented a method to perform thousands of linear regression and few of them returned -1.00001 (or so) which later gave NaN as stderr. You are absolutely right, my mistake. I got the formula confused in my head, I should have looked it up. Indeed, the covariance can be negative, giving you the r<0. David. From josef.pktd at gmail.com Tue Feb 12 07:46:02 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Feb 2013 07:46:02 -0500 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> Message-ID: On Tue, Feb 12, 2013 at 7:37 AM, wrote: > On Tue, Feb 12, 2013 at 7:19 AM, Jerome Kieffer wrote: >> On Tue, 12 Feb 2013 11:55:31 +0100 >> Da?id wrote: >> >>> This parameter is R**2, the square of R. You are computing it using the >>> squares of the residuals, so everything should be positive. If you get r >>> negative, something has gone terribly wrong. >> >> In [3]: scipy.stats.linregress([1,2,3],[3,2,1]) >> Out[3]: (-1.0, 4.0, -1.0, 9.0031759825137669e-11, 0.0) >> >> I am not very confident in my knowledge in statistics but here R = -1 and it does not look like an error. >> I implemented a method to perform thousands of linear regression and few of them returned -1.00001 (or so) which later gave NaN as stderr. >> >> so either the test should be: >> if (r*r > 1.0): r = r/abs(r) >> or: >> if (r > 1.0): r = 1 >> elif (r < -1.0): r = -1 > > The correlation coefficient that is reported is the signed > correlation, the docstring has an example that takes the square to get > R_squared. > > I don't know whether it's the R_squared, I need to check. >>> sm.OLS([1,2,3],sm.add_constant([3.1,2,1])).fit().rsquared 0.99924471299093653 >>> stats.linregress([3.1,2,1], [1,2,3])[2] -0.99962228516121865 >>> stats.linregress([3.1,2,1], [1,2,3])[2]**2 0.99924471299093676 Josef Josef > > But because r is signed we should have the check r<-1 then -1 > > Josef > >> >> Cheers, >> >> -- >> J?r?me Kieffer >> On-Line Data analysis / Software Group >> ISDD / ESRF >> tel +33 476 882 445 >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Tue Feb 12 07:54:38 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Feb 2013 07:54:38 -0500 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> Message-ID: On Tue, Feb 12, 2013 at 7:46 AM, wrote: > On Tue, Feb 12, 2013 at 7:37 AM, wrote: >> On Tue, Feb 12, 2013 at 7:19 AM, Jerome Kieffer wrote: >>> On Tue, 12 Feb 2013 11:55:31 +0100 >>> Da?id wrote: >>> >>>> This parameter is R**2, the square of R. You are computing it using the >>>> squares of the residuals, so everything should be positive. If you get r >>>> negative, something has gone terribly wrong. >>> >>> In [3]: scipy.stats.linregress([1,2,3],[3,2,1]) >>> Out[3]: (-1.0, 4.0, -1.0, 9.0031759825137669e-11, 0.0) >>> >>> I am not very confident in my knowledge in statistics but here R = -1 and it does not look like an error. >>> I implemented a method to perform thousands of linear regression and few of them returned -1.00001 (or so) which later gave NaN as stderr. >>> >>> so either the test should be: >>> if (r*r > 1.0): r = r/abs(r) >>> or: >>> if (r > 1.0): r = 1 >>> elif (r < -1.0): r = -1 >> >> The correlation coefficient that is reported is the signed >> correlation, the docstring has an example that takes the square to get >> R_squared. >> >> I don't know whether it's the R_squared, I need to check. > >>>> sm.OLS([1,2,3],sm.add_constant([3.1,2,1])).fit().rsquared > 0.99924471299093653 > >>>> stats.linregress([3.1,2,1], [1,2,3])[2] > -0.99962228516121865 >>>> stats.linregress([3.1,2,1], [1,2,3])[2]**2 > 0.99924471299093676 > > Josef > Josef >> >> But because r is signed we should have the check r<-1 then -1 Volunteers for a pull request ? and for checking that the test suite has a case with negative r, so this doesn't get changed by accident. Josef >> >> Josef >> >>> >>> Cheers, >>> >>> -- >>> J?r?me Kieffer >>> On-Line Data analysis / Software Group >>> ISDD / ESRF >>> tel +33 476 882 445 >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From Jerome.Kieffer at esrf.fr Tue Feb 12 10:33:40 2013 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 12 Feb 2013 16:33:40 +0100 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> Message-ID: <20130212163340.caa8e158.Jerome.Kieffer@esrf.fr> On Tue, 12 Feb 2013 07:54:38 -0500 josef.pktd at gmail.com wrote: > Volunteers for a pull request ? > and for checking that the test suite has a case with negative r, so > this doesn't get changed by accident. I can ... but I am not used to the tests in scipy -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From josef.pktd at gmail.com Tue Feb 12 10:48:33 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Feb 2013 10:48:33 -0500 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: <20130212163340.caa8e158.Jerome.Kieffer@esrf.fr> References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> <20130212163340.caa8e158.Jerome.Kieffer@esrf.fr> Message-ID: On Tue, Feb 12, 2013 at 10:33 AM, Jerome Kieffer wrote: > On Tue, 12 Feb 2013 07:54:38 -0500 > josef.pktd at gmail.com wrote: > >> Volunteers for a pull request ? >> and for checking that the test suite has a case with negative r, so >> this doesn't get changed by accident. > > I can ... but I am not used to the tests in scipy Usually I just have to search the test folder by function. most of the linregress tests are from an old benchmark test suite, checking mainly difficult cases. This https://github.com/scipy/scipy/blob/master/scipy/stats/tests/test_stats.py#L736 is a test that I added. Similar we could add one with a negative coefficient. statsmodels OLS is verified against other packages, so we can get the correct test results from there. Thank you, Josef > > > -- > J?r?me Kieffer > On-Line Data analysis / Software Group > ISDD / ESRF > tel +33 476 882 445 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From google at terre-adelie.org Mon Feb 11 15:34:13 2013 From: google at terre-adelie.org (=?ISO-8859-1?Q?J=E9r=F4me?= Kieffer) Date: Mon, 11 Feb 2013 21:34:13 +0100 Subject: [SciPy-User] scipy.stats.linregress bug ? Message-ID: <20130211213413.647a8311722efe3ae27bf9c9@terre-adelie.org> Dear Scipy Community, I am looking at the scipy.stats.linregress code and see: # average sum of squares: ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat r_num = ssxym r_den = np.sqrt(ssxm*ssym) if r_den == 0.0: r = 0.0 else: r = r_num / r_den if (r > 1.0): r = 1.0 # from numerical error if the slope is negative, the correlation factor R is -1, not one so one should add: if (r < -1.0): r = -1.0 # from numerical error or did I completely mis-understood the code ? Cheers, -- J?r?me Kieffer From Jerome.Kieffer at esrf.fr Tue Feb 12 11:45:57 2013 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 12 Feb 2013 17:45:57 +0100 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> <20130212163340.caa8e158.Jerome.Kieffer@esrf.fr> Message-ID: <20130212174557.a1f6337c.Jerome.Kieffer@esrf.fr> On Tue, 12 Feb 2013 10:48:33 -0500 josef.pktd at gmail.com wrote: > On Tue, Feb 12, 2013 at 10:33 AM, Jerome Kieffer wrote: > > On Tue, 12 Feb 2013 07:54:38 -0500 > > josef.pktd at gmail.com wrote: > > > >> Volunteers for a pull request ? > >> and for checking that the test suite has a case with negative r, so > >> this doesn't get changed by accident. > > > > I can ... but I am not used to the tests in scipy > > Usually I just have to search the test folder by function. > the patch for the code is trivial... but get a valid test-case (currently breaking) is not trivial, especially that numpy.cov is casting to double precision (I have seen R<-1 only with single precision arithmetic) I am preparing a pull-request. Cheers, -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From josef.pktd at gmail.com Tue Feb 12 11:56:20 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 12 Feb 2013 11:56:20 -0500 Subject: [SciPy-User] scipy.stats.linregress In-Reply-To: <20130212174557.a1f6337c.Jerome.Kieffer@esrf.fr> References: <20130212085927.1db43472.Jerome.Kieffer@esrf.fr> <20130212131904.b7b97502.Jerome.Kieffer@esrf.fr> <20130212163340.caa8e158.Jerome.Kieffer@esrf.fr> <20130212174557.a1f6337c.Jerome.Kieffer@esrf.fr> Message-ID: On Tue, Feb 12, 2013 at 11:45 AM, Jerome Kieffer wrote: > On Tue, 12 Feb 2013 10:48:33 -0500 > josef.pktd at gmail.com wrote: > >> On Tue, Feb 12, 2013 at 10:33 AM, Jerome Kieffer wrote: >> > On Tue, 12 Feb 2013 07:54:38 -0500 >> > josef.pktd at gmail.com wrote: >> > >> >> Volunteers for a pull request ? >> >> and for checking that the test suite has a case with negative r, so >> >> this doesn't get changed by accident. >> > >> > I can ... but I am not used to the tests in scipy >> >> Usually I just have to search the test folder by function. >> > > the patch for the code is trivial... > but get a valid test-case (currently breaking) is not trivial, especially that numpy.cov is casting to double precision (I have seen R<-1 only with single precision arithmetic) We should have a test for negative r, as backwards compatibility check for the future, but if it's too difficult to get one with r<-1, then we can leave two lines untested. I don't remember if I have ever seen a case with abs(r)>1. Josef > > I am preparing a pull-request. > > Cheers, > -- > J?r?me Kieffer > On-Line Data analysis / Software Group > ISDD / ESRF > tel +33 476 882 445 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ddvento at ucar.edu Tue Feb 12 12:05:48 2013 From: ddvento at ucar.edu (Davide Del Vento) Date: Tue, 12 Feb 2013 10:05:48 -0700 Subject: [SciPy-User] number of tests In-Reply-To: References: Message-ID: <511A766C.60302@ucar.edu> I should have added: $ lsb_release -a LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 6.2 (Santiago) Release: 6.2 Codename: Santiago $ python -c "import scipy; scipy.test('full')" Running unit tests for scipy NumPy version 1.6.2 NumPy is installed in /opt/numpy/1.6.2/intel/13.0.1/lib/python2.7/site-packages/numpy SciPy version 0.11.0 SciPy is installed in /opt/scipy/0.11.0/intel/13.0.1/lib/python2.7/site-packages/scipy Python version 2.7.3 (default, Feb 9 2013, 16:14:16) [GCC 4.7.2] nose version 1.2.1 Does anybody knows why the number of tests run are different among different runs of the same binary/library on different nodes? https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt implies they shouldn't... Thanks!! Davide Del Vento, On 02/11/2013 09:00 PM, Davide Del Vento wrote: > I compiled scipy 0.11.0 myself with the intel compiler, on top of > numpy 1.6.2 (with intel compiler too and MKL). I'm trying to assess > whether or not everything has been build fine. Since my machine is > actually a cluster, I'm running the tests in different configurations > (login node and batch script). However, I'm confused by the number of > tests which ran. > On the login nodes (either interactively or in a script without the > tty) I get: > > Ran 6218 tests in 514.867s > FAILED (KNOWNFAIL=17, SKIP=42, errors=1, failures=1) > > Whereas in a remote batch node (with a script) I get: > > Ran 6167 tests in 271.833s > FAILED (KNOWNFAIL=17, SKIP=42, errors=201, failures=1) > > I am not worried about the difference in timing, since it's about what > I expected. However, I am surprised about the different number of > tests that ran, because I'd expect it to be the same (possibly with > difference in success vs failure vs skipped, but not in the overall > number). > Instead I've got 51 "missing" test while running in a remote node. > Should have been the number of errors 52, I'd suspect an odd way of > counting what had run, however the difference in errors is 200, so it must > be something else. > > Why is it so? > > Thanks and Regards, > Davide From josephsmidt at gmail.com Tue Feb 12 14:34:27 2013 From: josephsmidt at gmail.com (Joseph Smidt) Date: Tue, 12 Feb 2013 12:34:27 -0700 Subject: [SciPy-User] How can I interpolate array from spherical to cartesian coordinates? In-Reply-To: <20130212090234.468c3159.Jerome.Kieffer@esrf.fr> References: <20130212090234.468c3159.Jerome.Kieffer@esrf.fr> Message-ID: Hey everyone, I found a solution to this problem so I decided to post it here for posterity's sake. The code below seems to do the trick: from pylab import * from scipy.interpolate import interp1d from scipy.ndimage import map_coordinates import scitools def spherical2cartesian(r, th, phi, grid, x, y, z, order=3): # Build relationship between Cartesian and spherical coordinates. X, Y, Z = scitools.numpytools.meshgrid(x,y,z) new_r = np.sqrt(X*X+Y*Y+Z*Z) new_th = np.arccos(Z/new_r) new_phi = np.arctan2(Y, X) # Find these values for the input grid ir = interp1d(r, np.arange(len(r)), bounds_error=False) ith = interp1d(th, np.arange(len(th))) iphi = interp1d(phi, np.arange(len(phi))) new_ir = ir(new_r.ravel()) new_ith = ith(new_th.ravel()) new_iphi = iphi(new_phi.ravel()) new_ir[new_r.ravel() > r.max()] = len(r)-1 new_ir[new_r.ravel() < r.min()] = 0 # Interpolate to Cartesian coordinates. return map_coordinates(grid, np.array([new_ir, new_ith, new_iphi]), order=order).reshape(new_r.shape) # Build 3D arrays for spherical coordinates. r, th, phi = mgrid[0:201,0:201,0:201] r = r/20.0 # r goes from 0 to 10. th = th/200.0*pi # Theta goes from 0 to pi phi = phi/200.0*2*pi # Phi goes from 0 to 2pi # Density is spherically symmetric. Only depends on r. density = exp(-r**2/20.0) # Build ranges for function r = linspace(0,200,200) th = linspace(0,np.pi,200) phi = linspace(-np.pi,np.pi,200) x = linspace(-200,200,200) y = linspace(-200,200,200) z = linspace(-200,200,200) # Map to Cartesian coordinates. density = spherical2cartesian(r, th, phi, density, x, y, z, order=3) # Plot density, now in spherical coordinates. # not in cartesian coordinates. figure() imshow(squeeze(density[:,:,100])) show() On Tue, Feb 12, 2013 at 1:02 AM, Jerome Kieffer wrote: > On Mon, 11 Feb 2013 23:30:11 -0700 > Joseph Smidt wrote: > > > > If anyone knows how I could do such a transformation to get > density_prime > > with scipy.ndimage.interpolation.map_coordinates or any other > interpolator > > for N-dim data I would appreciate it. > > I never did it in 3D but you need the inverse transformation for > map_coordinate: > r, theta, phi -> x, y , z = r*sin(theta)*cos(phi), r*sin(theta)*sin(phi), > r*cos(phi) > I think that's all. > > Cheers, > -- > J?r?me Kieffer > On-Line Data analysis / Software Group > ISDD / ESRF > tel +33 476 882 445 > -- ------------------------------------------------------------------------ Joseph Smidt Theoretical Division P.O. Box 1663, Mail Stop B283 Los Alamos, NM 87545 Office: 505-665-9752 Fax: 505-667-1931 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Feb 13 07:05:44 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 13 Feb 2013 07:05:44 -0500 Subject: [SciPy-User] Simple math question Message-ID: A basic combinatorial prob problem, which you will no doubt recognize, would have a solution: 1 - (1 - p)^n Any suggestion to calculate this, when n -> 10^5 ? Best I can think of is clip (n*p, -nan, 1) From pierre at barbierdereuille.net Wed Feb 13 08:23:50 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Wed, 13 Feb 2013 14:23:50 +0100 Subject: [SciPy-User] Simple math question In-Reply-To: References: Message-ID: Assuming 0 <= p <= 1, I don't see the problem there. -- Barbier de Reuille Pierre On 13 February 2013 13:05, Neal Becker wrote: > A basic combinatorial prob problem, which you will no doubt recognize, > would > have a solution: > > 1 - (1 - p)^n > > Any suggestion to calculate this, when n -> 10^5 ? > > Best I can think of is > > clip (n*p, -nan, 1) > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Feb 13 08:34:40 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 13 Feb 2013 08:34:40 -0500 Subject: [SciPy-User] Simple math question References: Message-ID: Pierre Barbier de Reuille wrote: > Assuming 0 <= p <= 1, I don't see the problem there. > I had just assumed there'd be numerical accuracy issues, but maybe not. From josef.pktd at gmail.com Wed Feb 13 08:41:27 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 13 Feb 2013 08:41:27 -0500 Subject: [SciPy-User] Simple math question In-Reply-To: References: Message-ID: On Wed, Feb 13, 2013 at 8:34 AM, Neal Becker wrote: > Pierre Barbier de Reuille wrote: > >> Assuming 0 <= p <= 1, I don't see the problem there. >> > > I had just assumed there'd be numerical accuracy issues, but maybe not. Any numerical problems that you might have with the power gets swamped because you can only get within floating point precision of 1. If the results were zero, then there are special function to get better precision for tiny numbers. AFAICS. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ndbecker2 at gmail.com Wed Feb 13 08:44:01 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 13 Feb 2013 08:44:01 -0500 Subject: [SciPy-User] Simple math question References: Message-ID: josef.pktd at gmail.com wrote: > On Wed, Feb 13, 2013 at 8:34 AM, Neal Becker wrote: >> Pierre Barbier de Reuille wrote: >> >>> Assuming 0 <= p <= 1, I don't see the problem there. >>> >> >> I had just assumed there'd be numerical accuracy issues, but maybe not. > > Any numerical problems that you might have with the power gets swamped > because you can only get within floating point precision of 1. > > If the results were zero, then there are special function to get > better precision for tiny numbers. > > AFAICS. > > Josef Here, 0 <= p < 10^-8 From ndbecker2 at gmail.com Wed Feb 13 08:45:22 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 13 Feb 2013 08:45:22 -0500 Subject: [SciPy-User] Simple math question References: Message-ID: Neal Becker wrote: > josef.pktd at gmail.com wrote: > >> On Wed, Feb 13, 2013 at 8:34 AM, Neal Becker wrote: >>> Pierre Barbier de Reuille wrote: >>> >>>> Assuming 0 <= p <= 1, I don't see the problem there. >>>> >>> >>> I had just assumed there'd be numerical accuracy issues, but maybe not. >> >> Any numerical problems that you might have with the power gets swamped >> because you can only get within floating point precision of 1. >> >> If the results were zero, then there are special function to get >> better precision for tiny numbers. >> >> AFAICS. >> >> Josef > > Here, 0 <= p < 10^-8 Sorry, that should be 10^-8 < p < 1, so 1-p is between 0 and 1-10^-8 From sturla at molden.no Wed Feb 13 10:58:21 2013 From: sturla at molden.no (Sturla Molden) Date: Wed, 13 Feb 2013 16:58:21 +0100 Subject: [SciPy-User] Problem with the mailing lists? Message-ID: <511BB81D.2030505@molden.no> I have not received any e-mail from the scipy and numpy mailing lists since February 4. However I am still able to post. I tried to resubscribe, but no luck. Is there a problem with the mailing lists or my subscription? (I'll try to read any responses over the web archive.) Sturla From robert.kern at gmail.com Wed Feb 13 10:59:50 2013 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 13 Feb 2013 15:59:50 +0000 Subject: [SciPy-User] Problem with the mailing lists? In-Reply-To: <511BB81D.2030505@molden.no> References: <511BB81D.2030505@molden.no> Message-ID: On Wed, Feb 13, 2013 at 3:58 PM, Sturla Molden wrote: > I have not received any e-mail from the scipy and numpy mailing lists > since February 4. However I am still able to post. I tried to > resubscribe, but no luck. > > Is there a problem with the mailing lists or my subscription? > > (I'll try to read any responses over the web archive.) Have you checked your spam filters? -- Robert Kern From sturla at molden.no Wed Feb 13 12:16:59 2013 From: sturla at molden.no (Sturla Molden) Date: Wed, 13 Feb 2013 18:16:59 +0100 Subject: [SciPy-User] Problem with the mailing lists? In-Reply-To: References: <511BB81D.2030505@molden.no> Message-ID: <511BCA8B.8060506@molden.no> On 13.02.2013 16:59, Robert Kern wrote: > On Wed, Feb 13, 2013 at 3:58 PM, Sturla Molden wrote: >> I have not received any e-mail from the scipy and numpy mailing lists >> since February 4. However I am still able to post. I tried to >> resubscribe, but no luck. >> >> Is there a problem with the mailing lists or my subscription? >> >> (I'll try to read any responses over the web archive.) > > Have you checked your spam filters? Yes, they were unchanged, and have been unchanged for a very long time. Your last message was received (probably because of the Cc), and got routed correctly to my scipy-user folder. Also I do get mail on the EPD-User list, so I am not rejecting e-mails from Enthought's postmaster (which I believe also run the scipy and numpy lists). Not sure what has happened, but I don't get the e-mails anymore. :-( Sorry for spamming the list, I just had to ask. Sturla From daniele at grinta.net Wed Feb 13 12:21:55 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 13 Feb 2013 18:21:55 +0100 Subject: [SciPy-User] Problem with the mailing lists? In-Reply-To: <511BCA8B.8060506@molden.no> References: <511BB81D.2030505@molden.no> <511BCA8B.8060506@molden.no> Message-ID: <511BCBB3.1000908@grinta.net> On 13/02/2013 18:16, Sturla Molden wrote: > Not sure what has happened, but I don't get the e-mails anymore. :-( It is possible that the mailing list manager suspended delivery to your email address after a few messages bounces. Check your subscription preferences through the link at the bottom of each list email. Cheers, Daniele From ognen at enthought.com Wed Feb 13 22:23:01 2013 From: ognen at enthought.com (Ognen Duzlevski) Date: Wed, 13 Feb 2013 21:23:01 -0600 Subject: [SciPy-User] Problem with the mailing lists? In-Reply-To: <511BB81D.2030505@molden.no> References: <511BB81D.2030505@molden.no> Message-ID: >From what I can tell your account seems to be active - no issues with it on this end. Ognen On Wed, Feb 13, 2013 at 9:58 AM, Sturla Molden wrote: > I have not received any e-mail from the scipy and numpy mailing lists > since February 4. However I am still able to post. I tried to > resubscribe, but no luck. > > Is there a problem with the mailing lists or my subscription? > > (I'll try to read any responses over the web archive.) > > > Sturla > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Thu Feb 14 08:24:14 2013 From: sturla at molden.no (Sturla Molden) Date: Thu, 14 Feb 2013 14:24:14 +0100 Subject: [SciPy-User] Problem with the mailing lists? In-Reply-To: <511BCA8B.8060506@molden.no> References: <511BB81D.2030505@molden.no> <511BCA8B.8060506@molden.no> Message-ID: > On 13.02.2013 16:59, Robert Kern wrote: > Yes, they were unchanged, and have been unchanged for a very long time. And now the filter is turned completely off. Sorry for the spam. Sturla From vanforeest at gmail.com Thu Feb 14 08:31:32 2013 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 14 Feb 2013 14:31:32 +0100 Subject: [SciPy-User] Simple math question In-Reply-To: References: Message-ID: 1 - exp(n*log(1.-p)) should do too. Taking logs is an easy way to handle this sort of problems. On 13 February 2013 14:45, Neal Becker wrote: > Neal Becker wrote: > > > josef.pktd at gmail.com wrote: > > > >> On Wed, Feb 13, 2013 at 8:34 AM, Neal Becker > wrote: > >>> Pierre Barbier de Reuille wrote: > >>> > >>>> Assuming 0 <= p <= 1, I don't see the problem there. > >>>> > >>> > >>> I had just assumed there'd be numerical accuracy issues, but maybe not. > >> > >> Any numerical problems that you might have with the power gets swamped > >> because you can only get within floating point precision of 1. > >> > >> If the results were zero, then there are special function to get > >> better precision for tiny numbers. > >> > >> AFAICS. > >> > >> Josef > > > > Here, 0 <= p < 10^-8 > Sorry, that should be 10^-8 < p < 1, so 1-p is between 0 and 1-10^-8 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From isaac.gerg at gergltd.com Thu Feb 14 09:35:53 2013 From: isaac.gerg at gergltd.com (Isaac Gerg) Date: Thu, 14 Feb 2013 09:35:53 -0500 Subject: [SciPy-User] Matlab histeq equivalent Message-ID: Hi, I am wondering if there is a Matlab histeq equivalent in scipy? In googling, I have not found such a function other than one here: http://www.janeriksolem.net/2009/06/histogram-equalization-with-python-and.html This image processing function is common enough I would think it would be in scipy and want to make sure I am not missing it for some trivial reason. Thanks in advance, Isaac -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Thu Feb 14 09:41:42 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Thu, 14 Feb 2013 15:41:42 +0100 Subject: [SciPy-User] Matlab histeq equivalent In-Reply-To: References: Message-ID: <511CF7A6.7040307@hilboll.de> Am 14.02.2013 15:35, schrieb Isaac Gerg: > Hi, > > I am wondering if there is a Matlab histeq equivalent in scipy? > > In googling, I have not found such a function other than one here: > http://www.janeriksolem.net/2009/06/histogram-equalization-with-python-and.html > > This image processing function is common enough I would think it would > be in scipy and want to make sure I am not missing it for some trivial > reason. skimage seems to have it: http://scikit-image.org/docs/dev/auto_examples/plot_local_equalize.html A. From robert.kern at gmail.com Thu Feb 14 11:45:37 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 14 Feb 2013 16:45:37 +0000 Subject: [SciPy-User] Simple math question In-Reply-To: References: Message-ID: On Thu, Feb 14, 2013 at 1:31 PM, nicky van foreest wrote: > 1 - exp(n*log(1.-p)) > > should do too. Taking logs is an easy way to handle this sort of problems. Possibly better depending on the values: -np.expm1(n * np.log(1-p)) You could use np.log1p(-p), but given the values of p, it doesn't appear to be worth it. > On 13 February 2013 14:45, Neal Becker wrote: >> >> Neal Becker wrote: >> >> > josef.pktd at gmail.com wrote: >> > >> >> On Wed, Feb 13, 2013 at 8:34 AM, Neal Becker >> >> wrote: >> >>> Pierre Barbier de Reuille wrote: >> >>> >> >>>> Assuming 0 <= p <= 1, I don't see the problem there. >> >>>> >> >>> >> >>> I had just assumed there'd be numerical accuracy issues, but maybe >> >>> not. >> >> >> >> Any numerical problems that you might have with the power gets swamped >> >> because you can only get within floating point precision of 1. >> >> >> >> If the results were zero, then there are special function to get >> >> better precision for tiny numbers. >> >> >> >> AFAICS. >> >> >> >> Josef >> > >> > Here, 0 <= p < 10^-8 >> Sorry, that should be 10^-8 < p < 1, so 1-p is between 0 and 1-10^-8 >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Robert Kern From andrea.gavana at gmail.com Thu Feb 14 16:01:28 2013 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 14 Feb 2013 22:01:28 +0100 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize Message-ID: Hi All, as my team and I are constantly facing very hard/complex numerical optimization problems, I have taken a look at the various *global* optimization routines available in Python and I thought I could throw in a couple of algorithms I implemented, mostly drawing from my previous thesis work. I have implemented two routines (based on numpy and scipy), namely: - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python implementation of the algorithm described here: http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf I have added a few improvements here and there based on my Master Thesis work on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. - Firefly: the Firefly algorithm, this is my Python implementation of the procedure described here: http://www.mathworks.com/matlabcentral/fileexchange/29693-firefly-algorithm As it appears that most numerical optimization "experts" still report their benchmark results based on "CPU time" or "elapsed time" or similar meaningless performance indicators, I have built a fairly sizeable benchmark test suite to test various optimization routines. The test suite currently contains: - 18 one-dimensional test functions with multiple local/global minima; - 85 multivariate problems (where the number of independent variables ranges from 2 to 10), again with multiple local/global minima. The main page describing the rules, algorithms and motivation is here: http://infinity77.net/global_optimization/index.html Algorithms comparisons: http://infinity77.net/global_optimization/multidimensional.html http://infinity77.net/global_optimization/univariate.html Test functions: http://infinity77.net/global_optimization/test_functions.html The overall conclusion is that, while the Firefly method performances are average at best, AMPGO is superior to all the other algorithms I have tried. However, to be fair, it is an algorithm designed for low-dimensional optimization problems (i.e., 1-10 variables). I haven't published the code for the two algorithms (yet), as they are not quite up to the documentation standards of scipy. However, if there is interest from the community to add them (or one of them) to scipy.optimize I will be happy to polish them up and contribute to this great library. The same applies to the benchmark test suite. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # From josef.pktd at gmail.com Thu Feb 14 16:53:41 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Feb 2013 16:53:41 -0500 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: > Hi All, > > as my team and I are constantly facing very hard/complex numerical > optimization problems, I have taken a look at the various *global* > optimization routines available in Python and I thought I could throw > in a couple of algorithms I implemented, mostly drawing from my > previous thesis work. > > I have implemented two routines (based on numpy and scipy), namely: > > - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python > implementation of the algorithm described here: > > http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf > > I have added a few improvements here and there based on my Master Thesis work > on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. This could also be a good addition. similar to basinhopping. >From my perspective, this kind of global optimizers are the most promising, (compared to the evolutionary, ...) >From a quick browse: Is the local optimizer fixed to a specific one, or can it be any available solver as in basinhopping? The only thing I might worry about that it only has 6 citations in Google Scholar (which might not mean much if the optimizer is not widely available). Given that there seem to be many variations of this kind of optimizers, it might be good to have some background on comparison with similar optimizers. If you have other comparisons of similar optimizers, it would be useful to see them. Also given that you have a large benchmark suite, you could compare it with the new basinhopping in scipy.optimize. > > - Firefly: the Firefly algorithm, this is my Python implementation of > the procedure > described here: > > http://www.mathworks.com/matlabcentral/fileexchange/29693-firefly-algorithm the fileexchange has a large number of "animal" optimizers, and I doubt they are all good. I think there needs to be a convincing case before adding any of them should be added to scipy. On the other hand, having them available outside of scipy would make it easier to try them out. (and see if fireflies, or bees or ants are doing better :) thank you for proposing this, from a potential user. I expect to try out basinhopping soon on estimation parameters of mixture distributions. Josef > > > As it appears that most numerical optimization "experts" still report > their benchmark results based on "CPU time" or "elapsed time" or > similar meaningless performance indicators, I have built a fairly > sizeable benchmark test suite to test various optimization routines. > The test suite currently contains: > > - 18 one-dimensional test functions with multiple local/global minima; > - 85 multivariate problems (where the number of independent variables > ranges from 2 to 10), again with multiple local/global minima. > > The main page describing the rules, algorithms and motivation is here: > > http://infinity77.net/global_optimization/index.html > > Algorithms comparisons: > > http://infinity77.net/global_optimization/multidimensional.html > http://infinity77.net/global_optimization/univariate.html > > Test functions: > > http://infinity77.net/global_optimization/test_functions.html > > The overall conclusion is that, while the Firefly method performances > are average at best, AMPGO is superior to all the other algorithms I > have tried. However, to be fair, it is an algorithm designed for > low-dimensional optimization problems (i.e., 1-10 variables). > > I haven't published the code for the two algorithms (yet), as they are > not quite up to the documentation standards of scipy. However, if > there is interest from the community to add them (or one of them) to > scipy.optimize I will be happy to polish them up and contribute to > this great library. The same applies to the benchmark test suite. > > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://www.infinity77.net > > # ------------------------------------------------------------- # > def ask_mailing_list_support(email): > > if mention_platform_and_version() and include_sample_app(): > send_message(email) > else: > install_malware() > erase_hard_drives() > # ------------------------------------------------------------- # > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From andrea.gavana at gmail.com Thu Feb 14 17:06:14 2013 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Thu, 14 Feb 2013 23:06:14 +0100 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On 14 February 2013 22:53, wrote: > On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: >> Hi All, >> >> as my team and I are constantly facing very hard/complex numerical >> optimization problems, I have taken a look at the various *global* >> optimization routines available in Python and I thought I could throw >> in a couple of algorithms I implemented, mostly drawing from my >> previous thesis work. >> >> I have implemented two routines (based on numpy and scipy), namely: >> >> - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python >> implementation of the algorithm described here: >> >> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >> >> I have added a few improvements here and there based on my Master Thesis work >> on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. > > This could also be a good addition. similar to basinhopping. > >From my perspective, this kind of global optimizers are the most > promising, (compared to the evolutionary, ...) > > >From a quick browse: Is the local optimizer fixed to a specific one, > or can it be any available solver as in basinhopping? It can be changed to be any available solver. As most of the optimization problems I deal with are horrendously complex (i.e., no gradient information available), I decided to use BOBYQA as default local solver inside AMPGO. So the benchmark comparison is fair only against derivative-free algorithms. > The only thing I might worry about that it only has 6 citations in > Google Scholar (which might not mean much if the optimizer is not > widely available). I am not sure how relevant the Google Scholar is regarding the effectiveness of an algorithm: AMPGO was the only method able to crack two of our latest (real-life) optimization problems (42 and 9 variables respectively), while all the others failed. > Given that there seem to be many variations of this kind of > optimizers, it might be good to have some background on comparison > with similar optimizers. As far as I know, there are no Open Source implementation of the Tunnelling Algorithm. I used to have access to a Fortran implementation (gradient-based only) in the past, but the code belongs to the National Autonomous University of Mexico. There are no Python implementation of it. AMPGO is a short-memory tunnelling algorithm loosely connected with the standard tunnelling method. > If you have other comparisons of similar optimizers, it would be > useful to see them. Also given that you have a large benchmark suite, > you could compare it with the new basinhopping in scipy.optimize. I'll give it a try tomorrow and report back. I still don't have scipy 0.11 at work (they won't let me upgrade), but I guess I can just get the Python source code for it and run it, can't I? >> - Firefly: the Firefly algorithm, this is my Python implementation of >> the procedure >> described here: >> >> http://www.mathworks.com/matlabcentral/fileexchange/29693-firefly-algorithm > > the fileexchange has a large number of "animal" optimizers, and I > doubt they are all good. > > I think there needs to be a convincing case before adding any of them > should be added to scipy. > On the other hand, having them available outside of scipy would make > it easier to try them out. > (and see if fireflies, or bees or ants are doing better :) I agree. I did it as an exercise to see if I remembered my Matlab after 10 years of using Numpy/Scipy only :-) Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net From josef.pktd at gmail.com Thu Feb 14 18:05:46 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Feb 2013 18:05:46 -0500 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On Thu, Feb 14, 2013 at 5:06 PM, Andrea Gavana wrote: > On 14 February 2013 22:53, wrote: >> On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: >>> Hi All, >>> >>> as my team and I are constantly facing very hard/complex numerical >>> optimization problems, I have taken a look at the various *global* >>> optimization routines available in Python and I thought I could throw >>> in a couple of algorithms I implemented, mostly drawing from my >>> previous thesis work. >>> >>> I have implemented two routines (based on numpy and scipy), namely: >>> >>> - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python >>> implementation of the algorithm described here: >>> >>> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >>> >>> I have added a few improvements here and there based on my Master Thesis work >>> on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. >> >> This could also be a good addition. similar to basinhopping. >> >From my perspective, this kind of global optimizers are the most >> promising, (compared to the evolutionary, ...) >> >> >From a quick browse: Is the local optimizer fixed to a specific one, >> or can it be any available solver as in basinhopping? > > It can be changed to be any available solver. As most of the > optimization problems I deal with are horrendously complex (i.e., no > gradient information available), I decided to use BOBYQA as default > local solver inside AMPGO. So the benchmark comparison is fair only > against derivative-free algorithms. > >> The only thing I might worry about that it only has 6 citations in >> Google Scholar (which might not mean much if the optimizer is not >> widely available). > > I am not sure how relevant the Google Scholar is regarding the > effectiveness of an algorithm: AMPGO was the only method able to crack > two of our latest (real-life) optimization problems (42 and 9 > variables respectively), while all the others failed. That sounds good. I'm all in favor, But I'm just a user and cheerleader for scipy.optimize ( http://scipy-user.10969.n7.nabble.com/OT-global-optimization-hybrid-global-local-search-td5769.html ) > >> Given that there seem to be many variations of this kind of >> optimizers, it might be good to have some background on comparison >> with similar optimizers. > > As far as I know, there are no Open Source implementation of the > Tunnelling Algorithm. I used to have access to a Fortran > implementation (gradient-based only) in the past, but the code belongs > to the National Autonomous University of Mexico. There are no Python > implementation of it. AMPGO is a short-memory tunnelling algorithm > loosely connected with the standard tunnelling method. > >> If you have other comparisons of similar optimizers, it would be >> useful to see them. Also given that you have a large benchmark suite, >> you could compare it with the new basinhopping in scipy.optimize. > > I'll give it a try tomorrow and report back. I still don't have scipy > 0.11 at work (they won't let me upgrade), but I guess I can just get > the Python source code for it and run it, can't I? It's essentially just one module to grab and to adjust the imports. That's what I did. In the pull request there is also a link to benchmarks https://github.com/js850/scipy/blob/benchmark_basinhopping/scipy/optimize/benchmarks/_basinhopping_benchmarks.py which were not added to the scipy source AFAICS. Josef > >>> - Firefly: the Firefly algorithm, this is my Python implementation of >>> the procedure >>> described here: >>> >>> http://www.mathworks.com/matlabcentral/fileexchange/29693-firefly-algorithm >> >> the fileexchange has a large number of "animal" optimizers, and I >> doubt they are all good. >> >> I think there needs to be a convincing case before adding any of them >> should be added to scipy. >> On the other hand, having them available outside of scipy would make >> it easier to try them out. >> (and see if fireflies, or bees or ants are doing better :) > > I agree. I did it as an exercise to see if I remembered my Matlab > after 10 years of using Numpy/Scipy only :-) > > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://www.infinity77.net > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From dave.hirschfeld at gmail.com Fri Feb 15 05:49:54 2013 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Fri, 15 Feb 2013 10:49:54 +0000 (UTC) Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize References: Message-ID: Andrea Gavana gmail.com> writes: > > Hi All, > > as my team and I are constantly facing very hard/complex numerical > optimization problems, I have taken a look at the various *global* > optimization routines available in Python and I thought I could throw > in a couple of algorithms I implemented, mostly drawing from my > previous thesis work. > > I haven't published the code for the two algorithms (yet), as they are > not quite up to the documentation standards of scipy. However, if > there is interest from the community to add them (or one of them) to > scipy.optimize I will be happy to polish them up and contribute to > this great library. The same applies to the benchmark test suite. > > Andrea. > I can't speak for the maintainers however as a user I'm definitely interested in scipy having more optimization capabilities, especially in the tricky global optimization area. Having to install a different package for each optimiser you want to try is a pain, especially on Windows where they often require extensive knowledge of both distutils and general compilation of C/C++ software. Thanks, Dave From xrodgers at gmail.com Thu Feb 14 19:06:11 2013 From: xrodgers at gmail.com (Chris Rodgers) Date: Thu, 14 Feb 2013 16:06:11 -0800 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu Message-ID: Hi all I use scipy.stats.mannwhitneyu extensively because my data is not at all normal. I have run into a few "gotchas" with this function and I wanted to discuss possible workarounds with the list. 1) When this function returns a significant result, it is non-trivial to determine the direction of the effect! The Mann-Whitney test is NOT a test on difference of medians or means, so you cannot determine the direction from these statistics. Wikipedia has a good example of why it is not a test for difference of median. http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test I've reprinted it here. The data are the finishing order of hares and tortoises. Obviously this is contrived but it indicates the problem. First the setup: results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H T T T T T T T T T'.split(' ') h = [i for i in range(len(results_l)) if results_l[i] == 'H'] t = [i for i in range(len(results_l)) if results_l[i] == 'T'] And the results: In [12]: scipy.stats.mannwhitneyu(h, t) Out[12]: (100.0, 0.0097565768849708391) In [13]: np.median(h), np.median(t) Out[13]: (19.0, 18.0) Hares are significantly faster than tortoises, but we cannot determine this from the output of mannwhitneyu. This could be fixed by either returning u1 and u2 from the guts of the function, or testing them in the function and returning the comparison. My current workaround is testing the means which is absolutely wrong in theory but usually correct in practice. 2) The documentation states that the sample sizes must be at least 20. I think this is because the normal approximation for U is not valid for smaller sample sizes. Is there a table of critical values for U in scipy.stats that is appropriate for small sample sizes or should the user implement his or her own? 3) This is picky but is there a reason that it returns a one-tailed p-value, while other tests (eg ttest_*) default to two-tailed? Thanks for any thoughts, tips, or corrections and please don't take these comments as criticisms ... if I didn't enjoy using scipy.stats so much I wouldn't bother bringing this up! Chris From xrodgers at gmail.com Thu Feb 14 20:46:31 2013 From: xrodgers at gmail.com (Chris Rodgers) Date: Thu, 14 Feb 2013 17:46:31 -0800 Subject: [SciPy-User] bug in rankdata? Message-ID: The results I'm getting from rankdata seem completely wrong for large datasets. I'll illustrate with a case where all data are equal, so every rank should be len(data) / 2 + 0.5. In [220]: rankdata(np.ones((10000,), dtype=np.int)) Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, 5000.5]) In [221]: rankdata(np.ones((100000,), dtype=np.int)) Out[221]: array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, 7050.82704, 7050.82704]) In [222]: rankdata(np.ones((1000000,), dtype=np.int)) Out[222]: array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, 1784.293664, 1784.293664]) In [223]: scipy.__version__ Out[223]: '0.11.0' In [224]: numpy.__version__ Out[224]: '1.6.1' The results are completely off for N>10000 or so. Am I doing something wrong? From MaraMaus at nurfuerspam.de Fri Feb 15 06:04:07 2013 From: MaraMaus at nurfuerspam.de (MaraMaus at nurfuerspam.de) Date: Fri, 15 Feb 2013 12:04:07 +0100 Subject: [SciPy-User] BLAS libraries not found (Windows) Message-ID: <20130215110407.203940@gmx.net> Hi, I'm trying to install scipy (originally I only wanted to install odespy, which requires odelab which requires scipy...). I'm using Windows (32 bit). I followed the instructions in http://www.scipy.org/Installing_SciPy/BuildingGeneral to install BLAS, where I only had to adjust the step where the environment variable is set by export for Windows. I use MinGW and blas.tgz was unpacked in the directory /c/users/mara/desktop/PythonDoku/BLAS: Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS $ gfortran -O2 -std=legacy -fno-second-underscore -c *.f dann build libfblas.a library, and set environment variable BLAS Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS $ ar r libfblas.a *.o C:\MinGW\bin\ar.exe: creating libfblas.a Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS $ ranlib libfblas.a Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS $ rm -rf *.o Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" Now when trying to install scipy via >python setup.py install I encounter the following error: in get_info raise self.notfounderror(self.notfounderror.__doc__) numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. I assume it has something to do with the step $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" above, which might be wrong...? I would be glad for any help! Thank you in advance, Mara From warren.weckesser at gmail.com Fri Feb 15 10:32:09 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 15 Feb 2013 10:32:09 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On 2/14/13, Chris Rodgers wrote: > The results I'm getting from rankdata seem completely wrong for large > datasets. I'll illustrate with a case where all data are equal, so > every rank should be len(data) / 2 + 0.5. > > In [220]: rankdata(np.ones((10000,), dtype=np.int)) > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, > 5000.5]) > > In [221]: rankdata(np.ones((100000,), dtype=np.int)) > Out[221]: > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, > 7050.82704, 7050.82704]) > > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) > Out[222]: > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, > 1784.293664, 1784.293664]) > > In [223]: scipy.__version__ > Out[223]: '0.11.0' > > In [224]: numpy.__version__ > Out[224]: '1.6.1' > > > The results are completely off for N>10000 or so. Am I doing something > wrong? Looks like a bug. The code that accumulates the ranks of the tied values is using a 32 bit integer for the sum of the ranks, and this is overflowing. I'll see if I can get this fixed for the imminent release of 0.12. Warren > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Feb 15 11:16:29 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 15 Feb 2013 11:16:29 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: > Hi all > > I use scipy.stats.mannwhitneyu extensively because my data is not at > all normal. I have run into a few "gotchas" with this function and I > wanted to discuss possible workarounds with the list. Can you open a ticket ? http://projects.scipy.org/scipy/report I partially agree, but any changes won't be backwards compatible, and I don't have time to think about this enough. > > 1) When this function returns a significant result, it is non-trivial > to determine the direction of the effect! The Mann-Whitney test is NOT > a test on difference of medians or means, so you cannot determine the > direction from these statistics. Wikipedia has a good example of why > it is not a test for difference of median. > http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test > > I've reprinted it here. The data are the finishing order of hares and > tortoises. Obviously this is contrived but it indicates the problem. > First the setup: > results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H > T T T T T T T T T'.split(' ') > h = [i for i in range(len(results_l)) if results_l[i] == 'H'] > t = [i for i in range(len(results_l)) if results_l[i] == 'T'] > > And the results: > In [12]: scipy.stats.mannwhitneyu(h, t) > Out[12]: (100.0, 0.0097565768849708391) > > In [13]: np.median(h), np.median(t) > Out[13]: (19.0, 18.0) > > Hares are significantly faster than tortoises, but we cannot determine > this from the output of mannwhitneyu. This could be fixed by either > returning u1 and u2 from the guts of the function, or testing them in > the function and returning the comparison. My current workaround is > testing the means which is absolutely wrong in theory but usually > correct in practice. In some cases I'm reluctant to return the direction when we use a two-sided test. In this case we don't have a one sided tests. In analogy to ttests, I think we could return the individual u1, u2 > > 2) The documentation states that the sample sizes must be at least 20. > I think this is because the normal approximation for U is not valid > for smaller sample sizes. Is there a table of critical values for U in > scipy.stats that is appropriate for small sample sizes or should the > user implement his or her own? not available in scipy. I never looked at this. pull requests for this are welcome if it works. It would be backwards compatible. > > 3) This is picky but is there a reason that it returns a one-tailed > p-value, while other tests (eg ttest_*) default to two-tailed? legacy wart, that I don't like, but it wasn't offending me enough to change it. > > > Thanks for any thoughts, tips, or corrections and please don't take > these comments as criticisms ... if I didn't enjoy using scipy.stats > so much I wouldn't bother bringing this up! Thanks for the feedback. In large parts review of the functions relies on comments by users (and future contributors). The main problem is how to make changes without breaking current usage, since many of those functions are widely used. Josef > > Chris > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 15 11:35:03 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 15 Feb 2013 11:35:03 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Fri, Feb 15, 2013 at 11:16 AM, wrote: > On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >> Hi all >> >> I use scipy.stats.mannwhitneyu extensively because my data is not at >> all normal. I have run into a few "gotchas" with this function and I >> wanted to discuss possible workarounds with the list. > > Can you open a ticket ? http://projects.scipy.org/scipy/report > > I partially agree, but any changes won't be backwards compatible, and > I don't have time to think about this enough. > >> >> 1) When this function returns a significant result, it is non-trivial >> to determine the direction of the effect! The Mann-Whitney test is NOT >> a test on difference of medians or means, so you cannot determine the >> direction from these statistics. Wikipedia has a good example of why >> it is not a test for difference of median. >> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >> >> I've reprinted it here. The data are the finishing order of hares and >> tortoises. Obviously this is contrived but it indicates the problem. >> First the setup: >> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >> T T T T T T T T T'.split(' ') >> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >> >> And the results: >> In [12]: scipy.stats.mannwhitneyu(h, t) >> Out[12]: (100.0, 0.0097565768849708391) >> >> In [13]: np.median(h), np.median(t) >> Out[13]: (19.0, 18.0) >> >> Hares are significantly faster than tortoises, but we cannot determine >> this from the output of mannwhitneyu. This could be fixed by either >> returning u1 and u2 from the guts of the function, or testing them in >> the function and returning the comparison. My current workaround is >> testing the means which is absolutely wrong in theory but usually >> correct in practice. > > In some cases I'm reluctant to return the direction when we use a > two-sided test. In this case we don't have a one sided tests. > In analogy to ttests, I think we could return the individual u1, u2 to expand a bit: For the Kolmogorov Smirnov test, we refused to return an indication of the direction. The alternative is two-sided and the distribution of the test statististic and the test statistic are different in the one-sided test. So we shouldn't draw any one-sided conclusions from the two-sided test. In the t_test and mannwhitenyu the test statistic is normally distributed (in large samples), so we can infer the one-sided test from the two-sided statistic and p-value. If there are tables for the small sample case, we would need to check if we get consistent interpretation between one- and two-sided tests. Josef > >> >> 2) The documentation states that the sample sizes must be at least 20. >> I think this is because the normal approximation for U is not valid >> for smaller sample sizes. Is there a table of critical values for U in >> scipy.stats that is appropriate for small sample sizes or should the >> user implement his or her own? > > not available in scipy. I never looked at this. > pull requests for this are welcome if it works. It would be backwards > compatible. > >> >> 3) This is picky but is there a reason that it returns a one-tailed >> p-value, while other tests (eg ttest_*) default to two-tailed? > > legacy wart, that I don't like, but it wasn't offending me enough to change it. > >> >> >> Thanks for any thoughts, tips, or corrections and please don't take >> these comments as criticisms ... if I didn't enjoy using scipy.stats >> so much I wouldn't bother bringing this up! > > Thanks for the feedback. > In large parts review of the functions relies on comments by users > (and future contributors). > > The main problem is how to make changes without breaking current > usage, since many of those functions are widely used. > > Josef > > >> >> Chris >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 15 11:58:19 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 15 Feb 2013 11:58:19 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Fri, Feb 15, 2013 at 11:35 AM, wrote: > On Fri, Feb 15, 2013 at 11:16 AM, wrote: >> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>> Hi all >>> >>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>> all normal. I have run into a few "gotchas" with this function and I >>> wanted to discuss possible workarounds with the list. >> >> Can you open a ticket ? http://projects.scipy.org/scipy/report >> >> I partially agree, but any changes won't be backwards compatible, and >> I don't have time to think about this enough. >> >>> >>> 1) When this function returns a significant result, it is non-trivial >>> to determine the direction of the effect! The Mann-Whitney test is NOT >>> a test on difference of medians or means, so you cannot determine the >>> direction from these statistics. Wikipedia has a good example of why >>> it is not a test for difference of median. >>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>> >>> I've reprinted it here. The data are the finishing order of hares and >>> tortoises. Obviously this is contrived but it indicates the problem. >>> First the setup: >>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>> T T T T T T T T T'.split(' ') >>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>> >>> And the results: >>> In [12]: scipy.stats.mannwhitneyu(h, t) >>> Out[12]: (100.0, 0.0097565768849708391) >>> >>> In [13]: np.median(h), np.median(t) >>> Out[13]: (19.0, 18.0) >>> >>> Hares are significantly faster than tortoises, but we cannot determine >>> this from the output of mannwhitneyu. This could be fixed by either >>> returning u1 and u2 from the guts of the function, or testing them in >>> the function and returning the comparison. My current workaround is >>> testing the means which is absolutely wrong in theory but usually >>> correct in practice. >> >> In some cases I'm reluctant to return the direction when we use a >> two-sided test. In this case we don't have a one sided tests. >> In analogy to ttests, I think we could return the individual u1, u2 > > to expand a bit: > For the Kolmogorov Smirnov test, we refused to return an indication of > the direction. The alternative is two-sided and the distribution of > the test statististic and the test statistic are different in the > one-sided test. > So we shouldn't draw any one-sided conclusions from the two-sided test. > > In the t_test and mannwhitenyu the test statistic is normally > distributed (in large samples), so we can infer the one-sided test > from the two-sided statistic and p-value. > > If there are tables for the small sample case, we would need to check > if we get consistent interpretation between one- and two-sided tests. > > Josef > >> >>> >>> 2) The documentation states that the sample sizes must be at least 20. >>> I think this is because the normal approximation for U is not valid >>> for smaller sample sizes. Is there a table of critical values for U in >>> scipy.stats that is appropriate for small sample sizes or should the >>> user implement his or her own? >> >> not available in scipy. I never looked at this. >> pull requests for this are welcome if it works. It would be backwards >> compatible. since I just looked at a table collection for some other test, they also have Mann-Whitney U statistic http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ but I didn't check if it matches the test statistic in scipy.stats Josef >> >>> >>> 3) This is picky but is there a reason that it returns a one-tailed >>> p-value, while other tests (eg ttest_*) default to two-tailed? >> >> legacy wart, that I don't like, but it wasn't offending me enough to change it. >> >>> >>> >>> Thanks for any thoughts, tips, or corrections and please don't take >>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>> so much I wouldn't bother bringing this up! >> >> Thanks for the feedback. >> In large parts review of the functions relies on comments by users >> (and future contributors). >> >> The main problem is how to make changes without breaking current >> usage, since many of those functions are widely used. >> >> Josef >> >> >>> >>> Chris >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at gmail.com Fri Feb 15 13:22:09 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 15 Feb 2013 13:22:09 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > On 2/14/13, Chris Rodgers wrote: > > The results I'm getting from rankdata seem completely wrong for large > > datasets. I'll illustrate with a case where all data are equal, so > > every rank should be len(data) / 2 + 0.5. > > > > In [220]: rankdata(np.ones((10000,), dtype=np.int)) > > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, > > 5000.5]) > > > > In [221]: rankdata(np.ones((100000,), dtype=np.int)) > > Out[221]: > > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, > > 7050.82704, 7050.82704]) > > > > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) > > Out[222]: > > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, > > 1784.293664, 1784.293664]) > > > > In [223]: scipy.__version__ > > Out[223]: '0.11.0' > > > > In [224]: numpy.__version__ > > Out[224]: '1.6.1' > > > > > > The results are completely off for N>10000 or so. Am I doing something > > wrong? > > > Looks like a bug. The code that accumulates the ranks of the tied > values is using a 32 bit integer for the sum of the ranks, and this is > overflowing. I'll see if I can get this fixed for the imminent > release of 0.12. > > Warren > > A pull request with the fix is here: https://github.com/scipy/scipy/pull/436 Warren > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g_haslett at yahoo.co.uk Fri Feb 15 14:13:17 2013 From: g_haslett at yahoo.co.uk (Garvin Haslett) Date: Fri, 15 Feb 2013 19:13:17 +0000 (UTC) Subject: [SciPy-User] numpy.random.choice throws AttributeError when called with a list of objects. Message-ID: The code below results in: AttributeError: SgCoord instance has no attribute 'ndim' Given that an example in the documentation operates on an array of strings, I'd expect numpy.random.choice to work with an array of objects. Is this not the case? ------------------------------- import numpy as np class SgCoord(): "A 2D coordinate" def __init__(self, x = 0, y = 0): self.x = x self.y = y if __name__ == '__main__': l = [] l.append(SgCoord(1,1)) np.array(l) np.random.choice(l) From sebastian at sipsolutions.net Fri Feb 15 14:35:52 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 15 Feb 2013 20:35:52 +0100 Subject: [SciPy-User] numpy.random.choice throws AttributeError when called with a list of objects. In-Reply-To: References: Message-ID: <1360956952.20932.6.camel@sebastian-laptop> On Fri, 2013-02-15 at 19:13 +0000, Garvin Haslett wrote: > The code below results in: > AttributeError: SgCoord instance has no attribute 'ndim' > > Given that an example in the documentation operates on an array of strings, > I'd expect numpy.random.choice to work with an array of objects. Is this > not the case? > Hey, Yes you are right, it is a bug by me. It is due to code ensuring that you get an array if you give size=() being wrong for object arrays. Anyway it means it fails for size=None and object arrays right now, so thanks for reporting. However, it will work perfectly fine for any other size or dtype, so I hope it is not a big deal for you, as it will give you no trouble if your real application draws multiple items at once. Regards, Sebastian > ------------------------------- > > import numpy as np > > class SgCoord(): > "A 2D coordinate" > def __init__(self, x = 0, y = 0): > self.x = x > self.y = y > > if __name__ == '__main__': > l = [] > l.append(SgCoord(1,1)) > np.array(l) > np.random.choice(l) > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From scopatz at gmail.com Fri Feb 15 12:43:17 2013 From: scopatz at gmail.com (Anthony Scopatz) Date: Fri, 15 Feb 2013 11:43:17 -0600 Subject: [SciPy-User] SciPy 2013 Announcement -- June 24 - 29, 2013, Austin, TX! Message-ID: Hello All, As this years conference communications co-chair, it is my extreme pleasure to announce the *SciPy 2013 conference from June 24th - 29th in sunny Austin, Texas, USA.* Please see our website(or below) for more details. A call for presentations, tutorials, and papers will be coming out very soon. Please make sure to sign up on our mailing list, follow us on twitter , or Google Plus so that you don't miss out on any important updates. I sincerely hope that you can attend this year and make 2013 the best year for SciPy yet! Happy Hacking! Anthony Scopatz SciPy 2013 Conference Announcement ---------------------------------- SciPy 2013, the twelfth annual Scientific Computing with Python conference, will be held this June 24th-29th in Austin, Texas. SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conference allows participants from academic, commercial, and governmental organizations to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. The conference consists of two days of tutorials by followed by two days of presentations, and concludes with two days of developer sprints on projects of interest to the attendees. Specialized Tracks ------------------ This year we are happy to announce two specialized tracks run in parallel to the general conference: *Machine Learning* In recent years, Python's machine learning libraries rapidly matured with a flurry of new libraries and cutting-edge algorithm implement and development occurring within Python. As Python makes these algorithms more accessible, machine learning algorithm application has spread across disciplines. Showcase your favorite machine learning library or how it has been used as an effective tool in your work! *Reproducible Science* Over recent years, the Open Science movement has stoked a renewed acknowledgement of the importance of reproducible research. The goals of this movement include improving the dissemination of progress, prevent fraud through transparency, and enable deeper/wider development and collaboration. This track is to discuss the tools and methods used to achieve reproducible scientific computing. Domain-specific Mini-symposia ----------------------------- Introduced in 2012, mini-symposia are held to discuss scientific computing applied to a specific scientific domain/industry during a half afternoon after the general conference. Their goal is to promote industry specific libraries and tools, and gather people with similar interests for discussions. Mini-symposia on the following topics will take place this year: - Meteorology, climatology, and atmospheric and oceanic science - Astronomy and astrophysics - Medical imaging - Bio-informatics Tutorials --------- Multiple interactive half-day tutorials will be taught by community experts. The tutorials provide conceptual and practical coverage of tools that have broad interest at both an introductory or advanced level. This year, a third track will be added, targeting specifically programmers with no prior knowledge of scientific python. Developer Sprints ----------------- A hackathon environment is setup for attendees to work on the core SciPy packages or their own personal projects. The conference is an opportunity for developers that are usually physically separated to come together and engage in highly productive sessions. It is also an occasion for new community members to introduce themselves and recieve tips from community experts. This year, some of the sprints will be scheduled and announced ahead of the conference. Birds-of-a-Feather (BOF) Sessions --------------------------------- Birds-of-a-Feather sessions are self-organized discussions that run parallel to the main conference. The BOFs sessions cover primary, tangential, or unrelated topics in an interactive, discussion setting. This year, some of the BOF sessions will be scheduled and announced ahead of the conference. Important Dates --------------- - March 18th: Presentation abstracts, poster, tutorial submission deadline. Application for sponsorship deadline. - April 15th: Speakers selected - April 22nd: Sponsorship acceptance deadline - May 1st: Speaker schedule announced - May 5th: Paper submission deadline - May 6th: Early-bird registration ends - June 24th-29th: 2 days of tutorials, 2 days of conference, 2 days of sprints We look forward to a very exciting conference and hope to see you all at the conference. The SciPy2013 organization team: * Andy Terrel, Co-chair * Jonathan Rocher, Co-chair * Katy Huff, Program Committee co-chair * Matt McCormick, Program Committee co-chair * Dharhas Pothina, Tutorial co-chair * Francesc Alted, Tutorial co-chair * Corran Webster, Sprint co-chair * Peter Wang, Sprint co-chair * Matthew Turk, BoF co-chair * Jarrod Millman, Proceeding co-chair * St?fan van der Walt, Proceeding co-chair * Anthony Scopatz, Communications co-chair * Majken Tranby, Communications co-chair * Jeff Daily, Financial Aid co-chair * John Wiggins, Financial Aid co-chair * Leah Jones, Operations chair * Brett Murphy, Sponsor chair * Bill Cowan, Financial chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From xrodgers at gmail.com Fri Feb 15 13:29:38 2013 From: xrodgers at gmail.com (Chris Rodgers) Date: Fri, 15 Feb 2013 10:29:38 -0800 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: Thanks very much! I discovered this bug because mann-whitney U was giving me bizarre results, like a negative U statistic. My data is a large number of integer counts, mostly zeros, which is the worst case for ties. Until I can update scipy, I'll either write my own rankdata method, which will be very slow, or I'll use the R equivalent which is more feature-ful (but then I have to figure out rpy2 which will also be slow). On Fri, Feb 15, 2013 at 10:22 AM, Warren Weckesser wrote: > > > On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser > wrote: >> >> On 2/14/13, Chris Rodgers wrote: >> > The results I'm getting from rankdata seem completely wrong for large >> > datasets. I'll illustrate with a case where all data are equal, so >> > every rank should be len(data) / 2 + 0.5. >> > >> > In [220]: rankdata(np.ones((10000,), dtype=np.int)) >> > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, >> > 5000.5]) >> > >> > In [221]: rankdata(np.ones((100000,), dtype=np.int)) >> > Out[221]: >> > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, >> > 7050.82704, 7050.82704]) >> > >> > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) >> > Out[222]: >> > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, >> > 1784.293664, 1784.293664]) >> > >> > In [223]: scipy.__version__ >> > Out[223]: '0.11.0' >> > >> > In [224]: numpy.__version__ >> > Out[224]: '1.6.1' >> > >> > >> > The results are completely off for N>10000 or so. Am I doing something >> > wrong? >> >> >> Looks like a bug. The code that accumulates the ranks of the tied >> values is using a 32 bit integer for the sum of the ranks, and this is >> overflowing. I'll see if I can get this fixed for the imminent >> release of 0.12. >> >> Warren >> > > > A pull request with the fix is here: > https://github.com/scipy/scipy/pull/436 > > > Warren > > >> >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From xrodgers at gmail.com Fri Feb 15 13:44:32 2013 From: xrodgers at gmail.com (Chris Rodgers) Date: Fri, 15 Feb 2013 10:44:32 -0800 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: Thanks Josef. Your points make sense to me. While we're on the subject, maybe I should ask whether this function is even appropriate for my data. My data are Poisson-like integer counts, and I want to know if the rate is significantly higher in dataset1 or dataset2. I'm reluctant to use poissfit because there is a scientific reason to believe that my data might deviate significantly from Poisson, although I haven't checked this statistically. Mann-whitney U seemed like a safe alternative because it doesn't make distributional assumptions and it deals with ties, which is especially important for me because half the counts or more can be zero. Does that seem like a good choice, as long as I have >20 samples and the large-sample approximation is appropriate? Comments welcome. Thanks Chris On Fri, Feb 15, 2013 at 8:58 AM, wrote: > On Fri, Feb 15, 2013 at 11:35 AM, wrote: >> On Fri, Feb 15, 2013 at 11:16 AM, wrote: >>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>>> Hi all >>>> >>>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>>> all normal. I have run into a few "gotchas" with this function and I >>>> wanted to discuss possible workarounds with the list. >>> >>> Can you open a ticket ? http://projects.scipy.org/scipy/report >>> >>> I partially agree, but any changes won't be backwards compatible, and >>> I don't have time to think about this enough. >>> >>>> >>>> 1) When this function returns a significant result, it is non-trivial >>>> to determine the direction of the effect! The Mann-Whitney test is NOT >>>> a test on difference of medians or means, so you cannot determine the >>>> direction from these statistics. Wikipedia has a good example of why >>>> it is not a test for difference of median. >>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>>> >>>> I've reprinted it here. The data are the finishing order of hares and >>>> tortoises. Obviously this is contrived but it indicates the problem. >>>> First the setup: >>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>>> T T T T T T T T T'.split(' ') >>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>>> >>>> And the results: >>>> In [12]: scipy.stats.mannwhitneyu(h, t) >>>> Out[12]: (100.0, 0.0097565768849708391) >>>> >>>> In [13]: np.median(h), np.median(t) >>>> Out[13]: (19.0, 18.0) >>>> >>>> Hares are significantly faster than tortoises, but we cannot determine >>>> this from the output of mannwhitneyu. This could be fixed by either >>>> returning u1 and u2 from the guts of the function, or testing them in >>>> the function and returning the comparison. My current workaround is >>>> testing the means which is absolutely wrong in theory but usually >>>> correct in practice. >>> >>> In some cases I'm reluctant to return the direction when we use a >>> two-sided test. In this case we don't have a one sided tests. >>> In analogy to ttests, I think we could return the individual u1, u2 >> >> to expand a bit: >> For the Kolmogorov Smirnov test, we refused to return an indication of >> the direction. The alternative is two-sided and the distribution of >> the test statististic and the test statistic are different in the >> one-sided test. >> So we shouldn't draw any one-sided conclusions from the two-sided test. >> >> In the t_test and mannwhitenyu the test statistic is normally >> distributed (in large samples), so we can infer the one-sided test >> from the two-sided statistic and p-value. >> >> If there are tables for the small sample case, we would need to check >> if we get consistent interpretation between one- and two-sided tests. >> >> Josef >> >>> >>>> >>>> 2) The documentation states that the sample sizes must be at least 20. >>>> I think this is because the normal approximation for U is not valid >>>> for smaller sample sizes. Is there a table of critical values for U in >>>> scipy.stats that is appropriate for small sample sizes or should the >>>> user implement his or her own? >>> >>> not available in scipy. I never looked at this. >>> pull requests for this are welcome if it works. It would be backwards >>> compatible. > > since I just looked at a table collection for some other test, they > also have Mann-Whitney U statistic > http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ > but I didn't check if it matches the test statistic in scipy.stats > > Josef > >>> >>>> >>>> 3) This is picky but is there a reason that it returns a one-tailed >>>> p-value, while other tests (eg ttest_*) default to two-tailed? >>> >>> legacy wart, that I don't like, but it wasn't offending me enough to change it. >>> >>>> >>>> >>>> Thanks for any thoughts, tips, or corrections and please don't take >>>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>>> so much I wouldn't bother bringing this up! >>> >>> Thanks for the feedback. >>> In large parts review of the functions relies on comments by users >>> (and future contributors). >>> >>> The main problem is how to make changes without breaking current >>> usage, since many of those functions are widely used. >>> >>> Josef >>> >>> >>>> >>>> Chris >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at gmail.com Sat Feb 16 13:22:57 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sat, 16 Feb 2013 13:22:57 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On 2/15/13, Chris Rodgers wrote: > Thanks very much! I discovered this bug because mann-whitney U was > giving me bizarre results, like a negative U statistic. My data is a > large number of integer counts, mostly zeros, which is the worst case > for ties. > > Until I can update scipy, I'll either write my own rankdata method, > which will be very slow, or I'll use the R equivalent which is more > feature-ful (but then I have to figure out rpy2 which will also be > slow). You could also try pandas (http://pandas.pydata.org/). The DataFrame and Series classes have a 'rank' method (http://pandas.pydata.org/pandas-docs/stable/computation.html#data-ranking). Warren > > On Fri, Feb 15, 2013 at 10:22 AM, Warren Weckesser > wrote: >> >> >> On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser >> wrote: >>> >>> On 2/14/13, Chris Rodgers wrote: >>> > The results I'm getting from rankdata seem completely wrong for large >>> > datasets. I'll illustrate with a case where all data are equal, so >>> > every rank should be len(data) / 2 + 0.5. >>> > >>> > In [220]: rankdata(np.ones((10000,), dtype=np.int)) >>> > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, >>> > 5000.5]) >>> > >>> > In [221]: rankdata(np.ones((100000,), dtype=np.int)) >>> > Out[221]: >>> > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, >>> > 7050.82704, 7050.82704]) >>> > >>> > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) >>> > Out[222]: >>> > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, >>> > 1784.293664, 1784.293664]) >>> > >>> > In [223]: scipy.__version__ >>> > Out[223]: '0.11.0' >>> > >>> > In [224]: numpy.__version__ >>> > Out[224]: '1.6.1' >>> > >>> > >>> > The results are completely off for N>10000 or so. Am I doing something >>> > wrong? >>> >>> >>> Looks like a bug. The code that accumulates the ranks of the tied >>> values is using a 32 bit integer for the sum of the ranks, and this is >>> overflowing. I'll see if I can get this fixed for the imminent >>> release of 0.12. >>> >>> Warren >>> >> >> >> A pull request with the fix is here: >> https://github.com/scipy/scipy/pull/436 >> >> >> Warren >> >> >>> >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From warren.weckesser at gmail.com Sat Feb 16 13:44:48 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sat, 16 Feb 2013 13:44:48 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On 2/16/13, Warren Weckesser wrote: > On 2/15/13, Chris Rodgers wrote: >> Thanks very much! I discovered this bug because mann-whitney U was >> giving me bizarre results, like a negative U statistic. My data is a >> large number of integer counts, mostly zeros, which is the worst case >> for ties. >> >> Until I can update scipy, I'll either write my own rankdata method, >> which will be very slow, or I'll use the R equivalent which is more >> feature-ful (but then I have to figure out rpy2 which will also be >> slow). > > > You could also try pandas (http://pandas.pydata.org/). The DataFrame > and Series classes have a 'rank' method > (http://pandas.pydata.org/pandas-docs/stable/computation.html#data-ranking). > > Warren P.S. Since rankdata was rewritten in cython, the comment in the pandas documentation about the pandas rank function being 10 to 20 times faster than scipy.stats.rankdata is no longer true. The pandas rank function does provide more options for how the tied ranks are assigned, which might be useful for you. Warren > > >> >> On Fri, Feb 15, 2013 at 10:22 AM, Warren Weckesser >> wrote: >>> >>> >>> On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser >>> wrote: >>>> >>>> On 2/14/13, Chris Rodgers wrote: >>>> > The results I'm getting from rankdata seem completely wrong for large >>>> > datasets. I'll illustrate with a case where all data are equal, so >>>> > every rank should be len(data) / 2 + 0.5. >>>> > >>>> > In [220]: rankdata(np.ones((10000,), dtype=np.int)) >>>> > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, >>>> > 5000.5]) >>>> > >>>> > In [221]: rankdata(np.ones((100000,), dtype=np.int)) >>>> > Out[221]: >>>> > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, >>>> > 7050.82704, 7050.82704]) >>>> > >>>> > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) >>>> > Out[222]: >>>> > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, >>>> > 1784.293664, 1784.293664]) >>>> > >>>> > In [223]: scipy.__version__ >>>> > Out[223]: '0.11.0' >>>> > >>>> > In [224]: numpy.__version__ >>>> > Out[224]: '1.6.1' >>>> > >>>> > >>>> > The results are completely off for N>10000 or so. Am I doing >>>> > something >>>> > wrong? >>>> >>>> >>>> Looks like a bug. The code that accumulates the ranks of the tied >>>> values is using a 32 bit integer for the sum of the ranks, and this is >>>> overflowing. I'll see if I can get this fixed for the imminent >>>> release of 0.12. >>>> >>>> Warren >>>> >>> >>> >>> A pull request with the fix is here: >>> https://github.com/scipy/scipy/pull/436 >>> >>> >>> Warren >>> >>> >>>> >>>> > _______________________________________________ >>>> > SciPy-User mailing list >>>> > SciPy-User at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From wesmckinn at gmail.com Sat Feb 16 13:58:51 2013 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 16 Feb 2013 13:58:51 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On Sat, Feb 16, 2013 at 1:22 PM, Warren Weckesser wrote: > On 2/15/13, Chris Rodgers wrote: >> Thanks very much! I discovered this bug because mann-whitney U was >> giving me bizarre results, like a negative U statistic. My data is a >> large number of integer counts, mostly zeros, which is the worst case >> for ties. >> >> Until I can update scipy, I'll either write my own rankdata method, >> which will be very slow, or I'll use the R equivalent which is more >> feature-ful (but then I have to figure out rpy2 which will also be >> slow). > > > You could also try pandas (http://pandas.pydata.org/). The DataFrame > and Series classes have a 'rank' method > (http://pandas.pydata.org/pandas-docs/stable/computation.html#data-ranking). > > Warren > I will add that the pandas rank methods are approx 20x faster than scipy.stats.rankdata, so another reason to use them for large-ish arrays. - Wes > >> >> On Fri, Feb 15, 2013 at 10:22 AM, Warren Weckesser >> wrote: >>> >>> >>> On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser >>> wrote: >>>> >>>> On 2/14/13, Chris Rodgers wrote: >>>> > The results I'm getting from rankdata seem completely wrong for large >>>> > datasets. I'll illustrate with a case where all data are equal, so >>>> > every rank should be len(data) / 2 + 0.5. >>>> > >>>> > In [220]: rankdata(np.ones((10000,), dtype=np.int)) >>>> > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, >>>> > 5000.5]) >>>> > >>>> > In [221]: rankdata(np.ones((100000,), dtype=np.int)) >>>> > Out[221]: >>>> > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, >>>> > 7050.82704, 7050.82704]) >>>> > >>>> > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) >>>> > Out[222]: >>>> > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, >>>> > 1784.293664, 1784.293664]) >>>> > >>>> > In [223]: scipy.__version__ >>>> > Out[223]: '0.11.0' >>>> > >>>> > In [224]: numpy.__version__ >>>> > Out[224]: '1.6.1' >>>> > >>>> > >>>> > The results are completely off for N>10000 or so. Am I doing something >>>> > wrong? >>>> >>>> >>>> Looks like a bug. The code that accumulates the ranks of the tied >>>> values is using a 32 bit integer for the sum of the ranks, and this is >>>> overflowing. I'll see if I can get this fixed for the imminent >>>> release of 0.12. >>>> >>>> Warren >>>> >>> >>> >>> A pull request with the fix is here: >>> https://github.com/scipy/scipy/pull/436 >>> >>> >>> Warren >>> >>> >>>> >>>> > _______________________________________________ >>>> > SciPy-User mailing list >>>> > SciPy-User at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wesmckinn at gmail.com Sat Feb 16 13:59:33 2013 From: wesmckinn at gmail.com (Wes McKinney) Date: Sat, 16 Feb 2013 13:59:33 -0500 Subject: [SciPy-User] bug in rankdata? In-Reply-To: References: Message-ID: On Sat, Feb 16, 2013 at 1:44 PM, Warren Weckesser wrote: > On 2/16/13, Warren Weckesser wrote: >> On 2/15/13, Chris Rodgers wrote: >>> Thanks very much! I discovered this bug because mann-whitney U was >>> giving me bizarre results, like a negative U statistic. My data is a >>> large number of integer counts, mostly zeros, which is the worst case >>> for ties. >>> >>> Until I can update scipy, I'll either write my own rankdata method, >>> which will be very slow, or I'll use the R equivalent which is more >>> feature-ful (but then I have to figure out rpy2 which will also be >>> slow). >> >> >> You could also try pandas (http://pandas.pydata.org/). The DataFrame >> and Series classes have a 'rank' method >> (http://pandas.pydata.org/pandas-docs/stable/computation.html#data-ranking). >> >> Warren > > > P.S. Since rankdata was rewritten in cython, the comment in the pandas > documentation about the pandas rank function being 10 to 20 times > faster than scipy.stats.rankdata is no longer true. The pandas rank > function does provide more options for how the tied ranks are > assigned, which might be useful for you. > > Warren > Oh, that's good to know. Will have to update the docs. - Wes > >> >> >>> >>> On Fri, Feb 15, 2013 at 10:22 AM, Warren Weckesser >>> wrote: >>>> >>>> >>>> On Fri, Feb 15, 2013 at 10:32 AM, Warren Weckesser >>>> wrote: >>>>> >>>>> On 2/14/13, Chris Rodgers wrote: >>>>> > The results I'm getting from rankdata seem completely wrong for large >>>>> > datasets. I'll illustrate with a case where all data are equal, so >>>>> > every rank should be len(data) / 2 + 0.5. >>>>> > >>>>> > In [220]: rankdata(np.ones((10000,), dtype=np.int)) >>>>> > Out[220]: array([ 5000.5, 5000.5, 5000.5, ..., 5000.5, 5000.5, >>>>> > 5000.5]) >>>>> > >>>>> > In [221]: rankdata(np.ones((100000,), dtype=np.int)) >>>>> > Out[221]: >>>>> > array([ 7050.82704, 7050.82704, 7050.82704, ..., 7050.82704, >>>>> > 7050.82704, 7050.82704]) >>>>> > >>>>> > In [222]: rankdata(np.ones((1000000,), dtype=np.int)) >>>>> > Out[222]: >>>>> > array([ 1784.293664, 1784.293664, 1784.293664, ..., 1784.293664, >>>>> > 1784.293664, 1784.293664]) >>>>> > >>>>> > In [223]: scipy.__version__ >>>>> > Out[223]: '0.11.0' >>>>> > >>>>> > In [224]: numpy.__version__ >>>>> > Out[224]: '1.6.1' >>>>> > >>>>> > >>>>> > The results are completely off for N>10000 or so. Am I doing >>>>> > something >>>>> > wrong? >>>>> >>>>> >>>>> Looks like a bug. The code that accumulates the ranks of the tied >>>>> values is using a 32 bit integer for the sum of the ranks, and this is >>>>> overflowing. I'll see if I can get this fixed for the imminent >>>>> release of 0.12. >>>>> >>>>> Warren >>>>> >>>> >>>> >>>> A pull request with the fix is here: >>>> https://github.com/scipy/scipy/pull/436 >>>> >>>> >>>> Warren >>>> >>>> >>>>> >>>>> > _______________________________________________ >>>>> > SciPy-User mailing list >>>>> > SciPy-User at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> > >>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ralf.gommers at gmail.com Sat Feb 16 16:34:44 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 16 Feb 2013 22:34:44 +0100 Subject: [SciPy-User] ANN: SciPy 0.12.0 beta 1 release Message-ID: Hi, I am pleased to announce the availability of the first beta release of SciPy 0.12.0. This is shaping up to be another solid release, with some cool new features (see highlights below) and a large amount of bug fixes and maintenance work under the hood. The number of contributors also keeps rising steadily, we're at 74 so far for this release. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.12.0b1/, release notes are copied below. A Python 3.3 Windows installer will follow soon. Please try this release and report any problems on the mailing list. Cheers, Ralf ========================== SciPy 0.12.0 Release Notes ========================== .. note:: Scipy 0.12.0 is not released yet! SciPy 0.12.0 is the culmination of 7 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.12.x branch, and on adding new features on the master branch. Some of the highlights of this release are: - Completed QHull wrappers in scipy.spatial. - cKDTree now a drop-in replacement for KDTree. - A new global optimizer, basinhopping. - Support for Python 2 and Python 3 from the same code base (no more 2to3). This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. Support for Python 2.4 and 2.5 has been dropped as of this release. New features ============ ``scipy.spatial`` improvements ------------------------------ cKDTree feature-complete ^^^^^^^^^^^^^^^^^^^^^^^^ Cython version of KDTree, cKDTree, is now feature-complete. Most operations (construction, query, query_ball_point, query_pairs, count_neighbors and sparse_distance_matrix) are between 200 and 1000 times faster in cKDTree than in KDTree. With very minor caveats, cKDTree has exactly the same interface as KDTree, and can be used as a drop-in replacement. Voronoi diagrams and convex hulls ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `scipy.spatial` now contains functionality for computing Voronoi diagrams and convex hulls using the Qhull library. (Delaunay triangulation was available since Scipy 0.9.0.) Delaunay improvements ^^^^^^^^^^^^^^^^^^^^^ It's now possible to pass in custom Qhull options in Delaunay triangulation. Coplanar points are now also recorded, if present. Incremental construction of Delaunay triangulations is now also possible. Spectral estimators (``scipy.signal``) -------------------------------------- The functions ``scipy.signal.periodogram`` and ``scipy.signal.welch`` were added, providing DFT-based spectral estimators. ``scipy.optimize`` improvements ------------------------------- Callback functions in L-BFGS-B and TNC ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A callback mechanism was added to L-BFGS-B and TNC minimization solvers. Basin hopping global optimization (``scipy.optimize.basinhopping``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A new global optimization algorithm. Basinhopping is designed to efficiently find the global minimum of a smooth function. ``scipy.special`` improvements ------------------------------ Revised complex error functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The computation of special functions related to the error function now uses a new `Faddeeva library from MIT `__ which increases their numerical precision. The scaled and imaginary error functions ``erfcx`` and ``erfi`` were also added, and the Dawson integral ``dawsn`` can now be evaluated for a complex argument. Faster orthogonal polynomials ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Evaluation of orthogonal polynomials (the ``eval_*`` routines) in now faster in ``scipy.special``, and their ``out=`` argument functions properly. ``scipy.sparse.linalg`` features -------------------------------- - In ``scipy.sparse.linalg.spsolve``, the ``b`` argument can now be either a vector or a matrix. - ``scipy.sparse.linalg.inv`` was added. This uses ``spsolve`` to compute a sparse matrix inverse. - ``scipy.sparse.linalg.expm`` was added. This computes the exponential of a sparse matrix using a similar algorithm to the existing dense array implementation in ``scipy.linalg.expm``. Listing Matlab(R) file contents in ``scipy.io`` ----------------------------------------------- A new function ``whosmat`` is available in ``scipy.io`` for inspecting contents of MAT files without reading them to memory. Documented BLAS and LAPACK low-level interfaces (``scipy.linalg``) ------------------------------------------------------------------ The modules `scipy.linalg.blas` and `scipy.linalg.lapack` can be used to access low-level BLAS and LAPACK functions. Polynomial interpolation improvements (``scipy.interpolate``) ------------------------------------------------------------- The barycentric, Krogh, piecewise and pchip polynomial interpolators in ``scipy.interpolate`` accept now an ``axis`` argument. Deprecated features =================== `scipy.lib.lapack` ------------------ The module `scipy.lib.lapack` is deprecated. You can use `scipy.linalg.lapack` instead. The module `scipy.lib.blas` was deprecated earlier in Scipy 0.10.0. `fblas` and `cblas` ------------------- Accessing the modules `scipy.linalg.fblas`, `cblas`, `flapack`, `clapack` is deprecated. Instead, use the modules `scipy.linalg.lapack` and `scipy.linalg.blas`. Backwards incompatible changes ============================== Removal of ``scipy.io.save_as_module`` -------------------------------------- The function ``scipy.io.save_as_module`` was deprecated in Scipy 0.11.0, and is now removed. Its private support modules ``scipy.io.dumbdbm_patched`` and ``scipy.io.dumb_shelve`` are also removed. Other changes ============= Authors ======= * Anton Akhmerov + * Alexander Ebersp?cher + * Anne Archibald * Jisk Attema + * K.-Michael Aye + * bemasc + * Sebastian Berg + * Fran?ois Boulogne + * Matthew Brett * Lars Buitinck * Steven Byrnes + * Tim Cera + * Christian + * Keith Clawson + * David Cournapeau * Nathan Crock + * endolith * Bradley M. Froehle + * Matty G * Christoph Gohlke * Ralf Gommers * Robert David Grant + * Yaroslav Halchenko * Charles Harris * Jonathan Helmus * Andreas Hilboll * Hugo + * Oleksandr Huziy * Jeroen Demeyer + * Johannes Sch?nberger + * Steven G. Johnson + * Chris Jordan-Squire * Jonathan Taylor + * Niklas Kroeger + * Jerome Kieffer + * kingson + * Josh Lawrence * Denis Laxalde * Alex Leach + * Lorenzo Luengo + * Stephen McQuay + * MinRK * Sturla Molden + * Eric Moore + * mszep + * Matt Newville + * Vlad Niculae * Travis Oliphant * David Parker + * Fabian Pedregosa * Josef Perktold * Zach Ploskey + * Alex Reinhart + * richli + * Gilles Rochefort + * Ciro Duran Santillli + * Jan Schlueter + * Jonathan Scholz + * Anthony Scopatz * Skipper Seabold * Fabrice Silva + * Scott Sinclair * Jacob Stevenson + * Sturla Molden + * Julian Taylor + * thorstenkranz + * John Travers + * trueprice + * Nicky van Foreest * Jacob Vanderplas * Patrick Varilly * Daniel Velkov + * Pauli Virtanen * Stefan van der Walt * Warren Weckesser A total of 74 people contributed to this release. People with a "+" by their names contributed a patch for the first time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From g_haslett at yahoo.co.uk Sat Feb 16 17:15:21 2013 From: g_haslett at yahoo.co.uk (Garvin Haslett) Date: Sat, 16 Feb 2013 22:15:21 +0000 (UTC) Subject: [SciPy-User] numpy.random.choice throws AttributeError when called with a list of objects. References: <1360956952.20932.6.camel@sebastian-laptop> Message-ID: Sebastian Berg sipsolutions.net> writes: > > On Fri, 2013-02-15 at 19:13 +0000, Garvin Haslett wrote: > > The code below results in: > > AttributeError: SgCoord instance has no attribute 'ndim' > > > > Given that an example in the documentation operates on an array of strings, > > I'd expect numpy.random.choice to work with an array of objects. Is this > > not the case? > > > > Hey, > > Yes you are right, it is a bug by me. ... > However, it will work perfectly fine for any other size or dtype ... > Regards, > > Sebastian > > > ------------------------------- > > > > import numpy as np > > > > class SgCoord(): > > "A 2D coordinate" > > def __init__(self, x = 0, y = 0): > > self.x = x > > self.y = y > > > > if __name__ == '__main__': > > l = [] > > l.append(SgCoord(1,1)) > > np.array(l) > > np.random.choice(l) > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > Hi Sebastian, Many thanks for your prompt reply and I can confirm that modifying the call to include an explicit size = 1 parameter works around this issue. Since I have your attention I'd like to clarify if there are any advantages to using numpy.random.choice over random.choice. Is it (numpy.random.choice) (i) more efficient? (ii) a more rigourous random distribution? Regards, Garvin. From sebastian at sipsolutions.net Sat Feb 16 17:49:33 2013 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 16 Feb 2013 23:49:33 +0100 Subject: [SciPy-User] numpy.random.choice throws AttributeError when called with a list of objects. In-Reply-To: References: <1360956952.20932.6.camel@sebastian-laptop> Message-ID: <1361054973.20932.47.camel@sebastian-laptop> On Sat, 2013-02-16 at 22:15 +0000, Garvin Haslett wrote: > Sebastian Berg sipsolutions.net> writes: > > > > > On Fri, 2013-02-15 at 19:13 +0000, Garvin Haslett wrote: > > > The code below results in: > > > AttributeError: SgCoord instance has no attribute 'ndim' > > > > > > Given that an example in the documentation operates on an array of strings, > > > I'd expect numpy.random.choice to work with an array of objects. Is this > > > not the case? > > > > > > > Hey, > > > > Yes you are right, it is a bug by me. > ... > > However, it will work perfectly fine for any other size or dtype > ... > > Regards, > > > > Sebastian > > > > > ------------------------------- > > > > > > import numpy as np > > > > > > class SgCoord(): > > > "A 2D coordinate" > > > def __init__(self, x = 0, y = 0): > > > self.x = x > > > self.y = y > > > > > > if __name__ == '__main__': > > > l = [] > > > l.append(SgCoord(1,1)) > > > np.array(l) > > > np.random.choice(l) > > > > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > Hi Sebastian, > Many thanks for your prompt reply and I can confirm that modifying the call > to include an explicit size = 1 parameter works around this issue. > Since I have your attention I'd like to clarify if there are any advantages > to using numpy.random.choice over random.choice. If your example is close to your real code. Probably there is no reason for using numpy here. (on top of it, you have conversions from list to array that np.random.choice will do) > Is it (numpy.random.choice) > (i) more efficient? If you draw a single element it is probably slower not faster (I did not try), if you draw many it will be much faster. When compared with random.sample this is even more so. With replace=False (random.sample equivalent) drawing few samples when no `p` is given is not efficiently implemented (I think it would be better to improve numpy there, and best soon to give fewer people a headache about their random number state, but someone has to put in some brains and time for that). > (ii) a more rigourous random distribution? The algorithm is the same I think, but if you need reproducible results, it is easier to use a np.random.RandomState object and avoiding the python functions. In the longer run the numpy function should probably get an axis keyword as well as another feature. Btw: https://github.com/numpy/numpy/issues/2992 fixes the issue Regards, Sebastian > Regards, > Garvin. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From klonuo at gmail.com Sat Feb 16 17:55:24 2013 From: klonuo at gmail.com (klo uo) Date: Sat, 16 Feb 2013 23:55:24 +0100 Subject: [SciPy-User] BLAS libraries not found (Windows) In-Reply-To: <20130215110407.203940@gmx.net> References: <20130215110407.203940@gmx.net> Message-ID: I guess you need lapack and then set correct paths in site.cfg file. Even more if you like to build under MinGW you can use Atlas with lapack for additional speedup. If you are interested in this path let me know I'll point you to Atlas project that compiles under MinGW. But you can of course download and install ready made Scipy without all this trouble and you can even find prebuild windows SciPy packages compiled with MKL kindly provided here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy On Fri, Feb 15, 2013 at 12:04 PM, wrote: > Hi, > > I'm trying to install scipy (originally I only wanted to install odespy, > which requires odelab which requires scipy...). > > I'm using Windows (32 bit). I followed the instructions in > http://www.scipy.org/Installing_SciPy/BuildingGeneral > to install BLAS, where I only had to adjust the step where the environment > variable is set by export for Windows. I use MinGW and blas.tgz was > unpacked in the directory /c/users/mara/desktop/PythonDoku/BLAS: > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > $ gfortran -O2 -std=legacy -fno-second-underscore -c *.f > dann build libfblas.a library, and set environment variable BLAS > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > $ ar r libfblas.a *.o > C:\MinGW\bin\ar.exe: creating libfblas.a > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > $ ranlib libfblas.a > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > $ rm -rf *.o > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" > > Now when trying to install scipy via > >python setup.py install > I encounter the following error: > > in get_info > raise self.notfounderror(self.notfounderror.__doc__) > numpy.distutils.system_info.BlasNotFoundError: > Blas (http://www.netlib.org/blas/) libraries not found. > Directories to search for the libraries can be specified in the > numpy/distutils/site.cfg file (section [blas]) or by setting > the BLAS environment variable. > > I assume it has something to do with the step > $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" > above, which might be wrong...? > > I would be glad for any help! Thank you in advance, > Mara > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Feb 16 19:51:19 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Feb 2013 19:51:19 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Fri, Feb 15, 2013 at 1:44 PM, Chris Rodgers wrote: > Thanks Josef. Your points make sense to me. > > While we're on the subject, maybe I should ask whether this function > is even appropriate for my data. My data are Poisson-like integer > counts, and I want to know if the rate is significantly higher in > dataset1 or dataset2. I'm reluctant to use poissfit because there is a > scientific reason to believe that my data might deviate significantly > from Poisson, although I haven't checked this statistically. > > Mann-whitney U seemed like a safe alternative because it doesn't make > distributional assumptions and it deals with ties, which is especially > important for me because half the counts or more can be zero. Does > that seem like a good choice, as long as I have >20 samples and the > large-sample approximation is appropriate? Comments welcome. Please bottom or inline post. I don't have any direct experience with this. The >20 samples is just a guideline (as usual). If you have many ties, then I would expect be that you need more samples (no reference). What I would do in cases like this is to run a small Monte Carlo, with Poisson data, or data that looks somewhat similar to your data, to see whether the test has the correct size (for example reject roughly 5% at a 5% alpha), and to see whether the test has much power in small samples. I would expect that the size is ok, but power might not be large unless the difference in the rate parameter is large. Another possibility is to compare permutation p-values with asymptotic p-values, to see whether they are close. There should be alternative tests, but I don't think they are available in python, specific tests for comparing count data (I have no idea), general 2 sample goodness-of-fit test (like ks_2samp) but we don't have anything for discrete data. If you want to go parametric, then you could also use poisson (or negative binomial) regression in statsmodels, and directly test the equality of the distribution parameter. (there is also zeroinflated poisson, but with less verification). Josef > > Thanks > Chris > > On Fri, Feb 15, 2013 at 8:58 AM, wrote: >> On Fri, Feb 15, 2013 at 11:35 AM, wrote: >>> On Fri, Feb 15, 2013 at 11:16 AM, wrote: >>>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>>>> Hi all >>>>> >>>>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>>>> all normal. I have run into a few "gotchas" with this function and I >>>>> wanted to discuss possible workarounds with the list. >>>> >>>> Can you open a ticket ? http://projects.scipy.org/scipy/report >>>> >>>> I partially agree, but any changes won't be backwards compatible, and >>>> I don't have time to think about this enough. >>>> >>>>> >>>>> 1) When this function returns a significant result, it is non-trivial >>>>> to determine the direction of the effect! The Mann-Whitney test is NOT >>>>> a test on difference of medians or means, so you cannot determine the >>>>> direction from these statistics. Wikipedia has a good example of why >>>>> it is not a test for difference of median. >>>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>>>> >>>>> I've reprinted it here. The data are the finishing order of hares and >>>>> tortoises. Obviously this is contrived but it indicates the problem. >>>>> First the setup: >>>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>>>> T T T T T T T T T'.split(' ') >>>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>>>> >>>>> And the results: >>>>> In [12]: scipy.stats.mannwhitneyu(h, t) >>>>> Out[12]: (100.0, 0.0097565768849708391) >>>>> >>>>> In [13]: np.median(h), np.median(t) >>>>> Out[13]: (19.0, 18.0) >>>>> >>>>> Hares are significantly faster than tortoises, but we cannot determine >>>>> this from the output of mannwhitneyu. This could be fixed by either >>>>> returning u1 and u2 from the guts of the function, or testing them in >>>>> the function and returning the comparison. My current workaround is >>>>> testing the means which is absolutely wrong in theory but usually >>>>> correct in practice. >>>> >>>> In some cases I'm reluctant to return the direction when we use a >>>> two-sided test. In this case we don't have a one sided tests. >>>> In analogy to ttests, I think we could return the individual u1, u2 >>> >>> to expand a bit: >>> For the Kolmogorov Smirnov test, we refused to return an indication of >>> the direction. The alternative is two-sided and the distribution of >>> the test statististic and the test statistic are different in the >>> one-sided test. >>> So we shouldn't draw any one-sided conclusions from the two-sided test. >>> >>> In the t_test and mannwhitenyu the test statistic is normally >>> distributed (in large samples), so we can infer the one-sided test >>> from the two-sided statistic and p-value. >>> >>> If there are tables for the small sample case, we would need to check >>> if we get consistent interpretation between one- and two-sided tests. >>> >>> Josef >>> >>>> >>>>> >>>>> 2) The documentation states that the sample sizes must be at least 20. >>>>> I think this is because the normal approximation for U is not valid >>>>> for smaller sample sizes. Is there a table of critical values for U in >>>>> scipy.stats that is appropriate for small sample sizes or should the >>>>> user implement his or her own? >>>> >>>> not available in scipy. I never looked at this. >>>> pull requests for this are welcome if it works. It would be backwards >>>> compatible. >> >> since I just looked at a table collection for some other test, they >> also have Mann-Whitney U statistic >> http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ >> but I didn't check if it matches the test statistic in scipy.stats >> >> Josef >> >>>> >>>>> >>>>> 3) This is picky but is there a reason that it returns a one-tailed >>>>> p-value, while other tests (eg ttest_*) default to two-tailed? >>>> >>>> legacy wart, that I don't like, but it wasn't offending me enough to change it. >>>> >>>>> >>>>> >>>>> Thanks for any thoughts, tips, or corrections and please don't take >>>>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>>>> so much I wouldn't bother bringing this up! >>>> >>>> Thanks for the feedback. >>>> In large parts review of the functions relies on comments by users >>>> (and future contributors). >>>> >>>> The main problem is how to make changes without breaking current >>>> usage, since many of those functions are widely used. >>>> >>>> Josef >>>> >>>> >>>>> >>>>> Chris >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sat Feb 16 21:17:09 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Feb 2013 21:17:09 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Sat, Feb 16, 2013 at 7:51 PM, wrote: > On Fri, Feb 15, 2013 at 1:44 PM, Chris Rodgers wrote: >> Thanks Josef. Your points make sense to me. >> >> While we're on the subject, maybe I should ask whether this function >> is even appropriate for my data. My data are Poisson-like integer >> counts, and I want to know if the rate is significantly higher in >> dataset1 or dataset2. I'm reluctant to use poissfit because there is a >> scientific reason to believe that my data might deviate significantly >> from Poisson, although I haven't checked this statistically. >> >> Mann-whitney U seemed like a safe alternative because it doesn't make >> distributional assumptions and it deals with ties, which is especially >> important for me because half the counts or more can be zero. Does >> that seem like a good choice, as long as I have >20 samples and the >> large-sample approximation is appropriate? Comments welcome. > > Please bottom or inline post. > > I don't have any direct experience with this. > > The >20 samples is just a guideline (as usual). If you have many ties, > then I would expect be that you need more samples (no reference). > > What I would do in cases like this is to run a small Monte Carlo, with > Poisson data, or data that looks somewhat similar to your data, to see > whether the test has the correct size (for example reject roughly 5% > at a 5% alpha), and to see whether the test has much power in small > samples. > I would expect that the size is ok, but power might not be large > unless the difference in the rate parameter is large. (Since I was just working on a different 2 sample test, I had this almost ready) https://gist.github.com/josef-pkt/4969715 Even for sample size of each sample equal to 10, the results look still pretty ok, slightly under rejecting. with 20 observations each, size is pretty good power is good for most lambda differences I looked at (largish). (I only used 1000 replications) Sometimes I'm surprised how fast we get to the asymptotics. Josef > > Another possibility is to compare permutation p-values with asymptotic > p-values, to see whether they are close. > > There should be alternative tests, but I don't think they are > available in python, specific tests for comparing count data (I have > no idea), general 2 sample goodness-of-fit test (like ks_2samp) but we > don't have anything for discrete data. > > If you want to go parametric, then you could also use poisson (or > negative binomial) regression in statsmodels, and directly test the > equality of the distribution parameter. (there is also zeroinflated > poisson, but with less verification). > > Josef > > >> >> Thanks >> Chris >> >> On Fri, Feb 15, 2013 at 8:58 AM, wrote: >>> On Fri, Feb 15, 2013 at 11:35 AM, wrote: >>>> On Fri, Feb 15, 2013 at 11:16 AM, wrote: >>>>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>>>>> Hi all >>>>>> >>>>>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>>>>> all normal. I have run into a few "gotchas" with this function and I >>>>>> wanted to discuss possible workarounds with the list. >>>>> >>>>> Can you open a ticket ? http://projects.scipy.org/scipy/report >>>>> >>>>> I partially agree, but any changes won't be backwards compatible, and >>>>> I don't have time to think about this enough. >>>>> >>>>>> >>>>>> 1) When this function returns a significant result, it is non-trivial >>>>>> to determine the direction of the effect! The Mann-Whitney test is NOT >>>>>> a test on difference of medians or means, so you cannot determine the >>>>>> direction from these statistics. Wikipedia has a good example of why >>>>>> it is not a test for difference of median. >>>>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>>>>> >>>>>> I've reprinted it here. The data are the finishing order of hares and >>>>>> tortoises. Obviously this is contrived but it indicates the problem. >>>>>> First the setup: >>>>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>>>>> T T T T T T T T T'.split(' ') >>>>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>>>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>>>>> >>>>>> And the results: >>>>>> In [12]: scipy.stats.mannwhitneyu(h, t) >>>>>> Out[12]: (100.0, 0.0097565768849708391) >>>>>> >>>>>> In [13]: np.median(h), np.median(t) >>>>>> Out[13]: (19.0, 18.0) >>>>>> >>>>>> Hares are significantly faster than tortoises, but we cannot determine >>>>>> this from the output of mannwhitneyu. This could be fixed by either >>>>>> returning u1 and u2 from the guts of the function, or testing them in >>>>>> the function and returning the comparison. My current workaround is >>>>>> testing the means which is absolutely wrong in theory but usually >>>>>> correct in practice. >>>>> >>>>> In some cases I'm reluctant to return the direction when we use a >>>>> two-sided test. In this case we don't have a one sided tests. >>>>> In analogy to ttests, I think we could return the individual u1, u2 >>>> >>>> to expand a bit: >>>> For the Kolmogorov Smirnov test, we refused to return an indication of >>>> the direction. The alternative is two-sided and the distribution of >>>> the test statististic and the test statistic are different in the >>>> one-sided test. >>>> So we shouldn't draw any one-sided conclusions from the two-sided test. >>>> >>>> In the t_test and mannwhitenyu the test statistic is normally >>>> distributed (in large samples), so we can infer the one-sided test >>>> from the two-sided statistic and p-value. >>>> >>>> If there are tables for the small sample case, we would need to check >>>> if we get consistent interpretation between one- and two-sided tests. >>>> >>>> Josef >>>> >>>>> >>>>>> >>>>>> 2) The documentation states that the sample sizes must be at least 20. >>>>>> I think this is because the normal approximation for U is not valid >>>>>> for smaller sample sizes. Is there a table of critical values for U in >>>>>> scipy.stats that is appropriate for small sample sizes or should the >>>>>> user implement his or her own? >>>>> >>>>> not available in scipy. I never looked at this. >>>>> pull requests for this are welcome if it works. It would be backwards >>>>> compatible. >>> >>> since I just looked at a table collection for some other test, they >>> also have Mann-Whitney U statistic >>> http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ >>> but I didn't check if it matches the test statistic in scipy.stats >>> >>> Josef >>> >>>>> >>>>>> >>>>>> 3) This is picky but is there a reason that it returns a one-tailed >>>>>> p-value, while other tests (eg ttest_*) default to two-tailed? >>>>> >>>>> legacy wart, that I don't like, but it wasn't offending me enough to change it. >>>>> >>>>>> >>>>>> >>>>>> Thanks for any thoughts, tips, or corrections and please don't take >>>>>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>>>>> so much I wouldn't bother bringing this up! >>>>> >>>>> Thanks for the feedback. >>>>> In large parts review of the functions relies on comments by users >>>>> (and future contributors). >>>>> >>>>> The main problem is how to make changes without breaking current >>>>> usage, since many of those functions are widely used. >>>>> >>>>> Josef >>>>> >>>>> >>>>>> >>>>>> Chris >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sat Feb 16 21:36:08 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Feb 2013 21:36:08 -0500 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Sat, Feb 16, 2013 at 9:17 PM, wrote: > On Sat, Feb 16, 2013 at 7:51 PM, wrote: >> On Fri, Feb 15, 2013 at 1:44 PM, Chris Rodgers wrote: >>> Thanks Josef. Your points make sense to me. >>> >>> While we're on the subject, maybe I should ask whether this function >>> is even appropriate for my data. My data are Poisson-like integer >>> counts, and I want to know if the rate is significantly higher in >>> dataset1 or dataset2. I'm reluctant to use poissfit because there is a >>> scientific reason to believe that my data might deviate significantly >>> from Poisson, although I haven't checked this statistically. >>> >>> Mann-whitney U seemed like a safe alternative because it doesn't make >>> distributional assumptions and it deals with ties, which is especially >>> important for me because half the counts or more can be zero. Does >>> that seem like a good choice, as long as I have >20 samples and the >>> large-sample approximation is appropriate? Comments welcome. >> >> Please bottom or inline post. >> >> I don't have any direct experience with this. >> >> The >20 samples is just a guideline (as usual). If you have many ties, >> then I would expect be that you need more samples (no reference). >> >> What I would do in cases like this is to run a small Monte Carlo, with >> Poisson data, or data that looks somewhat similar to your data, to see >> whether the test has the correct size (for example reject roughly 5% >> at a 5% alpha), and to see whether the test has much power in small >> samples. >> I would expect that the size is ok, but power might not be large >> unless the difference in the rate parameter is large. > > (Since I was just working on a different 2 sample test, I had this almost ready) > https://gist.github.com/josef-pkt/4969715 > > Even for sample size of each sample equal to 10, the results look > still pretty ok, slightly under rejecting. > with 20 observations each, size is pretty good > power is good for most lambda differences I looked at (largish). > (I only used 1000 replications) with asymmetric small sample sizes n_mc = 50000 nobs1, nobs2 = 15, 5 #20, 20 we also get a bit of under rejection, especially at small alpha (0.005 or 0.01) (and as fun part: plotting the histogram of the p-values shows gaps, because with ranks not all values are possible; if I remember the interpretation correctly.) Josef > > Sometimes I'm surprised how fast we get to the asymptotics. > > Josef > > > >> >> Another possibility is to compare permutation p-values with asymptotic >> p-values, to see whether they are close. >> >> There should be alternative tests, but I don't think they are >> available in python, specific tests for comparing count data (I have >> no idea), general 2 sample goodness-of-fit test (like ks_2samp) but we >> don't have anything for discrete data. >> >> If you want to go parametric, then you could also use poisson (or >> negative binomial) regression in statsmodels, and directly test the >> equality of the distribution parameter. (there is also zeroinflated >> poisson, but with less verification). >> >> Josef >> >> >>> >>> Thanks >>> Chris >>> >>> On Fri, Feb 15, 2013 at 8:58 AM, wrote: >>>> On Fri, Feb 15, 2013 at 11:35 AM, wrote: >>>>> On Fri, Feb 15, 2013 at 11:16 AM, wrote: >>>>>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>>>>>> Hi all >>>>>>> >>>>>>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>>>>>> all normal. I have run into a few "gotchas" with this function and I >>>>>>> wanted to discuss possible workarounds with the list. >>>>>> >>>>>> Can you open a ticket ? http://projects.scipy.org/scipy/report >>>>>> >>>>>> I partially agree, but any changes won't be backwards compatible, and >>>>>> I don't have time to think about this enough. >>>>>> >>>>>>> >>>>>>> 1) When this function returns a significant result, it is non-trivial >>>>>>> to determine the direction of the effect! The Mann-Whitney test is NOT >>>>>>> a test on difference of medians or means, so you cannot determine the >>>>>>> direction from these statistics. Wikipedia has a good example of why >>>>>>> it is not a test for difference of median. >>>>>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>>>>>> >>>>>>> I've reprinted it here. The data are the finishing order of hares and >>>>>>> tortoises. Obviously this is contrived but it indicates the problem. >>>>>>> First the setup: >>>>>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>>>>>> T T T T T T T T T'.split(' ') >>>>>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>>>>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>>>>>> >>>>>>> And the results: >>>>>>> In [12]: scipy.stats.mannwhitneyu(h, t) >>>>>>> Out[12]: (100.0, 0.0097565768849708391) >>>>>>> >>>>>>> In [13]: np.median(h), np.median(t) >>>>>>> Out[13]: (19.0, 18.0) >>>>>>> >>>>>>> Hares are significantly faster than tortoises, but we cannot determine >>>>>>> this from the output of mannwhitneyu. This could be fixed by either >>>>>>> returning u1 and u2 from the guts of the function, or testing them in >>>>>>> the function and returning the comparison. My current workaround is >>>>>>> testing the means which is absolutely wrong in theory but usually >>>>>>> correct in practice. >>>>>> >>>>>> In some cases I'm reluctant to return the direction when we use a >>>>>> two-sided test. In this case we don't have a one sided tests. >>>>>> In analogy to ttests, I think we could return the individual u1, u2 >>>>> >>>>> to expand a bit: >>>>> For the Kolmogorov Smirnov test, we refused to return an indication of >>>>> the direction. The alternative is two-sided and the distribution of >>>>> the test statististic and the test statistic are different in the >>>>> one-sided test. >>>>> So we shouldn't draw any one-sided conclusions from the two-sided test. >>>>> >>>>> In the t_test and mannwhitenyu the test statistic is normally >>>>> distributed (in large samples), so we can infer the one-sided test >>>>> from the two-sided statistic and p-value. >>>>> >>>>> If there are tables for the small sample case, we would need to check >>>>> if we get consistent interpretation between one- and two-sided tests. >>>>> >>>>> Josef >>>>> >>>>>> >>>>>>> >>>>>>> 2) The documentation states that the sample sizes must be at least 20. >>>>>>> I think this is because the normal approximation for U is not valid >>>>>>> for smaller sample sizes. Is there a table of critical values for U in >>>>>>> scipy.stats that is appropriate for small sample sizes or should the >>>>>>> user implement his or her own? >>>>>> >>>>>> not available in scipy. I never looked at this. >>>>>> pull requests for this are welcome if it works. It would be backwards >>>>>> compatible. >>>> >>>> since I just looked at a table collection for some other test, they >>>> also have Mann-Whitney U statistic >>>> http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ >>>> but I didn't check if it matches the test statistic in scipy.stats >>>> >>>> Josef >>>> >>>>>> >>>>>>> >>>>>>> 3) This is picky but is there a reason that it returns a one-tailed >>>>>>> p-value, while other tests (eg ttest_*) default to two-tailed? >>>>>> >>>>>> legacy wart, that I don't like, but it wasn't offending me enough to change it. >>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks for any thoughts, tips, or corrections and please don't take >>>>>>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>>>>>> so much I wouldn't bother bringing this up! >>>>>> >>>>>> Thanks for the feedback. >>>>>> In large parts review of the functions relies on comments by users >>>>>> (and future contributors). >>>>>> >>>>>> The main problem is how to make changes without breaking current >>>>>> usage, since many of those functions are widely used. >>>>>> >>>>>> Josef >>>>>> >>>>>> >>>>>>> >>>>>>> Chris >>>>>>> _______________________________________________ >>>>>>> SciPy-User mailing list >>>>>>> SciPy-User at scipy.org >>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From lists at hilboll.de Sun Feb 17 03:49:11 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Sun, 17 Feb 2013 09:49:11 +0100 Subject: [SciPy-User] scipy 0.11 packages for Ubuntu 12.04LTS Message-ID: <51209987.6070505@hilboll.de> Good morning, I finally managed to package scipy 0.11 for Ubuntu Precise (12.04LTS), and uploaded the packages to the pylab PPA: ppa:pylab/stable I ran ``scipy.test('full')`` for both the Python2 and Python3 packages, and with both packages, I get one failure (called slightly different for py3, but also related to #651): FAIL: Regression test for #651: better handling of badly conditioned Is this something to worry about? I basically just took the current 'stable' package from Ubuntu Precise (which is 0.9.0) and adapted the packaging in minor places as to eliminate any packaging errors (there was some patching around which didn't seem to make sense for 0.11.0). Please feel free to use this PPA and report any bugs back to me. Any wishes and further suggestions are also welcoome =) Cheers, Andreas. From mail.till at gmx.de Sun Feb 17 17:59:18 2013 From: mail.till at gmx.de (Till Stensitzki) Date: Sun, 17 Feb 2013 22:59:18 +0000 (UTC) Subject: [SciPy-User] [Release] qt-dataflow 0.1 Message-ID: Hi, i think my weekend project could be useful for other people: It tries to make implementing your on visual programming canvas easy. It is inspired by Orange's canvas, but it tries to implement only as much as necessary, also the license is different. It is based on Qt-graphicscene and need either pyside or pyqt. It should run on Python 2.6 till 3.3 but only 2.7 and 3.3 are tested. https://github.com/Tillsten/qt-dataflow See the examples for how use it. Feedback and Code are welcome. Greetings Till Stensitzki From MaraMaus at nurfuerspam.de Sun Feb 17 09:25:52 2013 From: MaraMaus at nurfuerspam.de (Mara Grahl) Date: Sun, 17 Feb 2013 15:25:52 +0100 Subject: [SciPy-User] BLAS libraries not found (Windows) In-Reply-To: References: <20130215110407.203940@gmx.net> Message-ID: <20130217142552.69670@gmx.net> Hi, thank you very much for pointing me to the installer for Python 3.3! Scipy works now. Unfortunately, odelab does not, although the installation process ended with "succesfully installed": C:\Python33\Scripts>pip install -e git+https://github.com/olivierverdier/odelab# egg=odelab Obtaining odelab from git+https://github.com/olivierverdier/odelab#egg=odelab Cloning https://github.com/olivierverdier/odelab to c:\python33\scripts\src\od elab Running setup.py egg_info for package odelab Installing collected packages: odelab Running setup.py develop for odelab Creating c:\python33\lib\site-packages\odelab.egg-link (link to .) Adding odelab 0.0.0 to easy-install.pth file Installed c:\python33\scripts\src\odelab Successfully installed odelab Cleaning up... When trying to import odelab in the python shell, I always obtain an Import error: >>> from odelab import System Traceback (most recent call last): File "", line 1, in File "c:\python33\scripts\src\odelab\odelab\__init__.py", line 1, in from solver import Solver, load_solver ImportError: No module named 'solver' Maybe you can help me with this issue, too? Thank you very much again, Mara -------- Original-Nachricht -------- > Datum: Sat, 16 Feb 2013 23:55:24 +0100 > Von: klo uo > An: SciPy Users List > Betreff: Re: [SciPy-User] BLAS libraries not found (Windows) > I guess you need lapack and then set correct paths in site.cfg file. Even > more if you like to build under MinGW you can use Atlas with lapack for > additional speedup. If you are interested in this path let me know I'll > point you to Atlas project that compiles under MinGW. > > But you can of course download and install ready made Scipy without all > this trouble and you can even find prebuild windows SciPy packages > compiled > with MKL kindly provided here: > http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy > > > On Fri, Feb 15, 2013 at 12:04 PM, wrote: > > > Hi, > > > > I'm trying to install scipy (originally I only wanted to install odespy, > > which requires odelab which requires scipy...). > > > > I'm using Windows (32 bit). I followed the instructions in > > http://www.scipy.org/Installing_SciPy/BuildingGeneral > > to install BLAS, where I only had to adjust the step where the > environment > > variable is set by export for Windows. I use MinGW and blas.tgz was > > unpacked in the directory /c/users/mara/desktop/PythonDoku/BLAS: > > > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > > $ gfortran -O2 -std=legacy -fno-second-underscore -c *.f > > dann build libfblas.a library, and set environment variable BLAS > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > > $ ar r libfblas.a *.o > > C:\MinGW\bin\ar.exe: creating libfblas.a > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > > $ ranlib libfblas.a > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > > $ rm -rf *.o > > Mara at Mara-PC /c/users/mara/desktop/PythonDoku/BLAS > > $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" > > > > Now when trying to install scipy via > > >python setup.py install > > I encounter the following error: > > > > in get_info > > raise self.notfounderror(self.notfounderror.__doc__) > > numpy.distutils.system_info.BlasNotFoundError: > > Blas (http://www.netlib.org/blas/) libraries not found. > > Directories to search for the libraries can be specified in the > > numpy/distutils/site.cfg file (section [blas]) or by setting > > the BLAS environment variable. > > > > I assume it has something to do with the step > > $ export BLAS="c\users\mara\desktop\PythonDoku\BLAS\libfblas.a" > > above, which might be wrong...? > > > > I would be glad for any help! Thank you in advance, > > Mara > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > From xrodgers at gmail.com Sun Feb 17 18:28:53 2013 From: xrodgers at gmail.com (Chris Rodgers) Date: Sun, 17 Feb 2013 15:28:53 -0800 Subject: [SciPy-User] Questions/comments about scipy.stats.mannwhitneyu In-Reply-To: References: Message-ID: On Sat, Feb 16, 2013 at 6:36 PM, wrote: > On Sat, Feb 16, 2013 at 9:17 PM, wrote: >> On Sat, Feb 16, 2013 at 7:51 PM, wrote: >>> On Fri, Feb 15, 2013 at 1:44 PM, Chris Rodgers wrote: >>>> Thanks Josef. Your points make sense to me. >>>> >>>> While we're on the subject, maybe I should ask whether this function >>>> is even appropriate for my data. My data are Poisson-like integer >>>> counts, and I want to know if the rate is significantly higher in >>>> dataset1 or dataset2. I'm reluctant to use poissfit because there is a >>>> scientific reason to believe that my data might deviate significantly >>>> from Poisson, although I haven't checked this statistically. >>>> >>>> Mann-whitney U seemed like a safe alternative because it doesn't make >>>> distributional assumptions and it deals with ties, which is especially >>>> important for me because half the counts or more can be zero. Does >>>> that seem like a good choice, as long as I have >20 samples and the >>>> large-sample approximation is appropriate? Comments welcome. >>> >>> Please bottom or inline post. >>> >>> I don't have any direct experience with this. >>> >>> The >20 samples is just a guideline (as usual). If you have many ties, >>> then I would expect be that you need more samples (no reference). >>> >>> What I would do in cases like this is to run a small Monte Carlo, with >>> Poisson data, or data that looks somewhat similar to your data, to see >>> whether the test has the correct size (for example reject roughly 5% >>> at a 5% alpha), and to see whether the test has much power in small >>> samples. >>> I would expect that the size is ok, but power might not be large >>> unless the difference in the rate parameter is large. >> >> (Since I was just working on a different 2 sample test, I had this almost ready) >> https://gist.github.com/josef-pkt/4969715 >> >> Even for sample size of each sample equal to 10, the results look >> still pretty ok, slightly under rejecting. >> with 20 observations each, size is pretty good >> power is good for most lambda differences I looked at (largish). >> (I only used 1000 replications) > > with asymmetric small sample sizes > n_mc = 50000 > nobs1, nobs2 = 15, 5 #20, 20 > we also get a bit of under rejection, especially at small alpha (0.005 or 0.01) > > (and as fun part: > plotting the histogram of the p-values shows gaps, because with ranks > not all values are possible; if I remember the interpretation > correctly.) > > Josef > > >> >> Sometimes I'm surprised how fast we get to the asymptotics. >> >> Josef >> >> >> >>> >>> Another possibility is to compare permutation p-values with asymptotic >>> p-values, to see whether they are close. >>> >>> There should be alternative tests, but I don't think they are >>> available in python, specific tests for comparing count data (I have >>> no idea), general 2 sample goodness-of-fit test (like ks_2samp) but we >>> don't have anything for discrete data. >>> >>> If you want to go parametric, then you could also use poisson (or >>> negative binomial) regression in statsmodels, and directly test the >>> equality of the distribution parameter. (there is also zeroinflated >>> poisson, but with less verification). >>> >>> Josef >>> >>> >>>> >>>> Thanks >>>> Chris >>>> >>>> On Fri, Feb 15, 2013 at 8:58 AM, wrote: >>>>> On Fri, Feb 15, 2013 at 11:35 AM, wrote: >>>>>> On Fri, Feb 15, 2013 at 11:16 AM, wrote: >>>>>>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers wrote: >>>>>>>> Hi all >>>>>>>> >>>>>>>> I use scipy.stats.mannwhitneyu extensively because my data is not at >>>>>>>> all normal. I have run into a few "gotchas" with this function and I >>>>>>>> wanted to discuss possible workarounds with the list. >>>>>>> >>>>>>> Can you open a ticket ? http://projects.scipy.org/scipy/report >>>>>>> >>>>>>> I partially agree, but any changes won't be backwards compatible, and >>>>>>> I don't have time to think about this enough. >>>>>>> >>>>>>>> >>>>>>>> 1) When this function returns a significant result, it is non-trivial >>>>>>>> to determine the direction of the effect! The Mann-Whitney test is NOT >>>>>>>> a test on difference of medians or means, so you cannot determine the >>>>>>>> direction from these statistics. Wikipedia has a good example of why >>>>>>>> it is not a test for difference of median. >>>>>>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test >>>>>>>> >>>>>>>> I've reprinted it here. The data are the finishing order of hares and >>>>>>>> tortoises. Obviously this is contrived but it indicates the problem. >>>>>>>> First the setup: >>>>>>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H >>>>>>>> T T T T T T T T T'.split(' ') >>>>>>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H'] >>>>>>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T'] >>>>>>>> >>>>>>>> And the results: >>>>>>>> In [12]: scipy.stats.mannwhitneyu(h, t) >>>>>>>> Out[12]: (100.0, 0.0097565768849708391) >>>>>>>> >>>>>>>> In [13]: np.median(h), np.median(t) >>>>>>>> Out[13]: (19.0, 18.0) >>>>>>>> >>>>>>>> Hares are significantly faster than tortoises, but we cannot determine >>>>>>>> this from the output of mannwhitneyu. This could be fixed by either >>>>>>>> returning u1 and u2 from the guts of the function, or testing them in >>>>>>>> the function and returning the comparison. My current workaround is >>>>>>>> testing the means which is absolutely wrong in theory but usually >>>>>>>> correct in practice. >>>>>>> >>>>>>> In some cases I'm reluctant to return the direction when we use a >>>>>>> two-sided test. In this case we don't have a one sided tests. >>>>>>> In analogy to ttests, I think we could return the individual u1, u2 >>>>>> >>>>>> to expand a bit: >>>>>> For the Kolmogorov Smirnov test, we refused to return an indication of >>>>>> the direction. The alternative is two-sided and the distribution of >>>>>> the test statististic and the test statistic are different in the >>>>>> one-sided test. >>>>>> So we shouldn't draw any one-sided conclusions from the two-sided test. >>>>>> >>>>>> In the t_test and mannwhitenyu the test statistic is normally >>>>>> distributed (in large samples), so we can infer the one-sided test >>>>>> from the two-sided statistic and p-value. >>>>>> >>>>>> If there are tables for the small sample case, we would need to check >>>>>> if we get consistent interpretation between one- and two-sided tests. >>>>>> >>>>>> Josef >>>>>> >>>>>>> >>>>>>>> >>>>>>>> 2) The documentation states that the sample sizes must be at least 20. >>>>>>>> I think this is because the normal approximation for U is not valid >>>>>>>> for smaller sample sizes. Is there a table of critical values for U in >>>>>>>> scipy.stats that is appropriate for small sample sizes or should the >>>>>>>> user implement his or her own? >>>>>>> >>>>>>> not available in scipy. I never looked at this. >>>>>>> pull requests for this are welcome if it works. It would be backwards >>>>>>> compatible. >>>>> >>>>> since I just looked at a table collection for some other test, they >>>>> also have Mann-Whitney U statistic >>>>> http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/ >>>>> but I didn't check if it matches the test statistic in scipy.stats >>>>> >>>>> Josef >>>>> >>>>>>> >>>>>>>> >>>>>>>> 3) This is picky but is there a reason that it returns a one-tailed >>>>>>>> p-value, while other tests (eg ttest_*) default to two-tailed? >>>>>>> >>>>>>> legacy wart, that I don't like, but it wasn't offending me enough to change it. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks for any thoughts, tips, or corrections and please don't take >>>>>>>> these comments as criticisms ... if I didn't enjoy using scipy.stats >>>>>>>> so much I wouldn't bother bringing this up! >>>>>>> >>>>>>> Thanks for the feedback. >>>>>>> In large parts review of the functions relies on comments by users >>>>>>> (and future contributors). >>>>>>> >>>>>>> The main problem is how to make changes without breaking current >>>>>>> usage, since many of those functions are widely used. >>>>>>> >>>>>>> Josef >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Chris >>>>>>>> _______________________________________________ >>>>>>>> SciPy-User mailing list >>>>>>>> SciPy-User at scipy.org >>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Thanks for checking that, and great to hear that mann-whitney works well (or even slightly conservatively) for this use case. To add some content besides a thanks, here is my wrapper for Python calls to R's wilcox.test , in case it is useful for anyone out there. https://github.com/cxrodgers/my/blob/master/stats.py The actual call is pretty simple, but there is a lot of extra error checking. I'm an experimentalist so a lot of my data is ugly/incomplete compared to simulations, so I check for empty variables or all-ties cases which will throw RuntimeErrors. I also worry about rounding error distorting the ranks of equal floats (again, ugly data). The overall running time is much much slower than Scipy's mannwhitneyu, but the 1) increased error checking; 2) avoiding the current bug in scipy.stats.rankdata; and 3) returning the inferred direction of the effect makes it worth it for me personally. From takowl at gmail.com Mon Feb 18 10:13:48 2013 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 18 Feb 2013 15:13:48 +0000 Subject: [SciPy-User] BLAS libraries not found (Windows) In-Reply-To: <20130217142552.69670@gmx.net> References: <20130215110407.203940@gmx.net> <20130217142552.69670@gmx.net> Message-ID: On 17 February 2013 14:25, Mara Grahl wrote: > >>> from odelab import System > Traceback (most recent call last): > File "", line 1, in > File "c:\python33\scripts\src\odelab\odelab\__init__.py", line 1, in > > from solver import Solver, load_solver > ImportError: No module named 'solver' > > Maybe you can help me with this issue, too? > It appears that odelab doesn't work on Python 3 yet. It's probably not that hard to do (a relatively small codebase, not handling lots of text, already using __future__.division), if someone wants to have a go at it. Otherwise, you'll need to install Python 2.7 for now. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Mon Feb 18 11:41:31 2013 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 18 Feb 2013 17:41:31 +0100 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On 14 February 2013 22:53, wrote: > On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: >> Hi All, >> >> as my team and I are constantly facing very hard/complex numerical >> optimization problems, I have taken a look at the various *global* >> optimization routines available in Python and I thought I could throw >> in a couple of algorithms I implemented, mostly drawing from my >> previous thesis work. >> >> I have implemented two routines (based on numpy and scipy), namely: >> >> - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python >> implementation of the algorithm described here: >> >> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >> >> I have added a few improvements here and there based on my Master Thesis work >> on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. > > This could also be a good addition. similar to basinhopping. > >From my perspective, this kind of global optimizers are the most > promising, (compared to the evolutionary, ...) > > >From a quick browse: Is the local optimizer fixed to a specific one, > or can it be any available solver as in basinhopping? > > The only thing I might worry about that it only has 6 citations in > Google Scholar (which might not mean much if the optimizer is not > widely available). > Given that there seem to be many variations of this kind of > optimizers, it might be good to have some background on comparison > with similar optimizers. > > If you have other comparisons of similar optimizers, it would be > useful to see them. Also given that you have a large benchmark suite, > you could compare it with the new basinhopping in scipy.optimize. OK, I have run the test suite with basinhopping as well, even though I had to modify it a bit to support the maximum number of functions evaluations and the tolerance on achieving the global optimum. Results are here: - N-D : http://infinity77.net/global_optimization/multidimensional.html - 1-D : http://infinity77.net/global_optimization/univariate.html Overall it appears to have average performances, even though I must stress again that, while my test suite is relatively large, it only encompasses low-dimensional problem (1-10 variables) and my stopping/convergence criteria may not be applicable to everyone else's needs. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # From josef.pktd at gmail.com Mon Feb 18 12:25:31 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 18 Feb 2013 12:25:31 -0500 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On Mon, Feb 18, 2013 at 11:41 AM, Andrea Gavana wrote: > On 14 February 2013 22:53, wrote: >> On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: >>> Hi All, >>> >>> as my team and I are constantly facing very hard/complex numerical >>> optimization problems, I have taken a look at the various *global* >>> optimization routines available in Python and I thought I could throw >>> in a couple of algorithms I implemented, mostly drawing from my >>> previous thesis work. >>> >>> I have implemented two routines (based on numpy and scipy), namely: >>> >>> - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python >>> implementation of the algorithm described here: >>> >>> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >>> >>> I have added a few improvements here and there based on my Master Thesis work >>> on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. >> >> This could also be a good addition. similar to basinhopping. >> >From my perspective, this kind of global optimizers are the most >> promising, (compared to the evolutionary, ...) >> >> >From a quick browse: Is the local optimizer fixed to a specific one, >> or can it be any available solver as in basinhopping? >> >> The only thing I might worry about that it only has 6 citations in >> Google Scholar (which might not mean much if the optimizer is not >> widely available). >> Given that there seem to be many variations of this kind of >> optimizers, it might be good to have some background on comparison >> with similar optimizers. >> >> If you have other comparisons of similar optimizers, it would be >> useful to see them. Also given that you have a large benchmark suite, >> you could compare it with the new basinhopping in scipy.optimize. > > OK, I have run the test suite with basinhopping as well, even though I > had to modify it a bit to support the maximum number of functions > evaluations and the tolerance on achieving the global optimum. Anything that might be useful to incorporate in the scipy version? > > Results are here: > > - N-D : http://infinity77.net/global_optimization/multidimensional.html > - 1-D : http://infinity77.net/global_optimization/univariate.html > > Overall it appears to have average performances, even though I must > stress again that, while my test suite is relatively large, it only > encompasses low-dimensional problem (1-10 variables) and my > stopping/convergence criteria may not be applicable to everyone else's > needs. Interesting results, AMPGO looks good. I'm a bit surprised that the number of function evaluations of basinhopping is relatively large. (One possible difference to your benchmark is that with smooth models and numerical derivatives, a more efficient local optimizer could be chosen.) two things I saw when I looked at your benchmark before: DE has several cases that are complementary to AMPGO, it might be interesting in applications to cross-check results across optimizers. Why does DE and some other have success rate either 0 or 100 and nothing in between? I don't see a reason what could cause this result. Thanks Josef > > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://www.infinity77.net > > # ------------------------------------------------------------- # > def ask_mailing_list_support(email): > > if mention_platform_and_version() and include_sample_app(): > send_message(email) > else: > install_malware() > erase_hard_drives() > # ------------------------------------------------------------- # > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From mutantturkey at gmail.com Mon Feb 18 14:45:32 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Mon, 18 Feb 2013 14:45:32 -0500 Subject: [SciPy-User] scipy.optimize.nnls - Incompatible Dimensions Message-ID: Hi, I am trying to use scipy's nnls function. Unfortunately I am getting an "incompatible dimensions error, but my dimensions look correct: >> print matrix.shape (13969, 4096) >> print counts.shape (4096,) >> solutions, rnorm = scipy.optimize.nnls(trained_matrix, counts) *** ValueError: incompatible dimensions Any ideas? Thanks, Calvin Morrison From jsseabold at gmail.com Mon Feb 18 14:55:01 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 18 Feb 2013 14:55:01 -0500 Subject: [SciPy-User] scipy.optimize.nnls - Incompatible Dimensions In-Reply-To: References: Message-ID: On Mon, Feb 18, 2013 at 2:45 PM, Calvin Morrison wrote: > Hi, > > I am trying to use scipy's nnls function. Unfortunately I am getting > an "incompatible dimensions error, but my dimensions look correct: > > >> print matrix.shape > (13969, 4096) > > >> print counts.shape > (4096,) > > >> solutions, rnorm = scipy.optimize.nnls(trained_matrix, counts) > *** ValueError: incompatible dimensions > > Any ideas? > Ax is a vector of length 13969 not 4096. Maybe you want ||A.Tx - b||_2? Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: From mutantturkey at gmail.com Mon Feb 18 14:57:20 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Mon, 18 Feb 2013 14:57:20 -0500 Subject: [SciPy-User] scipy.optimize.nnls - Incompatible Dimensions In-Reply-To: References: Message-ID: ah okay, it works if I rotate it! Thanks! trained_matrix = np.rot90(trained_matrix) On 18 February 2013 14:55, Skipper Seabold wrote: > On Mon, Feb 18, 2013 at 2:45 PM, Calvin Morrison > wrote: >> >> Hi, >> >> I am trying to use scipy's nnls function. Unfortunately I am getting >> an "incompatible dimensions error, but my dimensions look correct: >> >> >> print matrix.shape >> (13969, 4096) >> >> >> print counts.shape >> (4096,) >> >> >> solutions, rnorm = scipy.optimize.nnls(trained_matrix, counts) >> *** ValueError: incompatible dimensions >> >> Any ideas? > > > Ax is a vector of length 13969 not 4096. Maybe you want ||A.Tx - b||_2? > > Skipper > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From andrea.gavana at gmail.com Mon Feb 18 15:30:33 2013 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 18 Feb 2013 21:30:33 +0100 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: On 18 February 2013 18:25, wrote: > On Mon, Feb 18, 2013 at 11:41 AM, Andrea Gavana wrote: >> On 14 February 2013 22:53, wrote: >>> On Thu, Feb 14, 2013 at 4:01 PM, Andrea Gavana wrote: >>>> Hi All, >>>> >>>> as my team and I are constantly facing very hard/complex numerical >>>> optimization problems, I have taken a look at the various *global* >>>> optimization routines available in Python and I thought I could throw >>>> in a couple of algorithms I implemented, mostly drawing from my >>>> previous thesis work. >>>> >>>> I have implemented two routines (based on numpy and scipy), namely: >>>> >>>> - AMPGO: Adaptive Memory Programming for Global Optimization: this is my Python >>>> implementation of the algorithm described here: >>>> >>>> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >>>> >>>> I have added a few improvements here and there based on my Master Thesis work >>>> on the standard Tunnelling Algorithm of Levy, Montalvo and Gomez. >>> >>> This could also be a good addition. similar to basinhopping. >>> >From my perspective, this kind of global optimizers are the most >>> promising, (compared to the evolutionary, ...) >>> >>> >From a quick browse: Is the local optimizer fixed to a specific one, >>> or can it be any available solver as in basinhopping? >>> >>> The only thing I might worry about that it only has 6 citations in >>> Google Scholar (which might not mean much if the optimizer is not >>> widely available). >>> Given that there seem to be many variations of this kind of >>> optimizers, it might be good to have some background on comparison >>> with similar optimizers. >>> >>> If you have other comparisons of similar optimizers, it would be >>> useful to see them. Also given that you have a large benchmark suite, >>> you could compare it with the new basinhopping in scipy.optimize. >> >> OK, I have run the test suite with basinhopping as well, even though I >> had to modify it a bit to support the maximum number of functions >> evaluations and the tolerance on achieving the global optimum. > > Anything that might be useful to incorporate in the scipy version? Might be. The tolerance on the global optimum is close to meaningless for real-life problems (as we normally don't know where the global optimum is), but the number of functions evaluations stopping condition might be useful. I am not very familiar (i.e., close to zero) with the PR process, but I'll see what I can do. I can of course provide a patch, which is so much easier than the entire PR chain IMNSHO. >> Results are here: >> >> - N-D : http://infinity77.net/global_optimization/multidimensional.html >> - 1-D : http://infinity77.net/global_optimization/univariate.html >> >> Overall it appears to have average performances, even though I must >> stress again that, while my test suite is relatively large, it only >> encompasses low-dimensional problem (1-10 variables) and my >> stopping/convergence criteria may not be applicable to everyone else's >> needs. > > Interesting results, AMPGO looks good. > > I'm a bit surprised that the number of function evaluations of > basinhopping is relatively large. > (One possible difference to your benchmark is that with smooth models > and numerical derivatives, a more efficient local optimizer could be > chosen.) The results on my website have been obtained using L-BFGS-B as a local optimizer. Most of the test functions, however, are designed to fool gradient-based descent algorithms like BFGS, so I am not particularly surprised by the performances. I had also tried with SLSQP and TNC, with similar results. I couldn't use COBYLA as the Fortran interface to it changed between Scipy 0.9 and 0.11 and I have trouble compiling/installing anything at work. I'll give it another go tomorrow by bringing Scipy from my home PC. > two things I saw when I looked at your benchmark before: > > DE has several cases that are complementary to AMPGO, it might be > interesting in applications to cross-check results across optimizers. I am not sure I understand what you mean here... You mean that when AMPGO fails DE succeeds and vice-versa? > Why does DE and some other have success rate either 0 or 100 and > nothing in between? > I don't see a reason what could cause this result. 0 means that, whatever the random starting point was, the algorithm could never locate the global optimum over 100 trials (100 different random starting points). 100 means that the global optimum could always be located, irrespective of the starting point chosen. I see two reasons for this behaviour: 1. For the vast majority of the test functions, the lower/upper bounds are symmetric around the origin and the actual global optimum is *at* the origin: most derivative-free algorithms (but not AMPGO), when they don't know where to turn because they are desperate and can't find a better minimum, they choose the middle point of the search space (which happens to be the global optimum for most test functions - cheating). This is very difficult for me to correct as most of the algorithm's black magic is hidden behind obsolescent, outdated user-unfriendly Fortran/C code (as if we would need Fortran/C speed of their algorithm to calibrate the Large Hadron Collider detectors... please, just use an intelligent language like Python or, if worst comes, Matlab). This issue actually got some of the algorithms close to be excluded from the benchmark - it's not fair. The DIRECT and MLSL algorithms are just two examples (there are others). I'll try another approach tomorrow by shifting the lower/upper bounds to be asymmetric, and we'll see how it goes. 2. There is a (widespread) tendency amongst numerical optimization "experts" to pick and choose a set of benchmark functions that is better suited to demonstrate the superiority of their own algorithm against other methods (or they simply write new functions for which they know their algorithm will perform better). To write my benchmark suite I have looked at around 30/40 papers and most of them showed that behaviour: a couple of examples for that is the "Deflected Corrugated Spring" and the "Xin-She Yang" test functions, which are known to be favourable to DE and PSWARM. But anyway, I'll run a modified benchmark tomorrow (or the next few days) and I'll report back if there is any interest. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # From pav at iki.fi Mon Feb 18 15:45:13 2013 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 18 Feb 2013 22:45:13 +0200 Subject: [SciPy-User] (Possible) new optimization routines - scipy.optimize In-Reply-To: References: Message-ID: 18.02.2013 22:30, Andrea Gavana kirjoitti: [clip] > I am not very familiar (i.e., close to > zero) with the PR process, but I'll see what I can do. I can of course > provide a patch, which is so much easier than the entire PR chain > IMNSHO. It's not that much more difficult than patches, and way easier than patch queues: - create account on github.com, log in - click fork at https://github.com/scipy/scipy - git clone -o github git at github.com:YOURUSERNAME/scipy.git - git branch my-feature - edit - git commit -m "commit message" - git push github my-feature - go to https://github.com/YOURUSERNAME/scipy and click "Pull request" next to "my-feature" -- Pauli Virtanen From MaraMaus at nurfuerspam.de Mon Feb 18 15:31:28 2013 From: MaraMaus at nurfuerspam.de (Mara Grahl) Date: Mon, 18 Feb 2013 21:31:28 +0100 Subject: [SciPy-User] BLAS libraries not found (Windows) In-Reply-To: References: <20130215110407.203940@gmx.net> <20130217142552.69670@gmx.net> Message-ID: <20130218203128.245970@gmx.net> Hi Thomas, you were right, I installed python 2.7 and prerequisites for odelab, and now odelab is imported :) thank you very much! Mara -------- Original-Nachricht -------- > Datum: Mon, 18 Feb 2013 15:13:48 +0000 > Von: Thomas Kluyver > An: SciPy Users List > Betreff: Re: [SciPy-User] BLAS libraries not found (Windows) > On 17 February 2013 14:25, Mara Grahl wrote: > > > >>> from odelab import System > > Traceback (most recent call last): > > File "", line 1, in > > File "c:\python33\scripts\src\odelab\odelab\__init__.py", line 1, in > > > > from solver import Solver, load_solver > > ImportError: No module named 'solver' > > > > Maybe you can help me with this issue, too? > > > > It appears that odelab doesn't work on Python 3 yet. It's probably not > that > hard to do (a relatively small codebase, not handling lots of text, > already > using __future__.division), if someone wants to have a go at it. > Otherwise, > you'll need to install Python 2.7 for now. > > Thomas From sharath20284 at yahoo.com Mon Feb 18 18:38:04 2013 From: sharath20284 at yahoo.com (Sharath Venkatesha) Date: Mon, 18 Feb 2013 15:38:04 -0800 (PST) Subject: [SciPy-User] Problem with 64 bit Scipy Stats: cannot install MKL In-Reply-To: <1361230036.48189.YahooMailNeo@web141202.mail.bf1.yahoo.com> References: <1361230036.48189.YahooMailNeo@web141202.mail.bf1.yahoo.com> Message-ID: <1361230684.35105.YahooMailNeo@web141206.mail.bf1.yahoo.com> hello, I am using Christopher Gohlke's builds for 64 bit machines. My version is Python 2.7, Scipy 0.11.0 and Numpy 1.7.0 (non MKL version) on 64bit Win7 machine. When using? from scipy.stats import norm I get an error Traceback (most recent call last): ? File "", line 1, in ? File "C:\Python27\lib\site-packages\scipy\stats\__init__.py", line 321, in ? ? from stats import * ? File "C:\Python27\lib\site-packages\scipy\stats\stats.py", line 193, in ? ? import scipy.special as special ? File "C:\Python27\lib\site-packages\scipy\special\__init__.py", line 525, in ? ? from _cephes import * ImportError: DLL load failed: The specified module could not be found. I looked online for solutions that Chris has posted. Most of them included using the MKL version of numpy.?I installed the MKL version of numpy (without Intel MKL installation) and still get the same error, which is expected! I cannot install Intel MKL on my machine due to the costs involved.? Please suggest if there is a workaround. thanks Sharath -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Tue Feb 19 13:08:13 2013 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 19 Feb 2013 10:08:13 -0800 Subject: [SciPy-User] Problem with 64 bit Scipy Stats: cannot install MKL In-Reply-To: <1361230684.35105.YahooMailNeo@web141206.mail.bf1.yahoo.com> References: <1361230036.48189.YahooMailNeo@web141202.mail.bf1.yahoo.com> <1361230684.35105.YahooMailNeo@web141206.mail.bf1.yahoo.com> Message-ID: <5123BF8D.8010008@uci.edu> On 2/18/2013 3:38 PM, Sharath Venkatesha wrote: > > hello, > > I am using Christopher Gohlke's builds for 64 bit machines. > > My version is Python 2.7, Scipy 0.11.0 and Numpy 1.7.0 (non MKL version) > on 64bit Win7 machine. > > When using > > from scipy.stats import norm > > I get an error > > Traceback (most recent call last): > File "", line 1, in > File "C:\Python27\lib\site-packages\scipy\stats\__init__.py", line > 321, in > from stats import * > File "C:\Python27\lib\site-packages\scipy\stats\stats.py", line 193, > in > import scipy.special as special > File "C:\Python27\lib\site-packages\scipy\special\__init__.py", line > 525, in > from _cephes import * > ImportError: DLL load failed: The specified module could not be found. > > I looked online for solutions that Chris has posted. Most of them > included using the MKL version of numpy. I installed the MKL version of > numpy (without Intel MKL installation) and still get the same error, > which is expected! > > I cannot install Intel MKL on my machine due to the costs involved. > > Please suggest if there is a workaround. > > thanks > Sharath > This has already been resolved off-list. Christoph From paulhtremblay at gmail.com Tue Feb 19 16:04:20 2013 From: paulhtremblay at gmail.com (Paul Tremblay) Date: Tue, 19 Feb 2013 16:04:20 -0500 Subject: [SciPy-User] installing scipy under rhel 5 Message-ID: I have been trying to install scipy for rhel 5 all day and have had no luck. I know rhel is something like a decade old, but my job requires I use it. I installed numpy and matplotlib with almost no problem. Both packages told me I needed blas and lapack. I downloaded the rpms and installed them. However, when I try to build with python2.7 setup.py build I get the following message: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. I know I need to link to BLAS, but I can't find this library anywhere. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Tue Feb 19 16:37:37 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Tue, 19 Feb 2013 16:37:37 -0500 Subject: [SciPy-User] installing scipy under rhel 5 In-Reply-To: References: Message-ID: I use a home-brew package installer for CentOS5 (RHEL5) which is not pretty but does successfully build SciPy and its dependencies all from source. (For bonus you can see how to build Gtk+ and PyGtk on RHEL5 which is even harder). The build script includes the exact version numbers (sometimes not the latest) that are known to work. You can try building scipy following the steps in these build scripts: https://github.com/sot/skare/blob/master/cfg/num_libs.cfg https://github.com/sot/skare/blob/master/cfg/num_sci_src.cfg The exact versions of libraries that are known to work are available in: https://github.com/sot/skare/blob/master/pkgs.manifest There is also a lot of cruft in there related to our work environment that you should ignore. Hope that helps, Tom On Tue, Feb 19, 2013 at 4:04 PM, Paul Tremblay wrote: > I have been trying to install scipy for rhel 5 all day and have had no luck. > I know rhel is something like a decade old, but my job requires I use it. > > I installed numpy and matplotlib with almost no problem. Both packages told > me I needed blas and lapack. I downloaded the rpms and installed them. > > However, when I try to build with > > python2.7 setup.py build > > I get the following message: > > Blas (http://www.netlib.org/blas/) libraries not found. > Directories to search for the libraries can be specified in the > numpy/distutils/site.cfg file (section [blas]) or by setting > the BLAS environment variable. > > I know I need to link to BLAS, but I can't find this library anywhere. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rob.clewley at gmail.com Wed Feb 20 00:47:15 2013 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 20 Feb 2013 00:47:15 -0500 Subject: [SciPy-User] Seeking help and support for next-gen math modeling tools using Python Message-ID: Hi all, and apologies for a little cross-posting: First, thanks to those of you who have used and contributed to the PyDSTool math modeling environment [1]. This project has greatly benefitted from the underlying platform of numpy / scipy / matplotlib / ipython. Going forward I have three goals, for which I would like the urgent input and support of existing or potential users. (i) I have ideas for expanding PyDSTool with innovative tools in my research area, which is essentially for the reverse engineering of complex mechanisms in multi-scale dynamic systems [2]. These tools have already been prototyped and show promise, but they need a lot of work. (ii) I want to grow and develop the community of users who will help drive new ideas, provide feedback, and collaborate on writing and testing code for both the core and application aspects of PyDSTool. (iii) The first two goals will help me to expand the scientific / engineering applications and use cases of PyDSTool as well as further sustain the project in the long-term. I am applying for NSF funding to support these software and application goals over the next few years [3], but the proposal deadline is in just four weeks! If you are interested in helping in any way I would greatly appreciate your replies (off list) to either of the following queries: I need to better understand my existing and potential users, many of whom may not be registered on the sourceforge users list. Please tell me who you are and what you use PyDSTool for. If you are not using it yet but you?re interested in this area then please provide feedback regarding what you would like to see change. If you are interested in these future goals, even if you are not an existing user but may be in the future, please write a brief letter of support on a letterhead document that I will send in with the proposal as PDFs. I have sample text that I can send you, as well as my draft proposal?s introduction and specific aims. These letters can make a great deal of difference during review. Without funding, collaborators, user demand and community support, these more ambitious goals for PyDSTool will not happen, although I am committed to a basic level of maintenance. For instance, based on user feedback I am about to release an Ubuntu-based Live CD [4] that will allow users to try PyDSTool on any OS without having to install it. PyDSTool will also acquire an improved setup procedure and will be added to the NeuroDebian repository [5], among others. I am also finalizing an integrated interface to CUDA GPUs to perform fast parallel ODE solving [6]. Thanks for your time, Rob Clewley [1] http://pydstool.sourceforge.net [2] http://www.ni.gsu.edu/~rclewley/Research/index.html, and in particular http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002628 [3] NSF Software Infrastructure for Sustained Innovation (SI2-SSE) program solicitation: http://www.nsf.gov/pubs/2013/nsf13525/nsf13525.htm [4] http://help.ubuntu.com/community/LiveCD [5] http://neuro.debian.net/ [6] http://www.nvidia.com/object/cuda_home_new.html -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://neuroscience.gsu.edu/rclewley.html From francescoboccacci at libero.it Wed Feb 20 03:10:06 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Wed, 20 Feb 2013 09:10:06 +0100 (CET) Subject: [SciPy-User] Get bandwidth from gaussian_kde Message-ID: <32828853.3204281361347806637.JavaMail.defaultUser@defaultHost> Hi all, i have a question about gaussian_kde method. I use that method in this way: kernel = stats.kde.gaussian_kde(values) I think as write in the documentation "If None (default), ?scott? is used." the scott method is used as default. My question is: Is it possible to get the bandwidth value used ? I would like to show this value in my script. Thanks Francesco From pierre at barbierdereuille.net Wed Feb 20 03:55:38 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Wed, 20 Feb 2013 09:55:38 +0100 Subject: [SciPy-User] Get bandwidth from gaussian_kde In-Reply-To: <32828853.3204281361347806637.JavaMail.defaultUser@defaultHost> References: <32828853.3204281361347806637.JavaMail.defaultUser@defaultHost> Message-ID: Yes, it is possible. Or more precisely, it is possible to get the covariance matrix with the 'covariance' attribute. The bandwidth is simply the square root of the covariance ... -- Barbier de Reuille Pierre On 20 February 2013 09:10, francescoboccacci at libero.it < francescoboccacci at libero.it> wrote: > Hi all, > i have a question about gaussian_kde method. > I use that method in this way: > > kernel = stats.kde.gaussian_kde(values) > > I think as write in the documentation "If None (default), ?scott? is used." > the scott method is used as default. > My question is: Is it possible to get the bandwidth value used ? I would > like > to show this value in my script. > > Thanks > > Francesco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francescoboccacci at libero.it Wed Feb 20 04:03:29 2013 From: francescoboccacci at libero.it (francescoboccacci at libero.it) Date: Wed, 20 Feb 2013 10:03:29 +0100 (CET) Subject: [SciPy-User] R: Re: Get bandwidth from gaussian_kde Message-ID: <187174.3238151361351009460.JavaMail.defaultUser@defaultHost> Thanks Francesco ----Messaggio originale---- Da: pierre at barbierdereuille.net Data: 20/02/2013 9.55 A: "francescoboccacci at libero.it", "SciPy Users List" Ogg: Re: [SciPy-User] Get bandwidth from gaussian_kde Yes, it is possible. Or more precisely, it is possible to get the covariance matrix with the 'covariance' attribute.The bandwidth is simply the square root of the covariance ... -- Barbier de Reuille Pierre On 20 February 2013 09:10, francescoboccacci at libero.it wrote: Hi all, i have a question about gaussian_kde method. I use that method in this way: kernel = stats.kde.gaussian_kde(values) I think as write in the documentation "If None (default), ?scott? is used." the scott method is used as default. My question is: Is it possible to get the bandwidth value used ? I would like to show this value in my script. Thanks Francesco _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From balem at univ-brest.fr Wed Feb 20 10:24:26 2013 From: balem at univ-brest.fr (Kevin) Date: Wed, 20 Feb 2013 15:24:26 +0000 (UTC) Subject: [SciPy-User] Reading TDM/TDMS Files with scipy References: Message-ID: Floris hotmail.com> writes: > > Nils Wagner iam.uni-stuttgart.de> writes: > > > > > Hi all, > > > > Is it possible to read TDM/TDMS files with scipy ? > > > > I found a tool for Matlab > > http://zone.ni.com/devzone/cda/epd/p/id/5957 > > > > Nils > > > > Hello Nils, > I made a little tool for that: pyTDMS. > http://sourceforge.net/projects/pytdms/ > Hope that helps. > Floris > Hi, This message is quite old but I try to use your pytdms which seems to be a great tools. However I get this error message : ==> Interleaved Traceback (most recent call last): File "test_fft.py", line 4, in (ob,ra) = tdm.read("14_02_2013_10_52_12.tdms") File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line 900, in read data = readSegment(f,sz,data) File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line 762, in readSegment newdata = readRawData(f,leadin,newobjects,newobjectorder,filesize) File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line 648, in readRawData for c in channel: data[c]=[] TypeError: unhashable type: 'dict' Unfortunately I have no clue about how the tdms file is written, maybe you can help me on this ? Many thanks Kevin From nils106 at googlemail.com Wed Feb 20 11:49:16 2013 From: nils106 at googlemail.com (Nils Wagner) Date: Wed, 20 Feb 2013 17:49:16 +0100 Subject: [SciPy-User] Reading TDM/TDMS Files with scipy In-Reply-To: References: Message-ID: Hi Kevin, IMHO, you should ask Floris, directly. http://www.florisvanvugt.com/index.html Cheers, Nils On 2/20/13, Kevin wrote: > Floris hotmail.com> writes: > >> >> Nils Wagner iam.uni-stuttgart.de> writes: >> >> > >> > Hi all, >> > >> > Is it possible to read TDM/TDMS files with scipy ? >> > >> > I found a tool for Matlab >> > http://zone.ni.com/devzone/cda/epd/p/id/5957 >> > >> > Nils >> > >> >> Hello Nils, >> I made a little tool for that: pyTDMS. >> http://sourceforge.net/projects/pytdms/ >> Hope that helps. >> Floris >> > Hi, > This message is quite old but I try to use your pytdms which seems to be a > great > tools. However I get this error message : > > ==> Interleaved > Traceback (most recent call last): > File "test_fft.py", line 4, in > (ob,ra) = tdm.read("14_02_2013_10_52_12.tdms") > File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line > > 900, in read > data = readSegment(f,sz,data) > File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line > > 762, in readSegment > newdata = readRawData(f,leadin,newobjects,newobjectorder,filesize) > File "/home/balem/Desktop/TANDEM/test actif rade 14022013/pyTDMS.py", line > > 648, in readRawData > for c in channel: data[c]=[] > TypeError: unhashable type: 'dict' > > > Unfortunately I have no clue about how the tdms file is written, maybe you > can > help me on this ? > Many thanks > > Kevin > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From MaraMaus at nurfuerspam.de Wed Feb 20 22:39:03 2013 From: MaraMaus at nurfuerspam.de (MaraMaus at nurfuerspam.de) Date: Thu, 21 Feb 2013 04:39:03 +0100 Subject: [SciPy-User] odespy AttributeError Message-ID: <20130221033903.185300@gmx.net> Hi, since there doesn't seem to be a mailing list for the package odespy, I place my question here - I hope this is ok? Whereas the examples from the odespy manual work fine, I encounter a strange error when trying to use odespy for some complicated system of ordinary differential equations: Traceback (most recent call last): File "ode1.py", line 131, in u, s = solver.solve(time_points) File "C:\Python27\lib\site-packages\odespy\solvers.py", line 1036, in solve self.u[n+1] = self.advance() # new value File "C:\Python27\lib\site-packages\odespy\RungeKutta.py", line 194, in advance rms_norm = np.sqrt(np.sum(rms*rms)/self.neq) AttributeError: sqrt Does anyone know what might be the reason for this error message? I would be glad for any help, thank you in advance, Mara For completeness the full code: A short description: most of the code (between the hashs #setup_beg and #setup_end) sets up the system of 9 ordinary 1st order differential equations for the variables v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3. The list DLHS contains explicit expressions for [diff(v1,s), diff(v2,s), ... , diff(vpp3,s)] #setup_beg nx=3 T=45.1 mu=253.9 sL=500.0 sR=6.0 xL=0.0 xR=100.0**2 dx=(xR-xL)/(nx-1) x={} for i in range(1,nx+1): x[i]=xL+dx*(i-1) print(x[nx]) # vp is first derivative of v, vp1 is first derivative of v at grid point 1 etc. # note that vp[0]=vp1, ..., vp[nx-1]=vpnx from sympy import solve, Symbol pts=range(1,nx+1) vp=[Symbol('vp'+str(i)) for i in pts] vpp=[Symbol('vpp'+str(i)) for i in pts] vppp=[Symbol('vppp'+str(i)) for i in pts] vpppp=[Symbol('vpppp'+str(i)) for i in pts] #beachte: vp[0]=vp1,...,vp[nx-1]=vpnx etc. eq1={} for i in range(1,nx): eq1[i]=vp[i-1]+vpp[i-1]*(x[i+1]-x[i])/2 + vppp[i-1]*(x[i+1]-x[i])**2/8+vpppp[i-1]*(x[i+1]-x[i])**3/48 \ -vp[i]+vpp[i]*(x[i+1]-x[i])/2 - vppp[i]*(x[i+1]-x[i])**2/8+vpppp[i]*(x[i+1]-x[i])**3/48 eq2={} for i in range(1,nx): eq2[i]=vpp[i-1]+vppp[i-1]*(x[i+1]-x[i])/2 + vpppp[i-1]*(x[i+1]-x[i])**2/8 \ -vpp[i]+vppp[i]*(x[i+1]-x[i])/2 - vpppp[i]*(x[i+1]-x[i])**2/8 eq3={} eq3[1]=vppp[0]+vpppp[0]*(x[2]-x[1])/2-vppp[1]-vpppp[1]*(x[1]-x[2])/2 eq3[2]=vppp[nx-2]+vpppp[nx-2]*(x[nx]-x[nx-1])/2-vppp[nx-1]-vpppp[nx-1]*(x[nx-1]-x[nx])/2 eqs=[] for i in range(1,nx): eqs.append(eq1[i]) for i in range(1,nx): eqs.append(eq2[i]) eqs.append(eq3[1]) eqs.append(eq3[2]) vars=[] for i in range(1,nx+1): vars.append(vppp[i-1]) for i in range(1,nx+1): vars.append(vpppp[i-1]) sol=solve(eqs,vars) #beachte: vppp[0]=vppp1,...,vppp[nx-1]=vpppnx etc. # Check z.B. fuer nx=5: (gleiche Loesung mit MM) # vpppp5= #print(sol[vpppp[4]]) # vpppp1= #print(sol[vpppp[0]]) from sympy import Function, diff from sympy.functions import coth, tanh, sqrt s=Symbol('s') vh=Function('vh') xh=Symbol('xh') Epi= sqrt( s**2 + 2*diff(vh(xh),xh) ) Esi= sqrt( s**2 + 2*diff(vh(xh),xh) +4*xh*diff(vh(xh),xh,xh) ) Eq= sqrt(s**2 +3.2**2*xh) import math pi=math.pi gl1= s**4/12/pi**2*( 3/Epi*coth(Epi/2/T) +1/Esi*coth(Esi/2/T) -12/Eq*(tanh( (Eq-mu)/2/T ) + tanh( (Eq+mu)/2/T ) ) ) gl2= diff(gl1,xh) gl3= diff(gl2,xh) DLHS=[] for i in range(1,nx+1): DLHS.append( gl1.subs({'Derivative(vh(xh), xh, xh)':vpp[i-1],'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) for i in range(1,nx+1): DLHS.append( gl2.subs({'Derivative(vh(xh), xh, xh, xh)':vppp[i-1],'Derivative(vh(xh), xh, xh)':vpp[i-1],'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) for i in range(1,nx+1): DLHS.append( gl3.subs({'Derivative(vh(xh), xh, xh, xh, xh)':vpppp[i-1],'Derivative(vh(xh), xh, xh, xh)':vppp[i-1],'Derivative(vh(xh), xh, xh)':vpp[i-1], \ 'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) #DLHS[0]=D[v(x_1),s],...,DLHS[nx-1]=D[v(x_nx),s],DLHS[nx]=D[vp(x_1),s],..., DLHS[nx+nx+nx-1]=D[vpp(x_nx),s] #setup_end v=[Symbol('v'+str(i)) for i in pts] #initial conditions isc=[] for i in range(1,nx+1): isc.append( 5/2*x[i]**2) for i in range(1,nx+1): isc.append( 5*x[i]) for i in range(1,nx+1): isc.append( 5) # to do: generalize to arbitrary nx def f(u,s): v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3 = u return DLHS import odespy solver = odespy.RungeKutta.CashKarp(f) solver.set_initial_condition(isc) from numpy import linspace T = 490 # end of simulation N = 30 # no of time steps time_points = linspace(sL, T, N+1) u, s = solver.solve(time_points) #from matplotlib.pyplot import * #first=u[:,0] #plot(s, first) #show() From johann.cohentanugi at gmail.com Thu Feb 21 11:02:26 2013 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Thu, 21 Feb 2013 17:02:26 +0100 Subject: [SciPy-User] odespy AttributeError In-Reply-To: <20130221033903.185300@gmx.net> References: <20130221033903.185300@gmx.net> Message-ID: <51264512.9090503@gmail.com> hello, best is probably to contact the main developer directly, as this from github does not look like a community effort : https://github.com/hplgit good luck, Johann On 02/21/2013 04:39 AM, MaraMaus at nurfuerspam.de wrote: > Hi, > > since there doesn't seem to be a mailing list for the package odespy, I place my question here - I hope this is ok? > > Whereas the examples from the odespy manual work fine, I encounter a strange error when trying to use odespy for some complicated system of ordinary differential equations: > > Traceback (most recent call last): > File "ode1.py", line 131, in > u, s = solver.solve(time_points) > File "C:\Python27\lib\site-packages\odespy\solvers.py", line 1036, in solve > self.u[n+1] = self.advance() # new value > File "C:\Python27\lib\site-packages\odespy\RungeKutta.py", line 194, in advance > rms_norm = np.sqrt(np.sum(rms*rms)/self.neq) > AttributeError: sqrt > > Does anyone know what might be the reason for this error message? > > I would be glad for any help, thank you in advance, > > Mara > > > > For completeness the full code: > A short description: most of the code (between the hashs #setup_beg and #setup_end) sets up the system of 9 ordinary 1st order differential equations for the variables v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3. The list DLHS contains explicit expressions for > [diff(v1,s), diff(v2,s), ... , diff(vpp3,s)] > > > #setup_beg > nx=3 > T=45.1 > mu=253.9 > sL=500.0 > sR=6.0 > xL=0.0 > xR=100.0**2 > dx=(xR-xL)/(nx-1) > > x={} > for i in range(1,nx+1): > x[i]=xL+dx*(i-1) > > print(x[nx]) > > # vp is first derivative of v, vp1 is first derivative of v at grid point 1 etc. > # note that vp[0]=vp1, ..., vp[nx-1]=vpnx > > from sympy import solve, Symbol > > pts=range(1,nx+1) > vp=[Symbol('vp'+str(i)) for i in pts] > vpp=[Symbol('vpp'+str(i)) for i in pts] > vppp=[Symbol('vppp'+str(i)) for i in pts] > vpppp=[Symbol('vpppp'+str(i)) for i in pts] > #beachte: vp[0]=vp1,...,vp[nx-1]=vpnx etc. > > eq1={} > for i in range(1,nx): > eq1[i]=vp[i-1]+vpp[i-1]*(x[i+1]-x[i])/2 + vppp[i-1]*(x[i+1]-x[i])**2/8+vpppp[i-1]*(x[i+1]-x[i])**3/48 \ > -vp[i]+vpp[i]*(x[i+1]-x[i])/2 - vppp[i]*(x[i+1]-x[i])**2/8+vpppp[i]*(x[i+1]-x[i])**3/48 > > eq2={} > for i in range(1,nx): > eq2[i]=vpp[i-1]+vppp[i-1]*(x[i+1]-x[i])/2 + vpppp[i-1]*(x[i+1]-x[i])**2/8 \ > -vpp[i]+vppp[i]*(x[i+1]-x[i])/2 - vpppp[i]*(x[i+1]-x[i])**2/8 > > eq3={} > eq3[1]=vppp[0]+vpppp[0]*(x[2]-x[1])/2-vppp[1]-vpppp[1]*(x[1]-x[2])/2 > eq3[2]=vppp[nx-2]+vpppp[nx-2]*(x[nx]-x[nx-1])/2-vppp[nx-1]-vpppp[nx-1]*(x[nx-1]-x[nx])/2 > > > eqs=[] > for i in range(1,nx): > eqs.append(eq1[i]) > for i in range(1,nx): > eqs.append(eq2[i]) > eqs.append(eq3[1]) > eqs.append(eq3[2]) > > vars=[] > for i in range(1,nx+1): > vars.append(vppp[i-1]) > for i in range(1,nx+1): > vars.append(vpppp[i-1]) > > sol=solve(eqs,vars) > > #beachte: vppp[0]=vppp1,...,vppp[nx-1]=vpppnx etc. > # Check z.B. fuer nx=5: (gleiche Loesung mit MM) > # vpppp5= > #print(sol[vpppp[4]]) > # vpppp1= > #print(sol[vpppp[0]]) > > from sympy import Function, diff > from sympy.functions import coth, tanh, sqrt > > s=Symbol('s') > vh=Function('vh') > xh=Symbol('xh') > > Epi= sqrt( s**2 + 2*diff(vh(xh),xh) ) > Esi= sqrt( s**2 + 2*diff(vh(xh),xh) +4*xh*diff(vh(xh),xh,xh) ) > Eq= sqrt(s**2 +3.2**2*xh) > > import math > pi=math.pi > > gl1= s**4/12/pi**2*( 3/Epi*coth(Epi/2/T) +1/Esi*coth(Esi/2/T) -12/Eq*(tanh( (Eq-mu)/2/T ) + tanh( (Eq+mu)/2/T ) ) ) > gl2= diff(gl1,xh) > gl3= diff(gl2,xh) > > DLHS=[] > > for i in range(1,nx+1): > DLHS.append( gl1.subs({'Derivative(vh(xh), xh, xh)':vpp[i-1],'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > for i in range(1,nx+1): > DLHS.append( gl2.subs({'Derivative(vh(xh), xh, xh, xh)':vppp[i-1],'Derivative(vh(xh), xh, xh)':vpp[i-1],'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > for i in range(1,nx+1): > DLHS.append( gl3.subs({'Derivative(vh(xh), xh, xh, xh, xh)':vpppp[i-1],'Derivative(vh(xh), xh, xh, xh)':vppp[i-1],'Derivative(vh(xh), xh, xh)':vpp[i-1], \ > 'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > > #DLHS[0]=D[v(x_1),s],...,DLHS[nx-1]=D[v(x_nx),s],DLHS[nx]=D[vp(x_1),s],..., DLHS[nx+nx+nx-1]=D[vpp(x_nx),s] > > > #setup_end > > > v=[Symbol('v'+str(i)) for i in pts] > > #initial conditions > isc=[] > > for i in range(1,nx+1): > isc.append( 5/2*x[i]**2) > for i in range(1,nx+1): > isc.append( 5*x[i]) > for i in range(1,nx+1): > isc.append( 5) > > > > > # to do: generalize to arbitrary nx > def f(u,s): > v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3 = u > return DLHS > > import odespy > solver = odespy.RungeKutta.CashKarp(f) > solver.set_initial_condition(isc) > from numpy import linspace > T = 490 # end of simulation > N = 30 # no of time steps > time_points = linspace(sL, T, N+1) > u, s = solver.solve(time_points) > > #from matplotlib.pyplot import * > #first=u[:,0] > #plot(s, first) > #show() > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From paulhtremblay at gmail.com Thu Feb 21 23:53:22 2013 From: paulhtremblay at gmail.com (Paul Tremblay) Date: Thu, 21 Feb 2013 23:53:22 -0500 Subject: [SciPy-User] installing scipy under rhel 5 In-Reply-To: References: Message-ID: <5126F9C2.5090903@gmail.com> Thanks Tom. I don't quite understand how the .cfg file works with a build, so I'll have to keep experimenting until I get success. Paul On 2/19/13 4:37 PM, Tom Aldcroft wrote: > I use a home-brew package installer for CentOS5 (RHEL5) which is not > pretty but does successfully build SciPy and its dependencies all from > source. (For bonus you can see how to build Gtk+ and PyGtk on RHEL5 > which is even harder). The build script includes the exact version > numbers (sometimes not the latest) that are known to work. > > You can try building scipy following the steps in these build scripts: > > https://github.com/sot/skare/blob/master/cfg/num_libs.cfg > https://github.com/sot/skare/blob/master/cfg/num_sci_src.cfg > > The exact versions of libraries that are known to work are available in: > > https://github.com/sot/skare/blob/master/pkgs.manifest > > There is also a lot of cruft in there related to our work environment > that you should ignore. > > Hope that helps, > Tom > > On Tue, Feb 19, 2013 at 4:04 PM, Paul Tremblay wrote: >> I have been trying to install scipy for rhel 5 all day and have had no luck. >> I know rhel is something like a decade old, but my job requires I use it. >> >> I installed numpy and matplotlib with almost no problem. Both packages told >> me I needed blas and lapack. I downloaded the rpms and installed them. >> >> However, when I try to build with >> >> python2.7 setup.py build >> >> I get the following message: >> >> Blas (http://www.netlib.org/blas/) libraries not found. >> Directories to search for the libraries can be specified in the >> numpy/distutils/site.cfg file (section [blas]) or by setting >> the BLAS environment variable. >> >> I know I need to link to BLAS, but I can't find this library anywhere. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From aldcroft at head.cfa.harvard.edu Fri Feb 22 10:41:59 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 22 Feb 2013 10:41:59 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit Message-ID: In Aug 2011 there was a thread [Unexpected covariance matrix from scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html) where Christoph Deil reported that "scipy.optimize.curve_fit returns parameter errors that don't scale with sigma, the standard deviation of ydata, as I expected." Today I independently came to the same conclusion. This thread generated some discussion but seemingly no agreement that the covariance output of `curve_fit` is not what would be expected. I think the discussion wasn't as focused as possible because the example was too complicated. With that I provide here about the simplest possible example, which is fitting a constant to a constant dataset, aka computing the mean and error on the mean. Since we know the answers we can compare the output of `curve_fit`. To illustrate things more easily I put the examples into an IPython notebook which is available at: http://nbviewer.ipython.org/5014170/ This was run using scipy 0.11.0 by the way. Any further discussion on this topic to come to an understanding of the covariance output from `curve_fit` would be appreciated. Thanks, Tom From pierre at barbierdereuille.net Fri Feb 22 11:04:43 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Fri, 22 Feb 2013 17:04:43 +0100 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: As far as I understand the documentation, the `sigma` parameter is only used as weights for the least-square problem. It only *supposed* to be the variance of the data on each of the y data points, and not the variance of ydata as a whole. So in your example, the specification of `sigma` is incorrect (1 value instead of a N-length sequence). You can try to input a ramp (i.e. range(1,len(yn)+1) and you will see a big difference this time (but of course, this would be incorrect). -- Barbier de Reuille Pierre On 22 February 2013 16:41, Tom Aldcroft wrote: > In Aug 2011 there was a thread [Unexpected covariance matrix from > scipy.optimize.curve_fit]( > http://mail.scipy.org/pipermail/scipy-user/2011-August/030412.html) > where Christoph Deil reported that "scipy.optimize.curve_fit returns > parameter errors that don't scale with sigma, the standard deviation > of ydata, as I expected." Today I independently came to the same > conclusion. > > This thread generated some discussion but seemingly no agreement that > the covariance output of `curve_fit` is not what would be expected. I > think the discussion wasn't as focused as possible because the example > was too complicated. With that I provide here about the simplest > possible example, which is fitting a constant to a constant dataset, > aka computing the mean and error on the mean. Since we know the > answers we can compare the output of `curve_fit`. > > To illustrate things more easily I put the examples into an IPython > notebook which is available at: > > http://nbviewer.ipython.org/5014170/ > > This was run using scipy 0.11.0 by the way. Any further discussion on > this topic to come to an understanding of the covariance output from > `curve_fit` would be appreciated. > > Thanks, > Tom > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.moore2 at nih.gov Fri Feb 22 11:12:41 2013 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 22 Feb 2013 11:12:41 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: > -----Original Message----- > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] > Sent: Friday, February 22, 2013 10:42 AM > To: SciPy Users List > Subject: [SciPy-User] Revisit Unexpected covariance matrix from > scipy.optimize.curve_fit > > In Aug 2011 there was a thread [Unexpected covariance matrix from > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- > user/2011-August/030412.html) > where Christoph Deil reported that "scipy.optimize.curve_fit returns > parameter errors that don't scale with sigma, the standard deviation > of ydata, as I expected." Today I independently came to the same > conclusion. > > This thread generated some discussion but seemingly no agreement that > the covariance output of `curve_fit` is not what would be expected. I > think the discussion wasn't as focused as possible because the example > was too complicated. With that I provide here about the simplest > possible example, which is fitting a constant to a constant dataset, > aka computing the mean and error on the mean. Since we know the > answers we can compare the output of `curve_fit`. > > To illustrate things more easily I put the examples into an IPython > notebook which is available at: > > http://nbviewer.ipython.org/5014170/ > > This was run using scipy 0.11.0 by the way. Any further discussion on > this topic to come to an understanding of the covariance output from > `curve_fit` would be appreciated. > > Thanks, > Tom > _______________________________________________ chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) Perr is then the actual error in the fit parameter. No? -Eric From mutantturkey at gmail.com Fri Feb 22 11:42:43 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Fri, 22 Feb 2013 11:42:43 -0500 Subject: [SciPy-User] Loading sparse matricies Message-ID: Hi, Is there an easy way to load sparse matricies with loadtxt? I have a large delimiter separated value file that i'd like to read in, but it's extremely sparse and so it's very inefficient to load it as a regular matrix. Any ideas? Calvin From pierre at barbierdereuille.net Fri Feb 22 13:03:53 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Fri, 22 Feb 2013 19:03:53 +0100 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: I don't know about this result I must say, do you have a reference? But intuitively, perr shouldn't change when applying the same weight to all the values. -- Barbier de Reuille Pierre On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] wrote: > > -----Original Message----- > > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] > > Sent: Friday, February 22, 2013 10:42 AM > > To: SciPy Users List > > Subject: [SciPy-User] Revisit Unexpected covariance matrix from > > scipy.optimize.curve_fit > > > > In Aug 2011 there was a thread [Unexpected covariance matrix from > > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- > > user/2011-August/030412.html) > > where Christoph Deil reported that "scipy.optimize.curve_fit returns > > parameter errors that don't scale with sigma, the standard deviation > > of ydata, as I expected." Today I independently came to the same > > conclusion. > > > > This thread generated some discussion but seemingly no agreement that > > the covariance output of `curve_fit` is not what would be expected. I > > think the discussion wasn't as focused as possible because the example > > was too complicated. With that I provide here about the simplest > > possible example, which is fitting a constant to a constant dataset, > > aka computing the mean and error on the mean. Since we know the > > answers we can compare the output of `curve_fit`. > > > > To illustrate things more easily I put the examples into an IPython > > notebook which is available at: > > > > http://nbviewer.ipython.org/5014170/ > > > > This was run using scipy 0.11.0 by the way. Any further discussion on > > this topic to come to an understanding of the covariance output from > > `curve_fit` would be appreciated. > > > > Thanks, > > Tom > > _______________________________________________ > > chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) > perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) > > Perr is then the actual error in the fit parameter. No? > > -Eric > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 22 13:17:47 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Feb 2013 13:17:47 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille wrote: > I don't know about this result I must say, do you have a reference? > > But intuitively, perr shouldn't change when applying the same weight to all > the values. > > -- > Barbier de Reuille Pierre > > > On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] > wrote: >> >> > -----Original Message----- >> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >> > Sent: Friday, February 22, 2013 10:42 AM >> > To: SciPy Users List >> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from >> > scipy.optimize.curve_fit >> > >> > In Aug 2011 there was a thread [Unexpected covariance matrix from >> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >> > user/2011-August/030412.html) >> > where Christoph Deil reported that "scipy.optimize.curve_fit returns >> > parameter errors that don't scale with sigma, the standard deviation >> > of ydata, as I expected." Today I independently came to the same >> > conclusion. >> > >> > This thread generated some discussion but seemingly no agreement that >> > the covariance output of `curve_fit` is not what would be expected. I >> > think the discussion wasn't as focused as possible because the example >> > was too complicated. With that I provide here about the simplest >> > possible example, which is fitting a constant to a constant dataset, >> > aka computing the mean and error on the mean. Since we know the >> > answers we can compare the output of `curve_fit`. >> > >> > To illustrate things more easily I put the examples into an IPython >> > notebook which is available at: >> > >> > http://nbviewer.ipython.org/5014170/ If my fast reading is correct, then this is a very good example what I DON'T want in curve_fit. Your actual standard deviation (in simulation) is 1. Then you impose a sigma of 100, and your results are completely inconsistent with the data, huge error margins and confidence intervals 5 times the range of the actual observations. In most cases (maybe not in astronomy) we would like to estimate the parameter uncertainty based on the actual data. There are some cases where we have another estimate for sigma, a Bayesian can impose any prior; if we have more information about measurement errors, then we can use ODR, but in my opinion curve_fit should just be a boring standard weighted least squares. Josef >> > >> > This was run using scipy 0.11.0 by the way. Any further discussion on >> > this topic to come to an understanding of the covariance output from >> > `curve_fit` would be appreciated. >> > >> > Thanks, >> > Tom >> > _______________________________________________ >> >> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) >> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) >> >> Perr is then the actual error in the fit parameter. No? >> >> -Eric >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pierre at barbierdereuille.net Fri Feb 22 13:24:46 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Fri, 22 Feb 2013 19:24:46 +0100 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: I am not the one that have designed curve_fit. And I don't know what you think sigma is for. But curve_fit seems to be just using it as weighting for the various points. And as such, I really don't understand why the perr should change depending on a constant factor ... would anybody care to explain? -- Barbier de Reuille Pierre On 22 February 2013 19:17, wrote: > On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille > wrote: > > I don't know about this result I must say, do you have a reference? > > > > But intuitively, perr shouldn't change when applying the same weight to > all > > the values. > > > > -- > > Barbier de Reuille Pierre > > > > > > On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] < > eric.moore2 at nih.gov> > > wrote: > >> > >> > -----Original Message----- > >> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] > >> > Sent: Friday, February 22, 2013 10:42 AM > >> > To: SciPy Users List > >> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from > >> > scipy.optimize.curve_fit > >> > > >> > In Aug 2011 there was a thread [Unexpected covariance matrix from > >> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- > >> > user/2011-August/030412.html) > >> > where Christoph Deil reported that "scipy.optimize.curve_fit returns > >> > parameter errors that don't scale with sigma, the standard deviation > >> > of ydata, as I expected." Today I independently came to the same > >> > conclusion. > >> > > >> > This thread generated some discussion but seemingly no agreement that > >> > the covariance output of `curve_fit` is not what would be expected. I > >> > think the discussion wasn't as focused as possible because the example > >> > was too complicated. With that I provide here about the simplest > >> > possible example, which is fitting a constant to a constant dataset, > >> > aka computing the mean and error on the mean. Since we know the > >> > answers we can compare the output of `curve_fit`. > >> > > >> > To illustrate things more easily I put the examples into an IPython > >> > notebook which is available at: > >> > > >> > http://nbviewer.ipython.org/5014170/ > > If my fast reading is correct, then this is a very good example what I > DON'T want in curve_fit. > > Your actual standard deviation (in simulation) is 1. > > Then you impose a sigma of 100, and your results are completely > inconsistent with the data, huge error margins and confidence > intervals 5 times the range of the actual observations. > > In most cases (maybe not in astronomy) we would like to estimate the > parameter uncertainty based on the actual data. > There are some cases where we have another estimate for sigma, a > Bayesian can impose any prior; if we have more information about > measurement errors, then we can use ODR, but in my opinion curve_fit > should just be a boring standard weighted least squares. > > Josef > > > > >> > > >> > This was run using scipy 0.11.0 by the way. Any further discussion on > >> > this topic to come to an understanding of the covariance output from > >> > `curve_fit` would be appreciated. > >> > > >> > Thanks, > >> > Tom > >> > _______________________________________________ > >> > >> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) > >> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) > >> > >> Perr is then the actual error in the fit parameter. No? > >> > >> -Eric > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Fri Feb 22 13:27:03 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 22 Feb 2013 13:27:03 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: The 0.11 documentation on curve_fit says: sigma : None or N-length sequence If not None, it represents the standard-deviation of ydata. This vector, if given, will be used as weights in the least-squares problem. It unambiguously states that sigma is the standard deviation of ydata, which is different from a relative weight. That gives a clear implication that increasing the standard deviation of all the data points by some factor should change the parameter covariance. Can the doc string be changed to say "If not None, it represents the relative weighting of data points." I would say that most astronomers and physicists are likely to be tripped up by this otherwise because "sigma" has such a well-understood meaning. - Tom On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille wrote: > I don't know about this result I must say, do you have a reference? > > But intuitively, perr shouldn't change when applying the same weight to all > the values. > > -- > Barbier de Reuille Pierre > > > On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] > wrote: >> >> > -----Original Message----- >> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >> > Sent: Friday, February 22, 2013 10:42 AM >> > To: SciPy Users List >> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from >> > scipy.optimize.curve_fit >> > >> > In Aug 2011 there was a thread [Unexpected covariance matrix from >> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >> > user/2011-August/030412.html) >> > where Christoph Deil reported that "scipy.optimize.curve_fit returns >> > parameter errors that don't scale with sigma, the standard deviation >> > of ydata, as I expected." Today I independently came to the same >> > conclusion. >> > >> > This thread generated some discussion but seemingly no agreement that >> > the covariance output of `curve_fit` is not what would be expected. I >> > think the discussion wasn't as focused as possible because the example >> > was too complicated. With that I provide here about the simplest >> > possible example, which is fitting a constant to a constant dataset, >> > aka computing the mean and error on the mean. Since we know the >> > answers we can compare the output of `curve_fit`. >> > >> > To illustrate things more easily I put the examples into an IPython >> > notebook which is available at: >> > >> > http://nbviewer.ipython.org/5014170/ >> > >> > This was run using scipy 0.11.0 by the way. Any further discussion on >> > this topic to come to an understanding of the covariance output from >> > `curve_fit` would be appreciated. >> > >> > Thanks, >> > Tom >> > _______________________________________________ >> >> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) >> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) >> >> Perr is then the actual error in the fit parameter. No? >> >> -Eric >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From deil.christoph at googlemail.com Fri Feb 22 13:30:57 2013 From: deil.christoph at googlemail.com (Christoph Deil) Date: Fri, 22 Feb 2013 19:30:57 +0100 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> (I posted an hour ago, but my message apparently didn't get through, but also didn't bounce ? trying again.) Hi Tom, I think I understood what scipy.optimize.curve_fit is doing thanks to Josef's comments in the previous thread you mentioned. It scales the covariance matrix (i.e. the inverse of the HESSE matrix of second derivatives of the chi2 fit statistic) from the fit by a factor. If you want to get the covariance matrix that e.g. sherpa (http://cxc.harvard.edu/sherpa/index.html) or MINUIT (https://github.com/iminuit/iminuit) would return and that e.g. physicists / astronomers expect, you can re-compute this factor and divide the scaled covariance matrix from curve_fit like this: # Define inputs: model, x, y, p0 and sigma ? # Compute best-fit values and "scaled covariance matrix" with curve_fit popt, pcov = curve_fit(model, x, y, p0=p0, sigma=sigma) # Undo the scale factor to get the "real covariance matrix", which was automatically applied by curve_fit chi = (y - model(x, *popt)) / sigma chi2 = (chi ** 2).sum() dof = len(x) - len(popt) factor = (chi2 / dof) pcov /= factor (I haven't checked if this is equivalent to the code Eric gave above.) If I understand correctly, the motivation for multiplying pcov by this factor in curve_fit is that this was written by an economist, and there it is common to interpret the sigma not as errors on measurement points, but as relative weights between measurement points, with no meaning for the absolute scale of these weights. I think applying this scale factor to pcov is equivalent to re-scaling the sigmas to achieve a chi2 / dof of 1, which is a reasonable thing to do if you say the sigmas are only relative weights. Does this make sense? How about adding an option "scale_pcov" to curve_fit whether this scale factor should be applied. To keep backward compatibility the default would have to be scale_pcov=True. The advantage of this option would be that the issue is explained in the docstring and that people with "sigma=error" instead of "sigma=relative weight" can easily get what they want. Should I make a pull request? Christoph On Feb 22, 2013, at 7:27 PM, Tom Aldcroft wrote: > The 0.11 documentation on curve_fit says: > > sigma : None or N-length sequence > If not None, it represents the standard-deviation of ydata. This > vector, if given, will be used as weights in the least-squares > problem. > > It unambiguously states that sigma is the standard deviation of ydata, > which is different from a relative weight. That gives a clear > implication that increasing the standard deviation of all the data > points by some factor should change the parameter covariance. > > Can the doc string be changed to say "If not None, it represents the > relative weighting of data points." I would say that most astronomers > and physicists are likely to be tripped up by this otherwise because > "sigma" has such a well-understood meaning. > > - Tom > > > On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille > wrote: >> I don't know about this result I must say, do you have a reference? >> >> But intuitively, perr shouldn't change when applying the same weight to all >> the values. >> >> -- >> Barbier de Reuille Pierre >> >> >> On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] >> wrote: >>> >>>> -----Original Message----- >>>> From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >>>> Sent: Friday, February 22, 2013 10:42 AM >>>> To: SciPy Users List >>>> Subject: [SciPy-User] Revisit Unexpected covariance matrix from >>>> scipy.optimize.curve_fit >>>> >>>> In Aug 2011 there was a thread [Unexpected covariance matrix from >>>> scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >>>> user/2011-August/030412.html) >>>> where Christoph Deil reported that "scipy.optimize.curve_fit returns >>>> parameter errors that don't scale with sigma, the standard deviation >>>> of ydata, as I expected." Today I independently came to the same >>>> conclusion. >>>> >>>> This thread generated some discussion but seemingly no agreement that >>>> the covariance output of `curve_fit` is not what would be expected. I >>>> think the discussion wasn't as focused as possible because the example >>>> was too complicated. With that I provide here about the simplest >>>> possible example, which is fitting a constant to a constant dataset, >>>> aka computing the mean and error on the mean. Since we know the >>>> answers we can compare the output of `curve_fit`. >>>> >>>> To illustrate things more easily I put the examples into an IPython >>>> notebook which is available at: >>>> >>>> http://nbviewer.ipython.org/5014170/ >>>> >>>> This was run using scipy 0.11.0 by the way. Any further discussion on >>>> this topic to come to an understanding of the covariance output from >>>> `curve_fit` would be appreciated. >>>> >>>> Thanks, >>>> Tom >>>> _______________________________________________ >>> >>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) >>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) >>> >>> Perr is then the actual error in the fit parameter. No? >>> >>> -Eric >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Feb 22 13:33:07 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Feb 2013 13:33:07 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 1:27 PM, Tom Aldcroft wrote: > The 0.11 documentation on curve_fit says: > > sigma : None or N-length sequence > If not None, it represents the standard-deviation of ydata. This > vector, if given, will be used as weights in the least-squares > problem. > > It unambiguously states that sigma is the standard deviation of ydata, > which is different from a relative weight. That gives a clear > implication that increasing the standard deviation of all the data > points by some factor should change the parameter covariance. > > Can the doc string be changed to say "If not None, it represents the > relative weighting of data points." I would say that most astronomers > and physicists are likely to be tripped up by this otherwise because > "sigma" has such a well-understood meaning. I agree that this is a very misleading, and should be changed. documentation editor or pull requests are available to change this. Josef > > - Tom > > > On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille > wrote: >> I don't know about this result I must say, do you have a reference? >> >> But intuitively, perr shouldn't change when applying the same weight to all >> the values. >> >> -- >> Barbier de Reuille Pierre >> >> >> On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] >> wrote: >>> >>> > -----Original Message----- >>> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >>> > Sent: Friday, February 22, 2013 10:42 AM >>> > To: SciPy Users List >>> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from >>> > scipy.optimize.curve_fit >>> > >>> > In Aug 2011 there was a thread [Unexpected covariance matrix from >>> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >>> > user/2011-August/030412.html) >>> > where Christoph Deil reported that "scipy.optimize.curve_fit returns >>> > parameter errors that don't scale with sigma, the standard deviation >>> > of ydata, as I expected." Today I independently came to the same >>> > conclusion. >>> > >>> > This thread generated some discussion but seemingly no agreement that >>> > the covariance output of `curve_fit` is not what would be expected. I >>> > think the discussion wasn't as focused as possible because the example >>> > was too complicated. With that I provide here about the simplest >>> > possible example, which is fitting a constant to a constant dataset, >>> > aka computing the mean and error on the mean. Since we know the >>> > answers we can compare the output of `curve_fit`. >>> > >>> > To illustrate things more easily I put the examples into an IPython >>> > notebook which is available at: >>> > >>> > http://nbviewer.ipython.org/5014170/ >>> > >>> > This was run using scipy 0.11.0 by the way. Any further discussion on >>> > this topic to come to an understanding of the covariance output from >>> > `curve_fit` would be appreciated. >>> > >>> > Thanks, >>> > Tom >>> > _______________________________________________ >>> >>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) >>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) >>> >>> Perr is then the actual error in the fit parameter. No? >>> >>> -Eric >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Feb 22 13:43:30 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Feb 2013 13:43:30 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> References: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> Message-ID: On Fri, Feb 22, 2013 at 1:30 PM, Christoph Deil wrote: > (I posted an hour ago, but my message apparently didn't get through, but > also didn't bounce ? trying again.) > > Hi Tom, > > I think I understood what scipy.optimize.curve_fit is doing thanks to > Josef's comments in the previous thread you mentioned. > > It scales the covariance matrix (i.e. the inverse of the HESSE matrix of > second derivatives of the chi2 fit statistic) from the fit by a factor. > If you want to get the covariance matrix that e.g. sherpa > (http://cxc.harvard.edu/sherpa/index.html) or MINUIT > (https://github.com/iminuit/iminuit) would return and that e.g. physicists / > astronomers expect, you can re-compute this factor and divide the scaled > covariance matrix from curve_fit like this: > > # Define inputs: model, x, y, p0 and sigma > ? > > # Compute best-fit values and "scaled covariance matrix" with curve_fit > popt, pcov = curve_fit(model, x, y, p0=p0, sigma=sigma) > > # Undo the scale factor to get the "real covariance matrix", which was > automatically applied by curve_fit > chi = (y - model(x, *popt)) / sigma > chi2 = (chi ** 2).sum() > dof = len(x) - len(popt) > factor = (chi2 / dof) > pcov /= factor > > (I haven't checked if this is equivalent to the code Eric gave above.) > > If I understand correctly, the motivation for multiplying pcov by this > factor in curve_fit is that this was written by an economist, and there it I have never seen it outside of economics either. I doubt you find it in any (general) statistics package. However, when curve_fit got introduced and until the first round of this discussion started, I didn't know that there is any field that does not standardize by actual noise sigma. > is common to interpret the sigma not as errors on measurement points, but as > relative weights between measurement points, with no meaning for the > absolute scale of these weights. > I think applying this scale factor to pcov is equivalent to re-scaling the > sigmas to achieve a chi2 / dof of 1, which is a reasonable thing to do if > you say the sigmas are only relative weights. > > Does this make sense? > > How about adding an option "scale_pcov" to curve_fit whether this scale > factor should be applied. > To keep backward compatibility the default would have to be scale_pcov=True. > The advantage of this option would be that the issue is explained in the > docstring and that people with "sigma=error" instead of "sigma=relative > weight" can easily get what they want. > Should I make a pull request? I'm +1, both for clarification and for users who really have "sigma" and not "weights" (aside: in statsmodels we also have generalized least squares which allows for error correlation, in this case (capital) Sigma is the covariance matrix of the error, but it is still scaled by the (lower case) sigma which is estimated from the residuals as in curve_fit) Josef > > Christoph > > > On Feb 22, 2013, at 7:27 PM, Tom Aldcroft > wrote: > > The 0.11 documentation on curve_fit says: > > sigma : None or N-length sequence > If not None, it represents the standard-deviation of ydata. This > vector, if given, will be used as weights in the least-squares > problem. > > It unambiguously states that sigma is the standard deviation of ydata, > which is different from a relative weight. That gives a clear > implication that increasing the standard deviation of all the data > points by some factor should change the parameter covariance. > > Can the doc string be changed to say "If not None, it represents the > relative weighting of data points." I would say that most astronomers > and physicists are likely to be tripped up by this otherwise because > "sigma" has such a well-understood meaning. > > - Tom > > > On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille > wrote: > > I don't know about this result I must say, do you have a reference? > > But intuitively, perr shouldn't change when applying the same weight to all > the values. > > -- > Barbier de Reuille Pierre > > > On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] > wrote: > > > -----Original Message----- > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] > Sent: Friday, February 22, 2013 10:42 AM > To: SciPy Users List > Subject: [SciPy-User] Revisit Unexpected covariance matrix from > scipy.optimize.curve_fit > > In Aug 2011 there was a thread [Unexpected covariance matrix from > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- > user/2011-August/030412.html) > where Christoph Deil reported that "scipy.optimize.curve_fit returns > parameter errors that don't scale with sigma, the standard deviation > of ydata, as I expected." Today I independently came to the same > conclusion. > > This thread generated some discussion but seemingly no agreement that > the covariance output of `curve_fit` is not what would be expected. I > think the discussion wasn't as focused as possible because the example > was too complicated. With that I provide here about the simplest > possible example, which is fitting a constant to a constant dataset, > aka computing the mean and error on the mean. Since we know the > answers we can compare the output of `curve_fit`. > > To illustrate things more easily I put the examples into an IPython > notebook which is available at: > > http://nbviewer.ipython.org/5014170/ > > This was run using scipy 0.11.0 by the way. Any further discussion on > this topic to come to an understanding of the covariance output from > `curve_fit` would be appreciated. > > Thanks, > Tom > _______________________________________________ > > > chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) > perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) > > Perr is then the actual error in the fit parameter. No? > > -Eric > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From aldcroft at head.cfa.harvard.edu Fri Feb 22 13:44:21 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 22 Feb 2013 13:44:21 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: On Fri, Feb 22, 2013 at 1:17 PM, wrote: > On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille > wrote: >> I don't know about this result I must say, do you have a reference? >> >> But intuitively, perr shouldn't change when applying the same weight to all >> the values. >> >> -- >> Barbier de Reuille Pierre >> >> >> On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] >> wrote: >>> >>> > -----Original Message----- >>> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >>> > Sent: Friday, February 22, 2013 10:42 AM >>> > To: SciPy Users List >>> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from >>> > scipy.optimize.curve_fit >>> > >>> > In Aug 2011 there was a thread [Unexpected covariance matrix from >>> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >>> > user/2011-August/030412.html) >>> > where Christoph Deil reported that "scipy.optimize.curve_fit returns >>> > parameter errors that don't scale with sigma, the standard deviation >>> > of ydata, as I expected." Today I independently came to the same >>> > conclusion. >>> > >>> > This thread generated some discussion but seemingly no agreement that >>> > the covariance output of `curve_fit` is not what would be expected. I >>> > think the discussion wasn't as focused as possible because the example >>> > was too complicated. With that I provide here about the simplest >>> > possible example, which is fitting a constant to a constant dataset, >>> > aka computing the mean and error on the mean. Since we know the >>> > answers we can compare the output of `curve_fit`. >>> > >>> > To illustrate things more easily I put the examples into an IPython >>> > notebook which is available at: >>> > >>> > http://nbviewer.ipython.org/5014170/ > > If my fast reading is correct, then this is a very good example what I > DON'T want in curve_fit. > > Your actual standard deviation (in simulation) is 1. > > Then you impose a sigma of 100, and your results are completely > inconsistent with the data, huge error margins and confidence > intervals 5 times the range of the actual observations. > > In most cases (maybe not in astronomy) we would like to estimate the > parameter uncertainty based on the actual data. It happens all the time in astronomy and physics (and probably any experimental science) that you either have well-justified priors for the errors, or need to account for systematic errors that are not reflected in the scatter of the data. This is exactly why (at least in physics/astronomy) one has the option to specify the actual uncertainty of each data point for parameter estimation algorithms, not just relative weighting. > There are some cases where we have another estimate for sigma, a > Bayesian can impose any prior; if we have more information about > measurement errors, then we can use ODR, but in my opinion curve_fit > should just be a boring standard weighted least squares. But curve_fit is *much* simpler to use and I suspect most astronomers will go straight there. The fundamental algorithm underneath (Levenburg-Marquardt + chi^2 statistics) has no problem handling absolute sigma values. I grew up on Numerical Recipes, and "mrqmin", which gives as its output a covariance matrix which is exactly the "expected" covariance matrix. So why limit the usefulness of curve_fit to the case of empirical scatter estimates? > > Josef > > > >>> > >>> > This was run using scipy 0.11.0 by the way. Any further discussion on >>> > this topic to come to an understanding of the covariance output from >>> > `curve_fit` would be appreciated. >>> > >>> > Thanks, >>> > Tom >>> > _______________________________________________ >>> >>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) >>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) >>> >>> Perr is then the actual error in the fit parameter. No? >>> >>> -Eric >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From aldcroft at head.cfa.harvard.edu Fri Feb 22 13:53:30 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 22 Feb 2013 13:53:30 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> References: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> Message-ID: On Fri, Feb 22, 2013 at 1:30 PM, Christoph Deil wrote: > How about adding an option "scale_pcov" to curve_fit whether this scale > factor should be applied. > To keep backward compatibility the default would have to be scale_pcov=True. > The advantage of this option would be that the issue is explained in the > docstring and that people with "sigma=error" instead of "sigma=relative > weight" can easily get what they want. > Should I make a pull request? +1 From aldcroft at head.cfa.harvard.edu Fri Feb 22 17:28:26 2013 From: aldcroft at head.cfa.harvard.edu (Tom Aldcroft) Date: Fri, 22 Feb 2013 17:28:26 -0500 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: <789E1D2C-B286-47AE-A9E2-3BE26C6FB13E@gmail.com> Message-ID: Thanks for the clarifications and discussion here. Just to close this out, I updated the notebook with the suggestion from Christoph (which is formally the same as the one from Eric Moore). This gives the expected result. http://nbviewer.ipython.org/5014170 I just made a documentation update PR (https://github.com/scipy/scipy/pull/446). Hopefully this is not controversial and can be merged. Then Christoph and/or I will make a pull request to add a scale_pcov parameter. Thanks, Tom On Fri, Feb 22, 2013 at 1:53 PM, Tom Aldcroft wrote: > On Fri, Feb 22, 2013 at 1:30 PM, Christoph Deil > wrote: >> How about adding an option "scale_pcov" to curve_fit whether this scale >> factor should be applied. >> To keep backward compatibility the default would have to be scale_pcov=True. >> The advantage of this option would be that the issue is explained in the >> docstring and that people with "sigma=error" instead of "sigma=relative >> weight" can easily get what they want. >> Should I make a pull request? > > +1 From marquett at iap.fr Thu Feb 21 17:29:08 2013 From: marquett at iap.fr (Marquette Jean-Baptiste) Date: Thu, 21 Feb 2013 23:29:08 +0100 Subject: [SciPy-User] Unresolved symbol in scipy/special/_cephes.so on Mac OS 10.7.5 Message-ID: <4F2EE7F4-FB39-4635-A6BE-E26C7DBAFBE6@iap.fr> Hi scipy gurus, I have an issue in scipy, latest version: which python /Library/Frameworks/EPD64.framework/Versions/Current/bin/python scipy was uninstalled using the convenient enpkg option, then installed using pip and icc Intel compiler for C/C++ routines. The traceback is: python TestStat.py cg074 ERROR: ImportError: dlopen(/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): Symbol not found: ___finite Referenced from: /Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/special/_cephes.so Expected in: dynamic lookup [unknown] Traceback (most recent call last): File "TestStat.py", line 18, in import scipy.stats as sci File "/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/stats/__init__.py", line 321, in from stats import * File "/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/stats/stats.py", line 193, in import scipy.special as special File "/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/special/__init__.py", line 525, in from _cephes import * ImportError: dlopen(/Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): Symbol not found: ___finite Referenced from: /Library/Frameworks/EPD64.framework/Versions/Current/lib/python2.7/site-packages/scipy/special/_cephes.so Expected in: dynamic lookup Any hint welcome; Cheers, Jean-Baptiste From Deil.Christoph at gmail.com Fri Feb 22 12:18:06 2013 From: Deil.Christoph at gmail.com (Christoph Deil) Date: Fri, 22 Feb 2013 18:18:06 +0100 Subject: [SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit In-Reply-To: References: Message-ID: On Feb 22, 2013, at 5:12 PM, "Moore, Eric (NIH/NIDDK) [F]" wrote: >> -----Original Message----- >> From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu] >> Sent: Friday, February 22, 2013 10:42 AM >> To: SciPy Users List >> Subject: [SciPy-User] Revisit Unexpected covariance matrix from >> scipy.optimize.curve_fit >> >> In Aug 2011 there was a thread [Unexpected covariance matrix from >> scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy- >> user/2011-August/030412.html) >> where Christoph Deil reported that "scipy.optimize.curve_fit returns >> parameter errors that don't scale with sigma, the standard deviation >> of ydata, as I expected." Today I independently came to the same >> conclusion. >> >> This thread generated some discussion but seemingly no agreement that >> the covariance output of `curve_fit` is not what would be expected. I >> think the discussion wasn't as focused as possible because the example >> was too complicated. With that I provide here about the simplest >> possible example, which is fitting a constant to a constant dataset, >> aka computing the mean and error on the mean. Since we know the >> answers we can compare the output of `curve_fit`. >> >> To illustrate things more easily I put the examples into an IPython >> notebook which is available at: >> >> http://nbviewer.ipython.org/5014170/ >> >> This was run using scipy 0.11.0 by the way. Any further discussion on >> this topic to come to an understanding of the covariance output from >> `curve_fit` would be appreciated. >> >> Thanks, >> Tom >> _______________________________________________ > > chi2 = np.sum(((yn-const(x, *popt))/sigma)**2) > perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1))) > > Perr is then the actual error in the fit parameter. No? > > -Eric Hi Tom, I think I understood what scipy.optimize.curve_fit is doing thanks to Josef's comments in the previous thread you mentioned. It scales the covariance matrix (i.e. the inverse of the HESSE matrix of second derivatives of the chi2 fit statistic) from the fit by a factor. If you want to get the covariance matrix that e.g. sherpa (http://cxc.harvard.edu/sherpa/index.html) or MINUIT (https://github.com/iminuit/iminuit) would return and that e.g. physicists / astronomers expect, you can re-compute this factor and divide the scaled covariance matrix from curve_fit like this: # Define inputs: model, x, y, p0 and sigma ? # Compute best-fit values and "scaled covariance matrix" with curve_fit popt, pcov = curve_fit(model, x, y, p0=p0, sigma=sigma) # Undo the scale factor to get the "real covariance matrix", which was automatically applied by curve_fit chi = (y - model(x, *popt)) / sigma chi2 = (chi ** 2).sum() dof = len(x) - len(popt) factor = (chi2 / dof) pcov /= factor (I haven't checked if this is equivalent to the code Eric gave above.) If I understand correctly, the motivation for multiplying pcov by this factor in curve_fit is that this was written by an economist, and there it is common to interpret the sigma not as errors on measurement points, but as relative weights between measurement points, with no meaning for the absolute scale of these weights. I think applying this scale factor to pcov is equivalent to re-scaling the sigmas to achieve a chi2 / dof of 1, which is a reasonable thing to do if you say the sigmas are only relative weights. Does this make sense? How about adding an option "scale_pcov" to curve_fit whether this scale factor should be applied. To keep backward compatibility the default would have to be scale_pcov=True. The advantage of this option would be that the issue is explained in the docstring and that people with "sigma=error" instead of "sigma=relative weight" can easily get what they want. Should I make a pull request? Christoph From paulc.mnt at gmail.com Sat Feb 23 07:20:48 2013 From: paulc.mnt at gmail.com (Paul Manta) Date: Sat, 23 Feb 2013 12:20:48 +0000 (UTC) Subject: [SciPy-User] Checking Gradients with Scipy Message-ID: I want to use `scipy.optimize.check_grad` to check the gradient of my implementation of the sigmoid function; here's my Python function: > def sigmoid(x, gradient=False): > y = 1 / (1 + numpy.exp(-x)) > return numpy.multiply(y, 1 - y) if gradient else y Here are the arguments and the call to `check_grad`: > x0 = numpy.random.uniform(-30, 30, (4, 5)) > func = sigmoid > grad = lambda x: sigmoid(x, gradient=True) > error = scipy.optimize.check_grad(func, grad, x0) I get the error below. The shape mismatch refers to the subtraction `f(* ((xk+d,)+args)) - f0`. Any idea what could be causing this and how I should fix it? > File "scipy\optimize\optimize.py", line 597, in approx_fprime > grad[k] = (f(*((xk+d,)+args)) - f0) / d[k] > ValueError: operands could not be broadcast together with shapes (4,5) (4) From pav at iki.fi Sat Feb 23 08:45:42 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 23 Feb 2013 15:45:42 +0200 Subject: [SciPy-User] Checking Gradients with Scipy In-Reply-To: References: Message-ID: 23.02.2013 14:20, Paul Manta kirjoitti: > I want to use `scipy.optimize.check_grad` to check the gradient of my > implementation of the sigmoid function; here's my Python function: > >> def sigmoid(x, gradient=False): >> y = 1 / (1 + numpy.exp(-x)) >> return numpy.multiply(y, 1 - y) if gradient else y > > Here are the arguments and the call to `check_grad`: > >> x0 = numpy.random.uniform(-30, 30, (4, 5)) >> func = sigmoid >> grad = lambda x: sigmoid(x, gradient=True) >> error = scipy.optimize.check_grad(func, grad, x0) > > I get the error below. The shape mismatch refers to the subtraction `f(* > ((xk+d,)+args)) - f0`. Any idea what could be causing this and how I should fix > it? You are here telling check_grad that you have a function of 4*5 = 20 variables, as `check_grad` deduces the number of variables from the size of `x0`. `x0` needs to be a scalar here. Note that `numpy.multiply(a, b)` can be written as `a*b`. -- Pauli Virtanen From sergio.callegari at gmail.com Sat Feb 23 11:07:26 2013 From: sergio.callegari at gmail.com (Sergio Callegari) Date: Sat, 23 Feb 2013 16:07:26 +0000 (UTC) Subject: [SciPy-User] Calling scipy blas from cython is slow and non-deterministic Message-ID: Hi, following the excellent advice that I have received from V. Armando Sole on the numpy mailing list, I have finally succeeded in calling the blas routines shipped with scipy from cython. I am doing this to avoid shipping an extra blas library in windows, for a project of mine that uses scipy but has some things coded in cython for speed. So far I managed getting things (almost) working on Linux. Here is what I do: The following code snippet gives me the dgemv pointer (which is a pointer to a fortran function, even if it comes from scipy.linalg.blas.cblas, weird). from cpython cimport PyCObject_AsVoidPtr import scipy as sp __import__('scipy.linalg.blas') ctypedef void (*dgemv_ptr) (char *trans, int *m, int *n,\ double *alpha, double *a, int *lda, double *x,\ int *incx,\ double *beta, double *y, int *incy) cdef dgemv_ptr dgemv=PyCObject_AsVoidPtr(\ sp.linalg.blas.cblas.dgemv._cpointer) Then, in a tight loop, I can call dgemv by first defining some constants (since fortran stuff wants everything passed by address) cdef int one=1 cdef double onedot = 1.0 cdef double zerodot = 0.0 cdef char trans = 'N' ... and then calling dgemv inside the loop for i in xrange(N): dgemv(&trans, &nq, &order,\ &onedot, np.PyArray_DATA(C), &order, \ np.PyArray_DATA(c_x0), &one, \ &zerodot, np.PyArray_DATA(y0), &one) This works, sort of. But I have two major issues: 1) It is non deterministic. About 1 times over 6 it gives the wrong result. Using cblas_dgemv always gives the correct result. Am I forgetting the initialization of something? 2) It is much much slower than using the cblas_dgemv that I can get by linking to atlas. Specifically, I have about 8 calls to blas in my tight loop, 6 of them are to dgemv and the others are to dcopy. Changing a single dgemv call from the system cblas to the blas function returned by scipy.linalg.blas.cblas.dgemv._cpointer makes the execution time of a test case jump from about 0.7 s to 1.25 on my system. This is a huge slow down of 80% for just a single function! Any clue about why is this happening? In the end, on linux, scipy's fblas.so is dynamically linked to atlas exactly like my code when I access the cblas_dgemv. From sergio.callegari at gmail.com Sat Feb 23 13:34:05 2013 From: sergio.callegari at gmail.com (Sergio Callegari) Date: Sat, 23 Feb 2013 18:34:05 +0000 (UTC) Subject: [SciPy-User] =?utf-8?q?Calling_scipy_blas_from_cython_is_slow_and?= =?utf-8?q?=09non-deterministic?= References: Message-ID: OK, I've made it work. I was messing the row/column order mapping and the transpose param. However, it is still slower than cblas. After having replaced all the cblas_xxx calls in my code (about 8 of them), I get the following result cblas version 0.69 sec scipy blas 0.74 sec Any clue why? From MaraMaus at nurfuerspam.de Fri Feb 22 20:18:48 2013 From: MaraMaus at nurfuerspam.de (Mara Grahl) Date: Sat, 23 Feb 2013 02:18:48 +0100 Subject: [SciPy-User] odespy AttributeError In-Reply-To: <51264512.9090503@gmail.com> References: <20130221033903.185300@gmx.net> <51264512.9090503@gmail.com> Message-ID: <20130223011848.302220@gmx.net> Hi, thank you for the suggestion, I'll consider that, best Mara -------- Original-Nachricht -------- > Datum: Thu, 21 Feb 2013 17:02:26 +0100 > Von: Johann Cohen-Tanugi > An: SciPy Users List > CC: MaraMaus at nurfuerspam.de > Betreff: Re: [SciPy-User] odespy AttributeError > hello, best is probably to contact the main developer directly, as this > from github does not look like a community effort : > https://github.com/hplgit > > good luck, > Johann > > On 02/21/2013 04:39 AM, MaraMaus at nurfuerspam.de wrote: > > Hi, > > > > since there doesn't seem to be a mailing list for the package odespy, I > place my question here - I hope this is ok? > > > > Whereas the examples from the odespy manual work fine, I encounter a > strange error when trying to use odespy for some complicated system of > ordinary differential equations: > > > > Traceback (most recent call last): > > File "ode1.py", line 131, in > > u, s = solver.solve(time_points) > > File "C:\Python27\lib\site-packages\odespy\solvers.py", line 1036, in > solve > > self.u[n+1] = self.advance() # new value > > File "C:\Python27\lib\site-packages\odespy\RungeKutta.py", line 194, in > advance > > rms_norm = np.sqrt(np.sum(rms*rms)/self.neq) > > AttributeError: sqrt > > > > Does anyone know what might be the reason for this error message? > > > > I would be glad for any help, thank you in advance, > > > > Mara > > > > > > > > For completeness the full code: > > A short description: most of the code (between the hashs #setup_beg and > #setup_end) sets up the system of 9 ordinary 1st order differential > equations for the variables v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3. The list > DLHS contains explicit expressions for > > [diff(v1,s), diff(v2,s), ... , diff(vpp3,s)] > > > > > > #setup_beg > > nx=3 > > T=45.1 > > mu=253.9 > > sL=500.0 > > sR=6.0 > > xL=0.0 > > xR=100.0**2 > > dx=(xR-xL)/(nx-1) > > > > x={} > > for i in range(1,nx+1): > > x[i]=xL+dx*(i-1) > > > > print(x[nx]) > > > > # vp is first derivative of v, vp1 is first derivative of v at grid > point 1 etc. > > # note that vp[0]=vp1, ..., vp[nx-1]=vpnx > > > > from sympy import solve, Symbol > > > > pts=range(1,nx+1) > > vp=[Symbol('vp'+str(i)) for i in pts] > > vpp=[Symbol('vpp'+str(i)) for i in pts] > > vppp=[Symbol('vppp'+str(i)) for i in pts] > > vpppp=[Symbol('vpppp'+str(i)) for i in pts] > > #beachte: vp[0]=vp1,...,vp[nx-1]=vpnx etc. > > > > eq1={} > > for i in range(1,nx): > > eq1[i]=vp[i-1]+vpp[i-1]*(x[i+1]-x[i])/2 + > vppp[i-1]*(x[i+1]-x[i])**2/8+vpppp[i-1]*(x[i+1]-x[i])**3/48 \ > > -vp[i]+vpp[i]*(x[i+1]-x[i])/2 - > vppp[i]*(x[i+1]-x[i])**2/8+vpppp[i]*(x[i+1]-x[i])**3/48 > > > > eq2={} > > for i in range(1,nx): > > eq2[i]=vpp[i-1]+vppp[i-1]*(x[i+1]-x[i])/2 + > vpppp[i-1]*(x[i+1]-x[i])**2/8 \ > > -vpp[i]+vppp[i]*(x[i+1]-x[i])/2 - vpppp[i]*(x[i+1]-x[i])**2/8 > > > > eq3={} > > eq3[1]=vppp[0]+vpppp[0]*(x[2]-x[1])/2-vppp[1]-vpppp[1]*(x[1]-x[2])/2 > > > eq3[2]=vppp[nx-2]+vpppp[nx-2]*(x[nx]-x[nx-1])/2-vppp[nx-1]-vpppp[nx-1]*(x[nx-1]-x[nx])/2 > > > > > > eqs=[] > > for i in range(1,nx): > > eqs.append(eq1[i]) > > for i in range(1,nx): > > eqs.append(eq2[i]) > > eqs.append(eq3[1]) > > eqs.append(eq3[2]) > > > > vars=[] > > for i in range(1,nx+1): > > vars.append(vppp[i-1]) > > for i in range(1,nx+1): > > vars.append(vpppp[i-1]) > > > > sol=solve(eqs,vars) > > > > #beachte: vppp[0]=vppp1,...,vppp[nx-1]=vpppnx etc. > > # Check z.B. fuer nx=5: (gleiche Loesung mit MM) > > # vpppp5= > > #print(sol[vpppp[4]]) > > # vpppp1= > > #print(sol[vpppp[0]]) > > > > from sympy import Function, diff > > from sympy.functions import coth, tanh, sqrt > > > > s=Symbol('s') > > vh=Function('vh') > > xh=Symbol('xh') > > > > Epi= sqrt( s**2 + 2*diff(vh(xh),xh) ) > > Esi= sqrt( s**2 + 2*diff(vh(xh),xh) +4*xh*diff(vh(xh),xh,xh) ) > > Eq= sqrt(s**2 +3.2**2*xh) > > > > import math > > pi=math.pi > > > > gl1= s**4/12/pi**2*( 3/Epi*coth(Epi/2/T) +1/Esi*coth(Esi/2/T) > -12/Eq*(tanh( (Eq-mu)/2/T ) + tanh( (Eq+mu)/2/T ) ) ) > > gl2= diff(gl1,xh) > > gl3= diff(gl2,xh) > > > > DLHS=[] > > > > for i in range(1,nx+1): > > DLHS.append( gl1.subs({'Derivative(vh(xh), xh, > xh)':vpp[i-1],'Derivative(vh(xh), xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > > for i in range(1,nx+1): > > DLHS.append( gl2.subs({'Derivative(vh(xh), xh, xh, > xh)':vppp[i-1],'Derivative(vh(xh), xh, xh)':vpp[i-1],'Derivative(vh(xh), > xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > > for i in range(1,nx+1): > > DLHS.append( gl3.subs({'Derivative(vh(xh), xh, xh, xh, > xh)':vpppp[i-1],'Derivative(vh(xh), xh, xh, xh)':vppp[i-1],'Derivative(vh(xh), xh, > xh)':vpp[i-1], \ > > 'Derivative(vh(xh), > xh)':vp[i-1],'xh':x[i]}).subs(sol) ) > > > > > #DLHS[0]=D[v(x_1),s],...,DLHS[nx-1]=D[v(x_nx),s],DLHS[nx]=D[vp(x_1),s],..., DLHS[nx+nx+nx-1]=D[vpp(x_nx),s] > > > > > > #setup_end > > > > > > v=[Symbol('v'+str(i)) for i in pts] > > > > #initial conditions > > isc=[] > > > > for i in range(1,nx+1): > > isc.append( 5/2*x[i]**2) > > for i in range(1,nx+1): > > isc.append( 5*x[i]) > > for i in range(1,nx+1): > > isc.append( 5) > > > > > > > > > > # to do: generalize to arbitrary nx > > def f(u,s): > > v1, v2, v3, vp1, vp2, vp3, vpp1, vpp2, vpp3 = u > > return DLHS > > > > import odespy > > solver = odespy.RungeKutta.CashKarp(f) > > solver.set_initial_condition(isc) > > from numpy import linspace > > T = 490 # end of simulation > > N = 30 # no of time steps > > time_points = linspace(sL, T, N+1) > > u, s = solver.solve(time_points) > > > > #from matplotlib.pyplot import * > > #first=u[:,0] > > #plot(s, first) > > #show() > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lernerbrandon2 at gmail.com Sun Feb 24 11:34:37 2013 From: lernerbrandon2 at gmail.com (Brandon Lerner) Date: Sun, 24 Feb 2013 11:34:37 -0500 Subject: [SciPy-User] Template Matching via fftconvolve Message-ID: Hi all, I've been trying to write a program that will find all matches of a smaller image in a bigger image by using fftconvolve to do a normalized cross correlation of two differently sized 2D numpy arrays. I've written a brute force pixel by pixel comparison that works but it is way too slow for me to use. Any guidance to someone new at SciPy is appreciated. After days of research, I've found some discussions about this, but I can't figure out what to do next (look at attached code for an example): http://stackoverflow.com/questions/7670112/finding-a-subimage-inside-a-numpy-image http://stackoverflow.com/questions/12715673/numpy-template-matching-using-matrix-multiplications http://dsp.stackexchange.com/questions/736/how-do-i-implement-cross-correlation-to-prove-two-audio-files-are-similar http://stackoverflow.com/questions/4196453/simple-and-fast-method-to-compare-images-for-similarity Note: I've also found cv2 (cv2.templateMatch) or skimage(matchTemplate), but I want to learn the math behind all of this. Best, B -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: templatematch.py Type: application/octet-stream Size: 2045 bytes Desc: not available URL: From dacmorton at gmail.com Sun Feb 24 22:36:33 2013 From: dacmorton at gmail.com (Daniel Morton) Date: Sun, 24 Feb 2013 21:36:33 -0600 Subject: [SciPy-User] Scipy build problem Message-ID: When I attempted to upgrade to the most recent version of scipy the build failed with the following error message: error: command 'swig' failed with exit status 1. I have Mac OS X 10.6.8. I have gcc version 4.2.1 and gfortran version 4.6.0. I don't know where I got gfortran from. I'm trying to install scipy 0.11.0 and have numpy 1.8.0. The complete printout from the build follows: blas_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers'] lapack_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-msse3'] Running from scipy source directory. umfpack_info: amd_info: FOUND: libraries = ['amd'] library_dirs = ['/usr/local/lib'] swig_opts = ['-I/usr/local/include'] define_macros = [('SCIPY_AMD_H', None)] include_dirs = ['/usr/local/include'] FOUND: libraries = ['umfpack', 'amd'] library_dirs = ['/usr/local/lib'] swig_opts = ['-I/usr/local/include', '-I/usr/local/include'] define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)] include_dirs = ['/usr/local/include'] running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "dfftpack" sources building library "fftpack" sources building library "linpack_lite" sources building library "mach" sources building library "quadpack" sources building library "odepack" sources building library "dop" sources building library "fitpack" sources building library "odrpack" sources building library "minpack" sources building library "rootfind" sources building library "superlu_src" sources building library "arpack_scipy" sources building library "sc_c_misc" sources building library "sc_cephes" sources building library "sc_mach" sources building library "sc_amos" sources building library "sc_cdf" sources building library "sc_specfun" sources building library "statlib" sources building extension "scipy.cluster._vq" sources building extension "scipy.cluster._hierarchy_wrap" sources building extension "scipy.fftpack._fftpack" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.fftpack.convolve" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.integrate._quadpack" sources building extension "scipy.integrate._odepack" sources building extension "scipy.integrate.vode" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.integrate.lsoda" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.integrate._dop" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.interpolate.interpnd" sources building extension "scipy.interpolate._fitpack" sources building extension "scipy.interpolate.dfitpack" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. adding 'build/src.macosx-10.6-intel-2.7/scipy/interpolate/src/dfitpack-f2pywrappers.f' to sources. building extension "scipy.interpolate._interpolate" sources building extension "scipy.io.matlab.streams" sources building extension "scipy.io.matlab.mio_utils" sources building extension "scipy.io.matlab.mio5_utils" sources building extension "scipy.lib.blas.fblas" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. adding 'build/src.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/lib/blas/fblas-f2pywrappers.f' to sources. building extension "scipy.lib.blas.cblas" sources adding 'build/src.macosx-10.6-intel-2.7/scipy/lib/blas/cblas.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.lib.lapack.flapack" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.lib.lapack.clapack" sources adding 'build/src.macosx-10.6-intel-2.7/scipy/lib/lapack/clapack.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.lib.lapack.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.linalg._fblas" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. adding 'build/src.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/linalg/_fblas-f2pywrappers.f' to sources. building extension "scipy.linalg._flapack" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. adding 'build/src.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/linalg/_flapack-f2pywrappers.f' to sources. building extension "scipy.linalg._flinalg" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.linalg.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.odr.__odrpack" sources building extension "scipy.optimize._minpack" sources building extension "scipy.optimize._zeros" sources building extension "scipy.optimize._lbfgsb" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.optimize.moduleTNC" sources building extension "scipy.optimize._cobyla" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.optimize.minpack2" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.optimize._slsqp" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.optimize._nnls" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.signal.sigtools" sources building extension "scipy.signal._spectral" sources building extension "scipy.signal.spline" sources building extension "scipy.sparse.linalg.isolve._iterative" sources f2py options: [] adding 'build/src.macosx-10.6-intel-2.7/fortranobject.c' to sources. adding 'build/src.macosx-10.6-intel-2.7' to include_dirs. building extension "scipy.sparse.linalg.dsolve._superlu" sources building extension "scipy.sparse.linalg.dsolve.umfpack.__umfpack" sources adding 'scipy/sparse/linalg/dsolve/umfpack/umfpack.i' to sources. swig: scipy/sparse/linalg/dsolve/umfpack/umfpack.i swig -python -I/usr/local/include -I/usr/local/include -I/usr/local/include -o build/src.macosx-10.6-intel-2.7/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c -outdir build/src.macosx-10.6-intel-2.7/scipy/sparse/linalg/dsolve/umfpack scipy/sparse/linalg/dsolve/umfpack/umfpack.i scipy/sparse/linalg/dsolve/umfpack/umfpack.i:192: Error: Unable to find 'umfpack.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:193: Error: Unable to find 'umfpack_solve.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:194: Error: Unable to find 'umfpack_defaults.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:195: Error: Unable to find 'umfpack_triplet_to_col.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:196: Error: Unable to find 'umfpack_col_to_triplet.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:197: Error: Unable to find 'umfpack_transpose.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:198: Error: Unable to find 'umfpack_scale.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:200: Error: Unable to find 'umfpack_report_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:201: Error: Unable to find 'umfpack_report_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:202: Error: Unable to find 'umfpack_report_info.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:203: Error: Unable to find 'umfpack_report_control.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:215: Error: Unable to find 'umfpack_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:216: Error: Unable to find 'umfpack_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:225: Error: Unable to find 'umfpack_free_symbolic.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:226: Error: Unable to find 'umfpack_free_numeric.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:248: Error: Unable to find 'umfpack_get_lunz.h' scipy/sparse/linalg/dsolve/umfpack/umfpack.i:272: Error: Unable to find 'umfpack_get_numeric.h' error: command 'swig' failed with exit status 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.gullikson at gmail.com Sun Feb 24 21:40:56 2013 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Sun, 24 Feb 2013 20:40:56 -0600 Subject: [SciPy-User] leastsq (probably simple) question Message-ID: Hey all, I am trying to use scipy.optimize.leastsq to do a one-parameter fit. The final result is not very sensitive to the exact value, so I am wondering how you can limit leastsq to stop iterating once the values it is testing are reasonably close together? Example output from my code: Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] Resolution-fitting X^2 = 1054.42705462 at R = [ 51908.71170349] Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] Resolution-fitting X^2 = 1054.42705436 at R = [ 51908.71173423] Resolution-fitting X^2 = 1056.95654181 at R = [ 51908.71175472] Resolution-fitting X^2 = 1054.42705431 at R = [ 51908.71173935] Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] Resolution-fitting X^2 = 3408043.11138 at R = [ 216058.47126214] Resolution-fitting X^2 = 1054.8705442 at R = [ 51857.89239384] Resolution-fitting X^2 = 1054.64687339 at R = [ 51883.47239909] Resolution-fitting X^2 = 1054.53612148 at R = [ 51896.17634439] Resolution-fitting X^2 = 1054.48119765 at R = [ 51902.48582026] Resolution-fitting X^2 = 1054.4539391 at R = [ 51905.61951157] Resolution-fitting X^2 = 1054.44040558 at R = [ 51907.1759188] Resolution-fitting X^2 = 1054.4336851 at R = [ 51907.9489416] Resolution-fitting X^2 = 1054.43034753 at R = [ 51908.33288072] Resolution-fitting X^2 = 1054.42868992 at R = [ 51908.52357292] Resolution-fitting X^2 = 1054.42786665 at R = [ 51908.61828465] Resolution-fitting X^2 = 1054.42745775 at R = [ 51908.66532545] Resolution-fitting X^2 = 1054.42725467 at R = [ 51908.68868937] Resolution-fitting X^2 = 1054.4271538 at R = [ 51908.70029361] Resolution-fitting X^2 = 1054.4271037 at R = [ 51908.70605713] Resolution-fitting X^2 = 1054.42707882 at R = [ 51908.70891972] Resolution-fitting X^2 = 1054.42706646 at R = [ 51908.71034149] The parameter I am adjusting is R (the resolution of a spectrograph), and as you can see it is doing these tiny changes which have very little affect on the X^2 value (that is not reduced X^2, so don't worry that it isn't near 1!) Is there a way to tell leastsq to stop once it gets to say 51908 (in this example)? Kevin Gullikson -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 25 10:50:41 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 25 Feb 2013 10:50:41 -0500 Subject: [SciPy-User] leastsq (probably simple) question In-Reply-To: References: Message-ID: On Sun, Feb 24, 2013 at 9:40 PM, Kevin Gullikson wrote: > Hey all, > > I am trying to use scipy.optimize.leastsq to do a one-parameter fit. The > final result is not very sensitive to the exact value, so I am wondering how > you can limit leastsq to stop iterating once the values it is testing are > reasonably close together? Example output from my code: > > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > Resolution-fitting X^2 = 1054.42705462 at R = [ 51908.71170349] > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > Resolution-fitting X^2 = 1054.42705436 at R = [ 51908.71173423] > Resolution-fitting X^2 = 1056.95654181 at R = [ 51908.71175472] > Resolution-fitting X^2 = 1054.42705431 at R = [ 51908.71173935] > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > Resolution-fitting X^2 = 3408043.11138 at R = [ 216058.47126214] > Resolution-fitting X^2 = 1054.8705442 at R = [ 51857.89239384] > Resolution-fitting X^2 = 1054.64687339 at R = [ 51883.47239909] > Resolution-fitting X^2 = 1054.53612148 at R = [ 51896.17634439] > Resolution-fitting X^2 = 1054.48119765 at R = [ 51902.48582026] > Resolution-fitting X^2 = 1054.4539391 at R = [ 51905.61951157] > Resolution-fitting X^2 = 1054.44040558 at R = [ 51907.1759188] > Resolution-fitting X^2 = 1054.4336851 at R = [ 51907.9489416] > Resolution-fitting X^2 = 1054.43034753 at R = [ 51908.33288072] > Resolution-fitting X^2 = 1054.42868992 at R = [ 51908.52357292] > Resolution-fitting X^2 = 1054.42786665 at R = [ 51908.61828465] > Resolution-fitting X^2 = 1054.42745775 at R = [ 51908.66532545] > Resolution-fitting X^2 = 1054.42725467 at R = [ 51908.68868937] > Resolution-fitting X^2 = 1054.4271538 at R = [ 51908.70029361] > Resolution-fitting X^2 = 1054.4271037 at R = [ 51908.70605713] > Resolution-fitting X^2 = 1054.42707882 at R = [ 51908.70891972] > Resolution-fitting X^2 = 1054.42706646 at R = [ 51908.71034149] > > The parameter I am adjusting is R (the resolution of a spectrograph), and as > you can see it is doing these tiny changes which have very little affect on > the X^2 value (that is not reduced X^2, so don't worry that it isn't near > 1!) > > Is there a way to tell leastsq to stop once it gets to say 51908 (in this > example)? rescale your function and/or parameters or reduce xtol and ftol 1e-8 will be too small if your values are large Josef > > Kevin Gullikson > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kevin.gullikson.signup at gmail.com Mon Feb 25 10:56:30 2013 From: kevin.gullikson.signup at gmail.com (Kevin Gullikson) Date: Mon, 25 Feb 2013 09:56:30 -0600 Subject: [SciPy-User] leastsq (probably simple) question In-Reply-To: References: Message-ID: I have tried playing with xtol and ftol, even pushing them all the way to 10, but it doesn't seem to have any effect. On Mon, Feb 25, 2013 at 9:50 AM, wrote: > On Sun, Feb 24, 2013 at 9:40 PM, Kevin Gullikson > wrote: > > Hey all, > > > > I am trying to use scipy.optimize.leastsq to do a one-parameter fit. The > > final result is not very sensitive to the exact value, so I am wondering > how > > you can limit leastsq to stop iterating once the values it is testing are > > reasonably close together? Example output from my code: > > > > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > > Resolution-fitting X^2 = 1054.42705462 at R = [ 51908.71170349] > > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] > > Resolution-fitting X^2 = 1054.42705436 at R = [ 51908.71173423] > > Resolution-fitting X^2 = 1056.95654181 at R = [ 51908.71175472] > > Resolution-fitting X^2 = 1054.42705431 at R = [ 51908.71173935] > > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] > > Resolution-fitting X^2 = 3408043.11138 at R = [ 216058.47126214] > > Resolution-fitting X^2 = 1054.8705442 at R = [ 51857.89239384] > > Resolution-fitting X^2 = 1054.64687339 at R = [ 51883.47239909] > > Resolution-fitting X^2 = 1054.53612148 at R = [ 51896.17634439] > > Resolution-fitting X^2 = 1054.48119765 at R = [ 51902.48582026] > > Resolution-fitting X^2 = 1054.4539391 at R = [ 51905.61951157] > > Resolution-fitting X^2 = 1054.44040558 at R = [ 51907.1759188] > > Resolution-fitting X^2 = 1054.4336851 at R = [ 51907.9489416] > > Resolution-fitting X^2 = 1054.43034753 at R = [ 51908.33288072] > > Resolution-fitting X^2 = 1054.42868992 at R = [ 51908.52357292] > > Resolution-fitting X^2 = 1054.42786665 at R = [ 51908.61828465] > > Resolution-fitting X^2 = 1054.42745775 at R = [ 51908.66532545] > > Resolution-fitting X^2 = 1054.42725467 at R = [ 51908.68868937] > > Resolution-fitting X^2 = 1054.4271538 at R = [ 51908.70029361] > > Resolution-fitting X^2 = 1054.4271037 at R = [ 51908.70605713] > > Resolution-fitting X^2 = 1054.42707882 at R = [ 51908.70891972] > > Resolution-fitting X^2 = 1054.42706646 at R = [ 51908.71034149] > > > > The parameter I am adjusting is R (the resolution of a spectrograph), > and as > > you can see it is doing these tiny changes which have very little affect > on > > the X^2 value (that is not reduced X^2, so don't worry that it isn't near > > 1!) > > > > Is there a way to tell leastsq to stop once it gets to say 51908 (in this > > example)? > > rescale your function and/or parameters or reduce xtol and ftol > 1e-8 will be too small if your values are large > > Josef > > > > > Kevin Gullikson > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at barbierdereuille.net Mon Feb 25 11:53:16 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Mon, 25 Feb 2013 17:53:16 +0100 Subject: [SciPy-User] leastsq (probably simple) question In-Reply-To: References: Message-ID: And did you check how many evaluations are made? It's in the 'nfev' element of the infodict if you ask for the full output. -- Barbier de Reuille Pierre On 25 February 2013 16:56, Kevin Gullikson wrote: > I have tried playing with xtol and ftol, even pushing them all the way to > 10, but it doesn't seem to have any effect. > > > On Mon, Feb 25, 2013 at 9:50 AM, wrote: > >> On Sun, Feb 24, 2013 at 9:40 PM, Kevin Gullikson >> wrote: >> > Hey all, >> > >> > I am trying to use scipy.optimize.leastsq to do a one-parameter fit. The >> > final result is not very sensitive to the exact value, so I am >> wondering how >> > you can limit leastsq to stop iterating once the values it is testing >> are >> > reasonably close together? Example output from my code: >> > >> > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] >> > Resolution-fitting X^2 = 1054.42705462 at R = [ 51908.71170349] >> > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] >> > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] >> > Resolution-fitting X^2 = 1056.95654172 at R = [ 51908.71176496] >> > Resolution-fitting X^2 = 1054.42705436 at R = [ 51908.71173423] >> > Resolution-fitting X^2 = 1056.95654181 at R = [ 51908.71175472] >> > Resolution-fitting X^2 = 1054.42705431 at R = [ 51908.71173935] >> > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] >> > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] >> > Resolution-fitting X^2 = 1054.42705427 at R = [ 51908.71174447] >> > Resolution-fitting X^2 = 3408043.11138 at R = [ 216058.47126214] >> > Resolution-fitting X^2 = 1054.8705442 at R = [ 51857.89239384] >> > Resolution-fitting X^2 = 1054.64687339 at R = [ 51883.47239909] >> > Resolution-fitting X^2 = 1054.53612148 at R = [ 51896.17634439] >> > Resolution-fitting X^2 = 1054.48119765 at R = [ 51902.48582026] >> > Resolution-fitting X^2 = 1054.4539391 at R = [ 51905.61951157] >> > Resolution-fitting X^2 = 1054.44040558 at R = [ 51907.1759188] >> > Resolution-fitting X^2 = 1054.4336851 at R = [ 51907.9489416] >> > Resolution-fitting X^2 = 1054.43034753 at R = [ 51908.33288072] >> > Resolution-fitting X^2 = 1054.42868992 at R = [ 51908.52357292] >> > Resolution-fitting X^2 = 1054.42786665 at R = [ 51908.61828465] >> > Resolution-fitting X^2 = 1054.42745775 at R = [ 51908.66532545] >> > Resolution-fitting X^2 = 1054.42725467 at R = [ 51908.68868937] >> > Resolution-fitting X^2 = 1054.4271538 at R = [ 51908.70029361] >> > Resolution-fitting X^2 = 1054.4271037 at R = [ 51908.70605713] >> > Resolution-fitting X^2 = 1054.42707882 at R = [ 51908.70891972] >> > Resolution-fitting X^2 = 1054.42706646 at R = [ 51908.71034149] >> > >> > The parameter I am adjusting is R (the resolution of a spectrograph), >> and as >> > you can see it is doing these tiny changes which have very little >> affect on >> > the X^2 value (that is not reduced X^2, so don't worry that it isn't >> near >> > 1!) >> > >> > Is there a way to tell leastsq to stop once it gets to say 51908 (in >> this >> > example)? >> >> rescale your function and/or parameters or reduce xtol and ftol >> 1e-8 will be too small if your values are large >> >> Josef >> >> > >> > Kevin Gullikson >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From L.J.Buitinck at uva.nl Mon Feb 25 14:36:34 2013 From: L.J.Buitinck at uva.nl (Lars Buitinck) Date: Mon, 25 Feb 2013 20:36:34 +0100 Subject: [SciPy-User] Hacking scipy and running the tests Message-ID: Hi all, I've submitted several patches to Scipy so far, am I've hit this problem every time: you can't import Scipy from within the source dir, so you can't run the tests without building and installing. But that takes a long time, and then when you edit a file, you either have to edit it in the installation and not forget to copy it back (without help from Git), or you must rebuild and reinstall. Is there a smarter way to do all this? How do the Scipy core developers do this? TIA, -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam From jsseabold at gmail.com Mon Feb 25 14:51:27 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 25 Feb 2013 14:51:27 -0500 Subject: [SciPy-User] Hacking scipy and running the tests In-Reply-To: References: Message-ID: On Mon, Feb 25, 2013 at 2:36 PM, Lars Buitinck wrote: > Hi all, > > I've submitted several patches to Scipy so far, am I've hit this > problem every time: you can't import Scipy from within the source dir, > so you can't run the tests without building and installing. But that > takes a long time, and then when you edit a file, you either have to > edit it in the installation and not forget to copy it back (without > help from Git), or you must rebuild and reinstall. > > Is there a smarter way to do all this? How do the Scipy core developers do > this? > > AFAIK, you don't have to rebuild unless you're editing the C/Cython sources. In this case I often use (or make) the existing subpackage setup.py to only rebuild what I need. If you've built the source in place python setup.py build_ext --inplace then you can just edit the python sources and run nosetests in the source directory without rebuilding. Alternatively, I think you can use python setup.py develop which I believe builds in place and adds the source to your python path, though I rarely do this. I'd be interested to hear if there are better ways. Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Feb 25 15:17:40 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 25 Feb 2013 21:17:40 +0100 Subject: [SciPy-User] Hacking scipy and running the tests In-Reply-To: References: Message-ID: On Mon, Feb 25, 2013 at 8:51 PM, Skipper Seabold wrote: > On Mon, Feb 25, 2013 at 2:36 PM, Lars Buitinck wrote: > >> Hi all, >> >> I've submitted several patches to Scipy so far, am I've hit this >> problem every time: you can't import Scipy from within the source dir, >> so you can't run the tests without building and installing. But that >> takes a long time, and then when you edit a file, you either have to >> edit it in the installation and not forget to copy it back (without >> help from Git), or you must rebuild and reinstall. >> >> Is there a smarter way to do all this? How do the Scipy core developers >> do this? >> >> > AFAIK, you don't have to rebuild unless you're editing the C/Cython > sources. In this case I often use (or make) the existing subpackage > setup.py to only rebuild what I need. > > If you've built the source in place > > python setup.py build_ext --inplace > In-place build is what I use as well. Plus a second git repo (which pulls from upstream and my main repo) for multi-python-version testing of PRs and my own changes with "tox -e py27,py33". Ralf > then you can just edit the python sources and run nosetests in the source > directory without rebuilding. Alternatively, I think you can use > > python setup.py develop > > which I believe builds in place and adds the source to your python path, > though I rarely do this. > > I'd be interested to hear if there are better ways. > > Skipper > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sguo at bournemouth.ac.uk Tue Feb 26 05:09:06 2013 From: sguo at bournemouth.ac.uk (SHIHUI GUO) Date: Tue, 26 Feb 2013 10:09:06 +0000 Subject: [SciPy-User] _dop.error: failed in processing argument list for call-back fcn. Message-ID: HI all, I want to use scipy to implement an oscillator, ie. solving a secod-order ordinary differential equation, my code is: =================================== from scipy.integrate import ode y0,t0 = [0, 1], 0 def fun(t, y, params): rou = 1 omega = 10 sigma = 1 # convergence rate, ie, lambda conrate = 10 temp = -conrate*((y[0]^2+y[1]^2)/rou^2-sigma) dy = temp*y[0] - omega*y[1] ddy = omega*y[0] + temp*y[1] return [dy, ddy] test = ode(fun).set_integrator('dopri5') test.set_initial_value(y0, t0) t1 = 10 dt = 0.1 while test.successful() and test.t test.integrate(test.t+dt) File "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", line 333, in integrate self.f_params, self.jac_params) File "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", line 827, in run tuple(self.call_args) + (f_params,))) =================================== Previously I use the ubuntu default python, and scipy is 0.9.0. Some thread says it is a bug and has been fixed in 0.10.0, so I switched to enthought, now the scipy is newest version, but the error remains. Thanks for any help. Shihui -- * --------------------------------------------------------------------------------------- * SHIHUI GUO National Center for Computer Animation Bournemouth University United Kingdom BU is a Disability Two Ticks Employer and has signed up to the Mindful Employer charter. Information about the accessibility of University buildings can be found on the BU DisabledGo webpages [ http://www.disabledgo.com/en/org/bournemouth-university ] This email is intended only for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email, which must not be copied, distributed or disclosed to any other person. Any views or opinions presented are solely those of the author and do not necessarily represent those of Bournemouth University or its subsidiary companies. Nor can any contract be formed on behalf of the University or its subsidiary companies via email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Tue Feb 26 06:16:01 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 26 Feb 2013 06:16:01 -0500 Subject: [SciPy-User] _dop.error: failed in processing argument list for call-back fcn. In-Reply-To: References: Message-ID: On 2/26/13, SHIHUI GUO wrote: > HI all, > > I want to use scipy to implement an oscillator, ie. solving a secod-order > ordinary differential equation, my code is: > =================================== > from scipy.integrate import ode > > y0,t0 = [0, 1], 0 > > def fun(t, y, params): > rou = 1 > omega = 10 > sigma = 1 > # convergence rate, ie, lambda > conrate = 10 > temp = -conrate*((y[0]^2+y[1]^2)/rou^2-sigma) > dy = temp*y[0] - omega*y[1] > ddy = omega*y[0] + temp*y[1] > return [dy, ddy] > > test = ode(fun).set_integrator('dopri5') > test.set_initial_value(y0, t0) > t1 = 10 > dt = 0.1 > > while test.successful() and test.t test.integrate(test.t+dt) > print test.t, test.yy > =================================== > > =================================== > The error says: > _dop.error: failed in processing argument list for call-back fcn. > File "/home/shepherd/python/research/testode.py", line 23, in > test.integrate(test.t+dt) > File > "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", > line 333, in integrate > self.f_params, self.jac_params) > File > "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", > line 827, in run > tuple(self.call_args) + (f_params,))) > =================================== > > Previously I use the ubuntu default python, and scipy is 0.9.0. Some thread > says it is a bug and has been fixed in 0.10.0, so I switched to enthought, > now the scipy is newest version, but the error remains. > > Thanks for any help. > > Shihui > There are a few problems in your code. You haven't told the `ode` object that your function accepts the extra argument `params`. Normally you would do this with `test.set_f_params(...)`, but since your function doesn't actually use `params`, it is simpler to just change the function signature to def fun(t, y): Next, in Python, the operator to raise a value to a power is **, not ^, so the formula for `temp` should be: temp = -conrate*((y[0]**2+y[1]**2)/rou**2-sigma) Finally, the attribute for the solution is `y`, not `yy`, so the last line should be: print test.t, test.y Cheers, Warren > -- > * > --------------------------------------------------------------------------------------- > * > > SHIHUI GUO > National Center for Computer Animation > Bournemouth University > United Kingdom > > BU is a Disability Two Ticks Employer and has signed up to the Mindful > Employer charter. Information about the accessibility of University > buildings can be found on the BU DisabledGo webpages [ > http://www.disabledgo.com/en/org/bournemouth-university ] > This email is intended only for the person to whom it is addressed and may > contain confidential information. If you have received this email in error, > please notify the sender and delete this email, which must not be copied, > distributed or disclosed to any other person. > Any views or opinions presented are solely those of the author and do not > necessarily represent those of Bournemouth University or its subsidiary > companies. Nor can any contract be formed on behalf of the University or its > subsidiary companies via email. > > > From sp4g10 at soton.ac.uk Mon Feb 25 16:45:04 2013 From: sp4g10 at soton.ac.uk (pal s. (sp4g10)) Date: Mon, 25 Feb 2013 21:45:04 +0000 Subject: [SciPy-User] Missing files Message-ID: <50285F0A7296A04C8514FFBB6BE05DCA022BF54D@UOS-MSG00039-SI.soton.ac.uk> Hello, Installing Scipy on Cygwin's Python fails, and none of the cases found in the install.txt matches our problem. After downloading the tarball, and attempting to install, there is some errors with regards to missing modules, namely extension modules, and the the ldd command tells that the following doesn't exists: $ ldd /path/to/ext_module.so ldd: /path/to/ext_module.so: No such file or directory There is no INSTALLDIR folder anywhere and can't be found. The scipy tarball came from the folder: 0.12.0b1, and is named "scipy-0.12.0b1.tar.gz" This is the output when attempting to install scipy: Kanol PAL at KanolPAL-PC /scipy-0.12.0b1 $ python setup.py install Running from scipy source directory. blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE /usr/lib/python2.6/site-packages/numpy/distutils/system_info.py:1425: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in /usr/local/lib FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info NOT AVAILABLE /usr/lib/python2.6/site-packages/numpy/distutils/system_info.py:1340: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in /usr/local/lib FOUND: libraries = ['lapack'] library_dirs = ['/usr/lib'] language = f77 FOUND: libraries = ['lapack', 'blas'] library_dirs = ['/usr/lib'] define_macros = [('NO_ATLAS_INFO', 1)] language = f77 umfpack_info: libraries umfpack not found in /usr/local/lib amd_info: libraries amd not found in /usr/local/lib FOUND: libraries = ['amd'] library_dirs = ['/usr/lib'] swig_opts = ['-I/usr/include/suitesparse'] define_macros = [('SCIPY_AMD_H', None)] include_dirs = ['/usr/include/suitesparse'] FOUND: libraries = ['umfpack', 'amd'] library_dirs = ['/usr/lib'] swig_opts = ['-I/usr/include/suitesparse', '-I/usr/include/suitesparse'] define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)] include_dirs = ['/usr/include/suitesparse'] running install running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "dfftpack" sources building library "fftpack" sources building library "linpack_lite" sources building library "mach" sources building library "quadpack" sources building library "odepack" sources building library "dop" sources building library "fitpack" sources building library "odrpack" sources building library "minpack" sources building library "rootfind" sources building library "superlu_src" sources building library "arpack_scipy" sources building library "sc_c_misc" sources building library "sc_cephes" sources building library "sc_mach" sources building library "sc_amos" sources building library "sc_cdf" sources building library "sc_specfun" sources building library "statlib" sources building extension "scipy.cluster._vq" sources building extension "scipy.cluster._hierarchy_wrap" sources building extension "scipy.fftpack._fftpack" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.fftpack.convolve" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.integrate._quadpack" sources building extension "scipy.integrate._odepack" sources building extension "scipy.integrate.vode" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.integrate.lsoda" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.integrate._dop" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.interpolate.interpnd" sources building extension "scipy.interpolate._fitpack" sources building extension "scipy.interpolate.dfitpack" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/scipy/interpolate/src/dfitpack-f2pywrappers.f' to sources. building extension "scipy.interpolate._interpolate" sources building extension "scipy.io.matlab.streams" sources building extension "scipy.io.matlab.mio_utils" sources building extension "scipy.io.matlab.mio5_utils" sources building extension "scipy.lib.blas.fblas" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/lib/blas/fblas-f2pywrappers.f' to sources. building extension "scipy.lib.blas.cblas" sources adding 'build/src.cygwin-1.7.17-i686-2.6/scipy/lib/blas/cblas.pyf' to sources. f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.lib.lapack.flapack" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.lib.lapack.clapack" sources adding 'build/src.cygwin-1.7.17-i686-2.6/scipy/lib/lapack/clapack.pyf' to sources. f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.lib.lapack.calc_lwork" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.linalg._fblas" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_fblas-f2pywrappers.f' to sources. building extension "scipy.linalg._flapack" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flapack-f2pywrappers.f' to sources. building extension "scipy.linalg._flinalg" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.linalg.calc_lwork" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.odr.__odrpack" sources building extension "scipy.optimize._minpack" sources building extension "scipy.optimize._zeros" sources building extension "scipy.optimize._lbfgsb" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.optimize.moduleTNC" sources building extension "scipy.optimize._cobyla" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.optimize.minpack2" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.optimize._slsqp" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.optimize._nnls" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.signal.sigtools" sources building extension "scipy.signal._spectral" sources building extension "scipy.signal.spline" sources building extension "scipy.sparse.linalg.isolve._iterative" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.sparse.linalg.dsolve._superlu" sources building extension "scipy.sparse.linalg.dsolve.umfpack.__umfpack" sources adding 'scipy/sparse/linalg/dsolve/umfpack/umfpack.i' to sources. building extension "scipy.sparse.linalg.eigen.arpack._arpack" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/sparse/linalg/eigen/arpack/_arpack-f2pywrappers.f' to sources. building extension "scipy.sparse.sparsetools._csr" sources building extension "scipy.sparse.sparsetools._csc" sources building extension "scipy.sparse.sparsetools._coo" sources building extension "scipy.sparse.sparsetools._bsr" sources building extension "scipy.sparse.sparsetools._dia" sources building extension "scipy.sparse.sparsetools._csgraph" sources building extension "scipy.sparse.csgraph._shortest_path" sources building extension "scipy.sparse.csgraph._traversal" sources building extension "scipy.sparse.csgraph._min_spanning_tree" sources building extension "scipy.sparse.csgraph._tools" sources building extension "scipy.spatial.qhull" sources building extension "scipy.spatial.ckdtree" sources building extension "scipy.spatial._distance_wrap" sources building extension "scipy.special.specfun" sources f2py options: ['--no-wrap-functions'] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.special._ufuncs" sources building extension "scipy.special._ufuncs_cxx" sources building extension "scipy.stats.statlib" sources f2py options: ['--no-wrap-functions'] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.stats.vonmises_cython" sources building extension "scipy.stats._rank" sources building extension "scipy.stats.futil" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. building extension "scipy.stats.mvn" sources f2py options: [] adding 'build/src.cygwin-1.7.17-i686-2.6/fortranobject.c' to sources. adding 'build/src.cygwin-1.7.17-i686-2.6' to include_dirs. adding 'build/src.cygwin-1.7.17-i686-2.6/scipy/stats/mvn-f2pywrappers.f' to sources. building extension "scipy.ndimage._nd_image" sources building data_files sources build_src: building npy-pkg config files running build_py copying scipy/version.py -> build/lib.cygwin-1.7.17-i686-2.6/scipy copying build/src.cygwin-1.7.17-i686-2.6/scipy/__config__.py -> build/lib.cygwin-1.7.17-i686-2.6/scipy running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_clib running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext extending extension 'scipy.sparse.linalg.dsolve._superlu' defined_macros with [('USE_VENDOR_BLAS', 1)] customize UnixCCompiler customize UnixCCompiler using build_ext customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext building 'scipy.linalg._flinalg' extension compiling C sources C compiler: gcc -fno-strict-aliasing -g -O2 -pipe -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes compile options: '-DNO_ATLAS_INFO=1 -Ibuild/src.cygwin-1.7.17-i686-2.6 -I/usr/lib/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' gcc: build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.c gcc: build/src.cygwin-1.7.17-i686-2.6/fortranobject.c compiling Fortran sources Fortran f77 compiler: /usr/bin/g77 -g -Wall -O3 -funroll-loops compile options: '-DNO_ATLAS_INFO=1 -Ibuild/src.cygwin-1.7.17-i686-2.6 -I/usr/lib/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' g77:f77: scipy/linalg/src/det.f g77:f77: scipy/linalg/src/lu.f /usr/bin/g77 -g -Wall -g -Wall -shared build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/fortranobject.o build/temp.cygwin-1.7.17-i686-2.6/scipy/linalg/src/det.o build/temp.cygwin-1.7.17-i686-2.6/scipy/linalg/src/lu.o -L/usr/lib -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -L/usr/lib/python2.6/config -Lbuild/temp.cygwin-1.7.17-i686-2.6 -llapack -lblas -lpython2.6 -lg2c -o build/lib.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalg.dll build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0xac): undefined reference to `_ddet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x164): undefined reference to `_ddet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x21c): undefined reference to `_sdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x2d4): undefined reference to `_sdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x38c): undefined reference to `_zdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x444): undefined reference to `_zdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x4fc): undefined reference to `_cdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x5b4): undefined reference to `_cdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x66c): undefined reference to `_dlu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x724): undefined reference to `_zlu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x7dc): undefined reference to `_slu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x894): undefined reference to `_clu_c_' collect2: ld returned 1 exit status build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0xac): undefined reference to `_ddet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x164): undefined reference to `_ddet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x21c): undefined reference to `_sdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x2d4): undefined reference to `_sdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x38c): undefined reference to `_zdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x444): undefined reference to `_zdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x4fc): undefined reference to `_cdet_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x5b4): undefined reference to `_cdet_r_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x66c): undefined reference to `_dlu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x724): undefined reference to `_zlu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x7dc): undefined reference to `_slu_c_' build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o:_flinalgmodule.c:(.data+0x894): undefined reference to `_clu_c_' collect2: ld returned 1 exit status error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalgmodule.o build/temp.cygwin-1.7.17-i686-2.6/build/src.cygwin-1.7.17-i686-2.6/fortranobject.o build/temp.cygwin-1.7.17-i686-2.6/scipy/linalg/src/det.o build/temp.cygwin-1.7.17-i686-2.6/scipy/linalg/src/lu.o -L/usr/lib -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -L/usr/lib/python2.6/config -Lbuild/temp.cygwin-1.7.17-i686-2.6 -llapack -lblas -lpython2.6 -lg2c -o build/lib.cygwin-1.7.17-i686-2.6/scipy/linalg/_flinalg.dll" failed with exit status 1 Kanol PAL at KanolPAL-PC /scipy-0.12.0b1 $ python -c 'from numpy.f2py.diagnose import run; run()' ------ os.name='posix' ------ sys.platform='cygwin' ------ sys.version: 2.6.8 (unknown, Jun 9 2012, 11:30:32) [GCC 4.5.3] ------ sys.prefix: /usr ------ sys.path=':/usr/lib/python26.zip:/usr/lib/python2.6:/usr/lib/python2.6/plat-cygwin:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload:/usr/lib/python2.6/site-packages:/usr/lib/python2.6/site-packages/PIL:/usr/lib/python2.6/site-packages/gtk-2.0' ------ Found new numpy version '1.6.2' in /usr/lib/python2.6/site-packages/numpy/__init__.pyc Found f2py2e version '2' in /usr/lib/python2.6/site-packages/numpy/f2py/f2py2e.pyc Found numpy.distutils version '0.4.0' in '/usr/lib/python2.6/site-packages/numpy/distutils/__init__.pyc' ------ Importing numpy.distutils.fcompiler ... ok ------ Checking availability of supported Fortran compilers: GnuFCompiler instance properties: archiver = ['/usr/bin/g77', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/g77', '-g', '-Wall', '-O3', '-funroll-loops'] compiler_f90 = None compiler_fix = None libraries = ['g2c'] library_dirs = ['/usr/lib/gcc/i686-pc-cygwin/3.4.4'] linker_exe = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall'] linker_so = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall', '- shared'] object_switch = '-o ' ranlib = ['/usr/bin/g77'] version = LooseVersion ('3.4.4') version_cmd = ['/usr/bin/g77', '--version'] Gnu95FCompiler instance properties: archiver = ['/usr/bin/gfortran', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-O3', '- funroll-loops'] compiler_f90 = ['/usr/bin/gfortran', '-Wall', '-O3', '-funroll-loops'] compiler_fix = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-Wall', '-O3', '-funroll-loops'] libraries = ['gfortran'] library_dirs = ['/usr/lib/gcc/i686-pc-cygwin/4.5.3'] linker_exe = ['/usr/bin/gfortran', '-Wall', '-Wall'] linker_so = ['/usr/bin/gfortran', '-Wall', '-Wall', '-shared'] object_switch = '-o ' ranlib = ['/usr/bin/gfortran'] version = LooseVersion ('4.5.3') version_cmd = ['/usr/bin/gfortran', '--version'] Fortran compilers found: --fcompiler=gnu GNU Fortran 77 compiler (3.4.4) --fcompiler=gnu95 GNU Fortran 95 compiler (4.5.3) Compilers available for this platform, but not found: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=compaqv DIGITAL or Compaq Visual Fortran Compiler --fcompiler=g95 G95 Fortran Compiler --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps Compilers not available on this platform: --fcompiler=compaq Compaq Fortran Compiler --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for 64-bit apps --fcompiler=intelvem Intel Visual Fortran Compiler for 64-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=nag NAGWare Fortran 95 Compiler --fcompiler=none Fake Fortran compiler --fcompiler=pathf95 PathScale Fortran Compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=sun Sun or Forte Fortran 95 Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler For compiler details, run 'config_fc --verbose' setup command. ------ Importing numpy.distutils.cpuinfo ... ok ------ CPU information: CPUInfoBase__get_nbits getNCPUs has_mmx has_sse has_sse2 has_sse3 has_ssse3 is_32bit is_Intel is_Pentium is_i686 ------ $ python -c 'import os,sys;print os.name,sys.platform' posix cygwin $ uname -a CYGWIN_NT-6.0 KanolPAL-PC 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-pc-cygwin/4.5.3/lto-wrapper.exe Target: i686-pc-cygwin Configured with: /gnu/gcc/releases/respins/4.5.3-3/gcc4-4.5.3-3/src/gcc-4.5.3/configure --srcdir=/gnu/gcc/releases/respins/4.5.3-3/gcc4-4.5.3-3/src/gcc-4.5.3 --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/lib --datadir=/usr/share --localstatedir=/var --sysconfdir=/etc --datarootdir=/usr/share --docdir=/usr/share/doc/gcc4 -C --datadir=/usr/share --infodir=/usr/share/info --mandir=/usr/share/man -v --with-gmp=/usr --with-mpfr=/usr --enable-bootstrap --enable-version-specific-runtime-libs --libexecdir=/usr/lib --enable-static --enable-shared --enable-shared-libgcc --disable-__cxa_atexit --with-gnu-ld --with-gnu-as --with-dwarf2 --disable-sjlj-exceptions --enable-languages=ada,c,c++,fortran,java,lto,objc,obj-c++ --enable-graphite --enable-lto --enable-java-awt=gtk --disable-symvers --enable-libjava --program-suffix=-4 --enable-libgomp --enable-libssp --enable-libada --enable-threads=posix --with-arch=i686 --with-tune=generic --enable-libgcj-sublibs CC=gcc-4 CXX=g++-4 CC_FOR_TARGET=gcc-4 CXX_FOR_TARGET=g++-4 GNATMAKE_FOR_TARGET=gnatmake GNATBIND_FOR_TARGET=gnatbind --with-ecj-jar=/usr/share/java/ecj.jar Thread model: posix gcc version 4.5.3 (GCC) $ g77 --version GNU Fortran (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125) Copyright (C) 2004 Free Software Foundation, Inc. GNU Fortran comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of GNU Fortran under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING or type the command `info -f g77 Copying'. $ python -c 'import sys;print sys.version' 2.6.8 (unknown, Jun 9 2012, 11:30:32) [GCC 4.5.3] $ python -c 'import numpy;print numpy.__version__' 1.6.2 ___________________ What may be the mistake in our part ? Is ATLAS necessary for installing ? Thank you for your advice, Regards, Sullivan From calvin.r.robinson at nasa.gov Tue Feb 26 09:56:47 2013 From: calvin.r.robinson at nasa.gov (Robinson, Calvin R. (GRC-VM00)) Date: Tue, 26 Feb 2013 08:56:47 -0600 Subject: [SciPy-User] Ubuntu 12.04 Scipy and Numpy Versions Outdated Message-ID: <2900D4FF9C4FCC419304A58D4F08666D0491CC0512@NDJSSCC08.ndc.nasa.gov> My current scipy and numpy versions are 1.6.1 and 0.9. I was going to update them using Ubuntu's package manager but it turns out for 12.04, those are the latest versions. Are there other repositories I can use to update these packages? Thanks Calvin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sguo at bournemouth.ac.uk Tue Feb 26 11:17:36 2013 From: sguo at bournemouth.ac.uk (SHIHUI GUO) Date: Tue, 26 Feb 2013 16:17:36 +0000 Subject: [SciPy-User] _dop.error: failed in processing argument list for call-back fcn. In-Reply-To: References: Message-ID: Hi Warren, Thanks for this. The major issue is that I declared the "param" in function definition, but didn't pass that when calling. Another question, when we do: ========================== while test.successful() and test.twrote: > On 2/26/13, SHIHUI GUO wrote: > > HI all, > > > > I want to use scipy to implement an oscillator, ie. solving a secod-order > > ordinary differential equation, my code is: > > =================================== > > from scipy.integrate import ode > > > > y0,t0 = [0, 1], 0 > > > > def fun(t, y, params): > > rou = 1 > > omega = 10 > > sigma = 1 > > # convergence rate, ie, lambda > > conrate = 10 > > temp = -conrate*((y[0]^2+y[1]^2)/rou^2-sigma) > > dy = temp*y[0] - omega*y[1] > > ddy = omega*y[0] + temp*y[1] > > return [dy, ddy] > > > > test = ode(fun).set_integrator('dopri5') > > test.set_initial_value(y0, t0) > > t1 = 10 > > dt = 0.1 > > > > while test.successful() and test.t > test.integrate(test.t+dt) > > print test.t, test.yy > > =================================== > > > > =================================== > > The error says: > > _dop.error: failed in processing argument list for call-back fcn. > > File "/home/shepherd/python/research/testode.py", line 23, in > > test.integrate(test.t+dt) > > File > > > "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", > > line 333, in integrate > > self.f_params, self.jac_params) > > File > > > "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", > > line 827, in run > > tuple(self.call_args) + (f_params,))) > > =================================== > > > > Previously I use the ubuntu default python, and scipy is 0.9.0. Some > thread > > says it is a bug and has been fixed in 0.10.0, so I switched to > enthought, > > now the scipy is newest version, but the error remains. > > > > Thanks for any help. > > > > Shihui > > > > > There are a few problems in your code. > > You haven't told the `ode` object that your function accepts the extra > argument `params`. Normally you would do this with > `test.set_f_params(...)`, but since your function doesn't actually use > `params`, it is simpler to just change the function signature to > > def fun(t, y): > > Next, in Python, the operator to raise a value to a power is **, not > ^, so the formula for `temp` should be: > > temp = -conrate*((y[0]**2+y[1]**2)/rou**2-sigma) > > Finally, the attribute for the solution is `y`, not `yy`, so the last > line should be: > > print test.t, test.y > > > Cheers, > > Warren > > > > -- > > * > > > --------------------------------------------------------------------------------------- > > * > > > > SHIHUI GUO > > National Center for Computer Animation > > Bournemouth University > > United Kingdom > > > > BU is a Disability Two Ticks Employer and has signed up to the Mindful > > Employer charter. Information about the accessibility of University > > buildings can be found on the BU DisabledGo webpages [ > > http://www.disabledgo.com/en/org/bournemouth-university ] > > This email is intended only for the person to whom it is addressed and > may > > contain confidential information. If you have received this email in > error, > > please notify the sender and delete this email, which must not be copied, > > distributed or disclosed to any other person. > > Any views or opinions presented are solely those of the author and do not > > necessarily represent those of Bournemouth University or its subsidiary > > companies. Nor can any contract be formed on behalf of the University or > its > > subsidiary companies via email. > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > -- * --------------------------------------------------------------------------------------- * SHIHUI GUO National Center for Computer Animation Bournemouth University United Kingdom BU is a Disability Two Ticks Employer and has signed up to the Mindful Employer charter. Information about the accessibility of University buildings can be found on the BU DisabledGo webpages [ http://www.disabledgo.com/en/org/bournemouth-university ] This email is intended only for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email, which must not be copied, distributed or disclosed to any other person. Any views or opinions presented are solely those of the author and do not necessarily represent those of Bournemouth University or its subsidiary companies. Nor can any contract be formed on behalf of the University or its subsidiary companies via email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Feb 27 09:02:44 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 27 Feb 2013 09:02:44 -0500 Subject: [SciPy-User] fast autocorrelation Message-ID: I'm trying to use fft to do fast auto-correlation. (Actually, my real problem is slightly different, but I'll leave that detail out). The input sequence is complex, so a complex->complex FFT is used. A length 2N FFT is used with 0-padding to perform linear correlation. If now in FFT output we take magnitude-squared, we have a real sequence. A complex IFFT could be used to get the autocorrelation output, but that would be wasteful. My question is, is there a trick to more efficiently compute IFFT where the input is real? A couple of thoughts: 1. there is a nice trick for real->complex fft (not ifft) 2. an fft can be used to compute ifft as ifft (x) = conj (fft (conj (x))) From pierre at barbierdereuille.net Wed Feb 27 09:44:26 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Wed, 27 Feb 2013 15:44:26 +0100 Subject: [SciPy-User] fast autocorrelation In-Reply-To: References: Message-ID: scipy.fftpack has two functions for that: rfft and irfft. They are both faster than fft/ifft, so maybe you want to use that? Or is it still too slow? -- Dr. Barbier de Reuille, Pierre Institute of Plant Sciences Altenbergrain 21, CH-3013 Bern, Switzerland http://www.botany.unibe.ch/associated/systemsx/index.php On 27 February 2013 15:02, Neal Becker wrote: > I'm trying to use fft to do fast auto-correlation. (Actually, my real > problem > is slightly different, but I'll leave that detail out). > > The input sequence is complex, so a complex->complex FFT is used. A > length 2N > FFT is used with 0-padding to perform linear correlation. > > If now in FFT output we take magnitude-squared, we have a real sequence. A > complex IFFT could be used to get the autocorrelation output, but that > would be > wasteful. > > My question is, is there a trick to more efficiently compute IFFT where the > input is real? > > A couple of thoughts: > 1. there is a nice trick for real->complex fft (not ifft) > 2. an fft can be used to compute ifft as ifft (x) = conj (fft (conj (x))) > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed Feb 27 09:48:25 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 27 Feb 2013 09:48:25 -0500 Subject: [SciPy-User] fast autocorrelation References: Message-ID: Pierre Barbier de Reuille wrote: > scipy.fftpack has two functions for that: rfft and irfft. They are both > faster than fft/ifft, so maybe you want to use that? Or is it still too > slow? > I believe irfft will do complex->real (just as rfft does real->complex). In this case, we have frequency domain values, but they are pure real. So we want ifft from real->complex. From pierre at barbierdereuille.net Wed Feb 27 09:59:13 2013 From: pierre at barbierdereuille.net (Pierre Barbier de Reuille) Date: Wed, 27 Feb 2013 15:59:13 +0100 Subject: [SciPy-User] fast autocorrelation In-Reply-To: References: Message-ID: My bad, I hadn't read the description fully. -- Dr. Barbier de Reuille, Pierre Institute of Plant Sciences Altenbergrain 21, CH-3013 Bern, Switzerland http://www.botany.unibe.ch/associated/systemsx/index.php On 27 February 2013 15:48, Neal Becker wrote: > Pierre Barbier de Reuille wrote: > > > scipy.fftpack has two functions for that: rfft and irfft. They are both > > faster than fft/ifft, so maybe you want to use that? Or is it still too > > slow? > > > > I believe irfft will do complex->real (just as rfft does real->complex). > > In this case, we have frequency domain values, but they are pure real. So > we want ifft from real->complex. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Wed Feb 27 10:14:08 2013 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Wed, 27 Feb 2013 16:14:08 +0100 Subject: [SciPy-User] fast autocorrelation In-Reply-To: References: Message-ID: <1361978048.8228.7.camel@laptop-101> Le mercredi 27 f?vrier 2013 ? 09:02 -0500, Neal Becker a ?crit : > I'm trying to use fft to do fast auto-correlation. (Actually, my real problem > is slightly different, but I'll leave that detail out). > > The input sequence is complex, so a complex->complex FFT is used. A length 2N > FFT is used with 0-padding to perform linear correlation. > > If now in FFT output we take magnitude-squared, we have a real sequence. A > complex IFFT could be used to get the autocorrelation output, but that would be > wasteful. > > My question is, is there a trick to more efficiently compute IFFT where the > input is real? > > A couple of thoughts: > 1. there is a nice trick for real->complex fft (not ifft) > 2. an fft can be used to compute ifft as ifft (x) = conj (fft (conj (x))) For #1: np.fft.rfft For #2: regarding your real PSD (after squaring the output of FFT), the inner conj is trivial, and the normalization by the length of the sequence is lacking: if x real, ifft(x) = 1/n * conj(fft(x)) so may use for your purpose (just compute positive lag samples) : 1/n * np.fft.rfft(PSD).conj() From evgeny.burovskiy at gmail.com Wed Feb 27 10:54:44 2013 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Wed, 27 Feb 2013 15:54:44 +0000 Subject: [SciPy-User] am I using interpolate.PiecewisePolynomial correctly? Message-ID: Dear All, I'm trying to use the interpolate.PiecewisePolynomial for the first time, and I'm wondering if the timings I see are not due to some simple misunderstanding of mine. Specifically, given a simple example, $ cat pp.py import numpy as np from scipy.interpolate import interp1d, PiecewisePolynomial def f(x): return np.tan(x) def fprime(x): return 1./np.cos(x)**2 Npts = 50 grid = np.array([(np.pi/2.-0.1)*j/Npts for j in xrange(Npts+1)]) interp = interp1d(grid, f(grid), kind='cubic') piecewise = PiecewisePolynomial(grid, np.array([np.r_[f(x), fprime(x)] for x in grid]), orders=3) it looks like evaluation of a PiecewisePolynomial takes ages: $ python -mtimeit -s"from numpy import random, pi; import pp; x_new = random.rand(1000)*pi/3." "pp.piecewise(x_new)" 100 loops, best of 3: 3.2 msec per loop $ $ python -mtimeit -s"from numpy import random, pi; import pp; x_new = random.rand(1000)*pi/3." "pp.interp(x_new)" 10000 loops, best of 3: 143 usec per loop (I understand the difference in functionality between the two). I'm wondering if this sort of timings is an artifact of the generality of PiecewisePolynomials, or am I just not using them properly? Best, Evgeni -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas at hilboll.de Tue Feb 26 10:01:55 2013 From: andreas at hilboll.de (Andreas Hilboll) Date: Tue, 26 Feb 2013 16:01:55 +0100 Subject: [SciPy-User] Ubuntu 12.04 Scipy and Numpy Versions Outdated In-Reply-To: <2900D4FF9C4FCC419304A58D4F08666D0491CC0512@NDJSSCC08.ndc.nasa.gov> References: <2900D4FF9C4FCC419304A58D4F08666D0491CC0512@NDJSSCC08.ndc.nasa.gov> Message-ID: <512CCE63.7020204@hilboll.de> > My current scipy and numpy versions are 1.6.1 and 0.9. I was going to > update them using Ubuntu?s package manager but it turns out for 12.04, > those are the latest versions. Are there other repositories I can use to > update these packages? Hi Calvin, I created a PPA for this purpose. See my mail at http://mail.scipy.org/pipermail/scipy-user/2013-February/034165.html. So far, I only put scipy 0.11 up there. Any help in packaging is welcome ;) Cheers, Andreas. From takowl at gmail.com Wed Feb 27 11:04:09 2013 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 27 Feb 2013 16:04:09 +0000 Subject: [SciPy-User] Ubuntu 12.04 Scipy and Numpy Versions Outdated In-Reply-To: <512CCE63.7020204@hilboll.de> References: <2900D4FF9C4FCC419304A58D4F08666D0491CC0512@NDJSSCC08.ndc.nasa.gov> <512CCE63.7020204@hilboll.de> Message-ID: On 26 February 2013 15:01, Andreas Hilboll wrote: > I created a PPA for this purpose. See my mail at > http://mail.scipy.org/pipermail/scipy-user/2013-February/034165.html. So > far, I only put scipy 0.11 up there. Any help in packaging is welcome ;) > If you're lucky, a package can be backported with a single command: backportpackage -u ppa:pylab/stable -s raring -d precise python-numpy (the options being u for upload, s for source and d for destination). Of course, there might be more to do than that - e.g. in this case, I think you'll need to backport python-tz first, because it has a build dependency on python3-tz, which isn't in precise. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Wed Feb 27 13:19:45 2013 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 27 Feb 2013 19:19:45 +0100 Subject: [SciPy-User] ANN: SfePy 2013.1 Message-ID: <512E4E41.1040906@ntc.zcu.cz> I am pleased to announce release 2013.1 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Downloads, mailing list, wiki: http://code.google.com/p/sfepy/ Git (source) repository, issue tracker: http://github.com/sfepy Highlights of this release -------------------------- - unified use of stationary and evolutionary solvers - new implicit adaptive time stepping solver - elements of set and nodes of set region selectors - simplified setting of variables data For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke?, Maty?? Nov?k From jrocher at enthought.com Wed Feb 27 17:17:33 2013 From: jrocher at enthought.com (Jonathan Rocher) Date: Wed, 27 Feb 2013 16:17:33 -0600 Subject: [SciPy-User] [ANN] SciPy2013: Call for abstracts Message-ID: [Apologies for cross-posts] Dear all, The annual SciPy Conference (Scientific Computing with Python) allows participants from academic, commercial, and governmental organizations to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. *The deadline for abstract submissions is March 20th, 2013. * Submissions are welcome that address general Scientific Computing with Python, one of the two special themes for this years conference (machine learning & reproducible science), or the domain-specific mini-symposiaheld during the conference (Meteorology, climatology, and atmospheric and oceanic science, Astronomy and astrophysics, Medical imaging, Bio-informatics). Please submit your abstract at the SciPy 2013 website abstract submission form . Abstracts will be accepted for posters or presentations. Optional papers to be published in the conference proceedings will be requested following abstract submission. This year the proceedings will be made available prior to the conference to help attendees navigate the conference. We look forward to an exciting and interesting set of talks, posters, and discussions and hope to see you at the conference. The SciPy 2013 Program Committee Chairs Matt McCormick, Kitware, Inc. Katy Huff, University of Wisconsin-Madison and Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 27 21:08:40 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 27 Feb 2013 21:08:40 -0500 Subject: [SciPy-User] Power Message-ID: see for example http://www.statmethods.net/stats/power.html some almost one liners coming soon to statsmodels ------------ import numpy as np from scipy import stats def chisquare_power(effect_size, nobs, n_bins, alpha=0.05, ddof=0): '''power of chisquare goodness of fit test effect size is sqrt of chisquare statistic divided by nobs Parameters ---------- effect_size : float This is the deviation from the Null of the normalized chi_square statistic . nobs : int or float number of observations n_bins : int (or float) number of bins, or points in the discrete distribution alpha : float in (0,1) significance level of the test, default alpha=0.05 Returns ------- power : float power of the test at given significance level at effect size Notes ----- This function also works vectorized if all arguments broadcast. ''' crit = stats.chi2.isf(alpha, n_bins - 1 - ddof) power = stats.ncx2.sf(crit, n_bins - 1 - ddof, effect_size**2 * nobs) return power def chisquare_effectsize(probs0, probs1): '''effect size for a chisquare goodness-of-fit test Parameters ---------- probs0 : array_like probabilities or cell frequencies under the Null hypothesis probs1 : array_like probabilities or cell frequencies under the Alternative hypothesis probs0 and probs1 need to have the same shape. Both probs0 and probs1 are normalized to add to one. Returns ------- effectsize : float effect size of chisquare test ''' probs0 = np.asarray(probs0, float) probs1 = np.asarray(probs1, float) probs0 = probs0 / probs0.sum(0) probs1 = probs1 / probs1.sum(0) return np.sqrt(((probs1 - probs0)**2 / probs0).sum(0)) ------------ Josef Brain: We must prepare for tomorrow night. Pinky: Why? What are we going to do tomorrow night? Brain: The same thing we do every night, Pinky - try to take over the world! From jrocher at enthought.com Thu Feb 28 08:31:45 2013 From: jrocher at enthought.com (Jonathan Rocher) Date: Thu, 28 Feb 2013 07:31:45 -0600 Subject: [SciPy-User] [ANN] SciPy2013 Tutorials: Call for Submissions Message-ID: [Apologies for cross-posts] Dear all, We are excited to kick off the SciPy2013 conferencewith two days of tutorials. This year we are proud to expand the session to include *THREE parallel tracks*: introductory, intermediate and advanced. Teachers will receive a stipend for their service. We are accepting tutorial proposals from individuals or teams until *April 1st*. Click here for more details and to submit applications . Looking forward to a very exciting conference! The SciPy 2013 Tutorial Chairs Francesc Alted, Continuum Analytics Inc. Dharhas Pothina, Texas Water Development Board -- Jonathan Rocher, PhD Scientific software developer Co-chair of SciPy2013 Conference Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jrocher at enthought.com Thu Feb 28 13:29:24 2013 From: jrocher at enthought.com (Jonathan Rocher) Date: Thu, 28 Feb 2013 12:29:24 -0600 Subject: [SciPy-User] [ANN] SciPy2013 Sponsorships: Apply Now! Message-ID: [Apologies for cross-posts] Dear all, The SciPy2013 conference will continue the tradition of offering sponsorships to attend the conference. These sponsorships provide funding for airfare, lodging, and conference registration. This year, these sponsorships will be *open to community members rather than just students*. Applications will be judged both on merit as well as need. If you would like to apply for yourself or a worthy candidate , please note our application due date of *Monday, March 25th*. Winners will be announced on April 22nd. Looking forward to a very exciting conference! The SciPy 2013 Financial Aid Chairs Jeff Daily, Pacific Northwest National Lab. John Wiggins, Enthought Inc. -- Jonathan Rocher, PhD Scientific software developer Co-chair of SciPy2013 Conference Enthought, Inc. jrocher at enthought.com 1-512-536-1057 http://www.enthought.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From todd.holloway at gmail.com Thu Feb 28 01:10:24 2013 From: todd.holloway at gmail.com (Todd Holloway) Date: Wed, 27 Feb 2013 22:10:24 -0800 Subject: [SciPy-User] Looking for speakers to talk about SciPy at SF Data Mining meetup Message-ID: Hi all, I'm looking for a SciPy expert or two to come out and give a talk about SciPy at the SF Data Mining meetup (http://www.meetup.com/Data-Mining/) in March or April. Any takers or recommendations? The meetup would be at a San Francisco venue, and our meetups usually have 200-300 people attending. Cheers, Todd From josef.pktd at gmail.com Thu Feb 28 22:44:37 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Feb 2013 22:44:37 -0500 Subject: [SciPy-User] optimize.brentq and step function Message-ID: brentq documentation says "f must be a continuous function" I forgot that I have a step function and tried brentq and it worked. Is this an accident or a feature? the underlying function has a ceil and floor in it to convert the argument to integer. example: trying to invert stats.binom_test to get a confidence interval >>> n_rep 10000 >>> for count in np.arange(455, 460, 0.5): count, stats.binom_test(count, n_rep, p=0.05) ... (455.0, 0.038914401175656158) (455.5, 0.038914401175656158) (456.0, 0.043471737392089219) (456.5, 0.043471737392089219) (457.0, 0.048471945345671313) (457.5, 0.048471945345671313) (458.0, 0.053946450095604122) (458.5, 0.053946450095604122) (459.0, 0.059927554205846612) (459.5, 0.059927554205846612) >>> def func(qi): ... return stats.binom_test(qi * n_rep, n_rep, p=0.05) - 0.05 ... >>> qi = optimize.brentq(func, 0.01, 0.05) >>> (qi * n_rep) 457.99999999456901 >>> stats.binom_test(np.floor(qi * n_rep), n_rep, p=0.05) 0.048471945345671313 >>> stats.binom_test((qi * n_rep), n_rep, p=0.05) 0.048471945345671313 >>> stats.binom_test(np.floor(qi * n_rep)+1, n_rep, p=0.05) 0.053946450095604122 It looks like I get an integer next to the step root. Josef From charlesr.harris at gmail.com Thu Feb 28 23:43:28 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 28 Feb 2013 21:43:28 -0700 Subject: [SciPy-User] optimize.brentq and step function In-Reply-To: References: Message-ID: On Thu, Feb 28, 2013 at 8:44 PM, wrote: > brentq documentation says "f must be a continuous function" > > I forgot that I have a step function and tried brentq and it worked. > Is this an accident or a feature? > > Feature, the documentation is off. Brentq falls back on bisection when it converges too slowly. And there is some subtlety in 'slowly', but it does work find a point of sign where the function changes sign, all that is required is the the function be defined everywhere on the interval and there be a finite number of 'zeros'. If you know you have a discontinuity, plain old bisection is probably faster, but one of the best things about brentq is its generality. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 28 23:57:27 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 28 Feb 2013 21:57:27 -0700 Subject: [SciPy-User] optimize.brentq and step function In-Reply-To: References: Message-ID: On Thu, Feb 28, 2013 at 9:43 PM, Charles R Harris wrote: > > > On Thu, Feb 28, 2013 at 8:44 PM, wrote: > >> brentq documentation says "f must be a continuous function" >> >> I forgot that I have a step function and tried brentq and it worked. >> Is this an accident or a feature? >> >> > Feature, the documentation is off. Brentq falls back on bisection when it > converges too slowly. And there is some subtlety in 'slowly', but it does > work find a point of sign where the function changes sign, all that is > required is the the function be defined everywhere on the interval and > there be a finite number of 'zeros'. If you know you have a discontinuity, > plain old bisection is probably faster, but one of the best things about > brentq is its generality. > And the finite part is wrong. Bisection will always find a `zero` if the ends have opposite signs since the interval is halved on every iteration and it will terminate when the interval is sufficiently small. But all you know at that point is that the ends have different signs, not that the function is almost zero there. So continuity is required for the function to actually be close to zero. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Thu Feb 28 23:57:42 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Thu, 28 Feb 2013 23:57:42 -0500 Subject: [SciPy-User] _dop.error: failed in processing argument list for call-back fcn. In-Reply-To: References: Message-ID: On 2/26/13, SHIHUI GUO wrote: > Hi Warren, > > Thanks for this. The major issue is that I declared the "param" in function > definition, but didn't pass that when calling. > > Another question, when we do: > ========================== > while test.successful() and test.t test.integrate(test.t+dt) > ========================== > the result is returned after each time step. Is there any default way I > could specify the time span and get the result over the whole time span, > instead of individual integrative step? > Yes, in fact, that is what it is doing, but the loop is set up to get a value every `dt` time units. If you want just the final value, eliminate the loop, and just give the final time as the argument to test.integrate. For example, the following computes the solution to dy/dt = -y with y(0) = 1 at time t=10: ----- In [34]: def func(t, y): ....: return -y ....: In [35]: solver = ode(f=func) In [36]: solver.set_integrator("lsoda") Out[36]: In [37]: solver.set_initial_value(1.0) Out[37]: In [38]: result = solver.integrate(10.0) # Get the solution at t=10. In [39]: result Out[39]: array([ 4.53998024e-05]) In [40]: np.exp(-10) Out[40]: 4.5399929762484854e-05 In [41]: solver.successful() Out[41]: True ----- Warren > Thanks. > > Shihui > > > On 26 February 2013 11:16, Warren Weckesser > wrote: > >> On 2/26/13, SHIHUI GUO wrote: >> > HI all, >> > >> > I want to use scipy to implement an oscillator, ie. solving a >> > secod-order >> > ordinary differential equation, my code is: >> > =================================== >> > from scipy.integrate import ode >> > >> > y0,t0 = [0, 1], 0 >> > >> > def fun(t, y, params): >> > rou = 1 >> > omega = 10 >> > sigma = 1 >> > # convergence rate, ie, lambda >> > conrate = 10 >> > temp = -conrate*((y[0]^2+y[1]^2)/rou^2-sigma) >> > dy = temp*y[0] - omega*y[1] >> > ddy = omega*y[0] + temp*y[1] >> > return [dy, ddy] >> > >> > test = ode(fun).set_integrator('dopri5') >> > test.set_initial_value(y0, t0) >> > t1 = 10 >> > dt = 0.1 >> > >> > while test.successful() and test.t> > test.integrate(test.t+dt) >> > print test.t, test.yy >> > =================================== >> > >> > =================================== >> > The error says: >> > _dop.error: failed in processing argument list for call-back fcn. >> > File "/home/shepherd/python/research/testode.py", line 23, in >> > test.integrate(test.t+dt) >> > File >> > >> "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", >> > line 333, in integrate >> > self.f_params, self.jac_params) >> > File >> > >> "/home/shepherd/epd/epd_free-7.3-2-rh5-x86/lib/python2.7/site-packages/scipy/integrate/_ode.py", >> > line 827, in run >> > tuple(self.call_args) + (f_params,))) >> > =================================== >> > >> > Previously I use the ubuntu default python, and scipy is 0.9.0. Some >> thread >> > says it is a bug and has been fixed in 0.10.0, so I switched to >> enthought, >> > now the scipy is newest version, but the error remains. >> > >> > Thanks for any help. >> > >> > Shihui >> > >> >> >> There are a few problems in your code. >> >> You haven't told the `ode` object that your function accepts the extra >> argument `params`. Normally you would do this with >> `test.set_f_params(...)`, but since your function doesn't actually use >> `params`, it is simpler to just change the function signature to >> >> def fun(t, y): >> >> Next, in Python, the operator to raise a value to a power is **, not >> ^, so the formula for `temp` should be: >> >> temp = -conrate*((y[0]**2+y[1]**2)/rou**2-sigma) >> >> Finally, the attribute for the solution is `y`, not `yy`, so the last >> line should be: >> >> print test.t, test.y >> >> >> Cheers, >> >> Warren >> >> >> > -- >> > * >> > >> --------------------------------------------------------------------------------------- >> > * >> > >> > SHIHUI GUO >> > National Center for Computer Animation >> > Bournemouth University >> > United Kingdom >> > >> > BU is a Disability Two Ticks Employer and has signed up to the Mindful >> > Employer charter. Information about the accessibility of University >> > buildings can be found on the BU DisabledGo webpages [ >> > http://www.disabledgo.com/en/org/bournemouth-university ] >> > This email is intended only for the person to whom it is addressed and >> may >> > contain confidential information. If you have received this email in >> error, >> > please notify the sender and delete this email, which must not be >> > copied, >> > distributed or disclosed to any other person. >> > Any views or opinions presented are solely those of the author and do >> > not >> > necessarily represent those of Bournemouth University or its subsidiary >> > companies. Nor can any contract be formed on behalf of the University >> > or >> its >> > subsidiary companies via email. >> > >> > >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > > > -- > * > --------------------------------------------------------------------------------------- > * > > SHIHUI GUO > National Center for Computer Animation > Bournemouth University > United Kingdom > > BU is a Disability Two Ticks Employer and has signed up to the Mindful > Employer charter. Information about the accessibility of University > buildings can be found on the BU DisabledGo webpages [ > http://www.disabledgo.com/en/org/bournemouth-university ] > This email is intended only for the person to whom it is addressed and may > contain confidential information. If you have received this email in error, > please notify the sender and delete this email, which must not be copied, > distributed or disclosed to any other person. > Any views or opinions presented are solely those of the author and do not > necessarily represent those of Bournemouth University or its subsidiary > companies. Nor can any contract be formed on behalf of the University or its > subsidiary companies via email. > > >