From gael.varoquaux at normalesup.org Tue May 1 03:08:50 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 1 May 2012 09:08:50 +0200 Subject: [SciPy-Dev] Independent T-tests with unequal variances In-Reply-To: <4F9F3853.9050600@gmail.com> References: <4F9F2A83.1070206@gmail.com> <4F9F3853.9050600@gmail.com> Message-ID: <20120501070850.GB18707@phare.normalesup.org> Hi Gavin, This is interesting. Thanks for sharing. Do you think that you could submit this as a pull request on Github? A diff on a mailing list has much more chances of getting lost in the flow. Thanks a lot, Gael On Mon, Apr 30, 2012 at 06:11:47PM -0700, Junkshops wrote: > Well, this is embarrassing. I appear to have sent the wrong diff > with a sign error in it. The correct one is attached. > My apologies, > Gavin > On 4/30/2012 5:12 PM, Junkshops wrote: > >Hello all, > >I hope, as an utter newb poking my nose into this list, that I'm > >not giving the author of Miss Manner's Book of Netiquette the > >vapours. > >This is a follow up to Deniz Turgut's recent email: > >http://article.gmane.org/gmane.comp.python.scientific.devel/16291/ > >"There is also a form of t-test for independent samples with > >different variances, also known as Welch's t-test [2]. I think it > >is better to include the 'identical variance' assumption in the > >doc to avoid confusion." > >I was just caught by this problem and heartily agree with Deniz's > >views. However, I didn't see any explicit plans to add Welch's > >test to scipy/stats/stats.py, and I needed an implementation of > >the test and so implemented it. A diff against scipy 0.10.1 is > >attached if anyone might find it useful. > >Cheers, > >Gavin > 214c214 > < 'ttest_1samp', 'ttest_ind', 'ttest_rel', > --- > > 'ttest_1samp', 'ttest_ind', 'ttest_ind_uneq_var', 'ttest_rel', > 2901c2901 > < """Calculates the T-test for the means of TWO INDEPENDENT samples of scores. > --- > > """Calculates the T-test for the means of TWO INDEPENDENT samples of scores with equal variances. > 2903c2903 > < This is a two-sided test for the null hypothesis that 2 independent samples > --- > > This is a two-sided test for the null hypothesis that 2 independent samples with equal variances > 2926c2926 > < We can use this test, if we observe two independent samples from > --- > > We can use this test, if we observe two independent samples with equal variances from > 2972c2972 > < > --- > 2988c2988 > < > --- > 2990a2991,3116 > > def ttest_ind_uneq_var(a, b, axis=0): > > """Calculates the T-test for the means of TWO INDEPENDENT samples of scores with > > unequal variances (or situations where it is unknown if the variances are equal). > > This is a two-sided test for the null hypothesis that 2 independent samples with unequal variances > > have identical average (expected) values. > > Parameters > > ---------- > > a, b : sequence of ndarrays > > The arrays must have the same shape, except in the dimension > > corresponding to `axis` (the first, by default). > > axis : int, optional > > Axis can equal None (ravel array first), or an integer (the axis > > over which to operate on a and b). > > Returns > > ------- > > t : float or array > > t-statistic > > prob : float or array > > two-tailed p-value > > Notes > > ----- > > We can use this test, if we observe two independent samples from > > the same or different population, e.g. exam scores of boys and > > girls or of two ethnic groups. The test measures whether the > > average (expected) value differs significantly across samples. If > > we observe a large p-value, for example larger than 0.05 or 0.1, > > then we cannot reject the null hypothesis of identical average scores. > > If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, > > then we reject the null hypothesis of equal averages. > > References > > ---------- > > http://en.wikipedia.org/wiki/T-test#Independent_two-sample_t-test > > Examples > > -------- > > >>> from scipy import stats > > >>> import numpy as np > > >>> #fix seed to get the same result > > >>> np.random.seed(12345678) > > test with sample with identical means > > >>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) > > >>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500) > > >>> stats.ttest_ind(rvs1,rvs2) > > (0.26833823296239279, 0.78849443369564765) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs2) > > (0.26833823296239279, 0.78849419539158605) > > ttest_ind underestimates p for unequal variances > > >>> rvs3 = stats.norm.rvs(loc=5, scale=20, size=500) > > >>> stats.ttest_ind(rvs1, rvs3) > > (-0.46580283298287162, 0.64145827413436174) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs3) > > (-0.46580283298287162, 0.64149552307593671) > > >>> rvs4 = stats.norm.rvs(loc=5, scale=20, size=100) > > >>> stats.ttest_ind(rvs1, rvs4) > > (-0.99882539442782481, 0.31828327091038955) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs4) > > (-0.69712570584654099, 0.48711638692035597) > > test with sample with different means > > >>> rvs5 = stats.norm.rvs(loc=8, scale=10, size=500) > > >>> stats.ttest_ind(rvs1, rvs5) > > (-4.130511725493573, 3.922607411074624e-05) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs5) > > (-4.130511725493573, 3.9209626240360421e-05) > > >>> rvs6 = stats.norm.rvs(loc=8, scale=20, size=500) > > >>> stats.ttest_ind(rvs1, rvs6) > > (-3.8383088416156559, 0.00013167799566923922) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs6) > > (-3.8383088416156559, 0.00013475714831652827) > > >>> rvs7 = stats.norm.rvs(loc=8, scale=20, size=100) > > >>> stats.ttest_ind(rvs1, rvs7) > > (-0.79821473077740479, 0.42506275883963907) > > >>> stats.ttest_ind_uneq_var(rvs1, rvs7) > > (-0.51902756162811092, 0.60475596772293294) > > """ > > a, b, axis = _chk2_asarray(a, b, axis) > > v1 = np.var(a, axis, ddof = 1) > > v2 = np.var(b, axis, ddof = 1) > > n1 = a.shape[axis] > > n2 = b.shape[axis] > > vn1 = v1 / n1 > > vn2 = v2 / n2 > > df = ((vn1 + vn2)**2) / ((vn1**2) / (n1 + 1) + (vn2**2) / (n2 + 1)) - 2 > > d = np.mean(a, axis) - np.mean(b, axis) > > t = d / np.sqrt(vn1 + vn2) > > # why is this defining t = 1 when the means and variances are equal...? > > #t = np.where((d==0)*(svar==0), 1.0, t) #define t=0/0 = 0, identical means > > #don't think this is really necessary, but returns t as array if not > > t = np.where((d==0), 0.0, t) #define t=0/0 = 0, identical means > > prob = distributions.t.sf(np.abs(t), df) * 2 #use np.abs to get upper tail > > #distributions.t.sf currently does not propagate nans > > #this can be dropped, if distributions.t.sf propagates nans > > #if this is removed, then prob = prob[()] needs to be removed > > prob = np.where(np.isnan(t), np.nan, prob) > > if t.ndim == 0: > > t = t[()] > > prob = prob[()] > > return t, prob > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Gael Varoquaux Researcher, INRIA Parietal Laboratoire de Neuro-Imagerie Assistee par Ordinateur NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From ralf.gommers at googlemail.com Tue May 1 15:53:59 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 1 May 2012 21:53:59 +0200 Subject: [SciPy-Dev] scipy sprint @ EuroSciPy '12? Message-ID: Hi all, Would people be interested in having a scipy sprint at EuroSciPy this year? Last year we tried to do a last minute mini-sprint and ended up hunting for wifi access for half the time, so it would be good to organize things better this time around. I'm thinking a one-day sprint on Wed Aug 22nd would be good. If there's interest, I'm happy to organize (contact the conference organizers for a room, create a wiki page, etc.). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed May 2 07:00:50 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 02 May 2012 07:00:50 -0400 Subject: [SciPy-Dev] google releases ceres solver Message-ID: http://google-opensource.blogspot.com/2012/05/introducing-ceres-solver- nonlinear.html From mierle at gmail.com Wed May 2 13:25:15 2012 From: mierle at gmail.com (Keir Mierle) Date: Wed, 2 May 2012 10:25:15 -0700 Subject: [SciPy-Dev] google releases ceres solver In-Reply-To: References: Message-ID: We spent some time thinking about how to integrate Python with Ceres, and our current thinking is that we should implement a CostFunction with Cython, then pass that into Ceres. If one of you could take a peek at the Ceres API and offer thoughts on how to do this, we might get started on it. Thanks, Keir On Wed, May 2, 2012 at 4:00 AM, Neal Becker wrote: > http://google-opensource.blogspot.com/2012/05/introducing-ceres-solver- > nonlinear.html > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed May 2 15:34:03 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 2 May 2012 12:34:03 -0700 Subject: [SciPy-Dev] google releases ceres solver In-Reply-To: References: Message-ID: Hi Keir On Wed, May 2, 2012 at 10:25 AM, Keir Mierle wrote: > We spent some time thinking about how to integrate Python with Ceres, and > our current thinking is that we should implement a CostFunction with Cython, > then pass that into Ceres. If one of you could take a peek at the Ceres API > and offer thoughts on how to do this, we might get started on it. That should work; I have a related example (that Dag showed me) on p. 20 of my Cython slides over here: http://mentat.za.net/numpy/ay250_cython.pdf It requires having a Cython wrapper around the Ceres API, but I assume that's what you had in mind. I'm not sure how to get hold of a raw function pointer, if you needed that instead. St?fan From d.s.seljebotn at astro.uio.no Thu May 3 05:03:02 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 May 2012 11:03:02 +0200 Subject: [SciPy-Dev] google releases ceres solver In-Reply-To: References: Message-ID: <4FA249C6.5000401@astro.uio.no> On 05/02/2012 07:25 PM, Keir Mierle wrote: > We spent some time thinking about how to integrate Python with Ceres, > and our current thinking is that we should implement a CostFunction with > Cython, then pass that into Ceres. If one of you could take a peek at > the Ceres API and offer thoughts on how to do this, we might get started > on it. You would essentially write this in Cython: cdef class CostFunction: cpdef void evaluate(args...): raise NotImplementedError cdef int call_CostFunction_evaluate(args..., void* ctx) with gil: try: (ctx).evaluate(args...) return 0 except: # somehow save sys.exc_info to where the # Ceres entry point can find it return -1 If you do not release the GIL while inside Ceres one can remove the "with gil" part. Then you subclass the C++-CostFunction in C++, have it save the PyObject* that corresponds to the Cython-CostFunction instance, and have it call call_CostFunction_evaluate (which should likely be passed as a function pointer to a function initializing the C++ part of the wrapper). The C++ part of the wrapper can be written in a header file and included verbatim in Cython: cdef extern from "cpp_part_of_wrapper.hpp": pass Finally: For wrapping the double* arguments one may want to play with the new memoryviews in Cython 0.16. Alternatively, change "cpdef" to "cdef" and just pass the pointers along. For Python implementations of CostFunction, you likely want the double* wrapped as NumPy arrays. Constructing such arrays on each evaluate() would be rather expensive (whether this matters depends on the vector size of course), so if it is possible to somehow pass in to Ceres the buffers to use, and just hold on to the same NumPy arrays Python-side, and ignore the arguments to evaluate(), that would save some overhead. Dag From josef.pktd at gmail.com Thu May 3 08:52:57 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 3 May 2012 08:52:57 -0400 Subject: [SciPy-Dev] spam on ticket Message-ID: The problem with a responsive trac is that it's easier to get spam http://projects.scipy.org/scipy/ticket/162 and two more so far What's the procedure to remove them? Thanks, Josef From guyer at nist.gov Thu May 3 10:23:58 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 3 May 2012 10:23:58 -0400 Subject: [SciPy-Dev] spam on ticket In-Reply-To: References: Message-ID: On May 3, 2012, at 8:52 AM, wrote: > The problem with a responsive trac is that it's easier to get spam > > http://projects.scipy.org/scipy/ticket/162 > and two more so far > > What's the procedure to remove them? Go to the Admin tab and look in the sidebar for Ticket System:Delete Changes. It will ask you for the ticket number and then allow you to delete the offending entries. We have significantly reduced the quantity of spam on the FiPy Trac by using the Akismet spam filtering system and the TracCaptcha plugin (we're using Trac 0.11; it looks like some or all of these capabilities are built-in with Trac 0.12). http://trac.edgewall.org/wiki/SpamFilter#Akismet http://www.schwarz.eu/opensource/projects/trac_captcha From jsseabold at gmail.com Thu May 3 12:35:08 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 3 May 2012 12:35:08 -0400 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers Message-ID: Hi, Can someone take a look at the comments on this pull request [1]? I'm unable to replicate the failures and it's unclear to me that there are problems with the python code and the fortran wrappers. Thanks, Skipper [1] https://github.com/jseabold/scipy/commit/da19fbffa5a745fe441745987302abac201c6ff2#commitcomment-1287242 -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis at laxalde.org Thu May 3 13:14:01 2012 From: denis at laxalde.org (Denis Laxalde) Date: Thu, 03 May 2012 13:14:01 -0400 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers In-Reply-To: References: Message-ID: <4FA2BCD9.3090700@laxalde.org> Skipper Seabold a ?crit : > Can someone take a look at the comments on this pull request [1]? I'm > unable to replicate the failures and it's unclear to me that there are > problems with the python code and the fortran wrappers. I can confirm the tests failures: ====================================================================== ERROR: test_qz_double (test_decomp.TestQZ) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", line 1645, in test_qz_double AA,BB,Q,Z = qz(A,B) File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", line 168, in qz overwrite_b=overwrite_b, sort_t=sort_t) error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: dgges:lwork=46 ====================================================================== ERROR: test_qz_double_sort (test_decomp.TestQZ) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", line 1705, in test_qz_double_sort AA,BB,Q,Z,sdim = qz(A,B,sort=sort) File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", line 168, in qz overwrite_b=overwrite_b, sort_t=sort_t) error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: dgges:lwork=40 ====================================================================== ERROR: test_qz_single (test_decomp.TestQZ) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", line 1634, in test_qz_single AA,BB,Q,Z = qz(A,B) File "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", line 168, in qz overwrite_b=overwrite_b, sort_t=sort_t) error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: sgges:lwork=46 ---------------------------------------------------------------------- Lapack 3.3.1, Atlas 3.8.4 from Debian. -- Denis From jsseabold at gmail.com Thu May 3 13:51:55 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 3 May 2012 13:51:55 -0400 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers In-Reply-To: <4FA2BCD9.3090700@laxalde.org> References: <4FA2BCD9.3090700@laxalde.org> Message-ID: On Thu, May 3, 2012 at 1:14 PM, Denis Laxalde wrote: > > Skipper Seabold a ?crit : > > Can someone take a look at the comments on this pull request [1]? I'm > > unable to replicate the failures and it's unclear to me that there are > > problems with the python code and the fortran wrappers. > > I can confirm the tests failures: > > > ?====================================================================== > ?ERROR: test_qz_double (test_decomp.TestQZ) > ?---------------------------------------------------------------------- > ?Traceback (most recent call last): > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", > line 1645, in test_qz_double > ? ? ?AA,BB,Q,Z = qz(A,B) > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", > line 168, in qz > ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) > ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: > dgges:lwork=46 > > ?====================================================================== > ?ERROR: test_qz_double_sort (test_decomp.TestQZ) > ?---------------------------------------------------------------------- > ?Traceback (most recent call last): > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", > line 1705, in test_qz_double_sort > ? ? ?AA,BB,Q,Z,sdim = qz(A,B,sort=sort) > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", > line 168, in qz > ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) > ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: > dgges:lwork=40 > > ?====================================================================== > ?ERROR: test_qz_single (test_decomp.TestQZ) > ?---------------------------------------------------------------------- > ?Traceback (most recent call last): > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", > line 1634, in test_qz_single > ? ? ?AA,BB,Q,Z = qz(A,B) > ? ?File > "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", > line 168, in qz > ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) > ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: > sgges:lwork=46 > > ?---------------------------------------------------------------------- > > Lapack 3.3.1, Atlas 3.8.4 from Debian. > > Thanks. Can you check again with pushed fix? I think the documentation is wrong and the actual minimum value for lwork should be max(8*n, 6*n+16) and not the 8*n+16 in the documentation. If anyone cares to double check my reading of the source. http://www.netlib.no/netlib/lapack/double/dgges.f Skipper From matthew.brett at gmail.com Thu May 3 13:59:05 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 3 May 2012 10:59:05 -0700 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers In-Reply-To: References: <4FA2BCD9.3090700@laxalde.org> Message-ID: Hi, On Thu, May 3, 2012 at 10:51 AM, Skipper Seabold wrote: > On Thu, May 3, 2012 at 1:14 PM, Denis Laxalde wrote: >> >> Skipper Seabold a ?crit : >> > Can someone take a look at the comments on this pull request [1]? I'm >> > unable to replicate the failures and it's unclear to me that there are >> > problems with the python code and the fortran wrappers. >> >> I can confirm the tests failures: >> >> >> ?====================================================================== >> ?ERROR: test_qz_double (test_decomp.TestQZ) >> ?---------------------------------------------------------------------- >> ?Traceback (most recent call last): >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", >> line 1645, in test_qz_double >> ? ? ?AA,BB,Q,Z = qz(A,B) >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", >> line 168, in qz >> ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) >> ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: >> dgges:lwork=46 >> >> ?====================================================================== >> ?ERROR: test_qz_double_sort (test_decomp.TestQZ) >> ?---------------------------------------------------------------------- >> ?Traceback (most recent call last): >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", >> line 1705, in test_qz_double_sort >> ? ? ?AA,BB,Q,Z,sdim = qz(A,B,sort=sort) >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", >> line 168, in qz >> ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) >> ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: >> dgges:lwork=40 >> >> ?====================================================================== >> ?ERROR: test_qz_single (test_decomp.TestQZ) >> ?---------------------------------------------------------------------- >> ?Traceback (most recent call last): >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", >> line 1634, in test_qz_single >> ? ? ?AA,BB,Q,Z = qz(A,B) >> ? ?File >> "/home/denis/.local/lib/python2.7/site-packages/scipy/linalg/_decomp_qz.py", >> line 168, in qz >> ? ? ?overwrite_b=overwrite_b, sort_t=sort_t) >> ?error: (lwork>=MAX(1,8*n+16)||lwork==-1) failed for 6th keyword lwork: >> sgges:lwork=46 >> >> ?---------------------------------------------------------------------- >> >> Lapack 3.3.1, Atlas 3.8.4 from Debian. >> >> > > Thanks. Can you check again with pushed fix? > > I think the documentation is wrong and the actual minimum value for > lwork should be max(8*n, 6*n+16) and not the 8*n+16 in the > documentation. > > If anyone cares to double check my reading of the source. > > http://www.netlib.no/netlib/lapack/double/dgges.f Your reading looks right to me :) Matthew From denis at laxalde.org Thu May 3 14:08:13 2012 From: denis at laxalde.org (Denis Laxalde) Date: Thu, 03 May 2012 14:08:13 -0400 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers In-Reply-To: References: <4FA2BCD9.3090700@laxalde.org> Message-ID: <4FA2C98D.4010304@laxalde.org> Skipper Seabold a ?crit : > Thanks. Can you check again with pushed fix? All test pass now. > I think the documentation is wrong and the actual minimum value for > lwork should be max(8*n, 6*n+16) and not the 8*n+16 in the > documentation. > > If anyone cares to double check my reading of the source. > > http://www.netlib.no/netlib/lapack/double/dgges.f That's indeed what the code checks for. However 8*n+16 > max(8*n, 6*n+16) so the documentation indicates a conservative value. From jsseabold at gmail.com Thu May 3 14:14:38 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 3 May 2012 14:14:38 -0400 Subject: [SciPy-Dev] LAPACK problems with *gges wrappers In-Reply-To: <4FA2C98D.4010304@laxalde.org> References: <4FA2BCD9.3090700@laxalde.org> <4FA2C98D.4010304@laxalde.org> Message-ID: On Thu, May 3, 2012 at 2:08 PM, Denis Laxalde wrote: > Skipper Seabold a ?crit : >> Thanks. Can you check again with pushed fix? > > All test pass now. Thanks again. > >> I think the documentation is wrong and the actual minimum value for >> lwork should be max(8*n, 6*n+16) and not the 8*n+16 in the >> documentation. >> >> If anyone cares to double check my reading of the source. >> >> http://www.netlib.no/netlib/lapack/double/dgges.f > > That's indeed what the code checks for. However 8*n+16 > max(8*n, > 6*n+16) so the documentation indicates a conservative value. Sure. I guess I read the docs as saying that LWORK _should be_ >= 8*n+16, so that's what I also checked in the wrappers. However, this is not always the case given an LWORK query. Skipper From pav at iki.fi Thu May 3 17:58:54 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 03 May 2012 23:58:54 +0200 Subject: [SciPy-Dev] spam on ticket In-Reply-To: References: Message-ID: 03.05.2012 14:52, josef.pktd at gmail.com kirjoitti: > The problem with a responsive trac is that it's easier to get spam > > http://projects.scipy.org/scipy/ticket/162 > and two more so far > > What's the procedure to remove them? Drop me a mail with the name of the offending account, or use the "Delete ticket" feature. Actually, drop a mail in any case, so that the spammer account can be disabled. 03.05.2012 16:23, Jonathan Guyer kirjoitti: [clip] > We have significantly reduced the quantity of spam on the FiPy > Trac by using the Akismet spam filtering system and the > TracCaptcha plugin (we're using Trac 0.11; it looks like some > or all of these capabilities are built-in with Trac 0.12). > > http://trac.edgewall.org/wiki/SpamFilter#Akismet > http://www.schwarz.eu/opensource/projects/trac_captcha The Captcha we have already, and we disallow images. These stopped the last spammers. Akismet needs someone to pay for it. DNS-based URL blacklists could be used for free, though. Pauli From pav at iki.fi Thu May 3 18:26:20 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 04 May 2012 00:26:20 +0200 Subject: [SciPy-Dev] spam on ticket In-Reply-To: References: Message-ID: 03.05.2012 23:58, Pauli Virtanen kirjoitti: > 03.05.2012 14:52, josef.pktd at gmail.com kirjoitti: >> The problem with a responsive trac is that it's easier to get spam >> >> http://projects.scipy.org/scipy/ticket/162 >> and two more so far >> >> What's the procedure to remove them? > > Drop me a mail with the name of the offending account, or use the > "Delete ticket" feature. Actually, drop a mail in any case, so that the > spammer account can be disabled. Ok, here are some toys from deputy spam watchers :) http://projects.scipy.org/numpy/report/16 http://projects.scipy.org/scipy/report/17 Pauli From josef.pktd at gmail.com Thu May 3 19:19:20 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 3 May 2012 19:19:20 -0400 Subject: [SciPy-Dev] spam on ticket In-Reply-To: References: Message-ID: On Thu, May 3, 2012 at 6:26 PM, Pauli Virtanen wrote: > 03.05.2012 23:58, Pauli Virtanen kirjoitti: >> 03.05.2012 14:52, josef.pktd at gmail.com kirjoitti: >>> The problem with a responsive trac is that it's easier to get spam >>> >>> http://projects.scipy.org/scipy/ticket/162 >>> and two more so far >>> >>> What's the procedure to remove them? >> >> Drop me a mail with the name of the offending account, or use the >> "Delete ticket" feature. Actually, drop a mail in any case, so that the >> spammer account can be disabled. Thanks, I never realized that I have the admin tab > > Ok, here are some toys from deputy spam watchers :) > > ? ? ? ?http://projects.scipy.org/numpy/report/16 > ? ? ? ?http://projects.scipy.org/scipy/report/17 I think scipy trac tickets are pretty clean. I randomly checked some of the ones in report 17, all clean Also, I have close to 90% of tickets in my email as read, and I almost never see spam. If I can do something immediately after seeing them, I won't forget about it. Josef > > ? ?Pauli > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From guyer at nist.gov Fri May 4 10:02:04 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Fri, 4 May 2012 10:02:04 -0400 Subject: [SciPy-Dev] spam on ticket In-Reply-To: References: Message-ID: <9CE97EC4-B688-4085-96CB-92CE93CA7F82@nist.gov> On May 3, 2012, at 5:58 PM, Pauli Virtanen wrote: > 03.05.2012 16:23, Jonathan Guyer kirjoitti: > [clip] >> We have significantly reduced the quantity of spam on the FiPy >> Trac by using the Akismet spam filtering system and the >> TracCaptcha plugin (we're using Trac 0.11; it looks like some >> or all of these capabilities are built-in with Trac 0.12). >> >> http://trac.edgewall.org/wiki/SpamFilter#Akismet >> http://www.schwarz.eu/opensource/projects/trac_captcha > > The Captcha we have already, and we disallow images. These stopped the > last spammers. Akismet needs someone to pay for it. DNS-based URL > blacklists could be used for free, though. The folks who administer our Trac told us Akismet was free for non-commercial use. That was six years ago, so maybe we're grandfathered under an old pricing scheme? Looking at the correspondence from several years ago, it looks like our admins also activated the Bayesian filters in TracSpamFilter and felt that was much more useful than just Akismet alone. Anyway, I'm just offering what we have; I don't manage any of it. From scopatz at gmail.com Sat May 5 03:01:03 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Sat, 5 May 2012 02:01:03 -0500 Subject: [SciPy-Dev] ANN: PyNE v0.1 Message-ID: Hello All, I am pleased to announce the first release of PyNE, or Python for Nuclear Engineering . While this is a domain specific package, inside of the nuclear industry it is similar is scope to SciPy. I am announcing here to hopefully gain some traction with interested parties who may be monitoring SciPy. Moreover, there are some parts which may hold a more general interest (dealing with wrapping C++) if anyone is interested in breaking these out. Please feel free to contact me with questions, comments, complaints, or contributions. The release notes are posted below. Be Well Anthony ====================== PyNE 0.1 Release Notes ====================== PyNE 0.1 is the first release of Python for Nuclear Engineering project after an initial last year of effort. PyNE is a free and open source (BSD licensed) project which is meant to compliment other project, such as NumPy and SciPy, as a necessary package in the computational nuclear engineer's toolkit. It is meant to play nicely with existing, industry standard nuclear engineering tools. The goal of PyNE is to be both fast and useful. As such, this is only the begging! Release highlights: - Support for many I/O routines. - Nuclear data interface. - C/C++ library which may be linked against independent of Python. - Cython wrappers for C++ standard library containers. Please visit our website for more information: http://pyne.github.com/ PyNE requires Python 2.7, NumPy 1.5+, PyTables 2.1+. New features ============ Nuclide Naming in ``pyne.nucname`` ---------------------------------- This module may be used to convert between various nuclide naming schemes. Currently the following naming conventions are supported: zzaaam, human readable names, MCNP, Serpent, NIST, and CINDER. This module is implemented in C. Basic Nuclear Data via ``pyne.data`` ------------------------------------ This aims to provide quick access to very high fidelity nuclear data. Usually values are taken from the nuc_data.h5 library which is generated with the new ``nuc_data_make`` utility at install. Current data includes atomic masses, decay data, neutron scattering lengths, and simple cross section data. 63-group neutron cross sections, photon cross sections, and fission product yields are also added when CINDER is available. This module is implemented in C. Material Class in ``pyne.material`` ----------------------------------- Materials are the primary container for radionuclides throughout PyNE. They map nuclides to mass weights, though they contain methods for converting to/from atom fractions as well. In many ways they take inspiration from numpy arrays and python dictionaries. Materials are implemented in C++ and support both text and HDF5 I/O. CCCC Formats in ``pyne.cccc`` ----------------------------- The CCCC module contains a number of classes for reading various cross section, flux, geometry, and data files with specifications given by the Committee for Computer Code Coordination. The following types of files can be read using classes from this module: ISOTXS, DLAYXS, BRKOXS, RTFLUX, ATFLUX, RZFLUX, MATXS, and SPECTR. ACE Cross Sections in ``pyne.ace`` ---------------------------------- This module is for reading ACE-format cross sections. ACE stands for "A Compact ENDF" format and originated from work on MCNP. It is used in a number of other Monte Carlo particle transport codes. Cross Section Interface via ``pyne.xs`` --------------------------------------- This is a top-level interface for computing (and caching) multigroup neutron cross sections. These cross sections will be computed from a variety of available data sources (stored in nuc_data.h5). In order of preference: 1. CINDER 63-group cross sections, 2. A two-point fast/thermal interpolation (using 'simple_xs' data from KAERI), 3. or physical models implemented in this sub-package. In the future, this package will support generating multigroup cross sections from user-specified pointwise data sources (such as ENDF or ACE files). ORIGEN 2.2 Support in ``pyne.origen22`` --------------------------------------- This provides an interface for reading, writing, and merging certain ORIGEN 2.2 input and output files. Specifically, tapes 4, 5, 6, and 9 are supported. Serpent Support in ``pyne.serpent`` ----------------------------------- Serpent is a continuous energy Monte Carlo reactor physics code. Pyne contains support for reading in Serpent's three types of output files: res, dep, and det. These are all in Matlab's ``*.m`` format and are read in as Python dictionaries of numpy arrays and Materials. They may be optionally written out to corresponding ``*.py`` files and imported later. C++ Standard Library Converters in ``pyne.stlconverters`` --------------------------------------------------------- This module contains wrapper classes for commonly used containers in the C++ standard library. This module is largely used by PyNE under the covers, in Cython and elsewhere. However, these classes are of more general interest so feel free to use them in your own code as well. Currently implemented are SetInt, SetStr, MapStrInt, MapIntStr, MapIntDouble, and MapIntComplex. Nuclear Data Generation in ``pyne.dbgen`` ----------------------------------------- Pyne provides an easy-to-use, repeatable aggregation utility for nuclear data. This command line utility is called ``nuc_data_make`` builds and installs an HDF5 file named ``nuc_data.h5`` to the current PyNE install. Nuclear data is gathered from a variety of sources, including the web and the data files for other programs installed on your system (such as MCNP). Authors ======= This release contains code written by the following people (in alphabetical order): * Christopher Dembia * Robert Flanagan * Paul Romano * Anthony Scopatz * Paul Wilson Additionally, we would like to thank the following people for their inspiration, guidance, and testing: * Katy Huff * Seth Johnson * Joshua Peterson * Rachel Slaybaugh * Nick Touran * Morgan White -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun May 6 15:08:38 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 6 May 2012 21:08:38 +0200 Subject: [SciPy-Dev] ANN Scikit-learn 0.11-beta Message-ID: <20120506190838.GA9889@phare.normalesup.org> On behalf of our release manager, Andreas Mueller, and all the scikit-learn contributors, I am happy to announce the 0.11 beta. We are doing a quick beta and will hopefuly be releasing the final version tomorrow. The purpose of this beta is to get feedback on any release-critical bugs such as build issues. You can download the zip files of the beta on: https://github.com/scikit-learn/scikit-learn/zipball/0.11-beta You can also retrieve the latest code on https://github.com/scikit-learn/scikit-learn/zipball/master or using 'git clone git at github.com:scikit-learn/scikit-learn.git' Any feedback is more than welcome, Cheers, Ga?l From vanforeest at gmail.com Mon May 7 07:51:28 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 13:51:28 +0200 Subject: [SciPy-Dev] scipy.stats documentation Message-ID: Hi, I am still struggling to understand some of the scipy stats package, and ran into some obscure points. 1) What is actually the shape parameter? Let me include some references to show my confusion here. In expon it does not seem to exist: https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L2770 Then, in Erlang it is called 'n'. I suppose this would mean the number of stages. So in Erlang, why then is the scale parameter corresponding to the shape? BTW: should the scale in the erlang dist dosctring not be explained? Then, from the gamma dist I learn the following: https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L3382 So that would mean that in the expon dist the shape is set to 1. Then, here: http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.erlang.html it states that the shape parameter should be an int, but in the examples section it is set to 0.9, i.e., the documentation states this: >>> from scipy.stats import erlang >>> numargs = erlang.numargs >>> [ n ] = [0.9,] * numargs >>> rv = erlang(n) from which I infer that the shape is set to 0.9. All in all, I don't quite know what to expect with regard to the use and purpose of shape. Is the shape parameter explained somewhere explicitly? if not, wouldn't the stats tutorial be the best place? Who is the author of this doc? How can I help change it? 2) Would it be a good idea to make the use of the loc and scale parameter explicit in the doc strings of the distributions? I recall that, as a first time user, I had no clue what they meant, and that it took some struggling and searching to figure out what they came down to. Besides, the doc strings are not allways complete. For instance, this is the string for the epx distribution: The probability density function for `expon` is:: expon.pdf(x) = exp(-x) for ``x >= 0``. The scale parameter is equal to ``scale = 1.0 / lambda``. So, what is lambda here? Is it: pdf(x) = lambda * exp(-x lambda), or is it pdf(x) = exp(-x/lambda)/lamda? After some experimentation I found out, but the documentation is not explicit enough in my opinion. Suppose we would restate it like this: cdf(x) = 1. - exp( -(x-loc)/scale). Then I think it would be clear immediately, and also interpretation-free. Likewise for other distributions. 3) I am really willing to help improve stats and the documentation at points more consistent, but I don't quite know where to start. In the process I raise all these points. Is this list the best place, or should I send my comments to Josef (?)? Nicky From jsseabold at gmail.com Mon May 7 08:51:12 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 May 2012 08:51:12 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 7:51 AM, nicky van foreest wrote: > Hi, > > I am still struggling to understand some of the scipy stats package, > and ran into some obscure points. > > 1) > > What is actually the shape parameter? Let me include some references > to show my confusion here. > > In expon it does not seem to exist: > > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L2770 > > Then, in Erlang it is called 'n'. I suppose this would mean the number > of stages. So in Erlang, why then is the scale parameter corresponding > to the shape? BTW: should the scale in the erlang dist dosctring not > be explained? > > Then, from the gamma dist I learn the following: > > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L3382 > > So that would mean that in the expon dist the shape is set to 1. > > Then, here: > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.erlang.html > > it states that the shape parameter should be an int, but in the > examples section it is set to 0.9, i.e., the documentation states > this: > > >>> from scipy.stats import erlang > >>> numargs = erlang.numargs > >>> [ n ] = [0.9,] * numargs > >>> rv = erlang(n) > > from which I infer that the shape is set to 0.9. > > All in all, I don't quite know what to expect with regard to the use > and purpose of shape. > > Is the shape parameter explained somewhere explicitly? if not, > wouldn't the stats tutorial be the best place? Who is the author of > this doc? How can I help change it? > > 2) > > Would it be a good idea to make the use of the loc and scale parameter > explicit in the doc strings of the distributions? I recall that, as a > first time user, I had no clue what they meant, and that it took some > struggling and searching to figure out what they came down to. > Besides, the doc strings are not allways complete. For instance, this > is the string for the epx distribution: > > The probability density function for `expon` is:: > > expon.pdf(x) = exp(-x) > > for ``x >= 0``. > > The scale parameter is equal to ``scale = 1.0 / lambda``. > > So, what is lambda here? Is it: pdf(x) = lambda * exp(-x lambda), or > is it pdf(x) = exp(-x/lambda)/lamda? After some experimentation I > found out, but the documentation is not explicit enough in my opinion. > Suppose we would restate it like this: > > cdf(x) = 1. - exp( -(x-loc)/scale). > > Then I think it would be clear immediately, and also > interpretation-free. Likewise for other distributions. > > 3) > I am really willing to help improve stats and the documentation at > points more consistent, but I don't quite know where to start. In the > process I raise all these points. Is this list the best place, or > should I send my comments to Josef (?)? > I'd prefer if the conversations stayed on list. FWIW, I'm really glad you are stepping up to help Josef out here. I am somewhat familiar with the stats code and the internals, but I still struggle with it at times. Anything that can be done to make this more user-friendly from documentation to refactoring would be very welcome. Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 7 10:30:04 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 May 2012 10:30:04 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 8:51 AM, Skipper Seabold wrote: > On Mon, May 7, 2012 at 7:51 AM, nicky van foreest > wrote: >> >> Hi, >> >> I am still struggling to understand some of the scipy stats package, >> and ran into some obscure points. I think the main point to understanding the distributions is to realize that we know very little about individual distributions. stats.distributions has an elaborate generic structure for the distributions. When I started with this, it was relatively easy because it was just "coding". I knew maybe 10 to 15 distributions out of the around 90 and didn't need to go through all of them or understand the distribution specific parts. When I tried to find parameters for the test suite for each distribution, I started to go through them individually, specifically which distributions have integer parameters. Then I realized that none of the distribution complains when I feed it a real (non-integer) number, then I just did random search until I found parameters that worked. Over time we worked our way through some of the individual distribution, either because there were bugs, or because other developers were interested in them or because they showed up on the mailing lists. Some years ago Ralf rewrote the automatic docstring generation, which allow now for better distribution specific docstrings. So, the way would be open to incorporate more distribution specific information, but someone needs to know what a distribution is supposed to be. Also, for some purposes we don't really care about the initial history of a distribution, for example pymvpa and some proprietary software packages I looked at just try to find the best fitting distribution given the data, independent of what the interpretation of the distribution was when it was initially developed, for statistical tests, for queuing models, extreme value analysis or whichever. To most points below parameterization in stats.distributions versus text books All continuous distributions have a generic treatment of loc and scale independent of whether this is part of a standard definition of the distribution. (discrete distributions only have a loc shift.) (location-scale families of distributions) dist.cdf((x-loc)/scale) = dist.cdf(x, loc=loc, scale=scale) the standard distributions have loc=0, scale=1 pdf and other methods follow from this every other parameter besides loc and scale is called a shape parameter, say theta, so we have dist.cdf((x-loc)/scale; theta) = dist.cdf(x, *theta, loc=loc, scale=scale) (requires python >=2.6 :) many distributions like normal and exponential don't have a shape parameter, just loc and scale many "standard" definitions of distributions don't include loc and scale as separate parameters, and scale, or a function of scale, is often the "standard" parameter, see your example of the exponential distribution below http://en.wikipedia.org/wiki/Exponential_distribution standard definition has lambda as parameter, with interpretation as rate parameter. If we look at the cdf = 1 ? exp{??x), then lambda just multiplies x (instead of x/scale), so the lambda in the standard definition of exponential is just our 1./scale, or scale=1./lambda Sometimes the parameterization in stats.distributions is "a bit difficult" to translate to the standard parameterization, example lognormal that regularly raises questions. The documentation should be improved wherever possible, some parts I might never have read very carefully, other parts I might interpret in the "right way" even if it's not clear as general documentation for someone that isn't familiar with the details. >> >> 1) >> >> What is actually the shape parameter? ?Let me include some references >> to show my confusion here. >> >> In expon it does not seem to exist: >> >> >> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L2770 >> >> Then, in Erlang it is called 'n'. I suppose this would mean the number >> of stages. So in Erlang, why then is the scale parameter corresponding >> to the shape? BTW: should the scale in the erlang dist dosctring not >> be explained? Not sure I understand the first part. I never looked at Erlang until recently. >> >> Then, from the gamma dist I learn the following: >> >> >> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L3382 >> >> So that would mean that in the expon dist the shape is set to 1. >> >> Then, here: >> >> >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.erlang.html >> >> it states that the shape parameter should be an int, but in the >> examples section it is set to 0.9, i.e., the documentation states >> this: >> >> >>> from scipy.stats import erlang >> >>> numargs = erlang.numargs >> >>> [ n ] = [0.9,] * numargs >> >>> rv = erlang(n) >> >> from which I infer that the shape is set to 0.9. these are generic template numbers and could be replaced by distribution specific docstrings >> >> All in all, I don't quite know what to expect with regard to the use >> and purpose of shape. >> >> Is the shape parameter explained somewhere explicitly? if not, >> wouldn't the stats tutorial be the best place? Who is the author of >> this doc? How can I help change it? online editing is the easiest. I wrote the stats tutorial a long time ago, and it contains the description of individual distributions written by Travis. I haven't looked at the overall documentation for the distributions in a while. Suggestions, or, even better, direct improvements in the doc editor or with pull request would be very welcome. >> >> 2) >> >> Would it be a good idea to make the use of the loc and scale parameter >> explicit in the doc strings of the distributions? I recall that, as a >> first time user, I had no clue what they meant, and that it took some >> struggling and searching to figure out what they came down to. >> Besides, the doc strings are not allways complete. For instance, this >> is the string for the epx distribution: >> >> The probability density function for `expon` is:: >> >> ? ? ? ?expon.pdf(x) = exp(-x) >> >> ? ?for ``x >= 0``. >> >> ? ?The scale parameter is equal to ``scale = 1.0 / lambda``. >> >> So, what is lambda here? Is it: pdf(x) = lambda * exp(-x lambda), or >> is it pdf(x) = exp(-x/lambda)/lamda? After some experimentation I >> found out, but the documentation is not explicit enough in my opinion. >> Suppose we would restate it like this: >> >> cdf(x) = 1. - exp( -(x-loc)/scale). >> >> Then I think it would be clear immediately, and also >> interpretation-free. Likewise for other distributions. As above, (I will have to browse the documentation, to be able to comment on specific items.) I hope the description above helps for this, and we can keep going to clear this up and improve the documentation. >> >> 3) >> I am really willing to help improve stats and the documentation at >> points more consistent, but I don't quite know where to start. ?In the >> process I raise all these points. Is this list the best place, or >> should I send my comments to Josef (?)? > > > I'd prefer if the conversations stayed on list. > > FWIW, I'm really glad you are stepping up to help Josef out here. I am > somewhat familiar with the stats code and the internals, but I still > struggle with it at times. Anything that can be done to make this more > user-friendly from documentation to refactoring would be very welcome. I fully agree with Skipper. Nicky, thanks for looking into this. Josef > > Skipper > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From ralf.gommers at googlemail.com Mon May 7 12:03:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 7 May 2012 18:03:23 +0200 Subject: [SciPy-Dev] scipy.sparse.csgraph merged Message-ID: Hi, I finally merged https://github.com/scipy/scipy/pull/119, the sparse graph algorithms module written by Jake Vanderplas. This will certainly by one of the highlights of the 0.11 release (which shouldn't be too far off). Thanks again to Jake for all the work he put in. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Mon May 7 14:20:26 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 20:20:26 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: Hi, Many thanks for your explanations. I think it is best to just respond to the point below that are not yet completely clear to me. I'll remove the rest. > All continuous distributions have a generic treatment of loc and scale > independent of whether this is part of a standard definition of the > distribution. (discrete distributions only have a loc shift.) > (location-scale families of distributions) > > dist.cdf((x-loc)/scale) = dist.cdf(x, loc=loc, scale=scale) > the standard distributions have loc=0, scale=1 > pdf and other methods follow from this I know this, but it took me some searching to find this out (already a few years ago.) I would like to make this somewhat easier to understand for first time users. Besides this, if I read (for the exponential) cdf(x) = 1-exp( -(x-loc)/scale) it is perfectly clear that 1/scale represents the rate. The confusing point is that text books, wikipedia, and so on, are inconsistent. Hence, before I use any distribution in scipy.stats I first run a few test to be sure that using scale works the way as I expect. If, on the other hand, the docstring would be completely explicit, including the use of loc and scale (even though it is overkill as loc and scale are explained somewhere else), no confusions should arise, at least not from my part. As an example, I think as a first time user the following doc string would have helped me the most (Hopefully I am not too biased here): """An exponential continuous random variable. %(before_notes)s Notes ----- The cumulative distribution function for `expon` is:: expon.cdf(x) = 1. - exp(-(x-loc)/scale) To compute cdf(x) = exp(-lambda x ) it is required to take ``scale = 1.0/lambda``; since ``loc = 0`` automatically it is not necessary to set ``loc = 0.`` explicitly. The shape parameter is not implemented for ``expon`` as ``loc`` and ``scale`` suffice. %(example)s """ > Sometimes the parameterization in stats.distributions is "a bit > difficult" to translate to the standard parameterization, example > lognormal that regularly raises questions. Ok. Can we resolve this by making the doc-string more explicit, like my example above? > these are generic template numbers and could be replaced by > distribution specific docstrings Ok. How about fixing part for part? > online editing is the easiest. Many of these points seem, at least to me, too minor to raise a ticket, or am I mistaken here? > I wrote the stats tutorial a long time ago, and it contains the > description of individual distributions written by Travis. > I haven't looked at the overall documentation for the distributions in a while. Where can I find the tutorial? Then I'll try to add some description about the shape parameter, and improve/add parts where necessary. > > Suggestions, or, even better, direct improvements in the doc editor or > with pull request would be very welcome. I'll try that. > Nicky, thanks for looking into this. I am happy to be able to do something in return. python, scipy, and stats, made my life (in some respects :-) much easier. From stefan at sun.ac.za Mon May 7 14:47:17 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 7 May 2012 11:47:17 -0700 Subject: [SciPy-Dev] scipy.sparse.csgraph merged In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 9:03 AM, Ralf Gommers wrote: > I finally merged https://github.com/scipy/scipy/pull/119, the sparse graph > algorithms module written by Jake Vanderplas. This will certainly by one of > the highlights of the 0.11 release (which shouldn't be too far off). Thanks > again to Jake for all the work he put in. What a great pull request discussion and resulting changeset! Kudos to Jake, Ralf, Dan and Pauli for seeing it through. I'm curious to see how this compares to the N-d minimum cost path code in scikits-image: https://github.com/scikits-image/scikits-image/blob/master/skimage/graph/_mcp.pyx St?fan From vanforeest at gmail.com Mon May 7 15:31:53 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 21:31:53 +0200 Subject: [SciPy-Dev] scipy stats: argcheck Message-ID: Shouldn't the distribution gamma_gen be equiped with an argcheck function like so: def _argcheck(self, a: return (a > 0) compare the line https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L3402 Likewise for https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L2609 ---- One more question about the workflow. I notice that while I am trying to understand the stats module I encounter lots of minor points/unclarities and so on, like the above. I have the feeling the easiest way to deal with such points is as follows 1: send a mail to this list. Discuss this until it is resolved. 2: if it is a minor point, I implement it and make a pull request. 3: If it turns out to be a major point, file a report in scipy.trac. If Josef (or other maintainers) prefer otherwise, please let me know. From vanforeest at gmail.com Mon May 7 15:34:50 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 21:34:50 +0200 Subject: [SciPy-Dev] scipy stats: argcheck In-Reply-To: References: Message-ID: Sorry, I just realize that _argcheck() in rv_continuous already checks whether the shape a>0, making my question below superfluous. On 7 May 2012 21:31, nicky van foreest wrote: > Shouldn't the distribution gamma_gen be equiped with an argcheck > function like so: > > ? ?def _argcheck(self, a: > ? ? ? ?return (a > 0) > > compare the line > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L3402 > > Likewise for > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L2609 > > ---- > > One more question about the workflow. ?I notice that while I am trying > to understand the stats module I encounter lots of minor > points/unclarities and so on, like the above. I have the feeling the > easiest way to deal with such points is as follows > > 1: send a mail to this list. Discuss this until it is resolved. > 2: if it is a minor point, I ?implement it and make a pull request. > 3: If it turns out to be a major point, file a report in scipy.trac. > > If Josef (or other maintainers) prefer otherwise, please let me know. From jjhelmus at gmail.com Mon May 7 15:38:32 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Mon, 07 May 2012 15:38:32 -0400 Subject: [SciPy-Dev] ndimage grey morphology tickets Message-ID: <4FA824B8.2060709@gmail.com> All, Trac tickets #1135, #1281 and #1498 point out bugs in the ndimage.grey_dilation and grey_erosion function. I've started a branch that tries to address these issues: https://github.com/jjhelmus/scipy/tree/grey_morphology-fix This branch currently passes the tests in test_ndimage.py and the dilation_test.py file attached to ticket 1135, but I am not certain on two issues that I was hoping someone here might be able to comment on them: 1. Should there be a sanity check on the shape of footprint, structure and size when more than one is provided? 2. Ticket #1281 points out that grey_erosion does not have the parameter checking that grey_dilation has. I added these checks but noticed that to pass tests structure and footprint should not be reversed and the origin negated. Is this correct? If so the commented out lines in the branch should will be deleted. If this would be better discussed in a pull request I'd be happy to make one. http://projects.scipy.org/scipy/ticket/1135 http://projects.scipy.org/scipy/ticket/1281 http://projects.scipy.org/scipy/ticket/1498 Cheers, -Jonathan Helmus From josef.pktd at gmail.com Mon May 7 15:41:37 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 May 2012 15:41:37 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 2:20 PM, nicky van foreest wrote: > Hi, > > Many thanks for your explanations. I think it is best to just respond > to the point below that are not yet completely clear to me. I'll > remove the rest. > >> All continuous distributions have a generic treatment of loc and scale >> independent of whether this is part of a standard definition of the >> distribution. (discrete distributions only have a loc shift.) >> (location-scale families of distributions) >> >> dist.cdf((x-loc)/scale) = dist.cdf(x, loc=loc, scale=scale) >> the standard distributions have loc=0, scale=1 >> pdf and other methods follow from this > > I know this, but it took me some searching to find this out (already a > few years ago.) I would like to make this somewhat easier to > understand for first time users. Besides this, if I read (for the > exponential) > > cdf(x) = 1-exp( -(x-loc)/scale) > > it is perfectly clear that 1/scale represents the rate. > > The confusing point is that text books, wikipedia, and so on, are > inconsistent. Hence, before I use any distribution in scipy.stats I > first run a few test to be sure that using scale works the way as I > expect. If, on the other hand, the docstring would be completely > explicit, including the use of loc and scale (even though it is > overkill as loc and scale are explained somewhere else), no confusions > should arise, at least not from my part. > > As an example, I think as a first time user the following doc string > would have helped me the most (Hopefully I am not too biased here): > > ? ?"""An exponential continuous random variable. > > ? ?%(before_notes)s > > ? ?Notes > ? ?----- > ? ?The cumulative distribution function for `expon` is:: > > ? ? ? ?expon.cdf(x) = 1. - exp(-(x-loc)/scale) > > ? ?To compute > > ? ? ? ? ?cdf(x) = exp(-lambda x ) > > ? ? it is required to take ``scale = 1.0/lambda``; since ``loc = 0`` > automatically it ? ? ? ? ? ?is not necessary to set ``loc = 0.`` > explicitly. > > ? ?The shape parameter is not implemented for ``expon`` as ``loc`` > and ``scale`` suffice. > > ? ?%(example)s > > ? ?""" Yes that looks good to me. a few problems: in most cases we have the pdf currently in the docs which is, I guess, more familiar to most users. I'm not sure having 1./scale in front and (x-loc)/scale inside makes the pdf easier to read or understand, but it's more explicit. For many distributions, we don't have an explicit expression for the cdf, so loc and scale should still be understandable from the general documentation > >> Sometimes the parameterization in stats.distributions is "a bit >> difficult" to translate to the standard parameterization, example >> lognormal that regularly raises questions. > > Ok. Can we resolve this by making the doc-string more explicit, like > my example above? Yes > >> these are generic template numbers and could be replaced by >> distribution specific docstrings > > Ok. How about fixing part for part? If someone is going through individual distributions, this would be very good. (My initial worry a few years ago was that it will be difficult to maintain 90 individual docstrings.) > >> online editing is the easiest. > > Many of these points seem, at least to me, too minor to raise a > ticket, or am I mistaken here? no individual tickets are necessary. One possibility is a pull request with many changes. I usually prefer the online doc system. If you have edit permission, otherwise sign up and ping the list. here is the tutorial for editing http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ the distribution docstrings are a bit trickier: don't edit the generated docstring of the instance, e.g. http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ I think that would create a mess The docstring of the class with template is here http://docs.scipy.org/scipy/docs/scipy.stats.distributions.expon_gen/edit/ the only way I found the link is going through the milestones and look for xxx_gen http://docs.scipy.org/scipy/Milestones/Milestones_11/ > >> I wrote the stats tutorial a long time ago, and it contains the >> description of individual distributions written by Travis. >> I haven't looked at the overall documentation for the distributions in a while. > > Where can I find the tutorial? Then I'll try to add some description > about the shape parameter, and improve/add parts where necessary. http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ or in the source https://github.com/scipy/scipy/blob/master/doc/source/tutorial/stats.rst Josef > >> >> Suggestions, or, even better, direct improvements in the doc editor or >> with pull request would be very welcome. > > I'll try that. > >> Nicky, thanks for looking into this. > > I am happy to be able to do something in return. python, scipy, and > stats, made my life (in some respects :-) much easier. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From zachary.pincus at yale.edu Mon May 7 15:44:37 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 7 May 2012 15:44:37 -0400 Subject: [SciPy-Dev] scipy.sparse.csgraph merged In-Reply-To: References: Message-ID: <234B78D8-282B-4A81-9496-BF9D955C8E78@yale.edu> On May 7, 2012, at 2:47 PM, St?fan van der Walt wrote: > On Mon, May 7, 2012 at 9:03 AM, Ralf Gommers > wrote: >> I finally merged https://github.com/scipy/scipy/pull/119, the sparse graph >> algorithms module written by Jake Vanderplas. This will certainly by one of >> the highlights of the 0.11 release (which shouldn't be too far off). Thanks >> again to Jake for all the work he put in. > > What a great pull request discussion and resulting changeset! Kudos > to Jake, Ralf, Dan and Pauli for seeing it through. > > I'm curious to see how this compares to the N-d minimum cost path code > in scikits-image: > > https://github.com/scikits-image/scikits-image/blob/master/skimage/graph/_mcp.pyx Oh great, that's phenomenal to see in scipy! Fantastic stuff. The stuff in skimage is (essentially) Dijkstra's algorithm (on top of a binary heap, if I recall), for dense arrays and with some image-specific niceties (the graph connectivity is determined from a set of offsets that can be applied to any position in the array to find the "successor" nodes, and the edge weights can be determined in ways that make geodesic sense -- e.g. taking into account that diagonal "steps" are longer than axial ones). So the algorithms in skimage make most sense for dense matrices with some sort of spatial ordering, though they need not be "images" per se. Zach From tsyu80 at gmail.com Mon May 7 15:58:18 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Mon, 7 May 2012 15:58:18 -0400 Subject: [SciPy-Dev] ndimage grey morphology tickets In-Reply-To: <4FA824B8.2060709@gmail.com> References: <4FA824B8.2060709@gmail.com> Message-ID: On Mon, May 7, 2012 at 3:38 PM, Jonathan Helmus wrote: > All, > > Trac tickets #1135, #1281 and #1498 point out bugs in the > ndimage.grey_dilation and grey_erosion function. I've started a branch > that tries to address these issues: > https://github.com/jjhelmus/scipy/tree/grey_morphology-fix > > This branch currently passes the tests in test_ndimage.py and the > dilation_test.py file attached to ticket 1135, but I am not certain on > two issues that I was hoping someone here might be able to comment on them: > > 1. Should there be a sanity check on the shape of footprint, structure > and size when more than one is provided? > 2. Ticket #1281 points out that grey_erosion does not have the > parameter checking that grey_dilation has. I added these checks but > noticed that to pass tests structure and footprint should not be > reversed and the origin negated. Is this correct? If so the commented > out lines in the branch should will be deleted. > > If this would be better discussed in a pull request I'd be happy to make > one. > > http://projects.scipy.org/scipy/ticket/1135 > http://projects.scipy.org/scipy/ticket/1281 > http://projects.scipy.org/scipy/ticket/1498 > > > Cheers, > > -Jonathan Helmus > > Hi Jonathan, I recently submitted a fix for ticket 1135, but I realize now that I should have pinged the list. I didn't actually know about tickets 1281 or 1498. I think the PR I submitted should also take care of 1498, but I don't think 1281 is actually a bug. IIRC, the reason for the difference is that dilate shifts the origin of the structuring element/footprint if its size is even-numbered (i.e. doesn't have a "center" pixel). This shift makes dilation and erosion reversible---otherwise applying one, then the other would shift features. Are you sure the changes to `grey_erosion` are necessary? Also, I trimmed down the tests from PR 1135 (which seem to be the same tests in your branch) so that they only test for the submitted change. I think I originally wrote the tests (it's been more than 2 years so I could be wrong about that), and I was just testing everything related to the change. I trimmed it down because some tests replicated existing tests (and `test_ndimage.py` is already really long). I haven't had a chance to carefully look over your branch to compare, but I will tonight. -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon May 7 16:02:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 7 May 2012 22:02:14 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 9:41 PM, wrote: > On Mon, May 7, 2012 at 2:20 PM, nicky van foreest > wrote: > > > > >> Sometimes the parameterization in stats.distributions is "a bit > >> difficult" to translate to the standard parameterization, example > >> lognormal that regularly raises questions. > > > > Ok. Can we resolve this by making the doc-string more explicit, like > > my example above? > > Yes > > > > >> these are generic template numbers and could be replaced by > >> distribution specific docstrings > > > > Ok. How about fixing part for part? > > If someone is going through individual distributions, this would be > very good. (My initial worry a few years ago was that it will be > difficult to maintain 90 individual docstrings.) +1 for fixing this for individual distributions. Getting working and sensible examples out of auto-generated code is pretty much impossible. > > > > >> online editing is the easiest. > > > > Many of these points seem, at least to me, too minor to raise a > > ticket, or am I mistaken here? > > no individual tickets are necessary. One possibility is a pull request > with many changes. I usually prefer the online doc system. If you have edit permission, > otherwise sign up and ping the list. > Either way is very welcome, but a PR with multiple changes has much lower overhead than the doc wiki (and will be merged much sooner). > here is the tutorial for editing > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ > > the distribution docstrings are a bit trickier: > don't edit the generated docstring of the instance, e.g. > http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ > I think that would create a mess > Yes, that won't work. Please don't touch those. > > The docstring of the class with template is here > http://docs.scipy.org/scipy/docs/scipy.stats.distributions.expon_gen/edit/ > the only way I found the link is going through the milestones and look > for xxx_gen > http://docs.scipy.org/scipy/Milestones/Milestones_11/ > > Good point. Looks like all the _gen entries should be made white (=edit status) there instead of unimportant. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon May 7 16:03:59 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 May 2012 16:03:59 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 3:41 PM, wrote: > > On Mon, May 7, 2012 at 2:20 PM, nicky van foreest wrote: > > Hi, > > > > Many thanks for your explanations. I think it is best to just respond > > to the point below that are not yet completely clear to me. I'll > > remove the rest. > > > >> All continuous distributions have a generic treatment of loc and scale > >> independent of whether this is part of a standard definition of the > >> distribution. (discrete distributions only have a loc shift.) > >> (location-scale families of distributions) > >> > >> dist.cdf((x-loc)/scale) = dist.cdf(x, loc=loc, scale=scale) > >> the standard distributions have loc=0, scale=1 > >> pdf and other methods follow from this > > > > I know this, but it took me some searching to find this out (already a > > few years ago.) I would like to make this somewhat easier to > > understand for first time users. Besides this, if I read (for the > > exponential) > > > > cdf(x) = 1-exp( -(x-loc)/scale) > > > > it is perfectly clear that 1/scale represents the rate. > > > > The confusing point is that text books, wikipedia, and so on, are > > inconsistent. Hence, before I use any distribution in scipy.stats I > > first run a few test to be sure that using scale works the way as I > > expect. If, on the other hand, the docstring would be completely > > explicit, including the use of loc and scale (even though it is > > overkill as loc and scale are explained somewhere else), no confusions > > should arise, at least not from my part. > > > > As an example, I think as a first time user the following doc string > > would have helped me the most (Hopefully I am not too biased here): > > > > ? ?"""An exponential continuous random variable. > > > > ? ?%(before_notes)s > > > > ? ?Notes > > ? ?----- > > ? ?The cumulative distribution function for `expon` is:: > > > > ? ? ? ?expon.cdf(x) = 1. - exp(-(x-loc)/scale) > > > > ? ?To compute > > > > ? ? ? ? ?cdf(x) = exp(-lambda x ) > > > > ? ? it is required to take ``scale = 1.0/lambda``; since ``loc = 0`` > > automatically it ? ? ? ? ? ?is not necessary to set ``loc = 0.`` > > explicitly. > > > > ? ?The shape parameter is not implemented for ``expon`` as ``loc`` > > and ``scale`` suffice. > > > > ? ?%(example)s > > > > ? ?""" > > Yes that looks good to me. > a few problems: > in most cases we have the pdf currently in the docs which is, I guess, > more familiar to most users. > I'm not sure having 1./scale in front and (x-loc)/scale inside makes > the pdf easier to read or understand, but it's more explicit. > For many distributions, we don't have an explicit expression for the > cdf, so loc and scale should still be understandable from the general > documentation FWIW, every time I want to use a distribution, I have to open Wikipedia, the online docs such as http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/, and look at the docstring of the frozen distribution. Even then it's still not entirely clear to me how scale works in each case. I would love to see verbose, hand-holding documentation for the distributions. IMO it's not reasonable to ask users to understand the high level of abstraction used here. > > > > > >> Sometimes the parameterization in stats.distributions is "a bit > >> difficult" to translate to the standard parameterization, example > >> lognormal that regularly raises questions. > > > > Ok. Can we resolve this by making the doc-string more explicit, like > > my example above? > > Yes > > > > >> these are generic template numbers and could be replaced by > >> distribution specific docstrings > > > > Ok. How about fixing part for part? > > If someone is going through individual distributions, this would be > very good. (My initial worry a few years ago was that it will be > difficult to maintain 90 individual docstrings.) > Sure re-using the templates is useful (except when they generate incorrect information such as the wrong number of args and non-integer args, etc.), but it seems like (to me) confusion still reigns here. I don't know how the docs work anymore, but is it possible to have a wiki-editable 'Notes' section for each distribution? I think having a lot more information individually would go a long way. > > > >> online editing is the easiest. > > > > Many of these points seem, at least to me, too minor to raise a > > ticket, or am I mistaken here? > > no individual tickets are necessary. One possibility is a pull request > with many changes. > I usually prefer the online doc system. If you have edit permission, > otherwise sign up and ping the list. > > here is the tutorial for editing > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ > > the distribution docstrings are a bit trickier: > don't edit the generated docstring of the instance, e.g. > http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ > I think that would create a mess > > The docstring of the class with template is here > http://docs.scipy.org/scipy/docs/scipy.stats.distributions.expon_gen/edit/ > the only way I found the link is going through the milestones and look > for ?xxx_gen > http://docs.scipy.org/scipy/Milestones/Milestones_11/ > > > > > >> I wrote the stats tutorial a long time ago, and it contains the > >> description of individual distributions written by Travis. > >> I haven't looked at the overall documentation for the distributions in a while. > > > > Where can I find the tutorial? Then I'll try to add some description > > about the shape parameter, and improve/add parts where necessary. > > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ > or in the source > https://github.com/scipy/scipy/blob/master/doc/source/tutorial/stats.rst > > Josef > > > >> > >> Suggestions, or, even better, direct improvements in the doc editor or > >> with pull request would be very welcome. > > > > I'll try that. > > > >> Nicky, thanks for looking into this. > > > > I am happy to be able to do something in return. python, scipy, and > > stats, made my life (in some respects :-) much easier. > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Mon May 7 16:18:47 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 7 May 2012 22:18:47 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 10:03 PM, Skipper Seabold wrote: > On Mon, May 7, 2012 at 3:41 PM, wrote: > > > > On Mon, May 7, 2012 at 2:20 PM, nicky van foreest > wrote: > > > > > > > > > >> Sometimes the parameterization in stats.distributions is "a bit > > >> difficult" to translate to the standard parameterization, example > > >> lognormal that regularly raises questions. > > > > > > Ok. Can we resolve this by making the doc-string more explicit, like > > > my example above? > > > > Yes > > > > > > > >> these are generic template numbers and could be replaced by > > >> distribution specific docstrings > > > > > > Ok. How about fixing part for part? > > > > If someone is going through individual distributions, this would be > > very good. (My initial worry a few years ago was that it will be > > difficult to maintain 90 individual docstrings.) > > > > Sure re-using the templates is useful (except when they generate > incorrect information such as the wrong number of args and non-integer > args, etc.), but it seems like (to me) confusion still reigns here. I > don't know how the docs work anymore, but is it possible to have a > wiki-editable 'Notes' section for each distribution? I think having a > lot more information individually would go a long way. > They do (the distname_gen pages, as Josef pointed out), with the exception of a few discrete distributions. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon May 7 16:23:44 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 May 2012 16:23:44 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 4:18 PM, Ralf Gommers wrote: > > > On Mon, May 7, 2012 at 10:03 PM, Skipper Seabold > wrote: >> >> On Mon, May 7, 2012 at 3:41 PM, wrote: >> > >> > On Mon, May 7, 2012 at 2:20 PM, nicky van foreest >> > wrote: >> > >> > >> > > >> > >> Sometimes the parameterization in stats.distributions is "a bit >> > >> difficult" to translate to the standard parameterization, example >> > >> lognormal that regularly raises questions. >> > > >> > > Ok. Can we resolve this by making the doc-string more explicit, like >> > > my example above? >> > >> > Yes >> > >> > > >> > >> these are generic template numbers and could be replaced by >> > >> distribution specific docstrings >> > > >> > > Ok. How about fixing part for part? >> > >> > If someone is going through individual distributions, this would be >> > very good. (My initial worry a few years ago was that it will be >> > difficult to maintain 90 individual docstrings.) >> > >> >> Sure re-using the templates is useful (except when they generate >> incorrect information such as the wrong number of args and non-integer >> args, etc.), but it seems like (to me) confusion still reigns here. I >> don't know how the docs work anymore, but is it possible to have a >> wiki-editable 'Notes' section for each distribution? I think having a >> lot more information individually would go a long way. > > > They do (the distname_gen pages, as Josef pointed out), with the exception > of a few discrete distributions. > Ah good. The link didn't work for me, but I see it in the source now. Skipper From josef.pktd at gmail.com Mon May 7 16:24:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 May 2012 16:24:10 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 4:03 PM, Skipper Seabold wrote: > On Mon, May 7, 2012 at 3:41 PM, wrote: >> >> On Mon, May 7, 2012 at 2:20 PM, nicky van foreest wrote: >> > Hi, >> > >> > Many thanks for your explanations. I think it is best to just respond >> > to the point below that are not yet completely clear to me. I'll >> > remove the rest. >> > >> >> All continuous distributions have a generic treatment of loc and scale >> >> independent of whether this is part of a standard definition of the >> >> distribution. (discrete distributions only have a loc shift.) >> >> (location-scale families of distributions) >> >> >> >> dist.cdf((x-loc)/scale) = dist.cdf(x, loc=loc, scale=scale) >> >> the standard distributions have loc=0, scale=1 >> >> pdf and other methods follow from this >> > >> > I know this, but it took me some searching to find this out (already a >> > few years ago.) I would like to make this somewhat easier to >> > understand for first time users. Besides this, if I read (for the >> > exponential) >> > >> > cdf(x) = 1-exp( -(x-loc)/scale) >> > >> > it is perfectly clear that 1/scale represents the rate. >> > >> > The confusing point is that text books, wikipedia, and so on, are >> > inconsistent. Hence, before I use any distribution in scipy.stats I >> > first run a few test to be sure that using scale works the way as I >> > expect. If, on the other hand, the docstring would be completely >> > explicit, including the use of loc and scale (even though it is >> > overkill as loc and scale are explained somewhere else), no confusions >> > should arise, at least not from my part. >> > >> > As an example, I think as a first time user the following doc string >> > would have helped me the most (Hopefully I am not too biased here): >> > >> > ? ?"""An exponential continuous random variable. >> > >> > ? ?%(before_notes)s >> > >> > ? ?Notes >> > ? ?----- >> > ? ?The cumulative distribution function for `expon` is:: >> > >> > ? ? ? ?expon.cdf(x) = 1. - exp(-(x-loc)/scale) >> > >> > ? ?To compute >> > >> > ? ? ? ? ?cdf(x) = exp(-lambda x ) >> > >> > ? ? it is required to take ``scale = 1.0/lambda``; since ``loc = 0`` >> > automatically it ? ? ? ? ? ?is not necessary to set ``loc = 0.`` >> > explicitly. >> > >> > ? ?The shape parameter is not implemented for ``expon`` as ``loc`` >> > and ``scale`` suffice. >> > >> > ? ?%(example)s >> > >> > ? ?""" >> >> Yes that looks good to me. >> a few problems: >> in most cases we have the pdf currently in the docs which is, I guess, >> more familiar to most users. >> I'm not sure having 1./scale in front and (x-loc)/scale inside makes >> the pdf easier to read or understand, but it's more explicit. >> For many distributions, we don't have an explicit expression for the >> cdf, so loc and scale should still be understandable from the general >> documentation > > FWIW, every time I want to use a distribution, I have to open > Wikipedia, the online docs such as > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/, > and look at the docstring of the frozen distribution. Even then it's > still not entirely clear to me how scale works in each case. I would > love to see verbose, hand-holding documentation ?for the > distributions. IMO it's not reasonable to ask users to understand the > high level of abstraction used here. > >> >> >> > >> >> Sometimes the parameterization in stats.distributions is "a bit >> >> difficult" to translate to the standard parameterization, example >> >> lognormal that regularly raises questions. >> > >> > Ok. Can we resolve this by making the doc-string more explicit, like >> > my example above? >> >> Yes >> >> > >> >> these are generic template numbers and could be replaced by >> >> distribution specific docstrings >> > >> > Ok. How about fixing part for part? >> >> If someone is going through individual distributions, this would be >> very good. (My initial worry a few years ago was that it will be >> difficult to maintain 90 individual docstrings.) >> > > Sure re-using the templates is useful (except when they generate > incorrect information such as the wrong number of args and non-integer > args, etc.), but it seems like (to me) confusion still reigns here. I > don't know how the docs work anymore, but is it possible to have a > wiki-editable 'Notes' section for each distribution? I think having a > lot more information individually would go a long way. A long time ago I proposed to break up the distribution documentation into individual pages http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/#continuous-random-variables (slow in loading) 150 (?) pages of formulas, and to cross-link them from the docstrings. Personally I like Wikipedia, and we don't want to duplicate that information. The main information that would be very helpful in the notes section is explaining different parameterizations, and how to translate the Wikipedia version to the stats.distributions parameterization. For some exotic distributions where no informative Wikipedia page exists, more information would be very helpful, but in many cases I had problems finding that information. Josef > >> > >> >> online editing is the easiest. >> > >> > Many of these points seem, at least to me, too minor to raise a >> > ticket, or am I mistaken here? >> >> no individual tickets are necessary. One possibility is a pull request >> with many changes. >> I usually prefer the online doc system. If you have edit permission, >> otherwise sign up and ping the list. >> >> here is the tutorial for editing >> http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ >> >> the distribution docstrings are a bit trickier: >> don't edit the generated docstring of the instance, e.g. >> http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ >> I think that would create a mess >> >> The docstring of the class with template is here >> http://docs.scipy.org/scipy/docs/scipy.stats.distributions.expon_gen/edit/ >> the only way I found the link is going through the milestones and look >> for ?xxx_gen >> http://docs.scipy.org/scipy/Milestones/Milestones_11/ >> >> >> > >> >> I wrote the stats tutorial a long time ago, and it contains the >> >> description of individual distributions written by Travis. >> >> I haven't looked at the overall documentation for the distributions in a while. >> > >> > Where can I find the tutorial? Then I'll try to add some description >> > about the shape parameter, and improve/add parts where necessary. >> >> http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ >> or in the source >> https://github.com/scipy/scipy/blob/master/doc/source/tutorial/stats.rst >> >> Josef >> > >> >> >> >> Suggestions, or, even better, direct improvements in the doc editor or >> >> with pull request would be very welcome. >> > >> > I'll try that. >> > >> >> Nicky, thanks for looking into this. >> > >> > I am happy to be able to do something in return. python, scipy, and >> > stats, made my life (in some respects :-) much easier. >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 7 16:25:27 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 22:25:27 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: > Yes that looks good to me. > a few problems: > in most cases we have the pdf currently in the docs which is, I guess, > more familiar to most users. Ok. I agree that sticking to the pdf is best. > I'm not sure having 1./scale in front and (x-loc)/scale inside makes > the pdf easier to read or understand, but it's more explicit. I am also in doubt about what would be a good generic solution. The doc string for, for example, the gamma distribution will not become particularly easy to read, lots of x's... On the other hand, as Skipper points out below, I have the same problem as him when I want to use a distribution: I want to rely on the documentation, and not necessarily first check wikipedia, do some checking, and so on. I'll think about it a bit more. > If someone is going through individual distributions, this would be > very good. (My initial worry a few years ago was that it will be > difficult to maintain 90 individual docstrings.) Sure. However, I can start with three or four, and slowly expand. >> Many of these points seem, at least to me, too minor to raise a >> ticket, or am I mistaken here? > > no individual tickets are necessary. Hold on. I suppose you mean with "no" that I am mistaken, and that tickets are necessary. > the distribution docstrings are a bit trickier: > don't edit the generated docstring of the instance, e.g. > http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ > I think that would create a mess I inferred that the doc-strings in distributions.py should be edited, and that the reference documentation is automatically created from these doc-strings. Hence, I intended to change distributions.py. If this is wrong, let me know. From jsseabold at gmail.com Mon May 7 16:38:30 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 7 May 2012 16:38:30 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 4:25 PM, nicky van foreest wrote: >> Yes that looks good to me. >> a few problems: >> in most cases we have the pdf currently in the docs which is, I guess, >> more familiar to most users. > > Ok. I agree that sticking to the pdf is best. > >> I'm not sure having 1./scale in front and (x-loc)/scale inside makes >> the pdf easier to read or understand, but it's more explicit. > > I am also in doubt about what would be a good generic solution. The > doc string for, for example, the gamma distribution will not become > particularly easy to read, lots of x's... On the other hand, as > Skipper points out below, I have the same problem as him when I want > to use a distribution: I want to rely on the documentation, and not > necessarily first check wikipedia, do some checking, and so on. I'll > think about it a bit more. Gamma was actually the use case I was thinking of. I once wrote a two-parameter gamma distribution only to realize (when Josef pointed it out) that I could've just used scale to get what I wanted for the second parameter... I don't know though. Just my anecdotal data point... > >> If someone is going through individual distributions, this would be >> very good. (My initial worry a few years ago was that it will be >> difficult to maintain 90 individual docstrings.) > > Sure. However, I can start with three or four, and slowly expand. > >>> Many of these points seem, at least to me, too minor to raise a >>> ticket, or am I mistaken here? >> >> no individual tickets are necessary. > > Hold on. I suppose you mean with ?"no" that I am mistaken, and that > tickets are necessary. > >> the distribution docstrings are a bit trickier: >> don't edit the generated docstring of the instance, e.g. >> http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ >> I think that would create a mess > > I inferred that the doc-strings in distributions.py should be edited, > and that the reference documentation is automatically created from > these doc-strings. Hence, I intended to change distributions.py. If > this is wrong, let me know. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon May 7 16:48:43 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 7 May 2012 16:48:43 -0400 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: On Mon, May 7, 2012 at 4:25 PM, nicky van foreest wrote: >> Yes that looks good to me. >> a few problems: >> in most cases we have the pdf currently in the docs which is, I guess, >> more familiar to most users. > > Ok. I agree that sticking to the pdf is best. > >> I'm not sure having 1./scale in front and (x-loc)/scale inside makes >> the pdf easier to read or understand, but it's more explicit. > > I am also in doubt about what would be a good generic solution. The > doc string for, for example, the gamma distribution will not become > particularly easy to read, lots of x's... On the other hand, as > Skipper points out below, I have the same problem as him when I want > to use a distribution: I want to rely on the documentation, and not > necessarily first check wikipedia, do some checking, and so on. I'll > think about it a bit more. > >> If someone is going through individual distributions, this would be >> very good. (My initial worry a few years ago was that it will be >> difficult to maintain 90 individual docstrings.) > > Sure. However, I can start with three or four, and slowly expand. > >>> Many of these points seem, at least to me, too minor to raise a >>> ticket, or am I mistaken here? >> >> no individual tickets are necessary. > > Hold on. I suppose you mean with ?"no" that I am mistaken, and that > tickets are necessary. "no" as in "individual tickets are *not* necessary" (or as in "I would prefer not to see any individual tickets if we can do it wholesale in a pull request") (or as in "a ticket to fix a missing comma has too much overhead") I like tickets for bugs, because that leaves a better (electronic) paper trail. explicit enough, this time :) Josef > >> the distribution docstrings are a bit trickier: >> don't edit the generated docstring of the instance, e.g. >> http://docs.scipy.org/scipy/docs/scipy.stats.expon/edit/ >> I think that would create a mess > > I inferred that the doc-strings in distributions.py should be edited, > and that the reference documentation is automatically created from > these doc-strings. Hence, I intended to change distributions.py. If > this is wrong, let me know. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 7 16:53:01 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 22:53:01 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: > Gamma was actually the use case I was thinking of. I once wrote a > two-parameter gamma distribution only to realize (when Josef pointed > it out) that I could've just used scale to get what I wanted for the > second parameter... I don't know though. Just my anecdotal data > point... I can imagine. However, I have my doubts about including the loc and scale in the descritpion of the gamma distribution. That will become quite messy. So, currently, the string is a bit too terse, but the explicit version (with loc and scale included) will not be much better. How about this: """ gamma.pdf(x, a) = (lambda*x)**(a-1) * exp(-lambda*x) / gamma(a), where gamma(a) refers to the gamma function. Note that ``lambda`` can be changed by setting ``scale = 1./lambda``. '""" I also wonder whether an explicit ref in the documentation of each distribution to the stats tutorial would be helpful. Something like: For general background, see the tutorial. I also have in mind to give the loc, scale, and shape parameters a somewhat more prominent place in the tutorial. However, let me first try to get some simple things done. Up to now, I have been learning how to do things, but nothing has come out of my hands yet. From vanforeest at gmail.com Mon May 7 16:56:16 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 7 May 2012 22:56:16 +0200 Subject: [SciPy-Dev] scipy.stats documentation In-Reply-To: References: Message-ID: > "no" as in "individual tickets are *not* necessary" > (or as in "I would prefer not to see any individual tickets if we can > do it wholesale in a pull request") > (or as in "a ticket to fix a missing comma has too much overhead") > > I like tickets for bugs, because that leaves a better (electronic) paper trail. > > explicit enough, this time :) Yes :-) Very clear now. Tomorrow evening I'll improve some doc-strings and pull (push?) them via github. From gael.varoquaux at normalesup.org Mon May 7 19:12:10 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 8 May 2012 01:12:10 +0200 Subject: [SciPy-Dev] Announce: scikit-learn v0.11 Message-ID: <20120507231210.GM19857@phare.normalesup.org> On behalf of Andy Mueller, our release manager, I am happy to announce the 0.11 release of scikit-learn. This release includes some major new features such as randomized sparse models, gradient boosted regression trees, label propagation and many more. The release also has major improvements in the documentation and in stability. Details can be found on the [1]what's new page. We also have a new page with [2]video tutorials on machine learning with scikit-learn and different aspects of the package. Sources and windows binaries are available on sourceforge, through pypi (http://pypi.python.org/pypi/scikit-learn/0.11) or can be installed directly using pip: pip install -U scikit-learn Thanks again to all the contributors who made this release possible. Cheers, Ga?l 1. http://scikit-learn.org/stable/whats_new.html 2. http://scikit-learn.org/stable/presentations.html From tsyu80 at gmail.com Mon May 7 21:51:23 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Mon, 7 May 2012 21:51:23 -0400 Subject: [SciPy-Dev] ndimage grey morphology tickets In-Reply-To: References: <4FA824B8.2060709@gmail.com> Message-ID: On Mon, May 7, 2012 at 3:58 PM, Tony Yu wrote: > > > On Mon, May 7, 2012 at 3:38 PM, Jonathan Helmus wrote: > >> All, >> >> Trac tickets #1135, #1281 and #1498 point out bugs in the >> ndimage.grey_dilation and grey_erosion function. I've started a branch >> that tries to address these issues: >> https://github.com/jjhelmus/scipy/tree/grey_morphology-fix >> >> This branch currently passes the tests in test_ndimage.py and the >> dilation_test.py file attached to ticket 1135, but I am not certain on >> two issues that I was hoping someone here might be able to comment on >> them: >> >> 1. Should there be a sanity check on the shape of footprint, structure >> and size when more than one is provided? >> 2. Ticket #1281 points out that grey_erosion does not have the >> parameter checking that grey_dilation has. I added these checks but >> noticed that to pass tests structure and footprint should not be >> reversed and the origin negated. Is this correct? If so the commented >> out lines in the branch should will be deleted. >> >> If this would be better discussed in a pull request I'd be happy to make >> one. >> >> http://projects.scipy.org/scipy/ticket/1135 >> http://projects.scipy.org/scipy/ticket/1281 >> http://projects.scipy.org/scipy/ticket/1498 >> >> >> Cheers, >> >> -Jonathan Helmus >> >> Hi Jonathan, > > I recently submitted a fix for ticket 1135, > but I realize now that I should have pinged the list. I didn't actually > know about tickets 1281 or 1498. I think the PR I submitted should also > take care of 1498, but I don't think 1281 is actually a bug. > > IIRC, the reason for the difference is that dilate shifts the origin of > the structuring element/footprint if its size is even-numbered (i.e. > doesn't have a "center" pixel). This shift makes dilation and erosion > reversible---otherwise applying one, then the other would shift > features. Are you sure the changes to `grey_erosion` are necessary? > > Also, I trimmed down the tests from PR 1135 (which seem to be the same > tests in your branch) so that they only test for the submitted change. I > think I originally wrote the tests (it's been more than 2 years so I could > be wrong about that), and I was just testing everything related to the > change. I trimmed it down because some tests replicated existing tests (and > `test_ndimage.py` is already really long). > > I haven't had a chance to carefully look over your branch to compare, but > I will tonight. > > -Tony > > After a closer look, I'm pretty sure your changes to `grey_dilate` do essentially the same thing as PR #199 . As I mentioned in the previous email, the reason `grey_dilate` isn't symmetric in appearance to `grey_erosion` is because it needs to shift the origin of `structure`/`footprint`. Also, after closer review, I'm pretty sure the changes to `grey_erosion` are unnecessary. Could you take a closer look, just to make sure? In particular, I believe the changes to `grey_erosion` are handled by `filter._min_or_max_filter`. (The preprocessing of arguments---e.g. calls to `asarray` and `_normalize_sequence`--- in `grey_dilation` is required to shift the origin, but this processing is redone in `_min_or_max_filter`.) -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Tue May 8 10:52:10 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Tue, 08 May 2012 10:52:10 -0400 Subject: [SciPy-Dev] ndimage grey morphology tickets In-Reply-To: References: <4FA824B8.2060709@gmail.com> Message-ID: <4FA9331A.3070603@gmail.com> Tony Yu wrote: > > > On Mon, May 7, 2012 at 3:58 PM, Tony Yu > wrote: > > > > On Mon, May 7, 2012 at 3:38 PM, Jonathan Helmus > > wrote: > > All, > > Trac tickets #1135, #1281 and #1498 point out bugs in the > ndimage.grey_dilation and grey_erosion function. I've started > a branch > that tries to address these issues: > https://github.com/jjhelmus/scipy/tree/grey_morphology-fix > > This branch currently passes the tests in test_ndimage.py and the > dilation_test.py file attached to ticket 1135, but I am not > certain on > two issues that I was hoping someone here might be able to > comment on them: > > 1. Should there be a sanity check on the shape of footprint, > structure > and size when more than one is provided? > 2. Ticket #1281 points out that grey_erosion does not have the > parameter checking that grey_dilation has. I added these > checks but > noticed that to pass tests structure and footprint should not be > reversed and the origin negated. Is this correct? If so the > commented > out lines in the branch should will be deleted. > > If this would be better discussed in a pull request I'd be > happy to make > one. > > http://projects.scipy.org/scipy/ticket/1135 > http://projects.scipy.org/scipy/ticket/1281 > http://projects.scipy.org/scipy/ticket/1498 > > > Cheers, > > -Jonathan Helmus > > Hi Jonathan, > > I recently submitted a fix for ticket 1135 > , but I realize now > that I should have pinged the list. I didn't actually know about > tickets 1281 or 1498. I think the PR I submitted should also take > care of 1498, but I don't think 1281 is actually a bug. > > IIRC, the reason for the difference is that dilate shifts the > origin of the structuring element/footprint if its size is > even-numbered (i.e. doesn't have a "center" pixel). This shift > makes dilation and erosion reversible---otherwise applying one, > then the other would shift features. Are you sure the changes to > `grey_erosion` are necessary? > > Also, I trimmed down the tests from PR 1135 (which seem to be the > same tests in your branch) so that they only test for the > submitted change. I think I originally wrote the tests (it's been > more than 2 years so I could be wrong about that), and I was just > testing everything related to the change. I trimmed it down > because some tests replicated existing tests (and > `test_ndimage.py` is already really long). > > I haven't had a chance to carefully look over your branch to > compare, but I will tonight. > > -Tony > > > After a closer look, I'm pretty sure your changes to `grey_dilate` do > essentially the same thing as PR #199 > . > > As I mentioned in the previous email, the reason `grey_dilate` isn't > symmetric in appearance to `grey_erosion` is because it needs to shift > the origin of `structure`/`footprint`. > > Also, after closer review, I'm pretty sure the changes to > `grey_erosion` are unnecessary. Could you take a closer look, just to > make sure? In particular, I believe the changes to `grey_erosion` are > handled by `filter._min_or_max_filter` > . > (The preprocessing of arguments---e.g. calls to `asarray` and > `_normalize_sequence`--- in `grey_dilation` is required to shift the > origin, but this processing is redone in `_min_or_max_filter`.) > > -Tony > Tony, I looked at your pull request and you are correct, no change is needed in grey_erosion and our two modification to grey_dilation have the same effect. As I mentioned in the pull request comment the following might be good to add: Raise a RuntimeError if footprint, structure and size are all None. Current ndimage.grey_dilation([1]) raises a TypeError at line 1428, which is not as helpful. If this sanity check is added something similar might be added to grey_erosion, which currently raises a RuntimeError at line 816 complaing about no footprint, but probably should mention that structure or size can also be provided. Update the doc for grey_dilation and grey_erosion to mention that size is optional if footprint or structure is provided. I'll watch your pull request and stop working on my branch as your branch is more mature. Cheers, -Jonathan Helmus From tsyu80 at gmail.com Tue May 8 15:02:25 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Tue, 8 May 2012 15:02:25 -0400 Subject: [SciPy-Dev] ndimage grey morphology tickets In-Reply-To: <4FA9331A.3070603@gmail.com> References: <4FA824B8.2060709@gmail.com> <4FA9331A.3070603@gmail.com> Message-ID: On Tue, May 8, 2012 at 10:52 AM, Jonathan Helmus wrote: > Tony Yu wrote: > > > > > > On Mon, May 7, 2012 at 3:58 PM, Tony Yu > > wrote: > > > > > > > > On Mon, May 7, 2012 at 3:38 PM, Jonathan Helmus > > > wrote: > > > > All, > > > > Trac tickets #1135, #1281 and #1498 point out bugs in the > > ndimage.grey_dilation and grey_erosion function. I've started > > a branch > > that tries to address these issues: > > https://github.com/jjhelmus/scipy/tree/grey_morphology-fix > > > > This branch currently passes the tests in test_ndimage.py and the > > dilation_test.py file attached to ticket 1135, but I am not > > certain on > > two issues that I was hoping someone here might be able to > > comment on them: > > > > 1. Should there be a sanity check on the shape of footprint, > > structure > > and size when more than one is provided? > > 2. Ticket #1281 points out that grey_erosion does not have the > > parameter checking that grey_dilation has. I added these > > checks but > > noticed that to pass tests structure and footprint should not be > > reversed and the origin negated. Is this correct? If so the > > commented > > out lines in the branch should will be deleted. > > > > If this would be better discussed in a pull request I'd be > > happy to make > > one. > > > > http://projects.scipy.org/scipy/ticket/1135 > > http://projects.scipy.org/scipy/ticket/1281 > > http://projects.scipy.org/scipy/ticket/1498 > > > > > > Cheers, > > > > -Jonathan Helmus > > > > Hi Jonathan, > > > > I recently submitted a fix for ticket 1135 > > , but I realize now > > that I should have pinged the list. I didn't actually know about > > tickets 1281 or 1498. I think the PR I submitted should also take > > care of 1498, but I don't think 1281 is actually a bug. > > > > IIRC, the reason for the difference is that dilate shifts the > > origin of the structuring element/footprint if its size is > > even-numbered (i.e. doesn't have a "center" pixel). This shift > > makes dilation and erosion reversible---otherwise applying one, > > then the other would shift features. Are you sure the changes to > > `grey_erosion` are necessary? > > > > Also, I trimmed down the tests from PR 1135 (which seem to be the > > same tests in your branch) so that they only test for the > > submitted change. I think I originally wrote the tests (it's been > > more than 2 years so I could be wrong about that), and I was just > > testing everything related to the change. I trimmed it down > > because some tests replicated existing tests (and > > `test_ndimage.py` is already really long). > > > > I haven't had a chance to carefully look over your branch to > > compare, but I will tonight. > > > > -Tony > > > > > > After a closer look, I'm pretty sure your changes to `grey_dilate` do > > essentially the same thing as PR #199 > > . > > > > As I mentioned in the previous email, the reason `grey_dilate` isn't > > symmetric in appearance to `grey_erosion` is because it needs to shift > > the origin of `structure`/`footprint`. > > > > Also, after closer review, I'm pretty sure the changes to > > `grey_erosion` are unnecessary. Could you take a closer look, just to > > make sure? In particular, I believe the changes to `grey_erosion` are > > handled by `filter._min_or_max_filter` > > < > https://github.com/scipy/scipy/blob/master/scipy/ndimage/filters.py#L811>. > > (The preprocessing of arguments---e.g. calls to `asarray` and > > `_normalize_sequence`--- in `grey_dilation` is required to shift the > > origin, but this processing is redone in `_min_or_max_filter`.) > > > > -Tony > > > Tony, > > I looked at your pull request and you are correct, no change is > needed in grey_erosion and our two modification to grey_dilation have > the same effect. As I mentioned in the pull request comment the > following might be good to add: > > Raise a RuntimeError if footprint, structure and size are all None. > Current ndimage.grey_dilation([1]) raises a TypeError at line 1428, > which is not as helpful. If this sanity check is added something > similar might be added to grey_erosion, which currently raises a > RuntimeError at line 816 complaing about no footprint, but probably > should mention that structure or size can also be provided. > > Update the doc for grey_dilation and grey_erosion to mention that size > is optional if footprint or structure is provided. > > I'll watch your pull request and stop working on my branch as your > branch is more mature. > > Cheers, > > -Jonathan Helmus > Thanks for looking over the PR, Jonathan. I added one commit to check for the no-input case, and a second commit to update the docstring for (grey_) dilate, erosion, opening, closing, and morphological_gradient (I also verified that these functions do not require `size` to be specified if `structure` or `footprint` is specified). Could a dev with commit rights take a look at this? The main change is 4 lines of code (see first commit). The rest of the changes are just docstrings and tests. Thanks, -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Tue May 8 16:13:34 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 8 May 2012 22:13:34 +0200 Subject: [SciPy-Dev] scipy stats, doc string Message-ID: Hi, As Josef pointed out earlier, the Erlang distribution causes some further problems wrt the documentation too. The automatically generated documentation uses n = 0.9 as a shape parameter, but the documentation says that n should be an int. This is a bit silly of course. There are two ways to repair this IMO. 1) As Josef suggested: port Erlang to the gamma distribution and refer in the erlang doc string to the gamma distribution. I admit that I favor this. 2) Ensure that Erlang only accepts an int as a shape, implement an extra _arg_check (as is currently done), and change the string in distributions.py to automatically set the shape parameter. Specifically: https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L222 Would it be possible to change this line to something like >>> [ %(shapes)s ] = [%(shape_value)f,] * numargs Then it would be possible to pass a distribution dependent shape value. All in all option 1 seems the easiest, and from a user's point of view (I still regard myself a user of scipy.stats) sufficiently clear. Given this, I propose the following docstring for erlang: """An Erlang continuous random variable. %(before_notes)s See Also -------- gamma Notes ----- The Erlang distribution is a special case of the Gamma distribution, with the shape parameter ``a`` an integer. Refer to the ``gamma`` distribution for further examples. """ Note that I also remove the examples. In the gamma docsting, on the other hand, I include some extra info. I'll do a pull request this evening, including this change. Nicky From vanforeest at gmail.com Tue May 8 16:18:11 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 8 May 2012 22:18:11 +0200 Subject: [SciPy-Dev] scipy stats: doc strings of discrete distributions Message-ID: Hopefully you don't mind that I try to keep topics separated by sending separated mails. I noticed that the implementation of the doc strings differ for rv_continuous and rv_discrete. rv_continuous implementations look like class expon_gen(....): """ docs... """ def _rvs(...) expon = expon_gen(....) On the other hand, binom looks like class binom_gen(rv_discrete): def _rvs(self, n, p): binom = binom_gen(name='binom',shapes="n, p",extradoc=""" Binomial distribution """ Any objections against moving the extra doc and make the discrete distributions look more like the continuous ones? From josef.pktd at gmail.com Tue May 8 16:31:57 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 8 May 2012 16:31:57 -0400 Subject: [SciPy-Dev] scipy stats, doc string In-Reply-To: References: Message-ID: On Tue, May 8, 2012 at 4:13 PM, nicky van foreest wrote: > Hi, > > As Josef pointed out earlier, the Erlang distribution causes some > further problems wrt the documentation too. The automatically > generated documentation uses n = 0.9 as a shape parameter, but the > documentation says that n should be an int. This is a bit silly of > course. There are two ways to repair this IMO. > > 1) As Josef suggested: port Erlang to the gamma distribution and refer > in the erlang doc string to the gamma distribution. I admit that I > favor this. > > 2) Ensure that Erlang only accepts an int as a shape, implement an > extra _arg_check (as is currently done), and change > the string in distributions.py to automatically set the shape > parameter. Specifically: > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L222 > > Would it be possible to change this line to something like > >>>> [ %(shapes)s ] = [%(shape_value)f,] * numargs I think so, in general, http://projects.scipy.org/scipy/ticket/1581 However it wouldn't work as in your example, since shape_value might be empty or a list_like (several shape parameters) each distribution would need a new attribute, example_shape_values (?) (or maybe default inherited is empty) I'm not sure what to do about Erlang anymore. Josef > > Then it would be possible to pass a distribution dependent shape value. > > All in all option 1 seems the easiest, and from a user's point of view > (I still regard myself a user of scipy.stats) sufficiently clear. > Given this, I propose the following docstring for erlang: > > > ? ?"""An Erlang continuous random variable. > > ? ?%(before_notes)s > > ? ?See Also > ? ?-------- > ? ?gamma > > ? ?Notes > ? ?----- > ? ?The Erlang distribution is a special case of the Gamma > ? ?distribution, with the shape parameter ``a`` an integer. Refer to > ? ?the ``gamma`` distribution for further examples. > > ? ?""" > > Note that I also remove the examples. In the gamma docsting, on the > other hand, I include some extra info. I'll do a pull request this > evening, including this change. > > > > > Nicky > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Tue May 8 16:35:13 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 8 May 2012 16:35:13 -0400 Subject: [SciPy-Dev] scipy stats: doc strings of discrete distributions In-Reply-To: References: Message-ID: On Tue, May 8, 2012 at 4:18 PM, nicky van foreest wrote: > Hopefully you don't mind that I try to keep topics separated by > sending separated mails. > > I noticed that the implementation of the doc strings differ for > rv_continuous and rv_discrete. ?rv_continuous implementations look > like > > class expon_gen(....): > ? """ > ? docs... > ? """ > ? def _rvs(...) > > expon = expon_gen(....) > > > On the other hand, binom looks like > > class binom_gen(rv_discrete): > ? ?def _rvs(self, n, p): > > binom = binom_gen(name='binom',shapes="n, p",extradoc=""" > > Binomial distribution > """ > > Any objections against moving the extra doc and make the discrete > distributions look more like the continuous ones? No objections, they should also be converted to distribution specific docstring templates as Ralf did for the continuous distributions. Josef > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Tue May 8 17:47:20 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 8 May 2012 23:47:20 +0200 Subject: [SciPy-Dev] scipy stats: doc strings of discrete distributions In-Reply-To: References: Message-ID: Hi, I just committed some changes and issues a pull request. I don't know why but I cannot notify anybody about my changes. At least, there is no list of people (which should be there according to the github docs that I am reading) that I can notify. I'll try to see tomorrow how to resolve this. I made the pull request as user nokfi, in case this helps finding the commits. Once all this works, I'll try my hands on some more serious things, like solving for the ppf() Josef and I have been discussing previously. NIcky On 8 May 2012 22:35, wrote: > On Tue, May 8, 2012 at 4:18 PM, nicky van foreest wrote: >> Hopefully you don't mind that I try to keep topics separated by >> sending separated mails. >> >> I noticed that the implementation of the doc strings differ for >> rv_continuous and rv_discrete. ?rv_continuous implementations look >> like >> >> class expon_gen(....): >> ? """ >> ? docs... >> ? """ >> ? def _rvs(...) >> >> expon = expon_gen(....) >> >> >> On the other hand, binom looks like >> >> class binom_gen(rv_discrete): >> ? ?def _rvs(self, n, p): >> >> binom = binom_gen(name='binom',shapes="n, p",extradoc=""" >> >> Binomial distribution >> """ >> >> Any objections against moving the extra doc and make the discrete >> distributions look more like the continuous ones? > > No objections, they should also be converted to distribution specific > docstring templates as Ralf did for the continuous distributions. > > Josef > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Wed May 9 14:01:15 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 9 May 2012 20:01:15 +0200 Subject: [SciPy-Dev] scipy stats: doc strings of discrete distributions In-Reply-To: References: Message-ID: On Tue, May 8, 2012 at 11:47 PM, nicky van foreest wrote: > Hi, > > I just committed some changes and issues a pull request. I don't know > why but I cannot notify anybody about my changes. At least, there is > no list of people (which should be there according to the github docs > that I am reading) that I can notify. I'll try to see tomorrow how to > resolve this. > Can you point me to the github doc that was unclear? > I made the pull request as user nokfi, in case this helps finding the > commits. > > You did everything correctly and your PR looks good (didn't check in detail yet). Committers get automatically notified for PRs, and your email notified the scipy-dev list as requested. In the howto-contribute document it says: "When you send the PR for a new feature, be sure to also mention this on the scipy-dev mailing list. This can prompt interested people to help review your PR." I'll add to that that committers automatically get notified already. Cheers, Ralf > Once all this works, I'll try my hands on some more serious things, > like solving for the ppf() Josef and I have been discussing > previously. > > NIcky > > On 8 May 2012 22:35, wrote: > > On Tue, May 8, 2012 at 4:18 PM, nicky van foreest > wrote: > >> Hopefully you don't mind that I try to keep topics separated by > >> sending separated mails. > >> > >> I noticed that the implementation of the doc strings differ for > >> rv_continuous and rv_discrete. rv_continuous implementations look > >> like > >> > >> class expon_gen(....): > >> """ > >> docs... > >> """ > >> def _rvs(...) > >> > >> expon = expon_gen(....) > >> > >> > >> On the other hand, binom looks like > >> > >> class binom_gen(rv_discrete): > >> def _rvs(self, n, p): > >> > >> binom = binom_gen(name='binom',shapes="n, p",extradoc=""" > >> > >> Binomial distribution > >> """ > >> > >> Any objections against moving the extra doc and make the discrete > >> distributions look more like the continuous ones? > > > > No objections, they should also be converted to distribution specific > > docstring templates as Ralf did for the continuous distributions. > > > > Josef > > > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fastcluster at math.stanford.edu Wed May 9 14:51:30 2012 From: fastcluster at math.stanford.edu (=?ISO-8859-1?Q?Daniel_M=FCllner?=) Date: Wed, 09 May 2012 11:51:30 -0700 Subject: [SciPy-Dev] Suggestion: Fast hierarchical clustering Message-ID: <4FAABCB2.7040309@math.stanford.edu> Dear SciPy developers, I am the author of a package for fast hierarchical clustering: http://cran.r-project.org/web/packages/fastcluster/ (The C++ library has two interfaces for R and Python, hence the source code is published on CRAN.) The package improves the time complexity of the current algorithms in scipy.cluster.hierarchy from O(N^3) to O(N^2). The syntax of the Python interface agrees with the SciPy methods, so that users can quickly switch to the faster algorithms. But really, the best place for the clustering package would be to incorporate it into SciPy so that everyone could use the faster algorithms by default. Here are my questions: (1) Is there sufficient interest to replace scipy.cluster.hierarchy.linkage by faster code so that it makes the effort worthwhile for me? (2) To whom can I submit the suggested changes? Making a SciKit as described on http://www.scipy.org/Developer_Zone does not seem the right approach here since I am not offering a new, independent package but replacement of certain code within the existing module scipy.cluster.hierarchy. (3) Who decides whether the suggested changes are accepted or not? Best, Daniel Here are a few facts that you might want to know: * The core library is in C++ since I use templates a lot. However, I already took care that the R interface compiles on a variety of systems, see: http://cran.r-project.org/web/checks/check_results_fastcluster.html Therefore, I don't expect compilation issues for the Python interface. * The license is currently GPL. I am willing to publish the code under a different license if this is required for SciPy. * The latest version (not published yet) compiles and works under Python 2 and 3, so there are also no issues here. From vanderplas at astro.washington.edu Wed May 9 15:44:36 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Wed, 09 May 2012 12:44:36 -0700 Subject: [SciPy-Dev] Suggestion: Fast hierarchical clustering In-Reply-To: <4FAABCB2.7040309@math.stanford.edu> References: <4FAABCB2.7040309@math.stanford.edu> Message-ID: <4FAAC924.7020301@astro.washington.edu> Hi Daniel Daniel M?llner wrote: > > > (1) Is there sufficient interest to replace > scipy.cluster.hierarchy.linkage by faster code so that it makes the > effort worthwhile for me? > I'm not sure, but you've definitely asked in the right place. If people have a strong opinion either way, they should let you know. > (2) To whom can I submit the suggested changes? Making a SciKit as > described on http://www.scipy.org/Developer_Zone does not seem the right > approach here since I am not offering a new, independent package but > replacement of certain code within the existing module > scipy.cluster.hierarchy. > > Code is submitted for inclusion in scipy through github. If it seems like there is interest in this submission, you can get an account on github.com and clone the scipy source code, incorporate your changes, and do a pull request where the code will be reviewed. > (3) Who decides whether the suggested changes are accepted or not? > Decisions are made by consensus on this list and through the github review process. There are some more detailed documentation notes in this Pull Request: https://github.com/scipy/scipy/pull/184/files Good luck! Jake From vanforeest at gmail.com Wed May 9 16:07:08 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Wed, 9 May 2012 22:07:08 +0200 Subject: [SciPy-Dev] scipy stats: doc strings of discrete distributions In-Reply-To: References: Message-ID: Hi, Thanks for the feedback. >> I just committed some changes and issues a pull request. I don't know >> why but I cannot notify anybody about my changes. At least, there is >> no list of people (which should be there according to the github docs >> that I am reading) that I can notify. I'll try to see tomorrow how to >> resolve this. > > > Can you point me to the github doc that was unclear? I looked at this: http://help.github.com/send-pull-requests/ In particular I read the following: "After pressing the Pull Request button, you are presented with a preview page where you can enter a title and optional description, see exactly what commits will be included when the pull request is sent, and also see who the pull request will be sent to:" Now, on a second reading, I realize that the list of receivers will be shown automatically, once I check the preview page. For some reason this was not clear to me yesterday evening. I suppose a combination of confusion and eagerness to see results. > You did everything correctly and your PR looks good (didn't check in detail > yet). Committers get automatically notified for PRs, and your email notified > the scipy-dev list as requested. In the howto-contribute document it says: > > "When you send the PR for a new feature, be sure to also mention this on the > scipy-dev mailing list.? This can prompt interested people to help review > your PR." > > I'll add to that that committers automatically get notified already. Thanks. Nicky > > Cheers, > Ralf > >> >> Once all this works, I'll try my hands on some more serious things, >> like solving for the ppf() Josef and I have been discussing >> previously. >> >> NIcky >> >> On 8 May 2012 22:35, ? wrote: >> > On Tue, May 8, 2012 at 4:18 PM, nicky van foreest >> > wrote: >> >> Hopefully you don't mind that I try to keep topics separated by >> >> sending separated mails. >> >> >> >> I noticed that the implementation of the doc strings differ for >> >> rv_continuous and rv_discrete. ?rv_continuous implementations look >> >> like >> >> >> >> class expon_gen(....): >> >> ? """ >> >> ? docs... >> >> ? """ >> >> ? def _rvs(...) >> >> >> >> expon = expon_gen(....) >> >> >> >> >> >> On the other hand, binom looks like >> >> >> >> class binom_gen(rv_discrete): >> >> ? ?def _rvs(self, n, p): >> >> >> >> binom = binom_gen(name='binom',shapes="n, p",extradoc=""" >> >> >> >> Binomial distribution >> >> """ >> >> >> >> Any objections against moving the extra doc and make the discrete >> >> distributions look more like the continuous ones? >> > >> > No objections, they should also be converted to distribution specific >> > docstring templates as Ralf did for the continuous distributions. >> > >> > Josef >> > >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> >> SciPy-Dev at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From robert.kern at gmail.com Wed May 9 16:22:16 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 9 May 2012 21:22:16 +0100 Subject: [SciPy-Dev] Suggestion: Fast hierarchical clustering In-Reply-To: <4FAABCB2.7040309@math.stanford.edu> References: <4FAABCB2.7040309@math.stanford.edu> Message-ID: On Wed, May 9, 2012 at 7:51 PM, Daniel M?llner wrote: > * The license is currently GPL. I am willing to publish the code under a > different license if this is required for SciPy. Yes, we require a BSD-like license. https://github.com/scipy/scipy/blob/master/LICENSE.txt -- Robert Kern From pav at iki.fi Wed May 9 19:34:49 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 10 May 2012 01:34:49 +0200 Subject: [SciPy-Dev] Suggestion: Fast hierarchical clustering In-Reply-To: <4FAAC924.7020301@astro.washington.edu> References: <4FAABCB2.7040309@math.stanford.edu> <4FAAC924.7020301@astro.washington.edu> Message-ID: 09.05.2012 21:44, Jacob VanderPlas kirjoitti: > Daniel M?llner wrote: [clip] > > (1) Is there sufficient interest to replace > > scipy.cluster.hierarchy.linkage by faster code so that it makes the > > effort worthwhile for me? > > I'm not sure, but you've definitely asked in the right place. If people > have a strong opinion either way, they should let you know. Based on a quick look, this looks like an useful addition to me. Your code looks basically like completed careful work, and is modeled after the existing functions, so dropping it in scipy.cluster is probably quite straightforward. The only possible question is whether it's fully backwards compatible --- this seems to be so, and in any case the current scipy.cluster test suite seems complete enough to be able to check this easily. > > (2) To whom can I submit the suggested changes? Making a SciKit as > > described on http://www.scipy.org/Developer_Zone does not seem the right > > approach here since I am not offering a new, independent package but > > replacement of certain code within the existing module > > scipy.cluster.hierarchy. > > Code is submitted for inclusion in scipy through github. If it seems > like there is interest in this submission, you can get an account on > github.com and clone the scipy source code, incorporate your changes, > and do a pull request where the code will be reviewed. Here's some more general information, in the works: https://github.com/scipy/scipy/pull/191/files The "Developer_Zone" page is unfortunately somewhat out of date, and should be updated once we finish writing these instructions. > > (3) Who decides whether the suggested changes are accepted or not? > > Decisions are made by consensus on this list and through the github > review process. There are some more detailed documentation notes in > this Pull Request: https://github.com/scipy/scipy/pull/184/files In this case, I think it seems pretty likely that there are no objections to merging this --- correctness can be checked with the existing tests, and there are no questions about the scope or the API, as it's a replacement of the underlying algorithm only. -- Pauli Virtanen From lists at hilboll.de Thu May 10 05:52:58 2012 From: lists at hilboll.de (Andreas H.) Date: Thu, 10 May 2012 11:52:58 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition Message-ID: <4FAB8FFA.7060301@hilboll.de> Hi, googling for a Python implementation of STL (seasonal trend decomposition based on Loess, see http://cs.wellesley.edu/~cs315/Papers/stl%20statistical%20model.pdf) found this: http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/mpyloess.py?rev=3473 Can anyone tell me something about the status of this code? I'll probably need this for my work anyway, and thought it might be nice to bring it to a level at which it could be included into some package. Where would it belong? scipy.stats? statsmodels? Cheers, Andreas. From jsseabold at gmail.com Thu May 10 09:14:47 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 10 May 2012 09:14:47 -0400 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: <4FAB8FFA.7060301@hilboll.de> References: <4FAB8FFA.7060301@hilboll.de> Message-ID: On Thu, May 10, 2012 at 5:52 AM, Andreas H. wrote: > Hi, > > googling for a Python implementation of STL (seasonal trend > decomposition based on Loess, see > http://cs.wellesley.edu/~cs315/Papers/stl%20statistical%20model.pdf) > found this: > > > http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/mpyloess.py?rev=3473 > > Can anyone tell me something about the status of this code? I'll > probably need this for my work anyway, and thought it might be nice to > bring it to a level at which it could be included into some package. > Where would it belong? scipy.stats? statsmodels? > Statsmodels would definitely be interested in including something like this. It doesn't look like the original implementation is BSD compatible with SciPy/statsmodels though because of the "without fee" http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/LICENSE.txt?rev=3473 If you can code something up that doesn't rely on this, then by all means. If it helps (I don't know the details of STL yet), we have a LOWESS implementation here. http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric.api.lowess.html#statsmodels.nonparametric.api.lowess Any feedback on speed would be welcome, if you're comparing with the above code. I've also started working on Seasonal ARIMA models in a branch of my fork (jseabold) of statsmodels. I'd be interested to compare STL to our other filtering methods used to decompose economic time series into trend and cycle components http://statsmodels.sourceforge.net/devel/tsa.html#other-time-series-filters Skipper From robert.kern at gmail.com Thu May 10 09:20:56 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 10 May 2012 14:20:56 +0100 Subject: [SciPy-Dev] [pystatsmodels] Re: STL / LOESS seasonal trend decomposition In-Reply-To: References: <4FAB8FFA.7060301@hilboll.de> Message-ID: On Thu, May 10, 2012 at 2:14 PM, Skipper Seabold wrote: > Statsmodels would definitely be interested in including something like > this. It doesn't look like the original implementation is BSD > compatible with SciPy/statsmodels though because of the "without fee" > > http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/LICENSE.txt?rev=3473 No, that's fine. What it means is "You don't need to pay the authors a fee in order to use, copy, modify, etc." It's a standard part of such MIT-like licenses which are BSD-compatible: http://www.xfree86.org/current/LICENSE5.html -- Robert Kern From jsseabold at gmail.com Thu May 10 09:37:40 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 10 May 2012 09:37:40 -0400 Subject: [SciPy-Dev] [pystatsmodels] Re: STL / LOESS seasonal trend decomposition In-Reply-To: References: <4FAB8FFA.7060301@hilboll.de> Message-ID: On Thu, May 10, 2012 at 9:20 AM, Robert Kern wrote: > On Thu, May 10, 2012 at 2:14 PM, Skipper Seabold wrote: > >> Statsmodels would definitely be interested in including something like >> this. It doesn't look like the original implementation is BSD >> compatible with SciPy/statsmodels though because of the "without fee" >> >> http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/LICENSE.txt?rev=3473 > > No, that's fine. What it means is "You don't need to pay the authors a > fee in order to use, copy, modify, etc." It's a standard part of such > MIT-like licenses which are BSD-compatible: > > ?http://www.xfree86.org/current/LICENSE5.html > Ah, ok. Great then. Thanks, Skipper From pgmdevlist at gmail.com Thu May 10 10:02:06 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 10 May 2012 16:02:06 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: References: <4FAB8FFA.7060301@hilboll.de> Message-ID: Oh my, old memories... I haven't touched that piece of code in 5 years. The package was stuck in the sandbox because nobody really tested it but me, and I ended not really needing STL anyway... At that time, the MaskedArray-as-subclass implementation wasn't part of numpy per se. Some trivial work will be needed in renaming some of the imports and we need to think about future adaptations (w/ or w/o NA), but that should remain rather easy. Don't hesitate to contact me offlist if needed (faster answers that way). On May 10, 2012 3:15 PM, "Skipper Seabold" wrote: > On Thu, May 10, 2012 at 5:52 AM, Andreas H. wrote: > > Hi, > > > > googling for a Python implementation of STL (seasonal trend > > decomposition based on Loess, see > > http://cs.wellesley.edu/~cs315/Papers/stl%20statistical%20model.pdf) > > found this: > > > > > > > http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/mpyloess.py?rev=3473 > > > > Can anyone tell me something about the status of this code? I'll > > probably need this for my work anyway, and thought it might be nice to > > bring it to a level at which it could be included into some package. > > Where would it belong? scipy.stats? statsmodels? > > > > Statsmodels would definitely be interested in including something like > this. It doesn't look like the original implementation is BSD > compatible with SciPy/statsmodels though because of the "without fee" > > > http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/LICENSE.txt?rev=3473 > > If you can code something up that doesn't rely on this, then by all > means. If it helps (I don't know the details of STL yet), we have a > LOWESS implementation here. > > > http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric.api.lowess.html#statsmodels.nonparametric.api.lowess > > Any feedback on speed would be welcome, if you're comparing with the > above code. I've also started working on Seasonal ARIMA models in a > branch of my fork (jseabold) of statsmodels. I'd be interested to > compare STL to our other filtering methods used to decompose economic > time series into trend and cycle components > > http://statsmodels.sourceforge.net/devel/tsa.html#other-time-series-filters > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Thu May 10 11:46:12 2012 From: lists at hilboll.de (Andreas H.) Date: Thu, 10 May 2012 17:46:12 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: <4FAB8FFA.7060301@hilboll.de> References: <4FAB8FFA.7060301@hilboll.de> Message-ID: <4FABE2C4.1070009@hilboll.de> On 10.05.2012 11:52, Andreas H. wrote: > Hi, > > googling for a Python implementation of STL (seasonal trend > decomposition based on Loess, see > http://cs.wellesley.edu/~cs315/Papers/stl%20statistical%20model.pdf) > found this: > > > http://projects.scipy.org/scipy/browser/trunk/scipy/sandbox/pyloess/mpyloess.py?rev=3473 > > Can anyone tell me something about the status of this code? I'll > probably need this for my work anyway, and thought it might be nice to > bring it to a level at which it could be included into some package. > Where would it belong? scipy.stats? statsmodels? For a start (cosmetics, purity, whatever ...) I'd like to clone the code from the old SVN to a local GIT repo, to still have the version history. Any idea how I can do that? It seems the old SVN isn't alive any more ... Andreas. From lists at hilboll.de Thu May 10 11:52:23 2012 From: lists at hilboll.de (Andreas H.) Date: Thu, 10 May 2012 17:52:23 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: References: <4FAB8FFA.7060301@hilboll.de> Message-ID: <4FABE437.90908@hilboll.de> On 10.05.2012 16:02, Pierre GM wrote: > Oh my, old memories... I haven't touched that piece of code in 5 years. > The package was stuck in the sandbox because nobody really tested it > but me, and I ended not really needing STL anyway... > At that time, the MaskedArray-as-subclass implementation wasn't part of > numpy per se. Some trivial work will be needed in renaming some of the > imports and we need to think about future adaptations (w/ or w/o NA), > but that should remain rather easy. So basically the code was working for you, and it's a good point to start to implement STL? > Don't hesitate to contact me offlist > if needed (faster answers that way). Thanks, Pierre - I'll do that as soon as it becomes necessary. Andreas. From pgmdevlist at gmail.com Thu May 10 12:28:02 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 10 May 2012 18:28:02 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: <4FABE437.90908@hilboll.de> References: <4FAB8FFA.7060301@hilboll.de> <4FABE437.90908@hilboll.de> Message-ID: The code is just a wrapper around the original library. It worked fine for me and my needs at the time, but I felt a wider testing would not hurt. I do think statmodels would be its natural home. On May 10, 2012 5:52 PM, "Andreas H." wrote: > On 10.05.2012 16:02, Pierre GM wrote: > > Oh my, old memories... I haven't touched that piece of code in 5 years. > > The package was stuck in the sandbox because nobody really tested it > > but me, and I ended not really needing STL anyway... > > At that time, the MaskedArray-as-subclass implementation wasn't part of > > numpy per se. Some trivial work will be needed in renaming some of the > > imports and we need to think about future adaptations (w/ or w/o NA), > > but that should remain rather easy. > > So basically the code was working for you, and it's a good point to > start to implement STL? > > > Don't hesitate to contact me offlist > > if needed (faster answers that way). > > Thanks, Pierre - I'll do that as soon as it becomes necessary. > > Andreas. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu May 10 15:00:26 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 10 May 2012 21:00:26 +0200 Subject: [SciPy-Dev] STL / LOESS seasonal trend decomposition In-Reply-To: <4FABE2C4.1070009@hilboll.de> References: <4FAB8FFA.7060301@hilboll.de> <4FABE2C4.1070009@hilboll.de> Message-ID: 10.05.2012 17:46, Andreas H. kirjoitti: [clip] > For a start (cosmetics, purity, whatever ...) I'd like to clone the code > from the old SVN to a local GIT repo, to still have the version history. > Any idea how I can do that? It seems the old SVN isn't alive any more ... The version history of the sandbox is preserved in Scipy's Git repository. Try git log -- scipy/sandbox/pyloess Should be easy to resurrect. Pauli From vanforeest at gmail.com Thu May 10 16:42:24 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 10 May 2012 22:42:24 +0200 Subject: [SciPy-Dev] documentation and sphinx Message-ID: Hi, I am happily changing some documentation for scipy stats, and in the process I am assuming that I can type latex commands just as I do in Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes pngs of the formulas, and I had to change some settings in my config file for sphinx to enable this Can I rely on the shpinx version used for the numpy/scipy doc that formulas are converted to pngs, or do mathematical formulas end up very convoluted? Nicky From josef.pktd at gmail.com Thu May 10 16:46:17 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 10 May 2012 16:46:17 -0400 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 4:42 PM, nicky van foreest wrote: > Hi, > > I am happily changing some documentation for scipy stats, and in the > process I am assuming that I can type latex commands just as I do in > Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes > pngs of the formulas, and I had to change some settings in my config > file for sphinx to enable this ?Can I rely on the shpinx version used > for the numpy/scipy doc that formulas are converted to pngs, or do > mathematical formulas end up very convoluted? latex math works well, you can check any of the scipy tutorials e.g. http://docs.scipy.org/doc/scipy/reference/_sources/tutorial/stats/discrete.txt Josef > > Nicky > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Thu May 10 16:46:23 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 10 May 2012 22:46:23 +0200 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 10:42 PM, nicky van foreest wrote: > Hi, > > I am happily changing some documentation for scipy stats, and in the > process I am assuming that I can type latex commands just as I do in > Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes > pngs of the formulas, and I had to change some settings in my config > file for sphinx to enable this Can I rely on the shpinx version used > for the numpy/scipy doc that formulas are converted to pngs, or do > mathematical formulas end up very convoluted? > You can use LaTeX with the Sphinx .. math:: directive, but please do so sparingly - preferably confine it to the Notes section. The reason for that is that many users read docstrings in the terminal as plain text, and LaTeX isn't known for being very readable. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 10 16:47:07 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 10 May 2012 16:47:07 -0400 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 4:46 PM, wrote: > On Thu, May 10, 2012 at 4:42 PM, nicky van foreest wrote: >> Hi, >> >> I am happily changing some documentation for scipy stats, and in the >> process I am assuming that I can type latex commands just as I do in >> Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes >> pngs of the formulas, and I had to change some settings in my config >> file for sphinx to enable this ?Can I rely on the shpinx version used >> for the numpy/scipy doc that formulas are converted to pngs, or do >> mathematical formulas end up very convoluted? > > latex math works well, you can check any of the scipy tutorials > > e.g. > http://docs.scipy.org/doc/scipy/reference/_sources/tutorial/stats/discrete.txt link for rendered http://docs.scipy.org/doc/scipy/reference/tutorial/stats/discrete.html#percent-point-function-inverse-cdf > > Josef > >> >> Nicky >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From warren.weckesser at enthought.com Thu May 10 16:48:30 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 10 May 2012 15:48:30 -0500 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 3:46 PM, Ralf Gommers wrote: > > > On Thu, May 10, 2012 at 10:42 PM, nicky van foreest wrote: > >> Hi, >> >> I am happily changing some documentation for scipy stats, and in the >> process I am assuming that I can type latex commands just as I do in >> Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes >> pngs of the formulas, and I had to change some settings in my config >> file for sphinx to enable this Can I rely on the shpinx version used >> for the numpy/scipy doc that formulas are converted to pngs, or do >> mathematical formulas end up very convoluted? >> > > You can use LaTeX with the Sphinx .. math:: directive, but please do so > sparingly - preferably confine it to the Notes section. > +1. LaTeX makes beautiful PDF documents, but anything except the simplest notation looks terrible in a docstring. Warren > The reason for that is that many users read docstrings in the terminal as > plain text, and LaTeX isn't known for being very readable. > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 10 16:59:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 10 May 2012 16:59:30 -0400 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 4:48 PM, Warren Weckesser wrote: > > > On Thu, May 10, 2012 at 3:46 PM, Ralf Gommers > wrote: >> >> >> >> On Thu, May 10, 2012 at 10:42 PM, nicky van foreest >> wrote: >>> >>> Hi, >>> >>> I am happily changing some documentation for scipy stats, and in the >>> process I am assuming that I can type latex commands just as I do in >>> Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes >>> pngs of the formulas, and I had to change some settings in my config >>> file for sphinx to enable this ?Can I rely on the shpinx version used >>> for the numpy/scipy doc that formulas are converted to pngs, or do >>> mathematical formulas end up very convoluted? >> >> >> You can use LaTeX with the Sphinx .. math:: directive, but please do so >> sparingly - preferably confine it to the Notes section. > > > > +1.? LaTeX makes beautiful PDF documents, but anything except the simplest > notation looks terrible in a docstring. But there is no restriction in the tutorial section Josef > > Warren > > > >> >> The reason for that is that many users read docstrings? in the terminal as >> plain text, and LaTeX isn't known for being very readable. >> >> Ralf >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From stefan at sun.ac.za Thu May 10 16:59:20 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 10 May 2012 13:59:20 -0700 Subject: [SciPy-Dev] Removing misc functions from module docstring Message-ID: Hi all, After a bug reported via Debian: http://projects.scipy.org/scipy/ticket/1656 I've made a small doc PR to remove mentions of some misc functions (factorial, comb, lena, etc.) from the scipy module level docstring: https://github.com/scipy/scipy/pull/209/files Please let me know if you feel those function names need to remain in the docstring for some reason. Thanks St?fan From warren.weckesser at enthought.com Thu May 10 17:03:41 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 10 May 2012 16:03:41 -0500 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 3:59 PM, wrote: > On Thu, May 10, 2012 at 4:48 PM, Warren Weckesser > wrote: > > > > > > On Thu, May 10, 2012 at 3:46 PM, Ralf Gommers < > ralf.gommers at googlemail.com> > > wrote: > >> > >> > >> > >> On Thu, May 10, 2012 at 10:42 PM, nicky van foreest < > vanforeest at gmail.com> > >> wrote: > >>> > >>> Hi, > >>> > >>> I am happily changing some documentation for scipy stats, and in the > >>> process I am assuming that I can type latex commands just as I do in > >>> Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes > >>> pngs of the formulas, and I had to change some settings in my config > >>> file for sphinx to enable this Can I rely on the shpinx version used > >>> for the numpy/scipy doc that formulas are converted to pngs, or do > >>> mathematical formulas end up very convoluted? > >> > >> > >> You can use LaTeX with the Sphinx .. math:: directive, but please do so > >> sparingly - preferably confine it to the Notes section. > > > > > > > > +1. LaTeX makes beautiful PDF documents, but anything except the > simplest > > notation looks terrible in a docstring. > > But there is no restriction in the tutorial section > Right. The normal way to view the tutorials is as rendered HTML in a browser, so there is no need to restrict the use of LaTeX. Warren > > Josef > > > > > Warren > > > > > > > >> > >> The reason for that is that many users read docstrings in the terminal > as > >> plain text, and LaTeX isn't known for being very readable. > >> > >> Ralf > >> > >> > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > >> > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Thu May 10 17:12:53 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 10 May 2012 23:12:53 +0200 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: Ok. Thanks. I was actually editing a tutorial, so no problem there with using latex, which makes me a happier man. Ncky On May 10, 2012 11:04 PM, "Warren Weckesser" wrote: > > > On Thu, May 10, 2012 at 3:59 PM, wrote: > >> On Thu, May 10, 2012 at 4:48 PM, Warren Weckesser >> wrote: >> > >> > >> > On Thu, May 10, 2012 at 3:46 PM, Ralf Gommers < >> ralf.gommers at googlemail.com> >> > wrote: >> >> >> >> >> >> >> >> On Thu, May 10, 2012 at 10:42 PM, nicky van foreest < >> vanforeest at gmail.com> >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I am happily changing some documentation for scipy stats, and in the >> >>> process I am assuming that I can type latex commands just as I do in >> >>> Sphinx. (I also built my home page in sphinx). AFAIK, shpinx makes >> >>> pngs of the formulas, and I had to change some settings in my config >> >>> file for sphinx to enable this Can I rely on the shpinx version used >> >>> for the numpy/scipy doc that formulas are converted to pngs, or do >> >>> mathematical formulas end up very convoluted? >> >> >> >> >> >> You can use LaTeX with the Sphinx .. math:: directive, but please do so >> >> sparingly - preferably confine it to the Notes section. >> > >> > >> > >> > +1. LaTeX makes beautiful PDF documents, but anything except the >> simplest >> > notation looks terrible in a docstring. >> >> But there is no restriction in the tutorial section >> > > > Right. The normal way to view the tutorials is as rendered HTML in a > browser, so there is no need to restrict the use of LaTeX. > > Warren > > > >> >> Josef >> >> > >> > Warren >> > >> > >> > >> >> >> >> The reason for that is that many users read docstrings in the >> terminal as >> >> plain text, and LaTeX isn't known for being very readable. >> >> >> >> Ralf >> >> >> >> >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> >> SciPy-Dev at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> > >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wingusr at gmail.com Thu May 10 21:41:42 2012 From: wingusr at gmail.com (TP) Date: Thu, 10 May 2012 18:41:42 -0700 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Thu, May 10, 2012 at 1:42 PM, nicky van foreest wrote: > AFAIK, shpinx makes > pngs of the formulas, and I had to change some settings in my config > file for sphinx to enable this Sphinx also has a new MathJax extension [1]. Using MathJax [2] means that math in HTML doesn't have to be converted to relatively ugly PNGs to be displayed. [1] http://sphinx.pocoo.org/latest/ext/math.html#module-sphinx.ext.mathjax [2] http://www.mathjax.org/ From vanforeest at gmail.com Fri May 11 16:09:32 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Fri, 11 May 2012 22:09:32 +0200 Subject: [SciPy-Dev] scipy stats tutorial Message-ID: Hi, I have rewritten the first part of the scipy tutorial to such a form that it is (hopefully) easy accessible to new user of scipy.stats. To this aim I explicitly discuss loc and scale, and so on. (I recall that as a first time user I had no idea about vectorization, broadcasting, ... and that quite a lot of info was thrown at me at the same time, while I just wanted to compute the cdf of the Poisson distribution.) Since I made numerous changes at the start of the document, checking the differences through diff on github seems a very unattractive job to me. Second, to check whether the mark up was correct, I ran stats.rst through my local sphinx. For these to reasons I attach my local stats.html to this mail, so that you can check the end result. I don't have the plot directive, but I don't think that that is a problem for the moment. I left the sample analyses and the kernel density estimation untouched. I think these are already good enough to stand on their own. Once a user made it through my parts of the tutorial, these topics should be accessible (with sufficient background in statistics). Looking forward to your feedback, Nicky -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Fri May 11 16:16:20 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Fri, 11 May 2012 22:16:20 +0200 Subject: [SciPy-Dev] scipy tutorial Message-ID: I am still new to github... The end result can also be (without the plots) be checked at. https://github.com/nokfi/scipy/blob/repair-typo/doc/source/tutorial/stats.rst. From ralf.gommers at googlemail.com Sat May 12 11:23:57 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 12 May 2012 17:23:57 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On Fri, May 11, 2012 at 10:09 PM, nicky van foreest wrote: > Hi, > > I have rewritten the first part of the scipy tutorial to such a form > that it is (hopefully) easy accessible to new user of scipy.stats. To > this aim I explicitly discuss loc and scale, and so on. (I recall that > as a first time user I had no idea about vectorization, broadcasting, > ... and that quite a lot of info was thrown at me at the same time, > while I just wanted to compute the cdf of the Poisson distribution.) > Looks good from a quick read. Could you remove the use of the extradoc parameter? It's not needed anymore, and we'll probably get rid of it at some point. Looking at the docstring can be done with help() in the Python interpreter or "?" in IPython. > Since I made numerous changes at the start of the document, checking > the differences through diff on github seems a very unattractive job > to me. Second, to check whether the mark up was correct, I ran > stats.rst through my local sphinx. For these to reasons I attach my > local stats.html to this mail, so that you can check the end result. I > don't have the plot directive, but I don't think that that is a > problem for the moment. > Note that when you push commits to a branch on Github from which you have already made a pull request, those commits are added to that PR. So yours ended up at https://github.com/scipy/scipy/pull/205. Which is fine with me, since it's related anyway. And diffs are always good to have. Ralf > I left the sample analyses and the kernel density estimation > untouched. I think these are already good enough to stand on their > own. Once a user made it through my parts of the tutorial, these > topics should be accessible (with sufficient background in > statistics). > > Looking forward to your feedback, > > Nicky > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 12 11:58:36 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 12 May 2012 11:58:36 -0400 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On Sat, May 12, 2012 at 11:23 AM, Ralf Gommers wrote: > > > On Fri, May 11, 2012 at 10:09 PM, nicky van foreest > wrote: >> >> Hi, >> >> I have rewritten the first part of the scipy tutorial to such a form >> that it is (hopefully) easy accessible to new user of scipy.stats. To >> this aim I explicitly discuss loc and scale, and so on. (I recall that >> as a first time user I had no idea about vectorization, broadcasting, >> ... and that quite a lot of info was thrown at me at the same time, >> while I just wanted to compute the cdf of the Poisson distribution.) > > > Looks good from a quick read. > > Could you remove the use of the extradoc parameter? It's not needed anymore, > and we'll probably get rid of it at some point. Looking at the docstring can > be done with help() in the Python interpreter or "?" in IPython. Or with the object inspector in spyder, which has the bonus of showing the plain template docstring if source is selected, instead of having to read the same full docstring each time. The tools have changed, no more extradoc. Josef > >> >> Since I made numerous changes at the start of the document, checking >> the differences through diff on github seems a very unattractive job >> to me. Second, to check whether the mark up was correct, I ran >> stats.rst through my local sphinx. ?For these to reasons I attach my >> local stats.html to this mail, so that you can check the end result. I >> don't have the plot directive, but I don't think that that is a >> problem for the moment. > > > Note that when you push commits to a branch on Github from which you have > already made a pull request, those commits are added to that PR. So yours > ended up at https://github.com/scipy/scipy/pull/205. Which is fine with me, > since it's related anyway. And diffs are always good to have. > > Ralf > >> >> I left the sample analyses and the kernel density estimation >> untouched. I think these are already good enough to stand on their >> own. Once a user made it through my parts of the tutorial, these >> topics should be accessible (with sufficient background in >> statistics). >> >> Looking forward to your feedback, >> >> Nicky >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From vanforeest at gmail.com Sat May 12 15:20:03 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 12 May 2012 21:20:03 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: > Could you remove the use of the extradoc parameter? It's not needed anymore, > and we'll probably get rid of it at some point. Looking at the docstring can > be done with help() in the Python interpreter or "?" in IPython. I have removed it. > Note that when you push commits to a branch on Github from which you have > already made a pull request, those commits are added to that PR. So yours > ended up at https://github.com/scipy/scipy/pull/205. Which is fine with me, > since it's related anyway. And diffs are always good to have. I was not aware of this. I hope the commits are separated, though, just to separate the different changes (some wrt distribututions.py and some wrt the tutorial). So I suppose that I could/should have prevented this by making a new branch. Nicky From ralf.gommers at googlemail.com Sat May 12 15:41:31 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 12 May 2012 21:41:31 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On Sat, May 12, 2012 at 9:20 PM, nicky van foreest wrote: > > Could you remove the use of the extradoc parameter? It's not needed > anymore, > > and we'll probably get rid of it at some point. Looking at the docstring > can > > be done with help() in the Python interpreter or "?" in IPython. > > I have removed it. > Thanks. Looks like you still have to push that change to Github. > > > Note that when you push commits to a branch on Github from which you have > > already made a pull request, those commits are added to that PR. So yours > > ended up at https://github.com/scipy/scipy/pull/205. Which is fine with > me, > > since it's related anyway. And diffs are always good to have. > > I was not aware of this. I hope the commits are separated, though, > just to separate the different changes (some wrt distribututions.py > and some wrt the tutorial). So I suppose that I could/should have > prevented this by making a new branch. > > Yes, one commit stays one commit, no merging of those behind the scenes. So it should look fine. You should normally indeed make a new branch when you start working on some unrelated changes. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sat May 12 15:43:05 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 12 May 2012 21:43:05 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On 12 May 2012 21:41, Ralf Gommers wrote: > > > On Sat, May 12, 2012 at 9:20 PM, nicky van foreest > wrote: >> >> > Could you remove the use of the extradoc parameter? It's not needed >> > anymore, >> > and we'll probably get rid of it at some point. Looking at the docstring >> > can >> > be done with help() in the Python interpreter or "?" in IPython. >> >> I have removed it. > > > Thanks. Looks like you still have to push that change to Github. Yes. I'll push it right now. > Yes, one commit stays one commit, no merging of those behind the scenes. So > it should look fine. You should normally indeed make a new branch when you > start working on some unrelated changes. Ok. I have to get used to different working habits. Nicky > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From vanforeest at gmail.com Sat May 12 15:54:39 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 12 May 2012 21:54:39 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: > Could you remove the use of the extradoc parameter? It's not needed anymore, > and we'll probably get rid of it at some point. Looking at the docstring can > be done with help() in the Python interpreter or "?" in IPython. Look here: nicky at chuck:~$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scipy.stats import norm Help on norm_gen in module scipy.stats.distributions object: class norm_gen(rv_continuous) | Method resolution order: etc. I did not research the cause of this, but I don't get the doc of norm, but of norm_gen. When I use ipython I get the correct results though. What would be the best way to call the documentation of extra_doc from within python? NIcky From wingusr at gmail.com Sat May 12 16:22:09 2012 From: wingusr at gmail.com (TP) Date: Sat, 12 May 2012 13:22:09 -0700 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: On Sat, May 12, 2012 at 12:25 PM, nicky van foreest wrote: > Hi, > > Thanks for your reply. I have a question about using mathjax (I am not > an expert on this, in any way). Currently I write my documentation in > sphinx, then I use >make html. Once the making in finished I ftp the > _build directory to a webserver. Should this webserver run java script > to enable me to use mathjax to render the formulas? > > Nicky No. Only the web browser used to view your site needs to support javascript. From ralf.gommers at googlemail.com Sat May 12 16:51:19 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 12 May 2012 22:51:19 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On Sat, May 12, 2012 at 9:54 PM, nicky van foreest wrote: > > Could you remove the use of the extradoc parameter? It's not needed > anymore, > > and we'll probably get rid of it at some point. Looking at the docstring > can > > be done with help() in the Python interpreter or "?" in IPython. > > Look here: > > nicky at chuck:~$ python > Python 2.7.3 (default, Apr 20 2012, 22:39:59) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from scipy.stats import norm > Help on norm_gen in module scipy.stats.distributions object: > > class norm_gen(rv_continuous) > | Method resolution order: > > etc. > > I did not research the cause of this, but I don't get the doc of norm, > but of norm_gen. When I use ipython I get the correct results though. > What would be the best way to call the documentation of extra_doc from > within python? Oh yes, forgot about that. Python help() is working poorly. Better to do "print norm.__doc__" then. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 12 17:10:58 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 12 May 2012 17:10:58 -0400 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: On Sat, May 12, 2012 at 3:54 PM, nicky van foreest wrote: >> Could you remove the use of the extradoc parameter? It's not needed anymore, >> and we'll probably get rid of it at some point. Looking at the docstring can >> be done with help() in the Python interpreter or "?" in IPython. > > Look here: > > nicky at chuck:~$ python > Python 2.7.3 (default, Apr 20 2012, 22:39:59) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> from scipy.stats import norm > Help on norm_gen in module scipy.stats.distributions object: > > class norm_gen(rv_continuous) > ?| ?Method resolution order: > > etc. > > I did not research the cause of this, but I don't get the doc of norm, > but of norm_gen. When I use ipython I get the correct results though. I don't think we ever managed to get this to work. The original thread from 2009 is not clear on whether we did. I'm not sure there are many users left that use help(stats.xxx) for xxx is a distribution. print stats.norm.__doc__ works and is shorter (without methods) > What would be the best way to call the documentation of extra_doc from > within python? you mean like this ? >>> print stats.poisson.extradoc Poisson distribution poisson.pmf(k, mu) = exp(-mu) * mu**k / k! for k >= 0 Josef > NIcky > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pablo.winant at gmail.com Sat May 12 20:13:15 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sat, 12 May 2012 20:13:15 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation Message-ID: <4FAEFC9B.8040402@gmail.com> Hi, I tried to use interpolation routines in scipy recently and I have found two slight performance issues - The LinearNDInterpolation object implemented in cython requires a list of points and a list of values to be created. But is is not documented how to change the values of the interpolator without doing the mesh again. This is useful when one is solving the values of a function at the vertices of the mesh : one doesn't want to do the triangulation again and again. Maybe there could be a simple specific method to set the values in this case. In that case it would consist in changing the value of a property but it would be consistent with more general interpolation schemes. - I tried to use the delaunay object from scipy and noticed a strange thing: for a given set of coordinates it takes longer to get the indices of the triangles containing the points than it takes to perform the interpolation using LinearND object. This is puzzling since apparently the implementation of LinearND performs many calls to the qhull library to get this indices. Attached is a simple exampe demonstrating this anomaly. One last thing: I have written an interpolation object on sparse grids, using smolyak product of chebychev polynomials. It is written in pure python (vectorized) and licensed under the bsd license. Currently it lives in another library but I guess it would make more sense to have something like that in a more general scientific lib. Let me know if you are interested. (it is available there anyway: https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py and smolyak.py) Best regards, Pablo From pav at iki.fi Sun May 13 05:42:01 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 13 May 2012 11:42:01 +0200 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: <4FAEFC9B.8040402@gmail.com> References: <4FAEFC9B.8040402@gmail.com> Message-ID: 13.05.2012 02:13, Pablo Winant kirjoitti: [clip] > - The LinearNDInterpolation object implemented in cython requires a > list of points and a list of values to be created. But is is not > documented how to change the values of the interpolator without doing > the mesh again. [clip] This is a good suggestion --- the interpolator constructors should accept an existing triangulation. As a workaround, you can try modifying its ".values" attribute in-place, ip.values[...] = new_values > - I tried to use the delaunay object from scipy and noticed a strange > thing: for a given set of coordinates it takes longer to get the indices > of the triangles containing the points than it takes to perform the > interpolation using LinearND object. This is puzzling since apparently > the implementation of LinearND performs many calls to the qhull library > to get this indices. Attached is a simple exampe demonstrating this anomaly. The example doesn't seem to be attached, so I'm not exactly sure what you are seeing? Constructing a Delaunay triangulation is more expensive than performing the interpolation. There are also memory advantages in getting the point positions one-by-one (as in during interpolation) than all-at-once. > One last thing: I have written an interpolation object on sparse grids, > using smolyak product of chebychev polynomials. It is written in pure > python (vectorized) and licensed under the bsd license. Currently it > lives in another library but I guess it would make more sense to have > something like that in a more general scientific lib. Let me know if you > are interested. (it is available there anyway: > https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py > and smolyak.py) This seems slightly similar to the 2-D tensor approach in Fitpack, although that uses regular grids only. I can think of a couple of questions about this method that are not immediately clear to me: - Is it robust? I.e. how stable is the interpolant against perturbations of the data, badly scattered data points etc? Do you get problems with Chebychev polynomials if a high order is required? - Is your grid selection adaptive, or does the user need to tinker with grid parameters when interpolating? - How fast is this? As far I see, you need to solve a global linear least squares problem, which maybe is not so nice with a large number of data points? - There are a large number of possible interpolation methods we might like to include. What are the advantages of this one? One seems to be that it generalizes to n-D, and could be used to get smooth interpolants. -- Pauli Virtanen From josef.pktd at gmail.com Sun May 13 08:08:34 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 08:08:34 -0400 Subject: [SciPy-Dev] dead link in cookbook Message-ID: there is a dead link in cookbook to the script http://matplotlib.sf.net/examples/anim.py http://www.scipy.org/Cookbook/Matplotlib/Animations Does anyone know the correct link or is this obsolete? Thanks, Josef clueless From josef.pktd at gmail.com Sun May 13 08:44:36 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 08:44:36 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: <4FAEFC9B.8040402@gmail.com> References: <4FAEFC9B.8040402@gmail.com> Message-ID: On Sat, May 12, 2012 at 8:13 PM, Pablo Winant wrote: > Hi, > > I tried to use interpolation routines in scipy recently and I have found > two slight performance issues > > ?- The LinearNDInterpolation object implemented in cython requires a > list of points and a list of values to be created. But is is not > documented how to change the values of the interpolator without doing > the mesh again. This is useful when one is solving the values of a > function at the vertices of the mesh : one doesn't want to do the > triangulation again and again. Maybe there could be a simple specific > method to set the values in this case. In that case it would consist in > changing the value of a property but it would be consistent with more > general interpolation schemes. > > - I tried to use the delaunay object from scipy and noticed a strange > thing: for a given set of coordinates it takes longer to get the indices > of the triangles containing the points than it takes to perform the > interpolation using LinearND object. This is puzzling since apparently > the implementation of LinearND performs many calls to the qhull library > to get this indices. Attached is a simple exampe demonstrating this anomaly. > > One last thing: I have written an interpolation object on sparse grids, > using smolyak product of chebychev polynomials. It is written in pure > python (vectorized) and licensed under the bsd license. Currently it > lives in another library but I guess it would make more sense to have > something like that in a more general scientific lib. Let me know if you > are interested. (it is available there anyway: > https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py > and smolyak.py) Pablo, what is actually your license for dolo? your license file is GPL https://github.com/albop/dynare-python/blob/master/dolo/LICENSE but setup.py says BSD and and I found another package using some of your code as BSD-2 Thanks, Josef > > Best regards, > > Pablo > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From warren.weckesser at enthought.com Sun May 13 09:45:02 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 13 May 2012 08:45:02 -0500 Subject: [SciPy-Dev] dead link in cookbook In-Reply-To: References: Message-ID: On Sun, May 13, 2012 at 7:08 AM, wrote: > there is a dead link in cookbook to the script > http://matplotlib.sf.net/examples/anim.py > > http://www.scipy.org/Cookbook/Matplotlib/Animations > > Does anyone know the correct link or is this obsolete? > It is obsolete: http://old.nabble.com/documentation-issue-td32805008.html#a32805008 I added a comment at the top of the cookbook entry saying that some parts may be deprecated or obsolete. Warren > > Thanks, > > Josef > clueless > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pablo.winant at gmail.com Sun May 13 10:36:05 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sun, 13 May 2012 10:36:05 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: References: <4FAEFC9B.8040402@gmail.com> Message-ID: <4FAFC6D5.9090900@gmail.com> Le 13/05/2012 05:42, Pauli Virtanen a ?crit : > 13.05.2012 02:13, Pablo Winant kirjoitti: > [clip] >> - The LinearNDInterpolation object implemented in cython requires a >> list of points and a list of values to be created. But is is not >> documented how to change the values of the interpolator without doing >> the mesh again. > [clip] > > This is a good suggestion --- the interpolator constructors should > accept an existing triangulation. > > As a workaround, you can try modifying its ".values" attribute in-place, > > ip.values[...] = new_values Thank you for the suggestion, it works. >> - I tried to use the delaunay object from scipy and noticed a strange >> thing: for a given set of coordinates it takes longer to get the indices >> of the triangles containing the points than it takes to perform the >> interpolation using LinearND object. This is puzzling since apparently >> the implementation of LinearND performs many calls to the qhull library >> to get this indices. Attached is a simple exampe demonstrating this anomaly. > The example doesn't seem to be attached, so I'm not exactly sure what > you are seeing? Happens to me all the time :-) Here it is. > > Constructing a Delaunay triangulation is more expensive than performing > the interpolation. Yes. Unless I am mistaken, in my example the triangulation is done before and the more costly operation becomes point location. It seems to be faster with LinearNd. > There are also memory advantages in getting the point > positions one-by-one (as in during interpolation) than all-at-once. I understand that one. But in order to play with some variations in the interpolation scheme in python, it becomes more efficient to operate on a list of points. For instance, applying the barycentric coefficients to the list of values on the vertices can be done in one single matrix operation, if I have extracted the list of locations before. > >> One last thing: I have written an interpolation object on sparse grids, >> using smolyak product of chebychev polynomials. It is written in pure >> python (vectorized) and licensed under the bsd license. Currently it >> lives in another library but I guess it would make more sense to have >> something like that in a more general scientific lib. Let me know if you >> are interested. (it is available there anyway: >> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py >> and smolyak.py) > This seems slightly similar to the 2-D tensor approach in Fitpack, > although that uses regular grids only. This is the classical approach to get an interpolation with more than one dimension. But is is impossible if there are more than 3/4 dimensions while smolyak tensors can be used with more dimensions. There are actually several ways to use sparse tensor grids to interpolate: one can do piecewise interpolation on each hypercube of the grid or use the lagrange polynomial on the grid (spectral method). There is a synthetic paper describing precisely many possible methods: Sparse grids 2004, Bungarts and Griebel. Some of these methods were implemented in matlab http://www.ians.uni-stuttgart.de/spinterp/about.html . I actually found the method in an economic paper: "Computing Stochastic Dynamic Economic Models with a Large Number of State Variables: A Description and Application of a Smolyak-Collocation Method", Malin, B. Krueger, D., Kubler, F. > > I can think of a couple of questions about this method that are not > immediately clear to me: > > - Is it robust? I.e. how stable is the interpolant against > perturbations of the data, badly scattered data points etc? > Do you get problems with Chebychev polynomials if a high order is > required? It has roughly the same problems as chebychev polynomials: they are designed so as to minimize Runge-Kutta errors and there is a similar property for sparse product of chebychev polynomials. The method is meant to represent a function which you can evaluate at the points of the grid: they are precomputed, not directly taken from the data so I don't know how to use it to interpolate scattered data. It can achieve high precision for relatively smooth functions and fail badly if there are discontinuities. Outside of the approximation space, the extrapolation is known to behave badly. But in my experiments it was better than having a constant value everywhere. True, I had some problems using very high order polynomials (with l=6 meaning chebychev polynomials of order up to 2^6=64) but I don't know where does the limitation comes from (my implementation or the algorithm) > > - Is your grid selection adaptive, or does the user need to tinker with > grid parameters when interpolating? In the current version the user only specifies a parameter l, which quantify the precision of the grid (there are 2^l points in each dimension) and the interconnection between dimensions. For instance at l=1, there are two points in each dimension and no cross products. The main advantage, is that increasing the number of dimensions keeping l constant increases the number of points of the grid in a sub-exponential way: this breaks the curse of dimensionality. In the matlab toolbox by Andreas Klimke, there are adaptive schemes to increase the number of points in each dimension so it is at least theoretically possible. > > - How fast is this? As far I see, you need to solve a global linear > least squares problem, which maybe is not so nice with a large number > of data points? I would say it is relatively fast: for the same sparse grid I found it to be faster than the triangulation approach. And I use it with 5-6 dimensions without problems while I could not go that high with another interpolation scheme. The global linear least square problem is solved to find the langrange polynomials interpolating a given set of values. It is not done again when the interpolation is computed for new points (cf. discussion about LinearND and Delaunay). It could be avoided as there are explicit combinatorial formulas to do the same (but at that time I found it too complex ; it is probably not). > > - There are a large number of possible interpolation methods we might > like to include. What are the advantages of this one? One seems to be > that it generalizes to n-D, and could be used to get smooth > interpolants. > > Yes absolutely. It is especially suited when the number of dimensions is bigger than 3-4 where existing method (even delaunay based as far as I know) have a huge performance penalty cost. There are two other advantages of the current implementation: it can compute efficiently the derivative of the interpolated function at given points and it can also compute the derivative of the interpolated values w.r.t. to the values on the grid. What other interpolation schemes are you considering ? I am very interested by these developments as interpolation is really a practical/theoretical bottleneck for many of my problems. At some point I had a prototype of a multilinear interpolation in dimension n. Would you be interested in it ? Best regards, Pablo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example_linearnd_vs_delaunay.py Type: text/x-python Size: 1316 bytes Desc: not available URL: From josef.pktd at gmail.com Sun May 13 10:51:35 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 10:51:35 -0400 Subject: [SciPy-Dev] dead link in cookbook In-Reply-To: References: Message-ID: On Sun, May 13, 2012 at 9:45 AM, Warren Weckesser wrote: > > > On Sun, May 13, 2012 at 7:08 AM, wrote: >> >> there is a dead link in cookbook to the script >> http://matplotlib.sf.net/examples/anim.py >> >> http://www.scipy.org/Cookbook/Matplotlib/Animations >> >> Does anyone know the correct link or is this obsolete? > > > > > It is obsolete: > http://old.nabble.com/documentation-issue-td32805008.html#a32805008 > > I added a comment at the top of the cookbook entry saying that some parts > may be deprecated or obsolete. Thank you, Josef > > Warren > > > >> >> >> Thanks, >> >> Josef >> clueless >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From pablo.winant at gmail.com Sun May 13 10:57:55 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sun, 13 May 2012 10:57:55 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: References: <4FAEFC9B.8040402@gmail.com> Message-ID: <4FAFCBF3.8070107@gmail.com> Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : > On Sat, May 12, 2012 at 8:13 PM, Pablo Winant wrote: >> Hi, >> >> I tried to use interpolation routines in scipy recently and I have found >> two slight performance issues >> >> - The LinearNDInterpolation object implemented in cython requires a >> list of points and a list of values to be created. But is is not >> documented how to change the values of the interpolator without doing >> the mesh again. This is useful when one is solving the values of a >> function at the vertices of the mesh : one doesn't want to do the >> triangulation again and again. Maybe there could be a simple specific >> method to set the values in this case. In that case it would consist in >> changing the value of a property but it would be consistent with more >> general interpolation schemes. >> >> - I tried to use the delaunay object from scipy and noticed a strange >> thing: for a given set of coordinates it takes longer to get the indices >> of the triangles containing the points than it takes to perform the >> interpolation using LinearND object. This is puzzling since apparently >> the implementation of LinearND performs many calls to the qhull library >> to get this indices. Attached is a simple exampe demonstrating this anomaly. >> >> One last thing: I have written an interpolation object on sparse grids, >> using smolyak product of chebychev polynomials. It is written in pure >> python (vectorized) and licensed under the bsd license. Currently it >> lives in another library but I guess it would make more sense to have >> something like that in a more general scientific lib. Let me know if you >> are interested. (it is available there anyway: >> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py >> and smolyak.py) > Pablo, > > what is actually your license for dolo? It used to be GPL, but I changed it to BSD a while ago, precisely in order to integrate better with the python community. As I was using google-code at that time I switched to the new-bsd license which was the only bsd option. If I understand well, it is refered to as BSD-3. Now I must say I am a bit lost in this jungle, so I guess I would follow your suggestion if you say BSD-n is better. There several other parts of the library which would fit better outside (such as a nonlinear solver with complementarity constraints) so it is important to me that the license makes a sensible relocation of the code possible. > > your license file is GPL Thank you for spotting that. I will need to properly do all these legal stuff once I am sure about the good license. > https://github.com/albop/dynare-python/blob/master/dolo/LICENSE > but setup.py says BSD and and I found another package using some of > your code as BSD-2 Can you tell me which one it is ? Best, Pablo > > Thanks, > > Josef > >> Best regards, >> >> Pablo >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Sun May 13 11:40:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 11:40:10 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: <4FAFCBF3.8070107@gmail.com> References: <4FAEFC9B.8040402@gmail.com> <4FAFCBF3.8070107@gmail.com> Message-ID: On Sun, May 13, 2012 at 10:57 AM, Pablo Winant wrote: > Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : >> On Sat, May 12, 2012 at 8:13 PM, Pablo Winant ?wrote: >>> Hi, >>> >>> I tried to use interpolation routines in scipy recently and I have found >>> two slight performance issues >>> >>> ? - The LinearNDInterpolation object implemented in cython requires a >>> list of points and a list of values to be created. But is is not >>> documented how to change the values of the interpolator without doing >>> the mesh again. This is useful when one is solving the values of a >>> function at the vertices of the mesh : one doesn't want to do the >>> triangulation again and again. Maybe there could be a simple specific >>> method to set the values in this case. In that case it would consist in >>> changing the value of a property but it would be consistent with more >>> general interpolation schemes. >>> >>> - I tried to use the delaunay object from scipy and noticed a strange >>> thing: for a given set of coordinates it takes longer to get the indices >>> of the triangles containing the points than it takes to perform the >>> interpolation using LinearND object. This is puzzling since apparently >>> the implementation of LinearND performs many calls to the qhull library >>> to get this indices. Attached is a simple exampe demonstrating this anomaly. >>> >>> One last thing: I have written an interpolation object on sparse grids, >>> using smolyak product of chebychev polynomials. It is written in pure >>> python (vectorized) and licensed under the bsd license. Currently it >>> lives in another library but I guess it would make more sense to have >>> something like that in a more general scientific lib. Let me know if you >>> are interested. (it is available there anyway: >>> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py >>> and smolyak.py) >> Pablo, >> >> what is actually your license for dolo? > > It used to be GPL, but I changed it to BSD a while ago, precisely in > order to integrate better with the python community. > As I was using google-code at that time I switched to the new-bsd > license which was the only bsd option. > If I understand well, it is refered to as BSD-3. > > Now I must say I am a bit lost in this jungle, so I guess I would follow > your suggestion if you say BSD-n is better. A while ago I got confused about all these qualifiers for BSD (for statsmodels), BSD with number is less ambiguous or easier to remember. I don't think which BSD or which MIT doesn't matter as long as it is clearly stated. > > There several other parts of the library which would fit better outside > (such as a nonlinear solver with complementarity constraints) so it is > important to me that the license makes a sensible relocation of the code > possible. > >> >> your license file is GPL > Thank you for spotting that. I will need to properly do all these legal > stuff once I am sure about the good license. >> https://github.com/albop/dynare-python/blob/master/dolo/LICENSE >> but setup.py says BSD and and I found another package using some of >> your code as BSD-2 > Can you tell me which one it is ? https://github.com/christophe-gouel/RECS/blob/master/LICENSE.txt I only looked briefly, he uses your code to parse the model definition files, otherwise matlab. see bottom of this page https://github.com/christophe-gouel/RECS I was just browsing after your link. There is some interesting code and it would be good if we can share some of it. I also didn't know about a python - octave bridge. and was trying to see how easily we could create qnwnorm with scipy. Cheers, Josef > > Best, > > Pablo > >> >> Thanks, >> >> Josef >> >>> Best regards, >>> >>> Pablo >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pablo.winant at gmail.com Sun May 13 12:12:10 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sun, 13 May 2012 12:12:10 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: References: <4FAEFC9B.8040402@gmail.com> <4FAFCBF3.8070107@gmail.com> Message-ID: <4FAFDD5A.8030207@gmail.com> Le 13/05/2012 11:40, josef.pktd at gmail.com a ?crit : > On Sun, May 13, 2012 at 10:57 AM, Pablo Winant wrote: >> Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : >>> On Sat, May 12, 2012 at 8:13 PM, Pablo Winant wrote: >>>> Hi, >>>> >>>> I tried to use interpolation routines in scipy recently and I have found >>>> two slight performance issues >>>> >>>> - The LinearNDInterpolation object implemented in cython requires a >>>> list of points and a list of values to be created. But is is not >>>> documented how to change the values of the interpolator without doing >>>> the mesh again. This is useful when one is solving the values of a >>>> function at the vertices of the mesh : one doesn't want to do the >>>> triangulation again and again. Maybe there could be a simple specific >>>> method to set the values in this case. In that case it would consist in >>>> changing the value of a property but it would be consistent with more >>>> general interpolation schemes. >>>> >>>> - I tried to use the delaunay object from scipy and noticed a strange >>>> thing: for a given set of coordinates it takes longer to get the indices >>>> of the triangles containing the points than it takes to perform the >>>> interpolation using LinearND object. This is puzzling since apparently >>>> the implementation of LinearND performs many calls to the qhull library >>>> to get this indices. Attached is a simple exampe demonstrating this anomaly. >>>> >>>> One last thing: I have written an interpolation object on sparse grids, >>>> using smolyak product of chebychev polynomials. It is written in pure >>>> python (vectorized) and licensed under the bsd license. Currently it >>>> lives in another library but I guess it would make more sense to have >>>> something like that in a more general scientific lib. Let me know if you >>>> are interested. (it is available there anyway: >>>> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py >>>> and smolyak.py) >>> Pablo, >>> >>> what is actually your license for dolo? >> It used to be GPL, but I changed it to BSD a while ago, precisely in >> order to integrate better with the python community. >> As I was using google-code at that time I switched to the new-bsd >> license which was the only bsd option. >> If I understand well, it is refered to as BSD-3. >> >> Now I must say I am a bit lost in this jungle, so I guess I would follow >> your suggestion if you say BSD-n is better. > A while ago I got confused about all these qualifiers for BSD (for > statsmodels), BSD with number is less ambiguous or easier to remember. > I don't think which BSD or which MIT doesn't matter as long as it is > clearly stated. > >> There several other parts of the library which would fit better outside >> (such as a nonlinear solver with complementarity constraints) so it is >> important to me that the license makes a sensible relocation of the code >> possible. >> >>> your license file is GPL >> Thank you for spotting that. I will need to properly do all these legal >> stuff once I am sure about the good license. >>> https://github.com/albop/dynare-python/blob/master/dolo/LICENSE >>> but setup.py says BSD and and I found another package using some of >>> your code as BSD-2 >> Can you tell me which one it is ? > https://github.com/christophe-gouel/RECS/blob/master/LICENSE.txt > I only looked briefly, he uses your code to parse the model definition > files, otherwise matlab. > see bottom of this page https://github.com/christophe-gouel/RECS Ah, I know this one: he is a friend. He has written a software to solve rational expectation models in matlab. In addition to dolo, it uses the compecon toolbox which has some interesting interpolation routines (everything is documented in a book "/Applied Computational Economics and Finance/, Mario J. Miranda & Paul L. Fackler, MIT Press" ). They have three kind of one-dimensional interpolation routines (linear, splines, chebychev) and they provide a flexible way to produce multidimensional interpolation as a product of these one-dimensional routines. > > I was just browsing after your link. There is some interesting code > and it would be good if we can share some of it. I would be very glad to do so. Until now, I have been very busy doing research (thesis, post-doc,...) but I am now trying to turn the code from dolo into something useful for others. > I also didn't know about a python - octave bridge. > and was trying to see how easily we could create qnwnorm with scipy. qnwnorm comes from the compecon toolbox. It is based on a very well known algorithm for the one-dimensional quadrature and it should be fairly easy to rewrite/port it to python. I can do that if you are interested (it is also something I tried a while ago). > > Cheers, > > Josef > >> Best, >> >> Pablo >> >>> Thanks, >>> >>> Josef >>> >>>> Best regards, >>>> >>>> Pablo >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sun May 13 12:54:15 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 13 May 2012 18:54:15 +0200 Subject: [SciPy-Dev] scipy stats tutorial In-Reply-To: References: Message-ID: > print stats.norm.__doc__ I now use this in the tutorial. > >> What would be the best way to call the documentation of extra_doc from >> within python? I wasn't clear. The above, i.e., print xxx.__doc___, is what I needed. From josef.pktd at gmail.com Sun May 13 13:03:53 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 13:03:53 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: <4FAFDD5A.8030207@gmail.com> References: <4FAEFC9B.8040402@gmail.com> <4FAFCBF3.8070107@gmail.com> <4FAFDD5A.8030207@gmail.com> Message-ID: On Sun, May 13, 2012 at 12:12 PM, Pablo Winant wrote: > Le 13/05/2012 11:40, josef.pktd at gmail.com a ?crit?: > > On Sun, May 13, 2012 at 10:57 AM, Pablo Winant > wrote: > > Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : > > On Sat, May 12, 2012 at 8:13 PM, Pablo Winant > ?wrote: > > Hi, > > I tried to use interpolation routines in scipy recently and I have found > two slight performance issues > > ? - The LinearNDInterpolation object implemented in cython requires a > list of points and a list of values to be created. But is is not > documented how to change the values of the interpolator without doing > the mesh again. This is useful when one is solving the values of a > function at the vertices of the mesh : one doesn't want to do the > triangulation again and again. Maybe there could be a simple specific > method to set the values in this case. In that case it would consist in > changing the value of a property but it would be consistent with more > general interpolation schemes. > > - I tried to use the delaunay object from scipy and noticed a strange > thing: for a given set of coordinates it takes longer to get the indices > of the triangles containing the points than it takes to perform the > interpolation using LinearND object. This is puzzling since apparently > the implementation of LinearND performs many calls to the qhull library > to get this indices. Attached is a simple exampe demonstrating this anomaly. > > One last thing: I have written an interpolation object on sparse grids, > using smolyak product of chebychev polynomials. It is written in pure > python (vectorized) and licensed under the bsd license. Currently it > lives in another library but I guess it would make more sense to have > something like that in a more general scientific lib. Let me know if you > are interested. (it is available there anyway: > https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: > chebychev.py > and smolyak.py) > > Pablo, > > what is actually your license for dolo? > > It used to be GPL, but I changed it to BSD a while ago, precisely in > order to integrate better with the python community. > As I was using google-code at that time I switched to the new-bsd > license which was the only bsd option. > If I understand well, it is refered to as BSD-3. > > Now I must say I am a bit lost in this jungle, so I guess I would follow > your suggestion if you say BSD-n is better. > > A while ago I got confused about all these qualifiers for BSD (for > statsmodels), BSD with number is less ambiguous or easier to remember. > I don't think which BSD or which MIT doesn't matter as long as it is > clearly stated. > > There several other parts of the library which would fit better outside > (such as a nonlinear solver with complementarity constraints) so it is > important to me that the license makes a sensible relocation of the code > possible. > > your license file is GPL > > Thank you for spotting that. I will need to properly do all these legal > stuff once I am sure about the good license. > > https://github.com/albop/dynare-python/blob/master/dolo/LICENSE > but setup.py says BSD and and I found another package using some of > your code as BSD-2 > > Can you tell me which one it is ? > > https://github.com/christophe-gouel/RECS/blob/master/LICENSE.txt > I only looked briefly, he uses your code to parse the model definition > files, otherwise matlab. > see bottom of this page https://github.com/christophe-gouel/RECS > > Ah, I know this one: he is a friend. He has written a software to solve > rational expectation models in matlab. In addition to dolo, it uses the > compecon toolbox which has some interesting interpolation routines > (everything is documented in a book "Applied Computational Economics and > Finance, Mario J. Miranda & Paul L. Fackler, MIT Press" ). They have three > kind of one-dimensional interpolation routines (linear, splines, chebychev) > and they provide a flexible way to produce multidimensional interpolation as > a product of these one-dimensional routines. I looked at Miranda Fackler's code a few years ago, and thought they have a restrictive license. Now, I cannot find anything except the standard disclaimer on liability. I haven't quite figured out (no code) yet how to go in general from 1d to nd. (I was and am more interested in smoothing noisy versions, then pure interpolation.) > > > > I was just browsing after your link. There is some interesting code > and it would be good if we can share some of it. > > I would be very glad to do so. Until now, I have been very busy doing > research (thesis, post-doc,...) but I am now trying to turn the code from > dolo into something useful for others. > > I also didn't know about a python - octave bridge. > and was trying to see how easily we could create qnwnorm with scipy. > > qnwnorm comes from the compecon toolbox. It is based on a very well known > algorithm for the one-dimensional quadrature and it should be fairly easy to > rewrite/port it to python. I can do that if you are interested (it is also > something I tried a while ago). scipy provides the tools for 1d quadrature (points, weights), but I didn't know that it's so easy to extend to multivariate integration. Do you know if this extends to quadrature with respect to other distributions than normal. Josef > > > Cheers, > > Josef > > Best, > > Pablo > > Thanks, > > Josef > > Best regards, > > Pablo > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From pablo.winant at gmail.com Sun May 13 13:05:35 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sun, 13 May 2012 13:05:35 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: References: <4FAEFC9B.8040402@gmail.com> <4FAFCBF3.8070107@gmail.com> Message-ID: <4FAFE9DF.6000908@gmail.com> Le 13/05/2012 11:40, josef.pktd at gmail.com a ?crit : > On Sun, May 13, 2012 at 10:57 AM, Pablo Winant wrote: >> Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : >>> On Sat, May 12, 2012 at 8:13 PM, Pablo Winant wrote: >>>> Hi, >>>> >>>> I tried to use interpolation routines in scipy recently and I have found >>>> two slight performance issues >>>> >>>> - The LinearNDInterpolation object implemented in cython requires a >>>> list of points and a list of values to be created. But is is not >>>> documented how to change the values of the interpolator without doing >>>> the mesh again. This is useful when one is solving the values of a >>>> function at the vertices of the mesh : one doesn't want to do the >>>> triangulation again and again. Maybe there could be a simple specific >>>> method to set the values in this case. In that case it would consist in >>>> changing the value of a property but it would be consistent with more >>>> general interpolation schemes. >>>> >>>> - I tried to use the delaunay object from scipy and noticed a strange >>>> thing: for a given set of coordinates it takes longer to get the indices >>>> of the triangles containing the points than it takes to perform the >>>> interpolation using LinearND object. This is puzzling since apparently >>>> the implementation of LinearND performs many calls to the qhull library >>>> to get this indices. Attached is a simple exampe demonstrating this anomaly. >>>> >>>> One last thing: I have written an interpolation object on sparse grids, >>>> using smolyak product of chebychev polynomials. It is written in pure >>>> python (vectorized) and licensed under the bsd license. Currently it >>>> lives in another library but I guess it would make more sense to have >>>> something like that in a more general scientific lib. Let me know if you >>>> are interested. (it is available there anyway: >>>> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: chebychev.py >>>> and smolyak.py) >>> Pablo, >>> >>> what is actually your license for dolo? >> It used to be GPL, but I changed it to BSD a while ago, precisely in >> order to integrate better with the python community. >> As I was using google-code at that time I switched to the new-bsd >> license which was the only bsd option. >> If I understand well, it is refered to as BSD-3. >> >> Now I must say I am a bit lost in this jungle, so I guess I would follow >> your suggestion if you say BSD-n is better. > A while ago I got confused about all these qualifiers for BSD (for > statsmodels), BSD with number is less ambiguous or easier to remember. > I don't think which BSD or which MIT doesn't matter as long as it is > clearly stated. > >> There several other parts of the library which would fit better outside >> (such as a nonlinear solver with complementarity constraints) so it is >> important to me that the license makes a sensible relocation of the code >> possible. >> >>> your license file is GPL >> Thank you for spotting that. I will need to properly do all these legal >> stuff once I am sure about the good license. >>> https://github.com/albop/dynare-python/blob/master/dolo/LICENSE >>> but setup.py says BSD and and I found another package using some of >>> your code as BSD-2 >> Can you tell me which one it is ? > https://github.com/christophe-gouel/RECS/blob/master/LICENSE.txt > I only looked briefly, he uses your code to parse the model definition > files, otherwise matlab. > see bottom of this page https://github.com/christophe-gouel/RECS > > I was just browsing after your link. There is some interesting code > and it would be good if we can share some of it. > I also didn't know about a python - octave bridge. > and was trying to see how easily we could create qnwnorm with scipy. One quick update on this question : qnwnorm basically uses gauss-hermite quadrature in one dimension and use it to construct points and weights for a multivariate normal law. The univariate part is already implemented in scipy http://docs.scipy.org/doc/numpy/reference/generated/numpy.polynomial.hermite.hermgauss.html, so it could be used the multivariate gauss-hermite quadrature. However, it is not clear at all that it is theoretically the best method available (see: https://www.google.com/search?q=gauss+hermite+quadrature+multivariate&ie=utf-8&oe=utf-8&client=ubuntu&channel=fs) so it could be interesting to have more flexibility than there is in qnwnorm. > > Cheers, > > Josef > >> Best, >> >> Pablo >> >>> Thanks, >>> >>> Josef >>> >>>> Best regards, >>>> >>>> Pablo >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Sun May 13 13:06:16 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 13 May 2012 19:06:16 +0200 Subject: [SciPy-Dev] documentation and sphinx In-Reply-To: References: Message-ID: Fantastic. I replaced in conf.py the extension pngmath to mathjax, and it worked out of the box. Really nice. Thanks Nicky On 12 May 2012 22:22, TP wrote: > On Sat, May 12, 2012 at 12:25 PM, nicky van foreest > wrote: >> Hi, >> >> Thanks for your reply. I have a question about using mathjax (I am not >> an expert on this, in any way). Currently I write my documentation in >> sphinx, then I use >make html. Once the making in finished I ftp the >> _build directory to a webserver. Should this webserver run java script >> to enable me to use mathjax to render the formulas? >> >> Nicky > > No. Only the web browser used to view your site needs to support javascript. From pablo.winant at gmail.com Sun May 13 13:32:32 2012 From: pablo.winant at gmail.com (Pablo Winant) Date: Sun, 13 May 2012 13:32:32 -0400 Subject: [SciPy-Dev] Find points in delaunay triangulation : scipy.spatial vs. scipy.interpolation In-Reply-To: References: <4FAEFC9B.8040402@gmail.com> <4FAFCBF3.8070107@gmail.com> <4FAFDD5A.8030207@gmail.com> Message-ID: <4FAFF030.2030107@gmail.com> Le 13/05/2012 13:03, josef.pktd at gmail.com a ?crit : > On Sun, May 13, 2012 at 12:12 PM, Pablo Winant wrote: >> Le 13/05/2012 11:40, josef.pktd at gmail.com a ?crit : >> >> On Sun, May 13, 2012 at 10:57 AM, Pablo Winant >> wrote: >> >> Le 13/05/2012 08:44, josef.pktd at gmail.com a ?crit : >> >> On Sat, May 12, 2012 at 8:13 PM, Pablo Winant >> wrote: >> >> Hi, >> >> I tried to use interpolation routines in scipy recently and I have found >> two slight performance issues >> >> - The LinearNDInterpolation object implemented in cython requires a >> list of points and a list of values to be created. But is is not >> documented how to change the values of the interpolator without doing >> the mesh again. This is useful when one is solving the values of a >> function at the vertices of the mesh : one doesn't want to do the >> triangulation again and again. Maybe there could be a simple specific >> method to set the values in this case. In that case it would consist in >> changing the value of a property but it would be consistent with more >> general interpolation schemes. >> >> - I tried to use the delaunay object from scipy and noticed a strange >> thing: for a given set of coordinates it takes longer to get the indices >> of the triangles containing the points than it takes to perform the >> interpolation using LinearND object. This is puzzling since apparently >> the implementation of LinearND performs many calls to the qhull library >> to get this indices. Attached is a simple exampe demonstrating this anomaly. >> >> One last thing: I have written an interpolation object on sparse grids, >> using smolyak product of chebychev polynomials. It is written in pure >> python (vectorized) and licensed under the bsd license. Currently it >> lives in another library but I guess it would make more sense to have >> something like that in a more general scientific lib. Let me know if you >> are interested. (it is available there anyway: >> https://github.com/albop/dynare-python/tree/master/dolo/src/dolo/numeric: >> chebychev.py >> and smolyak.py) >> >> Pablo, >> >> what is actually your license for dolo? >> >> It used to be GPL, but I changed it to BSD a while ago, precisely in >> order to integrate better with the python community. >> As I was using google-code at that time I switched to the new-bsd >> license which was the only bsd option. >> If I understand well, it is refered to as BSD-3. >> >> Now I must say I am a bit lost in this jungle, so I guess I would follow >> your suggestion if you say BSD-n is better. >> >> A while ago I got confused about all these qualifiers for BSD (for >> statsmodels), BSD with number is less ambiguous or easier to remember. >> I don't think which BSD or which MIT doesn't matter as long as it is >> clearly stated. >> >> There several other parts of the library which would fit better outside >> (such as a nonlinear solver with complementarity constraints) so it is >> important to me that the license makes a sensible relocation of the code >> possible. >> >> your license file is GPL >> >> Thank you for spotting that. I will need to properly do all these legal >> stuff once I am sure about the good license. >> >> https://github.com/albop/dynare-python/blob/master/dolo/LICENSE >> but setup.py says BSD and and I found another package using some of >> your code as BSD-2 >> >> Can you tell me which one it is ? >> >> https://github.com/christophe-gouel/RECS/blob/master/LICENSE.txt >> I only looked briefly, he uses your code to parse the model definition >> files, otherwise matlab. >> see bottom of this page https://github.com/christophe-gouel/RECS >> >> Ah, I know this one: he is a friend. He has written a software to solve >> rational expectation models in matlab. In addition to dolo, it uses the >> compecon toolbox which has some interesting interpolation routines >> (everything is documented in a book "Applied Computational Economics and >> Finance, Mario J. Miranda& Paul L. Fackler, MIT Press" ). They have three >> kind of one-dimensional interpolation routines (linear, splines, chebychev) >> and they provide a flexible way to produce multidimensional interpolation as >> a product of these one-dimensional routines. > I looked at Miranda Fackler's code a few years ago, and thought they > have a restrictive license. Now, I cannot find anything except the > standard disclaimer on liability. I've been wondering many time too and Christophe, the author of compecon, has asked one of the authors. Apparently they were not really aware of the various opensource license and simply meant the code to be "usable by anyone" but wanted to remain in control of their library. However, I guess they would have no problem with letting python code of part of the code be released under the BSD license, provided they are cited and the derived work si clearly distinct from their. > > I haven't quite figured out (no code) yet how to go in general from 1d > to nd. (I was and am more interested in smoothing noisy versions, then > pure interpolation.) You can basically do a tensor product of the functions in each base. ( f(x,y) = (a11 f1(x) + a12 f2(x)) * (a21 f1(y) + a22 f2(y)) ) From a numerical point of view, I think it is dominated by direct nd representation but it is relatively easy to implement and fast. For instance, multilinear interpolation where (f1 and f2 are linear) is very fast. >> >> >> I was just browsing after your link. There is some interesting code >> and it would be good if we can share some of it. >> >> I would be very glad to do so. Until now, I have been very busy doing >> research (thesis, post-doc,...) but I am now trying to turn the code from >> dolo into something useful for others. >> >> I also didn't know about a python - octave bridge. >> and was trying to see how easily we could create qnwnorm with scipy. >> >> qnwnorm comes from the compecon toolbox. It is based on a very well known >> algorithm for the one-dimensional quadrature and it should be fairly easy to >> rewrite/port it to python. I can do that if you are interested (it is also >> something I tried a while ago). > scipy provides the tools for 1d quadrature (points, weights), but I > didn't know that it's so easy to extend to multivariate integration. > Do you know if this extends to quadrature with respect to other > distributions than normal. Frankly, no. > > Josef > >> >> Cheers, >> >> Josef >> >> Best, >> >> Pablo >> >> Thanks, >> >> Josef >> >> Best regards, >> >> Pablo >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Sun May 13 16:34:15 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 13 May 2012 22:34:15 +0200 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: Hi Josef, Some time ago we discussed an algorithm to find x such that cdf(x) = q for given q, that is, a generic algorithm to solve the ppf. In the mail below you propose to set xa and xb to good initial values, but this is not a simple task. Besides this, xa and xb are solely used in the ppf compution. This made me think about an algorithm that avoid the use of xa and xb altogether. I came up with the algorithm below. The included test cases show that it works really well. What is your opinion? In case motivation is required, I shift left and right such that eventually cdf(left) < q < cdf(right). I update by a factor of 10. I rather don't spend a lot of time on the while loop, but prefer to leave the actual solving to brentq. This algo homes in very fast, so it is better to rely on brentq to do the fast searching. Hence, I prefer to take 10 rather than 2 or 3 at a growth factor. Just for fun I tried 1000 as a factor. This works also well. Do you perhaps have more challenging cases? Once we have a set of tests, I'll try to set up a performance test. Once we are satisfied I'll make a pull request. I attached the code. BTW. I suppose that as long as we are experimenting with algorithms/code there is no reason to make a pull request. Nicky On 26 April 2012 05:15, wrote: > On Wed, Apr 25, 2012 at 3:49 PM, ? wrote: >> On Wed, Apr 25, 2012 at 3:21 PM, nicky van foreest wrote: >>>>>>>> The difficult cases will be where cdf also doesn't exist and we need >>>>>>>> to get it through integrate.quad, but I don't remember which >>>>>>>> distribution is a good case. >>>>> >>>>> This case is harder indeed. (I assume you mean by 'not exist' that >>>>> there is no closed form expression for the cdf, like the normal >>>>> distribution). Computing the ppf would involve calling quad a lot of >>>>> times. This is wasteful especially since the computation of cdf(b) >>>>> includes the computation of cdf(a) for a < b, supposing that quad runs >>>>> from -np.inf to b. We could repair this by computing cdf(b) = cdf(a) + >>>>> quad(f, a, b), assuming that cdf(a) has been computed already. >>>>> (perhaps I am not clear enough here. If so, let me know.) >>>> >>>> not exists = not defined as _cdf method ?could also be scipy.special >>>> if there are no closed form expressions >>> >>> I see, sure. >>> >>>>>>> I just think that we are not able to reach the q=0, q=1 boundaries, >>>>>>> since for some distributions we will run into other numerical >>>>>>> problems. And I'm curious how far we can get with this. >>>>> >>>>> I completely missed to include a test on the obvious cases q >= 1. - >>>>> np.finfo(float).eps and q <= np.finfo(float).eps. It is now in the >>>>> attached file. >>>> >>>>>>> findppf(stats.expon, 1e-30) >>>> -6.3593574850511882e-13 >>> >>> This result shows actually that xa and xb are necessary to include in >>> the specification of the distribution. The exponential distribution is >>> (usually) defined only on [0, \infty) not on the negative numbers. The >>> result above is negative though. This is of course a simple >>> consequence of calling brentq. From a user's perspective, though, I >>> would become very suspicious about this negative result. >> >> good argument to clean up xa, xb >> >>> >>>> The right answer should be dist.b for q=numerically 1, lower support >>>> point is dist.a but I don't see when we would need it. >>> >>> I agree, provided xa and xb are always properly defined. But then, >>> (just to be nitpicking), the definition of expon does not set xa and >>> xb explicitly. Hence xa = -10, and this is somewhat undesirable, given >>> the negative value above. >>> >>>>> >>>>> The simultaneous updating of left and right in the previous algo is >>>>> wrong. Suppose for instance that cdf(left) < cdf(right) < q. Then both >>>>> left and right would `move to the left'. This is clearly wrong. The >>>>> included code should be better. >>>> >>>> would move to the *right* ? >>> >>> Sure. >>> >>>> >>>> I thought the original was a nice trick, we can shift both left and >>>> right since we know it has to be in that direction, the cut of range >>>> cannot contain the answer. >>>> >>>> Or do I miss the point? >>> >>> No, you are right. When I wrote this at first, I also thought about >>> the point you bring up here. Then, I was somewhat dissatisfied with >>> calling the while loop twice (suppose the left bound requires >>> updating, then certainly the second while loop (to update the right >>> bound) is unnecessary, and calling cdf(right) is useless). While >>> trying to fix this, I forgot about my initial ideas... >>> >>>> >>>>> >>>>> With regard to the values of xb and xa. Can a `ordinary' user change >>>>> these? If so, then the ppf finder should include some protection in my >>>>> opinion. If not, the user will get an error that brentq has not the >>>>> right limits, but this error might be somewhat unexpected. (What has >>>>> brentq to do with finding the ppf?) Of course, looking at the code >>>>> this is clear, but I expect most users will no do so. >>>> >>>> I don't think ?`ordinary' users should touch xa, xb, but they could. >>>> Except for getting around the limitation in this ticket there is no >>>> reason to change xa, xb, so we could make them private _xa, _xb >>>> instead. >>> >>> I think that would be better. Thus, the developer that subclasses >>> rv_continuous should set _xa and _xb properly. >>> >>>>> The code contains two choices about how to handle xa and xb. Do you >>>>> have any preference? >>>> >>>> I don't really like choice 1, because it removes the use of the >>>> predefined xa, xb. On the other hand, with this extension, xa and xb >>>> wouldn't be really necessary anymore. >>> >>> In view of your example with findppf(expon(1e-30)) I prefer to use _xa and _xb. >>> >>>> >>>> another possibility would be to try except brentq with xa, xb first >>>> and get most cases, and switch to your version if needed. I'm not sure >>>> xa, xb are defined well enough that it's worth to go this route, >>>> though. >>> >>> I think that this makes the most sense. The definition of the class >>> should include sensible values of xa and xb. >>> >>> All in all, I would like to make the following proposal to resolve the >>> ticket in a generic way. >>> >>> 1) xa and xb should become private class members _xa and _xb >>> 2) _xa and _xb should be given proper values in the class definition, >>> e.g. expon._xa = 0 and expon._xb = 30., since exp(-30) = 9.35e-14. >>> 3) given a quantile q in the ppf function, include a test on _cdf(_xa) >>> <= q <= _cdf(_xb). If this fails, return an exception with some text >>> that tells that either _cdf(_xa) > q or _cdf(_xb) < q. >>> >>> Given your comments I actually favor all this searching for left and >>> right not that much anymore. It is generic, but it places the >>> responsibility of good code at the wrong place. >> >> 3) I prefer your expanding the search to raising an exception to the >> user. Note also that your 3) is inconsistent with 1). If a user >> visible exception is raised, then the user needs to change xa or xb, >> so it shouldn't be private. That's the current situation (except for a >> more cryptic message). >> >> 2) I'm all in favor, especially for one-side bound distributions, >> where it should be easy to go through those. There might be a few >> where the bound moves with the shape, but the only one I remember is >> genextreme and that has an explicit _ppf >> >> So I would prefer 1), 2) and your new enhanced generic _ppf > > forgot to mention > > the main reason that I like your expanding search space is that the > shape of the distribution can change a lot. Even if we set xa, xb to > reasonable values for likely shape parameters they won't be good > enough for others, as in the original ticket > >>>> stats.invgauss.stats(2) > (array(2.0), array(8.0)) >>>> stats.invgauss.stats(7) > (array(7.0), array(343.0)) >>>> stats.invgauss.stats(20) > (array(20.0), array(8000.0)) >>>> stats.invgauss.stats(100) > (array(100.0), array(1000000.0)) >>>> stats.invgauss.cdf(1000, 100) > 0.98335562794321207 > >>>> findppf(stats.invgauss, 0.99, 100) > 1926.520850319389 >>>> findppf(stats.invgauss, 0.999, 100) > 13928.012903371644 >>>> findppf(stats.invgauss, 0.999, 1) > 8.3548649291400938 > --------- > > to get a rough idea: > for xa, xb and a finite bound either left or right, all have generic > xa=-10 or xb=10 > >>>> dist_cont = [getattr(stats.distributions, dname) ?for dname in dir(stats.distributions) if isinstance(getattr(stats.distributions, dname), stats.distributions.rv_continuous)] > >>>> left = [(d.name, d.a, d.xa) for d in dist_cont if not np.isneginf(d.a)] >>>> pprint(left) > [('alpha', 0.0, -10.0), > ?('anglit', -0.78539816339744828, -10.0), > ?('arcsine', 0.0, -10.0), > ?('beta', 0.0, -10.0), > ?('betaprime', 0.0, -10.0), > ?('bradford', 0.0, -10.0), > ?('burr', 0.0, -10.0), > ?('chi', 0.0, -10.0), > ?('chi2', 0.0, -10.0), > ?('cosine', -3.1415926535897931, -10.0), > ?('erlang', 0.0, -10.0), > ?('expon', 0.0, 0), > ?('exponpow', 0.0, -10.0), > ?('exponweib', 0.0, -10.0), > ?('f', 0.0, -10.0), > ?('fatiguelife', 0.0, -10.0), > ?('fisk', 0.0, -10.0), > ?('foldcauchy', 0.0, -10.0), > ?('foldnorm', 0.0, -10.0), > ?('frechet_r', 0.0, -10.0), > ?('gamma', 0.0, -10.0), > ?('gausshyper', 0.0, -10.0), > ?('genexpon', 0.0, -10.0), > ?('gengamma', 0.0, -10.0), > ?('genhalflogistic', 0.0, -10.0), > ?('genpareto', 0.0, -10.0), > ?('gilbrat', 0.0, -10.0), > ?('gompertz', 0.0, -10.0), > ?('halfcauchy', 0.0, -10.0), > ?('halflogistic', 0.0, -10.0), > ?('halfnorm', 0.0, -10.0), > ?('invgamma', 0.0, -10.0), > ?('invgauss', 0.0, -10.0), > ?('invnorm', 0.0, -10.0), > ?('invweibull', 0, -10.0), > ?('johnsonb', 0.0, -10.0), > ?('ksone', 0.0, -10.0), > ?('kstwobign', 0.0, -10.0), > ?('levy', 0.0, -10.0), > ?('loglaplace', 0.0, -10.0), > ?('lognorm', 0.0, -10.0), > ?('lomax', 0.0, -10.0), > ?('maxwell', 0.0, -10.0), > ?('mielke', 0.0, -10.0), > ?('nakagami', 0.0, -10.0), > ?('ncf', 0.0, -10.0), > ?('ncx2', 0.0, -10.0), > ?('pareto', 1.0, -10.0), > ?('powerlaw', 0.0, -10.0), > ?('powerlognorm', 0.0, -10.0), > ?('rayleigh', 0.0, -10.0), > ?('rdist', -1.0, -10.0), > ?('recipinvgauss', 0.0, -10.0), > ?('rice', 0.0, -10.0), > ?('semicircular', -1.0, -10.0), > ?('triang', 0.0, -10.0), > ?('truncexpon', 0.0, -10.0), > ?('uniform', 0.0, -10.0), > ?('wald', 0.0, -10.0), > ?('weibull_min', 0.0, -10.0), > ?('wrapcauchy', 0.0, -10.0)] > >>>> right = [(d.name, d.b, d.xb) for d in dist_cont if not np.isposinf(d.b)] >>>> pprint(right) > [('anglit', 0.78539816339744828, 10.0), > ?('arcsine', 1.0, 10.0), > ?('beta', 1.0, 10.0), > ?('betaprime', 500.0, 10.0), > ?('bradford', 1.0, 10.0), > ?('cosine', 3.1415926535897931, 10.0), > ?('frechet_l', 0.0, 10.0), > ?('gausshyper', 1.0, 10.0), > ?('johnsonb', 1.0, 10.0), > ?('levy_l', 0.0, 10.0), > ?('powerlaw', 1.0, 10.0), > ?('rdist', 1.0, 10.0), > ?('semicircular', 1.0, 10.0), > ?('triang', 1.0, 10.0), > ?('uniform', 1.0, 10.0), > ?('weibull_max', 0.0, 10.0), > ?('wrapcauchy', 6.2831853071795862, 10.0)] > > only pareto has both limits on the same side of zero > >>>> pprint ([(d.name, d.a, d.b) for d in dist_cont if d.a*d.b>0]) > [('pareto', 1.0, inf)] > > > genextreme, and maybe one or two others, are missing because finite a, > b are set in _argcheck > vonmises is for circular and doesn't behave properly > > only two distributions define non-generic xa or xb > >>>> pprint ([(d.name, d.a, d.b, d.xa, d.xb) for d in dist_cont if not d.xa*d.xb==-100]) > [('foldcauchy', 0.0, inf, -10.0, 1000), ('recipinvgauss', 0.0, inf, -10.0, 50)] > > a pull request setting correct xa, xb would be very welcome > > Josef > > >> >> Josef >> >>> >>> Nicky >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: findppf2.py Type: application/octet-stream Size: 1199 bytes Desc: not available URL: From josef.pktd at gmail.com Sun May 13 17:57:21 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 13 May 2012 17:57:21 -0400 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: On Sun, May 13, 2012 at 4:34 PM, nicky van foreest wrote: > Hi Josef, > > Some time ago we discussed an algorithm to find x such that cdf(x) = q > for given q, that is, a generic algorithm to solve the ppf. In the > mail below you propose to set xa and xb to good initial values, but > this is not a simple task. Besides this, xa and xb are solely used in > the ppf compution. This made me think about an algorithm that avoid > the use of xa and xb altogether. I came up with the algorithm below. > The included test cases show that it works really well. What is your > opinion? > > In case motivation is required, I shift left and right such that > eventually cdf(left) < q < cdf(right). I update by a factor of 10. I > rather don't spend a lot of time on the while loop, but prefer to > leave the actual solving to brentq. This algo homes in very fast, so > it is better to rely on brentq to do the fast searching. Hence, I > prefer to take 10 rather than 2 or 3 at a growth factor. Just for fun > I tried 1000 as a factor. This works also well. > > Do you perhaps have more challenging cases? ?Once we have a set of > tests, I'll try to set up a performance test. brentq seems to work very well one case I was worried about, but works well: >>> p,r = optimize.brentq(lambda x: np.minimum(1, np.maximum(x,0)) -1e-30, -1000, 1000, full_output=1) >>> p -1.2079226507921669e-12 >>> r.iterations 53 >>> r.function_calls 54 >>> p,r = optimize.brentq(lambda x: np.minimum(1, np.maximum(x,0)) -1e-30, -100, 100, full_output=1) >>> p -9.0949470177290643e-13 >>> r.iterations 50 >>> r.function_calls 51 >>> p,r = optimize.brentq(lambda x: np.minimum(1, np.maximum(x,0)) -1e-30, -1000, 10, full_output=1) >>> r.iterations 71 >>> r.function_calls 72 aside: if q = 0 or q=1, then brentq will find something outside the support, but this should be handled by the ppf generic code class Fake(object): #actually uniform def cdf(wlf, x): return np.minimum(1, np.maximum(1000+x,0)) q = 1e-6 d = Fake() sol = findppf(d, q) print sol, q, d.cdf(sol) -999.999999 1e-06 9.99999997475e-07 Right now I cannot think of another case that would be difficult. I don't see anything yet to criticize in your latest version :( Josef > > Once we are satisfied I'll make a pull request. > > I attached the code. BTW. I suppose that as long as we are > experimenting with algorithms/code there is no reason to make a pull > request. > > Nicky > > > > On 26 April 2012 05:15, ? wrote: >> On Wed, Apr 25, 2012 at 3:49 PM, ? wrote: >>> On Wed, Apr 25, 2012 at 3:21 PM, nicky van foreest wrote: >>>>>>>>> The difficult cases will be where cdf also doesn't exist and we need >>>>>>>>> to get it through integrate.quad, but I don't remember which >>>>>>>>> distribution is a good case. >>>>>> >>>>>> This case is harder indeed. (I assume you mean by 'not exist' that >>>>>> there is no closed form expression for the cdf, like the normal >>>>>> distribution). Computing the ppf would involve calling quad a lot of >>>>>> times. This is wasteful especially since the computation of cdf(b) >>>>>> includes the computation of cdf(a) for a < b, supposing that quad runs >>>>>> from -np.inf to b. We could repair this by computing cdf(b) = cdf(a) + >>>>>> quad(f, a, b), assuming that cdf(a) has been computed already. >>>>>> (perhaps I am not clear enough here. If so, let me know.) >>>>> >>>>> not exists = not defined as _cdf method ?could also be scipy.special >>>>> if there are no closed form expressions >>>> >>>> I see, sure. >>>> >>>>>>>> I just think that we are not able to reach the q=0, q=1 boundaries, >>>>>>>> since for some distributions we will run into other numerical >>>>>>>> problems. And I'm curious how far we can get with this. >>>>>> >>>>>> I completely missed to include a test on the obvious cases q >= 1. - >>>>>> np.finfo(float).eps and q <= np.finfo(float).eps. It is now in the >>>>>> attached file. >>>>> >>>>>>>> findppf(stats.expon, 1e-30) >>>>> -6.3593574850511882e-13 >>>> >>>> This result shows actually that xa and xb are necessary to include in >>>> the specification of the distribution. The exponential distribution is >>>> (usually) defined only on [0, \infty) not on the negative numbers. The >>>> result above is negative though. This is of course a simple >>>> consequence of calling brentq. From a user's perspective, though, I >>>> would become very suspicious about this negative result. >>> >>> good argument to clean up xa, xb >>> >>>> >>>>> The right answer should be dist.b for q=numerically 1, lower support >>>>> point is dist.a but I don't see when we would need it. >>>> >>>> I agree, provided xa and xb are always properly defined. But then, >>>> (just to be nitpicking), the definition of expon does not set xa and >>>> xb explicitly. Hence xa = -10, and this is somewhat undesirable, given >>>> the negative value above. >>>> >>>>>> >>>>>> The simultaneous updating of left and right in the previous algo is >>>>>> wrong. Suppose for instance that cdf(left) < cdf(right) < q. Then both >>>>>> left and right would `move to the left'. This is clearly wrong. The >>>>>> included code should be better. >>>>> >>>>> would move to the *right* ? >>>> >>>> Sure. >>>> >>>>> >>>>> I thought the original was a nice trick, we can shift both left and >>>>> right since we know it has to be in that direction, the cut of range >>>>> cannot contain the answer. >>>>> >>>>> Or do I miss the point? >>>> >>>> No, you are right. When I wrote this at first, I also thought about >>>> the point you bring up here. Then, I was somewhat dissatisfied with >>>> calling the while loop twice (suppose the left bound requires >>>> updating, then certainly the second while loop (to update the right >>>> bound) is unnecessary, and calling cdf(right) is useless). While >>>> trying to fix this, I forgot about my initial ideas... >>>> >>>>> >>>>>> >>>>>> With regard to the values of xb and xa. Can a `ordinary' user change >>>>>> these? If so, then the ppf finder should include some protection in my >>>>>> opinion. If not, the user will get an error that brentq has not the >>>>>> right limits, but this error might be somewhat unexpected. (What has >>>>>> brentq to do with finding the ppf?) Of course, looking at the code >>>>>> this is clear, but I expect most users will no do so. >>>>> >>>>> I don't think ?`ordinary' users should touch xa, xb, but they could. >>>>> Except for getting around the limitation in this ticket there is no >>>>> reason to change xa, xb, so we could make them private _xa, _xb >>>>> instead. >>>> >>>> I think that would be better. Thus, the developer that subclasses >>>> rv_continuous should set _xa and _xb properly. >>>> >>>>>> The code contains two choices about how to handle xa and xb. Do you >>>>>> have any preference? >>>>> >>>>> I don't really like choice 1, because it removes the use of the >>>>> predefined xa, xb. On the other hand, with this extension, xa and xb >>>>> wouldn't be really necessary anymore. >>>> >>>> In view of your example with findppf(expon(1e-30)) I prefer to use _xa and _xb. >>>> >>>>> >>>>> another possibility would be to try except brentq with xa, xb first >>>>> and get most cases, and switch to your version if needed. I'm not sure >>>>> xa, xb are defined well enough that it's worth to go this route, >>>>> though. >>>> >>>> I think that this makes the most sense. The definition of the class >>>> should include sensible values of xa and xb. >>>> >>>> All in all, I would like to make the following proposal to resolve the >>>> ticket in a generic way. >>>> >>>> 1) xa and xb should become private class members _xa and _xb >>>> 2) _xa and _xb should be given proper values in the class definition, >>>> e.g. expon._xa = 0 and expon._xb = 30., since exp(-30) = 9.35e-14. >>>> 3) given a quantile q in the ppf function, include a test on _cdf(_xa) >>>> <= q <= _cdf(_xb). If this fails, return an exception with some text >>>> that tells that either _cdf(_xa) > q or _cdf(_xb) < q. >>>> >>>> Given your comments I actually favor all this searching for left and >>>> right not that much anymore. It is generic, but it places the >>>> responsibility of good code at the wrong place. >>> >>> 3) I prefer your expanding the search to raising an exception to the >>> user. Note also that your 3) is inconsistent with 1). If a user >>> visible exception is raised, then the user needs to change xa or xb, >>> so it shouldn't be private. That's the current situation (except for a >>> more cryptic message). >>> >>> 2) I'm all in favor, especially for one-side bound distributions, >>> where it should be easy to go through those. There might be a few >>> where the bound moves with the shape, but the only one I remember is >>> genextreme and that has an explicit _ppf >>> >>> So I would prefer 1), 2) and your new enhanced generic _ppf >> >> forgot to mention >> >> the main reason that I like your expanding search space is that the >> shape of the distribution can change a lot. Even if we set xa, xb to >> reasonable values for likely shape parameters they won't be good >> enough for others, as in the original ticket >> >>>>> stats.invgauss.stats(2) >> (array(2.0), array(8.0)) >>>>> stats.invgauss.stats(7) >> (array(7.0), array(343.0)) >>>>> stats.invgauss.stats(20) >> (array(20.0), array(8000.0)) >>>>> stats.invgauss.stats(100) >> (array(100.0), array(1000000.0)) >>>>> stats.invgauss.cdf(1000, 100) >> 0.98335562794321207 >> >>>>> findppf(stats.invgauss, 0.99, 100) >> 1926.520850319389 >>>>> findppf(stats.invgauss, 0.999, 100) >> 13928.012903371644 >>>>> findppf(stats.invgauss, 0.999, 1) >> 8.3548649291400938 >> --------- >> >> to get a rough idea: >> for xa, xb and a finite bound either left or right, all have generic >> xa=-10 or xb=10 >> >>>>> dist_cont = [getattr(stats.distributions, dname) ?for dname in dir(stats.distributions) if isinstance(getattr(stats.distributions, dname), stats.distributions.rv_continuous)] >> >>>>> left = [(d.name, d.a, d.xa) for d in dist_cont if not np.isneginf(d.a)] >>>>> pprint(left) >> [('alpha', 0.0, -10.0), >> ?('anglit', -0.78539816339744828, -10.0), >> ?('arcsine', 0.0, -10.0), >> ?('beta', 0.0, -10.0), >> ?('betaprime', 0.0, -10.0), >> ?('bradford', 0.0, -10.0), >> ?('burr', 0.0, -10.0), >> ?('chi', 0.0, -10.0), >> ?('chi2', 0.0, -10.0), >> ?('cosine', -3.1415926535897931, -10.0), >> ?('erlang', 0.0, -10.0), >> ?('expon', 0.0, 0), >> ?('exponpow', 0.0, -10.0), >> ?('exponweib', 0.0, -10.0), >> ?('f', 0.0, -10.0), >> ?('fatiguelife', 0.0, -10.0), >> ?('fisk', 0.0, -10.0), >> ?('foldcauchy', 0.0, -10.0), >> ?('foldnorm', 0.0, -10.0), >> ?('frechet_r', 0.0, -10.0), >> ?('gamma', 0.0, -10.0), >> ?('gausshyper', 0.0, -10.0), >> ?('genexpon', 0.0, -10.0), >> ?('gengamma', 0.0, -10.0), >> ?('genhalflogistic', 0.0, -10.0), >> ?('genpareto', 0.0, -10.0), >> ?('gilbrat', 0.0, -10.0), >> ?('gompertz', 0.0, -10.0), >> ?('halfcauchy', 0.0, -10.0), >> ?('halflogistic', 0.0, -10.0), >> ?('halfnorm', 0.0, -10.0), >> ?('invgamma', 0.0, -10.0), >> ?('invgauss', 0.0, -10.0), >> ?('invnorm', 0.0, -10.0), >> ?('invweibull', 0, -10.0), >> ?('johnsonb', 0.0, -10.0), >> ?('ksone', 0.0, -10.0), >> ?('kstwobign', 0.0, -10.0), >> ?('levy', 0.0, -10.0), >> ?('loglaplace', 0.0, -10.0), >> ?('lognorm', 0.0, -10.0), >> ?('lomax', 0.0, -10.0), >> ?('maxwell', 0.0, -10.0), >> ?('mielke', 0.0, -10.0), >> ?('nakagami', 0.0, -10.0), >> ?('ncf', 0.0, -10.0), >> ?('ncx2', 0.0, -10.0), >> ?('pareto', 1.0, -10.0), >> ?('powerlaw', 0.0, -10.0), >> ?('powerlognorm', 0.0, -10.0), >> ?('rayleigh', 0.0, -10.0), >> ?('rdist', -1.0, -10.0), >> ?('recipinvgauss', 0.0, -10.0), >> ?('rice', 0.0, -10.0), >> ?('semicircular', -1.0, -10.0), >> ?('triang', 0.0, -10.0), >> ?('truncexpon', 0.0, -10.0), >> ?('uniform', 0.0, -10.0), >> ?('wald', 0.0, -10.0), >> ?('weibull_min', 0.0, -10.0), >> ?('wrapcauchy', 0.0, -10.0)] >> >>>>> right = [(d.name, d.b, d.xb) for d in dist_cont if not np.isposinf(d.b)] >>>>> pprint(right) >> [('anglit', 0.78539816339744828, 10.0), >> ?('arcsine', 1.0, 10.0), >> ?('beta', 1.0, 10.0), >> ?('betaprime', 500.0, 10.0), >> ?('bradford', 1.0, 10.0), >> ?('cosine', 3.1415926535897931, 10.0), >> ?('frechet_l', 0.0, 10.0), >> ?('gausshyper', 1.0, 10.0), >> ?('johnsonb', 1.0, 10.0), >> ?('levy_l', 0.0, 10.0), >> ?('powerlaw', 1.0, 10.0), >> ?('rdist', 1.0, 10.0), >> ?('semicircular', 1.0, 10.0), >> ?('triang', 1.0, 10.0), >> ?('uniform', 1.0, 10.0), >> ?('weibull_max', 0.0, 10.0), >> ?('wrapcauchy', 6.2831853071795862, 10.0)] >> >> only pareto has both limits on the same side of zero >> >>>>> pprint ([(d.name, d.a, d.b) for d in dist_cont if d.a*d.b>0]) >> [('pareto', 1.0, inf)] >> >> >> genextreme, and maybe one or two others, are missing because finite a, >> b are set in _argcheck >> vonmises is for circular and doesn't behave properly >> >> only two distributions define non-generic xa or xb >> >>>>> pprint ([(d.name, d.a, d.b, d.xa, d.xb) for d in dist_cont if not d.xa*d.xb==-100]) >> [('foldcauchy', 0.0, inf, -10.0, 1000), ('recipinvgauss', 0.0, inf, -10.0, 50)] >> >> a pull request setting correct xa, xb would be very welcome >> >> Josef >> >> >>> >>> Josef >>> >>>> >>>> Nicky >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From aia8v at virginia.edu Sun May 13 19:42:20 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Sun, 13 May 2012 19:42:20 -0400 Subject: [SciPy-Dev] ipython notebook for docs/examples Message-ID: <4FB046DC.9030106@virginia.edu> i recently took a look at the ipython notebook feature. its awesome. if it is interface-able with sphinx, i could write docs way faster. also, it seems to me that making interactive tutorials/examples with this accesable through a server would work great. has this idea already been considered? if not , does anyone have thoughts on this? alex From vanderplas at astro.washington.edu Sun May 13 20:39:43 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Sun, 13 May 2012 17:39:43 -0700 Subject: [SciPy-Dev] ipython notebook for docs/examples In-Reply-To: <4FB046DC.9030106@virginia.edu> References: <4FB046DC.9030106@virginia.edu> Message-ID: <4FB0544F.8090006@astro.washington.edu> Hi Alex, At the ipython/scikit-learn sprint after PyCon this spring there were some people taking a look at this. I'm not sure what progress was made, but people were considering creating an ipython plugin for sphinx, so that sphinx docs could be exported to ipython notebooks just like they can now be exported to html or pdf. Then in the sphinx build, each page of the documentation could automatically include a link to an associated ipython notebook. Perhaps someone else can give an update about any progress that's been made in this area: I agree that it could be a very useful feature in the documentation of a lot of python projects Jake alex arsenovic wrote: > i recently took a look at the ipython notebook feature. its awesome. > if it is interface-able with sphinx, i could write docs way faster. > also, it seems to me that making interactive tutorials/examples with > this accesable through a server would work great. > > has this idea already been considered? if not , does anyone have > thoughts on this? > > > alex > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From fperez.net at gmail.com Sun May 13 20:42:09 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 13 May 2012 17:42:09 -0700 Subject: [SciPy-Dev] ipython notebook for docs/examples In-Reply-To: <4FB046DC.9030106@virginia.edu> References: <4FB046DC.9030106@virginia.edu> Message-ID: Hi Alex, On Sun, May 13, 2012 at 4:42 PM, alex arsenovic wrote: > i recently took a look ?at the ipython notebook feature. its ?awesome. > if it is interface-able with sphinx, i could write docs way faster. > also, it seems to me that making interactive tutorials/examples with > this accesable through a ?server would work great. > > has this idea already been considered? if not , does anyone have > thoughts on this? Sure :) Just to give everyone a quick status check on this idea: the main point is that we haven't yet finished the machinery to generate sphinx-compatible rst from notebooks. The code currently lives in a standalone repo: https://github.com/ipython/nbconvert So this is still a bit 'raw'. But with a bit of luck, in a few days I'll finish off the rst conversion machinery and we'll be in a reasonable shape to start looking at merging it into ipython proper. If you are interested in helping along, let me know and I'll provide more details. I hope that in the future we'll be able to provide with all the 'scipy*' projects: - executable notebooks for users to play with examples - nice sphinx html generated from these for online docs - pure .py versions of the codes for non-ipython use We're very close to all of this being possible, we just need to finish up a tiny bit of code. Cheers, f From tim at cerazone.net Mon May 14 10:58:06 2012 From: tim at cerazone.net (Tim Cera) Date: Mon, 14 May 2012 10:58:06 -0400 Subject: [SciPy-Dev] Missing docstrings in the SciPy docstring editor Message-ID: Most of the docstrings are gone from the on-line editor... http://docs.scipy.org/scipy/docs/ Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 14 11:43:53 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 May 2012 11:43:53 -0400 Subject: [SciPy-Dev] Missing docstrings in the SciPy docstring editor In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 10:58 AM, Tim Cera wrote: > Most of the docstrings are gone from the on-line editor... > http://docs.scipy.org/scipy/docs/ What are we supposed to see? I see a long list of blocks of function paths, and some spot checking shows the individual docstrings. Cheers, Josef > > Kindest regards, > Tim > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From tim at cerazone.net Mon May 14 11:48:16 2012 From: tim at cerazone.net (Tim Cera) Date: Mon, 14 May 2012 11:48:16 -0400 Subject: [SciPy-Dev] Missing docstrings in the SciPy docstring editor In-Reply-To: References: Message-ID: I just checked it and it is back to normal. When I sent the message there where only three links. On Mon, May 14, 2012 at 11:43 AM, wrote: > On Mon, May 14, 2012 at 10:58 AM, Tim Cera wrote: > > Most of the docstrings are gone from the on-line editor... > > http://docs.scipy.org/scipy/docs/ > > What are we supposed to see? > > I see a long list of blocks of function paths, and some spot checking > shows the individual docstrings. > > Cheers, > > Josef > > > > > Kindest regards, > > Tim > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Mon May 14 13:56:13 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 19:56:13 +0200 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: > one case I was worried about, but works well: > >>>> p,r = optimize.brentq(lambda x: np.minimum(1, np.maximum(x,0)) -1e-30, -1000, 1000, full_output=1) >>>> p > -1.2079226507921669e-12 Nice example. The answer is negative, while it should be positive, but the answer is within numerical accuracy I would say. > I don't see anything yet to criticize in your latest version :( Ok. I just checked the tests in scipy/stats/tests. It seems that these need not be changed. Thus I propose to do the following - make a new branch - repair for the cases q = 0 and q = 1 by means of an explicit test. - implement findppf in a suitable way in distributions.py - remove xa and xb - send a pull request In case this list is not complete, please let me know. Otherwise you'll see the pull request Nicky In case I m From josef.pktd at gmail.com Mon May 14 14:08:02 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 May 2012 14:08:02 -0400 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 1:56 PM, nicky van foreest wrote: >> one case I was worried about, but works well: >> >>>>> p,r = optimize.brentq(lambda x: np.minimum(1, np.maximum(x,0)) -1e-30, -1000, 1000, full_output=1) >>>>> p >> -1.2079226507921669e-12 > > Nice example. The answer is negative, while it should be positive, but > the answer is within numerical accuracy I would say. oops, didn't we have a case with negative sign already ? maybe a check self.a <= p <= self.b ? > >> I don't see anything yet to criticize in your latest version :( > > Ok. I just checked the tests in scipy/stats/tests. If you are curious, you could temporarily go closer to q=0 and q=1 in the tests for ppf, and see whether it breaks for any distribution. > It seems that these > need not be changed. Thus I propose to do the following > > - make a new branch > - repair for the cases q = ?0 and q = 1 by means of an explicit test. isn't ppf (generic part) taking care of this, if not then it should, I think ppf(0) = self.a ppf(1) = self.b > - implement findppf in a suitable way in distributions.py > - remove xa and xb > - send a pull request > > In case this list is not complete, please let me know. Otherwise > you'll see the pull request sounds good. Josef > > Nicky > > In case I m > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 14 14:45:29 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 20:45:29 +0200 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: >> Nice example. The answer is negative, while it should be positive, but >> the answer is within numerical accuracy I would say. > > oops, didn't we have a case with negative sign already ? > maybe a check self.a <= p <= self.b ?? I included this. I also think that a check on whether left and right stay within self.a and self.b should be included, perhaps just for safety reasons. > >> >>> I don't see anything yet to criticize in your latest version :( >> >> Ok. I just checked the tests in scipy/stats/tests. > > If you are curious, you could temporarily go closer to q=0 and q=1 in > the tests for ppf, and see whether it breaks for any distribution. Good idea. Just to see what would happen I changed the following code in test_continuous_basic.py: @_silence_fp_errors def check_cdf_ppf(distfn,arg,msg): values = [-1.e-5, 0.,0.001,0.5,0.999,1.] npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg), values, decimal=DECIMAL, err_msg= msg + \ ' - cdf-ppf roundtrip') Thus, I changed the values into an array. It should fail on the first value, as it is negative, but I get a pass. Specifically, I ran: nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py .............................................................................................................................. ---------------------------------------------------------------------- Ran 126 tests in 93.990s OK > Weird result. If I add a q = 1.0000001 I get a fail on the fourth test, as expected. >> - repair for the cases q = ?0 and q = 1 by means of an explicit test. > > isn't ppf (generic part) taking care of this, if not then it should, I think Actually, from the code in lines: https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529 I am inclined to believe you. However, in view of the above test ... Might it be that the conditions on L1529 have been added quite recently, and did not yet make it to my machine? I'll check this right now....As a matter of fact, my distributions.py contains the same check, i.e., cond1 = (q > 0) & (q < 1) . Hmmm. Now I admit that I do not understand in all nitty-gritty detail the entire implementation of ppf(), but I suspect that this is a bug. > > ppf(0) = self.a > ppf(1) = self.b Good idea. I'll implement the code in my branch, and do a pull request. From josef.pktd at gmail.com Mon May 14 15:51:35 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 May 2012 15:51:35 -0400 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 2:45 PM, nicky van foreest wrote: >>> Nice example. The answer is negative, while it should be positive, but >>> the answer is within numerical accuracy I would say. >> >> oops, didn't we have a case with negative sign already ? >> maybe a check self.a <= p <= self.b ?? > > I included this. I also think that a check on whether left and right > stay within ?self.a and self.b should be included, perhaps just for > safety reasons. > >> >>> >>>> I don't see anything yet to criticize in your latest version :( >>> >>> Ok. I just checked the tests in scipy/stats/tests. >> >> If you are curious, you could temporarily go closer to q=0 and q=1 in >> the tests for ppf, and see whether it breaks for any distribution. > > Good idea. Just to see what would happen I changed the following code > in test_continuous_basic.py: > > @_silence_fp_errors > def check_cdf_ppf(distfn,arg,msg): > ? ?values = [-1.e-5, 0.,0.001,0.5,0.999,1.] > ? ?npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg), > ? ? ? ? ? ? ? ? ? ? ? ? ? ?values, decimal=DECIMAL, err_msg= msg + \ > ? ? ? ? ? ? ? ? ? ? ? ? ? ?' - cdf-ppf roundtrip') roundtrip: looks like ppf should be ok, but cdf is not >>> stats.norm.ppf(-1e-5) nan >>> stats.norm.cdf(np.nan) 0.0 >>> stats.norm.cdf(stats.norm.ppf(-1e-5)) 0.0 I'm using scipy 0.9. but I don't think this has changed, not that I know of I'm trying to track down when this got changed. (github doesn't show changes in a file that has too many changes, need to dig out git) > > > Thus, I changed the values into an array. It should fail on the first > value, as it is negative, but I get a pass. Specifically, I ran: > > nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py > .............................................................................................................................. > ---------------------------------------------------------------------- > Ran 126 tests in 93.990s > > OK > >> > > Weird result. If I add a q ?= 1.0000001 I get a fail on the fourth > test, as expected. > >>> - repair for the cases q = ?0 and q = 1 by means of an explicit test. >> >> isn't ppf (generic part) taking care of this, if not then it should, I think > > Actually, from the code in lines: > > https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529 > > I am inclined to believe you. However, in view of the above test ... > Might it be that the conditions on L1529 have been added quite > recently, and did not yet make it to my machine? I'll check this right > now....As a matter of fact, my distributions.py contains the same > check, i.e., ? ? ? ? cond1 = (q > 0) & (q < 1) . Hmmm. > > Now I admit that I do not understand in all nitty-gritty detail the > entire implementation of ppf(), but I suspect that this is a bug. > >> >> ppf(0) = self.a >> ppf(1) = self.b > > Good idea. this already looks correct in the generic ppf code >>> stats.beta.ppf(0, 0.5) 0.0 >>> stats.beta.a 0.0 Josef > > I'll implement the code in my branch, and do a pull request. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 14 16:01:01 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 22:01:01 +0200 Subject: [SciPy-Dev] reading a module under test Message-ID: Hi, I would like to run a test on some code in my local branch of scipy. Now the problem in one of the test files is that it says from scipy import stats Now this reads the standard stats module, not the one I want to test, i.e, the one on my local branch. I changed the pythonpath, but this does not help. Is there a generic way to say something like from scipy import path_to_my_stats_under_test/stats ? Specifically, I want to load this file: /home/nicky/prog/scipy/scipy/stats/distributions.py, and I don't want to load /usr/lib/python2.7/dist-packages/scipy/stats/distributions.py. Thanks for any help. Nicky From denis at laxalde.org Mon May 14 16:09:37 2012 From: denis at laxalde.org (Denis Laxalde) Date: Mon, 14 May 2012 16:09:37 -0400 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: Message-ID: <4FB16681.2000802@laxalde.org> nicky van foreest a ?crit : > I would like to run a test on some code in my local branch of scipy. > Now the problem in one of the test files is that it says > > from scipy import stats > > Now this reads the standard stats module, not the one I want to test, > i.e, the one on my local branch. I changed the pythonpath, but this > does not help. Is there a generic way to say something like > > from scipy import path_to_my_stats_under_test/stats ? > > Specifically, I want to load this file: > /home/nicky/prog/scipy/scipy/stats/distributions.py, and I don't want > to load /usr/lib/python2.7/dist-packages/scipy/stats/distributions.py. You could build scipy from source and install it under your home directory using (from the root directory of sources): python setup.py install --user This will (on UNIX-like systems) install scipy in .local/lib/python2.7/site-packages/scipy. This directory comes before /usr/lib/python2.7/dist-packages so no need to tweak the PYTHONPATH. Then to run the tests, python -c "from scipy import stats; stats.test()". -- Denis From ralf.gommers at googlemail.com Mon May 14 16:13:48 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 14 May 2012 22:13:48 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: <4FB16681.2000802@laxalde.org> References: <4FB16681.2000802@laxalde.org> Message-ID: On Mon, May 14, 2012 at 10:09 PM, Denis Laxalde wrote: > nicky van foreest a ?crit : > > I would like to run a test on some code in my local branch of scipy. > > Now the problem in one of the test files is that it says > > > > from scipy import stats > > > > Now this reads the standard stats module, not the one I want to test, > > i.e, the one on my local branch. I changed the pythonpath, but this > > does not help. > It should help. Check with $echo $PYTHONPATH that the dir you installed it in comes before your site-packages dir. To achieve this, I have in my .bash_login this: export PYTHONPATH="$HOME/Code/numpy:$HOME/Code/scipy:${PYTHONPATH}" > Is there a generic way to say something like > > > > from scipy import path_to_my_stats_under_test/stats ? > Not really. > > Specifically, I want to load this file: > > /home/nicky/prog/scipy/scipy/stats/distributions.py, and I don't want > > to load /usr/lib/python2.7/dist-packages/scipy/stats/distributions.py. > > You could build scipy from source and install it under your home > directory using (from the root directory of sources): > > python setup.py install --user > > This will (on UNIX-like systems) install scipy in > .local/lib/python2.7/site-packages/scipy. This directory comes before > /usr/lib/python2.7/dist-packages so no need to tweak the PYTHONPATH. > > Possible too, but not so handy if you want to edit in-place and then commit the results. Ralf Then to run the tests, python -c "from scipy import stats; stats.test()". > > -- > Denis > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 14 16:15:37 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 May 2012 16:15:37 -0400 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 3:51 PM, wrote: > On Mon, May 14, 2012 at 2:45 PM, nicky van foreest wrote: >>>> Nice example. The answer is negative, while it should be positive, but >>>> the answer is within numerical accuracy I would say. >>> >>> oops, didn't we have a case with negative sign already ? >>> maybe a check self.a <= p <= self.b ?? >> >> I included this. I also think that a check on whether left and right >> stay within ?self.a and self.b should be included, perhaps just for >> safety reasons. >> >>> >>>> >>>>> I don't see anything yet to criticize in your latest version :( >>>> >>>> Ok. I just checked the tests in scipy/stats/tests. >>> >>> If you are curious, you could temporarily go closer to q=0 and q=1 in >>> the tests for ppf, and see whether it breaks for any distribution. >> >> Good idea. Just to see what would happen I changed the following code >> in test_continuous_basic.py: >> >> @_silence_fp_errors >> def check_cdf_ppf(distfn,arg,msg): >> ? ?values = [-1.e-5, 0.,0.001,0.5,0.999,1.] >> ? ?npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg), >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?values, decimal=DECIMAL, err_msg= msg + \ >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?' - cdf-ppf roundtrip') > > roundtrip: looks like ppf should be ok, but cdf is not > >>>> stats.norm.ppf(-1e-5) > nan >>>> stats.norm.cdf(np.nan) > 0.0 >>>> stats.norm.cdf(stats.norm.ppf(-1e-5)) > 0.0 > > I'm using scipy 0.9. but I don't think this has changed, not that I know of > > I'm trying to track down when this got changed. > (github doesn't show changes in a file that has too many changes, need > to dig out git) It would be better to run the same version as looking at the code. It's difficult to find the bug or understand the behavior if it's not there anymore switching to scipy 0.10 >>> stats.norm.cdf(np.nan) nan >>> scipy.__version__ '0.10.0b2' nan propagation is not available in 0.9.0 https://github.com/scipy/scipy/commit/96e39ecc6a2b671ed7f99a9c0375adc9238c6056#L0L1343 Josef > >> >> >> Thus, I changed the values into an array. It should fail on the first >> value, as it is negative, but I get a pass. Specifically, I ran: >> >> nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py >> .............................................................................................................................. >> ---------------------------------------------------------------------- >> Ran 126 tests in 93.990s >> >> OK >> >>> >> >> Weird result. If I add a q ?= 1.0000001 I get a fail on the fourth >> test, as expected. >> >>>> - repair for the cases q = ?0 and q = 1 by means of an explicit test. >>> >>> isn't ppf (generic part) taking care of this, if not then it should, I think >> >> Actually, from the code in lines: >> >> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529 >> >> I am inclined to believe you. However, in view of the above test ... >> Might it be that the conditions on L1529 have been added quite >> recently, and did not yet make it to my machine? I'll check this right >> now....As a matter of fact, my distributions.py contains the same >> check, i.e., ? ? ? ? cond1 = (q > 0) & (q < 1) . Hmmm. >> >> Now I admit that I do not understand in all nitty-gritty detail the >> entire implementation of ppf(), but I suspect that this is a bug. >> >>> >>> ppf(0) = self.a >>> ppf(1) = self.b >> >> Good idea. > > this already looks correct in the generic ppf code > >>>> stats.beta.ppf(0, 0.5) > 0.0 >>>> stats.beta.a > 0.0 > > Josef >> >> I'll implement the code in my branch, and do a pull request. >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 14 16:45:24 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 22:45:24 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: <4FB16681.2000802@laxalde.org> Message-ID: Thanks. On 14 May 2012 22:13, Ralf Gommers wrote: > > > On Mon, May 14, 2012 at 10:09 PM, Denis Laxalde wrote: >> >> nicky van foreest a ?crit : >> > I would like to run a test on some code in my local branch of scipy. >> > Now the problem in one of the test files is that it says >> > >> > from scipy import stats >> > >> > Now this reads the standard stats module, not the one I want to test, >> > i.e, the one on my local branch. I changed the pythonpath, but this >> > does not help. > > > It should help. Check with $echo $PYTHONPATH that the dir you installed it I made a typo in the python path... > in comes before your site-packages dir. To achieve this, I have in my > .bash_login this: > > export PYTHONPATH="$HOME/Code/numpy:$HOME/Code/scipy:${PYTHONPATH}" This does not really work. Look here: nicky at chuck:~$ export PYTHONPATH="$HOME/prog/scipy/scipy:${PYTHONPATH}" nicky at chuck:~$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.__version__ '0.9.0' >>> nicky at chuck:~$ python -c "from scipy import stats; stats.test()" Running unit tests for scipy.stats NumPy version 1.6.1 NumPy is installed in /usr/lib/python2.7/dist-packages/numpy SciPy version 0.9.0 SciPy is installed in /usr/lib/python2.7/dist-packages/scipy Python version 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] I suspect that ubuntu first searches along some other paths, and then uses my PYTHONPATH. @Denis >> You could build scipy from source and install it under your home >> directory using (from the root directory of sources): >> >> ? ? python setup.py install --user >> >> This will (on UNIX-like systems) install scipy in >> .local/lib/python2.7/site-packages/scipy. This directory comes before >> /usr/lib/python2.7/dist-packages so no need to tweak the PYTHONPATH. >> > Possible too, but not so handy if you want to edit in-place and then commit > the results. Thanks. There is also another problem. I get lots of blas and lapack warnings and errors. I'll first try to resolve the matter with setting a correct path. Nicky From vanforeest at gmail.com Mon May 14 16:46:35 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 22:46:35 +0200 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: Yes, you're right. I am trying to use the right version of scipy and stats, but I first have to figure out how to that. Nicky On 14 May 2012 22:15, wrote: > On Mon, May 14, 2012 at 3:51 PM, ? wrote: >> On Mon, May 14, 2012 at 2:45 PM, nicky van foreest wrote: >>>>> Nice example. The answer is negative, while it should be positive, but >>>>> the answer is within numerical accuracy I would say. >>>> >>>> oops, didn't we have a case with negative sign already ? >>>> maybe a check self.a <= p <= self.b ?? >>> >>> I included this. I also think that a check on whether left and right >>> stay within ?self.a and self.b should be included, perhaps just for >>> safety reasons. >>> >>>> >>>>> >>>>>> I don't see anything yet to criticize in your latest version :( >>>>> >>>>> Ok. I just checked the tests in scipy/stats/tests. >>>> >>>> If you are curious, you could temporarily go closer to q=0 and q=1 in >>>> the tests for ppf, and see whether it breaks for any distribution. >>> >>> Good idea. Just to see what would happen I changed the following code >>> in test_continuous_basic.py: >>> >>> @_silence_fp_errors >>> def check_cdf_ppf(distfn,arg,msg): >>> ? ?values = [-1.e-5, 0.,0.001,0.5,0.999,1.] >>> ? ?npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg), >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?values, decimal=DECIMAL, err_msg= msg + \ >>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?' - cdf-ppf roundtrip') >> >> roundtrip: looks like ppf should be ok, but cdf is not >> >>>>> stats.norm.ppf(-1e-5) >> nan >>>>> stats.norm.cdf(np.nan) >> 0.0 >>>>> stats.norm.cdf(stats.norm.ppf(-1e-5)) >> 0.0 >> >> I'm using scipy 0.9. but I don't think this has changed, not that I know of >> >> I'm trying to track down when this got changed. >> (github doesn't show changes in a file that has too many changes, need >> to dig out git) > > It would be better to run the same version as looking at the code. > It's difficult to find the bug or understand the behavior if it's not > there anymore > > switching to scipy 0.10 > >>>> stats.norm.cdf(np.nan) > nan >>>> scipy.__version__ > '0.10.0b2' > > nan propagation is not available in 0.9.0 > > https://github.com/scipy/scipy/commit/96e39ecc6a2b671ed7f99a9c0375adc9238c6056#L0L1343 > > Josef > >> >>> >>> >>> Thus, I changed the values into an array. It should fail on the first >>> value, as it is negative, but I get a pass. Specifically, I ran: >>> >>> nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py >>> .............................................................................................................................. >>> ---------------------------------------------------------------------- >>> Ran 126 tests in 93.990s >>> >>> OK >>> >>>> >>> >>> Weird result. If I add a q ?= 1.0000001 I get a fail on the fourth >>> test, as expected. >>> >>>>> - repair for the cases q = ?0 and q = 1 by means of an explicit test. >>>> >>>> isn't ppf (generic part) taking care of this, if not then it should, I think >>> >>> Actually, from the code in lines: >>> >>> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529 >>> >>> I am inclined to believe you. However, in view of the above test ... >>> Might it be that the conditions on L1529 have been added quite >>> recently, and did not yet make it to my machine? I'll check this right >>> now....As a matter of fact, my distributions.py contains the same >>> check, i.e., ? ? ? ? cond1 = (q > 0) & (q < 1) . Hmmm. >>> >>> Now I admit that I do not understand in all nitty-gritty detail the >>> entire implementation of ppf(), but I suspect that this is a bug. >>> >>>> >>>> ppf(0) = self.a >>>> ppf(1) = self.b >>> >>> Good idea. >> >> this already looks correct in the generic ppf code >> >>>>> stats.beta.ppf(0, 0.5) >> 0.0 >>>>> stats.beta.a >> 0.0 >> >> Josef >>> >>> I'll implement the code in my branch, and do a pull request. >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Mon May 14 16:51:03 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 14 May 2012 22:51:03 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: <4FB16681.2000802@laxalde.org> Message-ID: On Mon, May 14, 2012 at 10:45 PM, nicky van foreest wrote: > Thanks. > > On 14 May 2012 22:13, Ralf Gommers wrote: > > > > > > On Mon, May 14, 2012 at 10:09 PM, Denis Laxalde > wrote: > >> > >> nicky van foreest a ?crit : > >> > I would like to run a test on some code in my local branch of scipy. > >> > Now the problem in one of the test files is that it says > >> > > >> > from scipy import stats > >> > > >> > Now this reads the standard stats module, not the one I want to test, > >> > i.e, the one on my local branch. I changed the pythonpath, but this > >> > does not help. > > > > > > It should help. Check with $echo $PYTHONPATH that the dir you installed > it > > I made a typo in the python path... > > > in comes before your site-packages dir. To achieve this, I have in my > > .bash_login this: > > > > export PYTHONPATH="$HOME/Code/numpy:$HOME/Code/scipy:${PYTHONPATH}" > > This does not really work. Look here: > > nicky at chuck:~$ export PYTHONPATH="$HOME/prog/scipy/scipy:${PYTHONPATH}" > I assume that prog/scipy is the base folder of your git repo, containing the main setup.py. If so, remove the second scipy from prog/scipy/scipy. Then it should work. nicky at chuck:~$ python > Python 2.7.3 (default, Apr 20 2012, 22:39:59) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.__version__ > '0.9.0' > >>> > nicky at chuck:~$ python -c "from scipy import stats; stats.test()" > Running unit tests for scipy.stats > NumPy version 1.6.1 > NumPy is installed in /usr/lib/python2.7/dist-packages/numpy > SciPy version 0.9.0 > SciPy is installed in /usr/lib/python2.7/dist-packages/scipy > Python version 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] > > I suspect that ubuntu first searches along some other paths, and then > uses my PYTHONPATH. > That's quite unlikely. Ralf > @Denis > >> You could build scipy from source and install it under your home > >> directory using (from the root directory of sources): > >> > >> python setup.py install --user > >> > >> This will (on UNIX-like systems) install scipy in > >> .local/lib/python2.7/site-packages/scipy. This directory comes before > >> /usr/lib/python2.7/dist-packages so no need to tweak the PYTHONPATH. > >> > > Possible too, but not so handy if you want to edit in-place and then > commit > > the results. > > Thanks. There is also another problem. I get lots of blas and lapack > warnings and errors. > > I'll first try to resolve the matter with setting a correct path. > > Nicky > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis at laxalde.org Mon May 14 16:52:03 2012 From: denis at laxalde.org (Denis Laxalde) Date: Mon, 14 May 2012 16:52:03 -0400 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: <4FB16681.2000802@laxalde.org> Message-ID: <4FB17073.20003@laxalde.org> nicky van foreest a ?crit : > Thanks. There is also another problem. I get lots of blas and lapack > warnings and errors. Do you have the package libatlas-base-dev installed? From josef.pktd at gmail.com Mon May 14 17:11:32 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 14 May 2012 17:11:32 -0400 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 4:46 PM, nicky van foreest wrote: > Yes, ?you're right. I am trying to use the right version of scipy and > stats, but I first have to figure out how to that. It was also directed at myself, I spent half an hour staring at the online code looking for the problem that wasn't there. :) I switch python versions when I want to switch scipy versions. (python 2.5 with scipy 0.7, ...) Josef > > Nicky > > On 14 May 2012 22:15, ? wrote: >> On Mon, May 14, 2012 at 3:51 PM, ? wrote: >>> On Mon, May 14, 2012 at 2:45 PM, nicky van foreest wrote: >>>>>> Nice example. The answer is negative, while it should be positive, but >>>>>> the answer is within numerical accuracy I would say. >>>>> >>>>> oops, didn't we have a case with negative sign already ? >>>>> maybe a check self.a <= p <= self.b ?? >>>> >>>> I included this. I also think that a check on whether left and right >>>> stay within ?self.a and self.b should be included, perhaps just for >>>> safety reasons. >>>> >>>>> >>>>>> >>>>>>> I don't see anything yet to criticize in your latest version :( >>>>>> >>>>>> Ok. I just checked the tests in scipy/stats/tests. >>>>> >>>>> If you are curious, you could temporarily go closer to q=0 and q=1 in >>>>> the tests for ppf, and see whether it breaks for any distribution. >>>> >>>> Good idea. Just to see what would happen I changed the following code >>>> in test_continuous_basic.py: >>>> >>>> @_silence_fp_errors >>>> def check_cdf_ppf(distfn,arg,msg): >>>> ? ?values = [-1.e-5, 0.,0.001,0.5,0.999,1.] >>>> ? ?npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg), >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?values, decimal=DECIMAL, err_msg= msg + \ >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ?' - cdf-ppf roundtrip') >>> >>> roundtrip: looks like ppf should be ok, but cdf is not >>> >>>>>> stats.norm.ppf(-1e-5) >>> nan >>>>>> stats.norm.cdf(np.nan) >>> 0.0 >>>>>> stats.norm.cdf(stats.norm.ppf(-1e-5)) >>> 0.0 >>> >>> I'm using scipy 0.9. but I don't think this has changed, not that I know of >>> >>> I'm trying to track down when this got changed. >>> (github doesn't show changes in a file that has too many changes, need >>> to dig out git) >> >> It would be better to run the same version as looking at the code. >> It's difficult to find the bug or understand the behavior if it's not >> there anymore >> >> switching to scipy 0.10 >> >>>>> stats.norm.cdf(np.nan) >> nan >>>>> scipy.__version__ >> '0.10.0b2' >> >> nan propagation is not available in 0.9.0 >> >> https://github.com/scipy/scipy/commit/96e39ecc6a2b671ed7f99a9c0375adc9238c6056#L0L1343 >> >> Josef >> >>> >>>> >>>> >>>> Thus, I changed the values into an array. It should fail on the first >>>> value, as it is negative, but I get a pass. Specifically, I ran: >>>> >>>> nicky at chuck:~/prog/scipy/scipy/stats/tests$ python test_continuous_basic.py >>>> .............................................................................................................................. >>>> ---------------------------------------------------------------------- >>>> Ran 126 tests in 93.990s >>>> >>>> OK >>>> >>>>> >>>> >>>> Weird result. If I add a q ?= 1.0000001 I get a fail on the fourth >>>> test, as expected. >>>> >>>>>> - repair for the cases q = ?0 and q = 1 by means of an explicit test. >>>>> >>>>> isn't ppf (generic part) taking care of this, if not then it should, I think >>>> >>>> Actually, from the code in lines: >>>> >>>> https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1529 >>>> >>>> I am inclined to believe you. However, in view of the above test ... >>>> Might it be that the conditions on L1529 have been added quite >>>> recently, and did not yet make it to my machine? I'll check this right >>>> now....As a matter of fact, my distributions.py contains the same >>>> check, i.e., ? ? ? ? cond1 = (q > 0) & (q < 1) . Hmmm. >>>> >>>> Now I admit that I do not understand in all nitty-gritty detail the >>>> entire implementation of ppf(), but I suspect that this is a bug. >>>> >>>>> >>>>> ppf(0) = self.a >>>>> ppf(1) = self.b >>>> >>>> Good idea. >>> >>> this already looks correct in the generic ppf code >>> >>>>>> stats.beta.ppf(0, 0.5) >>> 0.0 >>>>>> stats.beta.a >>> 0.0 >>> >>> Josef >>>> >>>> I'll implement the code in my branch, and do a pull request. >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 14 17:13:49 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 23:13:49 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: <4FB16681.2000802@laxalde.org> Message-ID: > I assume that prog/scipy is the base folder of your git repo, containing the > main setup.py. If so, remove the second scipy from prog/scipy/scipy. Then it > should work. I did that, but then I got the following error: nicky at chuck:~/prog/scipy$ export PYTHONPATH="$HOME/prog/scipy/:${PYTHONPATH}" nicky at chuck:~/prog/scipy$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy Traceback (most recent call last): File "", line 1, in File "scipy/__init__.py", line 126, in raise ImportError(msg) ImportError: Error importing scipy: you cannot import scipy while being in scipy source directory; please exit the scipy source tree first, and relaunch your python intepreter. >>> nicky at chuck:~/prog/scipy$ cd .. nicky at chuck:~/prog$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy Traceback (most recent call last): File "", line 1, in File "/home/nicky/prog/scipy/scipy/__init__.py", line 126, in raise ImportError(msg) ImportError: Error importing scipy: you cannot import scipy while being in scipy source directory; please exit the scipy source tree first, and relaunch your python intepreter. >>> BTW, there is a typo here: intepreter Strange, don't you think? I tried Denis' trick, and this works. Thanks for the help From vanforeest at gmail.com Mon May 14 17:20:33 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 23:20:33 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: <4FB17073.20003@laxalde.org> References: <4FB16681.2000802@laxalde.org> <4FB17073.20003@laxalde.org> Message-ID: On 14 May 2012 22:52, Denis Laxalde wrote: > nicky van foreest a ?crit : >> Thanks. There is also another problem. I get lots of blas and lapack >> warnings and errors. > > Do you have the package libatlas-base-dev installed? That helped quite a bit. I also installed gfortran, and then everything compiled. Finally, to make it work I linked .local/lib/.../distributions.py to the distributions.py in my github repo. So, testing a few lines of code in distributions.py took quite some time. As an anecdote: I moved from gentoo linux to ubuntu, since I didn't like to compile all the stuff to get something working. In fact, I changed from C++ and fortan to python for the very same reason. But then, to test some lines in a python file made me feel 10 years younger :-) Nevertheless, I reached the goal. Thanks for your help. I must admit that I would like to see a simpler way to test just a few new lines of a file like distributions.py... NIcky > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanforeest at gmail.com Mon May 14 17:25:39 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 14 May 2012 23:25:39 +0200 Subject: [SciPy-Dev] scipy.stats: algorithm to for ticket 1493 In-Reply-To: References: Message-ID: As you might have seen in another set of mails, I am up and running again: nicky at chuck:~$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> >>> import scipy >>> scipy.__version__ '0.11.0.dev-85c0992' >>> > I switch python versions when I want to switch scipy versions. (python > 2.5 with scipy 0.7, ...) I don't know how to do this. Let's see how I fare with the approach of pulling scipy, compiling, and implementing changes. Tomorrow I'll try to add the ppf code. I plan to change these lines: https://github.com/scipy/scipy/blob/master/scipy/stats/distributions.py#L1180 In case this is wrong, please let me know. From robert.kern at gmail.com Mon May 14 17:33:26 2012 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 May 2012 22:33:26 +0100 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: Message-ID: On Mon, May 14, 2012 at 9:01 PM, nicky van foreest wrote: > Hi, > > I would like to run a test on some code in my local branch of scipy. > Now the problem in one of the test files is that it says > > from scipy import stats > > Now this reads the standard stats module, not the one I want to test, > i.e, the one on my local branch. I changed the pythonpath, but this > does not help. I do an in-place build: $ python setup.py build_src --inplace build_ext --inplace Then add the source directory to the $PYTHONPATH. Remove your installation in site-packages. -- Robert Kern From vanforeest at gmail.com Tue May 15 04:54:40 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 15 May 2012 10:54:40 +0200 Subject: [SciPy-Dev] reading a module under test In-Reply-To: References: Message-ID: > I do an in-place build: > > ?$ python setup.py build_src --inplace build_ext --inplace > > Then add the source directory to the $PYTHONPATH. Remove your > installation in site-packages. This works! Thanks. From lists at hilboll.de Tue May 15 05:45:40 2012 From: lists at hilboll.de (Andreas H.) Date: Tue, 15 May 2012 11:45:40 +0200 Subject: [SciPy-Dev] Question about stats.mstats.kendalltau_seasonal Message-ID: <4FB225C4.9050508@hilboll.de> Hi, looking at stats.mstats.kendalltau_seasonal, i noticed the lack of proper documentation for that function. I'd like to write something up about it, but wanted to verify that the original author was indeed following the approach in Hirsch, Robert M., and James R. Slack. ?A Nonparametric Trend Test for Seasonal Data With Serial Dependence.? Water Resources Research 20, no. 6 (1984): P. 727. (http://www.agu.org/pubs/crossref/1984/WR020i006p00727.shtml) Also, it would be nice to hear what exactly the different return values are. If you guys tell me, I'll gladly write docstring + example. Cheers, Andreas. From pgmdevlist at gmail.com Tue May 15 06:31:50 2012 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 15 May 2012 12:31:50 +0200 Subject: [SciPy-Dev] Question about stats.mstats.kendalltau_seasonal In-Reply-To: <4FB225C4.9050508@hilboll.de> References: <4FB225C4.9050508@hilboll.de> Message-ID: Andreas, IIRC, that's indeed the reference I used. The meaning of the different outputs can be inferred from the paper and from the code: the seasonal tau and its corresponding p-value are the main results, but I also added the corresponding (global) tau and p-value when seasonality isn't taken into account, for comparison purposes. The corresponding unit-test should use some data presented in (Hirsch and Slack, 1984). Once again, reading the paper should clarify everything (alas, my copy is in a box in an attic an ocean away). Sorry for not being more helpful P. -- Pierre GM Sent with Sparrow On Tuesday, May 15, 2012 at 11:45 , Andreas H. wrote: Hi, looking at stats.mstats.kendalltau_seasonal, i noticed the lack of proper documentation for that function. I'd like to write something up about it, but wanted to verify that the original author was indeed following the approach in Hirsch, Robert M., and James R. Slack. ?A Nonparametric Trend Test for Seasonal Data With Serial Dependence.? Water Resources Research 20, no. 6 (1984): P. 727. (http://www.agu.org/pubs/crossref/1984/WR020i006p00727.shtml) Also, it would be nice to hear what exactly the different return values are. If you guys tell me, I'll gladly write docstring + example. Cheers, Andreas. _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From aia8v at virginia.edu Tue May 15 08:30:31 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Tue, 15 May 2012 08:30:31 -0400 Subject: [SciPy-Dev] ipython notebook for docs/examples In-Reply-To: References: <4FB046DC.9030106@virginia.edu> Message-ID: <4FB24C67.20408@virginia.edu> sounds great. fernando, will notebook handle sphinx cross-referencing? alex On 05/13/2012 08:42 PM, Fernando Perez wrote: > Hi Alex, > > On Sun, May 13, 2012 at 4:42 PM, alex arsenovic wrote: >> i recently took a look at the ipython notebook feature. its awesome. >> if it is interface-able with sphinx, i could write docs way faster. >> also, it seems to me that making interactive tutorials/examples with >> this accesable through a server would work great. >> >> has this idea already been considered? if not , does anyone have >> thoughts on this? > Sure :) Just to give everyone a quick status check on this idea: the > main point is that we haven't yet finished the machinery to generate > sphinx-compatible rst from notebooks. The code currently lives in a > standalone repo: > > https://github.com/ipython/nbconvert > > So this is still a bit 'raw'. But with a bit of luck, in a few days > I'll finish off the rst conversion machinery and we'll be in a > reasonable shape to start looking at merging it into ipython proper. > > If you are interested in helping along, let me know and I'll provide > more details. > > I hope that in the future we'll be able to provide with all the > 'scipy*' projects: > > - executable notebooks for users to play with examples > - nice sphinx html generated from these for online docs > - pure .py versions of the codes for non-ipython use > > We're very close to all of this being possible, we just need to finish > up a tiny bit of code. > > Cheers, > > f > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From fperez.net at gmail.com Tue May 15 14:36:45 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 15 May 2012 11:36:45 -0700 Subject: [SciPy-Dev] ipython notebook for docs/examples In-Reply-To: <4FB24C67.20408@virginia.edu> References: <4FB046DC.9030106@virginia.edu> <4FB24C67.20408@virginia.edu> Message-ID: On Tue, May 15, 2012 at 5:30 AM, alex arsenovic wrote: > sounds great. fernando, will notebook ?handle sphinx cross-referencing? Handling reST is tricky because it can't be rendered client-side. But we have the notion of 'raw' cells that get passed to the conversion code unmodified, and you could use any reST/sphinx feature there. It wouldn't be rendered in any way in the live view, it would just show up as plain text, but it would work. Cheers, f From denis at laxalde.org Wed May 16 09:32:02 2012 From: denis at laxalde.org (Denis Laxalde) Date: Wed, 16 May 2012 09:32:02 -0400 Subject: [SciPy-Dev] prepending underscores to private module names In-Reply-To: References: Message-ID: <1337175122.8985.4.camel@schloss.campus.mcgill.ca> Hi, Le vendredi 03 juin 2011 ? 21:57 +0200, Ralf Gommers a ?crit : > A while ago we discussed making a clear distinction between public and > private modules by using underscores consistently, see > http://thread.gmane.org/gmane.comp.python.scientific.devel/14936. > > I've done this for a few modules now, see > https://github.com/rgommers/scipy/tree/underscores. Most files could > simply be underscored, some other would conflict with extension > modules so I also had to append _py to the name. Also I added a file > doc/source/api.rst that states which modules are public (up for > discussion of course). > > With google code search I checked how common it is to import from > modules that are supposed to be private, for example "from > scipy.optimize.optimize import fmin". It's not very common but does > happen, so deprecation warnings should be put in. > > Before going further I'd like to check that this looks okay to people. > What do you think? What is the status of this refactoring work? I think it is an interesting move. -- Denis From josef.pktd at gmail.com Wed May 16 09:36:46 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 16 May 2012 09:36:46 -0400 Subject: [SciPy-Dev] prepending underscores to private module names In-Reply-To: <1337175122.8985.4.camel@schloss.campus.mcgill.ca> References: <1337175122.8985.4.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 9:32 AM, Denis Laxalde wrote: > Hi, > > Le vendredi 03 juin 2011 ? 21:57 +0200, Ralf Gommers a ?crit : >> A while ago we discussed making a clear distinction between public and >> private modules by using underscores consistently, see >> http://thread.gmane.org/gmane.comp.python.scientific.devel/14936. >> >> I've done this for a few modules now, see >> https://github.com/rgommers/scipy/tree/underscores. Most files could >> simply be underscored, some other would conflict with extension >> modules so I also had to append _py to the name. Also I added a file >> doc/source/api.rst that states which modules are public (up for >> discussion of course). >> >> With google code search I checked how common it is to import from >> modules that are supposed to be private, for example "from >> scipy.optimize.optimize import fmin". It's not very common but does >> happen, so deprecation warnings should be put in. >> >> Before going further I'd like to check that this looks okay to people. >> What do you think? > > What is the status of this refactoring work? > I think it is an interesting move. I think, dormant for renaming existing modules until or unless it finds more support. recommended for new modules Josef > > -- > Denis > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From denis at laxalde.org Wed May 16 09:47:16 2012 From: denis at laxalde.org (Denis Laxalde) Date: Wed, 16 May 2012 09:47:16 -0400 Subject: [SciPy-Dev] relative imports Message-ID: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> Hi, What do people think about using relative imports [1] within scipy modules? One advantage I can see is to allow packages to be built and tested independently. Any drawbacks? [1]: http://docs.python.org/whatsnew/2.5.html#pep-328-absolute-and-relative-imports -- Denis From njs at pobox.com Wed May 16 09:58:14 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 16 May 2012 14:58:14 +0100 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 2:47 PM, Denis Laxalde wrote: > Hi, > > What do people think about using relative imports [1] within scipy > modules? One advantage I can see is to allow packages to be built and > tested independently. Any drawbacks? They require Python 2.5+, while (officially at least) scipy still supports Python 2.4. - N From robert.kern at gmail.com Wed May 16 10:12:45 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 May 2012 15:12:45 +0100 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 2:47 PM, Denis Laxalde wrote: > Hi, > > What do people think about using relative imports [1] within scipy > modules? One advantage I can see is to allow packages to be built and > tested independently. Any drawbacks? Relative imports won't allow subpackages to be built or tested independently. The only thing preventing that right now is building a subpackage by itself. If you solve that problem, then you just need to do an in-place build of the whole scipy package once, and then in-place builds of the subpackages that you modify. You can already test subpackages independently. -- Robert Kern From denis at laxalde.org Wed May 16 10:33:59 2012 From: denis at laxalde.org (Denis Laxalde) Date: Wed, 16 May 2012 10:33:59 -0400 Subject: [SciPy-Dev] relative imports In-Reply-To: References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> Message-ID: <1337178839.8985.15.camel@schloss.campus.mcgill.ca> Robert Kern a ?crit : > Relative imports won't allow subpackages to be built or tested > independently. Well, I've tried this for optimize and this does work. > The only thing preventing that right now is building a > subpackage by itself. What do you mean? cd scipy/optimize python setup.py install --user This works. > If you solve that problem, then you just need to > do an in-place build of the whole scipy package once, and then > in-place builds of the subpackages that you modify. You can already > test subpackages independently. I agree, that's one possibility although I personally prefer standard "out-of-place" builds. From robert.kern at gmail.com Wed May 16 10:46:11 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 May 2012 15:46:11 +0100 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337178839.8985.15.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> <1337178839.8985.15.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 3:33 PM, Denis Laxalde wrote: > Robert Kern a ?crit : >> Relative imports won't allow subpackages to be built or tested >> independently. > > Well, I've tried this for optimize and this does work. Oh, adding relative imports won't *hurt* things, but it doesn't *help* building and testing subpackages independently. >> ?The only thing preventing that right now is building a >> subpackage by itself. > > What do you mean? > > ?cd scipy/optimize > ?python setup.py install --user > > This works. Great. Then in-place builds of individual subpackages should work fine too. >> ?If you solve that problem, then you just need to >> do an in-place build of the whole scipy package once, and then >> in-place builds of the subpackages that you modify. You can already >> test subpackages independently. > > I agree, that's one possibility although I personally prefer standard > "out-of-place" builds. Installing just "optimize" won't be standard, either, so I'm not really sure what the reluctance is. Building and testing individual subpackages does not require relative imports. *Installing* individual subpackages in non-standard ways is what does not work. -- Robert Kern From denis at laxalde.org Wed May 16 11:11:52 2012 From: denis at laxalde.org (Denis Laxalde) Date: Wed, 16 May 2012 11:11:52 -0400 Subject: [SciPy-Dev] relative imports In-Reply-To: References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> Message-ID: <1337181112.8985.24.camel@schloss.campus.mcgill.ca> Nathaniel Smith a ?crit : > They require Python 2.5+, while (officially at least) scipy still > supports Python 2.4. It seems that there are incompatibilities now (e.g. python2.4 does not have the any() and all() builtin functions). Current master built with python2.4 has 19 tests failing and (at least) a syntax error. -- Denis From robert.kern at gmail.com Wed May 16 12:33:36 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 16 May 2012 17:33:36 +0100 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337178839.8985.15.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> <1337178839.8985.15.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 3:33 PM, Denis Laxalde wrote: > Robert Kern a ?crit : >> ?The only thing preventing that right now is building a >> subpackage by itself. > > What do you mean? > > ?cd scipy/optimize > ?python setup.py install --user > > This works. It installs, but it won't be functional even with relative imports. Most of scipy's subpackages import from each other. In order to use relative imports to import from other subpackages, all of the subpackages will need to be installed into a parent package. All told, just doing the in-place builds are easier, and it works today without any modifications. -- Robert Kern From josh.k.lawrence at gmail.com Wed May 16 13:51:11 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 16 May 2012 13:51:11 -0400 Subject: [SciPy-Dev] SciPy Docs Edit Privileges Message-ID: Hello, I found a typo in one of the doc strings in scipy and figured I should probably start contributing where I can. Could you please give me edit permissions? My username is wa03. Thanks, Josh From pav at iki.fi Wed May 16 15:01:58 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 16 May 2012 21:01:58 +0200 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337181112.8985.24.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> <1337181112.8985.24.camel@schloss.campus.mcgill.ca> Message-ID: 16.05.2012 17:11, Denis Laxalde kirjoitti: > Nathaniel Smith a ?crit : > > They require Python 2.5+, while (officially at least) scipy still > > supports Python 2.4. > > It seems that there are incompatibilities now (e.g. python2.4 does not > have the any() and all() builtin functions). > Current master built with python2.4 has 19 tests failing and (at least) > a syntax error. Yes, those should be fixed before the next release. Usually, this is fairly easy to do. Pauli From ralf.gommers at googlemail.com Wed May 16 15:06:06 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 16 May 2012 21:06:06 +0200 Subject: [SciPy-Dev] relative imports In-Reply-To: <1337181112.8985.24.camel@schloss.campus.mcgill.ca> References: <1337176036.8985.6.camel@schloss.campus.mcgill.ca> <1337181112.8985.24.camel@schloss.campus.mcgill.ca> Message-ID: On Wed, May 16, 2012 at 5:11 PM, Denis Laxalde wrote: > Nathaniel Smith a ?crit : > > They require Python 2.5+, while (officially at least) scipy still > > supports Python 2.4. > > It seems that there are incompatibilities now (e.g. python2.4 does not > have the any() and all() builtin functions). > Current master built with python2.4 has 19 tests failing and (at least) > a syntax error. That should be fixed before the next release then. This unfortunately happens regularly because we don't have a 2.4 buildbot. Usage of any/all functions slips in way too often; should be replaced with np.any/all. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed May 16 15:14:17 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 16 May 2012 21:14:17 +0200 Subject: [SciPy-Dev] SciPy Docs Edit Privileges In-Reply-To: References: Message-ID: On Wed, May 16, 2012 at 7:51 PM, Josh Lawrence wrote: > Hello, > > I found a typo in one of the doc strings in scipy and figured I should > probably start contributing where I can. Could you please give me edit > permissions? My username is wa03. > Done. Thanks for helping out! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 17 16:30:49 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 17 May 2012 16:30:49 -0400 Subject: [SciPy-Dev] github: disappearing comments - clarification? Message-ID: A question for the github experts: Under what scenarios do comments on github pull requests disappear? I've never seen a clear answer just my speculation. (based on: Did it disappear or is my search fu insufficient?) For example, some comments disappear after a rebased force push. all comments or just inline code comments? If a pull request never gets merged and the originating branch is deleted, do the comments also disappear? any other cases of disappearing comments? Thanks, Josef From pav at iki.fi Sat May 19 04:57:16 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 May 2012 10:57:16 +0200 Subject: [SciPy-Dev] minimize(): tolerance unification Message-ID: Hi, One thing which would be useful to bring to conclusion in 0.11.0 in the new minimize() interface: tolerance specification. First, it would be useful if there was a common `tol` keyword in `minimize()` which would just set whatever appropriate tolerance parameters, so that the user wouldn't need to bother about looking into the specific solver documentation. Second, the main bulk of the work is actually mostly finished, but some of the remaining different optimizers still take the termination parameters in a non-standard way: - fmin_cobyla: rhoend The `rhoend` parameter seems essentially equivalent to absolute x-tolerance, and could be renamed to `xtol` in minimize(). - fmin_l_bfgs_b: factr, pgtol The `factr` parameter specifies the absolute and relative f-tolerances as follows: rftol = eps * factr, aftol = eps * factr. In my opinion, the minimize() interface should expose this as the `ftol` parameter, and compute `factr = ftol/np.finfo(float).eps`. The `pgtol` parameter is a criterion for the projected gradient. Could we rename `pgtol` -> `gtol`, as some other routines use `gtol` to specify conditions for the gradient. The documentation could still describe that it's actually for the projected gradient, but I expect most users just don't care. - brent: ftol The `ftol` is actually the relative x-tolerance, and is currently erroneously named in the minimize() interface! - golden: ftol The `ftol` is actually the relative x-tolerance, and is currently erroneously named in the minimize() interface! - fmin_tnc: pgtol Ditto, pgtol -> gtol? -- Pauli Virtanen From pav at iki.fi Sat May 19 05:08:45 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 May 2012 11:08:45 +0200 Subject: [SciPy-Dev] minimize(): being strict on options Message-ID: Hi, Currently, the minimize() solvers silently accept unknown (= mistyped or inapplicable) options. It might be useful to change this so that it raises an error if unknown options are passed to a solver. Alternatively, it could raise a warning instead --- when trying out different solvers an error could be a PITA. Code changes: tol = options.get('ftol', 1.48e-8) maxiter = options.get('maxiter', 500) change to tol = options.pop('ftol', 1.48e-8) maxiter = options.pop('maxiter', 500) if options: warnings.warn("Unknown solver options: %r" % sorted(options.keys()), scipy.optimize.OptimizationWarning) and class OptimizationWarning(UserWarning): pass -- Pauli Virtanen From pav at iki.fi Sat May 19 08:30:41 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 19 May 2012 14:30:41 +0200 Subject: [SciPy-Dev] minimize(): tolerance unification In-Reply-To: References: Message-ID: 19.05.2012 10:57, Pauli Virtanen kirjoitti: > One thing which would be useful to bring to conclusion in 0.11.0 in the > new minimize() interface: tolerance specification. Plus the PR (also for the other thread): https://github.com/scipy/scipy/pull/217 From ralf.gommers at googlemail.com Sat May 19 10:05:43 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 19 May 2012 16:05:43 +0200 Subject: [SciPy-Dev] github: disappearing comments - clarification? In-Reply-To: References: Message-ID: On Thu, May 17, 2012 at 10:30 PM, wrote: > A question for the github experts: > > Under what scenarios do comments on github pull requests disappear? > > I've never seen a clear answer just my speculation. (based on: Did it > disappear or is my search fu insufficient?) > > For example, some comments disappear after a rebased force push. all > comments or just inline code comments? > Only inline comments on commits that have been rebased/deleted should disappear. Github is also smart enough to update links for commits that get merged from the repo where the PR came from to the repo into which it got merged. > If a pull request never gets merged and the originating branch is > deleted, do the comments also disappear? > > Yes, I think so. > any other cases of disappearing comments? > > Not that I'm aware of. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 19 10:24:16 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 19 May 2012 10:24:16 -0400 Subject: [SciPy-Dev] github: disappearing comments - clarification? In-Reply-To: References: Message-ID: On Sat, May 19, 2012 at 10:05 AM, Ralf Gommers wrote: > > > On Thu, May 17, 2012 at 10:30 PM, wrote: >> >> A question for the github experts: >> >> Under what scenarios do comments on github pull requests disappear? >> >> I've never seen a clear answer just my speculation. (based on: Did it >> disappear or is my search fu insufficient?) >> >> For example, some comments disappear after a rebased force push. all >> comments or just inline code comments? > > > Only inline comments on commits that have been rebased/deleted should > disappear. Github is also smart enough to update links for commits that get > merged from the repo where the PR came from to the repo into which it got > merged. > >> >> If a pull request never gets merged and the originating branch is >> deleted, do the comments also disappear? >> > Yes, I think so. > >> >> any other cases of disappearing comments? >> > Not that I'm aware of. Thanks for the clarification Josef > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From aia8v at virginia.edu Sun May 20 09:53:45 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Sun, 20 May 2012 09:53:45 -0400 Subject: [SciPy-Dev] scikit for virtual instruments? Message-ID: <4FB8F769.1060002@virginia.edu> i was considering creating a scikit to provide a library of virtual instruments. scikit-vi? import skvi? my needs are solely for VISA/GPIB vi's using pyvisa, but the project could support other interfaces as well. perhaps it would provide instances of VI's for specific applications (Voltmeter, Ohmeter, etc) as well as general instruments (Keithely2000). although i dont have a large library of VI's, or plan to write them, i have seen the writing of VI's is being re-done constantly. i think a centralized effort would save a lot of wasted time, as well as provide more robust code. the only existing solution to this that i am aware of is pythics, which i have used and i think its great, but it has a larger-than-necessary scope. a simple project, just for python VI's on Github would be the easiest for people to contribute to (in my opinion) and ease of contribution is necessary for a project such as this. any thoughts? alex From newville at cars.uchicago.edu Sun May 20 10:44:32 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Sun, 20 May 2012 09:44:32 -0500 Subject: [SciPy-Dev] minimize(): tolerance unification Message-ID: Hi Pauli, If you're working on unifying the signatures of the various minimizer functions in scipy.optimize (Thanks!! Great!!), I have a small request to try to make the behavior of the user-supplied objective functions more uniform as well. Currently, most solvers expect the objective function to return a scalar, and raise a ValueError if the objective function returns an array. OTOH, scipy.optimize.leastsq() requires the objective function to return an array. As far as I can tell, the docs are actually silent on what the value returned from the objective function should be. Is it worth considering the following change?: For all solvers expecting a scalar result, if the objective function returns an array, convert to (array**2).sum(). Obviously, this implies a 'least squares' approach as the default, which is debatable but also probably defensible. I believe this change would not break existing code (since the user can always return a scalar value), but it would allow more consistent objective functions, making it easier for users to try out different algorithms with the same objective function. Thanks, --Matt From bjorn.forsman at gmail.com Sun May 20 16:20:36 2012 From: bjorn.forsman at gmail.com (=?UTF-8?Q?Bj=C3=B8rn_Forsman?=) Date: Sun, 20 May 2012 22:20:36 +0200 Subject: [SciPy-Dev] New contribution: bode() function for LTI systems Message-ID: Hi all, I'm new on this list. I have been wanting to have a bode() function for LTI systems in Python for some time now. Today I got tired of waiting and wrote one myself :-) I have sent pull request on github: https://github.com/scipy/scipy/pull/224 - ENH: ltisys: new bode() function While at it I also fixed a small bug in the lti class init function: https://github.com/scipy/scipy/pull/223 - ENH: ltisys: make lti zpk initialization work with plain lists Is this OK? Should I post the patches here too? Best regards, Bj?rn Forsman From pav at iki.fi Sun May 20 16:34:47 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 20 May 2012 22:34:47 +0200 Subject: [SciPy-Dev] New contribution: bode() function for LTI systems In-Reply-To: References: Message-ID: 20.05.2012 22:20, Bj?rn Forsman kirjoitti: > I'm new on this list. I have been wanting to have a bode() function > for LTI systems in Python for some time now. Today I got tired of > waiting and wrote one myself :-) I have sent pull request on github: > > https://github.com/scipy/scipy/pull/224 - ENH: ltisys: new bode() function > > While at it I also fixed a small bug in the lti class init function: > > https://github.com/scipy/scipy/pull/223 - ENH: ltisys: make lti zpk > initialization work with plain lists > > Is this OK? Yep, thanks for sending in fixes. > Should I post the patches here too? No, the pull requests are enough. Best, Pauli Virtanen From pav at iki.fi Sun May 20 16:48:13 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 20 May 2012 22:48:13 +0200 Subject: [SciPy-Dev] minimize(): tolerance unification In-Reply-To: References: Message-ID: Hi, 20.05.2012 16:44, Matt Newville kirjoitti: > If you're working on unifying the signatures of the various minimizer > functions in scipy.optimize (Thanks!! Great!!), I have a small request > to try to make the behavior of the user-supplied objective functions > more uniform as well. This work was mostly done by Dennis Laxalde, I've only done some minor fiddling around here. [clip] > Currently, most solvers expect the objective function to return a > scalar, and raise a ValueError if the objective function returns an > array. OTOH, scipy.optimize.leastsq() requires the objective > function to return an array. As far as I can tell, the docs are > actually silent on what the value returned from the objective function > should be. Good point. If this is not immediately clear after reading the documentation, they should be improved. [clip] > Is it worth considering the following change?: For all solvers > expecting a scalar result, if the objective function returns an array, > convert to (array**2).sum(). > > Obviously, this implies a 'least squares' approach as the default, > which is debatable but also probably defensible. I believe this > change would not break existing code (since the user can always return > a scalar value), but it would allow more consistent objective > functions, making it easier for users to try out different algorithms > with the same objective function. My first thought on this is that it would be mixing two separate things --- nonlinear least squares, and optimization of scalar functions, and I'm not so sure how desirable this is. However, what we should have is a separate interface for least squares. Most preferably, a thin wrapper to the solver routines themselves. Currently, there is only one actual dedicated routine for this, Levenberg-Marquardt, but it might make sense to do as lmfit does and make it easy to use optimization solvers here (which would make it easier for users to see how to do LSQ with constraints). What to then do with the question of fancier higher-level interfaces to fitting is not so clear to me --- curve_fit made it in, and two different object oriented approaches have been proposed (your lmfit, and the PR from Martin Teichmann). There are libraries such as openopt and others which take a more structured approach, and I'm not clear on how much such stuff should be in Scipy. -- Pauli Virtanen From ralf.gommers at googlemail.com Sun May 20 17:25:11 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 20 May 2012 23:25:11 +0200 Subject: [SciPy-Dev] minimize(): being strict on options In-Reply-To: References: Message-ID: On Sat, May 19, 2012 at 11:08 AM, Pauli Virtanen wrote: > Hi, > > Currently, the minimize() solvers silently accept unknown (= mistyped or > inapplicable) options. It might be useful to change this so that it > raises an error if unknown options are passed to a solver. > > Alternatively, it could raise a warning instead --- when trying out > different solvers an error could be a PITA. > A warning would indeed be useful here. Ralf > Code changes: > > tol = options.get('ftol', 1.48e-8) > maxiter = options.get('maxiter', 500) > > change to > > tol = options.pop('ftol', 1.48e-8) > maxiter = options.pop('maxiter', 500) > if options: > warnings.warn("Unknown solver options: %r" > % sorted(options.keys()), > scipy.optimize.OptimizationWarning) > > and > > class OptimizationWarning(UserWarning): > pass > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordens at gmail.com Mon May 21 00:50:11 2012 From: jordens at gmail.com (=?UTF-8?Q?Robert_J=C3=B6rdens?=) Date: Sun, 20 May 2012 22:50:11 -0600 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions Message-ID: Hello, I submitted two functions to numpy that sum 2d matrices along angled cartesian or polar coordinates. https://github.com/numpy/numpy/pull/230 The two functions certainly have their main application in image processing and might be better suited for scipy of scikits-image. sum_angle() is not much more than the old scipy.misc.pilutils.radon() transform. But the later is deprecated and has several problems (floats, non-conserved sum(), interpolation, speed) as discussed in the pull request. The new scikits-image.transform.radon() appears to be more generic but a bit complicated and potentially even slower than the imrotate()-based version in scipy. Could sum_angle() and sum_polar() find a place in scipy or scikits-image or are they simple enough to be useful for numpy? -- Robert Jordens. From ralf.gommers at googlemail.com Mon May 21 03:16:13 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 21 May 2012 09:16:13 +0200 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 6:50 AM, Robert J?rdens wrote: > Hello, > I submitted two functions to numpy that sum 2d matrices along angled > cartesian or polar coordinates. > https://github.com/numpy/numpy/pull/230 > The two functions certainly have their main application in image > processing and might be better suited for scipy of scikits-image. > > sum_angle() is not much more than the old scipy.misc.pilutils.radon() > transform. But the later is deprecated and has several problems > (floats, non-conserved sum(), interpolation, speed) as discussed in > the pull request. > > The new scikits-image.transform.radon() appears to be more generic but > a bit complicated and potentially even slower than the > imrotate()-based version in scipy. > Could you time the scikits-image version, it's not obvious from reading the code that it will be very slow? > Could sum_angle() and sum_polar() find a place in scipy or > scikits-image or are they simple enough to be useful for numpy? > My impression is that it would fit best in scikits-image. Perhaps ndimage would be a reasonable place too. It's too specialized for numpy imho. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordens at gmail.com Mon May 21 04:06:29 2012 From: jordens at gmail.com (=?UTF-8?Q?Robert_J=C3=B6rdens?=) Date: Mon, 21 May 2012 02:06:29 -0600 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 1:16 AM, Ralf Gommers wrote: > Could you time the scikits-image version, it's not obvious from reading the > code that it will be very slow? In [47]: from scipy.misc.pilutil import radon as scipy_radon In [50]: from skimage.transform import radon as scikits_radon In [46]: a=np.random.randn(1000, 1000) In [53]: angle_sum(a, .1).shape Out[53]: (1100,) In [54]: scikits_radon(a, [.1]).shape Out[54]: (1415, 1) In [55]: scipy_radon(a, [.1]).shape Out[55]: (1000, 1) In [56]: %timeit angle_sum(a, .1) 10 loops, best of 3: 24.7 ms per loop In [57]: %timeit scikits_radon(a, [.1]) 10 loops, best of 3: 136 ms per loop In [58]: %timeit scipy_radon(a, [.1]) 10 loops, best of 3: 47.6 ms per loop In [59]: a.sum() Out[59]: -682.83728184031156 In [60]: angle_sum(a, .1).sum() Out[60]: -682.83728184032327 In [61]: scikits_radon(a, [.1]).sum() Out[61]: -682.83727334014043 In [62]: scipy_radon(a, [.1]).sum() Out[62]: 124316062.0 > My impression is that it would fit best in scikits-image. Perhaps ndimage > would be a reasonable place too. It's too specialized for numpy imho. angle_sum() at least can be applied beyond image processing in the analysis of coupling matrices or adjacency graphs. Both functions are simple and generic. -- Robert Jordens. From charlesr.harris at gmail.com Mon May 21 12:20:40 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 May 2012 10:20:40 -0600 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 2:06 AM, Robert J?rdens wrote: > On Mon, May 21, 2012 at 1:16 AM, Ralf Gommers > wrote: > > Could you time the scikits-image version, it's not obvious from reading > the > > code that it will be very slow? > > In [47]: from scipy.misc.pilutil import radon as scipy_radon > In [50]: from skimage.transform import radon as scikits_radon > > In [46]: a=np.random.randn(1000, 1000) > > In [53]: angle_sum(a, .1).shape > Out[53]: (1100,) > > In [54]: scikits_radon(a, [.1]).shape > Out[54]: (1415, 1) > > In [55]: scipy_radon(a, [.1]).shape > Out[55]: (1000, 1) > > In [56]: %timeit angle_sum(a, .1) > 10 loops, best of 3: 24.7 ms per loop > > In [57]: %timeit scikits_radon(a, [.1]) > 10 loops, best of 3: 136 ms per loop > > In [58]: %timeit scipy_radon(a, [.1]) > 10 loops, best of 3: 47.6 ms per loop > > In [59]: a.sum() > Out[59]: -682.83728184031156 > > In [60]: angle_sum(a, .1).sum() > Out[60]: -682.83728184032327 > > In [61]: scikits_radon(a, [.1]).sum() > Out[61]: -682.83727334014043 > > In [62]: scipy_radon(a, [.1]).sum() > Out[62]: 124316062.0 > > > My impression is that it would fit best in scikits-image. Perhaps ndimage > > would be a reasonable place too. It's too specialized for numpy imho. > > angle_sum() at least can be applied beyond image processing in the > analysis of coupling matrices or adjacency graphs. Both functions are > simple and generic. > > Looks like you need to sell this as a faster Radon or Hough, angle_sum is a bit too generic ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josh.k.lawrence at gmail.com Mon May 21 14:45:55 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Mon, 21 May 2012 14:45:55 -0400 Subject: [SciPy-Dev] scipy.signal.normalize problems? Message-ID: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> Hey all, I've been having some problems designing a Chebyshev filter and I think I have narrowed down the hang-up to scipy.signal.normalize. I think what's going on in my case is that line 286 of filter_design.py (the first allclose call in the normalize function) is producing a false positive. Here's the function definition: def normalize(b, a): """Normalize polynomial representation of a transfer function. If values of b are too close to 0, they are removed. In that case, a BadCoefficients warning is emitted. """ b, a = map(atleast_1d, (b, a)) if len(a.shape) != 1: raise ValueError("Denominator polynomial must be rank-1 array.") if len(b.shape) > 2: raise ValueError("Numerator polynomial must be rank-1 or" " rank-2 array.") if len(b.shape) == 1: b = asarray([b], b.dtype.char) while a[0] == 0.0 and len(a) > 1: a = a[1:] outb = b * (1.0) / a[0] outa = a * (1.0) / a[0] if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 warnings.warn("Badly conditioned filter coefficients (numerator): the " "results may be meaningless", BadCoefficients) while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > 1): outb = outb[:, 1:] if outb.shape[0] == 1: outb = outb[0] return outb, outa I marked line 286. If I reproduce all the steps carried out by scipy.signal.iirdesign, I end up with a (b, a) pair which results of scipy.signal.lp2lp and looks like this: In [106]: b_lp2 Out[106]: array([ 1.55431359e-06+0.j]) In [107]: a_lp2 Out[107]: array([ 1.00000000e+00 +0.00000000e+00j, 3.46306104e-01 -2.01282794e-16j, 2.42572185e-01 -6.08207573e-17j, 5.92946943e-02 +0.00000000e+00j, 1.82069156e-02 +5.55318531e-18j, 2.89328123e-03 +0.00000000e+00j, 4.36566281e-04 -2.95766719e-19j, 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 -1.00966301e-21j]) scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to keep track of what's going on) and runs them through scipy.signal.bilinear (in filter_design.py bilinear is called on line 624 within iirfilter. iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize is called on line 445. I've made my own class with bilinear copied and pasted from filter_design.py to test things. In bilinear, the input to normalize is given by b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 1.55431359e-06] a = [ 72269.02590913 -562426.61430468 1918276.19173089 -3745112.83646825 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 60799.46134722] In normalize, right before the allclose() call, outb is defined by outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 1.20441032e-09 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 2.15073272e-11]] From what I read of the function normalize, it should only evaluate true if all of the coefficients in outb are smaller than 1e-14. However, that's not what is going on. I have access to MATLAB and if I go through the same design criteria to design a chebyshev filter, I get the following: b = 1.0e-08 * Columns 1 through 5 0.002150733144728 0.017205865157826 0.060220528052392 0.120441056104784 0.150551320130980 Columns 6 through 9 0.120441056104784 0.060220528052392 0.017205865157826 0.002150733144728 which matches up rather well for several significant figures. I apologize if this is not clearly explained, but I'm not sure what to do. I tried messing around with the arguments to allclose (switching it to be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). I am also not sure why normalize is setup to run on rank-2 arrays. I looked through filter_design.py and none of the functions contained in it send a rank-2 array to normalize from what I can tell. Any thoughts? Cheers, Josh Lawrence From josh.k.lawrence at gmail.com Mon May 21 15:11:56 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Mon, 21 May 2012 15:11:56 -0400 Subject: [SciPy-Dev] scipy.signal.normalize problems? In-Reply-To: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> References: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> Message-ID: <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> If I change the allclose lines to have allclose(0, outb[:,0], atol=1e-14) it works. I think that is what the original goal was, anyways. From the documentation of allclose, what I have written above result in ensuring np.abs(0 - outb[:,0]) > atol + rtol * np.abs(outb[:,0]) with rtol defaulting to 1e-5. I'm still not sure about how things have been written for the 'b' argument of normalize being rank-2, so I can't guarantee that my fix makes things work for that. Should I file a bug report/submit a patch/send a git pull request, etc? Cheers, Josh Lawrence On May 21, 2012, at 2:45 PM, Josh Lawrence wrote: > Hey all, > > I've been having some problems designing a Chebyshev filter and I think I have narrowed down the hang-up to scipy.signal.normalize. I think what's going on in my case is that line 286 of filter_design.py (the first allclose call in the normalize function) is producing a false positive. Here's the function definition: > > def normalize(b, a): > """Normalize polynomial representation of a transfer function. > > If values of b are too close to 0, they are removed. In that case, a > BadCoefficients warning is emitted. > """ > b, a = map(atleast_1d, (b, a)) > if len(a.shape) != 1: > raise ValueError("Denominator polynomial must be rank-1 array.") > if len(b.shape) > 2: > raise ValueError("Numerator polynomial must be rank-1 or" > " rank-2 array.") > if len(b.shape) == 1: > b = asarray([b], b.dtype.char) > while a[0] == 0.0 and len(a) > 1: > a = a[1:] > outb = b * (1.0) / a[0] > outa = a * (1.0) / a[0] > if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 > warnings.warn("Badly conditioned filter coefficients (numerator): the " > "results may be meaningless", BadCoefficients) > while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > 1): > outb = outb[:, 1:] > if outb.shape[0] == 1: > outb = outb[0] > return outb, outa > > I marked line 286. If I reproduce all the steps carried out by scipy.signal.iirdesign, I end up with a (b, a) pair which results of scipy.signal.lp2lp and looks like this: > > In [106]: b_lp2 > Out[106]: array([ 1.55431359e-06+0.j]) > > In [107]: a_lp2 > Out[107]: > array([ 1.00000000e+00 +0.00000000e+00j, > 3.46306104e-01 -2.01282794e-16j, > 2.42572185e-01 -6.08207573e-17j, > 5.92946943e-02 +0.00000000e+00j, > 1.82069156e-02 +5.55318531e-18j, > 2.89328123e-03 +0.00000000e+00j, > 4.36566281e-04 -2.95766719e-19j, > 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 -1.00966301e-21j]) > > scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to keep track of what's going on) and runs them through scipy.signal.bilinear (in filter_design.py bilinear is called on line 624 within iirfilter. iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize is called on line 445. I've made my own class with bilinear copied and pasted from filter_design.py to test things. In bilinear, the input to normalize is given by > > b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 > 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 > 1.55431359e-06] > a = [ 72269.02590913 -562426.61430468 1918276.19173089 -3745112.83646825 > 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 > 60799.46134722] > > In normalize, right before the allclose() call, outb is defined by > > outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 1.20441032e-09 > 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 > 2.15073272e-11]] > > From what I read of the function normalize, it should only evaluate true if all of the coefficients in outb are smaller than 1e-14. However, that's not what is going on. I have access to MATLAB and if I go through the same design criteria to design a chebyshev filter, I get the following: > > b = > > 1.0e-08 * > > Columns 1 through 5 > > 0.002150733144728 0.017205865157826 0.060220528052392 0.120441056104784 0.150551320130980 > > Columns 6 through 9 > > 0.120441056104784 0.060220528052392 0.017205865157826 0.002150733144728 > > which matches up rather well for several significant figures. > > I apologize if this is not clearly explained, but I'm not sure what to do. I tried messing around with the arguments to allclose (switching it to be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). I am also not sure why normalize is setup to run on rank-2 arrays. I looked through filter_design.py and none of the functions contained in it send a rank-2 array to normalize from what I can tell. Any thoughts? > > Cheers, > > Josh Lawrence From charlesr.harris at gmail.com Mon May 21 17:25:23 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 May 2012 15:25:23 -0600 Subject: [SciPy-Dev] Faster Hough/Radon transform (was sum_angle() sum_polar() functions) Message-ID: Thought I'd advertize this under a different subject heading. The sum_angle function is really the inner loop but the timings are about 10x faster than the Radon transform in skimage, so some here might be interested. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon May 21 18:39:43 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 21 May 2012 18:39:43 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy Message-ID: Currently in scipy.stats.entropy if you are not ignoring them you will see warnings when the function is given a probability of zero even though the case of zero is specifically handled in the function. Rightly or wrongly this makes me cringe. What do people think about fixing this by using seterr explicitly in the function or masking the zeros. Eg., import numpy as np from scipy.stats import entropy prob = np.random.uniform(0,20, size=10) prob[5] = 0 prob = prob/prob.sum() np.seterr(all = 'warn') entropy(prob) # too loud Instead we could do (within entropy) oldstate = np.geterr() np.seterr(divide='ignore', invalid='ignore') entropy(prob) np.seterr(**oldstate) or just mask the zeros in the first place if this is too much idx = prob > 0 -np.sum(prob[idx] * np.log(prob[idx])) Thoughts? Skipper From njs at pobox.com Mon May 21 18:43:13 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 21 May 2012 23:43:13 +0100 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: > Currently in scipy.stats.entropy if you are not ignoring them you will > see warnings when the function is given a probability of zero even > though the case of zero is specifically handled in the function. > Rightly or wrongly this makes me cringe. What do people think about > fixing this by using seterr explicitly in the function or masking the > zeros. Eg., > > import numpy as np > from scipy.stats import entropy > > prob = np.random.uniform(0,20, size=10) > prob[5] = 0 > prob = prob/prob.sum() > > np.seterr(all = 'warn') > entropy(prob) # too loud > > Instead we could do (within entropy) > > oldstate = np.geterr() > np.seterr(divide='ignore', invalid='ignore') > entropy(prob) > np.seterr(**oldstate) > > or just mask the zeros in the first place if this is too much > > idx = prob > 0 > -np.sum(prob[idx] * np.log(prob[idx])) > > Thoughts? I like the mask version better. - N From josef.pktd at gmail.com Mon May 21 19:11:14 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 May 2012 19:11:14 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith wrote: > On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: >> Currently in scipy.stats.entropy if you are not ignoring them you will >> see warnings when the function is given a probability of zero even >> though the case of zero is specifically handled in the function. >> Rightly or wrongly this makes me cringe. What do people think about >> fixing this by using seterr explicitly in the function or masking the >> zeros. Eg., >> >> import numpy as np >> from scipy.stats import entropy >> >> prob = np.random.uniform(0,20, size=10) >> prob[5] = 0 >> prob = prob/prob.sum() >> >> np.seterr(all = 'warn') >> entropy(prob) # too loud >> >> Instead we could do (within entropy) >> >> oldstate = np.geterr() >> np.seterr(divide='ignore', invalid='ignore') >> entropy(prob) >> np.seterr(**oldstate) >> >> or just mask the zeros in the first place if this is too much >> >> idx = prob > 0 >> -np.sum(prob[idx] * np.log(prob[idx])) >> >> Thoughts? > > I like the mask version better. +1, buggy: if qk is given, then the function isn't vectorized. Josef > > - N > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From jsseabold at gmail.com Mon May 21 19:23:33 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 21 May 2012 19:23:33 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 7:11 PM, wrote: > On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith wrote: >> On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: >>> Currently in scipy.stats.entropy if you are not ignoring them you will >>> see warnings when the function is given a probability of zero even >>> though the case of zero is specifically handled in the function. >>> Rightly or wrongly this makes me cringe. What do people think about >>> fixing this by using seterr explicitly in the function or masking the >>> zeros. Eg., >>> >>> import numpy as np >>> from scipy.stats import entropy >>> >>> prob = np.random.uniform(0,20, size=10) >>> prob[5] = 0 >>> prob = prob/prob.sum() >>> >>> np.seterr(all = 'warn') >>> entropy(prob) # too loud >>> >>> Instead we could do (within entropy) >>> >>> oldstate = np.geterr() >>> np.seterr(divide='ignore', invalid='ignore') >>> entropy(prob) >>> np.seterr(**oldstate) >>> >>> or just mask the zeros in the first place if this is too much >>> >>> idx = prob > 0 >>> -np.sum(prob[idx] * np.log(prob[idx])) >>> >>> Thoughts? >> >> I like the mask version better. > > +1, https://github.com/scipy/scipy/pull/226 > > buggy: if qk is given, then the function isn't vectorized. > > Josef > >> >> - N >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon May 21 19:34:54 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 May 2012 19:34:54 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 7:23 PM, Skipper Seabold wrote: > On Mon, May 21, 2012 at 7:11 PM, ? wrote: >> On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith wrote: >>> On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: >>>> Currently in scipy.stats.entropy if you are not ignoring them you will >>>> see warnings when the function is given a probability of zero even >>>> though the case of zero is specifically handled in the function. >>>> Rightly or wrongly this makes me cringe. What do people think about >>>> fixing this by using seterr explicitly in the function or masking the >>>> zeros. Eg., >>>> >>>> import numpy as np >>>> from scipy.stats import entropy >>>> >>>> prob = np.random.uniform(0,20, size=10) >>>> prob[5] = 0 >>>> prob = prob/prob.sum() >>>> >>>> np.seterr(all = 'warn') >>>> entropy(prob) # too loud >>>> >>>> Instead we could do (within entropy) >>>> >>>> oldstate = np.geterr() >>>> np.seterr(divide='ignore', invalid='ignore') >>>> entropy(prob) >>>> np.seterr(**oldstate) >>>> >>>> or just mask the zeros in the first place if this is too much >>>> >>>> idx = prob > 0 >>>> -np.sum(prob[idx] * np.log(prob[idx])) >>>> >>>> Thoughts? >>> >>> I like the mask version better. >> >> +1, > > https://github.com/scipy/scipy/pull/226 won't work as replacement, if qk is None then the function is vectorized for axis=0 >>> rr array([[ 0.13878479, 0.03527334, 0.12000785, 0.14706888], [ 0.07682377, 0.12749588, 0.15172758, 0.19499206], [ 0.10462715, 0.1766166 , 0. , 0.09346067], [ 0.02208519, 0.14443609, 0.11331574, 0.15090141], [ 0.00830154, 0.06009464, 0.05424912, 0.11603281], [ 0.05205531, 0.0792505 , 0.02387006, 0.0061777 ], [ 0.00526626, 0.08439299, 0.17298407, 0.09992403], [ 0.16510456, 0.07008839, 0.01962196, 0.07101189], [ 0.23265325, 0.15908956, 0.2072021 , 0.08105922], [ 0.19429818, 0.06326201, 0.13702153, 0.03937134]]) >>> stats.entropy(rr) array([ 1.9678332 , 2.19817097, 2.0136922 , 2.1379255 ]) >>> -(rr[idx]*np.log(rr[idx])).sum(0) 8.3176218626994789 >>> stats.entropy(rr).sum() 8.3176218626994789 Josef > >> >> buggy: if qk is given, then the function isn't vectorized. >> >> Josef >> >>> >>> - N >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From warren.weckesser at enthought.com Mon May 21 20:33:06 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 21 May 2012 19:33:06 -0500 Subject: [SciPy-Dev] Code question about signal.ltisys.py Message-ID: While taking a look at https://github.com/scipy/scipy/pull/225, which makes a small changes to signal/ltisys.py,I noticed the surrounding code, which has lines like this: self.__dict__['num'], self.__dict__['den'] = normalize(*args) My inclination is to rewrite that as self.num, self.den = normalize(*args) But am I missing something? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon May 21 22:43:00 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 21 May 2012 22:43:00 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 7:34 PM, wrote: > On Mon, May 21, 2012 at 7:23 PM, Skipper Seabold wrote: >> On Mon, May 21, 2012 at 7:11 PM, ? wrote: >>> On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith wrote: >>>> On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: >>>>> Currently in scipy.stats.entropy if you are not ignoring them you will >>>>> see warnings when the function is given a probability of zero even >>>>> though the case of zero is specifically handled in the function. >>>>> Rightly or wrongly this makes me cringe. What do people think about >>>>> fixing this by using seterr explicitly in the function or masking the >>>>> zeros. Eg., >>>>> >>>>> import numpy as np >>>>> from scipy.stats import entropy >>>>> >>>>> prob = np.random.uniform(0,20, size=10) >>>>> prob[5] = 0 >>>>> prob = prob/prob.sum() >>>>> >>>>> np.seterr(all = 'warn') >>>>> entropy(prob) # too loud >>>>> >>>>> Instead we could do (within entropy) >>>>> >>>>> oldstate = np.geterr() >>>>> np.seterr(divide='ignore', invalid='ignore') >>>>> entropy(prob) >>>>> np.seterr(**oldstate) >>>>> >>>>> or just mask the zeros in the first place if this is too much >>>>> >>>>> idx = prob > 0 >>>>> -np.sum(prob[idx] * np.log(prob[idx])) >>>>> >>>>> Thoughts? >>>> >>>> I like the mask version better. >>> >>> +1, >> >> https://github.com/scipy/scipy/pull/226 > > won't work as replacement, if qk is None then the function is > vectorized for axis=0 > Hmm, I didn't think it was intended for 2d cases since there is no axis keyword and no tests for this. Docstring is unclear, but I've only used it for 1d and... import numpy as np p = np.random.random((10,4)) p[2,3] = 0 q = np.random.random((10,4)) q[2,3] = 0 p /= p.sum(0) q /= q.sum(0) from scipy import stats # bad logic for > 1d # plus it would return inf, not a 1d array stats.entropy(p,q) stats.entropy(p.flatten(), q.flatten()) # len check not shape q = np.random.random((10,3)) stats.entropy(p, q) >>>> rr > array([[ 0.13878479, ?0.03527334, ?0.12000785, ?0.14706888], > ? ? ? [ 0.07682377, ?0.12749588, ?0.15172758, ?0.19499206], > ? ? ? [ 0.10462715, ?0.1766166 , ?0. ? ? ? ?, ?0.09346067], > ? ? ? [ 0.02208519, ?0.14443609, ?0.11331574, ?0.15090141], > ? ? ? [ 0.00830154, ?0.06009464, ?0.05424912, ?0.11603281], > ? ? ? [ 0.05205531, ?0.0792505 , ?0.02387006, ?0.0061777 ], > ? ? ? [ 0.00526626, ?0.08439299, ?0.17298407, ?0.09992403], > ? ? ? [ 0.16510456, ?0.07008839, ?0.01962196, ?0.07101189], > ? ? ? [ 0.23265325, ?0.15908956, ?0.2072021 , ?0.08105922], > ? ? ? [ 0.19429818, ?0.06326201, ?0.13702153, ?0.03937134]]) > >>>> stats.entropy(rr) > array([ 1.9678332 , ?2.19817097, ?2.0136922 , ?2.1379255 ]) > >>>> -(rr[idx]*np.log(rr[idx])).sum(0) > 8.3176218626994789 >>>> stats.entropy(rr).sum() > 8.3176218626994789 > > Josef > >> >>> >>> buggy: if qk is given, then the function isn't vectorized. >>> >>> Josef >>> >>>> >>>> - N >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon May 21 23:29:19 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 21 May 2012 23:29:19 -0400 Subject: [SciPy-Dev] warnings in scipy.stats.entropy In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 10:43 PM, Skipper Seabold wrote: > On Mon, May 21, 2012 at 7:34 PM, ? wrote: >> On Mon, May 21, 2012 at 7:23 PM, Skipper Seabold wrote: >>> On Mon, May 21, 2012 at 7:11 PM, ? wrote: >>>> On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith wrote: >>>>> On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold wrote: >>>>>> Currently in scipy.stats.entropy if you are not ignoring them you will >>>>>> see warnings when the function is given a probability of zero even >>>>>> though the case of zero is specifically handled in the function. >>>>>> Rightly or wrongly this makes me cringe. What do people think about >>>>>> fixing this by using seterr explicitly in the function or masking the >>>>>> zeros. Eg., >>>>>> >>>>>> import numpy as np >>>>>> from scipy.stats import entropy >>>>>> >>>>>> prob = np.random.uniform(0,20, size=10) >>>>>> prob[5] = 0 >>>>>> prob = prob/prob.sum() >>>>>> >>>>>> np.seterr(all = 'warn') >>>>>> entropy(prob) # too loud >>>>>> >>>>>> Instead we could do (within entropy) >>>>>> >>>>>> oldstate = np.geterr() >>>>>> np.seterr(divide='ignore', invalid='ignore') >>>>>> entropy(prob) >>>>>> np.seterr(**oldstate) >>>>>> >>>>>> or just mask the zeros in the first place if this is too much >>>>>> >>>>>> idx = prob > 0 >>>>>> -np.sum(prob[idx] * np.log(prob[idx])) >>>>>> >>>>>> Thoughts? >>>>> >>>>> I like the mask version better. >>>> >>>> +1, >>> >>> https://github.com/scipy/scipy/pull/226 >> >> won't work as replacement, if qk is None then the function is >> vectorized for axis=0 >> > > Hmm, I didn't think it was intended for 2d cases since there is no > axis keyword and no tests for this. Docstring is unclear, but I've > only used it for 1d and... I works for 2d or nd if qk=None, and uses the (sometimes hidden) default axis=0. If qk is given, it doesn't work but still uses axis=0 in the sum. I would say typical state for a stats function that hasn't been cleaned up. For the ones that I did clean up, I usually added the axis keyword in cases like this. Josef > > import numpy as np > > p = np.random.random((10,4)) > p[2,3] = 0 > q = np.random.random((10,4)) > q[2,3] = 0 > > p /= p.sum(0) > q /= q.sum(0) > > from scipy import stats > > # bad logic for > 1d > # plus it would return inf, not a 1d array > stats.entropy(p,q) > > stats.entropy(p.flatten(), q.flatten()) > > # len check not shape > q = np.random.random((10,3)) > > stats.entropy(p, q) > >>>>> rr >> array([[ 0.13878479, ?0.03527334, ?0.12000785, ?0.14706888], >> ? ? ? [ 0.07682377, ?0.12749588, ?0.15172758, ?0.19499206], >> ? ? ? [ 0.10462715, ?0.1766166 , ?0. ? ? ? ?, ?0.09346067], >> ? ? ? [ 0.02208519, ?0.14443609, ?0.11331574, ?0.15090141], >> ? ? ? [ 0.00830154, ?0.06009464, ?0.05424912, ?0.11603281], >> ? ? ? [ 0.05205531, ?0.0792505 , ?0.02387006, ?0.0061777 ], >> ? ? ? [ 0.00526626, ?0.08439299, ?0.17298407, ?0.09992403], >> ? ? ? [ 0.16510456, ?0.07008839, ?0.01962196, ?0.07101189], >> ? ? ? [ 0.23265325, ?0.15908956, ?0.2072021 , ?0.08105922], >> ? ? ? [ 0.19429818, ?0.06326201, ?0.13702153, ?0.03937134]]) >> >>>>> stats.entropy(rr) >> array([ 1.9678332 , ?2.19817097, ?2.0136922 , ?2.1379255 ]) >> >>>>> -(rr[idx]*np.log(rr[idx])).sum(0) >> 8.3176218626994789 >>>>> stats.entropy(rr).sum() >> 8.3176218626994789 >> >> Josef >> >>> >>>> >>>> buggy: if qk is given, then the function isn't vectorized. >>>> >>>> Josef >>>> >>>>> >>>>> - N >>>>> _______________________________________________ >>>>> SciPy-Dev mailing list >>>>> SciPy-Dev at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pav at iki.fi Tue May 22 05:11:56 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 May 2012 09:11:56 +0000 (UTC) Subject: [SciPy-Dev] Code question about signal.ltisys.py References: Message-ID: Warren Weckesser enthought.com> writes: [clip] > self.__dict__['num'], self.__dict__['den'] = normalize(*args) > > My inclination is to rewrite that as > > self.num, self.den = normalize(*args) > > But am I missing something? These lines are equivalent, *unless* the class has __setattr__ defined. That's the case here, so rewriting may end up in different behavior than intended, depending on what __setattr__ does. Pauli From pav at iki.fi Tue May 22 05:15:11 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 22 May 2012 09:15:11 +0000 (UTC) Subject: [SciPy-Dev] warnings in scipy.stats.entropy References: Message-ID: Skipper Seabold gmail.com> writes: [clip] > Currently in scipy.stats.entropy if you are not ignoring them you will > see warnings when the function is given a probability of zero even > though the case of zero is specifically handled in the function. > Rightly or wrongly this makes me cringe. What do people think about > fixing this by using seterr explicitly in the function or masking the > zeros. I'd rather add `xlogy(x, y) = x*log(y)` that treats the case `x==0` specially to scipy.special, than kludge around the issue by masking or turning warnings off. -- Pauli Virtanen From warren.weckesser at enthought.com Tue May 22 06:05:20 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 22 May 2012 05:05:20 -0500 Subject: [SciPy-Dev] Code question about signal.ltisys.py In-Reply-To: References: Message-ID: On Tue, May 22, 2012 at 4:11 AM, Pauli Virtanen wrote: > Warren Weckesser enthought.com> writes: > [clip] > > self.__dict__['num'], self.__dict__['den'] = normalize(*args) > > > > My inclination is to rewrite that as > > > > self.num, self.den = normalize(*args) > > > > But am I missing something? > > These lines are equivalent, *unless* the class has __setattr__ defined. > That's the case here, so rewriting may end up in different behavior > than intended, depending on what __setattr__ does. > > Ah, right. Thanks. Warren > Pauli > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junkshops at gmail.com Tue May 22 11:39:02 2012 From: junkshops at gmail.com (Junkshops) Date: Tue, 22 May 2012 08:39:02 -0700 Subject: [SciPy-Dev] scipy.test() runs 0 tests Message-ID: <4FBBB316.90001@gmail.com> Hi all, Hopefully I'm not doing something too stupid here, but I've googled around quite a bit and can't seem to find any tips. Perhaps my google-fu is not strong enough. Anyway, I built scipy 0.11.0 in place, seemingly successfully, in ~/git/scipy. Ubuntu version is 12.04. I haven't added scipy 0.11 to the pypath yet so I'm running in the parent dir. js at ubuntuVB12:~/git/scipy$ sudo aptitude install python-nose Setting up python-nose (1.1.2-3) ... js at ubuntuVB12:~/git/scipy$ python Python 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test() Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /usr/lib/python2.7/dist-packages/numpy SciPy version 0.11.0.dev-3852ce2 SciPy is installed in scipy Python version 2.7.3 (default, Apr 20 2012, 22:39:59) [GCC 4.6.3] nose version 1.1.2 ---------------------------------------------------------------------- Ran 0 tests in 0.161s OK >>> scipy.test("full") ---------------------------------------------------------------------- Ran 0 tests in 0.139s OK Anyone have any idea what I'm doing wrong? Thanks very much, Gavin From jsseabold at gmail.com Tue May 22 14:20:09 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 22 May 2012 14:20:09 -0400 Subject: [SciPy-Dev] scipy.test() runs 0 tests In-Reply-To: <4FBBB316.90001@gmail.com> References: <4FBBB316.90001@gmail.com> Message-ID: On Tue, May 22, 2012 at 11:39 AM, Junkshops wrote: > > Hi all, > > Hopefully I'm not doing something too stupid here, but I've googled > around quite a bit and can't seem to find any tips. Perhaps my google-fu > is not strong enough. Anyway, I built scipy 0.11.0 in place, seemingly > successfully, in ~/git/scipy. Ubuntu version is 12.04. I haven't added > scipy 0.11 to the pypath yet so I'm running in the parent dir. > How did you build scipy? This works fine for me, though you'll want to run the tests from the command line since you're not putting scipy on the path. [~/src/scipy-skipper] (master) |11 $ nosetests scipy ---------------------------------------------------------------------- Ran 6116 tests in 387.441s FAILED (SKIP=28, errors=16, failures=19) Otherwise, you might try doing nosetest --exe scipy though I can't imagine why the tests would have the executable bit set in your source directory. From junkshops at gmail.com Tue May 22 15:44:02 2012 From: junkshops at gmail.com (Junkshops) Date: Tue, 22 May 2012 12:44:02 -0700 Subject: [SciPy-Dev] scipy.test() runs 0 tests In-Reply-To: References: <4FBBB316.90001@gmail.com> Message-ID: <4FBBEC82.2000505@gmail.com> Hi Skipper, > Otherwise, you might try doing > > nosetest --exe scipy > > though I can't imagine why the tests would have the executable bit set > in your source directory. That's the fix, and the reason the tests are executable is because my git repo is in a VirtualBox shared ntfs drive, so Linux permissions don't work and every single file in the repo is executable. I think that if I do |git config core.filemode false | I should be in good shape when I push, although if I'm wrong I'd definitely appreciate it if someone sets me right. Thanks for the help, Gavin On 5/22/2012 11:20 AM, Skipper Seabold wrote: > On Tue, May 22, 2012 at 11:39 AM, Junkshops wrote: >> Hi all, >> >> Hopefully I'm not doing something too stupid here, but I've googled >> around quite a bit and can't seem to find any tips. Perhaps my google-fu >> is not strong enough. Anyway, I built scipy 0.11.0 in place, seemingly >> successfully, in ~/git/scipy. Ubuntu version is 12.04. I haven't added >> scipy 0.11 to the pypath yet so I'm running in the parent dir. >> > How did you build scipy? This works fine for me, though you'll want to > run the tests from the command line since you're not putting scipy on > the path. > > [~/src/scipy-skipper] (master) > |11 $ nosetests scipy > > ---------------------------------------------------------------------- > Ran 6116 tests in 387.441s > > FAILED (SKIP=28, errors=16, failures=19) > > Otherwise, you might try doing > > nosetest --exe scipy > > though I can't imagine why the tests would have the executable bit set > in your source directory. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junkshops at gmail.com Wed May 23 03:38:34 2012 From: junkshops at gmail.com (Junkshops) Date: Wed, 23 May 2012 00:38:34 -0700 Subject: [SciPy-Dev] Adding t-test with unequal variances to stats.py Message-ID: <4FBC93FA.8010901@gmail.com> Hi all, I've issued a pull request (http://github.com/scipy/scipy/pull/227) for a version of scipy/stats/stats.py with the following changes: 1) Adds a method for running a t-test with unequal or unknown population variances. ttest_ind assumes that population variances are equal. 2) Refactored common code in the 4 t-test methods into shared methods. 3) This section of code, which has variations in multiple methods, looks buggy to me: d = np.mean(a,axis) - np.mean(b,axis) svar = ((n1-1)*v1+(n2-1)*v2) / float(df) t = d/np.sqrt(svar*(1.0/n1 + 1.0/n2)) t = np.where((d==0)*(svar==0), 1.0, t) #define t=0/0 = 0, identical means Surely if d=0, regardless of svar, t should be set to 0, not 1. Similarly, if svar = 0 then both variances are zero (assuming that each data set has at least 2 points - perhaps there should be a check for this?). In that case, if d==0 t should be zero. Otherwise, t should be +/-inf. Hence, (svar==0) is redundant. Accordingly, I've changed the lines in all functions to be the equivalent of t = np.where((d==0), 0.0, t) This handles the case where both d and svar are 0. The respective tests have also been changed. If I'm missing something here, please let me know. Thanks, Gavin From stefan at sun.ac.za Wed May 23 19:41:42 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 23 May 2012 16:41:42 -0700 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Mon, May 21, 2012 at 1:06 AM, Robert J?rdens wrote: > On Mon, May 21, 2012 at 1:16 AM, Ralf Gommers > wrote: >> Could you time the scikits-image version, it's not obvious from reading the >> code that it will be very slow? > > In [47]: from scipy.misc.pilutil import radon as scipy_radon > In [50]: from skimage.transform import radon as scikits_radon Looking at this again, I don't think the comparison is accurate. The skimage version of the radon transform does linear interpolation, whereas sum_angle uses the nearest neighbor. St?fan From karmel at arcaio.com Sat May 26 00:03:55 2012 From: karmel at arcaio.com (Karmel Allison) Date: Fri, 25 May 2012 21:03:55 -0700 Subject: [SciPy-Dev] Scipy Docs permissions Message-ID: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Hi all, After many years of my living off the work you do, I thought I'd start helping out. Can I get edit permissions for the docs to start? I just registered with the username karmel. Thanks! Karmel From stefan at sun.ac.za Sat May 26 03:15:57 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 26 May 2012 00:15:57 -0700 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: Hi Karmel On Fri, May 25, 2012 at 9:03 PM, Karmel Allison wrote: > After many years of my living off the work you do, I thought I'd start helping out. Can I get edit permissions for the docs to start? I just registered with the username karmel. Thank you for helping out! You now have editing permissions. St?fan From ralf.gommers at googlemail.com Mon May 28 11:26:50 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 28 May 2012 17:26:50 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule Message-ID: Hi all, It's time for a new release I think; 0.10.0 was released just over 6 months ago. There are a lot of PRs to merge, but no release blockers anymore as far as I know. Does anyone still have important fixes or other things that should go in? For the release schedule I propose: June 7: beta 1 June 17: rc 1 June 30: rc 2 July 7: final release Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From karmel at arcaio.com Mon May 28 14:53:50 2012 From: karmel at arcaio.com (Karmel Allison) Date: Mon, 28 May 2012 11:53:50 -0700 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: Thanks! As I begin to poke around, it occurs to me that there are many docs marked "Needs editing," but it's a little unclear what exactly makes a particular doc still in need of editing, as opposed to review. For example, for docs like http://docs.scipy.org/scipy/docs/scipy.stats.stats.fisher_exact/ and http://docs.scipy.org/scipy/docs/scipy.linalg.basic.solveh_banded/ what should be improved on? Or are those just cases that haven't been marked "Needs review"? Similarly, for docs like http://docs.scipy.org/scipy/docs/scipy.linalg.decomp.eigh/ and http://docs.scipy.org/scipy/docs/scipy.cluster.vq.sqrt/ is it the case that fixing the errors would qualify them for being ready for review? Or is there more on top of that you would like to see done? To generalize, then: how do you know when a particular doc is done baking? Thanks, Karmel On May 26, 2012, at 12:15 AM, St?fan van der Walt wrote: > Hi Karmel > > On Fri, May 25, 2012 at 9:03 PM, Karmel Allison wrote: >> After many years of my living off the work you do, I thought I'd start helping out. Can I get edit permissions for the docs to start? I just registered with the username karmel. > > Thank you for helping out! You now have editing permissions. > > St?fan > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Mon May 28 15:17:37 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 28 May 2012 21:17:37 +0200 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: Hi Karmel, On Mon, May 28, 2012 at 8:53 PM, Karmel Allison wrote: > Thanks! > > As I begin to poke around, it occurs to me that there are many docs marked > "Needs editing," but it's a little unclear what exactly makes a particular > doc still in need of editing, as opposed to review. For example, for docs > like > You're right that many docstrings are actually pretty good, but simply haven't been marked as "needs review" in the doc editor. For the numpy docs this was done, but some parts of scipy haven't been touched much in the wiki yet. > > http://docs.scipy.org/scipy/docs/scipy.stats.stats.fisher_exact/ and > this one looks ready for review. > http://docs.scipy.org/scipy/docs/scipy.linalg.basic.solveh_banded/ > this one misses an example, the reST formatting is not right (for example the line "ab[u + i - j, j] ..." should be marked green in the rendered version) and the parameter/return types are incorrect (boolean --> bool, array --> ndarray). > > what should be improved on? Or are those just cases that haven't been > marked "Needs review"? > > Similarly, for docs like > > http://docs.scipy.org/scipy/docs/scipy.linalg.decomp.eigh/ and > similar, example and formatting. > http://docs.scipy.org/scipy/docs/scipy.cluster.vq.sqrt/ > here you caught an imperfection in the wiki software. sqrt is a numpy function and shouldn't show up at all here. > > is it the case that fixing the errors would qualify them for being ready > for review? Or is there more on top of that you would like to see done? > > To generalize, then: how do you know when a particular doc is done baking? > The most important thing that's often missing in docstrings is a clear usage example. When that exists, the formatting is OK and the descriptions are clear, it is ready for review. Some functions could also benefit from a reference and/or notes that explain details of the algorithm or background. Whether that's appropriate or not - and what level of detail - is a bit up to your own good judgment. Thanks for helping out, Ralf > Thanks, > Karmel > > On May 26, 2012, at 12:15 AM, St?fan van der Walt wrote: > > > Hi Karmel > > > > On Fri, May 25, 2012 at 9:03 PM, Karmel Allison > wrote: > >> After many years of my living off the work you do, I thought I'd start > helping out. Can I get edit permissions for the docs to start? I just > registered with the username karmel. > > > > Thank you for helping out! You now have editing permissions. > > > > St?fan > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon May 28 15:21:08 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 28 May 2012 15:21:08 -0400 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: On Mon, May 28, 2012 at 2:53 PM, Karmel Allison wrote: > Thanks! > > As I begin to poke around, it occurs to me that there are many docs marked "Needs editing," but it's a little unclear what exactly makes a particular doc still in need of editing, as opposed to review. For example, for docs like > > http://docs.scipy.org/scipy/docs/scipy.stats.stats.fisher_exact/ and > http://docs.scipy.org/scipy/docs/scipy.linalg.basic.solveh_banded/ > > what should be improved on? Or are those just cases that haven't been marked "Needs review"? Mostly the later I think. Some functions like fisher_exact get good docstrings during a commit, and nobody updates the label in the editor. for stats, I went through many of them some time ago to change labels, but I just found that fisher_exact is missing from the milestone http://docs.scipy.org/scipy/Milestones/Milestones_11/ which I had used as basis for relabeling. examples if they don't require a huge program would be nice to make a docstring complete, link to tutorial, if available, would also be good Josef > > Similarly, for docs like > > http://docs.scipy.org/scipy/docs/scipy.linalg.decomp.eigh/ and > http://docs.scipy.org/scipy/docs/scipy.cluster.vq.sqrt/ what is sqrt doing in here? > > is it the case that fixing the errors would qualify them for being ready for review? Or is there more on top of that you would like to see done? > > To generalize, then: how do you know when a particular doc is done baking? > > Thanks, > Karmel > > On May 26, 2012, at 12:15 AM, St?fan van der Walt wrote: > >> Hi Karmel >> >> On Fri, May 25, 2012 at 9:03 PM, Karmel Allison wrote: >>> After many years of my living off the work you do, I thought I'd start helping out. Can I get edit permissions for the docs to start? I just registered with the username karmel. >> >> Thank you for helping out! ?You now have editing permissions. >> >> St?fan >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon May 28 15:26:47 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 28 May 2012 15:26:47 -0400 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: On Mon, May 28, 2012 at 3:17 PM, Ralf Gommers wrote: > Hi Karmel, > > On Mon, May 28, 2012 at 8:53 PM, Karmel Allison wrote: >> >> Thanks! >> >> As I begin to poke around, it occurs to me that there are many docs marked >> "Needs editing," but it's a little unclear what exactly makes a particular >> doc still in need of editing, as opposed to review. For example, for docs >> like > > > You're right that many docstrings are actually pretty good, but simply > haven't been marked as "needs review" in the doc editor. For the numpy docs > this was done, but some parts of scipy haven't been touched much in the wiki > yet. > >> >> >> http://docs.scipy.org/scipy/docs/scipy.stats.stats.fisher_exact/ and > > this one looks ready for review. > >> >> http://docs.scipy.org/scipy/docs/scipy.linalg.basic.solveh_banded/ > > this one misses an example, the reST formatting is not right (for example > the line "ab[u + i - j, j] ..." should be marked green in the rendered > version) and the parameter/return types are incorrect? (boolean --> bool, > array --> ndarray). >> >> >> what should be improved on? Or are those just cases that haven't been >> marked "Needs review"? >> >> Similarly, for docs like >> >> http://docs.scipy.org/scipy/docs/scipy.linalg.decomp.eigh/ and > > similar, example and formatting. > >> >> http://docs.scipy.org/scipy/docs/scipy.cluster.vq.sqrt/ > > here you caught an imperfection in the wiki software. sqrt is a numpy > function and shouldn't show up at all here. just as extra info: Review status: Unimportant often means there is some reason that the docstring shouldn't be edited. For example, in scipy.stats.distributions those docstrings are autogenerated and they shouldn't be changed in the doceditor. (there is a template code that can be edited.) Josef >> >> >> is it the case that fixing the errors would qualify them for being ready >> for review? Or is there more on top of that you would like to see done? >> >> To generalize, then: how do you know when a particular doc is done baking? > > > The most important thing that's often missing in docstrings is a clear usage > example. When that exists, the formatting is OK and the descriptions are > clear, it is ready for review. > > Some functions could also benefit from a reference and/or notes that explain > details of the algorithm or background. Whether that's appropriate or not - > and what level of detail - is a bit up to your own good judgment. > > Thanks for helping out, > Ralf > >> >> Thanks, >> Karmel >> >> On May 26, 2012, at 12:15 AM, St?fan van der Walt wrote: >> >> > Hi Karmel >> > >> > On Fri, May 25, 2012 at 9:03 PM, Karmel Allison >> > wrote: >> >> After many years of my living off the work you do, I thought I'd start >> >> helping out. Can I get edit permissions for the docs to start? I just >> >> registered with the username karmel. >> > >> > Thank you for helping out! ?You now have editing permissions. >> > >> > St?fan >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From karmel at arcaio.com Mon May 28 16:29:28 2012 From: karmel at arcaio.com (Karmel Allison) Date: Mon, 28 May 2012 13:29:28 -0700 Subject: [SciPy-Dev] Scipy Docs permissions In-Reply-To: References: <9B2542D1-4E0B-4B20-8B63-C13547F98D88@arcaio.com> Message-ID: <420C4728-A9B4-49B7-97A4-6B3FF4118D97@arcaio.com> That all helps clarify. Thanks! On May 28, 2012, at 12:17 PM, Ralf Gommers wrote: > Hi Karmel, > > On Mon, May 28, 2012 at 8:53 PM, Karmel Allison wrote: > Thanks! > > As I begin to poke around, it occurs to me that there are many docs marked "Needs editing," but it's a little unclear what exactly makes a particular doc still in need of editing, as opposed to review. For example, for docs like > > You're right that many docstrings are actually pretty good, but simply haven't been marked as "needs review" in the doc editor. For the numpy docs this was done, but some parts of scipy haven't been touched much in the wiki yet. > > > http://docs.scipy.org/scipy/docs/scipy.stats.stats.fisher_exact/ and > this one looks ready for review. > > http://docs.scipy.org/scipy/docs/scipy.linalg.basic.solveh_banded/ > this one misses an example, the reST formatting is not right (for example the line "ab[u + i - j, j] ..." should be marked green in the rendered version) and the parameter/return types are incorrect (boolean --> bool, array --> ndarray). > > what should be improved on? Or are those just cases that haven't been marked "Needs review"? > > Similarly, for docs like > > http://docs.scipy.org/scipy/docs/scipy.linalg.decomp.eigh/ and > similar, example and formatting. > > http://docs.scipy.org/scipy/docs/scipy.cluster.vq.sqrt/ > here you caught an imperfection in the wiki software. sqrt is a numpy function and shouldn't show up at all here. > > is it the case that fixing the errors would qualify them for being ready for review? Or is there more on top of that you would like to see done? > > To generalize, then: how do you know when a particular doc is done baking? > > The most important thing that's often missing in docstrings is a clear usage example. When that exists, the formatting is OK and the descriptions are clear, it is ready for review. > > Some functions could also benefit from a reference and/or notes that explain details of the algorithm or background. Whether that's appropriate or not - and what level of detail - is a bit up to your own good judgment. > > Thanks for helping out, > Ralf > > > Thanks, > Karmel > > On May 26, 2012, at 12:15 AM, St?fan van der Walt wrote: > > > Hi Karmel > > > > On Fri, May 25, 2012 at 9:03 PM, Karmel Allison wrote: > >> After many years of my living off the work you do, I thought I'd start helping out. Can I get edit permissions for the docs to start? I just registered with the username karmel. > > > > Thank you for helping out! You now have editing permissions. > > > > St?fan > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Tue May 29 09:28:26 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 29 May 2012 15:28:26 +0200 Subject: [SciPy-Dev] ANN: SfePy 2012.2 Message-ID: <4FC4CEFA.3030500@ntc.zcu.cz> I am pleased to announce release 2012.2 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Downloads, mailing list, wiki: http://code.google.com/p/sfepy/ Git (source) repository, issue tracker: http://github.com/sfepy Highlights of this release -------------------------- - reimplemented acoustic band gaps code using the homogenization engine - high order quadrature rules - unified dot product and mass terms, lots of other term updates/fixes, - updated the PDE solver application For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke?, Andre Smit From ralf.gommers at googlemail.com Tue May 29 16:08:27 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 29 May 2012 22:08:27 +0200 Subject: [SciPy-Dev] scipy.signal.normalize problems? In-Reply-To: <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> References: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> Message-ID: On Mon, May 21, 2012 at 9:11 PM, Josh Lawrence wrote: > If I change the allclose lines to have > > allclose(0, outb[:,0], atol=1e-14) > > it works. I think that is what the original goal was, anyways. From the > documentation of allclose, what I have written above result in ensuring > > np.abs(0 - outb[:,0]) > atol + rtol * np.abs(outb[:,0]) > > with rtol defaulting to 1e-5. I'm still not sure about how things have > been written for the 'b' argument of normalize being rank-2, so I can't > guarantee that my fix makes things work for that. Should I file a bug > report/submit a patch/send a git pull request, etc? > Sounds like you understood the issue and have a fix that does what you expect (matches Matlab), so a pull request would be nice. For the rank-2 thing, you can just note it on the PR if you can't figure it out. Ralf > Cheers, > > Josh Lawrence > > On May 21, 2012, at 2:45 PM, Josh Lawrence wrote: > > > Hey all, > > > > I've been having some problems designing a Chebyshev filter and I think > I have narrowed down the hang-up to scipy.signal.normalize. I think what's > going on in my case is that line 286 of filter_design.py (the first > allclose call in the normalize function) is producing a false positive. > Here's the function definition: > > > > def normalize(b, a): > > """Normalize polynomial representation of a transfer function. > > > > If values of b are too close to 0, they are removed. In that case, a > > BadCoefficients warning is emitted. > > """ > > b, a = map(atleast_1d, (b, a)) > > if len(a.shape) != 1: > > raise ValueError("Denominator polynomial must be rank-1 array.") > > if len(b.shape) > 2: > > raise ValueError("Numerator polynomial must be rank-1 or" > > " rank-2 array.") > > if len(b.shape) == 1: > > b = asarray([b], b.dtype.char) > > while a[0] == 0.0 and len(a) > 1: > > a = a[1:] > > outb = b * (1.0) / a[0] > > outa = a * (1.0) / a[0] > > if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 > > warnings.warn("Badly conditioned filter coefficients (numerator): > the " > > "results may be meaningless", BadCoefficients) > > while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > > 1): > > outb = outb[:, 1:] > > if outb.shape[0] == 1: > > outb = outb[0] > > return outb, outa > > > > I marked line 286. If I reproduce all the steps carried out by > scipy.signal.iirdesign, I end up with a (b, a) pair which results of > scipy.signal.lp2lp and looks like this: > > > > In [106]: b_lp2 > > Out[106]: array([ 1.55431359e-06+0.j]) > > > > In [107]: a_lp2 > > Out[107]: > > array([ 1.00000000e+00 +0.00000000e+00j, > > 3.46306104e-01 -2.01282794e-16j, > > 2.42572185e-01 -6.08207573e-17j, > > 5.92946943e-02 +0.00000000e+00j, > > 1.82069156e-02 +5.55318531e-18j, > > 2.89328123e-03 +0.00000000e+00j, > > 4.36566281e-04 -2.95766719e-19j, > > 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 > -1.00966301e-21j]) > > > > scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to > keep track of what's going on) and runs them through scipy.signal.bilinear > (in filter_design.py bilinear is called on line 624 within iirfilter. > iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize > is called on line 445. I've made my own class with bilinear copied and > pasted from filter_design.py to test things. In bilinear, the input to > normalize is given by > > > > b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 > > 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 > > 1.55431359e-06] > > a = [ 72269.02590913 -562426.61430468 1918276.19173089 > -3745112.83646825 > > 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 > > 60799.46134722] > > > > In normalize, right before the allclose() call, outb is defined by > > > > outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 > 1.20441032e-09 > > 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 > > 2.15073272e-11]] > > > > From what I read of the function normalize, it should only evaluate true > if all of the coefficients in outb are smaller than 1e-14. However, that's > not what is going on. I have access to MATLAB and if I go through the same > design criteria to design a chebyshev filter, I get the following: > > > > b = > > > > 1.0e-08 * > > > > Columns 1 through 5 > > > > 0.002150733144728 0.017205865157826 0.060220528052392 > 0.120441056104784 0.150551320130980 > > > > Columns 6 through 9 > > > > 0.120441056104784 0.060220528052392 0.017205865157826 > 0.002150733144728 > > > > which matches up rather well for several significant figures. > > > > I apologize if this is not clearly explained, but I'm not sure what to > do. I tried messing around with the arguments to allclose (switching it to > be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). > I am also not sure why normalize is setup to run on rank-2 arrays. I looked > through filter_design.py and none of the functions contained in it send a > rank-2 array to normalize from what I can tell. Any thoughts? > > > > Cheers, > > > > Josh Lawrence > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordens at gmail.com Tue May 29 19:17:39 2012 From: jordens at gmail.com (=?UTF-8?Q?Robert_J=C3=B6rdens?=) Date: Tue, 29 May 2012 17:17:39 -0600 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Wed, May 23, 2012 at 5:41 PM, St?fan van der Walt wrote: > Looking at this again, I don't think the comparison is accurate. ?The > skimage version of the radon transform does linear interpolation, > whereas sum_angle uses the nearest neighbor. Point taken. IDL and some other toolkits I had a look at defaulted to nearest neighbor. It might be good to have the option of choosing the interpolation method in skimage. -- Robert Jordens. From stefan at sun.ac.za Wed May 30 01:31:11 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 29 May 2012 22:31:11 -0700 Subject: [SciPy-Dev] sum_angle() and sum_polar() functions In-Reply-To: References: Message-ID: On Tue, May 29, 2012 at 4:17 PM, Robert J?rdens wrote: > Point taken. IDL and some other toolkits I had a look at defaulted to > nearest neighbor. > It might be good to have the option of choosing the interpolation > method in skimage. At the moment we make use of an accelerated linear interpolation routine to speed things up, but we could default to ndimage for higher orders--it'll just be quite a bit slower. St?fan From lists at hilboll.de Wed May 30 12:46:34 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Wed, 30 May 2012 18:46:34 +0200 Subject: [SciPy-Dev] How to get "correct" values for unit tests Message-ID: <4FC64EEA.1010707@hilboll.de> Hi, while working on the tests for the new scipy.interpolate.{Smooth,LSQ}SphereBiavariateSpline classes, I'm wondering how to come up with sensible TRUE example values to test against. In the case mentioned (see https://github.com/scipy/scipy/pull/192), I simply wrapped a routine (sphere.f) from FITPACK. So I could write a direct FORTRAN program using sphere.f to calculate some "TRUE" values. However, that would just check that the wrapping actually works. Is this considered enough? Ultimately, I would like a test to assure that the results are correct. But for that, wouldn't it be "better" (whatever that means) to use a different library to calculate the TRUE results? Sorry, this might be a confusing email. Cheers, Andreas. From lists at hilboll.de Wed May 30 12:50:34 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Wed, 30 May 2012 18:50:34 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: References: Message-ID: <4FC64FDA.7090402@hilboll.de> > Hi all, > > It's time for a new release I think; 0.10.0 was released just over 6 > months ago. There are a lot of PRs to merge, but no release blockers > anymore as far as I know. Does anyone still have important fixes or > other things that should go in? Though it's by no means important, I personally would really appreciate if https://github.com/scipy/scipy/pull/192 could go in. Andreas. From josh.k.lawrence at gmail.com Wed May 30 13:20:39 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 30 May 2012 13:20:39 -0400 Subject: [SciPy-Dev] scipy.signal.normalize problems? In-Reply-To: References: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> Message-ID: <51356F2E-A648-400E-855B-D919B44330EA@gmail.com> Okay, I made a pull request: https://github.com/scipy/scipy/pull/233 Do I need to file a ticket on Trac or is the PR good enough? --Josh On May 29, 2012, at 4:08 PM, Ralf Gommers wrote: > > > On Mon, May 21, 2012 at 9:11 PM, Josh Lawrence wrote: > If I change the allclose lines to have > > allclose(0, outb[:,0], atol=1e-14) > > it works. I think that is what the original goal was, anyways. From the documentation of allclose, what I have written above result in ensuring > > np.abs(0 - outb[:,0]) > atol + rtol * np.abs(outb[:,0]) > > with rtol defaulting to 1e-5. I'm still not sure about how things have been written for the 'b' argument of normalize being rank-2, so I can't guarantee that my fix makes things work for that. Should I file a bug report/submit a patch/send a git pull request, etc? > > Sounds like you understood the issue and have a fix that does what you expect (matches Matlab), so a pull request would be nice. > > For the rank-2 thing, you can just note it on the PR if you can't figure it out. > > Ralf > > > Cheers, > > Josh Lawrence > > On May 21, 2012, at 2:45 PM, Josh Lawrence wrote: > > > Hey all, > > > > I've been having some problems designing a Chebyshev filter and I think I have narrowed down the hang-up to scipy.signal.normalize. I think what's going on in my case is that line 286 of filter_design.py (the first allclose call in the normalize function) is producing a false positive. Here's the function definition: > > > > def normalize(b, a): > > """Normalize polynomial representation of a transfer function. > > > > If values of b are too close to 0, they are removed. In that case, a > > BadCoefficients warning is emitted. > > """ > > b, a = map(atleast_1d, (b, a)) > > if len(a.shape) != 1: > > raise ValueError("Denominator polynomial must be rank-1 array.") > > if len(b.shape) > 2: > > raise ValueError("Numerator polynomial must be rank-1 or" > > " rank-2 array.") > > if len(b.shape) == 1: > > b = asarray([b], b.dtype.char) > > while a[0] == 0.0 and len(a) > 1: > > a = a[1:] > > outb = b * (1.0) / a[0] > > outa = a * (1.0) / a[0] > > if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 > > warnings.warn("Badly conditioned filter coefficients (numerator): the " > > "results may be meaningless", BadCoefficients) > > while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > 1): > > outb = outb[:, 1:] > > if outb.shape[0] == 1: > > outb = outb[0] > > return outb, outa > > > > I marked line 286. If I reproduce all the steps carried out by scipy.signal.iirdesign, I end up with a (b, a) pair which results of scipy.signal.lp2lp and looks like this: > > > > In [106]: b_lp2 > > Out[106]: array([ 1.55431359e-06+0.j]) > > > > In [107]: a_lp2 > > Out[107]: > > array([ 1.00000000e+00 +0.00000000e+00j, > > 3.46306104e-01 -2.01282794e-16j, > > 2.42572185e-01 -6.08207573e-17j, > > 5.92946943e-02 +0.00000000e+00j, > > 1.82069156e-02 +5.55318531e-18j, > > 2.89328123e-03 +0.00000000e+00j, > > 4.36566281e-04 -2.95766719e-19j, > > 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 -1.00966301e-21j]) > > > > scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to keep track of what's going on) and runs them through scipy.signal.bilinear (in filter_design.py bilinear is called on line 624 within iirfilter. iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize is called on line 445. I've made my own class with bilinear copied and pasted from filter_design.py to test things. In bilinear, the input to normalize is given by > > > > b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 > > 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 > > 1.55431359e-06] > > a = [ 72269.02590913 -562426.61430468 1918276.19173089 -3745112.83646825 > > 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 > > 60799.46134722] > > > > In normalize, right before the allclose() call, outb is defined by > > > > outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 1.20441032e-09 > > 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 > > 2.15073272e-11]] > > > > From what I read of the function normalize, it should only evaluate true if all of the coefficients in outb are smaller than 1e-14. However, that's not what is going on. I have access to MATLAB and if I go through the same design criteria to design a chebyshev filter, I get the following: > > > > b = > > > > 1.0e-08 * > > > > Columns 1 through 5 > > > > 0.002150733144728 0.017205865157826 0.060220528052392 0.120441056104784 0.150551320130980 > > > > Columns 6 through 9 > > > > 0.120441056104784 0.060220528052392 0.017205865157826 0.002150733144728 > > > > which matches up rather well for several significant figures. > > > > I apologize if this is not clearly explained, but I'm not sure what to do. I tried messing around with the arguments to allclose (switching it to be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). I am also not sure why normalize is setup to run on rank-2 arrays. I looked through filter_design.py and none of the functions contained in it send a rank-2 array to normalize from what I can tell. Any thoughts? > > > > Cheers, > > > > Josh Lawrence > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed May 30 13:22:45 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 30 May 2012 13:22:45 -0400 Subject: [SciPy-Dev] How to get "correct" values for unit tests In-Reply-To: <4FC64EEA.1010707@hilboll.de> References: <4FC64EEA.1010707@hilboll.de> Message-ID: On Wed, May 30, 2012 at 12:46 PM, Andreas Hilboll wrote: > Hi, > > while working on the tests for the new > scipy.interpolate.{Smooth,LSQ}SphereBiavariateSpline classes, I'm > wondering how to come up with sensible TRUE example values to test against. > > In the case mentioned (see https://github.com/scipy/scipy/pull/192), I > simply wrapped a routine (sphere.f) from FITPACK. So I could write a > direct FORTRAN program using sphere.f to calculate some "TRUE" values. > However, that would just check that the wrapping actually works. > > Is this considered enough? Ultimately, I would like a test to assure > that the results are correct. But for that, wouldn't it be "better" > (whatever that means) to use a different library to calculate the TRUE > results? It's better to verify against results from an outside library, but it's not always possible to find exactly the same algorithm. In that case, all we can test is whether the numbers are approximately (wil low precision) the same. Many of the scipy.stats function are now and most of statsmodels models are verified against R (or other packages). (lowess is no identical to R up to 6 decimals or so.) In some cases it's possible to verify against a theoretical and hand calculated example, but I guess not in your case. Josef > > Sorry, this might be a confusing email. > > Cheers, > Andreas. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Wed May 30 13:38:45 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 May 2012 19:38:45 +0200 Subject: [SciPy-Dev] scipy.signal.normalize problems? In-Reply-To: <51356F2E-A648-400E-855B-D919B44330EA@gmail.com> References: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> <51356F2E-A648-400E-855B-D919B44330EA@gmail.com> Message-ID: On Wed, May 30, 2012 at 7:20 PM, Josh Lawrence wrote: > Okay, > > I made a pull request: > > https://github.com/scipy/scipy/pull/233 > > Do I need to file a ticket on Trac or is the PR good enough? > The PR should be good enough. Two suggestions though for the PR: - in a comment, link to this discussion (gmane archive) - please add a unit test. You already have an example that didn't work before, and now gives correct results (or at least matches Matlab). You can convert that into a test, with the output of normalize() checked against the hardcoded correct numerical values. Thanks, Ralf > --Josh > > On May 29, 2012, at 4:08 PM, Ralf Gommers wrote: > > > > On Mon, May 21, 2012 at 9:11 PM, Josh Lawrence wrote: > >> If I change the allclose lines to have >> >> allclose(0, outb[:,0], atol=1e-14) >> >> it works. I think that is what the original goal was, anyways. From the >> documentation of allclose, what I have written above result in ensuring >> >> np.abs(0 - outb[:,0]) > atol + rtol * np.abs(outb[:,0]) >> >> with rtol defaulting to 1e-5. I'm still not sure about how things have >> been written for the 'b' argument of normalize being rank-2, so I can't >> guarantee that my fix makes things work for that. Should I file a bug >> report/submit a patch/send a git pull request, etc? >> > > Sounds like you understood the issue and have a fix that does what you > expect (matches Matlab), so a pull request would be nice. > > For the rank-2 thing, you can just note it on the PR if you can't figure > it out. > > Ralf > > >> Cheers, >> >> Josh Lawrence >> >> On May 21, 2012, at 2:45 PM, Josh Lawrence wrote: >> >> > Hey all, >> > >> > I've been having some problems designing a Chebyshev filter and I think >> I have narrowed down the hang-up to scipy.signal.normalize. I think what's >> going on in my case is that line 286 of filter_design.py (the first >> allclose call in the normalize function) is producing a false positive. >> Here's the function definition: >> > >> > def normalize(b, a): >> > """Normalize polynomial representation of a transfer function. >> > >> > If values of b are too close to 0, they are removed. In that case, a >> > BadCoefficients warning is emitted. >> > """ >> > b, a = map(atleast_1d, (b, a)) >> > if len(a.shape) != 1: >> > raise ValueError("Denominator polynomial must be rank-1 array.") >> > if len(b.shape) > 2: >> > raise ValueError("Numerator polynomial must be rank-1 or" >> > " rank-2 array.") >> > if len(b.shape) == 1: >> > b = asarray([b], b.dtype.char) >> > while a[0] == 0.0 and len(a) > 1: >> > a = a[1:] >> > outb = b * (1.0) / a[0] >> > outa = a * (1.0) / a[0] >> > if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 >> > warnings.warn("Badly conditioned filter coefficients >> (numerator): the " >> > "results may be meaningless", BadCoefficients) >> > while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > >> 1): >> > outb = outb[:, 1:] >> > if outb.shape[0] == 1: >> > outb = outb[0] >> > return outb, outa >> > >> > I marked line 286. If I reproduce all the steps carried out by >> scipy.signal.iirdesign, I end up with a (b, a) pair which results of >> scipy.signal.lp2lp and looks like this: >> > >> > In [106]: b_lp2 >> > Out[106]: array([ 1.55431359e-06+0.j]) >> > >> > In [107]: a_lp2 >> > Out[107]: >> > array([ 1.00000000e+00 +0.00000000e+00j, >> > 3.46306104e-01 -2.01282794e-16j, >> > 2.42572185e-01 -6.08207573e-17j, >> > 5.92946943e-02 +0.00000000e+00j, >> > 1.82069156e-02 +5.55318531e-18j, >> > 2.89328123e-03 +0.00000000e+00j, >> > 4.36566281e-04 -2.95766719e-19j, >> > 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 >> -1.00966301e-21j]) >> > >> > scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to >> keep track of what's going on) and runs them through scipy.signal.bilinear >> (in filter_design.py bilinear is called on line 624 within iirfilter. >> iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize >> is called on line 445. I've made my own class with bilinear copied and >> pasted from filter_design.py to test things. In bilinear, the input to >> normalize is given by >> > >> > b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 >> > 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 >> > 1.55431359e-06] >> > a = [ 72269.02590913 -562426.61430468 1918276.19173089 >> -3745112.83646825 >> > 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 >> > 60799.46134722] >> > >> > In normalize, right before the allclose() call, outb is defined by >> > >> > outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 >> 1.20441032e-09 >> > 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 >> > 2.15073272e-11]] >> > >> > From what I read of the function normalize, it should only evaluate >> true if all of the coefficients in outb are smaller than 1e-14. However, >> that's not what is going on. I have access to MATLAB and if I go through >> the same design criteria to design a chebyshev filter, I get the following: >> > >> > b = >> > >> > 1.0e-08 * >> > >> > Columns 1 through 5 >> > >> > 0.002150733144728 0.017205865157826 0.060220528052392 >> 0.120441056104784 0.150551320130980 >> > >> > Columns 6 through 9 >> > >> > 0.120441056104784 0.060220528052392 0.017205865157826 >> 0.002150733144728 >> > >> > which matches up rather well for several significant figures. >> > >> > I apologize if this is not clearly explained, but I'm not sure what to >> do. I tried messing around with the arguments to allclose (switching it to >> be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). >> I am also not sure why normalize is setup to run on rank-2 arrays. I looked >> through filter_design.py and none of the functions contained in it send a >> rank-2 array to normalize from what I can tell. Any thoughts? >> > >> > Cheers, >> > >> > Josh Lawrence >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed May 30 13:39:50 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 May 2012 19:39:50 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: <4FC64FDA.7090402@hilboll.de> References: <4FC64FDA.7090402@hilboll.de> Message-ID: On Wed, May 30, 2012 at 6:50 PM, Andreas Hilboll wrote: > > Hi all, > > > > It's time for a new release I think; 0.10.0 was released just over 6 > > months ago. There are a lot of PRs to merge, but no release blockers > > anymore as far as I know. Does anyone still have important fixes or > > other things that should go in? > > Though it's by no means important, I personally would really appreciate > if https://github.com/scipy/scipy/pull/192 could go in. > That looks ready to be merged now, I'll make sure to not forget it. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josh.k.lawrence at gmail.com Wed May 30 13:45:33 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 30 May 2012 13:45:33 -0400 Subject: [SciPy-Dev] scipy.signal.normalize problems? In-Reply-To: References: <3D094679-4BC5-4647-9739-776DA7539CBB@gmail.com> <95DB673F-E1B6-4C9D-9050-39D8A412D52F@gmail.com> <51356F2E-A648-400E-855B-D919B44330EA@gmail.com> Message-ID: <012FDD5E-653D-4279-8A18-0E4D5A216986@gmail.com> Okay, I might take a few days to get around to writing a test for it. But I'll update the pull request when I do. --Josh On May 30, 2012, at 1:38 PM, Ralf Gommers wrote: > > > On Wed, May 30, 2012 at 7:20 PM, Josh Lawrence wrote: > Okay, > > I made a pull request: > > https://github.com/scipy/scipy/pull/233 > > Do I need to file a ticket on Trac or is the PR good enough? > > The PR should be good enough. Two suggestions though for the PR: > - in a comment, link to this discussion (gmane archive) > - please add a unit test. You already have an example that didn't work before, and now gives correct results (or at least matches Matlab). You can convert that into a test, with the output of normalize() checked against the hardcoded correct numerical values. > > Thanks, > Ralf > > > --Josh > > On May 29, 2012, at 4:08 PM, Ralf Gommers wrote: > >> >> >> On Mon, May 21, 2012 at 9:11 PM, Josh Lawrence wrote: >> If I change the allclose lines to have >> >> allclose(0, outb[:,0], atol=1e-14) >> >> it works. I think that is what the original goal was, anyways. From the documentation of allclose, what I have written above result in ensuring >> >> np.abs(0 - outb[:,0]) > atol + rtol * np.abs(outb[:,0]) >> >> with rtol defaulting to 1e-5. I'm still not sure about how things have been written for the 'b' argument of normalize being rank-2, so I can't guarantee that my fix makes things work for that. Should I file a bug report/submit a patch/send a git pull request, etc? >> >> Sounds like you understood the issue and have a fix that does what you expect (matches Matlab), so a pull request would be nice. >> >> For the rank-2 thing, you can just note it on the PR if you can't figure it out. >> >> Ralf >> >> >> Cheers, >> >> Josh Lawrence >> >> On May 21, 2012, at 2:45 PM, Josh Lawrence wrote: >> >> > Hey all, >> > >> > I've been having some problems designing a Chebyshev filter and I think I have narrowed down the hang-up to scipy.signal.normalize. I think what's going on in my case is that line 286 of filter_design.py (the first allclose call in the normalize function) is producing a false positive. Here's the function definition: >> > >> > def normalize(b, a): >> > """Normalize polynomial representation of a transfer function. >> > >> > If values of b are too close to 0, they are removed. In that case, a >> > BadCoefficients warning is emitted. >> > """ >> > b, a = map(atleast_1d, (b, a)) >> > if len(a.shape) != 1: >> > raise ValueError("Denominator polynomial must be rank-1 array.") >> > if len(b.shape) > 2: >> > raise ValueError("Numerator polynomial must be rank-1 or" >> > " rank-2 array.") >> > if len(b.shape) == 1: >> > b = asarray([b], b.dtype.char) >> > while a[0] == 0.0 and len(a) > 1: >> > a = a[1:] >> > outb = b * (1.0) / a[0] >> > outa = a * (1.0) / a[0] >> > if allclose(outb[:, 0], 0, rtol=1e-14): <------------------ Line 286 >> > warnings.warn("Badly conditioned filter coefficients (numerator): the " >> > "results may be meaningless", BadCoefficients) >> > while allclose(outb[:, 0], 0, rtol=1e-14) and (outb.shape[-1] > 1): >> > outb = outb[:, 1:] >> > if outb.shape[0] == 1: >> > outb = outb[0] >> > return outb, outa >> > >> > I marked line 286. If I reproduce all the steps carried out by scipy.signal.iirdesign, I end up with a (b, a) pair which results of scipy.signal.lp2lp and looks like this: >> > >> > In [106]: b_lp2 >> > Out[106]: array([ 1.55431359e-06+0.j]) >> > >> > In [107]: a_lp2 >> > Out[107]: >> > array([ 1.00000000e+00 +0.00000000e+00j, >> > 3.46306104e-01 -2.01282794e-16j, >> > 2.42572185e-01 -6.08207573e-17j, >> > 5.92946943e-02 +0.00000000e+00j, >> > 1.82069156e-02 +5.55318531e-18j, >> > 2.89328123e-03 +0.00000000e+00j, >> > 4.36566281e-04 -2.95766719e-19j, >> > 3.50842810e-05 -3.19180568e-20j, 1.64641246e-06 -1.00966301e-21j]) >> > >> > scipy.signal.iirdesign takes b_lp2, a_lp2 (my local variable names to keep track of what's going on) and runs them through scipy.signal.bilinear (in filter_design.py bilinear is called on line 624 within iirfilter. iirdesign calls iirfilter which calls bilinear). Inside bilinear, normalize is called on line 445. I've made my own class with bilinear copied and pasted from filter_design.py to test things. In bilinear, the input to normalize is given by >> > >> > b = [ 1.55431359e-06 1.24345087e-05 4.35207804e-05 8.70415608e-05 >> > 1.08801951e-04 8.70415608e-05 4.35207804e-05 1.24345087e-05 >> > 1.55431359e-06] >> > a = [ 72269.02590913 -562426.61430468 1918276.19173089 -3745112.83646825 >> > 4577612.13937628 -3586970.61385926 1759651.18184723 -494097.93515708 >> > 60799.46134722] >> > >> > In normalize, right before the allclose() call, outb is defined by >> > >> > outb = [[ 2.15073272e-11 1.72058618e-10 6.02205162e-10 1.20441032e-09 >> > 1.50551290e-09 1.20441032e-09 6.02205162e-10 1.72058618e-10 >> > 2.15073272e-11]] >> > >> > From what I read of the function normalize, it should only evaluate true if all of the coefficients in outb are smaller than 1e-14. However, that's not what is going on. I have access to MATLAB and if I go through the same design criteria to design a chebyshev filter, I get the following: >> > >> > b = >> > >> > 1.0e-08 * >> > >> > Columns 1 through 5 >> > >> > 0.002150733144728 0.017205865157826 0.060220528052392 0.120441056104784 0.150551320130980 >> > >> > Columns 6 through 9 >> > >> > 0.120441056104784 0.060220528052392 0.017205865157826 0.002150733144728 >> > >> > which matches up rather well for several significant figures. >> > >> > I apologize if this is not clearly explained, but I'm not sure what to do. I tried messing around with the arguments to allclose (switching it to be allclose(0, outb[:,0], ...), or changing the keyword from rtol to atol). I am also not sure why normalize is setup to run on rank-2 arrays. I looked through filter_design.py and none of the functions contained in it send a rank-2 array to normalize from what I can tell. Any thoughts? >> > >> > Cheers, >> > >> > Josh Lawrence >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed May 30 13:54:56 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 30 May 2012 19:54:56 +0200 Subject: [SciPy-Dev] How to get "correct" values for unit tests In-Reply-To: <4FC64EEA.1010707@hilboll.de> References: <4FC64EEA.1010707@hilboll.de> Message-ID: 30.05.2012 18:46, Andreas Hilboll kirjoitti: > while working on the tests for the new > scipy.interpolate.{Smooth,LSQ}SphereBiavariateSpline classes, I'm > wondering how to come up with sensible TRUE example values to test against. > > In the case mentioned (see https://github.com/scipy/scipy/pull/192), I > simply wrapped a routine (sphere.f) from FITPACK. So I could write a > direct FORTRAN program using sphere.f to calculate some "TRUE" values. > However, that would just check that the wrapping actually works. > > Is this considered enough? Ultimately, I would like a test to assure > that the results are correct. But for that, wouldn't it be "better" > (whatever that means) to use a different library to calculate the TRUE > results? I think an useful philosophy for the tests should be about ensuring that the code, as a whole, does what is promised. (Not all decades-old Fortran code is reliable...) So, more in the direction of functional tests than unit tests, and this works also as a QA step... Testing interpolation is a bit more difficult to do than for other types of code, since what is a "good" result is more fuzzily defined there, and the "correct" results are not fully well-defined. If I had to manually verify that the interpolation on a sphere works, what I'd try at first would be: generate a random dataset (with fixed random seed) and check (plot) that the result looks reasonable: - interpolant at points maps to original data values - continuity across "edges" of the sphere - checks for the flat derivative options - that the interpolant is "nice" in some sense The first three can be converted to assert_allclose style tests with some amount of work. The last relies on the eyeball-norm, but I could just pick a few data points out from a plot I think is reasonable, and write a small test that checks against those (as a statement that someone actually looked at the output). I'd guess the above would also catch essentially all possible problems in the wrapping. IMO testing just the wrapper is not very useful --- the above sort of tests are not much more difficult to write, and should catch a wider range of problems. Pauli From pav at iki.fi Wed May 30 14:00:53 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 30 May 2012 20:00:53 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: References: Message-ID: Hi, 28.05.2012 17:26, Ralf Gommers kirjoitti: > It's time for a new release I think; 0.10.0 was released just over 6 > months ago. There are a lot of PRs to merge, but no release blockers > anymore as far as I know. Does anyone still have important fixes or > other things that should go in? > > For the release schedule I propose: > June 7: beta 1 > June 17: rc 1 > June 30: rc 2 > July 7: final release The schedule seems possible. I'd like to get in as many as possible of the open PRs and any "nearly easy" stuff from the Trac. There are also a couple of low-hanging bugfixes in scipy.special listed in the Trac that are short of some quality time with Abramowitz&Stegun on verifying that the fixes are correct. Finally, making the Qhull wrappers expose circumcenters (for easy Voronoi) and allow passing in custom Qhull options could also be useful, and not too difficult to do. I'll see how much of this gets done. Pauli From vanderplas at astro.washington.edu Wed May 30 15:03:05 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Wed, 30 May 2012 12:03:05 -0700 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: References: Message-ID: <4FC66EE9.2020908@astro.washington.edu> Pauli, If you look at QHull, could you check if there's an easy way for it to expose information on edges as well? (that is, a listing of the segments connecting the points which define the simplexes). If we had that, it could be combined with the new csgraph tools to allow a very fast computation of euclidean minimum spanning trees for spatial clustering. I'm not aware of any good python implementation of EMST available currently. Jake Pauli Virtanen wrote: > Hi, > > 28.05.2012 17:26, Ralf Gommers kirjoitti: > >> It's time for a new release I think; 0.10.0 was released just over 6 >> months ago. There are a lot of PRs to merge, but no release blockers >> anymore as far as I know. Does anyone still have important fixes or >> other things that should go in? >> >> For the release schedule I propose: >> June 7: beta 1 >> June 17: rc 1 >> June 30: rc 2 >> July 7: final release >> > > The schedule seems possible. > > I'd like to get in as many as possible of the open PRs and any "nearly > easy" stuff from the Trac. There are also a couple of low-hanging > bugfixes in scipy.special listed in the Trac that are short of some > quality time with Abramowitz&Stegun on verifying that the fixes are correct. > > Finally, making the Qhull wrappers expose circumcenters (for easy > Voronoi) and allow passing in custom Qhull options could also be useful, > and not too difficult to do. I'll see how much of this gets done. > > Pauli > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From pav at iki.fi Wed May 30 15:24:41 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 30 May 2012 21:24:41 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: <4FC66EE9.2020908@astro.washington.edu> References: <4FC66EE9.2020908@astro.washington.edu> Message-ID: 30.05.2012 21:03, Jacob VanderPlas kirjoitti: > If you look at QHull, could you check if there's an easy way for it to > expose information on edges as well? (that is, a listing of the > segments connecting the points which define the simplexes). If we had > that, it could be combined with the new csgraph tools to allow a very > fast computation of euclidean minimum spanning trees for spatial > clustering. I'm not aware of any good python implementation of EMST > available currently. That's currently implicitly contained in the adjacency information (the `neighbors` attribute). The edge between simplex and its j-th neighbor consists of its vertices, excluding the j-th one. Sounds like that might be extracted with some slicing and a range(ndim) for loop. Needs some filtering out of the -1 entries in the neighbor array, though... But this could also be added as a new attribute. The information can be lazily computed when first accessed, so there's no memory penalty. Pauli From pav at iki.fi Wed May 30 15:31:37 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 30 May 2012 21:31:37 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: References: <4FC66EE9.2020908@astro.washington.edu> Message-ID: 30.05.2012 21:24, Pauli Virtanen kirjoitti: > 30.05.2012 21:03, Jacob VanderPlas kirjoitti: >> If you look at QHull, could you check if there's an easy way for it to >> expose information on edges as well? (that is, a listing of the >> segments connecting the points which define the simplexes). If we had >> that, it could be combined with the new csgraph tools to allow a very >> fast computation of euclidean minimum spanning trees for spatial >> clustering. I'm not aware of any good python implementation of EMST >> available currently. > > That's currently implicitly contained in the adjacency information (the > `neighbors` attribute). [clip] Sorry, those were the ridges (or facets). The 1-d edges (pruned for duplicates) could also be useful to get easily. Pauli From ralf.gommers at googlemail.com Wed May 30 17:53:02 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 30 May 2012 23:53:02 +0200 Subject: [SciPy-Dev] scipy 0.11.0 release schedule In-Reply-To: References: Message-ID: On Wed, May 30, 2012 at 8:00 PM, Pauli Virtanen wrote: > Hi, > > 28.05.2012 17:26, Ralf Gommers kirjoitti: > > It's time for a new release I think; 0.10.0 was released just over 6 > > months ago. There are a lot of PRs to merge, but no release blockers > > anymore as far as I know. Does anyone still have important fixes or > > other things that should go in? > > > > For the release schedule I propose: > > June 7: beta 1 > > June 17: rc 1 > > June 30: rc 2 > > July 7: final release > > The schedule seems possible. > > I'd like to get in as many as possible of the open PRs and any "nearly > easy" stuff from the Trac. There are also a couple of low-hanging > bugfixes in scipy.special listed in the Trac that are short of some > quality time with Abramowitz&Stegun on verifying that the fixes are > correct. > > Finally, making the Qhull wrappers expose circumcenters (for easy > Voronoi) and allow passing in custom Qhull options could also be useful, > and not too difficult to do. I'll see how much of this gets done. > Sounds like a busy weekend. I'll also focus on the open PRs. Besides that, some more python 2.4 fixes are needed and the wiki edits need to be merged. If anyone already wants to do some testing, especially on problematic platforms (Windows + MSVC, PPC OS X), that would be quite helpful. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis at laxalde.org Thu May 31 11:50:02 2012 From: denis at laxalde.org (Denis Laxalde) Date: Thu, 31 May 2012 11:50:02 -0400 Subject: [SciPy-Dev] test failures with python 2.4 Message-ID: <1338479402.4062.24.camel@schloss.campus.mcgill.ca> Hi, Here is a list of test failures with python 2.4 (there are also failures with 2.7, see below). Interpolate: test_interpnd.TestCloughTocher2DInterpolator.test_dense ... Exception exceptions.TypeError: in 'scipy.spatial.qhull._get_delaunay_info' ignored Segmentation fault Spatial: test_qhull.TestUtilities.test_convex_hull ... ERROR test_qhull.TestUtilities.test_degenerate_barycentric_transforms ... ERROR test_qhull.TestUtilities.test_find_simplex ... Exception exceptions.TypeError: in 'scipy.spatial.qhull._get_delaunay_info' ignored FAIL test_qhull.TestUtilities.test_more_barycentric_transforms ... ERROR test_qhull.TestUtilities.test_plane_distance ... ok ====================================================================== ERROR: test_qhull.TestUtilities.test_convex_hull ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/denis/.local/lib/python2.4/site-packages/scipy/spatial/tests/test_qhull.py", line 75, in test_convex_hull assert_equal(tri.convex_hull, [[1, 2], [3, 2], [1, 0], [3, 0]]) File "qhull.pyx", line 1109, in scipy.spatial.qhull.Delaunay.convex_hull (scipy/spatial/qhull.c:8406) ValueError: cannot resize an array references or is referenced by another array in this way. Use the resize function ====================================================================== ERROR: test_qhull.TestUtilities.test_degenerate_barycentric_transforms ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/denis/.local/lib/python2.4/site-packages/scipy/spatial/tests/test_qhull.py", line 133, in test_degenerate_barycentric_transforms bad_count = np.isnan(tri.transform[:,0,0]).sum() File "qhull.pyx", line 1028, in scipy.spatial.qhull.Delaunay.transform (scipy/spatial/qhull.c:7449) File "qhull.pyx", line 398, in scipy.spatial.qhull._get_barycentric_transforms (scipy/spatial/qhull.c:4300) File "/home/denis/.local/lib/python2.4/site-packages/numpy/core/numeric.py", line 235, in asarray return array(a, dtype, copy=False, order=order) File "stringsource", line 366, in View.MemoryView.memoryview.__getitem__ (scipy/spatial/qhull.c:14704) File "stringsource", line 650, in View.MemoryView._unellipsify (scipy/spatial/qhull.c:17965) TypeError: Cannot index with type '' ====================================================================== ERROR: test_qhull.TestUtilities.test_more_barycentric_transforms ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/denis/.local/lib/python2.4/site-packages/scipy/spatial/tests/test_qhull.py", line 157, in test_more_barycentric_transforms unit_cube=True) File "/home/denis/.local/lib/python2.4/site-packages/scipy/spatial/tests/test_qhull.py", line 96, in _check_barycentric_transforms c = barycentric_transform(tri.transform, centroids) File "qhull.pyx", line 1028, in scipy.spatial.qhull.Delaunay.transform (scipy/spatial/qhull.c:7449) File "qhull.pyx", line 398, in scipy.spatial.qhull._get_barycentric_transforms (scipy/spatial/qhull.c:4300) File "/home/denis/.local/lib/python2.4/site-packages/numpy/core/numeric.py", line 235, in asarray return array(a, dtype, copy=False, order=order) File "stringsource", line 366, in View.MemoryView.memoryview.__getitem__ (scipy/spatial/qhull.c:14704) File "stringsource", line 650, in View.MemoryView._unellipsify (scipy/spatial/qhull.c:17965) TypeError: Cannot index with type '' ====================================================================== FAIL: test_qhull.TestUtilities.test_find_simplex ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/denis/.local/lib/python2.4/site-packages/scipy/spatial/tests/test_qhull.py", line 33, in test_find_simplex assert_equal(i, p[2], err_msg='%r' % (p,)) File "/home/denis/.local/lib/python2.4/site-packages/numpy/testing/utils.py", line 256, in assert_equal return assert_array_equal(actual, desired, err_msg, verbose) File "/home/denis/.local/lib/python2.4/site-packages/numpy/testing/utils.py", line 707, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/home/denis/.local/lib/python2.4/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (0.25, 0.25, 1) (mismatch 100.0%) x: array(-1, dtype=int32) y: array(1) ---------------------------------------------------------------------- Ran 291 tests in 13.335s FAILED (errors=3, failures=1) Weave: FAIL: test_set_complex (test_scxx_object.TestObjectSetItemOpKey) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/weave/tests/test_scxx_object.py", line 943, in test_set_complex assert_equal(sys.getrefcount(key),4) # should be 3 File "/home/denis/.local/lib/python2.4/site-packages/numpy/testing/utils.py", line 313, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: 3 DESIRED: 4 Failures not specific to python 2.4. Sparse: ====================================================================== ERROR: adding a dense matrix to a sparse matrix ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/tests/test_base.py", line 527, in test_add_dense sum1 = self.dat + self.datsp File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/dok.py", line 133, in __getitem__ raise TypeError('index must be a pair of integers or slices') TypeError: index must be a pair of integers or slices ====================================================================== ERROR: test_matmat_sparse (test_base.TestDOK) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/tests/test_base.py", line 425, in test_matmat_sparse assert_array_almost_equal( a2*bsp, a*b) File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/dok.py", line 133, in __getitem__ raise TypeError('index must be a pair of integers or slices') TypeError: index must be a pair of integers or slices ====================================================================== ERROR: test_radd (test_base.TestDOK) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/tests/test_base.py", line 279, in test_radd c = a + b File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/dok.py", line 133, in __getitem__ raise TypeError('index must be a pair of integers or slices') TypeError: index must be a pair of integers or slices ====================================================================== ERROR: test_rsub (test_base.TestDOK) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/tests/test_base.py", line 290, in test_rsub assert_array_equal((self.dat - self.datsp),[[0,0,0,0],[0,0,0,0],[0,0,0,0]]) File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/dok.py", line 133, in __getitem__ raise TypeError('index must be a pair of integers or slices') TypeError: index must be a pair of integers or slices ====================================================================== ERROR: subtracting a dense matrix to/from a sparse matrix ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/tests/test_base.py", line 535, in test_sub_dense sum1 = 3*self.dat - self.datsp File "/home/denis/.local/lib/python2.4/site-packages/scipy/sparse/dok.py", line 133, in __getitem__ raise TypeError('index must be a pair of integers or slices') TypeError: index must be a pair of integers or slices ---------------------------------------------------------------------- Ran 1210 tests in 12.543s FAILED (KNOWNFAIL=4, errors=5) Special: ................../home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1645: RuntimeWarning: divide by zero encountered in iv c1 = special.iv(v, x) /home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1645: RuntimeWarning: overflow encountered in iv c1 = special.iv(v, x) /home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1646: RuntimeWarning: invalid value encountered in iv c2 = special.iv(v, x+0j) /home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1652: RuntimeWarning: divide by zero encountered in divide dc = abs(c1/c2 - 1) /home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1652: RuntimeWarning: invalid value encountered in divide dc = abs(c1/c2 - 1) /home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py:1659: RuntimeWarning: overflow encountered in iv assert_(dc[k] < 1e-9, (v[k], x[k], special.iv(v[k], x[k]), special.iv(v[k], x[k]+0j))) F...................................................K.K.............................................................................................................................................................................................................................................................................................................................................................................................K........K.................................................. ====================================================================== FAIL: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/denis/.local/lib/python2.7/site-packages/scipy/special/tests/test_basic.py", line 1659, in test_iv_cephes_vs_amos_mass_test assert_(dc[k] < 1e-9, (v[k], x[k], special.iv(v[k], x[k]), special.iv(v[k], x[k]+0j))) File "/usr/lib/pymodules/python2.7/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: (189.2947429454936, 3.0238805556481037, 4.089165443940765e-317, 0j) ---------------------------------------------------------------------- Ran 514 tests in 6.908s FAILED (KNOWNFAIL=4, failures=1) Building/testing environment for python 2.4 (Debian wheezy/sid x86_64): NumPy version 1.6.2 NumPy is installed in /home/denis/.local/lib/python2.4/site-packages/numpy SciPy version 0.11.0.dev-127acfd SciPy is installed in /home/denis/.local/lib/python2.4/site-packages/scipy Python version 2.4.6 (#2, Sep 25 2009, 22:22:06) [GCC 4.3.4] nose version 1.1.2 -- Denis From denis at laxalde.org Thu May 31 11:54:12 2012 From: denis at laxalde.org (Denis Laxalde) Date: Thu, 31 May 2012 11:54:12 -0400 Subject: [SciPy-Dev] test failures with python 2.4 In-Reply-To: <1338479402.4062.24.camel@schloss.campus.mcgill.ca> References: <1338479402.4062.24.camel@schloss.campus.mcgill.ca> Message-ID: <1338479652.4062.25.camel@schloss.campus.mcgill.ca> Le jeudi 31 mai 2012 ? 11:50 -0400, Denis Laxalde a ?crit : > Failures not specific to python 2.4. > > Sparse: Sparse tests failures are actually specific to python 2.4.