From sturla at molden.no Thu Feb 5 09:49:47 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 15:49:47 +0100 Subject: [SciPy-dev] scipy.org is down again In-Reply-To: References: Message-ID: <498AFC8B.6010908@molden.no> Someone restart the server please? S.M. From anand.prabhakar.patil at gmail.com Thu Feb 5 13:11:00 2009 From: anand.prabhakar.patil at gmail.com (anand.prabhakar.patil) Date: Thu, 5 Feb 2009 18:11:00 +0000 Subject: [SciPy-dev] Distutils: incorrect inference of cpu type on Ubuntu Intrepid Message-ID: <2bc7a5a50902051011o47904200lcf8028cc2d26cdcd@mail.gmail.com> Hi all, I just installed numpy from the debian packages on a virtual machine in a 'cloud' to do some multiprocessing. The numpy version is 1.1.1, but the problem (I think) would happen with 1.3.0.dev6034 as well. While trying to build a package that incorporates Fortran sources, I ran into the following: gfortran:f77: mbgw/st_cov_fun/fst_cov_fun.f f951: error: CPU you selected does not support x86-64 instruction set f951: error: CPU you selected does not support x86-64 instruction set f951: error: CPU you selected does not support x86-64 instruction set f951: error: CPU you selected does not support x86-64 instruction set error: Command "/usr/bin/gfortran -Wall -ffixed-form -fno-second-underscore -fPIC -O3 -funroll-loops -march=k6-2 -mmmx -msse2 -msse -msse3 -Ibuild/src.linux-x86_64-2.5 -I/usr/lib/python2.5/site-packages/numpy/core/include -I/usr/include/python2.5 -c -c mbgw/st_cov_fun/fst_cov_fun.f -o build/temp.linux-x86_64-2.5/mbgw/st_cov_fun/fst_cov_fun.o" failed with exit status 1 The problem seems to be the -march=k6-2 flag, which is not right because the chip is a 64-bit Opteron. The flag was set because the method numpy.distutils.cpuinfo.cpu._is_AthlonK6_2 returned True: def _is_AthlonK6_2(self): return self._is_AMD() and self.info[0]['model'] == '2' def _is_AMD(self): return self.info[0]['vendor_id']=='AuthenticAMD' which is understandable given the machine's /proc/cpuinfo, which follows... but the chip is a 64-bit Opteron, as I said. Should I file a bug report? Thanks, Anand 0# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : Quad-Core AMD Opteron(tm) Processor 2352 stepping : 3 cpu MHz : 2109.267 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs bogomips : 4218.52 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate [...] -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Feb 5 14:51:37 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 5 Feb 2009 19:51:37 +0000 (UTC) Subject: [SciPy-dev] Distutils: incorrect inference of cpu type on Ubuntu Intrepid References: <2bc7a5a50902051011o47904200lcf8028cc2d26cdcd@mail.gmail.com> Message-ID: Thu, 05 Feb 2009 18:11:00 +0000, anand.prabhakar.patil wrote: > Hi all, > > I just installed numpy from the debian packages on a virtual machine in > a 'cloud' to do some multiprocessing. The numpy version is 1.1.1, but > the problem (I think) would happen with 1.3.0.dev6034 as well. > > While trying to build a package that incorporates Fortran sources, I ran > into the following: > > gfortran:f77: mbgw/st_cov_fun/fst_cov_fun.f f951: error: CPU you > selected does not support x86-64 instruction set f951: error: CPU you > selected does not support x86-64 instruction set f951: error: CPU you > selected does not support x86-64 instruction set f951: error: CPU you > selected does not support x86-64 instruction set error: Command > "/usr/bin/gfortran -Wall -ffixed-form -fno-second-underscore -fPIC -O3 > -funroll-loops -march=k6-2 -mmmx -msse2 -msse -msse3 > -Ibuild/src.linux-x86_64-2.5 > -I/usr/lib/python2.5/site-packages/numpy/core/include > -I/usr/include/python2.5 -c -c mbgw/st_cov_fun/fst_cov_fun.f -o > build/temp.linux-x86_64-2.5/mbgw/st_cov_fun/fst_cov_fun.o" failed with > exit status 1 Seems like this is already addressed in SVN trunk: http://scipy.org/scipy/numpy/changeset/5978 -- Pauli Virtanen From anand.prabhakar.patil at gmail.com Thu Feb 5 15:24:27 2009 From: anand.prabhakar.patil at gmail.com (anand.prabhakar.patil) Date: Thu, 5 Feb 2009 20:24:27 +0000 Subject: [SciPy-dev] Distutils: incorrect inference of cpu type on Ubuntu Intrepid In-Reply-To: References: <2bc7a5a50902051011o47904200lcf8028cc2d26cdcd@mail.gmail.com> Message-ID: <2bc7a5a50902051224m5fab50damcb063d9b5005f768@mail.gmail.com> Thanks Pauli, what a relief. I'll give it a whirl. Anand On Thu, Feb 5, 2009 at 7:51 PM, Pauli Virtanen wrote: > Thu, 05 Feb 2009 18:11:00 +0000, anand.prabhakar.patil wrote: > > > Hi all, > > > > I just installed numpy from the debian packages on a virtual machine in > > a 'cloud' to do some multiprocessing. The numpy version is 1.1.1, but > > the problem (I think) would happen with 1.3.0.dev6034 as well. > > > > While trying to build a package that incorporates Fortran sources, I ran > > into the following: > > > > gfortran:f77: mbgw/st_cov_fun/fst_cov_fun.f f951: error: CPU you > > selected does not support x86-64 instruction set f951: error: CPU you > > selected does not support x86-64 instruction set f951: error: CPU you > > selected does not support x86-64 instruction set f951: error: CPU you > > selected does not support x86-64 instruction set error: Command > > "/usr/bin/gfortran -Wall -ffixed-form -fno-second-underscore -fPIC -O3 > > -funroll-loops -march=k6-2 -mmmx -msse2 -msse -msse3 > > -Ibuild/src.linux-x86_64-2.5 > > -I/usr/lib/python2.5/site-packages/numpy/core/include > > -I/usr/include/python2.5 -c -c mbgw/st_cov_fun/fst_cov_fun.f -o > > build/temp.linux-x86_64-2.5/mbgw/st_cov_fun/fst_cov_fun.o" failed with > > exit status 1 > > Seems like this is already addressed in SVN trunk: > http://scipy.org/scipy/numpy/changeset/5978 > > -- > Pauli Virtanen > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Thu Feb 5 15:44:14 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Thu, 05 Feb 2009 15:44:14 -0500 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros Message-ID: The versions of our stuff about to go out in Ubuntu 9.04 are: numpy 1.1.1 scipy 0.6.0 Would it be possible to get 1.2.1 and 0.7.0 to go instead? Who takes care of pushing these to the various Linux distros? It would be nice if the Packaging section of Developer Zone on the web site made the packaging process a little more transparent... Say what you do to get a release out and to get it in distribution, who does it, what the timetables tend to look like, etc. If you ask for help there, you're likely to get it. Thanks, --jh-- From robert.kern at gmail.com Thu Feb 5 15:46:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 5 Feb 2009 14:46:43 -0600 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: References: Message-ID: <3d375d730902051246n66c6e91dv82d58a8c6cde36d@mail.gmail.com> On Thu, Feb 5, 2009 at 14:44, Joe Harrington wrote: > The versions of our stuff about to go out in Ubuntu 9.04 are: > > numpy 1.1.1 > scipy 0.6.0 > > Would it be possible to get 1.2.1 and 0.7.0 to go instead? Who takes > care of pushing these to the various Linux distros? > > It would be nice if the Packaging section of Developer Zone on the web > site made the packaging process a little more transparent... Say what > you do to get a release out and to get it in distribution, who does > it, what the timetables tend to look like, etc. If you ask for help > there, you're likely to get it. My experience is that most distribution packagers aren't on this list. We don't push; they pull. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael.abshoff at googlemail.com Thu Feb 5 15:53:45 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Thu, 05 Feb 2009 12:53:45 -0800 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: <3d375d730902051246n66c6e91dv82d58a8c6cde36d@mail.gmail.com> References: <3d375d730902051246n66c6e91dv82d58a8c6cde36d@mail.gmail.com> Message-ID: <498B51D9.7010709@gmail.com> Robert Kern wrote: > On Thu, Feb 5, 2009 at 14:44, Joe Harrington wrote: >> The versions of our stuff about to go out in Ubuntu 9.04 are: >> >> numpy 1.1.1 >> scipy 0.6.0 >> >> Would it be possible to get 1.2.1 and 0.7.0 to go instead? Who takes >> care of pushing these to the various Linux distros? >> >> It would be nice if the Packaging section of Developer Zone on the web >> site made the packaging process a little more transparent... Say what >> you do to get a release out and to get it in distribution, who does >> it, what the timetables tend to look like, etc. If you ask for help >> there, you're likely to get it. > > My experience is that most distribution packagers aren't on this list. > We don't push; they pull. > Well, given that Ubuntu gets most of its packages from Debian and unstable has some fairly current numpy/scipy I guess you need to wait for things to trickle down. But poking the right person in the Ubuntu universe might also be a good idea, not that I know who that is. Cheers, Michael From millman at berkeley.edu Thu Feb 5 17:10:32 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 5 Feb 2009 14:10:32 -0800 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: References: Message-ID: On Thu, Feb 5, 2009 at 12:44 PM, Joe Harrington wrote: > It would be nice if the Packaging section of Developer Zone on the web > site made the packaging process a little more transparent... Say what > you do to get a release out and to get it in distribution, who does > it, what the timetables tend to look like, etc. If you ask for help > there, you're likely to get it. I don't use Ubuntu, but a quick google search gave me this: http://packages.ubuntu.com/jaunty/python-numpy http://packages.ubuntu.com/jaunty/python-scipy I would rather not have information about Ubuntu's release process on the SciPy site. Ubuntu should be the system of record and it is very easy to find out that information using google. Anything added to the SciPy site will either quickly become out of date or will need maintenance (we have too much out-of-date, conflicting, or duplicate information on the site). If you want the Ubuntu developers to use more recent versions of numpy and scipy, try "asking a question" about whether they will upgrade through launchpad: https://answers.launchpad.net/ubuntu/+source/python-scipy/+addquestion https://answers.launchpad.net/ubuntu/+source/python-numpy/+addquestion Hope that is enough information to get you started. Good luck, Jarrod From werner.ho at gmx.de Sat Feb 7 04:57:33 2009 From: werner.ho at gmx.de (Werner Hoch) Date: Sat, 7 Feb 2009 10:57:33 +0100 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: References: Message-ID: <200902071057.34169.werner.ho@gmx.de> Hi all, here are some notes about openSUSE. numpy and scipy are not part of the openSUSE core distribution but several projects in the openSUSE BuildService are packaging and using scipy and numpy. (https://build.opensuse.org/) I've maintained numpy and scipy for a while in the science project. Now both packages are maintained in the Education project. Misc links: http://download.opensuse.org/repositories/Education/ http://download.opensuse.org/repositories/Education/openSUSE_11.1/repodata/repoview/python-scipy-0-0.6.0-4.5.html http://download.opensuse.org/repositories/Education/openSUSE_11.1/repodata/repoview/python-numpy-0-1.2.1-2.6.html The current version are numpy 1.2.1 scipy 0.6.0 I think as soon as scipy 0.7.0 is out the package will be updated in the Education project. Lars has added some patches to scipy 0.6.0 to fix some compilation warnings in the Education project. I don't know if he has reported them upstream to the scipy project. openSUSE_11.1 has some strict rules about compilation warnings. The compiler warnings are checked after the build some warnings are treated as errors. In my privat project (not published) I've created a test build for scipy 0.7.0rc2. The buildservice complains that scipy 0.7.0rc2 has the following errors: ---------- I: Program is using implicit definitions of special functions. these functions need to use their correct prototypes to allow the lightweight buffer overflow checking to work. - Implicit memory/string functions need #include . - Implicit *printf functions need #include . - Implicit *printf functions need #include . - Implicit *read* functions need #include . - Implicit *recv* functions need #include . E: python-scipy implicit-fortify-decl scipy/sparse/linalg/dsolve/SuperLU/SRC/xerbla.c:33 I: Program returns random data in a function E: python-scipy no-return-in-nonvoid-function build/src.linux-x86_64-2.6/scipy/sparse/linalg/isolve/iterative/getbreak.f:74, 54, 34, 14 I: Program causes undefined operation (likely same variable used twiceand post/pre incremented in the same expression). e.g. x = x++; Split it in two operations. E: python-scipy sequence-point scipy/sparse/linalg/dsolve/SuperLU/SRC/cutil.c:243 E: python-scipy sequence-point scipy/sparse/linalg/dsolve/SuperLU/SRC/zutil.c:243 ---------- I'm attaching the patches from Lars (they are against scipy 0.6.0) Maybe someone can review them and integrate them into scipy. Regards Werner -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-0.6.0-implicit-fortify-decl.patch Type: text/x-diff Size: 364 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-0.6.0-no-return-in-nonvoid-function.patch Type: text/x-diff Size: 380 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-0.6.0-undefined_operation.patch Type: text/x-diff Size: 1675 bytes Desc: not available URL: From pav at iki.fi Sun Feb 8 12:23:00 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 8 Feb 2009 17:23:00 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: Some of the real-valued Bessel function implementations from the Cephes library currently used in scipy.special have problems. (See #503, #851, #853, #854.) Fixing some of these (eg. #503) would require implementing robust computation algorithms from scratch. (The Specfun code is IMHO too obscure and badly commented to be relied on as an alternative.) However, the Boost library seems to have good implementations Bessel (and some other) special functions: http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html Also the license seems Scipy-compatible: http://www.boost.org/LICENSE_1_0.txt So, I'd like to bring these over to Scipy, to replace some of the Cephes routines. The only problem is that being in Boost, they are written in C++, and I guess we can't make Scipy to depend on it. I see two options: A) Bundle the relevant subset of Boost with Scipy. The problem here is that the special functions seem to pull in a sizable subset of the whole Boost library. Also, I don't know how well compilers handle the template-happy C++ in boost today on all platforms where Scipy must work on. B) Convert the Boost code from C++ to C. This is in fact quite trivial search-and-replace operation. One example here: http://github.com/pv/scipy/blob/ticket-503-special-iv-fix/scipy/special/cephes/scipy_iv.c I'd like to see (B) happen in scipy.special. Thoughts? -- Pauli Virtanen From matthieu.brucher at gmail.com Sun Feb 8 12:25:32 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 8 Feb 2009 18:25:32 +0100 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: > The only problem is that being in Boost, they are written in C++, and I > guess we can't make Scipy to depend on it. The sparse module is already based on C+, so why not more ? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From pav at iki.fi Sun Feb 8 12:35:41 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 8 Feb 2009 17:35:41 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: Sun, 08 Feb 2009 18:25:32 +0100, Matthieu Brucher wrote: >> The only problem is that being in Boost, they are written in C++, and I >> guess we can't make Scipy to depend on it. > > The sparse module is already based on C+, so why not more ? The problem is not C++ per se, but Boost: (i) How much of it we need to bundle with Scipy? (ii) Are there portability/build issues? But yes, using unmodified upstream code could be a relief from the maintenance POV. -- Pauli Virtanen From pav at iki.fi Sun Feb 8 13:07:24 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 8 Feb 2009 18:07:24 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: Sun, 08 Feb 2009 17:35:41 +0000, Pauli Virtanen wrote: > Sun, 08 Feb 2009 18:25:32 +0100, Matthieu Brucher wrote: > >>> The only problem is that being in Boost, they are written in C++, and >>> I guess we can't make Scipy to depend on it. >> >> The sparse module is already based on C+, so why not more ? > > The problem is not C++ per se, but Boost: > > (i) How much of it we need to bundle with Scipy? To answer myself: the bessel.hpp appears to pull in 2.7 Mb of boost headers. -- Pauli Virtanen From charlesr.harris at gmail.com Sun Feb 8 13:39:55 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 8 Feb 2009 11:39:55 -0700 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: On Sun, Feb 8, 2009 at 10:23 AM, Pauli Virtanen wrote: > > Some of the real-valued Bessel function implementations from the Cephes > library currently used in scipy.special have problems. (See #503, #851, > #853, #854.) Fixing some of these (eg. #503) would require implementing > robust computation algorithms from scratch. (The Specfun code is IMHO too > obscure and badly commented to be relied on as an alternative.) > > However, the Boost library seems to have good implementations Bessel > (and some other) special functions: > > > http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ > > http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html > > Also the license seems Scipy-compatible: > > http://www.boost.org/LICENSE_1_0.txt > > So, I'd like to bring these over to Scipy, to replace some of the Cephes > routines. > > The only problem is that being in Boost, they are written in C++, and I > guess we can't make Scipy to depend on it. > > I see two options: > > A) Bundle the relevant subset of Boost with Scipy. The problem here > is that the special functions seem to pull in a sizable subset > of the whole Boost library. > That's a common problem with big c++ libraries. > > Also, I don't know how well compilers handle the template-happy > C++ in boost today on all platforms where Scipy must work on. > > B) Convert the Boost code from C++ to C. This is in fact quite trivial > search-and-replace operation. One example here: > > > http://github.com/pv/scipy/blob/ticket-503-special-iv-fix/scipy/special/cephes/scipy_iv.c > > I'd like to see (B) happen in scipy.special. Thoughts? > I think it's a good idea. It would also be nice if we picked best of breed from several libraries to make up our own special functions collection. For instance, there are several log gamma functions. That's probably a big job though and we would need extensive tests for the functions before trying it. Do you know of any other project that has put together such a test suite? Does boost have versions for log1p? We need a better implementation in numpy itself. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.abshoff at googlemail.com Sun Feb 8 13:58:13 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 10:58:13 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: <498F2B45.1080008@gmail.com> Pauli Virtanen wrote: Hi, > Sun, 08 Feb 2009 18:25:32 +0100, Matthieu Brucher wrote: > >>> The only problem is that being in Boost, they are written in C++, and I >>> guess we can't make Scipy to depend on it. >> The sparse module is already based on C+, so why not more ? > > The problem is not C++ per se, but Boost: > > (i) How much of it we need to bundle with Scipy? > (ii) Are there portability/build issues? Boost is a neverending souce of portability/build issues and every project I ever touched using boost had specific version requirements, i.e PuCUDA wanted either one of two speccfic release while quantlib wanted another set, but in between them there wasn't any boost that worked for both of them. Putting that code in-tree opens you up to all kinds of version mismatches and confusion if boost is installed system wide. I have had to fix or work around issues with recent boost on common platforms like OSX, much less seemingly "exotic" things like FreeBSD :), boost has its own build system (jam) which isn't exactly used commonly anywhere else and quite painful, i.e. boost always used the global Python headers for quantlib for example and you need either the latest release or some snapshot to work around that bug. Boost code requires beefy resources to compile and on and on an on. Please do not touch boost code, but if you must either translate C code or look at some alternative like mpmath, i.e. http://code.google.com/p/mpmath/ Fredrick is quite responsive about bugs and feature requests and we have talked to him about replacing some of the functionality provided by cephes in Sage via mpmath since they are arbitrary precision and pretty fast when optionally using gmp. But it also works in pure mode, i.e. all BSD licensed code. > But yes, using unmodified upstream code could be a relief from the > maintenance POV. Well, I am not so sure about that :) Cheers, Michael From pav at iki.fi Sun Feb 8 14:11:24 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 8 Feb 2009 19:11:24 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: Sun, 08 Feb 2009 11:39:55 -0700, Charles R Harris wrote: [clip] > I think it's a good idea. It would also be nice if we picked best of > breed from several libraries to make up our own special functions > collection. For instance, there are several log gamma functions. I think we need to do this eventually, even if it means lots of work. At points the Cephes and Specfun codes seem like the author has not wanted to bother with the best possible algorithm, which leads to problems in corner cases. > That's > probably a big job though and we would need extensive tests for the > functions before trying it. Do you know of any other project that has > put together such a test suite? For Bessel functions we can easily test against the AMOS library, which appears to be reliable --- unless the order is close to negative integers in which case there can be cancellation errors of the order of 1e-6 in the reflection formulas. Boost itself has tests for its special functions, these are spot tests at precomputed points, from 1e-2..1e2 magnitudes in both parameters. [As an aside, I noticed Boost's I(v,x) overflows to infty somewhat earlier than necessary for very large orders, though.] GSL has similar point tests. (But it's GPLed.) Netlib/Specfun has a test suite: http://netlib.org/specfun/ ; in F77. Anyway, point tests across some magnitudes of parameters should be easy to generate. What takes more work is checking the behavior of the functions in transition regions where the method of computation changes, and asymptotic behavior (overflows, etc.) at large or small parameters and near singularities. > Does boost have versions for log1p? We need a better implementation in > numpy itself. It has. http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/ log1p.hpp It's cluttered by C++ templates, but the algorithm looks like some serious effort has been put into it. -- Pauli Virtanen From simpson at math.toronto.edu Sun Feb 8 15:05:25 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 8 Feb 2009 15:05:25 -0500 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> On Feb 8, 2009, at 12:23 PM, Pauli Virtanen wrote: > > However, the Boost library seems to have good implementations Bessel > (and some other) special functions: > > http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ > http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html > > Also the license seems Scipy-compatible: > > http://www.boost.org/LICENSE_1_0.txt > > So, I'd like to bring these over to Scipy, to replace some of the > Cephes > routines. > > The only problem is that being in Boost, they are written in C++, > and I > guess we can't make Scipy to depend on it. What about the GSL implementation of the Bessel function? That's already in C and seems, in some sense, a more natural companion library to SciPy than boost. -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 8 15:15:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 8 Feb 2009 13:15:18 -0700 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> Message-ID: On Sun, Feb 8, 2009 at 1:05 PM, Gideon Simpson wrote: > On Feb 8, 2009, at 12:23 PM, Pauli Virtanen wrote: > > > However, the Boost library seems to have good implementations Bessel > (and some other) special functions: > > http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ > > http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html > > Also the license seems Scipy-compatible: > > http://www.boost.org/LICENSE_1_0.txt > > So, I'd like to bring these over to Scipy, to replace some of the Cephes > routines. > > The only problem is that being in Boost, they are written in C++, and I > guess we can't make Scipy to depend on it. > > > What about the GSL implementation of the Bessel function? That's already > in C and seems, in some sense, a more natural companion library to SciPy > than boost. > Wrong license. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.abshoff at googlemail.com Sun Feb 8 15:16:45 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 12:16:45 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> Message-ID: <498F3DAD.5000707@gmail.com> Gideon Simpson wrote: > On Feb 8, 2009, at 12:23 PM, Pauli Virtanen wrote: > >> >> However, the Boost library seems to have good implementations Bessel >> (and some other) special functions: >> >> http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ >> >> http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html >> >> >> Also the license seems Scipy-compatible: >> >> http://www.boost.org/LICENSE_1_0.txt >> >> So, I'd like to bring these over to Scipy, to replace some of the Cephes >> routines. >> >> The only problem is that being in Boost, they are written in C++, and I >> guess we can't make Scipy to depend on it. Hi, > What about the GSL implementation of the Bessel function? That's > already in C and seems, in some sense, a more natural companion library > to SciPy than boost. GSL is GPL licensed - not surprisingly since it is the GNU scientifc library :) > -gideon Cheers, Michael > > > ------------------------------------------------------------------------ > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From simpson at math.toronto.edu Sun Feb 8 15:30:11 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 8 Feb 2009 15:30:11 -0500 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> Message-ID: On Feb 8, 2009, at 3:15 PM, Charles R Harris wrote: > What about the GSL implementation of the Bessel function? That's > already in C and seems, in some sense, a more natural companion > library to SciPy than boost. > > Wrong license. > > Chuck I gather there is interest in reconciling the GPL and BSD licenses? -gideon From michael.abshoff at googlemail.com Sun Feb 8 15:42:58 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 12:42:58 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> Message-ID: <498F43D2.7090506@gmail.com> Gideon Simpson wrote: > On Feb 8, 2009, at 3:15 PM, Charles R Harris wrote: > >> What about the GSL implementation of the Bessel function? That's >> already in C and seems, in some sense, a more natural companion >> library to SciPy than boost. >> >> Wrong license. >> >> Chuck > > > I gather there is interest in reconciling the GPL and BSD licenses? What do you mean by that? Without going into details and triggering an epic flamewar GPL and BSD define "freedom" differently and given 25 years of history between the FSF and the BSD camp and endless flamewars I don't see how this can be resolved in any way, shape or form. :) > -gideon Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From simpson at math.toronto.edu Sun Feb 8 15:58:45 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 8 Feb 2009 15:58:45 -0500 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498F43D2.7090506@gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> <498F43D2.7090506@gmail.com> Message-ID: On Feb 8, 2009, at 3:42 PM, Michael Abshoff wrote: > > What do you mean by that? Without going into details and triggering an > epic flamewar GPL and BSD define "freedom" differently and given 25 > years of history between the FSF and the BSD camp and endless > flamewars > I don't see how this can be resolved in any way, shape or form. :) > > That's a real shame for an end-user like myself. I use a mixture of GSL and SciPy in my work, for different things, depending on the problem at hand. It would be good if there were robust interoperability there. -gideon From gael.varoquaux at normalesup.org Sun Feb 8 16:02:02 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 8 Feb 2009 22:02:02 +0100 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> <498F43D2.7090506@gmail.com> Message-ID: <20090208210202.GV14469@phare.normalesup.org> On Sun, Feb 08, 2009 at 03:58:45PM -0500, Gideon Simpson wrote: > On Feb 8, 2009, at 3:42 PM, Michael Abshoff wrote: > > What do you mean by that? Without going into details and triggering an > > epic flamewar GPL and BSD define "freedom" differently and given 25 > > years of history between the FSF and the BSD camp and endless > > flamewars > > I don't see how this can be resolved in any way, shape or form. :) > That's a real shame for an end-user like myself. I use a mixture of > GSL and SciPy in my work, for different things, depending on the > problem at hand. It would be good if there were robust > interoperability there. That's because you don't have to worry about distributing software built upon these tools or making a profit. If you are trying to run a company, or simply if you are in a lab that wishes to sell some of the software it has developed, you end up caring about these things. Ga?l From robert.kern at gmail.com Sun Feb 8 17:25:08 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 8 Feb 2009 16:25:08 -0600 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <9B2539E1-72A6-411F-B8C8-9A68D49AF9CA@math.toronto.edu> <498F43D2.7090506@gmail.com> Message-ID: <3d375d730902081425t441800e7uc239340b9dd88fb6@mail.gmail.com> On Sun, Feb 8, 2009 at 14:58, Gideon Simpson wrote: > On Feb 8, 2009, at 3:42 PM, Michael Abshoff wrote: > >> >> What do you mean by that? Without going into details and triggering an >> epic flamewar GPL and BSD define "freedom" differently and given 25 >> years of history between the FSF and the BSD camp and endless >> flamewars >> I don't see how this can be resolved in any way, shape or form. :) > > That's a real shame for an end-user like myself. I use a mixture of > GSL and SciPy in my work, for different things, depending on the > problem at hand. It would be good if there were robust > interoperability there. Interoperability is not the concern here. *You* can combine scipy and GSL all you like. That's not a problem. However, because we want to continue to use a BSD license for scipy, we don't include GPLed code like the GSL. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ondrej at certik.cz Sun Feb 8 19:12:51 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Sun, 8 Feb 2009 16:12:51 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498F2B45.1080008@gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <498F2B45.1080008@gmail.com> Message-ID: <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> Hi, On Sun, Feb 8, 2009 at 10:58 AM, Michael Abshoff wrote: > Pauli Virtanen wrote: > > Hi, > >> Sun, 08 Feb 2009 18:25:32 +0100, Matthieu Brucher wrote: >> >>>> The only problem is that being in Boost, they are written in C++, and I >>>> guess we can't make Scipy to depend on it. >>> The sparse module is already based on C+, so why not more ? >> >> The problem is not C++ per se, but Boost: >> >> (i) How much of it we need to bundle with Scipy? >> (ii) Are there portability/build issues? > > Boost is a neverending souce of portability/build issues and every > project I ever touched using boost had specific version requirements, > i.e PuCUDA wanted either one of two speccfic release while quantlib > wanted another set, but in between them there wasn't any boost that > worked for both of them. Putting that code in-tree opens you up to all > kinds of version mismatches and confusion if boost is installed system > wide. > > I have had to fix or work around issues with recent boost on common > platforms like OSX, much less seemingly "exotic" things like FreeBSD :), > boost has its own build system (jam) which isn't exactly used commonly > anywhere else and quite painful, i.e. boost always used the global > Python headers for quantlib for example and you need either the latest > release or some snapshot to work around that bug. Boost code requires > beefy resources to compile and on and on an on. Please do not touch > boost code, but if you must either translate C code or look at some > alternative like mpmath, i.e. > > http://code.google.com/p/mpmath/ > > Fredrick is quite responsive about bugs and feature requests and we have > talked to him about replacing some of the functionality provided by > cephes in Sage via mpmath since they are arbitrary precision and pretty > fast when optionally using gmp. But it also works in pure mode, i.e. all > BSD licensed code. I completely agree with Michael here. Why not to use mpmath? It's bsd, it started as part of sympy and it was the GSoC project for sympy the last year. It's pretty competitive with gmp (e.g. for the pi digits calculations, it's even faster than Sage, unless Sage fixed that already), but one doesn't have to use gmp, if one doesn't want to. And I think both Fredrik and other mpmath and sympy developers would help to make mpmath working with scipy. Definitely I would. I think that's a better option, than to port some boost stuff and then you would have to maintain it. If you use mpmath, all of us win, imho. Ondrej From robert.kern at gmail.com Sun Feb 8 19:28:45 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 8 Feb 2009 18:28:45 -0600 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> Message-ID: <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> On Sun, Feb 8, 2009 at 18:12, Ondrej Certik wrote: > Hi, > > On Sun, Feb 8, 2009 at 10:58 AM, Michael Abshoff > wrote: >> Pauli Virtanen wrote: >> >> Hi, >> >>> Sun, 08 Feb 2009 18:25:32 +0100, Matthieu Brucher wrote: >>> >>>>> The only problem is that being in Boost, they are written in C++, and I >>>>> guess we can't make Scipy to depend on it. >>>> The sparse module is already based on C+, so why not more ? >>> >>> The problem is not C++ per se, but Boost: >>> >>> (i) How much of it we need to bundle with Scipy? >>> (ii) Are there portability/build issues? >> >> Boost is a neverending souce of portability/build issues and every >> project I ever touched using boost had specific version requirements, >> i.e PuCUDA wanted either one of two speccfic release while quantlib >> wanted another set, but in between them there wasn't any boost that >> worked for both of them. Putting that code in-tree opens you up to all >> kinds of version mismatches and confusion if boost is installed system >> wide. >> >> I have had to fix or work around issues with recent boost on common >> platforms like OSX, much less seemingly "exotic" things like FreeBSD :), >> boost has its own build system (jam) which isn't exactly used commonly >> anywhere else and quite painful, i.e. boost always used the global >> Python headers for quantlib for example and you need either the latest >> release or some snapshot to work around that bug. Boost code requires >> beefy resources to compile and on and on an on. Please do not touch >> boost code, but if you must either translate C code or look at some >> alternative like mpmath, i.e. >> >> http://code.google.com/p/mpmath/ >> >> Fredrick is quite responsive about bugs and feature requests and we have >> talked to him about replacing some of the functionality provided by >> cephes in Sage via mpmath since they are arbitrary precision and pretty >> fast when optionally using gmp. But it also works in pure mode, i.e. all >> BSD licensed code. > > I completely agree with Michael here. Why not to use mpmath? It's bsd, > it started as part of sympy and it was the GSoC project for sympy the > last year. It's pretty competitive with gmp (e.g. for the pi digits > calculations, it's even faster than Sage, unless Sage fixed that > already), but one doesn't have to use gmp, if one doesn't want to. > > And I think both Fredrik and other mpmath and sympy developers would > help to make mpmath working with scipy. Definitely I would. I think > that's a better option, than to port some boost stuff and then you > would have to maintain it. If you use mpmath, all of us win, imho. ??? For implementing a C ufunc? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Sun Feb 8 20:10:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 9 Feb 2009 10:10:46 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> On Mon, Feb 9, 2009 at 2:23 AM, Pauli Virtanen wrote: > > Some of the real-valued Bessel function implementations from the Cephes > library currently used in scipy.special have problems. (See #503, #851, > #853, #854.) Fixing some of these (eg. #503) would require implementing > robust computation algorithms from scratch. (The Specfun code is IMHO too > obscure and badly commented to be relied on as an alternative.) > > However, the Boost library seems to have good implementations Bessel > (and some other) special functions: > > http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/detail/ > http://www.boost.org/doc/libs/1_37_0/libs/math/doc/sf_and_dist/html/math_toolkit/special.html > > Also the license seems Scipy-compatible: > > http://www.boost.org/LICENSE_1_0.txt > > So, I'd like to bring these over to Scipy, to replace some of the Cephes > routines. > > The only problem is that being in Boost, they are written in C++, and I > guess we can't make Scipy to depend on it. > > I see two options: > > A) Bundle the relevant subset of Boost with Scipy. The problem here > is that the special functions seem to pull in a sizable subset > of the whole Boost library. > > Also, I don't know how well compilers handle the template-happy > C++ in boost today on all platforms where Scipy must work on. I am -1 on boost. It is a nightmare to support on many platforms, and it is unreadable for people who are not C++ hackers. > B) Convert the Boost code from C++ to C. This is in fact quite trivial > search-and-replace operation. One example here: > > http://github.com/pv/scipy/blob/ticket-503-special-iv-fix/scipy/special/cephes/scipy_iv.c > > I'd like to see (B) happen in scipy.special. Thoughts? This is much better - I really don't see the point of using C++ for math functions. I am ok with this. David From cournape at gmail.com Sun Feb 8 20:22:43 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 9 Feb 2009 10:22:43 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902081722h1dfbc23cxcf59583aaf084349@mail.gmail.com> On Mon, Feb 9, 2009 at 4:11 AM, Pauli Virtanen wrote: > Sun, 08 Feb 2009 11:39:55 -0700, Charles R Harris wrote: > [clip] >> I think it's a good idea. It would also be nice if we picked best of >> breed from several libraries to make up our own special functions >> collection. For instance, there are several log gamma functions. > > I think we need to do this eventually, even if it means lots of work. At > points the Cephes and Specfun codes seem like the author has not wanted > to bother with the best possible algorithm, which leads to problems in > corner cases. Yes, I agree - I already asked about this a few weeks ago after some problems with other functions. I think cephes and specfun are not reliable - R does not use it, they have their own implementation of core math functions (sometimes inspired from cephes/specfun, but not that often). > > Anyway, point tests across some magnitudes of parameters should be easy > to generate. What takes more work is checking the behavior of the > functions in transition regions where the method of computation changes, > and asymptotic behavior (overflows, etc.) at large or small parameters > and near singularities. Yes, it would be a lot of work - I think we should focus on the tests before rewriting some functions. I would like to have a core scipy.special which is reliable: bessel, gamma/digamma/co, chebychev, basically most functions in R core would be a good start - and already quite heavy from a work POV. > It has. > > http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/ > log1p.hpp > > It's cluttered by C++ templates, but the algorithm looks like some > serious effort has been put into it. we could use their test-suite, maybe. cheers, David From ondrej at certik.cz Sun Feb 8 22:38:02 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Sun, 8 Feb 2009 19:38:02 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> Message-ID: <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> >> I completely agree with Michael here. Why not to use mpmath? It's bsd, >> it started as part of sympy and it was the GSoC project for sympy the >> last year. It's pretty competitive with gmp (e.g. for the pi digits >> calculations, it's even faster than Sage, unless Sage fixed that >> already), but one doesn't have to use gmp, if one doesn't want to. >> >> And I think both Fredrik and other mpmath and sympy developers would >> help to make mpmath working with scipy. Definitely I would. I think >> that's a better option, than to port some boost stuff and then you >> would have to maintain it. If you use mpmath, all of us win, imho. > > ??? For implementing a C ufunc? Using Cython? If it's as fast as anything else, why not. If it's not as fast, then that would be a reason not to use it. Ondrej From robert.kern at gmail.com Sun Feb 8 23:02:17 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 8 Feb 2009 22:02:17 -0600 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> Message-ID: <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> On Sun, Feb 8, 2009 at 21:38, Ondrej Certik wrote: >>> I completely agree with Michael here. Why not to use mpmath? It's bsd, >>> it started as part of sympy and it was the GSoC project for sympy the >>> last year. It's pretty competitive with gmp (e.g. for the pi digits >>> calculations, it's even faster than Sage, unless Sage fixed that >>> already), but one doesn't have to use gmp, if one doesn't want to. >>> >>> And I think both Fredrik and other mpmath and sympy developers would >>> help to make mpmath working with scipy. Definitely I would. I think >>> that's a better option, than to port some boost stuff and then you >>> would have to maintain it. If you use mpmath, all of us win, imho. >> >> ??? For implementing a C ufunc? > > Using Cython? If it's as fast as anything else, why not. If it's not > as fast, then that would be a reason not to use it. Well, show me the code. :-) I wasn't aware that mpmath could be Cythonized. If it can, and the result is reasonably fast, that would be *really* useful. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael.abshoff at googlemail.com Sun Feb 8 23:25:37 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 20:25:37 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> Message-ID: <498FB041.5080309@gmail.com> David Cournapeau wrote: Hi David, > This is much better - I really don't see the point of using C++ for > math functions. I am ok with this. > > David Out of curiosity: I checked the boost website and it states for the math lib: "All the implementations are fully generic and support the use of arbitrary "real-number" types, although they are optimised for use with types with known-about significand (or mantissa) sizes: typically float, double or long double." Since I assume some people around here are interested in arbitrary precisions and after looking some more at the documentation it seems that that library only supports this via using an NTL type which in turn uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit the licensing requirements of Scipy. David mentioned to write a library from scratch and I also assume that you want arbitrary precisions. Given that GMP is LGPL, the arbitrary precisions code in OpenSSL is covered by a BSD advertising clause (which might or might not be a deal breaker around here) what do you suggest to do about arbitrary precisions? I am not aware of any BSD 2 or 3 clause license library besides mpmath :) Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From michael.abshoff at googlemail.com Sun Feb 8 23:30:27 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 20:30:27 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> Message-ID: <498FB163.9030907@gmail.com> Robert Kern wrote: Hi, > On Sun, Feb 8, 2009 at 21:38, Ondrej Certik wrote: >>>> I completely agree with Michael here. Why not to use mpmath? It's bsd, >>>> it started as part of sympy and it was the GSoC project for sympy the >>>> last year. It's pretty competitive with gmp (e.g. for the pi digits >>>> calculations, it's even faster than Sage, unless Sage fixed that >>>> already), but one doesn't have to use gmp, if one doesn't want to. >>>> >>>> And I think both Fredrik and other mpmath and sympy developers would >>>> help to make mpmath working with scipy. Definitely I would. I think >>>> that's a better option, than to port some boost stuff and then you >>>> would have to maintain it. If you use mpmath, all of us win, imho. >>> ??? For implementing a C ufunc? >> Using Cython? If it's as fast as anything else, why not. If it's not >> as fast, then that would be a reason not to use it. > > Well, show me the code. :-) > > I wasn't aware that mpmath could be Cythonized. If it can, and the > result is reasonably fast, that would be *really* useful. > I believe Ondrej was talking about using Cython to reduce call overhead, not to Cythonize mpmath which might or might not pay off. Given that Cepehes does not do arbitrary precision taking a look at mpmath before you decide to reinvent the wheel seems like a good idea. I don't really see the problem since mpmath works and in some cases is competitive with MPFR. And I don't meant that silly Pi to some 10^X computation which isn't particularly useful in the real world. It is quite hard to do arbitrary precision arithmetic and numerically stable special functions, so building on top of mpmath has its advantages. Cheers, Michael From ondrej at certik.cz Sun Feb 8 23:34:51 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Sun, 8 Feb 2009 20:34:51 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498FB163.9030907@gmail.com> References: <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> <498FB163.9030907@gmail.com> Message-ID: <85b5c3130902082034j636eda7eq6db8e0a73f81816e@mail.gmail.com> >>>> ??? For implementing a C ufunc? >>> Using Cython? If it's as fast as anything else, why not. If it's not >>> as fast, then that would be a reason not to use it. >> >> Well, show me the code. :-) I say that usually. :) >> >> I wasn't aware that mpmath could be Cythonized. If it can, and the >> result is reasonably fast, that would be *really* useful. >> > > I believe Ondrej was talking about using Cython to reduce call overhead, > not to Cythonize mpmath which might or might not pay off. Yes. > > Given that Cepehes does not do arbitrary precision taking a look at > mpmath before you decide to reinvent the wheel seems like a good idea. I > don't really see the problem since mpmath works and in some cases is > competitive with MPFR. And I don't meant that silly Pi to some 10^X > computation which isn't particularly useful in the real world. It is > quite hard to do arbitrary precision arithmetic and numerically stable > special functions, so building on top of mpmath has its advantages. Plus given that there are people who can help out with this. Ondrej From robert.kern at gmail.com Sun Feb 8 23:43:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 8 Feb 2009 22:43:29 -0600 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498FB163.9030907@gmail.com> References: <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> <498FB163.9030907@gmail.com> Message-ID: <3d375d730902082043u344994cbr4ca2740141df9525@mail.gmail.com> On Sun, Feb 8, 2009 at 22:30, Michael Abshoff wrote: > Robert Kern wrote: > > Hi, > >> On Sun, Feb 8, 2009 at 21:38, Ondrej Certik wrote: >>>>> I completely agree with Michael here. Why not to use mpmath? It's bsd, >>>>> it started as part of sympy and it was the GSoC project for sympy the >>>>> last year. It's pretty competitive with gmp (e.g. for the pi digits >>>>> calculations, it's even faster than Sage, unless Sage fixed that >>>>> already), but one doesn't have to use gmp, if one doesn't want to. >>>>> >>>>> And I think both Fredrik and other mpmath and sympy developers would >>>>> help to make mpmath working with scipy. Definitely I would. I think >>>>> that's a better option, than to port some boost stuff and then you >>>>> would have to maintain it. If you use mpmath, all of us win, imho. >>>> ??? For implementing a C ufunc? >>> Using Cython? If it's as fast as anything else, why not. If it's not >>> as fast, then that would be a reason not to use it. >> >> Well, show me the code. :-) >> >> I wasn't aware that mpmath could be Cythonized. If it can, and the >> result is reasonably fast, that would be *really* useful. >> > > I believe Ondrej was talking about using Cython to reduce call overhead, > not to Cythonize mpmath which might or might not pay off. Actually, that's all I meant by "Cythonize". You're right that it's not quite the right word. > Given that Cepehes does not do arbitrary precision taking a look at > mpmath before you decide to reinvent the wheel seems like a good idea. Actually, we weren't talking about multiprecision at all until you brought it up. The point of using the boost code wasn't the multiprecision aspect, but just that it was an alternative double-precision implementation that looked like it didn't have the bugs Cephes has. Multiprecision is an entirely different discussion, which goes some way towards explaining why I was confused why you and Ondrej think it's a good fit. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From michael.abshoff at googlemail.com Sun Feb 8 23:58:20 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 08 Feb 2009 20:58:20 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <3d375d730902082043u344994cbr4ca2740141df9525@mail.gmail.com> References: <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> <498FB163.9030907@gmail.com> <3d375d730902082043u344994cbr4ca2740141df9525@mail.gmail.com> Message-ID: <498FB7EC.9040906@gmail.com> Robert Kern wrote: >> I believe Ondrej was talking about using Cython to reduce call overhead, >> not to Cythonize mpmath which might or might not pay off. > > Actually, that's all I meant by "Cythonize". You're right that it's > not quite the right word. > >> Given that Cepehes does not do arbitrary precision taking a look at >> mpmath before you decide to reinvent the wheel seems like a good idea. > > Actually, we weren't talking about multiprecision at all until you > brought it up. Fair enough. For the Sage people the fact that Cephes is limited in prevision and quite buggy is a major issue. That is why we are looking at mpmath. > The point of using the boost code wasn't the > multiprecision aspect, but just that it was an alternative > double-precision implementation that looked like it didn't have the > bugs Cephes has. By the way: In Sage we disabled all the special mtune and sse flags being set for the gfortran since it produced crashes for special functions all over the map on common architectures, i.e. Linux/P4 with gfortran 4.2.x as well as gfortran 4.3.x. On Solaris 10 I ended up using gfortran 4.3.2 for scipy and it works well, i.e. no crahes while gfortran 4.2.4 and earlier was a complete and buggy disaster. So someone might want to put a big warning in the release notes or the wiki that gfortran 4.2.x and Solaris do not play well together with numpy as well as scipy. > Multiprecision is an entirely different discussion, which goes some > way towards explaining why I was confused why you and Ondrej think > it's a good fit. Those pesky non-engineering mathematicians and physicists :) Cheers, Michael From david at ar.media.kyoto-u.ac.jp Sun Feb 8 23:46:34 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 09 Feb 2009 13:46:34 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498FB041.5080309@gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> Message-ID: <498FB52A.5090603@ar.media.kyoto-u.ac.jp> Hi Michael, Michael Abshoff wrote: > Since I assume some people around here are interested in arbitrary > precisions and after looking some more at the documentation it seems > that that library only supports this via using an NTL type which in turn > uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit > the licensing requirements of Scipy. > > David mentioned to write a library from scratch and I also assume that > you want arbitrary precisions. Actually, I did not think about arbitrary precision at all. Not that it would not be nice, but I simply know nothing about the topic :) I am far from being entirely convinced by my own suggestion of writing from scratch: it is a huge amount of work, and the possibilities to get it wrong are numerous. But I am wondering whether we have a choice: the only existing code which is licence compatible with scipy is the one we use now (cephes, specfun, etc...), and the code is not great (cephes is not even ANSI C, for example), except for toms maybe. My suggestion is based on how R is doing things; I tend to consider R as a reference, or at least a pretty good baseline when precision and accuracy are concerned. R does not use Cephes nor specfun. We of course can't use R code, but we could at least take inspiration of their references (e.g. citations for implementation), and use R for testing. I also think one of the problem of scipy.special is its size - there are so many functions, with little to no testing. So maybe we could make up a list of a small subset of functions which are 'essential', and focus on them (we would of course keep the current code). I don't know whether such a small subset exists. For information, nmath (the core maths routines in R) is ~7500 LOC according to sloccount, and they have ~ 100 functions, maybe that would be a good subset ? I also have no idea how to test those functions: when someones says their function is precise up to 1e-6, is it against some theoretical values which are computable (like gamma(0.5) = euler constant), from theoretical consideration on the implementation ? Pauli talked about point tests, but it sounds hard to get the right points for testing ? cheers, David From david at ar.media.kyoto-u.ac.jp Sun Feb 8 23:48:07 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 09 Feb 2009 13:48:07 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <498FB7EC.9040906@gmail.com> References: <498F2B45.1080008@gmail.com> <85b5c3130902081612wfabda9of8eaa3530e8d2309@mail.gmail.com> <3d375d730902081628m2a02f96cua4debfbd3ebeffeb@mail.gmail.com> <85b5c3130902081938n6c42bdb5l33b626316c41fe67@mail.gmail.com> <3d375d730902082002t71ca184dy697d9a8e07ccb54f@mail.gmail.com> <498FB163.9030907@gmail.com> <3d375d730902082043u344994cbr4ca2740141df9525@mail.gmail.com> <498FB7EC.9040906@gmail.com> Message-ID: <498FB587.5080308@ar.media.kyoto-u.ac.jp> Michael Abshoff wrote: > > By the way: In Sage we disabled all the special mtune and sse flags > being set for the gfortran since it produced crashes for special > functions all over the map on common architectures, i.e. Linux/P4 with > gfortran 4.2.x as well as gfortran 4.3.x. > We disabled them as well in numpy trunk, so starting at 1.3, this should not be a problem. David From pav at iki.fi Mon Feb 9 03:35:06 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 9 Feb 2009 08:35:06 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> Message-ID: Sun, 08 Feb 2009 20:25:37 -0800, Michael Abshoff wrote: > David Cournapeau wrote: >> This is much better - I really don't see the point of using C++ for >> math functions. I am ok with this. > > Out of curiosity: I checked the boost website and it states for the math > lib: > > "All the implementations are fully generic and support the use of > arbitrary "real-number" types, although they are optimised for use with > types with known-about significand (or mantissa) sizes: typically float, > double or long double." > > Since I assume some people around here are interested in arbitrary > precisions and after looking some more at the documentation it seems > that that library only supports this via using an NTL type which in turn > uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit > the licensing requirements of Scipy. Yeh, arbitrary precision could be nice in principle. But as I see it, at the moment it's out-of-scope for Scipy. Right now, we only need good implementations in double precision. These we can get for some functions for example by adapting the Boost code (and re-testing it) -- this is much less work than rewriting everything from scratch. > David mentioned to write a library from scratch and I also assume that > you want arbitrary precisions. Given that GMP is LGPL, the arbitrary > precisions code in OpenSSL is covered by a BSD advertising clause (which > might or might not be a deal breaker around here) what do you suggest to > do about arbitrary precisions? I am not aware of any BSD 2 or 3 clause > license library besides mpmath :) For the present, I'd say that we should leave the arbitrary-precision functions implemented in mpmath. -- Pauli Virtanen From ondrej at certik.cz Mon Feb 9 04:20:11 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 9 Feb 2009 01:20:11 -0800 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> Message-ID: <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> On Mon, Feb 9, 2009 at 12:35 AM, Pauli Virtanen wrote: > Sun, 08 Feb 2009 20:25:37 -0800, Michael Abshoff wrote: >> David Cournapeau wrote: >>> This is much better - I really don't see the point of using C++ for >>> math functions. I am ok with this. >> >> Out of curiosity: I checked the boost website and it states for the math >> lib: >> >> "All the implementations are fully generic and support the use of >> arbitrary "real-number" types, although they are optimised for use with >> types with known-about significand (or mantissa) sizes: typically float, >> double or long double." >> >> Since I assume some people around here are interested in arbitrary >> precisions and after looking some more at the documentation it seems >> that that library only supports this via using an NTL type which in turn >> uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit >> the licensing requirements of Scipy. > > Yeh, arbitrary precision could be nice in principle. > > But as I see it, at the moment it's out-of-scope for Scipy. Right now, we > only need good implementations in double precision. These we can get for > some functions for example by adapting the Boost code (and re-testing it) > -- this is much less work than rewriting everything from scratch. > >> David mentioned to write a library from scratch and I also assume that >> you want arbitrary precisions. Given that GMP is LGPL, the arbitrary >> precisions code in OpenSSL is covered by a BSD advertising clause (which >> might or might not be a deal breaker around here) what do you suggest to >> do about arbitrary precisions? I am not aware of any BSD 2 or 3 clause >> license library besides mpmath :) > > For the present, I'd say that we should leave the arbitrary-precision > functions implemented in mpmath. Right. For double precision I think mpmath is not so fast. Fredrik, is it difficult to make mpmath fast even for double precision? Last time I asked: http://groups.google.com/group/mpmath/browse_thread/thread/bca53c3382945c34/ you replied: " SciPy already provides a truckload of machine precision special functions, with excellent (fast and robust) implementations. It'd be hard to top that. " But apparently, maybe mpmath can be useful. Ondrej From pav at iki.fi Mon Feb 9 05:26:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 9 Feb 2009 10:26:50 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> Message-ID: Mon, 09 Feb 2009 01:20:11 -0800, Ondrej Certik wrote: [clip] > Right. For double precision I think mpmath is not so fast. Fredrik, is > it difficult to make mpmath fast even for double precision? > > Last time I asked: > > http://groups.google.com/group/mpmath/browse_thread/thread/ bca53c3382945c34/ > > you replied: > > " > SciPy already provides a truckload of machine precision special > functions, with excellent (fast and robust) implementations. It'd be > hard to top that. > " > > But apparently, maybe mpmath can be useful. I'd say that mpmath faces the same robustness and testing issues as Scipy with regard to special functions. (In addition, since it's written in Python, I'd assume it also faces additional performance issues.) Also, algorithms that work well in arbitrary precision might not work for limited precision, due to loss of precision or under/overflows in intermediate steps. Looking at the Bessel function implementations in mpmath/functions.py, I'd say that at least besselj and besseli would face overflow issues for large arguments if they were working in double precision. This kind of issues are actually the most difficult to get right. To clarify: Definitely I think that mpmath is great work, and I'm happy to see people working on it, including improvements to its special function library. But at the present, I think the path of least resistance for Scipy is to continue using, testing, and improving existing implementations of special function codes, written in C or F77, directly for limited precision. -- Pauli Virtanen From david at ar.media.kyoto-u.ac.jp Mon Feb 9 05:35:10 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 09 Feb 2009 19:35:10 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> Message-ID: <499006DE.5030707@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > Sun, 08 Feb 2009 20:25:37 -0800, Michael Abshoff wrote: > >> David Cournapeau wrote: >> >>> This is much better - I really don't see the point of using C++ for >>> math functions. I am ok with this. >>> >> Out of curiosity: I checked the boost website and it states for the math >> lib: >> >> "All the implementations are fully generic and support the use of >> arbitrary "real-number" types, although they are optimised for use with >> types with known-about significand (or mantissa) sizes: typically float, >> double or long double." >> >> Since I assume some people around here are interested in arbitrary >> precisions and after looking some more at the documentation it seems >> that that library only supports this via using an NTL type which in turn >> uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit >> the licensing requirements of Scipy. >> > > Yeh, arbitrary precision could be nice in principle. > > But as I see it, at the moment it's out-of-scope for Scipy. Right now, we > only need good implementations in double precision. These we can get for > some functions for example by adapting the Boost code (and re-testing it) > -- this is much less work than rewriting everything from scratch. > Are you familiar with boost testing ? I wonder whether it would be possible to automatically convert it to something usable for scipy (the .ipp files which contain the data should be relatively easy to convert to python, they all look the same with almost no code at all). cheers, David From tom.grydeland at gmail.com Mon Feb 9 06:39:41 2009 From: tom.grydeland at gmail.com (Tom Grydeland) Date: Mon, 9 Feb 2009 12:39:41 +0100 Subject: [SciPy-dev] some scipy.interpolate functions appear to be inaccessible Message-ID: Hi developers, I tried figuring out how to use scipy.interpolate to create cubic splines with "natural" boundary conditions (i.e. y'' = 0 at the end points), but failed. When I look in the source file for scipy/interpolate/interpolate.py it certainly looks like there is code to create the tridiagonal system to solve for such splines ( _get_spline3_Bb()), but this function appears to be orphaned. There are a number of functions _find_xxx() that do nothing except raise NotImplementedError, although these should be called from splmake(), from the looks of it. Does anyone know the intended structure of these functions? I could try to make them work, given a roadmap. Regards, -- Tom Grydeland From tom.grydeland at gmail.com Mon Feb 9 06:51:58 2009 From: tom.grydeland at gmail.com (Tom Grydeland) Date: Mon, 9 Feb 2009 12:51:58 +0100 Subject: [SciPy-dev] DCT naming conventions ? In-Reply-To: References: <49731006.3050503@ar.media.kyoto-u.ac.jp> <9457e7c80901180725l70e1fa8di1ed7b035cd6e00a7@mail.gmail.com> <5b8d13220901180830t62716f58o4b7d86d26eb6bb5@mail.gmail.com> <827183970901180904r36264d97v2fe01075bd26a6ac@mail.gmail.com> <5b8d13220901190039l5757e35fsb95ccb22dfe8a12@mail.gmail.com> <5b8d13220901200117s75a6033fgcd25d5e941aed016@mail.gmail.com> Message-ID: On Wed, Jan 21, 2009 at 3:31 PM, Tom Grydeland wrote: > Okay, I've formatted the maths (with an eye towards keeping it > readable also as text) and added a link to the online version of the > reference cited. > > Are the limits in the sums all correct now? As promised, I edited the docstrings for maths (using the docs.scipy.org wiki). Since then, changes appear to have been made to the SVN archive without committing the doc edits, so the wiki now shows conflicts. Can somebody with SVN access clean this up, please? Also, please confirm that the limits on the sums are all correct. Regards, -- Tom Grydeland From pav at iki.fi Mon Feb 9 07:14:14 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 9 Feb 2009 12:14:14 +0000 (UTC) Subject: [SciPy-dev] DCT naming conventions ? References: <49731006.3050503@ar.media.kyoto-u.ac.jp> <9457e7c80901180725l70e1fa8di1ed7b035cd6e00a7@mail.gmail.com> <5b8d13220901180830t62716f58o4b7d86d26eb6bb5@mail.gmail.com> <827183970901180904r36264d97v2fe01075bd26a6ac@mail.gmail.com> <5b8d13220901190039l5757e35fsb95ccb22dfe8a12@mail.gmail.com> <5b8d13220901200117s75a6033fgcd25d5e941aed016@mail.gmail.com> Message-ID: Mon, 09 Feb 2009 12:51:58 +0100, Tom Grydeland wrote: > On Wed, Jan 21, 2009 at 3:31 PM, Tom Grydeland > wrote: > >> Okay, I've formatted the maths (with an eye towards keeping it readable >> also as text) and added a link to the online version of the reference >> cited. >> >> Are the limits in the sums all correct now? > > As promised, I edited the docstrings for maths (using the docs.scipy.org > wiki). Since then, changes appear to have been made to the SVN archive > without committing the doc edits, so the wiki now shows conflicts. > > Can somebody with SVN access clean this up, please? The procedure with conflicts in wiki is to resolve them in the wiki, so that the changes can be easily propagated to SVN. Anyone with edit permissions in the wiki can do this, by editing the docstring and resolving the conflicting parts. -- Pauli Virtanen From fredrik.johansson at gmail.com Mon Feb 9 07:19:45 2009 From: fredrik.johansson at gmail.com (Fredrik Johansson) Date: Mon, 9 Feb 2009 13:19:45 +0100 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> Message-ID: <3d0cebfb0902090419l27c8b4c6sa695657aab602783@mail.gmail.com> On Mon, Feb 9, 2009 at 10:20 AM, Ondrej Certik wrote: > > Right. For double precision I think mpmath is not so fast. Fredrik, is > it difficult to make mpmath fast even for double precision? Fast and accurate machine floating-point implementations require entirely different algorithms. One also needs to piece together multiple algorithms to handle cases where, in arbitrary-precision arithmetic, increasing some parameter is sufficient. I can see a use for a pure-Python double precision special functions library. I don't think mpmath should be it, at least not for now; getting the arbitrary-precision algorithms right is enough work. Should someone be interested in writing such a library, they could use mpmath to test accuracy against, and even to generate Chebyshev approximations and the like. Fredrik From charlesr.harris at gmail.com Mon Feb 9 11:28:42 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Feb 2009 09:28:42 -0700 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> Message-ID: On Mon, Feb 9, 2009 at 3:26 AM, Pauli Virtanen wrote: > Mon, 09 Feb 2009 01:20:11 -0800, Ondrej Certik wrote: > [clip] > > Right. For double precision I think mpmath is not so fast. Fredrik, is > > it difficult to make mpmath fast even for double precision? > > > > Last time I asked: > > > > http://groups.google.com/group/mpmath/browse_thread/thread/ > bca53c3382945c34/ > > > > you replied: > > > > " > > SciPy already provides a truckload of machine precision special > > functions, with excellent (fast and robust) implementations. It'd be > > hard to top that. > > " > > > > But apparently, maybe mpmath can be useful. > > I'd say that mpmath faces the same robustness and testing issues as Scipy > with regard to special functions. (In addition, since it's written in > Python, I'd assume it also faces additional performance issues.) > > Also, algorithms that work well in arbitrary precision might not work for > limited precision, due to loss of precision or under/overflows in > intermediate steps. Looking at the Bessel function implementations in > mpmath/functions.py, I'd say that at least besselj and besseli would face > overflow issues for large arguments if they were working in double > precision. This kind of issues are actually the most difficult to get > right. > > To clarify: Definitely I think that mpmath is great work, and I'm happy > to see people working on it, including improvements to its special > function library. > > But at the present, I think the path of least resistance for Scipy is to > continue using, testing, and improving existing implementations of > special function codes, written in C or F77, directly for limited > precision. > Here are some references for both testing and implementations: http://math.nist.gov/mcsd/Reports/2001/nesf/ . I found Cody's book(s) a good reference back in the day. I wonder if anything happened with this proposal: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.7298 ? Note the references to Cody which are to works with both implementations and tests. I think Cody was responsible for specfun, have there been problems with that package? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Feb 9 12:05:05 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 9 Feb 2009 17:05:05 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <85b5c3130902090120v46e96e18ie390fc0278551d08@mail.gmail.com> Message-ID: Mon, 09 Feb 2009 09:28:42 -0700, Charles R Harris wrote: [clip] > Here are some references for both testing and implementations: > http://math.nist.gov/mcsd/Reports/2001/nesf/ . I found Cody's book(s) a > good reference back in the day. > > I wonder if anything happened with this proposal: > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.7298 ? Thanks for the references. > Note the references to Cody which are to works with both > implementations and tests. I think Cody was responsible for specfun, > have there been problems with that package? Confusingly enough, the 'specfun.f' in Scipy is AFAICS not Cody's netlib.org/specfun. Cody's code does have tests. -- Pauli Virtanen From oliphant at enthought.com Mon Feb 9 18:54:06 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 09 Feb 2009 18:54:06 -0500 Subject: [SciPy-dev] some scipy.interpolate functions appear to be inaccessible In-Reply-To: References: Message-ID: <4990C21E.1010301@enthought.com> Tom Grydeland wrote: > Hi developers, > > I tried figuring out how to use scipy.interpolate to create cubic > splines with "natural" boundary conditions (i.e. y'' = 0 at the end > points), but failed. When I look in the source file for > scipy/interpolate/interpolate.py it certainly looks like there is code > to create the tridiagonal system to solve for such splines ( > _get_spline3_Bb()), but this function appears to be orphaned. There > are a number of functions _find_xxx() that do nothing except raise > NotImplementedError, although these should be called from splmake(), > from the looks of it. > > I'm the one who started the process of trying to add more options to the creation of cubic splines. Much of the infrastructure was created and a few options were provided as well as some low-level operations. However, as you noticed not all of the details are worked out for all the possible options. Basically, the issue is that splines for order N have N-1 additional degrees of freedom and I wanted there to be some way to both provide for an arbitrary setting of these degrees of freedom as well as an easy interface to the "typical" ways the degrees of freedom are resolved. > Does anyone know the intended structure of these functions? I could > try to make them work, given a roadmap. > If you have time to work on the code, I could answer questions, but don't have a roadmap to share. I'm teaching this week --- maybe I'll have time to work on the interpolation again --- but don't count on it. Best regards, -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From david at ar.media.kyoto-u.ac.jp Tue Feb 10 05:07:13 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 10 Feb 2009 19:07:13 +0900 Subject: [SciPy-dev] DCT naming conventions ? In-Reply-To: References: <49731006.3050503@ar.media.kyoto-u.ac.jp> <9457e7c80901180725l70e1fa8di1ed7b035cd6e00a7@mail.gmail.com> <5b8d13220901180830t62716f58o4b7d86d26eb6bb5@mail.gmail.com> <827183970901180904r36264d97v2fe01075bd26a6ac@mail.gmail.com> <5b8d13220901190039l5757e35fsb95ccb22dfe8a12@mail.gmail.com> <5b8d13220901200117s75a6033fgcd25d5e941aed016@mail.gmail.com> Message-ID: <499151D1.1030106@ar.media.kyoto-u.ac.jp> Tom Grydeland wrote: > On Wed, Jan 21, 2009 at 3:31 PM, Tom Grydeland wrote: > > >> Okay, I've formatted the maths (with an eye towards keeping it >> readable also as text) and added a link to the online version of the >> reference cited. >> >> Are the limits in the sums all correct now? >> > > As promised, I edited the docstrings for maths (using the > docs.scipy.org wiki). Since then, changes appear to have been made to > the SVN archive without committing the doc edits, so the wiki now > shows conflicts. > > Can somebody with SVN access clean this up, please? > > Also, please confirm that the limits on the sums are all correct. > The limits look correct, now, thank you. I would prefer keeping the non latex formula, though, because they are not readable from the terminal - the latex formula could be kept in a module-level discussion, but I don't think they are appropriate for docstrings. cheers, David From Miroslav.Houdek at esa.int Tue Feb 10 05:19:14 2009 From: Miroslav.Houdek at esa.int (Miroslav.Houdek at esa.int) Date: Tue, 10 Feb 2009 11:19:14 +0100 Subject: [SciPy-dev] SciLAB Message-ID: I needed to read MatLAB and SciLAB files and found only MatLAB support in Numpy/Scipy, so I developed the support for SciLAB files. Now it's working with basic data types, so it works allright for the task I needed. Would anyone be interested in including it in Scipy.Io? It would need some extra effort to include more obscure data types and stuff like that. If so, what should I do? Best regards, Miroslav -------------------------------------------------------- Bla, bla, bla, something deep as usual in mailing lists, nothing is as it seems, this and that sucks while other things rule. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.grydeland at gmail.com Tue Feb 10 06:39:16 2009 From: tom.grydeland at gmail.com (Tom Grydeland) Date: Tue, 10 Feb 2009 12:39:16 +0100 Subject: [SciPy-dev] DCT naming conventions ? In-Reply-To: <5b8d13220901200117s75a6033fgcd25d5e941aed016@mail.gmail.com> References: <49731006.3050503@ar.media.kyoto-u.ac.jp> <9457e7c80901180725l70e1fa8di1ed7b035cd6e00a7@mail.gmail.com> <5b8d13220901180830t62716f58o4b7d86d26eb6bb5@mail.gmail.com> <827183970901180904r36264d97v2fe01075bd26a6ac@mail.gmail.com> <5b8d13220901190039l5757e35fsb95ccb22dfe8a12@mail.gmail.com> <5b8d13220901200117s75a6033fgcd25d5e941aed016@mail.gmail.com> Message-ID: >> On Mon, Jan 19, 2009 at 11:55 AM, Pauli Virtanen wrote: >>> This seems to be an use case for the math:: directive. > On Mon, Jan 19, 2009 at 8:13 PM, Tom Grydeland wrote: >> I thought so also. I'll volunteer to do it if David is too busy. On Tue, Jan 20, 2009 at 10:17 AM, David Cournapeau wrote: > Sure, go ahead. On Tue, Feb 10, 2009 at 11:07 AM, David Cournapeau wrote: > The limits look correct, now, thank you. I would prefer keeping the non > latex formula, though, because they are not readable from the terminal - > the latex formula could be kept in a module-level discussion, but I > don't think they are appropriate for docstrings. Any way is fine with me, redoing things twice is bad enough, three times is a real drag. The math version is there now, the text version is in the previous revision, do with them as you see fit, I'll go edit something else. Regards, -- Tom Grydeland From cournape at gmail.com Tue Feb 10 07:39:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 10 Feb 2009 21:39:09 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <499006DE.5030707@ar.media.kyoto-u.ac.jp> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> On Mon, Feb 9, 2009 at 7:35 PM, David Cournapeau wrote: > Pauli Virtanen wrote: >> Sun, 08 Feb 2009 20:25:37 -0800, Michael Abshoff wrote: >> >>> David Cournapeau wrote: >>> >>>> This is much better - I really don't see the point of using C++ for >>>> math functions. I am ok with this. >>>> >>> Out of curiosity: I checked the boost website and it states for the math >>> lib: >>> >>> "All the implementations are fully generic and support the use of >>> arbitrary "real-number" types, although they are optimised for use with >>> types with known-about significand (or mantissa) sizes: typically float, >>> double or long double." >>> >>> Since I assume some people around here are interested in arbitrary >>> precisions and after looking some more at the documentation it seems >>> that that library only supports this via using an NTL type which in turn >>> uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit >>> the licensing requirements of Scipy. >>> >> >> Yeh, arbitrary precision could be nice in principle. >> >> But as I see it, at the moment it's out-of-scope for Scipy. Right now, we >> only need good implementations in double precision. These we can get for >> some functions for example by adapting the Boost code (and re-testing it) >> -- this is much less work than rewriting everything from scratch. >> > > Are you familiar with boost testing ? I wonder whether it would be > possible to automatically convert it to something usable for scipy (the > .ipp files which contain the data should be relatively easy to convert > to python, they all look the same with almost no code at all). I started to work on this, actually. Getting the test data is easy, but there is still some manual work to know which function is used for which data (it does not look like the files are consistent enough so that the mapping data file -> function tested can be done automatically). Shall I integrate this into scipy ? cheers, David From charlesr.harris at gmail.com Tue Feb 10 08:08:23 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Feb 2009 06:08:23 -0700 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> Message-ID: On Tue, Feb 10, 2009 at 5:39 AM, David Cournapeau wrote: > On Mon, Feb 9, 2009 at 7:35 PM, David Cournapeau > wrote: > > Pauli Virtanen wrote: > >> Sun, 08 Feb 2009 20:25:37 -0800, Michael Abshoff wrote: > >> > >>> David Cournapeau wrote: > >>> > >>>> This is much better - I really don't see the point of using C++ for > >>>> math functions. I am ok with this. > >>>> > >>> Out of curiosity: I checked the boost website and it states for the > math > >>> lib: > >>> > >>> "All the implementations are fully generic and support the use of > >>> arbitrary "real-number" types, although they are optimised for use with > >>> types with known-about significand (or mantissa) sizes: typically > float, > >>> double or long double." > >>> > >>> Since I assume some people around here are interested in arbitrary > >>> precisions and after looking some more at the documentation it seems > >>> that that library only supports this via using an NTL type which in > turn > >>> uses GMP. NTL itself is GPLed, GMP is LGPL, so either one does not fit > >>> the licensing requirements of Scipy. > >>> > >> > >> Yeh, arbitrary precision could be nice in principle. > >> > >> But as I see it, at the moment it's out-of-scope for Scipy. Right now, > we > >> only need good implementations in double precision. These we can get for > >> some functions for example by adapting the Boost code (and re-testing > it) > >> -- this is much less work than rewriting everything from scratch. > >> > > > > Are you familiar with boost testing ? I wonder whether it would be > > possible to automatically convert it to something usable for scipy (the > > .ipp files which contain the data should be relatively easy to convert > > to python, they all look the same with almost no code at all). > > I started to work on this, actually. Getting the test data is easy, > but there is still some manual work to know which function is used for > which data (it does not look like the files are consistent enough so > that the mapping data file -> function tested can be done > automatically). > > Shall I integrate this into scipy ? > I think you should also take a look at Cody's netlib.org/specfun. It has the Bessel functions along with the tests and Cody does careful work with attention to detail and error. As Pauli says, the netlib specfunc isn't the same as the specfunc in Scipy. The notes in specfun are also worth a read. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Feb 10 11:15:15 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 10 Feb 2009 16:15:15 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> Message-ID: Tue, 10 Feb 2009 21:39:09 +0900, David Cournapeau wrote: [clip: tests for special functions] > I started to work on this, actually. Getting the test data is easy, but > there is still some manual work to know which function is used for which > data (it does not look like the files are consistent enough so that the > mapping data file -> function tested can be done automatically). > > Shall I integrate this into scipy ? Please do. It can't hurt to have better tests. And we can have them committed even before we start to fix any bugs they expose. Btw, it would be nice if buildbot.scipy.org also handled Scipy in addition to Numpy. (Buildbot unfortunately has no real support for multiple projects, so this would require running a second buildmaster daemon on a separate port.) Meanwhile, there's a dump of local buildbot results here: http://www.iki.fi/pav/tmp/bb/scipy/waterfall/ but this machine can't really be used as a build master as the machine it isn't online 24/7. -- Pauli Virtanen From charlesr.harris at gmail.com Tue Feb 10 11:17:13 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Feb 2009 09:17:13 -0700 Subject: [SciPy-dev] SciLAB In-Reply-To: References: Message-ID: On Tue, Feb 10, 2009 at 3:19 AM, wrote: > > I needed to read MatLAB and SciLAB files and found only MatLAB support in > Numpy/Scipy, so I developed the support for SciLAB files. Now it's working > with basic data types, so it works allright for the task I needed. Would > anyone be interested in including it in Scipy.Io? It would need some extra > effort to include more obscure data types and stuff like that. If so, what > should I do? > I can't think of any reason not to have this as long as someone maintains it. Anyone else have an opionion about this? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason-sage at creativetrax.com Tue Feb 10 11:32:55 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Tue, 10 Feb 2009 10:32:55 -0600 Subject: [SciPy-dev] 0.7 release Message-ID: <4991AC37.4030402@creativetrax.com> I am curious about the status of the 0.7 release. IIRC, there was a scipy 0.7rc2 tagged about two weeks ago. Thanks, Jason From eads at soe.ucsc.edu Tue Feb 10 11:56:32 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Tue, 10 Feb 2009 08:56:32 -0800 Subject: [SciPy-dev] 0.7 release In-Reply-To: <4991AC37.4030402@creativetrax.com> References: <4991AC37.4030402@creativetrax.com> Message-ID: <91b4b1ab0902100856t55c01615yab6472a71245a967@mail.gmail.com> The source tarball for 0.7.0 rc 2 is now on SourceForge and is the default download. On 2/10/09, jason-sage at creativetrax.com wrote: > I am curious about the status of the 0.7 release. IIRC, there was a > scipy 0.7rc2 tagged about two weeks ago. > > Thanks, > > Jason > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- Sent from my mobile device ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From eads at soe.ucsc.edu Tue Feb 10 12:02:34 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Tue, 10 Feb 2009 09:02:34 -0800 Subject: [SciPy-dev] SciLAB In-Reply-To: References: Message-ID: <91b4b1ab0902100902t273ee2a1r4a1a49dbb5614b9e@mail.gmail.com> I'm supportive of including it as long as you are interested in polishing it, documenting it consistent with standards, and writing regression tests with nose, or can find others to do these tasks. Damian On 2/10/09, Miroslav.Houdek at esa.int wrote: > I needed to read MatLAB and SciLAB files and found only MatLAB support in > Numpy/Scipy, so I developed the support for SciLAB files. Now it's working > with basic data types, so it works allright for the task I needed. Would > anyone be interested in including it in Scipy.Io? It would need some extra > effort to include more obscure data types and stuff like that. If so, what > should I do? > > Best regards, > Miroslav > > -------------------------------------------------------- > Bla, bla, bla, something deep as usual in mailing lists, nothing is as it > seems, this and that sucks while other things rule. -- Sent from my mobile device ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From cournape at gmail.com Tue Feb 10 13:31:30 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 11 Feb 2009 03:31:30 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> Message-ID: <5b8d13220902101031i32bfbc29xe70c7e6b45efb75d@mail.gmail.com> On Wed, Feb 11, 2009 at 1:15 AM, Pauli Virtanen wrote: > Tue, 10 Feb 2009 21:39:09 +0900, David Cournapeau wrote: > [clip: tests for special functions] >> I started to work on this, actually. Getting the test data is easy, but >> there is still some manual work to know which function is used for which >> data (it does not look like the files are consistent enough so that the >> mapping data file -> function tested can be done automatically). >> >> Shall I integrate this into scipy ? > > Please do. It can't hurt to have better tests. And we can have them > committed even before we start to fix any bugs they expose. I started a branch, special_refactor. I added all the converted Boost data set (the .ipp files to .csv), plus the small python script I used to generate them. I started implementing the corresponding tests - but this takes some time, because of all this template stuff which is awkward to follow. The only thing to do is to find which function is called for which test with which parameter - someone more familiar with boost could to this much faster, I guess. David From doutriaux1 at llnl.gov Tue Feb 10 14:09:15 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 10 Feb 2009 11:09:15 -0800 Subject: [SciPy-dev] 0.7 release In-Reply-To: <91b4b1ab0902100856t55c01615yab6472a71245a967@mail.gmail.com> References: <4991AC37.4030402@creativetrax.com> <91b4b1ab0902100856t55c01615yab6472a71245a967@mail.gmail.com> Message-ID: <03EB7A06-654A-486F-A02B-05DC2417B738@llnl.gov> Thanks Damian, I believe what Jason was really asking is when will the final version be released? I would like to know as well, as we had to release our software with rc2 instead of final, we simply couldn't wait any longer. Thanks, C. On Feb 10, 2009, at 8:56 AM, Damian Eads wrote: > The source tarball for 0.7.0 rc 2 is now on SourceForge and is the > default download. > > On 2/10/09, jason-sage at creativetrax.com sage at creativetrax.com> wrote: >> I am curious about the status of the 0.7 release. IIRC, there was a >> scipy 0.7rc2 tagged about two weeks ago. >> >> Thanks, >> >> Jason >> >> _______________________________________________ >> Scipy-dev mailing list >> Scipy-dev at scipy.org >> http:// projects.scipy.org/mailman/listinfo/scipy-dev >> > > -- > Sent from my mobile device > > ----------------------------------------------------- > Damian Eads Ph.D. Student > Jack Baskin School of Engineering, UCSC E2-489 > 1156 High Street Machine Learning Lab > Santa Cruz, CA 95064 http:// www. soe.ucsc.edu/~eads > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http:// projects.scipy.org/mailman/listinfo/scipy-dev > From eads at soe.ucsc.edu Tue Feb 10 14:34:38 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Tue, 10 Feb 2009 11:34:38 -0800 Subject: [SciPy-dev] 0.7 release In-Reply-To: <03EB7A06-654A-486F-A02B-05DC2417B738@llnl.gov> References: <4991AC37.4030402@creativetrax.com> <91b4b1ab0902100856t55c01615yab6472a71245a967@mail.gmail.com> <03EB7A06-654A-486F-A02B-05DC2417B738@llnl.gov> Message-ID: <91b4b1ab0902101134l1512af7s4c39bfe41ef52760@mail.gmail.com> I thought the release candidate *was* final but the betas b1 and b2 were not. My mistake. Sorry. I just learned about the distinction from here, http://en.wikipedia.org/wiki/Software_release_life_cycle . Damian On Tue, Feb 10, 2009 at 11:09 AM, Charles ???? Doutriaux wrote: > Thanks Damian, > > I believe what Jason was really asking is when will the final version > be released? > > I would like to know as well, as we had to release our software with > rc2 instead of final, we simply couldn't wait any longer. > > Thanks, > > C. > > On Feb 10, 2009, at 8:56 AM, Damian Eads wrote: > >> The source tarball for 0.7.0 rc 2 is now on SourceForge and is the >> default download. >> >> On 2/10/09, jason-sage at creativetrax.com > sage at creativetrax.com> wrote: >>> I am curious about the status of the 0.7 release. IIRC, there was a >>> scipy 0.7rc2 tagged about two weeks ago. >>> >>> Thanks, >>> >>> Jason >>> >>> _______________________________________________ >>> Scipy-dev mailing list >>> Scipy-dev at scipy.org >>> http:// projects.scipy.org/mailman/listinfo/scipy-dev >>> >> >> -- >> Sent from my mobile device >> >> ----------------------------------------------------- >> Damian Eads Ph.D. Student >> Jack Baskin School of Engineering, UCSC E2-489 >> 1156 High Street Machine Learning Lab >> Santa Cruz, CA 95064 http:// www. soe.ucsc.edu/~eads >> _______________________________________________ >> Scipy-dev mailing list >> Scipy-dev at scipy.org >> http:// projects.scipy.org/mailman/listinfo/scipy-dev >> > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From millman at berkeley.edu Tue Feb 10 15:02:15 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 10 Feb 2009 12:02:15 -0800 Subject: [SciPy-dev] 0.7 release In-Reply-To: <4991AC37.4030402@creativetrax.com> References: <4991AC37.4030402@creativetrax.com> Message-ID: On Tue, Feb 10, 2009 at 8:32 AM, wrote: > I am curious about the status of the 0.7 release. IIRC, there was a > scipy 0.7rc2 tagged about two weeks ago. The release candidate is essentially the final version as there have been no major regressions or show stoppers of which I am aware. I spoke with David Cournapeau this morning about making the final release and we will both have some time over the next few days to get 0.7.0 final out. I will tag the final release and announce it over the next couple of days. Jarrod From doutriaux1 at llnl.gov Tue Feb 10 23:15:31 2009 From: doutriaux1 at llnl.gov (=?UTF-8?Q?Charles_=D8=B3=D9=85=D9=8A=D8=B1_Doutriaux?=) Date: Tue, 10 Feb 2009 20:15:31 -0800 Subject: [SciPy-dev] 0.7 release In-Reply-To: References: <4991AC37.4030402@creativetrax.com> Message-ID: Hi Jarrod, This is great news! I figured out RC2 was close enough so we could ship CDAT with it. I'll do a minor update of CDAT whenever this comes out. Thanks again for all the good work ad efforts you guys put into this! It is definitely better to release a working project later than a flaky one earlier! C. On Feb 10, 2009, at 12:02 PM, Jarrod Millman wrote: > On Tue, Feb 10, 2009 at 8:32 AM, wrote: >> I am curious about the status of the 0.7 release. IIRC, there was a >> scipy 0.7rc2 tagged about two weeks ago. > > The release candidate is essentially the final version as there have > been no major regressions or show stoppers of which I am aware. I > spoke with David Cournapeau this morning about making the final > release and we will both have some time over the next few days to get > 0.7.0 final out. I will tag the final release and announce it over > the next couple of days. > > Jarrod > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http:// projects.scipy.org/mailman/listinfo/scipy-dev > From jason-sage at creativetrax.com Tue Feb 10 23:26:25 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Tue, 10 Feb 2009 22:26:25 -0600 Subject: [SciPy-dev] 0.7 release In-Reply-To: References: <4991AC37.4030402@creativetrax.com> Message-ID: <49925371.7010107@creativetrax.com> Jarrod Millman wrote: > On Tue, Feb 10, 2009 at 8:32 AM, wrote: > >> I am curious about the status of the 0.7 release. IIRC, there was a >> scipy 0.7rc2 tagged about two weeks ago. >> > > The release candidate is essentially the final version as there have > been no major regressions or show stoppers of which I am aware. I > spoke with David Cournapeau this morning about making the final > release and we will both have some time over the next few days to get > 0.7.0 final out. I will tag the final release and announce it over > the next couple of days. > Thanks! Jason From millman at berkeley.edu Wed Feb 11 03:26:24 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 11 Feb 2009 00:26:24 -0800 Subject: [SciPy-dev] ANN: SciPy 0.7.0 Message-ID: I'm pleased to announce SciPy 0.7.0. SciPy is a package of tools for science and engineering for Python. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. This release comes sixteen months after the 0.6.0 release and contains many new features, numerous bug-fixes, improved test coverage, and better documentation. Please note that SciPy 0.7.0 requires Python 2.4 or greater (but not Python 3) and NumPy 1.2.0 or greater. For information, please see the release notes: https://sourceforge.net/project/shownotes.php?release_id=660191&group_id=27747 You can download the release from here: https://sourceforge.net/project/showfiles.php?group_id=27747&package_id=19531&release_id=660191 Thank you to everybody who contributed to this release. Enjoy, Jarrod Millman From njs at pobox.com Wed Feb 11 05:45:14 2009 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 11 Feb 2009 02:45:14 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 Message-ID: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> The improvements to loadmat in 0.7.0 are wonderful! Thanks for the work on that. But I've run into one snag... on a matlab file I care about: $ easy_install scipy==0.6.0 $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")' real 0m4.172s user 0m2.908s sys 0m1.056s $ easy_install scipy==0.7.0 $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")' real 3m10.556s user 1m14.713s sys 1m55.731s So it became ~50 times slower, and quite unusable. All that time seems to be disappearing into GzipInputStream.__fill, and in particular, line_profiler says: 95 8509 270975 31.8 0.1 data = self.fileobj.read(n_to_fetch) 96 8509 37703 4.4 0.0 self._bytes_read += len(data) 97 8509 27164 3.2 0.0 if data: 98 8509 190425980 22379.4 99.6 self.data += self._unzipper.decompress(data) I'm thinking this is one of those times where the quadratic-time overhead to string append is worth avoiding... -- Nathaniel From rmay31 at gmail.com Wed Feb 11 11:53:46 2009 From: rmay31 at gmail.com (Ryan May) Date: Wed, 11 Feb 2009 10:53:46 -0600 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 4:45 AM, Nathaniel Smith wrote: > The improvements to loadmat in 0.7.0 are wonderful! Thanks for the > work on that. But I've run into one snag... on a matlab file I care > about: > > $ easy_install scipy==0.6.0 > $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")' > real 0m4.172s > user 0m2.908s > sys 0m1.056s > > $ easy_install scipy==0.7.0 > $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")' > real 3m10.556s > user 1m14.713s > sys 1m55.731s > > So it became ~50 times slower, and quite unusable. > > All that time seems to be disappearing into GzipInputStream.__fill, > and in particular, line_profiler says: > > > 95 8509 270975 31.8 0.1 data = > self.fileobj.read(n_to_fetch) > 96 8509 37703 4.4 0.0 > self._bytes_read += len(data) > 97 8509 27164 3.2 0.0 if data: > 98 8509 190425980 22379.4 99.6 > self.data += self._unzipper.decompress(data) > > I'm thinking this is one of those times where the quadratic-time > overhead to string append is worth avoiding... Well, here's a patch against gzipstreams.py that changes to add the chunks to a list and only add to the string at the very end. See if it helps your case. If not, is there somewhere you can put the datafile so that we can test with it? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from: Norman Oklahoma United States. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gzipstreams_speedup.diff Type: application/octet-stream Size: 1290 bytes Desc: not available URL: From Scott.Daniels at Acm.Org Wed Feb 11 15:03:13 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed, 11 Feb 2009 12:03:13 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: Ryan May wrote: > ... Well, here's a patch against gzipstreams.py that changes to add the > chunks to a list and only add to the string at the very end. See if it > helps your case. If not, is there somewhere you can put the datafile so > that we can test with it? Well, in your patch, instead of: @@ -95,11 +100,12 @@ data = self.fileobj.read(n_to_fetch) self._bytes_read += len(data) if data: - self.data += self._unzipper.decompress(data) + self_data += self._unzipper.decompress(data) if len(data) < n_to_fetch: # hit end of file - self.data += self._unzipper.flush() + self_data += self._unzipper.flush() self.exhausted = True break + self.data += ''.join(self_data) Use: @@ -95,11 +100,12 @@ data = self.fileobj.read(n_to_fetch) self._bytes_read += len(data) if data: - self.data += self._unzipper.decompress(data) + self_data.append(self._unzipper.decompress(data)) if len(data) < n_to_fetch: # hit end of file - self.data += self._unzipper.flush() + self_data.append(self._unzipper.flush()) self.exhausted = True break + self.data += ''.join(self_data) From nwagner at iam.uni-stuttgart.de Wed Feb 11 15:05:46 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 11 Feb 2009 21:05:46 +0100 Subject: [SciPy-dev] scikits learn Message-ID: Hi all, The installation of learn failed with compile options: '-I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' gcc: _lk.c gcc: _lk.c: Datei oder Verzeichnis nicht gefunden gcc: no input files gcc: _lk.c: Datei oder Verzeichnis nicht gefunden gcc: no input files error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -fwrapv -fPIC -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c _lk.c -o build/temp.linux-x86_64-2.6/_lk.o" failed with exit status 1 Any idea ? Nils From rmay31 at gmail.com Wed Feb 11 15:09:30 2009 From: rmay31 at gmail.com (Ryan May) Date: Wed, 11 Feb 2009 14:09:30 -0600 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 2:03 PM, Scott David Daniels wrote: > Ryan May wrote: > > ... Well, here's a patch against gzipstreams.py that changes to add the > > chunks to a list and only add to the string at the very end. See if it > > helps your case. If not, is there somewhere you can put the datafile so > > that we can test with it? > Well, in your patch, instead of: > @@ -95,11 +100,12 @@ > data = self.fileobj.read(n_to_fetch) > self._bytes_read += len(data) > if data: > - self.data += self._unzipper.decompress(data) > + self_data += self._unzipper.decompress(data) > if len(data) < n_to_fetch: # hit end of file > - self.data += self._unzipper.flush() > + self_data += self._unzipper.flush() > self.exhausted = True > break > + self.data += ''.join(self_data) > > Use: > @@ -95,11 +100,12 @@ > data = self.fileobj.read(n_to_fetch) > self._bytes_read += len(data) > if data: > - self.data += self._unzipper.decompress(data) > + self_data.append(self._unzipper.decompress(data)) > if len(data) < n_to_fetch: # hit end of file > - self.data += self._unzipper.flush() > + self_data.append(self._unzipper.flush()) > self.exhausted = True > break > + self.data += ''.join(self_data) > > Yeah, you're right. I thought += for lists just mapped to append, but apparently it appends other lists, but extends the list by other sequences. Weird. But if you do make that change, it solves your performance problem? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from: Norman Oklahoma United States. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Feb 11 15:13:09 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Feb 2009 14:13:09 -0600 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: <3d375d730902111213tef8f12ard1bb3c020f731e7c@mail.gmail.com> On Wed, Feb 11, 2009 at 14:09, Ryan May wrote: > On Wed, Feb 11, 2009 at 2:03 PM, Scott David Daniels > wrote: >> >> Ryan May wrote: >> > ... Well, here's a patch against gzipstreams.py that changes to add the >> > chunks to a list and only add to the string at the very end. See if it >> > helps your case. If not, is there somewhere you can put the datafile so >> > that we can test with it? >> Well, in your patch, instead of: >> @@ -95,11 +100,12 @@ >> data = self.fileobj.read(n_to_fetch) >> self._bytes_read += len(data) >> if data: >> - self.data += self._unzipper.decompress(data) >> + self_data += self._unzipper.decompress(data) >> if len(data) < n_to_fetch: # hit end of file >> - self.data += self._unzipper.flush() >> + self_data += self._unzipper.flush() >> self.exhausted = True >> break >> + self.data += ''.join(self_data) >> >> Use: >> @@ -95,11 +100,12 @@ >> data = self.fileobj.read(n_to_fetch) >> self._bytes_read += len(data) >> if data: >> - self.data += self._unzipper.decompress(data) >> + self_data.append(self._unzipper.decompress(data)) >> if len(data) < n_to_fetch: # hit end of file >> - self.data += self._unzipper.flush() >> + self_data.append(self._unzipper.flush()) >> self.exhausted = True >> break >> + self.data += ''.join(self_data) >> > > Yeah, you're right. I thought += for lists just mapped to append, but > apparently it appends other lists, but extends the list by other sequences. > Weird. Not weird at all. "x += y" should be the same as "x = x + y" except for possible in-place modification, per the reference manual. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Scott.Daniels at Acm.Org Wed Feb 11 15:21:30 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed, 11 Feb 2009 12:21:30 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: Ryan May wrote: > On Wed, Feb 11, 2009 at 2:03 PM, Scott David Daniels > > wrote: > > Ryan May wrote: > > ... Well, here's a patch against gzipstreams.py that changes to > add the > > chunks to a list and only add to the string at the very end. See > if it > > helps your case. If not, is there somewhere you can put the > datafile so > > that we can test with it? > Well, in your patch, instead of: > @@ -95,11 +100,12 @@ > data = self.fileobj.read(n_to_fetch) > self._bytes_read += len(data) > if data: > - self.data += self._unzipper.decompress(data) > + self_data += self._unzipper.decompress(data) > if len(data) < n_to_fetch: # hit end of file > - self.data += self._unzipper.flush() > + self_data += self._unzipper.flush() > self.exhausted = True > break > + self.data += ''.join(self_data) > > Use: > @@ -95,11 +100,12 @@ > data = self.fileobj.read(n_to_fetch) > self._bytes_read += len(data) > if data: > - self.data += self._unzipper.decompress(data) > + self_data.append(self._unzipper.decompress(data)) > if len(data) < n_to_fetch: # hit end of file > - self.data += self._unzipper.flush() > + self_data.append(self._unzipper.flush()) > self.exhausted = True > break > + self.data += ''.join(self_data) > > > Yeah, you're right. I thought += for lists just mapped to append, but > apparently it appends other lists, but extends the list by other > sequences. Weird. > > But if you do make that change, it solves your performance problem? I am not the OP. I just noticed a problem. However, there is another The loop control is now wrong: while read_to_end or len(self.data) < bytes: Clearly the second clause won't work right, so deeper surgery on your patch is needed. I'd calculate needed bytes = bytes - len(self.data) and decrement it by the length of each chunk added to self_data. But clearly I don't understand what is going on, since I see bytes initialized to -1 and never updated in the fragment, so the loop control boils down to "while read_to_end:". I think the code needs some further study there. --Scott David Daniels Scott.Daniels at Acm.Org From rmay31 at gmail.com Wed Feb 11 15:40:01 2009 From: rmay31 at gmail.com (Ryan May) Date: Wed, 11 Feb 2009 14:40:01 -0600 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 2:21 PM, Scott David Daniels wrote: > Ryan May wrote: > > On Wed, Feb 11, 2009 at 2:03 PM, Scott David Daniels > > > wrote: > > > > Ryan May wrote: > > > ... Well, here's a patch against gzipstreams.py that changes to > > add the > > > chunks to a list and only add to the string at the very end. See > > if it > > > helps your case. If not, is there somewhere you can put the > > datafile so > > > that we can test with it? > > Well, in your patch, instead of: > > @@ -95,11 +100,12 @@ > > data = self.fileobj.read(n_to_fetch) > > self._bytes_read += len(data) > > if data: > > - self.data += self._unzipper.decompress(data) > > + self_data += self._unzipper.decompress(data) > > if len(data) < n_to_fetch: # hit end of file > > - self.data += self._unzipper.flush() > > + self_data += self._unzipper.flush() > > self.exhausted = True > > break > > + self.data += ''.join(self_data) > > > > Use: > > @@ -95,11 +100,12 @@ > > data = self.fileobj.read(n_to_fetch) > > self._bytes_read += len(data) > > if data: > > - self.data += self._unzipper.decompress(data) > > + self_data.append(self._unzipper.decompress(data)) > > if len(data) < n_to_fetch: # hit end of file > > - self.data += self._unzipper.flush() > > + self_data.append(self._unzipper.flush()) > > self.exhausted = True > > break > > + self.data += ''.join(self_data) > > > > > > Yeah, you're right. I thought += for lists just mapped to append, but > > apparently it appends other lists, but extends the list by other > > sequences. Weird. > > > > But if you do make that change, it solves your performance problem? > > I am not the OP. I just noticed a problem. However, there is another > The loop control is now wrong: > while read_to_end or len(self.data) < bytes: > Clearly the second clause won't work right, so deeper surgery on your > patch is needed. I'd calculate needed bytes = bytes - len(self.data) > and decrement it by the length of each chunk added to self_data. > But clearly I don't understand what is going on, since I see bytes > initialized to -1 and never updated in the fragment, so the loop > control boils down to "while read_to_end:". I think the code needs > some further study there. > I can't believe I didn't notice you weren't the OP. And yeah, I forgot the loop control. Clearly, this is evidence that I shouldn't start my day with creating a patch, though I did at least have the sense to run the test suite. Obviously, the tests don't exercise a code path that uses the len(self.data) < bytes. As far as bytes goes, it isn't initialized to -1, but rather read_to_end is a boolean set to the value of (bytes == -1), so that you can pass bytes in as -1 and read all the data. Anyhow, for anyone who cares, here's a patch that removes the braindeaded-ness and should actually work. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from: Norman Oklahoma United States. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gzipstreams_speedup.diff Type: application/octet-stream Size: 1444 bytes Desc: not available URL: From nwagner at iam.uni-stuttgart.de Wed Feb 11 15:46:18 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 11 Feb 2009 21:46:18 +0100 Subject: [SciPy-dev] scikits umfpack Message-ID: python test_umfpack.py Traceback (most recent call last): File "test_umfpack.py", line 195, in nose.run(argv=['', __file__]) NameError: name 'nose' is not defined #!/usr/bin/env python # """ Test functions for UMFPACK solver. The solver is accessed via spsolve(), so the built-in SuperLU solver is tested too, in single precision. """ import warnings import nose An "import nose" works for me. Another issue python try_umfpack.py /home/nwagner/local/lib64/python2.6/site-packages/scipy/linsolve/__init__.py:4: DeprecationWarning: scipy.linsolve has moved to scipy.sparse.linalg.dsolve warn('scipy.linsolve has moved to scipy.sparse.linalg.dsolve', DeprecationWarning) Traceback (most recent call last): File "try_umfpack.py", line 7, in import scipy.linsolve.umfpack as um ImportError: No module named umfpack nwagner at linux-mogv:~/svn/umfpack/tests> python Python 2.6 (r26:66714, Feb 3 2009, 20:49:49) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scikits import umfpack as um The corresponding line in try_umfpack.py should be replaced by #import scipy.linsolve.umfpack as um from scikits import umfpack as um Nils From nwagner at iam.uni-stuttgart.de Wed Feb 11 15:59:02 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 11 Feb 2009 21:59:02 +0100 Subject: [SciPy-dev] try_umfpack.py Message-ID: Hi Robert C., I have updated the URL in try_umfpack.py #defaultURL = 'http://www.cise.ufl.edu/research/sparse/HBformat/' defaultURL = 'http://www.cise.ufl.edu/research/sparse/RB/HB/' ./try_umfpack.py -d bcsstm21.tar.gz ************************************************** url: http://www.cise.ufl.edu/research/sparse/RB/HB/bcsstm21.tar.gz file: /tmp/tmpDwKACc.gz format: triplet reading... Traceback (most recent call last): File "./try_umfpack.py", line 222, in main() File "./try_umfpack.py", line 138, in main mtx = readMatrix( matrixName, options ) File "./try_umfpack.py", line 98, in readMatrix mtx = readMatrix( fd ) File "./try_umfpack.py", line 37, in read_triplet nRow, nCol = map( int, fd.readline().split() ) ValueError: invalid literal for int() with base 10: 'bcsstm21/bcsstm21.rb' Any idea ? Nils From njs at pobox.com Wed Feb 11 21:05:00 2009 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 11 Feb 2009 18:05:00 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> On Wed, Feb 11, 2009 at 12:40 PM, Ryan May wrote: > Anyhow, for anyone who cares, here's a patch that removes the > braindeaded-ness and should actually work. It doesn't for me -- seems to have an infinite loop or somesuch (I got bored after half an hour). I redid the patch a bit (fixed the loop condition, and renamed some variables for clarity), and my version (attached, against stock 0.7.0) does terminate, and is almost as fast as 0.6.0: $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")' real 0m5.020s user 0m3.480s sys 0m1.540s So now the GzipInputStream overhead is only about 20-25%. Still seems a bit higher than it should be, but certainly usable, and worth it for the memory win. BTW, the name GzipInputStream is very confusing for something that reads raw deflate format and not, say, gzip format :-). -- Nathaniel -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-gzipstreams.patch Type: text/x-diff Size: 1775 bytes Desc: not available URL: From cournape at gmail.com Thu Feb 12 03:55:11 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 12 Feb 2009 17:55:11 +0900 Subject: [SciPy-dev] scikits learn In-Reply-To: References: Message-ID: <5b8d13220902120055n5d978a16k25c8a83d8a97d2c9@mail.gmail.com> On Thu, Feb 12, 2009 at 5:05 AM, Nils Wagner wrote: > Hi all, > > The installation of learn failed with > > > > compile options: > '-I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include > -I/usr/include/python2.6 -c' > gcc: _lk.c > gcc: _lk.c: Datei oder Verzeichnis nicht gefunden > gcc: no input files > gcc: _lk.c: Datei oder Verzeichnis nicht gefunden > gcc: no input files > error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > -fstack-protector -funwind-tables > -fasynchronous-unwind-tables -g -fwrapv -fPIC > -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include > -I/usr/include/python2.6 -c _lk.c -o > build/temp.linux-x86_64-2.6/_lk.o" failed with exit status Yes, em2 (intended to be a simpler, more complete and much more scalable version of em) is not ready, I should not have enabled it by default, cheers, David From lars.bittrich at googlemail.com Thu Feb 12 04:15:15 2009 From: lars.bittrich at googlemail.com (Lars Bittrich) Date: Thu, 12 Feb 2009 10:15:15 +0100 Subject: [SciPy-dev] Problem with weave and blitz and gcc 4.3 (SciPy 0.7.0) Message-ID: <200902121015.15832.lars.bittrich@googlemail.com> Hi all, on Ubuntu intrepid using gcc 4.3 I have problems using weave together with blitz converters. There is already an old ticket: http://www.scipy.org/scipy/scipy/ticket/739 Here is a minimal example to reproduce the problem (even with SciPy 0.7.0): from scipy.weave import inline, converters from numpy import array a = array([1]) code = "return_val = a(0);" print inline(code, ['a'], type_converters=converters.blitz) For the complete error output look at the ticket. The essential part is: /usr/lib/python2.5/site-packages/scipy/weave/blitz/blitz/funcs.h:530: error: 'labs' is not a member of 'std' ... /usr/lib/python2.5/site-packages/scipy/weave/blitz/blitz/mathfunc.h:45: error: 'labs' is not a member of 'std' As a workaround I should use gcc 4.2 but the also in the ticket mentioned patch from debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=10;filename=blitz%2B%2B.patch;att=1;bug=455661 seems much easier to apply. It is only two includes (both #include ). The patch workes well and all tests scipy.weave.test('full') are ok. Is there any reason why the patch is not applied in SciPy? I have even tested the patch with Ubuntu hardy (SciPy 0.6.0) and gcc 4.2. There at least the number of errors and failures does not change with or without the patch. Best regards, Lars From matthieu.brucher at gmail.com Thu Feb 12 04:37:54 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 12 Feb 2009 10:37:54 +0100 Subject: [SciPy-dev] Problem with weave and blitz and gcc 4.3 (SciPy 0.7.0) In-Reply-To: <200902121015.15832.lars.bittrich@googlemail.com> References: <200902121015.15832.lars.bittrich@googlemail.com> Message-ID: I concur, this is in fact a Blitz issue (I had the same on my computer when I compiled Blitz). The correct headers are not included. The patch is not big, it will not impact a thing for weave (the former gcc headers included it already, so the additional cost is zero). Matthieu 2009/2/12 Lars Bittrich : > Hi all, > > on Ubuntu intrepid using gcc 4.3 I have problems using weave together with > blitz converters. There is already an old ticket: > > http://www.scipy.org/scipy/scipy/ticket/739 > > Here is a minimal example to reproduce the problem (even with SciPy 0.7.0): > > > from scipy.weave import inline, converters > from numpy import array > > a = array([1]) > code = "return_val = a(0);" > print inline(code, ['a'], type_converters=converters.blitz) > > > For the complete error output look at the ticket. The essential part is: > > /usr/lib/python2.5/site-packages/scipy/weave/blitz/blitz/funcs.h:530: > error: 'labs' is not a member of 'std' > ... > /usr/lib/python2.5/site-packages/scipy/weave/blitz/blitz/mathfunc.h:45: > error: 'labs' is not a member of 'std' > > As a workaround I should use gcc 4.2 but the also in the ticket mentioned > patch from debian: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=10;filename=blitz%2B%2B.patch;att=1;bug=455661 > > seems much easier to apply. It is only two includes (both #include ). > The patch workes well and all tests scipy.weave.test('full') are ok. Is there > any reason why the patch is not applied in SciPy? > > I have even tested the patch with Ubuntu hardy (SciPy 0.6.0) and gcc 4.2. > There at least the number of errors and failures does not change with or > without the patch. > > > Best regards, > Lars > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Thu Feb 12 04:27:43 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 12 Feb 2009 18:27:43 +0900 Subject: [SciPy-dev] Problem with weave and blitz and gcc 4.3 (SciPy 0.7.0) In-Reply-To: <200902121015.15832.lars.bittrich@googlemail.com> References: <200902121015.15832.lars.bittrich@googlemail.com> Message-ID: <4993EB8F.9000503@ar.media.kyoto-u.ac.jp> Lars Bittrich wrote: > Is there > any reason why the patch is not applied in SciPy? > Yes, it was too late to fix in 0.7.0, I did not wanted to deal with C++ issues for 0.7.0 when I realized this problem. It will be fixed for 0.7.1, cheers, David From perry at stsci.edu Thu Feb 12 09:16:50 2009 From: perry at stsci.edu (Perry Greenfield) Date: Thu, 12 Feb 2009 09:16:50 -0500 Subject: [SciPy-dev] astrolib repository and trac site being relocated References: <71EA2E2F-C40A-4E96-A2BA-C9FC9EA25BF0@stsci.edu> Message-ID: <3A566ECF-CD78-4746-9901-0226E5AFA999@stsci.edu> We will be relocating the astrolib svn repository to STScI this Friday (February 13, 2009). When the new repository becomes available, we will have the scipy.org repository and trac site inactivated and post the new new urls for the STScI hosted ones. We will continue to provide commit privileges to the repository to those that have software or changes to contribute at the new site. We are very grateful to Enthought for the many years it has hosted the repository. They provided a great service to STScI and the community by doing so. We will continue to use the scipy.org wiki resources for astropy/astrolib-related information Perry Greenfield From pav at iki.fi Thu Feb 12 17:27:10 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 12 Feb 2009 22:27:10 +0000 (UTC) Subject: [SciPy-dev] Bessel functions from Boost References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> <5b8d13220902101031i32bfbc29xe70c7e6b45efb75d@mail.gmail.com> Message-ID: Wed, 11 Feb 2009 03:31:30 +0900, David Cournapeau wrote: [clip] > I started a branch, special_refactor. I added all the converted Boost > data set (the .ipp files to .csv), plus the small python script I used > to generate them. I started implementing the corresponding tests - but > this takes some time, because of all this template stuff which is > awkward to follow. The only thing to do is to find which function is > called for which test with which parameter - someone more familiar with > boost could to this much faster, I guess. I added a couple of more functions to the tests: They correctly point out that in 0.7.0: + The problems in Cephes's Iv (large argument), Yv (large order) and Kn (large order) + Numpy's complex-valued `arcsinh` and `arctanh` can have large relative errors (~1e-5) for small arguments (< eps)! Loss of precision in the naive implementation, I'll bet. but they fail to spot the other known issues. But on the positive side, the `arcsinh` issue is the only new one that came up. One problem with these tests is that the data files are *huge*, they currently total ~ 7 Mb. Even compressed, or saved as .npy files, these would add ~ 2 Mb to the Scipy source tarball. So I'm not sure what to do with this... -- Pauli Virtanen From david at ar.media.kyoto-u.ac.jp Thu Feb 12 20:31:12 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Feb 2009 10:31:12 +0900 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: References: <497D9E9F.7010401@ar.media.kyoto-u.ac.jp> <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> <5b8d13220902101031i32bfbc29xe70c7e6b45efb75d@mail.gmail.com> Message-ID: <4994CD60.5020109@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > Wed, 11 Feb 2009 03:31:30 +0900, David Cournapeau wrote: > > [clip] > >> I started a branch, special_refactor. I added all the converted Boost >> data set (the .ipp files to .csv), plus the small python script I used >> to generate them. I started implementing the corresponding tests - but >> this takes some time, because of all this template stuff which is >> awkward to follow. The only thing to do is to find which function is >> called for which test with which parameter - someone more familiar with >> boost could to this much faster, I guess. >> > > I added a couple of more functions to the tests: > > They correctly point out that in 0.7.0: > > + The problems in Cephes's Iv (large argument), Yv (large order) > and Kn (large order) > > + Numpy's complex-valued `arcsinh` and `arctanh` can have large > relative errors (~1e-5) for small arguments (< eps)! > > Loss of precision in the naive implementation, I'll bet. > > but they fail to spot the other known issues. But on the positive side, > the `arcsinh` issue is the only new one that came up. > > One problem with these tests is that the data files are *huge*, > they currently total ~ 7 Mb. Even compressed, or saved as .npy files, > these would add ~ 2 Mb to the Scipy source tarball. So I'm not sure > what to do with this... > That's the reason why I started a branch - I did not know how it would end up. I don't see an obvious answer to the problem: those are tests for ~ 100 functions, so this means 20kb of compressed data/function on average. Each test is two data points at least (x and f(x)), this means around ~ 500 test points/function. That does not sound that big anymore. Maybe we could have an option to split the dataset to make them separate from the main tarball ? I kept the data in .csv because I thought it would be nice to test for double and float at least, and the gain using binary would not be that huge anymore (it is also easier to use for tests outside the python machinery), cheers, David From charlesr.harris at gmail.com Thu Feb 12 21:58:25 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Feb 2009 19:58:25 -0700 Subject: [SciPy-dev] Bessel functions from Boost In-Reply-To: <4994CD60.5020109@ar.media.kyoto-u.ac.jp> References: <5b8d13220902081710n53ed4241t254dc99728eec6db@mail.gmail.com> <498FB041.5080309@gmail.com> <499006DE.5030707@ar.media.kyoto-u.ac.jp> <5b8d13220902100439i68a073cdw124e366795f87453@mail.gmail.com> <5b8d13220902101031i32bfbc29xe70c7e6b45efb75d@mail.gmail.com> <4994CD60.5020109@ar.media.kyoto-u.ac.jp> Message-ID: On Thu, Feb 12, 2009 at 6:31 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Pauli Virtanen wrote: > > Wed, 11 Feb 2009 03:31:30 +0900, David Cournapeau wrote: > > > > [clip] > > > >> I started a branch, special_refactor. I added all the converted Boost > >> data set (the .ipp files to .csv), plus the small python script I used > >> to generate them. I started implementing the corresponding tests - but > >> this takes some time, because of all this template stuff which is > >> awkward to follow. The only thing to do is to find which function is > >> called for which test with which parameter - someone more familiar with > >> boost could to this much faster, I guess. > >> > > > > I added a couple of more functions to the tests: > > > > They correctly point out that in 0.7.0: > > > > + The problems in Cephes's Iv (large argument), Yv (large order) > > and Kn (large order) > > > > + Numpy's complex-valued `arcsinh` and `arctanh` can have large > > relative errors (~1e-5) for small arguments (< eps)! > > > > Loss of precision in the naive implementation, I'll bet. > > > > but they fail to spot the other known issues. But on the positive side, > > the `arcsinh` issue is the only new one that came up. > > > > One problem with these tests is that the data files are *huge*, > > they currently total ~ 7 Mb. Even compressed, or saved as .npy files, > > these would add ~ 2 Mb to the Scipy source tarball. So I'm not sure > > what to do with this... > > > > That's the reason why I started a branch - I did not know how it would > end up. I don't see an obvious answer to the problem: those are tests > for ~ 100 functions, so this means 20kb of compressed data/function on > average. Each test is two data points at least (x and f(x)), this means > around ~ 500 test points/function. That does not sound that big anymore. > Maybe we could have an option to split the dataset to make them separate > from the main tarball ? > > I kept the data in .csv because I thought it would be nice to test for > double and float at least, and the gain using binary would not be that > huge anymore (it is also easier to use for tests outside the python > machinery), > Maybe it would be best to split out the tests into a separate project and not distribute it with scipy. It could be turned into a generic test suite based on python that could be used to test any implementation of a specific function. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Fri Feb 13 11:11:37 2009 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Fri, 13 Feb 2009 09:11:37 -0700 Subject: [SciPy-dev] WTFM Message-ID: <36596A3A-B968-4161-B7AA-7F0739F1AD4D@physics.ubc.ca> I'd like to register "mforbes" for documentation editing as editor/ reviewer and possibly proofer. Thanks, Michael. From gael.varoquaux at normalesup.org Fri Feb 13 11:49:43 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 13 Feb 2009 17:49:43 +0100 Subject: [SciPy-dev] WTFM In-Reply-To: <36596A3A-B968-4161-B7AA-7F0739F1AD4D@physics.ubc.ca> References: <36596A3A-B968-4161-B7AA-7F0739F1AD4D@physics.ubc.ca> Message-ID: <20090213164943.GB6469@phare.normalesup.org> On Fri, Feb 13, 2009 at 09:11:37AM -0700, Michael McNeil Forbes wrote: > I'd like to register "mforbes" for documentation editing as editor/ > reviewer and possibly proofer. I have added you as an editor. I'll leave you a little while to get familiar with the system before I add you as a reviewer. Please do ping the list when you believe it is time for you to become reviewer, as I will forget. Ga?l From perry at stsci.edu Fri Feb 13 12:20:10 2009 From: perry at stsci.edu (Perry Greenfield) Date: Fri, 13 Feb 2009 12:20:10 -0500 Subject: [SciPy-dev] astrolib, pyfits, pyraf repositories relocated Message-ID: <6ECE2B27-D097-4808-8F49-BF14883CC93E@stsci.edu> As previously announced, we have migrated the astrolib, pyfits, and pyraf repositories (along with the respective trac sites) to a server at STScI. The URLs for the new repositories and trac sites are listed below. All of the previous content of the repositories and trac sites (e.g., history and tickets) has been migrated as well. astrolib: svn: https://www.stsci.edu/svn/ssb/astrolib/ trac: https://www.stsci.edu/trac/ssb/astrolib pyfits: svn: https://www.stsci.edu/svn/ssb/pyfits/ trac: https://www.stsci.edu/trac/ssb/astrolib pyraf: svn: https://www.stsci.edu/svn/ssb/pyraf/ trac: https://www.stsci.edu/trac/ssb/pyraf Those that previously had commit privileges for astrolib should contact sienkiew at stsci.edu to get access to the new repository. We would again like to thank Enthought for hosting the repositories for the many years that they have. Perry From Scott.Daniels at Acm.Org Fri Feb 13 16:24:55 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Fri, 13 Feb 2009 13:24:55 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> Message-ID: My former follow-up bounced, so just to close the loop: Ryan May wrote: > ... I can't believe I didn't notice you weren't > the OP. And yeah, I forgot the loop control. Clearly, this is > evidence that I shouldn't start my day with creating a patch, though I > did at least have the sense to run the test suite. Obviously, the > tests don't exercise a code path that uses the len(self.data) < bytes. Actually this issue is hard to hard to test as a black box, since over-filling should work correctly but inefficiently. > As far as bytes goes, it isn't initialized to -1, but rather > read_to_end is a boolean set to the value of (bytes == -1), so > that you can pass bytes in as -1 and read all the data. Right. I figured that out when I actually went back to the original to make a patch. Many eyes make bugs shallow. There is one thing I did that you might want to incorporate: Instead of: self_data = [] Use: self_data = [self.data] And then at the bottom, instead of: self.data += ''.join([self_data]) Use: self.data = ''.join([self_data]) This way, the full length of the result is know before combining anything (so you get a single large buffer allocation, rather than two). --Scott David Daniels Scott.Daniels at Acm.Org From josef.pktd at gmail.com Sat Feb 14 18:16:11 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 14 Feb 2009 18:16:11 -0500 Subject: [SciPy-dev] test failures in current trunk: sparse\linalg\isolve, and special Message-ID: <1cd32cbb0902141516q56c8928evaae33b560566dedf@mail.gmail.com> 5 failures, * the ones in special have been reported before (I think) >>> from scipy import special >>> special.yn(301,1) nan * I haven't seen before the failures in sparse\linalg\isolve, * first error is a typo that I corrected in trunk Josef Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test() Running unit tests for scipy NumPy version 1.3.0.dev6362 NumPy is installed in C:\Programs\Python25\lib\site-packages\numpy SciPy version 0.8.0.dev5551 SciPy is installed in C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\ scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int el)] nose version 0.10.4 ====================================================================== ERROR: Failure: NameError (global name 'path_candiates' is not defined) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Programs\Python25\lib\site-packages\nose-0.10.4-py2.5.egg\nose\loader .py", line 364, in loadTestsFromName addr.filename, addr.module) File "C:\Programs\Python25\lib\site-packages\nose-0.10.4-py2.5.egg\nose\import er.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "C:\Programs\Python25\lib\site-packages\nose-0.10.4-py2.5.egg\nose\import er.py", line 84, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\weave\__init__.py", line 9 , in from blitz_tools import blitz File "\Programs\Python25\Lib\site-packages\scipy\weave\blitz_tools.py", line 1 1, in import inline_tools File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\weave\inline_tools.py", li ne 15, in function_catalog = catalog.catalog() File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\weave\catalog.py", line 35 6, in __init__ sys.path.append(default_dir()) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\weave\catalog.py", line 19 7, in default_dir path_candiates.append(os.path.join(tempfile.gettempdir(), NameError: global name 'path_candiates' is not defined ====================================================================== FAIL: test whether all methods converge ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\sparse\linalg\isolve\tests \test_iterative.py", line 101, in test_convergence assert_equal(info,0) File "\Programs\Python25\Lib\site-packages\numpy\testing\utils.py", line 183, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: -10 DESIRED: 0 ====================================================================== FAIL: test whether maxiter is respected ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\sparse\linalg\isolve\tests \test_iterative.py", line 82, in test_maxiter assert_equal(len(residuals), 3) File "\Programs\Python25\Lib\site-packages\numpy\testing\utils.py", line 183, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: 1 DESIRED: 3 ====================================================================== FAIL: test whether all methods accept a trivial preconditioner ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\sparse\linalg\isolve\tests \test_iterative.py", line 132, in test_precond assert_equal(info,0) File "\Programs\Python25\Lib\site-packages\numpy\testing\utils.py", line 183, in assert_equal raise AssertionError(msg) AssertionError: Items are not equal: ACTUAL: -10 DESIRED: 0 ====================================================================== FAIL: test_iv_cephes_vs_amos (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\special\tests\test_basic.p y", line 1653, in test_iv_cephes_vs_amos self.check_cephes_vs_amos(iv, iv, rtol=1e-8, atol=1e-305) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\special\tests\test_basic.p y", line 1640, in check_cephes_vs_amos assert c2.imag != 0, (v, z) AssertionError: (-100.3, 200.5) ====================================================================== FAIL: test_yv_cephes_vs_amos (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\special\tests\test_basic.p y", line 1650, in test_yv_cephes_vs_amos self.check_cephes_vs_amos(yv, yn, rtol=1e-11, atol=1e-305) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de v5551.win32\Programs\Python25\Lib\site-packages\scipy\special\tests\test_basic.p y", line 1640, in check_cephes_vs_amos assert c2.imag != 0, (v, z) AssertionError: (301, 1.0) ---------------------------------------------------------------------- Ran 3365 tests in 70.313s FAILED (KNOWNFAIL=2, SKIP=32, errors=1, failures=5) From wnbell at gmail.com Sun Feb 15 19:53:58 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 15 Feb 2009 19:53:58 -0500 Subject: [SciPy-dev] test failures in current trunk: sparse\linalg\isolve, and special In-Reply-To: <1cd32cbb0902141516q56c8928evaae33b560566dedf@mail.gmail.com> References: <1cd32cbb0902141516q56c8928evaae33b560566dedf@mail.gmail.com> Message-ID: On Sat, Feb 14, 2009 at 6:16 PM, wrote: > ====================================================================== > FAIL: test whether all methods converge > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.de > v5551.win32\Programs\Python25\Lib\site-packages\scipy\sparse\linalg\isolve\tests > \test_iterative.py", line 101, in test_convergence > assert_equal(info,0) > File "\Programs\Python25\Lib\site-packages\numpy\testing\utils.py", line 183, > in assert_equal > raise AssertionError(msg) > AssertionError: > Items are not equal: > ACTUAL: -10 > DESIRED: 0 > I can confirm the failure on Win32 and have isolated it to bicgstab. I'll continue searching for the source of the error. I noticed that the Windows binaries for SciPy 0.7 also have this problem, but the test has been marked as a known failure. Was this bug ever reported? -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From wnbell at gmail.com Sun Feb 15 22:00:36 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 15 Feb 2009 22:00:36 -0500 Subject: [SciPy-dev] test failures in current trunk: sparse\linalg\isolve, and special In-Reply-To: References: <1cd32cbb0902141516q56c8928evaae33b560566dedf@mail.gmail.com> Message-ID: On Sun, Feb 15, 2009 at 7:53 PM, Nathan Bell wrote: > > I can confirm the failure on Win32 and have isolated it to bicgstab. > I'll continue searching for the source of the error. > > I noticed that the Windows binaries for SciPy 0.7 also have this > problem, but the test has been marked as a known failure. Was this > bug ever reported? > The problem seems to be that when bicgstab enters the Fortran function BICGSTABREVCOM it dies a horrible death and returns INFO=-10 signifying breakdown of the algorithm. This doesn't make sense, because the calling function hasn't really provided the Fortran side with any real data except the right hand side (b) and the initial iterate (x) which happens to be 0 in this case. I don't know Fortran, but I believe the condition on 269 should be false since (I think) WORK(1,RTLD) and WORK(1,R) are the same vector (both copies of b) which, in the tests, is chosen randomly. 268 RHO = ( N, WORK(1,RTLD), 1, WORK(1,R), 1 ) 269 IF ( ABS( RHO ).LT.RHOTOL ) GO TO 25 When I changed RHOTOL and OMEGATOL from: 228 RHOTOL = GETBREAK() 229 OMEGATOL = GETBREAK() To: 228 RHOTOL = 0 229 OMEGATOL = 0 The tests pass. Can a Fortran guru tell us if GETBREAK() is suspect? Here are the suspects: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative.py#L134 http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/BiCGSTABREVCOM.f.src http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/getbreak.f.src I give up for now. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From cournape at gmail.com Sun Feb 15 22:36:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 16 Feb 2009 12:36:37 +0900 Subject: [SciPy-dev] test failures in current trunk: sparse\linalg\isolve, and special In-Reply-To: References: <1cd32cbb0902141516q56c8928evaae33b560566dedf@mail.gmail.com> Message-ID: <5b8d13220902151936l13bfcbd8k3ff4dd464a574bde@mail.gmail.com> On Mon, Feb 16, 2009 at 12:00 PM, Nathan Bell wrote: > On Sun, Feb 15, 2009 at 7:53 PM, Nathan Bell wrote: >> >> I can confirm the failure on Win32 and have isolated it to bicgstab. >> I'll continue searching for the source of the error. >> >> I noticed that the Windows binaries for SciPy 0.7 also have this >> problem, but the test has been marked as a known failure. Was this >> bug ever reported? I should have checked it was, it is more than likely that I forgot it. > > The problem seems to be that when bicgstab enters the Fortran function > BICGSTABREVCOM it dies a horrible death and returns INFO=-10 > signifying breakdown of the algorithm. This doesn't make sense, > because the calling function hasn't really provided the Fortran side > with any real data except the right hand side (b) and the initial > iterate (x) which happens to be 0 in this case. I don't know Fortran, > but I believe the condition on 269 should be false since (I think) > WORK(1,RTLD) and WORK(1,R) are the same vector (both copies of b) > which, in the tests, is chosen randomly. Without going into the details of the code, it is strange that the same code works on linux and mac os x and not on windows. Since it happends within fortran, it may be a fortran compiler bug. When I will have more time, I will test it with gfortran, cheers, David From cimrman3 at ntc.zcu.cz Mon Feb 16 12:16:00 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 16 Feb 2009 18:16:00 +0100 Subject: [SciPy-dev] try_umfpack.py In-Reply-To: References: Message-ID: <49999F50.4010100@ntc.zcu.cz> Nils Wagner wrote: > Hi Robert C., > > > I have updated the URL in try_umfpack.py > > #defaultURL = 'http://www.cise.ufl.edu/research/sparse/HBformat/' > defaultURL = 'http://www.cise.ufl.edu/research/sparse/RB/HB/' > > > ./try_umfpack.py -d bcsstm21.tar.gz > ************************************************** url: > http://www.cise.ufl.edu/research/sparse/RB/HB/bcsstm21.tar.gz file: > /tmp/tmpDwKACc.gz format: triplet reading... Traceback (most recent > call last): File "./try_umfpack.py", line 222, in main() > File "./try_umfpack.py", line 138, in main mtx = readMatrix( > matrixName, options ) File "./try_umfpack.py", line 98, in readMatrix > mtx = readMatrix( fd ) File "./try_umfpack.py", line 37, in > read_triplet nRow, nCol = map( int, fd.readline().split() ) > ValueError: invalid literal for int() with base 10: > 'bcsstm21/bcsstm21.rb' > > Any idea ? Hi Nils, thanks for your import fixes. It seems that the matrices are no longer stored simply in a gzipped files in the triplet (coordinate) format, but instead are stored in a tar.gz archive with .txt info files and .rb file in the Rutherford-Boeing format. try_umfpack.py cannot read this format yet - what is the status of the ticket http://projects.scipy.org/scipy/scipy/ticket/354 ? As soon as that gets into scipy, fixing try_umfpack.py is trivial. r. From jh at physics.ucf.edu Wed Feb 18 01:55:11 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Wed, 18 Feb 2009 15:55:11 +0900 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: (message from Jarrod Millman on Thu, 5 Feb 2009 14:10:32 -0800) References: Message-ID: Jarrod Millman writes: > On Thu, Feb 5, 2009 at 12:44 PM, Joe Harrington wrote: > > It would be nice if the Packaging section of Developer Zone on the web > > site made the packaging process a little more transparent... Say what > > you do to get a release out and to get it in distribution, who does > > it, what the timetables tend to look like, etc. If you ask for help > > there, you're likely to get it. > > I don't use Ubuntu, but a quick google search gave me this: > http://packages.ubuntu.com/jaunty/python-numpy > http://packages.ubuntu.com/jaunty/python-scipy > > I would rather not have information about Ubuntu's release process on > the SciPy site. Ubuntu should be the system of record and it is very > easy to find out that information using google. Anything added to the > SciPy site will either quickly become out of date or will need > maintenance (we have too much out-of-date, conflicting, or duplicate > information on the site). > > If you want the Ubuntu developers to use more recent versions of numpy > and scipy, try "asking a question" about whether they will upgrade > through launchpad: > https://answers.launchpad.net/ubuntu/+source/python-scipy/+addquestion > https://answers.launchpad.net/ubuntu/+source/python-numpy/+addquestion > > Hope that is enough information to get you started. > > Good luck, > Jarrod Well, there are two parts to the question: 1. what happens when the Packaging Team produces a release 2. how those releases get into distros I have put some basic text on each topic under Packaging on the Developer Zone page, but it needs more info than I know. Jarrod, could you flesh that out to say what the Packaging Team does to produce a release, who is on the Packaging Team, etc.? What are the principles applied when deciding what features go into what releases (might go under Source Code)? How do you make and test the packages? Once you declare a source release, what happens to make binary packages? Who does what in all these processes? What skills would a volunteer need and whom would they contact to help out? What kinds of help do you need? I think you'd get offers of help if the process were more transparent to new users. For the second part of the problem, of course there's no reason for us to document what Ubuntu or any other distro does to cut one of their releases. What's needed is to document how our stuff gets into their release. If they pull, who does it, and what's the best way to signal them that it's time to pull, given that they're not on our lists and that for many distros, the person responsible for the package might not use or even personally care about our software. I have hung a page off DevZone that has a section for each distro. I hope knowledgeable users will fill it in. Robert Kern writes: > My experience is that most distribution packagers aren't on this list. > We don't push; they pull. This is actually good news. Maybe we'd be more up-to-date in the distros if we had some volunteers act as liasons between the Packaging Team and the distros. All they'd have to do would be to follow the instructions on the Distros page to ping the distros whenever we cut a new release. There are plenty of newish folks who have asked how they could help, but who can't actually contribute code or docs yet. This would be a good task for them. It would only take one or two people. Thanks, --jh-- From jh at physics.ucf.edu Wed Feb 18 02:07:33 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Wed, 18 Feb 2009 16:07:33 +0900 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: (scipy-dev-request@scipy.org) References: Message-ID: Werner Hoch writes: > here are some notes about openSUSE. > ... Werner, could you edit http://scipy.org/Developer_Zone/Distros and put the appropriate info into an entry for OpenSUSE? Thanks! --jh-- Date: Sat, 7 Feb 2009 10:57:33 +0100 From: Werner Hoch Subject: Re: [SciPy-dev] updating the numpy/scipy versions in Linux distros To: SciPy Developers List Message-ID: <200902071057.34169.werner.ho at gmx.de> Content-Type: text/plain; charset="iso-8859-1" Hi all, here are some notes about openSUSE. numpy and scipy are not part of the openSUSE core distribution but several projects in the openSUSE BuildService are packaging and using scipy and numpy. (https://build.opensuse.org/) I've maintained numpy and scipy for a while in the science project. Now both packages are maintained in the Education project. Misc links: http://download.opensuse.org/repositories/Education/ http://download.opensuse.org/repositories/Education/openSUSE_11.1/repodata/repoview/python-scipy-0-0.6.0-4.5.html http://download.opensuse.org/repositories/Education/openSUSE_11.1/repodata/repoview/python-numpy-0-1.2.1-2.6.html The current version are numpy 1.2.1 scipy 0.6.0 I think as soon as scipy 0.7.0 is out the package will be updated in the Education project. Lars has added some patches to scipy 0.6.0 to fix some compilation warnings in the Education project. I don't know if he has reported them upstream to the scipy project. openSUSE_11.1 has some strict rules about compilation warnings. The compiler warnings are checked after the build some warnings are treated as errors. In my privat project (not published) I've created a test build for scipy 0.7.0rc2. The buildservice complains that scipy 0.7.0rc2 has the following errors: ---------- I: Program is using implicit definitions of special functions. these functions need to use their correct prototypes to allow the lightweight buffer overflow checking to work. - Implicit memory/string functions need #include . - Implicit *printf functions need #include . - Implicit *printf functions need #include . - Implicit *read* functions need #include . - Implicit *recv* functions need #include . E: python-scipy implicit-fortify-decl scipy/sparse/linalg/dsolve/SuperLU/SRC/xerbla.c:33 I: Program returns random data in a function E: python-scipy no-return-in-nonvoid-function build/src.linux-x86_64-2.6/scipy/sparse/linalg/isolve/iterative/getbreak.f:74, 54, 34, 14 I: Program causes undefined operation (likely same variable used twiceand post/pre incremented in the same expression). e.g. x = x++; Split it in two operations. E: python-scipy sequence-point scipy/sparse/linalg/dsolve/SuperLU/SRC/cutil.c:243 E: python-scipy sequence-point scipy/sparse/linalg/dsolve/SuperLU/SRC/zutil.c:243 ---------- I'm attaching the patches from Lars (they are against scipy 0.6.0) Maybe someone can review them and integrate them into scipy. Regards Werner From david at ar.media.kyoto-u.ac.jp Wed Feb 18 01:55:50 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 18 Feb 2009 15:55:50 +0900 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: References: Message-ID: <499BB0F6.80402@ar.media.kyoto-u.ac.jp> Joe Harrington wrote: > For the second part of the problem, of course there's no reason for us > to document what Ubuntu or any other distro does to cut one of their > releases. What's needed is to document how our stuff gets into their > release. I think there is little to document, because there is no formal process between "us" and "them". There is a lot of process within a distribution, what's go where and when for which version, but that's documented on each distribution. > If they pull, who does it, and what's the best way to signal > them that it's time to pull, given that they're not on our lists and > that for many distros, the person responsible for the package might > not use or even personally care about our software. > In my experience, what matters the most is maintainer time (I sound like a broken record, don't I ? :) ). If they have some time, they will update - but there is the problem that the window to update a version does not always fit with maintainer 'free' time. Then, there is the incentive of users asking for it. One thing to keep in mind, which may not be obvious to everyone: our goal, as numpy/scipy developers, and the distributions goals are not the same, if not antithetic. We care about distributing the most recent version, and they care about stability and the least work possible. Not updating is almost always easier than updating for them. For example, some RH developers are really pissed about python 3k breaking a lot of stuff, and would even want to see python 3k failing so that they don't have to deal with the numerous maintenance problems (see http://lwn.net/Articles/310450/). Another example is that debian developers would like to split numpy in many small pieces, because numpy is too big; from our POV, that does not make any sense. My own opinion is that the only solution is to have our own packages, published when we can; distributions then do what they want/can on their own schedule. > > This is actually good news. Maybe we'd be more up-to-date in the > distros if we had some volunteers act as liasons between the Packaging > Team and the distros. All they'd have to do would be to follow the > instructions on the Distros page to ping the distros whenever we cut a > new release. There are plenty of newish folks who have asked how they > could help, but who can't actually contribute code or docs yet. This > would be a good task for them. It would only take one or two people. > I am not sure packaging counts as a good task for newcomers. Packaging is difficult to do well, there are a lof of small details to keep right, and there is a lot of politics involved, which does not sound as the kind of things newcomers would like to do. cheers, David From matthew.brett at gmail.com Wed Feb 18 02:30:01 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 17 Feb 2009 23:30:01 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> Message-ID: <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> Hi, > So now the GzipInputStream overhead is only about 20-25%. Still seems > a bit higher than it should be, but certainly usable, and worth it for > the memory win. > > BTW, the name GzipInputStream is very confusing for something that > reads raw deflate format and not, say, gzip format :-). Thanks to all in this thread. I've committed a version of this patch with the final suggestions here, and renamed the class to ZlibInputStream. I've also put in a two shot version of the class that's a bit better matched to the matlab io situation, where we want to be able to do one small read for the variable name, then a long read for the rest of the data. Chris Burns kindly put some benchmarks in place. Could you test current SVN and let me know how speed is for you? Matthew From gael.varoquaux at normalesup.org Wed Feb 18 03:32:17 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 18 Feb 2009 09:32:17 +0100 Subject: [SciPy-dev] Scipy.org seems down Message-ID: <20090218083217.GB11533@phare.normalesup.org> scipy.org seems down. conference.scipy.org is up, as well as other servers hosted at Enthought... Ga?l From jh at physics.ucf.edu Wed Feb 18 08:23:44 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Wed, 18 Feb 2009 22:23:44 +0900 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: <499BB0F6.80402@ar.media.kyoto-u.ac.jp> (message from David Cournapeau on Wed, 18 Feb 2009 15:55:50 +0900) References: <499BB0F6.80402@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, greetings from not-so-faraway Tokyo... David Cournapeau wrote: > > For the second part of the problem, of course there's no reason for us > > to document what Ubuntu or any other distro does to cut one of their > > releases. What's needed is to document how our stuff gets into their > > release. > > I think there is little to document, because there is no formal process > between "us" and "them". There is a lot of process within a > distribution, what's go where and when for which version, but that's > documented on each distribution. So document the little there is to document. I don't expect that it takes more than an email or a click on a web page to get most distributors to put "pick up new numpy before next build" on their to-do list. But we could benefit, because as you point out these are busy people and they might not bother looking on their own every release. What I'm looking for on each distro may be no more than: The Fnord release process is described at: http://releases.fnord.com/timetable.html The current numpy in Fnord is: 1.2.3 The current scipy in Fnord is: 4.5.6 The Fnord Numpy build is based at http://building.fnord.com/numpy/ Fnord Smith handles both numpy and scipy packages for Fnord. The best way to get a new numpy release into Fnord is to email fnordsmith at fnord.com as soon as it's released. > > If they pull, who does it, and what's the best way to signal > > them that it's time to pull, given that they're not on our lists and > > that for many distros, the person responsible for the package might > > not use or even personally care about our software. > > > > In my experience, what matters the most is maintainer time (I sound like > a broken record, don't I ? :) ). If they have some time, they will > update - but there is the problem that the window to update a version > does not always fit with maintainer 'free' time. Then, there is the > incentive of users asking for it. Your time or theirs? I'm not asking for more of your time. But, if we organize ourselves just a little, and ping the distros when we have a new release, maybe someone on their end (perhaps even someone in our community) will take the time to ingest a new release into their system before they start hitting deadlines. Again, a little human contact will go a long way. > One thing to keep in mind, which may not be obvious to everyone: our > goal, as numpy/scipy developers, and the distributions goals are not the > same, if not antithetic. We care about distributing the most recent > version, and they care about stability and the least work possible. Not > updating is almost always easier than updating for them. For example, > some RH developers are really pissed about python 3k breaking a lot of > stuff, and would even want to see python 3k failing so that they don't > have to deal with the numerous maintenance problems (see > http://lwn.net/Articles/310450/). Another example is that debian > developers would like to split numpy in many small pieces, because numpy > is too big; from our POV, that does not make any sense. Well, it's not completely antithetic: we all want to take as little time as possible!:-) We can't force them into action, but one thing is certain: They won't take more action if users don't ask them to. A little human contact will go a long way. > My own opinion is that the only solution is to have our own packages, > published when we can; distributions then do what they want/can on their > own schedule. Didn't we switch to frequent, time-based releases in part to make it so that anybody picking up a package got a relatively recent version? Our interests don't have to conflict with theirs if we think creatively and even talk to them occasionally. > > This is actually good news. Maybe we'd be more up-to-date in the > > distros if we had some volunteers act as liasons between the Packaging > > Team and the distros. All they'd have to do would be to follow the > > instructions on the Distros page to ping the distros whenever we cut a > > new release. There are plenty of newish folks who have asked how they > > could help, but who can't actually contribute code or docs yet. This > > would be a good task for them. It would only take one or two people. > > > > I am not sure packaging counts as a good task for newcomers. Packaging > is difficult to do well, there are a lof of small details to keep right, > and there is a lot of politics involved, which does not sound as the > kind of things newcomers would like to do. I'm not advocating bringing lots of newbies into the binary packaging effort. I'm suggesting that we sign up a few people, who may be newbies to Python but may not be newbies to programming and even packaging, to follow the instructions on a web page about sending an email once a release. This is just not a big deal, but it could have a significant return, because, sorry if I sound like a broken record, but: a little human contact will go a long way. --jh-- From david at ar.media.kyoto-u.ac.jp Wed Feb 18 09:22:39 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 18 Feb 2009 23:22:39 +0900 Subject: [SciPy-dev] updating the numpy/scipy versions in Linux distros In-Reply-To: References: <499BB0F6.80402@ar.media.kyoto-u.ac.jp> Message-ID: <499C19AF.7060908@ar.media.kyoto-u.ac.jp> Joe Harrington wrote: > Hi David, greetings from not-so-faraway Tokyo... > > David Cournapeau wrote: > >>> For the second part of the problem, of course there's no reason for us >>> to document what Ubuntu or any other distro does to cut one of their >>> releases. What's needed is to document how our stuff gets into their >>> release. >>> >> I think there is little to document, because there is no formal process >> between "us" and "them". There is a lot of process within a >> distribution, what's go where and when for which version, but that's >> documented on each distribution. >> > > So document the little there is to document. I don't expect that it > takes more than an email or a click on a web page to get most > distributors to put "pick up new numpy before next build" on their > to-do list. But we could benefit, because as you point out these are > busy people and they might not bother looking on their own every > release. What I'm looking for on each distro may be no more than: > > The Fnord release process is described at: > http://releases.fnord.com/timetable.html > > The current numpy in Fnord is: 1.2.3 > The current scipy in Fnord is: 4.5.6 > > The Fnord Numpy build is based at http://building.fnord.com/numpy/ > > Fnord Smith handles both numpy and scipy packages > for Fnord. > > The best way to get a new numpy release into Fnord is to email > fnordsmith at fnord.com as soon as it's released. > I don't think we can do that: it changes too often, and it is almost guaranteed it won't be updated accordingly. No info is better than wrong info IMHO. And the info is easily available on the given distributions. Unless you have something which is 100 % automated and full proof, I don't think it is a good idea. > > Your time or theirs? theirs. Specially for numpy, it takes a long time to integrate well (because many packages depend on it; one example: numpy causes trouble because of pygtk for next Ubuntu - that's not something we care about). > I'm not asking for more of your time. But, if > we organize ourselves just a little, and ping the distros when we have > a new release, maybe someone on their end (perhaps even someone in our > community) will take the time to ingest a new release into their > system before they start hitting deadlines. Again, a little human > contact will go a long way. > FWIW, I recently joined debian python packagers ML, and I have svn access to fix some issues. That goes toward your human contact, I think :) > > Well, it's not completely antithetic: we all want to take as little > time as possible!:-) What is antithetic is that we care about being up to date, and they care about stability. Since we don't have a fully backward release process (which is not possible in python anyway), that's a big problem for distributions. Once say scipy 0.7 is released, we don't care at all about 0.6 anymore. But they do. They care that numpy 1.3 may break 20 packages which depend on it. Think about stable releases (several years), where they anyway have to care about old releases (which we don't); ubuntu hardy will have to support numpy 1.1.1 for several years to come anyway. Every new version means more work, because they still have to support the old one anyway. > Didn't we switch to frequent, time-based releases in part to make it > so that anybody picking up a package got a relatively recent version? > Our interests don't have to conflict with theirs if we think > creatively and even talk to them occasionally. > It is more complicated than that. A good example is bzr: that's a software which is crucial to Canonical, and is updated every month. It is heavily funded by Canonical. Yet, how do bzr developers distribute bzr ? By producing the .deb themselves, and distributing it to the PPA. That's why I think we should focus on things like PPA: we can do the same process for every distribution that matters, in a centralized way, which we can document. It can be mostly automated. By being independent of the distributions, we avoid the pain, while having most of the benefits. Trying to make things faster on the distribution side of things is hopeless in my experience - if you look at many big projects out there which care about Linux, they all distribute their own packages (mono, bzr, etc...). cheers, David From pinto at mit.edu Wed Feb 18 13:36:34 2009 From: pinto at mit.edu (Nicolas Pinto) Date: Wed, 18 Feb 2009 13:36:34 -0500 Subject: [SciPy-dev] rgb_to_hsv in scipy.misc ? (was: [Numpy-discussion] Optimizing speed for large-array inter-element algorithms (specifically, color space conversion)) Message-ID: <954ae5aa0902181036i192ad020kd152e0629974ed6c@mail.gmail.com> Hello, Would it be possible to include the following rgb to hsv conversion code in scipy (probably in misc along with misc.imread, etc.) ? What do you think? Thanks in advance. Best regards, -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto # ------------------------------------------------------------------------------ import numpy as np def rgb_to_hsv_arr(arr): """ fast rgb_to_hsv using numpy array """ # adapted from Arnar Flatberg # http://www.mail-archive.com/numpy-discussion at scipy.org/msg06147.html # it now handles NaN properly and mimics colorsys.rgb_to_hsv output arr = arr/255. out = np.empty_like(arr) arr_max = arr.max(-1) delta = arr.ptp(-1) s = delta / arr_max s[delta==0] = 0 # red is max idx = (arr[:,:,0] == arr_max) out[idx, 0] = (arr[idx, 1] - arr[idx, 2]) / delta[idx] # green is max idx = (arr[:,:,1] == arr_max) out[idx, 0] = 2. + (arr[idx, 2] - arr[idx, 0] ) / delta[idx] # blue is max idx = (arr[:,:,2] == arr_max) out[idx, 0] = 4. + (arr[idx, 0] - arr[idx, 1] ) / delta[idx] out[:,:,0] = (out[:,:,0]/6.0) % 1.0 out[:,:,1] = s out[:,:,2] = arr_max # rescale back to [0, 255] out *= 255. # remove NaN out[np.isnan(out)] = 0 return out -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Feb 18 15:46:23 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 18 Feb 2009 22:46:23 +0200 Subject: [SciPy-dev] [Numpy-discussion] rgb_to_hsv in scipy.misc ? (was: Optimizing speed for large-array inter-element algorithms (specifically, color space conversion)) In-Reply-To: <954ae5aa0902181036i192ad020kd152e0629974ed6c@mail.gmail.com> References: <954ae5aa0902181036i192ad020kd152e0629974ed6c@mail.gmail.com> Message-ID: <9457e7c80902181246j43ec0cddg4a444f911d77b175@mail.gmail.com> Hi Nicolas 2009/2/18 Nicolas Pinto : > Would it be possible to include the following rgb to hsv conversion code in > scipy (probably in misc along with misc.imread, etc.) ? I think SciPy could do with some more image processing algorithms. Would anyone mind if we added this sort of thing to the ndimage namespace? I'd also like to make available `imread` in that module. If there aren't any objections, I'd gladly help integrate the color conversion code. Regards St?fan From wnbell at gmail.com Wed Feb 18 17:04:45 2009 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 18 Feb 2009 17:04:45 -0500 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> Message-ID: On Wed, Feb 18, 2009 at 2:30 AM, Matthew Brett wrote: > > Thanks to all in this thread. I've committed a version of this patch > with the final suggestions here, and renamed the class to > ZlibInputStream. > Will this change be backported to 0.7 or 0.7.x? -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From matthew.brett at gmail.com Wed Feb 18 17:20:11 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 18 Feb 2009 14:20:11 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> Message-ID: <1e2af89e0902181420q59fcb63bk289d3f9d446f466e@mail.gmail.com> Hi, On Wed, Feb 18, 2009 at 2:04 PM, Nathan Bell wrote: > On Wed, Feb 18, 2009 at 2:30 AM, Matthew Brett wrote: >> >> Thanks to all in this thread. I've committed a version of this patch >> with the final suggestions here, and renamed the class to >> ZlibInputStream. >> > > Will this change be backported to 0.7 or 0.7.x? I think this one was bad enough that it would justify a 0.7.1 in short order. Matthew From wnbell at gmail.com Wed Feb 18 17:50:01 2009 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 18 Feb 2009 17:50:01 -0500 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: <1e2af89e0902181420q59fcb63bk289d3f9d446f466e@mail.gmail.com> References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> <1e2af89e0902181420q59fcb63bk289d3f9d446f466e@mail.gmail.com> Message-ID: On Wed, Feb 18, 2009 at 5:20 PM, Matthew Brett wrote: >> >> Will this change be backported to 0.7 or 0.7.x? > > I think this one was bad enough that it would justify a 0.7.1 in short order. > Good. I ask because there's a nasty problem with bicgstab() on win32 that I'd like to include in 0.7.1 soon also. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From matthew.brett at gmail.com Wed Feb 18 17:55:39 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 18 Feb 2009 14:55:39 -0800 Subject: [SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0 In-Reply-To: References: <961fa2b40902110245p4c78c00ar921fdaa81939a684@mail.gmail.com> <961fa2b40902111805h3172bcc4i88ab1fb7da67bf3d@mail.gmail.com> <1e2af89e0902172330r66f0ef0bh4bf201ab84a448d6@mail.gmail.com> <1e2af89e0902181420q59fcb63bk289d3f9d446f466e@mail.gmail.com> Message-ID: <1e2af89e0902181455x13e65404l23a9250af379baed@mail.gmail.com> Hi, On Wed, Feb 18, 2009 at 2:50 PM, Nathan Bell wrote: > On Wed, Feb 18, 2009 at 5:20 PM, Matthew Brett wrote: >>> >>> Will this change be backported to 0.7 or 0.7.x? >> >> I think this one was bad enough that it would justify a 0.7.1 in short order. >> > > Good. I ask because there's a nasty problem with bicgstab() on win32 > that I'd like to include in 0.7.1 soon also. I'm happy to have caused a big enough problem to help else someone out! Matthew From scipy at mspacek.mm.st Wed Feb 18 17:48:04 2009 From: scipy at mspacek.mm.st (Martin Spacek) Date: Wed, 18 Feb 2009 14:48:04 -0800 Subject: [SciPy-dev] docs editing permission request for: mspacek Message-ID: <499C9024.4030709@mspacek.mm.st> Hi, Could you please give me editing rights for the scipy/numpy docs at docs.scipy.org? My username is mspacek Cheers, Martin From gael.varoquaux at normalesup.org Wed Feb 18 18:01:39 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 19 Feb 2009 00:01:39 +0100 Subject: [SciPy-dev] docs editing permission request for: mspacek In-Reply-To: <499C9024.4030709@mspacek.mm.st> References: <499C9024.4030709@mspacek.mm.st> Message-ID: <20090218230139.GA13781@phare.normalesup.org> On Wed, Feb 18, 2009 at 02:48:04PM -0800, Martin Spacek wrote: > Could you please give me editing rights for the scipy/numpy docs at > docs.scipy.org? My username is mspacek Done. From eads at soe.ucsc.edu Wed Feb 18 20:21:42 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Wed, 18 Feb 2009 17:21:42 -0800 Subject: [SciPy-dev] doc typo in fclusterdata() function Message-ID: <91b4b1ab0902181721u18fe4c24iedd97e0003bba4fb@mail.gmail.com> Hi Martin, Good catch, thanks. I would appreciate it if you fixed the doc. I've forwarded your message to the scipy-dev mailing list. Create an account, reply with your account name, and someone will grant you editing rights. Thanks again, Damian On Wed, Feb 18, 2009 at 2:28 PM, Martin Spacek wrote: > Hi Damian, > > The first entry in the arguments list in the docstring for > scipy.cluster.hierarchy.fclusterdata() is: > > - Z : ndarray > The hierarchical clustering encoded with the matrix returned > by the ``linkage`` function. > > It should be something like: > > - X : ndarray > ``n`` by ``m`` data matrix with ``n`` observations in ``m`` > dimensions > > This is in scipy 0.7rc2, the online scipy docs, and also in hcluster 0.2. > > I should get myself editing rights for the scipy docs :) > > Cheers, > > Martin From matthew.brett at gmail.com Wed Feb 18 22:26:01 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 18 Feb 2009 19:26:01 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice Message-ID: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> Hi, I've just found another bug in the matlab io (mine of course). It's easy to fix, but I'd like some advice about a change in undocumented and somewhat unexpected behavior of the mat file writing. The bug is, that when writing an array to a matlab matrix variable, I had been appending 0s to the shape, for less than 2d arrays. For 0d arrays (scalars), this results in shape (0,0), which is read in matlab as an empty array. That's obviously a bug, and easy to fix. However, when passing in 1d arrays, it results in shapes like (3,0). Although this shape is illegal in matlab, it seems to kindly just change the 0 to a 1, resulting in a column vector. However, matlab tends to think of an unshaped vector as being a row vector: >> a = 1:12; >> size(a) ans = 1 12 And numpy thinks the same: >>> print np.atleast_2d(np.array([1,2])).shape (1, 2) It seems to me then, that I should assume the same, that a 1 dimensional array is a row vector. However, this will change the matlab shape of a 1 d array passed into the matlab mat file routines from a column vector to a row vector. Do y'all think I should: a) Given this is undocumented anyway, just switch to the row vector b) Keep 1d arrays as column vectors c) Raise a warning and change for a future release ? Thanks, Matthew From david at ar.media.kyoto-u.ac.jp Wed Feb 18 22:31:14 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 19 Feb 2009 12:31:14 +0900 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> Message-ID: <499CD282.6090109@ar.media.kyoto-u.ac.jp> Hi Matthew, Matthew Brett wrote: > However, matlab tends to think of an unshaped vector as being a row vector: > As you know, matlab does not have any rank 1 array concept. You can either create matrix (2 dimensions) or 'array' (N dimensions). I think there will always be problems at this level because of this mismatch. > It seems to me then, that I should assume the same, that a 1 > dimensional array is a row vector. However, this will change the > matlab shape of a 1 d array passed into the matlab mat file routines > from a column vector to a row vector. > > Do y'all think I should: > > a) Given this is undocumented anyway, just switch to the row vector > It may be undocumented, but I think it is safe to assume it will break a lot of code. If it changes, I think it is better to raise a warning before changing it. Because of the mismatch mentioned above, there may not be such as thing as best solution; one choice shall be made, and then we should stick to it. cheers, David From matthew.brett at gmail.com Wed Feb 18 23:03:52 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 18 Feb 2009 20:03:52 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <499CD282.6090109@ar.media.kyoto-u.ac.jp> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> Hi, >> Do y'all think I should: >> >> a) Given this is undocumented anyway, just switch to the row vector >> > > It may be undocumented, but I think it is safe to assume it will break a > lot of code. If it changes, I think it is better to raise a warning > before changing it. Because of the mismatch mentioned above, there may > not be such as thing as best solution; one choice shall be made, and > then we should stick to it. OK - I propose then that I do this: Leave the current behavior as it is (with the bug fixed) for 0.7.x Add a parameter at some point to the call allowing this behavior to be changed Raise a deprecation warning in 0.8 for the change in default behavior Change for 0.9 Seem sensible for everyone? Matthew From stefan at sun.ac.za Thu Feb 19 01:04:21 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 19 Feb 2009 08:04:21 +0200 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> Message-ID: <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> 2009/2/19 Matthew Brett : >> It may be undocumented, but I think it is safe to assume it will break a >> lot of code. If it changes, I think it is better to raise a warning >> before changing it. Because of the mismatch mentioned above, there may >> not be such as thing as best solution; one choice shall be made, and >> then we should stick to it. > > OK - I propose then that I do this: > > Leave the current behavior as it is (with the bug fixed) for 0.7.x > Add a parameter at some point to the call allowing this behavior to be changed > Raise a deprecation warning in 0.8 for the change in default behavior > Change for 0.9 Personally, I would have changed this behaviour ASAP. By putting it off to 0.9 we are forcing users to produce more of these "broken" MATLAB files. With 0.7 just out the door, we could fix the bug in 0.7.1 with the proper documentation (changelog and docstring). If that is not acceptable, at least propose the fix for 0.8, rather than 0.9. Cheers St?fan From matthew.brett at gmail.com Thu Feb 19 03:22:51 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 00:22:51 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> Message-ID: <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> Hi, > Personally, I would have changed this behaviour ASAP. By putting it > off to 0.9 we are forcing users to produce more of these "broken" > MATLAB files. > > With 0.7 just out the door, we could fix the bug in 0.7.1 with the > proper documentation (changelog and docstring). If that is not > acceptable, at least propose the fix for 0.8, rather than 0.9. Ah. Just to be clear - I am proposing to fix the bug fix for 0d arrays in 0.7.1. But do you mean that the 1D as column-vector behavior should also be considered a bug and fixed in 0.7.1? David's right, that it's a judgment call as to which 1d->2d behavior is right, but it's also fairly obvious that the row-vector choice is more sensible, given the current behavior of matlab and numpy. I'm happy either way. It's easier for me, and might be cleaner in the long run, to just fix and document, but I can also see that broken code is not a good outcome. See you, Matthew From p.c.degroot at tudelft.nl Thu Feb 19 03:57:35 2009 From: p.c.degroot at tudelft.nl (Pieter Cristiaan de Groot) Date: Thu, 19 Feb 2009 09:57:35 +0100 Subject: [SciPy-dev] old link in releas notes 0.7.0 Message-ID: <499D1EFF.5000204@tudelft.nl> Hello, I think that there is an old link in the release notes of 0.7.0. It seems that nose has moved from: http://code.google.com/p/python-nose/ to here: http://somethingaboutorange.com/mrl/projects/nose/ best, Pieter From wnbell at gmail.com Thu Feb 19 04:00:37 2009 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 19 Feb 2009 04:00:37 -0500 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 3:22 AM, Matthew Brett wrote: > > I'm happy either way. It's easier for me, and might be cleaner in the > long run, to just fix and document, but I can also see that broken > code is not a good outcome. > I vote fix and document in 0.7.1. I don't think a prolonged transition period would help matters. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From robince at gmail.com Thu Feb 19 04:05:56 2009 From: robince at gmail.com (Robin) Date: Thu, 19 Feb 2009 09:05:56 +0000 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 9:00 AM, Nathan Bell wrote: > On Thu, Feb 19, 2009 at 3:22 AM, Matthew Brett wrote: >> >> I'm happy either way. It's easier for me, and might be cleaner in the >> long run, to just fix and document, but I can also see that broken >> code is not a good outcome. >> > > I vote fix and document in 0.7.1. I don't think a prolonged > transition period would help matters. As a user of the MATLAB functionality, I would vote for the warning message in 0.7.1 and the change in 0.8 (with an option to keep the current behaviour). I use MATLAB files a lot to store results/data and it would/will be a pain to change everywhere I load files that I've previously saved (and then have to keep track of which data files were saved with the old version and the new which is why the compatibility option is important). I'm sure a lot of other people are in a similar situation... I think we really need a warning and a grace period - I wouldn't expect this sort of possible breakage in a point release. Cheers Robin From robince at gmail.com Thu Feb 19 04:07:58 2009 From: robince at gmail.com (Robin) Date: Thu, 19 Feb 2009 09:07:58 +0000 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 9:05 AM, Robin wrote: > On Thu, Feb 19, 2009 at 9:00 AM, Nathan Bell wrote: >> On Thu, Feb 19, 2009 at 3:22 AM, Matthew Brett wrote: >>> >>> I'm happy either way. It's easier for me, and might be cleaner in the >>> long run, to just fix and document, but I can also see that broken >>> code is not a good outcome. >>> >> >> I vote fix and document in 0.7.1. I don't think a prolonged >> transition period would help matters. > > As a user of the MATLAB functionality, I would vote for the warning > message in 0.7.1 and the change in 0.8 (with an option to keep the > current behaviour). > > I use MATLAB files a lot to store results/data and it would/will be a > pain to change everywhere I load files that I've previously saved (and > then have to keep track of which data files were saved with the old > version and the new which is why the compatibility option is > important). I'm sure a lot of other people are in a similar > situation... I think we really need a warning and a grace period - I > wouldn't expect this sort of possible breakage in a point release. > > Cheers > > Robin PS - to be clear - (I also forgot to differentiate) - fine with the 0d fix going in, but my above comments apply to the change in dimensionality of a MATLAB saved/loaded 1d vector. From gregor.thalhammer at gmail.com Thu Feb 19 06:18:01 2009 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Thu, 19 Feb 2009 12:18:01 +0100 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> Message-ID: <499D3FE9.20501@googlemail.com> Matthew Brett schrieb: ... > However, matlab tends to think of an unshaped vector as being a row vector: > > >>> a = 1:12; >>> size(a) >>> > > ans = > > 1 12 > > And numpy thinks the same: > > >>>> print np.atleast_2d(np.array([1,2])).shape >>>> > (1, 2) > > It seems to me then, that I should assume the same, that a 1 > dimensional array is a row vector. However, this will change the > matlab shape of a 1 d array passed into the matlab mat file routines > from a column vector to a row vector. > > Do y'all think I should: > > a) Given this is undocumented anyway, just switch to the row vector > b) Keep 1d arrays as column vectors > c) Raise a warning and change for a future release > > I vote for option b: keep 1d arrays as column vectors. Numpy and matlab follow two fundamentally different conventions: numpy uses by default C contiguous arrays, matlab Fortran contiguous arrays. If A is a matrix, in numpy 'for vector in A: ...' loops over the row vectors, in matlab 'for vector = A' loops over the colum vectors. Matlab follows more closely the notation mathematicians prefer, e.g., indices start with 0, 1d vectors are column vectors. Therefore I am convinced that a 1d vector in numpy (a row vector) corresponds more naturally to a column vector in matlab. I see the argument that [1:12] in matlab is a row vector, but I think this is simply to be consistent with the direct entry of matrices ([1:12; 1:12] is a 2x12 matrix) and to be more economic in displaying on screen. In my opinion this is not sufficient to deduce what should be the default shape of a 1d vector in matlab. Gregor From njs at pobox.com Thu Feb 19 12:12:40 2009 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 19 Feb 2009 09:12:40 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <499D3FE9.20501@googlemail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> Message-ID: <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> On Thu, Feb 19, 2009 at 3:18 AM, Gregor Thalhammer wrote: > Therefore I am convinced that a 1d vector > in numpy (a row vector) corresponds more naturally to a column vector in > matlab. I see the argument that [1:12] in matlab is a row vector, but I > think this is simply to be consistent with the direct entry of matrices > ([1:12; 1:12] is a 2x12 matrix) and to be more economic in displaying on > screen. In my opinion this is not sufficient to deduce what should be > the default shape of a 1d vector in matlab. Also, I believe matlab is inconsistent on this point anyway -- isn't A(:) (basically ".ravel()") a column vector? -- Nathaniel From matthew.brett at gmail.com Thu Feb 19 12:30:36 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 09:30:36 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> Message-ID: <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> Hi, >> Therefore I am convinced that a 1d vector >> in numpy (a row vector) corresponds more naturally to a column vector in >> matlab. I see the argument that [1:12] in matlab is a row vector, but I >> think this is simply to be consistent with the direct entry of matrices >> ([1:12; 1:12] is a 2x12 matrix) and to be more economic in displaying on >> screen. In my opinion this is not sufficient to deduce what should be >> the default shape of a 1d vector in matlab. > > Also, I believe matlab is inconsistent on this point anyway -- isn't > A(:) (basically ".ravel()") a column vector? I agree it's not completely clear what matlab thinks. However, at the moment we have: In [19]: arr = np.arange(5) In [20]: arr.shape Out[20]: (5,) In [21]: np.atleast_2d(arr).shape Out[21]: (1, 5) In [22]: scipy.io.savemat('afile.mat', {'arr':arr}) In [24]: vars = scipy.io.loadmat('afile.mat') In [25]: vars['arr'].shape Out[25]: (5, 1) I think that is moderately surprising. Best, Matthew From gregor.thalhammer at gmail.com Thu Feb 19 13:45:40 2009 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Thu, 19 Feb 2009 19:45:40 +0100 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> Message-ID: <499DA8D4.4020704@googlemail.com> Matthew Brett schrieb: > Hi, > > >>> Therefore I am convinced that a 1d vector >>> in numpy (a row vector) corresponds more naturally to a column vector in >>> matlab. I see the argument that [1:12] in matlab is a row vector, but I >>> think this is simply to be consistent with the direct entry of matrices >>> ([1:12; 1:12] is a 2x12 matrix) and to be more economic in displaying on >>> screen. In my opinion this is not sufficient to deduce what should be >>> the default shape of a 1d vector in matlab. >>> >> Also, I believe matlab is inconsistent on this point anyway -- isn't >> A(:) (basically ".ravel()") a column vector? >> > > I agree it's not completely clear what matlab thinks. > > However, at the moment we have: > > In [19]: arr = np.arange(5) > > In [20]: arr.shape > Out[20]: (5,) > > In [21]: np.atleast_2d(arr).shape > Out[21]: (1, 5) > > In [22]: scipy.io.savemat('afile.mat', {'arr':arr}) > > In [24]: vars = scipy.io.loadmat('afile.mat') > > In [25]: vars['arr'].shape > Out[25]: (5, 1) > > I think that is moderately surprising. > Why not converting matlab 1xN or Nx1 arrays to numpy 1d arrays when loading from a matlab file? (If I remember correctly, this has been the case long time ago, at least in some of the predecessors of scipy.io). I guess this change would create more protest, at least on this list, since it would break python code instead of matlab code. Gregor From matthew.brett at gmail.com Thu Feb 19 13:57:29 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 10:57:29 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <499DA8D4.4020704@googlemail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> <499DA8D4.4020704@googlemail.com> Message-ID: <1e2af89e0902191057h7af4b222tdfdaa053172680f3@mail.gmail.com> Hi. > Why not converting matlab 1xN or Nx1 arrays to numpy 1d arrays when > loading from a matlab file? (If I remember correctly, this has been the > case long time ago, at least in some of the predecessors of scipy.io). I > guess this change would create more protest, at least on this list, > since it would break python code instead of matlab code. Yes, the old matfile reader used to squeeze out any redundant (1-length) dimensions. We changed that behavior because, if matlab has specified a row or a column vector, it seemed a shame to throw this information away. Best, Matthew From wnbell at gmail.com Thu Feb 19 14:07:17 2009 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 19 Feb 2009 14:07:17 -0500 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <499DA8D4.4020704@googlemail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> <499DA8D4.4020704@googlemail.com> Message-ID: On Thu, Feb 19, 2009 at 1:45 PM, Gregor Thalhammer wrote: > > Why not converting matlab 1xN or Nx1 arrays to numpy 1d arrays when > loading from a matlab file? (If I remember correctly, this has been the > case long time ago, at least in some of the predecessors of scipy.io). I > guess this change would create more protest, at least on this list, > since it would break python code instead of matlab code. > If I store a 1xN or Nx1 array in a .mat I expect the matrix I read back later to have the same dimensions. Think of how obnoxious it would be if I had a code that (depending on the input) might write a Nx3 or a Nx2 or a Nx1 matrix to disk, and then read that back at a later time. If someone has a proper 2d matrix in numpy we ought to respect their (explicit) intentions. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From stefan at sun.ac.za Thu Feb 19 14:50:22 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 19 Feb 2009 21:50:22 +0200 Subject: [SciPy-dev] Server errors Message-ID: <9457e7c80902191150t382c1ee5tc4e009339c48a196@mail.gmail.com> Hi, Those "500 Internal Server Errors" are back again. Could someone with access to the machine please restart the necessary services? Thanks St?fan From stefan at sun.ac.za Thu Feb 19 15:03:03 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 19 Feb 2009 22:03:03 +0200 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <499D1EFF.5000204@tudelft.nl> References: <499D1EFF.5000204@tudelft.nl> Message-ID: <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> Hi Pieter 2009/2/19 Pieter Cristiaan de Groot : > I think that there is an old link in the release notes of 0.7.0. It > seems that nose has moved from: > > http://code.google.com/p/python-nose/ > > to here: > > http://somethingaboutorange.com/mrl/projects/nose/ These look like the developer and project pages. Are they not both valid? Regards St?fan From robert.kern at gmail.com Thu Feb 19 15:05:24 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 19 Feb 2009 14:05:24 -0600 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> References: <499D1EFF.5000204@tudelft.nl> <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> Message-ID: <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> On Thu, Feb 19, 2009 at 14:03, St?fan van der Walt wrote: > Hi Pieter > > 2009/2/19 Pieter Cristiaan de Groot : >> I think that there is an old link in the release notes of 0.7.0. It >> seems that nose has moved from: >> >> http://code.google.com/p/python-nose/ >> >> to here: >> >> http://somethingaboutorange.com/mrl/projects/nose/ > > These look like the developer and project pages. Are they not both valid? Only the latter has the most recent release downloads. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Thu Feb 19 15:45:02 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 19 Feb 2009 22:45:02 +0200 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> References: <499D1EFF.5000204@tudelft.nl> <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> Message-ID: <9457e7c80902191245u770b106ap26bc21d46a650bf3@mail.gmail.com> 2009/2/19 Robert Kern : >> These look like the developer and project pages. Are they not both valid? > > Only the latter has the most recent release downloads. Thanks. Updated on the 0.7.0 tag, the 0.7.x branch and trunk. Cheers St?fan From p.c.degroot at tudelft.nl Thu Feb 19 15:46:14 2009 From: p.c.degroot at tudelft.nl (Pieter Cristiaan de Groot) Date: Thu, 19 Feb 2009 21:46:14 +0100 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> References: <499D1EFF.5000204@tudelft.nl> <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> Message-ID: <499DC516.9020005@tudelft.nl> Everything on the google code website seems from before january 2008. But I didn't find a note anywhere about this apparant change. Robert Kern wrote: > On Thu, Feb 19, 2009 at 14:03, St?fan van der Walt wrote: > >> Hi Pieter >> >> 2009/2/19 Pieter Cristiaan de Groot : >> >>> I think that there is an old link in the release notes of 0.7.0. It >>> seems that nose has moved from: >>> >>> http://code.google.com/p/python-nose/ >>> >>> to here: >>> >>> http://somethingaboutorange.com/mrl/projects/nose/ >>> >> These look like the developer and project pages. Are they not both valid? >> > > Only the latter has the most recent release downloads. > > From p.c.degroot at tudelft.nl Thu Feb 19 15:49:34 2009 From: p.c.degroot at tudelft.nl (Pieter Cristiaan de Groot) Date: Thu, 19 Feb 2009 21:49:34 +0100 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <9457e7c80902191245u770b106ap26bc21d46a650bf3@mail.gmail.com> References: <499D1EFF.5000204@tudelft.nl> <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> <9457e7c80902191245u770b106ap26bc21d46a650bf3@mail.gmail.com> Message-ID: <499DC5DE.6050801@tudelft.nl> thanks, and good luck, Pieter St?fan van der Walt wrote: > 2009/2/19 Robert Kern : > >>> These look like the developer and project pages. Are they not both valid? >>> >> Only the latter has the most recent release downloads. >> > > Thanks. Updated on the 0.7.0 tag, the 0.7.x branch and trunk. > > Cheers > St?fan > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From gael.varoquaux at normalesup.org Thu Feb 19 16:31:06 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 19 Feb 2009 22:31:06 +0100 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose Message-ID: <20090219213106.GA17356@phare.normalesup.org> Sorry, this is off topic, but I know there are many nose users on this mailing list. Here is the problem: in our library, we have a perfectly valid function that is called 'onesample_test'. Due to the name, nose identifies it as a function (also because it is defined in a file called 'statistical_test.py'). Does anybody have an idea of what the right way to tell nose to skip it is? Nose is passing it the wrong number of arguments, and as a result it is appearing as a bogus error in the test suite. Thanks, Ga?l From robert.kern at gmail.com Thu Feb 19 16:34:49 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 19 Feb 2009 15:34:49 -0600 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <20090219213106.GA17356@phare.normalesup.org> References: <20090219213106.GA17356@phare.normalesup.org> Message-ID: <3d375d730902191334u388189adg34122b4407c7f5c7@mail.gmail.com> On Thu, Feb 19, 2009 at 15:31, Gael Varoquaux wrote: > Sorry, this is off topic, but I know there are many nose users on this > mailing list. > > Here is the problem: in our library, we have a perfectly valid function > that is called 'onesample_test'. Due to the name, nose identifies it as a > function (also because it is defined in a file called > 'statistical_test.py'). > > Does anybody have an idea of what the right way to tell nose to skip it > is? Nose is passing it the wrong number of arguments, and as a result it > is appearing as a bogus error in the test suite. Some combination of these should be useful to you. -m TESTMATCH, --match=TESTMATCH, --testmatch=TESTMATCH Use this regular expression to find tests [NOSE_TESTMATCH] -e EXCLUDE, --exclude=EXCLUDE Don't run tests that match regular expression [NOSE_EXCLUDE] -i INCLUDE, --include=INCLUDE Also run tests that match regular expression [NOSE_INCLUDE] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From millman at berkeley.edu Thu Feb 19 16:36:34 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 19 Feb 2009 13:36:34 -0800 Subject: [SciPy-dev] old link in releas notes 0.7.0 In-Reply-To: <499DC516.9020005@tudelft.nl> References: <499D1EFF.5000204@tudelft.nl> <9457e7c80902191203r7d70631fvb30aadac72e8d384@mail.gmail.com> <3d375d730902191205p3f57ca2agacc7dd388150c1a2@mail.gmail.com> <499DC516.9020005@tudelft.nl> Message-ID: On Thu, Feb 19, 2009 at 12:46 PM, Pieter Cristiaan de Groot wrote: > Everything on the google code website seems from before january 2008. > But I didn't find a note anywhere about this apparant change. They use the google code site for development (source code, bug tracking, developer wiki). The other page is the project home page. They both link to one another. The downloads have moved to the home page from the google site. The discrepancy I think is from having two sites with redundant information. Thanks for catching this, we should be linking to the project home page (and not the developer site) anyway. I updated the sourceforge site as well: https://sourceforge.net/project/shownotes.php?release_id=660191&group_id=27747 Jarrod From gael.varoquaux at normalesup.org Thu Feb 19 16:37:54 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 19 Feb 2009 22:37:54 +0100 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <3d375d730902191334u388189adg34122b4407c7f5c7@mail.gmail.com> References: <20090219213106.GA17356@phare.normalesup.org> <3d375d730902191334u388189adg34122b4407c7f5c7@mail.gmail.com> Message-ID: <20090219213754.GB17356@phare.normalesup.org> On Thu, Feb 19, 2009 at 03:34:49PM -0600, Robert Kern wrote: > Some combination of these should be useful to you. > -m TESTMATCH, --match=TESTMATCH, --testmatch=TESTMATCH > Use this regular expression to find tests > [NOSE_TESTMATCH] > -e EXCLUDE, --exclude=EXCLUDE > Don't run tests that match regular expression > [NOSE_EXCLUDE] > -i INCLUDE, --include=INCLUDE > Also run tests that match regular expression > [NOSE_INCLUDE] Do you know a way of making these options, or some attribute plugins options, module-level? I know how to add them in the setup.cfg, but the developpers run the test suite from all over the place. And I don't want to make it more difficult for them to run the test suite, as they already don't run it often. Cheers, Ga?l From robert.kern at gmail.com Thu Feb 19 16:40:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 19 Feb 2009 15:40:57 -0600 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <20090219213754.GB17356@phare.normalesup.org> References: <20090219213106.GA17356@phare.normalesup.org> <3d375d730902191334u388189adg34122b4407c7f5c7@mail.gmail.com> <20090219213754.GB17356@phare.normalesup.org> Message-ID: <3d375d730902191340x73969129sda8ef0a49d179f74@mail.gmail.com> On Thu, Feb 19, 2009 at 15:37, Gael Varoquaux wrote: > On Thu, Feb 19, 2009 at 03:34:49PM -0600, Robert Kern wrote: >> Some combination of these should be useful to you. > >> -m TESTMATCH, --match=TESTMATCH, --testmatch=TESTMATCH >> Use this regular expression to find tests >> [NOSE_TESTMATCH] >> -e EXCLUDE, --exclude=EXCLUDE >> Don't run tests that match regular expression >> [NOSE_EXCLUDE] >> -i INCLUDE, --include=INCLUDE >> Also run tests that match regular expression >> [NOSE_INCLUDE] > > Do you know a way of making these options, or some attribute plugins > options, module-level? I know how to add them in the setup.cfg, but the > developpers run the test suite from all over the place. And I don't want > to make it more difficult for them to run the test suite, as they already > don't run it often. If the settings don't conflict with other nose-using projects (e.g. -e '.*_test$') then your developers can probably use ~/.noserc . -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthew.brett at gmail.com Thu Feb 19 16:41:25 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 13:41:25 -0800 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <20090219213106.GA17356@phare.normalesup.org> References: <20090219213106.GA17356@phare.normalesup.org> Message-ID: <1e2af89e0902191341s1d33abd7u788e4f36dc0ecfeb@mail.gmail.com> Hi, > Here is the problem: in our library, we have a perfectly valid function > that is called 'onesample_test'. Due to the name, nose identifies it as a > function (also because it is defined in a file called > 'statistical_test.py'). Probably you want: from numpy.testing import dec @dec.setastest(False) def function_that_tests(): pass If you don't want to import the decorator, just def function_that_tests(): pass function_that_tests.__test__ = False ? Matthew From alan.mcintyre at gmail.com Thu Feb 19 16:43:15 2009 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Thu, 19 Feb 2009 13:43:15 -0800 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <20090219213106.GA17356@phare.normalesup.org> References: <20090219213106.GA17356@phare.normalesup.org> Message-ID: <1d36917a0902191343h56d13c82j6f791843490daecd@mail.gmail.com> You can use the setastest decorator from numpy.testing.dec if you want nose to ignore it. On Thu, Feb 19, 2009 at 1:31 PM, Gael Varoquaux wrote: > Sorry, this is off topic, but I know there are many nose users on this > mailing list. > > Here is the problem: in our library, we have a perfectly valid function > that is called 'onesample_test'. Due to the name, nose identifies it as a > function (also because it is defined in a file called > 'statistical_test.py'). > > Does anybody have an idea of what the right way to tell nose to skip it > is? Nose is passing it the wrong number of arguments, and as a result it > is appearing as a bogus error in the test suite. > > Thanks, > > Ga?l > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From gael.varoquaux at normalesup.org Thu Feb 19 16:43:34 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 19 Feb 2009 22:43:34 +0100 Subject: [SciPy-dev] [OT] Skipping a function wronlgy identified by nose In-Reply-To: <1e2af89e0902191341s1d33abd7u788e4f36dc0ecfeb@mail.gmail.com> References: <20090219213106.GA17356@phare.normalesup.org> <1e2af89e0902191341s1d33abd7u788e4f36dc0ecfeb@mail.gmail.com> Message-ID: <20090219214334.GC17356@phare.normalesup.org> On Thu, Feb 19, 2009 at 01:41:25PM -0800, Matthew Brett wrote: > def function_that_tests(): > pass > function_that_tests.__test__ = False Fantastic, just what I needed. Simple, no performance overhead, no tight coupling with a framework. Matthew, I love you. luv, Ga?l From john.stachurski at gmail.com Thu Feb 19 20:22:47 2009 From: john.stachurski at gmail.com (John Stachurski) Date: Fri, 20 Feb 2009 10:22:47 +0900 Subject: [SciPy-dev] on-line lectures and new text book using python, numpy, scipy Message-ID: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> Hi all, Thanks to all scipy and numpy developers for creating a great package. I'm an economist researching computational techniques, and I'm trying to wean economists off MATLAB and onto python/numpy/scipy. I have a book just published through MIT Press on computational economics using python: http://johnstachurski.net/book/book.html I've also written a fairly comprehensive set of lectures on python/numpy/ scipy with applications in economics http://johnstachurski.net/lectures/index.html I've added them to this page: http://www.scipy.org/Documentation listed under "Other". I hope people find them useful. All feedback is most welcome. Regards, John. -- John Stachurski Born to fish, forced to work Visit me at http://johnstachurski.net From 00ai99 at gmail.com Thu Feb 19 22:08:42 2009 From: 00ai99 at gmail.com (David Gowers) Date: Fri, 20 Feb 2009 13:38:42 +1030 Subject: [SciPy-dev] on-line lectures and new text book using python, numpy, scipy In-Reply-To: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> References: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> Message-ID: <23f4e3390902191908n63c198beoe7e4ecf3ab541bb9@mail.gmail.com> Hello John, On Fri, Feb 20, 2009 at 11:52 AM, John Stachurski wrote: > Hi all, > > Thanks to all scipy and numpy developers for creating a great package. > > I'm an economist researching computational techniques, and I'm trying > to wean economists off MATLAB and onto python/numpy/scipy. I have > a book just published through MIT Press on computational economics > using python: > > http://johnstachurski.net/book/book.html > > I've also written a fairly comprehensive set of lectures on python/numpy/ > scipy with applications in economics When I see that you are using python+numpy+scipy, I wonder whether http://en.wikipedia.org/wiki/Sage_(mathematics_software) could be useful to you, since it packages up those three and much more mathematical software into a single interface, in one package. I know part of the motivation for SAGE was to provide more complete MATLAB-equivalent functionality (the primary motivation being to have a open-source software suite capable of replacing Mathematica) David From matthew.brett at gmail.com Thu Feb 19 22:40:54 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 19:40:54 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> Message-ID: <1e2af89e0902191940h74b5aedfkd796432771a77c4b@mail.gmail.com> Hi, >> As a user of the MATLAB functionality, I would vote for the warning >> message in 0.7.1 and the change in 0.8 (with an option to keep the >> current behaviour). >> >> I use MATLAB files a lot to store results/data and it would/will be a >> pain to change everywhere I load files that I've previously saved (and >> then have to keep track of which data files were saved with the old >> version and the new which is why the compatibility option is >> important). I'm sure a lot of other people are in a similar >> situation... I think we really need a warning and a grace period - I >> wouldn't expect this sort of possible breakage in a point release. That seems the most reasonable compromise given the range of opinions. I've put the deprecation warning into SVN. I will put current SVN into 7.1, and intend to change the behavior for 0.8 as suggested. Best, Matthew From matthew.brett at gmail.com Thu Feb 19 22:42:45 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 19:42:45 -0800 Subject: [SciPy-dev] matlab io - request for testing Message-ID: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Hi, I have been beating up the matlab io rather severely in order to implement some cleanups, fixes, and add new options. I would very much appreciate it if people could pick up the current SVN and let me know whether they have any problems. Thanks a lot, Matthew From matthew.brett at gmail.com Thu Feb 19 23:12:29 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 19 Feb 2009 20:12:29 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <1e2af89e0902191940h74b5aedfkd796432771a77c4b@mail.gmail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499CD282.6090109@ar.media.kyoto-u.ac.jp> <1e2af89e0902182003v7f51359arb374eefa7d3e3595@mail.gmail.com> <9457e7c80902182204r36247a16m2667bdeeb89177cb@mail.gmail.com> <1e2af89e0902190022j2215e275td4ecb60435b14e12@mail.gmail.com> <1e2af89e0902191940h74b5aedfkd796432771a77c4b@mail.gmail.com> Message-ID: <1e2af89e0902192012t725de0e0rd4729707fe04c27@mail.gmail.com> One more question. > I've put the deprecation warning into SVN. I will put current SVN > into 7.1, and intend to change the behavior for 0.8 as suggested. I noticed while I was working on this, that the default 1D shape for matlab 4 files was a row vector, and for matlab 5 was a column vector. Possibilities: a) Change matlab 4 behavior to be as for matlab 5 (1d->column), thus possibly causing confusion to the (?rather few) matlab 4 users. Who will then be switched back to what they were expecting - row - for 0.8. - ? b) Leave as is, with matlab 4 writing row vectors and matlab 5 writing column vectors? In due course (0.8) they will both be writing row vectors. I'm tempted by b). That's how it is as of current SVN. Any thoughts? Matthew From wnbell at gmail.com Thu Feb 19 23:57:33 2009 From: wnbell at gmail.com (Nathan Bell) Date: Thu, 19 Feb 2009 23:57:33 -0500 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 10:42 PM, Matthew Brett wrote: > > I have been beating up the matlab io rather severely in order to > implement some cleanups, fixes, and add new options. > > I would very much appreciate it if people could pick up the current > SVN and let me know whether they have any problems. > r5579 works fine on my system (Ubuntu 8.04 64-bit Python 2.5). -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From peter.skomoroch at gmail.com Fri Feb 20 01:24:27 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Thu, 19 Feb 2009 22:24:27 -0800 Subject: [SciPy-dev] on-line lectures and new text book using python, numpy, scipy In-Reply-To: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> References: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> Message-ID: Great book, I posted a link to scipy-users a few weeks ago. Sent from my iPhone On Feb 19, 2009, at 5:22 PM, John Stachurski wrote: > Hi all, > > Thanks to all scipy and numpy developers for creating a great package. > > I'm an economist researching computational techniques, and I'm trying > to wean economists off MATLAB and onto python/numpy/scipy. I have > a book just published through MIT Press on computational economics > using python: > > http://johnstachurski.net/book/book.html > > I've also written a fairly comprehensive set of lectures on python/ > numpy/ > scipy with applications in economics > > http://johnstachurski.net/lectures/index.html > > I've added them to this page: > > http://www.scipy.org/Documentation > > listed under "Other". I hope people find them useful. All feedback > is most > welcome. > > Regards, John. > > -- > John Stachurski > Born to fish, forced to work > Visit me at http://johnstachurski.net > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From arokem at berkeley.edu Fri Feb 20 02:01:51 2009 From: arokem at berkeley.edu (Ariel Rokem) Date: Thu, 19 Feb 2009 23:01:51 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: <43958ee60902192301y3e854edch5240e1719e273473@mail.gmail.com> Hi Matthew - it seems to work on my computer (Mac OS 10.5.6), and quite fast at that (though I haven't measured precisely). However, it isn't quite backwards compatible with code written with a previous version of mio. If I am getting things right, the changes are such that, in order to get the same result as I got with the previous version, this lines of code: mat_file = sio.loadmat('file_name.mat') variable_values = mat_file['field_name'].variable Has to now be written: mat_file = sio.loadmat('file_name.mat') field_values = mat_file['field_name'][0][0].variable[0][0] Cheers -- Ariel On Thu, Feb 19, 2009 at 8:57 PM, Nathan Bell wrote: > On Thu, Feb 19, 2009 at 10:42 PM, Matthew Brett > wrote: > > > > I have been beating up the matlab io rather severely in order to > > implement some cleanups, fixes, and add new options. > > > > I would very much appreciate it if people could pick up the current > > SVN and let me know whether they have any problems. > > > > r5579 works fine on my system (Ubuntu 8.04 64-bit Python 2.5). > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Fri Feb 20 04:21:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 20 Feb 2009 04:21:32 -0500 Subject: [SciPy-dev] scipy.spatial 'output' args Message-ID: <7A90652A-D7F9-4B1B-8E9F-17B87764583D@cs.toronto.edu> Howdy, This is mostly a question for Damian (sorry, I seem to be bugging you on a lot of fronts!). I'm wondering if there's a principled reason why the pdist/squareform/etc. functions don't allow you to specify an output array. It seems hard to justify not having a way of avoiding repeated memory allocations if you're doing this more than once (as points change, for example -- if you're curious I'm implementing a 2-D embedding algorithm and the objective function relies on a lot of distances being recomputed). My guess is this is a MATLABism and that this was just an oversight. Is there a reason I shouldn't go about trying to add it? Would other people find this useful? David From gregor.thalhammer at gmail.com Fri Feb 20 07:48:07 2009 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Fri, 20 Feb 2009 13:48:07 +0100 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> <499DA8D4.4020704@googlemail.com> Message-ID: <499EA687.1000406@googlemail.com> Nathan Bell schrieb: > On Thu, Feb 19, 2009 at 1:45 PM, Gregor Thalhammer > wrote: > >> Why not converting matlab 1xN or Nx1 arrays to numpy 1d arrays when >> loading from a matlab file? (If I remember correctly, this has been the >> case long time ago, at least in some of the predecessors of scipy.io). I >> guess this change would create more protest, at least on this list, >> since it would break python code instead of matlab code. >> >> > > If I store a 1xN or Nx1 array in a .mat I expect the matrix I read > back later to have the same dimensions. Think of how obnoxious it > would be if I had a code that (depending on the input) might write a > Nx3 or a Nx2 or a Nx1 matrix to disk, and then read that back at a > later time. > > If someone has a proper 2d matrix in numpy we ought to respect their > (explicit) intentions. > You can use exactly the same argument to propose the old behaviour: If I store a 1d array in a .mat I expect the vector I read back later to have the same dimensions. Now we can argue what is the more common use case, 1d arrays or 1xN arrays? Which of them should behave consistently? Generally, I think it's a very bad choice to use a matlab file to store and retrieve numpy arrays, since you loose information about the shape of the arrays, in one or the other way. The matlab read/write functions in scipy.io are useful to exchange data between numpy and matlab. If I save a vector in matlab, isn't it natural to get a 1d array when load it with numpy? Gregor From Nicolas.Rougier at loria.fr Fri Feb 20 10:44:37 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Fri, 20 Feb 2009 16:44:37 +0100 Subject: [SciPy-dev] bug in scipy.ndimage.interpolation.zoom ? Message-ID: <1235144677.27155.2.camel@sulfur.loria.fr> Hello, >From the following script: import numpy from scipy.ndimage.interpolation import zoom n = 5 Z = numpy.ones((n)) print Z print zoom(Z,1,order=0) print zoom(Z.reshape((1,n)),(1,1),order=0) I get the following result: [ 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1.] [[ 0. 0. 0. 0. 0.]] Is that the expected behavior ? Nicolas From stefan at sun.ac.za Fri Feb 20 10:51:45 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 20 Feb 2009 17:51:45 +0200 Subject: [SciPy-dev] bug in scipy.ndimage.interpolation.zoom ? In-Reply-To: <1235144677.27155.2.camel@sulfur.loria.fr> References: <1235144677.27155.2.camel@sulfur.loria.fr> Message-ID: <9457e7c80902200751u42d5e8c4vda24bf7965b168d@mail.gmail.com> 2009/2/20 Nicolas Rougier : > I get the following result: > > [ 1. 1. 1. 1. 1.] > [ 1. 1. 1. 1. 1.] > [[ 0. 0. 0. 0. 0.]] > > Is that the expected behavior ? Hmm, no -- that doesn't look right. Unfortunately, I don't have time to hunt it right now. Maybe you can take a look, or alternatively create a ticket? Regards St?fan From Nicolas.Rougier at loria.fr Fri Feb 20 11:14:45 2009 From: Nicolas.Rougier at loria.fr (Nicolas Rougier) Date: Fri, 20 Feb 2009 17:14:45 +0100 Subject: [SciPy-dev] bug in scipy.ndimage.interpolation.zoom ? In-Reply-To: <9457e7c80902200751u42d5e8c4vda24bf7965b168d@mail.gmail.com> References: <1235144677.27155.2.camel@sulfur.loria.fr> <9457e7c80902200751u42d5e8c4vda24bf7965b168d@mail.gmail.com> Message-ID: <1235146485.27155.13.camel@sulfur.loria.fr> In the zoom function, the faulty line seems to be: zoom = (numpy.array(input.shape)-1)/(numpy.array(output_shape,float)-1) that generates nan in the zoom array for each output dimension equal to 1. I'm not sure about the meaning of this -1, but a quick fix could be to replace nan in zoom with 1 just after: zoom = numpy.nan_to_num(zoom)+numpy.isnan(zoom) Nicolas On Fri, 2009-02-20 at 17:51 +0200, St?fan van der Walt wrote: > 2009/2/20 Nicolas Rougier : > > I get the following result: > > > > [ 1. 1. 1. 1. 1.] > > [ 1. 1. 1. 1. 1.] > > [[ 0. 0. 0. 0. 0.]] > > > > Is that the expected behavior ? > > Hmm, no -- that doesn't look right. Unfortunately, I don't have time > to hunt it right now. Maybe you can take a look, or alternatively > create a ticket? > > Regards > St?fan > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From matthew.brett at gmail.com Fri Feb 20 12:01:14 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 20 Feb 2009 09:01:14 -0800 Subject: [SciPy-dev] Matlab io bug; request for advice In-Reply-To: <499EA687.1000406@googlemail.com> References: <1e2af89e0902181926q728b0074l4bcf119b4af4908b@mail.gmail.com> <499D3FE9.20501@googlemail.com> <961fa2b40902190912t43071dc9g8fc43fb5d36000fb@mail.gmail.com> <1e2af89e0902190930r5240eb13nb7cfd663a99c4e9d@mail.gmail.com> <499DA8D4.4020704@googlemail.com> <499EA687.1000406@googlemail.com> Message-ID: <1e2af89e0902200901t780a779arceefdb237593827@mail.gmail.com> Hi, > Generally, I think it's a very bad choice to use a matlab file to store > and retrieve numpy arrays, since you loose information about the shape > of the arrays, in one or the other way. The matlab read/write functions > in scipy.io are useful to exchange data between numpy and matlab. If I > save a vector in matlab, isn't it natural to get a 1d array when load it > with numpy? You don't lose the shape of matlab arrays when loaded into python. I suppose you do lose the distinction between 1D and 2D arrays when going from numpy -> matlab -> python, but, as matlab can't preserve this, there's no way round that. Best, Matthew From matthew.brett at gmail.com Fri Feb 20 12:05:49 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 20 Feb 2009 09:05:49 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <43958ee60902192301y3e854edch5240e1719e273473@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <43958ee60902192301y3e854edch5240e1719e273473@mail.gmail.com> Message-ID: <1e2af89e0902200905q34e08cedu909bac60ce9e9530@mail.gmail.com> Hi Ariel, Here's a wave up the hill. > mat_file = sio.loadmat('file_name.mat') > variable_values = mat_file['field_name'].variable > > Has to now be written: > > mat_file = sio.loadmat('file_name.mat') > field_values = mat_file['field_name'][0][0].variable[0][0] That's surprising. For a long time now, the reader has always returned at least 2D arrays from matlab, so the latter is what I was expecting. Are you sure this is a difference between 0.7 and current SVN? Can you check and then send me an example mat file with different behavior for the two versions? See you, Matthew From njs at pobox.com Sat Feb 21 01:28:11 2009 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 20 Feb 2009 22:28:11 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> On Thu, Feb 19, 2009 at 7:42 PM, Matthew Brett wrote: > I have been beating up the matlab io rather severely in order to > implement some cleanups, fixes, and add new options. > > I would very much appreciate it if people could pick up the current > SVN and let me know whether they have any problems. I finally got a chance to test with my nasty file, and with r5561, it now takes ~32 minutes of cpu time to load (as compared to ~5 minutes for 0.7.0, and 3 seconds for 0.6.0). All the time is in zlibstreams.py:read. I talked to the guy whose data it is now, though, and he okayed my distributing an example: http://roberts.vorpus.org/~njs/tmp/test.mat http://roberts.vorpus.org/~njs/tmp/test-mat.txt http://roberts.vorpus.org/~njs/tmp/test-mat.profile (Sorry the file is so large, all my attempts to minimize it somehow also fixed whatever is making it so pathological.) Does that help track things down? (This is also a good example file for why struct_as_record=True can be Very Very Useless, and if you combine struct_as_record=True with squeeze_me=True, the file ends up as gibberish -- a big tuple of anonymous variables, not so useful...) I'm also wondering, though, if (as you mentioned downthread somewhere) the matlab IO code ends up doing a single short read and then reads the whole actual matrix data in one fell swoop, then what benefit does this streaming code give us? I though that the point was that one could read small chunks and avoid taking the memory for a large temporary buffer, but if that's not happening, then it seems like a very slow and fragile chunk of code for no benefit. -- Nathaniel From matthew.brett at gmail.com Sat Feb 21 02:58:19 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 20 Feb 2009 23:58:19 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> Message-ID: <1e2af89e0902202358x1a346f6dx31c0e1b1224997fc@mail.gmail.com> Hi, > I finally got a chance to test with my nasty file, and with r5561, it > now takes ~32 minutes of cpu time to load (as compared to ~5 minutes > for 0.7.0, and 3 seconds for 0.6.0). All the time is in > zlibstreams.py:read. > > I talked to the guy whose data it is now, though, and he okayed my > distributing an example: > http://roberts.vorpus.org/~njs/tmp/test.mat > http://roberts.vorpus.org/~njs/tmp/test-mat.txt > http://roberts.vorpus.org/~njs/tmp/test-mat.profile > (Sorry the file is so large, all my attempts to minimize it somehow > also fixed whatever is making it so pathological.) Thanks - that's very useful. > Does that help track things down? (This is also a good example file > for why struct_as_record=True can be Very Very Useless, and if you > combine struct_as_record=True with squeeze_me=True, the file ends up > as gibberish -- a big tuple of anonymous variables, not so useful...) Also useful - thank you. > I'm also wondering, though, if (as you mentioned downthread somewhere) > the matlab IO code ends up doing a single short read and then reads > the whole actual matrix data in one fell swoop, then what benefit does > this streaming code give us? I though that the point was that one > could read small chunks and avoid taking the memory for a large > temporary buffer, but if that's not happening, then it seems like a > very slow and fragile chunk of code for no benefit. It may be that we'll have to pull it. The purpose of the two stage read - and the original purpose of the code - was to allow someone who is trying to read a particular variable to read enough of the zlib stream to get the name, in order to be able to skip it if the name is not the one they are looking for. Otherwise, they would have to read the whole stream - that might be very large - just to get the name. Thanks again, Matthew From matthew.brett at gmail.com Sat Feb 21 03:25:04 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 21 Feb 2009 00:25:04 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902202358x1a346f6dx31c0e1b1224997fc@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> <1e2af89e0902202358x1a346f6dx31c0e1b1224997fc@mail.gmail.com> Message-ID: <1e2af89e0902210025n3905298cp52b4bc5568f9654@mail.gmail.com> Hi, On Fri, Feb 20, 2009 at 11:58 PM, Matthew Brett wrote: > Hi, > >> I finally got a chance to test with my nasty file, and with r5561, it >> now takes ~32 minutes of cpu time to load (as compared to ~5 minutes >> for 0.7.0, and 3 seconds for 0.6.0). All the time is in >> zlibstreams.py:read. Actually, thinking about it, I wonder if it's the string slicing in getting the data out of zlibstream that is taking the time. I suppose that might happen if you have lots of tiny matrices in there. Could you try: import scipy.io.matlab as matlab matlab.bench() What kind of numbers do you get? Best, Matthew From nwagner at iam.uni-stuttgart.de Sat Feb 21 03:51:59 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 21 Feb 2009 09:51:59 +0100 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902210025n3905298cp52b4bc5568f9654@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> <1e2af89e0902202358x1a346f6dx31c0e1b1224997fc@mail.gmail.com> <1e2af89e0902210025n3905298cp52b4bc5568f9654@mail.gmail.com> Message-ID: On Sat, 21 Feb 2009 00:25:04 -0800 Matthew Brett wrote: > Hi, > > On Fri, Feb 20, 2009 at 11:58 PM, Matthew Brett > wrote: >> Hi, >> >>> I finally got a chance to test with my nasty file, and >>>with r5561, it >>> now takes ~32 minutes of cpu time to load (as compared >>>to ~5 minutes >>> for 0.7.0, and 3 seconds for 0.6.0). All the time is in >>> zlibstreams.py:read. > > Actually, thinking about it, I wonder if it's the string >slicing in > getting the data out of zlibstream that is taking the >time. I suppose > that might happen if you have lots of tiny matrices in >there. Could > you try: > > import scipy.io.matlab as matlab > matlab.bench() > > What kind of numbers do you get? > > Best, > > Matthew Hi Matthew, I just run the benchmark. Here are the results: >>> matlab.bench() Running benchmarks for scipy.io.matlab NumPy version 1.3.0.dev6436 NumPy is installed in /home/nwagner/local/lib64/python2.6/site-packages/numpy SciPy version 0.8.0.dev5581 SciPy is installed in /home/nwagner/local/lib64/python2.6/site-packages/scipy Python version 2.6 (r26:66714, Feb 3 2009, 20:49:49) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] nose version 0.10.4 reading gzip streams ======================================== time(s) | nbytes ---------------------------------------- 0.060 | 1.500 | 4000000 0.240 | 1.200 | 20000000 reading gzip streams ======================================== time(s) | nbytes ---------------------------------------- 0.060 | 1.500 | 4000000 0.240 | 1.333 | 20000000 . ---------------------------------------------------------------------- Ran 1 test in 10.152s OK True Nils From stefan at sun.ac.za Sat Feb 21 04:24:24 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 21 Feb 2009 11:24:24 +0200 Subject: [SciPy-dev] RFR: #794 - linalg.eig segmentation fault on unpickled arrays Message-ID: <9457e7c80902210124l7ac66101g617d9e42e8ecd68b@mail.gmail.com> Hi, Please review the patch attached to http://scipy.org/scipy/scipy/ticket/794 I'd like to know whether checking memory alignment and Fortran contiguity is sufficient. Please also let me know if you can think of a better way to avoid the LAPACK segfaults. Regards St?fan From john.stachurski at gmail.com Sat Feb 21 15:24:19 2009 From: john.stachurski at gmail.com (John Stachurski) Date: Sun, 22 Feb 2009 05:24:19 +0900 Subject: [SciPy-dev] on-line lectures and new text book using python, numpy, scipy In-Reply-To: References: <92120a230902191722w277d2662j3e1b2d69d81b67dc@mail.gmail.com> Message-ID: <92120a230902211224o608a0aafvd7a249a034d1522d@mail.gmail.com> On Fri, Feb 20, 2009 at 3:24 PM, Peter Skomoroch wrote: > Great book, I posted a link to scipy-users a few weeks ago. Many thanks ; ) Your website rocks. I read it all the time. -- John Stachurski Born to fish, forced to work Visit me at http://johnstachurski.net From arokem at berkeley.edu Sat Feb 21 17:11:59 2009 From: arokem at berkeley.edu (Ariel Rokem) Date: Sat, 21 Feb 2009 14:11:59 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902200905q34e08cedu909bac60ce9e9530@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <43958ee60902192301y3e854edch5240e1719e273473@mail.gmail.com> <1e2af89e0902200905q34e08cedu909bac60ce9e9530@mail.gmail.com> Message-ID: <43958ee60902211411q345856d2mab04b08d1c85ff0b@mail.gmail.com> Hi Matthew, no - I have been comparing 0.6.0 and r5579, so everything I am saying henceforth may turn out to be irrelevant. At any rate, I attach a .mat file - for this file: In [24]: sp.__version__ Out[24]: '0.6.0' In [25]: mat_file = sio.loadmat('RMT110408.mat') In [26]: mat_file Out[26]: {'ROI': , '__globals__': [], '__header__': 'MATLAB 5.0 MAT-file, Platform: MAC, Created on: Wed Dec 3 18:45:42 2008', '__version__': '1.0'} In [31]: sp.__version__ Out[31]: '0.8.0.dev5579' In [32]: mat_file = sio.loadmat('RMT110408.mat') In [33]: mat_file Out[33]: {'ROI': array([[]], dtype=object), '__globals__': [], '__header__': 'MATLAB 5.0 MAT-file, Platform: MAC, Created on: Wed Dec 3 18:45:42 2008', '__version__': '1.0'} >From all that you have said, this is probably no surprise to you. Cheers, Ariel On Fri, Feb 20, 2009 at 9:05 AM, Matthew Brett wrote: > Hi Ariel, > > Here's a wave up the hill. > > > mat_file = sio.loadmat('file_name.mat') > > variable_values = mat_file['field_name'].variable > > > > Has to now be written: > > > > mat_file = sio.loadmat('file_name.mat') > > field_values = mat_file['field_name'][0][0].variable[0][0] > > That's surprising. For a long time now, the reader has always > returned at least 2D arrays from matlab, so the latter is what I was > expecting. > > Are you sure this is a difference between 0.7 and current SVN? Can > you check and then send me an example mat file with different behavior > for the two versions? > > See you, > > Matthew > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: RMT110408.mat Type: application/octet-stream Size: 995 bytes Desc: not available URL: From eads at soe.ucsc.edu Sat Feb 21 20:59:27 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 21 Feb 2009 17:59:27 -0800 Subject: [SciPy-dev] scipy.spatial 'output' args In-Reply-To: <7A90652A-D7F9-4B1B-8E9F-17B87764583D@cs.toronto.edu> References: <7A90652A-D7F9-4B1B-8E9F-17B87764583D@cs.toronto.edu> Message-ID: <91b4b1ab0902211759n3d9dec36rc78d3ff9ebbef78f@mail.gmail.com> Hi David, On Fri, Feb 20, 2009 at 1:21 AM, David Warde-Farley wrote: > Howdy, > > This is mostly a question for Damian (sorry, I seem to be bugging you > on a lot of fronts!). I'm wondering if there's a principled reason why > the pdist/squareform/etc. functions don't allow you to specify an > output array. > > It seems hard to justify not having a way of avoiding repeated memory > allocations if you're doing this more than once (as points change, for > example -- if you're curious I'm implementing a 2-D embedding > algorithm and the objective function relies on a lot of distances > being recomputed). There is no reason why I did not provide an 'out' parameter for preallocated memory in these functions. This is probably just a MATLABism oversight. I'd encourage you to add this feature. If you do, please make sure to do the following: * the array holds the right number of values, * its C contiguous (the C functions don't do any special striding), * the data type of the out array is 'f', Please provide regression tests, making sure the exception is thrown in these appropriate cases. See 'test_distance.py' for the nose tests. Thanks so much for agreeing to add this feature. Damian > My guess is this is a MATLABism and that this was just an oversight. > Is there a reason I shouldn't go about trying to add it? Would other > people find this useful? > > David > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From still.horse at gmail.com Sat Feb 21 21:47:05 2009 From: still.horse at gmail.com (Kevin Daley) Date: Sat, 21 Feb 2009 21:47:05 -0500 Subject: [SciPy-dev] new member. Message-ID: <93d69480902211847le067bd6re56a562c608af732@mail.gmail.com> Hi, all. Just want to let everyone know I'm here. Name's Kevin Daley, from Atlanta. What I know: math: everything through linear algebra Lie calculus/Functional Calculus/Geometric Calculus (usually) chaos theory differential geometry (still needs a bit of polishing) Functional Integrals Stochastic calculus Fourier Theory/Harmonic Analysis/Wavelets science: Everything through Classical Mechanics (of course) Statistical Thermodynamics General Relativity/ESCK gravity (still needs a little work) Quantum Electrodynamics (am learning quantum optics) Fluid Dynamics/Plasmas/Turbulence statistical mechanics/information theory (some) physical neuroscience/EEG analysis computers: Python (8 years' experience. also fluent in C/C++) GPU parallelism (approx. 2-year CUDA experience, learning OpenCL) a variety of grid-based simulation techniques (the list is too long) wavelet compression (a little) computer graphics (a lot) basic FEM/level-set stuff bunch of api-level stuff: OpenGL, cg/glsl/hlsl, most widely-used python modules (and a few very arcane ones), miscellaneous: excellent verbal skills. when I'm available: most of the time. what I am most excited to help with: suggesting and helping out with new features/enhancements, extensions to new platforms, optimizations of numeric methods, user interface, integration with other modules. Everyone at scipy has been doing a great job...it's great software. But I don't see why we can't compete with industry technology's feature set----while providing extra modularity, flexibility, portability, and extensibility. You'll hear back from me soon. Cheers! -------------- next part -------------- An HTML attachment was scrubbed... URL: From danielsjensen1 at gmail.com Sat Feb 21 23:52:02 2009 From: danielsjensen1 at gmail.com (Daniel Jensen) Date: Sat, 21 Feb 2009 21:52:02 -0700 Subject: [SciPy-dev] Request to help edit documentation Message-ID: <200902212152.03089.danielsjensen1@gmail.com> I'm just following the instructions on the page: http://docs.scipy.org/numpy/Front%20Page/ for requesting rights to help edit documentation. I've found a few simple spelling errors in the numpy documentation and thought that I could help contribute a little. Just an example, the following page: http://docs.scipy.org/doc/numpy/reference/generated/numpy.array_str.html#numpy.array_str is missing the 'f' in 'floating point'. -Daniel From matthew.brett at gmail.com Sun Feb 22 04:01:39 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 22 Feb 2009 01:01:39 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> Message-ID: <1e2af89e0902220101x139136aycfbfb47a267f62d4@mail.gmail.com> Hi, On Fri, Feb 20, 2009 at 10:28 PM, Nathaniel Smith wrote: > On Thu, Feb 19, 2009 at 7:42 PM, Matthew Brett wrote: >> I have been beating up the matlab io rather severely in order to >> implement some cleanups, fixes, and add new options. >> >> I would very much appreciate it if people could pick up the current >> SVN and let me know whether they have any problems. > > I finally got a chance to test with my nasty file, and with r5561, it > now takes ~32 minutes of cpu time to load (as compared to ~5 minutes > for 0.7.0, and 3 seconds for 0.6.0). All the time is in > zlibstreams.py:read. Could you check current SVN again and see how it works? I've sped up zlibstreams and it's now saving memory on the read, at about a 12% drop in speed, now I think due to the overhead of the single extra function calls on many small reads. I'm unsure whether I want to leave zlibstreams in. It has the advantage of making skipping variables much faster and more memory efficient, and maybe some increase in memory efficiency as the variable is read, but still, the small performance penalty is annoying. Best, Matthew From nwagner at iam.uni-stuttgart.de Sun Feb 22 04:30:05 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 22 Feb 2009 10:30:05 +0100 Subject: [SciPy-dev] New test failures scipy 0.8.0.dev5585 Message-ID: Hi all, I see some new test failures beside the special function issues (python2.4, scipy 0.8.0.dev5585, numpy 1.3.0.dev6450) ====================================================================== FAIL: Test generator for parametric tests ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/scipy/misc/tests/test_pilutil.py", line 35, in tst_fromimage assert img.min() >= imin AssertionError ====================================================================== FAIL: Test generator for parametric tests ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/scipy/misc/tests/test_pilutil.py", line 35, in tst_fromimage assert img.min() >= imin AssertionError ====================================================================== FAIL: test_morestats.test_fligner ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib/python2.4/site-packages/scipy/stats/tests/test_morestats.py", line 117, in test_fligner (3.2282229927203558, 0.072379187848207877)) File "/usr/lib/python2.4/site-packages/numpy/testing/utils.py", line 303, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/usr/lib/python2.4/site-packages/numpy/testing/utils.py", line 295, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 100.0%) x: array([ 3.228, 0.072]) y: array([ 3.228, 0.072]) Nils From scott.sinclair.za at gmail.com Sun Feb 22 04:34:27 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Sun, 22 Feb 2009 11:34:27 +0200 Subject: [SciPy-dev] Request to help edit documentation In-Reply-To: <200902212152.03089.danielsjensen1@gmail.com> References: <200902212152.03089.danielsjensen1@gmail.com> Message-ID: <6a17e9ee0902220134g5e1312abt8b97907a8c2b733c@mail.gmail.com> > 2009/2/22 Daniel Jensen : > I'm just following the instructions on the page: > http://docs.scipy.org/numpy/Front%20Page/ > for requesting rights to help edit documentation. I've found a few simple > spelling errors in the numpy documentation and thought that I could help > contribute a little. Just an example, the following page: > http://docs.scipy.org/doc/numpy/reference/generated/numpy.array_str.html#numpy.array_str > is missing the 'f' in 'floating point'. > -Daniel Hi Daniel, You'll need to register a user name first at: http://docs.scipy.org/numpy/accounts/register/ Once you've done so post your user name here and someone will give you the appropriate editing rights. Cheers, Scott From njs at pobox.com Sun Feb 22 05:05:41 2009 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2009 02:05:41 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902220101x139136aycfbfb47a267f62d4@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> <1e2af89e0902220101x139136aycfbfb47a267f62d4@mail.gmail.com> Message-ID: <961fa2b40902220205n10f6da61le52d62c297c6055f@mail.gmail.com> On Sun, Feb 22, 2009 at 1:01 AM, Matthew Brett wrote: > On Fri, Feb 20, 2009 at 10:28 PM, Nathaniel Smith wrote: >> I finally got a chance to test with my nasty file, and with r5561, it >> now takes ~32 minutes of cpu time to load (as compared to ~5 minutes >> for 0.7.0, and 3 seconds for 0.6.0). All the time is in >> zlibstreams.py:read. > > Could you check current SVN again and see how it works? It's down to 4 seconds. Yay. > I've sped up zlibstreams and it's now saving memory on the read, at > about a 12% drop in speed, now I think due to the overhead of the > single extra function calls on many small reads. > > I'm unsure whether I want to leave zlibstreams in. It has the > advantage of making skipping variables much faster and more memory > efficient, and maybe some increase in memory efficiency as the > variable is read, but still, the small performance penalty is > annoying. IMHO, if it lets one load gigabyte-matrices without allocating gigabyte temp variables, then that's a qualitative difference that's worth a small slowdown. If not, then neither the memory savings or the slowdown are large enough for me to care much. (I don't tend to save/load matlab files in my inner loops, personally.) The thing that does make me nervous is this code's fragility (as has been demonstrated repeatedly now). It's really non-obvious how small changes will affect its performance characteristics. Having read your changes, it isn't at all obvious to me why it's faster now. And e.g. I had to read StringIO.py to understand why you were recreating the StringIO object on every __fill. Just looking at zlibstreams.py, it appears wasteful and should be removed, but now I think that doing so could make it super-slow again. Basically, I just don't want to have to come back at every release and complain about my weird files again... -- Nathaniel From gael.varoquaux at normalesup.org Sun Feb 22 06:29:30 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 22 Feb 2009 12:29:30 +0100 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <961fa2b40902220205n10f6da61le52d62c297c6055f@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> <1e2af89e0902220101x139136aycfbfb47a267f62d4@mail.gmail.com> <961fa2b40902220205n10f6da61le52d62c297c6055f@mail.gmail.com> Message-ID: <20090222112930.GC23273@phare.normalesup.org> On Sun, Feb 22, 2009 at 02:05:41AM -0800, Nathaniel Smith wrote: > Basically, I just don't want to have to come back at every release and > complain about my weird files again... Contribute tests? If possible this seems the best way to ensure consistency. Ga?l From josef.pktd at gmail.com Sun Feb 22 07:37:48 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 22 Feb 2009 07:37:48 -0500 Subject: [SciPy-dev] New test failures scipy 0.8.0.dev5585 In-Reply-To: References: Message-ID: <1cd32cbb0902220437u4c639f5cj4c8d125d60236c23@mail.gmail.com> On Sun, Feb 22, 2009 at 4:30 AM, Nils Wagner wrote: > Hi all, > > I see some new test failures beside the special function > issues (python2.4, scipy 0.8.0.dev5585, numpy > 1.3.0.dev6450) > > ====================================================================== > FAIL: test_morestats.test_fligner > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/lib/python2.4/site-packages/nose-0.10.3-py2.4.egg/nose/case.py", > line 182, in runTest > self.test(*self.arg) > File > "/usr/lib/python2.4/site-packages/scipy/stats/tests/test_morestats.py", > line 117, in test_fligner > (3.2282229927203558, 0.072379187848207877)) > File > "/usr/lib/python2.4/site-packages/numpy/testing/utils.py", > line 303, in assert_array_equal > verbose=verbose, header='Arrays are not equal') > File > "/usr/lib/python2.4/site-packages/numpy/testing/utils.py", > line 295, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not equal > > (mismatch 100.0%) > x: array([ 3.228, 0.072]) > y: array([ 3.228, 0.072]) > fixed in trunk, reduced precision required for test instead of assert_equal. The test passed on win32 with equal, so I didn't catch this Thanks, Josef From njs at pobox.com Sun Feb 22 07:59:10 2009 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Feb 2009 04:59:10 -0800 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <20090222112930.GC23273@phare.normalesup.org> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> <961fa2b40902202228s5752c88bu9dbf420674211536@mail.gmail.com> <1e2af89e0902220101x139136aycfbfb47a267f62d4@mail.gmail.com> <961fa2b40902220205n10f6da61le52d62c297c6055f@mail.gmail.com> <20090222112930.GC23273@phare.normalesup.org> Message-ID: <961fa2b40902220459l65b32346kff4e29f3f3b92e7@mail.gmail.com> On Sun, Feb 22, 2009 at 3:29 AM, Gael Varoquaux wrote: > On Sun, Feb 22, 2009 at 02:05:41AM -0800, Nathaniel Smith wrote: >> Basically, I just don't want to have to come back at every release and >> complain about my weird files again... > > Contribute tests? If possible this seems the best way to ensure > consistency. I would -- and I posted a link to the test file I'm using upthread -- but it's 300 megabytes and I don't know how to produce a smaller one. (The obvious tricks don't seem to work.) You're certainly welcome to include it if you *want*, but... -- Nathaniel From stefan at sun.ac.za Sun Feb 22 09:02:12 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 16:02:12 +0200 Subject: [SciPy-dev] Server problems: resetting with cronjob? Message-ID: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> Hi all, The server has been down intermittently all weekend, which makes it hard to edit and close tickets. Could we please install a cron job to restart trac and whatever else is running once daily? Thanks, St?fan From strawman at astraw.com Sun Feb 22 12:26:20 2009 From: strawman at astraw.com (Andrew Straw) Date: Sun, 22 Feb 2009 09:26:20 -0800 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> Message-ID: <49A18ABC.40407@astraw.com> St?fan van der Walt wrote: > Hi all, > > The server has been down intermittently all weekend, which makes it > hard to edit and close tickets. > > Could we please install a cron job to restart trac and whatever else > is running once daily? > > Along these lines, if Trac (or any webapp) is run as an FCGI script on Apache, the standard configuration will restart the process automatically after a certain number of requests have been handled. From pwang at enthought.com Sun Feb 22 14:11:52 2009 From: pwang at enthought.com (Peter Wang) Date: Sun, 22 Feb 2009 13:11:52 -0600 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> Message-ID: <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> On Feb 22, 2009, at 8:02 AM, St?fan van der Walt wrote: > Hi all, > The server has been down intermittently all weekend, which makes it > hard to edit and close tickets. I have restarted the apache process; when I checked just now it was clearly hung. Lately the server has been experiencing abnormally high load, and we are devoting resources now to transitioning services from it to the new hardware at conference.scipy.org. Today I will start working on moving the mailman mailing lists over, and then I will coordinate with admins of individual subdomains to move the websites, trac, and svn repositories to the new hardware. I know that the server issues lately have been frustrating, and appreciate everyone's patience. > Could we please install a cron job to restart trac and whatever else > is running once daily? For the time being I'll be monitoring the server more closely as I work on it, and will manually do a graceful restart if necessary. If you continue to have problems with connectivity, please let me know and we can do this as a last resort. Thanks, Peter From stefan at sun.ac.za Sun Feb 22 15:43:35 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 22:43:35 +0200 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> Message-ID: <9457e7c80902221243s37728158s37aa2ae204844941@mail.gmail.com> 2009/2/22 Peter Wang : > Lately the server has been experiencing abnormally high load, and we > are devoting resources now to transitioning services from it to the > new hardware at conference.scipy.org. Today I will start working on > moving the mailman mailing lists over, and then I will coordinate with > admins of individual subdomains to move the websites, trac, and svn > repositories to the new hardware. Peter, thank you very much. I know that you are doing this in addition to your normal duties; we are all extremely grateful. Regards St?fan From luis94855510 at gmail.com Sun Feb 22 16:24:09 2009 From: luis94855510 at gmail.com (Luis Saavedra) Date: Sun, 22 Feb 2009 18:24:09 -0300 Subject: [SciPy-dev] Another request to edit documentation... Message-ID: <49A1C279.1090608@gmail.com> Hi list, I'm learning to use the C-API for numpy and I had some problems... but I have already autosolved and I'd like to share experience of working on the documentation, regards, Luis From gael.varoquaux at normalesup.org Sun Feb 22 16:27:03 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 22 Feb 2009 22:27:03 +0100 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> Message-ID: <20090222212703.GR6701@phare.normalesup.org> On Sun, Feb 22, 2009 at 01:11:52PM -0600, Peter Wang wrote: > Lately the server has been experiencing abnormally high load, and we > are devoting resources now to transitioning services from it to the > new hardware at conference.scipy.org. Today I will start working on > moving the mailman mailing lists over, and then I will coordinate with > admins of individual subdomains to move the websites, trac, and svn > repositories to the new hardware. I am just wondering if this will change anything. I'd like to know where these high loads are coming from. I am afraid the same probems will come up with the new server. A maybe unrelated fact: the spammers are really devastating the moin instance. I don't know what to do about this. Cleaning it up takes ages, especially with how reactive it is. :(. Ga?l From gael.varoquaux at normalesup.org Sun Feb 22 16:28:05 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 22 Feb 2009 22:28:05 +0100 Subject: [SciPy-dev] Another request to edit documentation... In-Reply-To: <49A1C279.1090608@gmail.com> References: <49A1C279.1090608@gmail.com> Message-ID: <20090222212805.GS6701@phare.normalesup.org> On Sun, Feb 22, 2009 at 06:24:09PM -0300, Luis Saavedra wrote: > Hi list, > I'm learning to use the C-API for numpy and I had some problems... but I > have already autosolved and I'd like to share experience of working on the > documentation, Just create a login on the doc server: docs.scipy.org, and send me your user name. I'll authorize you. Thanks for helping out, Ga?l From luis94855510 at gmail.com Sun Feb 22 16:29:14 2009 From: luis94855510 at gmail.com (Luis Saavedra) Date: Sun, 22 Feb 2009 18:29:14 -0300 Subject: [SciPy-dev] Another request to edit documentation... In-Reply-To: <20090222212805.GS6701@phare.normalesup.org> References: <49A1C279.1090608@gmail.com> <20090222212805.GS6701@phare.normalesup.org> Message-ID: <49A1C3AA.5090909@gmail.com> Gael Varoquaux escribi?: > On Sun, Feb 22, 2009 at 06:24:09PM -0300, Luis Saavedra wrote: > >> Hi list, >> > > >> I'm learning to use the C-API for numpy and I had some problems... but I >> have already autosolved and I'd like to share experience of working on the >> documentation, >> > > Just create a login on the doc server: docs.scipy.org, and send me your > user name. I'll authorize you. > > Thanks for helping out, > > Ga?l > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > lsaavedr From gael.varoquaux at normalesup.org Sun Feb 22 16:32:03 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 22 Feb 2009 22:32:03 +0100 Subject: [SciPy-dev] Another request to edit documentation... In-Reply-To: <49A1C3AA.5090909@gmail.com> References: <49A1C279.1090608@gmail.com> <20090222212805.GS6701@phare.normalesup.org> <49A1C3AA.5090909@gmail.com> Message-ID: <20090222213203.GU6701@phare.normalesup.org> On Sun, Feb 22, 2009 at 06:29:14PM -0300, Luis Saavedra wrote: > lsaavedr OK, you're authorized. Thanks for your involvement. Ga?l From michael.abshoff at googlemail.com Sun Feb 22 16:40:20 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sun, 22 Feb 2009 13:40:20 -0800 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <20090222212703.GR6701@phare.normalesup.org> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> Message-ID: <49A1C644.5030205@gmail.com> Gael Varoquaux wrote: > On Sun, Feb 22, 2009 at 01:11:52PM -0600, Peter Wang wrote: >> Lately the server has been experiencing abnormally high load, and we >> are devoting resources now to transitioning services from it to the >> new hardware at conference.scipy.org. Today I will start working on >> moving the mailman mailing lists over, and then I will coordinate with >> admins of individual subdomains to move the websites, trac, and svn >> repositories to the new hardware. > > I am just wondering if this will change anything. I'd like to know where > these high loads are coming from. I am afraid the same probems will come > up with the new server. > > A maybe unrelated fact: the spammers are really devastating the moin > instance. I don't know what to do about this. Cleaning it up takes ages, > especially with how reactive it is. :(. Hi, two tips of fighting spammers from the Sage project's wiki: * add a list of common Chinese words to LocalBadContent, i.e. http://wiki.sagemath.org/LocalBadContent Also make sure to clean out all the spammer attempts on the hard disk. I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam attempts are preserved and not actually deleted from disk. If you have a couple ten thousand of those in one directory this might make every wiki access painfully slow and impact the whole server. * upgrade to the latest moin moin release and activate the question captcha. Spam has dropped to zero in the last 3 months since we used it. > Ga?l Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From gael.varoquaux at normalesup.org Sun Feb 22 16:49:28 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 22 Feb 2009 22:49:28 +0100 Subject: [SciPy-dev] Server problems: resetting with cronjob? In-Reply-To: <49A1C644.5030205@gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> Message-ID: <20090222214928.GW6701@phare.normalesup.org> On Sun, Feb 22, 2009 at 01:40:20PM -0800, Michael Abshoff wrote: > two tips of fighting spammers from the Sage project's wiki: > * add a list of common Chinese words to LocalBadContent, i.e. > http://wiki.sagemath.org/LocalBadContent > Also make sure to clean out all the spammer attempts on the hard disk. > I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam > attempts are preserved and not actually deleted from disk. If you have a > couple ten thousand of those in one directory this might make every wiki > access painfully slow and impact the whole server. > * upgrade to the latest moin moin release and activate the question > captcha. Spam has dropped to zero in the last 3 months since we used it. Thanks that's useful. Ga?l From stefan at sun.ac.za Sun Feb 22 18:16:39 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 01:16:39 +0200 Subject: [SciPy-dev] Response to ticket #872: overwrite_a has no effect for cholesky Message-ID: <9457e7c80902221516l464a9a63j8ca5f6281677e7bc@mail.gmail.com> Hi all, I think ticket #872 may have uncovered a larger problem, in that none of the scipy.linalg functions support overwrite_a. Maybe I'm mistaken; could anyone shed some light? Thanks St?fan From jtravs at gmail.com Sun Feb 22 18:56:55 2009 From: jtravs at gmail.com (John Travers) Date: Sun, 22 Feb 2009 23:56:55 +0000 Subject: [SciPy-dev] some new ode solvers Message-ID: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Hi All, Attached is a patch which adds two new ODE solvers to the scipy.integrate.ode module. The solvers are dopri5 and dop853, which are explicit Runge-Kutta pairs originally developed by Dormand and Prince. The fortran code was downloaded from: http://www.unige.ch/~hairer/software.html The license is clearly BSD and SciPy compatible. These are excellent solvers, described in detail in the authors book: Solving Ordinary Differential Equations. Nonstiff Problems. 2nd edition. Springer Series in Comput. Math., vol. 8. The dopri5 code is what Matlab's ode45 is based on. I think they would be a good addition to SciPy and I have used them often (in Fortran). The attached patch tries to follow the (somewhat strange to me) coding practices of the current ode module. I have added them to the test suite, but note that there are not many tests! (I might add some if I get time). I have also tested them with my own code which uses the ode module. If I get the go ahead from some regular SciPy contributers then I'll go ahead and commit this patch. I think I still have SVN access, but I wanted this pacth to be reviewed first as it has been a long time since I did anything with SciPy. Cheers, John -- Telephone: (+44) (0) 7739 105209 -------------- next part -------------- A non-text attachment was scrubbed... Name: ode-diff.tgz Type: application/x-gzip Size: 16255 bytes Desc: not available URL: From robert.kern at gmail.com Sun Feb 22 19:21:07 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Feb 2009 18:21:07 -0600 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: <3d375d730902221621s28e1641cu96f507f98f6fc3a7@mail.gmail.com> On Sun, Feb 22, 2009 at 17:56, John Travers wrote: > Hi All, > > Attached is a patch which adds two new ODE solvers to the > scipy.integrate.ode module. > The solvers are dopri5 and dop853, which are explicit Runge-Kutta > pairs originally developed > by Dormand and Prince. The fortran code was downloaded from: > > http://www.unige.ch/~hairer/software.html > > The license is clearly BSD and SciPy compatible. > > These are excellent solvers, described in detail in the authors book: > > Solving Ordinary Differential Equations. Nonstiff Problems. 2nd > edition. Springer Series in Comput. Math., vol. 8. > > The dopri5 code is what Matlab's ode45 is based on. > > I think they would be a good addition to SciPy and I have used them > often (in Fortran). The attached patch tries > to follow the (somewhat strange to me) coding practices of the current > ode module. I have added them to the > test suite, but note that there are not many tests! (I might add some > if I get time). I have also tested them with > my own code which uses the ode module. > > If I get the go ahead from some regular SciPy contributers then I'll > go ahead and commit this patch. I think I still > have SVN access, but I wanted this pacth to be reviewed first as it > has been a long time since I did anything with > SciPy. * Typo: "first 0rder" * You're missing dop.pyf * I don't like the conditional imports inside the classes. * Don't print things. Use warnings.warn() if you must. But otherwise, +1. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jtravs at gmail.com Sun Feb 22 19:36:01 2009 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Feb 2009 00:36:01 +0000 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <3d375d730902221621s28e1641cu96f507f98f6fc3a7@mail.gmail.com> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> <3d375d730902221621s28e1641cu96f507f98f6fc3a7@mail.gmail.com> Message-ID: <3a1077e70902221636ib6c28d9w5f6535f4734f6fb3@mail.gmail.com> at 12:21 AM, Robert Kern wrote: > > * Typo: "first 0rder" Fixed. This was also in the original fortran code. > > * You're missing dop.pyf Fixed in the attached patch. > > * I don't like the conditional imports inside the classes. Neither do I, but as I said, I followed the coding style already in ode.py > > * Don't print things. Use warnings.warn() if you must. Same as above. I might get round to cleaing this if I find time, but generally I don't like to change others code. > > But otherwise, +1. Thanks. -------------- next part -------------- A non-text attachment was scrubbed... Name: ode-diff.tgz Type: application/x-gzip Size: 16968 bytes Desc: not available URL: From jh at physics.ucf.edu Sun Feb 22 22:34:39 2009 From: jh at physics.ucf.edu (Joe Harrington) Date: Sun, 22 Feb 2009 22:34:39 -0500 Subject: [SciPy-dev] new member. In-Reply-To: (scipy-dev-request@scipy.org) References: Message-ID: Hi Kevin, welcome to the list. > But I don't see why we can't compete with industry technology's > feature set----while providing extra modularity, flexibility, > portability, and extensibility. I think the answer is simply one of available labor and coordination, so if you're willing to work, you are part of the solution. Others will have differing opinions, but I feel we are weakest, still, in documentation and packaging. Both areas are improving, but both have a long way to go. So, if you want to get this package adopted fastest, helping in those areas would probably bring the biggest bang per hour spent. My position is that if we make this package easy for new people to install and learn (as in, click in synaptic or yum or whatever package manager you use, read an obviously-placed getting started doc, and start playing), it will spread rapidly. Teachers and professors will use it in their classes because they know anyone can run it on any machine, easily and for free. The resulting, enlarged community will then make it much easier to build out and maintain, since that will solve the labor problem. So, head on over to docs.scipy.org, or contact the packaging team, and dive in. (And if it isn't obvious, the same invitation goes to all our lurkers. Pitching in is easy, and very rewarding!) Welcome again, --jh-- From dwf at cs.toronto.edu Sun Feb 22 22:58:49 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sun, 22 Feb 2009 22:58:49 -0500 Subject: [SciPy-dev] Response to ticket #872: overwrite_a has no effect for cholesky In-Reply-To: <9457e7c80902221516l464a9a63j8ca5f6281677e7bc@mail.gmail.com> References: <9457e7c80902221516l464a9a63j8ca5f6281677e7bc@mail.gmail.com> Message-ID: Hi Stefan, I just recently realized in another thread that a number of the BLAS wrappers in scipy.linalg.fblas have broken overwrite_foo. dgemv's seems to work, but dgemm's doesn't, for sure. Maybe these two things are related? David On 22-Feb-09, at 6:16 PM, St?fan van der Walt wrote: > Hi all, > > I think ticket #872 may have uncovered a larger problem, in that none > of the scipy.linalg functions support overwrite_a. Maybe I'm > mistaken; could anyone shed some light? > > Thanks > St?fan > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From wnbell at gmail.com Sun Feb 22 23:51:31 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 22 Feb 2009 23:51:31 -0500 Subject: [SciPy-dev] new member. In-Reply-To: References: Message-ID: On Sun, Feb 22, 2009 at 10:34 PM, Joe Harrington wrote: > > My position is that if we make this package easy for new people to > install and learn (as in, click in synaptic or yum or whatever package > manager you use, read an obviously-placed getting started doc, and > start playing), it will spread rapidly. Teachers and professors will > use it in their classes because they know anyone can run it on any > machine, easily and for free. The resulting, enlarged community will > then make it much easier to build out and maintain, since that will > solve the labor problem. > +1 Unfortunately, those things are the least fun to work on :) Kevin, you might also look over the roadmap and open tickets to see if anything piques your interest. http://projects.scipy.org/scipy/scipy/roadmap http://projects.scipy.org/scipy/scipy/report/1 I primarily work on the sparse module, so if you're interested in sparse matrices and/or solvers I can suggest some ideas. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From oliphant at enthought.com Sun Feb 22 23:56:41 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 22 Feb 2009 22:56:41 -0600 Subject: [SciPy-dev] new member. In-Reply-To: <93d69480902211847le067bd6re56a562c608af732@mail.gmail.com> References: <93d69480902211847le067bd6re56a562c608af732@mail.gmail.com> Message-ID: <49A22C89.1020600@enthought.com> Kevin Daley wrote: > Hi, all. > > Just want to let everyone know I'm here. Name's Kevin Daley, from > Atlanta. Hi Kevin, Welcome to the list! Jump in wherever you feel comfortable. We are glad to have you and look forward to your contributions. Best regards, -Travis From oliphant at enthought.com Mon Feb 23 00:02:21 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 22 Feb 2009 23:02:21 -0600 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: <49A22DDD.10904@enthought.com> John Travers wrote: > If I get the go ahead from some regular SciPy contributers then I'll > go ahead and commit this patch. I think I still > have SVN access, but I wanted this pacth to be reviewed first as it > has been a long time since I did anything with > SciPy. > +1 -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From robert.kern at gmail.com Mon Feb 23 00:04:50 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Feb 2009 23:04:50 -0600 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <3a1077e70902221636ib6c28d9w5f6535f4734f6fb3@mail.gmail.com> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> <3d375d730902221621s28e1641cu96f507f98f6fc3a7@mail.gmail.com> <3a1077e70902221636ib6c28d9w5f6535f4734f6fb3@mail.gmail.com> Message-ID: <3d375d730902222104g4b8510a9ve000bbd2edcc36ba@mail.gmail.com> On Sun, Feb 22, 2009 at 18:36, John Travers wrote: > at 12:21 AM, Robert Kern wrote: >> >> * Typo: "first 0rder" > > Fixed. This was also in the original fortran code. > >> >> * You're missing dop.pyf > > Fixed in the attached patch. > >> >> * I don't like the conditional imports inside the classes. > > Neither do I, but as I said, I followed the coding style already in ode.py > >> >> * Don't print things. Use warnings.warn() if you must. > > Same as above. I might get round to cleaing this if I find time, but > generally I don't > like to change others code. Well, it looks like some smartass went in and made those changes to the old code .... And that smartass will be happy to fix up yours when you check it in, if you like. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Mon Feb 23 04:03:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 23 Feb 2009 09:03:50 +0000 (UTC) Subject: [SciPy-dev] some new ode solvers References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: Sun, 22 Feb 2009 23:56:55 +0000, John Travers wrote: > Attached is a patch which adds two new ODE solvers to the > scipy.integrate.ode module. [clip] I think this is a good point to discuss some API design decisions on scipy.integrate.ode. There are currently two main interfaces to ODE integration: - vode: a class, requires people to repeatedly call .integrate() to get values at separated points. Parameters set via method calls to the object. Uses DVODE/ZVODE. - odeint: a function, computes values at points given as arguments. Parameters set via (keyword) arguments to the function. Uses LSODA. Clearly, here we have one interface too many, and it's a bit of a mess. Either both LSODA and DVODE should be available only via one way (or both ways, as we decided to go with scipy.interpolate). Which to deprecate? Also, how to specify the integrator to use: choose the correct function, or specify the name of the integrator as a string? I'd perhaps like to see: - All integrators moved to classes (with CamelCase names). If you want to use eg. the ZVODE solver, you'd instantiate 'Zvode' class. - The 'integrate' method would support getting multiple time points at once. - There'd also be thin wrapper functions (with lowercase names), e.g. 'zvode' that would allow all solvers to have a simple odeint-type interface. - Both of the current 'ode' and 'odeint' interfaces would be dropped in Scipy 1.0, and deprecated before that. How would this sound? -- Pauli Virtanen From wnbell at gmail.com Mon Feb 23 05:08:23 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 23 Feb 2009 05:08:23 -0500 Subject: [SciPy-dev] matlab io - request for testing In-Reply-To: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: On Thu, Feb 19, 2009 at 10:42 PM, Matthew Brett wrote: > Hi, > > I have been beating up the matlab io rather severely in order to > implement some cleanups, fixes, and add new options. > > I would very much appreciate it if people could pick up the current > SVN and let me know whether they have any problems. > Adding Antonino, who made some comments in a related SciPy-User thread: http://article.gmane.org/gmane.comp.python.scientific.user/19614 -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From jtravs at gmail.com Mon Feb 23 05:38:53 2009 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Feb 2009 10:38:53 +0000 Subject: [SciPy-dev] some new ode solvers In-Reply-To: References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: <3a1077e70902230238p58e59697gdb95fc3ee8517a6e@mail.gmail.com> OK the new solvers were commited in revision 5589. On the api re-design I have some notes below: On Mon, Feb 23, 2009 at 9:03 AM, Pauli Virtanen wrote: > I'd perhaps like to see: > - All integrators moved to classes (with CamelCase names). > If you want to use eg. the ZVODE solver, you'd instantiate 'Zvode' > class. +1 > - The 'integrate' method would support getting multiple time points at > once. > - There'd also be thin wrapper functions (with lowercase names), e.g. > 'zvode' that would allow all solvers to have a simple odeint-type > interface. > - Both of the current 'ode' and 'odeint' interfaces would be dropped in > Scipy 1.0, and deprecated before that. +1, I too thought the api could be much cleaner. One extra convenience 'feature' would be to have wrappers to use the real data solvers with complex input. For my stuff at least I find vode or dopri5, with the problem split into real and imaginary parts, with the translatation occuring at each step (with more then 2**15 points), considerably faster than the full system under zvode. It would be easy to supply wrappers which converted the data to/from the rhs function and input data. From gael.varoquaux at normalesup.org Mon Feb 23 06:16:16 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 23 Feb 2009 12:16:16 +0100 Subject: [SciPy-dev] some new ode solvers In-Reply-To: References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: <20090223111616.GC19510@phare.normalesup.org> On Mon, Feb 23, 2009 at 09:03:50AM +0000, Pauli Virtanen wrote: > How would this sound? In general good, but keep a simple odeint interface, with maybe a keyword argument to specify which integrator to use (it could even take an integrator instance). I was trying to help my little sister doing her homework with SciPy this week end, and grokking ODEs and integrators is hard-enough without thinking of objects and classes. Ga?l From nwagner at iam.uni-stuttgart.de Mon Feb 23 06:23:22 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 23 Feb 2009 12:23:22 +0100 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <20090223111616.GC19510@phare.normalesup.org> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> <20090223111616.GC19510@phare.normalesup.org> Message-ID: On Mon, 23 Feb 2009 12:16:16 +0100 Gael Varoquaux wrote: > On Mon, Feb 23, 2009 at 09:03:50AM +0000, Pauli Virtanen >wrote: >> How would this sound? > > In general good, but keep a simple odeint interface, >with maybe a keyword > argument to specify which integrator to use (it could >even take an > integrator instance). +1 Nils From stefan at sun.ac.za Mon Feb 23 11:04:36 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 18:04:36 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> Message-ID: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> *[If you only have 30 seconds to read this email, read the **bold text only] * *Dear* SciPy *developer*s The past while has seen a rocky ride with the SciPy servers, but yesterday Peter Wang announced that he is attending to the situation. This, then, seems like the perfect time to *stand back and take a look at our infrastructure*, and whether we should continue with the current setup. To put this conversation into context, we have to face the facts: SciPy has a large user community relative to the number of developers. A big library of code, used by many scientists, is supported by a small handful of people all over the world. *We cannot afford* *a high barrier to contribution*, and we have to lower the effort it takes for a developer to merge contributed code. *I'd like to propose two changes* to the status quo: 1. *Change to a distributed revision control system*, encouraging more open collaboration. 2. *Determine guidelines for code acceptance*, in terms of unit tests, documentation and peer review. Allow me to motivate these changes, and then suggest practical approaches for their implementation: Subversion allows only a selected group of developers to change the SciPy source code. This does not encourage a culture of meritocracy, but worse, has practical implications, in that users cannot merge their own patches. I won't discuss the advantages of distributed revision control here, but note that it shifts responsibility from the current core developers to contributers; *that benefits us all!* This ties in with my second point: code review. The current developers have access to SVN because they are experienced programmers with knowledge of SciPy's scientific domains of application. We are unable to employ this scarce resource fully, because it simply takes too long to merge a patch from Trac, review it, *bring it up to scratch*, and commit it. *We have to put a system in place which allows contributers to take responsibility for their own patches, and for core developers to guide and advise during this process.* As it is, we have many patches waiting on Trac for up to a year or more without any feedback; that is not acceptable. My view on testing is simple: *untested code is probably broken code* (and I can show examples from the past year's commit logs to corroborate this statement). *As for documentation, we cannot afford to be without it. * Implementation: Enthought generously hosts SciPy, and I hope they will continue doing so. New software will need to be installed on the server, but we have many hands willing to tackle that task: David Cournapeau and myself included. Before deploying to scipy.org, *we will configure a *different* server as a proof of concept.* 1) *Distributed revision control system: David Cournapeau and myself have been test driving Git [1] on SciPy and NumPy for a while. It is fast, well supported, has great branch support, and is simple to use for the average contributor, while allowing powerful patch-carving for the more adventurous. * 2) *Ticketing back-end:* David is exploring RedMine [2], and I'd like to take a look at InDefero [3], but *we'll do a careful analysis* of trac-git (like FedoraHosted) too. Thank you for taking the time to deliberate on SciPy's future. I would love to hear your comments. Kind regards St?fan [1] http://git.or.cz/course/svn.html [2] http://www.redmine.org/ [3] http://scipy.indefero.net/p/numpy/ [4] http://fedorahosted.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Mon Feb 23 11:17:38 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 23 Feb 2009 17:17:38 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <49A2CC22.8050006@ntc.zcu.cz> Hi St?fan, St?fan van der Walt wrote: > Implementation: > > Enthought generously hosts SciPy, and I hope they will continue doing so. > New software will need to be installed on the server, but we have many hands > willing to tackle that task: David Cournapeau and myself included. Before > deploying to scipy.org, *we will configure a *different* server as a proof > of concept.* > > 1) *Distributed revision control system: David Cournapeau and myself have > been test driving Git [1] on SciPy and NumPy for a while. It is fast, well > supported, has great branch support, and is simple to use for the average > contributor, while allowing powerful patch-carving for the more adventurous. > * Going git would make my life as an occasional numpy/scipy contributor really a lot easier, so big +1! cheers, r. From charlesr.harris at gmail.com Mon Feb 23 11:29:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 09:29:18 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 9:04 AM, St?fan van der Walt wrote: > *[If you only have 30 seconds to read this email, read the **bold text > only]* > > *Dear* SciPy *developer*s > > The past while has seen a rocky ride with the SciPy servers, but yesterday > Peter Wang announced that he is attending to the situation. This, then, > seems like the perfect time to *stand back and take a look at our > infrastructure*, and whether we should continue with the current setup. > > To put this conversation into context, we have to face the facts: SciPy has > a large user community relative to the number of developers. A big library > of code, used by many scientists, is supported by a small handful of people > all over the world. *We cannot afford* *a high barrier to contribution*, > and we have to lower the effort it takes for a developer to merge > contributed code. > > *I'd like to propose two changes* to the status quo: > > 1. *Change to a distributed revision control system*, encouraging more > open collaboration. > 2. *Determine guidelines for code acceptance*, in terms of unit tests, > documentation and peer review. > > Allow me to motivate these changes, and then suggest practical approaches > for their implementation: > > Subversion allows only a selected group of developers to change the SciPy > source code. This does not encourage a culture of meritocracy, but worse, > has practical implications, in that users cannot merge their own patches. I > won't discuss the advantages of distributed revision control here, but note > that it shifts responsibility from the current core developers to > contributers; *that benefits us all!* > > This ties in with my second point: code review. The current developers > have access to SVN because they are experienced programmers with knowledge > of SciPy's scientific domains of application. We are unable to employ this > scarce resource fully, because it simply takes too long to merge a patch > from Trac, review it, *bring it up to scratch*, and commit it. *We have > to put a system in place which allows contributers to take responsibility > for their own patches, and for core developers to guide and advise during > this process.* As it is, we have many patches waiting on Trac for up to a > year or more without any feedback; that is not acceptable. > > My view on testing is simple: *untested code is probably broken code* (and > I can show examples from the past year's commit logs to corroborate this > statement). *As for documentation, we cannot afford to be without it. > * > Implementation: > > Enthought generously hosts SciPy, and I hope they will continue doing so. > New software will need to be installed on the server, but we have many hands > willing to tackle that task: David Cournapeau and myself included. Before > deploying to scipy.org, *we will configure a *different* server as a proof > of concept.* > > 1) *Distributed revision control system: David Cournapeau and myself have > been test driving Git [1] on SciPy and NumPy for a while. It is fast, well > supported, has great branch support, and is simple to use for the average > contributor, while allowing powerful patch-carving for the more adventurous. > * > I really like Git, but... the last time I looked windows support wasn't up to snuff. Does anyone have more recent feedback on the windows situation? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtravs at gmail.com Mon Feb 23 11:39:45 2009 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Feb 2009 16:39:45 +0000 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <3a1077e70902230839v60809fddo9342182facff7877@mail.gmail.com> On Mon, Feb 23, 2009 at 4:29 PM, Charles R Harris wrote: > On Mon, Feb 23, 2009 at 9:04 AM, St?fan van der Walt > wrote: >> >> 1) Distributed revision control system: David Cournapeau and myself have >> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >> supported, has great branch support, and is simple to use for the average >> contributor, while allowing powerful patch-carving for the more adventurous. > > I really like Git, but... the last time I looked windows support wasn't up > to snuff. Does anyone have more recent feedback on the windows situation? > MSysGit works pretty well these days, and is apparently being merged with official git. http://code.google.com/p/msysgit/ J From sturla at molden.no Mon Feb 23 11:43:06 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 23 Feb 2009 17:43:06 +0100 (CET) Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3a1077e70902230839v60809fddo9342182facff7877@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3a1077e70902230839v60809fddo9342182facff7877@mail.gmail.com> Message-ID: > On Mon, Feb 23, 2009 at 4:29 PM, Charles R Harris > MSysGit works pretty well these days, and is apparently being merged > with official git. > > http://code.google.com/p/msysgit/ And there is of course Cygwin, which includes git. S.M. From jtravs at gmail.com Mon Feb 23 11:43:24 2009 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Feb 2009 16:43:24 +0000 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3a1077e70902230839v60809fddo9342182facff7877@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3a1077e70902230839v60809fddo9342182facff7877@mail.gmail.com> Message-ID: <3a1077e70902230843nef38b79i2ece22509b3e03fb@mail.gmail.com> On Mon, Feb 23, 2009 at 4:39 PM, John Travers wrote: > On Mon, Feb 23, 2009 at 4:29 PM, Charles R Harris > wrote: >> On Mon, Feb 23, 2009 at 9:04 AM, St?fan van der Walt >> wrote: >>> >>> 1) Distributed revision control system: David Cournapeau and myself have >>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>> supported, has great branch support, and is simple to use for the average >>> contributor, while allowing powerful patch-carving for the more adventurous. >> >> I really like Git, but... the last time I looked windows support wasn't up >> to snuff. Does anyone have more recent feedback on the windows situation? >> > > MSysGit works pretty well these days, and is apparently being merged > with official git. > > http://code.google.com/p/msysgit/ > See http://kylecordes.com/2008/04/30/git-windows-go/ for more info. J From stefan at sun.ac.za Mon Feb 23 11:43:55 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 18:43:55 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <9457e7c80902230843o3b95576ev2e759690e6c09ecb@mail.gmail.com> Hi Chuck 2009/2/23 Charles R Harris : >> 1) Distributed revision control system: David Cournapeau and myself have >> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >> supported, has great branch support, and is simple to use for the average >> contributor, while allowing powerful patch-carving for the more adventurous. > > I really like Git, but... the last time I looked windows support wasn't up > to snuff. Does anyone have more recent feedback on the windows situation? I've read that msysgit (http://code.google.com/p/msysgit/) works well. From http://garrys-brain.blogspot.com/2008/04/git-for-windows-msysgit.html : """ I would say Git For Windows is very close to being "ready" and providing you are not in need of the more difficult corner cases it is ready for production use. The guys working on it have done a great job. """ I can't vouch for this information, so it would be great to hear from someone who tried it themselves. Cheers St?fan From josef.pktd at gmail.com Mon Feb 23 11:43:59 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 11:43:59 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2CC22.8050006@ntc.zcu.cz> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> Message-ID: <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> On Mon, Feb 23, 2009 at 11:17 AM, Robert Cimrman wrote: > Hi St?fan, > > St?fan van der Walt wrote: >> Implementation: >> >> Enthought generously hosts SciPy, and I hope they will continue doing so. >> New software will need to be installed on the server, but we have many hands >> willing to tackle that task: David Cournapeau and myself included. Before >> deploying to scipy.org, *we will configure a *different* server as a proof >> of concept.* >> >> 1) *Distributed revision control system: David Cournapeau and myself have >> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >> supported, has great branch support, and is simple to use for the average >> contributor, while allowing powerful patch-carving for the more adventurous. >> * > > Going git would make my life as an occasional numpy/scipy contributor > really a lot easier, so big +1! > > cheers, > r. I'm pretty happy with svn; it is relatively simple and has good integration and GUI tools on Windows. From all I read, git would be a big barrier for casual users (of git). From all the descriptions I've read, git is powerful for "command line junkies" who remember a large number of commands and options but not for occasional users of it. But I never installed git, because some time ago when I compared bazar, mercurial, git still didn't have much support on windows. One problem is that the bazar mirror of scipy on launchpad still fails to import, but otherwise working with mirrors for the different version control system would creating patches easier for users of scipy, for example before my committ access to scipy, I used bzrsvn to maintain my local branches. My main problem with trac tickets are missing tests, not the actual applying of the patch or bugfixes. I think low test coverage and weak testing "culture" is more of a problem than the revision control system. From what I have seen in the scipy code, it is true that, if it doesn't have a test, it is broken with high probability. Josef From ctw at cogsci.info Mon Feb 23 11:44:41 2009 From: ctw at cogsci.info (Christoph T. Weidemann) Date: Mon, 23 Feb 2009 11:44:41 -0500 Subject: [SciPy-dev] Subclassed ndarray fails with ValueError when assigning to a sliced array In-Reply-To: <6b7179780902230838g2615d1e2u5940a489eed063ce@mail.gmail.com> References: <6b7179780902230838g2615d1e2u5940a489eed063ce@mail.gmail.com> Message-ID: Per, who created ticket 1023 tried to send the following message to the mailing list, but it bounced, so I am resending it on his behalf. ================================================================= Hi Folks: Someone mentioned that I should send an email to the mailing list when reporting a bug / issue, so here's the repeat of a new ticket I just opened: http://scipy.org/scipy/numpy/ticket/1023 I've been working on a subclass of ndarray that has named dimensions that you can use to slice within a custom __getitem__. For example, you can do something like this: x['time>0'] to slice the array where the time dimension is greater than zero. This is all working nicely, right up until I try to assign to a slice like this: x['time>0'] += 1 or this x['time>0'] = x['time>0'] + 1 giving me the following error: ValueError: field named time>0 not found. I'm not sure what is raising this error (the interpreter gives no module or line number that's raising it), so I'm scared it's actually in the interpreter or somewhere deep in the numpy code, but I figured I'd try here first. Any ideas for what's going on here? All assignments with standard slices work just fine. Is there some code in numpy that is checking the contents of __getitem__ and causing it to fail here? Thanks for any thoughts, Per PS-> I'm running this on NumPy? 1.1.0 in Debian Lenny. From david at ar.media.kyoto-u.ac.jp Mon Feb 23 11:31:31 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Feb 2009 01:31:31 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <49A2CF63.9080806@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > On Mon, Feb 23, 2009 at 9:04 AM, St?fan van der Walt > wrote: > > *[If you only have 30 seconds to read this email, read the **bold > text only]* > > *Dear* SciPy *developer*s > > The past while has seen a rocky ride with the SciPy servers, but > yesterday Peter Wang announced that he is attending to the > situation. This, then, seems like the perfect time to *stand back > and take a look at our infrastructure*, and whether we should > continue with the current setup. > > To put this conversation into context, we have to face the facts: > SciPy has a large user community relative to the number of > developers. A big library of code, used by many scientists, is > supported by a small handful of people all over the world. *We > cannot afford* *a high barrier to contribution*, and we have to > lower the effort it takes for a developer to merge contributed code. > > *I'd like to propose two changes* to the status quo: > > 1. *Change to a distributed revision control system*, encouraging > more open collaboration. > 2. *Determine guidelines for code acceptance*, in terms of unit > tests, documentation and peer review. > > Allow me to motivate these changes, and then suggest practical > approaches for their implementation: > > Subversion allows only a selected group of developers to change > the SciPy source code. This does not encourage a culture of > meritocracy, but worse, has practical implications, in that users > cannot merge their own patches. I won't discuss the advantages of > distributed revision control here, but note that it shifts > responsibility from the current core developers to contributers; > *that benefits us all!* > > This ties in with my second point: code review. The current > developers have access to SVN because they are experienced > programmers with knowledge of SciPy's scientific domains of > application. We are unable to employ this scarce resource fully, > because it simply takes too long to merge a patch from Trac, > review it, *bring it up to scratch*, and commit it. *We have to > put a system in place which allows contributers to take > responsibility for their own patches, and for core developers to > guide and advise during this process.* As it is, we have many > patches waiting on Trac for up to a year or more without any > feedback; that is not acceptable. > > My view on testing is simple: *untested code is probably broken > code* (and I can show examples from the past year's commit logs to > corroborate this statement). *As for documentation, we cannot > afford to be without it. > * > Implementation: > > Enthought generously hosts SciPy, and I hope they will continue > doing so. New software will need to be installed on the server, > but we have many hands willing to tackle that task: David > Cournapeau and myself included. Before deploying to scipy.org > , *we will configure a *different* server as a > proof of concept.* > > 1) *Distributed revision control system: David Cournapeau and > myself have been test driving Git [1] on SciPy and NumPy for a > while. It is fast, well supported, has great branch support, and > is simple to use for the average contributor, while allowing > powerful patch-carving for the more adventurous.* > > > I really like Git, but... the last time I looked windows support > wasn't up to snuff. Does anyone have more recent feedback on the > windows situation? It is not ideal: it is based on a bash shell. But it does not require cygwin anymore - you can grab an exe, and get it installed on your machine for e.g. cloning and submitting a patch to the bug tracker. If you want GUI, it won't work (but no DVCS has a decent GUI: TortoiseBZR and TortoiseHG are really far behind what I would expect from a reasonable GUI on windows). I think git will never be on par compared to other tools, because git is fundamentally engrained into the unix mentality (set of tools who communicate together through text). But after having used bzr for > 2 years, I am entirely convinced that git is far ahead bzr or even hg (I don't know much hg - I looked at it at some point because it had the best svn support, but I have not followed it recently - I still closely follow bzr development). One thing about git is that the speed factor is too much emphasized IMHO - even if git was as slow as bzr, I would prefer git today. My main worries about git usage for numpy/scipy are related to the bug tracker; tracking branches is more of a problem than I initially thought. Github has some very nice concepts, but still none of the git hosting projects can for example display the history graph, which is very helpful for newcomers I think (the tools exists locally, though). cheers, David From cimrman3 at ntc.zcu.cz Mon Feb 23 11:48:14 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 23 Feb 2009 17:48:14 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230843o3b95576ev2e759690e6c09ecb@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <9457e7c80902230843o3b95576ev2e759690e6c09ecb@mail.gmail.com> Message-ID: <49A2D34E.1090509@ntc.zcu.cz> St?fan van der Walt wrote: > Hi Chuck > > 2009/2/23 Charles R Harris : >>> 1) Distributed revision control system: David Cournapeau and myself have >>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>> supported, has great branch support, and is simple to use for the average >>> contributor, while allowing powerful patch-carving for the more adventurous. >> I really like Git, but... the last time I looked windows support wasn't up >> to snuff. Does anyone have more recent feedback on the windows situation? > > I've read that msysgit (http://code.google.com/p/msysgit/) works well. > From http://garrys-brain.blogspot.com/2008/04/git-for-windows-msysgit.html > : > > """ > I would say Git For Windows is very close to being "ready" and > providing you are not in need of the more difficult corner cases it is > ready for production use. The guys working on it have done a great > job. > """ > > I can't vouch for this information, so it would be great to hear from > someone who tried it themselves. I have tried git on windows XP a few days ago, and it was smooth - the installation using the usual wizard, launching git gui (no command line needed!), looking at an existing project... And did not done anything fancy, though. r. From pgmdevlist at gmail.com Mon Feb 23 11:49:47 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 23 Feb 2009 11:49:47 -0500 Subject: [SciPy-dev] Subclassed ndarray fails with ValueError when assigning to a sliced array In-Reply-To: References: <6b7179780902230838g2615d1e2u5940a489eed063ce@mail.gmail.com> Message-ID: On Feb 23, 2009, at 11:44 AM, Christoph T. Weidemann wrote: > Per, who created ticket 1023 tried to send the following message to > the mailing list, but it bounced, so I am resending it on his behalf. Could you post the relevant code ? From cimrman3 at ntc.zcu.cz Mon Feb 23 11:51:07 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 23 Feb 2009 17:51:07 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2D34E.1090509@ntc.zcu.cz> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <9457e7c80902230843o3b95576ev2e759690e6c09ecb@mail.gmail.com> <49A2D34E.1090509@ntc.zcu.cz> Message-ID: <49A2D3FB.4070802@ntc.zcu.cz> Robert Cimrman wrote: > St?fan van der Walt wrote: >> Hi Chuck >> >> 2009/2/23 Charles R Harris : >>>> 1) Distributed revision control system: David Cournapeau and myself have >>>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>>> supported, has great branch support, and is simple to use for the average >>>> contributor, while allowing powerful patch-carving for the more adventurous. >>> I really like Git, but... the last time I looked windows support wasn't up >>> to snuff. Does anyone have more recent feedback on the windows situation? >> I've read that msysgit (http://code.google.com/p/msysgit/) works well. >> From http://garrys-brain.blogspot.com/2008/04/git-for-windows-msysgit.html >> : >> >> """ >> I would say Git For Windows is very close to being "ready" and >> providing you are not in need of the more difficult corner cases it is >> ready for production use. The guys working on it have done a great >> job. >> """ >> >> I can't vouch for this information, so it would be great to hear from >> someone who tried it themselves. > > I have tried git on windows XP a few days ago, and it was smooth - the > installation using the usual wizard, launching git gui (no command line > needed!), looking at an existing project... > > And did not done anything fancy, though. A little correction: And -> I I have tried: http://msysgit.googlecode.com/files/Git-1.6.1-preview20081227.exe r. From david at ar.media.kyoto-u.ac.jp Mon Feb 23 11:44:07 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Feb 2009 01:44:07 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> Message-ID: <49A2D257.7070104@ar.media.kyoto-u.ac.jp> josef.pktd at gmail.com wrote: > On Mon, Feb 23, 2009 at 11:17 AM, Robert Cimrman wrote: > >> Hi St?fan, >> >> St?fan van der Walt wrote: >> >>> Implementation: >>> >>> Enthought generously hosts SciPy, and I hope they will continue doing so. >>> New software will need to be installed on the server, but we have many hands >>> willing to tackle that task: David Cournapeau and myself included. Before >>> deploying to scipy.org, *we will configure a *different* server as a proof >>> of concept.* >>> >>> 1) *Distributed revision control system: David Cournapeau and myself have >>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>> supported, has great branch support, and is simple to use for the average >>> contributor, while allowing powerful patch-carving for the more adventurous. >>> * >>> >> Going git would make my life as an occasional numpy/scipy contributor >> really a lot easier, so big +1! >> >> cheers, >> r. >> > > I'm pretty happy with svn; it is relatively simple and has good > integration and GUI tools on Windows. svn is simple for simple things - but those things are simple on bzr/hg/git whatever too. Cloning, diffing, committing; those are the same on every tool. More advanced things are really a PITA in svn - svn is actually extremely counterintuitive IMHO. I mean, what's intuitive about using copy to create a tag, really ? 50 % of the time I create a branch for numpy, I screw up because I need like 10 commands, which fail half of the time for stupid errors or time out. It is really a drag to handle branches in svn, specially when it takes 10 minutes to merge 5 revisions (granted, this is at least partially due to the server hanging up). You're right that there is no decent GUI for git - tortoiseSVN has seen years of development, so obviously, all the other tools are far behind. And from a GUI POV, I think there are some deep, unsolved problems on how to present things simply. I don't think anybody has found a solution yet. > From all I read, git would be a > big barrier for casual users (of git). From all the descriptions I've > read, git is powerful for "command line junkies" who remember a large > number of commands and options but not for occasional users of it. Really, the basic commands are the same for all the tools out there: http://git-scm.com/ But I am not denying that git has some rough edges UI-wise. In particular, handling of remotes (forwarding changes to other repositories) is still too complicated for simple tasks, and some error messages are cryptic. But so is the case for svn, really. > My main problem with trac tickets are missing tests, not the actual > applying of the patch or bugfixes. I think low test coverage and weak > testing "culture" is more of a problem than the revision control > system. I agree that tools cannot solve the problem of lack of test. But to give you a scenario: I had a couple of hours to spend on triaging bugs on numpy for 1.3 release last WE. I have done almost nothing: I can't easily get tickets with attached patches, I can't control the bug tracker from the command line (I can't say: give me all the tickets since 1.2 with attached patches, give me all tickets on numpy.core since 1.2, etc...). I find the whole workflow extremely frustrating personally. cheers, David From stefan at sun.ac.za Mon Feb 23 12:08:22 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 19:08:22 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> Message-ID: <9457e7c80902230908g7aef68d9o47f03f29e4c3617d@mail.gmail.com> Hi Josef 2009/2/23 > I'm pretty happy with svn; it is relatively simple and has good > integration and GUI tools on Windows. From all I read, git would be a > big barrier for casual users (of git). From all the descriptions I've > read, git is powerful for "command line junkies" who remember a large > number of commands and options but not for occasional users of it. But > I never installed git, because some time ago when I compared bazar, > mercurial, git still didn't have much support on windows. Git used to be hard to use, even casually, but that changed. Most of the basic commands are very similar to what they are in SVN. Some are even simpler, for example, it is much easier to merge under Git. I think many people fall into the trap of trying to do Revision Control Acrobatics before having mastered the basics (and I count myself in that group), but the link given (http://git.or.cz/course/svn.html) shows that simple things remain simple. Apart from the technical benefits, distributed revision control has a profound impact upon the social structuring of a project. Flat is better than nested, even when it comes to code development :) > My main problem with trac tickets are missing tests, not the actual > applying of the patch or bugfixes. I would like for contributers to take responsibility for their own patches. If we have a clearly designated set of criteria for inclusion (i.e. tested, documented, peer reviewed), it becomes easier to get code *into* SciPy, providing the scaffolding needed to develop and mature code. > I think low test coverage and weak > testing "culture" is more of a problem than the revision control > system. From what I have seen in the scipy code, it is true that, if > it doesn't have a test, it is broken with high probability. I agree, but I think that the two subjects go hand-in-hand. In essence, by making each user part of the development team, we give them the mandate to develop or solicit unit tests for their own patches. Regards St?fan From guyer at nist.gov Mon Feb 23 12:24:54 2009 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 23 Feb 2009 12:24:54 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2D257.7070104@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> Message-ID: On Feb 23, 2009, at 11:44 AM, David Cournapeau wrote: > 50 % of the time I create a > branch for numpy, I screw up because I need like 10 commands, which > fail > half of the time for stupid errors or time out. I have no opinion on a switch of SciPy to git or anything else, and I'm generally interested in the prospects for distributed version control, but I really have to ask, what 10 commands could you possibly need to execute to create a branch in svn? From cournape at gmail.com Mon Feb 23 13:10:42 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 03:10:42 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> On Tue, Feb 24, 2009 at 2:24 AM, Jonathan Guyer wrote: > > On Feb 23, 2009, at 11:44 AM, David Cournapeau wrote: > >> 50 % of the time I create a >> branch for numpy, I screw up because I need like 10 commands, which >> fail >> half of the time for stupid errors or time out. > > I have no opinion on a switch of SciPy to git or anything else, and > I'm generally interested in the prospects for distributed version > control, but I really have to ask, what 10 commands could you possibly > need to execute to create a branch in svn? svn cp trunk -> branch svnmerge switch branch svnmerge init trunk svn ci -F svnmerge-commit.txt svn switch trunk svnmerge init branch svn ci -F svnmerge-commit.txt Ok, that's 7 :) cheers, David From matthieu.brucher at gmail.com Mon Feb 23 13:22:19 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 23 Feb 2009 19:22:19 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: > 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take > a look at InDefero [3], but we'll do a careful analysis of trac-git (like > FedoraHosted) too. Why not staying with Trac ? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From guyer at nist.gov Mon Feb 23 13:23:09 2009 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 23 Feb 2009 13:23:09 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> Message-ID: <978A5007-A724-421C-BC6E-67CEC5D99159@nist.gov> On Feb 23, 2009, at 1:10 PM, David Cournapeau wrote: > svn cp trunk -> branch > svnmerge switch branch > svnmerge init trunk > svn ci -F svnmerge-commit.txt > svn switch trunk > svnmerge init branch > svn ci -F svnmerge-commit.txt Ahah. The answer to that is, don't use svnmerge. I tried it after you told me about it on this list and it's a disaster, at least from our perspective. We have a protocol for merges , based directly on the guidance of the The SVN Book, and it works very well. I was prepared to concede to you that *merging* changes takes way too many steps, but you said "create a branch", which piqued my curiosity. While svnmerge appeared to dramatically simplify all of the tagging and commenting that we presently have to do, in practice I found that it made a complete hash of things. I have no doubt that it could be used safely, but I don't believe that it actually saves any effort over doing it manually. As a case in point, seven steps for what should only be one. > Ok, that's 7 :) J'accuse! 8^) From jtravs at gmail.com Mon Feb 23 13:25:45 2009 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Feb 2009 18:25:45 +0000 Subject: [SciPy-dev] complex wrapper to ode Message-ID: <3a1077e70902231025n2dc7e3deo284f7632b2e26c3a@mail.gmail.com> Hello, Attached is a patch which adds a wrapper class 'zode' to integrate.ode. It allows one to conviniently solve systems of odes with complex values using the existing real valued solvers vode, dopri5, dop853, instead of zode, by simply integrating the real/imag parts. Is this worth commiting? It appears to me to be considerably faster than zvode for my big systems of equations. I'm not sure why, as I intuitively thought all the data copying etc. would slow it down. But it is after all only a convenience, anyone can do it themselves. Cheers, John -------------- next part -------------- A non-text attachment was scrubbed... Name: zode.diff Type: text/x-patch Size: 4703 bytes Desc: not available URL: From matthew.brett at gmail.com Mon Feb 23 13:29:25 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 23 Feb 2009 10:29:25 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> Hi Stefan, and team, In the spirit of the 30 second reader, I split your email into your two parts. 1. Change to a distributed revision control system, encouraging more open collaboration. 2. Determine guidelines for code acceptance, in terms of unit tests, documentation and peer review. In that order: 1) The distributed vs SVN issue is one that generates a lot of heat. Perhaps we could back off that one for now? We've been using bzr for a while. It's been a mixture of confusing and liberating. Like writing tests first, it's very difficult to explain why DVCS is important. It does require a lot of discipline for it not to get out of hand. 2) Yes. Please. That would really help. How do we think we should best get there? If we don't switch to DVCS immediately? See you, Matthew From nwagner at iam.uni-stuttgart.de Mon Feb 23 13:33:04 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 23 Feb 2009 19:33:04 +0100 Subject: [SciPy-dev] scipy.org is down Message-ID: scipy.org is down again... From david at ar.media.kyoto-u.ac.jp Mon Feb 23 13:19:59 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Feb 2009 03:19:59 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <978A5007-A724-421C-BC6E-67CEC5D99159@nist.gov> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> <978A5007-A724-421C-BC6E-67CEC5D99159@nist.gov> Message-ID: <49A2E8CF.7040606@ar.media.kyoto-u.ac.jp> Jonathan Guyer wrote: > > Ahah. The answer to that is, don't use svnmerge. I tried it after you > told me about it on this list and it's a disaster, at least from our > perspective. We have a protocol for merges >, based directly on the guidance of the The SVN Book, and it works > very well. I was prepared to concede to you that *merging* changes > takes way too many steps, but you said "create a branch", which piqued > my curiosity. Let's say "using" branches is a PITA in svn. > > While svnmerge appeared to dramatically simplify all of the tagging > and commenting that we presently have to do, in practice I found that > it made a complete hash of things. I have no doubt that it could be > used safely, but I don't believe that it actually saves any effort > over doing it manually. As a case in point, seven steps for what > should only be one. git tag my_tag_name # create a tag git co -b work_branch # create a new branch work_branch git merge branch1 # merge branch1 git log branch1..work_branch # log all revisions from branch1 to work_branch Note that I have not yet mentioned the speed: doing the above in svn takes like 1 minute or 2 for me, whereas it takes < 1s with git. cheers, David From wnbell at gmail.com Mon Feb 23 13:37:05 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 23 Feb 2009 13:37:05 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 11:04 AM, St?fan van der Walt wrote: > > process. As it is, we have many patches waiting on Trac for up to a year or > more without any feedback; that is not acceptable. > > My view on testing is simple: untested code is probably broken code (and I > can show examples from the past year's commit logs to corroborate this > statement). As for documentation, we cannot afford to be without it. > I agree that these are problems, but I don't see why a different revision management system or bugtracker is going to bring about qualitative change. If a patch has languished on Trac for a year it's because: (1) the patch is not going to be included and no one has closed it, (2) the relevant authorities lack the time, or (3) no one actively maintains that part of scipy. Perhaps Git + whatever will be a better combination than SVN + Trac. However, I'd argue that having a dedicate maintainer/supervisor for each instance of scipy.X is more valuable, and the lack thereof is our current problem. Can anyone claim that using SVN or Trac is so onerous that *it* is the problem? I own at least one 6+ month old ticket with a patches. I can locate this ticket with Trac in about 30 seconds. The problem is that integrating the patch would take a few hours of my time, and I simply haven't had time to dedicate to it. How many other patches are like this? I'm neutral on Git vs. SVN since they seem roughly equivalent for basic tasks ( http://git.or.cz/course/svn.html ). However, I think the following are more significant problems: - limited maintenance of scipy.X (i.e. who do we blame when tests fail?) - distribution woes (setup.py build should just work) - packaging woes (installers should just work, creating binary installers should be easy) - unreasonably long release cycle (why commit a fix, or report a bug when the next version is 18 months away) - lack of automated testing (build bots) And I'd argue for: - someone who we can spam when scipy.X fails - a setup.py that didn't lead to questions about Fortran ABI incompatibilities - a setup.py (or equivalent) with bdist_foo for every foo we care about - a ~6 month cycle and nightly builds (with binary installers) - a website where the scipy.X maintainer can see errors for their module on a dozen different platforms -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From david at ar.media.kyoto-u.ac.jp Mon Feb 23 13:22:49 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Feb 2009 03:22:49 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <49A2E979.2040201@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: >> 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take >> a look at InDefero [3], but we'll do a careful analysis of trac-git (like >> FedoraHosted) too. >> > > Why not staying with Trac ? > - no multiple project support (mostly a problem for scikits) - the ticket workflow is very awkward and too simplistic - no command line interface (may be solved with the xmlrpc plugin, though, I have no experience with it) - query system very primitive - etc... cheers, David From stefan at sun.ac.za Mon Feb 23 13:39:51 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 20:39:51 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <9457e7c80902231039n3b1f8be0xeb62542da4b9f16b@mail.gmail.com> 2009/2/23 Matthieu Brucher : >> 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take >> a look at InDefero [3], but we'll do a careful analysis of trac-git (like >> FedoraHosted) too. > > Why not staying with Trac ? I'm still investigating Trac. It is installed on our students' code hosting service, and I would like to see whether trac-git is mature and well integrated. Do you have any experience with it? Cheers St?fan From rob.clewley at gmail.com Mon Feb 23 13:50:15 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Mon, 23 Feb 2009 13:50:15 -0500 Subject: [SciPy-dev] some new ode solvers In-Reply-To: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: Hi John, > Attached is a patch which adds two new ODE solvers to the > scipy.integrate.ode module. > The solvers are dopri5 and dop853, which are explicit Runge-Kutta > pairs originally developed > by Dormand and Prince. The fortran code was downloaded from: > > http://www.unige.ch/~hairer/software.html This is good news, and the scipy module certainly needs an updated API. I hope that previous discussions on this list about API changes will be looked up as there were some good suggestions then. I wonder how much extra work it would be to include H&W's stiff and delayed ODE and DAE solvers such as Radau, Retard, and Hem? Those would be of great value to Scipy users, I think, as there's little high-level language support available for those AFAIK (Radau is in PyDSTool but not the others). Thanks, Rob From david at ar.media.kyoto-u.ac.jp Mon Feb 23 13:34:58 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Feb 2009 03:34:58 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Nathan Bell wrote: > Perhaps Git + whatever will be a better combination than SVN + Trac. > However, I'd argue that having a dedicate maintainer/supervisor for > each instance of scipy.X is more valuable, and the lack thereof is our > current problem. > > Can anyone claim that using SVN or Trac is so onerous that *it* is the > problem? Yes, I claim this. Doing bug triaging in a web interface is already not a pleasant experience, but with trac, it just becomes very frustrating. When preparing for releases (massive bug triaging), I waste hours doing things which could be at least partially automated. > I'm neutral on Git vs. SVN since they seem roughly equivalent for > basic tasks ( http://git.or.cz/course/svn.html ). However, I think > the following are more significant problems: > - limited maintenance of scipy.X (i.e. who do we blame when tests fail?) > - distribution woes (setup.py build should just work) > - packaging woes (installers should just work, creating binary > installers should be easy) > - unreasonably long release cycle (why commit a fix, or report a bug > when the next version is 18 months away) I think that I spent quite some time on most of those issues myself. Trac has been the most frustrating point in the whole process. For a release, you need the following from a bug POV - a good idea of bugs + regressions - a good idea of what changed - a quick way to retriage things None of this is made easy with trac, at least wo a command line interface. Also, my recent work on several builds issues on windows, etc... were done in branches (to avoid breaking the trunk for everyone), but this is a huge time waster for me. > And I'd argue for: > - someone who we can spam when scipy.X fails > - a setup.py that didn't lead to questions about Fortran ABI incompatibilities > - a setup.py (or equivalent) with bdist_foo for every foo we care about > - a ~6 month cycle and nightly builds (with binary installers) > - a website where the scipy.X maintainer can see errors for their > module on a dozen different platforms I would argue those issues are not orthogonal to the quality of the tools we are using. The time I waste on trac and svn is time I don't spend on those issues, and this time easily go up to hours now. cheers, David From peter.skomoroch at gmail.com Mon Feb 23 13:57:56 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 13:57:56 -0500 Subject: [SciPy-dev] scipy.org is down In-Reply-To: References: Message-ID: What is the root cause of this? I was changing my password on the wiki at the exact time it went down. On Mon, Feb 23, 2009 at 1:33 PM, Nils Wagner wrote: > scipy.org is down again... > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 23 14:03:14 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 23 Feb 2009 11:03:14 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0902231103p5c5d37b6p5dff9697e1fd84e3@mail.gmail.com> Hi, > Yes, I claim this. Doing bug triaging in a web interface is already not > a pleasant experience, but with trac, it just becomes very frustrating. > When preparing for releases (massive bug triaging), I waste hours doing > things which could be at least partially automated. Can I just check we've got a clear idea who is doing the most work of general scipy maintenance and release? I have the feeling that this group is David, Stefan, Jarrod at least? Then there are those of us who maintain or wrote packages. As the matlab io maintainer (more or less) I would not claim to have a good overview on general scipy release problems. Is there a successful template project workflow we can take up formally - matplotlib, ipython, python? Best, Matthew From guyer at nist.gov Mon Feb 23 14:03:38 2009 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 23 Feb 2009 14:03:38 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2E8CF.7040606@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> <978A5007-A724-421C-BC6E-67CEC5D99159@nist.gov> <49A2E8CF.7040606@ar.media.kyoto-u.ac.jp> Message-ID: On Feb 23, 2009, at 1:19 PM, David Cournapeau wrote: > Let's say "using" branches is a PITA in svn. I don't quite concede that, but I'll grant that it's harder than it ought to be. Some of my tolerance probably stems from the fact that svn beats the snot out of cvs in this regard. That doesn't mean that svn shouldn't be better (and, as I understand, the latest revs are better, but we're not using them yet). From pav at iki.fi Mon Feb 23 14:06:32 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 23 Feb 2009 19:06:32 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 10:29:25 -0800, Matthew Brett wrote: > In the spirit of the 30 second reader, I split your email into your two > parts. > > 1. Change to a distributed revision control system, encouraging more > open collaboration. > 2. Determine guidelines for code acceptance, in terms of unit tests, > documentation and peer review. > > In that order: > [clip] > 1) The distributed vs SVN issue is one that generates a lot of heat. > Perhaps we could back off that one for now? We've been using bzr for a > while. It's been a mixture of confusing and liberating. Like writing > tests first, it's very difficult to explain why DVCS is important. It > does require a lot of discipline for it not to get out of hand. Using a DVCS makes the life of a contributor easier: - You can track upstream changes. - It in general feels much more secure to have your upcoming contribution in a version control system from day 1 so that you can't lose anything. In practice, I've found myself using Git in Numpy/Scipy development, even though I have SVN commit access. - You have a single standard way to publish your current version of the proposed change (ie. push to a branch in a git repo somewhere). The current practice: multiple versions of patches on the mailing list or attached in Trac. Granted, the two first you get with git-svn. But I assume many people don't know that such tools exist. Having the official repo in DCVS may also help the gatekeeper: - There are code review tools for DVCSes. Eg., you can comment a patch line-by-line in Github. Compare this to unpacking a .tar.gz sent to mailing list, and commenting on it. - You can track the contributor's changes easily. No need guessing which version the patch was based on against, or which version of the patch is the latest. - You can merge the contributor's changes in easily; a DVCS usually does better job resolving conflicts than applying a patch. > 2) Yes. Please. That would really help. How do we think we should > best get there? If we don't switch to DVCS immediately? The problem with patch review is mainly, I think, lack of manpower and dedicated maintainers for subcomponents. Also, it seems that it's not very clear to contributors how to submit their code, and what is required. A DVCS could formalise and streamline part of the above. But it doesn't solve manpower and lack-of-responsibility issues. -- Pauli Virtanen From strawman at astraw.com Mon Feb 23 14:08:36 2009 From: strawman at astraw.com (Andrew Straw) Date: Mon, 23 Feb 2009 11:08:36 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231039n3b1f8be0xeb62542da4b9f16b@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <9457e7c80902231039n3b1f8be0xeb62542da4b9f16b@mail.gmail.com> Message-ID: <49A2F434.5010607@astraw.com> [I'm +1 on git, although my level of developer-ness should be weighted appropriately. (However, it's likely to be at least marginally higher if the switch to git is made.)] I just came across bugs everywhere, which looks like an interesting in-repo bug tracker with potential for web GUIs: http://bugseverywhere.org/be/show/HomePage As a big plus for this crowd, it's written in Python. -Andrew St?fan van der Walt wrote: > 2009/2/23 Matthieu Brucher : >>> 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take >>> a look at InDefero [3], but we'll do a careful analysis of trac-git (like >>> FedoraHosted) too. >> Why not staying with Trac ? > > I'm still investigating Trac. It is installed on our students' code > hosting service, and I would like to see whether trac-git is mature > and well integrated. Do you have any experience with it? > > Cheers > St?fan > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From wnbell at gmail.com Mon Feb 23 14:10:24 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 23 Feb 2009 14:10:24 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Message-ID: On Mon, Feb 23, 2009 at 1:34 PM, David Cournapeau wrote: >> And I'd argue for: >> - someone who we can spam when scipy.X fails >> - a setup.py that didn't lead to questions about Fortran ABI incompatibilities >> - a setup.py (or equivalent) with bdist_foo for every foo we care about >> - a ~6 month cycle and nightly builds (with binary installers) >> - a website where the scipy.X maintainer can see errors for their >> module on a dozen different platforms > > I would argue those issues are not orthogonal to the quality of the > tools we are using. The time I waste on trac and svn is time I don't > spend on those issues, and this time easily go up to hours now. > The collective time wasted by Fortran ABI problems *alone* is 10x more than that wasted by the problems you seek to remedy. I've been using development versions of scipy for 2 years now and even I get burned by which fortran/BLAS/LAPACK I need to install. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From pwang at enthought.com Mon Feb 23 14:13:20 2009 From: pwang at enthought.com (Peter Wang) Date: Mon, 23 Feb 2009 13:13:20 -0600 Subject: [SciPy-dev] scipy.org is down In-Reply-To: References: Message-ID: <6060051C-609F-4755-B812-6F5ED1968F26@enthought.com> > On Mon, Feb 23, 2009 at 1:33 PM, Nils Wagner > wrote: > scipy.org is down again... It is back up now. -Peter From stefan at sun.ac.za Mon Feb 23 14:39:55 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 21:39:55 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2F434.5010607@astraw.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <9457e7c80902231039n3b1f8be0xeb62542da4b9f16b@mail.gmail.com> <49A2F434.5010607@astraw.com> Message-ID: <9457e7c80902231139s7f54e521oe58a3c198d7189b7@mail.gmail.com> Hi Andrew 2009/2/23 Andrew Straw : > I just came across bugs everywhere, which looks like an interesting > in-repo bug tracker with potential for web GUIs: > http://bugseverywhere.org/be/show/HomePage As a big plus for this crowd, > it's written in Python. Thanks for the link -- a very novel way of tracking bugs! Cheers St?fan From cournape at gmail.com Mon Feb 23 14:40:28 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 04:40:28 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902231140pedb9fb8hf02e07091077dd19@mail.gmail.com> On Tue, Feb 24, 2009 at 4:10 AM, Nathan Bell wrote: > On Mon, Feb 23, 2009 at 1:34 PM, David Cournapeau > wrote: >>> And I'd argue for: >>> - someone who we can spam when scipy.X fails >>> - a setup.py that didn't lead to questions about Fortran ABI incompatibilities >>> - a setup.py (or equivalent) with bdist_foo for every foo we care about >>> - a ~6 month cycle and nightly builds (with binary installers) >>> - a website where the scipy.X maintainer can see errors for their >>> module on a dozen different platforms >> >> I would argue those issues are not orthogonal to the quality of the >> tools we are using. The time I waste on trac and svn is time I don't >> spend on those issues, and this time easily go up to hours now. >> > > The collective time wasted by Fortran ABI problems *alone* is 10x more > than that wasted by the problems you seek to remedy. I think that's a bit unfair. I built ubuntu packages for scipy/numpy, I built a win32 binary which solved another common ABI problem, I have added unit tests to detect fortran ABI problems, I have set up a build bot on the build service from open source for automatic rpm builds with correct fortran ABI. I spent hours on a broken platform I don't even use because I think it is important for windows to be a first class citizen for numpy/scipy. But I think you don't realize all the work that some of us need to do to make decent releases, properly tested on a vast range of platforms, without a lot of resources. Just looking at new bugs and assigning them takes a lot of time, and with an email interface and/or command line interface, it would be a matter of one minute or two. I personally don't care too much about using git instead of svn - because git-svn gets me a lot of advantages already, without changing anyone workflow. But concerning the bug tracker, something has to be done: I don't care what, as long as it enables a command-line and offline handling. cheers, David From stefan at sun.ac.za Mon Feb 23 14:42:44 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 21:42:44 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Message-ID: <9457e7c80902231142w28890956vb5f7028b7da396ed@mail.gmail.com> Hi Nathan 2009/2/23 Nathan Bell : > The collective time wasted by Fortran ABI problems *alone* is 10x more > than that wasted by the problems you seek to remedy. I've been using > development versions of scipy for 2 years now and even I get burned by > which fortran/BLAS/LAPACK I need to install. I take your point. Yet, developers can't fix bugs while they are frustrated with the system. We depend on David to look at these issues (I don't know many people who get so excited about build systems :-), so we should provide him (and ourselves!) with a pleasant environment in which to do his/our work. Regards St?fan From cournape at gmail.com Mon Feb 23 14:47:03 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 04:47:03 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1e2af89e0902231103p5c5d37b6p5dff9697e1fd84e3@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> <1e2af89e0902231103p5c5d37b6p5dff9697e1fd84e3@mail.gmail.com> Message-ID: <5b8d13220902231147u274fa607y83323da0bbf6539a@mail.gmail.com> On Tue, Feb 24, 2009 at 4:03 AM, Matthew Brett wrote: > Hi, > >> Yes, I claim this. Doing bug triaging in a web interface is already not >> a pleasant experience, but with trac, it just becomes very frustrating. >> When preparing for releases (massive bug triaging), I waste hours doing >> things which could be at least partially automated. > > Can I just check we've got a clear idea who is doing the most work of > general scipy maintenance and release? > > I have the feeling that this group is David, Stefan, Jarrod at least? Pauli has been involved too - for last scipy, Nathan has been heavily involved thanks to his great sparse contributions. > Then there are those of us who maintain or wrote packages. As the > matlab io maintainer (more or less) I would not claim to have a good > overview on general scipy release problems. > > Is there a successful template project workflow we can take up > formally - matplotlib, ipython, python? I don't know about those projects workflow, but I can tell you about the workflow I would enjoy: - launch the bug tracker interface - get all the new bugs since last time I did some bug triage - select N bugs to assign to contributor joe - look at all the bugs with patchs without review etc.... Doing this in trac takes a lot of time (or is even not possible, unless you type your own SQL queries). With something integrated to vi, a bit like an email interface, it would be much more pleasant. Imagine that instead of svn, you would have to use a web interface to commit, diff, log every revision. I think people would quickly become crazy, cheers, David From peter.skomoroch at gmail.com Mon Feb 23 14:51:59 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 14:51:59 -0500 Subject: [SciPy-dev] Scientific packages for a distributed computing Amazon EC2 image? Message-ID: I'm collecting a wishlist of scientific and python related packages (numpy, scipy, etc) people would want installed on a Debian based Amazon EC2 machine image (AMI)for distributed computing. I'll make more information available as the machine image develops, some of these will also go into the Machetec2AMI. Several variants of the AMI should become available in the next month. Please feel free to add any packages you would want pre-installed on the following wiki page: http://scipy.org/SciPyAmazonAmi Let me know if you spot any potential license conflicts with listed software. -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Feb 23 15:00:54 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 15:00:54 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A2D257.7070104@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> On Mon, Feb 23, 2009 at 11:44 AM, David Cournapeau wrote: > > I agree that tools cannot solve the problem of lack of test. But to give > you a scenario: I had a couple of hours to spend on triaging bugs on > numpy for 1.3 release last WE. I have done almost nothing: I can't > easily get tickets with attached patches, I can't control the bug > tracker from the command line (I can't say: give me all the tickets > since 1.2 with attached patches, give me all tickets on numpy.core since > 1.2, etc...). I find the whole workflow extremely frustrating personally. > This seems to me to be a problem with the (older) trac ticketing. Queries for attachment, I found, work ok, but I didn't find a query for tickets with recent comments. Going through the stats review tickets to see which have any relevant information is really slow, and I still never went through all of them to see if someone added a comment or not. Has this improved with the new version of trac, scipy trac is 0.10.2 and current trac is 0.11.3? But I don't see how changing the revision control system helps with this. I installed msysgit following the links provided and the git gui looks ok, although the file browser is missing the basic file information (revision numbers, dates, added to repository or not). I don't like the bash shell because I don't know the keys for basic things (or I have to look them up), but that's only a smaller problem. The question is whether the revision control should be easier for developer or easier for users to create patches, and that might not be the same system. Also, at least with bazaar and bzrsvn and, I guess, git-svn, I still don't see what the (major) disadvantage for branching is of using a mirror of the central svn repository in a decentralized version control. (However, I usually just work with only short lived branches to try out fixes and new code, or do selective merging manually. Personally, my second incentive for keeping essentially only one main branch (svn) in my own code is, that I am not very well organized with branches, and having 10 to 15 versions/branches of a package on my hard drive ends up just wasting space and time. Also, I like the svn integration in eclipse.) Josef I wrote this quite some time ago. But, being one of the (few?) pure Windows (and GUI) users, I'm still in favor of the current, familiar setup, with svn and trac. Code review tools are also available for svn (and in python, see google code), however, code review won't change much if the man power isn't there. From stefan at sun.ac.za Mon Feb 23 15:05:56 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 22:05:56 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> Message-ID: <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> Hi Matthew 2009/2/23 Matthew Brett : > 2) Yes. Please. That would really help. How do we think we should > best get there? If we don't switch to DVCS immediately? It won't require a revolution (hopefully), just consensus amongst developers. I view anyone with SVN access as a gatekeeper. Gatekeepers should agree to a code of conduct, by which they are held. 1. No code enters SciPy unless it had two pairs of eyes on it: reviewer and committer, reviewer and reviewer, reviewer and release manager, etc. All tickets ready for merging are marked in Trac for convenience. 2. No code enters SciPy unless it is fully documented. 3. No code enters SciPy unless it is fully tested (this holds for both bug-fixes and enhancements) It sounds tough, but it guarantees fewer bugs and a higher quality code base. I think it would be *less* intimidating for new contributers if they knew that everybody's code got reviewed, not just theirs, and that we'd like to work with them to improve their code to the point where it can be included. Cheers St?fan From robert.kern at gmail.com Mon Feb 23 15:06:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 14:06:58 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> On Mon, Feb 23, 2009 at 10:04, St?fan van der Walt wrote: > 1) Distributed revision control system: David Cournapeau and myself have > been test driving Git [1] on SciPy and NumPy for a while. It is fast, well > supported, has great branch support, and is simple to use for the average > contributor, while allowing powerful patch-carving for the more adventurous. While I really like DVCS in general, I don't think there is much benefit to switching. The various DVCS-SVN bridges account for most of the benefits, I think. > 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take > a look at InDefero [3], but we'll do a careful analysis of trac-git (like > FedoraHosted) too. You may also want to consider using Roundup for just bug tracking and forgoing "integrated" solutions like the above entirely. We don't use the Trac wiki for anything we couldn't do on the Moin site. While it is not entirely clear what is causing the scipy.org problems, Trac does appear to be poorly behaved at least on that machine. Dropping it for something more manageable IT-wise would be a benefit by itself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Mon Feb 23 15:15:49 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 23 Feb 2009 21:15:49 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> Message-ID: <20090223201549.GB5038@phare.normalesup.org> On Mon, Feb 23, 2009 at 02:10:24PM -0500, Nathan Bell wrote: > The collective time wasted by Fortran ABI problems *alone* is 10x more > than that wasted by the problems you seek to remedy. I've been using > development versions of scipy for 2 years now and even I get burned by > which fortran/BLAS/LAPACK I need to install. While I mostly agree with you that changing VCS will probably not bring huge improvements to the development process, that fact is that resolving the problem you are talking about (fortran ABI compatibility) is very hard, and probably much harder than changing VCS. So I claim your remarks is unfair. Ga?l From cournape at gmail.com Mon Feb 23 15:16:32 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 05:16:32 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> Message-ID: <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> On Tue, Feb 24, 2009 at 5:06 AM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 10:04, St?fan van der Walt wrote: > >> 1) Distributed revision control system: David Cournapeau and myself have >> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >> supported, has great branch support, and is simple to use for the average >> contributor, while allowing powerful patch-carving for the more adventurous. > > While I really like DVCS in general, I don't think there is much > benefit to switching. The various DVCS-SVN bridges account for most of > the benefits, I think. I agree on this - having a "blessed" mirror so that anyone into DVCS could have a "reference" would be enough for me, at least as long as there is no good solution for bug tracking (the one advantage of switching to a DVCS is easier merging/branching, but I am worried about the workflow if the bug tracker cannot track branches). > >> 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take >> a look at InDefero [3], but we'll do a careful analysis of trac-git (like >> FedoraHosted) too. > > You may also want to consider using Roundup for just bug tracking and > forgoing "integrated" solutions like the above entirely. We don't use > the Trac wiki for anything we couldn't do on the Moin site. Ah, I did not know about roundup, it looks really nice: at least from the feature set, it has everything I miss from trac. THe reason why I thought about redmine is that it is easy to migrate, and has support for the things I care the most. I otherwise do not care about the solution as long as it is scriptable and ideally can be used offline. Would hosting roundup be an option ? cheers, David From bsouthey at gmail.com Mon Feb 23 15:17:33 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 23 Feb 2009 14:17:33 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <49A3045D.2000509@gmail.com> Hi, You may find the following PEP useful in discussing the first proposed change: Python PEP 0374 "Migrating from svn to a distributed VCS". http://www.python.org/dev/peps/pep-0374/ Bruce St?fan van der Walt wrote: > *[If you only have 30 seconds to read this email, read the **bold text > only]* > > *Dear* SciPy *developer*s > > The past while has seen a rocky ride with the SciPy servers, but > yesterday Peter Wang announced that he is attending to the situation. > This, then, seems like the perfect time to *stand back and take a > look at our infrastructure*, and whether we should continue with the > current setup. > > To put this conversation into context, we have to face the facts: > SciPy has a large user community relative to the number of > developers. A big library of code, used by many scientists, is > supported by a small handful of people all over the world. *We cannot > afford* *a high barrier to contribution*, and we have to lower the > effort it takes for a developer to merge contributed code. > > *I'd like to propose two changes* to the status quo: > > 1. *Change to a distributed revision control system*, encouraging more > open collaboration. > 2. *Determine guidelines for code acceptance*, in terms of unit tests, > documentation and peer review. > > Allow me to motivate these changes, and then suggest practical > approaches for their implementation: > > Subversion allows only a selected group of developers to change the > SciPy source code. This does not encourage a culture of meritocracy, > but worse, has practical implications, in that users cannot merge > their own patches. I won't discuss the advantages of distributed > revision control here, but note that it shifts responsibility from the > current core developers to contributers; *that benefits us all!* > > This ties in with my second point: code review. The current > developers have access to SVN because they are experienced programmers > with knowledge of SciPy's scientific domains of application. We are > unable to employ this scarce resource fully, because it simply takes > too long to merge a patch from Trac, review it, *bring it up to > scratch*, and commit it. *We have to put a system in place which > allows contributers to take responsibility for their own patches, and > for core developers to guide and advise during this process.* As it > is, we have many patches waiting on Trac for up to a year or more > without any feedback; that is not acceptable. > > My view on testing is simple: *untested code is probably broken code* > (and I can show examples from the past year's commit logs to > corroborate this statement). *As for documentation, we cannot afford > to be without it. > * > Implementation: > > Enthought generously hosts SciPy, and I hope they will continue doing > so. New software will need to be installed on the server, but we have > many hands willing to tackle that task: David Cournapeau and myself > included. Before deploying to scipy.org , *we will > configure a *different* server as a proof of concept.* > > 1) *Distributed revision control system: David Cournapeau and myself > have been test driving Git [1] on SciPy and NumPy for a while. It is > fast, well supported, has great branch support, and is simple to use > for the average contributor, while allowing powerful patch-carving for > the more adventurous.* > > 2) *Ticketing back-end:* David is exploring RedMine [2], and I'd like > to take a look at InDefero [3], but *we'll do a careful analysis* of > trac-git (like FedoraHosted) too. > > Thank you for taking the time to deliberate on SciPy's future. I > would love to hear your comments. > > Kind regards > St?fan > > [1] http://git.or.cz/course/svn.html > [2] http://www.redmine.org/ > [3] http://scipy.indefero.net/p/numpy/ > [4] http://fedorahosted.org > > > ------------------------------------------------------------------------ > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From pwang at enthought.com Mon Feb 23 15:26:00 2009 From: pwang at enthought.com (Peter Wang) Date: Mon, 23 Feb 2009 14:26:00 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> Message-ID: On Feb 23, 2009, at 2:16 PM, David Cournapeau wrote: > Would hosting roundup be an option ? Certainly, especially if moving to roundup is done as part of "The Big Transition" off of the old hardware, so I don't have to worry about moving the existing Trac instances/users/etc. It also separates the tasks of moving the wiki, the SVN repos, and ticket trackers. (I'm assuming that moving off of Trac for ticketing also means moving off of it for the wiki, unless people have particular love for the Trac wiki system.) -Peter From josef.pktd at gmail.com Mon Feb 23 15:27:39 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 15:27:39 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231147u274fa607y83323da0bbf6539a@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2EC52.4020300@ar.media.kyoto-u.ac.jp> <1e2af89e0902231103p5c5d37b6p5dff9697e1fd84e3@mail.gmail.com> <5b8d13220902231147u274fa607y83323da0bbf6539a@mail.gmail.com> Message-ID: <1cd32cbb0902231227m56ac7829u3df4b81840375b1c@mail.gmail.com> On Mon, Feb 23, 2009 at 2:47 PM, David Cournapeau wrote: > On Tue, Feb 24, 2009 at 4:03 AM, Matthew Brett wrote: >> Hi, >> >>> Yes, I claim this. Doing bug triaging in a web interface is already not >>> a pleasant experience, but with trac, it just becomes very frustrating. >>> When preparing for releases (massive bug triaging), I waste hours doing >>> things which could be at least partially automated. >> >> Can I just check we've got a clear idea who is doing the most work of >> general scipy maintenance and release? >> >> I have the feeling that this group is David, Stefan, Jarrod at least? > > Pauli has been involved too - for last scipy, Nathan has been heavily > involved thanks to his great sparse contributions. > >> Then there are those of us who maintain or wrote packages. As the >> matlab io maintainer (more or less) I would not claim to have a good >> overview on general scipy release problems. >> >> Is there a successful template project workflow we can take up >> formally - matplotlib, ipython, python? > > I don't know about those projects workflow, but I can tell you about > the workflow I would enjoy: > - launch the bug tracker interface > - get all the new bugs since last time I did some bug triage > - select N bugs to assign to contributor joe > - look at all the bugs with patchs without review > etc.... > > Doing this in trac takes a lot of time (or is even not possible, > unless you type your own SQL queries). With something integrated to > vi, a bit like an email interface, it would be much more pleasant. > > Imagine that instead of svn, you would have to use a web interface to > commit, diff, log every revision. I think people would quickly become > crazy, > > cheers, > > David I was looking at the changes in 0.11. in trac and I found a trac commandshell (I just found it, I didn't try it out, the google code source is very recent) http://code.google.com/p/tracshell/ http://trac-hacks.org/wiki/TracShellScript Also the trac instance of edgwall.org has ticket query by last modified date. So, in comparing ticket systems the current trac release and not 0.10.2 should be compared to the alternatives Josef From matthew.brett at gmail.com Mon Feb 23 15:29:30 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 23 Feb 2009 12:29:30 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> Message-ID: <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> Hi Stefan, > 1. No code enters SciPy unless it had two pairs of eyes on it: > reviewer and committer, reviewer and reviewer, reviewer and release > manager, etc. All tickets ready for merging are marked in Trac for > convenience. > 2. No code enters SciPy unless it is fully documented. > 3. No code enters SciPy unless it is fully tested (this holds for both > bug-fixes and enhancements) Right. So, the real problem here is that the people doing the actual work have severe problems with the current workflow. It seems to me the issues here are: A) Do we agree in general to a more disciplined tests / review / accept cycle. B) What specifically are the problems that y'all are having, and what options are there for solving them. Would someone consider writing a workflow PEP for discussion? We need the use-cases clearly defined here, otherwise I feel we are going to get lost on the DVCS discussion. See you, Matthew From robert.kern at gmail.com Mon Feb 23 15:32:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 14:32:54 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> Message-ID: <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> On Mon, Feb 23, 2009 at 14:16, David Cournapeau wrote: > On Tue, Feb 24, 2009 at 5:06 AM, Robert Kern wrote: >> On Mon, Feb 23, 2009 at 10:04, St?fan van der Walt wrote: >> >>> 1) Distributed revision control system: David Cournapeau and myself have >>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>> supported, has great branch support, and is simple to use for the average >>> contributor, while allowing powerful patch-carving for the more adventurous. >> >> While I really like DVCS in general, I don't think there is much >> benefit to switching. The various DVCS-SVN bridges account for most of >> the benefits, I think. > > I agree on this - having a "blessed" mirror so that anyone into DVCS > could have a "reference" would be enough for me, at least as long as > there is no good solution for bug tracking (the one advantage of > switching to a DVCS is easier merging/branching, but I am worried > about the workflow if the bug tracker cannot track branches). The user pastes in the URL of his branch. Bug trackers don't really "track branches"; they track the status of issues. Some bug trackers, like that in Trac and Redmine, have some special mark up to make it easy to refer to a particular revision in their repo browser. Is that what you are talking about? Remember that with a DVCS, it is intrinsically difficult to accomplish this; anyone can host their branches anywhere. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Mon Feb 23 15:35:01 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 22:35:01 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> Message-ID: <9457e7c80902231235j14f9f3d1s274b070b48cabaa9@mail.gmail.com> Hi Robert 2009/2/23 Robert Kern : >> 1) Distributed revision control system: David Cournapeau and myself have >> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >> supported, has great branch support, and is simple to use for the average >> contributor, while allowing powerful patch-carving for the more adventurous. > > While I really like DVCS in general, I don't think there is much > benefit to switching. The various DVCS-SVN bridges account for most of > the benefits, I think. The thing I miss most is merge support. They are working on that for 1.6, but all the DVC systems have it now. If we don't switch, we should consider an official DVCS-based mirror, so that patches can be supplied as links to Github or elsewhere. > You may also want to consider using Roundup for just bug tracking and > forgoing "integrated" solutions like the above entirely. We don't use > the Trac wiki for anything we couldn't do on the Moin site. Thanks, I completely forgot about Roundup. I remember it from yonks ago when it won the Software Carpentry competition! Cheers St?fan From cournape at gmail.com Mon Feb 23 15:40:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 05:40:01 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> Message-ID: <5b8d13220902231240y4e8ae6ecl6bafd192aa847ce3@mail.gmail.com> On Tue, Feb 24, 2009 at 5:26 AM, Peter Wang wrote: > On Feb 23, 2009, at 2:16 PM, David Cournapeau wrote: > >> Would hosting roundup be an option ? > > Certainly, especially if moving to roundup is done as part of "The Big > Transition" off of the old hardware, so I don't have to worry about > moving the existing Trac instances/users/etc. Do you have experience in "integrating" wiki, svn and roundup ? Or is that something we could do on our side as an experiment first, so that you don't have too much to do ? David From seefeld at sympatico.ca Mon Feb 23 15:33:54 2009 From: seefeld at sympatico.ca (Stefan Seefeld) Date: Mon, 23 Feb 2009 15:33:54 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> Message-ID: <49A30832.9010701@sympatico.ca> Peter Wang wrote: > On Feb 23, 2009, at 2:16 PM, David Cournapeau wrote: > > >> Would hosting roundup be an option ? >> > > Certainly, especially if moving to roundup is done as part of "The Big > Transition" off of the old hardware, As it happens, Roundup itself is right now in (somewhat) active development again, so this would be a good time to evaluation it and let us know if anything not available through extensions is amiss. ('us' as in 'the Roundup developers') Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... From pav at iki.fi Mon Feb 23 15:41:22 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 23 Feb 2009 20:41:22 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 14:32:54 -0600, Robert Kern wrote: [clip] >> I agree on this - having a "blessed" mirror so that anyone into DVCS >> could have a "reference" would be enough for me, at least as long as >> there is no good solution for bug tracking (the one advantage of >> switching to a DVCS is easier merging/branching, but I am worried about >> the workflow if the bug tracker cannot track branches). > > The user pastes in the URL of his branch. Bug trackers don't really > "track branches"; they track the status of issues. Some bug trackers, > like that in Trac and Redmine, have some special mark up to make it easy > to refer to a particular revision in their repo browser. Is that what > you are talking about? Remember that with a DVCS, it is intrinsically > difficult to accomplish this; anyone can host their branches anywhere. I'd think the markup/URLs shouldn't be a problem, since, as Robert says, one can just paste URLs. I would be a bit more worried about support for 'Milestones' and 'Versions'. AFAIK Trac (at least the 0.10.3 we have) is a bit primitive here, as a bug can be only assigned to a single version/branch. So it's not easy to query for bugs that have been fixed in trunk, but haven't been backported eg. to 0.7.x branch. -- Pauli Virtanen From cournape at gmail.com Mon Feb 23 15:45:04 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 05:45:04 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> Message-ID: <5b8d13220902231245h3c04c9a2qdd96fba2eec04505@mail.gmail.com> On Tue, Feb 24, 2009 at 5:32 AM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 14:16, David Cournapeau wrote: >> On Tue, Feb 24, 2009 at 5:06 AM, Robert Kern wrote: >>> On Mon, Feb 23, 2009 at 10:04, St?fan van der Walt wrote: >>> >>>> 1) Distributed revision control system: David Cournapeau and myself have >>>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>>> supported, has great branch support, and is simple to use for the average >>>> contributor, while allowing powerful patch-carving for the more adventurous. >>> >>> While I really like DVCS in general, I don't think there is much >>> benefit to switching. The various DVCS-SVN bridges account for most of >>> the benefits, I think. >> >> I agree on this - having a "blessed" mirror so that anyone into DVCS >> could have a "reference" would be enough for me, at least as long as >> there is no good solution for bug tracking (the one advantage of >> switching to a DVCS is easier merging/branching, but I am worried >> about the workflow if the bug tracker cannot track branches). > > The user pastes in the URL of his branch. Bug trackers don't really > "track branches"; they track the status of issues. Some bug trackers, > like that in Trac and Redmine, have some special mark up to make it > easy to refer to a particular revision in their repo browser. Is that > what you are talking about? Yes: specially when reviewing for release, I like the ability to go from a bug to the revision and vice-versa. But my concern is a bit more general: I think there are still a lof of issues on how to integrate DVCS and bug tracking, and nobody really solved it yet. Some people have suggested putting the bugs themselves in the repository, launchpad is trying a lot of things, but is clearly struggling to get what's important and what's not, etc... DVCS have some very interesting UI issues, and I haven't seen a clear solution yet. cheers, David From robert.kern at gmail.com Mon Feb 23 15:48:48 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 14:48:48 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> Message-ID: <3d375d730902231248q16fb8c0fq4918ce8debb7a8dd@mail.gmail.com> On Mon, Feb 23, 2009 at 14:41, Pauli Virtanen wrote: > Mon, 23 Feb 2009 14:32:54 -0600, Robert Kern wrote: > [clip] >>> I agree on this - having a "blessed" mirror so that anyone into DVCS >>> could have a "reference" would be enough for me, at least as long as >>> there is no good solution for bug tracking (the one advantage of >>> switching to a DVCS is easier merging/branching, but I am worried about >>> the workflow if the bug tracker cannot track branches). >> >> The user pastes in the URL of his branch. Bug trackers don't really >> "track branches"; they track the status of issues. Some bug trackers, >> like that in Trac and Redmine, have some special mark up to make it easy >> to refer to a particular revision in their repo browser. Is that what >> you are talking about? Remember that with a DVCS, it is intrinsically >> difficult to accomplish this; anyone can host their branches anywhere. > > I'd think the markup/URLs shouldn't be a problem, since, as Robert says, > one can just paste URLs. > > I would be a bit more worried about support for 'Milestones' and > 'Versions'. AFAIK Trac (at least the 0.10.3 we have) is a bit primitive > here, as a bug can be only assigned to a single version/branch. So it's > not easy to query for bugs that have been fixed in trunk, but haven't > been backported eg. to 0.7.x branch. It's worth noting that Roundup isn't a bug tracker so much as it is a toolkit for constructing bug trackers (this is not always a good thing). The database schema is ridiculously flexible. The only required table is "user". You can make whatever fields you like according to the semantics that are appropriate for the workflow you want to use. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Feb 23 15:53:55 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 14:53:55 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231245h3c04c9a2qdd96fba2eec04505@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> <5b8d13220902231245h3c04c9a2qdd96fba2eec04505@mail.gmail.com> Message-ID: <3d375d730902231253v33dced7dkb01a02938d08de46@mail.gmail.com> On Mon, Feb 23, 2009 at 14:45, David Cournapeau wrote: > On Tue, Feb 24, 2009 at 5:32 AM, Robert Kern wrote: >> On Mon, Feb 23, 2009 at 14:16, David Cournapeau wrote: >>> On Tue, Feb 24, 2009 at 5:06 AM, Robert Kern wrote: >>>> On Mon, Feb 23, 2009 at 10:04, St?fan van der Walt wrote: >>>> >>>>> 1) Distributed revision control system: David Cournapeau and myself have >>>>> been test driving Git [1] on SciPy and NumPy for a while. It is fast, well >>>>> supported, has great branch support, and is simple to use for the average >>>>> contributor, while allowing powerful patch-carving for the more adventurous. >>>> >>>> While I really like DVCS in general, I don't think there is much >>>> benefit to switching. The various DVCS-SVN bridges account for most of >>>> the benefits, I think. >>> >>> I agree on this - having a "blessed" mirror so that anyone into DVCS >>> could have a "reference" would be enough for me, at least as long as >>> there is no good solution for bug tracking (the one advantage of >>> switching to a DVCS is easier merging/branching, but I am worried >>> about the workflow if the bug tracker cannot track branches). >> >> The user pastes in the URL of his branch. Bug trackers don't really >> "track branches"; they track the status of issues. Some bug trackers, >> like that in Trac and Redmine, have some special mark up to make it >> easy to refer to a particular revision in their repo browser. Is that >> what you are talking about? > > Yes: specially when reviewing for release, I like the ability to go > from a bug to the revision and vice-versa. With Roundup, for example, we can make a field for people to paste in the relevant branch URL. > But my concern is a bit more general: I think there are still a lof of > issues on how to integrate DVCS and bug tracking, and nobody really > solved it yet. Some people have suggested putting the bugs themselves > in the repository, launchpad is trying a lot of things, but is clearly > struggling to get what's important and what's not, etc... DVCS have > some very interesting UI issues, and I haven't seen a clear solution > yet. I think we should recognize that we will almost certainly not solve the general problem. We should not let this failure impede us from solving the more immediate problems. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Mon Feb 23 15:54:20 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 23 Feb 2009 20:54:20 +0000 (UTC) Subject: [SciPy-dev] complex wrapper to ode References: <3a1077e70902231025n2dc7e3deo284f7632b2e26c3a@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 18:25:45 +0000, John Travers wrote: > Attached is a patch which adds a wrapper class 'zode' to integrate.ode. > It allows one to conviniently solve systems of odes with complex values > using the existing real valued solvers vode, dopri5, dop853, instead of > zode, by simply integrating the real/imag parts. > > Is this worth commiting? Looks good to me, and may be generally useful, so I'm +1 But before committing, I'd suggest a couple of things: - The name 'zode' is slightly confusing vs. ZVODE and not very descriptive. Maybe 'complex_ode' would be better? This would leave us wiggle room later on with the naming... - Is it possible to do the real -> complex switch automatically, based on the type of return value from (a trial evaluation of) f? On a second thought, this might be brittle. - Since 'ode' supports Jacobians, it'd be nice if the wrapper supported them, too. > It appears to me to be considerably faster than > zvode for my big systems of equations. I'm not sure why, as I > intuitively thought all the data copying etc. would slow it down. Is your RHS an analytic function of all of the variables? The ZVODE docs seem to mention this as a requirement. But I don't know if the ZVODE implementation itself is supposed to be fast. -- Pauli Virtanen From stefan at sun.ac.za Mon Feb 23 16:03:06 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:03:06 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> Message-ID: <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> 2009/2/23 Matthew Brett : > A) Do we agree in general to a more disciplined tests / review / accept cycle. > B) What specifically are the problems that y'all are having, and what > options are there for solving them. Current workflow: 1. Cook up a patch 2. Apply the patch or, if you are not a dev, upload to trac So, currently, unreviewed, untested code ends up in SciPy, or languishes on Trac for a long time. Proposed workflow: 1. Cook up a patch 2. Attach the patch (or a URL to the patchset/branch) to the issue tracker with a REVIEW tag 3. Ping the mailing list or IRC to request a review (rinse and repeat) Workflow for dev: 1. Request a list of patches ready for review: review - Has tests [check] - Has docs [check] - Does what it is supposed to do [check] 2. Add a POSITIVE_REVIEW or NEGATIVE_REVIEW tag as appropriate 3. Request a list of patches ready to be merged (code can be merged if seen by two pairs of eyes: reviewer + committer, reviewer + reviewer, etc. In the end it must have "positive_reviews - negative_reviews >= 2"). Review the patch (this adds one pair of eyes) and merge if appropriate. That's the rough idea. Comments welcome. Cheers St?fan From charlesr.harris at gmail.com Mon Feb 23 16:06:24 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 14:06:24 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 1:00 PM, wrote: > On Mon, Feb 23, 2009 at 11:44 AM, David Cournapeau > wrote: > > > > > I agree that tools cannot solve the problem of lack of test. But to give > > you a scenario: I had a couple of hours to spend on triaging bugs on > > numpy for 1.3 release last WE. I have done almost nothing: I can't > > easily get tickets with attached patches, I can't control the bug > > tracker from the command line (I can't say: give me all the tickets > > since 1.2 with attached patches, give me all tickets on numpy.core since > > 1.2, etc...). I find the whole workflow extremely frustrating personally. > > > > This seems to me to be a problem with the (older) trac ticketing. > Queries for attachment, I found, work ok, but I didn't find a query > for tickets with recent comments. Going through the stats review > tickets to see which have any relevant information is really slow, and > I still never went through all of them to see if someone added a > comment or not. Has this improved with the new version of trac, > scipy trac is 0.10.2 and current trac is 0.11.3? > > But I don't see how changing the revision control system helps with this. > > I installed msysgit following the links provided and the git gui looks > ok, although the file browser is missing the basic file information > (revision numbers, dates, added to repository or not). I don't like > the bash shell because I don't know the keys for basic things (or I > have to look them up), but that's only a smaller problem. > Bash on windows is ugly anyways ;) > > The question is whether the revision control should be easier for > developer or easier for users to create patches, and that might not be > the same system. > > Also, at least with bazaar and bzrsvn and, I guess, git-svn, I still > don't see what the (major) disadvantage for branching is of using a > mirror of the central svn repository in a decentralized version > control. > I think git-svn takes care of most of the branching problem on a local basis. Where it fall down, IMHO, is in testing a branch on the builtbots and sharing a branch among two or three people. > (However, I usually just work with only short lived branches to try > out fixes and new code, or do selective merging manually. Personally, > my second incentive for keeping essentially only one main branch (svn) > in my own code is, that I am not very well organized with branches, > and having 10 to 15 versions/branches of a package on my hard drive > ends up just wasting space and time. Also, I like the svn integration > in eclipse.) > I tend to agree that changing the VCS shouldn't be the first priority. It sounds like tracking the tickets and svn changes relevant to particular tagged releases might be where the effort should go. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Feb 23 16:07:26 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:07:26 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231253v33dced7dkb01a02938d08de46@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231206h7c89e2e8yc2fea5d3dcfcbf55@mail.gmail.com> <5b8d13220902231216p65f13536lcb541cba34853d15@mail.gmail.com> <3d375d730902231232i7c9c28d8t4f5424acfb67ba23@mail.gmail.com> <5b8d13220902231245h3c04c9a2qdd96fba2eec04505@mail.gmail.com> <3d375d730902231253v33dced7dkb01a02938d08de46@mail.gmail.com> Message-ID: <9457e7c80902231307t51449c5am1953775ca600717a@mail.gmail.com> 2009/2/23 Robert Kern : > I think we should recognize that we will almost certainly not solve > the general problem. We should not let this failure impede us from > solving the more immediate problems. I just love that paragraph :) S. From robert.kern at gmail.com Mon Feb 23 16:11:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 15:11:06 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> Message-ID: <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> On Mon, Feb 23, 2009 at 15:06, Charles R Harris wrote: > I think git-svn takes care of most of the branching problem on a local > basis. Where it fall down, IMHO, is in testing a branch on the builtbots and > sharing a branch among two or three people. Why is that? Push your git branch to github. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Mon Feb 23 16:15:51 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 16:15:51 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> Message-ID: <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> On Mon, Feb 23, 2009 at 4:03 PM, St?fan van der Walt wrote: > 2009/2/23 Matthew Brett : >> A) Do we agree in general to a more disciplined tests / review / accept cycle. >> B) What specifically are the problems that y'all are having, and what >> options are there for solving them. > > Current workflow: > > 1. Cook up a patch > 2. Apply the patch or, if you are not a dev, upload to trac > > So, currently, unreviewed, untested code ends up in SciPy, or > languishes on Trac for a long time. > > Proposed workflow: > > 1. Cook up a patch > 2. Attach the patch (or a URL to the patchset/branch) to the issue > tracker with a REVIEW tag > 3. Ping the mailing list or IRC to request a review (rinse and repeat) > > Workflow for dev: > > 1. Request a list of patches ready for review: review > - Has tests [check] > - Has docs [check] > - Does what it is supposed to do [check] > 2. Add a POSITIVE_REVIEW or NEGATIVE_REVIEW tag as appropriate > 3. Request a list of patches ready to be merged (code can be merged if > seen by two pairs of eyes: reviewer + committer, reviewer + reviewer, > etc. In the end it must have "positive_reviews - negative_reviews >= > 2"). Review the patch (this adds one pair of eyes) and merge if > appropriate. > > That's the rough idea. Comments welcome. > > Cheers > St?fan I agree it is a good idea, theoretically, but Maybe I'm slightly pessimistic, but almost the only comment or review for my bugfixes in scipy.stats that I got, were from Per Brodtkorb, and my tickets and patches were sitting for half a year in trac.. If I have to wait for a review, then ... (and I'm still waiting for 2 fixes to numpy.random) Josef From cournape at gmail.com Mon Feb 23 16:17:23 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 06:17:23 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: <5b8d13220902231317i799eb927t7d0e44010ed2f8e4@mail.gmail.com> On Tue, Feb 24, 2009 at 6:11 AM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 15:06, Charles R Harris > wrote: > >> I think git-svn takes care of most of the branching problem on a local >> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >> sharing a branch among two or three people. > > Why is that? Push your git branch to github. This can cause trouble when dcommitting back to svn. I don't remember the exact scenario, but I managed to break some things (like committing twice to svn) because I was not careful. It is easy to forget getting svn references in addition to git ones, maybe this is linked - I have not tried really hard to understand the problem to be honest. cheers, David > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From charlesr.harris at gmail.com Mon Feb 23 16:19:49 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 14:19:49 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 2:11 PM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 15:06, Charles R Harris > wrote: > > > I think git-svn takes care of most of the branching problem on a local > > basis. Where it fall down, IMHO, is in testing a branch on the builtbots > and > > sharing a branch among two or three people. > > Why is that? Push your git branch to github. > Does github work with our buildbots? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Feb 23 16:24:33 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:24:33 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: <9457e7c80902231324l112a00acg2cfdc06d1c03709d@mail.gmail.com> Hi Robert 2009/2/23 Robert Kern : > On Mon, Feb 23, 2009 at 15:06, Charles R Harris > wrote: > >> I think git-svn takes care of most of the branching problem on a local >> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >> sharing a branch among two or three people. > > Why is that? Push your git branch to github. What do you say to the merging difficulties that SVN causes? Git-svn doesn't address that, unfortunately. Cheers St?fan From cournape at gmail.com Mon Feb 23 16:25:46 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 06:25:46 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> Message-ID: <5b8d13220902231325w730ec35fo330c0f2ca31a0c7e@mail.gmail.com> On Tue, Feb 24, 2009 at 6:15 AM, wrote: > On Mon, Feb 23, 2009 at 4:03 PM, St?fan van der Walt wrote: >> 2009/2/23 Matthew Brett : >>> A) Do we agree in general to a more disciplined tests / review / accept cycle. >>> B) What specifically are the problems that y'all are having, and what >>> options are there for solving them. >> >> Current workflow: >> >> 1. Cook up a patch >> 2. Apply the patch or, if you are not a dev, upload to trac >> >> So, currently, unreviewed, untested code ends up in SciPy, or >> languishes on Trac for a long time. >> >> Proposed workflow: >> >> 1. Cook up a patch >> 2. Attach the patch (or a URL to the patchset/branch) to the issue >> tracker with a REVIEW tag >> 3. Ping the mailing list or IRC to request a review (rinse and repeat) >> >> Workflow for dev: >> >> 1. Request a list of patches ready for review: review >> - Has tests [check] >> - Has docs [check] >> - Does what it is supposed to do [check] >> 2. Add a POSITIVE_REVIEW or NEGATIVE_REVIEW tag as appropriate >> 3. Request a list of patches ready to be merged (code can be merged if >> seen by two pairs of eyes: reviewer + committer, reviewer + reviewer, >> etc. In the end it must have "positive_reviews - negative_reviews >= >> 2"). Review the patch (this adds one pair of eyes) and merge if >> appropriate. >> >> That's the rough idea. Comments welcome. >> >> Cheers >> St?fan > > I agree it is a good idea, theoretically, but > > Maybe I'm slightly pessimistic, but almost the only comment or review > for my bugfixes in scipy.stats that I got, were from Per Brodtkorb, > and my tickets and patches were sitting for half a year in trac.. If I > have to wait for a review, then ... For patches: not being able to even retrieve them is one problem. Let say right now I feel guilty about your email, and look into trac: I can't easily retrieve all your patches which are > 6 months old without getting into a SQL query :) Or if I look at them, and think I have nothing to say, I can't mark them as "read" so I won't bother reading them next time I look at the bugs. I don't know if this scenario makes the problems I have with trac ATM clearer ? David From robert.kern at gmail.com Mon Feb 23 16:27:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 15:27:21 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> On Mon, Feb 23, 2009 at 15:19, Charles R Harris wrote: > > > On Mon, Feb 23, 2009 at 2:11 PM, Robert Kern wrote: >> >> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >> wrote: >> >> > I think git-svn takes care of most of the branching problem on a local >> > basis. Where it fall down, IMHO, is in testing a branch on the builtbots >> > and >> > sharing a branch among two or three people. >> >> Why is that? Push your git branch to github. > > Does github work with our buildbots? Probably not right now. But that's exactly the same problem if the main repo were a git one, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Mon Feb 23 16:30:57 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:30:57 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> Message-ID: <9457e7c80902231330g341471e6ld6f5e6fc5819864a@mail.gmail.com> Hi Josef 2009/2/23 : > I agree it is a good idea, theoretically, but > > Maybe I'm slightly pessimistic, but almost the only comment or review > for my bugfixes in scipy.stats that I got, were from Per Brodtkorb, > and my tickets and patches were sitting for half a year in trac.. If I > have to wait for a review, then ... I hope that, if we have a decent workflow in place, this kind of thing won't happen any longer. I can't go onto trac and search for "tickets with patches that needs review". Whenever I have time, I try to close bugs and review patches, but more often than not it takes a very long time just to access all the required info from Trac. I remember spending a full day reviewing your changes to distributions. Unfortunately, technology thwarted us there as well. You had one huge patch, that you then very carefullly split into parts, but I never managed to merge your changes with my tree. Then you got SVN access and, well, the rest is SVN history. > (and I'm still waiting for 2 fixes to numpy.random) See, I didn't know that :) I would review it for you right now, but I waited 60 seconds and then got 500 Internal Server Error. Regards St?fan From robert.kern at gmail.com Mon Feb 23 16:32:17 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 15:32:17 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231324l112a00acg2cfdc06d1c03709d@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231324l112a00acg2cfdc06d1c03709d@mail.gmail.com> Message-ID: <3d375d730902231332k61e564f9uaca6a44f48b23a8d@mail.gmail.com> On Mon, Feb 23, 2009 at 15:24, St?fan van der Walt wrote: > Hi Robert > > 2009/2/23 Robert Kern : >> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >> wrote: >> >>> I think git-svn takes care of most of the branching problem on a local >>> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >>> sharing a branch among two or three people. >> >> Why is that? Push your git branch to github. > > What do you say to the merging difficulties that SVN causes? Git-svn > doesn't address that, unfortunately. Scenario? There are several merging difficulties of SVN, and I do believe that git-svn addresses at least some of them. Precisely what are you thinking of? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Mon Feb 23 16:35:36 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 06:35:36 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231332k61e564f9uaca6a44f48b23a8d@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231324l112a00acg2cfdc06d1c03709d@mail.gmail.com> <3d375d730902231332k61e564f9uaca6a44f48b23a8d@mail.gmail.com> Message-ID: <5b8d13220902231335t104e1e34p749e2c57ad55d601@mail.gmail.com> On Tue, Feb 24, 2009 at 6:32 AM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 15:24, St?fan van der Walt wrote: >> Hi Robert >> >> 2009/2/23 Robert Kern : >>> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >>> wrote: >>> >>>> I think git-svn takes care of most of the branching problem on a local >>>> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >>>> sharing a branch among two or three people. >>> >>> Why is that? Push your git branch to github. >> >> What do you say to the merging difficulties that SVN causes? Git-svn >> doesn't address that, unfortunately. > > Scenario? There are several merging difficulties of SVN, and I do > believe that git-svn addresses at least some of them. Precisely what > are you thinking of? Multiple branch releases (1.2.x and 1.3.0 for example) are one scenario where easy (fast) merging would be useful. In that case, svn merge capabilities are enough, but really slow. It takes more time to merge than building numpy, for example. I don't know if my location in Japan matters for latency or whatever, but every svnmerge merge taking between 1-2 to 10 minutes is not a great experience. David From ellisonbg.net at gmail.com Mon Feb 23 16:36:23 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Mon, 23 Feb 2009 13:36:23 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231325w730ec35fo330c0f2ca31a0c7e@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> <5b8d13220902231325w730ec35fo330c0f2ca31a0c7e@mail.gmail.com> Message-ID: <6ce0ac130902231336le64a405wcc668eb77c789d01@mail.gmail.com> About a year ago, we moved IPython development to bzr. Since then I have moved all my projects to DVCS's (mainly git and bzr). At this point, I can't imagine using a non-DVCS like svn. Using bzr (even given the downsides of bzr) has really helped the IPython development workflow and has really encouraged new people to contribute (this has actually happened!). So for me the choice to move numpy/scipy development to a DVCS is a no-brainer. Personally, I don't see how everyone has survived using svn this long. In terms of which DVCS to pick. I have primarily used git and bzr (some hg too) and all of them will get the job done. I like certain things about git and other things about bzr. But, I find myself using git if I have a choice, mainly because I am impatient and git is fast. While the bzr+Launchpad setup that we are using with IPython works OK, it is painfully slow (especially the web interface on Launchpad). Cheers, Brian From strawman at astraw.com Mon Feb 23 16:37:06 2009 From: strawman at astraw.com (Andrew Straw) Date: Mon, 23 Feb 2009 13:37:06 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902231317i799eb927t7d0e44010ed2f8e4@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <5b8d13220902231317i799eb927t7d0e44010ed2f8e4@mail.gmail.com> Message-ID: <49A31702.9010806@astraw.com> David Cournapeau wrote: > On Tue, Feb 24, 2009 at 6:11 AM, Robert Kern wrote: >> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >> wrote: >> >>> I think git-svn takes care of most of the branching problem on a local >>> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >>> sharing a branch among two or three people. >> Why is that? Push your git branch to github. > > This can cause trouble when dcommitting back to svn. I don't remember > the exact scenario, but I managed to break some things (like > committing twice to svn) because I was not careful. It is easy to > forget getting svn references in addition to git ones, maybe this is > linked - I have not tried really hard to understand the problem to be > honest. The problem with git and svn integration is that you have to rebase your git branch onto the svn branch to merge your changes back to svn. You thereby loose the true history of your git branch. And you also make it difficult for anyone to track your git branch because you've rebased (rewritten history). Thus, use of git-svn doesn't really make code sharing much easier -- in general, there still has to be svn in there. Furthermore, because there is no birectional one-to-one map between git and svn branches, in practice only one svn branch, probably the trunk, can be used within a git system. -Andrew From stefan at sun.ac.za Mon Feb 23 16:42:36 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:42:36 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> Message-ID: <9457e7c80902231342t36290b5ds1e834de181351d7f@mail.gmail.com> 2009/2/23 Robert Kern : >> Does github work with our buildbots? > > Probably not right now. But that's exactly the same problem if the > main repo were a git one, too. The latest version of Buildbot has support for git. S. From stefan at sun.ac.za Mon Feb 23 16:44:01 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 23 Feb 2009 23:44:01 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231332k61e564f9uaca6a44f48b23a8d@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231324l112a00acg2cfdc06d1c03709d@mail.gmail.com> <3d375d730902231332k61e564f9uaca6a44f48b23a8d@mail.gmail.com> Message-ID: <9457e7c80902231344k1a88293em4759aab71015f6f9@mail.gmail.com> 2009/2/23 Robert Kern : > On Mon, Feb 23, 2009 at 15:24, St?fan van der Walt wrote: >> Hi Robert >> >> 2009/2/23 Robert Kern : >>> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >>> wrote: >>> >>>> I think git-svn takes care of most of the branching problem on a local >>>> basis. Where it fall down, IMHO, is in testing a branch on the builtbots and >>>> sharing a branch among two or three people. >>> >>> Why is that? Push your git branch to github. >> >> What do you say to the merging difficulties that SVN causes? Git-svn >> doesn't address that, unfortunately. > > Scenario? There are several merging difficulties of SVN, and I do > believe that git-svn addresses at least some of them. Precisely what > are you thinking of? I was thinking of this, from http://www.kernel.org/pub/software/scm/git/docs/git-svn.html """ Running git-merge or git-pull is NOT recommended on a branch you plan to dcommit from. Subversion does not represent merges in any reasonable or useful fashion; so users using Subversion cannot see any merges you've made. Furthermore, if you merge or pull from a git branch that is a mirror of an SVN branch, dcommit may commit to the wrong branch. """ Cheers St?fan From michael.abshoff at googlemail.com Mon Feb 23 16:46:10 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Mon, 23 Feb 2009 13:46:10 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231330g341471e6ld6f5e6fc5819864a@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <9457e7c80902231303u7f98de30sd5f119bdbe48bcd3@mail.gmail.com> <1cd32cbb0902231315g677e59d3i4b25196efde20902@mail.gmail.com> <9457e7c80902231330g341471e6ld6f5e6fc5819864a@mail.gmail.com> Message-ID: <49A31922.3020708@gmail.com> St?fan van der Walt wrote: > Hi Josef Hello folks, > 2009/2/23 : >> I agree it is a good idea, theoretically, but >> >> Maybe I'm slightly pessimistic, but almost the only comment or review >> for my bugfixes in scipy.stats that I got, were from Per Brodtkorb, >> and my tickets and patches were sitting for half a year in trac.. If I >> have to wait for a review, then ... > > I hope that, if we have a decent workflow in place, this kind of thing > won't happen any longer. I can't go onto trac and search for "tickets > with patches that needs review". Well, it can be done even with trac 0.10.x: http://trac.sagemath.org/sage_trac/report We adopted the workflow in Sage around trac and it has been running rather smoothly. Now that we updated trac to 0.11.3 we will use workflows to get around the rather primitive manual need to change summaries. But as often KISS works really weel. Cheers, Michael From charlesr.harris at gmail.com Mon Feb 23 16:46:09 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 14:46:09 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 2:27 PM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 15:19, Charles R Harris > wrote: > > > > > > On Mon, Feb 23, 2009 at 2:11 PM, Robert Kern > wrote: > >> > >> On Mon, Feb 23, 2009 at 15:06, Charles R Harris > >> wrote: > >> > >> > I think git-svn takes care of most of the branching problem on a local > >> > basis. Where it fall down, IMHO, is in testing a branch on the > builtbots > >> > and > >> > sharing a branch among two or three people. > >> > >> Why is that? Push your git branch to github. > > > > Does github work with our buildbots? > > Probably not right now. But that's exactly the same problem if the > main repo were a git one, too. > True enough, I'm not arguing for changing the main repository, just pointing out what I miss when making branches locally with git-svn. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.skomoroch at gmail.com Mon Feb 23 17:07:57 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 17:07:57 -0500 Subject: [SciPy-dev] scipy wiki throwing 500 errors Message-ID: It seems like the wiki still has some server issues, I've had several 500 errors while attempting to edit the wiki in the last few minutes. -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwang at enthought.com Mon Feb 23 17:12:54 2009 From: pwang at enthought.com (Peter Wang) Date: Mon, 23 Feb 2009 16:12:54 -0600 Subject: [SciPy-dev] scipy wiki throwing 500 errors In-Reply-To: References: Message-ID: On Feb 23, 2009, at 4:07 PM, Peter Skomoroch wrote: > It seems like the wiki still has some server issues, I've had > several 500 errors while attempting to edit the wiki in the last few > minutes. Yep, I'm looking at it. From pav at iki.fi Mon Feb 23 17:16:41 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 23 Feb 2009 22:16:41 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 15:11:06 -0600, Robert Kern wrote: > On Mon, Feb 23, 2009 at 15:06, Charles R Harris > wrote: > >> I think git-svn takes care of most of the branching problem on a local >> basis. Where it fall down, IMHO, is in testing a branch on the >> builtbots and sharing a branch among two or three people. > > Why is that? Push your git branch to github. git-svn relies much on rebasing (for merging with SVN), and AFAIK doesn't work well in a multi-user scenario. Quote from the manual page: The recommended method of exchanging code between git branches and users is git-format-patch and git-am, or just dcommiting to the SVN repository. That is, the recommended way to collaborate when using git-svn is to send patches via mail. Also, if you want to merge, there are caveats: Running git-merge or git-pull is NOT recommended on a branch you plan to dcommit from. Subversion does not represent merges in any reasonable or useful fashion; so users using Subversion cannot see any merges you've made. Furthermore, if you merge or pull from a git branch that is a mirror of an SVN branch, dcommit may commit to the wrong branch. So I don't think it's very simple to use it for sharing work with people. Working on an SVN branch is IMO better than this. *** I'm wondering if we would benefit from an official, pull-only, automatically updating, Git mirror of the SVN repository: - Easy "feature branches" also for those who don't have SVN commit rights. - Rebasing against SVN is not necessary, merges will just work. - Easier to work with than dealing with patches. Some issues turn up when changes are committed back to SVN: - There's some minor manual work before dcommit works, if you clone a git-svn repository from somewhere else: http://subtlegradient.com/articles/2008/04/22/cloning-a-git-svn-clone But this needs to be done only once. - cherry-pick + rebase is likely necessary before changes can be committed to SVN. - Merge is necessary on the feature branch, if further work needs to be done on it. The last point is probably the biggest factor undermining usability of this workflow. But this still probably would beat updating patches, and it does not seem to be any worse than SVN branches. -- Pauli Virtanen From robince at gmail.com Mon Feb 23 17:36:07 2009 From: robince at gmail.com (Robin) Date: Mon, 23 Feb 2009 22:36:07 +0000 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 9:46 PM, Charles R Harris wrote: > > > On Mon, Feb 23, 2009 at 2:27 PM, Robert Kern wrote: >> >> On Mon, Feb 23, 2009 at 15:19, Charles R Harris >> wrote: >> > >> > >> > On Mon, Feb 23, 2009 at 2:11 PM, Robert Kern >> > wrote: >> >> >> >> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >> >> wrote: >> >> >> >> > I think git-svn takes care of most of the branching problem on a >> >> > local >> >> > basis. Where it fall down, IMHO, is in testing a branch on the >> >> > builtbots >> >> > and >> >> > sharing a branch among two or three people. >> >> >> >> Why is that? Push your git branch to github. >> > >> > Does github work with our buildbots? >> >> Probably not right now. But that's exactly the same problem if the >> main repo were a git one, too. > > True enough, I'm not arguing for changing the main repository, just pointing > out what I miss when making branches locally with git-svn. > Watching this discussion with interest... Although git seems to have more momentum here, I though I would point out bzr-svn, which I think allows distributed collaboration around a central svn-hosted branch without the problems people have mentioned with git-svn. By storing the bzr metadata in svn properties, a full bazaar branch can be hosted in svn, so people can branch off it, merge with each other, merge back etc. In my experience it's worked very well for giving the distributed benefits of branching from a centralised svn repo. Cheers Robin From stefan at sun.ac.za Mon Feb 23 17:36:42 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 00:36:42 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> Message-ID: <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> 2009/2/24 Pauli Virtanen : > I'm wondering if we would benefit from an official, pull-only, > automatically updating, Git mirror of the SVN repository: I'm still wondering what the advantages are of staying with SVN. I haven't heard any compelling arguments so far, whereas we've heard numerous accounts from developers regarding the positive aspects of DVC systems. Cheers St?fan From charlesr.harris at gmail.com Mon Feb 23 17:54:04 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 15:54:04 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 3:36 PM, St?fan van der Walt wrote: > 2009/2/24 Pauli Virtanen : > > I'm wondering if we would benefit from an official, pull-only, > > automatically updating, Git mirror of the SVN repository: > > I'm still wondering what the advantages are of staying with SVN. I > haven't heard any compelling arguments so far, whereas we've heard > numerous accounts from developers regarding the positive aspects of > DVC systems. > Decent windows support via Tortoise and no need to learn a new system. Plus no need to revamp the whole setup, which is always a bigger pain than one plans for. The problems with the current system seem to be ticket tracking and branches. I think the first thing to do is take a look at what sage is doing and see if we can't refurbish our current system to make it more useable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhansen at gmail.com Mon Feb 23 17:58:09 2009 From: mhansen at gmail.com (Mike Hansen) Date: Mon, 23 Feb 2009 14:58:09 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> Message-ID: Hello, On Mon, Feb 23, 2009 at 2:54 PM, Charles R Harris wrote: > I think the first thing to do is take a look at what sage is > doing and see if we can't refurbish our current system to make it more > useable. On Mon, Feb 23, 2009 at 1:03 PM, St?fan van der Walt wrote: > Proposed workflow: > > 1. Cook up a patch > 2. Attach the patch (or a URL to the patchset/branch) to the issue > tracker with a REVIEW tag > 3. Ping the mailing list or IRC to request a review (rinse and repeat) > > Workflow for dev: > > 1. Request a list of patches ready for review: review > - Has tests [check] > - Has docs [check] > - Does what it is supposed to do [check] > 2. Add a POSITIVE_REVIEW or NEGATIVE_REVIEW tag as appropriate > 3. Request a list of patches ready to be merged (code can be merged if > seen by two pairs of eyes: reviewer + committer, reviewer + reviewer, > etc. In the end it must have "positive_reviews - negative_reviews >= > 2"). Review the patch (this adds one pair of eyes) and merge if > appropriate. This is roughly what we do for Sage, and it's fairly effective at getting code merged in. Also, _every_ piece of code that goes in does so through via Trac ticket. It's easy to see when bugs have been fixed. --Mike From strawman at astraw.com Mon Feb 23 17:59:25 2009 From: strawman at astraw.com (Andrew Straw) Date: Mon, 23 Feb 2009 14:59:25 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> Message-ID: <49A32A4D.5040101@astraw.com> Robin wrote: > On Mon, Feb 23, 2009 at 9:46 PM, Charles R Harris > wrote: >> >> On Mon, Feb 23, 2009 at 2:27 PM, Robert Kern wrote: >>> On Mon, Feb 23, 2009 at 15:19, Charles R Harris >>> wrote: >>>> >>>> On Mon, Feb 23, 2009 at 2:11 PM, Robert Kern >>>> wrote: >>>>> On Mon, Feb 23, 2009 at 15:06, Charles R Harris >>>>> wrote: >>>>> >>>>>> I think git-svn takes care of most of the branching problem on a >>>>>> local >>>>>> basis. Where it fall down, IMHO, is in testing a branch on the >>>>>> builtbots >>>>>> and >>>>>> sharing a branch among two or three people. >>>>> Why is that? Push your git branch to github. >>>> Does github work with our buildbots? >>> Probably not right now. But that's exactly the same problem if the >>> main repo were a git one, too. >> True enough, I'm not arguing for changing the main repository, just pointing >> out what I miss when making branches locally with git-svn. >> > > Watching this discussion with interest... Although git seems to have > more momentum here, I though I would point out bzr-svn, which I think > allows distributed collaboration around a central svn-hosted branch > without the problems people have mentioned with git-svn. By storing > the bzr metadata in svn properties, a full bazaar branch can be hosted > in svn, so people can branch off it, merge with each other, merge back > etc. > > In my experience it's worked very well for giving the distributed > benefits of branching from a centralised svn repo. Robin, can you point us to a public svn repo where non-trivial branching is happening with bzr? I had lots of trouble trying to get anything working in my attempts. From robince at gmail.com Mon Feb 23 18:38:55 2009 From: robince at gmail.com (Robin) Date: Mon, 23 Feb 2009 23:38:55 +0000 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A32A4D.5040101@astraw.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> <49A32A4D.5040101@astraw.com> Message-ID: On Mon, Feb 23, 2009 at 10:59 PM, Andrew Straw wrote: >>> True enough, I'm not arguing for changing the main repository, just pointing >>> out what I miss when making branches locally with git-svn. >>> >> >> Watching this discussion with interest... Although git seems to have >> more momentum here, I though I would point out bzr-svn, which I think >> allows distributed collaboration around a central svn-hosted branch >> without the problems people have mentioned with git-svn. By storing >> the bzr metadata in svn properties, a full bazaar branch can be hosted >> in svn, so people can branch off it, merge with each other, merge back >> etc. >> >> In my experience it's worked very well for giving the distributed >> benefits of branching from a centralised svn repo. > > Robin, can you point us to a public svn repo where non-trivial branching > is happening with bzr? I had lots of trouble trying to get anything > working in my attempts. Hi, I'm not sure what qualifies as non-trivial branching (DVCS people say all branching should be trivial!). I have only used it on my private repo myself, but I haven't had any problems. Asking on IRC, a good example is GNOME. Here are the developer branches created using bzr-svn http://bzr-playground.gnome.org/ - although it looks like they aren't pushing directly back to svn (but going through patches). This is the best I could find documentating a workflow for using bzr with svn: http://www.serverzen.net/starting-with-bazaar-bzr-svn But I think to avoid problems with merging the trick is to have a bzr checkout from svn as your sort of trunk branch, which you can then branch with bzr to create feature branches. Upstream branches can be pulled to the trunk branch, then merged to your feature branches, and when you want to push stuff back you merge it into your trunk checkout (which also commits it to svn). There are options to either push each individual commit as an svn commit, or just have a single merge commit in svn (the metadata for the bzr commits are there so they can be seen by other bzr users). Because all the metadata is in svn, you should be able to merge with anyone else who has their branch based on a svn checkout... Again not 100% on all this, but I believe that's how it works. When I looked the DVCS I settled on this since I wanted to work with my existing svn repo and it seemed to be the best dvcs subversion interface. Cheers Robin From oliphant at enthought.com Mon Feb 23 19:06:13 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 23 Feb 2009 18:06:13 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> Message-ID: <49A339F5.1040703@enthought.com> Matthew Brett wrote: > Hi Stefan, > > >> 1. No code enters SciPy unless it had two pairs of eyes on it: >> reviewer and committer, reviewer and reviewer, reviewer and release >> manager, etc. All tickets ready for merging are marked in Trac for >> convenience. >> 2. No code enters SciPy unless it is fully documented. >> 3. No code enters SciPy unless it is fully tested (this holds for both >> bug-fixes and enhancements) >> > > Right. > > So, the real problem here is that the people doing the actual work > have severe problems with the current workflow. > > It seems to me the issue > > A) Do we agree in general to a more disciplined tests / review / accept cycle. > I'm a bit concerned about getting too top-heavy here. I think the biggest problem has been time and adding too formal of a process will just increase the time it takes to get code into SciPy. I'm fine with emphasizing documentation and tests as we discuss things and we should encourage each other, but I'm not comfortable with hard-line statements like the ones being made above. Yes, such things are helpful, but they are also expensive and I worry more about what we lose in contributions. The quality of what we create should emerge as all interested parties critically look at the code that is available in SciPy. Not everyone can do that on the same schedule. I'm opposed to trying to force that to happen. I very much favor cultivating a culture that wants someone to fix the problems in their code. Once we have a git-svn integration working, then I can support a simple policy like 1) "this list of people can push from git to svn" 2) "code to submit must either be O.K.'d by one other or have a certain time limit expired with no response" But, my favorite workflow is a bit more chaotic, than that. People create their own DVCS versions of SciPy using their best judgment and publish revisions they consider to be working code. Branches that are given the thumbs up by 2 people (or 1 on the steering committee) get pushed to the main branch. This review happens regularly, on IRC channels at regularly scheduled times. Good conversation... -Travis From cournape at gmail.com Mon Feb 23 19:08:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 09:08:37 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> Message-ID: <5b8d13220902231608u7078c0a5v38fe28ac0bf4d472@mail.gmail.com> On Tue, Feb 24, 2009 at 7:54 AM, Charles R Harris wrote: > I think the first thing to do is take a look at what sage is > doing and see if we can't refurbish our current system to make it more > useable. sage uses mercurial, David From matthew.brett at gmail.com Mon Feb 23 19:21:30 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 23 Feb 2009 16:21:30 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A339F5.1040703@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> Message-ID: <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> Hi, > I'm a bit concerned about getting too top-heavy here. > > I think the biggest problem has been time and adding too formal of a > process will just increase the time it takes to get code into SciPy. Yes, right, that's the key issue. I think Stefan's position is that, as more people start using and contributing to Scipy, it's become near impossible to maintain in a release-worthy way (Stefan - is that right)? That if we want to keep going without collapsing we need a more formal process. I guess the alternative position is that not having a discipline of code review, testing and documentation will make it more likely we'll have contributors, and the code will get better that way. I really am no expert, but I have the impression that projects of the size of Scipy do tend to use (and change to) fairly formal review / accept cycles, with testing and documentation. Is that impression correct? See you, Matthew From strawman at astraw.com Mon Feb 23 19:28:21 2009 From: strawman at astraw.com (Andrew Straw) Date: Mon, 23 Feb 2009 16:28:21 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> <49A32A4D.5040101@astraw.com> Message-ID: <49A33F25.4000903@astraw.com> Robin wrote: > On Mon, Feb 23, 2009 at 10:59 PM, Andrew Straw wrote: >> Robin, can you point us to a public svn repo where non-trivial branching >> is happening with bzr? I had lots of trouble trying to get anything >> working in my attempts. > I'm not sure what qualifies as non-trivial branching (DVCS people say > all branching should be trivial!). I have only used it on my private > repo myself, but I haven't had any problems. Asking on IRC, a good > example is GNOME. Here are the developer branches created using > bzr-svn > http://bzr-playground.gnome.org/ - although it looks like they aren't > pushing directly back to svn (but going through patches). > > This is the best I could find documentating a workflow for using bzr with svn: > http://www.serverzen.net/starting-with-bazaar-bzr-svn Thanks for the links. > But I think to avoid problems with merging the trick is to have a bzr > checkout from svn as your sort of trunk branch, which you can then > branch with bzr to create feature branches. Upstream branches can be > pulled to the trunk branch, then merged to your feature branches, and > when you want to push stuff back you merge it into your trunk checkout > (which also commits it to svn). There are options to either push each > individual commit as an svn commit, or just have a single merge commit > in svn (the metadata for the bzr commits are there so they can be seen > by other bzr users). Because all the metadata is in svn, you should be > able to merge with anyone else who has their branch based on a svn > checkout... My problem when I tried this out was that the svn metadata wasn't the same across different bzr repos cloned from the same svn repo -- thus no ability to actually share the bzr branches between bzr repos. It sounds like you have only tried this from a single bzr repo? Plus, I didn't like polluting the svn repo with the bzr metadata, particularly given this no-ability-to-create-the-same-bzr-clones issue. Git solves the first issue (different git clones of the same svn repo produce the same git repo) and thereby somewhat eliminates the need to store git metadata in the svn repo. Which is why I like git-svn more than bzr-svn. But the lack of one-to-one bidirectional mapping between DVCS branches and svn branches it was prevents any of these schemes from working on any DVCS, as far as I can see. From ctw at cogsci.info Mon Feb 23 19:43:38 2009 From: ctw at cogsci.info (Christoph T. Weidemann) Date: Mon, 23 Feb 2009 19:43:38 -0500 Subject: [SciPy-dev] Subclassed ndarray fails with ValueError when assigning to a sliced array In-Reply-To: References: <6b7179780902230838g2615d1e2u5940a489eed063ce@mail.gmail.com> Message-ID: It turns out that this was a bug on our end, not in numpy ... sorry for the confusion! From pgmdevlist at gmail.com Mon Feb 23 19:45:56 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 23 Feb 2009 19:45:56 -0500 Subject: [SciPy-dev] Subclassed ndarray fails with ValueError when assigning to a sliced array In-Reply-To: References: <6b7179780902230838g2615d1e2u5940a489eed063ce@mail.gmail.com> Message-ID: <319CF280-56F0-4067-88AE-A262E5C79F32@gmail.com> On Feb 23, 2009, at 7:43 PM, Christoph T. Weidemann wrote: > It turns out that this was a bug on our end, not in numpy ... sorry > for the confusion! No problem. From robert.kern at gmail.com Mon Feb 23 20:40:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 19:40:27 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Here is my take at the current time: The experimentalist in me cries out that we should make just one major infrastructure change at a time. For a variety of reasons, from IT support issues to just plain hating {{{}}}, I think replacing the bug tracker should be the change to make right now. That said, we can try a blessed DVCS-SVN bridge and see how it works out. It doesn't solve all problems, but it should enable a better workflow for casual contributors, the raison d'?tre for this discussion. It also doesn't commit us to anything while we are separately seeing how the tracker changes work out. Pauli, you seem familiar with setting up a git-to-svn bridge. Can you do this? What do you need done one the actual SVN server to support this? If fans of other DVCSes want to set up and administer blessed mirrors of their own, let us know. David and St?fan, can you work on proposing a new tracker configuration? I ask that you take a glance at Roundup, but I'll leave that up to your schedules. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Mon Feb 23 20:46:03 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 24 Feb 2009 01:46:03 +0000 (UTC) Subject: [SciPy-dev] Server spam problems spam spam: spam References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> Message-ID: Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote: [clip] > two tips of fighting spammers from the Sage project's wiki: > > * add a list of common Chinese words to LocalBadContent, i.e. > > http://wiki.sagemath.org/LocalBadContent > > Also make sure to clean out all the spammer attempts on the hard disk. > I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam > attempts are preserved and not actually deleted from disk. If you have a > couple ten thousand of those in one directory this might make every wiki > access painfully slow and impact the whole server. Continuing Gael's work, I tried to expand the LocalBadContent list: http://scipy.org/LocalBadContent I wonder how useful this turns out to be in the end, this smells like an arms race... I doubt the additions cause problems to real pages, but if they do, some of them need to be reverted. [Btw, shouldn't LocalBadContent editing be restricted to those in EditorGroup? And could my account PauliVirtanen be added in the group?] Another thing is that there are apparently ca. 11600 pages in the Scipy.org wiki. I'd make a wild guess that at most ~500 of these are valid content; the rest is spam. I'm not sure if getting rid of the spam pages improves Moin's performance. Do we have any valid pages with CJK characters? Much of the spam seems Chinese, so mass-deleting at least this portion of it shouldn't be impossible to do, given Moin's database format. -- Pauli Virtanen From robert.kern at gmail.com Mon Feb 23 20:58:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 19:58:27 -0600 Subject: [SciPy-dev] Server spam problems spam spam: spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> Message-ID: <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> On Mon, Feb 23, 2009 at 19:46, Pauli Virtanen wrote: > Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote: > [clip] >> two tips of fighting spammers from the Sage project's wiki: >> >> * add a list of common Chinese words to LocalBadContent, i.e. >> >> http://wiki.sagemath.org/LocalBadContent >> >> Also make sure to clean out all the spammer attempts on the hard disk. >> I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam >> attempts are preserved and not actually deleted from disk. If you have a >> couple ten thousand of those in one directory this might make every wiki >> access painfully slow and impact the whole server. > > Continuing Gael's work, I tried to expand the LocalBadContent list: > > http://scipy.org/LocalBadContent > > I wonder how useful this turns out to be in the end, this smells like an > arms race... I doubt the additions cause problems to real pages, but if > they do, some of them need to be reverted. > > [Btw, shouldn't LocalBadContent editing be restricted to those in > EditorGroup? And could my account PauliVirtanen be added in the group?] Done and done. > Another thing is that there are apparently ca. 11600 pages in the > Scipy.org wiki. I'd make a wild guess that at most ~500 of these are > valid content; the rest is spam. I'm not sure if getting rid of the spam > pages improves Moin's performance. Probably. Are you volunteering? Peter can give you a shell account. If you are willing to take on the other upgrades Michael recommended, to add the Captcha, for instance, that would go well, too. > Do we have any valid pages with CJK characters? Much of the spam seems > Chinese, so mass-deleting at least this portion of it shouldn't be > impossible to do, given Moin's database format. The Chinese localized Moin help pages are valid, but that should be it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peter.skomoroch at gmail.com Mon Feb 23 21:13:03 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 21:13:03 -0500 Subject: [SciPy-dev] Server spam problems spam spam: spam In-Reply-To: <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> Message-ID: What about black listing spam ips? http://moinmoin.wikiwikiweb.de/BlackList On Mon, Feb 23, 2009 at 8:58 PM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 19:46, Pauli Virtanen wrote: > > Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote: > > [clip] > >> two tips of fighting spammers from the Sage project's wiki: > >> > >> * add a list of common Chinese words to LocalBadContent, i.e. > >> > >> http://wiki.sagemath.org/LocalBadContent > >> > >> Also make sure to clean out all the spammer attempts on the hard disk. > >> I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam > >> attempts are preserved and not actually deleted from disk. If you have a > >> couple ten thousand of those in one directory this might make every wiki > >> access painfully slow and impact the whole server. > > > > Continuing Gael's work, I tried to expand the LocalBadContent list: > > > > http://scipy.org/LocalBadContent > > > > I wonder how useful this turns out to be in the end, this smells like an > > arms race... I doubt the additions cause problems to real pages, but if > > they do, some of them need to be reverted. > > > > [Btw, shouldn't LocalBadContent editing be restricted to those in > > EditorGroup? And could my account PauliVirtanen be added in the group?] > > Done and done. > > > Another thing is that there are apparently ca. 11600 pages in the > > Scipy.org wiki. I'd make a wild guess that at most ~500 of these are > > valid content; the rest is spam. I'm not sure if getting rid of the spam > > pages improves Moin's performance. > > Probably. Are you volunteering? Peter can give you a shell account. If > you are willing to take on the other upgrades Michael recommended, to > add the Captcha, for instance, that would go well, too. > > > Do we have any valid pages with CJK characters? Much of the spam seems > > Chinese, so mass-deleting at least this portion of it shouldn't be > > impossible to do, given Moin's database format. > > The Chinese localized Moin help pages are valid, but that should be it. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 23 21:17:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 20:17:57 -0600 Subject: [SciPy-dev] Server spam problems spam spam: spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> Message-ID: <3d375d730902231817t4409de45h5382b312318833db@mail.gmail.com> On Mon, Feb 23, 2009 at 20:13, Peter Skomoroch wrote: > What about black listing spam ips? http://moinmoin.wikiwikiweb.de/BlackList The blacklist available there is from 2004. I doubt it is still useful. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Mon Feb 23 21:42:21 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 24 Feb 2009 02:42:21 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 19:40:27 -0600, Robert Kern wrote: [clip] > Pauli, you seem familiar with setting up a git-to-svn bridge. Can you do > this? Sure. I'll need a box on which to deploy the update script, though. Would one (which?) of the virtual hosts of conference.scipy.org do? I'd guess what's needed of the web server would be only to enable CGI for a single script. When poked, it would then fetch new stuff from SVN and either - Push to github or some such service - Push to a HTTP location on the machine, served statically - Push to a HTTP location on the machine, served by gitweb (cgi) The first option is probably the easiest, if account/password issues can be sorted out. The second option is probably enough for practical purposes. > What do you need done one the actual SVN server to support this? A post-commit hook poking a CGI script somewhere should be enough: /usr/bin/curl -d "revision=$REV&repository=$REPOS" \ http://host/path/to/script.cgi & If all this runs on the same machine as the SVN server, going over the network can be skipped, too. Anyway, I'll think about the details tomorrow. -- Pauli Virtanen From robert.kern at gmail.com Mon Feb 23 21:48:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 20:48:43 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: <3d375d730902231848n381099cp4f791018a5ab63a7@mail.gmail.com> On Mon, Feb 23, 2009 at 20:42, Pauli Virtanen wrote: > Mon, 23 Feb 2009 19:40:27 -0600, Robert Kern wrote: > [clip] >> Pauli, you seem familiar with setting up a git-to-svn bridge. Can you do >> this? > > Sure. I'll need a box on which to deploy the update script, though. > Would one (which?) of the virtual hosts of conference.scipy.org do? Probably. That machine is where all of the services will be moving to. Peter might be able to say which one. > I'd guess what's needed of the web server would be only to enable CGI for > a single script. When poked, it would then fetch new stuff from SVN and > either > > - Push to github or some such service > - Push to a HTTP location on the machine, served statically > - Push to a HTTP location on the machine, served by gitweb (cgi) > > The first option is probably the easiest, if account/password issues can > be sorted out. The second option is probably enough for practical > purposes. 1) probably has the advantage of having smaller overhead for people publishing their own branches on github. >> What do you need done one the actual SVN server to support this? > > A post-commit hook poking a CGI script somewhere should be enough: > > /usr/bin/curl -d "revision=$REV&repository=$REPOS" \ > http://host/path/to/script.cgi & > > If all this runs on the same machine as the SVN server, going over the > network can be skipped, too. > > Anyway, I'll think about the details tomorrow. Great. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peter.skomoroch at gmail.com Mon Feb 23 22:06:43 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 22:06:43 -0500 Subject: [SciPy-dev] Server spam problems spam spam: spam In-Reply-To: <3d375d730902231817t4409de45h5382b312318833db@mail.gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> <3d375d730902231817t4409de45h5382b312318833db@mail.gmail.com> Message-ID: <9268CC07-10F3-42CC-8210-9E67BB2A56D8@gmail.com> I guess the idea would be to append ips to the list automatically if an edit is marked as spam, and cut down on the manual checks. Sent from my iPhone On Feb 23, 2009, at 9:17 PM, Robert Kern wrote: > On Mon, Feb 23, 2009 at 20:13, Peter Skomoroch > wrote: >> What about black listing spam ips? http://moinmoin.wikiwikiweb.de/BlackList > > The blacklist available there is from 2004. I doubt it is still > useful. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Mon Feb 23 22:59:20 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 23 Feb 2009 20:59:20 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 6:40 PM, Robert Kern wrote: > Here is my take at the current time: > > The experimentalist in me cries out that we should make just one major > infrastructure change at a time. For a variety of reasons, from IT > support issues to just plain hating {{{}}}, I think replacing the bug > tracker should be the change to make right now. > I think this is a sensible approach. Getting the up-time and bug tracking issues straightened will be a big help. > > That said, we can try a blessed DVCS-SVN bridge and see how it works > out. It doesn't solve all problems, but it should enable a better > workflow for casual contributors, the raison d'?tre for this > discussion. It also doesn't commit us to anything while we are > separately seeing how the tracker changes work out. > Git on windows has improved a lot in the last year. Another year or two might make it a good solution for everyone. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Mon Feb 23 23:21:23 2009 From: wbaxter at gmail.com (Bill Baxter) Date: Tue, 24 Feb 2009 13:21:23 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: > Git on windows has improved a lot in the last year. Another year or two > might make it a good solution for everyone. I gave msys-Git a try last week and it was not smooth sailing for me. Seemed to maybe be having conflicts with the Unix tool ports I usually use. I'll probably give it another try next year or so. To me it looks like a race to see if Bzr can improve its performance before Git improves it's Windows support, or if Hg will improve its Gui and overall ease of use before both those guys. Nobody yet seems to have the formula for world dominance quite down. --bb From cournape at gmail.com Mon Feb 23 23:52:42 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 13:52:42 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: <5b8d13220902232052v11dd214auab455428982bf2e3@mail.gmail.com> On Tue, Feb 24, 2009 at 1:21 PM, Bill Baxter wrote: >> Git on windows has improved a lot in the last year. Another year or two >> might make it a good solution for everyone. > > I gave msys-Git a try last week and it was not smooth sailing for me. > Seemed to maybe be having conflicts with the Unix tool ports I usually > use. IIRC, there is an option to check/uncheck to avoid this exact issue in the installer. Concerning the bzr vs git thing, it is hard to really know the problems without having used both of them extensively; a lot are just "look how cool git is", look how cool bzr is. I wrote a couple of months ago a comparison which totally omits the speed aspect if that interests you: http://cournape.wordpress.com/2008/10/30/going-away-from-bzr-toward-git/ David From cournape at gmail.com Tue Feb 24 00:11:01 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 14:11:01 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: <5b8d13220902232111w411774c6tb5f2231cb4b0f80b@mail.gmail.com> On Tue, Feb 24, 2009 at 10:40 AM, Robert Kern wrote: > David and St?fan, can you work on proposing a new tracker > configuration? I ask that you take a glance at Roundup, but I'll leave > that up to your schedules. That sounds like a plan. One thing which I think would be helpful is a svn + trac dump, so that we have something to try things on, thank you very much, David From stefan at sun.ac.za Tue Feb 24 00:25:47 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 07:25:47 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: <9457e7c80902232125s1b98aa13x970a8129af172c5d@mail.gmail.com> 2009/2/24 Pauli Virtanen : > The first option is probably the easiest, if account/password issues can > be sorted out. The second option is probably enough for practical > purposes. Authentication to github is done with SSH keys, so it should be easy to automate. Cheers St?fan From stefan at sun.ac.za Tue Feb 24 00:39:43 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 07:39:43 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A339F5.1040703@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> Message-ID: <9457e7c80902232139k438f04a0h59b7e345dad55c62@mail.gmail.com> Hi Travis 2009/2/24 Travis E. Oliphant : > I think the biggest problem has been time and adding too formal of a > process will just increase the time it takes to get code into SciPy. > I'm fine with emphasizing documentation and tests as we discuss things > and we should encourage each other, but I'm not comfortable with > hard-line statements like the ones being made above. ?Yes, such things > are helpful, but they are also expensive and I worry more about what we > lose in contributions. Having so little time means that we cannot be cavalier about adding broken code to SciPy. Like Matthew mentioned, this becomes an immense maintenance burden. > The quality of what we create should emerge as all interested parties > critically look at the code that is available in SciPy. I agree with that sentiment; and looking critically at code in SciPy starts with our own patches. > Not everyone > can do that on the same schedule. ?I'm opposed to trying to force that > to happen. ?I very much favor cultivating a culture that wants someone > to fix the problems in their code. Sure, let's be inclusive, but also set a bar. If you make the time to write a patch, make the time to do it well (it does not take long to construct a test -- you have to make sure your code works properly anyhow). > But, my favorite workflow is a bit more chaotic, than that. ?People > create their own DVCS versions of SciPy using their best judgment and > publish revisions they consider to be working code. > > Branches that are given the thumbs up by 2 people (or 1 on the steering > committee) get pushed to the main branch. ? ? ?This review happens > regularly, on IRC channels at regularly scheduled times. Two eyes on every piece of code in SciPy, that's all we need. Two critical eyes that realise the value of tests and documentation. Your outline above fits in with my view of how this could happen. Cheers St?fan From stefan at sun.ac.za Tue Feb 24 00:51:44 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 07:51:44 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> Message-ID: <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> Hi Matthew 2009/2/24 Matthew Brett : > I think Stefan's position is that, as more people start using and > contributing to Scipy, it's become near impossible to maintain in a > release-worthy way (Stefan - is that right)? ? ?That if we want to > keep going without collapsing we need a more formal process. Exactly. If we keep introducing new bugs ourselves, there's not enough time in the world to bring SciPy up to standard. > I guess the alternative position is that not having a discipline of > code review, testing and documentation will make it more likely we'll > have contributors, and the code will get better that way. That's an interesting position, and one I don't understand entirely. I think that a clear, structured guideline for contributions would make SciPy *easier* to collaborate on. It is much easier to please someone if you know what they want! I would have committed many patches in the past if I had had a guarantee that they were working as advertised. That guarantee is provided by tests. With the nose framework in place, writing tests is so very easy: def test_myfoo(): assert 1 == 1 So I hope that everyone would agree that proper testing and documentation improves life, not only for the user community, but also for the contributor. Regards St?fan From cournape at gmail.com Tue Feb 24 01:28:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 15:28:26 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> Message-ID: <5b8d13220902232228y31a09e05r426176bf4d4f6347@mail.gmail.com> Hi Matthew >> I think Stefan's position is that, as more people start using and >> contributing to Scipy, it's become near impossible to maintain in a >> release-worthy way (Stefan - is that right)? ? ?That if we want to >> keep going without collapsing we need a more formal process. FWIW, that's my position as well. Several people complained about slow releases - but getting faster releases is only possible with a non linear timeline, where you don't accept random code near a release date. Even linux itself has merge windows to limit things a few weeks before a release, and Linux is not known to have a very formal process. To avoid red-herring, I will only work on real-case examples, as they happened recently. I don't think anybody has been happy with the 18 months between 0.6 and 0.7. IMO, the bare minimum to do for a release is: - check that it builds on windows, mac os X and Linux (both 32 and 64 bits) - check that it runs the test-suite - check that no blocker issue is kept opened This cannot be done if we don't have tests for new features. If code is pure python and pure computation, I don't mind so much, as long as it passes basic sanity check, but if it involves C or worse Fortran, it is almost guaranteed to break somewhere. Most people only code on one platform, and don't know that it may break something on python 2.4, or python 2.6, or on windows x64, or visual studio, or solaris, etc... Again, a concrete example: kdtree code, through its use of cython, was broken on python 2.4 on linux 64 bits, because of a cython bug. When this happens a few days before a planned release, it is enough to break one more RC, which takes several hours of work for release managers (to rebuild the binaries, set up things on sourceforge, etc...). To be clear, I do not blame the author of the code, I don't expect every contributor to check for those things. But I expect people to care that it is work for other people, and that somebody else has to check those things. Already having "merge windows", and blocked windows (nobody can commit anything without approval from the release management team) would be a huge gain. It would really help for making releases - I don't know a single big open source project which does not use this process in one way or the other. I don't feel like our process is on par with the size of scipy at this point. David From wbaxter at gmail.com Tue Feb 24 02:04:32 2009 From: wbaxter at gmail.com (Bill Baxter) Date: Tue, 24 Feb 2009 16:04:32 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902232052v11dd214auab455428982bf2e3@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> <5b8d13220902232052v11dd214auab455428982bf2e3@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 1:52 PM, David Cournapeau wrote: > On Tue, Feb 24, 2009 at 1:21 PM, Bill Baxter wrote: >>> Git on windows has improved a lot in the last year. Another year or two >>> might make it a good solution for everyone. >> >> I gave msys-Git a try last week and it was not smooth sailing for me. >> Seemed to maybe be having conflicts with the Unix tool ports I usually >> use. > > IIRC, there is an option to check/uncheck to avoid this exact issue in > the installer. Concerning the bzr vs git thing, it is hard to really > know the problems without having used both of them extensively; a lot > are just "look how cool git is", look how cool bzr is. I wrote a > couple of months ago a comparison which totally omits the speed aspect > if that interests you: > http://cournape.wordpress.com/2008/10/30/going-away-from-bzr-toward-git/ Thanks for that. I gave it another try. On the second go I noticed that it said something about making sure I didn't have Cygwin git on my path, which it looks like I did have. So maybe it will work this time... One thing you didn't mention in your annoyances with bzr is the multiple workflows. They say it's an advantage but it seems to me like added complexity for very little benefit. So you can have a checkout instead of a branch just to save yourself the trouble of having to say "bzr push" after a commit. But because of that they have to have commands to specify which flavor to use (initially checkout/branch, and to change bind/unbind), and to query the state of the current tree (bzr info gives that), Then they have repos with or without trees and various commands to set and query those states. I just found it unnecessarily confusing compared to Hg. And shared vs unshared repositories. Just too many ways to do things than really necessary. --bb From michael.abshoff at googlemail.com Tue Feb 24 02:31:49 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Mon, 23 Feb 2009 23:31:49 -0800 Subject: [SciPy-dev] Server spam problems spam spam: spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> Message-ID: <49A3A265.9060200@gmail.com> Pauli Virtanen wrote: > Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote: Hi, > [clip] >> two tips of fighting spammers from the Sage project's wiki: >> >> * add a list of common Chinese words to LocalBadContent, i.e. >> >> http://wiki.sagemath.org/LocalBadContent >> >> Also make sure to clean out all the spammer attempts on the hard disk. >> I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam >> attempts are preserved and not actually deleted from disk. If you have a >> couple ten thousand of those in one directory this might make every wiki >> access painfully slow and impact the whole server. > > Continuing Gael's work, I tried to expand the LocalBadContent list: > > http://scipy.org/LocalBadContent > > I wonder how useful this turns out to be in the end, this smells like an > arms race... I doubt the additions cause problems to real pages, but if > they do, some of them need to be reverted. We added those six or seven words to out Wiki setup for various wikis and they just work. Chinese spam attempts went from dozens a day to none that were successful. I just got tired of despamming the wiki since it made the RecentChanges useless to me, so I spend a lot of time cleaning out spammer accounts (a couple thousand in the end). Another thing I regularly do for some of the wikis is to delete auto generated spammer accounts, i.e. zkjefgkjq1 to zkjefgkjq102 at some Chinese ISP were somehow not connected to the Sage project ;). Since I manage four different wikis hosted at the same IP which widely different audiences (sage, MPIR, l-functions and cython) simultaneous registration at two or more of them when I never heard of the person leads to automatic deletion. This policy is possible because l-functions requires account holder to use names along the lines of first letter first name + last name and it is enforced. Doing that at the scipy wiki is probably not possible. > [Btw, shouldn't LocalBadContent editing be restricted to those in > EditorGroup? And could my account PauliVirtanen be added in the group?] No spammer has edited LocalBadContent ever in our wikis. I would do it since deleting it would obviously open the gates for spam. > Another thing is that there are apparently ca. 11600 pages in the > Scipy.org wiki. I'd make a wild guess that at most ~500 of these are > valid content; the rest is spam. I'm not sure if getting rid of the spam > pages improves Moin's performance. > > Do we have any valid pages with CJK characters? Much of the spam seems > Chinese, so mass-deleting at least this portion of it shouldn't be > impossible to do, given Moin's database format. Well, 11600 directories in one directory does not exactly improve the directory lookup time (assuming you are using sqlite). I just deleted rm -rf \(e[0/-9]* but a visual inspection might be appropriate first. Cheers, Michael From david.douard at logilab.fr Tue Feb 24 02:49:11 2009 From: david.douard at logilab.fr (David Douard) Date: Tue, 24 Feb 2009 08:49:11 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> Message-ID: <200902240849.11558.david.douard@logilab.fr> Le Monday 23 February 2009 23:54:04 Charles R Harris, vous avez ?crit?: > On Mon, Feb 23, 2009 at 3:36 PM, St?fan van der Walt wrote: > > 2009/2/24 Pauli Virtanen : > > > I'm wondering if we would benefit from an official, pull-only, > > > automatically updating, Git mirror of the SVN repository: > > > > I'm still wondering what the advantages are of staying with SVN. I > > haven't heard any compelling arguments so far, whereas we've heard > > numerous accounts from developers regarding the positive aspects of > > DVC systems. > > Decent windows support via Tortoise Then why not using mercurial? - it is written in Python (which people on this list should consider as a strong and valuable argument :-), - it has many extensions (there are easy to write, since it's Python code), - it has a decent win32 integration (with TortoiseHg), - IIRC, Trac now supports Hg, - it is probably easier to learn than git (even if this latter has greatly improved in this area). > and no need to learn a new system. Sometimes one realizes that he should have learned the new stuff long before. IMHO, DVCS do exactly fall into this category ;-) David > Plus > no need to revamp the whole setup, which is always a bigger pain than one > plans for. The problems with the current system seem to be ticket tracking > and branches. I think the first thing to do is take a look at what sage is > doing and see if we can't refurbish our current system to make it more > useable. > > Chuck -- David Douard LOGILAB, Paris (France), +33 1 45 32 03 12 Formations Python, Zope, Debian : http://www.logilab.fr/formations D?veloppement logiciel sur mesure : http://www.logilab.fr/services Informatique scientifique : http://www.logilab.fr/science From matthieu.brucher at gmail.com Tue Feb 24 03:12:40 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 24 Feb 2009 09:12:40 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <1cd32cbb0902231200t6ad7638ag2dba59d3138d10d0@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <9457e7c80902231436y53fff875xe58f8e0b47191b50@mail.gmail.com> Message-ID: > This is roughly what we do for Sage, and it's fairly effective at > getting code merged in. ?Also, _every_ piece of code that goes in does > so through via Trac ticket. It's easy to see when bugs have been > fixed. Do you have an automated process for this ? I know that some people work with merge request plugins, like Bundle Buggy for Bazaar (using emails in this case). Do you use something like this? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From benny.malengier at gmail.com Tue Feb 24 03:52:17 2009 From: benny.malengier at gmail.com (Benny Malengier) Date: Tue, 24 Feb 2009 09:52:17 +0100 Subject: [SciPy-dev] some new ode solvers In-Reply-To: References: <3a1077e70902221556u794f39e1u74f833a812a11423@mail.gmail.com> Message-ID: I would just like to note here I added a scikit: odes, with two extra ode solver, actually dae solvers, so the api called had to be different (they are based on a residual, not on a lhs). See: http://cage.ugent.be/~bm/progs.html However the API changes, it would be nice if this type of dae solvers is considered. Benny 2009/2/23 Rob Clewley > Hi John, > > > Attached is a patch which adds two new ODE solvers to the > > scipy.integrate.ode module. > > The solvers are dopri5 and dop853, which are explicit Runge-Kutta > > pairs originally developed > > by Dormand and Prince. The fortran code was downloaded from: > > > > http://www.unige.ch/~hairer/software.html > > This is good news, and the scipy module certainly needs an updated > API. I hope that previous discussions on this list about API changes > will be looked up as there were some good suggestions then. > > I wonder how much extra work it would be to include H&W's stiff and > delayed ODE and DAE solvers such as Radau, Retard, and Hem? Those > would be of great value to Scipy users, I think, as there's little > high-level language support available for those AFAIK (Radau is in > PyDSTool but not the others). > > Thanks, > Rob > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritemio at gmail.com Tue Feb 24 05:08:17 2009 From: tritemio at gmail.com (Antonio) Date: Tue, 24 Feb 2009 10:08:17 +0000 (UTC) Subject: [SciPy-dev] matlab io - request for testing References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: Matthew Brett gmail.com> writes: > > Hi, > > I have been beating up the matlab io rather severely in order to > implement some cleanups, fixes, and add new options. > > I would very much appreciate it if people could pick up the current > SVN and let me know whether they have any problems. I tried the SVN version and found it very fast. Even putting 1M as blocksize in scipy 0.7.0 the new version is a lot faster. Here there are benchmarks loading a 50MB matlab file: *SCIPY 0.7.0 modified with blocksize=1M* 4771 function calls (4768 primitive calls) in 1.318 CPU seconds Ordered by: internal time List reduced from 49 to 3 due to restriction <3> ncalls tottime percall cumtime percall filename:lineno(function) 400 1.011 0.003 1.011 0.003 {built-in method decompress} 10 0.086 0.009 0.158 0.016 /usr/lib/python2.5/StringIO.py:95(seek) 5 0.072 0.014 0.072 0.014 {method 'join' of 'str' objects} *SCIPY '0.8.0.dev5592'* 582 function calls (579 primitive calls) in 2.957 CPU seconds Ordered by: internal time List reduced from 40 to 3 due to restriction <3> ncalls tottime percall cumtime percall filename:lineno(function) 27 1.823 0.068 2.846 0.105 gzipstreams.py:77(__fill) 52 0.963 0.019 0.963 0.019 {built-in method decompress} 9 0.065 0.007 0.065 0.007 {method 'copy' of 'numpy.ndarray' objects} > Thanks a lot, Thanks for you work :) > Matthew > ~ Antonio PS: put me in CC since I'm not a SciPy subscriber From Ralf_Ahlbrink at web.de Tue Feb 24 05:28:03 2009 From: Ralf_Ahlbrink at web.de (Ralf Ahlbrink) Date: Tue, 24 Feb 2009 11:28:03 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> Message-ID: David Cournapeau wrote: > On Tue, Feb 24, 2009 at 2:24 AM, Jonathan Guyer wrote: >> >> On Feb 23, 2009, at 11:44 AM, David Cournapeau wrote: >> >>> 50 % of the time I create a >>> branch for numpy, I screw up because I need like 10 commands, which >>> fail >>> half of the time for stupid errors or time out. >> >> I have no opinion on a switch of SciPy to git or anything else, and >> I'm generally interested in the prospects for distributed version >> control, but I really have to ask, what 10 commands could you possibly >> need to execute to create a branch in svn? > > svn cp trunk -> branch > svnmerge switch branch > svnmerge init trunk > svn ci -F svnmerge-commit.txt > svn switch trunk > svnmerge init branch > svn ci -F svnmerge-commit.txt > > Ok, that's 7 :) Hi David, your statement here about subversion branching/merging is somewhat misleading, because you presume subversion <= 1.4. The current version (1.5) supports 'merge-tracking', i.e. the svnmerge functionality is transparently incorporated into svn. See e.g. http://blog.red-bean.com/sussman/?p=92 or http://svnbook.red-bean.com/en/1.5/index.html (especially chapter 4). Migration to 1.5 repositories by svnadmin dump and load actions works well. Regards, Ralf. > > cheers, > > David From cournape at gmail.com Tue Feb 24 07:18:57 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 24 Feb 2009 21:18:57 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <49A2CC22.8050006@ntc.zcu.cz> <1cd32cbb0902230843j1eee041fp7fdb7aa8cbb2788c@mail.gmail.com> <49A2D257.7070104@ar.media.kyoto-u.ac.jp> <5b8d13220902231010i19753309m15a1da7b4e9ac673@mail.gmail.com> Message-ID: <5b8d13220902240418h736cff7fgcdb420778941c5ae@mail.gmail.com> On Tue, Feb 24, 2009 at 7:28 PM, Ralf Ahlbrink wrote: > David Cournapeau wrote: > >> On Tue, Feb 24, 2009 at 2:24 AM, Jonathan Guyer wrote: >>> >>> On Feb 23, 2009, at 11:44 AM, David Cournapeau wrote: >>> >>>> 50 % of the time I create a >>>> branch for numpy, I screw up because I need like 10 commands, which >>>> fail >>>> half of the time for stupid errors or time out. >>> >>> I have no opinion on a switch of SciPy to git or anything else, and >>> I'm generally interested in the prospects for distributed version >>> control, but I really have to ask, what 10 commands could you possibly >>> need to execute to create a branch in svn? >> >> svn cp trunk -> branch >> svnmerge switch branch >> svnmerge init trunk >> svn ci -F svnmerge-commit.txt >> svn switch trunk >> svnmerge init branch >> svn ci -F svnmerge-commit.txt >> >> Ok, that's 7 :) > > Hi David, > > your statement here about subversion branching/merging is somewhat > misleading, because you presume subversion <= 1.4. The current version (1.5) > supports 'merge-tracking', i.e. the svnmerge functionality is transparently > incorporated into svn. See e.g. http://blog.red-bean.com/sussman/?p=92 or > http://svnbook.red-bean.com/en/1.5/index.html (especially chapter 4). > Migration to 1.5 repositories by svnadmin dump and load actions works well. AFAIK, svn 1.5 only solve some of the problems, but it it still very slow, which is one of the main issue. It also fails in case of renames, etc... I have seen reports of people sticking to svnmerge with 1.5. Also, building subversion is a royal PITA, I had to do it once on a CENTOS system, it took me a while - not all distributions have svn 1.5. Frankly, if we change, better change to a better system. svn is just an inferior tool in almost every possible way. Changing svn to 1.5 brings most of the pain that would bring DVCS, and for no clear improvement. I think at the moment, we would be better to stick to 1.4 for now, bring some official git mirrors, and work on other issues, David From pwang at enthought.com Tue Feb 24 10:47:41 2009 From: pwang at enthought.com (Peter Wang) Date: Tue, 24 Feb 2009 09:47:41 -0600 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: <49A3A265.9060200@gmail.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> Message-ID: Hi everyone, I have gone through with a blunt grep hammer and moved ~9300 pages off of the main scipy wiki. This seems to have helped Moin's performance somewhat. There are still approximately 3300 pages remaining. If folks are interested in a distributed approach to culling the rest of the spam, I can send out an 80kb file listing of the remaining pages. It would be helpful to have both "definite ham" and "definite spam" lists, especially in the foreign language pages and user pages, which are the toughest to figure out. (e.g. What is the difference between French spam and French ham? Surely we have *some* legitimate Chinese contributors on the wiki?) In my wild grepping it's possible I've blown away some good pages. I'm including my list of patterns below, so folks can identify major or obvious problems. The sketchiest (but also the most effective) was eliminating pages with '(2b)', but I recognize that was a pretty broad stroke. Of course, if anyone notices missing pages, please let me know and I will restore the page ASAP. -Peter --------------------------- *\(2b\)* *gold* *ffxi* *ountertop* granite* Gold* guild*wars* *Hangzhou* *hangzhou* Injection*Molding* lineage*2* liuhecai* Louis*Vuitton* ltage* Mabinogi* maple*story* Maple*Story* ok????* qq\(* replica* Rohan* rohan* ROHAN* rs* RS* Rs* runescape* Runescape* (e2* (e3* (e4* (e5* (e6* (e7* (e8* (e9* tm?????* xinggan* zxcv* cai* *d0????* hare* Hj* hj* hk* jack* Lex* seo* SEO* tema* Tombstone* usr* *arhammer* *arcraft* *WoW* *wow* *WOW www\(2e\)* zg* zhonggo* 315* 200{6,7,8,9}* 1878* 123* 13* 5* 6* 7* Ajd* baixiao* China* china* game* Game* google* Google* GOOGLE* kcc* nobye* oforu* power* Power* tibet* Tibet* ?urbocharger* ?holesale* ?rusher* From tritemio at gmail.com Tue Feb 24 11:01:31 2009 From: tritemio at gmail.com (Antonio) Date: Tue, 24 Feb 2009 16:01:31 +0000 (UTC) Subject: [SciPy-dev] matlab io - request for testing References: <1e2af89e0902191942t174d10bdr38d79e7fefa4f2d5@mail.gmail.com> Message-ID: Antonio gmail.com> writes: > *SCIPY 0.7.0 modified with blocksize=1M* I swapped the headers cutting and pasting the benchmarks, sorry. The conclusions do not change. This one is the new version SCIPY '0.8.0.dev5592 (not the 0.7.0) > 4771 function calls (4768 primitive calls) in 1.318 CPU seconds > > Ordered by: internal time > List reduced from 49 to 3 due to restriction <3> > > ncalls tottime percall cumtime percall filename:lineno(function) > 400 1.011 0.003 1.011 0.003 {built-in method decompress} > 10 0.086 0.009 0.158 0.016 > /usr/lib/python2.5/StringIO.py:95(seek) > 5 0.072 0.014 0.072 0.014 {method 'join' of 'str' objects} > while > *SCIPY '0.8.0.dev5592'* the following refers to the old SCIPY 0.7.0 > 582 function calls (579 primitive calls) in 2.957 CPU seconds > > Ordered by: internal time > List reduced from 40 to 3 due to restriction <3> > > ncalls tottime percall cumtime percall filename:lineno(function) > 27 1.823 0.068 2.846 0.105 gzipstreams.py:77(__fill) > 52 0.963 0.019 0.963 0.019 {built-in method decompress} > 9 0.065 0.007 0.065 0.007 {method 'copy' of 'numpy.ndarray' > objects} > As previously mentioned, the new version is faster. ~ Antonio From robince at gmail.com Tue Feb 24 13:24:24 2009 From: robince at gmail.com (Robin) Date: Tue, 24 Feb 2009 18:24:24 +0000 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A33F25.4000903@astraw.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <3d375d730902231311w3c137015t59e8eacd45aabc32@mail.gmail.com> <3d375d730902231327u4f1f41eah56f99857d23578e1@mail.gmail.com> <49A32A4D.5040101@astraw.com> <49A33F25.4000903@astraw.com> Message-ID: On Tue, Feb 24, 2009 at 12:28 AM, Andrew Straw wrote: > Robin wrote: >> On Mon, Feb 23, 2009 at 10:59 PM, Andrew Straw wrote: >>> Robin, can you point us to a public svn repo where non-trivial branching >>> is happening with bzr? I had lots of trouble trying to get anything >>> working in my attempts. >> I'm not sure what qualifies as non-trivial branching (DVCS people say >> all branching should be trivial!). I have only used it on my private >> repo myself, but I haven't had any problems. Asking on IRC, a good >> example is GNOME. Here are the developer branches created using >> bzr-svn >> http://bzr-playground.gnome.org/ - although it looks like they aren't >> pushing directly back to svn (but going through patches). >> >> This is the best I could find documentating a workflow for using bzr with svn: >> http://www.serverzen.net/starting-with-bazaar-bzr-svn > > Thanks for the links. > >> But I think to avoid problems with merging the trick is to have a bzr >> checkout from svn as your sort of trunk branch, which you can then >> branch with bzr to create feature branches. Upstream branches can be >> pulled to the trunk branch, then merged to your feature branches, and >> when you want to push stuff back you merge it into your trunk checkout >> (which also commits it to svn). There are options to either push each >> individual commit as an svn commit, or just have a single merge commit >> in svn (the metadata for the bzr commits are there so they can be seen >> by other bzr users). Because all the metadata is in svn, you should be >> able to merge with anyone else who has their branch based on a svn >> checkout... > > My problem when I tried this out was that the svn metadata wasn't the > same across different bzr repos cloned from the same svn repo -- thus no > ability to actually share the bzr branches between bzr repos. It sounds > like you have only tried this from a single bzr repo? Plus, I didn't > like polluting the svn repo with the bzr metadata, particularly given > this no-ability-to-create-the-same-bzr-clones issue. Git solves the > first issue (different git clones of the same svn repo produce the same > git repo) and thereby somewhat eliminates the need to store git metadata > in the svn repo. Which is why I like git-svn more than bzr-svn. But the > lack of one-to-one bidirectional mapping between DVCS branches and svn > branches it was prevents any of these schemes from working on any DVCS, > as far as I can see. I guess this is getting slightly off-topic - also I'm not trying to advocate anything over anything else (and don't have the experience to do so!) but for the record I'm pretty sure what you say here about bzr-svn is wrong. Seperate bzr clones of the same svn repository *are* the same, and one can merge between them. This works without bzr metadata in the svn repo (simple example below). Bzr metadata is only needed to provide the one-to-one mapping you mention, between bzr branches and svn branches (for example, one can checkout from subversion using bzr-svn, branch using bazaar, modify, and push back to a different location in svn and it will be a svn branch). Bzr metadata is not visible if you are running svn >= 1.5, and for other versions of svn whatever you are using can be configured to ignore it (it is just svn properties as far as I understand). I know there are lots of arguments for/against the different DVCS's - but I think the one thing bazaar is clearly ahead on is this svn integration. # two separate checkouts jm-g26b101:tmp robince$ bzr co http://svn.scipy.org/svn/scikits/trunk/mlabwrap mlabwrap-trunk jm-g26b101:tmp robince$ bzr co http://svn.scipy.org/svn/scikits/trunk/mlabwrap mlabwrap-trunk2 # branch each checkout jm-g26b101:tmp robince$ bzr branch mlabwrap-trunk mlabwrap-branch1 Branched 93 revision(s). jm-g26b101:tmp robince$ bzr branch mlabwrap-trunk2 mlabwrap-branch2 Branched 93 revision(s). # add a file to each jm-g26b101:tmp robince$ cd mlabwrap-branch1 jm-g26b101:mlabwrap-branch1 robince$ echo "a new file" > new.txt jm-g26b101:mlabwrap-branch1 robince$ bzr add new.txt adding new.txt add completed jm-g26b101:mlabwrap-branch1 robince$ bzr ci -m "add a new file" Committing to: /Users/robince/tmp/mlabwrap-branch1/ added new.txt Committed revision 94. jm-g26b101:mlabwrap-branch1 robince$ cd ../mlabwrap-branch2/ jm-g26b101:mlabwrap-branch2 robince$ echo "another new file" > new.txt jm-g26b101:mlabwrap-branch2 robince$ bzr add new.txt adding new.txt add completed jm-g26b101:mlabwrap-branch2 robince$ bzr ci -m "a new file" Committing to: /Users/robince/tmp/mlabwrap-branch2/ added new.txt Committed revision 94. # merge works fine jm-g26b101:mlabwrap-branch2 robince$ bzr merge ../mlabwrap-branch1/ +N new.txt R new.txt => new.txt.moved Conflict adding file new.txt. Moved existing file to new.txt.moved. 1 conflicts encountered. Cheers Robin From fperez.net at gmail.com Tue Feb 24 13:46:07 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 24 Feb 2009 10:46:07 -0800 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> Message-ID: On Tue, Feb 24, 2009 at 7:47 AM, Peter Wang wrote: > In my wild grepping it's possible I've blown away some good pages. > I'm including my list of patterns below, so folks can identify major > or obvious problems. ?The sketchiest (but also the most effective) was > eliminating pages with '(2b)', but I recognize that was a pretty broad > stroke. > power* > Power* I would at least double check these. Things like 'power spectrum' could have ended up killed by this one. The others look pretty safe. In passing, I'll mention how we eventually got rid of the ipython wiki spam. We made the wiki read-only for authenticated users, with only those listed here: http://ipython.scipy.org/moin/WritersGroup being allowed to write. Anyone who asks is added immediately to this list, so the barrier is low for legitimate contributions, and any of these people: http://ipython.scipy.org/moin/EditorsGroup can edit the writers list. This way it's easy to ensure there will be always someone around who can add writers with minimal delay for real contributions, while keeping the spammers out. It may be that with the new moin this approach isn't necessary, but for ipython it was the only way to finally eliminate the spam problem. And it did, 100%. Cheers, f From matthew.brett at gmail.com Tue Feb 24 13:59:01 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 24 Feb 2009 10:59:01 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). Message-ID: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Hi, I've split this off into a new thread because I felt there were two issues in Stefan's original thread. This is in the hope that we can stimulate discussion on the workflow (as opposed to - say - which version control system to use, or which bugtracker). I would be very interested to see if we can come to a consensus on the important discussion of whether to introduce fairly formal code review into the scipy workflow. I've appended the key piece of discussion below. > 2009/2/24 Travis E. Oliphant : >> I think the biggest problem has been time and adding too formal of a >> process will just increase the time it takes to get code into SciPy. >> I'm fine with emphasizing documentation and tests as we discuss things >> and we should encourage each other, but I'm not comfortable with >> hard-line statements like the ones being made above. ?Yes, such things >> are helpful, but they are also expensive and I worry more about what we >> lose in contributions. > > Having so little time means that we cannot be cavalier about adding > broken code to SciPy. ?Like Matthew mentioned, this becomes an immense > maintenance burden. > >> The quality of what we create should emerge as all interested parties >> critically look at the code that is available in SciPy. > > I agree with that sentiment; and looking critically at code in SciPy > starts with our own patches. > >> Not everyone >> can do that on the same schedule. ?I'm opposed to trying to force that >> to happen. ?I very much favor cultivating a culture that wants someone >> to fix the problems in their code. > > Sure, let's be inclusive, but also set a bar. ?If you make the time to > write a patch, make the time to do it well (it does not take long to > construct a test -- you have to make sure your code works properly > anyhow). > >> But, my favorite workflow is a bit more chaotic, than that. ?People >> create their own DVCS versions of SciPy using their best judgment and >> publish revisions they consider to be working code. >> >> Branches that are given the thumbs up by 2 people (or 1 on the steering >> committee) get pushed to the main branch. ? ? ?This review happens >> regularly, on IRC channels at regularly scheduled times. > > Two eyes on every piece of code in SciPy, that's all we need. ?Two > critical eyes that realise the value of tests and documentation. ?Your > outline above fits in with my view of how this could happen. Best, Matthew From pwang at enthought.com Tue Feb 24 14:33:05 2009 From: pwang at enthought.com (Peter Wang) Date: Tue, 24 Feb 2009 13:33:05 -0600 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> Message-ID: <2991E947-B332-4979-9D4D-9B089AC615E7@enthought.com> On Feb 24, 2009, at 12:46 PM, Fernando Perez wrote: > On Tue, Feb 24, 2009 at 7:47 AM, Peter Wang > wrote: > >> In my wild grepping it's possible I've blown away some good pages. >> I'm including my list of patterns below, so folks can identify major >> or obvious problems. The sketchiest (but also the most effective) >> was >> eliminating pages with '(2b)', but I recognize that was a pretty >> broad >> stroke. >> power* >> Power* > > I would at least double check these. Things like 'power spectrum' > could have ended up killed by this one. Indeed. For common english words I was careful to do an ls first and then "mv -v". > It may be that with the new moin this approach isn't necessary, but > for ipython it was the only way to finally eliminate the spam problem. > And it did, 100%. I would not be adverse to locking things down a bit; OTOH, if we move to the new Moin on the new server with CAPTCHAs, that might do most of the trick. Incidentally, I went through and cleared out 2500 spam pages from the ipython wiki directory as well, and moved them into /home/ipython/wiki/ data/badpages. These were done with a much more conservative set of patterns than what I applied to the main scipy page, and I'm fairly confident they were all spam (mostly Chinese characters, World of Warcraft gold, etc.). -Peter From charlesr.harris at gmail.com Tue Feb 24 16:13:44 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2009 14:13:44 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: Hi Matthew, On Tue, Feb 24, 2009 at 11:59 AM, Matthew Brett wrote: > Hi, > > I've split this off into a new thread because I felt there were two > issues in Stefan's original thread. > > This is in the hope that we can stimulate discussion on the workflow > (as opposed to - say - which version control system to use, or which > bugtracker). > > I would be very interested to see if we can come to a consensus on the > important discussion of whether to introduce fairly formal code review > into the scipy workflow. I've appended the key piece of discussion > below. > > > 2009/2/24 Travis E. Oliphant : > >> I think the biggest problem has been time and adding too formal of a > >> process will just increase the time it takes to get code into SciPy. > >> I'm fine with emphasizing documentation and tests as we discuss things > >> and we should encourage each other, but I'm not comfortable with > >> hard-line statements like the ones being made above. Yes, such things > >> are helpful, but they are also expensive and I worry more about what we > >> lose in contributions. > > > > Having so little time means that we cannot be cavalier about adding > > broken code to SciPy. Like Matthew mentioned, this becomes an immense > > maintenance burden. > > > >> The quality of what we create should emerge as all interested parties > >> critically look at the code that is available in SciPy. > > > > I agree with that sentiment; and looking critically at code in SciPy > > starts with our own patches. > > > >> Not everyone > >> can do that on the same schedule. I'm opposed to trying to force that > >> to happen. I very much favor cultivating a culture that wants someone > >> to fix the problems in their code. > > > > Sure, let's be inclusive, but also set a bar. If you make the time to > > write a patch, make the time to do it well (it does not take long to > > construct a test -- you have to make sure your code works properly > > anyhow). > > > >> But, my favorite workflow is a bit more chaotic, than that. People > >> create their own DVCS versions of SciPy using their best judgment and > >> publish revisions they consider to be working code. > >> > >> Branches that are given the thumbs up by 2 people (or 1 on the steering > >> committee) get pushed to the main branch. This review happens > >> regularly, on IRC channels at regularly scheduled times. > > > > Two eyes on every piece of code in SciPy, that's all we need. Two > > critical eyes that realise the value of tests and documentation. Your > > outline above fits in with my view of how this could happen. > I don't think there are enough eyes at this point for a strict review policy. How many of the current packages have any maintainer? Who was maintaining the stats package before Josef got involved? How many folks besides Robert could look over the changes usefully? How many folks looked over Travis' recent addition to optimize? Who is working on the interpolation package? I think at this point we would be better off trying to recruit at least one person to "own" each package. For new packages that is usually the person who committed it but we also need ownership of older packages. Someone with a personal stake in a package is likely to do more for quality assurance at this point than any amount of required review. I don't have a problem with folks complaining about missing tests, etc., but I worry that if we put too many review steps into the submission path there won't be enough people to make it work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhansen at gmail.com Tue Feb 24 16:48:47 2009 From: mhansen at gmail.com (Mike Hansen) Date: Tue, 24 Feb 2009 13:48:47 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 1:13 PM, Charles R Harris wrote: > I don't think there are enough eyes at this point for a strict review > policy. How many of the current packages have any maintainer? Who was > maintaining the stats package before Josef got involved? How many folks > besides Robert could look over the changes usefully? How many folks looked > over Travis' recent addition to optimize?? Who is working on the > interpolation package? > > I think at this point we would be better off trying to recruit at least one > person to "own" each package. For new packages that is usually the person > who committed it but we also need ownership of older packages. Someone with > a personal stake in a package is likely to do more for quality assurance at > this point than any amount of required review. It doesn't seem your current process is conducive to have a "maintainer" for each package. What is the maintainer supposed to do? Monitor all the relevant SVN commits hoping that some broken, untested change doesn't go in behind his or her back? With a review process, maintainers emerge since code doesn't get included if they don't. You also get a lot more people knowledgeable about more areas of the codebase. I think Stefan's comment should be reiterated: > Having so little time means that we cannot be cavalier about adding > broken code to SciPy. Like Matthew mentioned, this becomes an immense > maintenance burden. --Mike From gael.varoquaux at normalesup.org Tue Feb 24 16:57:41 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 24 Feb 2009 22:57:41 +0100 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <20090224215741.GD26812@phare.normalesup.org> On Tue, Feb 24, 2009 at 01:48:47PM -0800, Mike Hansen wrote: > It doesn't seem your current process is conducive to have a > "maintainer" for each package. What is the maintainer supposed to do? > Monitor all the relevant SVN commits hoping that some broken, > untested change doesn't go in behind his or her back? > With a review process, maintainers emerge since code doesn't get > included if they don't. You also get a lot more people knowledgeable > about more areas of the codebase. The problem is simply that you are lacking people. No more, no less. I say you because my contribution to scipy code has been nothing, although I try to be supportive of the project. I wonder how we go ahead and fix this problem. Tough question. What puzzles me is that there are plenty of people writing numerical code. In external packages, or in their in-house libraries. Why are these people not interested in putting their effort in something bigger, and thus probably longer-lived, puzzles me. Bah,.... Ga?l From charlesr.harris at gmail.com Tue Feb 24 17:02:34 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2009 15:02:34 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 2:48 PM, Mike Hansen wrote: > On Tue, Feb 24, 2009 at 1:13 PM, Charles R Harris > wrote: > > I don't think there are enough eyes at this point for a strict review > > policy. How many of the current packages have any maintainer? Who was > > maintaining the stats package before Josef got involved? How many folks > > besides Robert could look over the changes usefully? How many folks > looked > > over Travis' recent addition to optimize? Who is working on the > > interpolation package? > > > > I think at this point we would be better off trying to recruit at least > one > > person to "own" each package. For new packages that is usually the person > > who committed it but we also need ownership of older packages. Someone > with > > a personal stake in a package is likely to do more for quality assurance > at > > this point than any amount of required review. > > It doesn't seem your current process is conducive to have a > "maintainer" for each package. What is the maintainer supposed to do? > Monitor all the relevant SVN commits hoping that some broken, > untested change doesn't go in behind his or her back? > > With a review process, maintainers emerge since code doesn't get > included if they don't. You also get a lot more people knowledgeable > about more areas of the codebase. > > I think Stefan's comment should be reiterated: > > > Having so little time means that we cannot be cavalier about adding > > broken code to SciPy. Like Matthew mentioned, this becomes an immense > > maintenance burden. > Are we adding a lot of broken code? Does the gain offset the pain? I think we need more folks with commit privileges and interest. In the short term I would propose the following. 1) Additions get posted on the mailing list for comment before commit. I'll bet few knew of the additions to optimize. 2) We look for folks with patches in trac and consider giving them commit privileges to fix things up. 3) We put together a list of needed tests. Then we will see how serious folks are about writing tests. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Tue Feb 24 17:15:47 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 24 Feb 2009 14:15:47 -0800 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: <2991E947-B332-4979-9D4D-9B089AC615E7@enthought.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> <2991E947-B332-4979-9D4D-9B089AC615E7@enthought.com> Message-ID: On Tue, Feb 24, 2009 at 11:33 AM, Peter Wang wrote: > I would not be adverse to locking things down a bit; OTOH, if we move > to the new Moin on the new server with CAPTCHAs, that might do most of > the trick. That would be great. I'm not totally happy with having had to lock things down, but it was the only solution at the time. > > Incidentally, I went through and cleared out 2500 spam pages from the > ipython wiki directory as well, and moved them into /home/ipython/wiki/ > data/badpages. ?These were done with a much more conservative set of > patterns than what I applied to the main scipy page, and I'm fairly > confident they were all spam (mostly Chinese characters, World of > Warcraft gold, etc.). Thanks a lot! We did have a lot of that for a while (before the lockdown), and any cleanup that helps the server be more responsive is welcome. Cheers, f From mhansen at gmail.com Tue Feb 24 17:17:00 2009 From: mhansen at gmail.com (Mike Hansen) Date: Tue, 24 Feb 2009 14:17:00 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 2:02 PM, Charles R Harris wrote: > Are we adding a lot of broken code? Does the gain offset the pain? I think > we need more folks with commit privileges and interest. In the short term I > would propose the following. Many would argue that any untested code is broken code and will become an maintenance burden in the future. > 1) Additions get posted on the mailing list for comment before commit. I'll > bet few knew of the additions to optimize. Does this really provide motivation for people to look at code? > 2) We look for folks with patches in trac and consider giving them commit > privileges to fix things up. Ideally, you don't want anything committed until it's "fixed up". Using SVN makes this a bit more difficult to do. > 3) We put together a list of needed tests. Then we will see how serious folks are about writing tests. It's definitely important to know what you're actually testing and what you're not. Also, being able to do so in an automated way is important. http://ivory.idyll.org/blog/feb-09/people-who-dont-use-code-coverage-are-idiots.html --Mike From stefan at sun.ac.za Tue Feb 24 18:02:43 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 01:02:43 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> Charles, 2009/2/25 Charles R Harris : >> On Tue, Feb 24, 2009 at 1:13 PM, Charles R Harris >> wrote: >> > I don't think there are enough eyes at this point for a strict review >> > policy. How many of the current packages have any maintainer? Who was >> > maintaining the stats package before Josef got involved? How many folks >> > besides Robert could look over the changes usefully? How many folks >> > looked >> > over Travis' recent addition to optimize?? Who is working on the >> > interpolation package? Thank you for providing some prime examples on why code review and testing is needed. I didn't know about the changes to the optimisation module, but now I have to ask these questions: 1. Is it quality code, suitable for SciPy? (It is, I read the code, or in other words *reviewed* it) 2. Does it work? I don't know. Nobody does. There aren't any tests, no guarantees. That would be my reaction to one change, but there were 6 or more, none with any tests (and at least one contains a spelling mistake!). How do we know that they work under Windows, Solaris, Linux and OSX? Worse; what if I decide to make some updates to that code but, not having understood the author's intention perfectly, break it horribly. Who would be any the wiser? Tests protect the user and the developer alike. It is irresponsible to carry on the way we do. > Are we adding a lot of broken code? Yes, we are. And it is OK to write broken code, we all do. My argument is that, together, we write better code (review), and by using the tools at our proposal (testing), we minimise the chances of failure. > Does the gain offset the pain? I think > we need more folks with commit privileges and interest. In the short term I > would propose the following. More folks with commit priviledges would just perpetuate this chaos. Our community is sophisticated enough not to apply a brute-force solution to the problem. > 1) Additions get posted on the mailing list for comment before commit. I'll > bet few knew of the additions to optimize. Simply being aware of a patch does not improve its quality. > 3) We put together a list of needed tests. Then we will see how serious > folks are about writing tests. It should not be anyone's job to clean up after his/her peers. If each patch is accompanied by appropriate tests, this situation would never occur. Regards St?fan From wnbell at gmail.com Tue Feb 24 18:17:13 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 24 Feb 2009 18:17:13 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 4:13 PM, Charles R Harris wrote: > > I think at this point we would be better off trying to recruit at least one > person to "own" each package. For new packages that is usually the person > who committed it but we also need ownership of older packages. Someone with > a personal stake in a package is likely to do more for quality assurance at > this point than any amount of required review. > +1 This ought to be at the top of everyone's list. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From rob.clewley at gmail.com Tue Feb 24 18:18:44 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Tue, 24 Feb 2009 18:18:44 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: Hi, > I think at this point we would be better off trying to recruit at least one > person to "own" each package. For new packages that is usually the person > who committed it but we also need ownership of older packages. Someone with > a personal stake in a package is likely to do more for quality assurance at > this point than any amount of required review. I would hesitate to make the model as strong as "ownership". Maybe "curator"? I don't mean to play with semantics but the choice of language for the model will be important in giving the right impression to new and/or timid users/contributors (myself included) who don't need to be put off getting involved because of perceived responsibilities. Ownership suggests a strict hierarchy, and potential curators will be less likely to get involved if the workflow model labels them "owners." Also, this perception also enables non-owners (who might perceive themselves as unqualified to help) to justify leaving the poor blighters to do everything by themselves. I don't want to have the responsibility of "owning" anything about the existing code for ODE solving (and maybe some other numerical methods), even though I have some stake in it. But I'll happily share in some of the reviewing and possibly testing of changes and improvements to that code. So, can't there be informal teams of curatorship so that not everyone involved has to be really familiar with the tools discussed in the other thread?! Unfortunately I cannot afford the time to ride the waves of changing fashion in VCS, etc. Wouldn't this help to get more people involved? ... those many people that Gael correctly assumes are out there but staying silent! -Rob From jtravs at gmail.com Tue Feb 24 18:21:22 2009 From: jtravs at gmail.com (John Travers) Date: Tue, 24 Feb 2009 23:21:22 +0000 Subject: [SciPy-dev] complex wrapper to ode In-Reply-To: References: <3a1077e70902231025n2dc7e3deo284f7632b2e26c3a@mail.gmail.com> Message-ID: <3a1077e70902241521p4e160b50p8bcbb8f60529b8dc@mail.gmail.com> On Mon, Feb 23, 2009 at 8:54 PM, Pauli Virtanen wrote: > Mon, 23 Feb 2009 18:25:45 +0000, John Travers wrote: >> Attached is a patch which adds a wrapper class 'zode' to integrate.ode. >> It allows one to conviniently solve systems of odes with complex values >> using the existing real valued solvers vode, dopri5, dop853, instead of >> zode, by simply integrating the real/imag parts. >> >> Is this worth commiting? > > Looks good to me, and may be generally useful, so I'm +1 OK, it was commited as rev 5594 with the following corrections: > > But before committing, I'd suggest a couple of things: > > - The name 'zode' is slightly confusing vs. ZVODE and not very > ?descriptive. Maybe 'complex_ode' would be better? Fixed. > > ?This would leave us wiggle room later on with the naming... > > - Is it possible to do the real -> complex switch automatically, > ?based on the type of return value from (a trial evaluation of) f? > > ?On a second thought, this might be brittle. I think this would be too much black magic. At least in the current way the users intention must be explicit. > - Since 'ode' supports Jacobians, it'd be nice if the wrapper supported > ?them, too. I've added this, but it could do with more testing as I'm a little unsure of the signs. It passes the one complex problem test with a Jacobian. >> It appears to me to be considerably faster than >> zvode for my big systems of equations. I'm not sure why, as I >> intuitively thought all the data copying etc. would slow it down. > > Is your RHS an analytic function of all of the variables? The ZVODE docs > seem to mention this as a requirement. But I don't know if the ZVODE > implementation itself is supposed to be fast. I think it is only a requirement for the stiff solver. But my RHS is analytic anayway. Further testing has shown that vode only has a slight advantage over zvode, but dopri5 with th complex wrapper thrashes them both. Cheers, John From robert.kern at gmail.com Tue Feb 24 18:28:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Feb 2009 17:28:30 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> On Tue, Feb 24, 2009 at 15:13, Charles R Harris wrote: > I think at this point we would be better off trying to recruit at least one > person to "own" each package. For new packages that is usually the person > who committed it but we also need ownership of older packages. Someone with > a personal stake in a package is likely to do more for quality assurance at > this point than any amount of required review. "Ownership" has a bad failure mode. Case in point: nominally, I am the "owner" of scipy.stats and numpy.random and completely failed to move Josef's patches along. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Tue Feb 24 18:32:17 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 24 Feb 2009 18:32:17 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 6:18 PM, Rob Clewley wrote: > > I would hesitate to make the model as strong as "ownership". Maybe > "curator"? I don't mean to play with semantics but the choice of > language for the model will be important in giving the right > impression to new and/or timid users/contributors (myself included) > who don't need to be put off getting involved because of perceived > responsibilities. Ownership suggests a strict hierarchy, and potential > curators will be less likely to get involved if the workflow model > labels them "owners." Also, this perception also enables non-owners > (who might perceive themselves as unqualified to help) to justify > leaving the poor blighters to do everything by themselves. > > I don't want to have the responsibility of "owning" anything about the > existing code for ODE solving (and maybe some other numerical > methods), even though I have some stake in it. But I'll happily share > in some of the reviewing and possibly testing of changes and > improvements to that code. > > So, can't there be informal teams of curatorship so that not everyone > involved has to be really familiar with the tools discussed in the > other thread?! Unfortunately I cannot afford the time to ride the > waves of changing fashion in VCS, etc. > > Wouldn't this help to get more people involved? ... those many people > that Gael correctly assumes are out there but staying silent! > I wouldn't get too hung-up on the word "owner". I think the necessary part is that one or more people feel some level of responsibility for each component of scipy. As an example, in Trac you can configure an "owner" (their term, not mine) for each component. When a ticket is issued the owner will receive an email. For simple fixes, this facilitates quick turn around. Ideally, each component would have at least one owner who was notified when tickets are issued. Here's the current list: Name Owner Build issues cdavid Other somebody Trac somebody Website somebody numexpr cookedm scipy.cluster somebody scipy.fftpack somebody scipy.integrate somebody scipy.interpolate somebody scipy.io somebody scipy.lib somebody scipy.linalg somebody scipy.maxentropy somebody scipy.misc somebody scipy.ndimage somebody scipy.odr cdavid scipy.optimize somebody scipy.signal somebody scipy.sparse wnbell scipy.sparse.linalg wnbell scipy.spatial peridot scipy.special somebody scipy.stats somebody scipy.weave somebody -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Tue Feb 24 18:33:33 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 24 Feb 2009 18:33:33 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 6:28 PM, Robert Kern wrote: > > "Ownership" has a bad failure mode. Case in point: nominally, I am the > "owner" of scipy.stats and numpy.random and completely failed to move > Josef's patches along. > Does Josef not now "own" certain parts of scipy.stats? /case in point :) -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From stefan at sun.ac.za Tue Feb 24 18:50:01 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 01:50:01 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> Rob, 2009/2/25 Rob Clewley : > I don't want to have the responsibility of "owning" anything about the > existing code for ODE solving (and maybe some other numerical > methods), even though I have some stake in it. But I'll happily share > in some of the reviewing and possibly testing of changes and > improvements to that code. I think you raise an important point. As an Open Source project, we could could succeed without much (any) formal hierarchy. Naturally, the system evolves into a kind of meritocracy, where capability and dedication counts heavily (as it should). Informal ownership (I care for this code, therefore I take note of its progress) is helpful in moderate doses. For example, I like the fact that Nathan reviews all patches related to sparse matrices; he knows that part of SciPy extremely well, and his advice is valuable. Of course, other reviews of such a patch would be just as welcome. The main argument I tried to put forth earlier was: Contributing to a project is easy when you know what is expected of you (clear guidelines), and when you know that you'll be treated on merit alone (the same as everybody else). Merit carries no malice, and is about as impartial as it gets. Looking forward to your input on ODEs! :) Cheers St?fan From charlesr.harris at gmail.com Tue Feb 24 18:57:37 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2009 16:57:37 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 4:32 PM, Nathan Bell wrote: > On Tue, Feb 24, 2009 at 6:18 PM, Rob Clewley > wrote: > > > > I would hesitate to make the model as strong as "ownership". Maybe > > "curator"? I don't mean to play with semantics but the choice of > > language for the model will be important in giving the right > > impression to new and/or timid users/contributors (myself included) > > who don't need to be put off getting involved because of perceived > > responsibilities. Ownership suggests a strict hierarchy, and potential > > curators will be less likely to get involved if the workflow model > > labels them "owners." Also, this perception also enables non-owners > > (who might perceive themselves as unqualified to help) to justify > > leaving the poor blighters to do everything by themselves. > > > > I don't want to have the responsibility of "owning" anything about the > > existing code for ODE solving (and maybe some other numerical > > methods), even though I have some stake in it. But I'll happily share > > in some of the reviewing and possibly testing of changes and > > improvements to that code. > > > > So, can't there be informal teams of curatorship so that not everyone > > involved has to be really familiar with the tools discussed in the > > other thread?! Unfortunately I cannot afford the time to ride the > > waves of changing fashion in VCS, etc. > > > > Wouldn't this help to get more people involved? ... those many people > > that Gael correctly assumes are out there but staying silent! > > > > I wouldn't get too hung-up on the word "owner". I think the necessary part > is that one or more people feel some level of responsibility for each > component of scipy. > > As an example, in Trac you can configure an "owner" (their term, not mine) > for each component. When a ticket is issued the owner will receive an > email. For simple fixes, this facilitates quick turn around. Ideally, each > component would have at least one owner who was notified when tickets are > issued. > > Here's the current list: > > Name Owner > Build issues cdavid > Other somebody > Trac somebody > Website somebody > numexpr cookedm > scipy.cluster somebody > scipy.fftpack somebody > scipy.integrate somebody > scipy.interpolate somebody > scipy.io somebody > scipy.lib somebody > scipy.linalg somebody > scipy.maxentropy somebody > scipy.misc somebody > scipy.ndimage somebody > scipy.odr cdavid > scipy.optimize somebody > scipy.signal somebody > scipy.sparse wnbell > scipy.sparse.linalg wnbell > scipy.spatial peridot > scipy.special somebody > scipy.stats somebody > scipy.weave somebody > > Nice list. I note that Anne was quick to fix problems in scipy.spatial and David works hard on build issues. I feel responsible for the 1D solvers in scipy.optimize and the sort functions in Numpy. Josef should probably be added to the stats list. I nominate Pauli for special functions (Hi Pauli). A finer breakdown of categories might help in parceling out responsibilities. Volunteers? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From strawman at astraw.com Tue Feb 24 19:06:02 2009 From: strawman at astraw.com (Andrew Straw) Date: Tue, 24 Feb 2009 16:06:02 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> Message-ID: <49A48B6A.50101@astraw.com> St?fan van der Walt wrote: > Rob, > > 2009/2/25 Rob Clewley : >> I don't want to have the responsibility of "owning" anything about the >> existing code for ODE solving (and maybe some other numerical >> methods), even though I have some stake in it. But I'll happily share >> in some of the reviewing and possibly testing of changes and >> improvements to that code. > > I think you raise an important point. > > As an Open Source project, we could could succeed without much (any) > formal hierarchy. Naturally, the system evolves into a kind of > meritocracy, where capability and dedication counts heavily (as it > should). > > Informal ownership (I care for this code, therefore I take note of its > progress) is helpful in moderate doses. For example, I like the fact > that Nathan reviews all patches related to sparse matrices; he knows > that part of SciPy extremely well, and his advice is valuable. Of > course, other reviews of such a patch would be just as welcome. > > The main argument I tried to put forth earlier was: > > Contributing to a project is easy when you know what is expected of > you (clear guidelines), and when you know that you'll be treated on > merit alone (the same as everybody else). Merit carries no malice, > and is about as impartial as it gets. > I also want to point out that a formal code review process that is open (such as a web gui) encourages participation by people who may not feel they have the time or abilities to write new code, but would feel comfortable commenting on a patch sitting in front of them. I think it new developers could be fostered this way, too. Finally, while the prospect of having code go up for review won't encourage Travo to submit new code, it might have that effect on someone with less experience who is afraid that his/her new feature won't compile on Windows (for example). A formal code review process allows that person to put something online and say "don't apply as-is -- I'm looking for help integrating with Windows" or simply "I got this far, but I wonder how to do XXX". I realize these things are possible, to a degree, with Trac, but I agree that a better workflow process would be very useful. -Andrew From pav at iki.fi Tue Feb 24 19:11:45 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 25 Feb 2009 00:11:45 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> <3d375d730902231848n381099cp4f791018a5ab63a7@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 20:48:43 -0600, Robert Kern wrote: > On Mon, Feb 23, 2009 at 20:42, Pauli Virtanen wrote: >> Mon, 23 Feb 2009 19:40:27 -0600, Robert Kern wrote: [clip] >>> Pauli, you seem familiar with setting up a git-to-svn bridge. Can you >>> do this? >> >> Sure. I'll need a box on which to deploy the update script, though. >> Would one (which?) of the virtual hosts of conference.scipy.org do? > > Probably. That machine is where all of the services will be moving to. > Peter might be able to say which one. As a band-aid before this is configured, I put hourly updating mirrors on Github: http://github.com/pv/numpy-svn http://github.com/pv/scipy-svn Not much new under the sun here: these are based on and compatible with David's Git branches. Since git-svn produces a reproduceable tree, I think we can just pronounce our trees as the official Git mirror. David already wrote some time ago documentation what to do when you want to use git-svn to commit changes back to SVN: http://scipy.org/scipy/numpy/wiki/GitMirror If there are merges in the stuff to be dcommitted, git-svn seems to become a bit confusing. It's probably useful to do "git-rebase -i HASH" to linearize history before dcommit, but I'm a bit unsure what's the proper workflow in this case... -- Pauli Virtanen From wnbell at gmail.com Tue Feb 24 19:13:24 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 24 Feb 2009 19:13:24 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 1:59 PM, Matthew Brett wrote: > Hi, > > I've split this off into a new thread because I felt there were two > issues in Stefan's original thread. > > This is in the hope that we can stimulate discussion on the workflow > (as opposed to - say - which version control system to use, or which > bugtracker). > > I would be very interested to see if we can come to a consensus on the > important discussion of whether to introduce fairly formal code review > into the scipy workflow. ?I've appended the key piece of discussion > below. > I'd summarize my position with the following points: - SciPy components should have one or more maintainers - "Maintainer" means anyone who has an interest in that particular component - Maintainers should be notified by the bugtracker when problems arise My hope is that by resolving more problems at lower levels we can partially relieve the burden on release managers and the like. I see the introduction of a new VCS/bugtracker as mainly for the benefit of these people, whose responsibilities require more scalability. So, I'm definitely not opposed to introducing these changes and experimenting with alternatives a bit. However, I think we need a distributed-responsibility system *as much if not more* than a DVCS or a new bug tracker. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From peridot.faceted at gmail.com Tue Feb 24 19:20:16 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 24 Feb 2009 19:20:16 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: 2009/2/24 Robert Kern : > On Tue, Feb 24, 2009 at 15:13, Charles R Harris > wrote: > >> I think at this point we would be better off trying to recruit at least one >> person to "own" each package. For new packages that is usually the person >> who committed it but we also need ownership of older packages. Someone with >> a personal stake in a package is likely to do more for quality assurance at >> this point than any amount of required review. > > "Ownership" has a bad failure mode. Case in point: nominally, I am the > "owner" of scipy.stats and numpy.random and completely failed to move > Josef's patches along. It seems to me that scipy's development model is a classic open-source "scratch an itch": it bothered me that people were forever asking questions that needed spatial data structures, so I took a weekend and wrote some. I don't foresee this changing without some major change (e.g. a company suddenly hiring ten people to work full-time on scipy). So the question is how to make this model produce reliable code. Suggestions people have made to accomplish this: (1) Don't allow anything into SVN without tests and documentation. (2) Make sure everything gets reviewed before it goes in. (3) Appoint owners for parts of scipy. Of these, I strongly approve of (1). It's really not a barrier. Writing tests is easy. Every programmer does *some* testing (well maybe not Knuth, but everybody else) to make sure the code does what it's supposed to. Writing these tests in nose-compatible form really isn't hard. Documentation is more of an obstacle, just because it's extra work. But I think it's not too much to ask. (2) I'm not so sure of. For an example, a few days ago I fixed a couple of spatial bugs. In both cases, the bug fix was a one-line change to scipy proper, plus a unit test that would have caught the bug but now passes. What would be gained by waiting until somebody else got around to looking at those fixes before committing them? I am tempted to suggest a weaker standard: optional code review. If you want to submit a piece of code to scipy and don't have SVN access, or do but want someone else to take a look at it (as, e.g., I did for scipy.spatial as a whole), post it; people can review it and when it's been adequately reviewed it goes in. Of course, here we return to infrastructure: as far as I know we don't have any reasonable tool for doing these reviews, or for connecting them to bug reports. (3) I am highly dubious of. Certainly we'll have informal owners - I fixed the bugs in spatial in part because I wrote the code and was embarrassed to see it broken. I know the spatial code pretty well, so I will probably have an easier time assessing patches to it. But I am often busy - if those spatial bugs had been reported a month earlier I would not have been able to get to them any sooner. Making it my fault if patches don't get in to scipy.spatial - which is, really, what we're talking about - is a recipe for driving people like me away from developing scipy. Don't do it. Anne From wnbell at gmail.com Tue Feb 24 19:33:06 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 24 Feb 2009 19:33:06 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 7:20 PM, Anne Archibald wrote: > > (3) I am highly dubious of. Certainly we'll have informal owners - I > fixed the bugs in spatial in part because I wrote the code and was > embarrassed to see it broken. I know the spatial code pretty well, so > I will probably have an easier time assessing patches to it. IMO that's all "owner" implies. I don't think anyone can seriously expect more of a largely volunteer effort. The word "stakeholder" seems popular nowadays, should we use that instead? The aversion to "owner" must be a cultural thing :) -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From pav at iki.fi Tue Feb 24 19:37:54 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 25 Feb 2009 00:37:54 +0000 (UTC) Subject: [SciPy-dev] Server spam problems spam spam: spam References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <3d375d730902231758s760e4ea9q528e61847be1a831@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 19:58:27 -0600, Robert Kern wrote: [clip] >> Another thing is that there are apparently ca. 11600 pages in the >> Scipy.org wiki. I'd make a wild guess that at most ~500 of these are >> valid content; the rest is spam. I'm not sure if getting rid of the >> spam pages improves Moin's performance. > > Probably. Are you volunteering? Peter can give you a shell account. If > you are willing to take on the other upgrades Michael recommended, to > add the Captcha, for instance, that would go well, too. I can lend a hand here, if needed. But I see Peter already managed to tackle a lot of spam pages (thanks!). The wiki does feel more responsive now. >> Do we have any valid pages with CJK characters? Much of the spam seems >> Chinese, so mass-deleting at least this portion of it shouldn't be >> impossible to do, given Moin's database format. > > The Chinese localized Moin help pages are valid, but that should be it. Those are in the underlay/ (ie. they are stock pages that don't have revision history yet), so this would mean that there should be no pages in Chinese under data/. -- Pauli Virtanen From robert.kern at gmail.com Tue Feb 24 19:45:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Feb 2009 18:45:43 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <3d375d730902241645n3315f55bha76faed7e9dd7e50@mail.gmail.com> On Tue, Feb 24, 2009 at 12:59, Matthew Brett wrote: > Hi, > > I've split this off into a new thread because I felt there were two > issues in Stefan's original thread. > > This is in the hope that we can stimulate discussion on the workflow > (as opposed to - say - which version control system to use, or which > bugtracker). > > I would be very interested to see if we can come to a consensus on the > important discussion of whether to introduce fairly formal code review > into the scipy workflow. ?I've appended the key piece of discussion > below. My feeling about workflow is similar to my feeling about tools: the experimentalist in me is back in the corner with his hand raised high in the air. (He's very enthusiastic. "Ooh! Pickmepickmepickme!" There's one in every class. You know the kind.) There are a large number of unsupported assertions, gut feelings, and common sense flying about, and I don't want to get any of it on me. These are fairly poor guides for predicting the effects of project policy decisions, especially common sense. So let me make a meta-proposal: Let's do a series of one-month trials. We'll pick a workflow to try for a month. Much of the opposition to the various suggestions are coming from people who haven't tried to work in the proposed environment (at least not with this group of developers and this project; I conjecture without proof that variation between groups and projects is are large factors in the differing success of policies). The policy would be strictly enforced (to the extent that it involves enforcement) for that month. Detractors from the policy will grin and bear it for the duration of the month. Because it's just a month. We'll try their idea next month. This is far from scientific; the end result won't be measurable, per se. But it will give us experience with each of the suggestions. Perhaps what we fear about a policy won't be nearly as onerous as we think, or even has the reverse effect. Maybe we'll generate better ideas as we play around. Maybe some ideas truly suck. At the end, we may not generate a true consensus, but I suspect that we'll all be happier with *one* of the solutions than we are going to be if we just talk about it. And our happiness is really the thing to optimize, here, objective reality be damned. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Tue Feb 24 19:49:08 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 24 Feb 2009 19:49:08 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: Hi, Here's a concrete workflow problem: is there a way for me to test my changes on Mac and Windows before committing? I don't have access to either kind of machine, so I just write code I hope is portable, and David Cournapeau ends up having to suffer for it. Anne From bjracine at glosten.com Tue Feb 24 19:35:09 2009 From: bjracine at glosten.com (Benjamin J. Racine) Date: Tue, 24 Feb 2009 16:35:09 -0800 Subject: [SciPy-dev] [SciPy-user] Scientific packages for a distributed computing Amazon EC2 image? In-Reply-To: Message-ID: <8C2B20C4348091499673D86BF10AB6763B05797AE6@clipper.glosten.local> I put on there, but perhaps might have missed... cython mpi4py ETS (Enthought Tool Suite) Ben R. ________________________________ From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Peter Skomoroch Sent: Monday, February 23, 2009 11:52 AM To: SciPy Developers List; SciPy Users List Subject: [SciPy-user] Scientific packages for a distributed computing Amazon EC2 image? I'm collecting a wishlist of scientific and python related packages (numpy, scipy, etc) people would want installed on a Debian based Amazon EC2 machine image (AMI) for distributed computing. I'll make more information available as the machine image develops, some of these will also go into the Machetec2 AMI. Several variants of the AMI should become available in the next month. Please feel free to add any packages you would want pre-installed on the following wiki page: http://scipy.org/SciPyAmazonAmi Let me know if you spot any potential license conflicts with listed software. -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Feb 24 19:52:16 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 24 Feb 2009 17:52:16 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: On Tue, Feb 24, 2009 at 5:20 PM, Anne Archibald wrote: > 2009/2/24 Robert Kern : > > On Tue, Feb 24, 2009 at 15:13, Charles R Harris > > wrote: > > > >> I think at this point we would be better off trying to recruit at least > one > >> person to "own" each package. For new packages that is usually the > person > >> who committed it but we also need ownership of older packages. Someone > with > >> a personal stake in a package is likely to do more for quality assurance > at > >> this point than any amount of required review. > > > > "Ownership" has a bad failure mode. Case in point: nominally, I am the > > "owner" of scipy.stats and numpy.random and completely failed to move > > Josef's patches along. > > It seems to me that scipy's development model is a classic open-source > "scratch an itch": it bothered me that people were forever asking > questions that needed spatial data structures, so I took a weekend and > wrote some. I don't foresee this changing without some major change > (e.g. a company suddenly hiring ten people to work full-time on > scipy). So the question is how to make this model produce reliable > code. > > Suggestions people have made to accomplish this: > > (1) Don't allow anything into SVN without tests and documentation. > (2) Make sure everything gets reviewed before it goes in. > (3) Appoint owners for parts of scipy. > > Of these, I strongly approve of (1). It's really not a barrier. > Writing tests is easy. Every programmer does *some* testing (well > maybe not Knuth, but everybody else) to make sure the code does what > it's supposed to. Writing these tests in nose-compatible form really > isn't hard. Documentation is more of an obstacle, just because it's > extra work. But I think it's not too much to ask. > > (2) I'm not so sure of. For an example, a few days ago I fixed a > couple of spatial bugs. In both cases, the bug fix was a one-line > change to scipy proper, plus a unit test that would have caught the > bug but now passes. What would be gained by waiting until somebody > else got around to looking at those fixes before committing them? > > I am tempted to suggest a weaker standard: optional code review. If > you want to submit a piece of code to scipy and don't have SVN access, > or do but want someone else to take a look at it (as, e.g., I did for > scipy.spatial as a whole), post it; people can review it and when it's > been adequately reviewed it goes in. Of course, here we return to > infrastructure: as far as I know we don't have any reasonable tool for > doing these reviews, or for connecting them to bug reports. > > (3) I am highly dubious of. Certainly we'll have informal owners - I > fixed the bugs in spatial in part because I wrote the code and was > embarrassed to see it broken. I know the spatial code pretty well, so > I will probably have an easier time assessing patches to it. But I am > often busy - if those spatial bugs had been reported a month earlier I > would not have been able to get to them any sooner. Making it my fault > if patches don't get in to scipy.spatial - which is, really, what > we're talking about - is a recipe for driving people like me away from > developing scipy. Don't do it. > > I don't think that's what we are "really talking about", rather, I think we need folks who feel an informal ownership about parts of scipy. I simply pointed out where I felt responsible as an example. Your sense of "owning" scipy.spatial is another example. And I think the best way to get folks attached to orphaned bits of code that have languished untouched all these years is to let them make actual changes without jumping through umpteen legal hoops. I also think we need more developers, and the place to find them is among folks who have contributed patches. We should actively offer commit privileges to such folks. The main advantage of a DVCS in such a situation is that commit privilege becomes less important and additions can be reviewed offline and brought in easily when ready. But until we have such a system I think more folks need the ability to touch SVN. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Feb 24 19:53:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Feb 2009 18:53:15 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <3d375d730902241653m43e5ef79y1140eab6376d00c9@mail.gmail.com> On Tue, Feb 24, 2009 at 18:49, Anne Archibald wrote: > Hi, > > Here's a concrete workflow problem: is there a way for me to test my > changes on Mac and Windows before committing? I don't have access to > either kind of machine, so I just write code I hope is portable, and > David Cournapeau ends up having to suffer for it. You can make a branch, commit there, then force a build on the appropriate buildbot: http://buildbot.scipy.org/builders -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jason-sage at creativetrax.com Tue Feb 24 20:24:38 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Tue, 24 Feb 2009 19:24:38 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A48B6A.50101@astraw.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> <49A48B6A.50101@astraw.com> Message-ID: <49A49DD6.9070708@creativetrax.com> Andrew Straw wrote: > St?fan van der Walt wrote: > >> Rob, >> >> 2009/2/25 Rob Clewley : >> >>> I don't want to have the responsibility of "owning" anything about the >>> existing code for ODE solving (and maybe some other numerical >>> methods), even though I have some stake in it. But I'll happily share >>> in some of the reviewing and possibly testing of changes and >>> improvements to that code. >>> >> I think you raise an important point. >> >> As an Open Source project, we could could succeed without much (any) >> formal hierarchy. Naturally, the system evolves into a kind of >> meritocracy, where capability and dedication counts heavily (as it >> should). >> >> Informal ownership (I care for this code, therefore I take note of its >> progress) is helpful in moderate doses. For example, I like the fact >> that Nathan reviews all patches related to sparse matrices; he knows >> that part of SciPy extremely well, and his advice is valuable. Of >> course, other reviews of such a patch would be just as welcome. >> >> The main argument I tried to put forth earlier was: >> >> Contributing to a project is easy when you know what is expected of >> you (clear guidelines), and when you know that you'll be treated on >> merit alone (the same as everybody else). Merit carries no malice, >> and is about as impartial as it gets. >> >> > > I also want to point out that a formal code review process that is open > (such as a web gui) encourages participation by people who may not feel > they have the time or abilities to write new code, but would feel > comfortable commenting on a patch sitting in front of them. I think it > new developers could be fostered this way, too. Finally, while the > prospect of having code go up for review won't encourage Travo to submit > new code, it might have that effect on someone with less experience who > is afraid that his/her new feature won't compile on Windows (for > example). A formal code review process allows that person to put > something online and say "don't apply as-is -- I'm looking for help > integrating with Windows" or simply "I got this far, but I wonder how to > do XXX". I realize these things are possible, to a degree, with Trac, > but I agree that a better workflow process would be very useful. > This guarantee of a review made a huge difference in me being comfortable starting to contribute to Sage. I didn't have to get everything perfect; I knew someone would review my changes and offer suggestions. Another thing that I learned from the Sage project was that if you require it ("build it" :), they will come. It may take time, but requiring review of patches seems to pull people out of the woodwork to try reviewing the patches; people that would be too hesitant to actually write code for Sage, at least yet. It's a great, nonthreatening way to learn the system (there is often more than one review of a patch, especially if one of the reviewers is new to Sage). Just a few more reasons for a formal policy of reviews. Jason -- Jason Grout From josef.pktd at gmail.com Tue Feb 24 20:59:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 24 Feb 2009 20:59:08 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: <1cd32cbb0902241759x6a09b9e0n53de142f28f5548c@mail.gmail.com> On Tue, Feb 24, 2009 at 7:20 PM, Anne Archibald wrote: > 2009/2/24 Robert Kern : >> On Tue, Feb 24, 2009 at 15:13, Charles R Harris >> wrote: >> >>> I think at this point we would be better off trying to recruit at least one >>> person to "own" each package. For new packages that is usually the person >>> who committed it but we also need ownership of older packages. Someone with >>> a personal stake in a package is likely to do more for quality assurance at >>> this point than any amount of required review. >> >> "Ownership" has a bad failure mode. Case in point: nominally, I am the >> "owner" of scipy.stats and numpy.random and completely failed to move >> Josef's patches along. > > It seems to me that scipy's development model is a classic open-source > "scratch an itch": it bothered me that people were forever asking > questions that needed spatial data structures, so I took a weekend and > wrote some. I don't foresee this changing without some major change > (e.g. a company suddenly hiring ten people to work full-time on > scipy). So the question is how to make this model produce reliable > code. > > Suggestions people have made to accomplish this: > > (1) Don't allow anything into SVN without tests and documentation. > (2) Make sure everything gets reviewed before it goes in. > (3) Appoint owners for parts of scipy. I think that having someone who feels responsible for the different parts of scipy is the main problem. And whatever we do to make this easier and that expands the number of active participants will be an improvement. I don't feel like the "owner" of stats, but it's more a case of adoption. I like the centralized trac timeline since it is easy to monitor new tickets and changes to svn. And I'm doing code review ex-post (after commits) to minimize new problems. This is also an incentive to increase test coverage to complain immediately if something breaks. (My main problem with trac was monitoring old tickets, which I haven't figured out how to do efficiently.) I think for packages that have a responsible and responsive "maintainer" my experience with the mailing list was pretty good. On the other hand looking at the mailing list history, I saw many comments and threads about the problems in stats, and while some problems got fixed, many reports of problems were never followed by any action. Which is also pretty frustrating for the user. A new workflow and code review might help, but if there is nobody, that adopts the orphaned subpackages, it will be just another place to store comments. I'm a huge fan of full test coverage, but writing full verified tests is for me a lot of work and I still have a backlog of bugfixes because I haven't had time to write sufficient tests. Also, I think that the commitment to maintain and increase test coverage should be sufficient for some cases. For example, in stats.mstats Pierre rewrote and added statistics functions for masked arrays, the test coverage is good, but there are still quite a few functions not covered and still some rough edges, but overall it looks in better condition than scipy.stats did. In this case I find it useful to have the full set of functions, that Pierre wrote, available immediately than adding them piecemeal as he finds time to write tests. The documentation editor is a good example where an easier access by new contributors increased the number of participants, and maybe collective writing and review of code and tests can lower the entry barrier. But for now, I think, I still need to be able toget some bug fixes into stats without a large beaurocracy, or with an expiration date on any code review. Josef From pwang at enthought.com Wed Feb 25 00:32:43 2009 From: pwang at enthought.com (Peter Wang) Date: Tue, 24 Feb 2009 23:32:43 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: On Feb 23, 2009, at 8:42 PM, Pauli Virtanen wrote: > Mon, 23 Feb 2009 19:40:27 -0600, Robert Kern wrote: > [clip] >> Pauli, you seem familiar with setting up a git-to-svn bridge. Can >> you do >> this? > > Sure. I'll need a box on which to deploy the update script, though. > Would one (which?) of the virtual hosts of conference.scipy.org do? We should refer to the machine as new.scipy.org; it's slightly more apropos than "conference". What is the desired URL of this bridge script? > I'd guess what's needed of the web server would be only to enable > CGI for > a single script. When poked, it would then fetch new stuff from SVN > and > either > > - Push to github or some such service > - Push to a HTTP location on the machine, served statically > - Push to a HTTP location on the machine, served by gitweb (cgi) > > The first option is probably the easiest, if account/password issues > can > be sorted out. The second option is probably enough for practical > purposes. ... > Anyway, I'll think about the details tomorrow. How does the above interact with, and what are the ramifications for: - user accounts and permissions - subdomains (git-to-svn for numpy and various scipy subdomains like mpi4py, etc.) - logging/monitoring (so we can detect if the CGI goes wrong/zombie/ berserk) -Peter From hoytak at cs.ubc.ca Wed Feb 25 02:10:09 2009 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Tue, 24 Feb 2009 23:10:09 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: <4db580fd0902242310s4d3bead5pb217fbf6cf1c6d11@mail.gmail.com> Hi, Only two small things to add to this (interesting) discussion on git stuff: 1. tortoisegit is in active development (http://code.google.com/p/tortoisegit/). Haven't tried it, but looks good already. 2. I've found that, for a new user, having a git cheat sheet (e.g. http://zrusin.blogspot.com/2007/09/git-cheat-sheet.html, the one I prefer) gives about the same power, as using svn. For basic users, the features of svn are pretty much just a subset of those of git, and such a cheat sheet greatly helps the learning curve. Just a few thoughts. (I use git for all my stuff, and love it). --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From fperez.net at gmail.com Wed Feb 25 02:20:59 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 24 Feb 2009 23:20:59 -0800 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> Message-ID: Hi Peter, On Tue, Feb 24, 2009 at 7:47 AM, Peter Wang wrote: > Hi everyone, > > I have gone through with a blunt grep hammer and moved ~9300 pages off > of the main scipy wiki. ?This seems to have helped Moin's performance > somewhat. Inspired by this, I just went and nuked ~1900 out of the ipython one, leaving only the 128 that are probably for real. I hope this helps also reduce the load a bit more. Thanks again for all your work on getting the system to be more responsive! Cheers, f From stefan at sun.ac.za Wed Feb 25 02:51:02 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 09:51:02 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902241759x6a09b9e0n53de142f28f5548c@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> <1cd32cbb0902241759x6a09b9e0n53de142f28f5548c@mail.gmail.com> Message-ID: <9457e7c80902242351u3e26162fr7ddcb8b7cfd3bf1@mail.gmail.com> 2009/2/25 : > A new workflow and code review might help, but if there is nobody, that > adopts the orphaned subpackages, it will be just another place to store > comments. [...] > But for now, I think, I still need to be able toget some bug fixes > into stats without a large beaurocracy, or with an > expiration date on any code review. [...] I don't see code review as a very formal process. Review by a domain expert would be ideal, but if such a person is not available a review by any other programmer would do. Here is the first patch I submitted for review: http://codereview.appspot.com/1105 It touches only 5 lines of code in numpy/lib/function_base.py, and functionally the patch was fine. Note how the patch progressed with the advice of Ondrej and Robert: spacing according to PEP8, and a better way to referring to the list-type. Once Robert pointed that out, I also fixed another occurrence in the file. If you see this measure of improvement on a 5 line patch, imagine the positive impact it can have on a more complicated piece of code. Regards St?fan From stefan at sun.ac.za Wed Feb 25 02:58:38 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 09:58:38 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <3d375d730902241645n3315f55bha76faed7e9dd7e50@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241645n3315f55bha76faed7e9dd7e50@mail.gmail.com> Message-ID: <9457e7c80902242358s28dfe14cue96c2441d184c768@mail.gmail.com> 2009/2/25 Robert Kern : > There are a large number of unsupported assertions, gut feelings, and > common sense flying about, and I don't want to get any of it on me. > These are fairly poor guides for predicting the effects of project > policy decisions, especially common sense. [...] > So let me make a meta-proposal: Let's do a series of one-month trials. I'd be glad to try out different work-flows, and to pick the one that suits our project best. Would you like to propose a schedule? Just give me and David a week or so to sort out the issue tracker first! Regards St?fan From pwang at enthought.com Wed Feb 25 07:20:46 2009 From: pwang at enthought.com (Peter Wang) Date: Wed, 25 Feb 2009 06:20:46 -0600 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> Message-ID: <09ECD184-2822-4A91-9F58-1BB4122FD590@enthought.com> On Feb 25, 2009, at 1:20 AM, Fernando Perez wrote: > Inspired by this, I just went and nuked ~1900 out of the ipython one, > leaving only the 128 that are probably for real. I hope this helps > also reduce the load a bit more. Great, thank you! One thing that occurs to me is that once you have a fairly high ratio of ham to spam, it might be worth saving the directory listing into a base "goodpages.txt" that can then be used as a whitelist filter in the future when blowing away spam via regexes. (Hopefully we won't have to do that on this scale again, but if history is any indicator, spammers always find a way...) -Peter From oliphant at enthought.com Wed Feb 25 11:49:10 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 10:49:10 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <49A57686.70707@enthought.com> Charles R Harris wrote: > I don't think there are enough eyes at this point for a strict review > policy. How many of the current packages have any maintainer? Who was > maintaining the stats package before Josef got involved? How many > folks besides Robert could look over the changes usefully? How many > folks looked over Travis' recent addition to optimize? Who is working > on the interpolation package? > > I think at this point we would be better off trying to recruit at > least one person to "own" each package. For new packages that is > usually the person who committed it but we also need ownership of > older packages. Someone with a personal stake in a package is likely > to do more for quality assurance at this point than any amount of > required review. Yes, my feelings exactly. Quality goes up when people who have a personal stake or attachment to the code are engaged. How do we get more of this to happen? Formal review processes can actually have at least some negative impact in getting people engaged. Let's make a tweak here and a tweak there. Right now, I'm of the opinion that whatever makes the *workflow* of people like David, Pauli, Jarrod, Robert K, Robert C, Nathan, Matthew, Charles, Anne, Andrew, Gael, and Stefan (and others big contributors I may have missed) easier, I'm totally in favor of. If that is a DVCS and/or something different than Trac, then let's do that. It sounds like we are making steps in that direction which is excellent. > > I don't have a problem with folks complaining about missing tests, > etc., but I worry that if we put too many review steps into the > submission path there won't be enough people to make it work. This is exactly the way I feel.... I don't want to imply at all that we shouldn't be bugging each other about documentation and testing. I personally welcome any reminders in that direction. I am just worried about whether or not we are really solving the real problems that make it hard to contribute by instituting policy rather than providing examples of code to model. I do see a real need to fix the SVN-Trac workflow bottleneck as well as anything that helps the release process. It's actually at the release process where I would institute any formal review process. I'm also in favor of having a regular (i.e. every 3-6 months) release process. The difficulty there again is man-power. -Travis From oliphant at enthought.com Wed Feb 25 11:54:05 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 10:54:05 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902241759x6a09b9e0n53de142f28f5548c@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> <1cd32cbb0902241759x6a09b9e0n53de142f28f5548c@mail.gmail.com> Message-ID: <49A577AD.1020106@enthought.com> josef.pktd at gmail.com wrote: > > I think that having someone who feels responsible for the different parts > of scipy is the main problem. And whatever we do to make this > easier and that expands the number of active participants will be an > improvement. > It's a man-power thing again in my mind. I would love to spend more time on SciPy, but have not found the time. > I'm a huge fan of full test coverage, but writing full verified tests is for me > a lot of work and I still have a backlog of bugfixes because I haven't > had time to write sufficient tests. > > Also, I think that the commitment to maintain and increase test > coverage should be sufficient for some cases. > For example, in stats.mstats Pierre rewrote and added statistics > functions for masked arrays, the test coverage is good, but there > are still quite a few functions not covered and still some rough edges, > but overall it looks in better condition than scipy.stats did. In this case > I find it useful to have the full set of functions, that Pierre wrote, > available immediately than adding them piecemeal as he finds time > to write tests. > Yes. This is exactly the way I feel. I'd rather have functionality written by someone who cared about it perhaps without full test coverage than no functionality because someone can't find time to write tests. > But for now, I think, I still need to be able toget some bug fixes > into stats without a large beaurocracy, or with an > expiration date on any code review. > +1 BTW: Thanks for all your hard work on stats and optimize, Josef. -Travis From cournape at gmail.com Wed Feb 25 12:02:43 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 26 Feb 2009 02:02:43 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A57686.70707@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> Message-ID: <5b8d13220902250902v61c0e038odb513265e21264af@mail.gmail.com> On Thu, Feb 26, 2009 at 1:49 AM, Travis E. Oliphant wrote: > > I do see a real need to fix the SVN-Trac workflow bottleneck as well as > anything that helps the release process. ? ?It's actually at the release > process where I would institute any formal review process. ? I'm also in > favor of having a regular (i.e. every 3-6 months) release process. ? The > difficulty there again is man-power. One thing which may help here is to have a turn-around for the release manager: a different person every time. This person would have the last world of what goes in/what does not, with almost strictly enforced deadlines. In particular, we should really enforce code freeze - although I can understand the point that reviews may make things harder, I don't think it is possible at all to make good release without enforcing very strict timelines. There has to be no new code for some time before the release, time which is more than just one day or two. C/Fortran code would be the first to be freezed, then python, then docstring. The exact time can be tweaked after experiments, of course. But if we get this right, I believe that having freeze periods can make the time from patch to inclusion actually faster. Having a different person means it is not always the same person, obviously, and it may also keep people "honest", in the sense that a release manager will also be a coder later under a different release manager. cheers, David From oliphant at enthought.com Wed Feb 25 12:41:52 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 11:41:52 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> Message-ID: <49A582E0.9000001@enthought.com> St?fan van der Walt wrote: > Cha > Thank you for providing some prime examples on why code review and > testing is needed. > > I didn't know about the changes to the optimisation module, but now I > have to ask these questions: > > 1. Is it quality code, suitable for SciPy? (It is, I read the code, > or in other words *reviewed* it) > > 2. Does it work? I don't know. Nobody does. There aren't any tests, > no guarantees. > Which code has no tests? Are you speaking rhetorically or about a specific check-in? > That would be my reaction to one change, but there were 6 or more, > none with any tests (and at least one contains a spelling mistake!). > How do we know that they work under Windows, Solaris, Linux and OSX? > > Worse; what if I decide to make some updates to that code but, not > having understood the author's intention perfectly, break it horribly. > Who would be any the wiser? > > Tests protect the user and the developer alike. It is irresponsible > to carry on the way we do. > No it's not. It's just a different, more organic, development model than the one you are championing. It's one where people do their best to create quality code and help fix the pieces that they care about, no matter who the original author was. Getting quality code is not as easy as establishing a formal review process. We've had a review process for years. *Everybody* is welcome to review the code that is checked in and complain / argue about better ways to do it. If you feel that strongly and have commit access, you can even change the code. >> Are we adding a lot of broken code? >> > > Yes, we are. There are *degrees* of broken-ness. It's impossible to prove that code is not "broken" in some way depending on your definition of broken. So, it's a bit misleading to complain about broken-ness. We are also adding a lot of very useful code. I agree code gets better with more eye-balls, generally, but developers are also not interchangeable. I'm not at all convinced that a formal review process will actually make things better, when what we need is more time spent. > More folks with commit priviledges would just perpetuate this chaos. > Our community is sophisticated enough not to apply a brute-force > solution to the problem. > I'm not sure it actually would. I can see, however, that we do need a DVCS system to make the work-flow easier because branching is the key to having real development take place and allow the developer to utilize the benefits of version control while not impacting the "released-line" until they are ready to commit all changes (i.e. branching and merging needs to be as seamless as possible). > It should not be anyone's job to clean up after his/her peers. If > each patch is accompanied by appropriate tests, this situation would > never occur. > I think this is a fundamentally wrong mindset. Yes, we should all test before we commit code, and I'm not opposed to reminding people that the tests they already use to make sure things work can be easily placed as unit-tests. But, my best effort in an area which actually solves the problem, or scratches-the-itch I had, may look like "a mess" to someone else. And I don't think we can find a one-size fits-all solution to that fundamental difference of perspective. Who decides what "clean-up" constitutes? My answer is "those most interested" which is a dynamic and possibly varying group. If there are concerns about the commits people are making, then lets talk about specifics and address those rather than masking that behind a "review process" -Travis From oliphant at enthought.com Wed Feb 25 12:55:52 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 11:55:52 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A48B6A.50101@astraw.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> <49A48B6A.50101@astraw.com> Message-ID: <49A58628.8020607@enthought.com> Andrew Straw wrote: > > I also want to point out that a formal code review process that is open > (such as a web gui) encourages participation by people who may not feel > they have the time or abilities to write new code, but would feel > comfortable commenting on a patch sitting in front of them. I think it > new developers could be fostered this way, too. Great idea! Let's have review/mentoring processes to assist new-comers. I'm all for that. I would like to move those people who are timid at first to the point of being willing to dive in and get their hands dirty. I suspect my view is a bit organic for some, but I've encouraged people for a long time to commit code with as much documentation and testing as can be provided. Then, let the process of further documenting and using that code "harden" it rather than a "review" process. If we feel that there have been too many "buggy-commits" then what are examples of that? I think the switch to NumPy and the integration of ndimage did bring in some "less-reviewed" code with API changes that were possibly too hurried. But, that was a time-problem again. Is imposing an extra burden on the developer really going to solve that problem more than just a willingness to allow your own code to be critiqued and being willing to speak up when you see specifics you disagree with. I don't see this discussion as review or not review. Open source *will* be reviewed. It's just "when." On the question of whether or not you make the code available until you can guarantee someone else has looked at it, I come down on the side of "make it available" so that other people will look at it when it becomes interesting to them. Tools that let us monitor the results of commits (buildbots, dashboards, automatic emails, etc...) are much more valuable in my mind than (difficult-to-quantify and establish) processes that try to prevent any problems. More tools please is fundamentally what I say to the question of formal review... -Travis From peridot.faceted at gmail.com Wed Feb 25 13:04:57 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 25 Feb 2009 13:04:57 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A58628.8020607@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> <49A48B6A.50101@astraw.com> <49A58628.8020607@enthought.com> Message-ID: 2009/2/25 Travis E. Oliphant : > I don't see this discussion as review or not review. Open source > *will* be reviewed. It's just "when." On the question of whether or > not you make the code available until you can guarantee someone else has > looked at it, I come down on the side of "make it available" so that > other people will look at it when it becomes interesting to them. Unfortunately, some of that "when" is "when the release manager is trying to get a the release candidate to compile on all architectures", which makes the release process quite onerous. Anne From oliphant at enthought.com Wed Feb 25 13:07:01 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 12:07:01 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <3d375d730902241528r72a47e86v5bf23b9b81df9896@mail.gmail.com> Message-ID: <49A588C5.30500@enthought.com> Anne Archibald wrote: > wrote some. I don't foresee this changing without some major change > (e.g. a company suddenly hiring ten people to work full-time on > scipy). So the question is how to make this model produce reliable > code. > > Suggestions people have made to accomplish this: > > (1) Don't allow anything into SVN without tests and documentation. > (2) Make sure everything gets reviewed before it goes in. > (3) Appoint owners for parts of scipy. > > Of these, I strongly approve of (1). It's really not a barrier. > As long as we don't do #2, then having the rule of #1 is completely fine. Say it in a similar way to that: "Don't commit to trunk until there are tests and documentation." I would be opposed to attempts to modify the nouns with fuzzy words like "complete" or "full" or something impossible to quantify. Here's an attempt at the wording: "Don't commit new code to trunk until you are sure the code works by passing unit-tests and being documented by a doc-string that follows the pattern established" Bug-fixes should (usually be accompanied by a unit test) unless they are "bug-guard changes" (i.e. like the one-liner I recently made to NumPy to catch the error-condition which I've never actually seen and don't know how to test for): I definitely think review should be encouraged but absolutely optional pre-commit. We should then encourage each other to review post-commit. As far as "ownership". How about we just have a posted list of "interested committers" that people can refer to in order to direct code-review requests and patch-submission. More than one person can be listed for each sub-project. -Travis From cournape at gmail.com Wed Feb 25 13:16:32 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 26 Feb 2009 03:16:32 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A58628.8020607@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241550r153378a3o90a6f795c2fe9baa@mail.gmail.com> <49A48B6A.50101@astraw.com> <49A58628.8020607@enthought.com> Message-ID: <5b8d13220902251016l27142a55pd838bc2875186e23@mail.gmail.com> On Thu, Feb 26, 2009 at 2:55 AM, Travis E. Oliphant wrote: > Andrew Straw wrote: >> >> I also want to point out that a formal code review process that is open >> (such as a web gui) encourages participation by people who may not feel >> they have the time or abilities to write new code, but would feel >> comfortable commenting on a patch sitting in front of them. I think it >> new developers could be fostered this way, too. > Great idea! ?Let's have review/mentoring processes to assist > new-comers. ?I'm all for that. > > I would like to move those people who are timid at first to the point of > being willing to dive in and get their hands dirty. ? I suspect my view > is a bit organic for some, ?but I've encouraged people for a long time > to commit code with as much documentation and testing as can be > provided. ? Then, let the process of further documenting and using that > code "harden" it rather than a "review" process. > > If we feel that there have been too many "buggy-commits" then what are > examples of that? ?I think the switch to NumPy and the integration of > ndimage did bring in some "less-reviewed" code with API changes that > were possibly too hurried. ?But, that was a time-problem again. ? Is > imposing an extra burden on the developer really going to solve that > problem more than just a willingness to allow your own code to be > critiqued and being willing to speak up when you see specifics you > disagree with. > > I don't see this discussion as review or not review. ? Open source > *will* be reviewed. ?It's just "when." ?On the question of whether or > not you make the code available until you can guarantee someone else has > looked at it, I come down on the side of "make it available" so that > other people will look at it when it becomes interesting to them. > > Tools that let us monitor the results of commits (buildbots, dashboards, > automatic emails, etc...) are much more valuable in my mind than > (difficult-to-quantify and establish) processes that try to prevent any > problems. > > More tools please is fundamentally what I say to the question of formal > review... I agree. I know the topic is not about tools, but at least in my case, I don't do review because it is too much of a pain right now. It is not so much about tools as much as barrier of entry: I am honestly not really interested in doing code review if the process to be aware of the code to review, look at it, test it makes up 75 % of the time. And that's exactly the situation right now. More concretely, if I could: - receive email specifically for the packages/topics I want to review, not others - I can get the code/test it ideally without even having to click once in a website - send back what's wrong/what's not - mark a review as done That would make me much more willing to do review. I think discussing about review process in the dark, without actual experiments is a bit useless at this point - like Robert, I think as long as we don't have any process on how to do things, we won't have much more than gut feeling. To see actual problems, limitations, we have to try things at some point. David From bsouthey at gmail.com Wed Feb 25 13:16:51 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 25 Feb 2009 12:16:51 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A57686.70707@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> Message-ID: <49A58B13.1040206@gmail.com> Travis E. Oliphant wrote: > Charles R Harris wrote: > >> I don't think there are enough eyes at this point for a strict review >> policy. How many of the current packages have any maintainer? Who was >> maintaining the stats package before Josef got involved? How many >> folks besides Robert could look over the changes usefully? How many >> folks looked over Travis' recent addition to optimize? Who is working >> on the interpolation package? >> >> I think at this point we would be better off trying to recruit at >> least one person to "own" each package. For new packages that is >> usually the person who committed it but we also need ownership of >> older packages. Someone with a personal stake in a package is likely >> to do more for quality assurance at this point than any amount of >> required review. >> > Yes, my feelings exactly. Quality goes up when people who have a > personal stake or attachment to the code are engaged. How do we get > more of this to happen? Formal review processes can actually have at > least some negative impact in getting people engaged. Let's make a > tweak here and a tweak there. Right now, I'm of the opinion that > whatever makes the *workflow* of people like David, Pauli, Jarrod, > Robert K, Robert C, Nathan, Matthew, Charles, Anne, Andrew, Gael, and > Stefan (and others big contributors I may have missed) easier, I'm > totally in favor of. If that is a DVCS and/or something different > than Trac, then let's do that. > > It sounds like we are making steps in that direction which is excellent. > Really based on the discussion (including the latter comments), it appears to me that this discussion has moved towards what sort of developmental structure should scipy be using with a DVCS. I viewed much of the discussion following what sort of happens with Linux kernel development since they adopted DVCS starting with Bitkeeper. Jonathon Corbet has an interesting article on this http://ldn.linuxfoundation.org/book/how-participate-linux-community . Essentially there are sub-maintainer trees that feed into the testing tree (-mm), the staging tree (where patches are applied against that should minimize tree divergence) and hopefully Linus's tree. During that process is informal code review for at least bug fixes as new or major features still have a problem with code review. In some aspects, the man-power restriction with the Linux kernel development has been removed because code no longer has to flow through a single node. So this allows a user to get easily get code not only from these trees but also other developers. So scipy could do something similar where the use of DVCS which would hopefully this would reduce the burden on people like Robert and you. I do not see a real need at this time to say you 'own' that module and you must 'control' the development of it. I would suspect that 'ownership' of scipy components will naturally develop over time and, thus, should not be forced upon anyone. If sub-trees were created it would permit a sharing of incomplete code so that the burden of developing appropriate tests, writing documentation and testing can be distributed to interested parties. This would also foster mentoring and getting hands dirty in a positive way. > > >> I don't have a problem with folks complaining about missing tests, >> etc., but I worry that if we put too many review steps into the >> submission path there won't be enough people to make it work. >> > This is exactly the way I feel.... I don't want to imply at all that we > shouldn't be bugging each other about documentation and testing. I > personally welcome any reminders in that direction. I am just worried > about whether or not we are really solving the real problems that make > it hard to contribute by instituting policy rather than providing > examples of code to model. > From this it appears that in order to get code into scipy then you have to have all this documentation and tests. But really my concern is strict requirements of tests and documentation is that we will get minimal tests and inferior documentation or nothing at all. Rather I hope that the burden can be shifted from one person to a group of people then perhaps we can get more extensive tests and documentation as well as people actually testing the code on different systems with hopefully realistic situations. So at least there is some fix in some tree for a problem and eventually the rest of it will follow by the time everything is ready for mainline inclusion. Regards Bruce From matthew.brett at gmail.com Wed Feb 25 13:58:56 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 10:58:56 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A582E0.9000001@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> Message-ID: <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> Hi, >> Tests protect the user and the developer alike. ?It is irresponsible >> to carry on the way we do. >> > No it's not. Scipy is rarely released. David and Stefan are saying that it is very hard to release. It might be true, that continuing with the organic, 'add it if it seems good' approach, will be fine. But it might also be true that it will make Scipy grind to a halt, as it becomes too poorly structured and tested to maintain. http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/ rates Numpy / Scipy / matplotlib as 'immature'. This is mainly because of Scipy, and it's fair. It we want it to change we have to be able to release versions that have good documentation and low bug counts. The choices we make now are going to have long-lasting consequences for Scipy. I think our best guess, from what David and Stefan are saying, that we need a change towards more structured process. I stress the word "need". This doesn't seem surprising to me. I think we've got to listen to them, because they are doing the work of maintaining and releasing Scipy. See y'all, Matthew From josef.pktd at gmail.com Wed Feb 25 14:15:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 25 Feb 2009 14:15:24 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A58B13.1040206@gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> <49A58B13.1040206@gmail.com> Message-ID: <1cd32cbb0902251115p2ad7798dq811bd39142535392@mail.gmail.com> On Wed, Feb 25, 2009 at 1:16 PM, Bruce Southey wrote: > Travis E. Oliphant wrote: >> Charles R Harris wrote: >> >>> I don't think there are enough eyes at this point for a strict review >>> policy. How many of the current packages have any maintainer? Who was >>> maintaining the stats package before Josef got involved? How many >>> folks besides Robert could look over the changes usefully? How many >>> folks looked over Travis' recent addition to optimize? ?Who is working >>> on the interpolation package? >>> >>> I think at this point we would be better off trying to recruit at >>> least one person to "own" each package. For new packages that is >>> usually the person who committed it but we also need ownership of >>> older packages. Someone with a personal stake in a package is likely >>> to do more for quality assurance at this point than any amount of >>> required review. >>> >> Yes, ?my feelings exactly. ? Quality goes up when people who have a >> personal stake or attachment to the code are engaged. ? How do we get >> more of this to happen? ? Formal review processes can actually have at >> least some negative impact in getting people engaged. ? ? Let's make a >> tweak here and a tweak there. ? Right now, I'm of the opinion that >> whatever makes the *workflow* of people like David, Pauli, Jarrod, >> Robert K, Robert C, Nathan, Matthew, Charles, Anne, Andrew, Gael, and >> Stefan (and others big contributors I may have missed) easier, I'm >> totally in favor of. ? ?If that is a DVCS and/or something different >> than Trac, then let's do that. >> >> It sounds like we are making steps in that direction which is excellent. >> > Really based on the discussion (including the latter comments), it > appears to me that this discussion has moved towards what sort of > developmental structure should scipy be using with a DVCS. > > I viewed much of the discussion following what sort of happens with > Linux kernel development since they adopted DVCS starting with > Bitkeeper. Jonathon Corbet has an interesting article on this > http://ldn.linuxfoundation.org/book/how-participate-linux-community . > Essentially there are sub-maintainer trees that feed into the testing > tree (-mm), the staging tree (where patches are applied against that > should minimize tree divergence) and hopefully Linus's tree. ?During > that process is informal code review for at least bug fixes as new or > major features still have a problem with code review. In some aspects, > the man-power restriction with the Linux kernel development has been > removed because code no longer has to flow through a single node. So > this allows a user to get easily get code not only from these trees but > also other developers. > > So scipy could do something similar where the use of DVCS which would > hopefully this would reduce the burden on people like Robert and you. > > I do not see a real need at this time to say you 'own' that module and > you must 'control' the development of it. I would suspect that > 'ownership' of scipy components will naturally develop over time and, > thus, should not be forced upon anyone. > > If sub-trees were created it would permit a sharing of incomplete code > so that the burden of developing appropriate tests, writing > documentation and testing can be distributed to interested parties. This > would also foster mentoring and getting hands dirty in a positive way. > > >> >> >>> I don't have a problem with folks complaining about missing tests, >>> etc., but I worry that if we put too many review steps into the >>> submission path there won't be enough people to make it work. >>> >> This is exactly the way I feel.... ?I don't want to imply at all that we >> shouldn't be bugging each other about documentation and testing. ?I >> personally welcome any reminders in that direction. ?I am just worried >> about whether or not we are really solving the real problems that make >> it hard to contribute by instituting policy rather than providing >> examples of code to model. >> > ?From this it appears that in order to get code into scipy then you have > to have all this documentation and tests. But really my concern is > strict requirements of tests and documentation is that we will get > minimal tests and inferior documentation or nothing at all. Rather I > hope that the burden can be shifted from one person to a group of people > then perhaps we can get more extensive tests and documentation as well > as people actually testing the code on different systems with hopefully > realistic situations. So at least there is some fix in some tree for a > problem and eventually the rest of it will follow by the time everything > is ready for mainline inclusion. > > > Regards > Bruce > R has recently the discussion on quality control for statistical functions especially for use in the health industry, because they were critized by SAS that open source has insufficient guarantee for correctness. Scipy is not in the same group, but I think a review process before commit, if it attracts more users, will make it more likely to catch any problems. There are many good statistical tools in scipy, however until recently I wasn't sure what I would use in a "serious" application since there are too many, possibly incorrect results. The second case is that, more eyes might catch problems with refactoring, given that the test coverage is still shaky, and it might reduce the chance for dead code. Two examples for stats related functions: The recent removal of var and mean from scipy stats broke several functions that didn't have test coverage and so didn't show up in the tests. The second case is the recent addition of curvefit where the documentation didn't correspond to what was actually calculated. In both cases the review and corrections happened after the commit, since I keep an eye on any stats related commits. Without the review we might get misleading (or incorrect) numbers and broken code. And I've seen a lot of both in stats. But I also hope that any changes in the workflow helps in spreading the work of testing and documentation and makes adding new code easier and safer. Josef From charlesr.harris at gmail.com Wed Feb 25 14:22:21 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 25 Feb 2009 12:22:21 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> Message-ID: On Wed, Feb 25, 2009 at 11:58 AM, Matthew Brett wrote: > Hi, > > >> Tests protect the user and the developer alike. It is irresponsible > >> to carry on the way we do. > >> > > No it's not. > > Scipy is rarely released. David and Stefan are saying that it is very > hard to release. > > It might be true, that continuing with the organic, 'add it if it > seems good' approach, will be fine. But it might also be true that > it will make Scipy grind to a halt, as it becomes too poorly > structured and tested to maintain. > > > http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/ > > rates Numpy / Scipy / matplotlib as 'immature'. This is mainly > because of Scipy, and it's fair. It we want it to change we have to > be able to release versions that have good documentation and low bug > counts. > > The choices we make now are going to have long-lasting consequences for > Scipy. > > I think our best guess, from what David and Stefan are saying, that we > need a change towards more structured process. I stress the word > "need". This doesn't seem surprising to me. I think we've got to > listen to them, because they are doing the work of maintaining and > releasing Scipy. > Much of Scipy *isn't* maintained, that is why it is immature. There are parts that need to be worked over and rationalized and that isn't happening. You can't review code that hasn't been written. Some of that is history: the initial impetus in Scipy was interfacing existing C and Fortran libraries with Python and scratching itches. But that isn't the same as putting together a large package with smoothly interacting parts and verified results. And before that can happen we need more people working on the parts. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Wed Feb 25 14:26:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 26 Feb 2009 04:26:37 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> Message-ID: <5b8d13220902251126n62b15b21oc53fcc2ce3a71f89@mail.gmail.com> On Thu, Feb 26, 2009 at 3:58 AM, Matthew Brett wrote: > Hi, > >>> Tests protect the user and the developer alike. ?It is irresponsible >>> to carry on the way we do. >>> >> No it's not. > > Scipy is rarely released. ?David and Stefan are saying that it is very > hard to release. > > It might be true, that continuing with the organic, 'add it if it > seems good' approach, will be fine. ? But it might also be true that > it will make Scipy grind to a halt, as it becomes too poorly > structured and tested to maintain. I personally don't buy much the argument that asking for some tests would mean less amount of useful code. True, not asking for tests at commit time requires less work, but having to fix afterwards some code wo tests and which I have not written is much more time consuming. If the code really is useful, but has no tests - maybe it does not belong to scipy. Maybe it belongs to something else - I mean, creating a setup.py + publishing on pypi is not really hard. Maybe we should first see how much easier we can make the process of including new code instead of worrying about making it too difficult. David From fperez.net at gmail.com Wed Feb 25 14:29:17 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 25 Feb 2009 11:29:17 -0800 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: <09ECD184-2822-4A91-9F58-1BB4122FD590@enthought.com> References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> <09ECD184-2822-4A91-9F58-1BB4122FD590@enthought.com> Message-ID: On Wed, Feb 25, 2009 at 4:20 AM, Peter Wang wrote: > On Feb 25, 2009, at 1:20 AM, Fernando Perez wrote: > >> Inspired by this, I just went and nuked ~1900 out of the ipython one, >> leaving only the 128 that are probably for real. ?I hope this helps >> also reduce the load a bit more. > > Great, thank you! ?One thing that occurs to me is that once you have a > fairly high ratio of ham to spam, it might be worth saving the > directory listing into a base "goodpages.txt" that can then be used as > a whitelist filter in the future when blowing away spam via regexes. > (Hopefully we won't have to do that on this scale again, but if > history is any indicator, spammers always find a way...) Good idea, I just did it (in fact it's only 97 long, I cleaned up a few more after sending my email, so those are really 'pure ham' now, since I checked every one of them). BTW, I'm sure you have your tools by now for the cleanup, but in case this is useful, here's the little script I used. I found it easier to check interactively in small batches by pattern rather than doing one giant regexp run: /home/ipython/usr/bin/movepages It still takes time, since you have to look for false positives. In any case, many thanks for all your work, the moin wikis do feel already a LOT more responsive. I don't know how many times in the last few weeks I got timeout errors on the scipy cookbook, and now it's fairly snappy. This was a real problem, and it's much better now. Cheers, f From cournape at gmail.com Wed Feb 25 14:30:03 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 26 Feb 2009 04:30:03 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> Message-ID: <5b8d13220902251130g45d4607ay7829ca5c374e5fe2@mail.gmail.com> On Thu, Feb 26, 2009 at 4:22 AM, Charles R Harris wrote: > > > On Wed, Feb 25, 2009 at 11:58 AM, Matthew Brett > wrote: >> >> Hi, >> >> >> Tests protect the user and the developer alike. ?It is irresponsible >> >> to carry on the way we do. >> >> >> > No it's not. >> >> Scipy is rarely released. ?David and Stefan are saying that it is very >> hard to release. >> >> It might be true, that continuing with the organic, 'add it if it >> seems good' approach, will be fine. ? But it might also be true that >> it will make Scipy grind to a halt, as it becomes too poorly >> structured and tested to maintain. >> >> >> http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/ >> >> rates Numpy / Scipy / matplotlib as 'immature'. ?This is mainly >> because of Scipy, and it's fair. ?It we want it to change we have to >> be able to release versions that have good documentation and low bug >> counts. >> >> The choices we make now are going to have long-lasting consequences for >> Scipy. >> >> I think our best guess, from what David and Stefan are saying, that we >> need a change towards more structured process. ?I stress the word >> "need". ?This doesn't seem surprising to me. ? I think we've got to >> listen to them, because they are doing the work of maintaining and >> releasing Scipy. > > Much of Scipy *isn't* maintained, that is why it is immature. There are > parts that need to be worked over and rationalized and that isn't happening. > You can't review code that hasn't been written. Some of that is history: the > initial impetus in Scipy was interfacing existing C and Fortran libraries > with Python and scratching itches. But that isn't the same as putting > together a large package with smoothly interacting parts and verified > results. And before that can happen we need more people working on the > parts. Also, if the problem is man power, adding more code which makes the whole package more difficult to handle does not sound like a future proof path. Unless the goal of scipy is to become a bag of tricks which may be useful to some people, without any commitment from our side. Some parts of scipy are difficult to maintain because they have no tests and no documentation - it is not even obvious what it is supposed to do. I am afraid we can't have it both ways: if we want to increase quality, given man power, we have to reduce the amount of code which requires constant attention. If we want more features first, then, we can continute like we do now. But then we can't expect constant releases, which are relatively well tested. David From matthew.brett at gmail.com Wed Feb 25 14:32:47 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 11:32:47 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902251115p2ad7798dq811bd39142535392@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> <49A58B13.1040206@gmail.com> <1cd32cbb0902251115p2ad7798dq811bd39142535392@mail.gmail.com> Message-ID: <1e2af89e0902251132y3f3f5923if60cfc49f0b19cda@mail.gmail.com> Hi, > Scipy is not in the same group, but I think a review process before > commit, if it attracts more users, will make it more likely to catch any > problems. There are many good statistical tools in scipy, however > until recently I wasn't sure what I would use in a "serious" application > since there are too many, possibly incorrect results. Actually, watching your (Josef's) fixes to stats was an eye-opener for me. My impression was as you've said, that it was badly enough broken that you wouldn't use it - before you went through it. We can't afford to carry on shipping code that is that broken, otherwise we'll lose momentum and users. And at least - for stats - you (Josef) came along and went deep into it. Doing that is much harder without documentation and tests, making it more likely that bad code with major bugs will carry on making Scipy limp in the world of numerical libraries like matlab and R. I think we do - clearly - have a problem, and that we do - clearly - need a change to fix it. Matthew From stefan at sun.ac.za Wed Feb 25 14:45:21 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 21:45:21 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A582E0.9000001@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> Message-ID: <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> Hi Travis 2009/2/25 Travis E. Oliphant : > Which code has no tests? ? Are you speaking rhetorically or about a > specific check-in? I don't want to focus on any commits specifically (this kind of thing happens across the board), but I'll give one example that involves yourself. Take a look at http://projects.scipy.org/scipy/scipy/changeset/5554 You added a very useful function (thank you very much!). I haven't played with decimation filters recently, so I added a very na?ve test: http://projects.scipy.org/scipy/scipy/changeset/5560 Let's face it, you could have done a much better job there than I could. Besides the neat function you added, you also fixed a spelling mistake. Unfortunately, "Filifilt" became "filtflit", which is also incorrect. Also, you accidentally changed "lfilter_zi" to "lfiltir_zi". I fixed that in: http://projects.scipy.org/scipy/scipy/changeset/5557 Even a cursory review of the patch would have caught these issues. I appreciate that you want SciPy to remain organic and free; a place where motivated developers strive to grow the codebase to an efficient, powerful computing machine. I share that vision, although I'd like us to turn the gears up a bit, oil the machine, and encourage one another to write better code. Regards St?fan From peridot.faceted at gmail.com Wed Feb 25 14:53:09 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 25 Feb 2009 14:53:09 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <5b8d13220902251130g45d4607ay7829ca5c374e5fe2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <1e2af89e0902251058i3ceefa6fx1cbd3578482f6bf2@mail.gmail.com> <5b8d13220902251130g45d4607ay7829ca5c374e5fe2@mail.gmail.com> Message-ID: 2009/2/25 David Cournapeau : > On Thu, Feb 26, 2009 at 4:22 AM, Charles R Harris > wrote: >> >> On Wed, Feb 25, 2009 at 11:58 AM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> >> Tests protect the user and the developer alike. It is irresponsible >>> >> to carry on the way we do. >>> >> >>> > No it's not. >>> >>> Scipy is rarely released. David and Stefan are saying that it is very >>> hard to release. >>> >>> It might be true, that continuing with the organic, 'add it if it >>> seems good' approach, will be fine. But it might also be true that >>> it will make Scipy grind to a halt, as it becomes too poorly >>> structured and tested to maintain. >>> http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/ >>> >>> rates Numpy / Scipy / matplotlib as 'immature'. This is mainly >>> because of Scipy, and it's fair. It we want it to change we have to >>> be able to release versions that have good documentation and low bug >>> counts. >>> >>> The choices we make now are going to have long-lasting consequences for >>> Scipy. >>> >>> I think our best guess, from what David and Stefan are saying, that we >>> need a change towards more structured process. I stress the word >>> "need". This doesn't seem surprising to me. I think we've got to >>> listen to them, because they are doing the work of maintaining and >>> releasing Scipy. >> >> Much of Scipy *isn't* maintained, that is why it is immature. There are >> parts that need to be worked over and rationalized and that isn't happening. >> You can't review code that hasn't been written. Some of that is history: the >> initial impetus in Scipy was interfacing existing C and Fortran libraries >> with Python and scratching itches. But that isn't the same as putting >> together a large package with smoothly interacting parts and verified >> results. And before that can happen we need more people working on the >> parts. > > Also, if the problem is man power, adding more code which makes the > whole package more difficult to handle does not sound like a future > proof path. Unless the goal of scipy is to become a bag of tricks > which may be useful to some people, without any commitment from our > side. > > Some parts of scipy are difficult to maintain because they have no > tests and no documentation - it is not even obvious what it is > supposed to do. I am afraid we can't have it both ways: if we want to > increase quality, given man power, we have to reduce the amount of > code which requires constant attention. If we want more features > first, then, we can continute like we do now. But then we can't expect > constant releases, which are relatively well tested. It seems to me that one reason for the current disagreement is that people are talking about two different things: (1) Getting new code written from scratch and into the repository, and (2) Getting (and keeping) the code we have working reliably. For (1), tests and documentation are indeed a barrier (albeit in my opinion a very low one). For (2), though, requiring tests and documentation will drastically decrease the effort required. Put another way: some people are arguing that not requiring tests or documentation will get more people contributing new code. Others are arguing that allowing code without tests or documentation into the trunk will increase the manpower required to do basic things like make releases. Personally, I don't think requiring tests and documentation is a barrier to new users. I was very hesitant about my first contribution because I really didn't want to put in broken or embarrassingly bad code, so the fact that I could test it systematically, confirm that it didn't break anything else, and document it clearly made me more confident that I was contributing something that wouldn't require me to wear a paper bag on my head. But let's assume that it is a barrier to contributing new code. Which does scipy need more right now: reliability in the code it has and a regular release cycle, or lots more new code? Anne From charlesr.harris at gmail.com Wed Feb 25 14:56:57 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 25 Feb 2009 12:56:57 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> Message-ID: On Wed, Feb 25, 2009 at 12:45 PM, St?fan van der Walt wrote: > Hi Travis > > 2009/2/25 Travis E. Oliphant : > > Which code has no tests? Are you speaking rhetorically or about a > > specific check-in? > > I don't want to focus on any commits specifically (this kind of thing > happens across the board), but I'll give one example that involves > yourself. Take a look at > I would have made some comments on the curve_fit function also, but there was no easy way to do it. I don't know that we need a formal review process at this point, but it would be nice if changes to the code showed up somewhere with tools that made it easy to add comments that went directly back to the submitter. I also miss that on the tickets, where some way to contact the submitter could sometimes be helpful. 'Course, we don't want the spam machines grabbing the addresses ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Feb 25 15:01:20 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 12:01:20 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> Message-ID: <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> Hi, > I don't want to focus on any commits specifically (this kind of thing > happens across the board), but I'll give one example that involves > yourself. And I will give an example that involves myself. I added a patch that was partly tested and not properly benchmarked to the 0.7 matlab io and rendered it more or less unusable for large datasets. I speak only for myself, but I like having people have a look at my code, teaching me stuff I don't know, or just checking things that I didn't think of myself. So, how about this: A proposal ------------- We set up a patch review policy. The review involves checking for and suggesting tests and documentation. That's the default. If you don't want this to happen to your code, then you ask for an opt-out. Does that sound reasonable? Matthew From peridot.faceted at gmail.com Wed Feb 25 15:10:45 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 25 Feb 2009 15:10:45 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> Message-ID: 2009/2/25 Matthew Brett : > Hi, > >> I don't want to focus on any commits specifically (this kind of thing >> happens across the board), but I'll give one example that involves >> yourself. > > And I will give an example that involves myself. I added a patch that > was partly tested and not properly benchmarked to the 0.7 matlab io > and rendered it more or less unusable for large datasets. > > I speak only for myself, but I like having people have a look at my > code, teaching me stuff I don't know, or just checking things that I > didn't think of myself. > > So, how about this: > > A proposal > ------------- > > We set up a patch review policy. The review involves checking for and > suggesting tests and documentation. That's the default. If you don't > want this to happen to your code, then you ask for an opt-out. > > > Does that sound reasonable? I think as far as patch review goes, opt-in is quite enough - what's lacking is a reasonable mechanism to implement patch review. Simply posting the code to the mailing list really doesn't work well. Possibly that online code review site people have pointed to a few times could work? It would help if I could avoid looking at patches I've already reviewed (or decided not to review). Here's somewhere infrastructure could help. Anne From oliphant at enthought.com Wed Feb 25 15:33:25 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 14:33:25 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> Message-ID: <49A5AB15.4060609@enthought.com> St?fan van der Walt wrote: > Hi Matthew > > 2009/2/24 Matthew Brett : > >> I think Stefan's position is that, as more people start using and >> contributing to Scipy, it's become near impossible to maintain in a >> release-worthy way (Stefan - is that right)? That if we want to >> keep going without collapsing we need a more formal process. >> > > Exactly. If we keep introducing new bugs ourselves, there's not > enough time in the world to bring SciPy up to standard. > When have you done enough "testing" or "documentation"? We need bright, dedicated people working on SciPy and using good judgment. Sometimes that means emphasizing tests. Sometimes that means emphasizing documentation. Sometimes it means thinking hard about the algorithm you are implementing and carefully coding it. Unit-testing is a tool, but it requires more than that to create code that others can use and rely on --- a skill I don't think we can quantify sufficiently to formalize a process that by itself "produces more contributions" > With the nose framework in place, writing tests is so very easy: > > def test_myfoo(): > assert 1 == 1 > > So I hope that everyone would agree that proper testing and > documentation improves life, not only for the user community, but also > for the contributor. > Yes, unit-testing is essential when you need to re-factor --- but it comes with a cost. Code that is unit-tested requires those tests to also be re-factored when the code gets refactored, so there is such a thing as "unit-testing an API too early". Thus, there is a life-cycle question for unit-test coverage. Early-on fewer core unit-tests are appropriate. Later, when the API is stabilized, more unit-tests are appropriate. I can't give hard numbers for what is "right" or when the transition is made. -Travis From matthew.brett at gmail.com Wed Feb 25 15:34:32 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 12:34:32 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> Message-ID: <1e2af89e0902251234y3a048290g49616a1a87ebc1a2@mail.gmail.com> Hi, >> A proposal >> ------------- >> >> We set up a patch review policy. ?The review involves checking for and >> suggesting tests and documentation. ?That's the default. ?If you don't >> want this to happen to your code, then you ask for an opt-out. > I think as far as patch review goes, opt-in is quite enough I would suggest - following Stefan's comments - that opt-out would encourage people to think of this as being the standard way to work. Opting-out would be saying 'this is not the way I like to work' - and that is OK. > - what's > lacking is a reasonable mechanism to implement patch review. Right. So, staying clear of the actual tools - can we ask Stefan, or David, or Pauli, to suggest a specific workflow along these lines? Then we can see how to implement it. Best, Matthew From matthew.brett at gmail.com Wed Feb 25 15:41:33 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 12:41:33 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A5AB15.4060609@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> Message-ID: <1e2af89e0902251241t246d16c4p32191dfa2a1f4e86@mail.gmail.com> Hi, > Yes, unit-testing is essential when you need to re-factor --- but it > comes with a cost. ? ?Code that is unit-tested requires those tests to > also be re-factored when the code gets refactored, so there is such a > thing as "unit-testing an API too early". ? Thus, there is a life-cycle > question for unit-test coverage. ? Early-on fewer core unit-tests are > appropriate. ?Later, when the API is stabilized, more unit-tests are > appropriate. ? ?I can't give hard numbers for what is "right" or when > the transition is made. Yes, that's right, but code that is that provisional should not be in Scipy - right? And, I am learning, slowly, that writing the tests first makes the API better. I know that not everyone likes to do this, but it does seem to me a reasonable request, that by the time the code reaches the Scipy trunk, it should have good test coverage and documentation. Branches - whatever you like. Scikits - probably also fine. But not Scipy trunk... Matthew From perry at stsci.edu Wed Feb 25 15:58:21 2009 From: perry at stsci.edu (Perry Greenfield) Date: Wed, 25 Feb 2009 15:58:21 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A5AB15.4060609@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> Message-ID: <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> I'll add a couple comments regarding this whole discussion (including the tools that will be used for scipy/numpy software development). 1) I'm not sure that it is a good idea to change everything at once (e.g., svn->git, trac->roundup, etc), particularly if these changes can be done incrementally. It's easy for those that hunger for these changes to think of why doing sooner is better. I suspect there are many other that may feel otherwise that may not be as vocal. And some of the impacts may not be obvious. So if it is at all possible, even if it means some extra work, try just doing one thing at a time and evaluating its impact first before making other changes. Arguably the same point can be made about process changes. 2) While I understand the desire to increase the quality of commits to scipy by putting in a more formal process, like making sure code is reviewed, tests are present, and documentation is provided, I too, like Travis, worry that this may inhibit many useful contributions. Rather than act as a barrier, why not just have some sort of "seal of approval" for things that have gone through that process. As a user, I'd rather have the choice of using an unreviewed, poorly tested, or poorly documented module than have someone else decide that I can't make that choice myself. Who knows, I might find it useful enough to improve. Yes, it can be put in another area (e.g., scikits), but it should be just as easy to get at and see that it is available. If one thing should be most required it would be tests to ensure the main functionality works on all supported platforms so that building releases isn't held up by problems discovered on yet untested platforms. I don't think the reviewing or documentation issues generally affect how much work is involved in making releases though. Perry From matthewturk at gmail.com Wed Feb 25 16:22:13 2009 From: matthewturk at gmail.com (Matthew Turk) Date: Wed, 25 Feb 2009 13:22:13 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> Message-ID: Hi there, I've only just subscribed to this list (after following on GMANE-RSS) because I wanted to contribute to the discussion. I'd like to echo Perry's first point below, and add on some specific concerns I have about the entire workflow discussion. An aspect I worry is being overlooked is that in some communities change comes rather slowly. I have helped out a number of people with user-space deployment of Python packages, and the biggest impediment is -- as with many things -- installation. I worry that if the release schedule of SciPy doesn't speed up substantially, accessing source control will be the primary means of getting the code. Installing git and mercurial (and maybe Bazaar, but I've had the most trouble with that) into some user-space area is not difficult, but it adds on another layer of overhead. Until all of the supercomputing centers provide DVCS, users (and developers!) targeting deployment there will have yet another barrier to entry for using SciPy. (And as a result, they may fall back on old habits: IDL, for instance.) To that end, I'd like to strongly and plaintively request that some kind of mirror in SVN, or even archived nightly tarballs, be kept of the primary tree of development. Those concerns aside, I think that anything that reduces the barrier to entry for developers is likely to be a great boon to SciPy. For me, the biggest barrier to using and deploying SciPy is still installation, but in recent months that has become substantially easier -- largely due to the efforts by everyone here. (That being said, I do know that some of my colleagues do on occasion still struggle with the installation process.) Thanks for listening. :) -Matt On Wed, Feb 25, 2009 at 12:58 PM, Perry Greenfield wrote: > I'll add a couple comments regarding this whole discussion (including > the tools that will be used for scipy/numpy software development). > > 1) I'm not sure that it is a good idea to change everything at once > (e.g., svn->git, trac->roundup, etc), particularly if these changes > can be done incrementally. It's easy for those that hunger for these > changes to think of why doing sooner is better. I suspect there are > many other that may feel otherwise that may not be as vocal. And some > of the impacts may not be obvious. So if it is at all possible, even > if it means some extra work, try just doing one thing at a time and > evaluating its impact first before making other changes. Arguably the > same point can be made about process changes. > > 2) While I understand the desire to increase the quality of commits to > scipy by putting in a more formal process, like making sure code is > reviewed, tests are present, and documentation is provided, I too, > like Travis, worry that this may inhibit many useful contributions. > Rather than act as a barrier, why not just have some sort of "seal of > approval" for things that have gone through that process. As a user, > I'd rather have the choice of using an unreviewed, poorly tested, or > poorly documented module than have someone else decide that I can't > make that choice myself. Who knows, I might find it useful enough to > improve. Yes, it can be put in another area (e.g., scikits), but it > should be just as easy to get at and see that it is available. If one > thing should be most required it would be tests ?to ensure the main > functionality works on all supported platforms so that building > releases isn't held up by problems discovered on yet untested > platforms. I don't think the reviewing or documentation issues > generally affect how much work is involved in making releases though. > > Perry > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From stefan at sun.ac.za Wed Feb 25 16:51:06 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 23:51:06 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902251234y3a048290g49616a1a87ebc1a2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> <1e2af89e0902251234y3a048290g49616a1a87ebc1a2@mail.gmail.com> Message-ID: <9457e7c80902251351q36f920c2r44f4f46ccaf01c16@mail.gmail.com> Hi Matthew 2009/2/25 Matthew Brett : > Right. ?So, staying clear of the actual tools - can we ask Stefan, or > David, or Pauli, to suggest a specific workflow along these lines? > Then we can see how to implement it. David and I have installed both roundup and trac 0.11 so far, and we are busy exploring the different code review options. We'll keep the list posted! Cheers St?fan From robert.kern at gmail.com Wed Feb 25 16:58:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Feb 2009 15:58:00 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> Message-ID: <3d375d730902251358r2a949bbdy98e6b3ef80509db0@mail.gmail.com> On Wed, Feb 25, 2009 at 15:22, Matthew Turk wrote: > Hi there, > > I've only just subscribed to this list (after following on GMANE-RSS) > because I wanted to contribute to the discussion. ?I'd like to echo > Perry's first point below, and add on some specific concerns I have > about the entire workflow discussion. > > An aspect I worry is being overlooked is that in some communities > change comes rather slowly. ?I have helped out a number of people with > user-space deployment of Python packages, and the biggest impediment > is -- as with many things -- installation. ?I worry that if the > release schedule of SciPy doesn't speed up substantially, accessing > source control will be the primary means of getting the code. > Installing git and mercurial (and maybe Bazaar, but I've had the most > trouble with that) into some user-space area is not difficult, but it > adds on another layer of overhead. ?Until all of the supercomputing > centers provide DVCS, users (and developers!) targeting deployment > there will have yet another barrier to entry for using SciPy. ?(And as > a result, they may fall back on old habits: IDL, for instance.) ?To > that end, I'd like to strongly and plaintively request that some kind > of mirror in SVN, or even archived nightly tarballs, be kept of the > primary tree of development. Pretty much all of the DVCS web frontends allow users to get tarballs of any revision they like. For example, the "bz2", "zip" and "gz" links at the top of my Mercurial repo for line_profiler: http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/line_profiler/ Click the "files" link for any of the previous revisions to go back in the history, and click on the "bz2", etc., links to get past revisions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthewturk at gmail.com Wed Feb 25 17:02:06 2009 From: matthewturk at gmail.com (Matthew Turk) Date: Wed, 25 Feb 2009 14:02:06 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902251358r2a949bbdy98e6b3ef80509db0@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <3d375d730902251358r2a949bbdy98e6b3ef80509db0@mail.gmail.com> Message-ID: > Pretty much all of the DVCS web frontends allow users to get tarballs > of any revision they like. For example, the "bz2", "zip" and "gz" > links at the top of my Mercurial repo for line_profiler: Ah, excellent. I apologize for the noise and withdraw my concern. -Matt From oliphant at enthought.com Wed Feb 25 17:19:06 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 16:19:06 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <5b8d13220902250902v61c0e038odb513265e21264af@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> <5b8d13220902250902v61c0e038odb513265e21264af@mail.gmail.com> Message-ID: <49A5C3DA.8040709@enthought.com> David Cournapeau wrote: > On Thu, Feb 26, 2009 at 1:49 AM, Travis E. Oliphant > wrote: > > >> I do see a real need to fix the SVN-Trac workflow bottleneck as well as >> anything that helps the release process. It's actually at the release >> process where I would institute any formal review process. I'm also in >> favor of having a regular (i.e. every 3-6 months) release process. The >> difficulty there again is man-power. >> > > One thing which may help here is to have a turn-around for the release > manager: a different person every time. This person would have the > last world of what goes in/what does not, with almost strictly > enforced deadlines. In particular, we should really enforce code > freeze - although I can understand the point that reviews may make > things harder, I don't think it is possible at all to make good > release without enforcing very strict timelines. There has to be no > new code for some time before the release, time which is more than > just one day or two. C/Fortran code would be the first to be freezed, > then python, then docstring. The exact time can be tweaked after > experiments, of course. But if we get this right, I believe that > having freeze periods can make the time from patch to inclusion > actually faster. > That sounds fine. > Having a different person means it is not always the same person, > obviously, and it may also keep people "honest", in the sense that a > release manager will also be a coder later under a different release > manager. > > This sounds great. I think the release manager gets to pick how strictly things are enforced. -Travis From oliphant at enthought.com Wed Feb 25 17:44:25 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 16:44:25 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902251115p2ad7798dq811bd39142535392@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <49A57686.70707@enthought.com> <49A58B13.1040206@gmail.com> <1cd32cbb0902251115p2ad7798dq811bd39142535392@mail.gmail.com> Message-ID: <49A5C9C9.7040402@enthought.com> josef.pktd at gmail.com wrote: > > for dead code. Two examples for stats related functions: The recent > removal of var and mean from scipy stats broke several functions > that didn't have test coverage and so didn't show up in the tests. > The second case is the recent addition of curvefit where the > documentation didn't correspond to what was actually calculated. > Yes, and this is a perfect example of "people who care" catch the problem even after commit and it gets fixed (extremely quickly) --- especially with this kind of documentation problem. I don't see how a formal review process would have helped in this case given that I probably would not have spent time on it had I had to jump through that hoop. And, the parameter estimation was working fine, but the error estimate was not (at least for a day). > In both cases the review and corrections happened after the commit, > since I keep an eye on any stats related commits. Without the > review we might get misleading (or incorrect) numbers and broken > code. And I've seen a lot of both in stats. > This is exactly what we need. More people like Josef looking at check-ins and offering suggestions / assistance that corresponds to their area of interest / expertise. I was very grateful for the assistance (and the unit-tests based on NIST data). I don't think we need any formal rules except the natural ones. We definitely need better tools to "track which code has been reviewed", check builds on multiple platforms, merge development branches, etc., etc. We also need more cycles to spend on SciPy. -Travis From oliphant at enthought.com Wed Feb 25 17:49:10 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 16:49:10 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> Message-ID: <49A5CAE6.5030104@enthought.com> Charles R Harris wrote: > > I would have made some comments on the curve_fit function also, but > there was no easy way to do it. I don't know that we need a formal > review process at this point, but it would be nice if changes to the > code showed up somewhere with tools that made it easy to add comments > that went directly back to the submitter. I also miss that on the > tickets, where some way to contact the submitter could sometimes be > helpful. 'Course, we don't want the spam machines grabbing the > addresses ;) Yes, such tools would be very nice, and I would welcome warmly any changes to the workflow that accomodate them. -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From oliphant at enthought.com Wed Feb 25 17:52:55 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 16:52:55 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> Message-ID: <49A5CBC7.1080304@enthought.com> > And I will give an example that involves myself. I added a patch that > was partly tested and not properly benchmarked to the 0.7 matlab io > and rendered it more or less unusable for large datasets. > > I speak only for myself, but I like having people have a look at my > code, teaching me stuff I don't know, or just checking things that I > didn't think of myself. > > So, how about this: > > A proposal > ------------- > > We set up a patch review policy. The review involves checking for and > suggesting tests and documentation. That's the default. If you don't > want this to happen to your code, then you ask for an opt-out. > > Who do you ask? Who decides whether or not you get one? I think it's better to have a recommended policy of review but not a requirement. Then, a tool like coverage that shows which code has been reviewed by more than one pair of eyes. And then let that information along with who committed the code and who are the people "curating" a particular sub-package guide the release manager. -Travis From matthew.brett at gmail.com Wed Feb 25 17:58:17 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 25 Feb 2009 14:58:17 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A5CBC7.1080304@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <9457e7c80902241502p5be85f0atb730788ba7ab66b6@mail.gmail.com> <49A582E0.9000001@enthought.com> <9457e7c80902251145i577e832by7bf4b35017859385@mail.gmail.com> <1e2af89e0902251201y3f501928sf1089c3a129b3b45@mail.gmail.com> <49A5CBC7.1080304@enthought.com> Message-ID: <1e2af89e0902251458u756f5743p5cf2696449af0eaf@mail.gmail.com> Hi, >> We set up a patch review policy. ?The review involves checking for and >> suggesting tests and documentation. ?That's the default. ?If you don't >> want this to happen to your code, then you ask for an opt-out. >> >> > Who do you ask? The release manager, I suppose, or the several people who are doing it. > Who decides whether or not you get one? Same, but, the threshold for someone who has already committed code should be low. It's clear that some people don't like working that way - and that's fine. >?I think it's > better to have a recommended policy of review but not a requirement. Yes, right. So the recommendation is of the form 'this is our usual policy; if you would like to opt out of the policy, let us know, and expect us to agree'. See you, Matthew From oliphant at enthought.com Wed Feb 25 18:18:37 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 25 Feb 2009 17:18:37 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> Message-ID: <49A5D1CD.6070700@enthought.com> I've written a lot of response to various comments and I'd like to summarize and extend a few of my comments: 1) We absolutely need to improve the quality of SciPy, and that does mean more tests, documentation, and reviews --- and most importantly faster releases. Right now, a release happens when someone steps up to be a release manager and commits to making it happen. I don't know how to promise that on a regular cycle with only volunteer effort. I would love to have the resources to fund SciPy release management. 2) I think we are doing a decent job of commits having tests and documentation. We should continue to remind each other of the need for quality code in SciPy (and continue to clean up code that is there). 3) There are pieces of SciPy that need work (interpolate stands out most in my mind right now). I have changes to the interpolate code that I have not yet committed because I was waiting for the release of 0.7 but I really want to commit. Who is interested in reviewing this? I'm happy to work with additional eyes, but my current workflow is "commit code I think is working along with some tests and docstrings", and then let review/improve happen on the trunk. I don't really like having lots of branches checked out of a code-base in order to manage a different workflow. I'm open to being educated about approaches that work better. 4) Bug-fix commits are a different thing than feature-enhancement commits. We should have different expectations of them. 5) We do have scikits for more experimental additions to live so that SciPy should become more of a stable, documentation-rich library. But, the problem there is distribution. EPD and Enstaller (our BSD-licensed version of setuptools) is one answer to that distribution problem. There are others. 6) I very much appreciate all the work people do on SciPy. I think our biggest lack more than anything else is the "full-time" person that can respond to the user community and keep the momentum moving. -Travis From charlesr.harris at gmail.com Wed Feb 25 19:03:26 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 25 Feb 2009 17:03:26 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A5D1CD.6070700@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> Message-ID: On Wed, Feb 25, 2009 at 4:18 PM, Travis E. Oliphant wrote: > > I've written a lot of response to various comments and I'd like to > summarize and extend a few of my comments: > > 1) We absolutely need to improve the quality of SciPy, and that does > mean more tests, documentation, and reviews --- and most importantly > faster releases. Right now, a release happens when someone steps up > to be a release manager and commits to making it happen. I don't know > how to promise that on a regular cycle with only volunteer effort. I > would love to have the resources to fund SciPy release management. > > 2) I think we are doing a decent job of commits having tests and > documentation. We should continue to remind each other of the need > for quality code in SciPy (and continue to clean up code that is there). > > 3) There are pieces of SciPy that need work (interpolate stands out most > in my mind right now). I have changes to the interpolate code that I > have not yet committed because I was waiting for the release of 0.7 but > I really want to commit. Who is interested in reviewing this? I'm > happy to work with additional eyes, but my current workflow is "commit > code I think is working along with some tests and docstrings", and then > let review/improve happen on the trunk. I don't really like having > lots of branches checked out of a code-base in order to manage a > different workflow. I'm open to being educated about approaches that > work better. > Interpolate stands out in my mind also, along with signal processing. Mostly because last time I looked -- a long time ago -- they were pretty messy and I haven't seen much work done on them since. I think in this case it would be helpful if you summarized your intended changes and interfaces and gave a short explanation of your motivation. By the time the code actually shows up in SVN might be a little late. I think a similar approach would have helped with curve_fit, not least since I have found Gauss-Newton with numerical derivatives to out-perform Levenburg-Marquardt by ~50x in some problems and give better answers. Then there is the question of when to stop the iterations. Also, I couldn't see if the function and data could be array valued, which can be handy in some cases. For instance, I recently fit a case where the parameters moved a large set of points around on a unit sphere in order to minimize the distance to data points on the sphere. In this case the output of f was most conveniently represented as an array of vectors. Mind, I don't mind the function itself so much as giving it a name that implies more generality than I think it has. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Feb 25 21:03:50 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 02:03:50 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <3d375d730902231740t2204860cnb71e961eb3db3dcd@mail.gmail.com> Message-ID: Tue, 24 Feb 2009 23:32:43 -0600, Peter Wang wrote: [clip: SVN post-commit hook for the git mirror] > > How does the above interact with, and what are the ramifications for: > > - user accounts and permissions The CGI idea may have been too complicated. The simplest way to go is probably just to run an update script directly from SVN post-commit hook, sudoed and backgrounded. If it needs to be on a different host, run it via SSH+public key. eg. ssh -i PRIVATEKEY USERNAME at HOST /home/USERNAME/bin/sync-scipy-git $REV $REPOS \ < /dev/null > /dev/null 2>&1 & If you want me to maintain this, I think the hook can just SSH to my account on new.scipy.org. I'll then make sure that the script ~/bin/sync-scipy-git works as intended -- everything runs then on my account, which makes things simple. It's also easy to migrate this to run under another user, if necessary. Btw, git and git-svn are not yet installed on new.scipy.org... (Also, I got carried away and wrote also the CGI solution: http://github.com/pv/git-svn-automirror/, so it's there if we want to use it...) > > - subdomains (git-to-svn for numpy and various scipy subdomains like > mpi4py, etc.) > Subdomains are probably not important for this mirroring business, especially if we take the SSH/sudo way. If we end up liking Git a lot and switching away from SVN, it might be nice to have git.scipy.org running gitweb, though. But this is probably for the future. > > - logging/monitoring (so we can detect if the CGI goes wrong/zombie/ > berserk) > The script can itself log what it does, and maybe one can also set some ulimits in it. The most important thing probably is that the post-commit hook backgrounds the Git mirror update, so that the update can't lock up the SVN repo even if it goes haywire. -- Pauli Virtanen From pav at iki.fi Wed Feb 25 21:39:13 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 02:39:13 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> Message-ID: Wed, 25 Feb 2009 17:18:37 -0600, Travis E. Oliphant wrote: [clip] > 3) There are pieces of SciPy that need work (interpolate stands out most > in my mind right now). I have changes to the interpolate code that I > have not yet committed because I was waiting for the release of 0.7 but > I really want to commit. Who is interested in reviewing this? I'm > happy to work with additional eyes, but my current workflow is "commit > code I think is working along with some tests and docstrings", and then > let review/improve happen on the trunk. The codereview.appspot.com tool is very fast to use, eg. via the http://codereview.appspot.com/static/upload.py tool. So I'd suggest to just uploading the patches there even before commit; it can't do any harm. The problem with reviewing code after commit in trunk is that it takes more effort to correct or ask about dubious points. > I don't really like having lots of branches checked out of a code-base > in order to manage a different workflow. I'm open to being educated > about approaches that work better. I've found git-svn quite good for maintaining topic branches. It can switch easily between them using the same working tree, so that compiles are fast, and editor just needs M-x revert-buffer. -- Pauli Virtanen From pav at iki.fi Wed Feb 25 22:03:58 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 03:03:58 +0000 (UTC) Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v Message-ID: (For trying out the code review tool...) Scipy bugs #503 and #849 are due to a non-robust implementation of Bessel I function in Cephes. The following changes address this, by using an implementation from the Boost library, converted to C: http://codereview.appspot.com/20078 A git branch is here: http://github.com/pv/scipy-work/tree/ticket-503-special-iv-fix I'm thinking that this code would be ready to be committed in. I'm not aware of bugs in it. -- Pauli Virtanen From michael.abshoff at googlemail.com Wed Feb 25 23:15:30 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Wed, 25 Feb 2009 20:15:30 -0800 Subject: [SciPy-dev] cleaning out wiki spam In-Reply-To: References: <9457e7c80902220602v7c96e5b7v815bd6f6c470f7b6@mail.gmail.com> <1C629713-69B1-4093-8050-F1226865C2AE@enthought.com> <20090222212703.GR6701@phare.normalesup.org> <49A1C644.5030205@gmail.com> <49A3A265.9060200@gmail.com> <09ECD184-2822-4A91-9F58-1BB4122FD590@enthought.com> Message-ID: <49A61762.3010608@gmail.com> Fernando Perez wrote: > On Wed, Feb 25, 2009 at 4:20 AM, Peter Wang wrote: > Good idea, I just did it (in fact it's only 97 long, I cleaned up a > few more after sending my email, so those are really 'pure ham' now, > since I checked every one of them). > > BTW, I'm sure you have your tools by now for the cleanup, but in case > this is useful, here's the little script I used. I found it easier to > check interactively in small batches by pattern rather than doing one > giant regexp run: > > /home/ipython/usr/bin/movepages > > It still takes time, since you have to look for false positives. > > In any case, many thanks for all your work, the moin wikis do feel > already a LOT more responsive. I don't know how many times in the > last few weeks I got timeout errors on the scipy cookbook, and now > it's fairly snappy. This was a real problem, and it's much better > now. Cool. Since at least moinmoin releases prior to 1.7.2 do not delete Spam attempts I regularly check with ls -latr in pages for those and whack them, too. > Cheers, > > f Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From nmb at wartburg.edu Thu Feb 26 00:42:02 2009 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Thu, 26 Feb 2009 05:42:02 +0000 (UTC) Subject: [SciPy-dev] Scipy workflow (and not tools). References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: Rob Clewley gmail.com> writes: [...] > So, can't there be informal teams of curatorship so that not everyone > involved has to be really familiar with the tools discussed in the > other thread?! Unfortunately I cannot afford the time to ride the > waves of changing fashion in VCS, etc. > > Wouldn't this help to get more people involved? ... those many people > that Gael correctly assumes are out there but staying silent! I am the kind of person that you want developing code for Scipy. I prove the existence of a non-empty class of people who are out here but stay silent (no longer!). I am a persistent lurker on these lists. I'm a heavy user of Numpy and Scipy in my research. I use Numpy and Scipy in the classes I teach. I contribute to other Python-based OSS projects in my small spare time. When you folks talk about attracting people to work on Scipy, I should be the kind of person you are thinking about (and I am legion?). I'd like to share some of my thoughts on the issues of code review, tests, documentation and workflow in the hopes of offering a non-insider perspective. 1) Code review is very helpful for me as a new contributor. I am much more likely to contribute in a context in which I feel that whatever code I *can* produce is going to be reviewed and I can work on it to bring it up to Scipy standards. If I feel that I have to produce picture-perfect Python on my first try, I am much less likely to try in the first place. Code review is a perfect place for interested people (me!) to learn how to be active people. It is also a positive-feedback loop, as other interested people see the mentoring process that someone else has gone through with code review and feel themselves up to the task of trying to contribute. For this reason, I think it is a benefit for code reviews to take place in public fora such as mailing lists, not exclusively in special code-review applications/domains. 2) Unit testing is also important for me as a new contributor. If I would like to mess around with something that I don't understand in order to learn something, unit testing allows me to experiment effectively. Without unit tests, I cannot be an effective experimentalist in my hacking. In addition, other projects have trained me to unit test my contributions, so that is what I would most likely be doing if I were to contribute and I would like to feel that my effort to write tests is valued. 3) Documenting code seems like a very important standard to uphold for new contributors. As someone who *might* contribute, I don't yet have a fixed notion of what is good enough code. So, if I do decide to send something up for public consumption, then I am easy to convince that I need to do more documentation. 4) Workflow and tools are extremely important for me as a new contributor. One of the things that keeps me from developing even small patches for Scipy is SVN. If I want to make a change, I have to check out the trunk and then develop my change *completely without the benefit of version control*. I am not allowed to make any intermediate commits while I learn my way through the coding process. I must submit a fully formed patch without ever being able to checkpoint my own progress. This is basically a deal-breaker for me. I don't enjoy coding without a safety net, especially large changes, especially test-driven changes and especially heavily documented changes. I want to be able to polish my patch using the power of version control. Not having this makes me enjoy scipy development less which makes me less likely to contribute. As a fairly early convert to DVCS, I am used to being able to use my local branch of the project however I need to in my own development process. Being able to commit to a local branch as I see fit also helps produce well-tested and well-documented code *and* enables effective multi-step code review. Particularly with Bazaar's bundle concept where the history of a local branch can be swapped via email (not just the patch), reviewers can merge a bundle from an email and review directly in the branch as I developed it. Their suggestions can then be incorporated into new revisions in my local branch, which can then be submitted again for more polishing. (I imagine git and Mercurial have similar lightweight capabilities for exchanging branches; I just don't have experience with them.) I hope that my thoughts help clarify this group's thinking about what sort of things can help bring in new contributors. (Oh, and I've got some ideas for scipy.stats ;) -Neil From ondrej at certik.cz Thu Feb 26 02:04:46 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 25 Feb 2009 23:04:46 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> Message-ID: <85b5c3130902252304n16ee6766x4e7382425efcf65d@mail.gmail.com> Hi, On Mon, Feb 23, 2009 at 8:04 AM, St?fan van der Walt wrote: > [If you only have 30 seconds to read this email, read the bold text only] > > Dear SciPy developers > > The past while has seen a rocky ride with the SciPy servers, but yesterday > Peter Wang announced that he is attending to the situation. ?This, then, > seems like the perfect time to stand back and take a look at our > infrastructure, and whether we should continue with the current setup. > > To put this conversation into context, we have to face the facts: SciPy has > a large user community relative to the number of developers.? A big library > of code, used by many scientists, is supported by a small handful of people > all over the world.? We cannot afford a high barrier to contribution, and we > have to lower the effort it takes for a developer to merge contributed code. > > I'd like to propose two changes to the status quo: > > 1. Change to a distributed revision control system, encouraging more open > collaboration. > 2. Determine guidelines for code acceptance, in terms of unit tests, > documentation and peer review. > > Allow me to motivate these changes, and then suggest practical approaches > for their implementation: > > Subversion allows only a selected group of developers to change the SciPy > source code.? This does not encourage a culture of meritocracy, but worse, > has practical implications, in that users cannot merge their own patches.? I > won't discuss the advantages of distributed revision control here, but note > that it shifts responsibility from the current core developers to > contributers; that benefits us all! > > This ties in with my second point: code review.? The current developers have > access to SVN because they are experienced programmers with knowledge of > SciPy's scientific domains of application.? We are unable to employ this > scarce resource fully, because it simply takes too long to merge a patch > from Trac, review it, *bring it up to scratch*, and commit it.? We have to > put a system in place which allows contributers to take responsibility for > their own patches, and for core developers to guide and advise during this > process.? As it is, we have many patches waiting on Trac for up to a year or > more without any feedback; that is not acceptable. > > My view on testing is simple: untested code is probably broken code (and I > can show examples from the past year's commit logs to corroborate this > statement).? As for documentation, we cannot afford to be without it. > > Implementation: > > Enthought generously hosts SciPy, and I hope they will continue doing so. > New software will need to be installed on the server, but we have many hands > willing to tackle that task: David Cournapeau and myself included.? Before > deploying to scipy.org, we will configure a different server as a proof of > concept. > > 1) Distributed revision control system: David Cournapeau and myself have > been test driving Git [1] on SciPy and NumPy for a while.? It is fast, well > supported, has great branch support, and is simple to use for the average > contributor, while allowing powerful patch-carving for the more adventurous. > > 2) Ticketing back-end: David is exploring RedMine [2], and I'd like to take > a look at InDefero [3], but we'll do a careful analysis of trac-git (like > FedoraHosted) too. > > Thank you for taking the time to deliberate on SciPy's future.? I would love > to hear your comments. I read through the whole thread and I fully agree with Stefan and I support him. Git is +1, I think it's the best tool these days. I noticed several times, that Stefan had to fix patches committed by other people and that is very, very bad. It's wasting Stefan's time and I just think that broken patches should never be allowed to get in. I also think that peer review is absolutely necessary and if there is a right process for it, I can promise that I will be reviewing too. In fact, I suggested that in the past already. So I think Travis you don't have to be afraid that the code will stall. Besides it works for Sage and other projects as well. We used that in sympy too -- and we have a lot less developers in sympy than there are in scipy. So if we can do it, imho scipy can too. So, +1 to what Stefan said. I also think, by reading this thread, that most of the people agree with Stefan. Ondrej From ondrej at certik.cz Thu Feb 26 02:15:07 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 25 Feb 2009 23:15:07 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <85b5c3130902252315i79436623jfec4fc8cff17678@mail.gmail.com> On Wed, Feb 25, 2009 at 9:42 PM, Neil Martinsen-Burrell wrote: > Rob Clewley gmail.com> writes: > > [...] > >> So, can't there be informal teams of curatorship so that not everyone >> involved has to be really familiar with the tools discussed in the >> other thread?! Unfortunately I cannot afford the time to ride the >> waves of changing fashion in VCS, etc. >> >> Wouldn't this help to get more people involved? ... those many people >> that Gael correctly assumes are out there but staying silent! > > I am the kind of person that you want developing code for Scipy. ?I prove the > existence of a non-empty class of people who are out here but stay silent (no > longer!). ?I am a persistent lurker on these lists. I'm a heavy user of Numpy > and Scipy in my research. ?I use Numpy and Scipy in the classes I teach. ?I > contribute to other Python-based OSS projects in my small spare time. ?When > you folks talk about attracting people to work on Scipy, I should be the kind > of person you are thinking about (and I am legion?). ?I'd like to share some > of my thoughts on the issues of code review, tests, documentation and > workflow in the hopes of offering a non-insider perspective. > > 1) Code review is very helpful for me as a new contributor. ?I am much more > likely to contribute in a context in which I feel that whatever code I *can* > produce is going to be reviewed and I can work on it to bring it up to Scipy > standards. ?If I feel that I have to produce picture-perfect Python on my > first try, I am much less likely to try in the first place. ?Code review is a > perfect place for interested people (me!) to learn how to be active people. > It is also a positive-feedback loop, as other interested people see the > mentoring process that someone else has gone through with code review and feel > themselves up to the task of trying to contribute. ?For this reason, I think > it is a benefit for code reviews to take place in public fora such as mailing > lists, not exclusively in special code-review applications/domains. > > 2) Unit testing is also important for me as a new contributor. ?If I would > like to mess around with something that I don't understand in order to learn > something, unit testing allows me to experiment effectively. ?Without unit > tests, I cannot be an effective experimentalist in my hacking. ?In addition, > other projects have trained me to unit test my contributions, so that is > what I would most likely be doing if I were to contribute and I would like to > feel that my effort to write tests is valued. > > 3) Documenting code seems like a very important standard to uphold for new > contributors. ?As someone who *might* contribute, I don't yet have a fixed > notion of what is good enough code. ?So, if I do decide to send something up > for public consumption, then I am easy to convince that I need to do more > documentation. > > 4) Workflow and tools are extremely important for me as a new contributor. > One of the things that keeps me from developing even small patches for Scipy > is SVN. ?If I want to make a change, I have to check out the trunk and then > develop my change *completely without the benefit of version control*. ?I am not > allowed to make any intermediate commits while I learn my way through the coding > process. ?I must submit a fully formed patch without ever being able > to checkpoint my own progress. ?This is basically a deal-breaker for me. ?I > don't enjoy coding without a safety net, especially large changes, especially > test-driven changes and especially heavily documented changes. ?I want to be > able to polish my patch using the power of version control. ?Not having this > makes me enjoy scipy development less which makes me less likely to > contribute. > > As a fairly early convert to DVCS, I am used to being able to use my local > branch of the project however I need to in my own development process. ?Being > able to commit to a local branch as I see fit also helps produce > well-tested and well-documented code *and* enables effective multi-step code > review. ?Particularly with Bazaar's bundle concept where the history of a > local branch can be swapped via email (not just the patch), reviewers can > merge a bundle from an email and review directly in the branch as I developed > it. ?Their suggestions can then be incorporated into new revisions in my > local branch, which can then be submitted again for more polishing. ?(I > imagine git and Mercurial have similar lightweight capabilities for > exchanging branches; ?I just don't have experience with them.) > > > I hope that my thoughts help clarify this group's thinking about what sort of > things can help bring in new contributors. ?(Oh, and I've got some ideas for > scipy.stats ;) Yes, +1 to all what you said. Also I agree with what Stefan said and I think most of the others sort of agree with this too. So I hope some change will happen soon in this direction. :) E.g. dvcs and peer review. I also agree with what Robert Kern said about the experimentalist in the corner --- I think let's just start peer review and see what happens. Make it easy for people like me to see what patches are waiting for review so that I can go through them and do the review (=making myself responsible for the patches if I say they are ok, or otherwise offer suggestions). Ondrej From ellisonbg.net at gmail.com Thu Feb 26 02:21:40 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 25 Feb 2009 23:21:40 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> Neil, Thanks for speaking up! I think there are *many* people in your situation, including myself - I too am mostly a silent watcher of SciPy and I would be much more likely to contribute if the things you list were a part of the Scipy development culture: * Tests * Code review * Documentation * Good tools and workflow. I think it is an unproven myth that these things are "barriers" for people who want to write code. In most cases that I have seen, these things *encourage* new people to contribute to a project and greatly improve the quality of the code being written by newbies and veteran's alike. Cheers, Brian > I am the kind of person that you want developing code for Scipy. ?I prove the > existence of a non-empty class of people who are out here but stay silent (no > longer!). ?I am a persistent lurker on these lists. I'm a heavy user of Numpy > and Scipy in my research. ?I use Numpy and Scipy in the classes I teach. ?I > contribute to other Python-based OSS projects in my small spare time. ?When > you folks talk about attracting people to work on Scipy, I should be the kind > of person you are thinking about (and I am legion?). ?I'd like to share some > of my thoughts on the issues of code review, tests, documentation and > workflow in the hopes of offering a non-insider perspective. > > 1) Code review is very helpful for me as a new contributor. ?I am much more > likely to contribute in a context in which I feel that whatever code I *can* > produce is going to be reviewed and I can work on it to bring it up to Scipy > standards. ?If I feel that I have to produce picture-perfect Python on my > first try, I am much less likely to try in the first place. ?Code review is a > perfect place for interested people (me!) to learn how to be active people. > It is also a positive-feedback loop, as other interested people see the > mentoring process that someone else has gone through with code review and feel > themselves up to the task of trying to contribute. ?For this reason, I think > it is a benefit for code reviews to take place in public fora such as mailing > lists, not exclusively in special code-review applications/domains. > > 2) Unit testing is also important for me as a new contributor. ?If I would > like to mess around with something that I don't understand in order to learn > something, unit testing allows me to experiment effectively. ?Without unit > tests, I cannot be an effective experimentalist in my hacking. ?In addition, > other projects have trained me to unit test my contributions, so that is > what I would most likely be doing if I were to contribute and I would like to > feel that my effort to write tests is valued. > > 3) Documenting code seems like a very important standard to uphold for new > contributors. ?As someone who *might* contribute, I don't yet have a fixed > notion of what is good enough code. ?So, if I do decide to send something up > for public consumption, then I am easy to convince that I need to do more > documentation. > > 4) Workflow and tools are extremely important for me as a new contributor. > One of the things that keeps me from developing even small patches for Scipy > is SVN. ?If I want to make a change, I have to check out the trunk and then > develop my change *completely without the benefit of version control*. ?I am not > allowed to make any intermediate commits while I learn my way through the coding > process. ?I must submit a fully formed patch without ever being able > to checkpoint my own progress. ?This is basically a deal-breaker for me. ?I > don't enjoy coding without a safety net, especially large changes, especially > test-driven changes and especially heavily documented changes. ?I want to be > able to polish my patch using the power of version control. ?Not having this > makes me enjoy scipy development less which makes me less likely to > contribute. > > As a fairly early convert to DVCS, I am used to being able to use my local > branch of the project however I need to in my own development process. ?Being > able to commit to a local branch as I see fit also helps produce > well-tested and well-documented code *and* enables effective multi-step code > review. ?Particularly with Bazaar's bundle concept where the history of a > local branch can be swapped via email (not just the patch), reviewers can > merge a bundle from an email and review directly in the branch as I developed > it. ?Their suggestions can then be incorporated into new revisions in my > local branch, which can then be submitted again for more polishing. ?(I > imagine git and Mercurial have similar lightweight capabilities for > exchanging branches; ?I just don't have experience with them.) > > > I hope that my thoughts help clarify this group's thinking about what sort of > things can help bring in new contributors. ?(Oh, and I've got some ideas for > scipy.stats ;) > > -Neil > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From nwagner at iam.uni-stuttgart.de Thu Feb 26 02:30:18 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 26 Feb 2009 08:30:18 +0100 Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v In-Reply-To: References: Message-ID: On Thu, 26 Feb 2009 03:03:58 +0000 (UTC) Pauli Virtanen wrote: > (For trying out the code review tool...) > > Scipy bugs #503 and #849 are due to a non-robust >implementation of Bessel > I function in Cephes. The following changes address >this, by using an > implementation from the Boost library, converted to C: > > http://codereview.appspot.com/20078 > > A git branch is here: > > http://github.com/pv/scipy-work/tree/ticket-503-special-iv-fix > > I'm thinking that this code would be ready to be >committed in. I'm not > aware of bugs in it. > > -- > Pauli Virtanen How can I apply your patch ? Nils From ondrej at certik.cz Thu Feb 26 02:39:41 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 25 Feb 2009 23:39:41 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> Message-ID: <85b5c3130902252339i2ee47crce0d18b8ab49455b@mail.gmail.com> On Wed, Feb 25, 2009 at 11:21 PM, Brian Granger wrote: > Neil, > > Thanks for speaking up! ?I think there are *many* people in your > situation, including myself - I too am mostly a silent watcher of > SciPy and I would be much more likely to contribute if the things you > list were a part of the Scipy development culture: > > * Tests > * Code review > * Documentation > * Good tools and workflow. > > I think it is an unproven myth that these things are "barriers" for > people who want to write code. ?In most cases that I have seen, these > things *encourage* new people to contribute to a project and greatly > improve the quality of the code being written by newbies and veteran's > alike. Yep, I count myself in that group too. Ondrej From robince at gmail.com Thu Feb 26 03:38:57 2009 From: robince at gmail.com (Robin) Date: Thu, 26 Feb 2009 08:38:57 +0000 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 5:42 AM, Neil Martinsen-Burrell wrote: > Rob Clewley gmail.com> writes: > > [...] > >> So, can't there be informal teams of curatorship so that not everyone >> involved has to be really familiar with the tools discussed in the >> other thread?! Unfortunately I cannot afford the time to ride the >> waves of changing fashion in VCS, etc. >> >> Wouldn't this help to get more people involved? ... those many people >> that Gael correctly assumes are out there but staying silent! > > I am the kind of person that you want developing code for Scipy. ?I prove the > existence of a non-empty class of people who are out here but stay silent (no > longer!). ?I am a persistent lurker on these lists. I'm a heavy user of Numpy > and Scipy in my research. ?I use Numpy and Scipy in the classes I teach. ?I > contribute to other Python-based OSS projects in my small spare time. ?When > you folks talk about attracting people to work on Scipy, I should be the kind > of person you are thinking about (and I am legion?). ?I'd like to share some > of my thoughts on the issues of code review, tests, documentation and > workflow in the hopes of offering a non-insider perspective. > > 1) Code review is very helpful for me as a new contributor. ?I am much more > likely to contribute in a context in which I feel that whatever code I *can* > produce is going to be reviewed and I can work on it to bring it up to Scipy > standards. ?If I feel that I have to produce picture-perfect Python on my > first try, I am much less likely to try in the first place. ?Code review is a > perfect place for interested people (me!) to learn how to be active people. > It is also a positive-feedback loop, as other interested people see the > mentoring process that someone else has gone through with code review and feel > themselves up to the task of trying to contribute. ?For this reason, I think > it is a benefit for code reviews to take place in public fora such as mailing > lists, not exclusively in special code-review applications/domains. > > 2) Unit testing is also important for me as a new contributor. ?If I would > like to mess around with something that I don't understand in order to learn > something, unit testing allows me to experiment effectively. ?Without unit > tests, I cannot be an effective experimentalist in my hacking. ?In addition, > other projects have trained me to unit test my contributions, so that is > what I would most likely be doing if I were to contribute and I would like to > feel that my effort to write tests is valued. > > 3) Documenting code seems like a very important standard to uphold for new > contributors. ?As someone who *might* contribute, I don't yet have a fixed > notion of what is good enough code. ?So, if I do decide to send something up > for public consumption, then I am easy to convince that I need to do more > documentation. > > 4) Workflow and tools are extremely important for me as a new contributor. > One of the things that keeps me from developing even small patches for Scipy > is SVN. ?If I want to make a change, I have to check out the trunk and then > develop my change *completely without the benefit of version control*. ?I am not > allowed to make any intermediate commits while I learn my way through the coding > process. ?I must submit a fully formed patch without ever being able > to checkpoint my own progress. ?This is basically a deal-breaker for me. ?I > don't enjoy coding without a safety net, especially large changes, especially > test-driven changes and especially heavily documented changes. ?I want to be > able to polish my patch using the power of version control. ?Not having this > makes me enjoy scipy development less which makes me less likely to > contribute. > > As a fairly early convert to DVCS, I am used to being able to use my local > branch of the project however I need to in my own development process. ?Being > able to commit to a local branch as I see fit also helps produce > well-tested and well-documented code *and* enables effective multi-step code > review. ?Particularly with Bazaar's bundle concept where the history of a > local branch can be swapped via email (not just the patch), reviewers can > merge a bundle from an email and review directly in the branch as I developed > it. ?Their suggestions can then be incorporated into new revisions in my > local branch, which can then be submitted again for more polishing. ?(I > imagine git and Mercurial have similar lightweight capabilities for > exchanging branches; ?I just don't have experience with them.) > > > I hope that my thoughts help clarify this group's thinking about what sort of > things can help bring in new contributors. ?(Oh, and I've got some ideas for > scipy.stats ;) > > -Neil > - Show quoted text - As another long time lurker I would also support everything Neil said. I also wanted to add the point, that what stops me recommending scipy more widely to my colleagues is not that there is not enough code in it - it is that it is not stable enough to rely on for their work. That is perhaps a bit harsh, but I am sure that the first time one of my colleagues lost 1/2 a day because of a scipy bug (as I have done quite a few times) they would be back to MATLAB. So I would agree with Stefan and the others that the priority is not getting more code in per se, but improving the quality and frequency of releases to get a platform whose stability compares with MATLAB before adding more stuff. Cheers Robin From pav at iki.fi Thu Feb 26 05:06:27 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 10:06:27 +0000 (UTC) Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v References: Message-ID: Thu, 26 Feb 2009 08:30:18 +0100, Nils Wagner wrote: > Pauli Virtanen wrote: >> http://codereview.appspot.com/20078 >> http://github.com/pv/scipy-work/tree/ticket-503-special-iv-fix [clip] > How can I apply your patch ? There's the "Download raw patch set" on the codereview page, on top of the table containing the individual patches. Apply it with patch -p1 < issue_20078_1.diff on top of SVN checkout. *** Alternatively, check out the git branch: first, get the SVN mirror git clone git://github.com/pv/scipy-svn.git scipy.git This takes some time, as the branch contains the whole history of Scipy, but you need to do this only once. Then get my branch: cd scipy.git git remote add pauli git://github.com/pv/scipy-work.git git fetch pauli git remote show pauli Switch the working tree to it: git checkout pauli/ticket-503-special-iv-fix Examine what was done there: git log master.. git diff master git show bb21c git show 7f738 If you want to hack on it yourself, work on a branch of your own: git checkout -b ticket-503-special-iv-fix If you want to get the SVN tags and branches, tell git you want them: cd scipy.git git remote rm origin git remote add --mirror origin git://github.com/pv/scipy-svn.git git fetch git branch -r git log tags/0.7.0..trunk git diff tags/0.7.0 trunk -- Pauli Virtanen From nwagner at iam.uni-stuttgart.de Thu Feb 26 05:35:25 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 26 Feb 2009 11:35:25 +0100 Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v In-Reply-To: References: Message-ID: On Thu, 26 Feb 2009 10:06:27 +0000 (UTC) Pauli Virtanen wrote: > Thu, 26 Feb 2009 08:30:18 +0100, Nils Wagner wrote: >> Pauli Virtanen wrote: >>> http://codereview.appspot.com/20078 >>> http://github.com/pv/scipy-work/tree/ticket-503-special-iv-fix > [clip] >> How can I apply your patch ? > > There's the "Download raw patch set" on the codereview >page, on top of > the table containing the individual patches. Apply it >with > > patch -p1 < issue_20078_1.diff > > on top of SVN checkout. > > *** > > Alternatively, check out the git branch: first, get the >SVN mirror > > git clone git://github.com/pv/scipy-svn.git scipy.git > > This takes some time, as the branch contains the whole >history of > Scipy, but you need to do this only once. > > Then get my branch: > > cd scipy.git > git remote add pauli git://github.com/pv/scipy-work.git > git fetch pauli > git remote show pauli > > Switch the working tree to it: > > git checkout pauli/ticket-503-special-iv-fix > > Examine what was done there: > > git log master.. > git diff master > git show bb21c > git show 7f738 > > If you want to hack on it yourself, work on a branch of >your own: > > git checkout -b ticket-503-special-iv-fix > > If you want to get the SVN tags and branches, tell git >you want them: > > cd scipy.git > git remote rm origin > git remote add --mirror origin >git://github.com/pv/scipy-svn.git > git fetch > > git branch -r > git log tags/0.7.0..trunk > git diff tags/0.7.0 trunk > > -- > Pauli Virtanen > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev Hi Pauli, Thank you very much for your detailed instructions. They are really helpful. I have applied the patch. Here is the output of scipy.test() ====================================================================== FAIL: test_yn_zeros (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1598, in test_yn_zeros 488.98055964441374646], rtol=1e-19) File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 295, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-19, atol=0 (mismatch 100.0%) x: array([ 450.136, 463.057, 472.807, 481.274, 488.981]) y: array([ 450.136, 463.057, 472.807, 481.274, 488.981]) ====================================================================== FAIL: test_ynp_zeros (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1604, in test_ynp_zeros assert_tol_equal(yvp(443, ao), 0, atol=1e-15) File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 38, in assert_tol_equal verbose=verbose, header=header) File "/data/home/nwagner/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 295, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-15 (mismatch 100.0%) x: array([ 1.239e-10, -8.119e-16, 3.608e-16, 5.898e-16, 1.226e-15]) y: array(0) ====================================================================== FAIL: test_yv_cephes_vs_amos (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1664, in test_yv_cephes_vs_amos self.check_cephes_vs_amos(yv, yn, rtol=1e-11, atol=1e-305) File "/data/home/nwagner/local/lib/python2.5/site-packages/scipy/special/tests/test_basic.py", line 1653, in check_cephes_vs_amos assert c2.imag != 0, (v, z) AssertionError: (301, 1.0) ---------------------------------------------------------------------- Ran 3703 tests in 58.088s FAILED (KNOWNFAIL=2, SKIP=17, failures=3) Cheers, Nils From gelston at doosanbabcock.com Thu Feb 26 06:17:50 2009 From: gelston at doosanbabcock.com (Elston, Gareth R) Date: Thu, 26 Feb 2009 11:17:50 -0000 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: Message-ID: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> > -----Original Message----- > From: scipy-dev-bounces at scipy.org [mailto:scipy-dev-bounces at scipy.org] On Behalf Of scipy-dev-request at scipy.org > Sent: 26 February 2009 10:36 > To: scipy-dev at scipy.org > Subject: Scipy-dev Digest, Vol 64, Issue 56 > > Message: 1 > Date: Wed, 25 Feb 2009 23:21:40 -0800 > From: Brian Granger > Subject: Re: [SciPy-dev] Scipy workflow (and not tools). > To: SciPy Developers List > Message-ID: > <6ce0ac130902252321j6139238by634364acd2bd07b2 at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Neil, > > Thanks for speaking up! I think there are *many* people in your > situation, including myself - I too am mostly a silent watcher of > SciPy and I would be much more likely to contribute if the things you > list were a part of the Scipy development culture: > > * Tests > * Code review > * Documentation > * Good tools and workflow. > > I think it is an unproven myth that these things are "barriers" for > people who want to write code. In most cases that I have seen, these > things *encourage* new people to contribute to a project and greatly > improve the quality of the code being written by newbies and veteran's > alike. > > Cheers, > > Brian I'm another long-time lurker who's been thinking about contributing for a while but who's been a bit uncertain about where to start. These discussions, plus the git branches Pauli has created, are encouraging me greatly to roll up my sleeves and delve in. Cheers, Gareth. ------------------------------------------------------------- IMPORTANT NOTICE. This E-Mail and any files transmitted with it, are confidential and may be privileged and are for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of distribution, copying or use of this communication or the information in it, is strictly prohibited and may be unlawful. If you have received this E-Mail in error please return it to the sender. We should be grateful if you would also copy the communication to postmaster at doosanbabcock.com then delete the E-Mail and destroy any copies of it. It is your responsibility to scan any attachments for viruses. For further information, visit us at WWW.DOOSANBABCOCK.COM ------------------------------------------------------------- From stefan at sun.ac.za Thu Feb 26 07:17:07 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 26 Feb 2009 14:17:07 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A5D1CD.6070700@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> Message-ID: <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Hi Travis 2009/2/26 Travis E. Oliphant : > 1) ?We absolutely need to improve the quality of SciPy, and that does > mean more tests, documentation, and reviews --- and most importantly > faster releases. ? ?Right now, a release happens when someone steps up > to be a release manager and commits to making it happen. ? ?I don't know > how to promise that on a regular cycle with only volunteer effort. ? ?I > would love to have the resources to fund SciPy release management. If the release process wasn't so painful, maybe more people would volunteer? > 2) ?I think we are doing a decent job of commits having tests and > documentation. ? ?We should continue to remind each other of the need > for quality code in SciPy (and continue to clean up code that is there). I don't want to complain all the time (I really hate complaining), which is why I want a policy in place. Policy sounds formal, so let me rather say: I'd like us to come to a consensus on the type of changes that are appropriate. If we did, then the term "decent", as you use it above, becomes more clearly defined. > 3) There are pieces of SciPy that need work (interpolate stands out most > in my mind right now). ? ?I have changes to the interpolate code that I > have not yet committed because I was waiting for the release of 0.7 but > I really want to commit. ?Who is interested in reviewing this? I'd be glad to. Pauli's suggestion of codereview.appspot.com sounds good, since we don't have any better infrastructure in place. > 4) Bug-fix commits are a different thing than feature-enhancement > commits. ? We should have different expectations of them. I agree, to an extent. I think it is an ideal opportunity to add a test (since, clearly, the current test suite didn't catch the problem, and since you had to study the broken code in order to fix it); but in such a case it's more important to have the bug fixed. Unfortunately, without a test you won't be absolutely certain that it's fixed everywhere, but the process at least converges in the right direction. > 5) We do have scikits for more experimental additions to live so that > SciPy should become more of a stable, documentation-rich library. ?But, > the problem there is distribution. ? EPD and Enstaller (our BSD-licensed > version of setuptools) is one answer to that distribution problem. > There are others. You guys are doing a fantastic job, keep it up. Also thanks to Pierre Raybaut, whose Python(x,y) distribution is making life so easy for our students. I don't know if you've visited the portal to SciKits: http://scikits.appspot.com. If we can make any changes to facilitate packaging, let me know. > 6) I very much appreciate all the work people do on SciPy. ? ?I think > our biggest lack more than anything else is the "full-time" person that > can respond to the user community and keep the momentum moving. Absolutely. I've often wondered how hard it would be to obtain such funding, but to date I haven't made any proposals. Regards St?fan From argriffi at ncsu.edu Thu Feb 26 08:55:46 2009 From: argriffi at ncsu.edu (Alex Griffing) Date: Thu, 26 Feb 2009 08:55:46 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> References: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> Message-ID: <49A69F62.6020603@ncsu.edu> > I'm another long-time lurker who's been thinking about contributing for > a while but who's been a bit uncertain about where to start. These > discussions, plus the git branches Pauli has created, are encouraging me > greatly to roll up my sleeves and delve in. > > Cheers, > Gareth. Hi I'm also mostly a lurker. Here's a story about the time I tried to contribute to scipy. Maybe a year ago I wanted a procedure to attempt to optimize some aspect of a multidimensional function. I found scipy.optimize, but the procedure that I tried failed by dividing by zero. Undaunted, I started digging in the scipy source code, finding the error in the broyden2 function. The problem was that the algorithm was finding the correct solution in fewer than the default number of iterations and was dividing by an error term that was zero. I sent an email to a scipy mailing list saying that this function needed some kind of check to see if it was done (error near zero) so that it could stop iterating so that it would not divide by zero. I got a reply saying that I needed to write a patch that included the fix to the function and a new test that used to fail but that now passes. So I simplified my failing code, I changed the broyden2 function in a way that I thought would fix the problem, and I sent the code to the mailing list. I got a reply saying that what they meant by a patch was an svn diff. So I installed svn and I checked out the scipy code with the idea that I could change my local copy, test my changes, and send the diff to the mailing list. After spending some time trying to get this working, I stopped for the following reasons: 1) Binary format and path problems were causing me grief. 2) I already had a scipy that worked, and I didn't want to break this. 3) Neil's comment: """ If I want to make a change, I have to check out the trunk and then develop my change *completely without the benefit of version control*. I am not allowed to make any intermediate commits while I learn my way through the coding process. I must submit a fully formed patch without ever being able to checkpoint my own progress. This is basically a deal-breaker for me. """ I think I still use code that works around this bug by using the iteration keyword argument to specify a fewer-than-default number of iterations so that the error never reaches zero. Alex From josef.pktd at gmail.com Thu Feb 26 10:18:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 26 Feb 2009 10:18:49 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <1cd32cbb0902260718g24d8bbffu21db14e780b6fc8f@mail.gmail.com> On Thu, Feb 26, 2009 at 3:38 AM, Robin wrote: > > As another long time lurker I would also support everything Neil said. > > I also wanted to add the point, that what stops me recommending scipy > more widely to my colleagues is not that there is not enough code in > it - it is that it is not stable enough to rely on for their work. > That is perhaps a bit harsh, but I am sure that the first time one of > my colleagues lost 1/2 a day because of a scipy bug (as I have done > quite a few times) they would be back to MATLAB. > > So I would agree with Stefan and the others that the priority is not > getting more code in per se, but improving the quality and frequency > of releases to get a platform whose stability compares with MATLAB > before adding more stuff. > I think we are not seeing enough trac tickets about missing test with tests included as patches. For a user, that is familiar with the a part of scipy, it would be relatively easy to provide a test. This would reduce the chance that parts get broken by accident and signal in any refactoring that the interface should be change only with proper depreciation warning. So the users could contribute to scipy and making it more stable for their own work. Similarly, when I was working my way through parts of scipy, I found that examples (or tests that can be used as examples) are missing. This makes it often difficult to figure out what the exact format of the call parameters and limitation of the functions are. Example: signal.ltisys: no tests, no examples, good general description For someone not familiar with the matlab signal toolbox, it is not clear what the exact requirements for the matrices of the state space representation is. But for users of this, it might be much easier to come up with examples and tests than for me, who has to work trough the exceptions that are raised and the source code. I also agree, with Neil. This is exactly the situation I was in, half a year ago. Before, getting commit access I had several local copies of files and finally a bzr branch to keep track of my changes. A more systematic workflow for this would a big improvement. But for rewriting relatively confined parts (which is most of stats but may not apply to other parts), I still prefer to work with stand-alone scripts (under my own local version control), and integrate them into scipy when they are ready. The review of my changes by Per Brodtkorb was very helpful. However, my main quality control was to increase the test coverage for stats.distributions from around 50% to above 90%, with statistical tests that made sure that the numbers are at least approximately correct. (up to statistical noise and numerical precision.) Since I was also relatively new to numpy, I might not have coded everything in the most efficient way, but at least I felt relatively sure that each change I made passed the basic (statistical) tests. And I'm still reluctant to apply any bug fixes without full verification and testing. This slows down the bug fixing and enhancements but lowers the chance that we introduce new bugs. High test coverage would also make it easier to apply new patches or enhancements since we don't have to wait for the next round of bug reports to verify that everything still works. I think once scipy has a reasonable test coverage, the development and release process would go quite a bit faster Using nose testing is a huge improvement in the testing workflow. And I wish that we see lots of trac tickets with patches for missing tests. Josef From gael.varoquaux at normalesup.org Thu Feb 26 10:26:21 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 26 Feb 2009 16:26:21 +0100 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902260718g24d8bbffu21db14e780b6fc8f@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <1cd32cbb0902260718g24d8bbffu21db14e780b6fc8f@mail.gmail.com> Message-ID: <20090226152621.GE26861@phare.normalesup.org> On Thu, Feb 26, 2009 at 10:18:49AM -0500, josef.pktd at gmail.com wrote: > On Thu, Feb 26, 2009 at 3:38 AM, Robin wrote: > I think we are not seeing enough trac tickets about missing test > with tests included as patches. I think one of the issues is that a lot of people don't really know what tests are, how to write them, and how to run them. In addition, they don't really master version control, and as someone pointed out in the discussion they might not know of to make a patch. I few years ago, this was the case for me. A document on 'how to contribute to scipy' that explains the workflow would probably help a lot. The sympy guys have done this, and I know people on the sympy mailing list are often pointed to this document. I think the core part of such a document could actually be common to many projects. I with add on my TODO list to write one for Mayavi (in sphinx), because the is the project I know best, and we can try and port it to Scipy once it it done. I won't get to doing this anytime soon (I am super busy currently), so if someone wants to beat me in doing this for scipy, just go ahead! Ga?l From jason-sage at creativetrax.com Thu Feb 26 10:40:24 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Thu, 26 Feb 2009 09:40:24 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> Message-ID: <49A6B7E8.5010105@creativetrax.com> Perry Greenfield wrote: > > 2) While I understand the desire to increase the quality of commits to > scipy by putting in a more formal process, like making sure code is > reviewed, tests are present, and documentation is provided, I too, > like Travis, worry that this may inhibit many useful contributions. > Rather than act as a barrier, why not just have some sort of "seal of > approval" for things that have gone through that process. Lots of projects have -stable and -dev branches. The -stable branch for scipy could involve the "seal of approval" with review, doctests, etc. The -dev branch could be the unreviewed code. This lets Travis commit to something and get his patches out there, but also clearly defines a line in the sand between reviewed and unreviewed code. I realize that scipy already has something of -dev and -stable branches, based on releases. Maybe this idea boils down to: only reviewed code is allowed in an official release, but there is a -dev branch with all code available as well. As code is reviewed, it is moved into the -stable branch and released in the next release. In reality, using a DVCS, each developer's copy of the repository then becomes a private -dev branch that can be pulled from. Developers get to commit and publish unreviewed changes, and someone (the release manager) can pull in to -stable the changes that are reviewed. The release manager could also pull all changes from developer repositories into an official -dev branch if you wanted to have a central clearing house for what everyone is working on. Jason -- Jason Grout From oliphant at enthought.com Thu Feb 26 10:52:03 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 09:52:03 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> Message-ID: <49A6BAA3.2000706@enthought.com> Charles R Harris wrote: > > Interpolate stands out in my mind also, along with signal processing. > Mostly because last time I looked -- a long time ago -- they were > pretty messy and I haven't seen much work done on them since. I think > in this case it would be helpful if you summarized your intended > changes and interfaces and gave a short explanation of your > motivation. By the time the code actually shows up in SVN might be a > little late. An intern did some work here last summer (and I did some work two summers ago). Last summer's work never got integrated. The goal is to unify and expand the interface to interp1d and interp2d and interpnd --- to allow for future improvement while improving the API as well as a few of the algorithms. Basically, interpolate is one thing we teach quite a bit and it's a little embarrassing every time we teach it. > > I think a similar approach would have helped with curve_fit, not least > since I have found Gauss-Newton with numerical derivatives to > out-perform Levenburg-Marquardt by ~50x in some problems and give > better answers. Then there is the question of when to stop the > iterations. Also, I couldn't see if the function and data could be > array valued, which can be handy in some cases. For instance, I > recently fit a case where the parameters moved a large set of points > around on a unit sphere in order to minimize the distance to data > points on the sphere. In this case the output of f was most > conveniently represented as an array of vectors. Mind, I don't mind > the function itself so much as giving it a name that implies more > generality than I think it has. I'm very happy to change the name and/or the algorithm to improve the generality / speed. Anything you have would be welcome. Perhaps other people have a different perspective, but my perspective of the trunk is that it is not a release and so changes to new functionality in the trunk can be made with no concern of "backward-compatibility" until a release is made. Given the review tools I've seen. My perspective on the best way to review is to look at changes to the trunk and just make the changes that you see as needed, or if it's not clear to you how to make the changes, you can put comments in the code about what you would like to see done and then perhaps somebody else can figure out how to make that change. In that process, we should all play nicely with each other and respect each other's opinions. If an occasional point can't be resolved between the interested parties because of basically differences of opinion, then we can: 1) vote on the list and if that doesn't clearly resolve the question, 2) the steering committee votes and makes a decision. That's my perspective on how changes are made. -Travis From oliphant at enthought.com Thu Feb 26 10:58:03 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 09:58:03 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> Message-ID: <49A6BC0B.4040301@enthought.com> Pauli Virtanen wrote: > Wed, 25 Feb 2009 17:18:37 -0600, Travis E. Oliphant wrote: > [clip] > >> 3) There are pieces of SciPy that need work (interpolate stands out most >> in my mind right now). I have changes to the interpolate code that I >> have not yet committed because I was waiting for the release of 0.7 but >> I really want to commit. Who is interested in reviewing this? I'm >> happy to work with additional eyes, but my current workflow is "commit >> code I think is working along with some tests and docstrings", and then >> let review/improve happen on the trunk. >> > > The codereview.appspot.com tool is very fast to use, eg. via the > > http://codereview.appspot.com/static/upload.py > > tool. So I'd suggest to just uploading the patches there even before > commit; it can't do any harm. > The harm is the effort to do it. Interacting with a web-page is slower than svn commit. This extra step in the process does make a difference when you are time-crunched. > The problem with reviewing code after commit in trunk is that it takes > more effort to correct or ask about dubious points. > I disagree with this statement. Why does it take more effort than reviewing code on the trunk? You can do an svn diff to get the code changes, and do the review exactly as you could with any other tool. One way to see this is the difference between asking for permission or asking for forgiveness. Both have their place in social activities, but we shouldn't institutionalize one over the other. >> I don't really like having lots of branches checked out of a code-base >> in order to manage a different workflow. I'm open to being educated >> about approaches that work better. >> > > I've found git-svn quite good for maintaining topic branches. It can > switch easily between them using the same working tree, so that compiles > are fast, and editor just needs M-x revert-buffer. > Thanks for the tip. At some point I may be able to invest some time in learning about git-svn. How do you switch between branches using git-svn. With svn it's svn switch http://some-name-I-always-have-to-look-up-and-takes-time. -Travis From guyer at nist.gov Thu Feb 26 11:00:32 2009 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 26 Feb 2009 11:00:32 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> On Feb 26, 2009, at 12:42 AM, Neil Martinsen-Burrell wrote: > other projects have trained me to unit test my contributions, so > that is > what I would most likely be doing if I were to contribute and I > would like to > feel that my effort to write tests is valued. I've never seen a single post in this thread, or on this list for that matter, that indicated that anybody thought that the effort to write tests was not valued. Quite the contrary. Everybody wants tests; it's just a question of whether there's something else they want even more. If I may very unfairly summarize the debate this far: St?fan, et al.: There aren't enough tests. All code must have tests! Under penalty of death! Travis, et al.: But then we we shall have neither code *nor* tests! St?fan, et al.: Good! I would characterize the debate as the value of writing tests *relative* to the value of writing anything else, not whether you should be writing and using tests at all. Absolutely you should. > One of the things that keeps me from developing even small patches > for Scipy > is SVN. If I want to make a change, I have to check out the trunk > and then > develop my change *completely without the benefit of version control*. Nothing forces you to develop without version control. See the SVN Book's discussion on "vendor branches" for one approach. We hack on several other people's tools in our own repository and periodically synch to their efforts or send them patches from ours. I am *not* saying that DVCS is a bad idea. I don't think it is. In this instance, it's clearly superior to the svn model, but please let's stop making svn out to be worse than it is, because I guarantee that you're going to be bitten by some variant of the same issues with bzr, git, or whatever. Look at David Cournapeau's blog on why he's ditching bzr in favor of git, after ditching svn in favor of bzr; I look forward to his essay on the atrocities of git. I don't mean that to be snide; it's *good* that tools are getting better and it's *good* that people like David are actually comparing them head to head and telling us about their experiences. I get that good and pleasant tools make people more productive, e.g., while I can get work done with Windows, I can get a lot more work done with something else because I don't have to devote so much energy to profanity. If some DVCS will make life more pleasant for both release managers and new contributors like yourself, by all means go for it, but don't kid yourself about the strengths of the tool you're leaving behind or the weaknesses of the tool you're adopting. From gael.varoquaux at normalesup.org Thu Feb 26 11:04:59 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 26 Feb 2009 17:04:59 +0100 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> Message-ID: <20090226160459.GA1525@phare.normalesup.org> On Thu, Feb 26, 2009 at 11:00:32AM -0500, Jonathan Guyer wrote: > e.g., while I can get work done with Windows, I can get a lot more > work done with something else because I don't have to devote so much > energy to profanity. Fantastic!! I love this sentence. Ga?l From stefan at sun.ac.za Thu Feb 26 11:15:01 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 26 Feb 2009 18:15:01 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6BC0B.4040301@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> Message-ID: <9457e7c80902260815o4ea6c341y52c99547a4f771bc@mail.gmail.com> 2009/2/26 Travis E. Oliphant : >> tool. So I'd suggest to just uploading the patches there even before >> commit; it can't do any harm. >> > The harm is the effort to do it. ? ?Interacting with a web-page is > slower than svn commit. ? This extra step in the process does make a > difference when you are time-crunched. Code review tools (such as rietveld) have command line interfaces. You make your change, and call "upload.py", which automatically does an SVN diff and uploads your patch for you. Cheers St?fan From david at ar.media.kyoto-u.ac.jp Thu Feb 26 10:59:46 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 27 Feb 2009 00:59:46 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> Message-ID: <49A6BC72.4070202@ar.media.kyoto-u.ac.jp> Jonathan Guyer wrote: > > Nothing forces you to develop without version control. See the SVN > Book's discussion on "vendor branches" for one approach. We hack on > several other people's tools in our own repository and periodically > synch to their efforts or send them patches from ours. I am *not* > saying that DVCS is a bad idea. I don't think it is. In this instance, > it's clearly superior to the svn model, but please let's stop making > svn out to be worse than it is, because I guarantee that you're going > to be bitten by some variant of the same issues with bzr, git, or > whatever. Look at David Cournapeau's blog on why he's ditching bzr in > favor of git, after ditching svn in favor of bzr; You got it wrong: I've never ditched svn, since I have never used it for my own projects. David From guyer at nist.gov Thu Feb 26 11:20:43 2009 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 26 Feb 2009 11:20:43 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A6BC72.4070202@ar.media.kyoto-u.ac.jp> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <49A6BC72.4070202@ar.media.kyoto-u.ac.jp> Message-ID: On Feb 26, 2009, at 10:59 AM, David Cournapeau wrote: > Jonathan Guyer wrote: >> >> Look at David Cournapeau's blog on why he's ditching bzr in >> favor of git, after ditching svn in favor of bzr; > > You got it wrong: I've never ditched svn, since I have never used it > for > my own projects. Fair enough. Let me amend: "Look at David Cournapeau's blog on why he's ditching bzr in favor of git, after making an ad hominem attack on svn". Better? From stefan at sun.ac.za Thu Feb 26 11:23:51 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 26 Feb 2009 18:23:51 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> Message-ID: <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> 2009/2/26 Jonathan Guyer : > If I may very unfairly summarize the debate this far: > > St?fan, et al.: There aren't enough tests. All code must have tests! > Under penalty of death! > Travis, et al.: But then we we shall have neither code *nor* tests! > St?fan, et al.: Good! I agree, your summary isn't accurate: the world isn't that black and white. To summarise a profound story I heard the other day: "A wise man first dams up the river, before trying to empty the lake." Unless we test new code, how do we make time to clean up the old code? I certainly would not like to see SciPy stagnate because of hard-line policy. My fear is, that unless we do something in order to protect our scarce developer resources, we will see the same thing happening. Regards St?fan From ondrej at certik.cz Thu Feb 26 11:31:02 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 08:31:02 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A69F62.6020603@ncsu.edu> References: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> <49A69F62.6020603@ncsu.edu> Message-ID: <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> On Thu, Feb 26, 2009 at 5:55 AM, Alex Griffing wrote: > >> I'm another long-time lurker who's been thinking about contributing for >> a while but who's been a bit uncertain about where to start. These >> discussions, plus the git branches Pauli has created, are encouraging me >> greatly to roll up my sleeves and delve in. >> >> Cheers, >> Gareth. > > Hi I'm also mostly a lurker. ?Here's a story about the time I tried to > contribute to scipy. > > Maybe a year ago I wanted a procedure to attempt to optimize some aspect > of a multidimensional function. ?I found scipy.optimize, but the > procedure that I tried failed by dividing by zero. ?Undaunted, I started > digging in the scipy source code, finding the error in the broyden2 > function. ?The problem was that the algorithm was finding the correct > solution in fewer than the default number of iterations and was dividing > by an error term that was zero. > > I sent an email to a scipy mailing list saying that this function needed > some kind of check to see if it was done (error near zero) so that it > could stop iterating so that it would not divide by zero. ?I got a reply > saying that I needed to write a patch that included the fix to the > function and a new test that used to fail but that now passes. ?So I > simplified my failing code, I changed the broyden2 function in a way > that I thought would fix the problem, and I sent the code to the mailing > list. ?I got a reply saying that what they meant by a patch was an svn diff. > > So I installed svn and I checked out the scipy code with the idea that I > could change my local copy, test my changes, and send the diff to the > mailing list. ?After spending some time trying to get this working, I > stopped for the following reasons: > > 1) Binary format and path problems were causing me grief. > 2) I already had a scipy that worked, and I didn't want to break this. > 3) Neil's comment: > """ > If I want to make a change, I have to check out the trunk and then > develop my change *completely without the benefit of version control*. > I am not allowed to make any intermediate commits while I learn my way > through the coding process. ?I must submit a fully formed patch without > ever being able to checkpoint my own progress. ?This is basically a > deal-breaker for me. > """ > Hi Alex, it was my who was trying to help you with this. The problem was, that people who have svn access to scipy (e.g. I don't) are very busy, and do you must submit your patches in a way so that it is easy for them to apply it. I agree that git is much easier to use for that, but imho it's not difficult to create a patch with svn as well, especially if you have a working scipy on your computer. Just checkout the svn, copy your working scipy files over it, do "svn di" and send us the patch. > I think I still use code that works around this bug by using the > iteration keyword argument to specify a fewer-than-default number of > iterations so that the error never reaches zero. If you find time, please do send us the patch with the fix+tests, it will be applied. Thanks a lot, Ondrej From charlesr.harris at gmail.com Thu Feb 26 11:32:36 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 09:32:36 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 9:23 AM, St?fan van der Walt wrote: > 2009/2/26 Jonathan Guyer : > > If I may very unfairly summarize the debate this far: > > > > St?fan, et al.: There aren't enough tests. All code must have tests! > > Under penalty of death! > > Travis, et al.: But then we we shall have neither code *nor* tests! > > St?fan, et al.: Good! > > I agree, your summary isn't accurate: the world isn't that black and > white. To summarise a profound story I heard the other day: > > "A wise man first dams up the river, before trying to empty the lake." > > Unless we test new code, how do we make time to clean up the old code? > > I certainly would not like to see SciPy stagnate because of hard-line > policy. My fear is, that unless we do something in order to protect > our scarce developer resources, we will see the same thing happening. > People, people, let's not kindle the flames. Or at least wait till I get the hotdogs and marshmellows. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 26 11:35:20 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 09:35:20 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> References: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> <49A69F62.6020603@ncsu.edu> <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 9:31 AM, Ondrej Certik wrote: > On Thu, Feb 26, 2009 at 5:55 AM, Alex Griffing wrote: > > > >> I'm another long-time lurker who's been thinking about contributing for > >> a while but who's been a bit uncertain about where to start. These > >> discussions, plus the git branches Pauli has created, are encouraging me > >> greatly to roll up my sleeves and delve in. > >> > >> Cheers, > >> Gareth. > > > > Hi I'm also mostly a lurker. Here's a story about the time I tried to > > contribute to scipy. > > > > Maybe a year ago I wanted a procedure to attempt to optimize some aspect > > of a multidimensional function. I found scipy.optimize, but the > > procedure that I tried failed by dividing by zero. Undaunted, I started > > digging in the scipy source code, finding the error in the broyden2 > > function. The problem was that the algorithm was finding the correct > > solution in fewer than the default number of iterations and was dividing > > by an error term that was zero. > > > > I sent an email to a scipy mailing list saying that this function needed > > some kind of check to see if it was done (error near zero) so that it > > could stop iterating so that it would not divide by zero. I got a reply > > saying that I needed to write a patch that included the fix to the > > function and a new test that used to fail but that now passes. So I > > simplified my failing code, I changed the broyden2 function in a way > > that I thought would fix the problem, and I sent the code to the mailing > > list. I got a reply saying that what they meant by a patch was an svn > diff. > > > > So I installed svn and I checked out the scipy code with the idea that I > > could change my local copy, test my changes, and send the diff to the > > mailing list. After spending some time trying to get this working, I > > stopped for the following reasons: > > > > 1) Binary format and path problems were causing me grief. > > 2) I already had a scipy that worked, and I didn't want to break this. > > 3) Neil's comment: > > """ > > If I want to make a change, I have to check out the trunk and then > > develop my change *completely without the benefit of version control*. > > I am not allowed to make any intermediate commits while I learn my way > > through the coding process. I must submit a fully formed patch without > > ever being able to checkpoint my own progress. This is basically a > > deal-breaker for me. > > """ > > > > Hi Alex, > > it was my who was trying to help you with this. The problem was, that > people who have svn access to scipy (e.g. I don't) are very busy, and > do you must submit your patches in a way so that it is easy for them > to apply it. I agree that git is much easier to use for that, but imho > it's not difficult to create a patch with svn as well, especially if > you have a working scipy on your computer. > Do you want SVN access? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Thu Feb 26 11:45:38 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 08:45:38 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> <49A69F62.6020603@ncsu.edu> <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> Message-ID: <85b5c3130902260845v30d341c6h4afdbc79715e979f@mail.gmail.com> On Thu, Feb 26, 2009 at 8:35 AM, Charles R Harris wrote: > > > On Thu, Feb 26, 2009 at 9:31 AM, Ondrej Certik wrote: >> >> On Thu, Feb 26, 2009 at 5:55 AM, Alex Griffing wrote: >> > >> >> I'm another long-time lurker who's been thinking about contributing for >> >> a while but who's been a bit uncertain about where to start. These >> >> discussions, plus the git branches Pauli has created, are encouraging >> >> me >> >> greatly to roll up my sleeves and delve in. >> >> >> >> Cheers, >> >> Gareth. >> > >> > Hi I'm also mostly a lurker. ?Here's a story about the time I tried to >> > contribute to scipy. >> > >> > Maybe a year ago I wanted a procedure to attempt to optimize some aspect >> > of a multidimensional function. ?I found scipy.optimize, but the >> > procedure that I tried failed by dividing by zero. ?Undaunted, I started >> > digging in the scipy source code, finding the error in the broyden2 >> > function. ?The problem was that the algorithm was finding the correct >> > solution in fewer than the default number of iterations and was dividing >> > by an error term that was zero. >> > >> > I sent an email to a scipy mailing list saying that this function needed >> > some kind of check to see if it was done (error near zero) so that it >> > could stop iterating so that it would not divide by zero. ?I got a reply >> > saying that I needed to write a patch that included the fix to the >> > function and a new test that used to fail but that now passes. ?So I >> > simplified my failing code, I changed the broyden2 function in a way >> > that I thought would fix the problem, and I sent the code to the mailing >> > list. ?I got a reply saying that what they meant by a patch was an svn >> > diff. >> > >> > So I installed svn and I checked out the scipy code with the idea that I >> > could change my local copy, test my changes, and send the diff to the >> > mailing list. ?After spending some time trying to get this working, I >> > stopped for the following reasons: >> > >> > 1) Binary format and path problems were causing me grief. >> > 2) I already had a scipy that worked, and I didn't want to break this. >> > 3) Neil's comment: >> > """ >> > If I want to make a change, I have to check out the trunk and then >> > develop my change *completely without the benefit of version control*. >> > I am not allowed to make any intermediate commits while I learn my way >> > through the coding process. ?I must submit a fully formed patch without >> > ever being able to checkpoint my own progress. ?This is basically a >> > deal-breaker for me. >> > """ >> > >> >> Hi Alex, >> >> it was my who was trying to help you with this. The problem was, that >> people who have svn access to scipy (e.g. I don't) are very busy, and >> do you must submit your patches in a way so that it is easy for them >> to apply it. I agree that git is much easier to use for that, but imho >> it's not difficult to create a patch with svn as well, especially if >> you have a working scipy on your computer. > > Do you want SVN access? No: I had it in the past, but I forgot how to login, or maybe it stopped working, I don't know. In any case I much prefer to send patches that are properly reviewed and then applied by someone who is working with the scipy tree on a daily basis (as opposed to me who only occasionally sends a patch). And Stefan did a great job with this. So this is a hint to Alex ---- just send a nice patch that works and Stefan will apply it. The only thing that doesn't work with this scenario is that my name will not appear on the nice plot that Travis was showing at euroscipy (and I think at scipy2008 too). But once we move to git, all the names of the contributors will be preserved. Then I don't see any reason why people like me should have push access --- I just publish my branch at github, Stefan pulls it and pushes it in. Ondrej From cournape at gmail.com Thu Feb 26 11:49:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Feb 2009 01:49:37 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <49A6BC72.4070202@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220902260849x208c57b0r20e681aae2f60868@mail.gmail.com> On Fri, Feb 27, 2009 at 1:20 AM, Jonathan Guyer wrote: > > On Feb 26, 2009, at 10:59 AM, David Cournapeau wrote: > >> Jonathan Guyer wrote: >>> >>> Look at David Cournapeau's blog on why he's ditching bzr in >>> favor of git, after ditching svn in favor of bzr; >> >> You got it wrong: I've never ditched svn, since I have never used it >> for >> my own projects. > > Fair enough. Let me amend: "Look at David Cournapeau's blog on why > he's ditching bzr in > favor of git, after making an ad hominem attack on svn". Better? Not really. I know svn relatively well, and I think my svn complaints are valid. They may not be enough to make a change for numpy/scipy, but I notice that Stefan, Pauli and me, who account for maybe 50 % of the commits in the last 6 months, all use git-svn, even though we have commit rights. David From sturla at molden.no Thu Feb 26 12:02:03 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 26 Feb 2009 18:02:03 +0100 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <20090226160459.GA1525@phare.normalesup.org> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <20090226160459.GA1525@phare.normalesup.org> Message-ID: <49A6CB0B.4050609@molden.no> On 2/26/2009 5:04 PM, Gael Varoquaux wrote: > On Thu, Feb 26, 2009 at 11:00:32AM -0500, Jonathan Guyer wrote: >> e.g., while I can get work done with Windows, I can get a lot more >> work done with something else because I don't have to devote so much >> energy to profanity. > > Fantastic!! I love this sentence. On Linux, every call to malloc must be matched with a call to free. On Windows, millions of calls to HeapAlloc can be matched with a single call to HeapDestroy. Windows prevents impossible-to-find memory leaks, and preserves my sanity when working with complex data structures. I really like Windows. There is even software and hardware drivers for it. Linux may be free as in freedom, but as Kris Kristofferson noted some 30 years ago: "Freedom's just another word for nothing left to loose. Nothing ain't worth nothing but it's free." Sturla Molden From robert.kern at gmail.com Thu Feb 26 12:25:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Feb 2009 11:25:44 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6BC0B.4040301@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> Message-ID: <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> On Thu, Feb 26, 2009 at 09:58, Travis E. Oliphant wrote: > Pauli Virtanen wrote: >> Wed, 25 Feb 2009 17:18:37 -0600, Travis E. Oliphant wrote: >> [clip] >> >>> 3) There are pieces of SciPy that need work (interpolate stands out most >>> in my mind right now). ? ?I have changes to the interpolate code that I >>> have not yet committed because I was waiting for the release of 0.7 but >>> I really want to commit. ?Who is interested in reviewing this? ?I'm >>> happy to work with additional eyes, but my current workflow is "commit >>> code I think is working along with some tests and docstrings", and then >>> let review/improve happen on the trunk. >>> >> >> The codereview.appspot.com tool is very fast to use, eg. via the >> >> ? ? ? http://codereview.appspot.com/static/upload.py >> >> tool. So I'd suggest to just uploading the patches there even before >> commit; it can't do any harm. >> > The harm is the effort to do it. ? ?Interacting with a web-page is > slower than svn commit. That's why there are CLI tools to submit the review. >? This extra step in the process does make a > difference when you are time-crunched. We're usually not. >> The problem with reviewing code after commit in trunk is that it takes >> more effort to correct or ask about dubious points. >> > I disagree with this statement. ? Why does it take more effort than > reviewing code on the trunk? ? ?You can do an svn diff to get the code > changes, and do the review exactly as you could with any other tool. Because looking at a web page is easier, I've found. The communicating that happens afterwards is also easier. Please, *try* it for a month. I believe that you are speaking from ignorance. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Feb 26 12:36:03 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 10:36:03 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 10:25 AM, Robert Kern wrote: > On Thu, Feb 26, 2009 at 09:58, Travis E. Oliphant > wrote: > > Pauli Virtanen wrote: > >> Wed, 25 Feb 2009 17:18:37 -0600, Travis E. Oliphant wrote: > >> [clip] > >> > >>> 3) There are pieces of SciPy that need work (interpolate stands out > most > >>> in my mind right now). I have changes to the interpolate code that I > >>> have not yet committed because I was waiting for the release of 0.7 but > >>> I really want to commit. Who is interested in reviewing this? I'm > >>> happy to work with additional eyes, but my current workflow is "commit > >>> code I think is working along with some tests and docstrings", and then > >>> let review/improve happen on the trunk. > >>> > >> > >> The codereview.appspot.com tool is very fast to use, eg. via the > >> > >> http://codereview.appspot.com/static/upload.py > >> > >> tool. So I'd suggest to just uploading the patches there even before > >> commit; it can't do any harm. > >> > > The harm is the effort to do it. Interacting with a web-page is > > slower than svn commit. > > That's why there are CLI tools to submit the review. > > > This extra step in the process does make a > > difference when you are time-crunched. > > We're usually not. > > >> The problem with reviewing code after commit in trunk is that it takes > >> more effort to correct or ask about dubious points. > >> > > I disagree with this statement. Why does it take more effort than > > reviewing code on the trunk? You can do an svn diff to get the code > > changes, and do the review exactly as you could with any other tool. > > Because looking at a web page is easier, I've found. The communicating > that happens afterwards is also easier. > > Please, *try* it for a month. I believe that you are speaking from > ignorance. > I think a brief howto somewhere would help. It isn't easy to keep up with new tools and learn new habits without some help. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Feb 26 12:37:44 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 26 Feb 2009 18:37:44 +0100 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> Message-ID: <20090226173744.GF1525@phare.normalesup.org> On Thu, Feb 26, 2009 at 10:36:03AM -0700, Charles R Harris wrote: > I think a brief howto somewhere would help. It isn't easy to keep up with > new tools and learn new habits without some help. Especially when you are very busy. Its a common problem: learning new tools makes you more productive, so you do more when you have very little time. But if you have very little free time, you don't have time to learn new tools... Ga?l From ellisonbg.net at gmail.com Thu Feb 26 12:39:22 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 26 Feb 2009 09:39:22 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> Message-ID: <6ce0ac130902260939o53f85d36ib4a6fe22926f21f6@mail.gmail.com> > As another long time lurker I would also support everything Neil said. > > I also wanted to add the point, that what stops me recommending scipy > more widely to my colleagues is not that there is not enough code in > it - it is that it is not stable enough to rely on for their work. > That is perhaps a bit harsh, but I am sure that the first time one of > my colleagues lost 1/2 a day because of a scipy bug (as I have done > quite a few times) they would be back to MATLAB. > > So I would agree with Stefan and the others that the priority is not > getting more code in per se, but improving the quality and frequency > of releases to get a platform whose stability compares with MATLAB > before adding more stuff. +1 From ondrej at certik.cz Thu Feb 26 12:44:43 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 09:44:43 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <20090226173744.GF1525@phare.normalesup.org> References: <49A339F5.1040703@enthought.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <20090226173744.GF1525@phare.normalesup.org> Message-ID: <85b5c3130902260944i18e7105q8f6c7639a14899f7@mail.gmail.com> On Thu, Feb 26, 2009 at 9:37 AM, Gael Varoquaux wrote: > On Thu, Feb 26, 2009 at 10:36:03AM -0700, Charles R Harris wrote: >> ? ?I think a brief howto somewhere would help. It isn't easy to keep up with >> ? ?new tools and learn new habits without some help. > > Especially when you are very busy. > > Its a common problem: learning new tools makes you more productive, so > you do more when you have very little time. But if you have very little > free time, you don't have time to learn new tools... I am sure Stefan will write a howto, if not, I will, e.g. I'll try to fix something in scipy (hint: broyden2), go through the whole procedure (e.g. git, review, ...) and document the way, so that you can then just follow my howto. Ondrej From michael.abshoff at googlemail.com Thu Feb 26 12:16:17 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Thu, 26 Feb 2009 09:16:17 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> Message-ID: <49A6CE61.5020607@gmail.com> St?fan van der Walt wrote: > 2009/2/26 Jonathan Guyer : Hi, >> If I may very unfairly summarize the debate this far: >> >> St?fan, et al.: There aren't enough tests. All code must have tests! >> Under penalty of death! >> Travis, et al.: But then we we shall have neither code *nor* tests! >> St?fan, et al.: Good! > > I agree, your summary isn't accurate: the world isn't that black and > white. To summarise a profound story I heard the other day: > > "A wise man first dams up the river, before trying to empty the lake." > > Unless we test new code, how do we make time to clean up the old code? > > I certainly would not like to see SciPy stagnate because of hard-line > policy. My fear is, that unless we do something in order to protect > our scarce developer resources, we will see the same thing happening. Well, I have been following this thread so far and abstained from commenting except for one small email, but a couple more comments from the peanut gallery knowing too well that most people see things differently to the conclusion I will reach at the end: 1. If code is not tested it is broken. It might not be broken on your machine, but how about different endianess or pointer sizes? How about different compilers, BLAS backends, etc. 2. Tests illustrate how to use the code and that makes code more accessible for a person who is not the author to use as well as improve the code. And good coverage also lessens the risk that changes on one end of the codebase break things in other places. I believe the two points above are more or less agreed upon in the discussion here so far by everybody (probably not the "untested code is broken bit"), the main dividing issue is whether imposing "too much" or "too restrictive" process will hurt Scipy more than it will help. Tests can also be used to do other things, i.e. in Sage we use doctests to automatically look for leaks as well as to look for speed regression. The speed regression code is not yet automatically run, but due to experiencing slowdowns over and over again it is a high priority thing to get the needed infrastructure bits finished. Now the rather controversial point: In Sage we do not want your contribution if you do not have 100% coverage. There is code sitting in trac where the person has been told to get coverage up to standard, otherwise the code will not be merged. Obviously if the code is good and just missing some coverage the reviewer is often willing to add the missing bits to get the code up to standard. But we will not merge code that isn't fully tested or has known failures. The decision to mandate review and coverage was made after a not too long discussion in person at a Sage Days in September 2007 and it took a while to convince the last person that this was a good idea. We lost one developer who was not willing to write doctests for his code and all I can say is "good riddance". It wasn't a particularly important piece of code he was working on and I believe that also other external factors made that person drop out from the Sage project. Losing a person might seem bad and it certainly is not a good thing, but while the policy of mandatory coverage and then later on of mandatory review caused pain and required adjustments to the way we did things it has shown great term long term benefits. Review does not catch every issue, but it sure as hell catches a lot before the code goes even in. And once things have been merged they are in and unless someone takes the time right there and then to review the issues will likely just melt into the background. If you look into Sage into areas with low coverage you will find code that is in need of a lot of attention. That code was merged into Sage back in the day before review and coverage requirements and it clearly shows. The argument that some people make that people can go back and do review once the code is in certainly holds, the problem is that there are often more pressing issues to do right now than to deal with problems one could deal with later [the code is already in tree:)] and so very little happens about that code there and then unless it is truly bad. The comment about "damming the river" certainly fits very well here. We can and do release Sage version which are generally better than the previous release with much higher frequency than just about any other project out there. This is largely possible because of coverage and review. At one point we had some critical bugs that popped up at a rather important installation and William and I went from the decision to make a Sage release to having a handful of bugs fixed and tested that they worked to the final release tarball in less than 12 hours. And one more thing about trivial and obvious fixes not needing review: Those are some times the worst and hardest bugs to fix in the right way. Too often people have attempted to "fix" things in Sage not understanding the underlying implementation details and so on. We have been burned by "trivial" one line fixes often enough to develop a lot of skepticism that without testing anything can be verified to be fixed. One example here is one of the bugs that originally lead to the policy decision to mandated the coverage policy: We had a bug in an external library where a characteristic polynomial for a matrix over the integers was wrong. * First try: Fix the issue in the library and update it in Sage Oops, forgot the fix, *nobody* noticed since no doctest in Sage was added that did check for the bug, let's redo it. * Second try: Fix the issue in the library and update it in Sage, add doctest Testing the build revealed that everything worked fine. A couple days later we cut an rc release, but behold: doctesting reveals that on a big-endian PPC box the fix in the upstream library was *completely* broken. So the third try was the charm. Without mandated tests for fixes we would have caught this bug at some point for sure, but it would have likely been way more work to determine which change caused the problem since the side effects of broken code in that specific case can be at far places in the codebase. I am not asking anyone to agree with me, I just wanted to point out things the Sage project has been doing and share some experience I have collected as the Sage release manager. There is no one-size fits all for a project and project specific culture as well as customs are hard to change, but it can be done. I believe that the Sage project has greatly benefited from the policy changes described above. Now you need to come to your own conclusions :) > Regards > St?fan Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From oliphant at enthought.com Thu Feb 26 13:01:55 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 12:01:55 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> Message-ID: <49A6D913.9040809@enthought.com> Brian Granger wrote: > Neil, > > Thanks for speaking up! I think there are *many* people in your > situation, including myself - I too am mostly a silent watcher of > SciPy and I would be much more likely to contribute if the things you > list were a part of the Scipy development culture: > > * Tests > * Code review > * Documentation > What is standing in the way of these things being done more often? Tests are happening, code review is happening, and documentation is happening. What exactly is the problem except lack of time from people who have historically committed to SciPy? I want these things to happen and try to do them whenever I submit new code. But, time is a factor, and people will disagree about "what constitutes a test" and "what constitutes good documentation." There is a lot of code in SciPy that was contributed by me very early on that may not live up to the same standard of testing and documentation that people have. Is that the fundamental problem --- history? > * Good tools and workflow. > This, I do see as a problem. The value of DVCS is that it handles branches better and new contributors *want* to work on branches and this will encourage *everybody* (including me) to work on branches. I've been committing to trunk for many years on SciPy and NumPy without pre-commit code review. It's hard for me to break that habit and I'm resisting requests that I stop doing that. I'm willing to stop it -- if it will really help SciPy progress. However, I'm not convinced that this kind of commit behavior by me and others is what is really stalling SciPy development. It could be that other commit patterns are not advertised enough to assist newcomers and I'm all for advertising other commit patterns that help people contribute. So, let's do that advertising. I'm very encouraged by the experiments with a git-svn bridge and would love to see an issue-tracker with a command-line interface. I'd also love to see more volunteers who are willing to tackle release management so that we can do it more regularly. Ultimately, those who do the work of release management will define what SciPy *is*. Best regards, -Travis From charlesr.harris at gmail.com Thu Feb 26 14:06:46 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 12:06:46 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A6D913.9040809@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> Message-ID: Several lurkers have expressed an interest in working on scipy. How can we get them involved in working with the code? We need these people. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Feb 26 14:28:53 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 19:28:53 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <49A339F5.1040703@enthought.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <20090226173744.GF1525@phare.normalesup.org> <85b5c3130902260944i18e7105q8f6c7639a14899f7@mail.gmail.com> Message-ID: Thu, 26 Feb 2009 09:44:43 -0800, Ondrej Certik wrote: [clip] > I am sure Stefan will write a howto, if not, I will, e.g. I'll try to > fix something in scipy (hint: broyden2), go through the whole procedure > (e.g. git, review, ...) and document the way, so that you can then just > follow my howto. I now notice that I still haven't found time to make progress on the scipy.optimize.nonlin rewrite... I moved my git branches around a bit, you can find my current work here: http://github.com/pv/scipy-work/tree/ticket-791-optimize-nonlin-rewrite It'd be great if we managed to finish this, as it's been on hold now for some time. -- Pauli Virtanen From oliphant at enthought.com Thu Feb 26 14:39:47 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 13:39:47 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: <49A6F003.7080406@enthought.com> St?fan van der Walt wrote: > Hi Travis > > >> 3) There are pieces of SciPy that need work (interpolate stands out most >> in my mind right now). I have changes to the interpolate code that I >> have not yet committed because I was waiting for the release of 0.7 but >> I really want to commit. Who is interested in reviewing this? >> > > I'd be glad to. Pauli's suggestion of codereview.appspot.com sounds > good, since we don't have any better infrastructure in place. > As I've mentioned. I'm all for others doing this if it helps them feel more comfortable contributing to SciPy. I'm not interested in using this tool because of the increased effort. I have always been and remain very interested in reviews / feedback / comments / fixes to check-ins that I make to the trunk. If the check-in email is not sufficient for large changes, I am willing to send an email to interested parties about changes that have been made pointing to the svn diff in the Trac. If someone else would like to take those changes and have a discussion with some other tool, that is also fine. I'm very impatient because my windows of time to work on something are small and if I have to "wait-for-review" before something gets checked-in, I suspect I will get impatient because it increases the mental-time I have to spend on getting something fixed / improved in SciPy. I'm very aware of many of the improvements that need to be made, but don't have a lot of time to spend on them. >> 4) Bug-fix commits are a different thing than feature-enhancement >> commits. We should have different expectations of them. >> > > I agree, to an extent. I think it is an ideal opportunity to add a > test (since, clearly, the current test suite didn't catch the problem, > and since you had to study the broken code in order to fix it); but in > such a case it's more important to have the bug fixed. Unfortunately, > without a test you won't be absolutely certain that it's fixed > everywhere, but the process at least converges in the right direction. > I agree, but would mention that a unit test only lets you test against that particular feature/bug that was called out in the test. Without code-coverage you don't have any guarantees or even a guarantee that the unit-test is written well-enough to catch the more subtle and harder to replicate bugs. So, yes, unit-tests are good, but they are not a panacea to the goal of quality code. >> 6) I very much appreciate all the work people do on SciPy. I think >> our biggest lack more than anything else is the "full-time" person that >> can respond to the user community and keep the momentum moving. >> > > Absolutely. I've often wondered how hard it would be to obtain such > funding, but to date I haven't made any proposals. > Right now it looks to me that we have a steady-stream of students, academics, and the dedicated Robert Kern. I was hopeful that I would be able to make SciPy-growth work while I was in academia but it didn't work out for me. Right now, I'm excited to continue to help Enthought in its support of SciPy. I'm hopeful that will lead to me having more time to spend on SciPy myself, but it's possible that it won't work out that way. It's great to see others that have stepped up and are continuing to step up to move SciPy forward. Best regards, -Travis From oliphant at enthought.com Thu Feb 26 14:44:22 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 13:44:22 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6B7E8.5010105@creativetrax.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A6B7E8.5010105@creativetrax.com> Message-ID: <49A6F116.5050405@enthought.com> jason-sage at creativetrax.com wrote: > Perry Greenfield wrote: > >> 2) While I understand the desire to increase the quality of commits to >> scipy by putting in a more formal process, like making sure code is >> reviewed, tests are present, and documentation is provided, I too, >> like Travis, worry that this may inhibit many useful contributions. >> Rather than act as a barrier, why not just have some sort of "seal of >> approval" for things that have gone through that process. >> > > > Lots of projects have -stable and -dev branches. The -stable branch for > scipy could involve the "seal of approval" with review, doctests, etc. > The -dev branch could be the unreviewed code. This lets Travis commit > to something and get his patches out there, but also clearly defines a > line in the sand between reviewed and unreviewed code. I realize that > scipy already has something of -dev and -stable branches, based on > releases. Maybe this idea boils down to: only reviewed code is allowed > in an official release, but there is a -dev branch with all code > available as well. As code is reviewed, it is moved into the -stable > branch and released in the next release. > This may be a good solution for us in the short term, prior to choosing a DVCS. > In reality, using a DVCS, each developer's copy of the repository then > becomes a private -dev branch that can be pulled from. Developers get > to commit and publish unreviewed changes, and someone (the release > manager) can pull in to -stable the changes that are reviewed. The > release manager could also pull all changes from developer repositories > into an official -dev branch if you wanted to have a central clearing > house for what everyone is working on. > This sounds like a good workflow that solves the concerns I have while still allowing a stable branch to emerge with well-documented / tested / reviewed code. It seems like we could do this today --- I like it. -Travis From pav at iki.fi Thu Feb 26 14:45:42 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 19:45:42 +0000 (UTC) Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: Thu, 26 Feb 2009 14:17:07 +0200, St?fan van der Walt wrote: [clip] > I don't know if you've visited the portal to SciKits: > http://scikits.appspot.com. If we can make any changes to facilitate > packaging, let me know. I think a couple of things should be done on the portal to make it feel more finished: 1. Write a blurb for all of the scikits. (I can also do this, if you tell me how...) The portal page currently looks quite empty and somewhat discouraging: http://scikits.appspot.com/scikits Or should this happen automatically via PyPi? 2. A link to the portal should be added in a visible place @ scipy.org, when it's ready. 3. PyPi links & instructions for packages that are not in PyPi should be hidden. 4. openopt seems to live nowadays at openopt.org -- Pauli Virtanen From robert.kern at gmail.com Thu Feb 26 14:47:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Feb 2009 13:47:03 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6F003.7080406@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> <49A6F003.7080406@enthought.com> Message-ID: <3d375d730902261147t102c1823yaa8a558c5ff29b87@mail.gmail.com> On Thu, Feb 26, 2009 at 13:39, Travis E. Oliphant wrote: > I'm very impatient because my windows of time to work on something are > small and if I have to "wait-for-review" before something gets > checked-in, I suspect I will get impatient because it increases the > mental-time I have to spend on getting something fixed / improved in > SciPy. ? ?I'm very aware of many of the improvements that need to be > made, but don't have a lot of time to spend on them. Why do you care so much about checking it in to the trunk immediately? Toss it onto the review site with the CLI tool, and let someone else finish it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Thu Feb 26 14:48:18 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 12:48:18 -0700 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6F116.5050405@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A6B7E8.5010105@creativetrax.com> <49A6F116.5050405@enthought.com> Message-ID: On Thu, Feb 26, 2009 at 12:44 PM, Travis E. Oliphant wrote: > jason-sage at creativetrax.com wrote: > > Perry Greenfield wrote: > > > >> 2) While I understand the desire to increase the quality of commits to > >> scipy by putting in a more formal process, like making sure code is > >> reviewed, tests are present, and documentation is provided, I too, > >> like Travis, worry that this may inhibit many useful contributions. > >> Rather than act as a barrier, why not just have some sort of "seal of > >> approval" for things that have gone through that process. > >> > > > > > > Lots of projects have -stable and -dev branches. The -stable branch for > > scipy could involve the "seal of approval" with review, doctests, etc. > > The -dev branch could be the unreviewed code. This lets Travis commit > > to something and get his patches out there, but also clearly defines a > > line in the sand between reviewed and unreviewed code. I realize that > > scipy already has something of -dev and -stable branches, based on > > releases. Maybe this idea boils down to: only reviewed code is allowed > > in an official release, but there is a -dev branch with all code > > available as well. As code is reviewed, it is moved into the -stable > > branch and released in the next release. > > > This may be a good solution for us in the short term, prior to choosing > a DVCS. > > > In reality, using a DVCS, each developer's copy of the repository then > > becomes a private -dev branch that can be pulled from. Developers get > > to commit and publish unreviewed changes, and someone (the release > > manager) can pull in to -stable the changes that are reviewed. The > > release manager could also pull all changes from developer repositories > > into an official -dev branch if you wanted to have a central clearing > > house for what everyone is working on. > > > This sounds like a good workflow that solves the concerns I have while > still allowing a stable branch to emerge with well-documented / tested / > reviewed code. > Can someone walk Travis through the process so that he can make his commits somewhere? Then we can look them over and pull them into the trunk. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu Feb 26 14:54:00 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 26 Feb 2009 14:54:00 -0500 Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: On Feb 26, 2009, at 2:45 PM, Pauli Virtanen wrote: > Thu, 26 Feb 2009 14:17:07 +0200, St?fan van der Walt wrote: > [clip] >> I don't know if you've visited the portal to SciKits: >> http://scikits.appspot.com. If we can make any changes to facilitate >> packaging, let me know. > > I think a couple of things should be done on the portal to make it > feel > more finished: > > 1. Write a blurb for all of the scikits. (I can also do this, if you > tell > me how...) The portal page currently looks quite empty and somewhat > discouraging: > > http://scikits.appspot.com/scikits > > Or should this happen automatically via PyPi? Mmh, scikits.timeseries is not on PyPi yet. The latest sources require numpy 1.3.x, and I have to wait until 1.3 is officially released to release our first official version. > > 3. PyPi links & instructions for packages that are not in PyPi should > be hidden. Or updated: we have a fairly comprehensive doc on sourceforge (pytseries.sourceforge.net). How can I update the page on scikits.appspot.com ? From cournape at gmail.com Thu Feb 26 14:54:20 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Feb 2009 04:54:20 +0900 Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: <5b8d13220902261154r4be223b2obd334d91b63d1c5a@mail.gmail.com> On Fri, Feb 27, 2009 at 4:45 AM, Pauli Virtanen wrote: > Thu, 26 Feb 2009 14:17:07 +0200, St?fan van der Walt wrote: > [clip] >> I don't know if you've visited the portal to SciKits: >> http://scikits.appspot.com. ?If we can make any changes to facilitate >> packaging, let me know. > > I think a couple of things should be done on the portal to make it feel > more finished: > > 1. Write a blurb for all of the scikits. (I can also do this, if you tell > ? me how...) The portal page currently looks quite empty and somewhat > ? discouraging: > > ? http://scikits.appspot.com/scikits > > ? Or should this happen automatically via PyPi? Yes, it should be automatic. It corresponds to the short description of the package. I will fix it for my own packages, at least. There seems to be a problem with some packages which should not be there, though (scikits.em, for example - they are parts of scikits.learn - maybe a bug somewhere, since they don't exist in pypi). David From oliphant at enthought.com Thu Feb 26 14:55:50 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 13:55:50 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A6CE61.5020607@gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> <49A6CE61.5020607@gmail.com> Message-ID: <49A6F3C6.4060900@enthought.com> Michael Abshoff wrote: > I am not asking anyone to agree with me, I just wanted to point out > things the Sage project has been doing and share some experience I have > collected as the Sage release manager. There is no one-size fits all for > a project and project specific culture as well as customs are hard to > change, but it can be done. I believe that the Sage project has greatly > benefited from the policy changes described above. Now you need to come > to your own conclusions :) > Thank you for the detailed examples and for sharing your experiences. This kind of feedback is very valuable. One of the things I've admired about Sage is how much raw time people seem to devote to writing code / building web-pages / and writing documentation / tests. Having someone with as much energy as William seems to me to make a big difference. That was my experience with the NumPy port as well. Things move forward when there is someone spending a lot of time paying attention and pushing. *Just* the spare time of people causes projects to move forward more slowly. -Travis From ondrej at certik.cz Thu Feb 26 15:12:12 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 12:12:12 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <49A339F5.1040703@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <20090226173744.GF1525@phare.normalesup.org> <85b5c3130902260944i18e7105q8f6c7639a14899f7@mail.gmail.com> Message-ID: <85b5c3130902261212hef15135kce65e9a08f41a98c@mail.gmail.com> On Thu, Feb 26, 2009 at 11:28 AM, Pauli Virtanen wrote: > Thu, 26 Feb 2009 09:44:43 -0800, Ondrej Certik wrote: > [clip] >> I am sure Stefan will write a howto, if not, I will, e.g. I'll try to >> fix something in scipy (hint: broyden2), go through the whole procedure >> (e.g. git, review, ...) and document the way, so that you can then just >> follow my howto. > > I now notice that I still haven't found time to make progress on the > scipy.optimize.nonlin rewrite... I moved my git branches around a bit, > you can find my current work here: > > ? ? ? ?http://github.com/pv/scipy-work/tree/ticket-791-optimize-nonlin-rewrite > > It'd be great if we managed to finish this, as it's been on hold now > for some time. Yes, sorry about it --- I also haven't found time to help with this, but it's on my todo. Ondrej From ondrej at certik.cz Thu Feb 26 15:14:34 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 12:14:34 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A6F3C6.4060900@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <2CC58DDD-5CBB-48E7-825C-8912129668A7@nist.gov> <9457e7c80902260823g7f81a4bds2857bb51006a5192@mail.gmail.com> <49A6CE61.5020607@gmail.com> <49A6F3C6.4060900@enthought.com> Message-ID: <85b5c3130902261214n376e96b0q9c517234338b5916@mail.gmail.com> On Thu, Feb 26, 2009 at 11:55 AM, Travis E. Oliphant wrote: > Michael Abshoff wrote: >> I am not asking anyone to agree with me, I just wanted to point out >> things the Sage project has been doing and share some experience I have >> collected as the Sage release manager. There is no one-size fits all for >> a project and project specific culture as well as customs are hard to >> change, but it can be done. I believe that the Sage project has greatly >> benefited from the policy changes described above. Now you need to come >> to your own conclusions :) >> > Thank you for the detailed examples and for sharing your experiences. > This kind of feedback is very valuable. > > One of the things I've admired about Sage is how much raw time people > seem to devote to writing code / building web-pages / and writing > documentation / tests. > > Having someone with as much energy as William seems to me to make a big > difference. ? ? ?That was my experience with the NumPy port as well. > Things move forward when there is someone spending a lot of time paying > attention and pushing. ? *Just* the spare time of people causes projects > to move forward more slowly. Yep, I have exactly the same experience with sympy --- if I push things hard day and night, people will join and we grow, if it's just the spare time of other people, things move forward more slowly. Ondrej From jason-sage at creativetrax.com Thu Feb 26 15:24:13 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Thu, 26 Feb 2009 14:24:13 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> Message-ID: <49A6FA6D.9030802@creativetrax.com> Charles R Harris wrote: >> Because looking at a web page is easier, I've found. The communicating >> that happens afterwards is also easier. >> >> Please, *try* it for a month. I believe that you are speaking from >> ignorance. >> >> > > I think a brief howto somewhere would help. It isn't easy to keep up with > new tools and learn new habits without some help. > > +1 to some way of having the reviews easily accessible. I just looked at a review that Charles did and already learned some things about conventions in scipy, just based on the line-by-line commenting he did on a review. Jason From ellisonbg.net at gmail.com Thu Feb 26 15:40:21 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 26 Feb 2009 12:40:21 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <49A6D913.9040809@enthought.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> Message-ID: <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> >> * Tests >> * Code review >> * Documentation >> > What is standing in the way of these things being done more often? > Tests are happening, code review is happening, and documentation is > happening. ? What exactly is the problem except lack of time from people > who have historically committed to SciPy? Lack of time *can* be an issue for veteran developers. But for eager new developers this is not typically the problem. For these people, you simply need to make it perfectly clear how they can contribute to the project. The more specific we can be the better. As a semi-outsider, here is what I *perceive* the Scipy model to be currently (for new users): * Checkout the SVN trunk. * Make your changes. * Contact the list and tell them about your new code. * Then ??? This procedure is summarized on the scipy development page in the following language: "Interested people can get repository write access as well" AND "If you have some new code you'd like to see included in SciPy, the first thing to do is make a SciKit" Here is what I would like to see = this would make me wan't to contribute code to scipy: Contributing to SciPy is easy and anyone can do it. Here is what you do: * Create a branch using git/bzr/hg (we have to pick one). * Write your code. * Add tests and documentation to your code. * Run the test suite * Post your branch to github/launchpad/bitbucket * Submit your branch for review using... When someone is eager to write code, this is what they need!!! > I want these things to happen and try to do them whenever I submit new > code. ? But, time is a factor, and people will disagree about "what > constitutes a test" and "what constitutes good documentation." > > There is a lot of code in SciPy that was contributed by me very early on > that may not live up to the same standard of testing and documentation > that people have. ?Is that the fundamental problem --- history? No project can escape its history. But for an eager developer, there is really no difference (from a workflow/testing/review perspective) between writing new code and improving old code. >> * Good tools and workflow. >> > This, I do see as a problem. ? ? The value of DVCS is that it handles > branches better and new contributors *want* to work on branches and this > will encourage *everybody* (including me) to work on branches. +1 > I've been committing to trunk for many years on SciPy and NumPy without > pre-commit code review. ? It's hard for me to break that habit and I'm > resisting requests that I stop doing that. ? ?I'm willing to stop it -- I agree that this is a hard habit to break. From stefan at sun.ac.za Thu Feb 26 16:07:14 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 26 Feb 2009 23:07:14 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> Message-ID: <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> 2009/2/26 Brian Granger : > Contributing to SciPy is easy and anyone can do it. ?Here is what you do: > > * Create a branch using git/bzr/hg (we have to pick one). > * Write your code. > * Add tests and documentation to your code. > * Run the test suite > * Post your branch to github/launchpad/bitbucket > * Submit your branch for review using... I'll have a first draft of a developer's guide ready in the morning. Regards St?fan From stefan at sun.ac.za Thu Feb 26 16:11:36 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 26 Feb 2009 23:11:36 +0200 Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) In-Reply-To: <5b8d13220902261154r4be223b2obd334d91b63d1c5a@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> <5b8d13220902261154r4be223b2obd334d91b63d1c5a@mail.gmail.com> Message-ID: <9457e7c80902261311x4cdbb6afhff507298dfabf155@mail.gmail.com> 2009/2/26 David Cournapeau : > There seems to be a problem with some packages which should not be > there, though (scikits.em, for example - they are parts of > scikits.learn - maybe a bug somewhere, since they don't exist in > pypi). We scan both the SVN repository and PyPi. If anyone wants editing access to the text on those pages, we'll give you the appropriate permission. The descriptions are taken straight from PyPi, so if your package is registered there it should reflect correctly on scikits.appspot. If OpenOpt no longer lives in the scikits SVN, we should probably remove it. Cheers St?fan From charlesr.harris at gmail.com Thu Feb 26 16:18:22 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 14:18:22 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 1:40 PM, Brian Granger wrote: > >> * Tests > >> * Code review > >> * Documentation > >> > > What is standing in the way of these things being done more often? > > Tests are happening, code review is happening, and documentation is > > happening. What exactly is the problem except lack of time from people > > who have historically committed to SciPy? > > Lack of time *can* be an issue for veteran developers. But for eager > new developers this is not typically the problem. For these people, > you simply need to make it perfectly clear how they can contribute to > the project. The more specific we can be the better. > > As a semi-outsider, here is what I *perceive* the Scipy model to be > currently (for new users): > > * Checkout the SVN trunk. > * Make your changes. > * Contact the list and tell them about your new code. > * Then ??? > > This procedure is summarized on the scipy development page in the > following language: > > "Interested people can get repository write access as well" > > AND > > "If you have some new code you'd like to see included in SciPy, the > first thing to do is make a SciKit" > > Here is what I would like to see = this would make me wan't to > contribute code to scipy: > > Contributing to SciPy is easy and anyone can do it. Here is what you do: > > * Create a branch using git/bzr/hg (we have to pick one). > * Write your code. > * Add tests and documentation to your code. > * Run the test suite > * Post your branch to github/launchpad/bitbucket > * Submit your branch for review using... > > When someone is eager to write code, this is what they need!!! > Nice, I like it. +2. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ellisonbg.net at gmail.com Thu Feb 26 16:31:38 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 26 Feb 2009 13:31:38 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> Message-ID: <6ce0ac130902261331r35586aa2v99c77369f63ba7e1@mail.gmail.com> > I'll have a first draft of a developer's guide ready in the morning. Fantastic! Can this: http://www.scipy.org/Developer_Zone page be updated to reflect the changes. Cheers, Brian From charlesr.harris at gmail.com Thu Feb 26 16:46:23 2009 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Feb 2009 14:46:23 -0700 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> Message-ID: On Thu, Feb 26, 2009 at 2:07 PM, St?fan van der Walt wrote: > 2009/2/26 Brian Granger : > > Contributing to SciPy is easy and anyone can do it. Here is what you do: > > > > * Create a branch using git/bzr/hg (we have to pick one). > > * Write your code. > > * Add tests and documentation to your code. > > * Run the test suite > > * Post your branch to github/launchpad/bitbucket > > * Submit your branch for review using... > > I'll have a first draft of a developer's guide ready in the morning. > Getting the buildbots to support this procedure will also help with the testing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From argriffi at ncsu.edu Thu Feb 26 17:02:58 2009 From: argriffi at ncsu.edu (alex) Date: Thu, 26 Feb 2009 17:02:58 -0500 Subject: [SciPy-dev] nonlin patch In-Reply-To: <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> References: <9D4464CAAAB788439D66EE2432F9B5F1056A829E@00001EXCH.uk.mitsuibabcock.com> <49A69F62.6020603@ncsu.edu> <85b5c3130902260831h60a4e5a6k7fc4ac2365be24b1@mail.gmail.com> Message-ID: <49A71192.4010502@ncsu.edu> Ondrej Certik wrote: > [...] > > Just checkout the svn, copy your working scipy files over it, do "svn > di" and send us the patch. > Here is a bug-exposing patch that adds three tests to test_nonlin.py. One of them currently fails, and the fact that the other two pass helps to show the nature of the problem (if you use a small enough number of iterations or start your guess far enough from the true answer, then the bug is not triggered). This is probably fixed by Pauli Virtanen's work, but I haven't checked this. Alex Index: scipy/optimize/tests/test_nonlin.py =================================================================== --- scipy/optimize/tests/test_nonlin.py (revision 5597) +++ scipy/optimize/tests/test_nonlin.py (working copy) @@ -3,12 +3,13 @@ May 2007 """ +import numpy + from numpy.testing import * from scipy.optimize import nonlin from numpy import matrix, diag - def F(x): def p3(y): return float(y.T*y)*y @@ -24,6 +25,40 @@ return tuple(f.flat) +def G(x): + return [v**3 - 1 for v in x] + +class TestNonlinEasy(TestCase): + """ Test case for a stupidly easy function optimization problem. + """ + def broyden_helper(self, initial_guess, expected_result, iterations=None): + if iterations: + x = nonlin.broyden2(G, initial_guess, iter=iterations) + else: + x = nonlin.broyden2(G, initial_guess) + x_array = numpy.array(x) + xout_array = numpy.array(expected_result) + eps = 1e-9 + errmsg = 'got %s but expected %s' % (str(x_array), str(xout_array)) + difference = numpy.array(x_array - xout_array) + assert nonlin.norm(difference) < eps, errmsg + assert nonlin.norm(G(x)) < eps + + def test_broyden2_near_default_iterations(self): + initial_guess = [1.1, 1.1, 1.1] + expected_result = [1, 1, 1] + self.broyden_helper(initial_guess, expected_result) + + def test_broyden2_near_fewer_iterations(self): + initial_guess = [1.1, 1.1, 1.1] + expected_result = [1, 1, 1] + self.broyden_helper(initial_guess, expected_result, 8) + + def test_broyden2_far(self): + initial_guess = [2, 2, 2] + expected_result = [1, 1, 1] + self.broyden_helper(initial_guess, expected_result) + class TestNonlin(TestCase): """ Test case for a simple constrained entropy maximization problem (the machine translation example of Berger et al in From bsouthey at gmail.com Thu Feb 26 17:02:44 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 26 Feb 2009 16:02:44 -0600 Subject: [SciPy-dev] Updating and improving the statistical capabilities in Scipy Message-ID: <49A71184.2010407@gmail.com> Hi, I apologize in advance if this is the wrong approach. All this talk has inspired me to do something about developing the statistics of Scipy. We need to develop a strategy to improve the statistical functions within Scipy. A central requirement is a sort of code review to ensure that the existing functions have adequately documentation (see also the Scipy documentation Marathon http://www.scipy.org/Developer_Zone/DocMarathon2008) and have appropriate tests for functionality and accuracy. Basically I strongly believe that we must carefully use a slow divide-and-conqueror approach to succeed simply due to the scope involved. I would be extremely interested in what people would like to see so that we can develop specific goals and action plans to update and improve them. I consider at the least the following major areas currently present in Scipy that I am aware of: 1) Statistical distributions: Josef has vastly improved this. 2) Uni- and multi-variate kernel density estimation - currently Gaussian only available. 3) Basic statistical functions - available for standard and masked arrays but it is inconsistent. 4) Model fitting aspects that integrates different code within Scipy (including Jonathan Taylor's model class - which is really impressive and the Cookbook ols) to provide important functionality including general linear models, generalized linear models, and generalized additive models. I would suggest that we develop some type of PEP structure as starting point for discussion as well as using different threads to address different areas as well as future directions. Therefore I have put together something to address the basic statistical functions in a separate thread. Thanks Bruce From josef.pktd at gmail.com Thu Feb 26 17:09:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 26 Feb 2009 17:09:14 -0500 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> Message-ID: <1cd32cbb0902261409w7553e5dna54a9fca2eba09cd@mail.gmail.com> > Contributing to SciPy is easy and anyone can do it. ?Here is what you do: > > * Create a branch using git/bzr/hg (we have to pick one). > * Write your code. > * Add tests and documentation to your code. > * Run the test suite > * Post your branch to github/launchpad/bitbucket > * Submit your branch for review using... > > When someone is eager to write code, this is what they need!!! For bug fixes and small changes to scipy and if someone wants to submit some tests, this is a pretty bit of work, if you don't have the infrastructure set up for it. For example to be able to upload to a branch on launchpad, I was struggling several hours with the ssl authorization which was screwed up in my bazar install. When I am looking and using some open source project and I want to report some problems and propose some fixes, I don't want to have to install a new revision control system such as git, create accounts at several websites, just to report some bugfixes as in the story of Alex. To require a complete rebuild of scipy and creating branches for changing several lines of code and adding some tests, seems a lot to demand for someone that doesn't already have the particular full development environment installed. So while the decentralized version control will help those developers that want to get more strongly involved, we shouldn't make it sound like that is the only way to contribute. On the other hand, it is clear that the more work it is for committing and reviewing developers the slower will be a response, especially for parts of scipy that doesn't have a "maintainer". Josef From robert.kern at gmail.com Thu Feb 26 17:12:31 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Feb 2009 16:12:31 -0600 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <1cd32cbb0902261409w7553e5dna54a9fca2eba09cd@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <1cd32cbb0902261409w7553e5dna54a9fca2eba09cd@mail.gmail.com> Message-ID: <3d375d730902261412h59787689mcec2e0a9376251fa@mail.gmail.com> On Thu, Feb 26, 2009 at 16:09, wrote: >> Contributing to SciPy is easy and anyone can do it. ?Here is what you do: >> >> * Create a branch using git/bzr/hg (we have to pick one). >> * Write your code. >> * Add tests and documentation to your code. >> * Run the test suite >> * Post your branch to github/launchpad/bitbucket >> * Submit your branch for review using... >> >> When someone is eager to write code, this is what they need!!! > > For bug fixes and small changes to scipy and if someone wants to > submit some tests, this is a pretty bit of work, if you don't have the > infrastructure set up for it. > > For example to be able to upload to a branch on launchpad, I was > struggling several hours with the ssl authorization which was screwed > up in my bazar install. > When I am looking and using some open source project and I want to > report some problems and propose some fixes, I don't want to have to > install a new revision control system such as git, create accounts at > several websites, just to report some bugfixes as in the story of > Alex. To require a complete rebuild of scipy and creating branches for > changing several lines of code and adding some tests, seems a lot to > demand for someone that doesn't already have the particular full > development environment installed. > > So while the decentralized version control will help those developers > that want to get more strongly involved, we shouldn't make it sound > like that is the only way to contribute. > On the other hand, it is clear that the more work it is for committing > and reviewing developers the slower will be a response, especially for > parts of scipy that doesn't have a "maintainer". The lighterweight version is * Check out the source. * Write your code and as much docs and tests as you can. * Use to upload your patch to the review site. Making and publishing a branch is for people who do this regularly. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Thu Feb 26 17:26:39 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 26 Feb 2009 16:26:39 -0600 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy Message-ID: <49A7171F.5070500@gmail.com> Hi, I do apologize in advance if this is considered inappropriate but my goal is to advance the stats capabilities in Scipy. I do recognize that there is a large chunk of excellent work so part of this is simply to ensure that there is adequate documentation and tests. The following is my attempt of a PEP to provide some direction on how to improve the basic statistical functions within Scipy. I do have a list of the individual functions and the arguments involved but I decided it was in appropriate to attach it here. Probably the main aspect that I would like feedback is on whether or not there should be a single interface to these basic statistical functions. Thanks Bruce PEP: Improving the basic statistical functions in Scipy Authors: Bruce Southey Created: 26-Feb-2009 Abstract ======== This current PEP is orientated towards addressing the fundamental problems with the basic statistical functions in Scipy. The outcome is to provide Scipy with a consistent, well-tested and documented set of basic statistical functions that are available to different array types. Motivation ======== This PEP addresses the basic statistical functions available in the stats component of Scipy. These functions are defined in the following files: stats.py ? Defines many statistical functions and imports statlib. morestats.py ? Adds additional statistical functions to stats.py _support.py - Defines the functions used in stats.py but also is a circular because it also imports stats mstats.py ? Just imports functions from mstats_basic.py and mstats_extras.py mstats_basic.py ? Defines statistical functions for masked arrays mstats_extras.py? Defines additional statistical functions for masked arrays In total there are 178 unique functions defined in these files, some of which are private or internal and some have the same name but are defined slightly differently between standard and masked arrays. A list of theses functions is available. While the functions are defined for standard arrays and masked arrays, not all functions are available for both array types. For example, the majority of functions defined in stats.py for standard arrays are available in mstats_basic.py. But none of the standard arrays functions defined in morestats.py are available for masked arrays. Also none of these are functions are directly supported for other array types available to Scipy such record arrays, record arrays that contain masked data and sparse arrays. Specification ======== 1) Provide the same basic statistical functions with the same arguments for standard and masked arrays. 2) Utilize a single interface. For example, the gmean function (note _chk_asarray is defined differently): stats.py: def gmean(a, axis=0): a, axis = _chk_asarray(a, axis) log_a = np.log(a) return np.exp(log_a.mean(axis=axis)) mstats_basic.py: def gmean(a, axis=0): a, axis = _chk_asarray(a, axis) log_a = ma.log(a) return ma.exp(log_a.mean(axis=axis)) Rather a single function can be defined as: def gmean(a, axis=0): log_a = np.log(a) return np.exp(log_a.mean(axis=axis)) import numpy as np import numpy.ma as ma X=[1,2,3,4,5] a=np.array(X) m=ma.array(X, mask=[0,0,0,0,0]) np.exp((np.log(X).mean())) #2.6051710846973517 np.exp((np.log(a).mean())) #2.6051710846973517 np.exp((np.log(m).mean())) #2.6051710846973517 3) Depreciation and removal of unnecessary functions such as linregress. 4) Cleanup styles issues including: a) White space usage b) Consistent arguments such as 'a' vs 'x' and the usage of *args c) Uniquely identifying functions. i) rootfunc and tempfunc defined two and three times, respectively, in morestats.py but have different arguments. ii) makestr is defined twice in _support.py, once a main function and once as a subfunction of printcc. 5) Ensure info.py is complete and correct. 6) Improve the documentation of basic statistical functions in connection with the Scipy documentation Marathon (http://www.scipy.org/Developer_Zone/DocMarathon2008) 7) Improve the tests of the basic statistical functions: i) All functions should have at least have basic test coverage that indicates whether or not it is functional. ii) Important functions should have tests that include unexpected elements like Nan's, positive and negative infinity and other unexpected inputs. iii) Ideally there should be tests that check the function accuracy. 8) Extension of the functions to other array types available to Scipy such record arrays, record arrays that contain masked data and sparse arrays. Perhaps beyond the scope of this PEP. Backwards Compatibility ======== There is no guarantee that the outcome will maintain complete backwards compatibility because a consistent API is required across different array types. However, any changes to existing APIs must be justified such as ensuring the same keywords between functions for different array types. From gael.varoquaux at normalesup.org Thu Feb 26 17:27:00 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 26 Feb 2009 23:27:00 +0100 Subject: [SciPy-dev] Updating and improving the statistical capabilities in Scipy In-Reply-To: <49A71184.2010407@gmail.com> References: <49A71184.2010407@gmail.com> Message-ID: <20090226222700.GC23810@phare.normalesup.org> On Thu, Feb 26, 2009 at 04:02:44PM -0600, Bruce Southey wrote: > I would suggest that we develop some type of PEP structure as starting > point for discussion as well as using different threads to address > different areas as well as future directions. My 2 cents: Find a place that seems important to you, and that you believe you can fix or improve, and do what you believe is right, and propose patches or discussions around this patches. The reason I say this is that I noticed that I was thrilled by your e-mail, and immediately caring on moving it outside of my mailbox without replying to it. We are a bunch of busy folks, and you might not get anywhere with large-scope discussions, although people are enthousiastic about your ideas. :( Ga?l From lists_ravi at lavabit.com Thu Feb 26 17:33:43 2009 From: lists_ravi at lavabit.com (Ravi) Date: Thu, 26 Feb 2009 17:33:43 -0500 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6B7E8.5010105@creativetrax.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A6B7E8.5010105@creativetrax.com> Message-ID: <200902261733.43451.lists_ravi@lavabit.com> On Thursday 26 February 2009 10:40:24 jason-sage at creativetrax.com wrote: > > 2) While I understand the desire to increase the quality of commits to ? > > scipy by putting in a more formal process, like making sure code is ? > > reviewed, tests are present, and documentation is provided, I too, ? > > like Travis, worry that this may inhibit many useful contributions. ? > > Rather than act as a barrier, why not just have some sort of "seal of ? > > approval" for things that have gone through that process. > > Lots of projects have -stable and -dev branches. Please, NO! This does not scale. This is the process used in Boost and makes the release process a nightmare. Boost svn commit access is not easily gifted; virtually every library in Boost goes through extensive review and its authors are generally way-above-average programmers. In spite of all the preceding, getting a Boost release out is a superhuman effort: - tracking commits between branches is a full-time job - the -dev branch becomes "the wild west" very easily (slippery slope) - if no feature freeze, bug fixes get done even slower What you want, in my humble opinion, is what the KDE people call "always summer in the trunk". There really is only one scalable way to manage this: a DVCS. Instead of using trunk as a playground, use the "trunk" in your clone ("master" in git). So long as your clone is published somewhere (anywhere on the web, for instance), any authorized committer into the main repository can pull it. There is *exactly* one more step compared to using svn trunk, viz., sending a mail out to the mailing list indicating the change. But then, you were going to do that anyway, weren't you? The best example I can think of is Xorg. See http://cgit.freedesktop.org for a list of everyone's local repositories. (Of course, scipy may choose bzr over git, but the point still stands.) Let's say "airlied" just finished implementing the feature "drm" for the submodule "radeon r6xx". *His workflow is exactly the same as svn*: git clone ... = svn co / svn up # make changes git commit -a -m "drm rework finished" = svn commit # send mail to the mailing list Any interested party in the mailing list reviews the code and pulls it into the main repository. Note that "airlied" does no more work with the DVCS than with svn; the workflow for "airlied" has not changed other than the command substitution above. The above is actually a true story. Of course, now note that instead of "airlied" (a sanctified committer), it could be you and anyone interested (core developer or not) can simply pull from you and test it. This lowers the barriers to newbies, in my limited experience. Regards, Ravi From gael.varoquaux at normalesup.org Thu Feb 26 17:34:34 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 26 Feb 2009 23:34:34 +0100 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A7171F.5070500@gmail.com> References: <49A7171F.5070500@gmail.com> Message-ID: <20090226223434.GD23810@phare.normalesup.org> On Thu, Feb 26, 2009 at 04:26:39PM -0600, Bruce Southey wrote: > 1) Provide the same basic statistical functions with the same arguments > for standard and masked arrays. Sounds great. > 4) Cleanup styles issues including: > a) White space usage > b) Consistent arguments such as 'a' vs 'x' and the usage of *args > c) Uniquely identifying functions. > i) rootfunc and tempfunc defined two and three times, respectively, in > morestats.py but have different arguments. > ii) makestr is defined twice in _support.py, once a main function and > once as a subfunction of printcc. Excellent. > 5) Ensure info.py is complete and correct. Great. > 6) Improve the documentation of basic statistical functions in > connection with the Scipy documentation Marathon > (http://www.scipy.org/Developer_Zone/DocMarathon2008) Go for it! > 7) Improve the tests of the basic statistical functions: > i) All functions should have at least have basic test coverage that > indicates whether or not it is functional. > ii) Important functions should have tests that include unexpected > elements like Nan's, positive and negative infinity and other unexpected > inputs. > iii) Ideally there should be tests that check the function accuracy. Good. Hell, I can't say more. If you can get just half of what you have listed up there, it would be fantastic. I'll try to review your work (which I expect you will be putting up on a code review site, as discussed previously), but I don't promise anything: it can be hard for me to find time to sit down and do something serious on top of my current workload. Cheers, Ga?l From pav at iki.fi Thu Feb 26 17:36:57 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 22:36:57 +0000 (UTC) Subject: [SciPy-dev] Scipy workflow (and not tools). References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> Message-ID: Thu, 26 Feb 2009 23:07:14 +0200, St?fan van der Walt wrote: > 2009/2/26 Brian Granger : >> Contributing to SciPy is easy and anyone can do it. ?Here is what you >> do: >> >> * Create a branch using git/bzr/hg (we have to pick one). * Write your >> code. >> * Add tests and documentation to your code. * Run the test suite >> * Post your branch to github/launchpad/bitbucket * Submit your branch >> for review using... > > I'll have a first draft of a developer's guide ready in the morning. If you want to discuss Git, you can probably steal from here: http://scipy.org/scipy/numpy/wiki/GitMirror -- Pauli Virtanen From stefan at sun.ac.za Thu Feb 26 17:45:16 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 27 Feb 2009 00:45:16 +0200 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A7171F.5070500@gmail.com> References: <49A7171F.5070500@gmail.com> Message-ID: <9457e7c80902261445o7b556713hb7e2bc4e89649f00@mail.gmail.com> Hi Bruce 2009/2/27 Bruce Southey : > The following is my attempt of a PEP to provide some direction on how to > improve the basic statistical functions within Scipy. I do have a list > of the individual functions and the arguments involved but I decided it > was in appropriate to attach it here. Thank you for all the thoughtful suggestions. I think some of these issues can already be turned into tickets, i.e. API inconsistencies, missing test coverage, broken docs, etc. It might be useful to do so, so that we know what needs to be done next. Regards St?fan From ellisonbg.net at gmail.com Thu Feb 26 17:46:31 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 26 Feb 2009 14:46:31 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <3d375d730902261412h59787689mcec2e0a9376251fa@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <1cd32cbb0902261409w7553e5dna54a9fca2eba09cd@mail.gmail.com> <3d375d730902261412h59787689mcec2e0a9376251fa@mail.gmail.com> Message-ID: <6ce0ac130902261446k2691e60al182f876a193911aa@mail.gmail.com> > The lighterweight version is > > * Check out the source. > * Write your code and as much docs and tests as you can. > * Use to upload your patch to the review site. > > Making and publishing a branch is for people who do this regularly. Yes, the workflow should support people who want to write code but don't know how to or want to make and publish a branch. But I think we *should* encourage people to make branches (if a DVCS is used) simply because it gives them local version control and allows more flexibility for how the patches are created (for example rebasing). Brian From stefan at sun.ac.za Thu Feb 26 17:48:00 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 27 Feb 2009 00:48:00 +0200 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> Message-ID: <9457e7c80902261448y2719b96bg8225a0717969eedd@mail.gmail.com> 2009/2/27 Pauli Virtanen : > If you want to discuss Git, you can probably steal from here: > > ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror Ah, yes, good reminder! Could you give me a quick rundown of why you used --mirror earlier on when adding the remote? Thanks! St?fan From pav at iki.fi Thu Feb 26 18:21:13 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 23:21:13 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> Message-ID: Thu, 26 Feb 2009 09:58:03 -0600, Travis E. Oliphant wrote: [clip] > The harm is the effort to do it. Interacting with a web-page is > slower than svn commit. This extra step in the process does make a > difference when you are time-crunched. > >> The problem with reviewing code after commit in trunk is that it takes >> more effort to correct or ask about dubious points. > > I disagree with this statement. Why does it take more effort than > reviewing code on the trunk? You can do an svn diff to get the code > changes, and do the review exactly as you could with any other tool. I partly agree with this: if you want to immediately fix something "wrong" in the suggested change yourself, a web-based code review tool gets in the way. (But I haven't yet tried how well the command-line tool would work...) But if you want to suggest some changes to the author of the commit, or ask some specifics, the code review tool works fairly well as a communication tool. Less hassle than commenting on a mailing list, and more organized. I note that Github offers a similar "remarks-in-commits" feature. Could be worth a try to check how this works in practice. > One way to see this is the difference between asking for permission or > asking for forgiveness. Both have their place in social activities, > but we shouldn't institutionalize one over the other. Another social aspect is that asking for permission has much more positive connotations than asking for forgiveness. Anyway, difficult to tell how a mandatory review policy would affect Scipy's development. I'd be reluctant to jump headfirst to requiring it, without experimenting, even if Sage and Sympy have had good experience about it. Nevertheless, I'm going to try to change my own workflow and see what happens. [clip] >> I've found git-svn quite good for maintaining topic branches. It can >> switch easily between them using the same working tree, so that >> compiles are fast, and editor just needs M-x revert-buffer. > > Thanks for the tip. At some point I may be able to invest some time in > learning about git-svn. How do you switch between branches using > git-svn. With svn it's svn switch > http://some-name-I-always-have-to-look-up-and-takes-time. Check what branches you have git branch Check what branches other people have git branch -r Switch working tree to a different branch git checkout BRANCHNAME And it's fast. I think here Git beats SVN, Bzr and Mercurial in ease of use. (Mercurial does have some similar features, but IMO they are a bit less mature.) -- Pauli Virtanen From pav at iki.fi Thu Feb 26 18:24:18 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 26 Feb 2009 23:24:18 +0000 (UTC) Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v References: Message-ID: Thu, 26 Feb 2009 03:03:58 +0000, Pauli Virtanen wrote: > (For trying out the code review tool...) > > Scipy bugs #503 and #849 are due to a non-robust implementation of > Bessel I function in Cephes. The following changes address this, by > using an implementation from the Boost library, converted to C: > > http://codereview.appspot.com/20078 Urgh, the codereview app does not add any distinguishing headers to the mails it sends for each comment. Does someone know if something can be done to this? (Meanwhile, my .procmailrc says :0B which is mildly funny at this time of the day.) -- Pauli Virtanen From oliphant at enthought.com Thu Feb 26 18:44:22 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 17:44:22 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A6FA6D.9030802@creativetrax.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> Message-ID: <49A72956.2090104@enthought.com> jason-sage at creativetrax.com wrote: > Charles R Harris wrote: > >>> Because looking at a web page is easier, I've found. The communicating >>> that happens afterwards is also easier. >>> >>> Please, *try* it for a month. I believe that you are speaking from >>> ignorance. >>> >>> >>> >> I think a brief howto somewhere would help. It isn't easy to keep up with >> new tools and learn new habits without some help. >> >> >> > > +1 to some way of having the reviews easily accessible. I just looked > at a review that Charles did and already learned some things about > conventions in scipy, just based on the line-by-line commenting he did > on a review. > This is a valuable aspect of the review process that I had not considered... I'm actually at the point where I'm willing to try it as an experiment. My biggest concern is having a bunch of code sitting in a queue and not reviewed, nor committed --- or the review processes become too onerous and code not making it through because of what I would consider to be "ticky-tacky technicalities." If the process brings more people to the project, then it can ameliorate the first concern entirely, but possibly escalate the second. But, Robert's penchant for experimentation is charming me into doing something different and actually trying it out. So, whose going to show me what to actually do? There's no rush, I won't get to being able to push the interpolate stuff out until next week at the earliest. -Travis From josef.pktd at gmail.com Thu Feb 26 18:47:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 26 Feb 2009 18:47:08 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A7171F.5070500@gmail.com> References: <49A7171F.5070500@gmail.com> Message-ID: <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> I think a discussion for a roadmap for stats will be very useful. Currently my priority is still your point 7 iii) Ideally there should be tests that check the function accuracy. I consider this the main point of almost all my work on stats. And there are still some incorrect parts left. The next part for the current code base, that I think about, was to evaluate function whether they are ok, can be generalized, e.g. dimension, or are trivial and should be removed. Next are changes in the interface and combining or comparing mstats and stats. Here, I don't have a clear opinion yet of how far we can or want to consistently generalize all statistical functions to the different type of arrays. In many cases I looked at, the masked array version looked sufficiently different that I would be reluctant to merge them. One radical alternative would be to depreciate stats.stats and expand mstats, since it is already better designed to handle different array types. But I like the "simple" versions in stats, and I'm curious about any speed difference. But general tools to interface to different array types would be useful and should be carefully designed, e.g. function like ols that have a plain ndarray core, but can access the data from structured arrays and masked arrays. After, the changes to the current statistical function, I was considering areas of statistics that have partial but incomplete coverage. Non-parametric tests are well represented, and I have some extension for tests for discrete distributions. I think ANOVA, which I never used myself, has a very incomplete collection, which, I guess is a historical accident since Gary Strangman had, I think more ANOVA functions that are not included in stats. So instead of having a laundry list of functions, (some of which don't seem to have been used for years), I would prefer at least a conceptional grouping around statistical topics. Regression of course is currently MIA. The next large interface issue, especially for enhancements, is whether to use functions or proper classes. I think for some statistical analysis the current statistical function, once cleaned up, work fine. However, even R returns result classes (or whatever their equivalent is) for every statistical test, while in python we use matlab style functions. This will change when models will be included again. I have a list of functions that have no test coverage, a list (not written down) of functions that have bug suspects or known bugs, and it would be useful to get a wider opinion about which functions and interfaces are important Working on the list of functions on the wiki page maybe simpler for collecting comments than going through the statistical review in trac. Overall, I think there is still a lot of work to do before I start to worry about white space issues. Josef From oliphant at enthought.com Thu Feb 26 18:52:09 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 26 Feb 2009 17:52:09 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <9457e7c80902230804u32283340q16cb71bac179cebf@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> Message-ID: <49A72B29.7040900@enthought.com> Pauli Virtanen wrote: > Thu, 26 Feb 2009 09:58:03 -0600, Travis E. Oliphant wrote: > [clip] > > > Anyway, difficult to tell how a mandatory review policy would affect > Scipy's development. I'd be reluctant to jump headfirst to requiring it, > without experimenting, even if Sage and Sympy have had good experience > about it. Nevertheless, I'm going to try to change my own workflow and > see what happens. > I'm willing to do this as well. I love seeing the enthusiasm behind getting a description of how to contribute to SciPy up on the wiki and the effort put in to testing DVCS. That alone is worth encouraging if I can by switching my own workflow. -Travis From robert.kern at gmail.com Thu Feb 26 18:57:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Feb 2009 17:57:36 -0600 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> Message-ID: <3d375d730902261557h4e73a4d6kd332623c3e5860a9@mail.gmail.com> On Thu, Feb 26, 2009 at 17:47, wrote: > After, the changes to the current statistical function, I was > considering areas of statistics that have partial but incomplete > coverage. Non-parametric tests are well represented, and I have some > extension for tests for discrete distributions. I think ANOVA, which I > never used myself, has a very incomplete collection, which, I guess is > a historical accident since Gary Strangman had, I think more ANOVA > functions that are not included in stats. It's no accident. I removed it. It was a big monster function. Gary had a big disclaimer on it saying that it basically worked for the use case he had at the time, but it was far from a good general implementation. It printed things without being asked. No one knew if it actually worked. Gary removed it from later versions of his code, etc. Some of use decided that it was better for someone with an interest in ANOVA to reimplement it from scratch. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Feb 26 19:08:01 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 27 Feb 2009 00:08:01 +0000 (UTC) Subject: [SciPy-dev] Scipy workflow (and not tools). References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> <9457e7c80902261448y2719b96bg8225a0717969eedd@mail.gmail.com> Message-ID: Fri, 27 Feb 2009 00:48:00 +0200, St?fan van der Walt wrote: > 2009/2/27 Pauli Virtanen : >> If you want to discuss Git, you can probably steal from here: >> >> ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror > > Ah, yes, good reminder! > > Could you give me a quick rundown of why you used --mirror earlier on > when adding the remote? The --mirror option adds fetch = +refs/*:refs/* mirror = yes to [remote "origin"]. So one wouldn't need to edit .git/config manually. However, the --mirror has another effect which I missed earlier: it makes the remote consider all heads its own, so that "git remote prune origin" would drop all branches, including local ones. Similar issue with "git fetch". So I think it's not the correct solution. *** But all of that is moot now. I finally figured out that I must push to the mirror with git push git at github.com:pv/numpy-svn.git \ +refs/remotes/*:refs/heads/* +master Then it can be cloned simply with git clone --origin svn git://github.com/pv/scipy-svn.git And "--origin svn" only because we want svn/trunk instead of origin/trunk. Also git-svn can be activated: git svn init -s --prefix=svn/ http://svn.scipy.org/svn/scipy git svn rebase -l And as a bonus, the SVN branches are visible on Github! I'll update the GitMirror page. -- Pauli Virtanen From cournape at gmail.com Thu Feb 26 22:11:32 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Feb 2009 12:11:32 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> Message-ID: <5b8d13220902261911i6b4de9bh5b7b1ebfcca66921@mail.gmail.com> On Fri, Feb 27, 2009 at 8:21 AM, Pauli Virtanen wrote: > Thu, 26 Feb 2009 09:58:03 -0600, Travis E. Oliphant wrote: > [clip] >> The harm is the effort to do it. ? ?Interacting with a web-page is >> slower than svn commit. ? This extra step in the process does make a >> difference when you are time-crunched. >> >>> The problem with reviewing code after commit in trunk is that it takes >>> more effort to correct or ask about dubious points. >> >> I disagree with this statement. ? Why does it take more effort than >> reviewing code on the trunk? ? ?You can do an svn diff to get the code >> changes, and do the review exactly as you could with any other tool. > > I partly agree with this: if you want to immediately fix something > "wrong" in the suggested change yourself, a web-based code review tool > gets in the way. (But I haven't yet tried how well the command-line tool > would work...) > > But if you want to suggest some changes to the author of the commit, or > ask some specifics, the code review tool works fairly well as a > communication tool. Less hassle than commenting on a mailing list, and > more organized. > > I note that Github offers a similar "remarks-in-commits" feature. Could > be worth a try to check how this works in practice. > >> One way to see this is the difference between asking for permission or >> asking for forgiveness. Both have their place in social activities, >> but we shouldn't institutionalize one over the other. > > Another social aspect is that asking for permission has much more > positive connotations than asking for forgiveness. > > Anyway, difficult to tell how a mandatory review policy would affect > Scipy's development. I'd be reluctant to jump headfirst to requiring it, > without experimenting, even if Sage and Sympy have had good experience > about it. Nevertheless, I'm going to try to change my own workflow and > see what happens. > > [clip] >>> I've found git-svn quite good for maintaining topic branches. It can >>> switch easily between them using the same working tree, so that >>> compiles are fast, and editor just needs M-x revert-buffer. >> >> Thanks for the tip. At some point I may be able to invest some time in >> learning about git-svn. How do you switch between branches using >> git-svn. ?With svn it's svn switch >> http://some-name-I-always-have-to-look-up-and-takes-time. > > Check what branches you have > > ? ? ? ?git branch > > Check what branches other people have > > ? ? ? ?git branch -r > > Switch working tree to a different branch > > ? ? ? ?git checkout BRANCHNAME > > And it's fast. I think here Git beats SVN, Bzr and Mercurial in ease of > use. (Mercurial does have some similar features, but IMO they are a bit > less mature.) For me, that's one of the big killer feature of git. Switching branches is really cheap, both in time and in terms of command lines. When you add the ability to compare branches between them, git just blows away bzr. For example, when I want to get an idea of the development between two branches, I can do: git diff branch1..branch2 # --stat option is useful too to get a global view git log branch1..branch2 In scipy, both of those takes less than a second, even for thousand of commits of difference. For release, of to make sure I merge what I think I am merging, this is very helpful. You just never do it with svn, because it is so slow (takes minutes) and the syntax to compare branches is awful. This alone is one of the reason why I vastly prefer git to bzr, too. cheers, David From ondrej at certik.cz Thu Feb 26 22:20:35 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 19:20:35 -0800 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902261911i6b4de9bh5b7b1ebfcca66921@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <5b8d13220902261911i6b4de9bh5b7b1ebfcca66921@mail.gmail.com> Message-ID: <85b5c3130902261920q7336765cs14ef2ae743496f42@mail.gmail.com> On Thu, Feb 26, 2009 at 7:11 PM, David Cournapeau wrote: > On Fri, Feb 27, 2009 at 8:21 AM, Pauli Virtanen wrote: >> Thu, 26 Feb 2009 09:58:03 -0600, Travis E. Oliphant wrote: >> [clip] >>> The harm is the effort to do it. ? ?Interacting with a web-page is >>> slower than svn commit. ? This extra step in the process does make a >>> difference when you are time-crunched. >>> >>>> The problem with reviewing code after commit in trunk is that it takes >>>> more effort to correct or ask about dubious points. >>> >>> I disagree with this statement. ? Why does it take more effort than >>> reviewing code on the trunk? ? ?You can do an svn diff to get the code >>> changes, and do the review exactly as you could with any other tool. >> >> I partly agree with this: if you want to immediately fix something >> "wrong" in the suggested change yourself, a web-based code review tool >> gets in the way. (But I haven't yet tried how well the command-line tool >> would work...) >> >> But if you want to suggest some changes to the author of the commit, or >> ask some specifics, the code review tool works fairly well as a >> communication tool. Less hassle than commenting on a mailing list, and >> more organized. >> >> I note that Github offers a similar "remarks-in-commits" feature. Could >> be worth a try to check how this works in practice. >> >>> One way to see this is the difference between asking for permission or >>> asking for forgiveness. Both have their place in social activities, >>> but we shouldn't institutionalize one over the other. >> >> Another social aspect is that asking for permission has much more >> positive connotations than asking for forgiveness. >> >> Anyway, difficult to tell how a mandatory review policy would affect >> Scipy's development. I'd be reluctant to jump headfirst to requiring it, >> without experimenting, even if Sage and Sympy have had good experience >> about it. Nevertheless, I'm going to try to change my own workflow and >> see what happens. >> >> [clip] >>>> I've found git-svn quite good for maintaining topic branches. It can >>>> switch easily between them using the same working tree, so that >>>> compiles are fast, and editor just needs M-x revert-buffer. >>> >>> Thanks for the tip. At some point I may be able to invest some time in >>> learning about git-svn. How do you switch between branches using >>> git-svn. ?With svn it's svn switch >>> http://some-name-I-always-have-to-look-up-and-takes-time. >> >> Check what branches you have >> >> ? ? ? ?git branch >> >> Check what branches other people have >> >> ? ? ? ?git branch -r >> >> Switch working tree to a different branch >> >> ? ? ? ?git checkout BRANCHNAME >> >> And it's fast. I think here Git beats SVN, Bzr and Mercurial in ease of >> use. (Mercurial does have some similar features, but IMO they are a bit >> less mature.) > > For me, that's one of the big killer feature of git. Switching > branches is really cheap, both in time and in terms of command lines. > When you add the ability to compare branches between them, git just > blows away bzr. > > For example, when I want to get an idea of the development between two > branches, I can do: > > git diff branch1..branch2 # --stat option is useful too to get a global view > git log branch1..branch2 > > In scipy, both of those takes less than a second, even for thousand of > commits of difference. For release, of to make sure I merge what I > think I am merging, this is very helpful. You just never do it with > svn, because it is so slow (takes minutes) and the syntax to compare > branches is awful. This alone is one of the reason why I vastly prefer > git to bzr, too. Exactly, and also git cherry-pick, to pickup some particular patches from the other branch (e.g. some fixes etc.). Those are things that really rock and once you get used to it, you never want to come back. Ondrej From cournape at gmail.com Thu Feb 26 23:23:30 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 27 Feb 2009 13:23:30 +0900 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> <9457e7c80902261448y2719b96bg8225a0717969eedd@mail.gmail.com> Message-ID: <5b8d13220902262023i675a4bc4ra8bf981267fb2156@mail.gmail.com> On Fri, Feb 27, 2009 at 9:08 AM, Pauli Virtanen wrote: > Fri, 27 Feb 2009 00:48:00 +0200, St?fan van der Walt wrote: > >> 2009/2/27 Pauli Virtanen : >>> If you want to discuss Git, you can probably steal from here: >>> >>> ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror >> >> Ah, yes, good reminder! >> >> Could you give me a quick rundown of why you used --mirror earlier on >> when adding the remote? > > The --mirror option adds > > ? ? ? ?fetch = +refs/*:refs/* > ? ? ? ?mirror = yes > > to [remote "origin"]. So one wouldn't need to edit .git/config manually. > > However, the --mirror has another effect which I missed earlier: it makes > the remote consider all heads its own, so that "git remote prune origin" > would drop all branches, including local ones. Similar issue with > "git fetch". So I think it's not the correct solution. > > ? ?*** > > But all of that is moot now. I finally figured out that I must push to > the mirror with > > ? ? ? ?git push git at github.com:pv/numpy-svn.git \ > ? ? ? ? ? ? ? ?+refs/remotes/*:refs/heads/* +master > > Then it can be cloned simply with > > ? ? ? ?git clone --origin svn git://github.com/pv/scipy-svn.git > > And "--origin svn" only because we want svn/trunk instead of origin/trunk. > Also git-svn can be activated: > > ? ? ? ?git svn init -s --prefix=svn/ http://svn.scipy.org/svn/scipy > ? ? ? ?git svn rebase -l > > And as a bonus, the SVN branches are visible on Github! Ah, nice, I did not find a way to do this - I used a dirty script to get local branches and update them instead. One thing which is still annoying is that tags are considered as branches - but I guess there is no way around it, since svn does not have any tag concept. David From ondrej at certik.cz Thu Feb 26 23:54:21 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 26 Feb 2009 20:54:21 -0800 Subject: [SciPy-dev] Scipy workflow (and not tools). In-Reply-To: <5b8d13220902262023i675a4bc4ra8bf981267fb2156@mail.gmail.com> References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> <9457e7c80902261448y2719b96bg8225a0717969eedd@mail.gmail.com> <5b8d13220902262023i675a4bc4ra8bf981267fb2156@mail.gmail.com> Message-ID: <85b5c3130902262054i27fca37fq7e2d58c6cf06626a@mail.gmail.com> On Thu, Feb 26, 2009 at 8:23 PM, David Cournapeau wrote: > On Fri, Feb 27, 2009 at 9:08 AM, Pauli Virtanen wrote: >> Fri, 27 Feb 2009 00:48:00 +0200, St?fan van der Walt wrote: >> >>> 2009/2/27 Pauli Virtanen : >>>> If you want to discuss Git, you can probably steal from here: >>>> >>>> ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror >>> >>> Ah, yes, good reminder! >>> >>> Could you give me a quick rundown of why you used --mirror earlier on >>> when adding the remote? >> >> The --mirror option adds >> >> ? ? ? ?fetch = +refs/*:refs/* >> ? ? ? ?mirror = yes >> >> to [remote "origin"]. So one wouldn't need to edit .git/config manually. >> >> However, the --mirror has another effect which I missed earlier: it makes >> the remote consider all heads its own, so that "git remote prune origin" >> would drop all branches, including local ones. Similar issue with >> "git fetch". So I think it's not the correct solution. >> >> ? ?*** >> >> But all of that is moot now. I finally figured out that I must push to >> the mirror with >> >> ? ? ? ?git push git at github.com:pv/numpy-svn.git \ >> ? ? ? ? ? ? ? ?+refs/remotes/*:refs/heads/* +master >> >> Then it can be cloned simply with >> >> ? ? ? ?git clone --origin svn git://github.com/pv/scipy-svn.git >> >> And "--origin svn" only because we want svn/trunk instead of origin/trunk. >> Also git-svn can be activated: >> >> ? ? ? ?git svn init -s --prefix=svn/ http://svn.scipy.org/svn/scipy >> ? ? ? ?git svn rebase -l >> >> And as a bonus, the SVN branches are visible on Github! > > Ah, nice, I did not find a way to do this - I used a dirty script to > get local branches and update them instead. One thing which is still > annoying is that tags are considered as branches - but I guess there > is no way around it, since svn does not have any tag concept. Btw, I guess you already know it, but if you need to clone the git repository (for example David's) and then you would like to update it using git-svn with the latest svn from scipy, here is the howto: http://subtlegradient.com/articles/2008/04/22/cloning-a-git-svn-clone E.g. basically: git svn init http://macromates.com/svn/Bundles/trunk/Bundles/Ruby.tmbundle -R svn cp .git/refs/remotes/origin/master .git/refs/remotes/git-svn git svn fetch Ondrej From stefan at sun.ac.za Fri Feb 27 00:37:46 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 27 Feb 2009 07:37:46 +0200 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A72956.2090104@enthought.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> Message-ID: <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> Hi Travis 2009/2/27 Travis E. Oliphant : > My biggest concern is having a bunch of code sitting in a > queue and not reviewed, nor committed --- or the review processes become > too onerous and code not making it through because of what I would > consider to be "ticky-tacky technicalities." That's a very valid concern. David and I are experimenting with different issue trackers and plugins for trac, to see how best to generate a "review pool". I.e., what I'd like to see is that, if you only have 5 minutes to work on SciPy in the evening, you can a) Go to trac and click on "tickets for review" b) Review a couple of tickets or a) Go to trac and click on "reviewed tickets" b) Apply those patches or a) Go to trac and click on "unresolved issues" b) Fix the bug c) Upload the patch for review Technically, (c) is a bit challenging. I note your concern that it would become difficult to check in, so what I would like is to have a script such as scipy-submit -t 212 -m "Do not deallocate memory after object disposal." which then uploads the patch to the codereview site, and adds a link to ticket 212 with the commit message and review URL. All of this can be done via the web, but I'd prefer to have a CLI available. Do you have any suggestions or further concerns? Thanks St?fan From stefan at sun.ac.za Fri Feb 27 00:42:52 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 27 Feb 2009 07:42:52 +0200 Subject: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v In-Reply-To: References: Message-ID: <9457e7c80902262142w333024a4nd1a2e0afb8082231@mail.gmail.com> 2009/2/27 Pauli Virtanen : > Urgh, the codereview app does not add any distinguishing headers to the > mails it sends for each comment. I'll ask on the rietveld list. Cheers St?fan From prabhu at aero.iitb.ac.in Fri Feb 27 00:52:55 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Fri, 27 Feb 2009 11:22:55 +0530 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> Message-ID: <49A77FB7.7010406@aero.iitb.ac.in> On 02/27/09 11:07, St?fan van der Walt wrote: > scipy-submit -t 212 -m "Do not deallocate memory after object disposal." > > which then uploads the patch to the codereview site, and adds a link > to ticket 212 with the commit message and review URL. All of this can > be done via the web, but I'd prefer to have a CLI available. > > Do you have any suggestions or further concerns? I've not contributed anything in years to scipy but I have a practical problem that might be worth addressing eventually (others might be in a similar position) -- my entire network is firewalled and I can only access the web behind an authenticated http proxy. The firewall does allow ssh connections out though but that seems useless to access a git repository hosted on github say. The git user guide does not mention the word proxy (google wasn't too much help either) and it would be nice if all the tools allowed people to use the workflow from behind a firewall without too much pain. This may or may not be possible right away and may be low priority but is worth keeping in mind. Thanks. prabhu From oliphant at enthought.com Fri Feb 27 01:00:49 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 27 Feb 2009 00:00:49 -0600 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> Message-ID: <49A78191.20902@enthought.com> St?fan van der Walt wrote: > Hi Travis > > 2009/2/27 Travis E. Oliphant : > >> My biggest concern is having a bunch of code sitting in a >> queue and not reviewed, nor committed --- or the review processes become >> too onerous and code not making it through because of what I would >> consider to be "ticky-tacky technicalities." >> > > That's a very valid concern. David and I are experimenting with > different issue trackers and plugins for trac, to see how best to > generate a "review pool". I.e., what I'd like to see is that, if you > only have 5 minutes to work on SciPy in the evening, you can > > a) Go to trac and click on "tickets for review" > b) Review a couple of tickets > > or > > a) Go to trac and click on "reviewed tickets" > b) Apply those patches > > or > > a) Go to trac and click on "unresolved issues" > b) Fix the bug > c) Upload the patch for review > > Technically, (c) is a bit challenging. I note your concern that it > would become difficult to check in, so what I would like is to have a > script such as > > scipy-submit -t 212 -m "Do not deallocate memory after object disposal." > Something like that would be nice! Thanks for the continued effort at improving workflow. -Travis From david at ar.media.kyoto-u.ac.jp Fri Feb 27 00:48:01 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 27 Feb 2009 14:48:01 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A77FB7.7010406@aero.iitb.ac.in> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> Message-ID: <49A77E91.4010101@ar.media.kyoto-u.ac.jp> Prabhu Ramachandran wrote: > On 02/27/09 11:07, St?fan van der Walt wrote: > >> scipy-submit -t 212 -m "Do not deallocate memory after object disposal." >> >> which then uploads the patch to the codereview site, and adds a link >> to ticket 212 with the commit message and review URL. All of this can >> be done via the web, but I'd prefer to have a CLI available. >> >> Do you have any suggestions or further concerns? >> > > I've not contributed anything in years to scipy but I have a practical > problem that might be worth addressing eventually (others might be in a > similar position) -- my entire network is firewalled and I can only > access the web behind an authenticated http proxy. I have similar issues, and I agree those are valid concerns. Those can be very painful to handle. In my case, there is no DNS server, the names are resolved by the proxy; my workstation can only resolve the proxy name. This breaks most applications out there. ssh is not easy, because ssh cannot resolve names - for git, I managed to get things worked out for github using corkscrew. This is the kind of things which I managed to do once and hope never have to do again, so I can't tell you exactly how to do it: http://en.wikipedia.org/wiki/Corkscrew_(program) My .ssh/config looks like this for github Host gitproxy User git HostName ssh.github.com Port 443 ProxyCommand /usr/bin/corkscrew www 3128 %h %p IdentityFile /home/david/.ssh/id_rsa.pub Where www is the name of my proxy and 3128 the port. FWIW, svn has similar problems. I could never commit anything from a former internship location because of some proxy limitations - it is one of the reasons which pushed me into git for scipy development, actually. If you can't access either ssh or proxy, my experience is that you are more or less screwed with any tool out there - but with DVCS, you can at least put your changes aside and commit them later from an easier connection. cheers, David From david at ar.media.kyoto-u.ac.jp Fri Feb 27 03:28:53 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 27 Feb 2009 17:28:53 +0900 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document Message-ID: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> Hi, Following the discussions, I have started to write a small document highlighting my current gripes with trac. I focus on some common scenario, and pin-point trac limitations. I mention possible new tools at the end, but that's not the main point: everybody who is also disatisfied with trac, and maybe even more importantly people who are currently satisfied and think their scenario is not covered should feel free to comment/modify it: http://scipy.org/scipy/numpy/wiki/ImprovingIssueWorkflow I put the initial version in svn as well: http://projects.scipy.org/scipy/numpy/browser/trunk/doc/neps/newbugtracker.rst. cheers, David From pav at iki.fi Fri Feb 27 04:18:22 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 27 Feb 2009 09:18:22 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> Message-ID: Fri, 27 Feb 2009 11:22:55 +0530, Prabhu Ramachandran wrote: [clip] > I've not contributed anything in years to scipy but I have a practical > problem that might be worth addressing eventually (others might be in a > similar position) -- my entire network is firewalled and I can only > access the web behind an authenticated http proxy. The firewall does > allow ssh connections out though but that seems useless to access a git > repository hosted on github say. Git can clone over HTTP, just change git:// to http:// and it seems to work. I can also clone through a proxy with export http_proxy=http://username:password at proxy:port/ git clone http://whatever Pushing over HTTP is another question... It's probably not possible to push to Github over HTTPS, but maybe there are places that you can push to with only HTTP authentication. -- Pauli Virtanen From pav at iki.fi Fri Feb 27 04:21:39 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 27 Feb 2009 09:21:39 +0000 (UTC) Subject: [SciPy-dev] Scipy workflow (and not tools). References: <1e2af89e0902241059r145a10d3n118745ac24f80a7b@mail.gmail.com> <6ce0ac130902252321j6139238by634364acd2bd07b2@mail.gmail.com> <49A6D913.9040809@enthought.com> <6ce0ac130902261240y3596278fo30693766a0194d5@mail.gmail.com> <9457e7c80902261307m1d4cdedckc7df633763a9b29d@mail.gmail.com> <9457e7c80902261448y2719b96bg8225a0717969eedd@mail.gmail.com> <5b8d13220902262023i675a4bc4ra8bf981267fb2156@mail.gmail.com> <85b5c3130902262054i27fca37fq7e2d58c6cf06626a@mail.gmail.com> Message-ID: Thu, 26 Feb 2009 20:54:21 -0800, Ondrej Certik wrote: [clip] > Btw, I guess you already know it, but if you need to clone the git > repository (for example David's) and then you would like to update it > using git-svn with the latest svn from scipy, here is the howto: > > http://subtlegradient.com/articles/2008/04/22/cloning-a-git-svn-clone Yes, instructions for this are available also on the page http://scipy.org/scipy/numpy/wiki/GitMirror But if the mirror is up-to-date (and I hope we manage to get the SVN post-commit hook installed), there's no need to do this, you can just ''git fetch''. -- Pauli Virtanen From waller at guldbyn.se Fri Feb 27 04:29:52 2009 From: waller at guldbyn.se (Stefan Waller) Date: Fri, 27 Feb 2009 10:29:52 +0100 Subject: [SciPy-dev] unsubscibe In-Reply-To: References: Message-ID: <004501c998bd$f1de5310$d59af930$@se> -----Ursprungligt meddelande----- Fr?n: scipy-dev-bounces at scipy.org [mailto:scipy-dev-bounces at scipy.org] F?r scipy-dev-request at scipy.org Skickat: den 27 februari 2009 10:19 Till: scipy-dev at scipy.org ?mne: Scipy-dev Digest, Vol 64, Issue 66 Send Scipy-dev mailing list submissions to scipy-dev at scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://projects.scipy.org/mailman/listinfo/scipy-dev or, via email, send a message with subject or body 'help' to scipy-dev-request at scipy.org You can reach the person managing the list at scipy-dev-owner at scipy.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Scipy-dev digest..." Today's Topics: 1. Re: Scipy workflow (and not tools). (David Cournapeau) 2. Re: Scipy workflow (and not tools). (Ondrej Certik) 3. Re: The future of SciPy and its development infrastructure (St?fan van der Walt) 4. Re: RFR 503, 849: more robust implementation of real Bessel I_v (St?fan van der Walt) 5. Re: The future of SciPy and its development infrastructure (Prabhu Ramachandran) 6. Re: The future of SciPy and its development infrastructure (Travis E. Oliphant) 7. Re: The future of SciPy and its development infrastructure (David Cournapeau) 8. Improving the bug tracking workflow: starting document (David Cournapeau) 9. Re: The future of SciPy and its development infrastructure (Pauli Virtanen) ---------------------------------------------------------------------- Message: 1 Date: Fri, 27 Feb 2009 13:23:30 +0900 From: David Cournapeau Subject: Re: [SciPy-dev] Scipy workflow (and not tools). To: SciPy Developers List Message-ID: <5b8d13220902262023i675a4bc4ra8bf981267fb2156 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On Fri, Feb 27, 2009 at 9:08 AM, Pauli Virtanen wrote: > Fri, 27 Feb 2009 00:48:00 +0200, St?fan van der Walt wrote: > >> 2009/2/27 Pauli Virtanen : >>> If you want to discuss Git, you can probably steal from here: >>> >>> ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror >> >> Ah, yes, good reminder! >> >> Could you give me a quick rundown of why you used --mirror earlier on >> when adding the remote? > > The --mirror option adds > > ? ? ? ?fetch = +refs/*:refs/* > ? ? ? ?mirror = yes > > to [remote "origin"]. So one wouldn't need to edit .git/config manually. > > However, the --mirror has another effect which I missed earlier: it > makes the remote consider all heads its own, so that "git remote prune origin" > would drop all branches, including local ones. Similar issue with "git > fetch". So I think it's not the correct solution. > > ? ?*** > > But all of that is moot now. I finally figured out that I must push to > the mirror with > > ? ? ? ?git push git at github.com:pv/numpy-svn.git \ ? ? ? ? ? ? ? > ?+refs/remotes/*:refs/heads/* +master > > Then it can be cloned simply with > > ? ? ? ?git clone --origin svn git://github.com/pv/scipy-svn.git > > And "--origin svn" only because we want svn/trunk instead of origin/trunk. > Also git-svn can be activated: > > ? ? ? ?git svn init -s --prefix=svn/ http://svn.scipy.org/svn/scipy ? > ? ? ?git svn rebase -l > > And as a bonus, the SVN branches are visible on Github! Ah, nice, I did not find a way to do this - I used a dirty script to get local branches and update them instead. One thing which is still annoying is that tags are considered as branches - but I guess there is no way around it, since svn does not have any tag concept. David ------------------------------ Message: 2 Date: Thu, 26 Feb 2009 20:54:21 -0800 From: Ondrej Certik Subject: Re: [SciPy-dev] Scipy workflow (and not tools). To: SciPy Developers List Message-ID: <85b5c3130902262054i27fca37fq7e2d58c6cf06626a at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On Thu, Feb 26, 2009 at 8:23 PM, David Cournapeau wrote: > On Fri, Feb 27, 2009 at 9:08 AM, Pauli Virtanen wrote: >> Fri, 27 Feb 2009 00:48:00 +0200, St?fan van der Walt wrote: >> >>> 2009/2/27 Pauli Virtanen : >>>> If you want to discuss Git, you can probably steal from here: >>>> >>>> ? ? ? ?http://scipy.org/scipy/numpy/wiki/GitMirror >>> >>> Ah, yes, good reminder! >>> >>> Could you give me a quick rundown of why you used --mirror earlier >>> on when adding the remote? >> >> The --mirror option adds >> >> ? ? ? ?fetch = +refs/*:refs/* >> ? ? ? ?mirror = yes >> >> to [remote "origin"]. So one wouldn't need to edit .git/config manually. >> >> However, the --mirror has another effect which I missed earlier: it >> makes the remote consider all heads its own, so that "git remote prune origin" >> would drop all branches, including local ones. Similar issue with >> "git fetch". So I think it's not the correct solution. >> >> ? ?*** >> >> But all of that is moot now. I finally figured out that I must push >> to the mirror with >> >> ? ? ? ?git push git at github.com:pv/numpy-svn.git \ ? ? ? ? ? ? ? >> ?+refs/remotes/*:refs/heads/* +master >> >> Then it can be cloned simply with >> >> ? ? ? ?git clone --origin svn git://github.com/pv/scipy-svn.git >> >> And "--origin svn" only because we want svn/trunk instead of origin/trunk. >> Also git-svn can be activated: >> >> ? ? ? ?git svn init -s --prefix=svn/ http://svn.scipy.org/svn/scipy ? >> ? ? ?git svn rebase -l >> >> And as a bonus, the SVN branches are visible on Github! > > Ah, nice, I did not find a way to do this - I used a dirty script to > get local branches and update them instead. One thing which is still > annoying is that tags are considered as branches - but I guess there > is no way around it, since svn does not have any tag concept. Btw, I guess you already know it, but if you need to clone the git repository (for example David's) and then you would like to update it using git-svn with the latest svn from scipy, here is the howto: http://subtlegradient.com/articles/2008/04/22/cloning-a-git-svn-clone E.g. basically: git svn init http://macromates.com/svn/Bundles/trunk/Bundles/Ruby.tmbundle -R svn cp .git/refs/remotes/origin/master .git/refs/remotes/git-svn git svn fetch Ondrej ------------------------------ Message: 3 Date: Fri, 27 Feb 2009 07:37:46 +0200 From: St?fan van der Walt Subject: Re: [SciPy-dev] The future of SciPy and its development infrastructure To: SciPy Developers List Message-ID: <9457e7c80902262137j69478ba9h910dcfac3949af26 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi Travis 2009/2/27 Travis E. Oliphant : > My biggest concern is having a bunch of code sitting in a queue and > not reviewed, nor committed --- or the review processes become too > onerous and code not making it through because of what I would > consider to be "ticky-tacky technicalities." That's a very valid concern. David and I are experimenting with different issue trackers and plugins for trac, to see how best to generate a "review pool". I.e., what I'd like to see is that, if you only have 5 minutes to work on SciPy in the evening, you can a) Go to trac and click on "tickets for review" b) Review a couple of tickets or a) Go to trac and click on "reviewed tickets" b) Apply those patches or a) Go to trac and click on "unresolved issues" b) Fix the bug c) Upload the patch for review Technically, (c) is a bit challenging. I note your concern that it would become difficult to check in, so what I would like is to have a script such as scipy-submit -t 212 -m "Do not deallocate memory after object disposal." which then uploads the patch to the codereview site, and adds a link to ticket 212 with the commit message and review URL. All of this can be done via the web, but I'd prefer to have a CLI available. Do you have any suggestions or further concerns? Thanks St?fan ------------------------------ Message: 4 Date: Fri, 27 Feb 2009 07:42:52 +0200 From: St?fan van der Walt Subject: Re: [SciPy-dev] RFR 503, 849: more robust implementation of real Bessel I_v To: SciPy Developers List Message-ID: <9457e7c80902262142w333024a4nd1a2e0afb8082231 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 2009/2/27 Pauli Virtanen : > Urgh, the codereview app does not add any distinguishing headers to > the mails it sends for each comment. I'll ask on the rietveld list. Cheers St?fan ------------------------------ Message: 5 Date: Fri, 27 Feb 2009 11:22:55 +0530 From: Prabhu Ramachandran Subject: Re: [SciPy-dev] The future of SciPy and its development infrastructure To: SciPy Developers List Message-ID: <49A77FB7.7010406 at aero.iitb.ac.in> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 02/27/09 11:07, St?fan van der Walt wrote: > scipy-submit -t 212 -m "Do not deallocate memory after object disposal." > > which then uploads the patch to the codereview site, and adds a link > to ticket 212 with the commit message and review URL. All of this can > be done via the web, but I'd prefer to have a CLI available. > > Do you have any suggestions or further concerns? I've not contributed anything in years to scipy but I have a practical problem that might be worth addressing eventually (others might be in a similar position) -- my entire network is firewalled and I can only access the web behind an authenticated http proxy. The firewall does allow ssh connections out though but that seems useless to access a git repository hosted on github say. The git user guide does not mention the word proxy (google wasn't too much help either) and it would be nice if all the tools allowed people to use the workflow from behind a firewall without too much pain. This may or may not be possible right away and may be low priority but is worth keeping in mind. Thanks. prabhu ------------------------------ Message: 6 Date: Fri, 27 Feb 2009 00:00:49 -0600 From: "Travis E. Oliphant" Subject: Re: [SciPy-dev] The future of SciPy and its development infrastructure To: SciPy Developers List Message-ID: <49A78191.20902 at enthought.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed St?fan van der Walt wrote: > Hi Travis > > 2009/2/27 Travis E. Oliphant : > >> My biggest concern is having a bunch of code sitting in a queue and >> not reviewed, nor committed --- or the review processes become too >> onerous and code not making it through because of what I would >> consider to be "ticky-tacky technicalities." >> > > That's a very valid concern. David and I are experimenting with > different issue trackers and plugins for trac, to see how best to > generate a "review pool". I.e., what I'd like to see is that, if you > only have 5 minutes to work on SciPy in the evening, you can > > a) Go to trac and click on "tickets for review" > b) Review a couple of tickets > > or > > a) Go to trac and click on "reviewed tickets" > b) Apply those patches > > or > > a) Go to trac and click on "unresolved issues" > b) Fix the bug > c) Upload the patch for review > > Technically, (c) is a bit challenging. I note your concern that it > would become difficult to check in, so what I would like is to have a > script such as > > scipy-submit -t 212 -m "Do not deallocate memory after object disposal." > Something like that would be nice! Thanks for the continued effort at improving workflow. -Travis ------------------------------ Message: 7 Date: Fri, 27 Feb 2009 14:48:01 +0900 From: David Cournapeau Subject: Re: [SciPy-dev] The future of SciPy and its development infrastructure To: SciPy Developers List Message-ID: <49A77E91.4010101 at ar.media.kyoto-u.ac.jp> Content-Type: text/plain; charset=ISO-8859-1 Prabhu Ramachandran wrote: > On 02/27/09 11:07, St?fan van der Walt wrote: > >> scipy-submit -t 212 -m "Do not deallocate memory after object disposal." >> >> which then uploads the patch to the codereview site, and adds a link >> to ticket 212 with the commit message and review URL. All of this >> can be done via the web, but I'd prefer to have a CLI available. >> >> Do you have any suggestions or further concerns? >> > > I've not contributed anything in years to scipy but I have a practical > problem that might be worth addressing eventually (others might be in > a similar position) -- my entire network is firewalled and I can only > access the web behind an authenticated http proxy. I have similar issues, and I agree those are valid concerns. Those can be very painful to handle. In my case, there is no DNS server, the names are resolved by the proxy; my workstation can only resolve the proxy name. This breaks most applications out there. ssh is not easy, because ssh cannot resolve names - for git, I managed to get things worked out for github using corkscrew. This is the kind of things which I managed to do once and hope never have to do again, so I can't tell you exactly how to do it: http://en.wikipedia.org/wiki/Corkscrew_(program) My .ssh/config looks like this for github Host gitproxy User git HostName ssh.github.com Port 443 ProxyCommand /usr/bin/corkscrew www 3128 %h %p IdentityFile /home/david/.ssh/id_rsa.pub Where www is the name of my proxy and 3128 the port. FWIW, svn has similar problems. I could never commit anything from a former internship location because of some proxy limitations - it is one of the reasons which pushed me into git for scipy development, actually. If you can't access either ssh or proxy, my experience is that you are more or less screwed with any tool out there - but with DVCS, you can at least put your changes aside and commit them later from an easier connection. cheers, David ------------------------------ Message: 8 Date: Fri, 27 Feb 2009 17:28:53 +0900 From: David Cournapeau Subject: [SciPy-dev] Improving the bug tracking workflow: starting document To: SciPy Developers List Message-ID: <49A7A445.4010800 at ar.media.kyoto-u.ac.jp> Content-Type: text/plain; charset=ISO-8859-1 Hi, Following the discussions, I have started to write a small document highlighting my current gripes with trac. I focus on some common scenario, and pin-point trac limitations. I mention possible new tools at the end, but that's not the main point: everybody who is also disatisfied with trac, and maybe even more importantly people who are currently satisfied and think their scenario is not covered should feel free to comment/modify it: http://scipy.org/scipy/numpy/wiki/ImprovingIssueWorkflow I put the initial version in svn as well: http://projects.scipy.org/scipy/numpy/browser/trunk/doc/neps/newbugtracker.r st. cheers, David ------------------------------ Message: 9 Date: Fri, 27 Feb 2009 09:18:22 +0000 (UTC) From: Pauli Virtanen Subject: Re: [SciPy-dev] The future of SciPy and its development infrastructure To: scipy-dev at scipy.org Message-ID: Content-Type: text/plain; charset=UTF-8 Fri, 27 Feb 2009 11:22:55 +0530, Prabhu Ramachandran wrote: [clip] > I've not contributed anything in years to scipy but I have a practical > problem that might be worth addressing eventually (others might be in > a similar position) -- my entire network is firewalled and I can only > access the web behind an authenticated http proxy. The firewall does > allow ssh connections out though but that seems useless to access a > git repository hosted on github say. Git can clone over HTTP, just change git:// to http:// and it seems to work. I can also clone through a proxy with export http_proxy=http://username:password at proxy:port/ git clone http://whatever Pushing over HTTP is another question... It's probably not possible to push to Github over HTTPS, but maybe there are places that you can push to with only HTTP authentication. -- Pauli Virtanen ------------------------------ _______________________________________________ Scipy-dev mailing list Scipy-dev at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-dev End of Scipy-dev Digest, Vol 64, Issue 66 ***************************************** From josef.pktd at gmail.com Fri Feb 27 07:56:30 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 07:56:30 -0500 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> On Fri, Feb 27, 2009 at 3:28 AM, David Cournapeau wrote: > Hi, > > ? ?Following the discussions, I have started to write a small document > highlighting my current gripes with trac. I focus on some common > scenario, and pin-point trac limitations. I mention possible new tools > at the end, but that's not the main point: everybody who is also > disatisfied with trac, and maybe even more importantly people who are > currently satisfied and think their scenario is not covered should feel > free to comment/modify it: > > http://scipy.org/scipy/numpy/wiki/ImprovingIssueWorkflow > > I put the initial version in svn as well: > http://projects.scipy.org/scipy/numpy/browser/trunk/doc/neps/newbugtracker.rst. > > cheers, > > David Just two quick comments: * I like the integration of the bug tracker and svn, browsing between old tickets and revisions is pretty easy. Similarly, integrated timeline for svn and issue tracker makes tracking new code and issues easy. * eclipse integration with trac issues works well with mylyn, but I haven't used it much and not for scipy, eclipse integration with svn is very good. Josef From josef.pktd at gmail.com Fri Feb 27 09:25:46 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 09:25:46 -0500 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> Message-ID: <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> On Fri, Feb 27, 2009 at 7:56 AM, wrote: > On Fri, Feb 27, 2009 at 3:28 AM, David Cournapeau > wrote: >> Hi, >> >> ? ?Following the discussions, I have started to write a small document >> highlighting my current gripes with trac. I focus on some common >> scenario, and pin-point trac limitations. I mention possible new tools >> at the end, but that's not the main point: everybody who is also >> disatisfied with trac, and maybe even more importantly people who are >> currently satisfied and think their scenario is not covered should feel >> free to comment/modify it: >> >> http://scipy.org/scipy/numpy/wiki/ImprovingIssueWorkflow >> >> I put the initial version in svn as well: >> http://projects.scipy.org/scipy/numpy/browser/trunk/doc/neps/newbugtracker.rst. >> >> cheers, >> >> David > > Just two quick comments: > > * I like the integration of the bug tracker and svn, browsing between > old tickets and revisions is pretty easy. ?Similarly, integrated > timeline for svn and issue tracker makes tracking new code and issues > easy. > > * eclipse integration with trac issues works well with mylyn, but I > haven't used it much and not for scipy, > ?eclipse integration with svn is very good. > > Josef > I connected my eclipse mylyn with the scipy trac tickets and have tickets that are assigned to me on my local computer where I can mark them as read or unread. But for now I like the web interface better. But this works after some trial and error. Also, since David mentioned sql queries in another thread, I set up a report that sorts tickets by change time. It helps to see which tickets where recently commented on. But since this is the first time, I do this, and I'm not very familiar with sql, this still needs improvements. However, this helps with the main problem, I had with the trac ticket listing, and maybe some additional specialized reports will make keeping an overview of tickets easier, even with the current trac version. Josef From michael.abshoff at googlemail.com Fri Feb 27 09:39:59 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Fri, 27 Feb 2009 06:39:59 -0800 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> Message-ID: <49A7FB3F.2070908@gmail.com> josef.pktd at gmail.com wrote: > On Fri, Feb 27, 2009 at 7:56 AM, wrote: > However, this helps with the main problem, I had with the trac ticket > listing, and maybe some additional specialized reports will make > keeping an overview of tickets easier, even with the current trac > version. The scipy trac seems to be version 0.10.2 which has a number of known security issues. Asides from that trac release prior to 0.11 leak memory when using Apache, i.e. http://trac.edgewall.org/ticket/6614 which was a well known and often complained about bug, so many of your performance problems will likely go away once you upgrade (it as my impression you use trac+Apache) > Josef Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From bsouthey at gmail.com Fri Feb 27 10:03:03 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 27 Feb 2009 09:03:03 -0600 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> Message-ID: <49A800A7.4080407@gmail.com> josef.pktd at gmail.com wrote: > I think a discussion for a roadmap for stats will be very useful. > > Currently my priority is still your point > 7 iii) Ideally there should be tests that check the function accuracy. > > I consider this the main point of almost all my work on stats. And there are > still some incorrect parts left. > Yes, that is why I added it. > The next part for the current code base, that I think about, was to > evaluate function > whether they are ok, can be generalized, e.g. dimension, or are > trivial and should be > removed. > I agree as I do think some are a consequence of the porting process and have never received the appropriate followup over time. > Next are changes in the interface and combining or comparing mstats and stats. > Here, I don't have a clear opinion yet of how far we can or want to > consistently generalize all statistical functions to the different > type of arrays. In many cases I looked at, the masked array version > looked sufficiently different that I would be reluctant to merge them. > One radical alternative would be to depreciate stats.stats and expand > mstats, since it is already better designed to handle different array > types. But I like the "simple" versions in stats, and I'm curious > about any speed difference. > The main issue that prevents me from going further with this aspect! I do not find it that radical at all to suggest that as I am for just using masked arrays because I do not perceive a speed difference. (Okay I am perhaps unusual in that I work with large datasets and complex models so differences of a few seconds are not that meaningful to me.) It would be less work to convert the missing as there are about 85 functions missing from masked. > But general tools to interface to different array types would be > useful and should be carefully designed, e.g. function like ols that > have a plain ndarray core, but can access the data from structured > arrays and masked arrays. > > After, the changes to the current statistical function, I was > considering areas of statistics that have partial but incomplete > coverage. Non-parametric tests are well represented, and I have some > extension for tests for discrete distributions. I think ANOVA, which I > never used myself, has a very incomplete collection, which, I guess is > a historical accident since Gary Strangman had, I think more ANOVA > functions that are not included in stats. > So instead of having a laundry list of functions, (some of which don't > seem to have been used for years), I would prefer at least a > conceptional grouping around statistical topics. Regression of course > is currently MIA. > > Even after Robert's reply on that, stats.py at least still has linregress (simple regression with one variable) and glm that address these. However, there is a strong case that both of these should also be removed in favor of a better approach. I agree that doing things like general linear models (eg regression and ANOVA assuming normality), generalized linear models and such need a careful design that integrates where possible existing solutions. Even SAS has different procedures and different modules are available for R to do these. But must be a separate discussion. > The next large interface issue, especially for enhancements, is > whether to use functions or proper classes. I think for some > statistical analysis the current statistical function, once cleaned > up, work fine. However, even R returns result classes (or whatever > their equivalent is) for every statistical test, while in python we > use matlab style functions. > > This will change when models will be included again. > Excellent! > I have a list of functions that have no test coverage, a list (not > written down) of functions that have bug suspects or known bugs, and > it would be useful to get a wider opinion about which functions and > interfaces are important > Working on the list of functions on the wiki page maybe simpler for > collecting comments than going through the statistical review in trac. > I agree that we need to address what functions we really need and what interface is required. From that we can address the required tests and documentation. > Overall, I think there is still a lot of work to do before I start to > worry about white space issues. > Yeah, I just figured that we should correct any of these coding styles issues on the way. Thanks for all the comments, Bruce From josef.pktd at gmail.com Fri Feb 27 11:19:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 11:19:24 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A800A7.4080407@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> Message-ID: <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> On Fri, Feb 27, 2009 at 10:03 AM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: >> I think a discussion for a roadmap for stats will be very useful. >> >> Currently my priority is still your point >> ? ? ?7 iii) Ideally there should be tests that check the function accuracy. >> >> I consider this the main point of almost all my work on stats. And there are >> still some incorrect parts left. >> > Yes, that is why I added it. >> The next part for the current code base, that I think about, was to >> evaluate function >> whether they are ok, can be generalized, e.g. dimension, or are >> trivial and should be >> removed. >> > I agree as I do think some are a consequence of the porting process and > have never received the appropriate followup over time. >> Next are changes in the interface and combining or comparing mstats and stats. >> Here, I don't have a clear opinion yet of how far we can or want to >> consistently generalize all statistical functions to the different >> type of arrays. In many cases I looked at, the masked array version >> looked sufficiently different that I would be reluctant to merge them. >> One radical alternative would be to depreciate stats.stats and expand >> mstats, since it is already better designed to handle different array >> types. But I like the "simple" versions in stats, and I'm curious >> about any speed difference. >> > The main issue that prevents me from going further with this aspect! > > I do not find it that radical at all to suggest that as I am for just > using masked arrays because I do not perceive a speed difference. (Okay > I am perhaps unusual in that I work with large datasets and complex > models so differences of a few seconds are not that meaningful to me.) > It would be less work to convert the missing as there are about 85 > functions missing from masked. > I don't know what the current range of use cases for stats is. But for example in matlab, I have some ols estimation in an innerloop where I wouldn't want much overhead. But in this case, it would always be possible to go back to raw linalg.lstsq. The other disadvantage for me is that it is much easier to write functions that work for plain arrays, since I'm not working with masked/missing data. It's ok if the handling of different array types can be done in the interface of the function, but translating some statistical formulas into code or porting it from another language will be more difficult for me if I have to worry about missing values all the time. An example that I looked at recently, is statistical analysis of panel data, with a balanced panel the linear algebra and matrix operations are much easier than with an unbalanced panel. What I would like to do, but didn't have the time yet is to run the tests for stats.stats on stats.mstats. This way even if we would have some duplicate functions, we would have some cross check that they are consistent, and it would be a reminder for bug fixing also the other version. > >> But general tools to interface to different array types would be >> useful and should be carefully designed, e.g. function like ols that >> have a plain ndarray core, but can access the data from structured >> arrays and masked arrays. >> >> After, the changes to the current statistical function, I was >> considering areas of statistics that have partial but incomplete >> coverage. Non-parametric tests are well represented, and I have some >> extension for tests for discrete distributions. I think ANOVA, which I >> never used myself, has a very incomplete collection, which, I guess is >> a historical accident since Gary Strangman had, I think more ANOVA >> functions that are not included in stats. >> So instead of having a laundry list of functions, (some of which don't >> seem to have been used for years), I would prefer at least a >> conceptional grouping around statistical topics. Regression of course >> is currently MIA. >> >> > Even after Robert's reply on that, stats.py at least still has > linregress (simple regression with one variable) and glm that address > these. However, there is a strong case that both of these should also be > removed in favor of a better approach. > I don't really count linregress as a "serious" statistical function, since the restriction to one explanatory variable has no computational advantage if we have access to linalg. Similarly, I don't know what the purpose of pointbiserial is, if you can use np.corrcoef for the correlation coefficient or stats.pearsonr for the p-values. My impression is that these are historical functions, when there was no easy access to fast computers and full matrix and array packages. stats.glm is a bit of a misnomer it is just a t-test for the regression on one dummy variable, not an estimator. But again I don't see an advantage compared to ols with multivariate regressors and dummy variables. > I agree that doing things like general linear models (eg regression and > ANOVA assuming normality), generalized linear models and such need a > careful design that integrates where possible existing solutions. Even > SAS has different procedures and different modules are available for R > to do these. But must be a separate discussion. > >> The next large interface issue, especially for enhancements, is >> whether to use functions or proper classes. I think for some >> statistical analysis the current statistical function, once cleaned >> up, work fine. However, even R returns result classes (or whatever >> their equivalent is) for every statistical test, while in python we >> use matlab style functions. >> >> This will change when models will be included again. There are still bugs in it, and test coverage is still low. If anyone wants to help in the review, bug hunting or adding test the current version is in nipy at https://code.launchpad.net/~nipy-developers/nipy/trunk-josef-models >> > Excellent! >> I have a list of functions that have no test coverage, a list (not >> written down) of functions that have bug suspects or known bugs, and >> it would be useful to get a wider opinion about which functions and >> interfaces are important >> Working on the list of functions on the wiki page maybe simpler for >> collecting comments than going through the statistical review in trac. >> > I agree that we need to address what functions we really need and what > interface is required. From that we can address the required tests and > documentation. > >> Overall, I think there is still a lot of work to do before I start to >> worry about white space issues. >> > Yeah, ?I just figured that we should correct any of these coding styles > issues on the way. I'm slowly getting used to the formatting requirements, and at least during code changes, I try to stick to it. > > Thanks for all the comments, > Bruce > Josef From sturla at molden.no Fri Feb 27 11:27:36 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 27 Feb 2009 17:27:36 +0100 Subject: [SciPy-dev] Implementation of a parallel cKDTree Message-ID: <49A81478.7050807@molden.no> I have fiddled a bit with scipy.spatial.cKDTree for better performance on multicore CPUs. I have used threading.Thread instead of OpenMP, so no special compilation or compiler is required. The number of threads defaults to the number of processors if it can be determined. The performance is not much different from what I get with OpenMP. It is faster than using cKDTree with multiprocessing and shared memory. Memory handling is also improved. There are checks for NULL pointers returned by malloc or realloc. setjmp/longjmp is used for error handling if malloc or realloc fail. A memory pool is used to make sure all complex data structures are cleaned up properly. I have assumed that crt functions malloc, realloc and free are thread safe. This is usually the case. If they are not, they must be wrapped with calls to PyGILState_Ensure and PyGILState_Release. I have not done this as it could impair scalability. Regards, Sturla Molden -------------- next part -------------- A non-text attachment was scrubbed... Name: ckdtree_mt.pyx Type: / Size: 29331 bytes Desc: not available URL: From josef.pktd at gmail.com Fri Feb 27 11:42:51 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 11:42:51 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> Message-ID: <1cd32cbb0902270842i94f9ff6x1bcdf3e7897d64a5@mail.gmail.com> One more issue for the design of statistical function is the availability of using weights. I was looking at calculating weighted means and variances and so on, but the current situation doesn't look very good. There is np.average and the new curvefit allows for weights. http://scipy.org/scipy/scipy/ticket/604 has a full set of statistical functions using weights, but I couldn't make up my mind about how this should fit in. Many of the functions are very short wrappers and would increase the number of functions without necessarily a big benefit. But an efficient implementation of statistical functions that allow weights would make the use of dummy variables and the conversion of masked arrays to use the weighted functions easier, (use mask as dummy variable for the weight.) This won't help for all cases where masked arrays are used, but looking at specific functions and coming up with a good general design would be very useful. Josef From bsouthey at gmail.com Fri Feb 27 12:42:20 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 27 Feb 2009 11:42:20 -0600 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> Message-ID: <49A825FC.1080902@gmail.com> josef.pktd at gmail.com wrote: [snip] > What I would like to do, but didn't have the time yet is to run the > tests for stats.stats > on stats.mstats. This way even if we would have some duplicate > functions, we would > have some cross check that they are consistent, and it would be a reminder for > bug fixing also the other version. > Okay, I do not know how to get timeit to work with numpy/scipy but this is not how I would like it to be. But I managed somehow to (unfairly) compare the geometric means function (gmean) using this code: import timeit stand_t=timeit.Timer('scipy.stats.stats.gmean(X, axis=xs)', 'import numpy, scipy.stats.stats; X=numpy.random.gamma(shape=2, scale=1, size=(1,10)); xs=None').timeit(1000) masked_t=timeit.Timer('scipy.stats.mstats.gmean(X, axis=xs)', 'import numpy, scipy.stats.stats; X=numpy.random.gamma(shape=2, scale=1, size=(1,10)); xs=None').timeit(1000) numpy_t=timeit.Timer('numpy.exp((numpy.log(X).mean()))', 'import numpy, numpy.random; X=numpy.random.gamma(shape=2, scale=1, size=(1,10))').timeit(1000) I use Linux and Python 2.5 but my system is very buzy so perhaps not that fair for benchmarks. numpy.__version__ '1.3.0.dev6338' scipy.__version__ '0.8.0.dev5597' There is a cost of using _chk_asarray in this case which decreases as the array size increases. (I am not sure that _chk_asarray is really needed anyhow.) There is a huge cost for using masked array for small sizes but decreases as the array size increases. For 1 by 10 array, the difference between masked and non masked versions was 0.13 seconds to do it 1000 times with the ratio of masked to non masked = 7.94 For 1 by 10000 array, the difference between masked and non masked versions was 0.07 seconds to do it 1000 times with the ratio of masked to non masked = 2.14 However, briefly looking at some of these functions, I think that numpy/scipy would naturally handle the array type as I know numpy.exp((numpy.log(X).mean())) this works whether X is the usual array or if it is a masked array. If so then there is no reason for different functions unless we need to address masks. Bruce From josef.pktd at gmail.com Fri Feb 27 13:27:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 13:27:24 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A825FC.1080902@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <49A825FC.1080902@gmail.com> Message-ID: <1cd32cbb0902271027l30786169he2bf06d0e01f0b09@mail.gmail.com> On Fri, Feb 27, 2009 at 12:42 PM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: > [snip] >> What I would like to do, but didn't have the time yet is to run the >> tests for stats.stats >> on stats.mstats. This way even if we would have some duplicate >> functions, we would >> have some cross check that they are consistent, and it would be a reminder for >> bug fixing also the other version. >> > Okay, I do not know how to get timeit to work with numpy/scipy but this > is not how I would like it to be. But I managed somehow to (unfairly) > compare the geometric means function (gmean) using this code: > import timeit > stand_t=timeit.Timer('scipy.stats.stats.gmean(X, axis=xs)', 'import > numpy, scipy.stats.stats; X=numpy.random.gamma(shape=2, scale=1, > size=(1,10)); xs=None').timeit(1000) > masked_t=timeit.Timer('scipy.stats.mstats.gmean(X, axis=xs)', 'import > numpy, scipy.stats.stats; X=numpy.random.gamma(shape=2, scale=1, > size=(1,10)); xs=None').timeit(1000) > numpy_t=timeit.Timer('numpy.exp((numpy.log(X).mean()))', 'import numpy, > numpy.random; X=numpy.random.gamma(shape=2, scale=1, > size=(1,10))').timeit(1000) > > I use Linux and Python 2.5 but my system is very buzy so perhaps not > that fair for benchmarks. > numpy.__version__ ?'1.3.0.dev6338' > scipy.__version__ '0.8.0.dev5597' > > There is a cost of using _chk_asarray in this case which decreases as > the array size increases. (I am not sure that _chk_asarray is really > needed anyhow.) > There is a huge cost for using masked array for small sizes but > decreases as the array size increases. > > For 1 by 10 array, the difference between masked and non masked versions > was 0.13 seconds to do it 1000 times with the ratio of masked to non > masked = 7.94 > For 1 by 10000 array, the difference between masked and non masked > versions was 0.07 seconds to do it 1000 times with the ratio of masked > to non masked = 2.14 > > However, briefly looking at some of these functions, I think that > numpy/scipy would naturally handle the array type as I know > numpy.exp((numpy.log(X).mean())) this works whether X is the usual array > or if it is a masked array. If so then there is no reason for different > functions ?unless we need to address masks. > > > Bruce > I just ran the stats.stats test using mstats instead of stats. I didn't look at the results carefully, but the are some numerical inconsistencies between the two implementation, that need to be checked. I attached the test results to http://scipy.org/scipy/scipy/ticket/845. Your timing numbers don't sound so bad in absolute terms, but if it is inside an optimization loop, eg. for maximum likelihood estimation then an 8-fold slowdown can get painful. The main problem for the basic functions, I think, are those functions that need a loop because the data is not rectangular and cannot use simple broad casting and matrix/array operations. On the other hand, I don't think that the masked array functions have been checked for performance ("premature optimization") , since many of them are still relatively new. Josef From pav at iki.fi Fri Feb 27 13:28:51 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 27 Feb 2009 18:28:51 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> Message-ID: Fri, 27 Feb 2009 11:22:55 +0530, Prabhu Ramachandran wrote: [clip] > I've not contributed anything in years to scipy but I have a practical > problem that might be worth addressing eventually (others might be in a > similar position) -- my entire network is firewalled and I can only > access the web behind an authenticated http proxy. The firewall does > allow ssh connections out though but that seems useless to access a git ^^^^^^^^^^^^^^^^^^^^^^^^^ SSH is the default transport protocol for Git. -- Pauli Virtanen From pgmdevlist at gmail.com Fri Feb 27 13:54:10 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 27 Feb 2009 13:54:10 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> Message-ID: <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> All, I followed the thread without actively participating, but as the author of stats.mstats, I feel compelled to jump in. When I started working on some masked versions of scipy.stats version, numpy.ma wasn't part of numpy per se (if I remember correctly), and the package hadn't been thouroughly checked. Modifying scipy.stats to recognize masked arrays (for example, by changing _chk_array) wasn't really an option at the time, because numpy.ma was still considered as experimental. The easiest was therefore just to duplicate the functions. I checked that the results were consistent with the non- masked versions at the time, checked against R also, so I was fairly confident in the results. However, I didn't strive for exhaustivity: I coded the functions I needed and some of them direct relatives, but never tried to expand some more complex functions. I'm all in favor for merging the masked and non-masked versions: that's cleaner, easier to debug and maintain should there be some changes in signature (or even just doc). There's a few aspects we must keep in mind however: * standard numpy functions usually work well with masked arrays: if the input is MA, the np function should call MA.__array_wrap__ which will transform the result back to a MA. I have the nagging feeling it's not completely fool-proof, however, but the numpy.ma functions should always work. if the input is not a MA, then the output will never be masked, which may be a problem. Consider this example: >>> x=np.array([0,1,2]) >>> np.log(x).mean() -inf >>> ma.log(x).mean() 0.34657359027997264 >>> np.log(ma.array(x)).mean() 0.34657359027997264 If we don't transform x to a MA, or don't use the numpy.ma function, we just get a NaN/Inf as results. Otherwise, we get a nice float. * Systematically using MA may be problematic: I can picture cases where a standard ndarray is expected as output when a standard ndarray is given as inputt. If we force the conversion, the result will be a MA. Should we convert it back to a ndarray ? Using .filled() ? But then, with which filling_value ? In that case, we may want to consider a "usemask" flag: if usemask=True and the input is a ndarray, then the output will be a MA, otherwise it'll be a ndarray. Using a MA as input would set usemask to True no matter what. * The correlation functions discard masked values pair-wise: we can pre-process the inputs and still use the standard functions, so no problem here. * Some functions (threshold) can work directly w/ MA. * Some functions (the ones based on ranking) should behave differently whether the input has masked values (as missing values must be taken as ties). About optimization and speed test: * There's definitely some room for improvement here: for example, instead of using the count method, we could use the count function to prevent any unnecessary conversion to MA. (I'd need to optimize the count function, but that should be easy...). That'll depend on what we decide for handling MA. * Just running tests w/ the masked versions of the function will always show that they are slower, of course. * Slight differences of the order of 1e-15 should not really matter. All, don't hesitate to contact me on or off-list if you have some specific questions about implementation details. From prabhu at aero.iitb.ac.in Fri Feb 27 14:16:05 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sat, 28 Feb 2009 00:46:05 +0530 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> Message-ID: <49A83BF5.5070401@aero.iitb.ac.in> On 02/27/09 23:58, Pauli Virtanen wrote: > Fri, 27 Feb 2009 11:22:55 +0530, Prabhu Ramachandran wrote: > [clip] >> I've not contributed anything in years to scipy but I have a practical >> problem that might be worth addressing eventually (others might be in a >> similar position) -- my entire network is firewalled and I can only >> access the web behind an authenticated http proxy. The firewall does >> allow ssh connections out though but that seems useless to access a git > ^^^^^^^^^^^^^^^^^^^^^^^^^ > > SSH is the default transport protocol for Git. Hmm, this doesn't seem to work. My guess is that it uses a different port which clearly won't work unless I force the admins here to open up the git port. $ git clone --origin svn git://github.com/pv/scipy-svn.git scipy.git Initialized empty Git repository in /home/prabhu/src/git/scipy.git/.git/ github.com[0: 65.74.177.129]: errno=Connection timed out fatal: unable to connect a socket (Connection timed out) fetch-pack from 'git://github.com/pv/scipy-svn.git' failed. Note, that I can certainly ssh just fine to the outside world. cheers, prabhu From josef.pktd at gmail.com Fri Feb 27 14:48:30 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 14:48:30 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> Message-ID: <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> On Fri, Feb 27, 2009 at 1:54 PM, Pierre GM wrote: > All, > I followed the thread without actively participating, but as the > author of stats.mstats, I feel compelled to jump in. > > When I started working on some masked versions of scipy.stats version, > numpy.ma wasn't part of numpy per se (if I remember correctly), and > the package hadn't been thouroughly checked. Modifying scipy.stats to > recognize masked arrays (for example, by changing _chk_array) wasn't > really an option at the time, because numpy.ma was still considered as > experimental. The easiest was therefore just to duplicate the > functions. I checked that the results were consistent with the non- > masked versions at the time, checked against R also, so I was fairly > confident in the results. However, I didn't strive for exhaustivity: I > coded the functions I needed and some of them direct relatives, but > never tried to expand some more complex functions. > > I'm all in favor for merging the masked and non-masked versions: > that's cleaner, easier to debug and maintain should there be some > changes in signature (or even just doc). There's a few aspects we must > keep in mind however: > > * standard numpy functions usually work well with masked arrays: if > the input is MA, the np function should call MA.__array_wrap__ which > will transform the result back to a MA. I have the nagging feeling > it's not completely fool-proof, however, but the numpy.ma functions > should always work. if the input is not a MA, then the output will > never be masked, which may be a problem. > > Consider this example: > ?>>> x=np.array([0,1,2]) > ?>>> np.log(x).mean() > -inf > ?>>> ma.log(x).mean() > ?0.34657359027997264 > ?>>> np.log(ma.array(x)).mean() > ?0.34657359027997264 > > If we don't transform x to a MA, or don't use the numpy.ma function, > we just get a NaN/Inf as results. Otherwise, we get a nice float. > > * Systematically using MA may be problematic: I can picture cases > where a standard ndarray is expected as output when a standard ndarray > is given as inputt. If we force the conversion, the result will be a > MA. Should we convert it back to a ndarray ? Using .filled() ? But > then, with which filling_value ? > > In that case, we may want to consider a "usemask" flag: if > usemask=True and the input is a ndarray, then the output will be a MA, > otherwise it'll be a ndarray. Using a MA as input would set usemask to > True no matter what. > > * The correlation functions discard masked values pair-wise: we can > pre-process the inputs and still use the standard functions, so no > problem here. > > * Some functions (threshold) can work directly w/ MA. > > * Some functions (the ones based on ranking) should behave differently > whether the input has masked values (as missing values must be taken > as ties). > > > About optimization and speed test: > * There's definitely some room for improvement here: for example, > instead of using the count method, we could use the count function to > prevent any unnecessary conversion to MA. ?(I'd need to optimize the > count function, but that should be easy...). That'll depend on what we > decide for handling MA. > * Just running tests w/ the masked versions of the function will > always show that they are slower, of course. > * ?Slight differences of the order of 1e-15 should not really matter. > > All, don't hesitate to contact me on or off-list if you have some > specific questions about implementation details. > I still need to look at several examples, before I get a better feeling of how this will work. I just looked at the implementation of ma.var, ma.cov, ma.exp and a few more, and I think they are very well written and I don't see any way how their performance could be improved. Given that gmean is a very simple function, I was pretty surprised about the difference in timing. Now, I think that the main slowdown is that the mask has to be checked in every operation that calls a ma.* version of a function. As we discussed for the OLS case for larger statistical functions, building the main workload with plain arrays will save a lot of overhead. This works for cases where a single compression or fill is correct for all required numerical operations. If we get the correct setup (interface, conversion) for two kinds of functions, any array to ma core of the function", and "any array to plain core", then it will be easier, at least for me, to follow this pattern when (re)writing functions. One more issue is the treatment of nan and masked values, for example, if a function produces nans because of a zero division, then I would want to treat it differently than a missing value in the data. If it is automatically included in the mask then this distinction is lost. Or is there a different use case for this? In your log example, I wouldn't want to get a nice number back. I want the function to complain. Silently changing the definition of mathematical operations creates a huge potential for errors (that's why I also don't like the silent conversions when casting to int) For example, if this is maximum likelihood estimation, the log likelihood is -inf and not some nice number. >>> x=np.array([0,1,2]) >>> np.log(x).mean() I think if users want nice numbers, then they should mask them in the first place. Actually, I didn't realize this before, that ma adds additional points to the mask. But, before we start to rewrite and refactor across the board, I still want to finish cleaning up the existing functions and resolve some of the current inconsistencies. Josef From prabhu at aero.iitb.ac.in Fri Feb 27 14:53:56 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sat, 28 Feb 2009 01:23:56 +0530 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A77E91.4010101@ar.media.kyoto-u.ac.jp> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> <49A77E91.4010101@ar.media.kyoto-u.ac.jp> Message-ID: <49A844D4.6070600@aero.iitb.ac.in> On 02/27/09 11:18, David Cournapeau wrote: > http://en.wikipedia.org/wiki/Corkscrew_(program) > > My .ssh/config looks like this for github > > Host gitproxy > User git > HostName ssh.github.com > Port 443 > ProxyCommand /usr/bin/corkscrew www 3128 %h %p > IdentityFile /home/david/.ssh/id_rsa.pub > > Where www is the name of my proxy and 3128 the port. > > FWIW, svn has similar problems. I could never commit anything from a > former internship location because of some proxy limitations - it is one > of the reasons which pushed me into git for scipy development, actually. > If you can't access either ssh or proxy, my experience is that you are > more or less screwed with any tool out there - but with DVCS, you can at > least put your changes aside and commit them later from an easier > connection. Thanks for the information. Unfortunately this doesn't seem to work for me although the network policy isn't anywhere as draconian as yours was/is. I tried cloning using different approaches but none seems to work, maybe I'm doing something wrong: 1. I setup my .ssh/config suitably based on the above (and experimented with various options) Host github.com User git HostName github.com # also tried ssh.github.com Port 443 ProxyCommand /usr/bin/corkscrew my_proxy.iitb.ac.in 80 %h %p /home/prabhu/.ssh/auth IdentityFile /home/prabhu/.ssh/id_dsa.pub 2. $ git clone --origin svn git://github.com/pv/scipy-svn.git scipy.git And it does not work at all. I get this: Initialized empty Git repository in /.../scipy.git/.git/ and nothing for a long while and eventually something like this: github.com[0: 65.74.177.129]: errno=Connection timed out fatal: unable to connect a socket (Connection timed out) fetch-pack from 'git://github.com/pv/scipy-svn.git' failed. svn has worked well for me in this regard. I have always been able to checkin and checkout stuff with svn. Finally, this worked: proxycmd git clone --origin svn http://github.com/pv/scipy-svn.git scipy.git proxycmd is just a simple shell script that prompts for my password and sets up the http_proxy for the subsequent command. cheers, prabhu From prabhu at aero.iitb.ac.in Fri Feb 27 14:55:05 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sat, 28 Feb 2009 01:25:05 +0530 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> Message-ID: <49A84519.4080706@aero.iitb.ac.in> On 02/27/09 14:48, Pauli Virtanen wrote: > Fri, 27 Feb 2009 11:22:55 +0530, Prabhu Ramachandran wrote: > [clip] >> I've not contributed anything in years to scipy but I have a practical >> problem that might be worth addressing eventually (others might be in a >> similar position) -- my entire network is firewalled and I can only >> access the web behind an authenticated http proxy. The firewall does >> allow ssh connections out though but that seems useless to access a git >> repository hosted on github say. > > Git can clone over HTTP, just change git:// to http:// and > it seems to work. I can also clone through a proxy with > > export http_proxy=http://username:password at proxy:port/ > git clone http://whatever Thanks, this works. > Pushing over HTTP is another question... It's probably not > possible to push to Github over HTTPS, but maybe there are > places that you can push to with only HTTP authentication. OK, thanks. prabhu From pgmdevlist at gmail.com Fri Feb 27 15:05:18 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 27 Feb 2009 15:05:18 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> Message-ID: <25A812EA-411A-4CDF-9D59-FF3836370C43@gmail.com> > > Given that gmean is a very simple function, I was pretty surprised > about > the difference in timing. Now, I think that the main slowdown is > that the > mask has to be checked in every operation that calls a ma.* version > of a function. It's actually a tad more complex: ma.log checks the mask of the input, but also converts the output to a MA when needed, with all the overhead of MA.__array_finalize__. > As we discussed for the OLS case for larger statistical functions, > building > the main workload with plain arrays will save a lot of overhead. > This works > for cases where a single compression or fill is correct for all > required > numerical operations. That's indeed the way to go: preprocess a MA to transform it into a ndarray (by dropping masked values, or processing them afterwards), perform the operation, revert to MA if needed. > One more issue is the treatment of nan and masked values, for > example, if a > function produces nans because of a zero division, then I would want > to treat > it differently than a missing value in the data. If it is > automatically included in the > mask then this distinction is lost. Or is there a different use case > for this? Nope. If a value get masked by an operation, you won't be able to track it (unless by comparing the mask of the output w/ the mask of the input). > In your log example, I wouldn't want to get a nice number back. I want > the function > to complain. Because you work w/ ndarrays. If I work w/ MA, I expect it not to crash but drop the masked values. > But, before we start to rewrite and refactor across the board, I still > want to finish > cleaning up the existing functions and resolve some of the current > inconsistencies. Well, you may double the workload. One way would be to first agree on how we should refactor/reorganize the functions, then clean the ndarray part of the function. We can always add a NotImplementedError if the input is a MA w/ missing values. From cournape at gmail.com Fri Feb 27 15:07:10 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 28 Feb 2009 05:07:10 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A844D4.6070600@aero.iitb.ac.in> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> <49A77E91.4010101@ar.media.kyoto-u.ac.jp> <49A844D4.6070600@aero.iitb.ac.in> Message-ID: <5b8d13220902271207m7825e7d2p62fc883cab6570c2@mail.gmail.com> On Sat, Feb 28, 2009 at 4:53 AM, Prabhu Ramachandran wrote: > Host github.com > ? ? ? ? User git > ? ? ? ? HostName github.com # also tried ssh.github.com > ? ? ? ? Port 443 > ? ? ? ? ProxyCommand /usr/bin/corkscrew my_proxy.iitb.ac.in ?80 %h %p > /home/prabhu/.ssh/auth > ? ? ? ? IdentityFile /home/prabhu/.ssh/id_dsa.pub If you cannot go through port 443, that may explain it. One way to check the connection is to ssh directly to github.com (with user git). It will fail (it is not support to work), but will tell you something like: PTY allocation request failed on channel 0 Hi cournape! You've successfully authenticated, but GitHub does not provide shell access. Connection to github.com closed. > > svn has worked well for me in this regard. ?I have always been able to > checkin and checkout stuff with svn. Yes, it may work for svn and not for git: the related network requirements are not the same. In some cases, http is the only method - on the draconian environment, I used http + push at the end of the day at home. David From josef.pktd at gmail.com Fri Feb 27 15:14:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 15:14:55 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> Message-ID: <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> > In your log example, I wouldn't want to get a nice number back. I want > the function > to complain. Silently changing the definition of mathematical > operations creates a > huge potential for errors (that's why I also don't like the silent > conversions when casting to int) > For example, if this is maximum likelihood estimation, the log > likelihood is -inf > and not some nice number. >>>> x=np.array([0,1,2]) >>>> np.log(x).mean() > I think if users want nice numbers, then they should mask them in the > first place. the more I think, about >>> np.ma.log([0,1,2]).sum() 0.69314718055994529 >>> np.log([0,1,2]).sum() -inf the more worried, I get about using ma functions. One example: In the fit method of the distributions with bounded support, if there are observations outside of the bound than the negative log-likelihood is set to inf: cond0 = (x <= self.a) | (x >= self.b) if (any(cond0)): return inf else: N = len(x) return self._nnlf(x, *args) + N*log(scale) In this case, it might still produce the correct result since the check is before the aggregation. However, this is implementation specific. If I had assigned the inf before the summation of the log-likelihood contributions, ma.log would have removed them, and killed the boundary check. So when working with masked array functions, it is necessary to always keep in mind that the math is defined differently, which promises many happy hours of bug hunting. Josef From dwf at cs.toronto.edu Fri Feb 27 15:20:20 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 27 Feb 2009 15:20:20 -0500 Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <1e2af89e0902231029l3111cb59xe7c0d393f381e7b3@mail.gmail.com> <9457e7c80902231205u46d51746xa4c2f87992aa4c59@mail.gmail.com> <1e2af89e0902231229n20568b00kbfdf0786d7157d2@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: On 26-Feb-09, at 2:45 PM, Pauli Virtanen wrote: > 3. PyPi links & instructions for packages that are not in PyPi should > be hidden. It sticks out to me that there are a lot of broken PyPI links throughout the portal site. Also, I'm not sure what can be done about it, but the whole site is quite slow for me. David From pgmdevlist at gmail.com Fri Feb 27 15:40:59 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 27 Feb 2009 15:40:59 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> Message-ID: <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> On Feb 27, 2009, at 3:14 PM, josef.pktd at gmail.com wrote: > One example: > In the fit method of the distributions with bounded support, if there > are observations outside of the bound than the negative log-likelihood > is set to inf: > > cond0 = (x <= self.a) | (x >= self.b) > if (any(cond0)): > return inf > else: > N = len(x) > return self._nnlf(x, *args) + N*log(scale) > > In this case, it might still produce the correct result since the > check is before the aggregation. However, this is implementation > specific. If I had assigned the inf before the summation of the > log-likelihood contributions, ma.log would have removed them, and > killed the boundary check. OK, so you don't want to use the ma functions there. Pb is that you won't be able to use the np versions on MA either >>> x = ma.array([0,1,2],mask=[0,1,0]) >>> np.log(x) masked_array(data = [-- -- 0.69314718056], mask = [ True True False], fill_value = 1e+20) np.log(x) first work on the data, then call MA.__array_wrap__. This function checks the initial mask, then the context of the function: as it's a domained function, the entries outside the domain are transformed into mask. For this kind of problem, the easiest is to decouple: 1. Take a view of the input as a standard ndarray. 2. Process the view 3. Add the mask of the input if needed. With the previous example, that'd be roughly >>> ma.array(np.log(x.view(ndarray)), mask=ma.getmask(x)) masked_array(data = [-inf -- 0.69314718056], mask = [False True False], fill_value = 1e+20) You keep the masked entry at index 1, but don't mask the entry at index 0. > So when working with masked array functions, it is necessary to always > keep in mind that the math is defined differently, which promises many > happy hours of bug hunting. Indeed. But once again, the masked versions of the function are more for convenience. If you need performance, you have to preprocess the inputs by transforming them into standard ndarrays one way or another. In the case of correlation functions, for example, you can suppress missing values pair-wise (that is, drop the entries of x if the corresponding entries of y are masked, and vice-versa). For basic linear fit, that might be an approach. A second would be to work by intervals, the limits of the intervals being a masked value. For more complex fitting (eg, loess), problems arise. You can bypass them temporarily by raisong a NotImplementedError if the inputs are masked, it'd be up to the user to find a way to fill the inputs. From pav at iki.fi Fri Feb 27 15:44:14 2009 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 27 Feb 2009 20:44:14 +0000 (UTC) Subject: [SciPy-dev] The future of SciPy and its development infrastructure References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> <49A83BF5.5070401@aero.iitb.ac.in> Message-ID: Sat, 28 Feb 2009 00:46:05 +0530, Prabhu Ramachandran wrote: [clip] >> SSH is the default transport protocol for Git. > > Hmm, this doesn't seem to work. My guess is that it uses a different > port which clearly won't work unless I force the admins here to open up > the git port. Hmm, that was a direct quote from the git manual, but apparently I took it out of context. > $ git clone --origin svn git://github.com/pv/scipy-svn.git scipy.git > Initialized empty Git repository in /home/prabhu/src/git/scipy.git/.git/ > github.com[0: 65.74.177.129]: errno=Connection timed out fatal: unable > to connect a socket (Connection timed out) fetch-pack from > 'git://github.com/pv/scipy-svn.git' failed. > > Note, that I can certainly ssh just fine to the outside world. You can push via SSH (it does not go via the git:// protocol), but you need first to create an account and set up your public SSH key. Then you can do git push git at github.com:USERNAME/my-repo.git and this does go through port 22. So I think that you can 1) Clone other people's repositories via HTTP. 2) Push to your repository via SSH. So I think Github should work, even in an environment restricted like yours. -- Pauli Virtanen From prabhu at aero.iitb.ac.in Fri Feb 27 15:46:04 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sat, 28 Feb 2009 02:16:04 +0530 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <5b8d13220902271207m7825e7d2p62fc883cab6570c2@mail.gmail.com> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A6BC0B.4040301@enthought.com> <3d375d730902260925x332bae2mef6bc9c0907b5918@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> <49A77E91.4010101@ar.media.kyoto-u.ac.jp> <49A844D4.6070600@aero.iitb.ac.in> <5b8d13220902271207m7825e7d2p62fc883cab6570c2@mail.gmail.com> Message-ID: <49A8510C.1010201@aero.iitb.ac.in> On 02/28/09 01:37, David Cournapeau wrote: > On Sat, Feb 28, 2009 at 4:53 AM, Prabhu Ramachandran > wrote: > >> Host github.com >> User git >> HostName github.com # also tried ssh.github.com >> Port 443 >> ProxyCommand /usr/bin/corkscrew my_proxy.iitb.ac.in 80 %h %p >> /home/prabhu/.ssh/auth >> IdentityFile /home/prabhu/.ssh/id_dsa.pub > > If you cannot go through port 443, that may explain it. One way to > check the connection is to ssh directly to github.com (with user git). > It will fail (it is not support to work), but will tell you something > like: > > PTY allocation request failed on channel 0 > Hi cournape! You've successfully authenticated, but GitHub does not > provide shell access. > Connection to github.com closed. Mine tells me this: $ ssh -p 443 github.com The authenticity of host '[ssh.github.com]:443 ()' can't be established. RSA key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[ssh.github.com]:443' (RSA) to the list of known hosts. Permission denied (publickey). So it looks like it does work but does not authenticate. I hope I don t have to setup a login with github. cheers, prabhu From cournape at gmail.com Fri Feb 27 15:56:26 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 28 Feb 2009 05:56:26 +0900 Subject: [SciPy-dev] The future of SciPy and its development infrastructure In-Reply-To: <49A8510C.1010201@aero.iitb.ac.in> References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A6FA6D.9030802@creativetrax.com> <49A72956.2090104@enthought.com> <9457e7c80902262137j69478ba9h910dcfac3949af26@mail.gmail.com> <49A77FB7.7010406@aero.iitb.ac.in> <49A77E91.4010101@ar.media.kyoto-u.ac.jp> <49A844D4.6070600@aero.iitb.ac.in> <5b8d13220902271207m7825e7d2p62fc883cab6570c2@mail.gmail.com> <49A8510C.1010201@aero.iitb.ac.in> Message-ID: <5b8d13220902271256n312bf55btd622f732f82c7c38@mail.gmail.com> On Sat, Feb 28, 2009 at 5:46 AM, Prabhu Ramachandran wrote: > On 02/28/09 01:37, David Cournapeau wrote: >> On Sat, Feb 28, 2009 at 4:53 AM, Prabhu Ramachandran >> wrote: >> >>> Host github.com >>> ? ? ? ? User git >>> ? ? ? ? HostName github.com # also tried ssh.github.com >>> ? ? ? ? Port 443 >>> ? ? ? ? ProxyCommand /usr/bin/corkscrew my_proxy.iitb.ac.in ?80 %h %p >>> /home/prabhu/.ssh/auth >>> ? ? ? ? IdentityFile /home/prabhu/.ssh/id_dsa.pub >> >> If you cannot go through port 443, that may explain it. One way to >> check the connection is to ssh directly to github.com (with user git). >> It will fail (it is not support to work), but will tell you something >> like: >> >> PTY allocation request failed on channel 0 >> Hi cournape! You've successfully authenticated, but GitHub does not >> provide shell access. >> ? ? ? ? ?Connection to github.com closed. > > Mine tells me this: > > $ ssh -p 443 github.com > The authenticity of host '[ssh.github.com]:443 ( command>)' can't be established. > RSA key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48. > Are you sure you want to continue connecting (yes/no)? yes > Warning: Permanently added '[ssh.github.com]:443' (RSA) to the list of > known hosts. > Permission denied (publickey). > > So it looks like it does work but does not authenticate. ?I hope I don > t have to setup a login with github. If you want to connect through ssh, I am afraid you don't have a choice. But of course, the no hassle solution is to just clone from http - or even simpler, to get the autogenerated tarball (independently of DVCS or nor, I think that's something we should support anyway). cheers, David From jason-sage at creativetrax.com Fri Feb 27 16:52:04 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Fri, 27 Feb 2009 15:52:04 -0600 Subject: [SciPy-dev] Matrix exponential Message-ID: <49A86084.5090506@creativetrax.com> John Cremona posted the following message to the sage development list about matrix exponentials. I'm copying it to here since it asks about the scipy matrix exponential method (we say numpy below, but we really mean scipy...) John Cremona wrote: > >> I have just been to a colloquium talk by numerical analyst Nick Higham > >> (Manchester) called "How to compute and not to compute a matrix > >> exponential". He has new methods which are now in mathematica, matlab > >> and NAG but (apparantly) nowhere else. He only seemed interested in > >> getting good speed & precision to 16 decimals but (when I asked) > >> confirmed that the methods should apply to give arbitrary precision. > >> > >> I just checked and see that Sage's matrix exp() uses something stupid > >> except over RDF/CDF where it uses a pade approximation method via > >> numpy. The method of the talk was a variant of that, the main trick > >> being to use exactly the right order of Pade approx. so maximise > >> precision and speed. > >> > >> I would like to know how good the numpy method is, and whether it can > >> be improved to this "state of the art" version at least for RDF. Then > >> it could be another selling point for Sage. > > > > Could you CC the numpy devlist as well on this? It sounds exciting! I will if you give me the address (or you can perhaps?). It might be worth including Higham's URL: http://www.maths.manchester.ac.uk/~higham/ as he has lots of his talks up there including some which are similar to the one I heard. From josef.pktd at gmail.com Fri Feb 27 16:52:11 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 16:52:11 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> Message-ID: <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> > > Indeed. But once again, the masked versions of the function are more > for convenience. If you need performance, you have to preprocess the > inputs by transforming them into standard ndarrays one way or another. > In the case of correlation functions, for example, you can suppress > missing values pair-wise (that is, drop the entries of x if the > corresponding entries of y are masked, and vice-versa). > For basic linear fit, that might be an approach. A second would be to > work by intervals, the limits of the intervals being a masked value. > For more complex fitting (eg, loess), problems arise. You can bypass > them temporarily by raisong a NotImplementedError if the inputs are > masked, it'd be up to the user to find a way to fill the inputs. For most of the current statistical functions, with the exception of different tie handling, I think that we can expand the _chk_asarray to do the necessary preprocessing. I also thought that the return for these functions should be easy, since most of them return statistics and not data arrays. However, looking at some examples, it is not obvious to me which return result and type you would like to have. A good example is `moment`, here is some of the current returns. I think they cover the main return patterns. >>> x array([ 0., 1., NaN, 2.]) masked arrays with masked nan: ------------------------------------------------ do you need a masked array as return type, since all values are valid? how about for t-statistic and p-values? Do p-values need to be masked arrays? >>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])),3) masked_array(data = [0.0 0.0], mask = [False False], fill_value = 1e+020) >>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])),2) masked_array(data = [0.666666666667 0.666666666667], mask = [False False], fill_value = 1e+020) >>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])),1) #inconsistent return type array([ 0., 0.]) masked array without masked values ----------------------------------------------------- same as above about return type >>> stats.mstats.moment(np.ma.column_stack([np.arange(4),np.arange(4)]),2) masked_array(data = [ 1.25 1.25], mask = False, fill_value = 1e+020) masked array with nan that is not masked ------------------------------------------------------------- masked array in, masked array out, nan results converted to mask is this desired? >>> stats.mstats.moment(np.ma.column_stack([x,x]),3) masked_array(data = [-- --], mask = [ True True], fill_value = 1e+020) >>> stats.mstats.moment(np.ma.column_stack([x,x]),0) masked_array(data = [-- --], mask = [ True True], fill_value = 1e+020) >>> stats.mstats.moment(np.ma.column_stack([x,x]),1) array([ 0., 0.]) ndarray with nans ------------------------- converted to masked array, nans are masked. here I want to get ndarray with nans returned >>> stats.mstats.moment(np.column_stack([x,x]),0) masked_array(data = [-- --], mask = [ True True], fill_value = 1e+020) >>> stats.mstats.moment(np.column_stack([x,x]),1) array([ 0., 0.]) ndarray without nans ------------------------------ this should return ndarray >>> stats.mstats.moment(np.column_stack([np.arange(4),np.arange(4)]),2) masked_array(data = [ 1.25 1.25], mask = False, fill_value = 1e+020) >>> stats.mstats.moment(np.column_stack([np.arange(4),np.arange(4)]),1) array([ 0., 0.]) If this return "API" is specified, then it is possible to work out some examples to see how the merged function works. If you think converting nans that are the result of calculations to masked arrays are important, then we could add a keyword argument that implies a fix_invalid before returning the results. Josef From robert.kern at gmail.com Fri Feb 27 16:59:43 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Feb 2009 15:59:43 -0600 Subject: [SciPy-dev] Matrix exponential In-Reply-To: <49A86084.5090506@creativetrax.com> References: <49A86084.5090506@creativetrax.com> Message-ID: <3d375d730902271359m404ec3e4n990ce491e63f470@mail.gmail.com> On Fri, Feb 27, 2009 at 15:52, wrote: > John Cremona posted the following message to the sage development list > about matrix exponentials. ?I'm copying it to here since it asks about > the scipy matrix exponential method (we say numpy below, but we really > mean scipy...) > > John Cremona wrote: > >> >> I have just been to a colloquium talk by numerical analyst Nick Higham >> >> (Manchester) called "How to compute and not to compute a matrix >> >> exponential". ?He has new methods which are now in mathematica, matlab >> >> and NAG but (apparantly) nowhere else. Are the good methods in this paper? A New Scaling and Squaring Algorithm for the Matrix Exponential (with Awad Al-Mohy), MIMS EPrint 2009.9, January 2009. [new] http://eprints.ma.man.ac.uk/1217/01/covered/MIMS_ep2009_9.pdf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pgmdevlist at gmail.com Fri Feb 27 17:47:08 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 27 Feb 2009 17:47:08 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> Message-ID: <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: > > For most of the current statistical functions, with the exception of > different tie handling, I think that we can expand the _chk_asarray to > do the necessary preprocessing. Mmh. _chk_asarray will always return a MA. Is it what you want? Are you An idea is then to use the 'usemask' parameter I was talking about earlier: * if usemask is False (default), return a ndarray * If usemask is True, return a MA * if the input is a MA (w/ or w/o missing values), set usemask to True, and mask the NaNs/Infs first w/ ma.fix_invalid. That way, we need only one function. If we really need it, we can have duplicate functions in scipy.mstats where usemask is set to True by default. Now, for the actual implementation: * usemask=False and some NaNs: return NaN * usemask=True: use the ma implementation. >>>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])), >>>> 1) #inconsistent return type > array([ 0., 0.]) That's a bug, we should have a MA. From efiring at hawaii.edu Fri Feb 27 18:01:29 2009 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 27 Feb 2009 13:01:29 -1000 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> Message-ID: <49A870C9.8000208@hawaii.edu> Pierre GM wrote: > On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: >> For most of the current statistical functions, with the exception of >> different tie handling, I think that we can expand the _chk_asarray to >> do the necessary preprocessing. > > Mmh. _chk_asarray will always return a MA. Is it what you want? Are you > > An idea is then to use the 'usemask' parameter I was talking about > earlier: > * if usemask is False (default), return a ndarray > * If usemask is True, return a MA > * if the input is a MA (w/ or w/o missing values), set usemask to > True, and mask the NaNs/Infs first w/ ma.fix_invalid. This may not be appropriate for scipy, but for my own purposes I included a third option for the similar "masked" kwarg in a simple stats class: http://currents.soest.hawaii.edu/hg/hgwebdir.cgi/pycurrents/file/7b4103d34cc8/num/stats.py#l1 masked='auto' makes the output masked if and only if the input is masked. Eric > > That way, we need only one function. If we really need it, we can have > duplicate functions in scipy.mstats where usemask is set to True by > default. > > Now, for the actual implementation: > * usemask=False and some NaNs: return NaN > * usemask=True: use the ma implementation. > > > >>>>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])), >>>>> 1) #inconsistent return type >> array([ 0., 0.]) > > That's a bug, we should have a MA. > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Fri Feb 27 18:13:06 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 18:13:06 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> References: <49A7171F.5070500@gmail.com> <1cd32cbb0902261547v43de301du7199b3bb7af26c47@mail.gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> Message-ID: <1cd32cbb0902271513s5f8a3ff0qeccca599cc704e7d@mail.gmail.com> On Fri, Feb 27, 2009 at 5:47 PM, Pierre GM wrote: > > On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: >> >> For most of the current statistical functions, with the exception of >> different tie handling, I think that we can expand the _chk_asarray to >> do the necessary preprocessing. > > Mmh. _chk_asarray will always return a MA. Is it what you want? Are you > No, what I meant was, that _chk_asarray is currently called for preprocessing in most functions, so it will be easy to use a replacement function to obtain the preprocessed (e.g. compressed) data, and whatever flags (usemask) we need, in the main body of the function and for the decision about the return type. > An idea is then to use the 'usemask' parameter I was talking about > earlier: > * if usemask is False (default), return a ndarray > * If usemask is True, return a MA > * if the input is a MA (w/ or w/o missing values), set usemask to > True, and mask the NaNs/Infs first w/ ma.fix_invalid. > > That way, we need only one function. If we really need it, we can have > duplicate functions in scipy.mstats where usemask is set to True by > default. > > Now, for the actual implementation: > * usemask=False and some NaNs: return NaN > * usemask=True: use the ma implementation. > That clarifies the API. I will try to write a prototype, but I spend too much time on scipy this week. > >>>>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])), >>>>> 1) #inconsistent return type >> array([ 0., ?0.]) > > That's a bug, we should have a MA. > From josef.pktd at gmail.com Fri Feb 27 18:27:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 18:27:49 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <49A870C9.8000208@hawaii.edu> References: <49A7171F.5070500@gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> <49A870C9.8000208@hawaii.edu> Message-ID: <1cd32cbb0902271527s74f1bf9ds24c9a86d2713b2c6@mail.gmail.com> On Fri, Feb 27, 2009 at 6:01 PM, Eric Firing wrote: > Pierre GM wrote: >> On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: >>> For most of the current statistical functions, with the exception of >>> different tie handling, I think that we can expand the _chk_asarray to >>> do the necessary preprocessing. >> >> Mmh. _chk_asarray will always return a MA. Is it what you want? Are you >> >> An idea is then to use the 'usemask' parameter I was talking about >> earlier: >> * if usemask is False (default), return a ndarray >> * If usemask is True, return a MA >> * if the input is a MA (w/ or w/o missing values), set usemask to >> True, and mask the NaNs/Infs first w/ ma.fix_invalid. > > This may not be appropriate for scipy, but for my own purposes I > included a third option for the similar "masked" kwarg in a simple stats > class: > > http://currents.soest.hawaii.edu/hg/hgwebdir.cgi/pycurrents/file/7b4103d34cc8/num/stats.py#l1 > > masked='auto' makes the output masked if and only if the input is masked. > > Eric > Yes, your class looks similar to what I have in mind. But I didn't see a license statement to know whether I'm allowed to look. Also, your broadcastable (squeeze) option looks like a very useful idea. Two differences that I think of are to have the main part in ndarrays while your _y is a masked array, and at this stage we won't switch to classes for the basic statistical functions. Additionally, if I rewrite these functions I would like to get also weights in. Josef From bsouthey at gmail.com Fri Feb 27 22:04:42 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 27 Feb 2009 21:04:42 -0600 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: <1cd32cbb0902271513s5f8a3ff0qeccca599cc704e7d@mail.gmail.com> References: <49A7171F.5070500@gmail.com> <49A800A7.4080407@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> <1cd32cbb0902271513s5f8a3ff0qeccca599cc704e7d@mail.gmail.com> Message-ID: On Fri, Feb 27, 2009 at 5:13 PM, wrote: > On Fri, Feb 27, 2009 at 5:47 PM, Pierre GM wrote: >> >> On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: >>> >>> For most of the current statistical functions, with the exception of >>> different tie handling, I think that we can expand the _chk_asarray to >>> do the necessary preprocessing. >> >> Mmh. _chk_asarray will always return a MA. Is it what you want? Are you >> > No, what I meant was, that _chk_asarray is currently called for > preprocessing in most functions, so it will be easy to use a replacement > function to obtain the preprocessed (e.g. compressed) data, and whatever > flags (usemask) we need, in the main body of the function and for the > decision about the return type. > I really do not see the requirement for _chk_asarray at all. When a user passes a typical array or masked array then there should be no further processing required. Also _chk_asarray will use ravel() if axis is None but my understanding of many numpy functions operate over a flattened array when there is no axis defined. The only case that needs addressing is when a user supplies an object that can be converted to an array otherwise a error needs to be raised. After conversion to an array no further processing is required and even that conversion in some cases will be done within the existing functions. > >> An idea is then to use the 'usemask' parameter I was talking about >> earlier: >> * if usemask is False (default), return a ndarray >> * If usemask is True, return a MA >> * if the input is a MA (w/ or w/o missing values), set usemask to >> True, and mask the NaNs/Infs first w/ ma.fix_invalid. >> >> That way, we need only one function. If we really need it, we can have >> duplicate functions in scipy.mstats where usemask is set to True by >> default. >> >> Now, for the actual implementation: >> * usemask=False and some NaNs: return NaN >> * usemask=True: use the ma implementation. >> > > That clarifies the API. I will try to write a prototype, but I spend > too much time on scipy this week. This is a little messy and there has been discussion regarding this elsewhere. In these terms there are two distinct issues: 1) If the array contains non-finite numbers (NaN, positive and negative infinity) then perhaps the user can strip these out first for example R's mean function has the argument 'na.rm = FALSE'. 2) If non-finite elements arise during the function like taking the log of zero then I think that the user must know these have occurred rather than be forced to check the mask - especially if they have already masked values for other reasons like incomplete data. Bruce From josef.pktd at gmail.com Fri Feb 27 22:52:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Feb 2009 22:52:55 -0500 Subject: [SciPy-dev] PEP: Improving the basic statistical functions in Scipy In-Reply-To: References: <49A7171F.5070500@gmail.com> <1cd32cbb0902270819l7da77aacv9ff8f7ac3c60fd1b@mail.gmail.com> <64BE787E-059E-4BC4-A9BC-F40B294442DD@gmail.com> <1cd32cbb0902271148u12111254t72b76ddbc5efb17b@mail.gmail.com> <1cd32cbb0902271214q2e08a68ciead44452234f2f30@mail.gmail.com> <6CF41A55-2514-410F-8C68-CA46D26A628F@gmail.com> <1cd32cbb0902271352j928a978g3a3f0cd4b53b69a4@mail.gmail.com> <7CFAD058-CB6E-4CAB-A59B-4AF03FB365A7@gmail.com> <1cd32cbb0902271513s5f8a3ff0qeccca599cc704e7d@mail.gmail.com> Message-ID: <1cd32cbb0902271952q27e04c3am7ef0c51300d0453f@mail.gmail.com> On Fri, Feb 27, 2009 at 10:04 PM, Bruce Southey wrote: > On Fri, Feb 27, 2009 at 5:13 PM, ? wrote: >> On Fri, Feb 27, 2009 at 5:47 PM, Pierre GM wrote: >>> >>> On Feb 27, 2009, at 4:52 PM, josef.pktd at gmail.com wrote: >>>> >>>> For most of the current statistical functions, with the exception of >>>> different tie handling, I think that we can expand the _chk_asarray to >>>> do the necessary preprocessing. >>> >>> Mmh. _chk_asarray will always return a MA. Is it what you want? Are you >>> >> No, what I meant was, that _chk_asarray is currently called for >> preprocessing in most functions, so it will be easy to use a replacement >> function to obtain the preprocessed (e.g. compressed) data, and whatever >> flags (usemask) we need, in the main body of the function and for the >> decision about the return type. >> > I really do not see the requirement for _chk_asarray at all. When a > user passes a typical array or masked array then there should be no > further processing required. Also _chk_asarray will use ravel() if > axis is None but my understanding of many numpy functions operate over > a flattened array when there is no axis defined. > > The only case that needs addressing is when a user supplies an object > that can be converted to an array otherwise a error needs to be > raised. After conversion to an array no further processing is required > and even that conversion in some cases will be done within the > existing functions. The current usage allows to pass lists instead of arrays. This is very convenient for interactive use but might also have other uses, e.g when building a list incrementally. And I thought asarray doesn't have much cost if it is already an array. I didn't look systematically at ravel, but while axis=None works automatically for many numpy functions, for more complex statistical functions more control over the dimension of the input arrays is necessary. Many statistical functions are only designed for 1d or 2d and controlling the dimension at the beginning simplifies the main part of the functions. I had some cases where I was struggling for a while with the dimensions and axis, but in many cases it could be redundant. If we want to handle different array types with the same function then the _chk_asarray call will be replaced by the type specific preprocessing. > > >> >>> An idea is then to use the 'usemask' parameter I was talking about >>> earlier: >>> * if usemask is False (default), return a ndarray >>> * If usemask is True, return a MA >>> * if the input is a MA (w/ or w/o missing values), set usemask to >>> True, and mask the NaNs/Infs first w/ ma.fix_invalid. >>> >>> That way, we need only one function. If we really need it, we can have >>> duplicate functions in scipy.mstats where usemask is set to True by >>> default. >>> >>> Now, for the actual implementation: >>> * usemask=False and some NaNs: return NaN >>> * usemask=True: use the ma implementation. >>> >> >> That clarifies the API. I will try to write a prototype, but I spend >> too much time on scipy this week. > > This is a little messy and there has been discussion regarding this > elsewhere. In these terms there are two distinct issues: > 1) If the array contains non-finite numbers (NaN, positive and > negative infinity) then perhaps the user can strip these out first for > example R's mean function has the argument 'na.rm = FALSE'. If the merged functions are able to handle masked arrays and plain ndarrays, then we can also offer the user the option for the treatment of nans, this would make the separate nanmean, ... obsolete. Operation on inf might be too ambiguous and I would think they are the responsibility of the user. And if there is a inf*0 then I want to give them the nan back, and the user can decide what to do. In general inf is a legitimate number and might or should propagate correctly (if the user wants to leave them in) e.g. >>> stats.norm.cdf(-np.inf) 0.0 >>> stats.norm.cdf(np.inf) 1.0 > 2) If non-finite elements arise during the function like taking the > log of zero then I think that the user must know these have occurred > rather than be forced to check the mask - especially if they have > already masked values for other reasons like incomplete data. I agree and I want this behavior for ndarrays, for masked arrays I'm less involved since I'm not using them (yet). I like Erics use of a trivariate (?) choice with "auto" which adds one option for the user: masked='auto' : True|False|'auto' determines the output; if True, output will be a masked array; if False, output will be an ndarray with nan used as a bad flag if necessary; if 'auto', output will match input What the exact definition is for masked arrays in the case "auto", is up to the masked array users. Josef From stefan at sun.ac.za Sat Feb 28 04:14:08 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 28 Feb 2009 11:14:08 +0200 Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) In-Reply-To: References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> Message-ID: <9457e7c80902280114q479da81eyf3ea98b0dbf7f816@mail.gmail.com> 2009/2/27 David Warde-Farley : >> 3. PyPi links & instructions for packages that are not in PyPi should >> ? be hidden. > > It sticks out to me that there are a lot of broken PyPI links > throughout the portal site. Those are the SciKits that haven't been registered with PyPi. We can easily remove them from the list, but I thought it's better to have them there to start off with. > Also, I'm not sure what can be done about it, but the whole site is > quite slow for me. I'll forward your comments to the developer, thanks. Cheers St?fan From jason-sage at creativetrax.com Sat Feb 28 05:04:00 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Sat, 28 Feb 2009 04:04:00 -0600 Subject: [SciPy-dev] Matrix exponential In-Reply-To: <3d375d730902271359m404ec3e4n990ce491e63f470@mail.gmail.com> References: <49A86084.5090506@creativetrax.com> <3d375d730902271359m404ec3e4n990ce491e63f470@mail.gmail.com> Message-ID: <49A90C10.2040902@creativetrax.com> (I think John probably needs to be CC'd, since I don't know if he's subscribed to this list; so I'm CCing him this message. John, Robert Kern's query is below.) Robert Kern wrote: > On Fri, Feb 27, 2009 at 15:52, wrote: > >> John Cremona posted the following message to the sage development list >> about matrix exponentials. I'm copying it to here since it asks about >> the scipy matrix exponential method (we say numpy below, but we really >> mean scipy...) >> >> John Cremona wrote: >> >> >>>>> I have just been to a colloquium talk by numerical analyst Nick Higham >>>>> (Manchester) called "How to compute and not to compute a matrix >>>>> exponential". He has new methods which are now in mathematica, matlab >>>>> and NAG but (apparantly) nowhere else. >>>>> > > Are the good methods in this paper? > > A New Scaling and Squaring Algorithm for the Matrix Exponential (with > Awad Al-Mohy), MIMS EPrint 2009.9, January 2009. [new] > > http://eprints.ma.man.ac.uk/1217/01/covered/MIMS_ep2009_9.pdf > > From jason-sage at creativetrax.com Sat Feb 28 05:09:50 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Sat, 28 Feb 2009 04:09:50 -0600 Subject: [SciPy-dev] Matrix exponential In-Reply-To: <158627b90902271451x5a46daf3q6736a7fc9e2bb7d2@mail.gmail.com> References: <49A86084.5090506@creativetrax.com> <158627b90902271451x5a46daf3q6736a7fc9e2bb7d2@mail.gmail.com> Message-ID: <49A90D6E.3060407@creativetrax.com> (It appears that this postscript from John (at the top of the message below) didn't make it to the scipy-dev list, probably because he's not subscribed, so I'm forwarding the message to the scipy-dev list.) -Jason John Cremona wrote: > PS An interesting quote from one of Higham's talks: > > The availability of expm(A) in > early versions of MATLAB > quite possibly contributed to > the system?s technical and commercial success.? > ? Cleve Moler (2003) > > I get the impression that this is used a lot, though they only seem to > want double precision (as opposed to multi) which is both fast and has > predictably bounded error. The method is a variant of a standard one > (Pade approximations) with some nice tricks as some once-and-for all > parameter tuning (for which he said they used Maple!) > > John Cremona > > 2009/2/27 : > >> John Cremona posted the following message to the sage development list about >> matrix exponentials. I'm copying it to here since it asks about the scipy >> matrix exponential method (we say numpy below, but we really mean scipy...) >> >> John Cremona wrote: >> >> >>>>> I have just been to a colloquium talk by numerical analyst Nick Higham >>>>> (Manchester) called "How to compute and not to compute a matrix >>>>> exponential". He has new methods which are now in mathematica, matlab >>>>> and NAG but (apparantly) nowhere else. He only seemed interested in >>>>> getting good speed & precision to 16 decimals but (when I asked) >>>>> confirmed that the methods should apply to give arbitrary precision. >>>>> >>>>> I just checked and see that Sage's matrix exp() uses something stupid >>>>> except over RDF/CDF where it uses a pade approximation method via >>>>> numpy. The method of the talk was a variant of that, the main trick >>>>> being to use exactly the right order of Pade approx. so maximise >>>>> precision and speed. >>>>> >>>>> I would like to know how good the numpy method is, and whether it can >>>>> be improved to this "state of the art" version at least for RDF. Then >>>>> it could be another selling point for Sage. >>>>> >>> Could you CC the numpy devlist as well on this? It sounds exciting! >>> >> I will if you give me the address (or you can perhaps?). It might be >> worth including Higham's URL: >> http://www.maths.manchester.ac.uk/~higham/ as he has lots of his >> talks up there including some which are similar to the one I heard. >> >> >> >> >> > > From pav at iki.fi Sat Feb 28 06:33:10 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 28 Feb 2009 11:33:10 +0000 (UTC) Subject: [SciPy-dev] Scikits portal suggestions (Was: The future of SciPy...) References: <9457e7c80902230230t568101ebib7124e97c636ad5f@mail.gmail.com> <49A339F5.1040703@enthought.com> <1e2af89e0902231621l5f1e3a5bnb2b363356cbdfca0@mail.gmail.com> <9457e7c80902232151m7913fc22la7fda16384faf60e@mail.gmail.com> <49A5AB15.4060609@enthought.com> <8C95DED3-E418-40AE-983C-6547E4EC7083@stsci.edu> <49A5D1CD.6070700@enthought.com> <9457e7c80902260417q26c20c3es96d26dc0b187691f@mail.gmail.com> <9457e7c80902280114q479da81eyf3ea98b0dbf7f816@mail.gmail.com> Message-ID: Sat, 28 Feb 2009 11:14:08 +0200, St?fan van der Walt wrote: > 2009/2/27 David Warde-Farley : >>> 3. PyPi links & instructions for packages that are not in PyPi should >>> ? be hidden. >> >> It sticks out to me that there are a lot of broken PyPI links >> throughout the portal site. > > Those are the SciKits that haven't been registered with PyPi. We can > easily remove them from the list, but I thought it's better to have them > there to start off with. I think it's good to have the scikits there, but if a Scikit isn't in PyPi, it's best not to show broken PyPi links or easy_install instructions. -- Pauli Virtanen From jason-sage at creativetrax.com Sat Feb 28 07:31:31 2009 From: jason-sage at creativetrax.com (jason-sage at creativetrax.com) Date: Sat, 28 Feb 2009 06:31:31 -0600 Subject: [SciPy-dev] Matrix exponential In-Reply-To: <3d375d730902271359m404ec3e4n990ce491e63f470@mail.gmail.com> References: <49A86084.5090506@creativetrax.com> <3d375d730902271359m404ec3e4n990ce491e63f470@mail.gmail.com> Message-ID: <49A92EA3.7010600@creativetrax.com> Robert Kern wrote: > On Fri, Feb 27, 2009 at 15:52, wrote: > >> John Cremona posted the following message to the sage development list >> about matrix exponentials. I'm copying it to here since it asks about >> the scipy matrix exponential method (we say numpy below, but we really >> mean scipy...) >> >> John Cremona wrote: >> >> >>>>> I have just been to a colloquium talk by numerical analyst Nick Higham >>>>> (Manchester) called "How to compute and not to compute a matrix >>>>> exponential". He has new methods which are now in mathematica, matlab >>>>> and NAG but (apparantly) nowhere else. >>>>> > > Are the good methods in this paper? > > A New Scaling and Squaring Algorithm for the Matrix Exponential (with > Awad Al-Mohy), MIMS EPrint 2009.9, January 2009. [new] > > http://eprints.ma.man.ac.uk/1217/01/covered/MIMS_ep2009_9.pdf > > John replied to me by private email and said: "The answer to their question is -- yes, that paper looks like what he was talking about. If people wish to look into it, good. Otherwise I don't mind!" Thanks, Jason From grh at mur.at Sat Feb 28 07:50:47 2009 From: grh at mur.at (Georg Holzmann) Date: Sat, 28 Feb 2009 13:50:47 +0100 Subject: [SciPy-dev] talkbox scikit Message-ID: <49A93327.9000701@mur.at> Hallo David ! I want to contribute some code to the talkbox scikit and have some questions. - First, is this scikit also for audio signal processing / audio (music) feature extraction, or mainly for speech only ? - Second, I have implemented some (random) code for audio signal processing which IMHO would be nice to have in a scikit: * Implementation of a Generalized Cross Correlation (GCC) with various pre-whitening filters. (after "The Generalized Correlation Method for Estimation of Time Delay" by Charles Knapp and Clifford Carter, programmed with looking at the matlab GCC implementation by Davide Renzi) this function is used for robustly determine the time delay between two real signals * Equivalent Rectangular Bandwidth Filter Coefficients for biquad IIR Filters. (implemented after "An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank" by Malcolm Slaney) * Filter coefficients for a bank of Gammatone filters. (implemented after "An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank" by Malcolm Slaney) Implementation also with multiple biquad filters, to avoid numerical unstabilities. * Common filter parameters for audio biquad IIR filters (after "Cookbook formulae for audio EQ biquad filter coefficients", http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt) * Conversion of linear IIR filter parameters to a minimum phase filter with the same amplitude response. * MFCC feature extraction (but I have seen that you already have implemented mfccs...) * I plan to implement more audio/music feature extraction methods in near future (chroma features, beat features, beat-synchronous features ...) - OK, would this be suited for talkbox ? If yes, are there any guidlines how to contribute (I will adapt the code to your scikit style, writing more tests and docs) - I mean, should I just send you the code by mail ? - In which categories should I put all these ? So I propose all the filter parameter calculations in talkbox/fbanks/, feature extraction methods of course into talkbox/features/ and the generalized cross correlation maybe into talkbox/tools/correlations.py, or maybe in a seperate file ... ? - And last but not least, is this the right mailing list for such discussions ;) ? Or are there any special lists for scikits ? OK, thanks for any feedback ! LG Georg From cournape at gmail.com Sat Feb 28 11:57:38 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Mar 2009 01:57:38 +0900 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> Message-ID: <5b8d13220902280857o2495ba6brc8b3ad94f7d42c6f@mail.gmail.com> On Fri, Feb 27, 2009 at 9:56 PM, wrote: > On Fri, Feb 27, 2009 at 3:28 AM, David Cournapeau > wrote: >> Hi, >> >> ? ?Following the discussions, I have started to write a small document >> highlighting my current gripes with trac. I focus on some common >> scenario, and pin-point trac limitations. I mention possible new tools >> at the end, but that's not the main point: everybody who is also >> disatisfied with trac, and maybe even more importantly people who are >> currently satisfied and think their scenario is not covered should feel >> free to comment/modify it: >> >> http://scipy.org/scipy/numpy/wiki/ImprovingIssueWorkflow >> >> I put the initial version in svn as well: >> http://projects.scipy.org/scipy/numpy/browser/trunk/doc/neps/newbugtracker.rst. >> >> cheers, >> >> David > > Just two quick comments: > > * I like the integration of the bug tracker and svn, browsing between > old tickets and revisions is pretty easy. ?Similarly, integrated > timeline for svn and issue tracker makes tracking new code and issues > easy. Yes, I agree this is a useful functionality. I think most integrated solutions ala trac/redmine and co have this feature. > * eclipse integration with trac issues works well with mylyn, but I > haven't used it much and not for scipy, > ?eclipse integration with svn is very good. I note that you care about eclipse integration. This is an important point I think for other people as well, and something I can't really document myself, as I don't use IDE. David From cournape at gmail.com Sat Feb 28 12:01:37 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Mar 2009 02:01:37 +0900 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> Message-ID: <5b8d13220902280901s11ac7e99mb183c990b0ed6077@mail.gmail.com> On Fri, Feb 27, 2009 at 11:25 PM, wrote: > > I connected my eclipse mylyn with the scipy trac tickets and have > tickets that are assigned to me on my local computer where I can mark > them as read or unread. But for now I like the web interface better. Different people, different workflows :) Web-UI is of course a must, that's definitely easier for newcomers anyway (no need to install anything). Some things could be improved in the WEB-UI as well: batch editing, for example, is something I really like with redmine - reassigning things in batch has to be done through SQL I believe. > Also, since David mentioned sql queries in another thread, I set up a > report that sorts tickets by change time. It helps to see which > tickets where recently commented on. But since this is the first time, > I do this, and I'm not very familiar with sql, this still needs > improvements. With trac 0.11, there will be no much need for SQL: there is finally a user-friendly interface for advanced queries with trac macro systems. This does not help much for batch editing, though. Maybe we will have to implement our own plugin to trac if we stay with trac. David From stefan at sun.ac.za Sat Feb 28 14:32:41 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 28 Feb 2009 21:32:41 +0200 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <49A7FB3F.2070908@gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> <49A7FB3F.2070908@gmail.com> Message-ID: <9457e7c80902281132q1416965ena47c0af9d7b550b9@mail.gmail.com> 2009/2/27 Michael Abshoff : > The scipy trac seems to be version 0.10.2 which has a number of known > security issues. That's about to change! We've already set up a new NumPy Trac for testing. Hopefully we can soon switch over to http://new.scipy.org/trac/numpy/timeline You'll notice that the new server and 0.11 is much more responsive. Cheers St?fan From michael.abshoff at googlemail.com Sat Feb 28 14:47:04 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sat, 28 Feb 2009 11:47:04 -0800 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <9457e7c80902281132q1416965ena47c0af9d7b550b9@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> <49A7FB3F.2070908@gmail.com> <9457e7c80902281132q1416965ena47c0af9d7b550b9@mail.gmail.com> Message-ID: <49A994B8.2050105@gmail.com> St?fan van der Walt wrote: > 2009/2/27 Michael Abshoff : >> The scipy trac seems to be version 0.10.2 which has a number of known >> security issues. > > That's about to change! We've already set up a new NumPy Trac for > testing. Hopefully we can soon switch over to > > http://new.scipy.org/trac/numpy/timeline > > You'll notice that the new server and 0.11 is much more responsive. Well, I rarely look at numpy or scipy's trac since I have plenty of things to do, too, and I am not the guy dealing with numpy/scipy problems in Sage at the moment :) I was just surprised that such an obvious problem with trac+apache or other crippling issues with your trac install weren't just fixed. It took William a couple hours to migrate four trac installs, one of them with a 400 MB database. If Sage's trac had the performance of the ones you had to deal with I would not have rested until the issue were fixed since no working trac means standstill for Sage. > Cheers > St?fan Cheers, Michael > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From josef.pktd at gmail.com Sat Feb 28 15:44:40 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 28 Feb 2009 15:44:40 -0500 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <5b8d13220902280901s11ac7e99mb183c990b0ed6077@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> <5b8d13220902280901s11ac7e99mb183c990b0ed6077@mail.gmail.com> Message-ID: <1cd32cbb0902281244v705bfd35wa9ede0a7f47408df@mail.gmail.com> On Sat, Feb 28, 2009 at 12:01 PM, David Cournapeau wrote: > On Fri, Feb 27, 2009 at 11:25 PM, ? wrote: > >> >> I connected my eclipse mylyn with the scipy trac tickets and have >> tickets that are assigned to me on my local computer where I can mark >> them as read or unread. But for now I like the web interface better. > > Different people, different workflows :) Web-UI is of course a must, > that's definitely easier for newcomers anyway (no need to install > anything). Some things could be improved in the WEB-UI as well: batch > editing, for example, is something I really like with redmine - > reassigning things in batch has to be done through SQL I believe. > >> Also, since David mentioned sql queries in another thread, I set up a >> report that sorts tickets by change time. It helps to see which >> tickets where recently commented on. But since this is the first time, >> I do this, and I'm not very familiar with sql, this still needs >> improvements. > > With trac 0.11, there will be no much need for SQL: there is finally a > user-friendly interface for advanced queries with trac macro systems. > This does not help much for batch editing, though. Maybe we will have > to implement our own plugin to trac if we stay with trac. > > David >From what I have seen from a quick look at the new trac 0.11.3, it still doesn't allow custom queries sorted by changedate. So it will still be useful to set up a set of reports by changedate and milestone or version or component as in http://trac.edgewall.org/report Do you mean the ticket query macro for the wiki or is there some other way for doing advanced queries? I didn't find the ticket query macro very useful for my own use since it doesn't provide the nice table. The XML-RPC interface to trac might work quite well for scripting, this is also how eclipse/mylyn synchronizes with the trac tickets. To eclipse usage: I used it, together with svn integration, quite a bit for pure python packages. Since I build and install scipy into new directories each time, I find the workspace concept of eclipse a bit cumbersome, and if I'm not sure how thinks work, I just work with the Idle shell. But since eclipse is distributed with pythonxy, there might be more users of eclipse. Josef From cournape at gmail.com Sat Feb 28 16:39:00 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 1 Mar 2009 06:39:00 +0900 Subject: [SciPy-dev] Improving the bug tracking workflow: starting document In-Reply-To: <1cd32cbb0902281244v705bfd35wa9ede0a7f47408df@mail.gmail.com> References: <49A7A445.4010800@ar.media.kyoto-u.ac.jp> <1cd32cbb0902270456y3f9ebc10s1eed3ad9fef0010f@mail.gmail.com> <1cd32cbb0902270625t69592f4n583c4eabcbafc481@mail.gmail.com> <5b8d13220902280901s11ac7e99mb183c990b0ed6077@mail.gmail.com> <1cd32cbb0902281244v705bfd35wa9ede0a7f47408df@mail.gmail.com> Message-ID: <5b8d13220902281339v643d0ea0xd7a43fe51a0d332@mail.gmail.com> On Sun, Mar 1, 2009 at 5:44 AM, wrote: > On Sat, Feb 28, 2009 at 12:01 PM, David Cournapeau wrote: >> On Fri, Feb 27, 2009 at 11:25 PM, ? wrote: >> >>> >>> I connected my eclipse mylyn with the scipy trac tickets and have >>> tickets that are assigned to me on my local computer where I can mark >>> them as read or unread. But for now I like the web interface better. >> >> Different people, different workflows :) Web-UI is of course a must, >> that's definitely easier for newcomers anyway (no need to install >> anything). Some things could be improved in the WEB-UI as well: batch >> editing, for example, is something I really like with redmine - >> reassigning things in batch has to be done through SQL I believe. >> >>> Also, since David mentioned sql queries in another thread, I set up a >>> report that sorts tickets by change time. It helps to see which >>> tickets where recently commented on. But since this is the first time, >>> I do this, and I'm not very familiar with sql, this still needs >>> improvements. >> >> With trac 0.11, there will be no much need for SQL: there is finally a >> user-friendly interface for advanced queries with trac macro systems. >> This does not help much for batch editing, though. Maybe we will have >> to implement our own plugin to trac if we stay with trac. >> >> David > > >From what I have seen from a quick look at the new trac 0.11.3, it still > doesn't allow custom queries sorted by changedate. So it will still > be useful to set up a set of reports by changedate and milestone or > version or component as in http://trac.edgewall.org/report It looks like I read a bit too quickly, and it will be for 0.12: http://trac.edgewall.org/milestone/0.12 > Do you mean the ticket query macro for the wiki or is there some other > way for doing advanced queries? I didn't find the ticket query macro > very useful for my own use since it doesn't provide the nice table. I don't care too much about the presentation - this can be improved. As you mentioned, the xml-rpc might be very useful - I may be able to program something for my own workflow. After all, that's what python is for :) cheers, David