From charlesr.harris at gmail.com Sun Apr 1 16:52:53 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Apr 2007 14:52:53 -0600 Subject: [SciPy-dev] Violation of array scalar multiplication rules? Message-ID: Just asking. In [35]: type(array(1.0)*2) Out[35]: In [36]: type(array(1.0)) Out[36]: Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Apr 3 07:11:48 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 03 Apr 2007 20:11:48 +0900 Subject: [SciPy-dev] Improving scipy.clusters Message-ID: <46123674.2070607@ar.media.kyoto-u.ac.jp> Hi there, I would like to clean-up and improve the kmean algorithm (more initialization schemes for the algorithm, and better docs). I already have write access to the scipy svn rep, but as scipy.clusters is not in the sandbox nor my code, I would like to make sure it is OK, cheers, David From robert.kern at gmail.com Tue Apr 3 13:02:50 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 03 Apr 2007 12:02:50 -0500 Subject: [SciPy-dev] Improving scipy.clusters In-Reply-To: <46123674.2070607@ar.media.kyoto-u.ac.jp> References: <46123674.2070607@ar.media.kyoto-u.ac.jp> Message-ID: <461288BA.6060205@gmail.com> David Cournapeau wrote: > Hi there, > > I would like to clean-up and improve the kmean algorithm (more > initialization schemes for the algorithm, and better docs). I already > have write access to the scipy svn rep, but as scipy.clusters is not in > the sandbox nor my code, I would like to make sure it is OK, It is. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at ee.byu.edu Tue Apr 3 14:28:07 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 03 Apr 2007 12:28:07 -0600 Subject: [SciPy-dev] Improving scipy.clusters In-Reply-To: <46123674.2070607@ar.media.kyoto-u.ac.jp> References: <46123674.2070607@ar.media.kyoto-u.ac.jp> Message-ID: <46129CB7.5060403@ee.byu.edu> David Cournapeau wrote: > Hi there, > > I would like to clean-up and improve the kmean algorithm (more > initialization schemes for the algorithm, and better docs). I already > have write access to the scipy svn rep, but as scipy.clusters is not in > the sandbox nor my code, I would like to make sure it is OK, > I'm pretty sure nobody else is working on it at the moment so please do whatever you can. Many thanks, -Travis From oliphant at ee.byu.edu Tue Apr 3 18:43:29 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 03 Apr 2007 16:43:29 -0600 Subject: [SciPy-dev] NumPy 1.0.2 released Message-ID: <4612D891.8040200@ee.byu.edu> To all SciPy / NumPy users: NumPy 1.0.2 was released yesterday (4-02-07). Get it by following the download link at http://numpy.scipy.org This is a bug-fix release with a couple of additional features. Thanks to everybody who helped track down and fix bugs. -Travis From bgoli at sun.ac.za Wed Apr 4 02:02:08 2007 From: bgoli at sun.ac.za (Brett Olivier) Date: Wed, 4 Apr 2007 08:02:08 +0200 Subject: [SciPy-dev] scipy.test "generic 1d filter" crashes interpreter on windows Message-ID: <200704040802.08155.bgoli@sun.ac.za> Hi Running scipy.test() after installing an SVN version of scipy on windows ('0.5.3.dev2895') causes the interpereter to crash on the "generic 1d filter" test. This problem seems to be windows specific and have appeared sometime after '0.5.3.dev2866'. scipy.test(1,10) generation of a binary structure 3 ... ok generation of a binary structure 4 ... ok generic filter 1 ... ERROR generic 1d filter 1 Build environment scipy.__version__ = '0.5.3.dev2895' numpy.__version__ = '1.0.3.dev3657' WinXP, Python2.4, MinGW (gcc 3.4.5), ATLAS 3.7.11 TIA Brett From stefan at sun.ac.za Wed Apr 4 04:25:52 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 4 Apr 2007 10:25:52 +0200 Subject: [SciPy-dev] scipy.test "generic 1d filter" crashes interpreter on windows In-Reply-To: <200704040802.08155.bgoli@sun.ac.za> References: <200704040802.08155.bgoli@sun.ac.za> Message-ID: <20070404082552.GP18196@mentat.za.net> Hi Brett I fixed a couple of issues in ndimage regarding spline interpolation, which required porting some of the code from numarray to numpy. The memory leak was probably introduced then. I ran the whole test suite under valgrind, which didn't report any problems under linux (just did it again to make sure, and it's still fine). Unfortunately, I am not familiar enough with windows systems to know what the equivalent of valgrind would be. Is there any way you can localise the problem further? Just to be safe, please make sure you are doing a clean build of scipy > r2889. Regards St?fan On Wed, Apr 04, 2007 at 08:02:08AM +0200, Brett Olivier wrote: > Hi > > Running scipy.test() after installing an SVN version of scipy on > windows ('0.5.3.dev2895') causes the interpereter to crash on > the "generic 1d filter" test. This problem seems to be windows > specific and have appeared sometime after '0.5.3.dev2866'. > > scipy.test(1,10) > generation of a binary structure 3 ... ok > generation of a binary structure 4 ... ok > generic filter 1 ... ERROR > generic 1d filter 1 > > Build environment > scipy.__version__ = '0.5.3.dev2895' > numpy.__version__ = '1.0.3.dev3657' > WinXP, Python2.4, MinGW (gcc 3.4.5), ATLAS 3.7.11 > > TIA > Brett From cimrman3 at ntc.zcu.cz Wed Apr 4 04:37:42 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 04 Apr 2007 10:37:42 +0200 Subject: [SciPy-dev] UMFPACKv4.4 and swig memory leak of type 'void *' In-Reply-To: <45E54D84.8010103@iam.uni-stuttgart.de> References: <45E54D84.8010103@iam.uni-stuttgart.de> Message-ID: <461363D6.3010305@ntc.zcu.cz> Nils Wagner wrote: > Hi all, > > scipy.test(1,10) reports a lot of memory leaks, e.g. > > Getting factors of complex matrixswig/python detected a memory leak of > type 'void *', no destructor found. > Getting factors of real matrixswig/python detected a memory leak of type > 'void *', no destructor found. > Solve with UMFPACK: double precision complexswig/python detected a > memory leak of type 'void *', no destructor found. > swig/python detected a memory leak of type 'void *', no destructor found. > ... ok > Solve: single precision complexUse minimum degree ordering on A'+A. > ... ok > Solve with UMFPACK: double precisionswig/python detected a memory leak > of type 'void *', no destructor found. > swig/python detected a memory leak of type 'void *', no destructor found. > ... ok > > Is this a swig problem ? > > I am using > > Numpy version 1.0.2.dev3562 > Scipy version 0.5.3.dev2774 Hi Nils, it was not a swig problem but mine (passing bad 'own' flag in a typemap)). One can be really blind... Now it seems fixed. (rev. 2896) r. From a.schmolck at gmx.net Fri Apr 6 19:05:58 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: 07 Apr 2007 00:05:58 +0100 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> Message-ID: Robert Kern writes: [on the question whether the svn-layout {branch,tags,trunk}/subproject (Robert) or subproject/{branch,tags,trunk} (me) is preferable for scikits] > By and large, it simply doesn't matter to "get everything related to your > project". Believe me, you don't want all of branches/ and tags/ in a > checkout. I have done so in order to get some overview over a project that I was unfamiliar with, but I agree that that's hardly an important use case. Migration convenience seems more relevant, but unless rearranging dir structure in svn repositories is really hard, that's not a big factor either. > On the other hand, "getting all of the packages in scikits" does matter. Sure, but not that often either. Why would many people want to frequently update the svn versions of all of several unrelated and fairly special purpose scientific packages? I would assume the typical case is that someone hacks e.g. mlabwrap and updates that from time to time rather than a dozen other porjects. > IMO, the inconvenience of prefixing your branches and tags is secondary to > the performance problems of svn:external, which slows down all checkouts and > updates for everyone Only dirs with svn:externals would be affected, right? Since the only such dir would be /scikits, which most people won't checkout, I don't think there would be much of an impact (see above). > (although I'll have to double-check that claim for svn:external links within > the same repository). I think you're right: I added an external 'ext' on mlabwrap to mlabwrap/test, and it appears to slow down checkout; I've got a very old svn client, but you can try yourself: svn co http://scipy.org/svn/scikits/trunk/mlabwrap/ > Also, it appears that Trac doesn't like svn:external to other repositories; > I'm not sure if that extends to svn:external within the same repository. > > http://trac.edgewall.org/wiki/TracFaq#DoesTracsupportsvn:externalsubversionrepositories For this purpose it seems to work fine: It just displays the svn:externals property, which is all you want here, I think. Anyway, the choice between either layout (i.e. subproject internal/external {tags,branches,trunk}) seems not terribly important. Each approach has advantages and disadvantages, and apparently both are common; the 'official' SVN book authors prefer internal, as do I, but write about the toplevel {tags,branches,trunk} approach: There's nothing particularly incorrect about such a layout, but it may or may not seem as intuitive for your users. Especially in large, multi-project situations with many users, those users may tend to be familiar with only one or two of the projects in the repository. But the projects-as-branch-siblings tends to de-emphasize project individuality and focus on the entire set of projects as a single entity. That's a social issue though. We like our originally suggested arrangement for purely practical reasons: it's easier to ask about (or modify, or migrate elsewhere) the entire history of a single project when there's a single repository path that holds the history entirepast, present, tagged, branched andfor that project and that project alone. > > Maybe the structure I proposed would also make importing and exporting of > > projects slightly easier (because it mirrors the typical layout of an > > individual svn project). Speaking of which -- is there something I can do with > > the existing CVS so that it can be easily imported in the scikits svn (in > > which case we can get rid of what's already checked in), or would importing > > the CVS involve a hassle in any case, because then I'll just archive it on > > sourceforge. > > I don't know. You'll have to read the cvs2svn documentation. Sorry, I should have phrased this better. I have already read the cvs2svn docs and converted the CVS repository (just a single trunk, no branches or tags). If I put a tar.bz2 of that repository on the web, can someone with admin rights easily install it in lieu of the existing repository? Is Berkeley db fine as backend? If it's not easy and quick to do, I'm also happy to loose the revision history and abandon conversion attempts. > > The other thing I've been wondering is if such a setup couldn't also be made > > to accomodate something like Stefan van der Walt's layout proposal, which as > > far as I can see would allow for the most convenient way possible to grab all > > scikits and build them: > > > > setup.py > > scikits/ > > __init_xg_.py > > -> mlabwrap/ > > mlabwrap_setup.py > > __init__.py > > awmstools.py > > ... > > -> some_other_scikit/ > > some_other_scikit_setup.py > > Having two ways to install something is just begging for trouble. I wasn't advocating two ways; you always call scikits/setup.py and it just installs different amounts of stuff depending on how many subdirs you've checked out. > > Couldn't one have a toplevel setup.py that just runs all > > scikits/DIRNAME/DIRNAME_setup.py's it can find, passing through the command line > > options (or something along those lines[1])? > > That's unworkable. I've tried. I suspected that it might turn out to be. Pitty. > >> If so it should also use numpy.distutils. Just make sure to import > >> setuptools first. > >> > >> import setuptools > >> from numpy.distutils.core import setup > >> ... > > > > Is there a recipe/template for this somewhere? Googling "scipy setuptools" > > comes up with > > > > > > > > as the first hit, which seems to indicate that setuptools is still a bit alpha > > and the docs can't be trusted if one wants something that actually works. > > Fernando's a curmudgeon, and that page is old. Ignore him. :-) OK, I'm going the``ez_setup.py`` way. > Like I said, just import setuptools before you import numpy.distutils. Then use > numpy.distutils as normal to handle all of the building and stuff. setuptools > adds some keywords to setup() that you should also provide, namely > > namespace_packages=['scikits'], > > That's all that's necessary. There's no particular magic to combining setuptools > and numpy.distutils. Well, it's not quite obvious how to fully take advantage of setuptools, though. One of the main reasons for using it is that it's meant to download and install depenedencies automatically, but that can't work if my setup.py import something from its sole dependency (numpy) to start with. Surely there must be some way to write packages that depend on numpy but can be installed automatically (and download numpy if required)? And why do I need to use numpy.distutils in the first place? I find rather unhelpful and I didn't find any other documentation. Another thing: should ``scikits/__init__.py`` be really empty? From it looks a bit to me like it should contain: __import__('pkg_resources').declare_namespace(__name__) ? Finally, what is the preferred download url for scikits projects? Should I continue to host the file-release on SF, or should they go somewhere else? Same for the webpage. So I think some kind of template for scikit authors would be useful and I'd suggest that once I've got setup.py etc. ironed out I put some info for other prospective scikit authors on a wikipage on scipy.org -- what would be a good place? Finally, just to double check, does this directory structure look good to you: mlabwrap/ setup.py README.txt tests/ # N.B. renamed from 'test' test_mlabwrap.py ... scikits/ __init__.py # empty mlabwrap/ __init__.py _mlabwrap.py mlabraw.py -> awmstools.py -> awmsmeta.py ? thanks, 'as From robert.kern at gmail.com Fri Apr 6 22:38:42 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 06 Apr 2007 21:38:42 -0500 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> Message-ID: <46170432.5080205@gmail.com> Alexander Schmolck wrote: > Robert Kern writes: > Anyway, the choice between either layout (i.e. subproject internal/external > {tags,branches,trunk}) seems not terribly important. Fine. Please let's drop the issue, then. >>> Maybe the structure I proposed would also make importing and exporting of >>> projects slightly easier (because it mirrors the typical layout of an >>> individual svn project). Speaking of which -- is there something I can do with >>> the existing CVS so that it can be easily imported in the scikits svn (in >>> which case we can get rid of what's already checked in), or would importing >>> the CVS involve a hassle in any case, because then I'll just archive it on >>> sourceforge. >> I don't know. You'll have to read the cvs2svn documentation. > > Sorry, I should have phrased this better. I have already read the cvs2svn docs > and converted the CVS repository (just a single trunk, no branches or tags). > If I put a tar.bz2 of that repository on the web, can someone with admin > rights easily install it in lieu of the existing repository? Is there no conversion tool that simply puts revisions into an existing directory of a repository instead of making a new repository? > Is Berkeley db > fine as backend? No. We use the fsfs. >>>> If so it should also use numpy.distutils. Just make sure to import >>>> setuptools first. >>>> >>>> import setuptools >>>> from numpy.distutils.core import setup >>>> ... >>> Is there a recipe/template for this somewhere? Googling "scipy setuptools" >>> comes up with >>> >>> >>> >>> as the first hit, which seems to indicate that setuptools is still a bit alpha >>> and the docs can't be trusted if one wants something that actually works. >> Fernando's a curmudgeon, and that page is old. Ignore him. :-) > > OK, I'm going the``ez_setup.py`` way. No, ez_setup.py is deprecated. That part of the setuptools docs is out of date. When the Cheeseshop comes back up read this page with up-to-date information about how to get going with setuptools: http://cheeseshop.python.org/pypi/setuptools >> Like I said, just import setuptools before you import numpy.distutils. Then use >> numpy.distutils as normal to handle all of the building and stuff. setuptools >> adds some keywords to setup() that you should also provide, namely >> >> namespace_packages=['scikits'], >> >> That's all that's necessary. There's no particular magic to combining setuptools >> and numpy.distutils. > > Well, it's not quite obvious how to fully take advantage of setuptools, > though. One of the main reasons for using it is that it's meant to download > and install depenedencies automatically, but that can't work if my setup.py > import something from its sole dependency (numpy) to start with. Surely there > must be some way to write packages that depend on numpy but can be installed > automatically (and download numpy if required)? Not really. The dependencies that you can specify are requirements for using the package after it is installed, not requirements for building the package. The structure of distutils setup.py files pretty much enforces this. setuptools doesn't really get to do anything until setup() is called, i.e. after you've used your dependencies. You might be able to hack something together with pkg_resources.WorkingSet.resolve(), though. http://peak.telecommunity.com/DevCenter/PkgResources#workingset-methods-and-attributes > And why do I need to use > numpy.distutils in the first place? You don't strictly have to. numpy.get_include() is probably sufficient for you, but it has the same problem. > Another thing: should ``scikits/__init__.py`` be really empty? From > > > > it looks a bit to me like it should contain: > > __import__('pkg_resources').declare_namespace(__name__) > > ? Yes, apologies. > Finally, what is the preferred download url for scikits projects? Should I > continue to host the file-release on SF, or should they go somewhere else? I recommend putting them on the Cheeseshop. > Same for the webpage. scipy.org wiki if you can. > So I think some kind of template for scikit authors would be useful and I'd > suggest that once I've got setup.py etc. ironed out I put some info for other > prospective scikit authors on a wikipage on scipy.org -- what would be a good > place? http://projects.scipy.org/scipy/scikits > Finally, just to double check, does this directory structure look good to you: > > mlabwrap/ > setup.py > README.txt > tests/ # N.B. renamed from 'test' > test_mlabwrap.py > ... > scikits/ > __init__.py # empty > mlabwrap/ > __init__.py > _mlabwrap.py > mlabraw.py > -> awmstools.py > -> awmsmeta.py > > ? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From guyer at nist.gov Fri Apr 6 23:29:23 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Fri, 6 Apr 2007 23:29:23 -0400 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: <46170432.5080205@gmail.com> References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> <46170432.5080205@gmail.com> Message-ID: <064E82D4-481F-4C72-B2F1-B6ED35AE25EE@nist.gov> On Apr 6, 2007, at 10:38 PM, Robert Kern wrote: > Alexander Schmolck wrote: >> Sorry, I should have phrased this better. I have already read the >> cvs2svn docs >> and converted the CVS repository (just a single trunk, no branches >> or tags). >> If I put a tar.bz2 of that repository on the web, can someone with >> admin >> rights easily install it in lieu of the existing repository? > > Is there no conversion tool that simply puts revisions into an > existing > directory of a repository instead of making a new repository? cvs2svn can do this. http://cvs2svn.tigris.org/faq.html talks about an "options file method" that I haven't used, but the older dumpfile mechanism worked fine for us. From wnbell at gmail.com Sun Apr 8 03:35:39 2007 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 8 Apr 2007 01:35:39 -0600 Subject: [SciPy-dev] [Numpy-discussion] Tuning sparse stuff in NumPy In-Reply-To: <46095404.4020106@ntc.zcu.cz> References: <4607E40E.5000300@ntc.zcu.cz> <4607EDCD.5020001@ntc.zcu.cz> <46095404.4020106@ntc.zcu.cz> Message-ID: On 3/27/07, Robert Cimrman wrote: > ok. now which version of scipy (scipy.__version__) do you use (you may > have posted it, but I missed it)? Not so long ago, there was an effort > by Nathan Bell and others reimplementing sparsetools + scipy.sparse to > get better usability and performance. My (almost latest) version is > 0.5.3.dev2860. Robert, did David find the source of his performance problems? I suspect that he was using an older version of sparsetools, but I'd like to know for sure. -- Nathan Bell wnbell at gmail.com From david at ar.media.kyoto-u.ac.jp Sun Apr 8 22:57:01 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 09 Apr 2007 11:57:01 +0900 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? Message-ID: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> Hi there, I tried to compile numpy/scipy with recent LAPACK and BLAS versions (LAPACK 3.1.1, BLAS from the netlib package, not from LAPACK, using gfortran as a compiler everywhere), and I got several errors when testing scipy: ====================================================================== FAIL: check_syevr (scipy.lib.tests.test_lapack.test_flapack_float) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", line 41, in check_syevr assert_array_almost_equal(w,exact_w) File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 230, in assert_array_almost_equal header='Arrays are not almost equal') File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 215, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ====================================================================== FAIL: check_syevr_irange (scipy.lib.tests.test_lapack.test_flapack_float) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", line 66, in check_syevr_irange assert_array_almost_equal(w,exact_w[rslice]) File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 230, in assert_array_almost_equal header='Arrays are not almost equal') File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 215, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ---------------------------------------------------------------------- The different dtype may suggest an error while compiling the BLAS/LAPACK, but I tested the libraries with official tester without any error. cheers, David From matthew.brett at gmail.com Mon Apr 9 06:24:31 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 9 Apr 2007 11:24:31 +0100 Subject: [SciPy-dev] ctypes requirement? Message-ID: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> Hi, I was thinking of using ctypes in some of the scipy matlab read / write routines. I noticed that a couple of sandbox packages are already using it. Does the team think the time has come for ctypes to be a requirement for scipy? It will make some development easier. Thanks, Matthew From a.schmolck at gmx.net Mon Apr 9 21:41:05 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: 10 Apr 2007 02:41:05 +0100 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: <46170432.5080205@gmail.com> References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> <46170432.5080205@gmail.com> Message-ID: Robert Kern writes: > >>>> If so it should also use numpy.distutils. Just make sure to import > >>>> setuptools first. > >>>> > >>>> import setuptools > >>>> from numpy.distutils.core import setup > >>>> ... > >>> Is there a recipe/template for this somewhere? Googling "scipy setuptools" > >>> comes up with > >>> > >>> > >>> > >>> as the first hit, which seems to indicate that setuptools is still a bit alpha > >>> and the docs can't be trusted if one wants something that actually works. > >> Fernando's a curmudgeon, and that page is old. Ignore him. :-) > > > > OK, I'm going the``ez_setup.py`` way. > > No, ez_setup.py is deprecated. That part of the setuptools docs is out of date. > When the Cheeseshop comes back up read this page with up-to-date information > about how to get going with setuptools: > > http://cheeseshop.python.org/pypi/setuptools OK. > > Surely there must be some way to write packages that depend on numpy but > > can be installed automatically (and download numpy if required)? > > Not really. The dependencies that you can specify are requirements for using the > package after it is installed, not requirements for building the package. The > structure of distutils setup.py files pretty much enforces this. setuptools > doesn't really get to do anything until setup() is called, i.e. after you've > used your dependencies. > > You might be able to hack something together with > pkg_resources.WorkingSet.resolve(), though. > > http://peak.telecommunity.com/DevCenter/PkgResources#workingset-methods-and-attributes Sometimes > > > And why do I need to use > > numpy.distutils in the first place? > > You don't strictly have to. numpy.get_include() is probably sufficient > for you, but it has the same problem. > > > Another thing: should ``scikits/__init__.py`` be really empty? From > > > > > > > > it looks a bit to me like it should contain: > > > > __import__('pkg_resources').declare_namespace(__name__) > > > > ? > > Yes, apologies. OK, I think the upshot of all this is that I'll figure out how to do a robust and user-friendly setuptools-based package another time. I don't want to delay the release of 1.0 any further so I'll release mlabwrap-1.0final on SF as a distutils based install with the old (scikits-less) package structure. Since post-1.0 mlabwrap will represent a break anyway (Numeric support will be dropped, newer version of python and matlab might be required and the interface might change in some not backwards-compatible ways), the need to change the important statement is maybe not a bad thing. > > > Finally, what is the preferred download url for scikits projects? Should I > > continue to host the file-release on SF, or should they go somewhere else? > > I recommend putting them on the Cheeseshop. Thanks, will do. > > > Same for the webpage. > > scipy.org wiki if you can. I could make http://www.scipy.org/MlabWrap the project webpage and move its current (developer-only contents) into some subdir -- does that sound reasonable? > > So I think some kind of template for scikit authors would be useful and I'd > > suggest that once I've got setup.py etc. ironed out I put some info for other > > prospective scikit authors on a wikipage on scipy.org -- what would be a good > > place? > > http://projects.scipy.org/scipy/scikits I tried to create a trac account ('aschmolck') there but it didn't work; a new account seems to have been created (the name is taken) but I can't log into it. Do I need some other type of account first in order to be able to create the trac account? When I click on "create account" I get an authorization dialog, but the just created login and password don't work (I verified by going through the same process with login/passwd test_user). thanks, 'as From cimrman3 at ntc.zcu.cz Tue Apr 10 04:43:55 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 10:43:55 +0200 Subject: [SciPy-dev] [Numpy-discussion] Tuning sparse stuff in NumPy In-Reply-To: References: <4607E40E.5000300@ntc.zcu.cz> <4607EDCD.5020001@ntc.zcu.cz> <46095404.4020106@ntc.zcu.cz> Message-ID: <461B4E4B.3080608@ntc.zcu.cz> Nathan Bell wrote: > On 3/27/07, Robert Cimrman wrote: >> ok. now which version of scipy (scipy.__version__) do you use (you may >> have posted it, but I missed it)? Not so long ago, there was an effort >> by Nathan Bell and others reimplementing sparsetools + scipy.sparse to >> get better usability and performance. My (almost latest) version is >> 0.5.3.dev2860. > > Robert, did David find the source of his performance problems? I > suspect that he was using an older version of sparsetools, but I'd > like to know for sure. I am not sure either, but I think the slowness he perceived was due to using the version 0.5.2. David, did you try your benchmarks with the latest SVN version? r. From cimrman3 at ntc.zcu.cz Tue Apr 10 07:12:50 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 13:12:50 +0200 Subject: [SciPy-dev] eigh implementation inconsistency Message-ID: <461B7132.40008@ntc.zcu.cz> I need to solve a symmetric generalized eigenvalue problem, so I have looked at the linalg module and found a thing that seems inconsistent to me. For general (unsymmetric) problems there is the 'eig' function which allows for solving both the regular (via *geev) and generalized (via *ggev) eigenvalue problems. On the other hand, the function 'eigh' for symmetric (or hermitian) problems does not allow the generalized problems, even though there are functions in lapack to do it (dsygv, chegv). I would modify eigh to accept an optional 'b' argument just like eig does. What must be done to have dsygv, chegv wrappers generated? They are not generated now, IMHO. r. From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:16:15 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:16:15 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B7132.40008@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> Message-ID: <461B71FF.6040709@iam.uni-stuttgart.de> Robert Cimrman wrote: > I need to solve a symmetric generalized eigenvalue problem, so I have > looked at the linalg module and found a thing that seems inconsistent to me. > > For general (unsymmetric) problems there is the 'eig' function which > allows for solving both the regular (via *geev) and generalized (via > *ggev) eigenvalue problems. > > On the other hand, the function 'eigh' for symmetric (or hermitian) > problems does not allow the generalized problems, even though there are > functions in lapack to do it (dsygv, chegv). > > I would modify eigh to accept an optional 'b' argument just like eig > does. What must be done to have dsygv, chegv wrappers generated? They > are not generated now, IMHO. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Sounds good to me ! It would be a nice improvement. Nils From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:41:00 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:41:00 +0200 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? In-Reply-To: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> References: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> Message-ID: <461B77CC.9070009@iam.uni-stuttgart.de> David Cournapeau wrote: > Hi there, > > I tried to compile numpy/scipy with recent LAPACK and BLAS versions > (LAPACK 3.1.1, BLAS from the netlib package, not from LAPACK, using > gfortran as a compiler everywhere), and I got several errors when > testing scipy: > > ====================================================================== > FAIL: check_syevr (scipy.lib.tests.test_lapack.test_flapack_float) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", > line 41, in check_syevr > assert_array_almost_equal(w,exact_w) > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 230, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 215, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) > y: array([-0.66992434, 0.48769389, 9.18223045]) > > ====================================================================== > FAIL: check_syevr_irange (scipy.lib.tests.test_lapack.test_flapack_float) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", > line 66, in check_syevr_irange > assert_array_almost_equal(w,exact_w[rslice]) > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 230, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 215, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) > y: array([-0.66992434, 0.48769389, 9.18223045]) > > ---------------------------------------------------------------------- > > The different dtype may suggest an error while compiling the > BLAS/LAPACK, but I tested the libraries with official tester without any > error. > > cheers, > > David > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Hi David, I can confirm your findings. BTW, is there a way to obtain the information which version of LAPACK is used via scipy.show_config() ? I mean something like [('ATLAS_INFO', '"\\"3.7.28\\""')] Cheers, Nils From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:43:25 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:43:25 +0200 Subject: [SciPy-dev] Deadline for scipy release 0.5.3 Message-ID: <461B785D.8090802@iam.uni-stuttgart.de> Hi all, I was wondering if there is a deadline for the next scipy release ? Nils From opossumnano at gmail.com Tue Apr 10 08:08:46 2007 From: opossumnano at gmail.com (Tiziano Zito) Date: Tue, 10 Apr 2007 14:08:46 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B7132.40008@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> Message-ID: you can also have a look at the symeig module: http://mdp-toolkit.sourceforge.net/symeig.html it is a wrapper of all the generalized symmetric eigenvalues problem routines in lapack, i.e. EVR, GV, GVD, GVX, including those for extracting only a subset of eigenvalues. cheers, tiziano On 4/10/07, Robert Cimrman wrote: > > I need to solve a symmetric generalized eigenvalue problem, so I have > looked at the linalg module and found a thing that seems inconsistent to me. > > For general (unsymmetric) problems there is the 'eig' function which > allows for solving both the regular (via *geev) and generalized (via > *ggev) eigenvalue problems. > > On the other hand, the function 'eigh' for symmetric (or hermitian) > problems does not allow the generalized problems, even though there are > functions in lapack to do it (dsygv, chegv). > > I would modify eigh to accept an optional 'b' argument just like eig > does. What must be done to have dsygv, chegv wrappers generated? They > are not generated now, IMHO. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From nwagner at iam.uni-stuttgart.de Tue Apr 10 08:08:43 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 14:08:43 +0200 Subject: [SciPy-dev] UMFPACKv4.4 and swig memory leak of type 'void *' In-Reply-To: <461363D6.3010305@ntc.zcu.cz> References: <45E54D84.8010103@iam.uni-stuttgart.de> <461363D6.3010305@ntc.zcu.cz> Message-ID: <461B7E4B.3000302@iam.uni-stuttgart.de> Robert Cimrman wrote: > Nils Wagner wrote: > >> Hi all, >> >> scipy.test(1,10) reports a lot of memory leaks, e.g. >> >> Getting factors of complex matrixswig/python detected a memory leak of >> type 'void *', no destructor found. >> Getting factors of real matrixswig/python detected a memory leak of type >> 'void *', no destructor found. >> Solve with UMFPACK: double precision complexswig/python detected a >> memory leak of type 'void *', no destructor found. >> swig/python detected a memory leak of type 'void *', no destructor found. >> ... ok >> Solve: single precision complexUse minimum degree ordering on A'+A. >> ... ok >> Solve with UMFPACK: double precisionswig/python detected a memory leak >> of type 'void *', no destructor found. >> swig/python detected a memory leak of type 'void *', no destructor found. >> ... ok >> >> Is this a swig problem ? >> >> I am using >> >> Numpy version 1.0.2.dev3562 >> Scipy version 0.5.3.dev2774 >> > > Hi Nils, > > it was not a swig problem but mine (passing bad 'own' flag in a > typemap)). One can be really blind... > > Now it seems fixed. (rev. 2896) > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Hi Robert, Thanks a lot ! Nils From cimrman3 at ntc.zcu.cz Tue Apr 10 08:22:00 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 14:22:00 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz> Message-ID: <461B8168.7030807@ntc.zcu.cz> Tiziano Zito wrote: > you can also have a look at the symeig module: > http://mdp-toolkit.sourceforge.net/symeig.html > it is a wrapper of all the generalized symmetric eigenvalues problem > routines in lapack, i.e. EVR, GV, GVD, GVX, including those for > extracting only a subset of eigenvalues. > > cheers, > tiziano This is great! Exactly what I am looking for... I assume you would not consider changing it license from LGPL to BSD so that it could be included in SciPy? :-) r. From opossumnano at gmail.com Tue Apr 10 08:42:58 2007 From: opossumnano at gmail.com (Tiziano Zito) Date: Tue, 10 Apr 2007 14:42:58 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B8168.7030807@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> <461B8168.7030807@ntc.zcu.cz> Message-ID: We would change the license to whatever is needed to include it in SciPy, but I think the problem is that our pyf files (those needed by f2py to generate the C extension module) have been heavily (and manually :-)) tuned and do not resemble the pyf files used by scipy to generate other lapack wrappers. An inclusion "as is" has been excluded for this reason by Pearu a couple of years ago. Rewriting the pyf to match those of scipy is a tedious work that we are not going to do in the near future. Unless Pearu changed his mind or someone volunteers to do the hard work, in which case I would help as far as I can, it is not going to happen soon. bye, tiziano On 4/10/07, Robert Cimrman wrote: > Tiziano Zito wrote: > > you can also have a look at the symeig module: > > http://mdp-toolkit.sourceforge.net/symeig.html > > it is a wrapper of all the generalized symmetric eigenvalues problem > > routines in lapack, i.e. EVR, GV, GVD, GVX, including those for > > extracting only a subset of eigenvalues. > > > > cheers, > > tiziano > > This is great! Exactly what I am looking for... I assume you would not > consider changing it license from LGPL to BSD so that it could be > included in SciPy? :-) > > r. > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From aisaac at american.edu Tue Apr 10 09:07:07 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 10 Apr 2007 09:07:07 -0400 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz><461B8168.7030807@ntc.zcu.cz> Message-ID: On Tue, 10 Apr 2007, Tiziano Zito apparently wrote: > We would change the license to whatever is needed to include it in > SciPy, but I think the problem is that our pyf files (those needed by > f2py to generate the C extension module) have been heavily (and > manually :-)) tuned and do not resemble the pyf files used by scipy > to generate other lapack wrappers. An inclusion "as is" > has been excluded for this reason by Pearu a couple of > years ago. Even if a volunteer to rewrite the pyf files does not emerge immediately, changing the license *now* will ensure that some future interested party does not pass by the opportunity out of licensing concerns. fwiw, Alan Isaac From cimrman3 at ntc.zcu.cz Tue Apr 10 10:54:06 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 16:54:06 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz><461B8168.7030807@ntc.zcu.cz> Message-ID: <461BA50E.2090509@ntc.zcu.cz> Alan G Isaac wrote: > On Tue, 10 Apr 2007, Tiziano Zito apparently wrote: >> We would change the license to whatever is needed to include it in >> SciPy, but I think the problem is that our pyf files (those needed by >> f2py to generate the C extension module) have been heavily (and >> manually :-)) tuned and do not resemble the pyf files used by scipy >> to generate other lapack wrappers. An inclusion "as is" >> has been excluded for this reason by Pearu a couple of >> years ago. > > Even if a volunteer to rewrite the pyf files does not emerge > immediately, changing the license *now* will ensure that > some future interested party does not pass by the > opportunity out of licensing concerns. What is more, once and if it is BSD-licensed, it could get included and _used_ from scipy sandbox just as it is right now. Later it could get merged into the main tree, if there is interest and will... Or a new scikit? regards, r. From steve at shrogers.com Wed Apr 11 08:38:59 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Wed, 11 Apr 2007 06:38:59 -0600 Subject: [SciPy-dev] ctypes requirement? In-Reply-To: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> References: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> Message-ID: <461CD6E3.3020807@shrogers.com> Matthew Brett wrote: > > ... Does the team think the time has come for ctypes to > be a requirement for scipy? It will make some development easier. > I'm just a lurker here, but FWIW I think ctypes is a reasonable requirement. We're still using Python 2.4.3, but include ctypes with our distribution. # Steve From ondrej at certik.cz Wed Apr 11 13:02:42 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 11 Apr 2007 19:02:42 +0200 Subject: [SciPy-dev] SciPy improvements Message-ID: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Hi, I am studying theoretical physics and I have collected a lot of useful python code, that I believe could go to SciPy. So first I'll review what I have and if you find it interesting, I would like to discuss how I could implement it in SciPy. 1) Optimization http://chemev.googlecode.com/svn/trunk/chemev/optimization/ I have a Differential Evolution optimizer, Simplex optimizer, mcmc (not well tested yet), I took a code from someone else, but adapted the interface to the SciPy's one: def fmin_de(f,x0,callback=None,iter=None): Those are unconstrained optimizers. Then I have a constrains code, that applies a logistic function to the fitting variable and allows me to do constrained optimization. For example the L-BFGS with my constrains converges 7x faster, than the original L-BFGS-B on my problem. 2) Nonlinear solvers I have written these nonlinear solvers for the problem R(x) = 0, where x and R has a dimension "n": broyden1 - Broyden's first method - is a quasi-Newton-Raphson method for updating an approximate Jacobian and then inverting it broyden2 - Broyden's second method - the same as broyden1, but updates the inverse Jacobian directly broyden3 - Broyden's second method - the same as broyden2, but instead of directly computing the inverse Jacobian, it remembers how to construct it using vectors, and when computing inv(J)*F, it uses those vectors to compute this product, thus avoding the expensive NxN matrix multiplication. broyden_generalized - Generalized Broyden's method, the same as broyden2, but instead of approximating the full NxN Jacobian, it construct it at every iteration in a way that avoids the NxN matrix multiplication. This is not as precise as broyden3. anderson - extended Anderson method, the same as the broyden_generalized, but added w_0^2*I to before taking inversion to improve the stability anderson2 - the Anderson method, the same as anderson, but formulated differently linear_mixing exciting_mixing I use them in the self-consistent cycle of the Density Functional Theory (so I use a terminology of DFT literature in the names of the methods). Also I am writing a BFGS solver with linesearch, that should behave even better than the Broyden scheme. Of course I am trying to use SciPy's code (like linesearch) wherever possible. 3) PETSC bindings I found these nice petsc bindings: http://cheeseshop.python.org/pypi/petsc4py/0.7.2 I believe this could also be an optional package in SciPy. Because if SciPy has some sparse matrices code, then it should definitely also has this. 4) Finite element code? I have my own code, that uses libmesh: http://libmesh.sourceforge.net/ and calls tetgen and parses input from gmsh etc. Can convert the mesh, can refine it, can solve it, etc. Webpages are here: http://code.google.com/p/femgeom/ http://code.google.com/p/libmeshpetscpackage/ http://code.google.com/p/grainmodel/ I am not sure here, if it should belong to SciPy. Probably not. 5) Symbolic manipulation in Python http://code.google.com/p/sympy/ We'll have some google summer of code students working on this and also I am not sure if it belongs to SciPy. However, this project looks very promising. ----------------- So that's it. I have some comments to SciPy: 1) Documentation Virtually none, I just use the source code to understand, what SciPy can do and how. But the docstrings are good though. I would suggest to update the http://www.scipy.org/doc/api_docs/ more often (for example I didn't find there the new l-bfgs code). 2) What is the official NumPy's page? I believe it should be http://numpy.org/ however, it points to a sourceforge page. I believe this homepage should contain all the relevant information about numpy and contain links to the fee based documentation and possibly some tutorials. The documentation of NumPy is scattered across many pages and I find it confusing. I know you have some list here: http://www.scipy.org/MigratingFromPlone But I am quite confused from the whole SciPy page. I think less but unconfusing information is better, but that's just my opinion. I think the front page of both SciPy and NumPy should be clean and simple with a clear link to a documentation. Like http://www.pytables.org/moin/PyTables http://matplotlib.sourceforge.net/ However, the documentation: http://www.scipy.org/Documentation is confusing, because except the fee based guide to NumPy, it's not clear, what is the official SciPy doc and what the best way of learning SciPy is. So I am interested in your opinions and then I'll just integrate my code into scipy.optimize and send you a patch to this list? Ondrej From scipy2mdjhs78c at jenningsstory.com Wed Apr 11 13:19:54 2007 From: scipy2mdjhs78c at jenningsstory.com (Andy Jennings) Date: Wed, 11 Apr 2007 10:19:54 -0700 Subject: [SciPy-dev] lmder patch Message-ID: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> optimize.leastsq is not converging when I use a jacobian function. MATRIXC2F is transposing the wrong way. Below is a patch that fixes it in jac_multipack_lm_function, but I'm not sure this is the right fix. It looks like jac_multipack_calling_function might have the same problem, in which case maybe it's better to fix the definition of MATRIXC2F in Lib/optimize/minpack.h. I guess I would want a test case for hybrj to be sure. Lib/integrate/multipack.h and Lib/interpolate/multipack.h also define a MATRIXC2F macro. I have no idea if they need to be looked at as well. AJennings P.S. I just signed up for a trac account a few days ago. I have to submit patches to someone with the rights to commit, right? Is this scipy-dev list the best place to do it? Index: Lib/optimize/__minpack.h =================================================================== --- Lib/optimize/__minpack.h (revision 2901) +++ Lib/optimize/__minpack.h (working copy) @@ -149,7 +149,7 @@ return -1; } if (multipack_jac_transpose == 1) - MATRIXC2F(fjac, result_array->data, *n, *ldfjac) + MATRIXC2F(fjac, result_array->data, *ldfjac, *n) else memcpy(fjac, result_array->data, (*n)*(*ldfjac)*sizeof(double)); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 11 15:25:15 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Apr 2007 14:25:15 -0500 Subject: [SciPy-dev] lmder patch In-Reply-To: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> References: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> Message-ID: <461D361B.1080107@gmail.com> Andy Jennings wrote: > P.S. I just signed up for a trac account a few days ago. I have to > submit patches to someone with the rights to commit, right? Is this > scipy-dev list the best place to do it? No, please open a ticket so it doesn't get lost. At the bottom of the page where you enter the ticket's information, there will be a checkbox labeled something like "I have files to attach to this ticket." Check that box, submit the ticket, then you will be presented a page where you can upload the patch. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Apr 11 16:09:13 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Apr 2007 15:09:13 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: <461D4069.8070101@gmail.com> Ondrej Certik wrote: > Hi, > > I am studying theoretical physics and I have collected a lot of useful > python code, that I believe could go to SciPy. So first I'll review > what I have and if you find it interesting, I would like to discuss > how I could implement it in SciPy. Excellent! Thank you! One thing to be aware of is that scipy uses the BSD license, so you would need to relicense your code under the BSD license and get permission from the others who have contributed to the code you are submitting. > 1) Optimization > > http://chemev.googlecode.com/svn/trunk/chemev/optimization/ > > I have a Differential Evolution optimizer, Simplex optimizer, mcmc > (not well tested yet), I took a code from someone else, but adapted > the interface to the SciPy's one: > > def fmin_de(f,x0,callback=None,iter=None): Well, fmin() is already an implementation of the simplex algorithm. How does yours compare? We can't include the MCMC optimizer until we have an implementation of Metropolis-Hastings in scipy itself; we're not going to depend on an external PyMC. As for the differential evolution code, with all respect to you and Jame Phillips, it's not the best way to implement that algorithm in Python. It's a straight translation of the C++ code so it doesn't make use of numpy at all. I have an implementation that does: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py It was written for pre-numpy scipy, so it may need some sprucing-up before it works. > Those are unconstrained optimizers. Then I have a constrains code, > that applies a logistic function to the fitting variable and allows me > to do constrained optimization. For example the L-BFGS with my > constrains converges 7x faster, than the original L-BFGS-B on my > problem. Interesting. Let's toss it up on the Cookbook first and pound on it a bit. I have qualms about applying such transformations to the domains of target functions and then using derivative-based optimizers on them, but those qualms might be baseless. Still, I'd rather experiment first before putting them into scipy. > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but updates the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but instead of > directly computing the inverse Jacobian, it remembers how to construct > it using vectors, and when computing inv(J)*F, it uses those vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as broyden2, > but instead of approximating the full NxN Jacobian, it construct it at > every iteration in a way that avoids the NxN matrix multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the broyden_generalized, > but added w_0^2*I to before taking inversion to improve the stability > anderson2 - the Anderson method, the same as anderson, but formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). > > Also I am writing a BFGS solver with linesearch, that should behave > even better than the Broyden scheme. > > Of course I am trying to use SciPy's code (like linesearch) wherever possible. That's fantastic. I'd love to see them. Are they in chemev? I don't see them. > 3) PETSC bindings > > I found these nice petsc bindings: > > http://cheeseshop.python.org/pypi/petsc4py/0.7.2 > > I believe this could also be an optional package in SciPy. Because if > SciPy has some sparse matrices code, then it should definitely also > has this. I don't know. It has a fine (and probably better) existence separate from scipy. > 4) Finite element code? > > I have my own code, that uses libmesh: > > http://libmesh.sourceforge.net/ > > and calls tetgen and parses input from gmsh etc. Can convert the mesh, > can refine it, can solve it, etc. Webpages are here: > > http://code.google.com/p/femgeom/ > http://code.google.com/p/libmeshpetscpackage/ > http://code.google.com/p/grainmodel/ > > I am not sure here, if it should belong to SciPy. Probably not. I think you are right. You can't really get around the licenses of your dependencies, here. > 5) Symbolic manipulation in Python > > http://code.google.com/p/sympy/ > > We'll have some google summer of code students working on this and > also I am not sure if it belongs to SciPy. However, this project looks > very promising. Again, I think it has a fine existence separate from scipy. A reason to bring it into scipy would be such that other parts of scipy would use it to implement their own stuff. Otherwise, I don't think there is much point. > ----------------- > So that's it. > > I have some comments to SciPy: > > 1) Documentation > > Virtually none, I just use the source code to understand, what SciPy > can do and how. But the docstrings are good though. I would suggest to > update the > > http://www.scipy.org/doc/api_docs/ > > more often (for example I didn't find there the new l-bfgs code). > > 2) What is the official NumPy's page? http://numpy.scipy.org > I believe it should be > > http://numpy.org/ > > however, it points to a sourceforge page. Correct. I don't know who still owns that domain. > I believe this homepage > should contain all the relevant information about numpy and contain > links to the fee based documentation and possibly some tutorials. > The documentation of NumPy is scattered across many pages and I find > it confusing. > > I know you have some list here: > > http://www.scipy.org/MigratingFromPlone > > But I am quite confused from the whole SciPy page. I think less but > unconfusing information is better, but that's just my opinion. I think > the front page of both SciPy and NumPy should be clean and simple with > a clear link to a documentation. Like > > http://www.pytables.org/moin/PyTables > http://matplotlib.sourceforge.net/ Please, by all means submit your recommendations for reorganization of that page. Because the front page is special, I'd recommend submitting your modifications as a ticket to our Trac (see below) instead of just editing it. The other pages on the wiki, please modify them as you see fit. > However, the documentation: > > http://www.scipy.org/Documentation > > is confusing, because except the fee based guide to NumPy, it's not > clear, what is the official SciPy doc and what the best way of > learning SciPy is. There really is no official scipy doc at this time. That's part of the problem. > So I am interested in your opinions and then I'll just integrate my > code into scipy.optimize and send you a patch to this list? Register an account with the scipy Trac (click "Register" in the upper-right corner): http://projects.scipy.org/scipy/scipy Then make a new ticket and attach your patch to that. Submit enough patches, and we'll just give you SVN access. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthieu.brucher at gmail.com Wed Apr 11 16:56:45 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 11 Apr 2007 22:56:45 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: > > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but updates > the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but instead > of > directly computing the inverse Jacobian, it remembers how to > construct > it using vectors, and when computing inv(J)*F, it uses those > vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > but instead of approximating the full NxN Jacobian, it construct > it at > every iteration in a way that avoids the NxN matrix > multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the > broyden_generalized, > but added w_0^2*I to before taking inversion to improve the > stability > anderson2 - the Anderson method, the same as anderson, but formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). Could the part that computes the step be separated from the function iself and the optimizer ? I'm trying to "modularize" non linear solvers so as to select more efficiently what is needed - kind of optimizer, kind of step, kind of stopping criterion, ... - Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Thu Apr 12 14:22:18 2007 From: jh at physics.ucf.edu (Joe Harrington) Date: Thu, 12 Apr 2007 14:22:18 -0400 Subject: [SciPy-dev] SciPy improvements In-Reply-To: (scipy-dev-request@scipy.org) References: Message-ID: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> Just a comment on Robert's otherwise-excellent reply, we agreed some time ago that the forum for discussing changes to the site is this list, not trac. This is because many participants such as myself are not involved in code development and do not have (or need) trac accounts. We should *not* encourage people simply to romp in the pages and restructure as they please, as the web site is in constant public view by more than just developers and should therefore not be a playground for testing ideas (except for DevZone, which is specifically for that purpose). Shortly after the switch to the Moin site, someone went in and rewrote a bunch of the pages to follow their own style, and it made us realize that an open invitation to edit was not the best idea. Small changes like adding a link or an entry in a list are of course fine to make. Changing a page's overall structure should at least get a brief review by the list. The page layouts are simple enough that improvements can either be discussed based on a posted description, or actually made by example. For the latter, copy the page onto a page hanging off of DevZone and post an email pointing to it and asking for comment. For obvious reasons, the front page can only be modified by a few people, not just anyone with an account. It would be best if people making regular changes identified themselves in DevZone as site maintainers so that others can find them. That said, I agree restructuring is called for in some cases, and as Robert pointed out, in the doc area what's really needed is a doc. I think we'll be quick to cheer on anything reasonable in either area. --jh-- From robert.kern at gmail.com Thu Apr 12 17:06:55 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 16:06:55 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> References: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> Message-ID: <461E9F6F.8090004@gmail.com> Joe Harrington wrote: > Just a comment on Robert's otherwise-excellent reply, we agreed some > time ago that the forum for discussing changes to the site is this > list, not trac. This is because many participants such as myself are > not involved in code development and do not have (or need) trac > accounts. I don't actually recall any agreement to that effect. The Trac exists for tracking issues for all parts of the project, not just bugs in the code. If you want to be active in the project even you are not a developer of code, use the Trac. > We should *not* encourage people simply to romp in the > pages and restructure as they please, as the web site is in constant > public view by more than just developers and should therefore not be a > playground for testing ideas (except for DevZone, which is > specifically for that purpose). Shortly after the switch to the Moin > site, someone went in and rewrote a bunch of the pages to follow their > own style, and it made us realize that an open invitation to edit was > not the best idea. Again, I'm pretty sure that there was no such universal realization. I explicitly do encourage people to use the Wiki and change pages as they see fit (except for FrontPage because it's special). If you are concerned about changes, watch the RSS feed and change things back if you disagree with the changes. If there is continued disagreement, then bring it to the list. That's how Wikis are supposed to work. We have no lack of suggestions on this list about how a page ought to look or what it ought to have. What we lack are people actually willing to put the time in to do the edits and provide the content. I encourage the latter; the former needs no such encouragement. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mforbes at physics.ubc.ca Thu Apr 12 17:31:54 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Thu, 12 Apr 2007 14:31:54 -0700 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson > method for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but > updates the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but > instead of > directly computing the inverse Jacobian, it remembers how > to construct > it using vectors, and when computing inv(J)*F, it uses > those vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > but instead of approximating the full NxN Jacobian, it > construct it at > every iteration in a way that avoids the NxN matrix > multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the > broyden_generalized, > but added w_0^2*I to before taking inversion to improve the > stability > anderson2 - the Anderson method, the same as anderson, but > formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). > > Could the part that computes the step be separated from the > function iself and the optimizer ? I'm trying to "modularize" non > linear solvers so as to select more efficiently what is needed - > kind of optimizer, kind of step, kind of stopping criterion, ... - > > Matthieu It should be possible to modularize these with a step class that maintains state (the Jacobian, or its inverse etc.). (Where is the latest version of your optimization proposal? I have not had a chance to look at it yet, but have been meaning to and would like to look at the latest version. Perhaps we should make a page on the Wiki to collect suggestions and code samples.) I have been meaning to get a good Broyden based algorithm coded for python for a while. I have a MATLAB version of a globally convergent Broyden implementation using a linesearch as a base with a couple of unusual features that might be useful (not specific to Broyden based methods). (The code is based on the presentation of the algorithm given in Numerical Recipies with some modifications suggested by Dennis and Schnabel's book and is partially documented using noweb.) http://www.phys.washington.edu/users/mforbes/projects/broyden/ 1) Variable tolerances. The idea is to quickly estimate the starting Jacobian with low tolerance calculations and then improve the tolerances as the code converges to the solution. This is useful if the function R(x) is computed with numerical integration or some similar technique that is quick for low tolerances but expensive for high tolerance functions. (I have rarely seen this technique in generic optimization algorithms, but found it invaluable for several problems.) 2) Real-time bailout. This allows you to compute for a specified length of time and then return the best solution found within that time frame. Most algorithms simply count function calls. 3) Selective refinement of the Jacobian as the iteration proceeds. This amounts to monitoring the condition number of the Jacobian and recomputing parts of it selectively if it becomes ill-conditioned. (For this I update the QR factorization of J rather than maintaining inv(J)). These things all slow down the fundamental algorithm, but are very useful when the function R(x) is extremely expensive to evaluate. Michael. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Thu Apr 12 19:31:35 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 01:31:35 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> Hi, thank to all of you for your thorough response. 1) The DE code from Robert: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py looks incredibly simple, I'll check it on my problem, if it behaves better than my ugly one. But that's the thing - I didn't find any mention about it in the documentation, otherwise I would be using your code, because it's simpler (=better). Also I didn't find it in google. I'll check the simplex method from SciPy - when I was using it, I just needed a simple script, not a whole dependence on SciPy and it was very difficult to get just the simplex algorithm out of SciPy. 2) I am sending my Broyden update methods in the attachement together with tests (you need py.test to run them). I am using a test from some article, I think from Vanderbilt. However my code is giving a little different results. I'll polish it and send it as a patch to SciPy in the trac, as directed, so you can just look how I do it. But when you do it, you will see there is really nothing to it - it's very trivial. However, my code doesn't use linesearch. Also I am curious how the BFGS method is going to work when I implement it - it's very similar to Broyden, except the update of the Jacobian is a little different. Could you Michael please also rewrite your code to Python? http://www.phys.washington.edu/users/mforbes/projects/broyden/ It would be nice to have all of it in SciPy with the same interface. BTW - why are you using matlab at all? To me, the python with numpy is better than anything else I've programmed in, including matlab. 3) about the logistics transforamtion - I was sceptical too, until I tried that and it was converging faster by a factor of 7x on my problem (chemev). So for me it's enough justification, but of course I am not saying that it must converge faster for any problem. However, I implemented it as a wrapper function above all the unconstrained algorithms with the SciPy interface, so the user is not forced to use it - he can just try it and see, as I did. I'll post it to the cookbook. 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. 5)documentation: the front page is quite fine, however the documentation needs complete redesign in my opinion. First - I believe the numpy should be separated from SciPy and have it's own page (numpy.org), but if you think it belongs under the hood of scipy.org, then ok. So, I'll copy the page: http://www.scipy.org/Documentation into some new one, and redesign it as I would like it to be, and then you'll tell me what you think about it. The same with other pages if I'll get a better idea about them. This way I shouldn't spoil anything in case you wouldn't like it. Because I don't have just couple of small fixes. Ondrej On 4/12/07, Michael McNeil Forbes wrote: > > > > On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > > > 2) Nonlinear solvers > > > > I have written these nonlinear solvers for the problem R(x) = 0, where > > x and R has a dimension "n": > > > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > > updating an approximate Jacobian and then inverting it > > broyden2 - Broyden's second method - the same as broyden1, but updates > the > > inverse Jacobian directly > > broyden3 - Broyden's second method - the same as broyden2, but instead > of > > directly computing the inverse Jacobian, it remembers how to > construct > > it using vectors, and when computing inv(J)*F, it uses those > vectors to > > compute this product, thus avoding the expensive NxN matrix > > multiplication. > > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > > but instead of approximating the full NxN Jacobian, it construct > it at > > every iteration in a way that avoids the NxN matrix > multiplication. > > This is not as precise as broyden3. > > anderson - extended Anderson method, the same as the > broyden_generalized, > > but added w_0^2*I to before taking inversion to improve the > stability > > anderson2 - the Anderson method, the same as anderson, but formulated > > differently > > linear_mixing > > exciting_mixing > > > > I use them in the self-consistent cycle of the Density Functional > > Theory (so I use a terminology of DFT literature in the names of the > > methods). > > Could the part that computes the step be separated from the function iself > and the optimizer ? I'm trying to "modularize" non linear solvers so as to > select more efficiently what is needed - kind of optimizer, kind of step, > kind of stopping criterion, ... - > > Matthieu > > It should be possible to modularize these with a step class that maintains > state > (the Jacobian, or its inverse etc.). > > (Where is the latest version of your optimization proposal? I have not had > a chance > to look at it yet, but have been meaning to and would like to look at the > latest version. > Perhaps we should make a page on the Wiki to collect suggestions and code > samples.) > > I have been meaning to get a good Broyden based algorithm coded for python > for a > while. I have a MATLAB version of a globally convergent Broyden > implementation > using a linesearch as a base with a couple of unusual features that might be > useful (not specific to Broyden based methods). (The code is based on the > presentation of the algorithm given in Numerical Recipies with some > modifications > suggested by Dennis and Schnabel's book and is partially documented using > noweb.) > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > 1) Variable tolerances. The idea is to quickly estimate the starting > Jacobian > with low tolerance calculations and then improve the tolerances as the > code > converges to the solution. This is useful if the function R(x) is > computed > with numerical integration or some similar technique that is quick for > low > tolerances but expensive for high tolerance functions. (I have rarely > seen > this technique in generic optimization algorithms, but found it > invaluable for > several problems.) > 2) Real-time bailout. This allows you to compute for a specified length of > time > and then return the best solution found within that time frame. Most > algorithms > simply count function calls. > 3) Selective refinement of the Jacobian as the iteration proceeds. This > amounts > to monitoring the condition number of the Jacobian and recomputing parts > of it > selectively if it becomes ill-conditioned. (For this I update the QR > factorization of J rather than maintaining inv(J)). > > These things all slow down the fundamental algorithm, but are very useful > when the > function R(x) is extremely expensive to evaluate. > > Michael. > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- A non-text attachment was scrubbed... Name: solvers.py Type: text/x-python Size: 12553 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_solvers.py Type: text/x-python Size: 1948 bytes Desc: not available URL: From ondrej at certik.cz Thu Apr 12 19:32:14 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 01:32:14 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> Hi, thank to all of you for your thorough response. 1) The DE code from Robert: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py looks incredibly simple, I'll check it on my problem, if it behaves better than my ugly one. But that's the thing - I didn't find any mention about it in the documentation, otherwise I would be using your code, because it's simpler (=better). Also I didn't find it in google. I'll check the simplex method from SciPy - when I was using it, I just needed a simple script, not a whole dependence on SciPy and it was very difficult to get just the simplex algorithm out of SciPy. 2) I am sending my Broyden update methods in the attachement together with tests (you need py.test to run them). I am using a test from some article, I think from Vanderbilt. However my code is giving a little different results. I'll polish it and send it as a patch to SciPy in the trac, as directed, so you can just look how I do it. But when you do it, you will see there is really nothing to it - it's very trivial. However, my code doesn't use linesearch. Also I am curious how the BFGS method is going to work when I implement it - it's very similar to Broyden, except the update of the Jacobian is a little different. Could you Michael please also rewrite your code to Python? http://www.phys.washington.edu/users/mforbes/projects/broyden/ It would be nice to have all of it in SciPy with the same interface. BTW - why are you using matlab at all? To me, the python with numpy is better than anything else I've programmed in, including matlab. 3) about the logistics transforamtion - I was sceptical too, until I tried that and it was converging faster by a factor of 7x on my problem (chemev). So for me it's enough justification, but of course I am not saying that it must converge faster for any problem. However, I implemented it as a wrapper function above all the unconstrained algorithms with the SciPy interface, so the user is not forced to use it - he can just try it and see, as I did. I'll post it to the cookbook. 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. 5)documentation: the front page is quite fine, however the documentation needs complete redesign in my opinion. First - I believe the numpy should be separated from SciPy and have it's own page (numpy.org), but if you think it belongs under the hood of scipy.org, then ok. So, I'll copy the page: http://www.scipy.org/Documentation into some new one, and redesign it as I would like it to be, and then you'll tell me what you think about it. The same with other pages if I'll get a better idea about them. This way I shouldn't spoil anything in case you wouldn't like it. Because I don't have just couple of small fixes. Ondrej On 4/12/07, Michael McNeil Forbes wrote: > > > > On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > > > 2) Nonlinear solvers > > > > I have written these nonlinear solvers for the problem R(x) = 0, where > > x and R has a dimension "n": > > > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > > updating an approximate Jacobian and then inverting it > > broyden2 - Broyden's second method - the same as broyden1, but updates > the > > inverse Jacobian directly > > broyden3 - Broyden's second method - the same as broyden2, but instead > of > > directly computing the inverse Jacobian, it remembers how to > construct > > it using vectors, and when computing inv(J)*F, it uses those > vectors to > > compute this product, thus avoding the expensive NxN matrix > > multiplication. > > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > > but instead of approximating the full NxN Jacobian, it construct > it at > > every iteration in a way that avoids the NxN matrix > multiplication. > > This is not as precise as broyden3. > > anderson - extended Anderson method, the same as the > broyden_generalized, > > but added w_0^2*I to before taking inversion to improve the > stability > > anderson2 - the Anderson method, the same as anderson, but formulated > > differently > > linear_mixing > > exciting_mixing > > > > I use them in the self-consistent cycle of the Density Functional > > Theory (so I use a terminology of DFT literature in the names of the > > methods). > > Could the part that computes the step be separated from the function iself > and the optimizer ? I'm trying to "modularize" non linear solvers so as to > select more efficiently what is needed - kind of optimizer, kind of step, > kind of stopping criterion, ... - > > Matthieu > > It should be possible to modularize these with a step class that maintains > state > (the Jacobian, or its inverse etc.). > > (Where is the latest version of your optimization proposal? I have not had > a chance > to look at it yet, but have been meaning to and would like to look at the > latest version. > Perhaps we should make a page on the Wiki to collect suggestions and code > samples.) > > I have been meaning to get a good Broyden based algorithm coded for python > for a > while. I have a MATLAB version of a globally convergent Broyden > implementation > using a linesearch as a base with a couple of unusual features that might be > useful (not specific to Broyden based methods). (The code is based on the > presentation of the algorithm given in Numerical Recipies with some > modifications > suggested by Dennis and Schnabel's book and is partially documented using > noweb.) > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > 1) Variable tolerances. The idea is to quickly estimate the starting > Jacobian > with low tolerance calculations and then improve the tolerances as the > code > converges to the solution. This is useful if the function R(x) is > computed > with numerical integration or some similar technique that is quick for > low > tolerances but expensive for high tolerance functions. (I have rarely > seen > this technique in generic optimization algorithms, but found it > invaluable for > several problems.) > 2) Real-time bailout. This allows you to compute for a specified length of > time > and then return the best solution found within that time frame. Most > algorithms > simply count function calls. > 3) Selective refinement of the Jacobian as the iteration proceeds. This > amounts > to monitoring the condition number of the Jacobian and recomputing parts > of it > selectively if it becomes ill-conditioned. (For this I update the QR > factorization of J rather than maintaining inv(J)). > > These things all slow down the fundamental algorithm, but are very useful > when the > function R(x) is extremely expensive to evaluate. > > Michael. > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > From robert.kern at gmail.com Thu Apr 12 19:45:15 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 18:45:15 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> Message-ID: <461EC48B.50105@gmail.com> Ondrej Certik wrote: > 3) about the logistics transforamtion - I was sceptical too, until I > tried that and it was converging faster by a factor of 7x on my > problem (chemev). So for me it's enough justification, but of course I > am not saying that it must converge faster for any problem. I'm sure it works faster; I'd just like to make sure that it always gives the correct answer. > 4) About the petsc - I know it's another dependence. However, I > noticed you are using umfpack in SciPy. So why not petsc? I think it > contains much more (sometimes better) solvers (depends on the > problem). It's seems logical to me, to either use nothing, or the best > library available, which I believe is petsc. Well, I wasn't as much of a dependency/no-optional-features freak when the optional UMFPACK stuff went in. Also, IIRC the wrappers for UMFPACK were written specifically for scipy; they didn't exist as a separate package beforehand. petsc4py already exists. Unless if we decide that some other feature of scipy needs it, there is no reason that I can see for bringing it into the scipy package. > 5)documentation: the front page is quite fine, however the > documentation needs complete redesign in my opinion. First - I believe > the numpy should be separated from SciPy and have it's own page > (numpy.org), but if you think it belongs under the hood of scipy.org, > then ok. > > So, I'll copy the page: > > http://www.scipy.org/Documentation > > into some new one, and redesign it as I would like it to be, and then > you'll tell me what you think about it. The same with other pages if > I'll get a better idea about them. This way I shouldn't spoil anything > in case you wouldn't like it. Because I don't have just couple of > small fixes. As you like. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mforbes at physics.ubc.ca Thu Apr 12 20:55:48 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Thu, 12 Apr 2007 17:55:48 -0700 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> Message-ID: <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> On 12 Apr 2007, at 4:32 PM, Ondrej Certik wrote: > Could you Michael please also rewrite your code to Python? > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > It would be nice to have all of it in SciPy with the same interface. > > BTW - why are you using matlab at all? To me, the python with numpy is > better than anything else I've programmed in, including matlab. I wrote this code back when I had easy access to matlab and before I knew python. I simply have not had time to port the broyden code to python yet. Hopefully I will find time soon (it would also be nice to organize the pieces in a modular fashion as Matthieu suggested, but I have simply not had time to look over that yet.) There are still a few things I miss about matlab, especially a good line-by-line profiler, but generally I agree that python+numpy is much better for programming. Michael. From wbaxter at gmail.com Thu Apr 12 22:32:41 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 11:32:41 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> Message-ID: On 4/13/07, Michael McNeil Forbes wrote: > On 12 Apr 2007, at 4:32 PM, Ondrej Certik wrote: > > > Could you Michael please also rewrite your code to Python? > > > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > > > It would be nice to have all of it in SciPy with the same interface. > > > > BTW - why are you using matlab at all? To me, the python with numpy is > > better than anything else I've programmed in, including matlab. I don't know if you were serious or not, but Matlab still has a huge amount of inertia and a very good network effect going for it. Plus many more years of development behind it than numpy. You're much more likely to be able to find random algorithms on the net implemented in Matlab than in Python/Numpy, that is if you even need to go looking, because a lot of stuff is already in Matlab to begin with. So while I personally agree that Numpy is better than Matlab codewise, a code in the hand beats two in the bush. I'd be very happy to drop Matlab completely if you could just port the NetLab (http://www.ncrg.aston.ac.uk/netlab/) and a few other things for me. :-) --bb From david at ar.media.kyoto-u.ac.jp Thu Apr 12 22:38:10 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 11:38:10 +0900 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? In-Reply-To: <461B77CC.9070009@iam.uni-stuttgart.de> References: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> <461B77CC.9070009@iam.uni-stuttgart.de> Message-ID: <461EED12.2080909@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > Hi David, > > I can confirm your findings. > BTW, is there a way to obtain the information which version of LAPACK is > used via scipy.show_config() ? > I mean something like [('ATLAS_INFO', '"\\"3.7.28\\""')] I guess not, because the LAPACK sources themselves do not seem to have any API to retrieve the current version. I will try to see where this scipy error is coming from, then, David From david at ar.media.kyoto-u.ac.jp Thu Apr 12 22:53:56 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 11:53:56 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> Message-ID: <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > > > I don't know if you were serious or not, but Matlab still has a huge > amount of inertia and a very good network effect going for it. Plus > many more years of development behind it than numpy. You're much more > likely to be able to find random algorithms on the net implemented in > Matlab than in Python/Numpy, that is if you even need to go looking, > because a lot of stuff is already in Matlab to begin with. > > So while I personally agree that Numpy is better than Matlab codewise, > a code in the hand beats two in the bush. I'd be very happy to drop > Matlab completely if you could just port the NetLab > (http://www.ncrg.aston.ac.uk/netlab/) and a few other things for me. > Well, netlab is a huge package, but google just announced that my SoC projet pymachine, a set of toolboxes for machine learning-related algorithms, has been accepted: http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the full proposal can be found there: http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) So expect some news (and more importantly, some code) on this front in the coming months, cheers, David From wbaxter at gmail.com Thu Apr 12 23:13:34 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 12:13:34 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, David Cournapeau wrote: > Well, netlab is a huge package, but google just announced that my SoC > projet pymachine, a set of toolboxes for machine learning-related > algorithms, has been accepted: That's great to hear. I've been keeping an eye on your progress with pyEM and such. Very promising. Incidentally I work with some speech and graphics guys from ATR, where I see you worked previously. Do you know Yotsukura-san, Kawamoto-san, or Nakamura-san? (I think Nakamura-san is now the head of ATR maybe even). > http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the > full proposal can be found there: > http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) > So expect some news (and more importantly, some code) on this front in > the coming months, I would be interested in joining a dev list on this or something like that (or open dev blog? or wiki?) if you start such a thing. I assume you have to have discussions with your mentor anyway. If possible it'd be nice to peek in on those conversations. --bb From robert.kern at gmail.com Thu Apr 12 23:20:08 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 22:20:08 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: <461EF6E8.1020108@gmail.com> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: >> http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the >> full proposal can be found there: >> http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) > >> So expect some news (and more importantly, some code) on this front in >> the coming months, > > I would be interested in joining a dev list on this or something like > that (or open dev blog? or wiki?) if you start such a thing. I assume > you have to have discussions with your mentor anyway. If possible > it'd be nice to peek in on those conversations. David is welcome to use scipy-dev and scipy.org especially since a good chunk of the project involves scipy packages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Fri Apr 13 00:59:49 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 13:59:49 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: > > >> Well, netlab is a huge package, but google just announced that my SoC >> projet pymachine, a set of toolboxes for machine learning-related >> algorithms, has been accepted: >> > > That's great to hear. I've been keeping an eye on your progress with > pyEM and such. Very promising. > > Incidentally I work with some speech and graphics guys from ATR, where > I see you worked previously. Do you know Yotsukura-san, Kawamoto-san, > or Nakamura-san? (I think Nakamura-san is now the head of ATR maybe > even). > > I indeed work some time there before starting my PhD program at Kyodai, but not in the speech lab (I worked in the now defunct MIS lab). > I would be interested in joining a dev list on this or something like > that (or open dev blog? or wiki?) if you start such a thing. I assume > you have to have discussions with your mentor anyway. If possible > it'd be nice to peek in on those conversations. > There is nothing started yet, and some things need to be fixed with my mentor before things get started, but as Robert said, most if not all discussion related to it would happen here and follow the usual scipy process (scipy SVN, Trac, etc...). David From wbaxter at gmail.com Fri Apr 13 01:41:10 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 14:41:10 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F0E45.5000009@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, David Cournapeau wrote: > Bill Baxter wrote: > > On 4/13/07, David Cournapeau wrote: > > I would be interested in joining a dev list on this or something like > > that (or open dev blog? or wiki?) if you start such a thing. I assume > > you have to have discussions with your mentor anyway. If possible > > it'd be nice to peek in on those conversations. > > > There is nothing started yet, and some things need to be fixed with my > mentor before things get started, but as Robert said, most if not all > discussion related to it would happen here and follow the usual scipy > process (scipy SVN, Trac, etc...). Great then. The project page mentions SVM. In addition to SVM I'm interested in things like PPCA, kernel PCA, RBF networks, gaussian processes and GPLVM. Are you going to try to go in the direction of a modular structure with reusable bits for for all kernel methods, or is the plan to targeted specifically SVM? The basic components of this stuff (like RBFs) also make for good scattered data interpolation schemes. I hear questions every so often on the list about good ways to do that, so making the tools for the machine learning toolkit easy to use for people who just want to interpolate data would be nice. Going in a slightly different direction, meshfree methods for solving partial differential equations also build on tools like RBF and moving least squares interpolation. So for that reason too, it would be nice to have a reusable api layer for those things. You mention also that you're planning to unify row vec vs. column vec conventions. Just wanted to put my vote in for row vectors! For a number of reasons 1) It seems to be the more common usage in machine learning literature 2) with Numpy's default C-contiguous data it puts individual vectors in contiguous memory. 3) it's easier to print something that's Nx5 than 5xN 4) "for vec in lotsofvecs:" works with row vectors, but needs a transpose for column vectors. 5) accessing a vector becomes just data[i] instead of data[:,i] which makes it easier to go back and forth between a python list of vectors and a numpy 2d array of vectors. --bb From wbaxter at gmail.com Fri Apr 13 01:44:37 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 14:44:37 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > You mention also that you're planning to unify row vec vs. column vec > conventions. Just wanted to put my vote in for row vectors! For a > number of reasons > 1) It seems to be the more common usage in machine learning literature > 2) with Numpy's default C-contiguous data it puts individual vectors > in contiguous memory. > 3) it's easier to print something that's Nx5 than 5xN > 4) "for vec in lotsofvecs:" works with row vectors, but needs a > transpose for column vectors. > 5) accessing a vector becomes just data[i] instead of data[:,i] which > makes it easier to go back and forth between a python list of vectors > and a numpy 2d array of vectors. One more: 6) mat(avec) where avec is 1-D returns a row vector rather than a column vector. From robert.kern at gmail.com Fri Apr 13 01:50:54 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 00:50:54 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F1A3E.1000508@gmail.com> Bill Baxter wrote: > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? On that note, I have some Gaussian process code in a Mercurial repository here (click the "manifest" button to browse the source): http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/gp/ It's based on the treatment by Rasmussen and Williams: http://www.gaussianprocess.org/gpml/ The covariance functions I implement there might be useful in other methods, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Fri Apr 13 01:57:31 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 14:57:31 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: >> Bill Baxter wrote: >>> On 4/13/07, David Cournapeau wrote: >>> I would be interested in joining a dev list on this or something like >>> that (or open dev blog? or wiki?) if you start such a thing. I assume >>> you have to have discussions with your mentor anyway. If possible >>> it'd be nice to peek in on those conversations. >>> >> There is nothing started yet, and some things need to be fixed with my >> mentor before things get started, but as Robert said, most if not all >> discussion related to it would happen here and follow the usual scipy >> process (scipy SVN, Trac, etc...). > > Great then. > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? The plan is really about unifying and improving existing toolboxes, and provides a higher level API (which would end up in scikits for various reasons). Depending on the time left, I will add some algorithms later. Of course, the goal is that other people will also jump in to add new algorithms (I for example will add some recent advances for mixture like ensemble learning, outside of the SoC if necessary). > > The basic components of this stuff (like RBFs) also make for good > scattered data interpolation schemes. I hear questions every so often > on the list about good ways to do that, so making the tools for the > machine learning toolkit easy to use for people who just want to > interpolate data would be nice. > > Going in a slightly different direction, meshfree methods for solving > partial differential equations also build on tools like RBF and moving > least squares interpolation. So for that reason too, it would be nice > to have a reusable api layer for those things. > > You mention also that you're planning to unify row vec vs. column vec > conventions. Just wanted to put my vote in for row vectors! For a > number of reasons > 1) It seems to be the more common usage in machine learning literature > 2) with Numpy's default C-contiguous data it puts individual vectors > in contiguous memory. > 3) it's easier to print something that's Nx5 than 5xN > 4) "for vec in lotsofvecs:" works with row vectors, but needs a > transpose for column vectors. > 5) accessing a vector becomes just data[i] instead of data[:,i] which > makes it easier to go back and forth between a python list of vectors > and a numpy 2d array of vectors. I have not given a lot of thoughts about it yet; what matters the most is that all algo have the same conventions. Nevertheless, my experience so far in numpy is similar to yours with regard to ML algorithms (except point 2: depending on the algo. you need contiguous access along one dimension, and my impression is that in numpy, this matters a lot performance wise, at least much more than in matlab). David From david at ar.media.kyoto-u.ac.jp Fri Apr 13 02:08:14 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 15:08:14 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1A3E.1000508@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1A3E.1000508@gmail.com> Message-ID: <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > Bill Baxter wrote: > >> The project page mentions SVM. In addition to SVM I'm interested in >> things like PPCA, kernel PCA, RBF networks, gaussian processes and >> GPLVM. Are you going to try to go in the direction of a modular >> structure with reusable bits for for all kernel methods, or is the >> plan to targeted specifically SVM? > > On that note, I have some Gaussian process code in a Mercurial repository here > (click the "manifest" button to browse the source): > > http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/gp/ > > It's based on the treatment by Rasmussen and Williams: > > http://www.gaussianprocess.org/gpml/ > > The covariance functions I implement there might be useful in other methods, too. > Thanks for those links, I will take a look at the code you wrote. Since you're here, I have some questions concerning chaco for the visualization part of the project. Basically, I am unsure about whether I should use chaco or matplotlitb. I do not know chaco very well yet, but it seems much better API and performance wise compared to matplotlib for interactive visualization. The problem is that it still does not have a lot of visibility to the community compared to matplotlib, and it is still pretty complicated to install. I do not care much about those points myself, but seeing how installation problems are one of the big difficulty for newcommers to numpy/scipy, I am a bit concerned. Is this impression funded, and if it is, is there a chance to see improvements on those fronts in the next few months ? David From matthieu.brucher at gmail.com Fri Apr 13 02:23:12 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 08:23:12 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? Don't scipy have SVMs already ? Perhaps not as modularized at it could be ? PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), KPCA is not a big deal if kernels are in a module, and if they have a method taking 2 arguments. BTW, even the svd could directly take a kernel as an argument, the default kernel being the scalar product ? I'm in favour of fine-grained modules - like for the optimisation module I proposed -, and allowing pepole to choose which kernel they want, even if the kernel was designed for SVM, is a good thing, the "kernel trick" should be almost universal :) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Fri Apr 13 02:40:18 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 15:40:18 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F25D2.6050405@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? > > > Don't scipy have SVMs already ? Perhaps not as modularized at it could > be ? > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), KPCA is not > a big deal if kernels are in a module, and if they have a method > taking 2 arguments. BTW, even the svd could directly take a kernel as > an argument, the default kernel being the scalar product ? > I'm in favour of fine-grained modules - like for the optimisation > module I proposed -, and allowing pepole to choose which kernel they > want, even if the kernel was designed for SVM, is a good thing, the > "kernel trick" should be almost universal :) The project is first about unifying *existing* packages: basically, make them first class citizen doc-wise and api-wise, so that they can be moved out of the sandbox. SVM, EM for mixture fall in this category. cheers, David From wbaxter at gmail.com Fri Apr 13 02:46:27 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 15:46:27 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, Matthieu Brucher wrote: > > > The project page mentions SVM. In addition to SVM I'm interested in > > things like PPCA, kernel PCA, RBF networks, gaussian processes and > > GPLVM. Are you going to try to go in the direction of a modular > > structure with reusable bits for for all kernel methods, or is the > > plan to targeted specifically SVM? > > Don't scipy have SVMs already ? Perhaps not as modularized at it could be ? It might have something in sandbox, but as far as I'm concerned 'in scipy.sandbox' is synonymous with 'not in scipy'. > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), Yes, pretty much so, just with some variances added into the diagonal of the matrix at the right place. > KPCA is not a big > deal if kernels are in a module, and if they have a method taking 2 > arguments. Right. Low level kernel stuff in reusable lib makes a lot of sense. If there were a lib of functions with common kernels and their derivatives and possibly even 2nd derivatives, that would cover a lot of territory. (I could use some 2nd derivatives right now...) --bb From oliphant.travis at ieee.org Fri Apr 13 03:06:41 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 13 Apr 2007 01:06:41 -0600 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> Message-ID: <461F2C01.2070904@ieee.org> David Cournapeau wrote: > I have not given a lot of thoughts about it yet; what matters the most > is that all algo have the same conventions. Nevertheless, my experience > so far in numpy is similar to yours with regard to ML algorithms (except > point 2: depending on the algo. you need contiguous access along one > dimension, and my impression is that in numpy, this matters a lot > performance wise, at least much more than in matlab). > If I understand you correctly, this is not as true as it once was. There was some benchmark code that showed a slow-down on transposed arrays until we caught the bug that was causing it. What is important is that your inner-loops are running over data that is "closest" together in memory. Exactly how close is close enough depends on cache size. -Travis From nwagner at iam.uni-stuttgart.de Fri Apr 13 03:01:07 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 09:01:07 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy Message-ID: <461F2AB3.1080407@iam.uni-stuttgart.de> Ondrej, Please can you show me an example where petsc solvers are "better" than UMFPACK. Nils 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. From matthieu.brucher at gmail.com Fri Apr 13 03:01:46 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 09:01:46 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: Message-ID: A little update of my proposal : - each step can be update after each iteration, it will be enhanced so that everything computed in the iteration will be passed on, in case it is needed to update the step. That could be useful for approximated steps - added a simple Damped optimizer, it tries to take a step, if the cost is higher than before, half a step is tested, ... - a function object is created if the function argument is not passed (takes the arg 'fun' as the cost function, gradient for the gradient, ...). Some safeguards must still be implemented. I was thinking of the limits of this architecture : - defenitely all quasi-Newton optimizers can be ported to this framework, as well as all semi-quadratic ones - constrained optimization will not unless it is modified so that it can, but as I do not use such optimizers in my PhD thesis, I do not know them enough But even the simplex/polytope optimizer (fmin) can be expressed in the framework - it is useless though, as it would be slower -, and can advantages of the different stopping criteria. BTW, I used some parts of this framework in an EM algorithm with an AIC based optimizer on the top. As I said in another thread, I'm in favour of fine-grained modules, even if some wrapper can provide simple optimization procedures. Matthieu 2007/3/26, Matthieu Brucher : > > OK, I see why you want that approach. > > (So that you can still pass a single object around in your > > optimizer module.) Yes, that seems right... > > > > Exactly :) > > > This seems to bundle naturally with a specific optimizer? > > > > I'm not an expert in optimization, but I intended several class/seminars > on the subject, and at least for the usual simple optimizer - the standard > optimizer, all damped approach, and all the other that use a step and a > criterion test - use this interface, and with a lot of different steps that > are usual - gradient, every conjugated gradient solution, (quasi-)Newton - > or criteria. > I even suppose it can do very well in semi-quadratic optimization, with > very little change, but I have to finish some work before I can read some > books on the subject to begin implementing it in Python. > > > If so, the class definition should reside in the StandardOptimizer module. > > > > Cheers, > > Alan Isaac > > > > PS For readability, I think Optimizer should define > > a "virtual" iterate method. E.g., > > def iterate(self): > > return NotImplemented > > > Yes, it seems better. > > Thanks for the opinion ! > > Matthieu > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizerProposal_02.tar.gz Type: application/x-gzip Size: 3861 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Fri Apr 13 03:05:31 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 16:05:31 +0900 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F2AB3.1080407@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> Message-ID: <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > Ondrej, > > Please can you show me an example where petsc solvers are "better" than > UMFPACK. > > Nils > > 4) About the petsc - I know it's another dependence. However, I > noticed you are using umfpack in SciPy. So why not petsc? I think it > contains much more (sometimes better) solvers (depends on the > problem). It's seems logical to me, to either use nothing, or the best > library available, which I believe is petsc. I can give you one situation where adding dependency makes things more complicated: when you are a packager. I am trying to "evangelize" numpy/scipy, and one problem people face is installation. When you are a user, adding dependency is great, it gives you more code, more API to leverage. When you are a packager, each dependency is a mess. I am working on rpm and debian package of numpy and scipy, and 99.9 % of the problems are the dependencies. LAPACK and BLAS are already quite difficult to package correctly (debian was the only distribution to do it correctly for a long time), UMFPACK is kind of a pain to compile too (depends on two other packages), and let's not even start talking about ATLAS, which pose significant challenges by its very nature (again, only debian has done it correctly, Fedora copying their method). And I have only experience on linux, where at least every distribution uses the same compiler suite. Most of those libraries are not really commonly used, and as such, are not provided by distributors most of the time. I wouldn't be surprised if this is one of the reason why Robert is reluctant to add more dependencies. David From matthieu.brucher at gmail.com Fri Apr 13 03:11:42 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 09:11:42 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > > > Don't scipy have SVMs already ? Perhaps not as modularized at it could > be ? > > It might have something in sandbox, but as far as I'm concerned 'in > scipy.sandbox' is synonymous with 'not in scipy'. OK, I read David's answer - BTW, hello Gabou :) -. I'm looking forward to this, I will probably use SVMs in a near future, as I'm moving toward Python. > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), > > Yes, pretty much so, just with some variances added into the diagonal > of the matrix at the right place. OK, what you want is in fact the implementation of the optimization algorithm given at the end of the paper ? Because in the standard form of PPCA, the error variance is isotropic, and in that case, it is tantamount to simple PCA. > KPCA is not a big > > deal if kernels are in a module, and if they have a method taking 2 > > arguments. > > Right. Low level kernel stuff in reusable lib makes a lot of sense. > If there were a lib of functions with common kernels and their > derivatives and possibly even 2nd derivatives, that would cover a lot > of territory. (I could use some 2nd derivatives right now...) I would appreciate as well, for a modified mean-shift algorithm. I'll probably port Isomap and LLE algorithms as well, they are widely used for manifold learning (they can be expressed as KPCA algorithms with a particular kernel). Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Apr 13 03:49:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 02:49:17 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1A3E.1000508@gmail.com> <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> Message-ID: <461F35FD.9040102@gmail.com> David Cournapeau wrote: > Since you're here, I have some questions concerning chaco for the > visualization part of the project. Basically, I am unsure about whether > I should use chaco or matplotlitb. I do not know chaco very well yet, > but it seems much better API and performance wise compared to matplotlib > for interactive visualization. The problem is that it still does not > have a lot of visibility to the community compared to matplotlib, and it > is still pretty complicated to install. I do not care much about those > points myself, but seeing how installation problems are one of the big > difficulty for newcommers to numpy/scipy, I am a bit concerned. Is this > impression funded, and if it is, is there a chance to see improvements > on those fronts in the next few months ? Just installing Chaco and the stuff it depends on can actually be somewhat easier than matplotlib: we don't bother with external libjpeg, libpng, and libfreetype libraries. The major issue would actually be disabling building of TVTK if you don't have VTK installed and you only care about Chaco, but even that's a comment-out-one-line operation. In the coming weeks, though, we will be playing around with a reorganization of the repository along the lines of the scikits layout if you've been following that conversation. That would enable one to just build the enthought subpackages that you need, or allow easy_install to do so. Even if we don't end up reorganizing the trunk that way, we'll probably have such a reorganized mirror of the trunk using svn:externals to get the same effect for distribution purposes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cimrman3 at ntc.zcu.cz Fri Apr 13 06:59:33 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 12:59:33 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EC48B.50105@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> <461EC48B.50105@gmail.com> Message-ID: <461F6295.8080303@ntc.zcu.cz> Robert Kern wrote: > Ondrej Certik wrote: >> 4) About the petsc - I know it's another dependence. However, I >> noticed you are using umfpack in SciPy. So why not petsc? I think it >> contains much more (sometimes better) solvers (depends on the >> problem). It's seems logical to me, to either use nothing, or the best >> library available, which I believe is petsc. > > Well, I wasn't as much of a dependency/no-optional-features freak when the > optional UMFPACK stuff went in. Also, IIRC the wrappers for UMFPACK were written > specifically for scipy; they didn't exist as a separate package beforehand. > petsc4py already exists. Unless if we decide that some other feature of scipy > needs it, there is no reason that I can see for bringing it into the scipy package. Yes, it was written specifically for scipy. Actually I was really 'forced' to write UMFPACK wrappers as at that time (long long ago) there was not a fast enough direct sparse solver in scipy. r. From cimrman3 at ntc.zcu.cz Fri Apr 13 07:13:59 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 13:13:59 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F2AB3.1080407@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> Message-ID: <461F65F7.6000909@ntc.zcu.cz> Nils Wagner wrote: > Please can you show me an example where petsc solvers are "better" than > UMFPACK. Petsc is really a superpackage providing many parallel linear solvers (iterative, direct, preconditioners, ...) together with nonlinear solvers, time steppers, etc. The solvers can be both petsc-native or external packages, nevertheless all are accessed via a uniform interface. IMHO UMFPACK is one of the optional external solvers petsc can use, so to answer your question, petsc can do anything that UMFPACK does and much more. r. From ondrej at certik.cz Fri Apr 13 07:29:01 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 13:29:01 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F65F7.6000909@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> Message-ID: <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> On 4/13/07, Robert Cimrman wrote: > Nils Wagner wrote: > > Please can you show me an example where petsc solvers are "better" than > > UMFPACK. > > Petsc is really a superpackage providing many parallel linear solvers > (iterative, direct, preconditioners, ...) together with nonlinear > solvers, time steppers, etc. The solvers can be both petsc-native or > external packages, nevertheless all are accessed via a uniform > interface. IMHO UMFPACK is one of the optional external solvers petsc > can use, so to answer your question, petsc can do anything that UMFPACK > does and much more. Yes, it's exactly like this. Thus, there is a question whether SciPy should support sparse solvers (my answer is yes) and if so, then it should support petsc, otherwise, for example me, I am not going to use it, as I want to try several solvers according to the problem. What I am trying to say is that I don't want to write two versions of my code - one for petsc and second one for SciPy. And from the zen of python: There should be one-- and preferably only one --obvious way to do it. Ondra From nwagner at iam.uni-stuttgart.de Fri Apr 13 07:32:53 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 13:32:53 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F65F7.6000909@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> Message-ID: <461F6A65.4010605@iam.uni-stuttgart.de> Robert Cimrman wrote: > Nils Wagner wrote: > >> Please can you show me an example where petsc solvers are "better" than >> UMFPACK. >> > > Petsc is really a superpackage providing many parallel linear solvers > (iterative, direct, preconditioners, ...) together with nonlinear > solvers, time steppers, etc. The solvers can be both petsc-native or > external packages, nevertheless all are accessed via a uniform > interface. IMHO UMFPACK is one of the optional external solvers petsc > can use, so to answer your question, petsc can do anything that UMFPACK > does and much more. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > The current UMFPACK version is 5.0.3. Which versions are supported by scipy ? As I mentioned in a previous email I still have trouble to install other versions than 4.4. Any pointer, how to install more recent versions of UMFPACK, would be appreciated. When you are talking about superpackages how about Sundials ? http://www.llnl.gov/CASC/sundials/ AFAIK the ode solvers in scipy cannot handle events and a DAE solver is also missing in scipy. http://www.wolfram.com/products/mathematica/newin51/eventhandling.html Are there plans to add this functionality to scipy (in the form of scikits) ? Nils From cimrman3 at ntc.zcu.cz Fri Apr 13 08:24:04 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 14:24:04 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> Message-ID: <461F7664.7070001@ntc.zcu.cz> Ondrej Certik wrote: > On 4/13/07, Robert Cimrman wrote: >> Nils Wagner wrote: >>> Please can you show me an example where petsc solvers are "better" than >>> UMFPACK. >> Petsc is really a superpackage providing many parallel linear solvers >> (iterative, direct, preconditioners, ...) together with nonlinear >> solvers, time steppers, etc. The solvers can be both petsc-native or >> external packages, nevertheless all are accessed via a uniform >> interface. IMHO UMFPACK is one of the optional external solvers petsc >> can use, so to answer your question, petsc can do anything that UMFPACK >> does and much more. > > Yes, it's exactly like this. Thus, there is a question whether SciPy > should support sparse solvers (my answer is yes) and if so, then it > should support petsc, otherwise, for example me, I am not going to use > it, as I want to try several solvers according to the problem. My problems tend to be such that only a direct solvers work :) > What I am trying to say is that I don't want to write two versions of > my code - one for petsc and second one for SciPy. And from the zen of > python: > > There should be one-- and preferably only one --obvious way to do it. Well, you can use very well both petsc and scipy/numpy together. afaik petsc4py depends on numpy, so this you need in any case, and scipy is a set of very useful modules built on top of numpy (particularly its multidimensional array data type), addressing different fields of (scientific) computation, not just solving linear systems. It is true that the sparse matrix support in scipy is not as mature as some users need, but this can change :). So for now, you can use petsc (or ) for sparse stuff if you like, and scipy for other things that are not in petsc. There is no contradiction, imho. Just my 2kc, r. From cimrman3 at ntc.zcu.cz Fri Apr 13 08:36:41 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 14:36:41 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F6A65.4010605@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> Message-ID: <461F7959.4050105@ntc.zcu.cz> Nils Wagner wrote: > Robert Cimrman wrote: >> Nils Wagner wrote: >> >>> Please can you show me an example where petsc solvers are >>> "better" than UMFPACK. >>> >> Petsc is really a superpackage providing many parallel linear >> solvers (iterative, direct, preconditioners, ...) together with >> nonlinear solvers, time steppers, etc. The solvers can be both >> petsc-native or external packages, nevertheless all are accessed >> via a uniform interface. IMHO UMFPACK is one of the optional >> external solvers petsc can use, so to answer your question, petsc >> can do anything that UMFPACK does and much more. >> > The current UMFPACK version is 5.0.3. Which versions are supported by > scipy ? As I mentioned in a previous email I still have trouble to > install other versions than 4.4. Any pointer, how to install more > recent versions of UMFPACK, would be appreciated. I can use 5.0 without problems, so 5.0.3 should work too. I have downloaded the whole UFsparse suite, edited UFconfig/UFconfig.mk, cd into UMFPACK and typed 'make' (normal UMFPACK installation procedure). Then I edited my numpy/site.cfg to reflect the installation paths. I do remember you had some problems with this step, but I do not know why. Any other recent UMFPACK users out there? r. ps: I am leaving for one week, so I will not be able to answer UMFPACK related questions :) From nwagner at iam.uni-stuttgart.de Fri Apr 13 08:36:48 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 14:36:48 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7664.7070001@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> Message-ID: <461F7960.4090102@iam.uni-stuttgart.de> Robert Cimrman wrote: > Ondrej Certik wrote: > >> On 4/13/07, Robert Cimrman wrote: >> >>> Nils Wagner wrote: >>> >>>> Please can you show me an example where petsc solvers are "better" than >>>> UMFPACK. >>>> >>> Petsc is really a superpackage providing many parallel linear solvers >>> (iterative, direct, preconditioners, ...) together with nonlinear >>> solvers, time steppers, etc. The solvers can be both petsc-native or >>> external packages, nevertheless all are accessed via a uniform >>> interface. IMHO UMFPACK is one of the optional external solvers petsc >>> can use, so to answer your question, petsc can do anything that UMFPACK >>> does and much more. >>> >> Yes, it's exactly like this. Thus, there is a question whether SciPy >> should support sparse solvers (my answer is yes) and if so, then it >> should support petsc, otherwise, for example me, I am not going to use >> it, as I want to try several solvers according to the problem. >> > > My problems tend to be such that only a direct solvers work :) > > >> What I am trying to say is that I don't want to write two versions of >> my code - one for petsc and second one for SciPy. And from the zen of >> python: >> >> There should be one-- and preferably only one --obvious way to do it. >> > > Well, you can use very well both petsc and scipy/numpy together. afaik > petsc4py depends on numpy, so this you need in any case, and scipy is a > set of very useful modules built on top of numpy (particularly its > multidimensional array data type), addressing different fields of > (scientific) computation, not just solving linear systems. It is true > that the sparse matrix support in scipy is not as mature as some users > need, but this can change :). So for now, you can use petsc (or your favourite sparse matrix package here>) for sparse stuff if you > like, and scipy for other things that are not in petsc. > There is no contradiction, imho. > > Just my 2kc, > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Unfortunately petsc4py has no tutorial. http://www.cimec.org.ar/python/petsc4py.html#tutorial I guess that many users prefer well documented packages. Nils From ondrej at certik.cz Fri Apr 13 08:44:46 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 14:44:46 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7960.4090102@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> <461F7960.4090102@iam.uni-stuttgart.de> Message-ID: <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> > Unfortunately petsc4py has no tutorial. > http://www.cimec.org.ar/python/petsc4py.html#tutorial > > I guess that many users prefer well documented packages. Well, is there a tutorial for umfpack in scipy? I only found this: http://www.scipy.org/doc/api_docs/scipy.sparse.html But that's about the same amount of documentation as the petsc4py has in form of the docstrings. Ondrej From nwagner at iam.uni-stuttgart.de Fri Apr 13 08:52:22 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 14:52:22 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> <461F7960.4090102@iam.uni-stuttgart.de> <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> Message-ID: <461F7D06.4080908@iam.uni-stuttgart.de> Ondrej Certik wrote: >> Unfortunately petsc4py has no tutorial. >> http://www.cimec.org.ar/python/petsc4py.html#tutorial >> >> I guess that many users prefer well documented packages. >> > > Well, is there a tutorial for umfpack in scipy? I only found this: > > http://www.scipy.org/doc/api_docs/scipy.sparse.html > > But that's about the same amount of documentation as the petsc4py has > in form of the docstrings. > > Ondrej > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Try >>> from scipy import * >>> help (linsolve) Nils From matthieu.brucher at gmail.com Fri Apr 13 12:14:55 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 18:14:55 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: Message-ID: A new proposal... I refactored the code for the line search part, it is now a another module. The damped optimizer of the last proposal is now a damped line search, by default no line search is performed at all. Matthieu 2007/4/13, Matthieu Brucher : > > A little update of my proposal : > > - each step can be update after each iteration, it will be enhanced so > that everything computed in the iteration will be passed on, in case it is > needed to update the step. That could be useful for approximated steps > - added a simple Damped optimizer, it tries to take a step, if the cost is > higher than before, half a step is tested, ... > - a function object is created if the function argument is not passed > (takes the arg 'fun' as the cost function, gradient for the gradient, ...). > Some safeguards must still be implemented. > > I was thinking of the limits of this architecture : > - defenitely all quasi-Newton optimizers can be ported to this framework, > as well as all semi-quadratic ones > - constrained optimization will not unless it is modified so that it can, > but as I do not use such optimizers in my PhD thesis, I do not know them > enough > > But even the simplex/polytope optimizer (fmin) can be expressed in the > framework - it is useless though, as it would be slower -, and can > advantages of the different stopping criteria. BTW, I used some parts of > this framework in an EM algorithm with an AIC based optimizer on the top. > > As I said in another thread, I'm in favour of fine-grained modules, even > if some wrapper can provide simple optimization procedures. > > Matthieu > > 2007/3/26, Matthieu Brucher < matthieu.brucher at gmail.com>: > > > > OK, I see why you want that approach. > > > (So that you can still pass a single object around in your > > > optimizer module.) Yes, that seems right... > > > > > > > > Exactly :) > > > > > > This seems to bundle naturally with a specific optimizer? > > > > > > > > I'm not an expert in optimization, but I intended several class/seminars > > on the subject, and at least for the usual simple optimizer - the standard > > optimizer, all damped approach, and all the other that use a step and a > > criterion test - use this interface, and with a lot of different steps that > > are usual - gradient, every conjugated gradient solution, (quasi-)Newton - > > or criteria. > > I even suppose it can do very well in semi-quadratic optimization, with > > very little change, but I have to finish some work before I can read some > > books on the subject to begin implementing it in Python. > > > > > > If so, the class definition should reside in the StandardOptimizer > > > module. > > > > > > Cheers, > > > Alan Isaac > > > > > > PS For readability, I think Optimizer should define > > > a "virtual" iterate method. E.g., > > > def iterate(self): > > > return NotImplemented > > > > > > Yes, it seems better. > > > > Thanks for the opinion ! > > > > Matthieu > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizerProposal_03.zip Type: application/zip Size: 7587 bytes Desc: not available URL: From robert.kern at gmail.com Fri Apr 13 14:14:00 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 13:14:00 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F6A65.4010605@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> Message-ID: <461FC868.9090500@gmail.com> Nils Wagner wrote: > When you are talking about superpackages how about Sundials ? You keep asking this question over and over again, and you get the same answer every time: it will be wrapped when the people who want it wrapped put the effort into wrapping it. If you want to see wrappers for SUNDIALS, stop asking the question and start writing code. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Fri Apr 13 14:42:40 2007 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 13 Apr 2007 13:42:40 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7959.4050105@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> <461F7959.4050105@ntc.zcu.cz> Message-ID: On 4/13/07, Robert Cimrman wrote: > I can use 5.0 without problems, so 5.0.3 should work too. > > I have downloaded the whole UFsparse suite, edited UFconfig/UFconfig.mk, > cd into UMFPACK and typed 'make' (normal UMFPACK installation > procedure). Then I edited my numpy/site.cfg to reflect the installation > paths. I do remember you had some problems with this step, but I do not > know why. > > Any other recent UMFPACK users out there? I'm using SuiteSparse version 2.1.1 dated 09/11/2006 which includes UMFPACK 5.0.1. IIRC I'm also using GotoBLAS for UMFPACK. I installed SuiteSparse to /opt/SuiteSparse and copied all the UMFPACK, CHOLMOD, AMD etc. header files to /usr/include. I remember having problems when trying to point SciPy at the header files in /opt/SuiteSparse. My site.cfg has the following entries: [amd] library_dirs = /opt/SuiteSparse/AMD/Lib include_dirs = /opt/SuiteSparse/AMD/Include amd_libs = amd [umfpack] library_dirs = /opt/SuiteSparse/UMFPACK/Lib include_dirs = /opt/SuiteSparse/UMFPACK/Include umfpack_libs = umfpack When running SciPy's setup.py I get the following output: umfpack_info: amd_info: FOUND: libraries = ['amd'] library_dirs = ['/opt/SuiteSparse/AMD/Lib'] swig_opts = ['-I/opt/SuiteSparse/AMD/Include'] define_macros = [('SCIPY_AMD_H', None)] include_dirs = ['/opt/SuiteSparse/AMD/Include'] FOUND: libraries = ['umfpack', 'amd'] library_dirs = ['/opt/SuiteSparse/UMFPACK/Lib', '/opt/SuiteSparse/AMD/Lib'] swig_opts = ['-I/opt/SuiteSparse/UMFPACK/Include', '-I/opt/SuiteSparse/AMD/Include'] define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)] include_dirs = ['/opt/SuiteSparse/UMFPACK/Include', '/opt/SuiteSparse/AMD/Include'] -- Nathan Bell wnbell at gmail.com From ondrej at certik.cz Fri Apr 13 19:57:53 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Sat, 14 Apr 2007 01:57:53 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EC48B.50105@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> <461EC48B.50105@gmail.com> Message-ID: <85b5c3130704131657l101e8927wb7e3235acc4c73ab@mail.gmail.com> > > So, I'll copy the page: > > > > http://www.scipy.org/Documentation > > > > into some new one, and redesign it as I would like it to be, and then > > you'll tell me what you think about it. The same with other pages if > > I'll get a better idea about them. This way I shouldn't spoil anything > > in case you wouldn't like it. Because I don't have just couple of > > small fixes. > > As you like. Thank you! You can check my draft of a new documentation: http://www.scipy.org/DocumentationNew and the original one: http://www.scipy.org/Documentation I removed broken links, merged very simple pages with the tutorial and removed duplicates. Tell me, if you like it and if I should continue - I would merge SciPy Tutorial with Tutorial II, then I would merge all porting wikis into one central with links (or just add links to one of them) and possibly some more simplifying - it's still too much complicated with too much (confusing) links. Ondrej From mforbes at physics.ubc.ca Sat Apr 14 11:37:17 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Sat, 14 Apr 2007 08:37:17 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: Message-ID: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> I started a discussion page on the Trac for design ideas etc. about modular optimization. Right now I am just adding questions I have about things. As it becomes more coherent, we can bring these questions/ideas to the list for comments. http://projects.scipy.org/scipy/scipy/wiki/DevelopmentIdeas/ ModularOptimization Michael. P.S. What is the best way to share code ideas on the wiki? Small bits work well inline, but for larger chunks it would be nice to be able to attach files. Unfortunately none of the wiki's deals well with attached files (no versioning and sometimes no way of modifying the file). On 13 Apr 2007, at 9:14 AM, Matthieu Brucher wrote: > A new proposal... > I refactored the code for the line search part, it is now a another > module. The damped optimizer of the last proposal is now a damped > line search, by default no line search is performed at all. > > Matthieu From matthieu.brucher at gmail.com Sat Apr 14 11:59:37 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 14 Apr 2007 17:59:37 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: Good point ;) I could make those changes safe for the f or func part... Using an object to optimize is for me better than a collection of functions although a collection of functions can be made into an object if needed. For the interface, I suppose that assembling a optimizer is not something everybody will want to do, that's why some optimizers are proposed out of the box in MatLab toolboxes for instance, but allowing to customize rapidly an optimizer can be a real advantage over all other optimization packages. One of the members of the lab I studying in said to me that he did see if such modularization was pertinent. He used for its application (warping an image) a Levenberg-Marquardt optimizer with constraints and the line-search was performed with interval analysis. Until some days ago, I thought that he was right, that only some of optimizers can be expressed in "my" framework. Now, I think that even his optimization could be expressed, and if he wanted to modify something in the optimizer, it would be much simpler with this architecture, in Python, that what he has now, in C. He made some stuff very specific for his function, as a lot of people would want to do, but couldn't with a fixed interface ike MatLab's, but in fact a lot could be expressed in terms of a specific step, a specific line search, a specific criterion and a specific function/set of parameters. Until some time ago, I thought that modules with criteria, steps and optimizers would be enough, now I think I missed the fact that a lot of optimizers share the line search, and that it should be onother module. I'm writting some other tests functions (shamelessly taken from _Engineering Optimization_ from Rao) with other line searches and steps, I'll keep you posted. Matthieu 2007/4/14, Michael McNeil Forbes : > > I started a discussion page on the Trac for design ideas etc. about > modular optimization. Right now I am just adding questions I have > about things. As it becomes more coherent, we can bring these > questions/ideas to the list for comments. > > http://projects.scipy.org/scipy/scipy/wiki/DevelopmentIdeas/ > ModularOptimization > > Michael. > > P.S. What is the best way to share code ideas on the wiki? Small > bits work well inline, but for larger chunks it would be nice to be > able to attach files. Unfortunately none of the wiki's deals well > with attached files (no versioning and sometimes no way of modifying > the file). > > On 13 Apr 2007, at 9:14 AM, Matthieu Brucher wrote: > > > A new proposal... > > I refactored the code for the line search part, it is now a another > > module. The damped optimizer of the last proposal is now a damped > > line search, by default no line search is performed at all. > > > > Matthieu > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Sat Apr 14 12:06:43 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Sat, 14 Apr 2007 12:06:43 -0400 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: On Apr 14, 2007, at 11:37 AM, Michael McNeil Forbes wrote: > P.S. What is the best way to share code ideas on the wiki? Small > bits work well inline, but for larger chunks it would be nice to be > able to attach files. Unfortunately none of the wiki's deals well > with attached files (no versioning and sometimes no way of modifying > the file). In Trac, you can check things into the associated repository and then use source: and diff: links. From mforbes at physics.ubc.ca Sat Apr 14 15:45:23 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Sat, 14 Apr 2007 12:45:23 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: On 14 Apr 2007, at 8:59 AM, Matthieu Brucher wrote: > Good point ;) > I could make those changes safe for the f or func part... Using an > object to optimize is for me better than a collection of functions > although a collection of functions can be made into an object if > needed. > > > For the interface, I suppose that assembling a optimizer is not > something everybody will want to do, that's why some optimizers are > proposed out of the box in MatLab toolboxes for instance, but > allowing to customize rapidly an optimizer can be a real advantage > over all other optimization packages. And one can easily make convenience functions which take standard arguments and package them internally. I think that the interface should be flexible enough to allow users to just call the optimizers with a few standard arguments like they are used to, but allow users to "build" more complicated/more customized optimizers as they need. Also, it would be nice if an optimizer could be "tuned" to a particular problem (i.e. have a piece of code that tries several algorithms and parameter values to see which is fastest.) > One of the members of the lab I studying in said to me that he did > see if such modularization was pertinent. He used for its > application (warping an image) a Levenberg-Marquardt optimizer with > constraints and the line-search was performed with interval > analysis. Until some days ago, I thought that he was right, that > only some of optimizers can be expressed in "my" framework. Now, I > think that even his optimization could be expressed, and if he > wanted to modify something in the optimizer, it would be much > simpler with this architecture, in Python, that what he has now, in > C. He made some stuff very specific for his function, as a lot of > people would want to do, but couldn't with a fixed interface ike > MatLab's, but in fact a lot could be expressed in terms of a > specific step, a specific line search, a specific criterion and a > specific function/set of parameters. > > Until some time ago, I thought that modules with criteria, steps > and optimizers would be enough, now I think I missed the fact that > a lot of optimizers share the line search, and that it should be > onother module. My immediate goal is to try and get the interface and module structure well defined so that I know where to put the pieces of my Broyden code when I rip it apart. One question about coupling: Useful criteria for globally convergent algorithms include testing the gradients and/or curvature of the function. In the Broyden algorithm, for example, these would be maintained by the "step" object, but the criteria object would need to access these. Likewise, if a "function" object can compute its own derivatives, then the "ceriterion" object should access it from there. Any ideas on how to deal with these couplings? Perhaps the "function" object should maintain all the state (approximate jacobians etc.). Michael. From matthieu.brucher at gmail.com Sat Apr 14 16:51:15 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 14 Apr 2007 22:51:15 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: > > And one can easily make convenience functions which take standard > arguments and package them internally. I think that the interface > should be flexible enough to allow users to just call the optimizers > with a few standard arguments like they are used to, but allow users > to "build" more complicated/more customized optimizers as they need. > Also, it would be nice if an optimizer could be "tuned" to a > particular problem (i.e. have a piece of code that tries several > algorithms and parameter values to see which is fastest.) Exactly. > My immediate goal is to try and get the interface and module > structure well defined so that I know where to put the pieces of my > Broyden code when I rip it apart. I can help you with it :) One question about coupling: Useful criteria for globally convergent > algorithms include testing the gradients and/or curvature of the > function. Simple, the criterion takes, for the moment, the current iteration number, the former value, the current value, same for the parameters. It can be modified to add the gradient if needed - I think the step would be a better choice ? - In the Broyden algorithm, for example, these would be > maintained by the "step" object, but the criteria object would need > to access these. Access what exactly ? Likewise, if a "function" object can compute its > own derivatives, then the "ceriterion" object should access it from > there. I don't think that the criterion need to access this, because it would mean it knows more than it should, from an object-oriented point of view, but this can be discussed :) Any ideas on how to deal with these couplings? Perhaps the > "function" object should maintain all the state (approximate > jacobians etc.). I don't think so, the function provides methods to compute gradient, hessian, ... but only the step object knows what to do with it : approximate a hessian, what was already approximated, ... A step object is associated with one optimizer, a function object can be optimized several times. If it has a state, it couldn't be used with several optimizers without reinitializing it, and it is not intuitive enough. I've thinking about a correct architecture for several months now, and that is what I think is a good one : - a function to optimize that provides some method to compute the cost, the gradient, hessian, ... only basic stuff - an object that is responsible for the optimization, the glue between all modules -> optimizer - an object that tells if the optimization has converged. It needs the current iteration number, several last values, parameters, perhaps other things, but these things should be discussed - an object that computes a new step, takes a function to optimize, can have a state - to compute approximate hessian or inverse hessian - a line search that can find a new candidate - section method, damped method, no method at all, with a state (Marquardt), ... With these five objects, I _think_ every unconstrained method can be expressed. For the constraints, I suppose the step and the line search should be adapted, but no other module needs to be touched. I implemented the golden and fibonacci section, it's pretty straightforward to add other line searches or steps, I'll try to add some before I submit it on TRAC. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Mon Apr 16 11:39:08 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 16 Apr 2007 08:39:08 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> On 14 Apr 2007, at 1:51 PM, Matthieu Brucher wrote: > > One question about coupling: Useful criteria for globally convergent > algorithms include testing the gradients and/or curvature of the > function. > > Simple, the criterion takes, for the moment, the current iteration > number, the former value, the current value, same for the > parameters. It can be modified to add the gradient if needed - I > think the step would be a better choice ? > > In the Broyden algorithm, for example, these would be > maintained by the "step" object, > > but the criteria object would need > to access these. > > Access what exactly ? "these == gradient/hessian information" The criterion needs access to this information, but the question is: who serves it? If the "function" can compute these, then it should naturally serve this information. With the Broyden method, you suggest that the "step" would serve this information. Thus, there are two objects (depending on the choice of method) that maintain and provide gradient information. After thinking about this some more, I am beginning to like the idea that only the "function" object be responsible for the Jacobian. If the function can compute the Jacobian directly: great, use a newton- like method. If it can't, then do its best to approximate it (i.e. the "Broyden" part of the algorithm would be encoded in the function object rather than the step object." The "function" object alone then serves up information about the value of the function at a given point, as well as the gradient and hessian at that point (either exact or approximate) to the criterion, step, and any other objects that need it. > Likewise, if a "function" object can compute its > own derivatives, then the "ceriterion" object should access it from > there. > > > I don't think that the criterion need to access this, because it > would mean it knows more than it should, from an object-oriented > point of view, but this can be discussed :) Certain termination criteria need access to the derivatives to make sure that they terminate. It would query the function object for this information. Other criteria may need to query the "step" object to find out the size of the previous steps. The "criterion" should not maintain any of these internally, just rely on the values served by the other objects: this does not break the encapsulation, it just couples the objects more tightly, but sophisticated criteria need this coupling. > Any ideas on how to deal with these couplings? Perhaps the > "function" object should maintain all the state (approximate > jacobians etc.). > > I don't think so, the function provides methods to compute > gradient, hessian, ... but only the step object knows what to do > with it : approximate a hessian, what was already approximated, ... > A step object is associated with one optimizer, a function object > can be optimized several times. If it has a state, it couldn't be > used with several optimizers without reinitializing it, and it is > not intuitive enough. The "function" object maintains all the information known about the function: how to compute the function, how to compute/approximate derivatives etc. If the user does not supply code for directly computing derivatives, but wants to use an optimization method that makes use of gradient information, then the function object should do its best to provide approximate information. The essence behind the Broyden methods is to approximate the Jacobian information in a clever and cheap way. I really think the natural place for this is in the "function" object, not the "step". > I've thinking about a correct architecture for several months now, > and that is what I think is a good one : > - a function to optimize that provides some method to compute the > cost, the gradient, hessian, ... only basic stuff > - an object that is responsible for the optimization, the glue > between all modules -> optimizer > - an object that tells if the optimization has converged. It needs > the current iteration number, several last values, parameters, > perhaps other things, but these things should be discussed > - an object that computes a new step, takes a function to optimize, > can have a state - to compute approximate hessian or inverse hessian > - a line search that can find a new candidate - section method, > damped method, no method at all, with a state (Marquardt), ... > > With these five objects, I _think_ every unconstrained method can > be expressed. For the constraints, I suppose the step and the line > search should be adapted, but no other module needs to be touched. Please describe how you think the Broyden root-finding method would fit within this scheme. Which object would maintain the state of the approximate Jacobian? Michael. From matthieu.brucher at gmail.com Mon Apr 16 12:07:44 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 18:07:44 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: I suspected it would become more problematic to decouple everything, but not that soon :) "these == gradient/hessian information" > > The criterion needs access to this information, but the question is: > who serves it? If the "function" can compute these, then it should > naturally serve this information. With the Broyden method, you > suggest that the "step" would serve this information. Thus, there > are two objects (depending on the choice of method) that maintain and > provide gradient information. I'll look in the litterature for the Broyden method, if I see the whole algorithm, I think I'll be able to answer your questions ;) After thinking about this some more, I am beginning to like the idea > that only the "function" object be responsible for the Jacobian. If > the function can compute the Jacobian directly: great, use a newton- > like method. If it can't, then do its best to approximate it (i.e. > the "Broyden" part of the algorithm would be encoded in the function > object rather than the step object." I think that if that if the function knows, on its own, how to compute the Jacobian, the hessian, ... it should provide it. When it does not, it shouldn't be the man sitting on the computer that modifies its function to add a Broyden algorithm to the function object. He sould only say to the optimizer that the function does not compute the Jacobian by using another module. What module ? That is a question for later. The goal of this is to have a clean architecture, and adding a way to compute something directly in the function, something that is not dependent on the function, but on the step, is not a good thing. The "function" object alone then serves up information about the > value of the function at a given point, as well as the gradient and > hessian at that point (either exact or approximate) to the criterion, > step, and any other objects that need it. I'm OK with it as long as it is not an approximation algorithm that is based on gradient, ... to compute for instance the hessian. Such an approximation algorithm is generic, and as such it should be put in another module or in a function superclass. > I don't think that the criterion need to access this, because it > > would mean it knows more than it should, from an object-oriented > > point of view, but this can be discussed :) > > Certain termination criteria need access to the derivatives to make > sure that they terminate. It would query the function object for > this information. Other criteria may need to query the "step" object > to find out the size of the previous steps. The step is not the good one, it's the line search object goal to find the correct step size, and such intel is given back to the optimizer core, because there, everything is saved - everything should be saved with a call to recordHistory -. What could be done is that every object - step or line search - returns along with the result - the result being the step, the new candidate, ... - a dictionnary with such values. In that case, the criterion can choose what it needs directly inside it. The "criterion" should > not maintain any of these internally, just rely on the values served > by the other objects: this does not break the encapsulation, it just > couples the objects more tightly, but sophisticated criteria need > this coupling. For the moment, the state was not in the criterion, one cannot know how any time it could be called inside an optimizer. This state is maintained by the optimizer itself - contains the last 2 values, the last 2 sets of parameters -, but I suppose that if we have the new candidate, the step and its size, those can be removed, and so the dictionary chooses what it needs. > I don't think so, the function provides methods to compute > > gradient, hessian, ... but only the step object knows what to do > > with it : approximate a hessian, what was already approximated, ... > > A step object is associated with one optimizer, a function object > > can be optimized several times. If it has a state, it couldn't be > > used with several optimizers without reinitializing it, and it is > > not intuitive enough. > > The "function" object maintains all the information known about the > function: how to compute the function, how to compute/approximate > derivatives etc. If the user does not supply code for directly > computing derivatives, but wants to use an optimization method that > makes use of gradient information, then the function object should do > its best to provide approximate information. The essence behind the > Broyden methods is to approximate the Jacobian information in a > clever and cheap way. That would mean that it can have a state, I really do not support this approach. The Broyden _is_ a way to get a step from a function that does not give some intell - Jacobian, for instance -, so it is not a function thing, it is a step mode. I really think the natural place for this is in the "function" > object, not the "step". > > > With these five objects, I _think_ every unconstrained method can > > be expressed. For the constraints, I suppose the step and the line > > search should be adapted, but no other module needs to be touched. > > Please describe how you think the Broyden root-finding method would > fit within this scheme. Which object would maintain the state of the > approximate Jacobian? No problem, I'll check in my two optimization books this evening provided I have enough time - I'm a little late in some important work projects :| - Thanks for your patience and your will to create generic optimizers :) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Mon Apr 16 12:11:34 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 16 Apr 2007 18:11:34 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> > I'll look in the litterature for the Broyden method, if I see the whole > algorithm, I think I'll be able to answer your questions ;) As I understand the Broyden update, the whole trick is that you don't need the precise Jacobian and it is subsequently approximated at every iteration. So all that is needed (and in my problems actually all that is available) is the function value. Ondrej From matthieu.brucher at gmail.com Mon Apr 16 12:20:50 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 18:20:50 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> Message-ID: 2007/4/16, Ondrej Certik : > > > I'll look in the litterature for the Broyden method, if I see the whole > > algorithm, I think I'll be able to answer your questions ;) > > As I understand the Broyden update, the whole trick is that you don't > need the precise Jacobian and it is subsequently approximated at every > iteration. So all that is needed (and in my problems actually all that > is available) is the function value. > > Ondrej That's what I understood from the discussion - well, every Quasi-Newton algorithm does this to some extent -, but I'll check to propose a full example. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Mon Apr 16 14:47:59 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 16 Apr 2007 11:47:59 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: On 16 Apr 2007, at 9:07 AM, Matthieu Brucher wrote: > I'll look in the litterature for the Broyden method, if I see the > whole algorithm, I think I'll be able to answer your questions ;) Basically, and approximated Jacobian is used to determine the step direction and/or size (depending on the "step" module etc.) The key to the Broyden approach is that the information about F(x+dx) is used to update the approximate Jacobian (think multidimensional secant method) in a clever way without any additional function evaluations (there is not a unique way to do this and some choices work better than others). Thus, think of Broyden methods as a Quasi-Newton methods but with a cheap and very approximate Jacobian (hence, one usually uses a robust line search method to make sure that one is always descending). > After thinking about this some more, I am beginning to like the idea > that only the "function" object be responsible for the Jacobian. If > the function can compute the Jacobian directly: great, use a newton- > like method. If it can't, then do its best to approximate it (i.e. > the "Broyden" part of the algorithm would be encoded in the function > object rather than the step object." > > > I think that if that if the function knows, on its own, how to > compute the Jacobian, the hessian, ... it should provide it. When > it does not, it shouldn't be the man sitting on the computer that > modifies its function to add a Broyden algorithm to the function > object. He sould only say to the optimizer that the function does > not compute the Jacobian by using another module. What module ? > That is a question for later. The goal of this is to have a clean > architecture, and adding a way to compute something directly in the > function, something that is not dependent on the function, but on > the step, is not a good thing. My view is that the person sitting at the computer does one of the following things: >>> F1 = Function(f) >>> F2 = Function(f,opts) >>> F3 = Function(f,df,ddf,opts) etc. In this first case, the object F1 can compute f(x), and will use finite differences or some more complicated method to compute derivatives df(x) and ddf(x) if required by the optimization algorithm. In F2, the user provides options that specify options about how to do these computations (for example, step size h, should a centred difference be used? Perhaps the function is cheap and a Richardson extrapolation should be used for higher accuracy. If f is analytic and supports complex arguments, then the difference step should be h=eps*1j. Maybe f has been implemented using an automatic differentiation library etc. Just throwing out ideas here...) In the third case, the user has explicitly provided functions to compute the Jacobian etc. so these will be used (unless the user specifies otherwise). In any case, all of the functors F1, F2 and F3 can be passed to various "optimizers". There would be a set of modules behind the interface provided by Function() that implement these various techniques for computing and/or estimating the derivatives, including the Broyden method. The user sitting at the computer does nothing other than select from a set of options (opts) what methods he wants the library to use. Note, the user could pass explicit things to Function() too, like a custom function that computes numerical derivatives. > The "function" object alone then serves up information about the > value of the function at a given point, as well as the gradient and > hessian at that point (either exact or approximate) to the criterion, > step, and any other objects that need it. > > I'm OK with it as long as it is not an approximation algorithm that > is based on gradient, ... to compute for instance the hessian. Such > an approximation algorithm is generic, and as such it should be put > in another module or in a function superclass. A Function "superclass" is what I had in mind. > ... > Certain termination criteria need access to the derivatives to make > sure that they terminate. It would query the function object for > this information. Other criteria may need to query the "step" object > to find out the size of the previous steps. > > The step is not the good one, it's the line search object goal to > find the correct step size, and such intel is given back to the > optimizer core, because there, everything is saved - everything > should be saved with a call to recordHistory -. What could be done > is that every object - step or line search - returns along with the > result - the result being the step, the new candidate, ... - a > dictionnary with such values. In that case, the criterion can > choose what it needs directly inside it. Yes, it seems that the optimizer should maintain information about the history. The question I have is about the flow of information: I imagine that the criterion object should be able to query the optimization object for the information that it needs. We should define an interface of things that the optimizer can serve up to the various components. This interface can be extended as required to support more sophisticated algorithms. > The "criterion" should > not maintain any of these internally, just rely on the values served > by the other objects: this does not break the encapsulation, it just > couples the objects more tightly, but sophisticated criteria need > this coupling. > For the moment, the state was not in the criterion, one cannot know > how any time it could be called inside an optimizer. This state is > maintained by the optimizer itself - contains the last 2 values, > the last 2 sets of parameters -, but I suppose that if we have the > new candidate, the step and its size, those can be removed, and so > the dictionary chooses what it needs. > > > > I don't think so, the function provides methods to compute > > gradient, hessian, ... but only the step object knows what to do > > with it : approximate a hessian, what was already approximated, ... > > A step object is associated with one optimizer, a function object > > can be optimized several times. If it has a state, it couldn't be > > used with several optimizers without reinitializing it, and it is > > not intuitive enough. > > The "function" object maintains all the information known about the > function: how to compute the function, how to compute/approximate > derivatives etc. If the user does not supply code for directly > computing derivatives, but wants to use an optimization method that > makes use of gradient information, then the function object should do > its best to provide approximate information. The essence behind the > Broyden methods is to approximate the Jacobian information in a > clever and cheap way. > > That would mean that it can have a state, I really do not support > this approach. The Broyden _is_ a way to get a step from a function > that does not give some intell - Jacobian, for instance -, so it is > not a function thing, it is a step mode. I disagree. I think of the Broyden algorithm as a way of maintaining the Jacobian. The way to get the step is independent of this, though it may use the Jacobian information to help it. The Broyden part of the algorithm is solely to approximate the Jacobian cheaply. > ... > Please describe how you think the Broyden root-finding method would > fit within this scheme. Which object would maintain the state of the > approximate Jacobian? > > No problem, I'll check in my two optimization books this evening > provided I have enough time - I'm a little late in some important > work projects :| - Maybe I will see things differently when you do this, but I am pretty convinced right now that the Function() object is the best place for the Broyden part of the algorithm. Michael. P.S. No hurry. I might also disappear from time to time when busy;-) From matthieu.brucher at gmail.com Mon Apr 16 15:27:23 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 21:27:23 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: > > Basically, and approximated Jacobian is used to determine the step > direction and/or size (depending on the "step" module etc.) > > The key to the Broyden approach is that the information about F(x+dx) > is used to update the approximate Jacobian (think multidimensional > secant method) in a clever way without any additional function > evaluations (there is not a unique way to do this and some choices > work better than others). > > Thus, think of Broyden methods as a Quasi-Newton methods but with a > cheap and very approximate Jacobian (hence, one usually uses a robust > line search method to make sure that one is always descending). I read some doc, it seems Broyden is a class of different steps, am I wrong ? and it tries to approximate the Hessian of the function. My view is that the person sitting at the computer does one of the > following things: > > >>> F1 = Function(f) > >>> F2 = Function(f,opts) > >>> F3 = Function(f,df,ddf,opts) > etc. OK, that is not what a function is :) A function is the set of f, df, ddf but not with the options. What you are exposing is the construction of an optimizer ;) Did you see the code in my different proposals ? In fact, you have a function class - for instance Rosenbrock class - that defines several methods, like gradient, hessian, ... without a real state - a real state being something other than for instead the number of dimension for the rosenbrock function, the points that need to be approximated, ... a real state is something that is dependent of the subsequent calls to the functor, the gradient, ... - so that this function can be reused efficiently. Then you use an instance of this class to be optimized. You choose your step mode with its parameters like gradient, conjugate gradient, Newton, Quasi-Newton, ... You choose your line search with its own parameters - tolerance, ... - like section methods, interpolation methods, ... Actually, you choose your stopping criterion. Then you make something like : optimizer = StandardOptimizer(function = myFunction, step = myStep, .......) optimizer->optimize() That is a modular design, and that is why some basic functions must be provided so that people that don't care about the underlying design really do not have to care. Then if someone wants a specific, non-standard optimizer, one just have to select the wanted modules - for instance, a conjugate-gradient with a golden-section line search and a relative value criterion onstead of a fibonacci search and an absolute criterion -. It can be more cumbersome at the start, but once some modules are made, assembling them will be more easy, and tests will be more fun :) In this first case, the object F1 can compute f(x), and will use > finite differences or some more complicated method to compute > derivatives df(x) and ddf(x) if required by the optimization > algorithm. In F2, the user provides options that specify options > about how to do these computations (for example, step size h, should > a centred difference be used? Perhaps the function is cheap and a > Richardson extrapolation should be used for higher accuracy. If f is > analytic and supports complex arguments, then the difference step > should be h=eps*1j. Maybe f has been implemented using an automatic > differentiation library etc. Just throwing out ideas here...) Yes, that's exactly an optimizer, not a function ;) A Function "superclass" is what I had in mind. As I said, that would make the function a state function, so this instance must not be shared among optimizers, so it is more error prone :( Yes, it seems that the optimizer should maintain information about > the history. The question I have is about the flow of information: I > imagine that the criterion object should be able to query the > optimization object for the information that it needs. We should > define an interface of things that the optimizer can serve up to the > various components. This interface can be extended as required to > support more sophisticated algorithms. If each module returns a tuple with the result and a set of parameters used and if the criterion gets all these sets in a dictionary, it will be able to cope with more specific cases of steps or line searches. For the communication between the optimizer and the modules, the only communication is 'I want this and this' from the optimizer, and the rest is defined when instantiating the different modules. For instance, the line search doesn't need to know how the step is computed, and in fact neither does the optimizer. The step knows the function; knows what it needs from it, if it is not provided, the optimization should fail. > That would mean that it can have a state, I really do not support > > this approach. The Broyden _is_ a way to get a step from a function > > that does not give some intell - Jacobian, for instance -, so it is > > not a function thing, it is a step mode. > > I disagree. I think of the Broyden algorithm as a way of maintaining > the Jacobian. The way to get the step is independent of this, though > it may use the Jacobian information to help it. The Broyden part of > the algorithm is solely to approximate the Jacobian cheaply. Broyden algorithm is a step, the litterature states it as a step, nothing else. Think of 3D computer graphism. Some graphic cards - function - provide some functions, other don't. When they are needed, the driver - the step or the line search - provides an software approach - an approximation -, but do not modify the GPU to add the functions. Here, it is exactly the same, a step is an adapter to provide a step, but never modifies the function. Maybe I will see things differently when you do this, but I am pretty > convinced right now that the Function() object is the best place for > the Broyden part of the algorithm. And now ? :) For instance, the conjugate gradient must remember the last step. It is exactly like Broyden algorithm, it is only simplier, but it has a state. If the last gradient were to be stored inside the function and if this function were to be optimized again with another starting point, the first step would be wrong... If I have some time, I'll program the FR conjugate-gradient step so that you can see how it is made ;) Michael. > > P.S. No hurry. I might also disappear from time to time when busy;-) Me too ;) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Mon Apr 16 16:13:22 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 16 Apr 2007 13:13:22 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: On 16 Apr 2007, at 12:27 PM, Matthieu Brucher wrote: > Basically, and approximated Jacobian is used to determine the step > direction and/or size (depending on the "step" module etc.) > > The key to the Broyden approach is that the information about F(x+dx) > is used to update the approximate Jacobian (think multidimensional > secant method) in a clever way without any additional function > evaluations (there is not a unique way to do this and some choices > work better than others). > > Thus, think of Broyden methods as a Quasi-Newton methods but with a > cheap and very approximate Jacobian (hence, one usually uses a robust > line search method to make sure that one is always descending). > > > I read some doc, it seems Broyden is a class of different steps, am > I wrong ? and it tries to approximate the Hessian of the function. I am thinking of root finding G(x) = 0 right now rather than optimization (I have not thought about the optimization problem yet). The way the Broyden root finder works is that you start with an approximate Jacobian J0 (you could start with the identity for example). So, start with an approximate J0 at position x0. G0 = G(x0) dx = -inv(J0)*G0 # This is a Quasi-Newton step. Use your favourite step method here... x1 = x0 + dx G1 = G(x1) Now the Broyden part comes in. It computes a new approximate Jacobian J1 at position x1 using J0, dG and dX such that J1*dx = dG This is the secant condition. There are many ways to do this in multiple dimensions and the various Broyden methods choose one of these. The most common is the BFGS choice, but there are other choices with different convergence properties. Now start over with J0 = J1 and x0 = x1 and repeat until convergence is met. Michael. P.S. More comments to follow. From ondrej at certik.cz Mon Apr 16 17:29:54 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 16 Apr 2007 23:29:54 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: <85b5c3130704161429u5ef93120sc257a749b576f62d@mail.gmail.com> You might want to check my email (couple of days ago) about the broyden methods together with a python code and tests. I didn't have time to implement it in scipy yet, but you can already use it. Ondrej On 4/16/07, Michael McNeil Forbes wrote: > On 16 Apr 2007, at 12:27 PM, Matthieu Brucher wrote: > > > Basically, and approximated Jacobian is used to determine the step > > direction and/or size (depending on the "step" module etc.) > > > > The key to the Broyden approach is that the information about F(x+dx) > > is used to update the approximate Jacobian (think multidimensional > > secant method) in a clever way without any additional function > > evaluations (there is not a unique way to do this and some choices > > work better than others). > > > > Thus, think of Broyden methods as a Quasi-Newton methods but with a > > cheap and very approximate Jacobian (hence, one usually uses a robust > > line search method to make sure that one is always descending). > > > > > > I read some doc, it seems Broyden is a class of different steps, am > > I wrong ? and it tries to approximate the Hessian of the function. > > I am thinking of root finding G(x) = 0 right now rather than > optimization (I have not thought about the optimization problem > yet). The way the Broyden root finder works is that you start with > an approximate Jacobian J0 (you could start with the identity for > example). > > So, start with an approximate J0 at position x0. > > G0 = G(x0) > dx = -inv(J0)*G0 # This is a Quasi-Newton step. Use your favourite > step method here... > x1 = x0 + dx > G1 = G(x1) > > Now the Broyden part comes in. It computes a new approximate > Jacobian J1 at position x1 using J0, dG and dX such that > > J1*dx = dG > > This is the secant condition. There are many ways to do this in > multiple dimensions and the various Broyden methods choose one of > these. The most common is the BFGS choice, but there are other > choices with different convergence properties. > > Now start over with J0 = J1 and x0 = x1 and repeat until convergence > is met. > > Michael. > > P.S. More comments to follow. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From matthieu.brucher at gmail.com Tue Apr 17 01:26:34 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 17 Apr 2007 07:26:34 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <85b5c3130704161429u5ef93120sc257a749b576f62d@mail.gmail.com> References: <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> <85b5c3130704161429u5ef93120sc257a749b576f62d@mail.gmail.com> Message-ID: I'll check them today, thanks for the tests too Matthieu 2007/4/16, Ondrej Certik : > > You might want to check my email (couple of days ago) about the > broyden methods together with a python code and tests. I didn't have > time to implement it in scipy yet, but you can already use it. > > Ondrej > > On 4/16/07, Michael McNeil Forbes wrote: > > On 16 Apr 2007, at 12:27 PM, Matthieu Brucher wrote: > > > > > Basically, and approximated Jacobian is used to determine the step > > > direction and/or size (depending on the "step" module etc.) > > > > > > The key to the Broyden approach is that the information about F(x+dx) > > > is used to update the approximate Jacobian (think multidimensional > > > secant method) in a clever way without any additional function > > > evaluations (there is not a unique way to do this and some choices > > > work better than others). > > > > > > Thus, think of Broyden methods as a Quasi-Newton methods but with a > > > cheap and very approximate Jacobian (hence, one usually uses a robust > > > line search method to make sure that one is always descending). > > > > > > > > > I read some doc, it seems Broyden is a class of different steps, am > > > I wrong ? and it tries to approximate the Hessian of the function. > > > > I am thinking of root finding G(x) = 0 right now rather than > > optimization (I have not thought about the optimization problem > > yet). The way the Broyden root finder works is that you start with > > an approximate Jacobian J0 (you could start with the identity for > > example). > > > > So, start with an approximate J0 at position x0. > > > > G0 = G(x0) > > dx = -inv(J0)*G0 # This is a Quasi-Newton step. Use your favourite > > step method here... > > x1 = x0 + dx > > G1 = G(x1) > > > > Now the Broyden part comes in. It computes a new approximate > > Jacobian J1 at position x1 using J0, dG and dX such that > > > > J1*dx = dG > > > > This is the secant condition. There are many ways to do this in > > multiple dimensions and the various Broyden methods choose one of > > these. The most common is the BFGS choice, but there are other > > choices with different convergence properties. > > > > Now start over with J0 = J1 and x0 = x1 and repeat until convergence > > is met. > > > > Michael. > > > > P.S. More comments to follow. > > _______________________________________________ > > Scipy-dev mailing list > > Scipy-dev at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Tue Apr 17 04:12:12 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 17 Apr 2007 10:12:12 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? > There is a review of dimensionality reduction algorithm on http://www.cs.unimaas.nl/l.vandermaaten/Laurens%20van%20der%20Maaten/Matlab%20Toolbox%20for%20Dimensionality%20Reduction.html that could be woth porting... one day... Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From fullung at gmail.com Tue Apr 17 06:46:22 2007 From: fullung at gmail.com (Albert Strasheim) Date: Tue, 17 Apr 2007 12:46:22 +0200 Subject: [SciPy-dev] Windows build with MSVC Message-ID: <002001c780dd$a3fd9060$0100a8c0@sun.ac.za> Hello all For lack of other sources of pain in my life, I'm giving the SciPy build with MSVC and G77 a go again. I've already fixed a few issues: http://projects.scipy.org/scipy/scipy/changeset/2927 http://projects.scipy.org/scipy/scipy/changeset/2928 I'm currently running into problems at building the scipy.special._cephes extension. I get the following error: cephes.lib(exp10.obj) : error LNK2001: unresolved external symbol _isnan cephes.lib(cbrt.obj) : error LNK2001: unresolved external symbol _isnan cephes.lib(unity.obj) : error LNK2019: unresolved external symbol _isnan referenced in function _cephes_expm1 cephes.lib(ndtr.obj) : error LNK2019: unresolved external symbol _isnan referenced in function _cephes_erfc cephes.lib(gamma.obj) : error LNK2001: unresolved external symbol _isnan cephes.lib(exp2.obj) : error LNK2001: unresolved external symbol _isnan build\lib.win32-2.4\scipy\special\_cephes.pyd : fatal error LNK1120: 1 unresolved externals It seems the cephes macro soup has gotten isnan wrong for the MSVC build. The MSDN docs for isnan are here: http://msdn2.microsoft.com/en-us/library/tzthab44(VS.80).aspx Maybe someone can figure out the right mix of defines and mconf.h magic to make it build again? Thanks! Regards, Albert From john at curioussymbols.com Tue Apr 17 09:31:50 2007 From: john at curioussymbols.com (John Pye) Date: Tue, 17 Apr 2007 23:31:50 +1000 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> Message-ID: <4624CC46.4030404@curioussymbols.com> Hi David David Cournapeau wrote: > UMFPACK is kind of a pain to compile too > (depends on two other packages), and let's not even start talking about > ATLAS, [...] Just in case you didn't know, UMFPACK (as well as Tim Davis' other code) is part of Debian/Ubuntu. It's named 'libufsparse'. AFAIK it was not in Fedora Core 6 (but this may have changed). Cheers JP From david at ar.media.kyoto-u.ac.jp Tue Apr 17 23:19:00 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 18 Apr 2007 12:19:00 +0900 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <4624CC46.4030404@curioussymbols.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> Message-ID: <46258E24.8030602@ar.media.kyoto-u.ac.jp> John Pye wrote: > Hi David > > David Cournapeau wrote: > >> UMFPACK is kind of a pain to compile too >> (depends on two other packages), and let's not even start talking about >> ATLAS, [...] >> > Just in case you didn't know, UMFPACK (as well as Tim Davis' other code) > is part of Debian/Ubuntu. It's named 'libufsparse'. AFAIK it was not in > Fedora Core 6 (but this may have changed). > I know that Debian/Ubuntu makes it much easier to build numpy and scipy than other distributions, basically 2-3 lines to install them and all necessary dependencies :) But I was speaking for other distributions. Sadly, my experience with other distributions (FC 5-6 and openSUSE, at least) show that they are not on par with debian, at least for those packages. And then, you have other OS. Anyway, I was just highlighting why adding dependencies makes the packaging job more difficult. David From john at curioussymbols.com Wed Apr 18 01:34:42 2007 From: john at curioussymbols.com (John Pye) Date: Wed, 18 Apr 2007 15:34:42 +1000 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <46258E24.8030602@ar.media.kyoto-u.ac.jp> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> Message-ID: <4625ADF2.7060503@curioussymbols.com> David Cournapeau wrote: > John Pye wrote: > >> Hi David >> >> David Cournapeau wrote: >> >> >>> UMFPACK is kind of a pain to compile too >>> (depends on two other packages), and let's not even start talking about >>> ATLAS, [...] >>> >> Just in case you didn't know, UMFPACK (as well as Tim Davis' other code) >> is part of Debian/Ubuntu. It's named 'libufsparse'. AFAIK it was not in >> Fedora Core 6 (but this may have changed). >> >> > I know that Debian/Ubuntu makes it much easier to build numpy and scipy > than other distributions, basically 2-3 lines to install them and all > necessary dependencies :) But I was speaking for other distributions. > Sadly, my experience with other distributions (FC 5-6 and openSUSE, at > least) show that they are not on par with debian, at least for those > packages. And then, you have other OS. > Anyway, I was just highlighting why adding dependencies makes the > packaging job more difficult. Perhaps the focus should be on getting UMFPACK added to those distros, rather than working out how to build them as part of scipy? The only place where this isn't so great is Windows. Windows is always a special case :-) From robert.kern at gmail.com Wed Apr 18 01:36:53 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 18 Apr 2007 00:36:53 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <4625ADF2.7060503@curioussymbols.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> <4625ADF2.7060503@curioussymbols.com> Message-ID: <4625AE75.3040300@gmail.com> John Pye wrote: > Perhaps the focus should be on getting UMFPACK added to those distros, > rather than working out how to build them as part of scipy? Since no one is suggesting building the UMFPACK library as part of the scipy build process, it appears that everyone already agrees with you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Wed Apr 18 01:46:39 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 18 Apr 2007 14:46:39 +0900 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <4625ADF2.7060503@curioussymbols.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> <4625ADF2.7060503@curioussymbols.com> Message-ID: <4625B0BF.5030805@ar.media.kyoto-u.ac.jp> John Pye wrote: > > Perhaps the focus should be on getting UMFPACK added to those distros, > rather than working out how to build them as part of scipy? The only > place where this isn't so great is Windows. Windows is always a special > case :-) > Talk is cheap :) This is exactly what I've started a few days ago, you are more than welcomed to help me: http://software.opensuse.org/download/home:/ashigabou/ I've already binary packages for numpy + scipy using NETLIB BLAS and LAPACK (also packaged by myself, as I had various problems with the ones distributed by fedora and open suse), am working on atlas 3.7.* (which is by far the most difficult), and intend to do at least fftw3 and umfpack along. cheers, David From mforbes at physics.ubc.ca Wed Apr 18 02:04:22 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Tue, 17 Apr 2007 23:04:22 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: <199D13FD-7C7E-4F58-98EF-F2E8D3B43953@physics.ubc.ca> Okay, I think we are thinking similar things with different terminology: I think you are saying that only one object should maintain state (your "optimizer") (I was originally sharing the state which I agree can cause problems). If so, I agree, but to me it seems that object should be called a "partially optimized function". I think of an "optimizer" as something which modifies state rather than something that maintains state. I am thinking of code like: ------ roughly_locate_minimum = Optimizer (criterion=extremelyWeakCriterion,step=slowRobustStep,...) find_precise_minimum = Optimizer (criterion=preciseCriterion,step=fasterStep,...) f = Rosenbrock(...) x0 = ... f_min = OptimizedFunction(f,x0) f_min.optimize(optimizer=roughly_locate_minimum) f_min.optimize(optimizer=find_precise_minimum) #OR (this reads better to me, but the functions should return copies of f_min, so may not be desirable for performance reasons) f_min = roughly_locate_minimum(f_min) f_min = find_precise_minimum(f_min) # Then one can query f_min for results: print f_min.x # Best current approximation to optimum print f_min.f print f_min.err #Estimate error print f_min.df # etc... ----- The f_min object keeps track of all state, can be passed from one optimizer to another, etc. In my mind, it is simply an object that has accumulated information about a function. The idea I have in mind is that f is extremely expensive to compute, thus the object with state f_min accumulates more and more information as it goes along. Ultimately this information could be used in many ways, for example: - f_min could keep track of roughly how long it takes to compute f (x), thus providing estimates of the time required to complete a calculation. - f_min could keep track of values and use interpolation to provide fast guesses etc. Does this mesh with your idea of an "optimizer"? I think it is strictly equivalent, but looking at the line of code "optimizer.optimize()" is much less useful to me than "f_min.optimize (optimizer=...)". What would your ideal "user" code look like for the above use-case? I will try to flesh out a more detailed structure for the OptimizedFunction class, Michael. On 16 Apr 2007, at 12:27 PM, Matthieu Brucher wrote: > ... > OK, that is not what a function is :) > A function is the set of f, df, ddf but not with the options. What > you are exposing is the construction of an optimizer ;) > > Did you see the code in my different proposals ? > In fact, you have a function class - for instance Rosenbrock class > - that defines several methods, like gradient, hessian, ... without > a real state - a real state being something other than for instead > the number of dimension for the rosenbrock function, the points > that need to be approximated, ... a real state is something that is > dependent of the subsequent calls to the functor, the gradient, ... > - so that this function can be reused efficiently. > Then you use an instance of this class to be optimized. > You choose your step mode with its parameters like gradient, > conjugate gradient, Newton, Quasi-Newton, ... > You choose your line search with its own parameters - > tolerance, ... - like section methods, interpolation methods, ... > Actually, you choose your stopping criterion. > Then you make something like : > optimizer = StandardOptimizer(function = myFunction, step = > myStep, .......) > optimizer->optimize() > > That is a modular design, and that is why some basic functions must > be provided so that people that don't care about the underlying > design really do not have to care. Then if someone wants a > specific, non-standard optimizer, one just have to select the > wanted modules - for instance, a conjugate-gradient with a golden- > section line search and a relative value criterion onstead of a > fibonacci search and an absolute criterion -. > > It can be more cumbersome at the start, but once some modules are > made, assembling them will be more easy, and tests will be more fun :) > ... From matthieu.brucher at gmail.com Wed Apr 18 02:36:00 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 18 Apr 2007 08:36:00 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <199D13FD-7C7E-4F58-98EF-F2E8D3B43953@physics.ubc.ca> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> <199D13FD-7C7E-4F58-98EF-F2E8D3B43953@physics.ubc.ca> Message-ID: 2007/4/18, Michael McNeil Forbes : > > Okay, I think we are thinking similar things with different terminology: Yes, I think that too. I think you are saying that only one object should maintain state > (your "optimizer") (I was originally sharing the state which I agree > can cause problems). If so, I agree, but to me it seems that object > should be called a "partially optimized function". I think of an > "optimizer" as something which modifies state rather than something > that maintains state. Well, in fact the real object that has a state is the step, the optimizer could have a state, but I do not currently use that approach. I'll think about the dependencies that having an state optimizer only would it lead to. That would mean that each call to the step, the criterion or the line search would take another parameter, the state of the optimizer. Let say it's a dict. Each object would take and modify some values, and in fact that is what I pass to the recordHistory function - I think I'll make it again record_history to be scipy coding standard compliant -, so there are not much trouble to do this. I am thinking of code like: > > ------ > roughly_locate_minimum = Optimizer > (criterion=extremelyWeakCriterion,step=slowRobustStep,...) > find_precise_minimum = Optimizer > (criterion=preciseCriterion,step=fasterStep,...) > > f = Rosenbrock(...) > x0 = ... > > f_min = OptimizedFunction(f,x0) > f_min.optimize(optimizer=roughly_locate_minimum) > f_min.optimize(optimizer=find_precise_minimum) > > #OR (this reads better to me, but the functions should return copies > of f_min, so may not be desirable for performance reasons) > f_min = roughly_locate_minimum(f_min) > f_min = find_precise_minimum(f_min) > > # Then one can query f_min for results: > print f_min.x # Best current approximation to optimum > print f_min.f > print f_min.err #Estimate error > print f_min.df > # etc... > ----- > > The f_min object keeps track of all state, can be passed from one > optimizer to another, etc. In my mind, it is simply an object that > has accumulated information about a function. You mean you would want to possibly share the state between optimizers ? The idea I have in > mind is that f is extremely expensive to compute, thus the object > with state f_min accumulates more and more information as it goes > along. Well, a part of this is done my the recordHistory, but I don't think that saving every whole state in f_min is a good idea from a memory point of view. Why not saving the last state, with every needed values ? For instance, the old and new values, the old and new parameters, the old and new step, the new gradient, ... I think the number iterations should be there as well ;) Ultimately this information could be used in many ways, for > example: > > - f_min could keep track of roughly how long it takes to compute f > (x), thus providing estimates of the time required to complete a > calculation. > - f_min could keep track of values and use interpolation to provide > fast guesses etc. > > Does this mesh with your idea of an "optimizer"? I think it is > strictly equivalent, but looking at the line of code > "optimizer.optimize()" is much less useful to me than "f_min.optimize > (optimizer=...)". > > What would your ideal "user" code look like for the above use-case? Well, not exactly, it would be almost like it. I do not know what you want to put in OptimizedFunction and what is its role exactly. My ideal code is straightforward - in fact it's more the current code - : f = Rosenbrock(...) x0 = ... roughly_locate_minimum_optimizer = StandardOptimizer(function = f, x0 = x0, step = Step.SomeStep(...), lineSearch = LineSearch.InexactLineSearch(...), criterion = Criterion.SomeCriterion(...), record = SomeRecordingFunctionIfNeeded) local_minimum = roughly_locate_minimum_optimizer.optimize() precisely_locate_minimum_optimizer = StandardOptimizer(function = f, x0 = local_minimum, step = Step.SomeOtherStep(...), lineSearch = LineSearch.ExactLineSearch(...), criterion = Criterion.SomeOtherCriterion(...), record = SomeRecordingFunctionIfNeeded) minimum = precisely_locate_minimum_optimizer.optimize() Using the OptimizedFunction to save a state shared by different optimizers that do not save the same things could lead to borders effects difficult to track. What could be done, as I said, is to output the state at the end of the optimizer, and perhaps allowing the user to give it to a new optimizer. It would only take the part that it needs. I will try to flesh out a more detailed structure for the > OptimizedFunction class, > Michael. I'm looking foward to see this ;) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Wed Apr 18 04:35:09 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 18 Apr 2007 10:35:09 +0200 Subject: [SciPy-dev] Updated generic optimizers proposal Message-ID: Hi, I'm lauching a new thread, the last was pretty big, and as I almost put every advice in this proposal, I thought it would be better. First, I used scipy coding standard, I hope I didn't forget something. I do not know where it would be put at the moment on my scipy tree, and the tests are visual for the moment, I have to make them automatic, but I do not know the framework used by scipy, I have to check it first. So, the proposal : - combining several objects to make an optimizer - a function should be an object defining the __call__ method and graient, hessian, ... if needed. It can be passed as several separate functions as Alan suggested it, a new object is then created - an optimizer is a combination of a function, a step_kind, a line_search, a criterion and a starting point x0. - the result of the optimization is return after a call to the optimize() method - every object (step or line_search) saves its modification in a state variable in the optimizer. This variable can be accessed if needed after the optimization. - after each iteration, a record function is called with this state variable - it is a dict, BTW -, if you want to save the whole dict, don't forget to copy it, as it is modified during the optimization For the moment are implemented : - a standard algorithm, only calls step_kind then line_search for a new candidate - the next optimizer would be one that calls a modifying function on the computed result, that can be useful in some cases - - criteria : - monotony criterion : the cost is decreasing - a factor can be used to allow an error - - relative value criterion : the relative value error is higher than a fixed error - absolute value criterion : the same with the absolute error - step : - gradient step - Newton step - Fletcher-Reeves conjugate gradient step - other conjugate gradient will be available - - line search : - no line search, just take the step - damped search, it's an inexact line search, that searches in the step direction a set of parameters than decreases the cost by dividing by two the step size while the cost is not decreasing - Golden section search - Fibonacci search I'm not pulling other criterion, step or line search, as my time is finite when doing a structural change. There are 3 classic optimization test functions in the package, Rosenbrock, Powell and a quadratic function, feel free to try them. Sometimes, the optimizer converges to the true minimum, sometimes it does not, I tried to propose several solutions to show that every combinaison does not manage to find the minimum. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizer_proposal_04.tar.gz Type: application/x-gzip Size: 5903 bytes Desc: not available URL: From ondrej at certik.cz Wed Apr 18 05:57:55 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 18 Apr 2007 11:57:55 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <4625AE75.3040300@gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> <4625ADF2.7060503@curioussymbols.com> <4625AE75.3040300@gmail.com> Message-ID: <85b5c3130704180257n1174397bn268a232baadaca7e@mail.gmail.com> I think Python projects which are not just python bindings to some C++/Fortran library (and SciPy definitely is not) should work in a way that you just download them and import in python, without any installation and it should just work, at least basic things. This way SciPy can have any number of (optional) modules, as it will not add to the complexity of the installation. Of course you need NumPy, but I assume NumPy is already installed. But I just use apt-get install python-scipy anyway, so I don't care. Ondrej On 4/18/07, Robert Kern wrote: > John Pye wrote: > > > Perhaps the focus should be on getting UMFPACK added to those distros, > > rather than working out how to build them as part of scipy? > > Since no one is suggesting building the UMFPACK library as part of the scipy > build process, it appears that everyone already agrees with you. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From david at ar.media.kyoto-u.ac.jp Wed Apr 18 06:44:14 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 18 Apr 2007 19:44:14 +0900 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704180257n1174397bn268a232baadaca7e@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> <4625ADF2.7060503@curioussymbols.com> <4625AE75.3040300@gmail.com> <85b5c3130704180257n1174397bn268a232baadaca7e@mail.gmail.com> Message-ID: <4625F67E.1040807@ar.media.kyoto-u.ac.jp> Ondrej Certik wrote: > I think Python projects which are not just python bindings to some > C++/Fortran library (and SciPy definitely is not) should work in a way > that you just download them and import in python, without any > installation and it should just work, at least basic things. > On linux at least, I don't see any other solution than packaging numpy/scipy with rpm /deb/ whatever the distribution uses. This is the only reliable way to distribute binaries on Linux; trying otherwise is cumbersome, difficult, and a waste of time I think. David From nmarais at sun.ac.za Wed Apr 18 09:55:03 2007 From: nmarais at sun.ac.za (Neilen Marais) Date: Wed, 18 Apr 2007 15:55:03 +0200 Subject: [SciPy-dev] Compiling pysparse from the sandbox Message-ID: Hi, I tried compiling pysparse from the sandbox in a recent svn SciPy, but I get the following compiler error: In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec_transp?: Lib/sandbox/pysparse/src/ll_mat.c:760: error: ?CONTIGUOUS? undeclared (first use in this function) Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec?: Lib/sandbox/pysparse/src/ll_mat.c:797: error: ?CONTIGUOUS? undeclared (first use in this function) In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec_transp?: Lib/sandbox/pysparse/src/csr_mat.c:119: error: ?CONTIGUOUS? undeclared (first use in this function) Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec?: Lib/sandbox/pysparse/src/csr_mat.c:146: error: ?CONTIGUOUS? undeclared (first use in this function) In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: Lib/sandbox/pysparse/src/sss_mat.c: In function ?SSSMat_matvec?: Lib/sandbox/pysparse/src/sss_mat.c:83: error: ?CONTIGUOUS? undeclared (first use in this function) In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec_transp?: Lib/sandbox/pysparse/src/ll_mat.c:760: error: ?CONTIGUOUS? undeclared (first use in this function) Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec?: Lib/sandbox/pysparse/src/ll_mat.c:797: error: ?CONTIGUOUS? undeclared (first use in this function) In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec_transp?: Lib/sandbox/pysparse/src/csr_mat.c:119: error: ?CONTIGUOUS? undeclared (first use in this function) Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec?: Lib/sandbox/pysparse/src/csr_mat.c:146: error: ?CONTIGUOUS? undeclared (first use in this function) In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: Lib/sandbox/pysparse/src/sss_mat.c: In function ?SSSMat_matvec?: Lib/sandbox/pysparse/src/sss_mat.c:83: error: ?CONTIGUOUS? undeclared (first use in this function) error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O2 -Wall -fPIC -ILib/sandbox/pysparse/include/ -I/usr/lib/python2.4/site-packages/numpy/core/include -I/usr/include/python2.4 -c Lib/sandbox/pysparse/src/spmatrixmodule.c -o build/temp.linux-x86_64-2.4/Lib/sandbox/pysparse/src/spmatrixmodule.o" failed with exit status 1 Is this a known problem, or have I forgotten to install some development library? I'm running Ubuntu Edgy AMD64. Thanks Neilen -- you know its kind of tragic we live in the new world but we've lost the magic -- Battery 9 (www.battery9.co.za) From nwagner at iam.uni-stuttgart.de Wed Apr 18 10:38:04 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 18 Apr 2007 16:38:04 +0200 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: Message-ID: <46262D4C.2020005@iam.uni-stuttgart.de> Neilen Marais wrote: > Hi, > > I tried compiling pysparse from the sandbox in a recent svn SciPy, but I > get the following compiler error: > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: > Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec_transp?: > Lib/sandbox/pysparse/src/ll_mat.c:760: error: ?CONTIGUOUS? undeclared (first use in this function) > Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once > Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) > Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec?: > Lib/sandbox/pysparse/src/ll_mat.c:797: error: ?CONTIGUOUS? undeclared (first use in this function) > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: > Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec_transp?: > Lib/sandbox/pysparse/src/csr_mat.c:119: error: ?CONTIGUOUS? undeclared (first use in this function) > Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec?: > Lib/sandbox/pysparse/src/csr_mat.c:146: error: ?CONTIGUOUS? undeclared (first use in this function) > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: > Lib/sandbox/pysparse/src/sss_mat.c: In function ?SSSMat_matvec?: > Lib/sandbox/pysparse/src/sss_mat.c:83: error: ?CONTIGUOUS? undeclared (first use in this function) > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: > Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec_transp?: > Lib/sandbox/pysparse/src/ll_mat.c:760: error: ?CONTIGUOUS? undeclared (first use in this function) > Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once > Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) > Lib/sandbox/pysparse/src/ll_mat.c: In function ?LLMat_matvec?: > Lib/sandbox/pysparse/src/ll_mat.c:797: error: ?CONTIGUOUS? undeclared (first use in this function) > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: > Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec_transp?: > Lib/sandbox/pysparse/src/csr_mat.c:119: error: ?CONTIGUOUS? undeclared (first use in this function) > Lib/sandbox/pysparse/src/csr_mat.c: In function ?CSRMat_matvec?: > Lib/sandbox/pysparse/src/csr_mat.c:146: error: ?CONTIGUOUS? undeclared (first use in this function) > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: > Lib/sandbox/pysparse/src/sss_mat.c: In function ?SSSMat_matvec?: > Lib/sandbox/pysparse/src/sss_mat.c:83: error: ?CONTIGUOUS? undeclared (first use in this function) > error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O2 -Wall -fPIC -ILib/sandbox/pysparse/include/ -I/usr/lib/python2.4/site-packages/numpy/core/include -I/usr/include/python2.4 -c Lib/sandbox/pysparse/src/spmatrixmodule.c -o build/temp.linux-x86_64-2.4/Lib/sandbox/pysparse/src/spmatrixmodule.o" failed with exit status 1 > > Is this a known problem, or have I forgotten to install some development > library? I'm running Ubuntu Edgy AMD64. > > Thanks > Neilen > > Hi Neilen, I just tried to install pysparse from the sandox. I can only confirm the failure. Nils From robert.kern at gmail.com Wed Apr 18 11:24:52 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 18 Apr 2007 10:24:52 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704180257n1174397bn268a232baadaca7e@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> <4624CC46.4030404@curioussymbols.com> <46258E24.8030602@ar.media.kyoto-u.ac.jp> <4625ADF2.7060503@curioussymbols.com> <4625AE75.3040300@gmail.com> <85b5c3130704180257n1174397bn268a232baadaca7e@mail.gmail.com> Message-ID: <46263844.4080709@gmail.com> Ondrej Certik wrote: > I think Python projects which are not just python bindings to some > C++/Fortran library (and SciPy definitely is not) should work in a way > that you just download them and import in python, without any > installation and it should just work, at least basic things. But most of the things in scipy that aren't *just* wrappers around C or FORTRAN *use* just-wrappers of C and FORTRAN libraries. I'm sorry that this doesn't meet your standards, but there's really no getting around it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wbaxter at gmail.com Wed Apr 18 14:53:22 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Thu, 19 Apr 2007 03:53:22 +0900 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: <46262D4C.2020005@iam.uni-stuttgart.de> References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: I think the name changed from CONTIGUOUS to NPY_CONTIGUOUS (or NPY_C_CONTIGUOUS if you want to be a little more explicit). Or try including numpy/noprefix.h instead of whatever it is including. That's apparently a compatibility header. So is this pysparse in the sandbox the same pysparse as here? http://sourceforge.net/project/showfiles.php?group_id=101403 Has it been further developed/fixed in the sandbox? --bb On 4/18/07, Nils Wagner wrote: > Neilen Marais wrote: > > Hi, > > > > I tried compiling pysparse from the sandbox in a recent svn SciPy, but I > > get the following compiler error: > > > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: > > Lib/sandbox/pysparse/src/ll_mat.c: In function 'LLMat_matvec_transp': > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: 'CONTIGUOUS' undeclared (first use in this function) > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) > > Lib/sandbox/pysparse/src/ll_mat.c: In function 'LLMat_matvec': > > Lib/sandbox/pysparse/src/ll_mat.c:797: error: 'CONTIGUOUS' undeclared (first use in this function) > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: > > Lib/sandbox/pysparse/src/csr_mat.c: In function 'CSRMat_matvec_transp': > > Lib/sandbox/pysparse/src/csr_mat.c:119: error: 'CONTIGUOUS' undeclared (first use in this function) > > Lib/sandbox/pysparse/src/csr_mat.c: In function 'CSRMat_matvec': > > Lib/sandbox/pysparse/src/csr_mat.c:146: error: 'CONTIGUOUS' undeclared (first use in this function) > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: > > Lib/sandbox/pysparse/src/sss_mat.c: In function 'SSSMat_matvec': > > Lib/sandbox/pysparse/src/sss_mat.c:83: error: 'CONTIGUOUS' undeclared (first use in this function) > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:17: > > Lib/sandbox/pysparse/src/ll_mat.c: In function 'LLMat_matvec_transp': > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: 'CONTIGUOUS' undeclared (first use in this function) > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: (Each undeclared identifier is reported only once > > Lib/sandbox/pysparse/src/ll_mat.c:760: error: for each function it appears in.) > > Lib/sandbox/pysparse/src/ll_mat.c: In function 'LLMat_matvec': > > Lib/sandbox/pysparse/src/ll_mat.c:797: error: 'CONTIGUOUS' undeclared (first use in this function) > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:18: > > Lib/sandbox/pysparse/src/csr_mat.c: In function 'CSRMat_matvec_transp': > > Lib/sandbox/pysparse/src/csr_mat.c:119: error: 'CONTIGUOUS' undeclared (first use in this function) > > Lib/sandbox/pysparse/src/csr_mat.c: In function 'CSRMat_matvec': > > Lib/sandbox/pysparse/src/csr_mat.c:146: error: 'CONTIGUOUS' undeclared (first use in this function) > > In file included from Lib/sandbox/pysparse/src/spmatrixmodule.c:19: > > Lib/sandbox/pysparse/src/sss_mat.c: In function 'SSSMat_matvec': > > Lib/sandbox/pysparse/src/sss_mat.c:83: error: 'CONTIGUOUS' undeclared (first use in this function) > > error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O2 -Wall -fPIC -ILib/sandbox/pysparse/include/ -I/usr/lib/python2.4/site-packages/numpy/core/include -I/usr/include/python2.4 -c Lib/sandbox/pysparse/src/spmatrixmodule.c -o build/temp.linux-x86_64-2.4/Lib/sandbox/pysparse/src/spmatrixmodule.o" failed with exit status 1 > > > > Is this a known problem, or have I forgotten to install some development > > library? I'm running Ubuntu Edgy AMD64. > > > > Thanks > > Neilen > > > > > Hi Neilen, > > I just tried to install pysparse from the sandox. > I can only confirm the failure. > > Nils > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From travis at enthought.com Tue Apr 17 09:02:55 2007 From: travis at enthought.com (Travis Vaught) Date: Tue, 17 Apr 2007 08:02:55 -0500 Subject: [SciPy-dev] ANN: SciPy 2007 Conference Message-ID: <5F432F58-2741-458F-B643-274D09B701C8@enthought.com> Greetings, The *SciPy 2007 Conference* has been scheduled for mid-August at CalTech. http://www.scipy.org/SciPy2007 Here's the rough schedule: Tutorials: August 14-15 (Tuesday and Wednesday) Conference: August 16-17 (Thursday and Friday) Sprints: August 18 (Saturday) Exciting things are happening in the Python community, and the SciPy 2007 Conference is an excellent opportunity to exchange ideas, learn techniques, contribute code and affect the direction of scientific computing (or just to learn what all the fuss is about). Last year's conference saw a near-doubling of attendance to 138, and we're looking forward to continued gains in participation. We'll be announcing the Keynote Speaker and providing a detailed schedule in the coming weeks. Registration: ------------- Registration is now open. You may register online at https://www.enthought.com/scipy07. Early registration for the conference is $150.00 and includes breakfast and lunch Thursday & Friday and a very nice dinner Thursday night. Tutorial registration is an additional $75.00. After July 15, 2007, conference registration will increase to $200.00 (tutorial registration will remain the same at $75.00). Call for Presenters ------------------- If you are interested in presenting at the conference, you may submit an abstract in Plain Text, PDF or MS Word formats to abstracts at scipy.org -- the deadline for abstract submission is July 6, 2007. Papers and/or presentation slides are acceptable and are due by August 3, 2007. Tutorial Sessions ----------------- Last year's conference saw an overwhelming turnout for our first-ever tutorial sessions. In order to better accommodate the community interest in tutorials, we've expanded them to 2 days and are providing food (requiring us to charge a modest fee for tutorials this year). A tentative list of topics for tutorials includes: - Wrapping Code with Python (extension module development) - Building Rich Scientific Applications with Python - Using SciPy for Statistical Analysis - Using SciPy for Signal Processing and Image Processing - Using Python as a Scientific IDE/Workbench - Others... This is a preliminary list; topics will change and be extended. If you'd like to present a tutorial, or are interested in a particular topic for a tutorial, please email the SciPy users mailing list (link below). A current list will be maintained here: http://www.scipy.org/SciPy2007/Tutorials Coding Sprints -------------- We've dedicated the Saturday after the conference for a Coding Sprint. Please include any ideas for Sprint topics on the Sprints wiki page here: http://www.scipy.org/SciPy2007/Sprints We're looking forward to another great conference! Best, Travis ------------- Links to various SciPy and NumPy mailing lists may be found here: http://www.scipy.org/Mailing_Lists From daniel.wheeler at nist.gov Wed Apr 18 18:41:33 2007 From: daniel.wheeler at nist.gov (Daniel Wheeler) Date: Wed, 18 Apr 2007 22:41:33 +0000 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: On Apr 18, 2007, at 6:53 PM, Bill Baxter wrote: > I think the name changed from CONTIGUOUS to NPY_CONTIGUOUS (or > NPY_C_CONTIGUOUS if you want to be a little more explicit). Or try > including numpy/noprefix.h instead of whatever it is including. > That's apparently a compatibility header. > > So is this pysparse in the sandbox the same pysparse as here? Probably not. Pysparse at sourceforge is being maintained and is at release 1.0. It's been updated to use numpy with the numpy/noprefix.h method mentioned above. Use cvs rather than 1.0 as 1.0 doesn't see the numpy header files. > http://sourceforge.net/project/showfiles.php?group_id=101403 > > Has it been further developed/fixed in the sandbox? > > --bb > > -- Daniel Wheeler -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Wed Apr 18 22:02:46 2007 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 18 Apr 2007 20:02:46 -0600 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: On 4/18/07, Daniel Wheeler wrote: > Probably not. Pysparse at sourceforge is being maintained and is at release > 1.0. It's been updated > to use numpy with the numpy/noprefix.h method mentioned above. Use cvs > rather than 1.0 as 1.0 > doesn't see the numpy header files. Just out of curiosity, is there important functionality that PySparse offers that's not currently available in SciPy? From what I can tell, PySparse has a few preconditioners and an eigensolver, in addition to what SciPy also has. Is there an interest in including these or any other sparse features in SciPy? I have some Algebraic Multigrid code (AMG) that I've been working on for a while. I've implemented the so-called "classical" AMG of Ruge & Stuben and also Smoothed Aggregation as described in by Vanek et. al. Would others be interested in using AMG in SciPy? For those not familiar with AMG, or multigrid in general - multigrid can solve linear systems that arise in certain elliptic PDEs (e.g. Poisson equations, heat diffusion, linear elasticity, etc) in optimal time. Furthermore, the AMG methods mentioned above are "black box" in the sense that only the matrix needs to be provided to the solver - so no knowledge of the mesh geometry is necessary. Also, are the iterative methods (pcg,gmres,etc.) reentrant? I recall having problems using cg with a preconditioner that also called cg (for a coarse level solve). -- Nathan Bell wnbell at gmail.com From oliphant.travis at ieee.org Thu Apr 19 00:01:21 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Wed, 18 Apr 2007 22:01:21 -0600 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: <4626E991.9040604@ieee.org> Nathan Bell wrote: > On 4/18/07, Daniel Wheeler wrote: > >> Probably not. Pysparse at sourceforge is being maintained and is at release >> 1.0. It's been updated >> to use numpy with the numpy/noprefix.h method mentioned above. Use cvs >> rather than 1.0 as 1.0 >> doesn't see the numpy header files. >> > > Just out of curiosity, is there important functionality that PySparse > offers that's not currently available in SciPy? From what I can tell, > PySparse has a few preconditioners and an eigensolver, in addition to > what SciPy also has. > > Is there an interest in including these or any other sparse features in SciPy? > > I have some Algebraic Multigrid code (AMG) that I've been working on > for a while. I've implemented the so-called "classical" AMG of Ruge & > Stuben and also Smoothed Aggregation as described in by Vanek et. al. > > Would others be interested in using AMG in SciPy? For those not > familiar with AMG, or multigrid in general - multigrid can solve > linear systems that arise in certain elliptic PDEs (e.g. Poisson > equations, heat diffusion, linear elasticity, etc) in optimal time. > Furthermore, the AMG methods mentioned above are "black box" in the > sense that only the matrix needs to be provided to the solver - so no > knowledge of the mesh geometry is necessary. > I'm interested. > Also, are the iterative methods (pcg,gmres,etc.) reentrant? I recall > having problems using cg with a preconditioner that also called cg > (for a coarse level solve). > Yes, because f2py should be re-entrant. -Travis From wbaxter at gmail.com Wed Apr 18 23:54:50 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Thu, 19 Apr 2007 12:54:50 +0900 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: On 4/19/07, Nathan Bell wrote: > On 4/18/07, Daniel Wheeler wrote: > > Probably not. Pysparse at sourceforge is being maintained and is at release > > 1.0. It's been updated > > to use numpy with the numpy/noprefix.h method mentioned above. Use cvs > > rather than 1.0 as 1.0 > > doesn't see the numpy header files. > > Just out of curiosity, is there important functionality that PySparse > offers that's not currently available in SciPy? From what I can tell, > PySparse has a few preconditioners and an eigensolver, in addition to > what SciPy also has. SciPy's sparse matrices also lack any way to operate based on a sparse index, which is a fundamental operation in FEM codes. Basically you need to be able to do something like idx = [1,8,13,15] K[ix_[idx,idx]] += node_contribution pysparse has an update_add_masked function which can do that efficiently (although it would obviously be better if regular numpy indexing just worked.) > Is there an interest in including these or any other sparse features in SciPy? > > I have some Algebraic Multigrid code (AMG) that I've been working on > for a while. I've implemented the so-called "classical" AMG of Ruge & > Stuben and also Smoothed Aggregation as described in by Vanek et. al. > > Would others be interested in using AMG in SciPy? For those not > familiar with AMG, or multigrid in general - multigrid can solve > linear systems that arise in certain elliptic PDEs (e.g. Poisson > equations, heat diffusion, linear elasticity, etc) in optimal time. > Furthermore, the AMG methods mentioned above are "black box" in the > sense that only the matrix needs to be provided to the solver - so no > knowledge of the mesh geometry is necessary. Sounds great to me. I've toyed with MG, but never got very far. Non-power-of-two grids and boundary conditions were too tricksy for me. But the O(N) solve time is great if someone else does all the hard work for me. :-) > Also, are the iterative methods (pcg,gmres,etc.) reentrant? I recall > having problems using cg with a preconditioner that also called cg > (for a coarse level solve). --bb From nwagner at iam.uni-stuttgart.de Thu Apr 19 02:04:40 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 19 Apr 2007 08:04:40 +0200 Subject: [SciPy-dev] from scipy.sandbox import pysparse Message-ID: <46270678.4060801@iam.uni-stuttgart.de> Hi all, Compiling pysparse from the sandbox works for me but I cannot import pysparse. Traceback (most recent call last): File "/usr/lib64/python2.4/site-packages/scipy/sandbox/pysparse/tests/test_sparray.py", line 3, in ? from sparray import sparray ImportError: No module named sparray >>> from scipy.sandbox import pysparse Traceback (most recent call last): File "", line 1, in ? File "/usr/lib64/python2.4/site-packages/scipy/sandbox/pysparse/__init__.py", line 4, in ? from spmatrix import * ImportError: No module named spmatrix >>> import scipy >>> scipy.__version__ '0.5.3.dev2933' Nils From guyer at nist.gov Thu Apr 19 08:59:37 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 19 Apr 2007 08:59:37 -0400 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: <7B887682-E255-4B4B-B583-52C905661345@nist.gov> On Apr 18, 2007, at 11:54 PM, Bill Baxter wrote: > SciPy's sparse matrices also lack any way to operate based on a sparse > index, which is a fundamental operation in FEM codes. Basically you > need to be able to do something like > idx = [1,8,13,15] > K[ix_[idx,idx]] += node_contribution > > pysparse has an update_add_masked function which can do that > efficiently (although it would obviously be better if regular numpy > indexing just worked.) This is one important advantage, although update_add_mask() was not suitable for our needs (we wanted to build sparse matrices from dense vectors, rather than building sparse matrices from small dense matrices). We added (and Roman accepted) an update_add_at() function that did what we wanted. Either is blindingly faster than what could be done if SciPy the last time I looked. SciPy's syntax is more Pythonic, but PySparse can build matrices *much* more efficiently. More broadly, the advantages of PySparse over SciPy all boil down to speed. About a year and a half ago, I posted some benchmarking info to the SciPy wiki, but it went away with the move to the new wiki. I have no idea where it's gone, and I don't seem to have a copy of it. It used to be at . > Sounds great to me. I've toyed with MG, but never got very far. > Non-power-of-two grids and boundary conditions were too tricksy for > me. But the O(N) solve time is great if someone else does all the > hard work for me. :-) Likewise. We're very interested in multigrid for FiPy, and even more interested if it doesn't require us to think very hard. From edschofield at gmail.com Thu Apr 19 11:46:37 2007 From: edschofield at gmail.com (Ed Schofield) Date: Thu, 19 Apr 2007 16:46:37 +0100 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: <1b5a37350704190846o2c5bf6acj5f177779613712c0@mail.gmail.com> > On 4/18/07, Daniel Wheeler wrote: > > > > On Apr 18, 2007, at 6:53 PM, Bill Baxter wrote: > >> I think the name changed from CONTIGUOUS to NPY_CONTIGUOUS (or >> NPY_C_CONTIGUOUS if you want to be a little more explicit). Or try >> including numpy/noprefix.h instead of whatever it is including. >> That's apparently a compatibility header. > >> So is this pysparse in the sandbox the same pysparse as here? > > Probably not. Pysparse at sourceforge is being maintained and is at release > 1.0. It's been updated > to use numpy with the numpy/noprefix.h method mentioned above. Use cvs > rather than 1.0 as 1.0 > doesn't see the numpy header files. That's interesting to hear; I haven't been following PySparse development for a while. Around the time of the Numeric -> SciPy core transition, I took a snapshot of PySparse, patched it to compile with NumPy and added it to the sandbox, with the idea of using it as a basis for improvements to (or even a re-write of) the scipy.sparse module. But we decided to keep the existing scipy.sparse matrix types, and when I came to add the new list-based lil_matrix type, it turned out to be easier to design it from scratch rather than port PySparse's ll_mat.c code to Python. So I see no reason to keep the pysparse snapshot in the sandbox any longer. I've actually been meaning for a while to propose that we delete it from the tree... -- Ed From nwagner at iam.uni-stuttgart.de Thu Apr 19 13:24:13 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 19 Apr 2007 19:24:13 +0200 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: <1b5a37350704190846o2c5bf6acj5f177779613712c0@mail.gmail.com> References: <46262D4C.2020005@iam.uni-stuttgart.de> <1b5a37350704190846o2c5bf6acj5f177779613712c0@mail.gmail.com> Message-ID: On Thu, 19 Apr 2007 16:46:37 +0100 "Ed Schofield" wrote: >> On 4/18/07, Daniel Wheeler >>wrote: >> >> >> >> On Apr 18, 2007, at 6:53 PM, Bill Baxter wrote: >> >>> I think the name changed from CONTIGUOUS to >>>NPY_CONTIGUOUS (or >>> NPY_C_CONTIGUOUS if you want to be a little more >>>explicit). Or try >>> including numpy/noprefix.h instead of whatever it is >>>including. >>> That's apparently a compatibility header. >> >>> So is this pysparse in the sandbox the same pysparse as >>>here? >> >> Probably not. Pysparse at sourceforge is being >>maintained and is at release >> 1.0. It's been updated >> to use numpy with the numpy/noprefix.h method mentioned >>above. Use cvs >> rather than 1.0 as 1.0 >> doesn't see the numpy header files. > > That's interesting to hear; I haven't been following >PySparse > development for a while. Around the time of the Numeric >-> SciPy core > transition, I took a snapshot of PySparse, patched it to >compile with > NumPy and added it to the sandbox, with the idea of >using it as a > basis for improvements to (or even a re-write of) the >scipy.sparse > module. But we decided to keep the existing scipy.sparse >matrix types, > and when I came to add the new list-based lil_matrix >type, it turned > out to be easier to design it from scratch rather than >port PySparse's > ll_mat.c code to Python. > > So I see no reason to keep the pysparse snapshot in the >sandbox any > longer. I've actually been meaning for a while to >propose that we > delete it from the tree... > > -- Ed > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev Hi Ed, Just now I have installed the latest pysparse via cvs, but I need some advice to improve my local setup.py. I have already installed BLAS/LAPACK,ATLAS and UMFPACK from source. So, who can send me a customized setup.py for pysparse ? Nils From spacey-scipy-dev at lenin.net Thu Apr 19 15:51:18 2007 From: spacey-scipy-dev at lenin.net (Peter C. Norton) Date: Thu, 19 Apr 2007 12:51:18 -0700 Subject: [SciPy-dev] Specifying library names when configuring numpy or scipy? Message-ID: <20070419195118.GL5812@lenin.net> Summary: With python-2.5.1c1, numpy-1.02, and scipy-0.52, the distutils included in numpy doesn't allow me to specify the name of my platform-specific optimized blas+lapack library in the blas_opt or blas_lapack sections. I've tried using the following names in the blas_opt and lapack_opt "libraries" sections: libsunperf, sunperf. To build and install scipy, I can create symlinks from the real location of libsunperf to a temporary location and then perform the build, as long as the names being symlinked to are "libblas.so" and "liblapack.so". This is a problem, since it means that to use scipy, I will need to have these symlinks in place somewhere in the loader's path, forever. I'm guessing this relates to my having the same problem with the numpy distutils package, but with numpy I could use a static library, and that keeps me from having to jerry-rig symlinks around my filesystem. scipy doesn't seem to work with static libraries. Is this inability to use alternative optimized library names on anyone's rader? I'm not familiar with distutils in general, or how it should work for numpy/scipy here, but I'd be happy to look at this and feedback on it. I'm guessing that the problem is that the classes blas_opt and lapack_opt need some fleshing out. Does this just need to be done still? Thanks, -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one. From nmarais at sun.ac.za Thu Apr 19 19:06:15 2007 From: nmarais at sun.ac.za (Neilen Marais) Date: Fri, 20 Apr 2007 01:06:15 +0200 Subject: [SciPy-dev] Solution for problem with f90 allocatable arrays on 64-bit systems Message-ID: <1177023975.13148.47.camel@localhost.localdomain> Hi, I've found the problem in the generated code that results in the bug described here: http://projects.scipy.org/scipy/numpy/ticket/147 The problem is an index which should be specified as pointer size but is specified as integer in the fortran part of the module code. When wrapping the following code: (geomwrap.f90) MODULE mesh ! The 4 node indices per element that define all the mesh elements INTEGER, DIMENSION(:,:), ALLOCATABLE :: element_nodes CONTAINS SUBROUTINE init_mesh() ALLOCATE(element_nodes(4,2)) element_nodes(:,1) = (/1, 2, 3, 4/) element_nodes(:,2) = (/1, 3, 4, 5/) END SUBROUTINE init_mesh END MODULE mesh and generating wrappers: f2py --fcompiler=gnu95 -m geo --build-dir test -c geomwrap.f90 in the file geo-f2pywrappers2.f90 ! -*- f90 -*- ! This file is autogenerated with f2py (version:2_3718) ! It contains Fortran 90 wrappers to fortran functions. subroutine f2py_mesh_getdims_element_nodes(r,s,f2pysetdata,flag) use mesh, only: d => element_nodes integer flag external f2pysetdata logical ns integer s(*),r,i,j the definition for s(*) should be interger(8) (at least on 64-bit machines), i.e. ! -*- f90 -*- ! This file is autogenerated with f2py (version:2_3718) ! It contains Fortran 90 wrappers to fortran functions. subroutine f2py_mesh_getdims_element_nodes(r,s,f2pysetdata,flag) use mesh, only: d => element_nodes integer flag external f2pysetdata logical ns integer(8) s(*) integer r,i,j is correct. I'm going to patch f2py on my machine to generate integer(8) sized s(*), but this will of course be wrong on 32-bit machines. What is the recommended way of getting a pointer-sized int in fortran? Thanks Neilen From cimrman3 at ntc.zcu.cz Fri Apr 20 04:50:23 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 20 Apr 2007 10:50:23 +0200 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: References: <46262D4C.2020005@iam.uni-stuttgart.de> Message-ID: <46287ECF.8090806@ntc.zcu.cz> Nathan Bell wrote: > Just out of curiosity, is there important functionality that PySparse > offers that's not currently available in SciPy? From what I can tell, > PySparse has a few preconditioners and an eigensolver, in addition to > what SciPy also has. > > Is there an interest in including these or any other sparse features in SciPy? > > I have some Algebraic Multigrid code (AMG) that I've been working on > for a while. I've implemented the so-called "classical" AMG of Ruge & > Stuben and also Smoothed Aggregation as described in by Vanek et. al. > > Would others be interested in using AMG in SciPy? For those not > familiar with AMG, or multigrid in general - multigrid can solve > linear systems that arise in certain elliptic PDEs (e.g. Poisson > equations, heat diffusion, linear elasticity, etc) in optimal time. > Furthermore, the AMG methods mentioned above are "black box" in the > sense that only the matrix needs to be provided to the solver - so no > knowledge of the mesh geometry is necessary. I'd love to see an AMG implementation in SciPy! r. From nwagner at iam.uni-stuttgart.de Fri Apr 20 04:56:35 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 20 Apr 2007 10:56:35 +0200 Subject: [SciPy-dev] Compiling pysparse from the sandbox In-Reply-To: <46287ECF.8090806@ntc.zcu.cz> References: <46262D4C.2020005@iam.uni-stuttgart.de> <46287ECF.8090806@ntc.zcu.cz> Message-ID: <46288043.8010505@iam.uni-stuttgart.de> Robert Cimrman wrote: > Nathan Bell wrote: > >> Just out of curiosity, is there important functionality that PySparse >> offers that's not currently available in SciPy? From what I can tell, >> PySparse has a few preconditioners and an eigensolver, in addition to >> what SciPy also has. >> >> Is there an interest in including these or any other sparse features in SciPy? >> >> I have some Algebraic Multigrid code (AMG) that I've been working on >> for a while. I've implemented the so-called "classical" AMG of Ruge & >> Stuben and also Smoothed Aggregation as described in by Vanek et. al. >> >> Would others be interested in using AMG in SciPy? For those not >> familiar with AMG, or multigrid in general - multigrid can solve >> linear systems that arise in certain elliptic PDEs (e.g. Poisson >> equations, heat diffusion, linear elasticity, etc) in optimal time. >> Furthermore, the AMG methods mentioned above are "black box" in the >> sense that only the matrix needs to be provided to the solver - so no >> knowledge of the mesh geometry is necessary. >> > > I'd love to see an AMG implementation in SciPy! > > r. > +1 Nils > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From jtravs at gmail.com Fri Apr 20 14:03:42 2007 From: jtravs at gmail.com (John Travers) Date: Fri, 20 Apr 2007 19:03:42 +0100 Subject: [SciPy-dev] Genetic algorithm optimization Message-ID: <3a1077e70704201103v6bcf2ad5jab711584888383ed@mail.gmail.com> Hi all, I have recently needed a genetic algorithm code for some work. There appear to be various python implementations, which I didn't try and the package in the sandbox looked promising initially but I think a large amount of work needs to be done to get it going again. As I was pushed for time a took an easy option and wrapped what appears to be a fairly robust implementation in Fortran 77 called pikaia: http://www.hao.ucar.edu/Public/models/pikaia/pikaia.html It has apparently been used in a number of scientific research projects and appears to be quite mature. I have a very simple f2py interface (attached along with a patch to the pikaia source which I slightly modified) and python calling function (also attached). While it doesn't provide a fully customizable, all bells and whistles GA library which some people may want (see attached ga.py for docs on what features it does have), it does work very well for standard global optimization problems (see example, also attached). It would fit very neatly in the optimization package of scipy as it is a small one function routine, see attached test script for an example. The GA code itself is explicitly placed in the public domain. The file you can download from the above website contains a quicksort and uniform random number routine which are under the ACM license which is incompatible with scipy. But these parts are easily replaced with compatibly licensed versions (suggestions for such routines would be welcome). What do people think about including it in the optimization module? Cheers, John -------------- next part -------------- A non-text attachment was scrubbed... Name: pikaia.diff Type: text/x-patch Size: 2966 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pikaia.pyf Type: application/octet-stream Size: 833 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ga.py Type: text/x-python Size: 4511 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_ga.py Type: text/x-python Size: 340 bytes Desc: not available URL: From nmarais at sun.ac.za Sat Apr 21 16:15:13 2007 From: nmarais at sun.ac.za (Neilen Marais) Date: Sat, 21 Apr 2007 22:15:13 +0200 Subject: [SciPy-dev] Solution for problem with f90 allocatable arrays on 64-bit systems References: <1177023975.13148.47.camel@localhost.localdomain> Message-ID: On Fri, 20 Apr 2007 01:06:15 +0200, Neilen Marais wrote: Following up on myself, the problem is now fixed in SVN. > Hi, > > I've found the problem in the generated code that results in the bug > described here: http://projects.scipy.org/scipy/numpy/ticket/147 > -- you know its kind of tragic we live in the new world but we've lost the magic -- Battery 9 (www.battery9.co.za) From jtravs at gmail.com Sun Apr 22 09:09:02 2007 From: jtravs at gmail.com (John Travers) Date: Sun, 22 Apr 2007 14:09:02 +0100 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 Message-ID: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> Hi all, I have a number of modules I'm working on for my own use which depend on Fortran 90/95 language features. One is a python wrapper to a boundary value problem solver (http://cs.smu.ca/~muir/BVP_SOLVER_Webpage.shtml) which would be a very good addition to scipy I think (the authors have explicitly agreed that it can be used in scipy). My question is: what is the policy on scipy modules depending on Fortran 90/95 code. There currently doesn't appear to be any in the repository. However with the availability of gfortran now broadening and the fact that most (all?) commercial compilers now have full support, is it any longer a problem (putting aside f2py support)? Cheers, John From robert.kern at gmail.com Sun Apr 22 14:31:16 2007 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Apr 2007 13:31:16 -0500 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> Message-ID: <462BA9F4.3070708@gmail.com> John Travers wrote: > Hi all, > > I have a number of modules I'm working on for my own use which depend > on Fortran 90/95 language features. One is a python wrapper to a > boundary value problem solver > (http://cs.smu.ca/~muir/BVP_SOLVER_Webpage.shtml) which would be a > very good addition to scipy I think (the authors have explicitly > agreed that it can be used in scipy). > > My question is: what is the policy on scipy modules depending on > Fortran 90/95 code. There currently doesn't appear to be any in the > repository. However with the availability of gfortran now broadening > and the fact that most (all?) commercial compilers now have full > support, is it any longer a problem (putting aside f2py support)? Personally, I still don't trust gfortran to be non-buggy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Mon Apr 23 03:40:23 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 23 Apr 2007 09:40:23 +0200 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> Message-ID: <20070423074023.GJ6933@mentat.za.net> Hi John On Sun, Apr 22, 2007 at 02:09:02PM +0100, John Travers wrote: > My question is: what is the policy on scipy modules depending on > Fortran 90/95 code. There currently doesn't appear to be any in the > repository. However with the availability of gfortran now broadening > and the fact that most (all?) commercial compilers now have full > support, is it any longer a problem (putting aside f2py support)? I'd suggest that you start by putting the code in a sandbox directory. There you can experiment setting up the building processs, and once we're satisfied we can move it over to the rest of scipy. Even if we don't manage right away, having the code there for the future is a good idea, otherwise it may be forgotten. Maybe gfortran does manage to build your code, although I suspect Robert's concern is that it may not consistently (over different versions) deliver correctly functioning binaries? Cheers St?fan From david at ar.media.kyoto-u.ac.jp Mon Apr 23 03:58:19 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 23 Apr 2007 16:58:19 +0900 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <20070423074023.GJ6933@mentat.za.net> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <20070423074023.GJ6933@mentat.za.net> Message-ID: <462C671B.2050701@ar.media.kyoto-u.ac.jp> Stefan van der Walt wrote: > Hi John > > On Sun, Apr 22, 2007 at 02:09:02PM +0100, John Travers wrote: >> My question is: what is the policy on scipy modules depending on >> Fortran 90/95 code. There currently doesn't appear to be any in the >> repository. However with the availability of gfortran now broadening >> and the fact that most (all?) commercial compilers now have full >> support, is it any longer a problem (putting aside f2py support)? > > I'd suggest that you start by putting the code in a sandbox directory. > There you can experiment setting up the building processs, and once > we're satisfied we can move it over to the rest of scipy. Even if we > don't manage right away, having the code there for the future is a > good idea, otherwise it may be forgotten. > > Maybe gfortran does manage to build your code, although I suspect > Robert's concern is that it may not consistently (over different > versions) deliver correctly functioning binaries? Debian and Ubuntu still use g77 as the main fortran compiler, too, which is binary incompatible with gfortran. David From david at ar.media.kyoto-u.ac.jp Mon Apr 23 04:10:13 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 23 Apr 2007 17:10:13 +0900 Subject: [SciPy-dev] Status of the doc format for scipy code ? Message-ID: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> Hi, I wanted to know what is the status of the doc format for scipy (http://projects.scipy.org/scipy/numpy/wiki/DocstringStandards) ? Is anything settled down ? cheers, David From matthew.brett at gmail.com Mon Apr 23 04:29:01 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 23 Apr 2007 09:29:01 +0100 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <20070423074023.GJ6933@mentat.za.net> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <20070423074023.GJ6933@mentat.za.net> Message-ID: <1e2af89e0704230129w6664c58i5f716b129035c865@mail.gmail.com> Hi, > I'd suggest that you start by putting the code in a sandbox directory. > There you can experiment setting up the building processs, and once > we're satisfied we can move it over to the rest of scipy. That seems like a good idea. Why not make your build conditional on the use of a f95 compiler? At least that will make your code available to those of us who do use gfortran. Best, Matthew From jtravs at gmail.com Mon Apr 23 04:47:29 2007 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Apr 2007 09:47:29 +0100 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <20070423074023.GJ6933@mentat.za.net> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <20070423074023.GJ6933@mentat.za.net> Message-ID: <3a1077e70704230147i4290be39m872d83e527dc59cb@mail.gmail.com> On 23/04/07, Stefan van der Walt wrote: > Maybe gfortran does manage to build your code, although I suspect > Robert's concern is that it may not consistently (over different > versions) deliver correctly functioning binaries? Well, I've used gfortran 4.1 - 4.3 (current svn) to build and use the BVP code and pass all tests. So I think they are getting there... There is currently a very large amount of working going on with gfortran. Version 4.2 which should hit linux distributions this year is already a pretty competitive f95 compiler. But I understand the concerns. So I'll get it into svn at some point then look at selective compilation with a suitable compiler. Cheers, John From stefan at sun.ac.za Mon Apr 23 05:09:15 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 23 Apr 2007 11:09:15 +0200 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> Message-ID: <20070423090915.GM6933@mentat.za.net> On Mon, Apr 23, 2007 at 05:10:13PM +0900, David Cournapeau wrote: > Hi, > > I wanted to know what is the status of the doc format for scipy > (http://projects.scipy.org/scipy/numpy/wiki/DocstringStandards) ? Is > anything settled down ? The latest document I know of is at http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt Cheers St?fan From david at ar.media.kyoto-u.ac.jp Mon Apr 23 06:10:14 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 23 Apr 2007 19:10:14 +0900 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <20070423090915.GM6933@mentat.za.net> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> Message-ID: <462C8606.9090004@ar.media.kyoto-u.ac.jp> Stefan van der Walt wrote: > On Mon, Apr 23, 2007 at 05:10:13PM +0900, David Cournapeau wrote: >> Hi, >> >> I wanted to know what is the status of the doc format for scipy >> (http://projects.scipy.org/scipy/numpy/wiki/DocstringStandards) ? Is >> anything settled down ? > > The latest document I know of is at > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt Do you know how to process it ? I tried epydoc beta, and it does not work on a simple example, and dies on numpy sources... David From nmarais at sun.ac.za Mon Apr 23 12:52:08 2007 From: nmarais at sun.ac.za (Neilen Marais) Date: Mon, 23 Apr 2007 18:52:08 +0200 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <462BA9F4.3070708@gmail.com> Message-ID: On Sun, 22 Apr 2007 13:31:16 -0500, Robert Kern wrote: > John Travers wrote: >> My question is: what is the policy on scipy modules depending on >> Fortran 90/95 code. There currently doesn't appear to be any in the >> repository. However with the availability of gfortran now broadening >> and the fact that most (all?) commercial compilers now have full >> support, is it any longer a problem (putting aside f2py support)? > > Personally, I still don't trust gfortran to be non-buggy. But presumably those willing to take the risk should be able to use f90/95 modules? I think the sooner we start at least giving people the option to use gfortran, the sooner gfortran will gain our trust. It will at least allow gfortan/scipy incompatibilities to surface and allow us to report problems to the gfortran developers. I'm a little bit in the dark about the g77/gfortan binary incompatibility. Does this mean that f90 code compiled with gfortran can't call f77 code compiled with g77? That would be a pretty sticky situation for users of distribution packaged libs if the distro is using g77... Regards Neilen -- you know its kind of tragic we live in the new world but we've lost the magic -- Battery 9 (www.battery9.co.za) From robert.kern at gmail.com Mon Apr 23 12:57:13 2007 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Apr 2007 11:57:13 -0500 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <462BA9F4.3070708@gmail.com> Message-ID: <462CE569.4060303@gmail.com> Neilen Marais wrote: > On Sun, 22 Apr 2007 13:31:16 -0500, Robert Kern wrote: > >> John Travers wrote: > >>> My question is: what is the policy on scipy modules depending on >>> Fortran 90/95 code. There currently doesn't appear to be any in the >>> repository. However with the availability of gfortran now broadening >>> and the fact that most (all?) commercial compilers now have full >>> support, is it any longer a problem (putting aside f2py support)? >> Personally, I still don't trust gfortran to be non-buggy. > > But presumably those willing to take the risk should be able to use f90/95 > modules? Well, when you can make installing a subpackage of scipy optional, I'm more than happy to do that. Until then, keep it in the sandbox or in scikits. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From jtravs at gmail.com Mon Apr 23 13:09:51 2007 From: jtravs at gmail.com (John Travers) Date: Mon, 23 Apr 2007 18:09:51 +0100 Subject: [SciPy-dev] BVP solver and dependence on Fortran 90/95 In-Reply-To: <462CE569.4060303@gmail.com> References: <3a1077e70704220609s11da1a07pea7a898a1354c29f@mail.gmail.com> <462BA9F4.3070708@gmail.com> <462CE569.4060303@gmail.com> Message-ID: <3a1077e70704231009k4a089ab5g62b21544b3148d8b@mail.gmail.com> On 23/04/07, Robert Kern wrote: > Neilen Marais wrote: > > But presumably those willing to take the risk should be able to use f90/95 > > modules? > > Well, when you can make installing a subpackage of scipy optional, I'm more than > happy to do that. Until then, keep it in the sandbox or in scikits. > I think I'll put it in a scikit for now, so that it gains more visibility without being a problem for scipy. However, we will need to seriously address the f95 issue in the future as more and more high quality numeric code is produced in this language (for very good reason, allocatable arrays make for much neater and robust codes when one needs to dynamically adjust meshes etc.). This BVP solver is just one of many examples I use. Cheers, John From stefan at sun.ac.za Mon Apr 23 16:24:09 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 23 Apr 2007 22:24:09 +0200 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <462C8606.9090004@ar.media.kyoto-u.ac.jp> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> Message-ID: <20070423202409.GQ6933@mentat.za.net> On Mon, Apr 23, 2007 at 07:10:14PM +0900, David Cournapeau wrote: > Stefan van der Walt wrote: > > On Mon, Apr 23, 2007 at 05:10:13PM +0900, David Cournapeau wrote: > >> I wanted to know what is the status of the doc format for scipy > >> (http://projects.scipy.org/scipy/numpy/wiki/DocstringStandards) ? Is > >> anything settled down ? > > > > The latest document I know of is at > > > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt > Do you know how to process it ? I tried epydoc beta, and it does not > work on a simple example, and dies on numpy sources... AFAIK, no parser has been written for that format yet. Cheers St?fan From jmt at twilley.org Mon Apr 23 18:19:23 2007 From: jmt at twilley.org (Jack Twilley) Date: Mon, 23 Apr 2007 15:19:23 -0700 Subject: [SciPy-dev] Disappearing websites and unlicensed code Message-ID: <462D30EB.1090606@twilley.org> I was looking for some examples on using Python's wave module to read and write WAV files and came across the tutorial formerly found at http://scipy.mit.edu/tutorials/wave.pdf -- I say formerly found because I haven't been able to get to that website since I first found the link via Google. I was able to access the file via Google's cache and I have been using the code for private work, but the code in the tutorial is unlicensed as it sits. I checked the web for scipy and found the website which points to this mailing list and thought I'd ask here if anyone knows how the aforementioned code is licensed and what happened to the MIT site. Jack. From david at ar.media.kyoto-u.ac.jp Tue Apr 24 00:55:05 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Apr 2007 13:55:05 +0900 Subject: [SciPy-dev] Dataset for examples and license Message-ID: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> Hi, I would like to know what should be done when including some dataset in scipy ? For example, during the development of my project pymachine, I would like to include some famous data like iris/old faithful data, etc... for demo of classic machine learning algorithms. R has some intereseting data, but is licensed under the GPL, and I am not quite sure what the status of the data are wrt the license ? Does GPL also cover raw data ? cheers, David From peridot.faceted at gmail.com Tue Apr 24 01:53:45 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 24 Apr 2007 01:53:45 -0400 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> Message-ID: On 24/04/07, David Cournapeau wrote: > Hi, > > I would like to know what should be done when including some dataset > in scipy ? For example, during the development of my project pymachine, > I would like to include some famous data like iris/old faithful data, > etc... for demo of classic machine learning algorithms. R has some > intereseting data, but is licensed under the GPL, and I am not quite > sure what the status of the data are wrt the license ? Does GPL also > cover raw data ? Not necessarily appropriate for machine learning, and this doesn't answer your question, but there's lots of astronomy data which is public (and in fact I think in the public domain as it's a NASA product). For inclusion in scipy, supposing the license is fine, if the data is small (a few kilobytes?) it can go in a test case. (Does scipy *have* a collection of example code in the distribution? It would be nice...) If it's bigger (a few megabytes?) it could go on the Wiki; if it's really big it could probably go on the Wikimedia Commons (though do they support arbitrary file types?). Uh, I should say, I'm not a scipy developer, so this is rather my best guess at what they would permit. Anne From stefan at sun.ac.za Tue Apr 24 03:07:43 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Tue, 24 Apr 2007 09:07:43 +0200 Subject: [SciPy-dev] Disappearing websites and unlicensed code In-Reply-To: <462D30EB.1090606@twilley.org> References: <462D30EB.1090606@twilley.org> Message-ID: <20070424070742.GR6933@mentat.za.net> Hi Jack On Mon, Apr 23, 2007 at 03:19:23PM -0700, Jack Twilley wrote: > I was looking for some examples on using Python's wave module to read > and write WAV files and came across the tutorial formerly found at > http://scipy.mit.edu/tutorials/wave.pdf -- I say formerly found because > I haven't been able to get to that website since I first found the link > via Google. Unfortunately, I don't know the answer to your question. But I can point you in the direction of David's pyaudiolab, which you may find useful nontheless: http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/pyaudiolab/ Cheers St?fan From rudolph at ska.ac.za Tue Apr 24 04:49:21 2007 From: rudolph at ska.ac.za (Rudolph van der Merwe) Date: Tue, 24 Apr 2007 10:49:21 +0200 Subject: [SciPy-dev] Possible bug in scipy.stats.kurtosistest() Message-ID: <97670e910704240149u60078b8bmeafb53c06fc9f844@mail.gmail.com> I think I found a bug in the kurtosistest function in the scipy.stats module. The relevant file is scipy/stats/stats.py The kurtosistest function seems to be a direct implementation of the algorithm described by D'Agostino et all in the following paper: R. B. D'Agostina, A. Belanger, and R. B. D'Agostino, Jr. "A Suggestion for Using Powerful and Informative Tests of Normality", The American Statistician, Vol. 44, No. 4. (Nov., 1990), pp. 316-321 One of the first steps in the algorithm is calculating the kurtosis of the distribution. This is done using the stats.kurtosis function. This function can either calculate the Fisher or the Pearson form of the kurtosis. The scipy code in kurtosistest makes use of the Fisher form which is the default for the kurtosis function. The algorithm in D'Agostino's paper, however, makes use of the Pearson form of the kurtosis. I verified this issue on the numerical example given in the paper. The results calculated by scipy.stats.kurtosistest only agrees with the paper if the call to kurtosis in kurtosistest is changed from b2 = kurtosis(a, axis) to b2 = kurtosis(a, axis, fisher=False) Since kurtosistest is used by the scipy.stats.normaltest function (which does a D'Agostino-Pearson normality test), this bug affects that function's correctness as well. Is there a more formal way of filing this bug than posting a message on this forum? Rudolph P.S. I'm using Scipy version 0.5.2 -- Rudolph van der Merwe KAT (Karoo Array Telescope) / www.kat.ac.za From david at ar.media.kyoto-u.ac.jp Tue Apr 24 09:01:20 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Apr 2007 22:01:20 +0900 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> Message-ID: <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > On 24/04/07, David Cournapeau wrote: >> Hi, >> >> I would like to know what should be done when including some dataset >> in scipy ? For example, during the development of my project pymachine, >> I would like to include some famous data like iris/old faithful data, >> etc... for demo of classic machine learning algorithms. R has some >> intereseting data, but is licensed under the GPL, and I am not quite >> sure what the status of the data are wrt the license ? Does GPL also >> cover raw data ? > > Not necessarily appropriate for machine learning, and this doesn't > answer your question, but there's lots of astronomy data which is > public (and in fact I think in the public domain as it's a NASA > product). > > For inclusion in scipy, supposing the license is fine, if the data is > small (a few kilobytes?) it can go in a test case. (Does scipy *have* > a collection of example code in the distribution? It would be nice...) > If it's bigger (a few megabytes?) it could go on the Wiki; if it's > really big it could probably go on the Wikimedia Commons (though do > they support arbitrary file types?). Well, I guess once scipy is modularized and can be installed package by package, having a package dataset ala R would be nice. For now, I have a small python script which convert those dataset to hdf5, so they can be read easily from python, and if including them to scipy is OK license-wise, I can easily add the data as a package for distribution (the compressed, pickled, related data takes ~ 100 kb). David From david.huard at gmail.com Tue Apr 24 09:08:17 2007 From: david.huard at gmail.com (David Huard) Date: Tue, 24 Apr 2007 09:08:17 -0400 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <20070423202409.GQ6933@mentat.za.net> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> Message-ID: <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> > > > > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt > > Do you know how to process it ? I tried epydoc beta, and it does not > > work on a simple example, and dies on numpy sources... I updated epydoc and the patch I wrote a while ago to parse something very near the scipy template are now outdated. On the other hand, simple examples should work... Do you get a latex error or a python error ? If you get a latex inputenc error, it may be a due to using latin1 encoding instead of utf8 or whatever your encoding is (look in the header of api.tex). HTH, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Apr 24 09:10:08 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Apr 2007 22:10:08 +0900 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> Message-ID: <462E01B0.6020609@ar.media.kyoto-u.ac.jp> David Huard wrote: > > > > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt > > Do you know how to process it ? I tried epydoc beta, and it does not > > work on a simple example, and dies on numpy sources... > > > I updated epydoc and the patch I wrote a while ago to parse something > very near the scipy template are now outdated. > > On the other hand, simple examples should work... > > Do you get a latex error or a python error ? Well, actually, I was just stupid, and forgot to set __docformat__ to reST before processing my python files... Now, I can get them processed. > > If you get a latex inputenc error, it may be a due to using latin1 > encoding instead of utf8 or whatever your encoding is (look in the > header of api.tex). Indeed, the package is called twice, once with utf8, once with latin1; I haven't really looked into it. Does the latex part of the docstring works ? Eg, I didn't find a way to add comment into latex and make them work with :lm:eqn, but I am not sure about the syntax (and didn't find any examples in numpy/scipy sources). Thanks, David From david at ar.media.kyoto-u.ac.jp Tue Apr 24 09:11:35 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 24 Apr 2007 22:11:35 +0900 Subject: [SciPy-dev] Disappearing websites and unlicensed code In-Reply-To: <20070424070742.GR6933@mentat.za.net> References: <462D30EB.1090606@twilley.org> <20070424070742.GR6933@mentat.za.net> Message-ID: <462E0207.1000400@ar.media.kyoto-u.ac.jp> Stefan van der Walt wrote: > Hi Jack > > On Mon, Apr 23, 2007 at 03:19:23PM -0700, Jack Twilley wrote: >> I was looking for some examples on using Python's wave module to read >> and write WAV files and came across the tutorial formerly found at >> http://scipy.mit.edu/tutorials/wave.pdf -- I say formerly found because >> I haven't been able to get to that website since I first found the link >> via Google. > > Unfortunately, I don't know the answer to your question. But I can > point you in the direction of David's pyaudiolab, which you may find > useful nontheless: > > http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/pyaudiolab/ > This reminds me I can import it into scikits, now :) David From ondrej at certik.cz Tue Apr 24 09:20:50 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 24 Apr 2007 15:20:50 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461D4069.8070101@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <461D4069.8070101@gmail.com> Message-ID: <85b5c3130704240620x24ac89e9yc1c8dcbb016320de@mail.gmail.com> > Register an account with the scipy Trac (click "Register" in the upper-right > corner): > > http://projects.scipy.org/scipy/scipy > > Then make a new ticket and attach your patch to that. Submit enough patches, and > we'll just give you SVN access. Hi, the patch is here: http://projects.scipy.org/scipy/scipy/ticket/402 it should be enough to apply it in the scipy root dir. Notes: I created a new module nonlin, and put the solvers and tests (I adapted them to the scipy test framework). I am interested in criticisms, like if I should better put it into the optimize module (I think optimization is a different field), or what else should be done. I have a question about how you work when developing scipy? I do: 1) I play in the trunk, implement something 2) in the scipy root dir I execute: rm -rf ../dist/; ./setup.py install --prefix ../dist 3) in the parent directory, I have execute this short script, that tests, that my change in 1) works fine: import sys sys.path.insert(0,"dist/lib/python2.4/site-packages/") import scipy if scipy.version.release: raise "The svn version not imported!! Fix your paths" from scipy import nonlin nonlin.test() However, I am interested, if you have some better approach. Thanks, Ondrej From david.huard at gmail.com Tue Apr 24 09:40:47 2007 From: david.huard at gmail.com (David Huard) Date: Tue, 24 Apr 2007 09:40:47 -0400 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <462E01B0.6020609@ar.media.kyoto-u.ac.jp> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> <462E01B0.6020609@ar.media.kyoto-u.ac.jp> Message-ID: <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> Here is a patch to the latest docutils svn implementing the math role from Jens. You can look at docutils/sandbox/jens/latex-math/test/test.txt for an example. Note that I changed the name of the role from latex-math to math, so you'll have to replace occurrences of latex-math by math. Then run rst2latex test.txt | pdflatex Cheers, David 2007/4/24, David Cournapeau : > > David Huard wrote: > > > > > > > > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/HOWTO_DOCUMENT.txt > > > Do you know how to process it ? I tried epydoc beta, and it does > not > > > work on a simple example, and dies on numpy sources... > > > > > > I updated epydoc and the patch I wrote a while ago to parse something > > very near the scipy template are now outdated. > > > > On the other hand, simple examples should work... > > > > Do you get a latex error or a python error ? > Well, actually, I was just stupid, and forgot to set __docformat__ to > reST before processing my python files... Now, I can get them processed. > > > > If you get a latex inputenc error, it may be a due to using latin1 > > encoding instead of utf8 or whatever your encoding is (look in the > > header of api.tex). > Indeed, the package is called twice, once with utf8, once with latin1; I > haven't really looked into it. > > Does the latex part of the docstring works ? Eg, I didn't find a way to > add comment into latex and make them work with :lm:eqn, but I am not > sure about the syntax (and didn't find any examples in numpy/scipy > sources). > > Thanks, > > David > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: latex-math.patch Type: text/x-patch Size: 23110 bytes Desc: not available URL: From robert.kern at gmail.com Tue Apr 24 12:37:39 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 24 Apr 2007 11:37:39 -0500 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> Message-ID: <462E3253.6010604@gmail.com> David Cournapeau wrote: > Well, I guess once scipy is modularized and can be installed package by > package, having a package dataset ala R would be nice. For now, I have a > small python script which convert those dataset to hdf5, so they can be > read easily from python, and if including them to scipy is OK > license-wise, I can easily add the data as a package for distribution > (the compressed, pickled, related data takes ~ 100 kb). I'm fiddling around with a convention for data packages. Let's suppose we have a namespace package scipydata. Each data package would be a subpackage under scipydata. It would provide some conventionally-named metadata to describe the dataset (`__doc__` to describe the dataset in prose, `source`, `copyright`, etc.) and a load() callable that would load the dataset and return a dictionary with its data. The load() callable could do whatever it needs to load the data. It might just return objects that are defined in code (e.g. numpy.array([...])) if they are small enough. Or it might read a CSV, NetCDF4, or HDF5 file that is included in the package. Or it might download something from a website or FTP site. The scipydata.util package would provide some utilities to help writing scipydata packages. Particularly, it would be provide utilities to read some kind of configuration file or environment variable which establishes a cache directory such that large datasets can be downloaded from a website once and loaded from disk thereafter. The scipydata packages could then be distributed extremely easily as eggs, and getting your dataset would be as simple as $ easy_install scipydata.cournapeaus_data Does that sound good to you? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Tue Apr 24 17:07:50 2007 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 24 Apr 2007 17:07:50 -0400 Subject: [SciPy-dev] Possible bug in scipy.stats.kurtosistest() In-Reply-To: <97670e910704240149u60078b8bmeafb53c06fc9f844@mail.gmail.com> References: <97670e910704240149u60078b8bmeafb53c06fc9f844@mail.gmail.com> Message-ID: On 24/04/07, Rudolph van der Merwe wrote: > I think I found a bug in the kurtosistest function in the scipy.stats > module. The relevant file is scipy/stats/stats.py You're probably right; the statistics tools are in desperate need of review. In fact, from your nice detailed descrpition, I'm quite convinced there is a bug. > Is there a more formal way of filing this bug than posting a message > on this forum? There is the scipy Trac bug tracker. I'm afraid you have to log in to report bugs. I've filed this one for you, at http://projects.scipy.org/scipy/scipy/ticket/403 Thank you for letting us know! Anne M. Archibald From oliphant at ee.byu.edu Tue Apr 24 18:05:50 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 24 Apr 2007 16:05:50 -0600 Subject: [SciPy-dev] Possible bug in scipy.stats.kurtosistest() In-Reply-To: <97670e910704240149u60078b8bmeafb53c06fc9f844@mail.gmail.com> References: <97670e910704240149u60078b8bmeafb53c06fc9f844@mail.gmail.com> Message-ID: <462E7F3E.1000003@ee.byu.edu> Rudolph van der Merwe wrote: >Since kurtosistest is used by the scipy.stats.normaltest function >(which does a D'Agostino-Pearson normality test), this bug affects >that function's correctness as well. > >Is there a more formal way of filing this bug than posting a message > > >on this forum > There is the bug-tracker Trac pages which requires spam-avoiding registration but which helps us not lose sight of important problems. On the other hand, bringing the problem to the attention of a developer is usually the best way to get it solved. Based on your recommendations, this bug has been fixed. Thank you for your help. -Travis From david at ar.media.kyoto-u.ac.jp Tue Apr 24 22:06:28 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 25 Apr 2007 11:06:28 +0900 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462E3253.6010604@gmail.com> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> Message-ID: <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > > I'm fiddling around with a convention for data packages. Let's suppose we have a > namespace package scipydata. Each data package would be a subpackage under > scipydata. It would provide some conventionally-named metadata to describe the > dataset (`__doc__` to describe the dataset in prose, `source`, `copyright`, > etc.) and a load() callable that would load the dataset and return a dictionary > with its data. The load() callable could do whatever it needs to load the data. > It might just return objects that are defined in code (e.g. numpy.array([...])) > if they are small enough. Or it might read a CSV, NetCDF4, or HDF5 file that is > included in the package. Or it might download something from a website or FTP site. > > The scipydata.util package would provide some utilities to help writing > scipydata packages. Particularly, it would be provide utilities to read some > kind of configuration file or environment variable which establishes a cache > directory such that large datasets can be downloaded from a website once and > loaded from disk thereafter. > > The scipydata packages could then be distributed extremely easily as eggs, and > getting your dataset would be as simple as > > $ easy_install scipydata.cournapeaus_data > > Does that sound good to you? I don't see any problem with that approach, and I am sure you know much better than me how to organize things for easy distribution. I think everybody agreeing on one file format is important (I have a preference for hdf5, since it is well supported under python through pytables, and has a full C api). For really small dataset, CSV could be OK. Would scipydata be in scipy ? (I am asking again for license reasons :) ). David From david at ar.media.kyoto-u.ac.jp Tue Apr 24 22:20:22 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 25 Apr 2007 11:20:22 +0900 Subject: [SciPy-dev] scikits developement conventions ? Message-ID: <462EBAE6.1000906@ar.media.kyoto-u.ac.jp> Hi, Now that scikits is finally in place, I would like to know what the "conventions" are for development. The only webpage I found for info is the following: https://projects.scipy.org/scipy/scikits/. Is that right ? My questions are: - who has write access to the repository ? - how does one create a new project (people with write access can create a project ?) cheers, David From steve at shrogers.com Tue Apr 24 23:01:55 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 24 Apr 2007 21:01:55 -0600 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462E3253.6010604@gmail.com> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> Message-ID: <462EC4A3.8030405@shrogers.com> Robert Kern wrote: > David Cournapeau wrote: > >> Well, I guess once scipy is modularized and can be installed package by >> package, having a package dataset ala R would be nice. For now, I have a >> small python script which convert those dataset to hdf5, so they can be >> read easily from python, and if including them to scipy is OK >> license-wise, I can easily add the data as a package for distribution >> (the compressed, pickled, related data takes ~ 100 kb). > > I'm fiddling around with a convention for data packages. Let's suppose we have a > namespace package scipydata. Each data package would be a subpackage under > scipydata. It would provide some conventionally-named metadata to describe the > dataset (`__doc__` to describe the dataset in prose, `source`, `copyright`, > etc.) and a load() callable that would load the dataset and return a dictionary > with its data. The load() callable could do whatever it needs to load the data. > It might just return objects that are defined in code (e.g. numpy.array([...])) > if they are small enough. Or it might read a CSV, NetCDF4, or HDF5 file that is > included in the package. Or it might download something from a website or FTP site. > > The scipydata.util package would provide some utilities to help writing > scipydata packages. Particularly, it would be provide utilities to read some > kind of configuration file or environment variable which establishes a cache > directory such that large datasets can be downloaded from a website once and > loaded from disk thereafter. > > The scipydata packages could then be distributed extremely easily as eggs, and > getting your dataset would be as simple as > > $ easy_install scipydata.cournapeaus_data > > Does that sound good to you? > Yes, it does. # Steve From robert.kern at gmail.com Wed Apr 25 01:37:13 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Apr 2007 00:37:13 -0500 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> Message-ID: <462EE909.8040801@gmail.com> David Cournapeau wrote: > I don't see any problem with that approach, and I am sure you know much > better than me how to organize things for easy distribution. I think > everybody agreeing on one file format is important (I have a preference > for hdf5, since it is well supported under python through pytables, and > has a full C api). I don't agree. My design goal was to be able to expose a single interface (load()) in front of any file format or data source. I imagined that many of the data sources would be from other packages that are out of our direct control and which we did not want to copy-and-paste into our own repository. > For really small dataset, CSV could be OK. > > Would scipydata be in scipy ? (I am asking again for license reasons :) ). No, it would be a separate namespace package. Each scipydata subpackage could specify its own license. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Apr 25 01:41:58 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Apr 2007 00:41:58 -0500 Subject: [SciPy-dev] scikits developement conventions ? In-Reply-To: <462EBAE6.1000906@ar.media.kyoto-u.ac.jp> References: <462EBAE6.1000906@ar.media.kyoto-u.ac.jp> Message-ID: <462EEA26.6000509@gmail.com> David Cournapeau wrote: > Hi, > > Now that scikits is finally in place, I would like to know what the > "conventions" are for development. The only webpage I found for info is > the following: https://projects.scipy.org/scipy/scikits/. Is that right ? That's it at the moment. You can also look at my responses in the thread "mlabwrap scikit". I'll try to flesh out the page this week. > My questions are: > - who has write access to the repository ? Ask Jeff Strunk for access. Tell him I said to give you access. > - how does one create a new project (people with write access can > create a project ?) svn add, essentially. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Wed Apr 25 02:42:40 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 25 Apr 2007 15:42:40 +0900 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462EE909.8040801@gmail.com> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> <462EE909.8040801@gmail.com> Message-ID: <462EF860.1040406@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > David Cournapeau wrote: >> I don't see any problem with that approach, and I am sure you know much >> better than me how to organize things for easy distribution. I think >> everybody agreeing on one file format is important (I have a preference >> for hdf5, since it is well supported under python through pytables, and >> has a full C api). > > I don't agree. My design goal was to be able to expose a single interface > (load()) in front of any file format or data source. I imagined that many of the > data sources would be from other packages that are out of our direct control and > which we did not want to copy-and-paste into our own repository. Ah, ok, I was more in the "r spirit" and their dataset package. Using your approach is actually more general: we can always have one package with "basic" datasets if this is seen useful, with our own data. David From matthieu.brucher at gmail.com Wed Apr 25 05:17:50 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 25 Apr 2007 11:17:50 +0200 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: References: Message-ID: Since nobody answered to this mail, I submitted the last developments to the TRAC : http://projects.scipy.org/scipy/scipy/ticket/405 I added some conjugate grasient steps : PRP, CW, D and DY, and a special optimizer that can modify the set of parameters before ans after an iteration - this is useful when a set of parameters has some invariants, or to add noise at each iteration, ... - Matthieu 2007/4/18, Matthieu Brucher : > > Hi, > > I'm lauching a new thread, the last was pretty big, and as I almost put > every advice in this proposal, I thought it would be better. > First, I used scipy coding standard, I hope I didn't forget something. > > I do not know where it would be put at the moment on my scipy tree, and > the tests are visual for the moment, I have to make them automatic, but I do > not know the framework used by scipy, I have to check it first. > > So, the proposal : > - combining several objects to make an optimizer > - a function should be an object defining the __call__ method and graient, > hessian, ... if needed. It can be passed as several separate functions as > Alan suggested it, a new object is then created > - an optimizer is a combination of a function, a step_kind, a line_search, > a criterion and a starting point x0. > - the result of the optimization is return after a call to the optimize() > method > - every object (step or line_search) saves its modification in a state > variable in the optimizer. This variable can be accessed if needed after the > optimization. > - after each iteration, a record function is called with this state > variable - it is a dict, BTW -, if you want to save the whole dict, don't > forget to copy it, as it is modified during the optimization > > For the moment are implemented : > - a standard algorithm, only calls step_kind then line_search for a new > candidate - the next optimizer would be one that calls a modifying function > on the computed result, that can be useful in some cases - > - criteria : > - monotony criterion : the cost is decreasing - a factor can be used to > allow an error - > - relative value criterion : the relative value error is higher than a > fixed error > - absolute value criterion : the same with the absolute error > - step : > - gradient step > - Newton step > - Fletcher-Reeves conjugate gradient step - other conjugate gradient will > be available - > - line search : > - no line search, just take the step > - damped search, it's an inexact line search, that searches in the step > direction a set of parameters than decreases the cost by dividing by two the > step size while the cost is not decreasing > - Golden section search > - Fibonacci search > > I'm not pulling other criterion, step or line search, as my time is finite > when doing a structural change. > > There are 3 classic optimization test functions in the package, > Rosenbrock, Powell and a quadratic function, feel free to try them. > Sometimes, the optimizer converges to the true minimum, sometimes it does > not, I tried to propose several solutions to show that every combinaison > does not manage to find the minimum. > > Matthieu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ratcliff at gmail.com Wed Apr 25 05:36:12 2007 From: william.ratcliff at gmail.com (william ratcliff) Date: Wed, 25 Apr 2007 05:36:12 -0400 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: References: Message-ID: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> say, who's responsible for the anneal portion of optimize? I'd like to check in a minor tweak which implements simple upper and bounds on the fit parameters. Thanks, William On 4/18/07, Matthieu Brucher wrote: > > Hi, > > I'm lauching a new thread, the last was pretty big, and as I almost put > every advice in this proposal, I thought it would be better. > First, I used scipy coding standard, I hope I didn't forget something. > > I do not know where it would be put at the moment on my scipy tree, and > the tests are visual for the moment, I have to make them automatic, but I do > not know the framework used by scipy, I have to check it first. > > So, the proposal : > - combining several objects to make an optimizer > - a function should be an object defining the __call__ method and graient, > hessian, ... if needed. It can be passed as several separate functions as > Alan suggested it, a new object is then created > - an optimizer is a combination of a function, a step_kind, a line_search, > a criterion and a starting point x0. > - the result of the optimization is return after a call to the optimize() > method > - every object (step or line_search) saves its modification in a state > variable in the optimizer. This variable can be accessed if needed after the > optimization. > - after each iteration, a record function is called with this state > variable - it is a dict, BTW -, if you want to save the whole dict, don't > forget to copy it, as it is modified during the optimization > > For the moment are implemented : > - a standard algorithm, only calls step_kind then line_search for a new > candidate - the next optimizer would be one that calls a modifying function > on the computed result, that can be useful in some cases - > - criteria : > - monotony criterion : the cost is decreasing - a factor can be used to > allow an error - > - relative value criterion : the relative value error is higher than a > fixed error > - absolute value criterion : the same with the absolute error > - step : > - gradient step > - Newton step > - Fletcher-Reeves conjugate gradient step - other conjugate gradient will > be available - > - line search : > - no line search, just take the step > - damped search, it's an inexact line search, that searches in the step > direction a set of parameters than decreases the cost by dividing by two the > step size while the cost is not decreasing > - Golden section search > - Fibonacci search > > I'm not pulling other criterion, step or line search, as my time is finite > when doing a structural change. > > There are 3 classic optimization test functions in the package, > Rosenbrock, Powell and a quadratic function, feel free to try them. > Sometimes, the optimizer converges to the true minimum, sometimes it does > not, I tried to propose several solutions to show that every combinaison > does not manage to find the minimum. > > Matthieu > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Apr 25 06:47:29 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 25 Apr 2007 12:47:29 +0200 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462EE909.8040801@gmail.com> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> <462EE909.8040801@gmail.com> Message-ID: <20070425104728.GE6933@mentat.za.net> On Wed, Apr 25, 2007 at 12:37:13AM -0500, Robert Kern wrote: > David Cournapeau wrote: > > I don't see any problem with that approach, and I am sure you know much > > better than me how to organize things for easy distribution. I think > > everybody agreeing on one file format is important (I have a preference > > for hdf5, since it is well supported under python through pytables, and > > has a full C api). > > I don't agree. My design goal was to be able to expose a single interface > (load()) in front of any file format or data source. I imagined that many of the > data sources would be from other packages that are out of our direct control and > which we did not want to copy-and-paste into our own repository. I like the generic 'load()' approach. I often work with large image datasets, where you never want to load the whole thing into memory at once. The above interface would allow me to construct a cached dictionary, which only returns an image on request. Regards St?fan From david at ar.media.kyoto-u.ac.jp Wed Apr 25 07:08:32 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 25 Apr 2007 20:08:32 +0900 Subject: [SciPy-dev] Dataset for examples and license In-Reply-To: <462EE909.8040801@gmail.com> References: <462D8DA9.9030204@ar.media.kyoto-u.ac.jp> <462DFFA0.6060806@ar.media.kyoto-u.ac.jp> <462E3253.6010604@gmail.com> <462EB7A4.4030804@ar.media.kyoto-u.ac.jp> <462EE909.8040801@gmail.com> Message-ID: <462F36B0.2000603@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > David Cournapeau wrote: >> I don't see any problem with that approach, and I am sure you know much >> better than me how to organize things for easy distribution. I think >> everybody agreeing on one file format is important (I have a preference >> for hdf5, since it is well supported under python through pytables, and >> has a full C api). > > I don't agree. My design goal was to be able to expose a single interface > (load()) in front of any file format or data source. I imagined that many of the > data sources would be from other packages that are out of our direct control and > which we did not want to copy-and-paste into our own repository. > >> For really small dataset, CSV could be OK. >> >> Would scipydata be in scipy ? (I am asking again for license reasons :) ). > > No, it would be a separate namespace package. Each scipydata subpackage could > specify its own license. > Ok, I set up something really trivial, so that we can start discussing about details before releasing something. This is available here (bzr archive): http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipydata/scipydata.dev This basically define two subpackages of scipydata, iris and oldfaithful. Each dataset being small, they are defined in python files. In ipython, you can do: >> from scipydata import iris >> ?iris Type: module Base Class: String Form: Namespace: Interactive File: /usr/media/boulot/src/sigtools/scdata/scipydata/iris/__init__.py Docstring: This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. >>data = iris.load() Something which would be nice is that when you type iris, you get the docstring, eg something like __repr__ method, but for a module. I don't know if this is possible (this may have undesirable effects, too). Also, I don't know if this is worthwile to have something like a DataInfo class which has all the meta data, so that you can have everything at once (ala help(faithful) in R, for people who know R). cheers, David From openopt at ukr.net Wed Apr 25 07:33:20 2007 From: openopt at ukr.net (dmitrey) Date: Wed, 25 Apr 2007 14:33:20 +0300 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> References: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> Message-ID: <462F3C80.5030202@ukr.net> Hallo! I have been accepted for participating in the GSoC program with project related to scipy and optimization. If noone else here is able or have no time, I would take a look (however, I can't spend much time before summer start because of my exams; and anneal (as well as other global solvers) is not my specialty). I think that lb-ub bounds can hardly be implemented in a simple way because it depends very much on rand points generator quality, and the latter should be much more better than simple lb+rand*(ub-lb) elseware all points will be located in a thin area near their average value (same problem is present in integration of functions f: R^n->R with high dimensions (n>>1) ). I took a look at the generators by Joachim Vandekerckhove in his anneal (connected to my openopt for MATLAB/Octave), they seems to be too primitive. BTW afaik anneal currenlty is concerned as deprecated (I don't know better English word, not "up-to-date", old one), there are better global solvers, for example GRASP-based. WBR, D. william ratcliff wrote: > say, who's responsible for the anneal portion of optimize? I'd like > to check in a minor tweak which implements simple upper and bounds on > the fit parameters. > > Thanks, > William > > On 4/18/07, *Matthieu Brucher* > wrote: > > Hi, > > I'm lauching a new thread, the last was pretty big, and as I > almost put every advice in this proposal, I thought it would be > better. > First, I used scipy coding standard, I hope I didn't forget > something. > > I do not know where it would be put at the moment on my scipy > tree, and the tests are visual for the moment, I have to make them > automatic, but I do not know the framework used by scipy, I have > to check it first. > > So, the proposal : > - combining several objects to make an optimizer > - a function should be an object defining the __call__ method and > graient, hessian, ... if needed. It can be passed as several > separate functions as Alan suggested it, a new object is then created > - an optimizer is a combination of a function, a step_kind, a > line_search, a criterion and a starting point x0. > - the result of the optimization is return after a call to the > optimize() method > - every object (step or line_search) saves its modification in a > state variable in the optimizer. This variable can be accessed if > needed after the optimization. > - after each iteration, a record function is called with this > state variable - it is a dict, BTW -, if you want to save the > whole dict, don't forget to copy it, as it is modified during the > optimization > > For the moment are implemented : > - a standard algorithm, only calls step_kind then line_search for > a new candidate - the next optimizer would be one that calls a > modifying function on the computed result, that can be useful in > some cases - > - criteria : > - monotony criterion : the cost is decreasing - a factor can be > used to allow an error - > - relative value criterion : the relative value error is higher > than a fixed error > - absolute value criterion : the same with the absolute error > - step : > - gradient step > - Newton step > - Fletcher-Reeves conjugate gradient step - other conjugate > gradient will be available - > - line search : > - no line search, just take the step > - damped search, it's an inexact line search, that searches in > the step direction a set of parameters than decreases the cost by > dividing by two the step size while the cost is not decreasing > - Golden section search > - Fibonacci search > > I'm not pulling other criterion, step or line search, as my time > is finite when doing a structural change. > > There are 3 classic optimization test functions in the package, > Rosenbrock, Powell and a quadratic function, feel free to try > them. Sometimes, the optimizer converges to the true minimum, > sometimes it does not, I tried to propose several solutions to > show that every combinaison does not manage to find the minimum. > > Matthieu > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From david at ar.media.kyoto-u.ac.jp Wed Apr 25 07:43:09 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 25 Apr 2007 20:43:09 +0900 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> <462E01B0.6020609@ar.media.kyoto-u.ac.jp> <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> Message-ID: <462F3ECD.5060701@ar.media.kyoto-u.ac.jp> David Huard wrote: > Here is a patch to the latest docutils svn implementing the math role > from Jens. > > You can look at docutils/sandbox/jens/latex-math/test/test.txt for an > example. Note that I changed the name of the role from latex-math to > math, so you'll have to replace occurrences of latex-math by math. > Then run > > rst2latex test.txt | pdflatex > Excuse my ignorance, but how do I use it in a docstring ? Does epydoc uses the docutils' rest parser ? David From guyer at nist.gov Wed Apr 25 08:32:59 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Wed, 25 Apr 2007 08:32:59 -0400 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <462F3ECD.5060701@ar.media.kyoto-u.ac.jp> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> <462E01B0.6020609@ar.media.kyoto-u.ac.jp> <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> <462F3ECD.5060701@ar.media.kyoto-u.ac.jp> Message-ID: On Apr 25, 2007, at 7:43 AM, David Cournapeau wrote: > Does epydoc > uses the docutils' rest parser ? Yes. From william.ratcliff at gmail.com Wed Apr 25 12:04:47 2007 From: william.ratcliff at gmail.com (william ratcliff) Date: Wed, 25 Apr 2007 12:04:47 -0400 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: <462F3C80.5030202@ukr.net> References: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> <462F3C80.5030202@ukr.net> Message-ID: <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> The 'simple' way applies only to the anneal algorithm in scipy. When one chooses steps in a simulated annealing algorithm, there is always the question of how to step from the current point. For anneal, it is currently done based on an upper bound and lower bound (in one option). However, there is nothing to prevent the searcher from crawling its way out of the box. When most people imagine themselves searching in a bounded parameter space, that is not the expected behavior. Now, it is possible that the optimum solution is truly outside of the box and that the searcher is doing the right thing. However, if that is not the case, then there is a problem. So, what is one to do? The first obvious thing to try is to say, if you reach the edge of a bounded parameter, stay there. However, that is not ideal as you get stuck and can't explore the rest of the phase space. So, I use the simple heuristic that if a trial move is to take you outside of the box, simply stay where you are. In the next cycle, try to move again. This will keep you in the box, and if there is truly a solution outside of the box, will still move you towards the walls and let you know that maybe you've set your bounds improperly. Now, there are questions of efficiency. For example, how easy is it to get out of corners? Should one do reflections? However, I believe that my rather simple heuristic will preserve detailed balance and results in an algorithm that has the expected behavior and is better than having no option ;> As for deprecation--is it really true that scipy.optimize.anneal is deprecated? As for issues of this global optimizer or that global optimizer, why not let the user decide based on their expectations of their fitting surface? For some truly glassy surfaces, one is forced into techniques like simulated annealing, parrallel tempering, genetic algorithms, etc. and I imagine that their relative performance is based strongly on the particular problem that their are trying to solve. Cheers, WIlliam Ratcliff On 4/25/07, dmitrey wrote: > > Hallo! > I have been accepted for participating in the GSoC program with project > related to scipy and optimization. > If noone else here is able or have no time, I would take a look > (however, I can't spend much time before summer start because of my > exams; and anneal (as well as other global solvers) is not my specialty). > > I think that lb-ub bounds can hardly be implemented in a simple way > because it depends very much on rand points generator quality, and the > latter should be much more better than simple lb+rand*(ub-lb) elseware > all points will be located in a thin area near their average value (same > problem is present in integration of functions f: R^n->R with high > dimensions (n>>1) ). > I took a look at the generators by Joachim Vandekerckhove in his anneal > (connected to my openopt for MATLAB/Octave), they seems to be too > primitive. > BTW afaik anneal currenlty is concerned as deprecated (I don't know > better English word, not "up-to-date", old one), there are better global > solvers, for example GRASP-based. > WBR, D. > > william ratcliff wrote: > > say, who's responsible for the anneal portion of optimize? I'd like > > to check in a minor tweak which implements simple upper and bounds on > > the fit parameters. > > > > Thanks, > > William > > > > On 4/18/07, *Matthieu Brucher* > > wrote: > > > > Hi, > > > > I'm lauching a new thread, the last was pretty big, and as I > > almost put every advice in this proposal, I thought it would be > > better. > > First, I used scipy coding standard, I hope I didn't forget > > something. > > > > I do not know where it would be put at the moment on my scipy > > tree, and the tests are visual for the moment, I have to make them > > automatic, but I do not know the framework used by scipy, I have > > to check it first. > > > > So, the proposal : > > - combining several objects to make an optimizer > > - a function should be an object defining the __call__ method and > > graient, hessian, ... if needed. It can be passed as several > > separate functions as Alan suggested it, a new object is then > created > > - an optimizer is a combination of a function, a step_kind, a > > line_search, a criterion and a starting point x0. > > - the result of the optimization is return after a call to the > > optimize() method > > - every object (step or line_search) saves its modification in a > > state variable in the optimizer. This variable can be accessed if > > needed after the optimization. > > - after each iteration, a record function is called with this > > state variable - it is a dict, BTW -, if you want to save the > > whole dict, don't forget to copy it, as it is modified during the > > optimization > > > > For the moment are implemented : > > - a standard algorithm, only calls step_kind then line_search for > > a new candidate - the next optimizer would be one that calls a > > modifying function on the computed result, that can be useful in > > some cases - > > - criteria : > > - monotony criterion : the cost is decreasing - a factor can be > > used to allow an error - > > - relative value criterion : the relative value error is higher > > than a fixed error > > - absolute value criterion : the same with the absolute error > > - step : > > - gradient step > > - Newton step > > - Fletcher-Reeves conjugate gradient step - other conjugate > > gradient will be available - > > - line search : > > - no line search, just take the step > > - damped search, it's an inexact line search, that searches in > > the step direction a set of parameters than decreases the cost by > > dividing by two the step size while the cost is not decreasing > > - Golden section search > > - Fibonacci search > > > > I'm not pulling other criterion, step or line search, as my time > > is finite when doing a structural change. > > > > There are 3 classic optimization test functions in the package, > > Rosenbrock, Powell and a quadratic function, feel free to try > > them. Sometimes, the optimizer converges to the true minimum, > > sometimes it does not, I tried to propose several solutions to > > show that every combinaison does not manage to find the minimum. > > > > Matthieu > > > > _______________________________________________ > > Scipy-dev mailing list > > Scipy-dev at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-dev > > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Scipy-dev mailing list > > Scipy-dev at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 25 12:18:42 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Apr 2007 11:18:42 -0500 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> References: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> <462F3C80.5030202@ukr.net> <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> Message-ID: <462F7F62.7070000@gmail.com> william ratcliff wrote: > As for deprecation--is it really true that > scipy.optimize.anneal is deprecated? I think Dmitrey simply meant that no one has maintained it in a while, which is true. To answer your question, no one has claimed stewardship of anneal, so please suggest modifications as you think would help. We appreciate your contributions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From openopt at ukr.net Wed Apr 25 13:25:18 2007 From: openopt at ukr.net (dmitrey) Date: Wed, 25 Apr 2007 20:25:18 +0300 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> References: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> <462F3C80.5030202@ukr.net> <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> Message-ID: <462F8EFE.8070008@ukr.net> Check for the case: objFun(x) = sum(x) x0 = [0 0 0 0 0 0 0 0 0 1] (or very small numbers instead of zeros, smaller than typicalDeltaX/50 for example) lb = zeros ub = ones (or any other) so if you use random shift for all coords (x = x_prev + deltaX, all coords of deltaX are random), the probability of "move" is 2^(-9)=1/512 and probability of "stay" is 1-2^(-9) = 511/512. this is valid for current update_guess() from anneal class fast_sa: def update_guess(self, x0): x0 = asarray(x0) u = squeeze(random.uniform(0.0, 1.0, size=self.dims)) T = self.T y = sign(u-0.5)*T*((1+1.0/T)**abs(2*u-1)-1.0) xc = y*(self.upper - self.lower) (so xc=deltaX change ALL coords) xnew = x0 + xc return xnew class cauchy_sa(base_schedule): def update_guess(self, x0): x0 = asarray(x0) numbers = squeeze(random.uniform(-pi/2, pi/2, size=self.dims)) xc = self.learn_rate * self.T * tan(numbers) xnew = x0 + xc (ALSO modify ALL coords) return xnew class boltzmann_sa(base_schedule): def update_guess(self, x0): std = minimum(sqrt(self.T)*ones(self.dims), (self.upper-self.lower)/3.0/self.learn_rate) x0 = asarray(x0) xc = squeeze(random.normal(0, 1.0, size=self.dims)) xnew = x0 + xc*std*self.learn_rate (ALSO modify ALL coords) return xnew If you use random shift for 1 coord only (sequential) there can be other problems. WBR, D. william ratcliff wrote: > The 'simple' way applies only to the anneal algorithm in scipy. When > one chooses steps in a simulated annealing algorithm, there is always > the question of how to step from the current point. For anneal, it is > currently done based on an upper bound and lower bound (in one > option). However, there is nothing to prevent the searcher from > crawling its way out of the box. When most people imagine themselves > searching in a bounded parameter space, that is not the expected > behavior. Now, it is possible that the optimum solution is truly > outside of the box and that the searcher is doing the right thing. > However, if that is not the case, then there is a problem. So, what > is one to do? The first obvious thing to try is to say, if you reach > the edge of a bounded parameter, stay there. However, that is not > ideal as you get stuck and can't explore the rest of the phase space. > So, I use the simple heuristic that if a trial move is to take you > outside of the box, simply stay where you are. In the next cycle, try > to move again. This will keep you in the box, and if there is truly > a solution outside of the box, will still move you towards the walls > and let you know that maybe you've set your bounds improperly. Now, > there are questions of efficiency. For example, how easy is it to get > out of corners? Should one do reflections? However, I believe that > my rather simple heuristic will preserve detailed balance and results > in an algorithm that has the expected behavior and is better than > having no option ;> > > As for deprecation--is it really true that > scipy.optimize.anneal is deprecated? > > As for issues of this global optimizer or that global optimizer, why > not let the user decide based on their expectations of their fitting > surface? For some truly glassy surfaces, one is forced into > techniques like simulated annealing, parrallel tempering, genetic > algorithms, etc. and I imagine that their relative performance is > based strongly on the particular problem that their are trying to solve. > > Cheers, > WIlliam Ratcliff From millman at berkeley.edu Wed Apr 25 14:34:48 2007 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 25 Apr 2007 11:34:48 -0700 Subject: [SciPy-dev] scikits developement conventions ? In-Reply-To: <462EEA26.6000509@gmail.com> References: <462EBAE6.1000906@ar.media.kyoto-u.ac.jp> <462EEA26.6000509@gmail.com> Message-ID: On 4/24/07, Robert Kern wrote: > > David Cournapeau wrote: > > Now that scikits is finally in place, I would like to know what the > > "conventions" are for development. The only webpage I found for info is > > the following: https://projects.scipy.org/scipy/scikits/. Is that right > ? > > That's it at the moment. You can also look at my responses in the thread > "mlabwrap scikit". I'll try to flesh out the page this week. > Last week Alexander Schmolck and I started converting mlabwrap to the be the first scikits project. We ran short on time and didn't get everything finished. I will be adding the project to svn later this week. Alex also started working on the distilling the scipy-dev threads on the wiki page. Once Robert fleshes it out a little better, it should make more sense. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ratcliff at gmail.com Wed Apr 25 19:19:26 2007 From: william.ratcliff at gmail.com (william ratcliff) Date: Wed, 25 Apr 2007 19:19:26 -0400 Subject: [SciPy-dev] Updated generic optimizers proposal In-Reply-To: <462F8EFE.8070008@ukr.net> References: <827183970704250236r9c39769w25c53d4f6d1d29df@mail.gmail.com> <462F3C80.5030202@ukr.net> <827183970704250904j117ca962m5383980d8fb1d4b8@mail.gmail.com> <462F8EFE.8070008@ukr.net> Message-ID: <827183970704251619o4a635428q8875608b36870d03@mail.gmail.com> What I suggest is simply a slightly modified version of what's in anneal.pyto add the following: class simple_sa(base_schedule): def init(self, **options): self.__dict__.update(options) if self.m is None: self.m = 1.0 if self.n is None: self.n = 1.0 self.c = self.m * exp(-self.n * self.quench) def update_guess(self, x0): x0 = asarray(x0) T = self.T myFlag=True while myFlag: u = squeeze(random.uniform(0.0, 1.0, size=self.dims)) y = sign(u-0.5)*T*((1+1.0/T)**abs(2*u-1)-1.0) xc = y*(self.upper - self.lower) xt=x0+xc indu=where(xt>self.upper) # find where it goes above the upper bounds indl=where(xt wrote: > > Check for the case: > objFun(x) = sum(x) > x0 = [0 0 0 0 0 0 0 0 0 1] (or very small numbers instead of zeros, > smaller than typicalDeltaX/50 for example) > lb = zeros > ub = ones (or any other) > > so if you use random shift for all coords (x = x_prev + deltaX, all > coords of deltaX are random), the probability of "move" is 2^(-9)=1/512 > and probability of "stay" is 1-2^(-9) = 511/512. > this is valid for current update_guess() from anneal > class fast_sa: > def update_guess(self, x0): > x0 = asarray(x0) > u = squeeze(random.uniform(0.0, 1.0, size=self.dims)) > T = self.T > y = sign(u-0.5)*T*((1+1.0/T)**abs(2*u-1)-1.0) > xc = y*(self.upper - self.lower) (so xc=deltaX change ALL coords) > xnew = x0 + xc > return xnew > > class cauchy_sa(base_schedule): > def update_guess(self, x0): > x0 = asarray(x0) > numbers = squeeze(random.uniform(-pi/2, pi/2, size=self.dims)) > xc = self.learn_rate * self.T * tan(numbers) > xnew = x0 + xc (ALSO modify ALL coords) > return xnew > > class boltzmann_sa(base_schedule): > def update_guess(self, x0): > std = minimum(sqrt(self.T)*ones(self.dims), > (self.upper-self.lower)/3.0/self.learn_rate) > x0 = asarray(x0) > xc = squeeze(random.normal(0, 1.0, size=self.dims)) > xnew = x0 + xc*std*self.learn_rate (ALSO modify ALL coords) > return xnew > > If you use random shift for 1 coord only (sequential) there can be other > problems. > > WBR, D. > > william ratcliff wrote: > > The 'simple' way applies only to the anneal algorithm in scipy. When > > one chooses steps in a simulated annealing algorithm, there is always > > the question of how to step from the current point. For anneal, it is > > currently done based on an upper bound and lower bound (in one > > option). However, there is nothing to prevent the searcher from > > crawling its way out of the box. When most people imagine themselves > > searching in a bounded parameter space, that is not the expected > > behavior. Now, it is possible that the optimum solution is truly > > outside of the box and that the searcher is doing the right thing. > > However, if that is not the case, then there is a problem. So, what > > is one to do? The first obvious thing to try is to say, if you reach > > the edge of a bounded parameter, stay there. However, that is not > > ideal as you get stuck and can't explore the rest of the phase space. > > So, I use the simple heuristic that if a trial move is to take you > > outside of the box, simply stay where you are. In the next cycle, try > > to move again. This will keep you in the box, and if there is truly > > a solution outside of the box, will still move you towards the walls > > and let you know that maybe you've set your bounds improperly. Now, > > there are questions of efficiency. For example, how easy is it to get > > out of corners? Should one do reflections? However, I believe that > > my rather simple heuristic will preserve detailed balance and results > > in an algorithm that has the expected behavior and is better than > > having no option ;> > > > > As for deprecation--is it really true that > > scipy.optimize.anneal is deprecated? > > > > As for issues of this global optimizer or that global optimizer, why > > not let the user decide based on their expectations of their fitting > > surface? For some truly glassy surfaces, one is forced into > > techniques like simulated annealing, parrallel tempering, genetic > > algorithms, etc. and I imagine that their relative performance is > > based strongly on the particular problem that their are trying to solve. > > > > Cheers, > > WIlliam Ratcliff > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Apr 26 05:25:29 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Apr 2007 18:25:29 +0900 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> <462E01B0.6020609@ar.media.kyoto-u.ac.jp> <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> Message-ID: <46307009.3030307@ar.media.kyoto-u.ac.jp> David Huard wrote: > Here is a patch to the latest docutils svn implementing the math role > from Jens. > > You can look at docutils/sandbox/jens/latex-math/test/test.txt for an > example. Note that I changed the name of the role from latex-math to > math, so you'll have to replace occurrences of latex-math by math. > Then run > > rst2latex test.txt | pdflatex Thanks, this is working great ! I have another questions regarding epydoc's usage. First, when I have a Examples section in my docstring, it is put before :Parameters: in the output html, even if I put the example section after in the docstring. Why is that ? David From matthieu.brucher at gmail.com Thu Apr 26 06:43:07 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 12:43:07 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy Message-ID: Hello, I'm wondering what C++ matrix library I should for future inclusion of code in scipy. What I want to do is to port a class that can do Parzen window or K-neighboors, and it uses my own matrix library. Should I use Blitz ++ ? Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Apr 26 06:47:02 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Apr 2007 06:47:02 -0400 Subject: [SciPy-dev] Which matrix library in C++ for scipy References: Message-ID: Matthieu Brucher wrote: > Hello, > > I'm wondering what C++ matrix library I should for future inclusion of > code in scipy. What I want to do is to port a class that can do Parzen > window or K-neighboors, and it uses my own matrix library. Should I use > Blitz ++ ? > At this point, the best choice for myself has been using boost::python with boost::ublas. I also use boost::multi_array. From matthieu.brucher at gmail.com Thu Apr 26 07:24:56 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 13:24:56 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: Message-ID: > At this point, the best choice for myself has been using boost::python > with > boost::ublas. I also use boost::multi_array. > Does scipy depend on Boost ? In that case, I can use it. I posted a message in the user list some time ago about neighboors algorithms in Python, and I had a link to BioPython, based on C++. I could use it, but don't want to install BioPython, and I'm willing to contribute my own version - I have it for a long time, and as I'm migrating toward Python for my work, much more easy to use - to scipy. In that perspective, I don't want to have a dependency on something that is not already there, and in scipy and in my own C++ code - that works very well ;) -. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Apr 26 07:23:33 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Apr 2007 20:23:33 +0900 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: Message-ID: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > > At this point, the best choice for myself has been using > boost::python with > boost::ublas. I also use boost::multi_array. > > > Does scipy depend on Boost ? In that case, I can use it. > I posted a message in the user list some time ago about neighboors > algorithms in Python, and I had a link to BioPython, based on C++. I > could use it, but don't want to install BioPython, and I'm willing to > contribute my own version - I have it for a long time, and as I'm > migrating toward Python for my work, much more easy to use - to scipy. > In that perspective, I don't want to have a dependency on something > that is not already there, and in scipy and in my own C++ code - that > works very well ;) -. scipy does certainly not depend on Boost. Blitz is used in weave, but I don't know if this is mandatory (and it includes the blitz library); I think that actually, a C++ compiler is not even mandatory. Maybe I am missing something, but why the need for a matrix library for basic neighbors algorithm ? David From matthieu.brucher at gmail.com Thu Apr 26 07:43:39 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 13:43:39 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> Message-ID: > > scipy does certainly not depend on Boost. Blitz is used in weave, but I > don't know if this is mandatory (and it includes the blitz library); I > think that actually, a C++ compiler is not even mandatory. Maybe I am > missing something, but why the need for a matrix library for basic > neighbors algorithm ? That's right, I do not have the utility for a fully-fludged matrix library, but basic stuff yes. For a neighbooring algorithm, I can use basic computations, norms, ... Yes, I could program them directly, but if there is already something in scipy, no use to do it again ;) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Apr 26 07:44:19 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Apr 2007 20:44:19 +0900 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> Message-ID: <46309093.5020809@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > > scipy does certainly not depend on Boost. Blitz is used in weave, > but I > don't know if this is mandatory (and it includes the blitz > library); I > think that actually, a C++ compiler is not even mandatory. Maybe I am > missing something, but why the need for a matrix library for basic > neighbors algorithm ? > > > That's right, I do not have the utility for a fully-fludged matrix > library, but basic stuff yes. For a neighbooring algorithm, I can use > basic computations, norms, ... > Yes, I could program them directly, but if there is already something > in scipy, no use to do it again ;) I don't think there is anything like that in scipy. Something which could be useful would be to have a C++ class which reflects a numpy array, for seamless integration between eg boost.python and numpy. But that would be quite a challenge to get it right, I think. David From ndbecker2 at gmail.com Thu Apr 26 08:58:11 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Apr 2007 08:58:11 -0400 Subject: [SciPy-dev] Which matrix library in C++ for scipy References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: David Cournapeau wrote: > Matthieu Brucher wrote: >> >> scipy does certainly not depend on Boost. Blitz is used in weave, >> but I >> don't know if this is mandatory (and it includes the blitz >> library); I >> think that actually, a C++ compiler is not even mandatory. Maybe I am >> missing something, but why the need for a matrix library for basic >> neighbors algorithm ? >> >> >> That's right, I do not have the utility for a fully-fludged matrix >> library, but basic stuff yes. For a neighbooring algorithm, I can use >> basic computations, norms, ... >> Yes, I could program them directly, but if there is already something >> in scipy, no use to do it again ;) > I don't think there is anything like that in scipy. Something which > could be useful would be to have a C++ class which reflects a numpy > array, for seamless integration between eg boost.python and numpy. But > that would be quite a challenge to get it right, I think. > > David There is numerical interface in boost::python. I don't use this approach myself. Here's why. I write all basic algorithms in c++. I try to use modern, generic programming when writing them. There is AFAIK, no reasonable way to interface such code to numerical/numpy. The C interface to numpy is too low-level. IOW, I like writing in c++, and I don't want to have to write code at such a low-level interface as would be needed to interface to numpy. So my approach is: 1. Write c++ algorithms with generic interfaces (where feasible). 2. When it is not feasible to use generic container types, I use boost::ublas::{vector/matrix} explicitly. 3. The above c++ code is parametrized (templated) on the container types. 4. Explicit instantiations of (3) are then exposed to python, normally specifying ublas::{vector/matrix} as the input/output types. This doesn't, of course, directly interoperate with numpy. I can, however, convert between numpy arrays and ublas::matrix (which currently requires copying the data, unfortunately). From matthieu.brucher at gmail.com Thu Apr 26 08:59:43 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 14:59:43 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: <46309093.5020809@ar.media.kyoto-u.ac.jp> References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: > > I don't think there is anything like that in scipy. Something which > could be useful would be to have a C++ class which reflects a numpy > array, for seamless integration between eg boost.python and numpy. But > that would be quite a challenge to get it right, I think. > OK, that confirms what I thought - nothing is in scipy at the moment -. Well, I hope someone will have time and patience to make this bridge in the future ;) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Thu Apr 26 09:03:47 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 15:03:47 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: > > There is numerical interface in boost::python. > > I don't use this approach myself. Here's why. I write all basic > algorithms > in c++. I try to use modern, generic programming when writing them. Same for me ;) There is AFAIK, no reasonable way to interface such code to numerical/numpy. > The C interface to numpy is too low-level. IOW, I like writing in c++, > and > I don't want to have to write code at such a low-level interface as would > be needed to interface to numpy. > > So my approach is: > > 1. Write c++ algorithms with generic interfaces (where feasible). > 2. When it is not feasible to use generic container types, I use > boost::ublas::{vector/matrix} explicitly. > 3. The above c++ code is parametrized (templated) on the container types. > 4. Explicit instantiations of (3) are then exposed to python, normally > specifying ublas::{vector/matrix} as the input/output types. > > This doesn't, of course, directly interoperate with numpy. I can, > however, > convert between numpy arrays and ublas::matrix (which currently requires > copying the data, unfortunately). What I'm doing after the last thread I started on numpy (SWIG, boost.pythonor ctypes) is using ctypes, so I have my headers, I make a C/C++ bridge and I use ctypes. If you have a (good) boost.python/whatever example for template instantiation and use in Python then, I'm sure a lot of people will thank you for the rest of your life - and even after ;) - Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Apr 26 09:07:13 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 26 Apr 2007 22:07:13 +0900 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: <4630A401.3050008@ar.media.kyoto-u.ac.jp> Neal Becker wrote: > David Cournapeau wrote: > >>> y, but if there is already something >>> in scipy, no use to do it again ;) >> I don't think there is anything like that in scipy. Something which >> could be useful would be to have a C++ class which reflects a numpy >> array, for seamless integration between eg boost.python and numpy. But >> that would be quite a challenge to get it right, I think. >> >> David > > There is numerical interface in boost::python. > > I don't use this approach myself. Here's why. I write all basic algorithms > in c++. I try to use modern, generic programming when writing them. Well, I myself try to avoid C++ like the plague :) What I meant was to have a C++ class which reflect the numpy array, in a sensible manner. For example, "automatic" memory management (at least as automatic as it can get in C++), having the member functions of the C class reflected in C++. The C api of numpy already defines a class, reflected in python. The job would be to do the same in C++. Eg: dtype d("float") narray A((5, 4), d), B((4, 5), d( narray C() C = dot(A, B) Of course, to do it right is difficult and would require someone who know both numpy and C++ very well. David From ndbecker2 at gmail.com Thu Apr 26 09:26:23 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Apr 2007 09:26:23 -0400 Subject: [SciPy-dev] Which matrix library in C++ for scipy References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: Matthieu Brucher wrote: >> >> There is numerical interface in boost::python. >> >> I don't use this approach myself. Here's why. I write all basic >> algorithms >> in c++. I try to use modern, generic programming when writing them. > > > > Same for me ;) > > > There is AFAIK, no reasonable way to interface such code to > numerical/numpy. >> The C interface to numpy is too low-level. IOW, I like writing in c++, >> and >> I don't want to have to write code at such a low-level interface as would >> be needed to interface to numpy. >> >> So my approach is: >> >> 1. Write c++ algorithms with generic interfaces (where feasible). >> 2. When it is not feasible to use generic container types, I use >> boost::ublas::{vector/matrix} explicitly. >> 3. The above c++ code is parametrized (templated) on the container types. >> 4. Explicit instantiations of (3) are then exposed to python, normally >> specifying ublas::{vector/matrix} as the input/output types. >> >> This doesn't, of course, directly interoperate with numpy. I can, >> however, >> convert between numpy arrays and ublas::matrix (which currently requires >> copying the data, unfortunately). > > > What I'm doing after the last thread I started on numpy (SWIG, > boost.pythonor ctypes) is using ctypes, so I have my headers, I make a > C/C++ bridge and > I use ctypes. > If you have a (good) boost.python/whatever example for template > instantiation and use in Python then, I'm sure a lot of people will thank > you for the rest of your life - and even after ;) - > > Matthieu I've got loads of examples. Good is subjective. What kind of example would you like? From fperez.net at gmail.com Thu Apr 26 09:36:27 2007 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 26 Apr 2007 07:36:27 -0600 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: On 4/26/07, Neal Becker wrote: > So my approach is: > > 1. Write c++ algorithms with generic interfaces (where feasible). > 2. When it is not feasible to use generic container types, I use > boost::ublas::{vector/matrix} explicitly. > 3. The above c++ code is parametrized (templated) on the container types. > 4. Explicit instantiations of (3) are then exposed to python, normally > specifying ublas::{vector/matrix} as the input/output types. > > This doesn't, of course, directly interoperate with numpy. I can, however, > convert between numpy arrays and ublas::matrix (which currently requires > copying the data, unfortunately). I'm curious (being a rather primitive C++ user) as to why you don't like/use/prefer Blitz++ for this particular use? Blitz arrays are fairly numpy-like in much of their behavior, and one can be instantiated out of a numpy array with minimal cost (copying only the striding info, not the actual data). That's what weave uses both for weave.blitz and for weave.inline when type_converters=blitz is passed. Thanks, f From ndbecker2 at gmail.com Thu Apr 26 09:44:43 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Apr 2007 09:44:43 -0400 Subject: [SciPy-dev] Which matrix library in C++ for scipy References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: Fernando Perez wrote: > On 4/26/07, Neal Becker wrote: > >> So my approach is: >> >> 1. Write c++ algorithms with generic interfaces (where feasible). >> 2. When it is not feasible to use generic container types, I use >> boost::ublas::{vector/matrix} explicitly. >> 3. The above c++ code is parametrized (templated) on the container types. >> 4. Explicit instantiations of (3) are then exposed to python, normally >> specifying ublas::{vector/matrix} as the input/output types. >> >> This doesn't, of course, directly interoperate with numpy. I can, >> however, convert between numpy arrays and ublas::matrix (which currently >> requires copying the data, unfortunately). > > I'm curious (being a rather primitive C++ user) as to why you don't > like/use/prefer Blitz++ for this particular use? Blitz arrays are > fairly numpy-like in much of their behavior, and one can be > instantiated out of a numpy array with minimal cost (copying only the > striding info, not the actual data). That's what weave uses both for > weave.blitz and for weave.inline when type_converters=blitz is passed. > > Thanks, > > f I did try to evaluate this some time back, and don't recall the reasoning - but it may be that it appears that blitz++ development has stopped, and I don't want to invest in something that is going to die. The last release is Oct 2005. From matthieu.brucher at gmail.com Thu Apr 26 09:58:43 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 26 Apr 2007 15:58:43 +0200 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: > > I've got loads of examples. Good is subjective. What kind of example > would you like? > For instance a simple example with an array in input and another one in ouput :) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Thu Apr 26 10:06:04 2007 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 26 Apr 2007 08:06:04 -0600 Subject: [SciPy-dev] Which matrix library in C++ for scipy In-Reply-To: References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: On 4/26/07, Neal Becker wrote: > Fernando Perez wrote: > > I'm curious (being a rather primitive C++ user) as to why you don't > > like/use/prefer Blitz++ for this particular use? Blitz arrays are > > fairly numpy-like in much of their behavior, and one can be > > instantiated out of a numpy array with minimal cost (copying only the > > striding info, not the actual data). That's what weave uses both for > > weave.blitz and for weave.inline when type_converters=blitz is passed. > > > > Thanks, > > > > f > > I did try to evaluate this some time back, and don't recall the reasoning - > but it may be that it appears that blitz++ development has stopped, and I > don't want to invest in something that is going to die. The last release > is Oct 2005. Fair enough. From a quick scan of their ML archives (I haven't been subscribed for a long time) it seems that there's still /some/ activity, but Blitz has indeed suffered for a long time from lack of solid development. Julian Cummings --the current maintainer-- does his best, but I have the feeling that this is a project that is mostly love-and-spare-time for him, so it understandably gets a small time slice allocation. It's unfortunate, I think, given how well some aspects of numpy arrays map to Blitz ones (esp. the no-copy part). It looks like an opportunity for a good C++ programmer looking for an interesting project to adopt and re-energize. Cheers, f From ndbecker2 at gmail.com Thu Apr 26 10:09:31 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Apr 2007 10:09:31 -0400 Subject: [SciPy-dev] Which matrix library in C++ for scipy References: <46308BB5.1070009@ar.media.kyoto-u.ac.jp> <46309093.5020809@ar.media.kyoto-u.ac.jp> Message-ID: Matthieu Brucher wrote: >> >> I've got loads of examples. Good is subjective. What kind of example >> would you like? >> > > For instance a simple example with an array in input and another one in > ouput :) > > Matthieu boost::python can expose class as well as functions, but since you are only asking for a (lowly) function, this is a simple example. First, generic c++ code for computing a difference of 1 sample: template inline std::complex diff_demod (std::complex z1, std::complex z2, scalar_t epsilon) { return z2 * std::conj (Limit2 > (z1, epsilon)); } Now code which takes 2 input containers and applies the above, returning an output container: template inline out_t diff_demod2 (in1_t const& in1, in2_t const& in2, scalar_t epsilon) { if (boost::size (in1) != boost::size (in2)) throw std::runtime_error ("diff_demod size mismatch"); out_t out (boost::size (in1)); typename boost::range_const_iterator::type i1 = boost::begin(in1); typename boost::range_const_iterator::type i2 = boost::begin(in2); typename boost::range_iterator::type o = boost::begin(out); for (; i1 != boost::end (in1); ++i1, ++i2, ++o) *o = diff_demod (*i1, *i2, epsilon); return out; } Another variant: template inline out_t diff_demod1 (in_t const& in, scalar_t epsilon) { out_t out (boost::size (in)); typename boost::range_const_iterator::type i = boost::begin(in); typename boost::range_iterator::type o = boost::begin(out); typename boost::range_value::type prev = 0; for (; i != boost::end (in); ++i, ++o) { *o = diff_demod (prev, *i, epsilon); prev = *i; } return out; } Now expose this to python: BOOST_PYTHON_MODULE(limit) { def ("Limit", &Compute, ublas::vector, double >, (arg ("in"), arg ("epsilon")=1e-6)); def ("DiffDemod2", &diff_demod2,ublas::vector,ublas::vector,double>, (arg ("in1"), arg ("in2"), arg ("epsilon")=1e-6)) ; def ("DiffDemod1", &diff_demod1,ublas::vector,double>, (arg ("in"), arg ("epsilon")=1e-6)) ; } From david.huard at gmail.com Thu Apr 26 10:31:40 2007 From: david.huard at gmail.com (David Huard) Date: Thu, 26 Apr 2007 10:31:40 -0400 Subject: [SciPy-dev] Status of the doc format for scipy code ? In-Reply-To: <46307009.3030307@ar.media.kyoto-u.ac.jp> References: <462C69E5.9070905@ar.media.kyoto-u.ac.jp> <20070423090915.GM6933@mentat.za.net> <462C8606.9090004@ar.media.kyoto-u.ac.jp> <20070423202409.GQ6933@mentat.za.net> <91cf711d0704240608l38592dcbg3fd00cb3021d3b83@mail.gmail.com> <462E01B0.6020609@ar.media.kyoto-u.ac.jp> <91cf711d0704240640w789d34f1m1c2c937750f6b232@mail.gmail.com> <46307009.3030307@ar.media.kyoto-u.ac.jp> Message-ID: <91cf711d0704260731q6ab61bacvdcaa6a1ade63a5bf@mail.gmail.com> epydoc orders the elements of the docstring according to an hard-coded list. Here is a small patch to epydoc that lets you insert examples as :Example: an example here >>> trythis(a) [3,4,5] There is certainly a better way to achieve this but I'm still not aware of it. HTH, David 2007/4/26, David Cournapeau : > > David Huard wrote: > > Here is a patch to the latest docutils svn implementing the math role > > from Jens. > > > > You can look at docutils/sandbox/jens/latex-math/test/test.txt for an > > example. Note that I changed the name of the role from latex-math to > > math, so you'll have to replace occurrences of latex-math by math. > > Then run > > > > rst2latex test.txt | pdflatex > Thanks, this is working great ! I have another questions regarding > epydoc's usage. First, when I have a Examples section in my docstring, > it is put before :Parameters: in the output html, even if I put the > example section after in the docstring. Why is that ? > > David > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example.patch Type: text/x-patch Size: 3920 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Fri Apr 27 06:43:52 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 27 Apr 2007 19:43:52 +0900 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? Message-ID: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> Hi, When fixing the fft problem with prime number in numpy, I came again across the fft implementation in scipy, and related problems (suboptimal solution for fftw, see eg scipy ticket #1). I would like to improve the situation: - First, I think the module needs serious cleaning, as for now, it is a bunch of C files with many #define all accross the code, making it difficult to track things. I propose to split the sources by implementation (fft_fftw3.c, fft_fft2, fft_mkl, etc...); this can be done without any consequence on the implementation. - Then, improving fft where it is needed. Does this sound ok with scipy developers ? I should come up with a patch for zfft within today; even if I try to keep changes minimal, I would need people willing to test (I can easily test on linux with fftw2 and 3, and if necessary with mkl on the same platform. I would prefer avoiding testing on windows myself, as I have only a Japanese windows, making things extremely painful for me on this already painful platform). cheers, David From robert.kern at gmail.com Fri Apr 27 12:35:02 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Apr 2007 11:35:02 -0500 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> Message-ID: <46322636.6000907@gmail.com> David Cournapeau wrote: > Hi, > > When fixing the fft problem with prime number in numpy, I came again > across the fft implementation in scipy, and related problems (suboptimal > solution for fftw, see eg scipy ticket #1). I would like to improve the > situation: > - First, I think the module needs serious cleaning, as for now, it > is a bunch of C files with many #define all accross the code, making it > difficult to track things. I propose to split the sources by > implementation (fft_fftw3.c, fft_fft2, fft_mkl, etc...); this can be > done without any consequence on the implementation. > - Then, improving fft where it is needed. > > Does this sound ok with scipy developers ? Go for it. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant.travis at ieee.org Fri Apr 27 14:54:13 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 27 Apr 2007 12:54:13 -0600 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> Message-ID: <463246D5.6040303@ieee.org> David Cournapeau wrote: > Hi, > > When fixing the fft problem with prime number in numpy, I came again > across the fft implementation in scipy, and related problems (suboptimal > solution for fftw, see eg scipy ticket #1). I would like to improve the > situation: > - First, I think the module needs serious cleaning, as for now, it > is a bunch of C files with many #define all accross the code, making it > difficult to track things. Could you be more clear about where the problem in your eyes lies? There are multiple sources for the fft (original fftpack files + interface files to other fft libraries if the user has those installed). The ifdefs that I could find are just in the interface files to the "other" fft libraries that a user might have installed. I don't care if that is redone so that the setup.py file just uses different sources as opposed to defining pre-processor variables, but you might check with Pearu since he is the author of those interfaces. But, I also don't really see the problem with the way it is done, now. > I propose to split the sources by > implementation (fft_fftw3.c, fft_fft2, fft_mkl, etc...); this can be > done without any consequence on the implementation. > - Then, improving fft where it is needed. > Please indicate what improvements you will be making. The fft's are performed by external libraries, I'm hesitant to start altering what those libraries are doing without clear justification. If you want to improve the interface to the optional fftw and mkl libraries that is one thing, but just saying you are going to "fix the fft implementation" is not re-assuring. -Travis From oliphant.travis at ieee.org Fri Apr 27 15:04:36 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 27 Apr 2007 13:04:36 -0600 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <463246D5.6040303@ieee.org> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> Message-ID: <46324944.7010208@ieee.org> Travis Oliphant wrote: > Could you be more clear about where the problem in your eyes lies? > There are multiple sources for the fft (original fftpack files + > interface files to other fft libraries if the user has those installed). > > The ifdefs that I could find are just in the interface files to the > "other" fft libraries that a user might have installed. I don't care if > that is redone so that the setup.py file just uses different sources as > opposed to defining pre-processor variables, but you might check with > Pearu since he is the author of those interfaces. > > But, I also don't really see the problem with the way it is done, now. > I don't mean to discourage effort here. I'm just hesitant to start changing things without some kind of feedback from the original author (Pearu in this case). I just want to make sure he is aware of and agrees with changes that are being made. I know Pearu spent a fair bit of time creating the current fft interfaces. They may not be perfect but they are quite flexible and allow people to use fft's from multiple libraries. Most of the fft behavior comes from those multiple libraries. So, I just want to be clear about which fft implementation and interface is being modified and exactly what the problems are. -Travis From david at ar.media.kyoto-u.ac.jp Sat Apr 28 03:16:29 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 28 Apr 2007 16:16:29 +0900 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <46324944.7010208@ieee.org> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> Message-ID: <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> Travis Oliphant wrote: > Travis Oliphant wrote: >> Could you be more clear about where the problem in your eyes lies? >> There are multiple sources for the fft (original fftpack files + >> interface files to other fft libraries if the user has those installed). >> >> The ifdefs that I could find are just in the interface files to the >> "other" fft libraries that a user might have installed. I don't care if >> that is redone so that the setup.py file just uses different sources as >> opposed to defining pre-processor variables, but you might check with >> Pearu since he is the author of those interfaces. >> >> But, I also don't really see the problem with the way it is done, now. >> > > I don't mean to discourage effort here. I'm just hesitant to start > changing things without some kind of feedback from the original author > (Pearu in this case). I agree, and that's exactly why I send this email. > > I just want to make sure he is aware of and agrees with changes that are > being made. I know Pearu spent a fair bit of time creating the current > fft interfaces. They may not be perfect but they are quite flexible and > allow people to use fft's from multiple libraries. I don't intend to change anything to the API. The reasoning is that scipy implementation of fft is suboptimal when using fftw3: it was really slow for some time, and I applied a quick fix, which is neither efficient or elegant (see ticket1 of scipy trac: http://projects.scipy.org/scipy/scipy/ticket/1). This patch took me a good time to get it right because of the fft sources organization. I understand that style is a matter of preferences more than anything else, and I am certainly not a reference, but I really think that with the kind of following code, it is really difficult to understand what's going on: int i; complex_double *ptr = inout; #ifndef WITH_MKL #if defined(WITH_FFTW) fftw_plan plan = NULL; #endif #endif #if defined WITH_MKL DFTI_DESCRIPTOR_HANDLE desc_handle; #else double* wsave = NULL; #endif #ifdef WITH_FFTWORK coef_dbl* coef = NULL; #endif #ifndef WITH_MKL #ifdef WITH_DJBFFT int j; complex_double *ptrc = NULL; unsigned int *f = NULL; #endif #endif #ifdef WITH_FFTWORK if (ispow2le2e30(n)) { i = get_cache_id_zfftwork(n); coef = caches_zfftwork[i].coef; } else #endif #ifndef WITH_MKL #ifdef WITH_DJBFFT switch (n) { case 2:;case 4:;case 8:;case 16:;case 32:;case 64:;case 128:;case 256:; case 512:;case 1024:;case 2048:;case 4096:;case 8192: i = get_cache_id_zdjbfft(n); f = caches_zdjbfft[i].f; ptrc = (complex_double*)caches_zdjbfft[i].ptr; } if (f==0) #endif #endif #ifdef WITH_MKL desc_handle = caches_zmklfft[get_cache_id_zmklfft(n)].desc_handle; #elif defined WITH_FFTW plan = caches_zfftw[get_cache_id_zfftw(n,direction)].plan; #else wsave = caches_zfftpack[get_cache_id_zfftpack(n)].wsave; #endif I think it is much more readable to have one file with (fftw3) """ complex_double *ptr = inout; fftw_complex *ptrm = NULL; fftw_plan plan = NULL; int i; plan = caches_zfftw[get_cache_id_zfftw(n, dir)].plan; """ One with (fftw case) """ int i; complex_double *ptr = inout; fftw_plan plan = NULL; plan = caches_zfftw[get_cache_id_zfftw(n, direction)].plan; """ Etc... I guess the #ifdef solution was OK with one ot two implementation, but now, I fail to see how anyone can understand what the function does, which variables it is using, etc... To improve the fftw3 implementation, I think we must have a better caching system, and I don't see how I could do that with the way the source is organized right now. > > Most of the fft behavior comes from those multiple libraries. So, I > just want to be clear about which fft implementation and interface is > being modified and exactly what the problems are. No interface would be changed; actually, I wouldn't be surprised if the compiled object code would be exactly the same with my suggestion for cleaning up (To be sure that I don't introduce bugs, I actually compare each new file with the preprocessed original file with gcc -E). David From pearu at cens.ioc.ee Sat Apr 28 06:06:42 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 28 Apr 2007 13:06:42 +0300 (EEST) Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> Message-ID: <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> On Sat, April 28, 2007 10:16 am, David Cournapeau wrote: > Travis Oliphant wrote: >> I don't mean to discourage effort here. I'm just hesitant to start >> changing things without some kind of feedback from the original author >> (Pearu in this case). > I agree, and that's exactly why I send this email. I certainly agree that scipy.fftpack implementation can be improved (as any piece of software). The current implementation was an attempt to support a large number of different fft implementations (they all have their own pros-and-cons) that had (sometimes very) different APIs. And hence the complexity of the current implementation. If I would start rewriting scipy.fftpack then I would probably choose a different approach to the current one too. However, I doubt that any other implementation with the same goal would be easy to understand or read to non-authors due to a variety of APIs of different fft implementations. In summary, I am not against to rewriting scipy.fftpack provided that 1) it is carried out in scipy.sandbox until it becomes more-or-less equivalent to the current scipy.fftpack in terms of features, unit-testing, supported fft backends, and documentation. 2) there are performance tests demonstrating that the new implementation outperforms the old one at least with the following backends: Fortran fftpack (that is default for most users), fftw2, and fftw3. Finally, I would also like to note that the efficiency of scipy.fftpack cacheing should be tested against both short sequences (N=64,..,512) as well as for long sequences (N>1024) as the same scheme could be very dependent of the sequence size when regarding perfomance. Best regards, Pearu From pearu at cens.ioc.ee Sat Apr 28 06:14:16 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 28 Apr 2007 13:14:16 +0300 (EEST) Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> Message-ID: <58855.84.50.128.236.1177755256.squirrel@cens.ioc.ee> On Sat, April 28, 2007 1:06 pm, Pearu Peterson wrote: > In summary, I am not against to rewriting scipy.fftpack provided that > 1) it is carried out in scipy.sandbox until it becomes more-or-less > equivalent to the current scipy.fftpack in terms of features, > unit-testing, supported fft backends, and documentation. I now reread your original proposal and I think that this condition can be relaxed provided that after applying any patches to scipy.fftpack all fftpack unittests pass ok. Pearu From david at ar.media.kyoto-u.ac.jp Sat Apr 28 06:52:18 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 28 Apr 2007 19:52:18 +0900 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <58855.84.50.128.236.1177755256.squirrel@cens.ioc.ee> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> <58855.84.50.128.236.1177755256.squirrel@cens.ioc.ee> Message-ID: <46332762.10701@ar.media.kyoto-u.ac.jp> Pearu Peterson wrote: > On Sat, April 28, 2007 1:06 pm, Pearu Peterson wrote: > >> In summary, I am not against to rewriting scipy.fftpack provided that >> 1) it is carried out in scipy.sandbox until it becomes more-or-less >> equivalent to the current scipy.fftpack in terms of features, >> unit-testing, supported fft backends, and documentation. > > I now reread your original proposal and I think that this condition > can be relaxed provided that after applying any patches to > scipy.fftpack all fftpack unittests pass ok. Thanks for your email, Pearu. I would certainly not submit a patch which do not pass the unittest; problem being I cannot test all implementations (I can test fftpack/fftw2/fftw3 on Linux easily; I cannot get djbfft to compile on my machine; if necessary, I can install the MKL; I don't know whether OS and compiler matters a lot for fftpack, as I don't intend to touch the setup part). I think that I was not really clear in my former emails: I do not suggest a rewrite, but a 2 steps improvements: - one which does not change anything to the code except its organization (one file for one fft API instead of the #ifdef) - once the first step is done and considered acceptable by the scipy developers, I would like to improve fftw3, such as it is at least on par with fftw2: right now, for fftw3, the arrays are copied twice. I think fftw3 requires a more sophisticated caching scheme, because with fftw3 you cannot use the same plan when the input/output are changed, at least with the basic API. For example, Octave is using such a scheme. Other implementations would be untouched. Cheers, David > > Pearu > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > From pearu at cens.ioc.ee Sat Apr 28 07:15:12 2007 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat, 28 Apr 2007 14:15:12 +0300 (EEST) Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <46332762.10701@ar.media.kyoto-u.ac.jp> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> <58855.84.50.128.236.1177755256.squirrel@cens.ioc.ee> <46332762.10701@ar.media.kyoto-u.ac.jp> Message-ID: <60533.84.50.128.236.1177758912.squirrel@cens.ioc.ee> On Sat, April 28, 2007 1:52 pm, David Cournapeau wrote: > > I think that I was not really clear in my former emails: I do not > suggest a rewrite, but a 2 steps improvements: > - one which does not change anything to the code except its > organization (one file for one fft API instead of the #ifdef) > - once the first step is done and considered acceptable by the scipy > developers, I would like to improve fftw3, such as it is at least on par > with fftw2: right now, for fftw3, the arrays are copied twice. I think > fftw3 requires a more sophisticated caching scheme, because with fftw3 > you cannot use the same plan when the input/output are changed, at least > with the basic API. For example, Octave is using such a scheme. Other > implementations would be untouched. I am ok with both steps. Let us know when the first step is complete so that we can test that the interface is working for different backends. Pearu From a.schmolck at gmx.net Mon Apr 30 04:43:54 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Mon, 30 Apr 2007 09:43:54 +0100 Subject: [SciPy-dev] mlabwrap high-level user interface In-Reply-To: <1e2af89e0704300058k3ee12257j1be5399bcc15639d@mail.gmail.com> (Matthew Brett's message of "Mon\, 30 Apr 2007 08\:58\:43 +0100") References: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> <1e2af89e0704300058k3ee12257j1be5399bcc15639d@mail.gmail.com> Message-ID: "Matthew Brett" writes: > Hi guys, > > I like the interface. We might have to think of a different module > name though, because matplotlib has a 'mlab' submodule of the same > name, for matlab compatibility functions, I don't think we need to worry too much about this -- IMO using a plotting package like matplotlib to obtain matlab-like functions (rather than numpy, scipy or whatever) is pretty yucky anyway and people can easily use either an ``import as`` or a a fully qualified name (``matplotlib.mlab`` or ``mlabwrap.mlab``). alex From ondrej at certik.cz Mon Apr 30 05:07:14 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 30 Apr 2007 11:07:14 +0200 Subject: [SciPy-dev] patch implementing broyden schemes in SciPy Message-ID: <85b5c3130704300207pe900d9areeecdbb3e930c4d3@mail.gmail.com> Hi, I submitted a patch for Broyden schemes together with tests: http://projects.scipy.org/scipy/scipy/ticket/402 My email describing it (together with a question) is here: http://projects.scipy.org/pipermail/scipy-dev/2007-April/006979.html I won't have time for it this week and also the patch is for the svn version from a week ago, but if some of you would like to look at that, just let me know and I'll update it to the top svn. I will be glad to help the code meet SciPy's standards - just let me know if I should change anything. Thanks very much, Ondrej Certik From a.schmolck at gmx.net Mon Apr 30 05:57:36 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Mon, 30 Apr 2007 10:57:36 +0100 Subject: [SciPy-dev] mlab namespace thought References: <1e2af89e0704211119x66ea9258i74868484c2ca62e@mail.gmail.com> <1e2af89e0704222326mf00c14ayec7a3d2e9ce0a0e1@mail.gmail.com> Message-ID: Hi Matthew, [I think this is likely to be of interest to other mlabwrap developers as well, so I hope it's OK if I move this to scipy-dev, and re-arrange to bottom-posting style. Sorry it took me some time to get back on this -- I've taken longer than expected to readjust to my European day-cycle] >> "Matthew Brett" writes: >> > Hi, >> > >> > I was just thinking about the problem of the kludginess of >> > >> >>>> from mlabwrap import mlab >> >>>> import mlabraw >> >>>> A, B, C = mlab.svd([[1,2],[1,3]], 0, nout=3) >> >>>> pymat.put(mlab._session, "X", [[1,2], [1,3]]) >> > >> > It seems to me the problem is the fact that the mlab namespace has to >> > be kept free of anything that could be a matlab function. >> > >> > So how about: >> > >> > import mlab >> > from mlab import exec as mle >> > >> > A, B, C = mle.svd([[1,2],[1,3]], 0, nout=3) >> > mlab.put(X, [[1,2], [1,3]]) > On 4/23/07, Alexander Schmolck wrote: >> >> Hi Matthew, >> >> why not just use >> >> mlab._set('X', [[1,2], [1,3]]) >> >> :) >> >> alex >> > >> > Best, "Matthew Brett" writes: > Yer - sorry - I forgot that your're right. :) > > But, in general, you've got the problem that any new method of > attribute in the mlab module has to be preprended with an underscore, > which is a bit awkward - as you know, the underscore is in general a > semi-formal clue to the programmer that the feature is private and > should not be used outside the module itself. I speak only for myself, > but I have the instinct to think hard before I use an underscore > function or attribute. And it makes it difficult for you to indicate > which functions etc are really private and which are not. Agreed -- there's still double-underscore, but that has name-mangling implications; OTOH > I realize this would mean an API change in mlabwrap... I can definitely see where you're coming from -- It is true that underscore method names carry a strong connotation with "messing with internals", but there are two considerations why I prefer the current scheme to having several objects: 1. There's a good technical reason why there's a single ``mlab`` that mediates access to a matlab session, rather than several handles for different purposes (``mle`` etc.): mlabwrap supports multiple matlab sessions and each such session is encapsulated in a MlabWrap class instance -- ``mlab`` is just the 'default' session that gets created as one imports mlabwrap. If one uses different objects to interface to a single session, synchronization issues can arise (e.g. one's ``mle`` might in fact accidentally refer to a different session than one's ``mlab``). 2. Finally, getting and setting variables, *is*, to my mind "messing with internals" -- the interface metaphor of mlabwrap is that matlab is a python library -- you call functions in the mlab "module" just like in any other module and the things you pass in and get out are python objects (possibly proxying matlab objects). Setting named variables in matlab-space is lower-level than that The main use case I see for it is querying and setting global variables that control the behavior of a matlab package (or matlab itself). Whilst I'd be happy to give greater prominence to ``mlab._set`` and ``mlab._get`` in the documentation (and I think de-emphasizing mlabraw might also be a good idea) or reserve the dictionary access notation for this use as Brian suggested, my feeling is that it is not that often needed. If you have some other, non-marginal use case it would really be good if you could send me some examples. I think that it'd be feasible to think up a more "natural" way to handle such variable setting (e.g. one could possibly make ``mlab.some_global_var = 3`` work; but there are some difficulties associated with doing that; e.g. the syntactic equivalence between nullary function call and variable access in matlab; see my other post or scipy.org/MlabWrap) if there is a demonstrated need. More on this in my replies to Brian. cheers, alex From jtravs at gmail.com Mon Apr 30 07:17:07 2007 From: jtravs at gmail.com (John Travers) Date: Mon, 30 Apr 2007 12:17:07 +0100 Subject: [SciPy-dev] Cleaning and fixing fft in scipy ? In-Reply-To: <46332762.10701@ar.media.kyoto-u.ac.jp> References: <4631D3E8.30400@ar.media.kyoto-u.ac.jp> <463246D5.6040303@ieee.org> <46324944.7010208@ieee.org> <4632F4CD.5010302@ar.media.kyoto-u.ac.jp> <58658.84.50.128.236.1177754802.squirrel@cens.ioc.ee> <58855.84.50.128.236.1177755256.squirrel@cens.ioc.ee> <46332762.10701@ar.media.kyoto-u.ac.jp> Message-ID: <3a1077e70704300417p5b65036eh2cc099cc5b43a708@mail.gmail.com> On 28/04/07, David Cournapeau wrote: > Thanks for your email, Pearu. I would certainly not submit a patch which > do not pass the unittest; problem being I cannot test all > implementations (I can test fftpack/fftw2/fftw3 on Linux easily; I > cannot get djbfft to compile on my machine; if necessary, I can install > the MKL; I don't know whether OS and compiler matters a lot for fftpack, > as I don't intend to touch the setup part). I can easily test the MKL version. I wrote the original patch for the MKL support and agree some improvements to the source would be nice. In fact MKL is only currently supported for complex ffts so I might get round to extending that at some point after you have done your improvements. Cheers, John From nwagner at iam.uni-stuttgart.de Mon Apr 30 08:32:03 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 30 Apr 2007 14:32:03 +0200 Subject: [SciPy-dev] mlabwrap high-level user interface In-Reply-To: References: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> <1e2af89e0704300058k3ee12257j1be5399bcc15639d@mail.gmail.com> Message-ID: <4635E1C3.1080100@iam.uni-stuttgart.de> Alexander Schmolck wrote: > "Matthew Brett" writes: > > >> Hi guys, >> >> I like the interface. We might have to think of a different module >> name though, because matplotlib has a 'mlab' submodule of the same >> name, for matlab compatibility functions, >> > > I don't think we need to worry too much about this -- IMO using a plotting > package like matplotlib to obtain matlab-like functions (rather than numpy, > scipy or whatever) is pretty yucky anyway and people can easily use either an > ``import as`` or a a fully qualified name (``matplotlib.mlab`` or > ``mlabwrap.mlab``). > > alex > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > BTW I cannot get access to mlabwrap via svn. (as described http://www.scipy.org/MlabWrap) Is it a temporary problem ? svn co http://scipy.org/svn/scikits/trunk/mlabwrab/ svn: URL 'http://scipy.org/svn/scikits/trunk/mlabwrap' doesn't exist Nils From a.schmolck at gmx.net Mon Apr 30 08:47:25 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Mon, 30 Apr 2007 13:47:25 +0100 Subject: [SciPy-dev] mlabwrap high-level user interface In-Reply-To: <4635E1C3.1080100@iam.uni-stuttgart.de> (Nils Wagner's message of "Mon\, 30 Apr 2007 14\:32\:03 +0200") References: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> <1e2af89e0704300058k3ee12257j1be5399bcc15639d@mail.gmail.com> <4635E1C3.1080100@iam.uni-stuttgart.de> Message-ID: Nils Wagner writes: > BTW I cannot get access to mlabwrap via svn. > (as described http://www.scipy.org/MlabWrap) > > Is it a temporary problem ? Sort of -- the repository is temporarily located on a server of Berkeley university (since we needed some svn server were we had root access when we migrated from CVS), but it should move to quite soon. I think Jarrod Millman (who set up the server) now also has admin rights to the scipy.org/scikits svn and should transfer the server pretty soon; he's mentioned to me in an email that he's very busy right now, which might cause additional delay -- but either I or Jarrod will write a short note to this list, once the move is accomplished. cheers, 'as From a.schmolck at gmx.net Mon Apr 30 08:52:28 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Mon, 30 Apr 2007 13:52:28 +0100 Subject: [SciPy-dev] mlabwrap high-level user interface In-Reply-To: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> (Brian Hawthorne's message of "Mon\, 30 Apr 2007 00\:50\:17 -0700") References: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> Message-ID: [Moving this to scipy-dev] "Brian Hawthorne" writes: > Hello, > I'm hoping to get feedback from you all on some suggestions for what a > cleaner mlabwrap API might look like. > >>>> from scikits.mlab import engine # engine is an instance of MlabEngine > (or simply Engine) This is simply about renaming scikits.mlabwrap to scikits.mlab, MlabWrap to MlabEngine and mlab to engine, right? I agree that mlabwrap is not necessarily the most elegant name (pymat is better, but was already taken), but I'm not sure I'd prefer engine to mlab -- it's more generic and more to type; it's also slightly the wrong emphasis for my taste (I don't want people to focus on the fact that they're doing something that involves using the matlab engine; the abstraction level offered by mlabwrap is above that of matlab's engine protocol). Anyway, my suggestion is that we first deal with the meat (sorting out hybrid proxying, scikits and testing infrastructure etc.) and worry about cosmetic issues like renamings afterwards -- I don't think naming issues are unimportant but they can be dealt with once the functionality we want is there; conversely changes in the core design can render cosmetic decisions obsolete. >>>> engine.X = [[1,2,3],[4,5,6],[7,8,9]] # raw put, implement __setattr__ >>>> engine["X"] = [[1,2,3],[4,5,6],[7,8,9]] # raw put, implement __setitem__ >>>> X = engine.X # raw get, implement __getattr__ >>>> X = engine["X"] # raw get, implement __getitem__ Do I understand you correctly that you want 1&2 and 3&4 to be equivalent (javscript-style)? If so I'm against it -- in python there's ideally one and only one obvious way. I propose using ``.X`` for function calls and ``.["X"]`` for variable access (yup, there *is* a reason why I think one wants to seperate these, see my reply to your other post). >>>> Y = engine("matlab code") # raw eval, implement __call__ Using ``engine["x"]`` viz. ``mlab["x"]`` for variable access and ``mlab("do something")`` for raw evaluation looks pretty attractive to me. The only reason I can see why one might want to reserve ``__call__`` is that it could also mkae for a convenient customization syntax, e.g:: mlab(flatten_row_vecs=True).sin([1,2,3]) which might be more important than having a shorter way to spell ``mlab._do("do something")``. With python2.5 one could also use the with syntax for that, but the syntax above could still be slightly more convenient for single calls with non-default options. >>>> a = engine.some_matlab_object_with_nested_attributes # return an > ObjectProxy >>>> a.b.c # in matlab, call (a.subsref("b")).subsref("c") >>>> a["b.c"] # bypas "normal" indexing to call a.subsref("b.c") > > Regarding packaging, the code currently in _mlabwrap.py could move into the > __init__.py file (since it's imported into there anyway), then we could > delete > _mlabwrap.py. _mlabwrap.py is there for a purpose, albeit a fairly egotistical one: it makes switching buffers in emacs easier (i.e. generic names like __init__.py are inconvenient if you got a UI that lets you switch to the desired buffer (viz opened file for non-emacs users) by entering some unique substring). I'd be happy to get rid of _mlabwrap.py if there are any downsides associated with having it, but I'm currently not aware of any. > Also, the ctypes version Taylor is working on won't require any non-python > extension code (I believe), so mlabrawmodule.so will disappear too, Yup. > and the package directory will be left with nothing but awmsmeta.py and > awmstools.py. Taylor also mentioned that the ctypes version currently relies > on some small C utilities, but that they can probably be done away with > (please correct me if I'm wrong here). No, I'd indeed like to get rid of all C(++) code. > I also suggest moving the tests dir under the scikits.mlab package so it can > be found by numpytest. IIRC the current directory structure (and the position of ``tests/``) corresponds to what Robert Kern suggested -- I'm not really familiar with numpytest yet, but I suppose if it expects tests to be in the location you say that's presumably an oversight. I'm about to write a scikits related email to the list anyway, so I'll ask in there. > As far as changing mlabwrap to mlab, it's not super important, but it would > be a bit easier to type, and I think in general python wrappers don't > use the word "wrap" in their names. I don't think the typing argument matters much, since unlike e.g. ``os`` ``mlabwrap`` is not meant to be to be used as a qualifying prefix -- the standard idiom is ``from mlabwrap import mlab``, so mlabwrap only needs to get type once per file/session. > If we did that, it would also make sense for the project itself to be called > mlab (the fewer names the better). It's common in python, but IMO stuff like ``StringIO.StringIO`` is piss-poor design in a language where both the use of unqualified names and qualified names is common (and importing module handles later on is often required for reloading and other interactive shell use). I've been frequently annoyed and even once or twice been bitten by this so I'd really rather not have ``mlab.mlab`` even if it would mean fewer names. OTOH now we've got the additional leading ``scikits.`` anyway, the respective ``mlab``s are presumably sufficiently disambiguated... so I guess I'm open to this renaming, but I suggest we deal with other things first, as mentioned above. cheers, alex From a.schmolck at gmx.net Mon Apr 30 09:02:35 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Mon, 30 Apr 2007 14:02:35 +0100 Subject: [SciPy-dev] mlabwrap high-level user interface In-Reply-To: <796269930704300114n1c3e85d6x691504d424fbdd4f@mail.gmail.com> (Brian Hawthorne's message of "Mon\, 30 Apr 2007 01\:14\:24 -0700") References: <796269930704300050k272ba290t30d60098a9a72b3a@mail.gmail.com> <796269930704300114n1c3e85d6x691504d424fbdd4f@mail.gmail.com> Message-ID: [moving this to scipy-dev] "Brian Hawthorne" writes: > Ah, forgot one critical use case, the function call! > >>>> res = engine.svd(X, nout=1) >>>> res = engine["svd"](X, nout=1) As I mentioned in the other poist, I don't like that. There ought to be one way to do things, and the first type of call is clearly what we'd like function calls to typically look like. I suspect that ``engine["svd"]`` ought to be equivalent to ``engine.svd()``. The reason why I'd at least strongly consider this funny looking behavior that is that nullary-function call vs. variable lookup is below the interface level in matlab (i.e. syntactically undistinguishable and hence something that one is often at liberty to change for a variable that isn't intended for mutation; a bit like property access can be transparently replaced by a function call in python (using e.g __getattr__) but not in lesser languages like java or C++). > This would fall under the exact same code as the raw get below (here > returning a function proxy). Oh, and one more suggestion, if we're going to > treat the engine as a dict (which it is), it might also be nice to have a > keys method to return all the names in the matlab namespace: > >>>> mlab_vars = engine.keys() # or "vars" or "names", whatever I'm against conflating matlab and MLabWrap-instance method namespaces, especially since I don't see a very compelling use-case for a ``.keys`` method. > Though i guess if matlab already has a command for that, you could just use > that ;) Indeed and if there isn't you might have a hard time implementing it anyway ;) > One nice thing about this design is that the engine object has no > statically defined public methods, so you don't have to worry so much about > a method name conflicting with a matlab name. Yes, I think we really want to avoid that. > I think it's fine to define a private RawEngine class under the hood which > implements open, close, get, set, and eval. Then the main user-facing engine > will just have a reference to it and delegate to it. Hmm, sorry I'm not 100% sure what you're proposing here -- would RawEngine essentially be like mlabraw, but ctypes based with a class interface? If so, yes that would indeed be the approach I'd advocate for the ctypes mlabraw replacement (the only reason that the C++ code doesn't work like this is that in the case of mlabraw.cpp the effort involved would IMO not have justified the reward). IIRC, David's ctypes code already looks like this. alex