From charlesr.harris at gmail.com Sun Apr 1 16:52:53 2007 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Apr 2007 14:52:53 -0600 Subject: [SciPy-dev] Violation of array scalar multiplication rules? Message-ID: Just asking. In [35]: type(array(1.0)*2) Out[35]: In [36]: type(array(1.0)) Out[36]: Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Apr 3 07:11:48 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 03 Apr 2007 20:11:48 +0900 Subject: [SciPy-dev] Improving scipy.clusters Message-ID: <46123674.2070607@ar.media.kyoto-u.ac.jp> Hi there, I would like to clean-up and improve the kmean algorithm (more initialization schemes for the algorithm, and better docs). I already have write access to the scipy svn rep, but as scipy.clusters is not in the sandbox nor my code, I would like to make sure it is OK, cheers, David From robert.kern at gmail.com Tue Apr 3 13:02:50 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 03 Apr 2007 12:02:50 -0500 Subject: [SciPy-dev] Improving scipy.clusters In-Reply-To: <46123674.2070607@ar.media.kyoto-u.ac.jp> References: <46123674.2070607@ar.media.kyoto-u.ac.jp> Message-ID: <461288BA.6060205@gmail.com> David Cournapeau wrote: > Hi there, > > I would like to clean-up and improve the kmean algorithm (more > initialization schemes for the algorithm, and better docs). I already > have write access to the scipy svn rep, but as scipy.clusters is not in > the sandbox nor my code, I would like to make sure it is OK, It is. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at ee.byu.edu Tue Apr 3 14:28:07 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 03 Apr 2007 12:28:07 -0600 Subject: [SciPy-dev] Improving scipy.clusters In-Reply-To: <46123674.2070607@ar.media.kyoto-u.ac.jp> References: <46123674.2070607@ar.media.kyoto-u.ac.jp> Message-ID: <46129CB7.5060403@ee.byu.edu> David Cournapeau wrote: > Hi there, > > I would like to clean-up and improve the kmean algorithm (more > initialization schemes for the algorithm, and better docs). I already > have write access to the scipy svn rep, but as scipy.clusters is not in > the sandbox nor my code, I would like to make sure it is OK, > I'm pretty sure nobody else is working on it at the moment so please do whatever you can. Many thanks, -Travis From oliphant at ee.byu.edu Tue Apr 3 18:43:29 2007 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue, 03 Apr 2007 16:43:29 -0600 Subject: [SciPy-dev] NumPy 1.0.2 released Message-ID: <4612D891.8040200@ee.byu.edu> To all SciPy / NumPy users: NumPy 1.0.2 was released yesterday (4-02-07). Get it by following the download link at http://numpy.scipy.org This is a bug-fix release with a couple of additional features. Thanks to everybody who helped track down and fix bugs. -Travis From bgoli at sun.ac.za Wed Apr 4 02:02:08 2007 From: bgoli at sun.ac.za (Brett Olivier) Date: Wed, 4 Apr 2007 08:02:08 +0200 Subject: [SciPy-dev] scipy.test "generic 1d filter" crashes interpreter on windows Message-ID: <200704040802.08155.bgoli@sun.ac.za> Hi Running scipy.test() after installing an SVN version of scipy on windows ('0.5.3.dev2895') causes the interpereter to crash on the "generic 1d filter" test. This problem seems to be windows specific and have appeared sometime after '0.5.3.dev2866'. scipy.test(1,10) generation of a binary structure 3 ... ok generation of a binary structure 4 ... ok generic filter 1 ... ERROR generic 1d filter 1 Build environment scipy.__version__ = '0.5.3.dev2895' numpy.__version__ = '1.0.3.dev3657' WinXP, Python2.4, MinGW (gcc 3.4.5), ATLAS 3.7.11 TIA Brett From stefan at sun.ac.za Wed Apr 4 04:25:52 2007 From: stefan at sun.ac.za (Stefan van der Walt) Date: Wed, 4 Apr 2007 10:25:52 +0200 Subject: [SciPy-dev] scipy.test "generic 1d filter" crashes interpreter on windows In-Reply-To: <200704040802.08155.bgoli@sun.ac.za> References: <200704040802.08155.bgoli@sun.ac.za> Message-ID: <20070404082552.GP18196@mentat.za.net> Hi Brett I fixed a couple of issues in ndimage regarding spline interpolation, which required porting some of the code from numarray to numpy. The memory leak was probably introduced then. I ran the whole test suite under valgrind, which didn't report any problems under linux (just did it again to make sure, and it's still fine). Unfortunately, I am not familiar enough with windows systems to know what the equivalent of valgrind would be. Is there any way you can localise the problem further? Just to be safe, please make sure you are doing a clean build of scipy > r2889. Regards St?fan On Wed, Apr 04, 2007 at 08:02:08AM +0200, Brett Olivier wrote: > Hi > > Running scipy.test() after installing an SVN version of scipy on > windows ('0.5.3.dev2895') causes the interpereter to crash on > the "generic 1d filter" test. This problem seems to be windows > specific and have appeared sometime after '0.5.3.dev2866'. > > scipy.test(1,10) > generation of a binary structure 3 ... ok > generation of a binary structure 4 ... ok > generic filter 1 ... ERROR > generic 1d filter 1 > > Build environment > scipy.__version__ = '0.5.3.dev2895' > numpy.__version__ = '1.0.3.dev3657' > WinXP, Python2.4, MinGW (gcc 3.4.5), ATLAS 3.7.11 > > TIA > Brett From cimrman3 at ntc.zcu.cz Wed Apr 4 04:37:42 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 04 Apr 2007 10:37:42 +0200 Subject: [SciPy-dev] UMFPACKv4.4 and swig memory leak of type 'void *' In-Reply-To: <45E54D84.8010103@iam.uni-stuttgart.de> References: <45E54D84.8010103@iam.uni-stuttgart.de> Message-ID: <461363D6.3010305@ntc.zcu.cz> Nils Wagner wrote: > Hi all, > > scipy.test(1,10) reports a lot of memory leaks, e.g. > > Getting factors of complex matrixswig/python detected a memory leak of > type 'void *', no destructor found. > Getting factors of real matrixswig/python detected a memory leak of type > 'void *', no destructor found. > Solve with UMFPACK: double precision complexswig/python detected a > memory leak of type 'void *', no destructor found. > swig/python detected a memory leak of type 'void *', no destructor found. > ... ok > Solve: single precision complexUse minimum degree ordering on A'+A. > ... ok > Solve with UMFPACK: double precisionswig/python detected a memory leak > of type 'void *', no destructor found. > swig/python detected a memory leak of type 'void *', no destructor found. > ... ok > > Is this a swig problem ? > > I am using > > Numpy version 1.0.2.dev3562 > Scipy version 0.5.3.dev2774 Hi Nils, it was not a swig problem but mine (passing bad 'own' flag in a typemap)). One can be really blind... Now it seems fixed. (rev. 2896) r. From a.schmolck at gmx.net Fri Apr 6 19:05:58 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: 07 Apr 2007 00:05:58 +0100 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> Message-ID: Robert Kern writes: [on the question whether the svn-layout {branch,tags,trunk}/subproject (Robert) or subproject/{branch,tags,trunk} (me) is preferable for scikits] > By and large, it simply doesn't matter to "get everything related to your > project". Believe me, you don't want all of branches/ and tags/ in a > checkout. I have done so in order to get some overview over a project that I was unfamiliar with, but I agree that that's hardly an important use case. Migration convenience seems more relevant, but unless rearranging dir structure in svn repositories is really hard, that's not a big factor either. > On the other hand, "getting all of the packages in scikits" does matter. Sure, but not that often either. Why would many people want to frequently update the svn versions of all of several unrelated and fairly special purpose scientific packages? I would assume the typical case is that someone hacks e.g. mlabwrap and updates that from time to time rather than a dozen other porjects. > IMO, the inconvenience of prefixing your branches and tags is secondary to > the performance problems of svn:external, which slows down all checkouts and > updates for everyone Only dirs with svn:externals would be affected, right? Since the only such dir would be /scikits, which most people won't checkout, I don't think there would be much of an impact (see above). > (although I'll have to double-check that claim for svn:external links within > the same repository). I think you're right: I added an external 'ext' on mlabwrap to mlabwrap/test, and it appears to slow down checkout; I've got a very old svn client, but you can try yourself: svn co http://scipy.org/svn/scikits/trunk/mlabwrap/ > Also, it appears that Trac doesn't like svn:external to other repositories; > I'm not sure if that extends to svn:external within the same repository. > > http://trac.edgewall.org/wiki/TracFaq#DoesTracsupportsvn:externalsubversionrepositories For this purpose it seems to work fine: It just displays the svn:externals property, which is all you want here, I think. Anyway, the choice between either layout (i.e. subproject internal/external {tags,branches,trunk}) seems not terribly important. Each approach has advantages and disadvantages, and apparently both are common; the 'official' SVN book authors prefer internal, as do I, but write about the toplevel {tags,branches,trunk} approach: There's nothing particularly incorrect about such a layout, but it may or may not seem as intuitive for your users. Especially in large, multi-project situations with many users, those users may tend to be familiar with only one or two of the projects in the repository. But the projects-as-branch-siblings tends to de-emphasize project individuality and focus on the entire set of projects as a single entity. That's a social issue though. We like our originally suggested arrangement for purely practical reasons: it's easier to ask about (or modify, or migrate elsewhere) the entire history of a single project when there's a single repository path that holds the history entirepast, present, tagged, branched andfor that project and that project alone. > > Maybe the structure I proposed would also make importing and exporting of > > projects slightly easier (because it mirrors the typical layout of an > > individual svn project). Speaking of which -- is there something I can do with > > the existing CVS so that it can be easily imported in the scikits svn (in > > which case we can get rid of what's already checked in), or would importing > > the CVS involve a hassle in any case, because then I'll just archive it on > > sourceforge. > > I don't know. You'll have to read the cvs2svn documentation. Sorry, I should have phrased this better. I have already read the cvs2svn docs and converted the CVS repository (just a single trunk, no branches or tags). If I put a tar.bz2 of that repository on the web, can someone with admin rights easily install it in lieu of the existing repository? Is Berkeley db fine as backend? If it's not easy and quick to do, I'm also happy to loose the revision history and abandon conversion attempts. > > The other thing I've been wondering is if such a setup couldn't also be made > > to accomodate something like Stefan van der Walt's layout proposal, which as > > far as I can see would allow for the most convenient way possible to grab all > > scikits and build them: > > > > setup.py > > scikits/ > > __init_xg_.py > > -> mlabwrap/ > > mlabwrap_setup.py > > __init__.py > > awmstools.py > > ... > > -> some_other_scikit/ > > some_other_scikit_setup.py > > Having two ways to install something is just begging for trouble. I wasn't advocating two ways; you always call scikits/setup.py and it just installs different amounts of stuff depending on how many subdirs you've checked out. > > Couldn't one have a toplevel setup.py that just runs all > > scikits/DIRNAME/DIRNAME_setup.py's it can find, passing through the command line > > options (or something along those lines[1])? > > That's unworkable. I've tried. I suspected that it might turn out to be. Pitty. > >> If so it should also use numpy.distutils. Just make sure to import > >> setuptools first. > >> > >> import setuptools > >> from numpy.distutils.core import setup > >> ... > > > > Is there a recipe/template for this somewhere? Googling "scipy setuptools" > > comes up with > > > > > > > > as the first hit, which seems to indicate that setuptools is still a bit alpha > > and the docs can't be trusted if one wants something that actually works. > > Fernando's a curmudgeon, and that page is old. Ignore him. :-) OK, I'm going the``ez_setup.py`` way. > Like I said, just import setuptools before you import numpy.distutils. Then use > numpy.distutils as normal to handle all of the building and stuff. setuptools > adds some keywords to setup() that you should also provide, namely > > namespace_packages=['scikits'], > > That's all that's necessary. There's no particular magic to combining setuptools > and numpy.distutils. Well, it's not quite obvious how to fully take advantage of setuptools, though. One of the main reasons for using it is that it's meant to download and install depenedencies automatically, but that can't work if my setup.py import something from its sole dependency (numpy) to start with. Surely there must be some way to write packages that depend on numpy but can be installed automatically (and download numpy if required)? And why do I need to use numpy.distutils in the first place? I find rather unhelpful and I didn't find any other documentation. Another thing: should ``scikits/__init__.py`` be really empty? From it looks a bit to me like it should contain: __import__('pkg_resources').declare_namespace(__name__) ? Finally, what is the preferred download url for scikits projects? Should I continue to host the file-release on SF, or should they go somewhere else? Same for the webpage. So I think some kind of template for scikit authors would be useful and I'd suggest that once I've got setup.py etc. ironed out I put some info for other prospective scikit authors on a wikipage on scipy.org -- what would be a good place? Finally, just to double check, does this directory structure look good to you: mlabwrap/ setup.py README.txt tests/ # N.B. renamed from 'test' test_mlabwrap.py ... scikits/ __init__.py # empty mlabwrap/ __init__.py _mlabwrap.py mlabraw.py -> awmstools.py -> awmsmeta.py ? thanks, 'as From robert.kern at gmail.com Fri Apr 6 22:38:42 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 06 Apr 2007 21:38:42 -0500 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> Message-ID: <46170432.5080205@gmail.com> Alexander Schmolck wrote: > Robert Kern writes: > Anyway, the choice between either layout (i.e. subproject internal/external > {tags,branches,trunk}) seems not terribly important. Fine. Please let's drop the issue, then. >>> Maybe the structure I proposed would also make importing and exporting of >>> projects slightly easier (because it mirrors the typical layout of an >>> individual svn project). Speaking of which -- is there something I can do with >>> the existing CVS so that it can be easily imported in the scikits svn (in >>> which case we can get rid of what's already checked in), or would importing >>> the CVS involve a hassle in any case, because then I'll just archive it on >>> sourceforge. >> I don't know. You'll have to read the cvs2svn documentation. > > Sorry, I should have phrased this better. I have already read the cvs2svn docs > and converted the CVS repository (just a single trunk, no branches or tags). > If I put a tar.bz2 of that repository on the web, can someone with admin > rights easily install it in lieu of the existing repository? Is there no conversion tool that simply puts revisions into an existing directory of a repository instead of making a new repository? > Is Berkeley db > fine as backend? No. We use the fsfs. >>>> If so it should also use numpy.distutils. Just make sure to import >>>> setuptools first. >>>> >>>> import setuptools >>>> from numpy.distutils.core import setup >>>> ... >>> Is there a recipe/template for this somewhere? Googling "scipy setuptools" >>> comes up with >>> >>> >>> >>> as the first hit, which seems to indicate that setuptools is still a bit alpha >>> and the docs can't be trusted if one wants something that actually works. >> Fernando's a curmudgeon, and that page is old. Ignore him. :-) > > OK, I'm going the``ez_setup.py`` way. No, ez_setup.py is deprecated. That part of the setuptools docs is out of date. When the Cheeseshop comes back up read this page with up-to-date information about how to get going with setuptools: http://cheeseshop.python.org/pypi/setuptools >> Like I said, just import setuptools before you import numpy.distutils. Then use >> numpy.distutils as normal to handle all of the building and stuff. setuptools >> adds some keywords to setup() that you should also provide, namely >> >> namespace_packages=['scikits'], >> >> That's all that's necessary. There's no particular magic to combining setuptools >> and numpy.distutils. > > Well, it's not quite obvious how to fully take advantage of setuptools, > though. One of the main reasons for using it is that it's meant to download > and install depenedencies automatically, but that can't work if my setup.py > import something from its sole dependency (numpy) to start with. Surely there > must be some way to write packages that depend on numpy but can be installed > automatically (and download numpy if required)? Not really. The dependencies that you can specify are requirements for using the package after it is installed, not requirements for building the package. The structure of distutils setup.py files pretty much enforces this. setuptools doesn't really get to do anything until setup() is called, i.e. after you've used your dependencies. You might be able to hack something together with pkg_resources.WorkingSet.resolve(), though. http://peak.telecommunity.com/DevCenter/PkgResources#workingset-methods-and-attributes > And why do I need to use > numpy.distutils in the first place? You don't strictly have to. numpy.get_include() is probably sufficient for you, but it has the same problem. > Another thing: should ``scikits/__init__.py`` be really empty? From > > > > it looks a bit to me like it should contain: > > __import__('pkg_resources').declare_namespace(__name__) > > ? Yes, apologies. > Finally, what is the preferred download url for scikits projects? Should I > continue to host the file-release on SF, or should they go somewhere else? I recommend putting them on the Cheeseshop. > Same for the webpage. scipy.org wiki if you can. > So I think some kind of template for scikit authors would be useful and I'd > suggest that once I've got setup.py etc. ironed out I put some info for other > prospective scikit authors on a wikipage on scipy.org -- what would be a good > place? http://projects.scipy.org/scipy/scikits > Finally, just to double check, does this directory structure look good to you: > > mlabwrap/ > setup.py > README.txt > tests/ # N.B. renamed from 'test' > test_mlabwrap.py > ... > scikits/ > __init__.py # empty > mlabwrap/ > __init__.py > _mlabwrap.py > mlabraw.py > -> awmstools.py > -> awmsmeta.py > > ? Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From guyer at nist.gov Fri Apr 6 23:29:23 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Fri, 6 Apr 2007 23:29:23 -0400 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: <46170432.5080205@gmail.com> References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> <46170432.5080205@gmail.com> Message-ID: <064E82D4-481F-4C72-B2F1-B6ED35AE25EE@nist.gov> On Apr 6, 2007, at 10:38 PM, Robert Kern wrote: > Alexander Schmolck wrote: >> Sorry, I should have phrased this better. I have already read the >> cvs2svn docs >> and converted the CVS repository (just a single trunk, no branches >> or tags). >> If I put a tar.bz2 of that repository on the web, can someone with >> admin >> rights easily install it in lieu of the existing repository? > > Is there no conversion tool that simply puts revisions into an > existing > directory of a repository instead of making a new repository? cvs2svn can do this. http://cvs2svn.tigris.org/faq.html talks about an "options file method" that I haven't used, but the older dumpfile mechanism worked fine for us. From wnbell at gmail.com Sun Apr 8 03:35:39 2007 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 8 Apr 2007 01:35:39 -0600 Subject: [SciPy-dev] [Numpy-discussion] Tuning sparse stuff in NumPy In-Reply-To: <46095404.4020106@ntc.zcu.cz> References: <4607E40E.5000300@ntc.zcu.cz> <4607EDCD.5020001@ntc.zcu.cz> <46095404.4020106@ntc.zcu.cz> Message-ID: On 3/27/07, Robert Cimrman wrote: > ok. now which version of scipy (scipy.__version__) do you use (you may > have posted it, but I missed it)? Not so long ago, there was an effort > by Nathan Bell and others reimplementing sparsetools + scipy.sparse to > get better usability and performance. My (almost latest) version is > 0.5.3.dev2860. Robert, did David find the source of his performance problems? I suspect that he was using an older version of sparsetools, but I'd like to know for sure. -- Nathan Bell wnbell at gmail.com From david at ar.media.kyoto-u.ac.jp Sun Apr 8 22:57:01 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 09 Apr 2007 11:57:01 +0900 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? Message-ID: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> Hi there, I tried to compile numpy/scipy with recent LAPACK and BLAS versions (LAPACK 3.1.1, BLAS from the netlib package, not from LAPACK, using gfortran as a compiler everywhere), and I got several errors when testing scipy: ====================================================================== FAIL: check_syevr (scipy.lib.tests.test_lapack.test_flapack_float) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", line 41, in check_syevr assert_array_almost_equal(w,exact_w) File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 230, in assert_array_almost_equal header='Arrays are not almost equal') File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 215, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ====================================================================== FAIL: check_syevr_irange (scipy.lib.tests.test_lapack.test_flapack_float) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", line 66, in check_syevr_irange assert_array_almost_equal(w,exact_w[rslice]) File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 230, in assert_array_almost_equal header='Arrays are not almost equal') File "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", line 215, in assert_array_compare assert cond, msg AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ---------------------------------------------------------------------- The different dtype may suggest an error while compiling the BLAS/LAPACK, but I tested the libraries with official tester without any error. cheers, David From matthew.brett at gmail.com Mon Apr 9 06:24:31 2007 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 9 Apr 2007 11:24:31 +0100 Subject: [SciPy-dev] ctypes requirement? Message-ID: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> Hi, I was thinking of using ctypes in some of the scipy matlab read / write routines. I noticed that a couple of sandbox packages are already using it. Does the team think the time has come for ctypes to be a requirement for scipy? It will make some development easier. Thanks, Matthew From a.schmolck at gmx.net Mon Apr 9 21:41:05 2007 From: a.schmolck at gmx.net (Alexander Schmolck) Date: 10 Apr 2007 02:41:05 +0100 Subject: [SciPy-dev] mlabwrap scikit [Was: Re: scikits project] In-Reply-To: <46170432.5080205@gmail.com> References: <460BC5FD.4050602@gmail.com> <460C7070.2000904@gmail.com> <46170432.5080205@gmail.com> Message-ID: Robert Kern writes: > >>>> If so it should also use numpy.distutils. Just make sure to import > >>>> setuptools first. > >>>> > >>>> import setuptools > >>>> from numpy.distutils.core import setup > >>>> ... > >>> Is there a recipe/template for this somewhere? Googling "scipy setuptools" > >>> comes up with > >>> > >>> > >>> > >>> as the first hit, which seems to indicate that setuptools is still a bit alpha > >>> and the docs can't be trusted if one wants something that actually works. > >> Fernando's a curmudgeon, and that page is old. Ignore him. :-) > > > > OK, I'm going the``ez_setup.py`` way. > > No, ez_setup.py is deprecated. That part of the setuptools docs is out of date. > When the Cheeseshop comes back up read this page with up-to-date information > about how to get going with setuptools: > > http://cheeseshop.python.org/pypi/setuptools OK. > > Surely there must be some way to write packages that depend on numpy but > > can be installed automatically (and download numpy if required)? > > Not really. The dependencies that you can specify are requirements for using the > package after it is installed, not requirements for building the package. The > structure of distutils setup.py files pretty much enforces this. setuptools > doesn't really get to do anything until setup() is called, i.e. after you've > used your dependencies. > > You might be able to hack something together with > pkg_resources.WorkingSet.resolve(), though. > > http://peak.telecommunity.com/DevCenter/PkgResources#workingset-methods-and-attributes Sometimes > > > And why do I need to use > > numpy.distutils in the first place? > > You don't strictly have to. numpy.get_include() is probably sufficient > for you, but it has the same problem. > > > Another thing: should ``scikits/__init__.py`` be really empty? From > > > > > > > > it looks a bit to me like it should contain: > > > > __import__('pkg_resources').declare_namespace(__name__) > > > > ? > > Yes, apologies. OK, I think the upshot of all this is that I'll figure out how to do a robust and user-friendly setuptools-based package another time. I don't want to delay the release of 1.0 any further so I'll release mlabwrap-1.0final on SF as a distutils based install with the old (scikits-less) package structure. Since post-1.0 mlabwrap will represent a break anyway (Numeric support will be dropped, newer version of python and matlab might be required and the interface might change in some not backwards-compatible ways), the need to change the important statement is maybe not a bad thing. > > > Finally, what is the preferred download url for scikits projects? Should I > > continue to host the file-release on SF, or should they go somewhere else? > > I recommend putting them on the Cheeseshop. Thanks, will do. > > > Same for the webpage. > > scipy.org wiki if you can. I could make http://www.scipy.org/MlabWrap the project webpage and move its current (developer-only contents) into some subdir -- does that sound reasonable? > > So I think some kind of template for scikit authors would be useful and I'd > > suggest that once I've got setup.py etc. ironed out I put some info for other > > prospective scikit authors on a wikipage on scipy.org -- what would be a good > > place? > > http://projects.scipy.org/scipy/scikits I tried to create a trac account ('aschmolck') there but it didn't work; a new account seems to have been created (the name is taken) but I can't log into it. Do I need some other type of account first in order to be able to create the trac account? When I click on "create account" I get an authorization dialog, but the just created login and password don't work (I verified by going through the same process with login/passwd test_user). thanks, 'as From cimrman3 at ntc.zcu.cz Tue Apr 10 04:43:55 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 10:43:55 +0200 Subject: [SciPy-dev] [Numpy-discussion] Tuning sparse stuff in NumPy In-Reply-To: References: <4607E40E.5000300@ntc.zcu.cz> <4607EDCD.5020001@ntc.zcu.cz> <46095404.4020106@ntc.zcu.cz> Message-ID: <461B4E4B.3080608@ntc.zcu.cz> Nathan Bell wrote: > On 3/27/07, Robert Cimrman wrote: >> ok. now which version of scipy (scipy.__version__) do you use (you may >> have posted it, but I missed it)? Not so long ago, there was an effort >> by Nathan Bell and others reimplementing sparsetools + scipy.sparse to >> get better usability and performance. My (almost latest) version is >> 0.5.3.dev2860. > > Robert, did David find the source of his performance problems? I > suspect that he was using an older version of sparsetools, but I'd > like to know for sure. I am not sure either, but I think the slowness he perceived was due to using the version 0.5.2. David, did you try your benchmarks with the latest SVN version? r. From cimrman3 at ntc.zcu.cz Tue Apr 10 07:12:50 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 13:12:50 +0200 Subject: [SciPy-dev] eigh implementation inconsistency Message-ID: <461B7132.40008@ntc.zcu.cz> I need to solve a symmetric generalized eigenvalue problem, so I have looked at the linalg module and found a thing that seems inconsistent to me. For general (unsymmetric) problems there is the 'eig' function which allows for solving both the regular (via *geev) and generalized (via *ggev) eigenvalue problems. On the other hand, the function 'eigh' for symmetric (or hermitian) problems does not allow the generalized problems, even though there are functions in lapack to do it (dsygv, chegv). I would modify eigh to accept an optional 'b' argument just like eig does. What must be done to have dsygv, chegv wrappers generated? They are not generated now, IMHO. r. From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:16:15 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:16:15 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B7132.40008@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> Message-ID: <461B71FF.6040709@iam.uni-stuttgart.de> Robert Cimrman wrote: > I need to solve a symmetric generalized eigenvalue problem, so I have > looked at the linalg module and found a thing that seems inconsistent to me. > > For general (unsymmetric) problems there is the 'eig' function which > allows for solving both the regular (via *geev) and generalized (via > *ggev) eigenvalue problems. > > On the other hand, the function 'eigh' for symmetric (or hermitian) > problems does not allow the generalized problems, even though there are > functions in lapack to do it (dsygv, chegv). > > I would modify eigh to accept an optional 'b' argument just like eig > does. What must be done to have dsygv, chegv wrappers generated? They > are not generated now, IMHO. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Sounds good to me ! It would be a nice improvement. Nils From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:41:00 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:41:00 +0200 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? In-Reply-To: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> References: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> Message-ID: <461B77CC.9070009@iam.uni-stuttgart.de> David Cournapeau wrote: > Hi there, > > I tried to compile numpy/scipy with recent LAPACK and BLAS versions > (LAPACK 3.1.1, BLAS from the netlib package, not from LAPACK, using > gfortran as a compiler everywhere), and I got several errors when > testing scipy: > > ====================================================================== > FAIL: check_syevr (scipy.lib.tests.test_lapack.test_flapack_float) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", > line 41, in check_syevr > assert_array_almost_equal(w,exact_w) > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 230, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 215, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) > y: array([-0.66992434, 0.48769389, 9.18223045]) > > ====================================================================== > FAIL: check_syevr_irange (scipy.lib.tests.test_lapack.test_flapack_float) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/david/local/lib/python2.5/site-packages/scipy/lib/lapack/tests/esv_tests.py", > line 66, in check_syevr_irange > assert_array_almost_equal(w,exact_w[rslice]) > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 230, in assert_array_almost_equal > header='Arrays are not almost equal') > File > "/home/david/local/lib/python2.5/site-packages/numpy/testing/utils.py", > line 215, in assert_array_compare > assert cond, msg > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([-0.66992444, 0.48769468, 9.18222618], dtype=float32) > y: array([-0.66992434, 0.48769389, 9.18223045]) > > ---------------------------------------------------------------------- > > The different dtype may suggest an error while compiling the > BLAS/LAPACK, but I tested the libraries with official tester without any > error. > > cheers, > > David > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Hi David, I can confirm your findings. BTW, is there a way to obtain the information which version of LAPACK is used via scipy.show_config() ? I mean something like [('ATLAS_INFO', '"\\"3.7.28\\""')] Cheers, Nils From nwagner at iam.uni-stuttgart.de Tue Apr 10 07:43:25 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 13:43:25 +0200 Subject: [SciPy-dev] Deadline for scipy release 0.5.3 Message-ID: <461B785D.8090802@iam.uni-stuttgart.de> Hi all, I was wondering if there is a deadline for the next scipy release ? Nils From opossumnano at gmail.com Tue Apr 10 08:08:46 2007 From: opossumnano at gmail.com (Tiziano Zito) Date: Tue, 10 Apr 2007 14:08:46 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B7132.40008@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> Message-ID: you can also have a look at the symeig module: http://mdp-toolkit.sourceforge.net/symeig.html it is a wrapper of all the generalized symmetric eigenvalues problem routines in lapack, i.e. EVR, GV, GVD, GVX, including those for extracting only a subset of eigenvalues. cheers, tiziano On 4/10/07, Robert Cimrman wrote: > > I need to solve a symmetric generalized eigenvalue problem, so I have > looked at the linalg module and found a thing that seems inconsistent to me. > > For general (unsymmetric) problems there is the 'eig' function which > allows for solving both the regular (via *geev) and generalized (via > *ggev) eigenvalue problems. > > On the other hand, the function 'eigh' for symmetric (or hermitian) > problems does not allow the generalized problems, even though there are > functions in lapack to do it (dsygv, chegv). > > I would modify eigh to accept an optional 'b' argument just like eig > does. What must be done to have dsygv, chegv wrappers generated? They > are not generated now, IMHO. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From nwagner at iam.uni-stuttgart.de Tue Apr 10 08:08:43 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 10 Apr 2007 14:08:43 +0200 Subject: [SciPy-dev] UMFPACKv4.4 and swig memory leak of type 'void *' In-Reply-To: <461363D6.3010305@ntc.zcu.cz> References: <45E54D84.8010103@iam.uni-stuttgart.de> <461363D6.3010305@ntc.zcu.cz> Message-ID: <461B7E4B.3000302@iam.uni-stuttgart.de> Robert Cimrman wrote: > Nils Wagner wrote: > >> Hi all, >> >> scipy.test(1,10) reports a lot of memory leaks, e.g. >> >> Getting factors of complex matrixswig/python detected a memory leak of >> type 'void *', no destructor found. >> Getting factors of real matrixswig/python detected a memory leak of type >> 'void *', no destructor found. >> Solve with UMFPACK: double precision complexswig/python detected a >> memory leak of type 'void *', no destructor found. >> swig/python detected a memory leak of type 'void *', no destructor found. >> ... ok >> Solve: single precision complexUse minimum degree ordering on A'+A. >> ... ok >> Solve with UMFPACK: double precisionswig/python detected a memory leak >> of type 'void *', no destructor found. >> swig/python detected a memory leak of type 'void *', no destructor found. >> ... ok >> >> Is this a swig problem ? >> >> I am using >> >> Numpy version 1.0.2.dev3562 >> Scipy version 0.5.3.dev2774 >> > > Hi Nils, > > it was not a swig problem but mine (passing bad 'own' flag in a > typemap)). One can be really blind... > > Now it seems fixed. (rev. 2896) > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Hi Robert, Thanks a lot ! Nils From cimrman3 at ntc.zcu.cz Tue Apr 10 08:22:00 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 14:22:00 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz> Message-ID: <461B8168.7030807@ntc.zcu.cz> Tiziano Zito wrote: > you can also have a look at the symeig module: > http://mdp-toolkit.sourceforge.net/symeig.html > it is a wrapper of all the generalized symmetric eigenvalues problem > routines in lapack, i.e. EVR, GV, GVD, GVX, including those for > extracting only a subset of eigenvalues. > > cheers, > tiziano This is great! Exactly what I am looking for... I assume you would not consider changing it license from LGPL to BSD so that it could be included in SciPy? :-) r. From opossumnano at gmail.com Tue Apr 10 08:42:58 2007 From: opossumnano at gmail.com (Tiziano Zito) Date: Tue, 10 Apr 2007 14:42:58 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: <461B8168.7030807@ntc.zcu.cz> References: <461B7132.40008@ntc.zcu.cz> <461B8168.7030807@ntc.zcu.cz> Message-ID: We would change the license to whatever is needed to include it in SciPy, but I think the problem is that our pyf files (those needed by f2py to generate the C extension module) have been heavily (and manually :-)) tuned and do not resemble the pyf files used by scipy to generate other lapack wrappers. An inclusion "as is" has been excluded for this reason by Pearu a couple of years ago. Rewriting the pyf to match those of scipy is a tedious work that we are not going to do in the near future. Unless Pearu changed his mind or someone volunteers to do the hard work, in which case I would help as far as I can, it is not going to happen soon. bye, tiziano On 4/10/07, Robert Cimrman wrote: > Tiziano Zito wrote: > > you can also have a look at the symeig module: > > http://mdp-toolkit.sourceforge.net/symeig.html > > it is a wrapper of all the generalized symmetric eigenvalues problem > > routines in lapack, i.e. EVR, GV, GVD, GVX, including those for > > extracting only a subset of eigenvalues. > > > > cheers, > > tiziano > > This is great! Exactly what I am looking for... I assume you would not > consider changing it license from LGPL to BSD so that it could be > included in SciPy? :-) > > r. > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > From aisaac at american.edu Tue Apr 10 09:07:07 2007 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 10 Apr 2007 09:07:07 -0400 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz><461B8168.7030807@ntc.zcu.cz> Message-ID: On Tue, 10 Apr 2007, Tiziano Zito apparently wrote: > We would change the license to whatever is needed to include it in > SciPy, but I think the problem is that our pyf files (those needed by > f2py to generate the C extension module) have been heavily (and > manually :-)) tuned and do not resemble the pyf files used by scipy > to generate other lapack wrappers. An inclusion "as is" > has been excluded for this reason by Pearu a couple of > years ago. Even if a volunteer to rewrite the pyf files does not emerge immediately, changing the license *now* will ensure that some future interested party does not pass by the opportunity out of licensing concerns. fwiw, Alan Isaac From cimrman3 at ntc.zcu.cz Tue Apr 10 10:54:06 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 10 Apr 2007 16:54:06 +0200 Subject: [SciPy-dev] eigh implementation inconsistency In-Reply-To: References: <461B7132.40008@ntc.zcu.cz><461B8168.7030807@ntc.zcu.cz>

Message-ID: <461BA50E.2090509@ntc.zcu.cz> Alan G Isaac wrote: > On Tue, 10 Apr 2007, Tiziano Zito apparently wrote: >> We would change the license to whatever is needed to include it in >> SciPy, but I think the problem is that our pyf files (those needed by >> f2py to generate the C extension module) have been heavily (and >> manually :-)) tuned and do not resemble the pyf files used by scipy >> to generate other lapack wrappers. An inclusion "as is" >> has been excluded for this reason by Pearu a couple of >> years ago. > > Even if a volunteer to rewrite the pyf files does not emerge > immediately, changing the license *now* will ensure that > some future interested party does not pass by the > opportunity out of licensing concerns. What is more, once and if it is BSD-licensed, it could get included and _used_ from scipy sandbox just as it is right now. Later it could get merged into the main tree, if there is interest and will... Or a new scikit? regards, r. From steve at shrogers.com Wed Apr 11 08:38:59 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Wed, 11 Apr 2007 06:38:59 -0600 Subject: [SciPy-dev] ctypes requirement? In-Reply-To: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> References: <1e2af89e0704090324x9b8afedl6b1dd9d048c1095c@mail.gmail.com> Message-ID: <461CD6E3.3020807@shrogers.com> Matthew Brett wrote: > > ... Does the team think the time has come for ctypes to > be a requirement for scipy? It will make some development easier. > I'm just a lurker here, but FWIW I think ctypes is a reasonable requirement. We're still using Python 2.4.3, but include ctypes with our distribution. # Steve From ondrej at certik.cz Wed Apr 11 13:02:42 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 11 Apr 2007 19:02:42 +0200 Subject: [SciPy-dev] SciPy improvements Message-ID: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Hi, I am studying theoretical physics and I have collected a lot of useful python code, that I believe could go to SciPy. So first I'll review what I have and if you find it interesting, I would like to discuss how I could implement it in SciPy. 1) Optimization http://chemev.googlecode.com/svn/trunk/chemev/optimization/ I have a Differential Evolution optimizer, Simplex optimizer, mcmc (not well tested yet), I took a code from someone else, but adapted the interface to the SciPy's one: def fmin_de(f,x0,callback=None,iter=None): Those are unconstrained optimizers. Then I have a constrains code, that applies a logistic function to the fitting variable and allows me to do constrained optimization. For example the L-BFGS with my constrains converges 7x faster, than the original L-BFGS-B on my problem. 2) Nonlinear solvers I have written these nonlinear solvers for the problem R(x) = 0, where x and R has a dimension "n": broyden1 - Broyden's first method - is a quasi-Newton-Raphson method for updating an approximate Jacobian and then inverting it broyden2 - Broyden's second method - the same as broyden1, but updates the inverse Jacobian directly broyden3 - Broyden's second method - the same as broyden2, but instead of directly computing the inverse Jacobian, it remembers how to construct it using vectors, and when computing inv(J)*F, it uses those vectors to compute this product, thus avoding the expensive NxN matrix multiplication. broyden_generalized - Generalized Broyden's method, the same as broyden2, but instead of approximating the full NxN Jacobian, it construct it at every iteration in a way that avoids the NxN matrix multiplication. This is not as precise as broyden3. anderson - extended Anderson method, the same as the broyden_generalized, but added w_0^2*I to before taking inversion to improve the stability anderson2 - the Anderson method, the same as anderson, but formulated differently linear_mixing exciting_mixing I use them in the self-consistent cycle of the Density Functional Theory (so I use a terminology of DFT literature in the names of the methods). Also I am writing a BFGS solver with linesearch, that should behave even better than the Broyden scheme. Of course I am trying to use SciPy's code (like linesearch) wherever possible. 3) PETSC bindings I found these nice petsc bindings: http://cheeseshop.python.org/pypi/petsc4py/0.7.2 I believe this could also be an optional package in SciPy. Because if SciPy has some sparse matrices code, then it should definitely also has this. 4) Finite element code? I have my own code, that uses libmesh: http://libmesh.sourceforge.net/ and calls tetgen and parses input from gmsh etc. Can convert the mesh, can refine it, can solve it, etc. Webpages are here: http://code.google.com/p/femgeom/ http://code.google.com/p/libmeshpetscpackage/ http://code.google.com/p/grainmodel/ I am not sure here, if it should belong to SciPy. Probably not. 5) Symbolic manipulation in Python http://code.google.com/p/sympy/ We'll have some google summer of code students working on this and also I am not sure if it belongs to SciPy. However, this project looks very promising. ----------------- So that's it. I have some comments to SciPy: 1) Documentation Virtually none, I just use the source code to understand, what SciPy can do and how. But the docstrings are good though. I would suggest to update the http://www.scipy.org/doc/api_docs/ more often (for example I didn't find there the new l-bfgs code). 2) What is the official NumPy's page? I believe it should be http://numpy.org/ however, it points to a sourceforge page. I believe this homepage should contain all the relevant information about numpy and contain links to the fee based documentation and possibly some tutorials. The documentation of NumPy is scattered across many pages and I find it confusing. I know you have some list here: http://www.scipy.org/MigratingFromPlone But I am quite confused from the whole SciPy page. I think less but unconfusing information is better, but that's just my opinion. I think the front page of both SciPy and NumPy should be clean and simple with a clear link to a documentation. Like http://www.pytables.org/moin/PyTables http://matplotlib.sourceforge.net/ However, the documentation: http://www.scipy.org/Documentation is confusing, because except the fee based guide to NumPy, it's not clear, what is the official SciPy doc and what the best way of learning SciPy is. So I am interested in your opinions and then I'll just integrate my code into scipy.optimize and send you a patch to this list? Ondrej From scipy2mdjhs78c at jenningsstory.com Wed Apr 11 13:19:54 2007 From: scipy2mdjhs78c at jenningsstory.com (Andy Jennings) Date: Wed, 11 Apr 2007 10:19:54 -0700 Subject: [SciPy-dev] lmder patch Message-ID: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> optimize.leastsq is not converging when I use a jacobian function. MATRIXC2F is transposing the wrong way. Below is a patch that fixes it in jac_multipack_lm_function, but I'm not sure this is the right fix. It looks like jac_multipack_calling_function might have the same problem, in which case maybe it's better to fix the definition of MATRIXC2F in Lib/optimize/minpack.h. I guess I would want a test case for hybrj to be sure. Lib/integrate/multipack.h and Lib/interpolate/multipack.h also define a MATRIXC2F macro. I have no idea if they need to be looked at as well. AJennings P.S. I just signed up for a trac account a few days ago. I have to submit patches to someone with the rights to commit, right? Is this scipy-dev list the best place to do it? Index: Lib/optimize/__minpack.h =================================================================== --- Lib/optimize/__minpack.h (revision 2901) +++ Lib/optimize/__minpack.h (working copy) @@ -149,7 +149,7 @@ return -1; } if (multipack_jac_transpose == 1) - MATRIXC2F(fjac, result_array->data, *n, *ldfjac) + MATRIXC2F(fjac, result_array->data, *ldfjac, *n) else memcpy(fjac, result_array->data, (*n)*(*ldfjac)*sizeof(double)); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 11 15:25:15 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Apr 2007 14:25:15 -0500 Subject: [SciPy-dev] lmder patch In-Reply-To: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> References: <149ddc5e0704111019q462c02a5s11286b46b57bb0ad@mail.gmail.com> Message-ID: <461D361B.1080107@gmail.com> Andy Jennings wrote: > P.S. I just signed up for a trac account a few days ago. I have to > submit patches to someone with the rights to commit, right? Is this > scipy-dev list the best place to do it? No, please open a ticket so it doesn't get lost. At the bottom of the page where you enter the ticket's information, there will be a checkbox labeled something like "I have files to attach to this ticket." Check that box, submit the ticket, then you will be presented a page where you can upload the patch. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Wed Apr 11 16:09:13 2007 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Apr 2007 15:09:13 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: <461D4069.8070101@gmail.com> Ondrej Certik wrote: > Hi, > > I am studying theoretical physics and I have collected a lot of useful > python code, that I believe could go to SciPy. So first I'll review > what I have and if you find it interesting, I would like to discuss > how I could implement it in SciPy. Excellent! Thank you! One thing to be aware of is that scipy uses the BSD license, so you would need to relicense your code under the BSD license and get permission from the others who have contributed to the code you are submitting. > 1) Optimization > > http://chemev.googlecode.com/svn/trunk/chemev/optimization/ > > I have a Differential Evolution optimizer, Simplex optimizer, mcmc > (not well tested yet), I took a code from someone else, but adapted > the interface to the SciPy's one: > > def fmin_de(f,x0,callback=None,iter=None): Well, fmin() is already an implementation of the simplex algorithm. How does yours compare? We can't include the MCMC optimizer until we have an implementation of Metropolis-Hastings in scipy itself; we're not going to depend on an external PyMC. As for the differential evolution code, with all respect to you and Jame Phillips, it's not the best way to implement that algorithm in Python. It's a straight translation of the C++ code so it doesn't make use of numpy at all. I have an implementation that does: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py It was written for pre-numpy scipy, so it may need some sprucing-up before it works. > Those are unconstrained optimizers. Then I have a constrains code, > that applies a logistic function to the fitting variable and allows me > to do constrained optimization. For example the L-BFGS with my > constrains converges 7x faster, than the original L-BFGS-B on my > problem. Interesting. Let's toss it up on the Cookbook first and pound on it a bit. I have qualms about applying such transformations to the domains of target functions and then using derivative-based optimizers on them, but those qualms might be baseless. Still, I'd rather experiment first before putting them into scipy. > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but updates the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but instead of > directly computing the inverse Jacobian, it remembers how to construct > it using vectors, and when computing inv(J)*F, it uses those vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as broyden2, > but instead of approximating the full NxN Jacobian, it construct it at > every iteration in a way that avoids the NxN matrix multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the broyden_generalized, > but added w_0^2*I to before taking inversion to improve the stability > anderson2 - the Anderson method, the same as anderson, but formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). > > Also I am writing a BFGS solver with linesearch, that should behave > even better than the Broyden scheme. > > Of course I am trying to use SciPy's code (like linesearch) wherever possible. That's fantastic. I'd love to see them. Are they in chemev? I don't see them. > 3) PETSC bindings > > I found these nice petsc bindings: > > http://cheeseshop.python.org/pypi/petsc4py/0.7.2 > > I believe this could also be an optional package in SciPy. Because if > SciPy has some sparse matrices code, then it should definitely also > has this. I don't know. It has a fine (and probably better) existence separate from scipy. > 4) Finite element code? > > I have my own code, that uses libmesh: > > http://libmesh.sourceforge.net/ > > and calls tetgen and parses input from gmsh etc. Can convert the mesh, > can refine it, can solve it, etc. Webpages are here: > > http://code.google.com/p/femgeom/ > http://code.google.com/p/libmeshpetscpackage/ > http://code.google.com/p/grainmodel/ > > I am not sure here, if it should belong to SciPy. Probably not. I think you are right. You can't really get around the licenses of your dependencies, here. > 5) Symbolic manipulation in Python > > http://code.google.com/p/sympy/ > > We'll have some google summer of code students working on this and > also I am not sure if it belongs to SciPy. However, this project looks > very promising. Again, I think it has a fine existence separate from scipy. A reason to bring it into scipy would be such that other parts of scipy would use it to implement their own stuff. Otherwise, I don't think there is much point. > ----------------- > So that's it. > > I have some comments to SciPy: > > 1) Documentation > > Virtually none, I just use the source code to understand, what SciPy > can do and how. But the docstrings are good though. I would suggest to > update the > > http://www.scipy.org/doc/api_docs/ > > more often (for example I didn't find there the new l-bfgs code). > > 2) What is the official NumPy's page? http://numpy.scipy.org > I believe it should be > > http://numpy.org/ > > however, it points to a sourceforge page. Correct. I don't know who still owns that domain. > I believe this homepage > should contain all the relevant information about numpy and contain > links to the fee based documentation and possibly some tutorials. > The documentation of NumPy is scattered across many pages and I find > it confusing. > > I know you have some list here: > > http://www.scipy.org/MigratingFromPlone > > But I am quite confused from the whole SciPy page. I think less but > unconfusing information is better, but that's just my opinion. I think > the front page of both SciPy and NumPy should be clean and simple with > a clear link to a documentation. Like > > http://www.pytables.org/moin/PyTables > http://matplotlib.sourceforge.net/ Please, by all means submit your recommendations for reorganization of that page. Because the front page is special, I'd recommend submitting your modifications as a ticket to our Trac (see below) instead of just editing it. The other pages on the wiki, please modify them as you see fit. > However, the documentation: > > http://www.scipy.org/Documentation > > is confusing, because except the fee based guide to NumPy, it's not > clear, what is the official SciPy doc and what the best way of > learning SciPy is. There really is no official scipy doc at this time. That's part of the problem. > So I am interested in your opinions and then I'll just integrate my > code into scipy.optimize and send you a patch to this list? Register an account with the scipy Trac (click "Register" in the upper-right corner): http://projects.scipy.org/scipy/scipy Then make a new ticket and attach your patch to that. Submit enough patches, and we'll just give you SVN access. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthieu.brucher at gmail.com Wed Apr 11 16:56:45 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 11 Apr 2007 22:56:45 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: > > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but updates > the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but instead > of > directly computing the inverse Jacobian, it remembers how to > construct > it using vectors, and when computing inv(J)*F, it uses those > vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > but instead of approximating the full NxN Jacobian, it construct > it at > every iteration in a way that avoids the NxN matrix > multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the > broyden_generalized, > but added w_0^2*I to before taking inversion to improve the > stability > anderson2 - the Anderson method, the same as anderson, but formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). Could the part that computes the step be separated from the function iself and the optimizer ? I'm trying to "modularize" non linear solvers so as to select more efficiently what is needed - kind of optimizer, kind of step, kind of stopping criterion, ... - Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Thu Apr 12 14:22:18 2007 From: jh at physics.ucf.edu (Joe Harrington) Date: Thu, 12 Apr 2007 14:22:18 -0400 Subject: [SciPy-dev] SciPy improvements In-Reply-To: (scipy-dev-request@scipy.org) References: Message-ID: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> Just a comment on Robert's otherwise-excellent reply, we agreed some time ago that the forum for discussing changes to the site is this list, not trac. This is because many participants such as myself are not involved in code development and do not have (or need) trac accounts. We should *not* encourage people simply to romp in the pages and restructure as they please, as the web site is in constant public view by more than just developers and should therefore not be a playground for testing ideas (except for DevZone, which is specifically for that purpose). Shortly after the switch to the Moin site, someone went in and rewrote a bunch of the pages to follow their own style, and it made us realize that an open invitation to edit was not the best idea. Small changes like adding a link or an entry in a list are of course fine to make. Changing a page's overall structure should at least get a brief review by the list. The page layouts are simple enough that improvements can either be discussed based on a posted description, or actually made by example. For the latter, copy the page onto a page hanging off of DevZone and post an email pointing to it and asking for comment. For obvious reasons, the front page can only be modified by a few people, not just anyone with an account. It would be best if people making regular changes identified themselves in DevZone as site maintainers so that others can find them. That said, I agree restructuring is called for in some cases, and as Robert pointed out, in the doc area what's really needed is a doc. I think we'll be quick to cheer on anything reasonable in either area. --jh-- From robert.kern at gmail.com Thu Apr 12 17:06:55 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 16:06:55 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> References: <200704121822.l3CIMI09003701@glup.physics.ucf.edu> Message-ID: <461E9F6F.8090004@gmail.com> Joe Harrington wrote: > Just a comment on Robert's otherwise-excellent reply, we agreed some > time ago that the forum for discussing changes to the site is this > list, not trac. This is because many participants such as myself are > not involved in code development and do not have (or need) trac > accounts. I don't actually recall any agreement to that effect. The Trac exists for tracking issues for all parts of the project, not just bugs in the code. If you want to be active in the project even you are not a developer of code, use the Trac. > We should *not* encourage people simply to romp in the > pages and restructure as they please, as the web site is in constant > public view by more than just developers and should therefore not be a > playground for testing ideas (except for DevZone, which is > specifically for that purpose). Shortly after the switch to the Moin > site, someone went in and rewrote a bunch of the pages to follow their > own style, and it made us realize that an open invitation to edit was > not the best idea. Again, I'm pretty sure that there was no such universal realization. I explicitly do encourage people to use the Wiki and change pages as they see fit (except for FrontPage because it's special). If you are concerned about changes, watch the RSS feed and change things back if you disagree with the changes. If there is continued disagreement, then bring it to the list. That's how Wikis are supposed to work. We have no lack of suggestions on this list about how a page ought to look or what it ought to have. What we lack are people actually willing to put the time in to do the edits and provide the content. I encourage the latter; the former needs no such encouragement. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mforbes at physics.ubc.ca Thu Apr 12 17:31:54 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Thu, 12 Apr 2007 14:31:54 -0700 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> Message-ID: On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > 2) Nonlinear solvers > > I have written these nonlinear solvers for the problem R(x) = 0, where > x and R has a dimension "n": > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson > method for > updating an approximate Jacobian and then inverting it > broyden2 - Broyden's second method - the same as broyden1, but > updates the > inverse Jacobian directly > broyden3 - Broyden's second method - the same as broyden2, but > instead of > directly computing the inverse Jacobian, it remembers how > to construct > it using vectors, and when computing inv(J)*F, it uses > those vectors to > compute this product, thus avoding the expensive NxN matrix > multiplication. > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > but instead of approximating the full NxN Jacobian, it > construct it at > every iteration in a way that avoids the NxN matrix > multiplication. > This is not as precise as broyden3. > anderson - extended Anderson method, the same as the > broyden_generalized, > but added w_0^2*I to before taking inversion to improve the > stability > anderson2 - the Anderson method, the same as anderson, but > formulated > differently > linear_mixing > exciting_mixing > > I use them in the self-consistent cycle of the Density Functional > Theory (so I use a terminology of DFT literature in the names of the > methods). > > Could the part that computes the step be separated from the > function iself and the optimizer ? I'm trying to "modularize" non > linear solvers so as to select more efficiently what is needed - > kind of optimizer, kind of step, kind of stopping criterion, ... - > > Matthieu It should be possible to modularize these with a step class that maintains state (the Jacobian, or its inverse etc.). (Where is the latest version of your optimization proposal? I have not had a chance to look at it yet, but have been meaning to and would like to look at the latest version. Perhaps we should make a page on the Wiki to collect suggestions and code samples.) I have been meaning to get a good Broyden based algorithm coded for python for a while. I have a MATLAB version of a globally convergent Broyden implementation using a linesearch as a base with a couple of unusual features that might be useful (not specific to Broyden based methods). (The code is based on the presentation of the algorithm given in Numerical Recipies with some modifications suggested by Dennis and Schnabel's book and is partially documented using noweb.) http://www.phys.washington.edu/users/mforbes/projects/broyden/ 1) Variable tolerances. The idea is to quickly estimate the starting Jacobian with low tolerance calculations and then improve the tolerances as the code converges to the solution. This is useful if the function R(x) is computed with numerical integration or some similar technique that is quick for low tolerances but expensive for high tolerance functions. (I have rarely seen this technique in generic optimization algorithms, but found it invaluable for several problems.) 2) Real-time bailout. This allows you to compute for a specified length of time and then return the best solution found within that time frame. Most algorithms simply count function calls. 3) Selective refinement of the Jacobian as the iteration proceeds. This amounts to monitoring the condition number of the Jacobian and recomputing parts of it selectively if it becomes ill-conditioned. (For this I update the QR factorization of J rather than maintaining inv(J)). These things all slow down the fundamental algorithm, but are very useful when the function R(x) is extremely expensive to evaluate. Michael. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Thu Apr 12 19:31:35 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 01:31:35 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

Message-ID: <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> Hi, thank to all of you for your thorough response. 1) The DE code from Robert: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py looks incredibly simple, I'll check it on my problem, if it behaves better than my ugly one. But that's the thing - I didn't find any mention about it in the documentation, otherwise I would be using your code, because it's simpler (=better). Also I didn't find it in google. I'll check the simplex method from SciPy - when I was using it, I just needed a simple script, not a whole dependence on SciPy and it was very difficult to get just the simplex algorithm out of SciPy. 2) I am sending my Broyden update methods in the attachement together with tests (you need py.test to run them). I am using a test from some article, I think from Vanderbilt. However my code is giving a little different results. I'll polish it and send it as a patch to SciPy in the trac, as directed, so you can just look how I do it. But when you do it, you will see there is really nothing to it - it's very trivial. However, my code doesn't use linesearch. Also I am curious how the BFGS method is going to work when I implement it - it's very similar to Broyden, except the update of the Jacobian is a little different. Could you Michael please also rewrite your code to Python? http://www.phys.washington.edu/users/mforbes/projects/broyden/ It would be nice to have all of it in SciPy with the same interface. BTW - why are you using matlab at all? To me, the python with numpy is better than anything else I've programmed in, including matlab. 3) about the logistics transforamtion - I was sceptical too, until I tried that and it was converging faster by a factor of 7x on my problem (chemev). So for me it's enough justification, but of course I am not saying that it must converge faster for any problem. However, I implemented it as a wrapper function above all the unconstrained algorithms with the SciPy interface, so the user is not forced to use it - he can just try it and see, as I did. I'll post it to the cookbook. 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. 5)documentation: the front page is quite fine, however the documentation needs complete redesign in my opinion. First - I believe the numpy should be separated from SciPy and have it's own page (numpy.org), but if you think it belongs under the hood of scipy.org, then ok. So, I'll copy the page: http://www.scipy.org/Documentation into some new one, and redesign it as I would like it to be, and then you'll tell me what you think about it. The same with other pages if I'll get a better idea about them. This way I shouldn't spoil anything in case you wouldn't like it. Because I don't have just couple of small fixes. Ondrej On 4/12/07, Michael McNeil Forbes wrote: > > > > On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > > > 2) Nonlinear solvers > > > > I have written these nonlinear solvers for the problem R(x) = 0, where > > x and R has a dimension "n": > > > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > > updating an approximate Jacobian and then inverting it > > broyden2 - Broyden's second method - the same as broyden1, but updates > the > > inverse Jacobian directly > > broyden3 - Broyden's second method - the same as broyden2, but instead > of > > directly computing the inverse Jacobian, it remembers how to > construct > > it using vectors, and when computing inv(J)*F, it uses those > vectors to > > compute this product, thus avoding the expensive NxN matrix > > multiplication. > > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > > but instead of approximating the full NxN Jacobian, it construct > it at > > every iteration in a way that avoids the NxN matrix > multiplication. > > This is not as precise as broyden3. > > anderson - extended Anderson method, the same as the > broyden_generalized, > > but added w_0^2*I to before taking inversion to improve the > stability > > anderson2 - the Anderson method, the same as anderson, but formulated > > differently > > linear_mixing > > exciting_mixing > > > > I use them in the self-consistent cycle of the Density Functional > > Theory (so I use a terminology of DFT literature in the names of the > > methods). > > Could the part that computes the step be separated from the function iself > and the optimizer ? I'm trying to "modularize" non linear solvers so as to > select more efficiently what is needed - kind of optimizer, kind of step, > kind of stopping criterion, ... - > > Matthieu > > It should be possible to modularize these with a step class that maintains > state > (the Jacobian, or its inverse etc.). > > (Where is the latest version of your optimization proposal? I have not had > a chance > to look at it yet, but have been meaning to and would like to look at the > latest version. > Perhaps we should make a page on the Wiki to collect suggestions and code > samples.) > > I have been meaning to get a good Broyden based algorithm coded for python > for a > while. I have a MATLAB version of a globally convergent Broyden > implementation > using a linesearch as a base with a couple of unusual features that might be > useful (not specific to Broyden based methods). (The code is based on the > presentation of the algorithm given in Numerical Recipies with some > modifications > suggested by Dennis and Schnabel's book and is partially documented using > noweb.) > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > 1) Variable tolerances. The idea is to quickly estimate the starting > Jacobian > with low tolerance calculations and then improve the tolerances as the > code > converges to the solution. This is useful if the function R(x) is > computed > with numerical integration or some similar technique that is quick for > low > tolerances but expensive for high tolerance functions. (I have rarely > seen > this technique in generic optimization algorithms, but found it > invaluable for > several problems.) > 2) Real-time bailout. This allows you to compute for a specified length of > time > and then return the best solution found within that time frame. Most > algorithms > simply count function calls. > 3) Selective refinement of the Jacobian as the iteration proceeds. This > amounts > to monitoring the condition number of the Jacobian and recomputing parts > of it > selectively if it becomes ill-conditioned. (For this I update the QR > factorization of J rather than maintaining inv(J)). > > These things all slow down the fundamental algorithm, but are very useful > when the > function R(x) is extremely expensive to evaluate. > > Michael. > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- A non-text attachment was scrubbed... Name: solvers.py Type: text/x-python Size: 12553 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_solvers.py Type: text/x-python Size: 1948 bytes Desc: not available URL: From ondrej at certik.cz Thu Apr 12 19:32:14 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 01:32:14 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

Message-ID: <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> Hi, thank to all of you for your thorough response. 1) The DE code from Robert: http://svn.scipy.org/svn/scipy/trunk/Lib/sandbox/rkern/diffev.py looks incredibly simple, I'll check it on my problem, if it behaves better than my ugly one. But that's the thing - I didn't find any mention about it in the documentation, otherwise I would be using your code, because it's simpler (=better). Also I didn't find it in google. I'll check the simplex method from SciPy - when I was using it, I just needed a simple script, not a whole dependence on SciPy and it was very difficult to get just the simplex algorithm out of SciPy. 2) I am sending my Broyden update methods in the attachement together with tests (you need py.test to run them). I am using a test from some article, I think from Vanderbilt. However my code is giving a little different results. I'll polish it and send it as a patch to SciPy in the trac, as directed, so you can just look how I do it. But when you do it, you will see there is really nothing to it - it's very trivial. However, my code doesn't use linesearch. Also I am curious how the BFGS method is going to work when I implement it - it's very similar to Broyden, except the update of the Jacobian is a little different. Could you Michael please also rewrite your code to Python? http://www.phys.washington.edu/users/mforbes/projects/broyden/ It would be nice to have all of it in SciPy with the same interface. BTW - why are you using matlab at all? To me, the python with numpy is better than anything else I've programmed in, including matlab. 3) about the logistics transforamtion - I was sceptical too, until I tried that and it was converging faster by a factor of 7x on my problem (chemev). So for me it's enough justification, but of course I am not saying that it must converge faster for any problem. However, I implemented it as a wrapper function above all the unconstrained algorithms with the SciPy interface, so the user is not forced to use it - he can just try it and see, as I did. I'll post it to the cookbook. 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. 5)documentation: the front page is quite fine, however the documentation needs complete redesign in my opinion. First - I believe the numpy should be separated from SciPy and have it's own page (numpy.org), but if you think it belongs under the hood of scipy.org, then ok. So, I'll copy the page: http://www.scipy.org/Documentation into some new one, and redesign it as I would like it to be, and then you'll tell me what you think about it. The same with other pages if I'll get a better idea about them. This way I shouldn't spoil anything in case you wouldn't like it. Because I don't have just couple of small fixes. Ondrej On 4/12/07, Michael McNeil Forbes wrote: > > > > On 11 Apr 2007, at 1:56 PM, Matthieu Brucher wrote: > > > 2) Nonlinear solvers > > > > I have written these nonlinear solvers for the problem R(x) = 0, where > > x and R has a dimension "n": > > > > broyden1 - Broyden's first method - is a quasi-Newton-Raphson method > for > > updating an approximate Jacobian and then inverting it > > broyden2 - Broyden's second method - the same as broyden1, but updates > the > > inverse Jacobian directly > > broyden3 - Broyden's second method - the same as broyden2, but instead > of > > directly computing the inverse Jacobian, it remembers how to > construct > > it using vectors, and when computing inv(J)*F, it uses those > vectors to > > compute this product, thus avoding the expensive NxN matrix > > multiplication. > > broyden_generalized - Generalized Broyden's method, the same as > broyden2, > > but instead of approximating the full NxN Jacobian, it construct > it at > > every iteration in a way that avoids the NxN matrix > multiplication. > > This is not as precise as broyden3. > > anderson - extended Anderson method, the same as the > broyden_generalized, > > but added w_0^2*I to before taking inversion to improve the > stability > > anderson2 - the Anderson method, the same as anderson, but formulated > > differently > > linear_mixing > > exciting_mixing > > > > I use them in the self-consistent cycle of the Density Functional > > Theory (so I use a terminology of DFT literature in the names of the > > methods). > > Could the part that computes the step be separated from the function iself > and the optimizer ? I'm trying to "modularize" non linear solvers so as to > select more efficiently what is needed - kind of optimizer, kind of step, > kind of stopping criterion, ... - > > Matthieu > > It should be possible to modularize these with a step class that maintains > state > (the Jacobian, or its inverse etc.). > > (Where is the latest version of your optimization proposal? I have not had > a chance > to look at it yet, but have been meaning to and would like to look at the > latest version. > Perhaps we should make a page on the Wiki to collect suggestions and code > samples.) > > I have been meaning to get a good Broyden based algorithm coded for python > for a > while. I have a MATLAB version of a globally convergent Broyden > implementation > using a linesearch as a base with a couple of unusual features that might be > useful (not specific to Broyden based methods). (The code is based on the > presentation of the algorithm given in Numerical Recipies with some > modifications > suggested by Dennis and Schnabel's book and is partially documented using > noweb.) > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > 1) Variable tolerances. The idea is to quickly estimate the starting > Jacobian > with low tolerance calculations and then improve the tolerances as the > code > converges to the solution. This is useful if the function R(x) is > computed > with numerical integration or some similar technique that is quick for > low > tolerances but expensive for high tolerance functions. (I have rarely > seen > this technique in generic optimization algorithms, but found it > invaluable for > several problems.) > 2) Real-time bailout. This allows you to compute for a specified length of > time > and then return the best solution found within that time frame. Most > algorithms > simply count function calls. > 3) Selective refinement of the Jacobian as the iteration proceeds. This > amounts > to monitoring the condition number of the Jacobian and recomputing parts > of it > selectively if it becomes ill-conditioned. (For this I update the QR > factorization of J rather than maintaining inv(J)). > > These things all slow down the fundamental algorithm, but are very useful > when the > function R(x) is extremely expensive to evaluate. > > Michael. > > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > > From robert.kern at gmail.com Thu Apr 12 19:45:15 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 18:45:15 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> Message-ID: <461EC48B.50105@gmail.com> Ondrej Certik wrote: > 3) about the logistics transforamtion - I was sceptical too, until I > tried that and it was converging faster by a factor of 7x on my > problem (chemev). So for me it's enough justification, but of course I > am not saying that it must converge faster for any problem. I'm sure it works faster; I'd just like to make sure that it always gives the correct answer. > 4) About the petsc - I know it's another dependence. However, I > noticed you are using umfpack in SciPy. So why not petsc? I think it > contains much more (sometimes better) solvers (depends on the > problem). It's seems logical to me, to either use nothing, or the best > library available, which I believe is petsc. Well, I wasn't as much of a dependency/no-optional-features freak when the optional UMFPACK stuff went in. Also, IIRC the wrappers for UMFPACK were written specifically for scipy; they didn't exist as a separate package beforehand. petsc4py already exists. Unless if we decide that some other feature of scipy needs it, there is no reason that I can see for bringing it into the scipy package. > 5)documentation: the front page is quite fine, however the > documentation needs complete redesign in my opinion. First - I believe > the numpy should be separated from SciPy and have it's own page > (numpy.org), but if you think it belongs under the hood of scipy.org, > then ok. > > So, I'll copy the page: > > http://www.scipy.org/Documentation > > into some new one, and redesign it as I would like it to be, and then > you'll tell me what you think about it. The same with other pages if > I'll get a better idea about them. This way I shouldn't spoil anything > in case you wouldn't like it. Because I don't have just couple of > small fixes. As you like. Thank you! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mforbes at physics.ubc.ca Thu Apr 12 20:55:48 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Thu, 12 Apr 2007 17:55:48 -0700 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> Message-ID: <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> On 12 Apr 2007, at 4:32 PM, Ondrej Certik wrote: > Could you Michael please also rewrite your code to Python? > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > It would be nice to have all of it in SciPy with the same interface. > > BTW - why are you using matlab at all? To me, the python with numpy is > better than anything else I've programmed in, including matlab. I wrote this code back when I had easy access to matlab and before I knew python. I simply have not had time to port the broyden code to python yet. Hopefully I will find time soon (it would also be nice to organize the pieces in a modular fashion as Matthieu suggested, but I have simply not had time to look over that yet.) There are still a few things I miss about matlab, especially a good line-by-line profiler, but generally I agree that python+numpy is much better for programming. Michael. From wbaxter at gmail.com Thu Apr 12 22:32:41 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 11:32:41 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> Message-ID: On 4/13/07, Michael McNeil Forbes wrote: > On 12 Apr 2007, at 4:32 PM, Ondrej Certik wrote: > > > Could you Michael please also rewrite your code to Python? > > > > http://www.phys.washington.edu/users/mforbes/projects/broyden/ > > > > It would be nice to have all of it in SciPy with the same interface. > > > > BTW - why are you using matlab at all? To me, the python with numpy is > > better than anything else I've programmed in, including matlab. I don't know if you were serious or not, but Matlab still has a huge amount of inertia and a very good network effect going for it. Plus many more years of development behind it than numpy. You're much more likely to be able to find random algorithms on the net implemented in Matlab than in Python/Numpy, that is if you even need to go looking, because a lot of stuff is already in Matlab to begin with. So while I personally agree that Numpy is better than Matlab codewise, a code in the hand beats two in the bush. I'd be very happy to drop Matlab completely if you could just port the NetLab (http://www.ncrg.aston.ac.uk/netlab/) and a few other things for me. :-) --bb From david at ar.media.kyoto-u.ac.jp Thu Apr 12 22:38:10 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 11:38:10 +0900 Subject: [SciPy-dev] Scipy and LAPACK 3.1.* ? In-Reply-To: <461B77CC.9070009@iam.uni-stuttgart.de> References: <4619AB7D.2070104@ar.media.kyoto-u.ac.jp> <461B77CC.9070009@iam.uni-stuttgart.de> Message-ID: <461EED12.2080909@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > Hi David, > > I can confirm your findings. > BTW, is there a way to obtain the information which version of LAPACK is > used via scipy.show_config() ? > I mean something like [('ATLAS_INFO', '"\\"3.7.28\\""')] I guess not, because the LAPACK sources themselves do not seem to have any API to retrieve the current version. I will try to see where this scipy error is coming from, then, David From david at ar.media.kyoto-u.ac.jp Thu Apr 12 22:53:56 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 11:53:56 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> Message-ID: <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > > > I don't know if you were serious or not, but Matlab still has a huge > amount of inertia and a very good network effect going for it. Plus > many more years of development behind it than numpy. You're much more > likely to be able to find random algorithms on the net implemented in > Matlab than in Python/Numpy, that is if you even need to go looking, > because a lot of stuff is already in Matlab to begin with. > > So while I personally agree that Numpy is better than Matlab codewise, > a code in the hand beats two in the bush. I'd be very happy to drop > Matlab completely if you could just port the NetLab > (http://www.ncrg.aston.ac.uk/netlab/) and a few other things for me. > Well, netlab is a huge package, but google just announced that my SoC projet pymachine, a set of toolboxes for machine learning-related algorithms, has been accepted: http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the full proposal can be found there: http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) So expect some news (and more importantly, some code) on this front in the coming months, cheers, David From wbaxter at gmail.com Thu Apr 12 23:13:34 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 12:13:34 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, David Cournapeau wrote: > Well, netlab is a huge package, but google just announced that my SoC > projet pymachine, a set of toolboxes for machine learning-related > algorithms, has been accepted: That's great to hear. I've been keeping an eye on your progress with pyEM and such. Very promising. Incidentally I work with some speech and graphics guys from ATR, where I see you worked previously. Do you know Yotsukura-san, Kawamoto-san, or Nakamura-san? (I think Nakamura-san is now the head of ATR maybe even). > http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the > full proposal can be found there: > http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) > So expect some news (and more importantly, some code) on this front in > the coming months, I would be interested in joining a dev list on this or something like that (or open dev blog? or wiki?) if you start such a thing. I assume you have to have discussions with your mentor anyway. If possible it'd be nice to peek in on those conversations. --bb From robert.kern at gmail.com Thu Apr 12 23:20:08 2007 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Apr 2007 22:20:08 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: <461EF6E8.1020108@gmail.com> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: >> http://code.google.com/soc/psf/appinfo.html?csaid=44CD86A83707638A (the >> full proposal can be found there: >> http://www.ar.media.kyoto-u.ac.jp/members/david/fullproposal.html) > >> So expect some news (and more importantly, some code) on this front in >> the coming months, > > I would be interested in joining a dev list on this or something like > that (or open dev blog? or wiki?) if you start such a thing. I assume > you have to have discussions with your mentor anyway. If possible > it'd be nice to peek in on those conversations. David is welcome to use scipy-dev and scipy.org especially since a good chunk of the project involves scipy packages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Fri Apr 13 00:59:49 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 13:59:49 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> Message-ID: <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: > > >> Well, netlab is a huge package, but google just announced that my SoC >> projet pymachine, a set of toolboxes for machine learning-related >> algorithms, has been accepted: >> > > That's great to hear. I've been keeping an eye on your progress with > pyEM and such. Very promising. > > Incidentally I work with some speech and graphics guys from ATR, where > I see you worked previously. Do you know Yotsukura-san, Kawamoto-san, > or Nakamura-san? (I think Nakamura-san is now the head of ATR maybe > even). > > I indeed work some time there before starting my PhD program at Kyodai, but not in the speech lab (I worked in the now defunct MIS lab). > I would be interested in joining a dev list on this or something like > that (or open dev blog? or wiki?) if you start such a thing. I assume > you have to have discussions with your mentor anyway. If possible > it'd be nice to peek in on those conversations. > There is nothing started yet, and some things need to be fixed with my mentor before things get started, but as Robert said, most if not all discussion related to it would happen here and follow the usual scipy process (scipy SVN, Trac, etc...). David From wbaxter at gmail.com Fri Apr 13 01:41:10 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 14:41:10 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F0E45.5000009@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, David Cournapeau wrote: > Bill Baxter wrote: > > On 4/13/07, David Cournapeau wrote: > > I would be interested in joining a dev list on this or something like > > that (or open dev blog? or wiki?) if you start such a thing. I assume > > you have to have discussions with your mentor anyway. If possible > > it'd be nice to peek in on those conversations. > > > There is nothing started yet, and some things need to be fixed with my > mentor before things get started, but as Robert said, most if not all > discussion related to it would happen here and follow the usual scipy > process (scipy SVN, Trac, etc...). Great then. The project page mentions SVM. In addition to SVM I'm interested in things like PPCA, kernel PCA, RBF networks, gaussian processes and GPLVM. Are you going to try to go in the direction of a modular structure with reusable bits for for all kernel methods, or is the plan to targeted specifically SVM? The basic components of this stuff (like RBFs) also make for good scattered data interpolation schemes. I hear questions every so often on the list about good ways to do that, so making the tools for the machine learning toolkit easy to use for people who just want to interpolate data would be nice. Going in a slightly different direction, meshfree methods for solving partial differential equations also build on tools like RBF and moving least squares interpolation. So for that reason too, it would be nice to have a reusable api layer for those things. You mention also that you're planning to unify row vec vs. column vec conventions. Just wanted to put my vote in for row vectors! For a number of reasons 1) It seems to be the more common usage in machine learning literature 2) with Numpy's default C-contiguous data it puts individual vectors in contiguous memory. 3) it's easier to print something that's Nx5 than 5xN 4) "for vec in lotsofvecs:" works with row vectors, but needs a transpose for column vectors. 5) accessing a vector becomes just data[i] instead of data[:,i] which makes it easier to go back and forth between a python list of vectors and a numpy 2d array of vectors. --bb From wbaxter at gmail.com Fri Apr 13 01:44:37 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 14:44:37 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > You mention also that you're planning to unify row vec vs. column vec > conventions. Just wanted to put my vote in for row vectors! For a > number of reasons > 1) It seems to be the more common usage in machine learning literature > 2) with Numpy's default C-contiguous data it puts individual vectors > in contiguous memory. > 3) it's easier to print something that's Nx5 than 5xN > 4) "for vec in lotsofvecs:" works with row vectors, but needs a > transpose for column vectors. > 5) accessing a vector becomes just data[i] instead of data[:,i] which > makes it easier to go back and forth between a python list of vectors > and a numpy 2d array of vectors. One more: 6) mat(avec) where avec is 1-D returns a row vector rather than a column vector. From robert.kern at gmail.com Fri Apr 13 01:50:54 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 00:50:54 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F1A3E.1000508@gmail.com> Bill Baxter wrote: > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? On that note, I have some Gaussian process code in a Mercurial repository here (click the "manifest" button to browse the source): http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/gp/ It's based on the treatment by Rasmussen and Williams: http://www.gaussianprocess.org/gpml/ The covariance functions I implement there might be useful in other methods, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Fri Apr 13 01:57:31 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 14:57:31 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> Bill Baxter wrote: > On 4/13/07, David Cournapeau wrote: >> Bill Baxter wrote: >>> On 4/13/07, David Cournapeau wrote: >>> I would be interested in joining a dev list on this or something like >>> that (or open dev blog? or wiki?) if you start such a thing. I assume >>> you have to have discussions with your mentor anyway. If possible >>> it'd be nice to peek in on those conversations. >>> >> There is nothing started yet, and some things need to be fixed with my >> mentor before things get started, but as Robert said, most if not all >> discussion related to it would happen here and follow the usual scipy >> process (scipy SVN, Trac, etc...). > > Great then. > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? The plan is really about unifying and improving existing toolboxes, and provides a higher level API (which would end up in scikits for various reasons). Depending on the time left, I will add some algorithms later. Of course, the goal is that other people will also jump in to add new algorithms (I for example will add some recent advances for mixture like ensemble learning, outside of the SoC if necessary). > > The basic components of this stuff (like RBFs) also make for good > scattered data interpolation schemes. I hear questions every so often > on the list about good ways to do that, so making the tools for the > machine learning toolkit easy to use for people who just want to > interpolate data would be nice. > > Going in a slightly different direction, meshfree methods for solving > partial differential equations also build on tools like RBF and moving > least squares interpolation. So for that reason too, it would be nice > to have a reusable api layer for those things. > > You mention also that you're planning to unify row vec vs. column vec > conventions. Just wanted to put my vote in for row vectors! For a > number of reasons > 1) It seems to be the more common usage in machine learning literature > 2) with Numpy's default C-contiguous data it puts individual vectors > in contiguous memory. > 3) it's easier to print something that's Nx5 than 5xN > 4) "for vec in lotsofvecs:" works with row vectors, but needs a > transpose for column vectors. > 5) accessing a vector becomes just data[i] instead of data[:,i] which > makes it easier to go back and forth between a python list of vectors > and a numpy 2d array of vectors. I have not given a lot of thoughts about it yet; what matters the most is that all algo have the same conventions. Nevertheless, my experience so far in numpy is similar to yours with regard to ML algorithms (except point 2: depending on the algo. you need contiguous access along one dimension, and my impression is that in numpy, this matters a lot performance wise, at least much more than in matlab). David From david at ar.media.kyoto-u.ac.jp Fri Apr 13 02:08:14 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 15:08:14 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1A3E.1000508@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1A3E.1000508@gmail.com> Message-ID: <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> Robert Kern wrote: > Bill Baxter wrote: > >> The project page mentions SVM. In addition to SVM I'm interested in >> things like PPCA, kernel PCA, RBF networks, gaussian processes and >> GPLVM. Are you going to try to go in the direction of a modular >> structure with reusable bits for for all kernel methods, or is the >> plan to targeted specifically SVM? > > On that note, I have some Gaussian process code in a Mercurial repository here > (click the "manifest" button to browse the source): > > http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/gp/ > > It's based on the treatment by Rasmussen and Williams: > > http://www.gaussianprocess.org/gpml/ > > The covariance functions I implement there might be useful in other methods, too. > Thanks for those links, I will take a look at the code you wrote. Since you're here, I have some questions concerning chaco for the visualization part of the project. Basically, I am unsure about whether I should use chaco or matplotlitb. I do not know chaco very well yet, but it seems much better API and performance wise compared to matplotlib for interactive visualization. The problem is that it still does not have a lot of visibility to the community compared to matplotlib, and it is still pretty complicated to install. I do not care much about those points myself, but seeing how installation problems are one of the big difficulty for newcommers to numpy/scipy, I am a bit concerned. Is this impression funded, and if it is, is there a chance to see improvements on those fronts in the next few months ? David From matthieu.brucher at gmail.com Fri Apr 13 02:23:12 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 08:23:12 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: <461F25D2.6050405@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > > The project page mentions SVM. In addition to SVM I'm interested in > things like PPCA, kernel PCA, RBF networks, gaussian processes and > GPLVM. Are you going to try to go in the direction of a modular > structure with reusable bits for for all kernel methods, or is the > plan to targeted specifically SVM? > > > Don't scipy have SVMs already ? Perhaps not as modularized at it could > be ? > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), KPCA is not > a big deal if kernels are in a module, and if they have a method > taking 2 arguments. BTW, even the svd could directly take a kernel as > an argument, the default kernel being the scalar product ? > I'm in favour of fine-grained modules - like for the optimisation > module I proposed -, and allowing pepole to choose which kernel they > want, even if the kernel was designed for SVM, is a good thing, the > "kernel trick" should be almost universal :) The project is first about unifying *existing* packages: basically, make them first class citizen doc-wise and api-wise, so that they can be moved out of the sandbox. SVM, EM for mixture fall in this category. cheers, David From wbaxter at gmail.com Fri Apr 13 02:46:27 2007 From: wbaxter at gmail.com (Bill Baxter) Date: Fri, 13 Apr 2007 15:46:27 +0900 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: On 4/13/07, Matthieu Brucher wrote: > > > The project page mentions SVM. In addition to SVM I'm interested in > > things like PPCA, kernel PCA, RBF networks, gaussian processes and > > GPLVM. Are you going to try to go in the direction of a modular > > structure with reusable bits for for all kernel methods, or is the > > plan to targeted specifically SVM? > > Don't scipy have SVMs already ? Perhaps not as modularized at it could be ? It might have something in sandbox, but as far as I'm concerned 'in scipy.sandbox' is synonymous with 'not in scipy'. > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), Yes, pretty much so, just with some variances added into the diagonal of the matrix at the right place. > KPCA is not a big > deal if kernels are in a module, and if they have a method taking 2 > arguments. Right. Low level kernel stuff in reusable lib makes a lot of sense. If there were a lib of functions with common kernels and their derivatives and possibly even 2nd derivatives, that would cover a lot of territory. (I could use some 2nd derivatives right now...) --bb From oliphant.travis at ieee.org Fri Apr 13 03:06:41 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 13 Apr 2007 01:06:41 -0600 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1BCB.9000306@ar.media.kyoto-u.ac.jp> Message-ID: <461F2C01.2070904@ieee.org> David Cournapeau wrote: > I have not given a lot of thoughts about it yet; what matters the most > is that all algo have the same conventions. Nevertheless, my experience > so far in numpy is similar to yours with regard to ML algorithms (except > point 2: depending on the algo. you need contiguous access along one > dimension, and my impression is that in numpy, this matters a lot > performance wise, at least much more than in matlab). > If I understand you correctly, this is not as true as it once was. There was some benchmark code that showed a slow-down on transposed arrays until we caught the bug that was causing it. What is important is that your inner-loops are running over data that is "closest" together in memory. Exactly how close is close enough depends on cache size. -Travis From nwagner at iam.uni-stuttgart.de Fri Apr 13 03:01:07 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 09:01:07 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy Message-ID: <461F2AB3.1080407@iam.uni-stuttgart.de> Ondrej, Please can you show me an example where petsc solvers are "better" than UMFPACK. Nils 4) About the petsc - I know it's another dependence. However, I noticed you are using umfpack in SciPy. So why not petsc? I think it contains much more (sometimes better) solvers (depends on the problem). It's seems logical to me, to either use nothing, or the best library available, which I believe is petsc. From matthieu.brucher at gmail.com Fri Apr 13 03:01:46 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 09:01:46 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

Message-ID: A little update of my proposal : - each step can be update after each iteration, it will be enhanced so that everything computed in the iteration will be passed on, in case it is needed to update the step. That could be useful for approximated steps - added a simple Damped optimizer, it tries to take a step, if the cost is higher than before, half a step is tested, ... - a function object is created if the function argument is not passed (takes the arg 'fun' as the cost function, gradient for the gradient, ...). Some safeguards must still be implemented. I was thinking of the limits of this architecture : - defenitely all quasi-Newton optimizers can be ported to this framework, as well as all semi-quadratic ones - constrained optimization will not unless it is modified so that it can, but as I do not use such optimizers in my PhD thesis, I do not know them enough But even the simplex/polytope optimizer (fmin) can be expressed in the framework - it is useless though, as it would be slower -, and can advantages of the different stopping criteria. BTW, I used some parts of this framework in an EM algorithm with an AIC based optimizer on the top. As I said in another thread, I'm in favour of fine-grained modules, even if some wrapper can provide simple optimization procedures. Matthieu 2007/3/26, Matthieu Brucher : > > OK, I see why you want that approach. > > (So that you can still pass a single object around in your > > optimizer module.) Yes, that seems right... > > > > Exactly :) > > > This seems to bundle naturally with a specific optimizer? > > > > I'm not an expert in optimization, but I intended several class/seminars > on the subject, and at least for the usual simple optimizer - the standard > optimizer, all damped approach, and all the other that use a step and a > criterion test - use this interface, and with a lot of different steps that > are usual - gradient, every conjugated gradient solution, (quasi-)Newton - > or criteria. > I even suppose it can do very well in semi-quadratic optimization, with > very little change, but I have to finish some work before I can read some > books on the subject to begin implementing it in Python. > > > If so, the class definition should reside in the StandardOptimizer module. > > > > Cheers, > > Alan Isaac > > > > PS For readability, I think Optimizer should define > > a "virtual" iterate method. E.g., > > def iterate(self): > > return NotImplemented > > > Yes, it seems better. > > Thanks for the opinion ! > > Matthieu > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizerProposal_02.tar.gz Type: application/x-gzip Size: 3861 bytes Desc: not available URL: From david at ar.media.kyoto-u.ac.jp Fri Apr 13 03:05:31 2007 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Apr 2007 16:05:31 +0900 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F2AB3.1080407@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> Message-ID: <461F2BBB.7060406@ar.media.kyoto-u.ac.jp> Nils Wagner wrote: > Ondrej, > > Please can you show me an example where petsc solvers are "better" than > UMFPACK. > > Nils > > 4) About the petsc - I know it's another dependence. However, I > noticed you are using umfpack in SciPy. So why not petsc? I think it > contains much more (sometimes better) solvers (depends on the > problem). It's seems logical to me, to either use nothing, or the best > library available, which I believe is petsc. I can give you one situation where adding dependency makes things more complicated: when you are a packager. I am trying to "evangelize" numpy/scipy, and one problem people face is installation. When you are a user, adding dependency is great, it gives you more code, more API to leverage. When you are a packager, each dependency is a mess. I am working on rpm and debian package of numpy and scipy, and 99.9 % of the problems are the dependencies. LAPACK and BLAS are already quite difficult to package correctly (debian was the only distribution to do it correctly for a long time), UMFPACK is kind of a pain to compile too (depends on two other packages), and let's not even start talking about ATLAS, which pose significant challenges by its very nature (again, only debian has done it correctly, Fedora copying their method). And I have only experience on linux, where at least every distribution uses the same compiler suite. Most of those libraries are not really commonly used, and as such, are not provided by distributors most of the time. I wouldn't be surprised if this is one of the reason why Robert is reluctant to add more dependencies. David From matthieu.brucher at gmail.com Fri Apr 13 03:11:42 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 09:11:42 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com> <85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> Message-ID: > > > Don't scipy have SVMs already ? Perhaps not as modularized at it could > be ? > > It might have something in sandbox, but as far as I'm concerned 'in > scipy.sandbox' is synonymous with 'not in scipy'. OK, I read David's answer - BTW, hello Gabou :) -. I'm looking forward to this, I will probably use SVMs in a near future, as I'm moving toward Python. > PPCA is PCA IIRC (Tipping 97, it's part of my Phd thesis), > > Yes, pretty much so, just with some variances added into the diagonal > of the matrix at the right place. OK, what you want is in fact the implementation of the optimization algorithm given at the end of the paper ? Because in the standard form of PPCA, the error variance is isotropic, and in that case, it is tantamount to simple PCA. > KPCA is not a big > > deal if kernels are in a module, and if they have a method taking 2 > > arguments. > > Right. Low level kernel stuff in reusable lib makes a lot of sense. > If there were a lib of functions with common kernels and their > derivatives and possibly even 2nd derivatives, that would cover a lot > of territory. (I could use some 2nd derivatives right now...) I would appreciate as well, for a modified mean-shift algorithm. I'll probably port Isomap and LLE algorithms as well, they are widely used for manifold learning (they can be expressed as KPCA algorithms with a particular kernel). Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Apr 13 03:49:17 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 02:49:17 -0500 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121632w5a31ec3sc09fa161e82ee661@mail.gmail.com> <30EF75A9-4002-4EBE-8784-CDBDBFC69720@physics.ubc.ca> <461EF0C4.6060506@ar.media.kyoto-u.ac.jp> <461F0E45.5000009@ar.media.kyoto-u.ac.jp> <461F1A3E.1000508@gmail.com> <461F1E4E.1040705@ar.media.kyoto-u.ac.jp> Message-ID: <461F35FD.9040102@gmail.com> David Cournapeau wrote: > Since you're here, I have some questions concerning chaco for the > visualization part of the project. Basically, I am unsure about whether > I should use chaco or matplotlitb. I do not know chaco very well yet, > but it seems much better API and performance wise compared to matplotlib > for interactive visualization. The problem is that it still does not > have a lot of visibility to the community compared to matplotlib, and it > is still pretty complicated to install. I do not care much about those > points myself, but seeing how installation problems are one of the big > difficulty for newcommers to numpy/scipy, I am a bit concerned. Is this > impression funded, and if it is, is there a chance to see improvements > on those fronts in the next few months ? Just installing Chaco and the stuff it depends on can actually be somewhat easier than matplotlib: we don't bother with external libjpeg, libpng, and libfreetype libraries. The major issue would actually be disabling building of TVTK if you don't have VTK installed and you only care about Chaco, but even that's a comment-out-one-line operation. In the coming weeks, though, we will be playing around with a reorganization of the repository along the lines of the scikits layout if you've been following that conversation. That would enable one to just build the enthought subpackages that you need, or allow easy_install to do so. Even if we don't end up reorganizing the trunk that way, we'll probably have such a reorganized mirror of the trunk using svn:externals to get the same effect for distribution purposes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cimrman3 at ntc.zcu.cz Fri Apr 13 06:59:33 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 12:59:33 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EC48B.50105@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> <461EC48B.50105@gmail.com> Message-ID: <461F6295.8080303@ntc.zcu.cz> Robert Kern wrote: > Ondrej Certik wrote: >> 4) About the petsc - I know it's another dependence. However, I >> noticed you are using umfpack in SciPy. So why not petsc? I think it >> contains much more (sometimes better) solvers (depends on the >> problem). It's seems logical to me, to either use nothing, or the best >> library available, which I believe is petsc. > > Well, I wasn't as much of a dependency/no-optional-features freak when the > optional UMFPACK stuff went in. Also, IIRC the wrappers for UMFPACK were written > specifically for scipy; they didn't exist as a separate package beforehand. > petsc4py already exists. Unless if we decide that some other feature of scipy > needs it, there is no reason that I can see for bringing it into the scipy package. Yes, it was written specifically for scipy. Actually I was really 'forced' to write UMFPACK wrappers as at that time (long long ago) there was not a fast enough direct sparse solver in scipy. r. From cimrman3 at ntc.zcu.cz Fri Apr 13 07:13:59 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 13:13:59 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F2AB3.1080407@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> Message-ID: <461F65F7.6000909@ntc.zcu.cz> Nils Wagner wrote: > Please can you show me an example where petsc solvers are "better" than > UMFPACK. Petsc is really a superpackage providing many parallel linear solvers (iterative, direct, preconditioners, ...) together with nonlinear solvers, time steppers, etc. The solvers can be both petsc-native or external packages, nevertheless all are accessed via a uniform interface. IMHO UMFPACK is one of the optional external solvers petsc can use, so to answer your question, petsc can do anything that UMFPACK does and much more. r. From ondrej at certik.cz Fri Apr 13 07:29:01 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 13:29:01 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F65F7.6000909@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> Message-ID: <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> On 4/13/07, Robert Cimrman wrote: > Nils Wagner wrote: > > Please can you show me an example where petsc solvers are "better" than > > UMFPACK. > > Petsc is really a superpackage providing many parallel linear solvers > (iterative, direct, preconditioners, ...) together with nonlinear > solvers, time steppers, etc. The solvers can be both petsc-native or > external packages, nevertheless all are accessed via a uniform > interface. IMHO UMFPACK is one of the optional external solvers petsc > can use, so to answer your question, petsc can do anything that UMFPACK > does and much more. Yes, it's exactly like this. Thus, there is a question whether SciPy should support sparse solvers (my answer is yes) and if so, then it should support petsc, otherwise, for example me, I am not going to use it, as I want to try several solvers according to the problem. What I am trying to say is that I don't want to write two versions of my code - one for petsc and second one for SciPy. And from the zen of python: There should be one-- and preferably only one --obvious way to do it. Ondra From nwagner at iam.uni-stuttgart.de Fri Apr 13 07:32:53 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 13:32:53 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F65F7.6000909@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> Message-ID: <461F6A65.4010605@iam.uni-stuttgart.de> Robert Cimrman wrote: > Nils Wagner wrote: > >> Please can you show me an example where petsc solvers are "better" than >> UMFPACK. >> > > Petsc is really a superpackage providing many parallel linear solvers > (iterative, direct, preconditioners, ...) together with nonlinear > solvers, time steppers, etc. The solvers can be both petsc-native or > external packages, nevertheless all are accessed via a uniform > interface. IMHO UMFPACK is one of the optional external solvers petsc > can use, so to answer your question, petsc can do anything that UMFPACK > does and much more. > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > The current UMFPACK version is 5.0.3. Which versions are supported by scipy ? As I mentioned in a previous email I still have trouble to install other versions than 4.4. Any pointer, how to install more recent versions of UMFPACK, would be appreciated. When you are talking about superpackages how about Sundials ? http://www.llnl.gov/CASC/sundials/ AFAIK the ode solvers in scipy cannot handle events and a DAE solver is also missing in scipy. http://www.wolfram.com/products/mathematica/newin51/eventhandling.html Are there plans to add this functionality to scipy (in the form of scikits) ? Nils From cimrman3 at ntc.zcu.cz Fri Apr 13 08:24:04 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 14:24:04 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> Message-ID: <461F7664.7070001@ntc.zcu.cz> Ondrej Certik wrote: > On 4/13/07, Robert Cimrman wrote: >> Nils Wagner wrote: >>> Please can you show me an example where petsc solvers are "better" than >>> UMFPACK. >> Petsc is really a superpackage providing many parallel linear solvers >> (iterative, direct, preconditioners, ...) together with nonlinear >> solvers, time steppers, etc. The solvers can be both petsc-native or >> external packages, nevertheless all are accessed via a uniform >> interface. IMHO UMFPACK is one of the optional external solvers petsc >> can use, so to answer your question, petsc can do anything that UMFPACK >> does and much more. > > Yes, it's exactly like this. Thus, there is a question whether SciPy > should support sparse solvers (my answer is yes) and if so, then it > should support petsc, otherwise, for example me, I am not going to use > it, as I want to try several solvers according to the problem. My problems tend to be such that only a direct solvers work :) > What I am trying to say is that I don't want to write two versions of > my code - one for petsc and second one for SciPy. And from the zen of > python: > > There should be one-- and preferably only one --obvious way to do it. Well, you can use very well both petsc and scipy/numpy together. afaik petsc4py depends on numpy, so this you need in any case, and scipy is a set of very useful modules built on top of numpy (particularly its multidimensional array data type), addressing different fields of (scientific) computation, not just solving linear systems. It is true that the sparse matrix support in scipy is not as mature as some users need, but this can change :). So for now, you can use petsc (or ) for sparse stuff if you like, and scipy for other things that are not in petsc. There is no contradiction, imho. Just my 2kc, r. From cimrman3 at ntc.zcu.cz Fri Apr 13 08:36:41 2007 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 13 Apr 2007 14:36:41 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F6A65.4010605@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> Message-ID: <461F7959.4050105@ntc.zcu.cz> Nils Wagner wrote: > Robert Cimrman wrote: >> Nils Wagner wrote: >> >>> Please can you show me an example where petsc solvers are >>> "better" than UMFPACK. >>> >> Petsc is really a superpackage providing many parallel linear >> solvers (iterative, direct, preconditioners, ...) together with >> nonlinear solvers, time steppers, etc. The solvers can be both >> petsc-native or external packages, nevertheless all are accessed >> via a uniform interface. IMHO UMFPACK is one of the optional >> external solvers petsc can use, so to answer your question, petsc >> can do anything that UMFPACK does and much more. >> > The current UMFPACK version is 5.0.3. Which versions are supported by > scipy ? As I mentioned in a previous email I still have trouble to > install other versions than 4.4. Any pointer, how to install more > recent versions of UMFPACK, would be appreciated. I can use 5.0 without problems, so 5.0.3 should work too. I have downloaded the whole UFsparse suite, edited UFconfig/UFconfig.mk, cd into UMFPACK and typed 'make' (normal UMFPACK installation procedure). Then I edited my numpy/site.cfg to reflect the installation paths. I do remember you had some problems with this step, but I do not know why. Any other recent UMFPACK users out there? r. ps: I am leaving for one week, so I will not be able to answer UMFPACK related questions :) From nwagner at iam.uni-stuttgart.de Fri Apr 13 08:36:48 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 14:36:48 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7664.7070001@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> Message-ID: <461F7960.4090102@iam.uni-stuttgart.de> Robert Cimrman wrote: > Ondrej Certik wrote: > >> On 4/13/07, Robert Cimrman wrote: >> >>> Nils Wagner wrote: >>> >>>> Please can you show me an example where petsc solvers are "better" than >>>> UMFPACK. >>>> >>> Petsc is really a superpackage providing many parallel linear solvers >>> (iterative, direct, preconditioners, ...) together with nonlinear >>> solvers, time steppers, etc. The solvers can be both petsc-native or >>> external packages, nevertheless all are accessed via a uniform >>> interface. IMHO UMFPACK is one of the optional external solvers petsc >>> can use, so to answer your question, petsc can do anything that UMFPACK >>> does and much more. >>> >> Yes, it's exactly like this. Thus, there is a question whether SciPy >> should support sparse solvers (my answer is yes) and if so, then it >> should support petsc, otherwise, for example me, I am not going to use >> it, as I want to try several solvers according to the problem. >> > > My problems tend to be such that only a direct solvers work :) > > >> What I am trying to say is that I don't want to write two versions of >> my code - one for petsc and second one for SciPy. And from the zen of >> python: >> >> There should be one-- and preferably only one --obvious way to do it. >> > > Well, you can use very well both petsc and scipy/numpy together. afaik > petsc4py depends on numpy, so this you need in any case, and scipy is a > set of very useful modules built on top of numpy (particularly its > multidimensional array data type), addressing different fields of > (scientific) computation, not just solving linear systems. It is true > that the sparse matrix support in scipy is not as mature as some users > need, but this can change :). So for now, you can use petsc (or your favourite sparse matrix package here>) for sparse stuff if you > like, and scipy for other things that are not in petsc. > There is no contradiction, imho. > > Just my 2kc, > > r. > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Unfortunately petsc4py has no tutorial. http://www.cimec.org.ar/python/petsc4py.html#tutorial I guess that many users prefer well documented packages. Nils From ondrej at certik.cz Fri Apr 13 08:44:46 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 13 Apr 2007 14:44:46 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7960.4090102@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> <461F7960.4090102@iam.uni-stuttgart.de> Message-ID: <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> > Unfortunately petsc4py has no tutorial. > http://www.cimec.org.ar/python/petsc4py.html#tutorial > > I guess that many users prefer well documented packages. Well, is there a tutorial for umfpack in scipy? I only found this: http://www.scipy.org/doc/api_docs/scipy.sparse.html But that's about the same amount of documentation as the petsc4py has in form of the docstrings. Ondrej From nwagner at iam.uni-stuttgart.de Fri Apr 13 08:52:22 2007 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 13 Apr 2007 14:52:22 +0200 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <85b5c3130704130429i2ba0d7a7i166969fd4c548158@mail.gmail.com> <461F7664.7070001@ntc.zcu.cz> <461F7960.4090102@iam.uni-stuttgart.de> <85b5c3130704130544g66a2c3ei27828ca6e174df0f@mail.gmail.com> Message-ID: <461F7D06.4080908@iam.uni-stuttgart.de> Ondrej Certik wrote: >> Unfortunately petsc4py has no tutorial. >> http://www.cimec.org.ar/python/petsc4py.html#tutorial >> >> I guess that many users prefer well documented packages. >> > > Well, is there a tutorial for umfpack in scipy? I only found this: > > http://www.scipy.org/doc/api_docs/scipy.sparse.html > > But that's about the same amount of documentation as the petsc4py has > in form of the docstrings. > > Ondrej > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > Try >>> from scipy import * >>> help (linsolve) Nils From matthieu.brucher at gmail.com Fri Apr 13 12:14:55 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 13 Apr 2007 18:14:55 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

Message-ID: A new proposal... I refactored the code for the line search part, it is now a another module. The damped optimizer of the last proposal is now a damped line search, by default no line search is performed at all. Matthieu 2007/4/13, Matthieu Brucher : > > A little update of my proposal : > > - each step can be update after each iteration, it will be enhanced so > that everything computed in the iteration will be passed on, in case it is > needed to update the step. That could be useful for approximated steps > - added a simple Damped optimizer, it tries to take a step, if the cost is > higher than before, half a step is tested, ... > - a function object is created if the function argument is not passed > (takes the arg 'fun' as the cost function, gradient for the gradient, ...). > Some safeguards must still be implemented. > > I was thinking of the limits of this architecture : > - defenitely all quasi-Newton optimizers can be ported to this framework, > as well as all semi-quadratic ones > - constrained optimization will not unless it is modified so that it can, > but as I do not use such optimizers in my PhD thesis, I do not know them > enough > > But even the simplex/polytope optimizer (fmin) can be expressed in the > framework - it is useless though, as it would be slower -, and can > advantages of the different stopping criteria. BTW, I used some parts of > this framework in an EM algorithm with an AIC based optimizer on the top. > > As I said in another thread, I'm in favour of fine-grained modules, even > if some wrapper can provide simple optimization procedures. > > Matthieu > > 2007/3/26, Matthieu Brucher < matthieu.brucher at gmail.com>: > > > > OK, I see why you want that approach. > > > (So that you can still pass a single object around in your > > > optimizer module.) Yes, that seems right... > > > > > > > > Exactly :) > > > > > > This seems to bundle naturally with a specific optimizer? > > > > > > > > I'm not an expert in optimization, but I intended several class/seminars > > on the subject, and at least for the usual simple optimizer - the standard > > optimizer, all damped approach, and all the other that use a step and a > > criterion test - use this interface, and with a lot of different steps that > > are usual - gradient, every conjugated gradient solution, (quasi-)Newton - > > or criteria. > > I even suppose it can do very well in semi-quadratic optimization, with > > very little change, but I have to finish some work before I can read some > > books on the subject to begin implementing it in Python. > > > > > > If so, the class definition should reside in the StandardOptimizer > > > module. > > > > > > Cheers, > > > Alan Isaac > > > > > > PS For readability, I think Optimizer should define > > > a "virtual" iterate method. E.g., > > > def iterate(self): > > > return NotImplemented > > > > > > Yes, it seems better. > > > > Thanks for the opinion ! > > > > Matthieu > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: optimizerProposal_03.zip Type: application/zip Size: 7587 bytes Desc: not available URL: From robert.kern at gmail.com Fri Apr 13 14:14:00 2007 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Apr 2007 13:14:00 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F6A65.4010605@iam.uni-stuttgart.de> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> Message-ID: <461FC868.9090500@gmail.com> Nils Wagner wrote: > When you are talking about superpackages how about Sundials ? You keep asking this question over and over again, and you get the same answer every time: it will be wrapped when the people who want it wrapped put the effort into wrapping it. If you want to see wrappers for SUNDIALS, stop asking the question and start writing code. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Fri Apr 13 14:42:40 2007 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 13 Apr 2007 13:42:40 -0500 Subject: [SciPy-dev] petsc - UMFPACK and scipy In-Reply-To: <461F7959.4050105@ntc.zcu.cz> References: <461F2AB3.1080407@iam.uni-stuttgart.de> <461F65F7.6000909@ntc.zcu.cz> <461F6A65.4010605@iam.uni-stuttgart.de> <461F7959.4050105@ntc.zcu.cz> Message-ID: On 4/13/07, Robert Cimrman wrote: > I can use 5.0 without problems, so 5.0.3 should work too. > > I have downloaded the whole UFsparse suite, edited UFconfig/UFconfig.mk, > cd into UMFPACK and typed 'make' (normal UMFPACK installation > procedure). Then I edited my numpy/site.cfg to reflect the installation > paths. I do remember you had some problems with this step, but I do not > know why. > > Any other recent UMFPACK users out there? I'm using SuiteSparse version 2.1.1 dated 09/11/2006 which includes UMFPACK 5.0.1. IIRC I'm also using GotoBLAS for UMFPACK. I installed SuiteSparse to /opt/SuiteSparse and copied all the UMFPACK, CHOLMOD, AMD etc. header files to /usr/include. I remember having problems when trying to point SciPy at the header files in /opt/SuiteSparse. My site.cfg has the following entries: [amd] library_dirs = /opt/SuiteSparse/AMD/Lib include_dirs = /opt/SuiteSparse/AMD/Include amd_libs = amd [umfpack] library_dirs = /opt/SuiteSparse/UMFPACK/Lib include_dirs = /opt/SuiteSparse/UMFPACK/Include umfpack_libs = umfpack When running SciPy's setup.py I get the following output: umfpack_info: amd_info: FOUND: libraries = ['amd'] library_dirs = ['/opt/SuiteSparse/AMD/Lib'] swig_opts = ['-I/opt/SuiteSparse/AMD/Include'] define_macros = [('SCIPY_AMD_H', None)] include_dirs = ['/opt/SuiteSparse/AMD/Include'] FOUND: libraries = ['umfpack', 'amd'] library_dirs = ['/opt/SuiteSparse/UMFPACK/Lib', '/opt/SuiteSparse/AMD/Lib'] swig_opts = ['-I/opt/SuiteSparse/UMFPACK/Include', '-I/opt/SuiteSparse/AMD/Include'] define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)] include_dirs = ['/opt/SuiteSparse/UMFPACK/Include', '/opt/SuiteSparse/AMD/Include'] -- Nathan Bell wnbell at gmail.com From ondrej at certik.cz Fri Apr 13 19:57:53 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Sat, 14 Apr 2007 01:57:53 +0200 Subject: [SciPy-dev] SciPy improvements In-Reply-To: <461EC48B.50105@gmail.com> References: <85b5c3130704111002j751d0d69h290924693ad1e640@mail.gmail.com>

<85b5c3130704121631u19c1ef22w48ce1854c0897108@mail.gmail.com> <461EC48B.50105@gmail.com> Message-ID: <85b5c3130704131657l101e8927wb7e3235acc4c73ab@mail.gmail.com> > > So, I'll copy the page: > > > > http://www.scipy.org/Documentation > > > > into some new one, and redesign it as I would like it to be, and then > > you'll tell me what you think about it. The same with other pages if > > I'll get a better idea about them. This way I shouldn't spoil anything > > in case you wouldn't like it. Because I don't have just couple of > > small fixes. > > As you like. Thank you! You can check my draft of a new documentation: http://www.scipy.org/DocumentationNew and the original one: http://www.scipy.org/Documentation I removed broken links, merged very simple pages with the tutorial and removed duplicates. Tell me, if you like it and if I should continue - I would merge SciPy Tutorial with Tutorial II, then I would merge all porting wikis into one central with links (or just add links to one of them) and possibly some more simplifying - it's still too much complicated with too much (confusing) links. Ondrej From mforbes at physics.ubc.ca Sat Apr 14 11:37:17 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Sat, 14 Apr 2007 08:37:17 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

Message-ID: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> I started a discussion page on the Trac for design ideas etc. about modular optimization. Right now I am just adding questions I have about things. As it becomes more coherent, we can bring these questions/ideas to the list for comments. http://projects.scipy.org/scipy/scipy/wiki/DevelopmentIdeas/ ModularOptimization Michael. P.S. What is the best way to share code ideas on the wiki? Small bits work well inline, but for larger chunks it would be nice to be able to attach files. Unfortunately none of the wiki's deals well with attached files (no versioning and sometimes no way of modifying the file). On 13 Apr 2007, at 9:14 AM, Matthieu Brucher wrote: > A new proposal... > I refactored the code for the line search part, it is now a another > module. The damped optimizer of the last proposal is now a damped > line search, by default no line search is performed at all. > > Matthieu From matthieu.brucher at gmail.com Sat Apr 14 11:59:37 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 14 Apr 2007 17:59:37 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: Good point ;) I could make those changes safe for the f or func part... Using an object to optimize is for me better than a collection of functions although a collection of functions can be made into an object if needed. For the interface, I suppose that assembling a optimizer is not something everybody will want to do, that's why some optimizers are proposed out of the box in MatLab toolboxes for instance, but allowing to customize rapidly an optimizer can be a real advantage over all other optimization packages. One of the members of the lab I studying in said to me that he did see if such modularization was pertinent. He used for its application (warping an image) a Levenberg-Marquardt optimizer with constraints and the line-search was performed with interval analysis. Until some days ago, I thought that he was right, that only some of optimizers can be expressed in "my" framework. Now, I think that even his optimization could be expressed, and if he wanted to modify something in the optimizer, it would be much simpler with this architecture, in Python, that what he has now, in C. He made some stuff very specific for his function, as a lot of people would want to do, but couldn't with a fixed interface ike MatLab's, but in fact a lot could be expressed in terms of a specific step, a specific line search, a specific criterion and a specific function/set of parameters. Until some time ago, I thought that modules with criteria, steps and optimizers would be enough, now I think I missed the fact that a lot of optimizers share the line search, and that it should be onother module. I'm writting some other tests functions (shamelessly taken from _Engineering Optimization_ from Rao) with other line searches and steps, I'll keep you posted. Matthieu 2007/4/14, Michael McNeil Forbes : > > I started a discussion page on the Trac for design ideas etc. about > modular optimization. Right now I am just adding questions I have > about things. As it becomes more coherent, we can bring these > questions/ideas to the list for comments. > > http://projects.scipy.org/scipy/scipy/wiki/DevelopmentIdeas/ > ModularOptimization > > Michael. > > P.S. What is the best way to share code ideas on the wiki? Small > bits work well inline, but for larger chunks it would be nice to be > able to attach files. Unfortunately none of the wiki's deals well > with attached files (no versioning and sometimes no way of modifying > the file). > > On 13 Apr 2007, at 9:14 AM, Matthieu Brucher wrote: > > > A new proposal... > > I refactored the code for the line search part, it is now a another > > module. The damped optimizer of the last proposal is now a damped > > line search, by default no line search is performed at all. > > > > Matthieu > > _______________________________________________ > Scipy-dev mailing list > Scipy-dev at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Sat Apr 14 12:06:43 2007 From: guyer at nist.gov (Jonathan Guyer) Date: Sat, 14 Apr 2007 12:06:43 -0400 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: On Apr 14, 2007, at 11:37 AM, Michael McNeil Forbes wrote: > P.S. What is the best way to share code ideas on the wiki? Small > bits work well inline, but for larger chunks it would be nice to be > able to attach files. Unfortunately none of the wiki's deals well > with attached files (no versioning and sometimes no way of modifying > the file). In Trac, you can check things into the associated repository and then use source: and diff: links. From mforbes at physics.ubc.ca Sat Apr 14 15:45:23 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Sat, 14 Apr 2007 12:45:23 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca> Message-ID: On 14 Apr 2007, at 8:59 AM, Matthieu Brucher wrote: > Good point ;) > I could make those changes safe for the f or func part... Using an > object to optimize is for me better than a collection of functions > although a collection of functions can be made into an object if > needed. > > > For the interface, I suppose that assembling a optimizer is not > something everybody will want to do, that's why some optimizers are > proposed out of the box in MatLab toolboxes for instance, but > allowing to customize rapidly an optimizer can be a real advantage > over all other optimization packages. And one can easily make convenience functions which take standard arguments and package them internally. I think that the interface should be flexible enough to allow users to just call the optimizers with a few standard arguments like they are used to, but allow users to "build" more complicated/more customized optimizers as they need. Also, it would be nice if an optimizer could be "tuned" to a particular problem (i.e. have a piece of code that tries several algorithms and parameter values to see which is fastest.) > One of the members of the lab I studying in said to me that he did > see if such modularization was pertinent. He used for its > application (warping an image) a Levenberg-Marquardt optimizer with > constraints and the line-search was performed with interval > analysis. Until some days ago, I thought that he was right, that > only some of optimizers can be expressed in "my" framework. Now, I > think that even his optimization could be expressed, and if he > wanted to modify something in the optimizer, it would be much > simpler with this architecture, in Python, that what he has now, in > C. He made some stuff very specific for his function, as a lot of > people would want to do, but couldn't with a fixed interface ike > MatLab's, but in fact a lot could be expressed in terms of a > specific step, a specific line search, a specific criterion and a > specific function/set of parameters. > > Until some time ago, I thought that modules with criteria, steps > and optimizers would be enough, now I think I missed the fact that > a lot of optimizers share the line search, and that it should be > onother module. My immediate goal is to try and get the interface and module structure well defined so that I know where to put the pieces of my Broyden code when I rip it apart. One question about coupling: Useful criteria for globally convergent algorithms include testing the gradients and/or curvature of the function. In the Broyden algorithm, for example, these would be maintained by the "step" object, but the criteria object would need to access these. Likewise, if a "function" object can compute its own derivatives, then the "ceriterion" object should access it from there. Any ideas on how to deal with these couplings? Perhaps the "function" object should maintain all the state (approximate jacobians etc.). Michael. From matthieu.brucher at gmail.com Sat Apr 14 16:51:15 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 14 Apr 2007 22:51:15 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

Message-ID: > > And one can easily make convenience functions which take standard > arguments and package them internally. I think that the interface > should be flexible enough to allow users to just call the optimizers > with a few standard arguments like they are used to, but allow users > to "build" more complicated/more customized optimizers as they need. > Also, it would be nice if an optimizer could be "tuned" to a > particular problem (i.e. have a piece of code that tries several > algorithms and parameter values to see which is fastest.) Exactly. > My immediate goal is to try and get the interface and module > structure well defined so that I know where to put the pieces of my > Broyden code when I rip it apart. I can help you with it :) One question about coupling: Useful criteria for globally convergent > algorithms include testing the gradients and/or curvature of the > function. Simple, the criterion takes, for the moment, the current iteration number, the former value, the current value, same for the parameters. It can be modified to add the gradient if needed - I think the step would be a better choice ? - In the Broyden algorithm, for example, these would be > maintained by the "step" object, but the criteria object would need > to access these. Access what exactly ? Likewise, if a "function" object can compute its > own derivatives, then the "ceriterion" object should access it from > there. I don't think that the criterion need to access this, because it would mean it knows more than it should, from an object-oriented point of view, but this can be discussed :) Any ideas on how to deal with these couplings? Perhaps the > "function" object should maintain all the state (approximate > jacobians etc.). I don't think so, the function provides methods to compute gradient, hessian, ... but only the step object knows what to do with it : approximate a hessian, what was already approximated, ... A step object is associated with one optimizer, a function object can be optimized several times. If it has a state, it couldn't be used with several optimizers without reinitializing it, and it is not intuitive enough. I've thinking about a correct architecture for several months now, and that is what I think is a good one : - a function to optimize that provides some method to compute the cost, the gradient, hessian, ... only basic stuff - an object that is responsible for the optimization, the glue between all modules -> optimizer - an object that tells if the optimization has converged. It needs the current iteration number, several last values, parameters, perhaps other things, but these things should be discussed - an object that computes a new step, takes a function to optimize, can have a state - to compute approximate hessian or inverse hessian - a line search that can find a new candidate - section method, damped method, no method at all, with a state (Marquardt), ... With these five objects, I _think_ every unconstrained method can be expressed. For the constraints, I suppose the step and the line search should be adapted, but no other module needs to be touched. I implemented the golden and fibonacci section, it's pretty straightforward to add other line searches or steps, I'll try to add some before I submit it on TRAC. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Mon Apr 16 11:39:08 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 16 Apr 2007 08:39:08 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

Message-ID: <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> On 14 Apr 2007, at 1:51 PM, Matthieu Brucher wrote: > > One question about coupling: Useful criteria for globally convergent > algorithms include testing the gradients and/or curvature of the > function. > > Simple, the criterion takes, for the moment, the current iteration > number, the former value, the current value, same for the > parameters. It can be modified to add the gradient if needed - I > think the step would be a better choice ? > > In the Broyden algorithm, for example, these would be > maintained by the "step" object, > > but the criteria object would need > to access these. > > Access what exactly ? "these == gradient/hessian information" The criterion needs access to this information, but the question is: who serves it? If the "function" can compute these, then it should naturally serve this information. With the Broyden method, you suggest that the "step" would serve this information. Thus, there are two objects (depending on the choice of method) that maintain and provide gradient information. After thinking about this some more, I am beginning to like the idea that only the "function" object be responsible for the Jacobian. If the function can compute the Jacobian directly: great, use a newton- like method. If it can't, then do its best to approximate it (i.e. the "Broyden" part of the algorithm would be encoded in the function object rather than the step object." The "function" object alone then serves up information about the value of the function at a given point, as well as the gradient and hessian at that point (either exact or approximate) to the criterion, step, and any other objects that need it. > Likewise, if a "function" object can compute its > own derivatives, then the "ceriterion" object should access it from > there. > > > I don't think that the criterion need to access this, because it > would mean it knows more than it should, from an object-oriented > point of view, but this can be discussed :) Certain termination criteria need access to the derivatives to make sure that they terminate. It would query the function object for this information. Other criteria may need to query the "step" object to find out the size of the previous steps. The "criterion" should not maintain any of these internally, just rely on the values served by the other objects: this does not break the encapsulation, it just couples the objects more tightly, but sophisticated criteria need this coupling. > Any ideas on how to deal with these couplings? Perhaps the > "function" object should maintain all the state (approximate > jacobians etc.). > > I don't think so, the function provides methods to compute > gradient, hessian, ... but only the step object knows what to do > with it : approximate a hessian, what was already approximated, ... > A step object is associated with one optimizer, a function object > can be optimized several times. If it has a state, it couldn't be > used with several optimizers without reinitializing it, and it is > not intuitive enough. The "function" object maintains all the information known about the function: how to compute the function, how to compute/approximate derivatives etc. If the user does not supply code for directly computing derivatives, but wants to use an optimization method that makes use of gradient information, then the function object should do its best to provide approximate information. The essence behind the Broyden methods is to approximate the Jacobian information in a clever and cheap way. I really think the natural place for this is in the "function" object, not the "step". > I've thinking about a correct architecture for several months now, > and that is what I think is a good one : > - a function to optimize that provides some method to compute the > cost, the gradient, hessian, ... only basic stuff > - an object that is responsible for the optimization, the glue > between all modules -> optimizer > - an object that tells if the optimization has converged. It needs > the current iteration number, several last values, parameters, > perhaps other things, but these things should be discussed > - an object that computes a new step, takes a function to optimize, > can have a state - to compute approximate hessian or inverse hessian > - a line search that can find a new candidate - section method, > damped method, no method at all, with a state (Marquardt), ... > > With these five objects, I _think_ every unconstrained method can > be expressed. For the constraints, I suppose the step and the line > search should be adapted, but no other module needs to be touched. Please describe how you think the Broyden root-finding method would fit within this scheme. Which object would maintain the state of the approximate Jacobian? Michael. From matthieu.brucher at gmail.com Mon Apr 16 12:07:44 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 18:07:44 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

<47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: I suspected it would become more problematic to decouple everything, but not that soon :) "these == gradient/hessian information" > > The criterion needs access to this information, but the question is: > who serves it? If the "function" can compute these, then it should > naturally serve this information. With the Broyden method, you > suggest that the "step" would serve this information. Thus, there > are two objects (depending on the choice of method) that maintain and > provide gradient information. I'll look in the litterature for the Broyden method, if I see the whole algorithm, I think I'll be able to answer your questions ;) After thinking about this some more, I am beginning to like the idea > that only the "function" object be responsible for the Jacobian. If > the function can compute the Jacobian directly: great, use a newton- > like method. If it can't, then do its best to approximate it (i.e. > the "Broyden" part of the algorithm would be encoded in the function > object rather than the step object." I think that if that if the function knows, on its own, how to compute the Jacobian, the hessian, ... it should provide it. When it does not, it shouldn't be the man sitting on the computer that modifies its function to add a Broyden algorithm to the function object. He sould only say to the optimizer that the function does not compute the Jacobian by using another module. What module ? That is a question for later. The goal of this is to have a clean architecture, and adding a way to compute something directly in the function, something that is not dependent on the function, but on the step, is not a good thing. The "function" object alone then serves up information about the > value of the function at a given point, as well as the gradient and > hessian at that point (either exact or approximate) to the criterion, > step, and any other objects that need it. I'm OK with it as long as it is not an approximation algorithm that is based on gradient, ... to compute for instance the hessian. Such an approximation algorithm is generic, and as such it should be put in another module or in a function superclass. > I don't think that the criterion need to access this, because it > > would mean it knows more than it should, from an object-oriented > > point of view, but this can be discussed :) > > Certain termination criteria need access to the derivatives to make > sure that they terminate. It would query the function object for > this information. Other criteria may need to query the "step" object > to find out the size of the previous steps. The step is not the good one, it's the line search object goal to find the correct step size, and such intel is given back to the optimizer core, because there, everything is saved - everything should be saved with a call to recordHistory -. What could be done is that every object - step or line search - returns along with the result - the result being the step, the new candidate, ... - a dictionnary with such values. In that case, the criterion can choose what it needs directly inside it. The "criterion" should > not maintain any of these internally, just rely on the values served > by the other objects: this does not break the encapsulation, it just > couples the objects more tightly, but sophisticated criteria need > this coupling. For the moment, the state was not in the criterion, one cannot know how any time it could be called inside an optimizer. This state is maintained by the optimizer itself - contains the last 2 values, the last 2 sets of parameters -, but I suppose that if we have the new candidate, the step and its size, those can be removed, and so the dictionary chooses what it needs. > I don't think so, the function provides methods to compute > > gradient, hessian, ... but only the step object knows what to do > > with it : approximate a hessian, what was already approximated, ... > > A step object is associated with one optimizer, a function object > > can be optimized several times. If it has a state, it couldn't be > > used with several optimizers without reinitializing it, and it is > > not intuitive enough. > > The "function" object maintains all the information known about the > function: how to compute the function, how to compute/approximate > derivatives etc. If the user does not supply code for directly > computing derivatives, but wants to use an optimization method that > makes use of gradient information, then the function object should do > its best to provide approximate information. The essence behind the > Broyden methods is to approximate the Jacobian information in a > clever and cheap way. That would mean that it can have a state, I really do not support this approach. The Broyden _is_ a way to get a step from a function that does not give some intell - Jacobian, for instance -, so it is not a function thing, it is a step mode. I really think the natural place for this is in the "function" > object, not the "step". > > > With these five objects, I _think_ every unconstrained method can > > be expressed. For the constraints, I suppose the step and the line > > search should be adapted, but no other module needs to be touched. > > Please describe how you think the Broyden root-finding method would > fit within this scheme. Which object would maintain the state of the > approximate Jacobian? No problem, I'll check in my two optimization books this evening provided I have enough time - I'm a little late in some important work projects :| - Thanks for your patience and your will to create generic optimizers :) Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej at certik.cz Mon Apr 16 12:11:34 2007 From: ondrej at certik.cz (Ondrej Certik) Date: Mon, 16 Apr 2007 18:11:34 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

<47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> > I'll look in the litterature for the Broyden method, if I see the whole > algorithm, I think I'll be able to answer your questions ;) As I understand the Broyden update, the whole trick is that you don't need the precise Jacobian and it is subsequently approximated at every iteration. So all that is needed (and in my problems actually all that is available) is the function value. Ondrej From matthieu.brucher at gmail.com Mon Apr 16 12:20:50 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 18:20:50 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

<47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> <85b5c3130704160911x3177b24aq3a5de7da3828581a@mail.gmail.com> Message-ID: 2007/4/16, Ondrej Certik : > > > I'll look in the litterature for the Broyden method, if I see the whole > > algorithm, I think I'll be able to answer your questions ;) > > As I understand the Broyden update, the whole trick is that you don't > need the precise Jacobian and it is subsequently approximated at every > iteration. So all that is needed (and in my problems actually all that > is available) is the function value. > > Ondrej That's what I understood from the discussion - well, every Quasi-Newton algorithm does this to some extent -, but I'll check to propose a full example. Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From mforbes at physics.ubc.ca Mon Apr 16 14:47:59 2007 From: mforbes at physics.ubc.ca (Michael McNeil Forbes) Date: Mon, 16 Apr 2007 11:47:59 -0700 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References:

<7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>

<47499294-EEF7-49F8-BE6A-CFBC67CA4707@physics.ubc.ca> Message-ID: On 16 Apr 2007, at 9:07 AM, Matthieu Brucher wrote: > I'll look in the litterature for the Broyden method, if I see the > whole algorithm, I think I'll be able to answer your questions ;) Basically, and approximated Jacobian is used to determine the step direction and/or size (depending on the "step" module etc.) The key to the Broyden approach is that the information about F(x+dx) is used to update the approximate Jacobian (think multidimensional secant method) in a clever way without any additional function evaluations (there is not a unique way to do this and some choices work better than others). Thus, think of Broyden methods as a Quasi-Newton methods but with a cheap and very approximate Jacobian (hence, one usually uses a robust line search method to make sure that one is always descending). > After thinking about this some more, I am beginning to like the idea > that only the "function" object be responsible for the Jacobian. If > the function can compute the Jacobian directly: great, use a newton- > like method. If it can't, then do its best to approximate it (i.e. > the "Broyden" part of the algorithm would be encoded in the function > object rather than the step object." > > > I think that if that if the function knows, on its own, how to > compute the Jacobian, the hessian, ... it should provide it. When > it does not, it shouldn't be the man sitting on the computer that > modifies its function to add a Broyden algorithm to the function > object. He sould only say to the optimizer that the function does > not compute the Jacobian by using another module. What module ? > That is a question for later. The goal of this is to have a clean > architecture, and adding a way to compute something directly in the > function, something that is not dependent on the function, but on > the step, is not a good thing. My view is that the person sitting at the computer does one of the following things: >>> F1 = Function(f) >>> F2 = Function(f,opts) >>> F3 = Function(f,df,ddf,opts) etc. In this first case, the object F1 can compute f(x), and will use finite differences or some more complicated method to compute derivatives df(x) and ddf(x) if required by the optimization algorithm. In F2, the user provides options that specify options about how to do these computations (for example, step size h, should a centred difference be used? Perhaps the function is cheap and a Richardson extrapolation should be used for higher accuracy. If f is analytic and supports complex arguments, then the difference step should be h=eps*1j. Maybe f has been implemented using an automatic differentiation library etc. Just throwing out ideas here...) In the third case, the user has explicitly provided functions to compute the Jacobian etc. so these will be used (unless the user specifies otherwise). In any case, all of the functors F1, F2 and F3 can be passed to various "optimizers". There would be a set of modules behind the interface provided by Function() that implement these various techniques for computing and/or estimating the derivatives, including the Broyden method. The user sitting at the computer does nothing other than select from a set of options (opts) what methods he wants the library to use. Note, the user could pass explicit things to Function() too, like a custom function that computes numerical derivatives. > The "function" object alone then serves up information about the > value of the function at a given point, as well as the gradient and > hessian at that point (either exact or approximate) to the criterion, > step, and any other objects that need it. > > I'm OK with it as long as it is not an approximation algorithm that > is based on gradient, ... to compute for instance the hessian. Such > an approximation algorithm is generic, and as such it should be put > in another module or in a function superclass. A Function "superclass" is what I had in mind. > ... > Certain termination criteria need access to the derivatives to make > sure that they terminate. It would query the function object for > this information. Other criteria may need to query the "step" object > to find out the size of the previous steps. > > The step is not the good one, it's the line search object goal to > find the correct step size, and such intel is given back to the > optimizer core, because there, everything is saved - everything > should be saved with a call to recordHistory -. What could be done > is that every object - step or line search - returns along with the > result - the result being the step, the new candidate, ... - a > dictionnary with such values. In that case, the criterion can > choose what it needs directly inside it. Yes, it seems that the optimizer should maintain information about the history. The question I have is about the flow of information: I imagine that the criterion object should be able to query the optimization object for the information that it needs. We should define an interface of things that the optimizer can serve up to the various components. This interface can be extended as required to support more sophisticated algorithms. > The "criterion" should > not maintain any of these internally, just rely on the values served > by the other objects: this does not break the encapsulation, it just > couples the objects more tightly, but sophisticated criteria need > this coupling. > For the moment, the state was not in the criterion, one cannot know > how any time it could be called inside an optimizer. This state is > maintained by the optimizer itself - contains the last 2 values, > the last 2 sets of parameters -, but I suppose that if we have the > new candidate, the step and its size, those can be removed, and so > the dictionary chooses what it needs. > > > > I don't think so, the function provides methods to compute > > gradient, hessian, ... but only the step object knows what to do > > with it : approximate a hessian, what was already approximated, ... > > A step object is associated with one optimizer, a function object > > can be optimized several times. If it has a state, it couldn't be > > used with several optimizers without reinitializing it, and it is > > not intuitive enough. > > The "function" object maintains all the information known about the > function: how to compute the function, how to compute/approximate > derivatives etc. If the user does not supply code for directly > computing derivatives, but wants to use an optimization method that > makes use of gradient information, then the function object should do > its best to provide approximate information. The essence behind the > Broyden methods is to approximate the Jacobian information in a > clever and cheap way. > > That would mean that it can have a state, I really do not support > this approach. The Broyden _is_ a way to get a step from a function > that does not give some intell - Jacobian, for instance -, so it is > not a function thing, it is a step mode. I disagree. I think of the Broyden algorithm as a way of maintaining the Jacobian. The way to get the step is independent of this, though it may use the Jacobian information to help it. The Broyden part of the algorithm is solely to approximate the Jacobian cheaply. > ... > Please describe how you think the Broyden root-finding method would > fit within this scheme. Which object would maintain the state of the > approximate Jacobian? > > No problem, I'll check in my two optimization books this evening > provided I have enough time - I'm a little late in some important > work projects :| - Maybe I will see things differently when you do this, but I am pretty convinced right now that the Function() object is the best place for the Broyden part of the algorithm. Michael. P.S. No hurry. I might also disappear from time to time when busy;-) From matthieu.brucher at gmail.com Mon Apr 16 15:27:23 2007 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 16 Apr 2007 21:27:23 +0200 Subject: [SciPy-dev] Proposal for more generic optimizers (posted before on scipy-user) In-Reply-To: References: <7D8EB91C-8762-4BF4-B45D-58E9F0EA93C4@physics.ubc.ca>