From oliverh at promanthan.com Sat Dec 1 15:15:55 2012 From: oliverh at promanthan.com (Oliver M. Haynold) Date: Sat, 1 Dec 2012 14:15:55 -0600 Subject: [SciPy-User] skew-normal distribution and Owen's T function Message-ID: I've coded up the skew-normal probability distribution, following closely the logic from the sn package in CRAN, for SciPy. I've put a very early version (skew-t isn't implemented yet, which is the main purpose of the package) on http://promanthan.com/randomstuff/skewt-0.0.1.tgz There are also unit tests that compare the results of my package to those of the sn package in R. Skew-t should be ready later this month. As always, any feedback and bug reports are highly appreciated. From josef.pktd at gmail.com Sat Dec 1 16:53:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 1 Dec 2012 16:53:10 -0500 Subject: [SciPy-User] skew-normal distribution and Owen's T function In-Reply-To: References: Message-ID: On Sat, Dec 1, 2012 at 3:15 PM, Oliver M. Haynold wrote: > I've coded up the skew-normal probability distribution, following closely the logic from the sn package in CRAN, for SciPy. I've put a very early version (skew-t isn't implemented yet, which is the main purpose of the package) on > > http://promanthan.com/randomstuff/skewt-0.0.1.tgz > > There are also unit tests that compare the results of my package to those of the sn package in R. Skew-t should be ready later this month. As always, any feedback and bug reports are highly appreciated. Very good, thank you I was looking for this. I just bumped recently again into Owen's T for skewed distributions. After a first look: I would outsource _tOwen into a standalone function, since it might also be useful in other places. scipy.special might also be a good location for this function, if the implementation is not specific to the current usage. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From takowl at gmail.com Sun Dec 2 17:12:03 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Sun, 2 Dec 2012 22:12:03 +0000 Subject: [SciPy-User] Fwd: New Scipy website In-Reply-To: References: Message-ID: On 27 November 2012 20:11, Ralf Gommers wrote: > You're right that we can't have broken links, so this should be done. > Before deciding to move things around, it would be good to have a complete > overview of all pages that are at scipy.org/xxx or xxx.scipy.org. Then > for each one you can decide if it has to be included in the Sphinx based > site, or put somewhere else with a redirect. Cleaning up some content at > the same time would be even better. The overview of all pages will also > allow you to write a test script to check no links break. > > Moving things to Sphinx is a task that parallelizes nicely and can > gradually be done after flipping the switch on the new site, but I think we > do need the complete overview beforehand. > The complete (long) list of pages on the current site is at http://www.scipy.org/TitleIndex I've pulled out the pages I think should have redirects. This list can be collaboratively edited if I've missed things: http://piratepad.net/tnY97djjTC Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhushazang at gmail.com Sun Dec 2 19:18:26 2012 From: zhushazang at gmail.com (Rodolfo Timoteo da Silva) Date: Sun, 02 Dec 2012 22:18:26 -0200 Subject: [SciPy-User] New Scipy website In-Reply-To: References: Message-ID: <50BBEFEB.3000500@gmail.com> Hi, just to report how to install into Gentoo /emerge scipy/ Best regards Rodolfo Em 26-11-2012 12:14, Thomas Kluyver escreveu: > I'd like to discuss how we can move the new SciPy website (currently > at http://scipy.github.com/ ) to be the homepage of scipy.org > . > > The new site is still something of a work in progress, but it's been > there for years, and we can't continue to maintain two sites > indefinitely. We've already been bitten by people referring to out of > date FAQs, and I think the aesthetics of the new site are a marked > improvement. So, first, what does the new site need before it can go > live?/ > > /Two things I'm already aware of: > > - Installation instructions for other package managers (so far we > cover apt and macports). Please can users of other Linux distros send > me the corresponding installation commands? > http://scipy.github.com/install.html > > - The 'Getting started' page should be rewritten to describe the Scipy > stack, and from a more newcomer-ish perspective. > > Of course, there's a lot of useful information on the existing site - > such as the Cookbook - that it's impractical to move to the new site. > So, I suggest that we move the current MoinMoin-based site to > wiki.scipy.org , and provide redirects so we > don't break existing links. Unfortunately, Github pages can't do > proper HTTP-based redirects. So we can either use hackish HTML/JS > redirects, or host the new site on a server where we can set up proper > redirects. > > Thanks, > Thomas > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From takowl at gmail.com Sun Dec 2 19:38:06 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 3 Dec 2012 00:38:06 +0000 Subject: [SciPy-User] New Scipy website In-Reply-To: <50BBEFEB.3000500@gmail.com> References: <50BBEFEB.3000500@gmail.com> Message-ID: On 3 December 2012 00:18, Rodolfo Timoteo da Silva wrote: > Hi, just to report how to install into Gentoo > > *emerge scipy* > Thanks, Rodolfo, I'd like to provide instructions for how to install the entire Scipy Stack - the set of core packages we've chosen. That's numpy, scipy, matplotlib, ipython (including the notebook), sympy, pandas and nose. Can the package names be chained together in a single emerge command? And what versions will the user get by default? You can see the Scipy Stack specification here: http://scipy.github.com/stackspec.html Best wishes, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliverh at promanthan.com Sat Dec 1 15:15:55 2012 From: oliverh at promanthan.com (Oliver M. Haynold) Date: Sat, 1 Dec 2012 14:15:55 -0600 Subject: [SciPy-User] skew-normal distribution and Owen's T function Message-ID: I've coded up the skew-normal probability distribution, following closely the logic from the sn package in CRAN, for SciPy. I've put a very early version (skew-t isn't implemented yet, which is the main purpose of the package) on http://promanthan.com/randomstuff/skewt-0.0.1.tgz There are also unit tests that compare the results of my package to those of the sn package in R. Skew-t should be ready later this month. As always, any feedback and bug reports are highly appreciated. From takowl at gmail.com Mon Dec 3 13:02:40 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 3 Dec 2012 18:02:40 +0000 Subject: [SciPy-User] New Scipy website In-Reply-To: References: Message-ID: >From the lack of offers for a new server to host on, I assume that we're going to keep using Github pages. So I'll put something together to do a "poor man's redirect" on the pages I listed: http://piratepad.net/tnY97djjTC Thanks, Thomas On 26 November 2012 14:14, Thomas Kluyver wrote: > I'd like to discuss how we can move the new SciPy website (currently at > http://scipy.github.com/ ) to be the homepage of scipy.org. > > The new site is still something of a work in progress, but it's been there > for years, and we can't continue to maintain two sites indefinitely. We've > already been bitten by people referring to out of date FAQs, and I think > the aesthetics of the new site are a marked improvement. So, first, what > does the new site need before it can go live?* > > *Two things I'm already aware of: > > - Installation instructions for other package managers (so far we cover > apt and macports). Please can users of other Linux distros send me the > corresponding installation commands? > http://scipy.github.com/install.html > > - The 'Getting started' page should be rewritten to describe the Scipy > stack, and from a more newcomer-ish perspective. > > Of course, there's a lot of useful information on the existing site - such > as the Cookbook - that it's impractical to move to the new site. So, I > suggest that we move the current MoinMoin-based site to wiki.scipy.org, > and provide redirects so we don't break existing links. Unfortunately, > Github pages can't do proper HTTP-based redirects. So we can either use > hackish HTML/JS redirects, or host the new site on a server where we can > set up proper redirects. > > Thanks, > Thomas > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pdeepak at instem.res.in Sun Dec 2 23:14:07 2012 From: pdeepak at instem.res.in (Poduval Deepak Balakrishnan) Date: Mon, 3 Dec 2012 09:44:07 +0530 Subject: [SciPy-User] Help Scipy Installation Message-ID: <7bab8c8a4df595dbe43ecef657178089.squirrel@webmail.instem.res.in> Hi, I am deepak. I installed numpy but when i try and install scipy I get the following error. Please help Traceback (most recent call last): File "setup.py", line 208, in setup_package() File "setup.py", line 145, in setup_package from numpy.distutils.core import setup File "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/__init__.py", line 137, in import add_newdocs File "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/add_newdocs.py", line 9, in from numpy.lib import add_newdoc File "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/lib/__init__.py", line 4, in from type_check import * File "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/lib/type_check.py", line 8, in import numpy.core.numeric as _nx File "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/core/__init__.py", line 5, in import multiarray ImportError: libpython2.7.so.1.0: cannot open shared object file: No such file or directory Thank you in advance. Best regards Poduval Deepak Balakrishnan, Junior Research Fellow, Das's Lab, inStem, NCBS, GKVK Campus, Bellary Road, Bangalore Karnataka - 560065 From aanderso at med.wayne.edu Mon Dec 3 11:29:35 2012 From: aanderso at med.wayne.edu (Anderson, Amy) Date: Mon, 3 Dec 2012 16:29:35 +0000 Subject: [SciPy-User] running script error (eigen symmetric) Message-ID: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> Dear Scipy users, I am a new python and scipy user and I am looking for some help to troubleshoot an error I have been getting when trying to run a script in python. The script I am trying to run is called pyCluster ROI and it needs python, pynifti, scipy, and numpy to run. I have installed all these programs successfully using macports on Mac OS 10.8. I unarchived the scripts and ran the test script getting this error: $ python pyClusterROI_test.py Traceback (most recent call last): File "pyClusterROI_test.py", line 48, in from make_local_connectivity_scorr import * File "/Users/matthewnye/Downloads/pyClusterROI/make_local_connectivity_scorr.py", line 40, in from scipy.sparse.linalg.eigen.arpack import eigen_symmetric ImportError: cannot import name eigen_symmetric I have been trying to hunt down what this error may be and it seems that older versions of scipy might have used ?scipy.sparse.linalg.eigen.arpack.eigen symmetric()? where as new versions it has been renamed as "scipy.sparse.linalg.eigen.arpack.eigs()." Does anyone know what might be causing this error or if it is a version problem is there any way to get an older version of scipy through macports? Also I have considered installing all these modules through a manual install but since i am a novice user I did not want to attempt that if there was a potentially simpler solution. Any help would be much appreciated! -Amy -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Dec 3 14:56:31 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Dec 2012 11:56:31 -0800 Subject: [SciPy-User] New Scipy website In-Reply-To: References: Message-ID: Hi, On Mon, Dec 3, 2012 at 10:02 AM, Thomas Kluyver wrote: > >From the lack of offers for a new server to host on, I assume that we're > going to keep using Github pages. So I'll put something together to do a > "poor man's redirect" on the pages I listed: Is there an advantage on hosting somewhere other than github pages? I'm very happy to host on my space if that's an advantage. Sourceforge is an obvious free alternative for html page hosting. Best, Matthew From davidmenhur at gmail.com Mon Dec 3 14:58:14 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Mon, 3 Dec 2012 20:58:14 +0100 Subject: [SciPy-User] running script error (eigen symmetric) In-Reply-To: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> References: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> Message-ID: In Mac, the python command corresponds to the python that come shipped with it, not the one installed by macports, and therefore, the one that has scipy installed. It is something like python-macports2.6 or so (tupe python and tab/double tab on the shell to make sure). On Mon, Dec 3, 2012 at 5:29 PM, Anderson, Amy wrote: > Dear Scipy users, > > I am a new python and scipy user and I am looking for some help to > troubleshoot an error I have been getting when trying to run a script in > python. The script I am trying to run is called pyCluster ROI and it needs > python, pynifti, scipy, and numpy to run. I have installed all these > programs successfully using macports on Mac OS 10.8. > > I unarchived the scripts and ran the test script getting this error: > > $ python pyClusterROI_test.py > > Traceback (most recent call last): > > File "pyClusterROI_test.py", line 48, in > > from make_local_connectivity_scorr import * > > File > "/Users/matthewnye/Downloads/pyClusterROI/make_local_connectivity_scorr.py", > line 40, in > > from scipy.sparse.linalg.eigen.arpack import eigen_symmetric > > ImportError: cannot import name eigen_symmetric > > > I have been trying to hunt down what this error may be and it seems that > older versions of scipy might have used > ?scipy.sparse.linalg.eigen.arpack.eigen symmetric()? where as new versions > it has been renamed as "scipy.sparse.linalg.eigen.arpack.eigs()." > > > Does anyone know what might be causing this error or if it is a version > problem is there any way to get an older version of scipy through macports? > Also I have considered installing all these modules through a manual install > but since i am a novice user I did not want to attempt that if there was a > potentially simpler solution. > > Any help would be much appreciated! > > -Amy > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Mon Dec 3 14:59:04 2012 From: cournape at gmail.com (David Cournapeau) Date: Mon, 3 Dec 2012 20:59:04 +0100 Subject: [SciPy-User] Help Scipy Installation In-Reply-To: <7bab8c8a4df595dbe43ecef657178089.squirrel@webmail.instem.res.in> References: <7bab8c8a4df595dbe43ecef657178089.squirrel@webmail.instem.res.in> Message-ID: Hi Deepak On Mon, Dec 3, 2012 at 5:14 AM, Poduval Deepak Balakrishnan wrote: > Hi, > > I am deepak. I installed numpy but when i try and install scipy I get the > following error. Please help > > Traceback (most recent call last): > File "setup.py", line 208, in > setup_package() > File "setup.py", line 145, in setup_package > from numpy.distutils.core import setup > File > "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/__init__.py", > line 137, in > import add_newdocs > File > "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/add_newdocs.py", > line 9, in > from numpy.lib import add_newdoc > File > "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/lib/__init__.py", > line 4, in > from type_check import * > File > "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/lib/type_check.py", > line 8, in > import numpy.core.numeric as _nx > File > "/usr/lib/python2.6/site-packages/numpy-1.6.1-0.egg/numpy/core/__init__.py", > line 5, in > import multiarray > ImportError: libpython2.7.so.1.0: cannot open shared object file: No such > file or directory > It looks like you installed numpy in the directory expected by python 2.6 (/usr/lib/python2.6/...), but built it with python 2.7 (since it is looking for the python 2.7 .so file). How exactly did you install numpy ? Also, what's the value of your PYTHONPATH variable ? thanks, David From amyla333 at gmail.com Mon Dec 3 16:19:15 2012 From: amyla333 at gmail.com (Amy Anderson) Date: Mon, 3 Dec 2012 16:19:15 -0500 Subject: [SciPy-User] running script error (eigen symmetric) In-Reply-To: References: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> Message-ID: I am actually having an issue getting an older version of Scipy to install through macports. I need a version of scipy before 0.9 because the eigen symmetric module was not changed yet. I have been using macports to install scipy etc because I thought it was the easier way but at this point I really just need to get this script working and need an older version of scipy. if anyone has any ideas please let me know. On Mon, Dec 3, 2012 at 2:58 PM, Da?id wrote: > In Mac, the python command corresponds to the python that come shipped > with it, not the one installed by macports, and therefore, the one > that has scipy installed. It is something like python-macports2.6 or > so (tupe python and tab/double tab on the shell to make sure). > > On Mon, Dec 3, 2012 at 5:29 PM, Anderson, Amy > wrote: > > Dear Scipy users, > > > > I am a new python and scipy user and I am looking for some help to > > troubleshoot an error I have been getting when trying to run a script in > > python. The script I am trying to run is called pyCluster ROI and it > needs > > python, pynifti, scipy, and numpy to run. I have installed all these > > programs successfully using macports on Mac OS 10.8. > > > > I unarchived the scripts and ran the test script getting this error: > > > > $ python pyClusterROI_test.py > > > > Traceback (most recent call last): > > > > File "pyClusterROI_test.py", line 48, in > > > > from make_local_connectivity_scorr import * > > > > File > > > "/Users/matthewnye/Downloads/pyClusterROI/make_local_connectivity_scorr.py", > > line 40, in > > > > from scipy.sparse.linalg.eigen.arpack import eigen_symmetric > > > > ImportError: cannot import name eigen_symmetric > > > > > > I have been trying to hunt down what this error may be and it seems that > > older versions of scipy might have used > > ?scipy.sparse.linalg.eigen.arpack.eigen symmetric()? where as new > versions > > it has been renamed as "scipy.sparse.linalg.eigen.arpack.eigs()." > > > > > > Does anyone know what might be causing this error or if it is a version > > problem is there any way to get an older version of scipy through > macports? > > Also I have considered installing all these modules through a manual > install > > but since i am a novice user I did not want to attempt that if there was > a > > potentially simpler solution. > > > > Any help would be much appreciated! > > > > -Amy > > > > > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Mon Dec 3 16:29:27 2012 From: deil.christoph at googlemail.com (Christoph Deil) Date: Mon, 3 Dec 2012 22:29:27 +0100 Subject: [SciPy-User] running script error (eigen symmetric) In-Reply-To: References: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> Message-ID: <8304C676-A87C-4B63-9554-9744C435D4E6@gmail.com> Dear Amy, the scipy you installed via Macports is version 0.11 : $ port installed py27-scipy The following ports are currently installed: py27-scipy @0.11.0_0+gcc45 (active) Indeed for that version of scipy there is no function eigen_symmetric in scipy.sparse.linalg.eigen.arpack, as the ImportError in your script said: $ python Python 2.7.3 (default, Oct 22 2012, 20:01:15) [GCC 4.2.1 Compatible Apple Clang 4.0 ((tags/Apple/clang-421.0.60))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from scipy.sparse.linalg.eigen.arpack import eigen_symmetric Traceback (most recent call last): File "", line 1, in ImportError: cannot import name eigen_symmetric >>> from scipy.sparse.linalg.eigen.arpack import eigs >>> exit() $ It is possible to install an older version of scipy with Macports, but it looks complicated (I've never done this): https://trac.macports.org/wiki/howto/InstallingOlderPort Installing scipy 0.9 on a modern Mac outside Macports is also not simple. You could try a binary or source install from http://sourceforge.net/projects/scipy/files/scipy/ . Is updating your script to use eigs instead of eigen_symmetric an option? Or ask the guy that wrote it do update it for you? Christoph On Dec 3, 2012, at 10:19 PM, Amy Anderson wrote: > I am actually having an issue getting an older version of Scipy to install through macports. I need a version of scipy before 0.9 because the eigen symmetric module was not changed yet. I have been using macports to install scipy etc because I thought it was the easier way but at this point I really just need to get this script working and need an older version of scipy. if anyone has any ideas please let me know. > > On Mon, Dec 3, 2012 at 2:58 PM, Da?id wrote: > In Mac, the python command corresponds to the python that come shipped > with it, not the one installed by macports, and therefore, the one > that has scipy installed. It is something like python-macports2.6 or > so (tupe python and tab/double tab on the shell to make sure). > > On Mon, Dec 3, 2012 at 5:29 PM, Anderson, Amy wrote: > > Dear Scipy users, > > > > I am a new python and scipy user and I am looking for some help to > > troubleshoot an error I have been getting when trying to run a script in > > python. The script I am trying to run is called pyCluster ROI and it needs > > python, pynifti, scipy, and numpy to run. I have installed all these > > programs successfully using macports on Mac OS 10.8. > > > > I unarchived the scripts and ran the test script getting this error: > > > > $ python pyClusterROI_test.py > > > > Traceback (most recent call last): > > > > File "pyClusterROI_test.py", line 48, in > > > > from make_local_connectivity_scorr import * > > > > File > > "/Users/matthewnye/Downloads/pyClusterROI/make_local_connectivity_scorr.py", > > line 40, in > > > > from scipy.sparse.linalg.eigen.arpack import eigen_symmetric > > > > ImportError: cannot import name eigen_symmetric > > > > > > I have been trying to hunt down what this error may be and it seems that > > older versions of scipy might have used > > ?scipy.sparse.linalg.eigen.arpack.eigen symmetric()? where as new versions > > it has been renamed as "scipy.sparse.linalg.eigen.arpack.eigs()." > > > > > > Does anyone know what might be causing this error or if it is a version > > problem is there any way to get an older version of scipy through macports? > > Also I have considered installing all these modules through a manual install > > but since i am a novice user I did not want to attempt that if there was a > > potentially simpler solution. > > > > Any help would be much appreciated! > > > > -Amy > > > > > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Dec 3 16:39:37 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 03 Dec 2012 23:39:37 +0200 Subject: [SciPy-User] running script error (eigen symmetric) In-Reply-To: <8304C676-A87C-4B63-9554-9744C435D4E6@gmail.com> References: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> <8304C676-A87C-4B63-9554-9744C435D4E6@gmail.com> Message-ID: 03.12.2012 23:29, Christoph Deil kirjoitti: [clip] >>>> from scipy.sparse.linalg.eigen.arpack import eigs >>>> exit() Or, better, as documented: >>> from scipy.sparse.linalg import eigs http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.eigs.html -- Pauli Virtanen From takowl at gmail.com Mon Dec 3 17:13:19 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 3 Dec 2012 22:13:19 +0000 Subject: [SciPy-User] New Scipy website In-Reply-To: <1101fb90-1df0-413e-883b-fa0b9cb3e334@googlegroups.com> References:

<1101fb90-1df0-413e-883b-fa0b9cb3e334@googlegroups.com> Message-ID: On 3 December 2012 21:44, BrettRMurphy wrote: > Sorry it took us at Enthought a while to get back to this thread. > > In addition to the server, we've got some people who can help out with > setting things up and getting the new site up and running. To this end, > Ognen D has set up a private mailing list for people who will be working on > the effort. The list is private as they'll be discussing data security and > privacy issues. The current list is Ognen D, Prabhu R, Thomas K, Pauli V, > David C, Dhruv B and Parth B. If anyone is interested in joining, please > send a note to ognen at enthought,com > > We're looking forward to helping get SciPy.org updated. > Thanks Brett, Robert. Should I have received an invitation or notification about this mailing list? Matthew, the disadvantage of Github pages is that you can't create proper HTTP redirects. Thanks for the offer of hosting, although we'll probably use Enthought's offer. Best wishes, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Mon Dec 3 21:16:30 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 3 Dec 2012 21:16:30 -0500 Subject: [SciPy-User] New Scipy website In-Reply-To: References:

<1101fb90-1df0-413e-883b-fa0b9cb3e334@googlegroups.com> Message-ID: NumFocus can also easily host and administer an EC2 server if that is needed. If Enthought is willing to continue to host the SciPy server then that is great. This is also something that NumFocus is able to do at any time. If there are other projects that would like a server setup for them --- if github pages is not enough --- just contact admin at numfocus.org. On Dec 3, 2012, at 5:13 PM, Thomas Kluyver wrote: > On 3 December 2012 21:44, BrettRMurphy wrote: > Sorry it took us at Enthought a while to get back to this thread. > > In addition to the server, we've got some people who can help out with setting things up and getting the new site up and running. To this end, Ognen D has set up a private mailing list for people who will be working on the effort. The list is private as they'll be discussing data security and privacy issues. The current list is Ognen D, Prabhu R, Thomas K, Pauli V, David C, Dhruv B and Parth B. If anyone is interested in joining, please send a note to ognen at enthought,com > > We're looking forward to helping get SciPy.org updated. > > Thanks Brett, Robert. Should I have received an invitation or notification about this mailing list? > > Matthew, the disadvantage of Github pages is that you can't create proper HTTP redirects. Thanks for the offer of hosting, although we'll probably use Enthought's offer. > > Best wishes, > Thomas > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 3 22:30:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 3 Dec 2012 22:30:10 -0500 Subject: [SciPy-User] distributions - who got the most ? Message-ID: scipy.stats has more than 90 distributions. Do we want to increase it by almost a factor of 10? :) While looking for the cdf of a distribution, I found this : http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates He collected 870 distributions (under BSD license). Includes generic random number generation. Even though there are some variations of distributions counted separately, given my quick browsing this looks impressive and a good source for code and references. Coding style is not great but it's 10 years or so of collecting distributions. Josef From travis at continuum.io Mon Dec 3 22:50:43 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 3 Dec 2012 21:50:43 -0600 Subject: [SciPy-User] distributions - who got the most ? In-Reply-To: References: Message-ID: <504F8B26-F48D-41CA-A049-D010460A97FA@continuum.io> On Dec 3, 2012, at 9:30 PM, josef.pktd at gmail.com wrote: > scipy.stats has more than 90 distributions. > > Do we want to increase it by almost a factor of 10? :) > That is pretty cool. I would personally love to see this sort of thing. This might be a good project to propose to NumFOCUS for an enterprising young student / mentor combination under the Technical Fellowship Program. Josef would be a good mentor for this sort of thing. The John Hunter Technical Fellowship is a 3 month to 18 month program where a student / post-doc works with a mentor (both of whom receive funding) on some project of interest to the scientific computing community. Proposals are reviewed by a grant team. We are just getting this program started and need good projects and mentors. Students can submit proposals and can often be matched with a mentor. Contact numfocus at googlegroups.org to inquire or send a proposal. -Travis > While looking for the cdf of a distribution, I found this : > http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates > > He collected 870 distributions (under BSD license). Includes generic > random number generation. > > Even though there are some variations of distributions counted > separately, given my quick browsing this looks impressive and a good > source for code and references. > Coding style is not great but it's 10 years or so of collecting distributions. > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From kmichael.aye at gmail.com Tue Dec 4 02:33:17 2012 From: kmichael.aye at gmail.com (Michael Aye) Date: Mon, 3 Dec 2012 23:33:17 -0800 Subject: [SciPy-User] distributions - who got the most ? References: <504F8B26-F48D-41CA-A049-D010460A97FA@continuum.io> Message-ID: On 2012-12-04 03:50:43 +0000, Travis Oliphant said: > On Dec 3, 2012, at 9:30 PM, josef.pktd at gmail.com wrote: > >> scipy.stats has more than 90 distributions. >> >> Do we want to increase it by almost a factor of 10? :) >> > > That is pretty cool. I would personally love to see this sort of > thing. This might be a good project to propose to NumFOCUS for an > enterprising young student / mentor combination under the Technical > Fellowship Program. Josef would be a good mentor for this sort of > thing. > > The John Hunter Technical Fellowship is a 3 month to 18 month program > where a student / post-doc works with a mentor (both of whom receive > funding) on some project of interest to the scientific computing > community. Proposals are reviewed by a grant team. We are just > getting this program started and need good projects and mentors. > Students can submit proposals and can often be matched with a mentor. > > Contact numfocus at googlegroups.org to inquire or send a proposal. Interesting, I just thought about something like this. Will part-time be possible? I am asking here, because the list might want to know this as well. Michael > > -Travis > > > > >> While looking for the cdf of a distribution, I found this : >> http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates >> >> >> He collected 870 distributions (under BSD license). Includes generic >> random number generation. >> >> Even though there are some variations of distributions counted >> separately, given my quick browsing this looks impressive and a good >> source for code and references. >> Coding style is not great but it's 10 years or so of collecting distributions. >> >> Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From rkern at enthought.com Mon Dec 3 15:17:11 2012 From: rkern at enthought.com (Robert Kern) Date: Mon, 3 Dec 2012 20:17:11 +0000 Subject: [SciPy-User] New Scipy website In-Reply-To: References: Message-ID: On Mon, Dec 3, 2012 at 6:02 PM, Thomas Kluyver wrote: > From the lack of offers for a new server to host on, I assume that we're > going to keep using Github pages. Enthought is happy to continue to donate a server for the SciPy website (on EC2 infrastructure most likely). -- Robert Kern Enthought From bmurphy at enthought.com Mon Dec 3 16:44:43 2012 From: bmurphy at enthought.com (BrettRMurphy) Date: Mon, 3 Dec 2012 13:44:43 -0800 (PST) Subject: [SciPy-User] New Scipy website In-Reply-To: References:

Message-ID: <1101fb90-1df0-413e-883b-fa0b9cb3e334@googlegroups.com> Sorry it took us at Enthought a while to get back to this thread. In addition to the server, we've got some people who can help out with setting things up and getting the new site up and running. To this end, Ognen D has set up a private mailing list for people who will be working on the effort. The list is private as they'll be discussing data security and privacy issues. The current list is Ognen D, Prabhu R, Thomas K, Pauli V, David C, Dhruv B and Parth B. If anyone is interested in joining, please send a note to ognen at enthought,com We're looking forward to helping get SciPy.org updated. Brett Murphy Enthought On Monday, December 3, 2012 2:17:11 PM UTC-6, Robert Kern wrote: > > On Mon, Dec 3, 2012 at 6:02 PM, Thomas Kluyver > > wrote: > > From the lack of offers for a new server to host on, I assume that we're > > going to keep using Github pages. > > Enthought is happy to continue to donate a server for the SciPy > website (on EC2 infrastructure most likely). > > -- > Robert Kern > Enthought > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmurphy at enthought.com Mon Dec 3 17:42:47 2012 From: bmurphy at enthought.com (BrettRMurphy) Date: Mon, 3 Dec 2012 14:42:47 -0800 (PST) Subject: [SciPy-User] New Scipy website In-Reply-To: References:

<1101fb90-1df0-413e-883b-fa0b9cb3e334@googlegroups.com> Message-ID: <9b096976-1734-43a1-aea1-f2162abec511@googlegroups.com> I assume Ognen will update the list members soon. -- Brett Enthought On Monday, December 3, 2012 4:13:19 PM UTC-6, Thomas Kluyver wrote: > > On 3 December 2012 21:44, BrettRMurphy > > wrote: > >> Sorry it took us at Enthought a while to get back to this thread. >> >> In addition to the server, we've got some people who can help out with >> setting things up and getting the new site up and running. To this end, >> Ognen D has set up a private mailing list for people who will be working on >> the effort. The list is private as they'll be discussing data security and >> privacy issues. The current list is Ognen D, Prabhu R, Thomas K, Pauli V, >> David C, Dhruv B and Parth B. If anyone is interested in joining, please >> send a note to ognen at enthought,com >> >> We're looking forward to helping get SciPy.org updated. >> > > Thanks Brett, Robert. Should I have received an invitation or notification > about this mailing list? > > Matthew, the disadvantage of Github pages is that you can't create proper > HTTP redirects. Thanks for the offer of hosting, although we'll probably > use Enthought's offer. > > Best wishes, > Thomas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From servant.mathieu at gmail.com Tue Dec 4 10:55:37 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Tue, 4 Dec 2012 16:55:37 +0100 Subject: [SciPy-User] equivalent of R quantile() function in scipy Message-ID: Dear list, >From an array X of values, the quantile () function in R can return the score at any given specified quantile : e.g., quant_values = quantile (X, probs = c(.1, .3, .5, .7, .9)). The scoreatpercentile() function in scipy seems to to the same stuff. However, you can specify only one quantile per function call e.g. quant_value = scoreatpercentile (X, per = 10) etc.. Is it possible to return more than one quantile per function call? Cheers, Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 4 11:14:48 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 4 Dec 2012 11:14:48 -0500 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: References: Message-ID: On Tue, Dec 4, 2012 at 10:55 AM, servant mathieu wrote: > Dear list, > > From an array X of values, the quantile () function in R can return the > score at any given specified quantile : e.g., quant_values = quantile (X, > probs = c(.1, .3, .5, .7, .9)). > > The scoreatpercentile() function in scipy seems to to the same stuff. > However, you can specify only one quantile per function call e.g. > quant_value = scoreatpercentile (X, per = 10) etc.. > Yes, this is pretty annoying. > Is it possible to return more than one quantile per function call? You can use percentile in numpy. Note that it expects the percentiles in 0-100. quants = np.percentile(np.random.randn(25), [10, 30, 50, 70, 90]) Skipper From jjhelmus at gmail.com Tue Dec 4 11:20:16 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Tue, 04 Dec 2012 11:20:16 -0500 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: References: Message-ID: <50BE22C0.6040103@gmail.com> Mathieu, numpy.percentile can can accept a sequence of percentiles as the second parameter: In [8]: probs = [10.0, 30.0, 50.0, 70.0, 90.0] In [9]: a = np.arange(100) In [10]: np.percentile(a, probs) Out[10]: [9.9000000000000004, 29.699999999999999, 49.5, 69.299999999999997, 89.100000000000009] Or you could use a list comprehension with scipy.stat.scoreatpercentile: In [11]: [scipy.stats.scoreatpercentile(a, i) for i in probs] Out[11]: [9.9000000000000004, 29.699999999999999, 49.5, 69.299999999999997, 89.100000000000009] The first solution is probably faster as the sequence is only sorted once. Hope this helps, - Jonathan Helmus On 12/04/2012 10:55 AM, servant mathieu wrote: > Dear list, > From an array X of values, the quantile () function in R can return > the score at any given specified quantile : e.g., quant_values = > quantile (X, probs = c(.1, .3, .5, .7, .9)). > The scoreatpercentile() function in scipy seems to to the same stuff. > However, you can specify only one quantile per function call e.g. > quant_value = scoreatpercentile (X, per = 10) etc.. > Is it possible to return more than one quantile per function call? > Cheers, > Mathieu > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Dec 4 11:26:55 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 04 Dec 2012 18:26:55 +0200 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: <50BE22C0.6040103@gmail.com> References: <50BE22C0.6040103@gmail.com> Message-ID: 04.12.2012 18:20, Jonathan Helmus kirjoitti: > numpy.percentile can can accept a sequence of percentiles as the second > parameter: > > In [8]: probs = [10.0, 30.0, 50.0, 70.0, 90.0] > > In [9]: a = np.arange(100) > > In [10]: np.percentile(a, probs) [clip] > In [11]: [scipy.stats.scoreatpercentile(a, i) for i in probs] > Out[11]: [clip] It could be useful if someone would take a look if that the implementation in scipy can be easily adapted for that. -- Pauli Virtanen From jjhelmus at gmail.com Tue Dec 4 11:44:47 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Tue, 04 Dec 2012 11:44:47 -0500 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: References: <50BE22C0.6040103@gmail.com> Message-ID: <50BE287F.1020800@gmail.com> On 12/04/2012 11:26 AM, Pauli Virtanen wrote: > 04.12.2012 18:20, Jonathan Helmus kirjoitti: >> numpy.percentile can can accept a sequence of percentiles as the second >> parameter: >> >> In [8]: probs = [10.0, 30.0, 50.0, 70.0, 90.0] >> >> In [9]: a = np.arange(100) >> >> In [10]: np.percentile(a, probs) > [clip] >> In [11]: [scipy.stats.scoreatpercentile(a, i) for i in probs] >> Out[11]: > [clip] > > It could be useful if someone would take a look if that the > implementation in scipy can be easily adapted for that. > NumPy's implementation is pure python and quite straightforward, I'll put together a pull request to include similar functionality in SciPy. Adding the limit optional parameter shouldn't be too hard. - Jonathan Helmus From jsseabold at gmail.com Tue Dec 4 11:43:08 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 4 Dec 2012 11:43:08 -0500 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: <50BE287F.1020800@gmail.com> References: <50BE22C0.6040103@gmail.com> <50BE287F.1020800@gmail.com> Message-ID: On Tue, Dec 4, 2012 at 11:44 AM, Jonathan Helmus wrote: > On 12/04/2012 11:26 AM, Pauli Virtanen wrote: >> 04.12.2012 18:20, Jonathan Helmus kirjoitti: >>> numpy.percentile can can accept a sequence of percentiles as the second >>> parameter: >>> >>> In [8]: probs = [10.0, 30.0, 50.0, 70.0, 90.0] >>> >>> In [9]: a = np.arange(100) >>> >>> In [10]: np.percentile(a, probs) >> [clip] >>> In [11]: [scipy.stats.scoreatpercentile(a, i) for i in probs] >>> Out[11]: >> [clip] >> >> It could be useful if someone would take a look if that the >> implementation in scipy can be easily adapted for that. >> > NumPy's implementation is pure python and quite straightforward, I'll > put together a pull request to include similar functionality in SciPy. > Adding the limit optional parameter shouldn't be too hard. > Great thanks for looking at this. Could you also look at adding an axis keyword to scoreatpercentile? Should be simple, but IIRC this was missing the last time I looked at this. Skipper From pav at iki.fi Tue Dec 4 11:45:23 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 04 Dec 2012 18:45:23 +0200 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: <50BE287F.1020800@gmail.com> References: <50BE22C0.6040103@gmail.com> <50BE287F.1020800@gmail.com> Message-ID: 04.12.2012 18:44, Jonathan Helmus kirjoitti: > On 12/04/2012 11:26 AM, Pauli Virtanen wrote: [clip] >> It could be useful if someone would take a look if that the >> implementation in scipy can be easily adapted for that. >> > NumPy's implementation is pure python and quite straightforward, I'll > put together a pull request to include similar functionality in SciPy. > Adding the limit optional parameter shouldn't be too hard. Great, that's fast action :) Cheers, Pauli From ralf.gommers at gmail.com Tue Dec 4 16:01:36 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 4 Dec 2012 22:01:36 +0100 Subject: [SciPy-User] distributions - who got the most ? In-Reply-To: References: Message-ID: On Tue, Dec 4, 2012 at 4:30 AM, wrote: > scipy.stats has more than 90 distributions. > > Do we want to increase it by almost a factor of 10? :) > > While looking for the cdf of a distribution, I found this : > > http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates > > He collected 870 distributions (under BSD license). Includes generic > random number generation. > > Even though there are some variations of distributions counted > separately, given my quick browsing this looks impressive and a good > source for code and references. > Coding style is not great but it's 10 years or so of collecting > distributions. > Adding a lot of distributions sounds fine to me. That many distributions would need to go into a separate namespace. Any additions should be complete though (the Matlab code only has pdf/cdf) and well tested. The Matlab code doesn't look all that useful except for the references ("coding style is not great" is really too kind). I also don't trust the BSD license that's put on it, many files have different author names in them with no mention of where they came from. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 4 20:15:56 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 4 Dec 2012 20:15:56 -0500 Subject: [SciPy-User] distributions - who got the most ? In-Reply-To: References: Message-ID: On Tue, Dec 4, 2012 at 4:01 PM, Ralf Gommers wrote: > > > > On Tue, Dec 4, 2012 at 4:30 AM, wrote: >> >> scipy.stats has more than 90 distributions. >> >> Do we want to increase it by almost a factor of 10? :) >> >> While looking for the cdf of a distribution, I found this : >> >> http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates >> >> He collected 870 distributions (under BSD license). Includes generic >> random number generation. >> >> Even though there are some variations of distributions counted >> separately, given my quick browsing this looks impressive and a good >> source for code and references. >> Coding style is not great but it's 10 years or so of collecting >> distributions. > > > Adding a lot of distributions sounds fine to me. That many distributions > would need to go into a separate namespace. > > Any additions should be complete though (the Matlab code only has pdf/cdf) > and well tested. The Matlab code doesn't look all that useful except for the > references ("coding style is not great" is really too kind). I also don't > trust the BSD license that's put on it, many files have different author > names in them with no mention of where they came from. The matlab code includes several "special" functions that look mostly copied from other authors. This would need checking, but I doubt we need many of those since we have scipy.special. We are missing some special functions for distributions, but I didn't check whether he has any of those. The pdfs, and the cdfs when available, look like they were implemented by the author, at least it looks that way for the small sample that I checked. (code quality varies a lot, but many distributions are vectorized or can be easily vectorized from his code. Given the pdf, the rest could all be derived generically. But it won't be efficient. Also, I just saw that sympy could become useful to derive extra properties http://matthewrocklin.com/blog/work/2012/12/03/Characteristic-Functions/ sympy.stats also works based only on the pdf (from what I have seen). I'm a bit skeptical about the number of distributions that are actually generally useful and not just used once in a journal article. My impression from tracking several statistics journals is that there are at least 10 new distributions each year. As an example, he has a long list of poisson mixture distributions that I never heard of except for negative binomial. They might be useful in some cases, but a more general class might cover it better. >From a brief look at his reference http://scholar.google.com/scholar?cluster=6061641765696455790&hl=en&as_sdt=0,5&as_vis=1 I think it might not be necessary to implement all details for 5 or more distributions separately. According to Google the paper has only 4 citations. see also 1) But there are a lot of distributions, or classes/categories of distributions that scipy is missing, and are for example available in R, but in R they are spread out over many packages. Josef 1) another reference for poisson mixtures (technical, not a quick read, but a funny table) Karlis, D. and Xekalaki, E. (2005), Mixed Poisson Distributions. International Statistical Review, 73: 35?58. doi: 10.1111/j.1751-5823.2005.tb00250.x http://scholar.google.com/scholar?cluster=4455890634693542956&hl=en&as_sdt=2005&sciodt=0,5 -------------------------- Table 1 Some mixed Poisson distributions. Mixed Poisson Distribution Mixing Distribution A Key Reference Negative Binomial Gamma Greenwood & Yule (1920) Geometric Exponential Johnson et al. (1992) Poisson-Linear Exponential Family Linear Exponential Family Sankaran (1969) Poisson?Lindley Lindley Sankaran (1970) Poisson-Linear Exponential Linear Exponential Kling & Goovaerts (1993) Poisson-Lognormal Lognormal Bulmer (1974) Poisson-Confluent Hypergeometric Series Confluent Hypergeometric Series Bhattacharya (1966) Poisson-Generalized Inverse Gaussian Generalized Inverse Gaussian Sichel (1974) Sichel Inverse Gaussian Sichel (1975) Poisson-Inverse Gamma Inverse Gamma Willmot (1993) Poisson-Truncated Normal Truncated Normal Patil (1964) Generalized Waring Gamma Product Ratio Irwin (1975) Simple Waring Exponential Beta Pielou (1962) Yule Beta with Specific Parameter Values Simon (1955) Poisson-Generalized Pareto Generalized Pareto Kempton (1975) Poisson-Beta I Beta Type I Holla & Bhattacharya (1965) Poisson-Beta II Beta Type II Gurland (1958) Poisson-Truncated Beta II Truncated Beta Type II Willmot (1986) Poisson-Uniform Uniform Bhattacharya (1966) Poisson-Truncated Gamma Truncated Gamma Willmot (1993) Poisson-Generalized Gamma Generalized Gamma Albrecht (1984) Dellaporte Shifted Gamma Ruohonen (1988) Poisson-Modified Bessel of the 3rd Kind Modified Bessel of the 3rd Kind Ong & Muthaloo (1995) Poisson?Pareto Pareto Willmot (1993) Poisson-Shifted Pareto Shifted Pareto Willmot (1993) Poisson?Pearson Family Pearson?s Family of Distributions Albrecht (1982) Poisson-Log-Student Log-Student Gaver & O?Muircheartaigh (1987) Poisson-Power Function Power Function Distribution Rai (1971) Poisson?Lomax Lomax Al-Awadhi & Ghitany (2001) Poisson-Power Variance Power Variance Family Hougaard et al. (1997) Neyman Poisson Douglas (1980) Other Discrete Distributions Johnson et al. (1992) ------------------------------------------------------------------- > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From sepideh at isc.ac.uk Wed Dec 5 10:11:28 2012 From: sepideh at isc.ac.uk (Sepideh Rastin) Date: Wed, 05 Dec 2012 15:11:28 +0000 Subject: [SciPy-User] Need help to use rv_continuous.fit References: 3d375d731003220750i8801ed7o3dfaaa2990f7b057@mail.gmail.com Message-ID: <50BF6420.20307@isc.ac.uk> Hi there, I have histograms that will form the likelihood function and need to find the best normal distributions. I would like to know if it would be possible to create my input array to rv_continuous.fit using the data of my histograms. Kind regards, Sep From malandrea80 at gmail.com Wed Dec 5 11:12:05 2012 From: malandrea80 at gmail.com (MalAndrea) Date: Wed, 5 Dec 2012 17:12:05 +0100 Subject: [SciPy-User] Parallel Differential Evolution Message-ID: Hi, I have followed Rob suggestions on this, i.e. to substute: (1). *numpy.zeros(X)* instead of *flex.double(X, 0)* (2). *1000*numpy.ones(X) *instead of *flex.double(X, 1000)* (3). *numpy.min(X) *for *flex.min(X)*, etc. (mean, sum) (4). *numpy.random.uniform(size=N)* for *flex.random_double(N) * However, to get this work, I also had to modify the following: - modification (4). only works when floats are meant to be used, I guess, so in the only case: *rnd = numpy.random.uniform(size=self.vector_length)*instead of *flex.random_double(N)*. In the other two cases it is *random_values = numpy.random.random_integers(low=0.0, high=1.0, size=N)* instead of * flex.random_double(N)* - Also,* numpy.nanargmin* instead of *flex.min_index* - Also, *numpy.argsort* instead of *flex.sort_permutation* - Also, *.copy()* instead of *.deep_copy()* - Finally, *numpy.random.seed(0)* instead of *flex.set_random_seed(0)* Now, it works (at least for the Rosenbrock function). Thank you, Andrea -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 5 11:44:53 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Dec 2012 11:44:53 -0500 Subject: [SciPy-User] Need help to use rv_continuous.fit In-Reply-To: <50BF6420.20307@isc.ac.uk> References: <50BF6420.20307@isc.ac.uk> Message-ID: On Wed, Dec 5, 2012 at 10:11 AM, Sepideh Rastin wrote: > Hi there, > I have histograms that will form the likelihood function and need to > find the best normal distributions. > I would like to know if it would be possible to create my input array to > rv_continuous.fit using the data of my histograms. several answers No, you cannot use the histogram for fit() directly. stats.norm.fit() expects the original data with individual observations. It's possible to create an artifical dataset by just creating observations based on the histogramm. np.repeat(bin_centers, bin_counts) or something like this. fitting a normal distribution is "boring": sample mean and standard deviation are estimates of loc and scale. There are several scripts on the web, scipy cookbook, scipy central, ... ? how to fit a normal pdf directly to a histogram. If you have only a small number of bins, then using the above will cause a discretization bias (reference ?). In this case, I would fit either the histogram or the cumulative histogram to the discrete probabilities from the discretization, for example something like minimizing def fun(cumhist, bins_edges, loc, scale): diff = cumhist - stats.norm.cdf(bin_edges, loc=loc, scale=scale) #fix or drop first element return (diff**2).sum() Josef > Kind regards, > Sep > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jjhelmus at gmail.com Wed Dec 5 12:33:20 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Wed, 05 Dec 2012 12:33:20 -0500 Subject: [SciPy-User] equivalent of R quantile() function in scipy In-Reply-To: References: <50BE22C0.6040103@gmail.com> <50BE287F.1020800@gmail.com> Message-ID: <50BF8560.4050702@gmail.com> On 12/04/2012 11:43 AM, Skipper Seabold wrote: > On Tue, Dec 4, 2012 at 11:44 AM, Jonathan Helmus wrote: >> On 12/04/2012 11:26 AM, Pauli Virtanen wrote: >>> 04.12.2012 18:20, Jonathan Helmus kirjoitti: >>>> numpy.percentile can can accept a sequence of percentiles as the second >>>> parameter: >>>> >>>> In [8]: probs = [10.0, 30.0, 50.0, 70.0, 90.0] >>>> >>>> In [9]: a = np.arange(100) >>>> >>>> In [10]: np.percentile(a, probs) >>> [clip] >>>> In [11]: [scipy.stats.scoreatpercentile(a, i) for i in probs] >>>> Out[11]: >>> [clip] >>> >>> It could be useful if someone would take a look if that the >>> implementation in scipy can be easily adapted for that. >>> >> NumPy's implementation is pure python and quite straightforward, I'll >> put together a pull request to include similar functionality in SciPy. >> Adding the limit optional parameter shouldn't be too hard. >> > Great thanks for looking at this. Could you also look at adding an > axis keyword to scoreatpercentile? Should be simple, but IIRC this was > missing the last time I looked at this. > > Skipper I made a pull request that adds sequences of percentiles and an axis keyword to the scoreatpercentile function. https://github.com/scipy/scipy/pull/374 We can move any further discussion to the PR comments. - Jonathan Helmus From ndbecker2 at gmail.com Wed Dec 5 14:01:44 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 05 Dec 2012 14:01:44 -0500 Subject: [SciPy-User] typo in stats tutorial? Message-ID: The text says there's very little difference between Scott's Rule and Silverman's Rule. With the code, I don't think there'd be any difference at all: from scipy import stats x1 = np.array([-7, -5, 1, 4, 5], dtype=np.float) kde1 = stats.gaussian_kde(x1) kde2 = stats.gaussian_kde(x1, bw_method='silverman') fig = plt.figure() ax = fig.add_subplot(111) ax.plot(x1, np.zeros(x1.shape), 'b+', ms=20) # rug plot x_eval = np.linspace(-10, 10, num=200) ax.plot(x_eval, kde1(x_eval), 'k-', label="Scott's Rule") ax.plot(x_eval, kde1(x_eval), 'r-', label="Silverman's Rule") plt.show() From josef.pktd at gmail.com Wed Dec 5 14:20:00 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Dec 2012 14:20:00 -0500 Subject: [SciPy-User] typo in stats tutorial? In-Reply-To: References: Message-ID: On Wed, Dec 5, 2012 at 2:01 PM, Neal Becker wrote: > The text says there's very little difference between Scott's Rule and > Silverman's Rule. With the code, I don't think there'd be any difference at > all: > > from scipy import stats > > x1 = np.array([-7, -5, 1, 4, 5], dtype=np.float) > kde1 = stats.gaussian_kde(x1) > kde2 = stats.gaussian_kde(x1, bw_method='silverman') > > fig = plt.figure() > ax = fig.add_subplot(111) > > ax.plot(x1, np.zeros(x1.shape), 'b+', ms=20) # rug plot > x_eval = np.linspace(-10, 10, num=200) > ax.plot(x_eval, kde1(x_eval), 'k-', label="Scott's Rule") > ax.plot(x_eval, kde1(x_eval), 'r-', label="Silverman's Rule") kde2 what are we supposed to see? nice, example runs without changes in python 3.2.3 (the only python on my computer that seems to have a recent scipy and matplotlib) Josef > > plt.show() > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jrocher at enthought.com Wed Dec 5 16:48:57 2012 From: jrocher at enthought.com (Jonathan Rocher) Date: Wed, 5 Dec 2012 15:48:57 -0600 Subject: [SciPy-User] SciPy 2013 in Austin TX, June 24th-29th Message-ID: Dear all, We are extremely excited to announce that the 12th SciPy conference will take place in Austin TX from June 24th to June 29th 2013. http://conference.scipy.org/scipy2013/ This year again, we will have tutorials, BOFs, sprints in addition to the conference. We also have many ideas to make this coming edition even more exciting and productive than the past years and will be probing the community for ideas and volunteers. In the meanwhile, please save the dates, signup for the newsletter and pass the word around. Hope to see many of you then. Andy Terrel, Co-Chair of Scipy2013 Jonathan Rocher, Co-Chair of Scipy2013 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmain at gmx.de Wed Dec 5 17:23:49 2012 From: rmain at gmx.de (rmain at gmx.de) Date: Wed, 05 Dec 2012 23:23:49 +0100 Subject: [SciPy-User] How to convolve with transfer function Message-ID: <50BFC975.1090905@gmx.de> Hi everyone. I want to do the following: I have an image and want to apply the transfer function http://www.abload.de/img/tfbiif7.png to my image. However, in the article I am using it said the following: "Due to the singularity in the log function at the origin, one cannot construct an analytic expression for the shape of the Log-Gabor function in the spatial domain. Therefore, the filters are constructed in the frequency domain." So it has to be used in the frequency domain. I first have to transform my image via FFT. For that, I found the function numpy.fft.rfft2 which does the transformation. However I'm at a loss on how to apply the transfer function to the array. I found the scipy.signal.convolve method, however that one seems to work with arrays. But I can't figure out how to convolve with this transfer function if it can't be defined as an array. Would you please help me with my problem? Cheers Robert From malandrea80 at gmail.com Wed Dec 5 18:48:58 2012 From: malandrea80 at gmail.com (MalAndrea) Date: Thu, 6 Dec 2012 00:48:58 +0100 Subject: [SciPy-User] Parallel Differential Evolution In-Reply-To: References: Message-ID: Hi, I have followed Rob suggestions on this, i.e. to substute: (1). *numpy.zeros(X)* instead of *flex.double(X, 0)* (2). *1000*numpy.ones(X) *instead of *flex.double(X, 1000)* (3). *numpy.min(X) *for *flex.min(X)*, etc. (mean, sum) (4). *numpy.random.uniform(size=N)* for *flex.random_double(N) * However, to get this work, I also had to modify the following: - modification (4). only works when floats are meant to be used, I guess, so in the only case: *rnd = numpy.random.uniform(size=self.vector_length)* instead of *flex.random_double(N)*. In the other two cases it is *random_values = numpy.random.random_integers(low=0.0, high=1.0, size=N)* instead of * flex.random_double(N)* - Also,* numpy.nanargmin* instead of *flex.min_index* - Also, *numpy.argsort* instead of *flex.sort_permutation* - Also, *.copy()* instead of *.deep_copy()* - Finally, *numpy.random.seed(0)* instead of *flex.set_random_seed(0)* Now, it works (at least for the Rosenbrock function). Thank you, Andrea P.S. Sorry, this is the (old) message I referred to: Hi, > > Maybe the following is a useful implementation for you: > http://cci.lbl.gov/cctbx_sources/scitbx/differential_evolution.py > > According to this file you need only the scitbx library, which > unfortunately looks non-trivial to install (it seems to be related to > Boost). However, this looks like an old code (stdlib no longer exists, > you'd just write "import random" now) and all the work done with > arrays using the scitbx library (index of minimum value, mean value, > etc.) can easily be done with the much more easily-installed numpy > library from scipy.org. You probably just need that, a simple graphing > package such as matplotlib, and a working environment such as Ipython. > Binary windows installers are available for all these from a simple > google search for "install X windows" where X is any of the above > packages (they will be found at sourceforge.net). Don't forget to > match the binary installer's version to your version of Python. > > So instead of importing flex you'd import numpy and then make the > following substitutions (not tested): > > numpy.zeros(X) instead of flex.double(X, 0) > > 1000*numpy.ones(X) instead of flex.double(X, 1000) > > numpy.min(X) for flex.min(X), etc. > > numpy.random.uniform(size=N) for flex.random_double(N) > > If you get stuck post your code back here and someone can take a look. > Hope this helps, > Rob > > On Wed, Feb 24, 2010 at 6:02 PM, Debabrata Midya > > wrote: > >* Hi SciPy-Users,*>**>* Thanks in advance.*>**>* I am Deb, Statistician at NSW Department of Services, Technology and*>* Administration, Sydney, Australia.*>**>* I am new to this user group as well as to Python and Parallel Python.*>**>* I am interested to use Differential Evolution (DE) using Parallel Python on*>* Windows XP. I have some experience in DE using GNU GCC compiler.*>**>* Is there any software for it? If yes, anyone can assist me what are the*>* softwares I need to install and their locations please.*>**>* Once again, thank you very much for the time you have given.*>**>* I am looking forward for your reply.*>**>* Regards,*>**>* Deb*> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 5 20:19:49 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 5 Dec 2012 20:19:49 -0500 Subject: [SciPy-User] How to convolve with transfer function In-Reply-To: <50BFC975.1090905@gmx.de> References: <50BFC975.1090905@gmx.de> Message-ID: On Wed, Dec 5, 2012 at 5:23 PM, wrote: > Hi everyone. I want to do the following: > I have an image and want to apply the transfer function > http://www.abload.de/img/tfbiif7.png to my image. However, in the > article I am using it said the following: > "Due to the singularity in the log function at the origin, one cannot > construct an analytic expression for the shape > of the Log-Gabor function in the spatial domain. Therefore, the filters > are constructed in the frequency domain." > So it has to be used in the frequency domain. I first have to transform > my image via FFT. For that, I found the function numpy.fft.rfft2 which > does the transformation. However I'm at a loss on how to apply the > transfer function to the array. I found the scipy.signal.convolve > method, however that one seems to work with arrays. But I can't figure > out how to convolve with this transfer function if it can't be defined > as an array. > Would you please help me with my problem? Hopefully there will be a more specific answer. What I did in a similar case (but not with images) is to take fftconvolve https://github.com/scipy/scipy/blob/v0.11.0/scipy/signal/signaltools.py#L134 replace IN1 *= fftn(in2, fsize) with IN1 *= G(w) where G(w) is the Fourier transform of the transfer function that you have given in your formula. The problem that I usually had was figuring out what the frequencies, w, are supposed to be, and what shape G(w) is supposed to have. (recipe how to cannibalize scipy source, not for your specific problem) Josef > > Cheers > Robert > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From paul.kienzle at nist.gov Wed Dec 5 11:31:03 2012 From: paul.kienzle at nist.gov (Paul Kienzle) Date: Wed, 5 Dec 2012 11:31:03 -0500 Subject: [SciPy-User] Need help to use rv_continuous.fit In-Reply-To: <50BF6420.20307@isc.ac.uk> References: 3d375d731003220750i8801ed7o3dfaaa2990f7b057@mail.gmail.com <50BF6420.20307@isc.ac.uk> Message-ID: On Dec 5, 2012, at 10:11 AM, Sepideh Rastin wrote: > Hi there, > I have histograms that will form the likelihood function and need to > find the best normal distributions. > I would like to know if it would be possible to create my input array to > rv_continuous.fit using the data of my histograms. If you have the raw data, you can just use mean() and std() functions. E.g., from pylab import * n = 1000 mu = 2 sigma = 0.3 def G(x,m,s): return exp(-((x-m)/s)**2/2)/sqrt(2*pi*s**2) # fake data s = randn(n)*sigma + mu # estimates mu_hat, sigma_hat = mean(s), std(s) # plot t = linspace(mu-3*sigma,mu+3*sigma,200) bins = linspace(t[0],t[-1],21) step = bins[1]-bins[0] hist(s,bins=bins) plot(t,G(t,mu_hat,sigma_hat)*n*step,'g',hold=True) show() - Paul Paul Kienzle pkienzle at nist.gov From monocongo at gmail.com Thu Dec 6 15:10:26 2012 From: monocongo at gmail.com (James Adams) Date: Thu, 6 Dec 2012 15:10:26 -0500 Subject: [SciPy-User] Hydroclimpy: unable to find sqlite module Message-ID: I have installed the scikits.hydroclimpy package but when I try to verify that it's installed correctly via an import command within the python interpreter I get an error message telling me that the sqlite module is not present: $ python Python 2.7.3 (default, Dec 6 2012, 14:20:34) [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scikits.hydroclimpy Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/site-packages/scikits.hydroclimpy-0.67.1-py2.7-linux-x86_64.egg/scikits/hydroclimpy/__init__.py", line 19, in import io File "/usr/local/lib/python2.7/site-packages/scikits.hydroclimpy-0.67.1-py2.7-linux-x86_64.egg/scikits/hydroclimpy/io/__init__.py", line 7, in import sqlite ImportError: No module named sqlite I do however have the sqlite3 module installed. Can anyone suggest a work around for this? --James From cgohlke at uci.edu Thu Dec 6 15:34:49 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 06 Dec 2012 12:34:49 -0800 Subject: [SciPy-User] Hydroclimpy: unable to find sqlite module In-Reply-To: References: Message-ID: <50C10169.6000200@uci.edu> On 12/6/2012 12:10 PM, James Adams wrote: > I have installed the scikits.hydroclimpy package but when I try to > verify that it's installed correctly via an import command within the > python interpreter I get an error message telling me that the sqlite > module is not present: > > $ python > Python 2.7.3 (default, Dec 6 2012, 14:20:34) > [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import scikits.hydroclimpy > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python2.7/site-packages/scikits.hydroclimpy-0.67.1-py2.7-linux-x86_64.egg/scikits/hydroclimpy/__init__.py", > line 19, in > import io > File "/usr/local/lib/python2.7/site-packages/scikits.hydroclimpy-0.67.1-py2.7-linux-x86_64.egg/scikits/hydroclimpy/io/__init__.py", > line 7, in > import sqlite > ImportError: No module named sqlite > > > I do however have the sqlite3 module installed. > > Can anyone suggest a work around for this? > > --James sqlite.py is missing in trunk. You could try to copy the file from the pierregm branch Christoph From monocongo at gmail.com Thu Dec 6 16:03:24 2012 From: monocongo at gmail.com (James Adams) Date: Thu, 6 Dec 2012 16:03:24 -0500 Subject: [SciPy-User] Hydroclimpy: unable to find sqlite module Message-ID: Copying the sqlite.py file you referenced into the io directory and then rebuilding/reinstalling hydroclimpy did the trick. Thanks! --James From ralf.gommers at gmail.com Sat Dec 8 05:36:44 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Dec 2012 11:36:44 +0100 Subject: [SciPy-User] distributions - who got the most ? In-Reply-To: References:

Message-ID: On Wed, Dec 5, 2012 at 2:15 AM, wrote: > On Tue, Dec 4, 2012 at 4:01 PM, Ralf Gommers > wrote: > > > > > > > > On Tue, Dec 4, 2012 at 4:30 AM, wrote: > >> > >> scipy.stats has more than 90 distributions. > >> > >> Do we want to increase it by almost a factor of 10? :) > >> > >> While looking for the cdf of a distribution, I found this : > >> > >> > http://www.mathworks.com/matlabcentral/fileexchange/35008-generation-of-random-variates > >> > >> He collected 870 distributions (under BSD license). Includes generic > >> random number generation. > >> > >> Even though there are some variations of distributions counted > >> separately, given my quick browsing this looks impressive and a good > >> source for code and references. > >> Coding style is not great but it's 10 years or so of collecting > >> distributions. > > > > > > Adding a lot of distributions sounds fine to me. That many distributions > > would need to go into a separate namespace. > > > > Any additions should be complete though (the Matlab code only has > pdf/cdf) > > and well tested. The Matlab code doesn't look all that useful except for > the > > references ("coding style is not great" is really too kind). I also don't > > trust the BSD license that's put on it, many files have different author > > names in them with no mention of where they came from. > > The matlab code includes several "special" functions that look mostly > copied from other authors. > This would need checking, but I doubt we need many of those since we > have scipy.special. > We are missing some special functions for distributions, but I didn't > check whether he has any of those. > The pdfs, and the cdfs when available, look like they were implemented > by the author, at least it looks that way for the small sample that I > checked. > (code quality varies a lot, but many distributions are vectorized or > can be easily vectorized from his code. > > Given the pdf, the rest could all be derived generically. But it won't > be efficient. > True, but that doesn't feel quite right. Tickets are being opened regularly about precision issues due to using the generic methods. Same for speed, but that's perhaps less critical. Generic methods are often not good enough. > Also, I just saw that sympy could become useful to derive extra properties > http://matthewrocklin.com/blog/work/2012/12/03/Characteristic-Functions/ > sympy.stats also works based only on the pdf (from what I have seen). > > I'm a bit skeptical about the number of distributions that are > actually generally useful and not just used once in a journal article. > My impression from tracking several statistics journals is that there > are at least 10 new distributions each year. > > As an example, he has a long list of poisson mixture distributions > that I never heard of except for negative binomial. They might be > useful in some cases, but a more general class might cover it better. > >From a brief look at his reference > > http://scholar.google.com/scholar?cluster=6061641765696455790&hl=en&as_sdt=0,5&as_vis=1 > I think it might not be necessary to implement all details for 5 or > more distributions separately. > According to Google the paper has only 4 citations. see also 1) > > But there are a lot of distributions, or classes/categories of > distributions that scipy is missing, and are for example available in > R, but in R they are spread out over many packages. > Keeping a list of those in a ticket could be useful. Ralf > > Josef > > 1) another reference for poisson mixtures (technical, not a quick > read, but a funny table) > > Karlis, D. and Xekalaki, E. (2005), Mixed Poisson Distributions. > International Statistical Review, 73: 35?58. doi: > 10.1111/j.1751-5823.2005.tb00250.x > > http://scholar.google.com/scholar?cluster=4455890634693542956&hl=en&as_sdt=2005&sciodt=0,5 > > -------------------------- > Table 1 > Some mixed Poisson distributions. > Mixed Poisson Distribution Mixing Distribution A Key Reference > Negative Binomial Gamma Greenwood & Yule (1920) > Geometric Exponential Johnson et al. (1992) > Poisson-Linear Exponential Family Linear Exponential Family Sankaran (1969) > Poisson?Lindley Lindley Sankaran (1970) > Poisson-Linear Exponential Linear Exponential Kling & Goovaerts (1993) > Poisson-Lognormal Lognormal Bulmer (1974) > Poisson-Confluent Hypergeometric Series Confluent Hypergeometric > Series Bhattacharya (1966) > Poisson-Generalized Inverse Gaussian Generalized Inverse Gaussian Sichel > (1974) > Sichel Inverse Gaussian Sichel (1975) > Poisson-Inverse Gamma Inverse Gamma Willmot (1993) > Poisson-Truncated Normal Truncated Normal Patil (1964) > Generalized Waring Gamma Product Ratio Irwin (1975) > Simple Waring Exponential Beta Pielou (1962) > Yule Beta with Specific Parameter Values Simon (1955) > Poisson-Generalized Pareto Generalized Pareto Kempton (1975) > Poisson-Beta I Beta Type I Holla & Bhattacharya (1965) > Poisson-Beta II Beta Type II Gurland (1958) > Poisson-Truncated Beta II Truncated Beta Type II Willmot (1986) > Poisson-Uniform Uniform Bhattacharya (1966) > Poisson-Truncated Gamma Truncated Gamma Willmot (1993) > Poisson-Generalized Gamma Generalized Gamma Albrecht (1984) > Dellaporte Shifted Gamma Ruohonen (1988) > Poisson-Modified Bessel of the 3rd Kind Modified Bessel of the 3rd > Kind Ong & Muthaloo (1995) > Poisson?Pareto Pareto Willmot (1993) > Poisson-Shifted Pareto Shifted Pareto Willmot (1993) > Poisson?Pearson Family Pearson?s Family of Distributions Albrecht (1982) > Poisson-Log-Student Log-Student Gaver & O?Muircheartaigh (1987) > Poisson-Power Function Power Function Distribution Rai (1971) > Poisson?Lomax Lomax Al-Awadhi & Ghitany (2001) > Poisson-Power Variance Power Variance Family Hougaard et al. (1997) > Neyman Poisson Douglas (1980) > Other Discrete Distributions Johnson et al. (1992) > ------------------------------------------------------------------- > > > > > Ralf > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fgnu32 at yahoo.com Sun Dec 9 23:01:30 2012 From: fgnu32 at yahoo.com (Fg Nu) Date: Sun, 9 Dec 2012 20:01:30 -0800 (PST) Subject: [SciPy-User] Logistic regression using SciPy Message-ID: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation ?function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable.? Here is the code: #================================================== ? ? # purpose: logistic regression? ? ? import numpy as np ? ? import scipy as sp ? ? import scipy.optimize ? ?? ? ? import matplotlib as mpl ? ? import os ? ?? ? ? # prepare the data ? ? data = np.loadtxt('data.csv', delimiter=',', skiprows=1) ? ? vY = data[:, 0] ? ? mX = data[:, 1:] ? ? intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) ? ? mX = np.concatenate((intercept, mX), axis = 1) ? ? iK = mX.shape[1] ? ? iN = mX.shape[0] ? ? ? ? ? # logistic transformation ? ? def logit(mX, vBeta): ? ? ? ? return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) ? ?? ? ? # test function call ? ? vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828,? ? ? ? ? 1.7993371, .7148045 ?]) ? ? logit(mX, vBeta0) ? ?? ? ? # cost function ? ? def logLikelihoodLogit(vBeta, mX, vY): ? ? ? ? return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) ? ? logLikelihoodLogit(vBeta0, mX, vY) # test function call ? ?? ? ? # gradient function ? ? def likelihoodScore(vBeta, mX, vY): ? ? ? ? return(np.dot(mX.T,? ? ? ? ? ? ? ? ? ? ? ? ((np.dot(mX, vBeta) - vY)/ ? ? ? ? ? ? ? ? ? ? ? ?np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) ? ?? ? ? likelihoodScore(vBeta0, mX, vY).shape # test function call ? ? ? ? ? # optimize the function (without gradient) ? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]),? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) ? ?? ? ? # optimize the function (with gradient) ? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]), fprime = likelihoodScore,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) #===================================================== * The first optimization (without gradient) ?ends with a whole lot of stuff about division by zero. * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. Any help with this is appreciated. If anyone wants to try this, the data is included below. ? ? low,age,lwt,race,smoke,ptl,ht,ui ? ? 0,19,182,2,0,0,0,1 ? ? 0,33,155,3,0,0,0,0 ? ? 0,20,105,1,1,0,0,0 ? ? 0,21,108,1,1,0,0,1 ? ? 0,18,107,1,1,0,0,1 ? ? 0,21,124,3,0,0,0,0 ? ? 0,22,118,1,0,0,0,0 ? ? 0,17,103,3,0,0,0,0 ? ? 0,29,123,1,1,0,0,0 ? ? 0,26,113,1,1,0,0,0 ? ? 0,19,95,3,0,0,0,0 ? ? 0,19,150,3,0,0,0,0 ? ? 0,22,95,3,0,0,1,0 ? ? 0,30,107,3,0,1,0,1 ? ? 0,18,100,1,1,0,0,0 ? ? 0,18,100,1,1,0,0,0 ? ? 0,15,98,2,0,0,0,0 ? ? 0,25,118,1,1,0,0,0 ? ? 0,20,120,3,0,0,0,1 ? ? 0,28,120,1,1,0,0,0 ? ? 0,32,121,3,0,0,0,0 ? ? 0,31,100,1,0,0,0,1 ? ? 0,36,202,1,0,0,0,0 ? ? 0,28,120,3,0,0,0,0 ? ? 0,25,120,3,0,0,0,1 ? ? 0,28,167,1,0,0,0,0 ? ? 0,17,122,1,1,0,0,0 ? ? 0,29,150,1,0,0,0,0 ? ? 0,26,168,2,1,0,0,0 ? ? 0,17,113,2,0,0,0,0 ? ? 0,17,113,2,0,0,0,0 ? ? 0,24,90,1,1,1,0,0 ? ? 0,35,121,2,1,1,0,0 ? ? 0,25,155,1,0,0,0,0 ? ? 0,25,125,2,0,0,0,0 ? ? 0,29,140,1,1,0,0,0 ? ? 0,19,138,1,1,0,0,0 ? ? 0,27,124,1,1,0,0,0 ? ? 0,31,215,1,1,0,0,0 ? ? 0,33,109,1,1,0,0,0 ? ? 0,21,185,2,1,0,0,0 ? ? 0,19,189,1,0,0,0,0 ? ? 0,23,130,2,0,0,0,0 ? ? 0,21,160,1,0,0,0,0 ? ? 0,18,90,1,1,0,0,1 ? ? 0,18,90,1,1,0,0,1 ? ? 0,32,132,1,0,0,0,0 ? ? 0,19,132,3,0,0,0,0 ? ? 0,24,115,1,0,0,0,0 ? ? 0,22,85,3,1,0,0,0 ? ? 0,22,120,1,0,0,1,0 ? ? 0,23,128,3,0,0,0,0 ? ? 0,22,130,1,1,0,0,0 ? ? 0,30,95,1,1,0,0,0 ? ? 0,19,115,3,0,0,0,0 ? ? 0,16,110,3,0,0,0,0 ? ? 0,21,110,3,1,0,0,1 ? ? 0,30,153,3,0,0,0,0 ? ? 0,20,103,3,0,0,0,0 ? ? 0,17,119,3,0,0,0,0 ? ? 0,17,119,3,0,0,0,0 ? ? 0,23,119,3,0,0,0,0 ? ? 0,24,110,3,0,0,0,0 ? ? 0,28,140,1,0,0,0,0 ? ? 0,26,133,3,1,2,0,0 ? ? 0,20,169,3,0,1,0,1 ? ? 0,24,115,3,0,0,0,0 ? ? 0,28,250,3,1,0,0,0 ? ? 0,20,141,1,0,2,0,1 ? ? 0,22,158,2,0,1,0,0 ? ? 0,22,112,1,1,2,0,0 ? ? 0,31,150,3,1,0,0,0 ? ? 0,23,115,3,1,0,0,0 ? ? 0,16,112,2,0,0,0,0 ? ? 0,16,135,1,1,0,0,0 ? ? 0,18,229,2,0,0,0,0 ? ? 0,25,140,1,0,0,0,0 ? ? 0,32,134,1,1,1,0,0 ? ? 0,20,121,2,1,0,0,0 ? ? 0,23,190,1,0,0,0,0 ? ? 0,22,131,1,0,0,0,0 ? ? 0,32,170,1,0,0,0,0 ? ? 0,30,110,3,0,0,0,0 ? ? 0,20,127,3,0,0,0,0 ? ? 0,23,123,3,0,0,0,0 ? ? 0,17,120,3,1,0,0,0 ? ? 0,19,105,3,0,0,0,0 ? ? 0,23,130,1,0,0,0,0 ? ? 0,36,175,1,0,0,0,0 ? ? 0,22,125,1,0,0,0,0 ? ? 0,24,133,1,0,0,0,0 ? ? 0,21,134,3,0,0,0,0 ? ? 0,19,235,1,1,0,1,0 ? ? 0,25,95,1,1,3,0,1 ? ? 0,16,135,1,1,0,0,0 ? ? 0,29,135,1,0,0,0,0 ? ? 0,29,154,1,0,0,0,0 ? ? 0,19,147,1,1,0,0,0 ? ? 0,19,147,1,1,0,0,0 ? ? 0,30,137,1,0,0,0,0 ? ? 0,24,110,1,0,0,0,0 ? ? 0,19,184,1,1,0,1,0 ? ? 0,24,110,3,0,1,0,0 ? ? 0,23,110,1,0,0,0,0 ? ? 0,20,120,3,0,0,0,0 ? ? 0,25,241,2,0,0,1,0 ? ? 0,30,112,1,0,0,0,0 ? ? 0,22,169,1,0,0,0,0 ? ? 0,18,120,1,1,0,0,0 ? ? 0,16,170,2,0,0,0,0 ? ? 0,32,186,1,0,0,0,0 ? ? 0,18,120,3,0,0,0,0 ? ? 0,29,130,1,1,0,0,0 ? ? 0,33,117,1,0,0,0,1 ? ? 0,20,170,1,1,0,0,0 ? ? 0,28,134,3,0,0,0,0 ? ? 0,14,135,1,0,0,0,0 ? ? 0,28,130,3,0,0,0,0 ? ? 0,25,120,1,0,0,0,0 ? ? 0,16,95,3,0,0,0,0 ? ? 0,20,158,1,0,0,0,0 ? ? 0,26,160,3,0,0,0,0 ? ? 0,21,115,1,0,0,0,0 ? ? 0,22,129,1,0,0,0,0 ? ? 0,25,130,1,0,0,0,0 ? ? 0,31,120,1,0,0,0,0 ? ? 0,35,170,1,0,1,0,0 ? ? 0,19,120,1,1,0,0,0 ? ? 0,24,116,1,0,0,0,0 ? ? 0,45,123,1,0,0,0,0 ? ? 1,28,120,3,1,1,0,1 ? ? 1,29,130,1,0,0,0,1 ? ? 1,34,187,2,1,0,1,0 ? ? 1,25,105,3,0,1,1,0 ? ? 1,25,85,3,0,0,0,1 ? ? 1,27,150,3,0,0,0,0 ? ? 1,23,97,3,0,0,0,1 ? ? 1,24,128,2,0,1,0,0 ? ? 1,24,132,3,0,0,1,0 ? ? 1,21,165,1,1,0,1,0 ? ? 1,32,105,1,1,0,0,0 ? ? 1,19,91,1,1,2,0,1 ? ? 1,25,115,3,0,0,0,0 ? ? 1,16,130,3,0,0,0,0 ? ? 1,25,92,1,1,0,0,0 ? ? 1,20,150,1,1,0,0,0 ? ? 1,21,200,2,0,0,0,1 ? ? 1,24,155,1,1,1,0,0 ? ? 1,21,103,3,0,0,0,0 ? ? 1,20,125,3,0,0,0,1 ? ? 1,25,89,3,0,2,0,0 ? ? 1,19,102,1,0,0,0,0 ? ? 1,19,112,1,1,0,0,1 ? ? 1,26,117,1,1,1,0,0 ? ? 1,24,138,1,0,0,0,0 ? ? 1,17,130,3,1,1,0,1 ? ? 1,20,120,2,1,0,0,0 ? ? 1,22,130,1,1,1,0,1 ? ? 1,27,130,2,0,0,0,1 ? ? 1,20,80,3,1,0,0,1 ? ? 1,17,110,1,1,0,0,0 ? ? 1,25,105,3,0,1,0,0 ? ? 1,20,109,3,0,0,0,0 ? ? 1,18,148,3,0,0,0,0 ? ? 1,18,110,2,1,1,0,0 ? ? 1,20,121,1,1,1,0,1 ? ? 1,21,100,3,0,1,0,0 ? ? 1,26,96,3,0,0,0,0 ? ? 1,31,102,1,1,1,0,0 ? ? 1,15,110,1,0,0,0,0 ? ? 1,23,187,2,1,0,0,0 ? ? 1,20,122,2,1,0,0,0 ? ? 1,24,105,2,1,0,0,0 ? ? 1,15,115,3,0,0,0,1 ? ? 1,23,120,3,0,0,0,0 ? ? 1,30,142,1,1,1,0,0 ? ? 1,22,130,1,1,0,0,0 ? ? 1,17,120,1,1,0,0,0 ? ? 1,23,110,1,1,1,0,0 ? ? 1,17,120,2,0,0,0,0 ? ? 1,26,154,3,0,1,1,0 ? ? 1,20,106,3,0,0,0,0 ? ? 1,26,190,1,1,0,0,0 ? ? 1,14,101,3,1,1,0,0 ? ? 1,28,95,1,1,0,0,0 ? ? 1,14,100,3,0,0,0,0 ? ? 1,23,94,3,1,0,0,0 ? ? 1,17,142,2,0,0,1,0 ? ? 1,21,130,1,1,0,1,0 Thanks. From d.warde.farley at gmail.com Sun Dec 9 23:22:35 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Sun, 9 Dec 2012 23:22:35 -0500 Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: First, the way you've written the log likelihood is numerically unstable. Consider simplifying the expression (using logarithm laws and breaking apart logistic function) and using the log1p function where appropriate. Second, the optimization problem is going to be extremely ill conditioned given the very different scales of your different predictors. You should probably mean-center and divide by the standard deviation. Third, there's a check_grad function in scipy.optimize that can be used to troubleshoot gradient issues. Fourth, there's a pre-rolled of this in scikit-learn that will probably be a good deal faster (it wraps a C library) and certainly better tested than home-rolling it. David On Sun, Dec 9, 2012 at 11:01 PM, Fg Nu wrote: > > > I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable. > > Here is the code: > > #================================================== > > # purpose: logistic regression > import numpy as np > import scipy as sp > import scipy.optimize > > import matplotlib as mpl > import os > > # prepare the data > data = np.loadtxt('data.csv', delimiter=',', skiprows=1) > vY = data[:, 0] > mX = data[:, 1:] > intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) > mX = np.concatenate((intercept, mX), axis = 1) > iK = mX.shape[1] > iN = mX.shape[0] > > # logistic transformation > def logit(mX, vBeta): > return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) > > # test function call > vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828, > 1.7993371, .7148045 ]) > logit(mX, vBeta0) > > # cost function > def logLikelihoodLogit(vBeta, mX, vY): > return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) > logLikelihoodLogit(vBeta0, mX, vY) # test function call > > # gradient function > def likelihoodScore(vBeta, mX, vY): > return(np.dot(mX.T, > ((np.dot(mX, vBeta) - vY)/ > np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) > > likelihoodScore(vBeta0, mX, vY).shape # test function call > > # optimize the function (without gradient) > optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, > x0 = np.array([-.1, -.03, -.01, .44, .92, .53, > 1.8, .71]), > args = (mX, vY), gtol = 1e-3) > > # optimize the function (with gradient) > optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, > x0 = np.array([-.1, -.03, -.01, .44, .92, .53, > 1.8, .71]), fprime = likelihoodScore, > args = (mX, vY), gtol = 1e-3) > #===================================================== > > * The first optimization (without gradient) ends with a whole lot of stuff about division by zero. > > * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. > > Any help with this is appreciated. If anyone wants to try this, the data is included below. > > low,age,lwt,race,smoke,ptl,ht,ui > 0,19,182,2,0,0,0,1 > 0,33,155,3,0,0,0,0 > 0,20,105,1,1,0,0,0 > 0,21,108,1,1,0,0,1 > 0,18,107,1,1,0,0,1 > 0,21,124,3,0,0,0,0 > 0,22,118,1,0,0,0,0 > 0,17,103,3,0,0,0,0 > 0,29,123,1,1,0,0,0 > 0,26,113,1,1,0,0,0 > 0,19,95,3,0,0,0,0 > 0,19,150,3,0,0,0,0 > 0,22,95,3,0,0,1,0 > 0,30,107,3,0,1,0,1 > 0,18,100,1,1,0,0,0 > 0,18,100,1,1,0,0,0 > 0,15,98,2,0,0,0,0 > 0,25,118,1,1,0,0,0 > 0,20,120,3,0,0,0,1 > 0,28,120,1,1,0,0,0 > 0,32,121,3,0,0,0,0 > 0,31,100,1,0,0,0,1 > 0,36,202,1,0,0,0,0 > 0,28,120,3,0,0,0,0 > 0,25,120,3,0,0,0,1 > 0,28,167,1,0,0,0,0 > 0,17,122,1,1,0,0,0 > 0,29,150,1,0,0,0,0 > 0,26,168,2,1,0,0,0 > 0,17,113,2,0,0,0,0 > 0,17,113,2,0,0,0,0 > 0,24,90,1,1,1,0,0 > 0,35,121,2,1,1,0,0 > 0,25,155,1,0,0,0,0 > 0,25,125,2,0,0,0,0 > 0,29,140,1,1,0,0,0 > 0,19,138,1,1,0,0,0 > 0,27,124,1,1,0,0,0 > 0,31,215,1,1,0,0,0 > 0,33,109,1,1,0,0,0 > 0,21,185,2,1,0,0,0 > 0,19,189,1,0,0,0,0 > 0,23,130,2,0,0,0,0 > 0,21,160,1,0,0,0,0 > 0,18,90,1,1,0,0,1 > 0,18,90,1,1,0,0,1 > 0,32,132,1,0,0,0,0 > 0,19,132,3,0,0,0,0 > 0,24,115,1,0,0,0,0 > 0,22,85,3,1,0,0,0 > 0,22,120,1,0,0,1,0 > 0,23,128,3,0,0,0,0 > 0,22,130,1,1,0,0,0 > 0,30,95,1,1,0,0,0 > 0,19,115,3,0,0,0,0 > 0,16,110,3,0,0,0,0 > 0,21,110,3,1,0,0,1 > 0,30,153,3,0,0,0,0 > 0,20,103,3,0,0,0,0 > 0,17,119,3,0,0,0,0 > 0,17,119,3,0,0,0,0 > 0,23,119,3,0,0,0,0 > 0,24,110,3,0,0,0,0 > 0,28,140,1,0,0,0,0 > 0,26,133,3,1,2,0,0 > 0,20,169,3,0,1,0,1 > 0,24,115,3,0,0,0,0 > 0,28,250,3,1,0,0,0 > 0,20,141,1,0,2,0,1 > 0,22,158,2,0,1,0,0 > 0,22,112,1,1,2,0,0 > 0,31,150,3,1,0,0,0 > 0,23,115,3,1,0,0,0 > 0,16,112,2,0,0,0,0 > 0,16,135,1,1,0,0,0 > 0,18,229,2,0,0,0,0 > 0,25,140,1,0,0,0,0 > 0,32,134,1,1,1,0,0 > 0,20,121,2,1,0,0,0 > 0,23,190,1,0,0,0,0 > 0,22,131,1,0,0,0,0 > 0,32,170,1,0,0,0,0 > 0,30,110,3,0,0,0,0 > 0,20,127,3,0,0,0,0 > 0,23,123,3,0,0,0,0 > 0,17,120,3,1,0,0,0 > 0,19,105,3,0,0,0,0 > 0,23,130,1,0,0,0,0 > 0,36,175,1,0,0,0,0 > 0,22,125,1,0,0,0,0 > 0,24,133,1,0,0,0,0 > 0,21,134,3,0,0,0,0 > 0,19,235,1,1,0,1,0 > 0,25,95,1,1,3,0,1 > 0,16,135,1,1,0,0,0 > 0,29,135,1,0,0,0,0 > 0,29,154,1,0,0,0,0 > 0,19,147,1,1,0,0,0 > 0,19,147,1,1,0,0,0 > 0,30,137,1,0,0,0,0 > 0,24,110,1,0,0,0,0 > 0,19,184,1,1,0,1,0 > 0,24,110,3,0,1,0,0 > 0,23,110,1,0,0,0,0 > 0,20,120,3,0,0,0,0 > 0,25,241,2,0,0,1,0 > 0,30,112,1,0,0,0,0 > 0,22,169,1,0,0,0,0 > 0,18,120,1,1,0,0,0 > 0,16,170,2,0,0,0,0 > 0,32,186,1,0,0,0,0 > 0,18,120,3,0,0,0,0 > 0,29,130,1,1,0,0,0 > 0,33,117,1,0,0,0,1 > 0,20,170,1,1,0,0,0 > 0,28,134,3,0,0,0,0 > 0,14,135,1,0,0,0,0 > 0,28,130,3,0,0,0,0 > 0,25,120,1,0,0,0,0 > 0,16,95,3,0,0,0,0 > 0,20,158,1,0,0,0,0 > 0,26,160,3,0,0,0,0 > 0,21,115,1,0,0,0,0 > 0,22,129,1,0,0,0,0 > 0,25,130,1,0,0,0,0 > 0,31,120,1,0,0,0,0 > 0,35,170,1,0,1,0,0 > 0,19,120,1,1,0,0,0 > 0,24,116,1,0,0,0,0 > 0,45,123,1,0,0,0,0 > 1,28,120,3,1,1,0,1 > 1,29,130,1,0,0,0,1 > 1,34,187,2,1,0,1,0 > 1,25,105,3,0,1,1,0 > 1,25,85,3,0,0,0,1 > 1,27,150,3,0,0,0,0 > 1,23,97,3,0,0,0,1 > 1,24,128,2,0,1,0,0 > 1,24,132,3,0,0,1,0 > 1,21,165,1,1,0,1,0 > 1,32,105,1,1,0,0,0 > 1,19,91,1,1,2,0,1 > 1,25,115,3,0,0,0,0 > 1,16,130,3,0,0,0,0 > 1,25,92,1,1,0,0,0 > 1,20,150,1,1,0,0,0 > 1,21,200,2,0,0,0,1 > 1,24,155,1,1,1,0,0 > 1,21,103,3,0,0,0,0 > 1,20,125,3,0,0,0,1 > 1,25,89,3,0,2,0,0 > 1,19,102,1,0,0,0,0 > 1,19,112,1,1,0,0,1 > 1,26,117,1,1,1,0,0 > 1,24,138,1,0,0,0,0 > 1,17,130,3,1,1,0,1 > 1,20,120,2,1,0,0,0 > 1,22,130,1,1,1,0,1 > 1,27,130,2,0,0,0,1 > 1,20,80,3,1,0,0,1 > 1,17,110,1,1,0,0,0 > 1,25,105,3,0,1,0,0 > 1,20,109,3,0,0,0,0 > 1,18,148,3,0,0,0,0 > 1,18,110,2,1,1,0,0 > 1,20,121,1,1,1,0,1 > 1,21,100,3,0,1,0,0 > 1,26,96,3,0,0,0,0 > 1,31,102,1,1,1,0,0 > 1,15,110,1,0,0,0,0 > 1,23,187,2,1,0,0,0 > 1,20,122,2,1,0,0,0 > 1,24,105,2,1,0,0,0 > 1,15,115,3,0,0,0,1 > 1,23,120,3,0,0,0,0 > 1,30,142,1,1,1,0,0 > 1,22,130,1,1,0,0,0 > 1,17,120,1,1,0,0,0 > 1,23,110,1,1,1,0,0 > 1,17,120,2,0,0,0,0 > 1,26,154,3,0,1,1,0 > 1,20,106,3,0,0,0,0 > 1,26,190,1,1,0,0,0 > 1,14,101,3,1,1,0,0 > 1,28,95,1,1,0,0,0 > 1,14,100,3,0,0,0,0 > 1,23,94,3,1,0,0,0 > 1,17,142,2,0,0,1,0 > 1,21,130,1,1,0,1,0 > > Thanks. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sun Dec 9 23:42:20 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 9 Dec 2012 23:42:20 -0500 Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: On Sun, Dec 9, 2012 at 11:22 PM, David Warde-Farley wrote: > First, the way you've written the log likelihood is numerically > unstable. Consider simplifying the expression (using logarithm laws > and breaking apart logistic function) and using the log1p function > where appropriate. > > Second, the optimization problem is going to be extremely ill > conditioned given the very different scales of your different > predictors. You should probably mean-center and divide by the standard > deviation. > > Third, there's a check_grad function in scipy.optimize that can be > used to troubleshoot gradient issues. > > Fourth, there's a pre-rolled of this in scikit-learn that will > probably be a good deal faster (it wraps a C library) and certainly > better tested than home-rolling it. or statsmodels, which uses analytical gradient and hessians, and gives you additional result statistics and (statistical) tests. Josef > > David > > On Sun, Dec 9, 2012 at 11:01 PM, Fg Nu wrote: >> >> >> I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable. >> >> Here is the code: >> >> #================================================== >> >> # purpose: logistic regression >> import numpy as np >> import scipy as sp >> import scipy.optimize >> >> import matplotlib as mpl >> import os >> >> # prepare the data >> data = np.loadtxt('data.csv', delimiter=',', skiprows=1) >> vY = data[:, 0] >> mX = data[:, 1:] >> intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) >> mX = np.concatenate((intercept, mX), axis = 1) >> iK = mX.shape[1] >> iN = mX.shape[0] >> >> # logistic transformation >> def logit(mX, vBeta): >> return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) >> >> # test function call >> vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828, >> 1.7993371, .7148045 ]) >> logit(mX, vBeta0) >> >> # cost function >> def logLikelihoodLogit(vBeta, mX, vY): >> return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) >> logLikelihoodLogit(vBeta0, mX, vY) # test function call >> >> # gradient function >> def likelihoodScore(vBeta, mX, vY): >> return(np.dot(mX.T, >> ((np.dot(mX, vBeta) - vY)/ >> np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) >> >> likelihoodScore(vBeta0, mX, vY).shape # test function call >> >> # optimize the function (without gradient) >> optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >> x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >> 1.8, .71]), >> args = (mX, vY), gtol = 1e-3) >> >> # optimize the function (with gradient) >> optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >> x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >> 1.8, .71]), fprime = likelihoodScore, >> args = (mX, vY), gtol = 1e-3) >> #===================================================== >> >> * The first optimization (without gradient) ends with a whole lot of stuff about division by zero. >> >> * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. >> >> Any help with this is appreciated. If anyone wants to try this, the data is included below. >> >> low,age,lwt,race,smoke,ptl,ht,ui >> 0,19,182,2,0,0,0,1 >> 0,33,155,3,0,0,0,0 >> 0,20,105,1,1,0,0,0 >> 0,21,108,1,1,0,0,1 >> 0,18,107,1,1,0,0,1 >> 0,21,124,3,0,0,0,0 >> 0,22,118,1,0,0,0,0 >> 0,17,103,3,0,0,0,0 >> 0,29,123,1,1,0,0,0 >> 0,26,113,1,1,0,0,0 >> 0,19,95,3,0,0,0,0 >> 0,19,150,3,0,0,0,0 >> 0,22,95,3,0,0,1,0 >> 0,30,107,3,0,1,0,1 >> 0,18,100,1,1,0,0,0 >> 0,18,100,1,1,0,0,0 >> 0,15,98,2,0,0,0,0 >> 0,25,118,1,1,0,0,0 >> 0,20,120,3,0,0,0,1 >> 0,28,120,1,1,0,0,0 >> 0,32,121,3,0,0,0,0 >> 0,31,100,1,0,0,0,1 >> 0,36,202,1,0,0,0,0 >> 0,28,120,3,0,0,0,0 >> 0,25,120,3,0,0,0,1 >> 0,28,167,1,0,0,0,0 >> 0,17,122,1,1,0,0,0 >> 0,29,150,1,0,0,0,0 >> 0,26,168,2,1,0,0,0 >> 0,17,113,2,0,0,0,0 >> 0,17,113,2,0,0,0,0 >> 0,24,90,1,1,1,0,0 >> 0,35,121,2,1,1,0,0 >> 0,25,155,1,0,0,0,0 >> 0,25,125,2,0,0,0,0 >> 0,29,140,1,1,0,0,0 >> 0,19,138,1,1,0,0,0 >> 0,27,124,1,1,0,0,0 >> 0,31,215,1,1,0,0,0 >> 0,33,109,1,1,0,0,0 >> 0,21,185,2,1,0,0,0 >> 0,19,189,1,0,0,0,0 >> 0,23,130,2,0,0,0,0 >> 0,21,160,1,0,0,0,0 >> 0,18,90,1,1,0,0,1 >> 0,18,90,1,1,0,0,1 >> 0,32,132,1,0,0,0,0 >> 0,19,132,3,0,0,0,0 >> 0,24,115,1,0,0,0,0 >> 0,22,85,3,1,0,0,0 >> 0,22,120,1,0,0,1,0 >> 0,23,128,3,0,0,0,0 >> 0,22,130,1,1,0,0,0 >> 0,30,95,1,1,0,0,0 >> 0,19,115,3,0,0,0,0 >> 0,16,110,3,0,0,0,0 >> 0,21,110,3,1,0,0,1 >> 0,30,153,3,0,0,0,0 >> 0,20,103,3,0,0,0,0 >> 0,17,119,3,0,0,0,0 >> 0,17,119,3,0,0,0,0 >> 0,23,119,3,0,0,0,0 >> 0,24,110,3,0,0,0,0 >> 0,28,140,1,0,0,0,0 >> 0,26,133,3,1,2,0,0 >> 0,20,169,3,0,1,0,1 >> 0,24,115,3,0,0,0,0 >> 0,28,250,3,1,0,0,0 >> 0,20,141,1,0,2,0,1 >> 0,22,158,2,0,1,0,0 >> 0,22,112,1,1,2,0,0 >> 0,31,150,3,1,0,0,0 >> 0,23,115,3,1,0,0,0 >> 0,16,112,2,0,0,0,0 >> 0,16,135,1,1,0,0,0 >> 0,18,229,2,0,0,0,0 >> 0,25,140,1,0,0,0,0 >> 0,32,134,1,1,1,0,0 >> 0,20,121,2,1,0,0,0 >> 0,23,190,1,0,0,0,0 >> 0,22,131,1,0,0,0,0 >> 0,32,170,1,0,0,0,0 >> 0,30,110,3,0,0,0,0 >> 0,20,127,3,0,0,0,0 >> 0,23,123,3,0,0,0,0 >> 0,17,120,3,1,0,0,0 >> 0,19,105,3,0,0,0,0 >> 0,23,130,1,0,0,0,0 >> 0,36,175,1,0,0,0,0 >> 0,22,125,1,0,0,0,0 >> 0,24,133,1,0,0,0,0 >> 0,21,134,3,0,0,0,0 >> 0,19,235,1,1,0,1,0 >> 0,25,95,1,1,3,0,1 >> 0,16,135,1,1,0,0,0 >> 0,29,135,1,0,0,0,0 >> 0,29,154,1,0,0,0,0 >> 0,19,147,1,1,0,0,0 >> 0,19,147,1,1,0,0,0 >> 0,30,137,1,0,0,0,0 >> 0,24,110,1,0,0,0,0 >> 0,19,184,1,1,0,1,0 >> 0,24,110,3,0,1,0,0 >> 0,23,110,1,0,0,0,0 >> 0,20,120,3,0,0,0,0 >> 0,25,241,2,0,0,1,0 >> 0,30,112,1,0,0,0,0 >> 0,22,169,1,0,0,0,0 >> 0,18,120,1,1,0,0,0 >> 0,16,170,2,0,0,0,0 >> 0,32,186,1,0,0,0,0 >> 0,18,120,3,0,0,0,0 >> 0,29,130,1,1,0,0,0 >> 0,33,117,1,0,0,0,1 >> 0,20,170,1,1,0,0,0 >> 0,28,134,3,0,0,0,0 >> 0,14,135,1,0,0,0,0 >> 0,28,130,3,0,0,0,0 >> 0,25,120,1,0,0,0,0 >> 0,16,95,3,0,0,0,0 >> 0,20,158,1,0,0,0,0 >> 0,26,160,3,0,0,0,0 >> 0,21,115,1,0,0,0,0 >> 0,22,129,1,0,0,0,0 >> 0,25,130,1,0,0,0,0 >> 0,31,120,1,0,0,0,0 >> 0,35,170,1,0,1,0,0 >> 0,19,120,1,1,0,0,0 >> 0,24,116,1,0,0,0,0 >> 0,45,123,1,0,0,0,0 >> 1,28,120,3,1,1,0,1 >> 1,29,130,1,0,0,0,1 >> 1,34,187,2,1,0,1,0 >> 1,25,105,3,0,1,1,0 >> 1,25,85,3,0,0,0,1 >> 1,27,150,3,0,0,0,0 >> 1,23,97,3,0,0,0,1 >> 1,24,128,2,0,1,0,0 >> 1,24,132,3,0,0,1,0 >> 1,21,165,1,1,0,1,0 >> 1,32,105,1,1,0,0,0 >> 1,19,91,1,1,2,0,1 >> 1,25,115,3,0,0,0,0 >> 1,16,130,3,0,0,0,0 >> 1,25,92,1,1,0,0,0 >> 1,20,150,1,1,0,0,0 >> 1,21,200,2,0,0,0,1 >> 1,24,155,1,1,1,0,0 >> 1,21,103,3,0,0,0,0 >> 1,20,125,3,0,0,0,1 >> 1,25,89,3,0,2,0,0 >> 1,19,102,1,0,0,0,0 >> 1,19,112,1,1,0,0,1 >> 1,26,117,1,1,1,0,0 >> 1,24,138,1,0,0,0,0 >> 1,17,130,3,1,1,0,1 >> 1,20,120,2,1,0,0,0 >> 1,22,130,1,1,1,0,1 >> 1,27,130,2,0,0,0,1 >> 1,20,80,3,1,0,0,1 >> 1,17,110,1,1,0,0,0 >> 1,25,105,3,0,1,0,0 >> 1,20,109,3,0,0,0,0 >> 1,18,148,3,0,0,0,0 >> 1,18,110,2,1,1,0,0 >> 1,20,121,1,1,1,0,1 >> 1,21,100,3,0,1,0,0 >> 1,26,96,3,0,0,0,0 >> 1,31,102,1,1,1,0,0 >> 1,15,110,1,0,0,0,0 >> 1,23,187,2,1,0,0,0 >> 1,20,122,2,1,0,0,0 >> 1,24,105,2,1,0,0,0 >> 1,15,115,3,0,0,0,1 >> 1,23,120,3,0,0,0,0 >> 1,30,142,1,1,1,0,0 >> 1,22,130,1,1,0,0,0 >> 1,17,120,1,1,0,0,0 >> 1,23,110,1,1,1,0,0 >> 1,17,120,2,0,0,0,0 >> 1,26,154,3,0,1,1,0 >> 1,20,106,3,0,0,0,0 >> 1,26,190,1,1,0,0,0 >> 1,14,101,3,1,1,0,0 >> 1,28,95,1,1,0,0,0 >> 1,14,100,3,0,0,0,0 >> 1,23,94,3,1,0,0,0 >> 1,17,142,2,0,0,1,0 >> 1,21,130,1,1,0,1,0 >> >> Thanks. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From fgnu32 at yahoo.com Mon Dec 10 00:00:39 2012 From: fgnu32 at yahoo.com (Fg Nu) Date: Sun, 9 Dec 2012 21:00:39 -0800 (PST) Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com>

Message-ID: <1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> Thanks Josef, but I am aware of the alternatives but am interested in rolling my own. ----- Original Message ----- From: "josef.pktd at gmail.com" To: SciPy Users List Cc: Sent: Monday, December 10, 2012 10:12 AM Subject: Re: [SciPy-User] Logistic regression using SciPy On Sun, Dec 9, 2012 at 11:22 PM, David Warde-Farley wrote: > First, the way you've written the log likelihood is numerically > unstable. Consider simplifying the expression (using logarithm laws > and breaking apart logistic function) and using the log1p function > where appropriate. > > Second, the optimization problem is going to be extremely ill > conditioned given the very different scales of your different > predictors. You should probably mean-center and divide by the standard > deviation. > > Third, there's a check_grad function in scipy.optimize that can be > used to troubleshoot gradient issues. > > Fourth, there's a pre-rolled of this in scikit-learn that will > probably be a good deal faster (it wraps a C library) and certainly > better tested than home-rolling it. or statsmodels, which uses analytical gradient and hessians, and gives you additional result statistics and (statistical) tests. Josef > > David > > On Sun, Dec 9, 2012 at 11:01 PM, Fg Nu wrote: >> >> >> I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation? function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable. >> >> Here is the code: >> >> #================================================== >> >>? ? # purpose: logistic regression >>? ? import numpy as np >>? ? import scipy as sp >>? ? import scipy.optimize >> >>? ? import matplotlib as mpl >>? ? import os >> >>? ? # prepare the data >>? ? data = np.loadtxt('data.csv', delimiter=',', skiprows=1) >>? ? vY = data[:, 0] >>? ? mX = data[:, 1:] >>? ? intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) >>? ? mX = np.concatenate((intercept, mX), axis = 1) >>? ? iK = mX.shape[1] >>? ? iN = mX.shape[0] >> >>? ? # logistic transformation >>? ? def logit(mX, vBeta): >>? ? ? ? return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) >> >>? ? # test function call >>? ? vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828, >>? ? ? ? 1.7993371, .7148045? ]) >>? ? logit(mX, vBeta0) >> >>? ? # cost function >>? ? def logLikelihoodLogit(vBeta, mX, vY): >>? ? ? ? return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) >>? ? logLikelihoodLogit(vBeta0, mX, vY) # test function call >> >>? ? # gradient function >>? ? def likelihoodScore(vBeta, mX, vY): >>? ? ? ? return(np.dot(mX.T, >>? ? ? ? ? ? ? ? ? ? ? ((np.dot(mX, vBeta) - vY)/ >>? ? ? ? ? ? ? ? ? ? ? ? np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) >> >>? ? likelihoodScore(vBeta0, mX, vY).shape # test function call >> >>? ? # optimize the function (without gradient) >>? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]), >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) >> >>? ? # optimize the function (with gradient) >>? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]), fprime = likelihoodScore, >>? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) >> #===================================================== >> >> * The first optimization (without gradient)? ends with a whole lot of stuff about division by zero. >> >> * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. >> >> Any help with this is appreciated. If anyone wants to try this, the data is included below. >> >>? ? low,age,lwt,race,smoke,ptl,ht,ui >>? ? 0,19,182,2,0,0,0,1 >>? ? 0,33,155,3,0,0,0,0 >>? ? 0,20,105,1,1,0,0,0 >>? ? 0,21,108,1,1,0,0,1 >>? ? 0,18,107,1,1,0,0,1 >>? ? 0,21,124,3,0,0,0,0 >>? ? 0,22,118,1,0,0,0,0 >>? ? 0,17,103,3,0,0,0,0 >>? ? 0,29,123,1,1,0,0,0 >>? ? 0,26,113,1,1,0,0,0 >>? ? 0,19,95,3,0,0,0,0 >>? ? 0,19,150,3,0,0,0,0 >>? ? 0,22,95,3,0,0,1,0 >>? ? 0,30,107,3,0,1,0,1 >>? ? 0,18,100,1,1,0,0,0 >>? ? 0,18,100,1,1,0,0,0 >>? ? 0,15,98,2,0,0,0,0 >>? ? 0,25,118,1,1,0,0,0 >>? ? 0,20,120,3,0,0,0,1 >>? ? 0,28,120,1,1,0,0,0 >>? ? 0,32,121,3,0,0,0,0 >>? ? 0,31,100,1,0,0,0,1 >>? ? 0,36,202,1,0,0,0,0 >>? ? 0,28,120,3,0,0,0,0 >>? ? 0,25,120,3,0,0,0,1 >>? ? 0,28,167,1,0,0,0,0 >>? ? 0,17,122,1,1,0,0,0 >>? ? 0,29,150,1,0,0,0,0 >>? ? 0,26,168,2,1,0,0,0 >>? ? 0,17,113,2,0,0,0,0 >>? ? 0,17,113,2,0,0,0,0 >>? ? 0,24,90,1,1,1,0,0 >>? ? 0,35,121,2,1,1,0,0 >>? ? 0,25,155,1,0,0,0,0 >>? ? 0,25,125,2,0,0,0,0 >>? ? 0,29,140,1,1,0,0,0 >>? ? 0,19,138,1,1,0,0,0 >>? ? 0,27,124,1,1,0,0,0 >>? ? 0,31,215,1,1,0,0,0 >>? ? 0,33,109,1,1,0,0,0 >>? ? 0,21,185,2,1,0,0,0 >>? ? 0,19,189,1,0,0,0,0 >>? ? 0,23,130,2,0,0,0,0 >>? ? 0,21,160,1,0,0,0,0 >>? ? 0,18,90,1,1,0,0,1 >>? ? 0,18,90,1,1,0,0,1 >>? ? 0,32,132,1,0,0,0,0 >>? ? 0,19,132,3,0,0,0,0 >>? ? 0,24,115,1,0,0,0,0 >>? ? 0,22,85,3,1,0,0,0 >>? ? 0,22,120,1,0,0,1,0 >>? ? 0,23,128,3,0,0,0,0 >>? ? 0,22,130,1,1,0,0,0 >>? ? 0,30,95,1,1,0,0,0 >>? ? 0,19,115,3,0,0,0,0 >>? ? 0,16,110,3,0,0,0,0 >>? ? 0,21,110,3,1,0,0,1 >>? ? 0,30,153,3,0,0,0,0 >>? ? 0,20,103,3,0,0,0,0 >>? ? 0,17,119,3,0,0,0,0 >>? ? 0,17,119,3,0,0,0,0 >>? ? 0,23,119,3,0,0,0,0 >>? ? 0,24,110,3,0,0,0,0 >>? ? 0,28,140,1,0,0,0,0 >>? ? 0,26,133,3,1,2,0,0 >>? ? 0,20,169,3,0,1,0,1 >>? ? 0,24,115,3,0,0,0,0 >>? ? 0,28,250,3,1,0,0,0 >>? ? 0,20,141,1,0,2,0,1 >>? ? 0,22,158,2,0,1,0,0 >>? ? 0,22,112,1,1,2,0,0 >>? ? 0,31,150,3,1,0,0,0 >>? ? 0,23,115,3,1,0,0,0 >>? ? 0,16,112,2,0,0,0,0 >>? ? 0,16,135,1,1,0,0,0 >>? ? 0,18,229,2,0,0,0,0 >>? ? 0,25,140,1,0,0,0,0 >>? ? 0,32,134,1,1,1,0,0 >>? ? 0,20,121,2,1,0,0,0 >>? ? 0,23,190,1,0,0,0,0 >>? ? 0,22,131,1,0,0,0,0 >>? ? 0,32,170,1,0,0,0,0 >>? ? 0,30,110,3,0,0,0,0 >>? ? 0,20,127,3,0,0,0,0 >>? ? 0,23,123,3,0,0,0,0 >>? ? 0,17,120,3,1,0,0,0 >>? ? 0,19,105,3,0,0,0,0 >>? ? 0,23,130,1,0,0,0,0 >>? ? 0,36,175,1,0,0,0,0 >>? ? 0,22,125,1,0,0,0,0 >>? ? 0,24,133,1,0,0,0,0 >>? ? 0,21,134,3,0,0,0,0 >>? ? 0,19,235,1,1,0,1,0 >>? ? 0,25,95,1,1,3,0,1 >>? ? 0,16,135,1,1,0,0,0 >>? ? 0,29,135,1,0,0,0,0 >>? ? 0,29,154,1,0,0,0,0 >>? ? 0,19,147,1,1,0,0,0 >>? ? 0,19,147,1,1,0,0,0 >>? ? 0,30,137,1,0,0,0,0 >>? ? 0,24,110,1,0,0,0,0 >>? ? 0,19,184,1,1,0,1,0 >>? ? 0,24,110,3,0,1,0,0 >>? ? 0,23,110,1,0,0,0,0 >>? ? 0,20,120,3,0,0,0,0 >>? ? 0,25,241,2,0,0,1,0 >>? ? 0,30,112,1,0,0,0,0 >>? ? 0,22,169,1,0,0,0,0 >>? ? 0,18,120,1,1,0,0,0 >>? ? 0,16,170,2,0,0,0,0 >>? ? 0,32,186,1,0,0,0,0 >>? ? 0,18,120,3,0,0,0,0 >>? ? 0,29,130,1,1,0,0,0 >>? ? 0,33,117,1,0,0,0,1 >>? ? 0,20,170,1,1,0,0,0 >>? ? 0,28,134,3,0,0,0,0 >>? ? 0,14,135,1,0,0,0,0 >>? ? 0,28,130,3,0,0,0,0 >>? ? 0,25,120,1,0,0,0,0 >>? ? 0,16,95,3,0,0,0,0 >>? ? 0,20,158,1,0,0,0,0 >>? ? 0,26,160,3,0,0,0,0 >>? ? 0,21,115,1,0,0,0,0 >>? ? 0,22,129,1,0,0,0,0 >>? ? 0,25,130,1,0,0,0,0 >>? ? 0,31,120,1,0,0,0,0 >>? ? 0,35,170,1,0,1,0,0 >>? ? 0,19,120,1,1,0,0,0 >>? ? 0,24,116,1,0,0,0,0 >>? ? 0,45,123,1,0,0,0,0 >>? ? 1,28,120,3,1,1,0,1 >>? ? 1,29,130,1,0,0,0,1 >>? ? 1,34,187,2,1,0,1,0 >>? ? 1,25,105,3,0,1,1,0 >>? ? 1,25,85,3,0,0,0,1 >>? ? 1,27,150,3,0,0,0,0 >>? ? 1,23,97,3,0,0,0,1 >>? ? 1,24,128,2,0,1,0,0 >>? ? 1,24,132,3,0,0,1,0 >>? ? 1,21,165,1,1,0,1,0 >>? ? 1,32,105,1,1,0,0,0 >>? ? 1,19,91,1,1,2,0,1 >>? ? 1,25,115,3,0,0,0,0 >>? ? 1,16,130,3,0,0,0,0 >>? ? 1,25,92,1,1,0,0,0 >>? ? 1,20,150,1,1,0,0,0 >>? ? 1,21,200,2,0,0,0,1 >>? ? 1,24,155,1,1,1,0,0 >>? ? 1,21,103,3,0,0,0,0 >>? ? 1,20,125,3,0,0,0,1 >>? ? 1,25,89,3,0,2,0,0 >>? ? 1,19,102,1,0,0,0,0 >>? ? 1,19,112,1,1,0,0,1 >>? ? 1,26,117,1,1,1,0,0 >>? ? 1,24,138,1,0,0,0,0 >>? ? 1,17,130,3,1,1,0,1 >>? ? 1,20,120,2,1,0,0,0 >>? ? 1,22,130,1,1,1,0,1 >>? ? 1,27,130,2,0,0,0,1 >>? ? 1,20,80,3,1,0,0,1 >>? ? 1,17,110,1,1,0,0,0 >>? ? 1,25,105,3,0,1,0,0 >>? ? 1,20,109,3,0,0,0,0 >>? ? 1,18,148,3,0,0,0,0 >>? ? 1,18,110,2,1,1,0,0 >>? ? 1,20,121,1,1,1,0,1 >>? ? 1,21,100,3,0,1,0,0 >>? ? 1,26,96,3,0,0,0,0 >>? ? 1,31,102,1,1,1,0,0 >>? ? 1,15,110,1,0,0,0,0 >>? ? 1,23,187,2,1,0,0,0 >>? ? 1,20,122,2,1,0,0,0 >>? ? 1,24,105,2,1,0,0,0 >>? ? 1,15,115,3,0,0,0,1 >>? ? 1,23,120,3,0,0,0,0 >>? ? 1,30,142,1,1,1,0,0 >>? ? 1,22,130,1,1,0,0,0 >>? ? 1,17,120,1,1,0,0,0 >>? ? 1,23,110,1,1,1,0,0 >>? ? 1,17,120,2,0,0,0,0 >>? ? 1,26,154,3,0,1,1,0 >>? ? 1,20,106,3,0,0,0,0 >>? ? 1,26,190,1,1,0,0,0 >>? ? 1,14,101,3,1,1,0,0 >>? ? 1,28,95,1,1,0,0,0 >>? ? 1,14,100,3,0,0,0,0 >>? ? 1,23,94,3,1,0,0,0 >>? ? 1,17,142,2,0,0,1,0 >>? ? 1,21,130,1,1,0,1,0 >> >> Thanks. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Dec 10 00:28:58 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 10 Dec 2012 00:28:58 -0500 Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com>

<1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> Message-ID: On Mon, Dec 10, 2012 at 12:00 AM, Fg Nu wrote: > > > Thanks Josef, but I am aware of the alternatives but am interested in rolling my own. That's always a good exercise :) I think your last reshape in likelihoodScore is wrong. The return should be 1-dimensional. for example .ravel() Josef > > > > ----- Original Message ----- > From: "josef.pktd at gmail.com" > To: SciPy Users List > Cc: > Sent: Monday, December 10, 2012 10:12 AM > Subject: Re: [SciPy-User] Logistic regression using SciPy > > On Sun, Dec 9, 2012 at 11:22 PM, David Warde-Farley > wrote: >> First, the way you've written the log likelihood is numerically >> unstable. Consider simplifying the expression (using logarithm laws >> and breaking apart logistic function) and using the log1p function >> where appropriate. >> >> Second, the optimization problem is going to be extremely ill >> conditioned given the very different scales of your different >> predictors. You should probably mean-center and divide by the standard >> deviation. >> >> Third, there's a check_grad function in scipy.optimize that can be >> used to troubleshoot gradient issues. >> >> Fourth, there's a pre-rolled of this in scikit-learn that will >> probably be a good deal faster (it wraps a C library) and certainly >> better tested than home-rolling it. > > or statsmodels, which uses analytical gradient and hessians, and gives > you additional result statistics and (statistical) tests. > > Josef > >> >> David >> >> On Sun, Dec 9, 2012 at 11:01 PM, Fg Nu wrote: >>> >>> >>> I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable. >>> >>> Here is the code: >>> >>> #================================================== >>> >>> # purpose: logistic regression >>> import numpy as np >>> import scipy as sp >>> import scipy.optimize >>> >>> import matplotlib as mpl >>> import os >>> >>> # prepare the data >>> data = np.loadtxt('data.csv', delimiter=',', skiprows=1) >>> vY = data[:, 0] >>> mX = data[:, 1:] >>> intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) >>> mX = np.concatenate((intercept, mX), axis = 1) >>> iK = mX.shape[1] >>> iN = mX.shape[0] >>> >>> # logistic transformation >>> def logit(mX, vBeta): >>> return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) >>> >>> # test function call >>> vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828, >>> 1.7993371, .7148045 ]) >>> logit(mX, vBeta0) >>> >>> # cost function >>> def logLikelihoodLogit(vBeta, mX, vY): >>> return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) >>> logLikelihoodLogit(vBeta0, mX, vY) # test function call >>> >>> # gradient function >>> def likelihoodScore(vBeta, mX, vY): >>> return(np.dot(mX.T, >>> ((np.dot(mX, vBeta) - vY)/ >>> np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) >>> >>> likelihoodScore(vBeta0, mX, vY).shape # test function call >>> >>> # optimize the function (without gradient) >>> optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >>> x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >>> 1.8, .71]), >>> args = (mX, vY), gtol = 1e-3) >>> >>> # optimize the function (with gradient) >>> optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, >>> x0 = np.array([-.1, -.03, -.01, .44, .92, .53, >>> 1.8, .71]), fprime = likelihoodScore, >>> args = (mX, vY), gtol = 1e-3) >>> #===================================================== >>> >>> * The first optimization (without gradient) ends with a whole lot of stuff about division by zero. >>> >>> * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. >>> >>> Any help with this is appreciated. If anyone wants to try this, the data is included below. >>> >>> low,age,lwt,race,smoke,ptl,ht,ui >>> 0,19,182,2,0,0,0,1 >>> 0,33,155,3,0,0,0,0 >>> 0,20,105,1,1,0,0,0 >>> 0,21,108,1,1,0,0,1 >>> 0,18,107,1,1,0,0,1 >>> 0,21,124,3,0,0,0,0 >>> 0,22,118,1,0,0,0,0 >>> 0,17,103,3,0,0,0,0 >>> 0,29,123,1,1,0,0,0 >>> 0,26,113,1,1,0,0,0 >>> 0,19,95,3,0,0,0,0 >>> 0,19,150,3,0,0,0,0 >>> 0,22,95,3,0,0,1,0 >>> 0,30,107,3,0,1,0,1 >>> 0,18,100,1,1,0,0,0 >>> 0,18,100,1,1,0,0,0 >>> 0,15,98,2,0,0,0,0 >>> 0,25,118,1,1,0,0,0 >>> 0,20,120,3,0,0,0,1 >>> 0,28,120,1,1,0,0,0 >>> 0,32,121,3,0,0,0,0 >>> 0,31,100,1,0,0,0,1 >>> 0,36,202,1,0,0,0,0 >>> 0,28,120,3,0,0,0,0 >>> 0,25,120,3,0,0,0,1 >>> 0,28,167,1,0,0,0,0 >>> 0,17,122,1,1,0,0,0 >>> 0,29,150,1,0,0,0,0 >>> 0,26,168,2,1,0,0,0 >>> 0,17,113,2,0,0,0,0 >>> 0,17,113,2,0,0,0,0 >>> 0,24,90,1,1,1,0,0 >>> 0,35,121,2,1,1,0,0 >>> 0,25,155,1,0,0,0,0 >>> 0,25,125,2,0,0,0,0 >>> 0,29,140,1,1,0,0,0 >>> 0,19,138,1,1,0,0,0 >>> 0,27,124,1,1,0,0,0 >>> 0,31,215,1,1,0,0,0 >>> 0,33,109,1,1,0,0,0 >>> 0,21,185,2,1,0,0,0 >>> 0,19,189,1,0,0,0,0 >>> 0,23,130,2,0,0,0,0 >>> 0,21,160,1,0,0,0,0 >>> 0,18,90,1,1,0,0,1 >>> 0,18,90,1,1,0,0,1 >>> 0,32,132,1,0,0,0,0 >>> 0,19,132,3,0,0,0,0 >>> 0,24,115,1,0,0,0,0 >>> 0,22,85,3,1,0,0,0 >>> 0,22,120,1,0,0,1,0 >>> 0,23,128,3,0,0,0,0 >>> 0,22,130,1,1,0,0,0 >>> 0,30,95,1,1,0,0,0 >>> 0,19,115,3,0,0,0,0 >>> 0,16,110,3,0,0,0,0 >>> 0,21,110,3,1,0,0,1 >>> 0,30,153,3,0,0,0,0 >>> 0,20,103,3,0,0,0,0 >>> 0,17,119,3,0,0,0,0 >>> 0,17,119,3,0,0,0,0 >>> 0,23,119,3,0,0,0,0 >>> 0,24,110,3,0,0,0,0 >>> 0,28,140,1,0,0,0,0 >>> 0,26,133,3,1,2,0,0 >>> 0,20,169,3,0,1,0,1 >>> 0,24,115,3,0,0,0,0 >>> 0,28,250,3,1,0,0,0 >>> 0,20,141,1,0,2,0,1 >>> 0,22,158,2,0,1,0,0 >>> 0,22,112,1,1,2,0,0 >>> 0,31,150,3,1,0,0,0 >>> 0,23,115,3,1,0,0,0 >>> 0,16,112,2,0,0,0,0 >>> 0,16,135,1,1,0,0,0 >>> 0,18,229,2,0,0,0,0 >>> 0,25,140,1,0,0,0,0 >>> 0,32,134,1,1,1,0,0 >>> 0,20,121,2,1,0,0,0 >>> 0,23,190,1,0,0,0,0 >>> 0,22,131,1,0,0,0,0 >>> 0,32,170,1,0,0,0,0 >>> 0,30,110,3,0,0,0,0 >>> 0,20,127,3,0,0,0,0 >>> 0,23,123,3,0,0,0,0 >>> 0,17,120,3,1,0,0,0 >>> 0,19,105,3,0,0,0,0 >>> 0,23,130,1,0,0,0,0 >>> 0,36,175,1,0,0,0,0 >>> 0,22,125,1,0,0,0,0 >>> 0,24,133,1,0,0,0,0 >>> 0,21,134,3,0,0,0,0 >>> 0,19,235,1,1,0,1,0 >>> 0,25,95,1,1,3,0,1 >>> 0,16,135,1,1,0,0,0 >>> 0,29,135,1,0,0,0,0 >>> 0,29,154,1,0,0,0,0 >>> 0,19,147,1,1,0,0,0 >>> 0,19,147,1,1,0,0,0 >>> 0,30,137,1,0,0,0,0 >>> 0,24,110,1,0,0,0,0 >>> 0,19,184,1,1,0,1,0 >>> 0,24,110,3,0,1,0,0 >>> 0,23,110,1,0,0,0,0 >>> 0,20,120,3,0,0,0,0 >>> 0,25,241,2,0,0,1,0 >>> 0,30,112,1,0,0,0,0 >>> 0,22,169,1,0,0,0,0 >>> 0,18,120,1,1,0,0,0 >>> 0,16,170,2,0,0,0,0 >>> 0,32,186,1,0,0,0,0 >>> 0,18,120,3,0,0,0,0 >>> 0,29,130,1,1,0,0,0 >>> 0,33,117,1,0,0,0,1 >>> 0,20,170,1,1,0,0,0 >>> 0,28,134,3,0,0,0,0 >>> 0,14,135,1,0,0,0,0 >>> 0,28,130,3,0,0,0,0 >>> 0,25,120,1,0,0,0,0 >>> 0,16,95,3,0,0,0,0 >>> 0,20,158,1,0,0,0,0 >>> 0,26,160,3,0,0,0,0 >>> 0,21,115,1,0,0,0,0 >>> 0,22,129,1,0,0,0,0 >>> 0,25,130,1,0,0,0,0 >>> 0,31,120,1,0,0,0,0 >>> 0,35,170,1,0,1,0,0 >>> 0,19,120,1,1,0,0,0 >>> 0,24,116,1,0,0,0,0 >>> 0,45,123,1,0,0,0,0 >>> 1,28,120,3,1,1,0,1 >>> 1,29,130,1,0,0,0,1 >>> 1,34,187,2,1,0,1,0 >>> 1,25,105,3,0,1,1,0 >>> 1,25,85,3,0,0,0,1 >>> 1,27,150,3,0,0,0,0 >>> 1,23,97,3,0,0,0,1 >>> 1,24,128,2,0,1,0,0 >>> 1,24,132,3,0,0,1,0 >>> 1,21,165,1,1,0,1,0 >>> 1,32,105,1,1,0,0,0 >>> 1,19,91,1,1,2,0,1 >>> 1,25,115,3,0,0,0,0 >>> 1,16,130,3,0,0,0,0 >>> 1,25,92,1,1,0,0,0 >>> 1,20,150,1,1,0,0,0 >>> 1,21,200,2,0,0,0,1 >>> 1,24,155,1,1,1,0,0 >>> 1,21,103,3,0,0,0,0 >>> 1,20,125,3,0,0,0,1 >>> 1,25,89,3,0,2,0,0 >>> 1,19,102,1,0,0,0,0 >>> 1,19,112,1,1,0,0,1 >>> 1,26,117,1,1,1,0,0 >>> 1,24,138,1,0,0,0,0 >>> 1,17,130,3,1,1,0,1 >>> 1,20,120,2,1,0,0,0 >>> 1,22,130,1,1,1,0,1 >>> 1,27,130,2,0,0,0,1 >>> 1,20,80,3,1,0,0,1 >>> 1,17,110,1,1,0,0,0 >>> 1,25,105,3,0,1,0,0 >>> 1,20,109,3,0,0,0,0 >>> 1,18,148,3,0,0,0,0 >>> 1,18,110,2,1,1,0,0 >>> 1,20,121,1,1,1,0,1 >>> 1,21,100,3,0,1,0,0 >>> 1,26,96,3,0,0,0,0 >>> 1,31,102,1,1,1,0,0 >>> 1,15,110,1,0,0,0,0 >>> 1,23,187,2,1,0,0,0 >>> 1,20,122,2,1,0,0,0 >>> 1,24,105,2,1,0,0,0 >>> 1,15,115,3,0,0,0,1 >>> 1,23,120,3,0,0,0,0 >>> 1,30,142,1,1,1,0,0 >>> 1,22,130,1,1,0,0,0 >>> 1,17,120,1,1,0,0,0 >>> 1,23,110,1,1,1,0,0 >>> 1,17,120,2,0,0,0,0 >>> 1,26,154,3,0,1,1,0 >>> 1,20,106,3,0,0,0,0 >>> 1,26,190,1,1,0,0,0 >>> 1,14,101,3,1,1,0,0 >>> 1,28,95,1,1,0,0,0 >>> 1,14,100,3,0,0,0,0 >>> 1,23,94,3,1,0,0,0 >>> 1,17,142,2,0,0,1,0 >>> 1,21,130,1,1,0,1,0 >>> >>> Thanks. >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From fgnu32 at yahoo.com Mon Dec 10 01:26:12 2012 From: fgnu32 at yahoo.com (Fg Nu) Date: Sun, 9 Dec 2012 22:26:12 -0800 (PST) Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com> Message-ID: <1355120772.80318.YahooMailNeo@web160102.mail.bf1.yahoo.com> For any one interested, here is the final code, where everything works. Basically, I reparametrized the likelihood function. Also, added a call to the check_grad function. #===================================================== # purpose: logistic regression? import numpy as np import scipy as sp import scipy.optimize import matplotlib as mpl import os # prepare the data data = np.loadtxt('data.csv', delimiter=',', skiprows=1) vY = data[:, 0] mX = data[:, 1:] # mX = (mX - np.mean(mX))/np.std(mX) ?# standardize the data; if required intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) mX = np.concatenate((intercept, mX), axis = 1) iK = mX.shape[1] iN = mX.shape[0] ? # logistic transformation def logit(mX, vBeta): ? ? return((np.exp(np.dot(mX, vBeta))/(1.0 + np.exp(np.dot(mX, vBeta))))) # test function call vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828,? ? ? 1.7993371, .7148045 ?]) logit(mX, vBeta0) # cost function def logLikelihoodLogit(vBeta, mX, vY): ? ? return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) logLikelihoodLogit(vBeta0, mX, vY) # test function call # different parametrization of the cost function def logLikelihoodLogitVerbose(vBeta, mX, vY): ? ? return(-(np.sum(vY*(np.dot(mX, vBeta) - np.log((1.0 + np.exp(np.dot(mX, vBeta))))) + ? ? ? ? ? ? ? ? ? ? (1-vY)*(-np.log((1.0 + np.exp(np.dot(mX, vBeta)))))))) logLikelihoodLogitVerbose(vBeta0, mX, vY) ?# test function call # gradient function def likelihoodScore(vBeta, mX, vY): ? ? return(np.dot(mX.T,? ? ? ? ? ? ? ? ? ? (logit(mX, vBeta) - vY))) likelihoodScore(vBeta0, mX, vY).shape # test function call sp.optimize.check_grad(logLikelihoodLogitVerbose, likelihoodScore,? ? ? ? ? ? ? ? ? ? ? ? ?vBeta0, mX, vY) ?# check that the analytical gradient is close to? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # numerical gradient # optimize the function (without gradient) optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogitVerbose,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]),? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) # optimize the function (with gradient) optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogitVerbose,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]), fprime = likelihoodScore,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) #===================================================== Thanks for the helpful suggestions from everyone on- and off-list. ----- Original Message ----- From: Fg Nu To: "scipy-user at scipy.org" Cc: Sent: Monday, December 10, 2012 9:31 AM Subject: Logistic regression using SciPy I am trying to code up logistic regression in Python using the SciPy "fmin_bfgs" function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation ?function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable.? Here is the code: #================================================== ? ? # purpose: logistic regression? ? ? import numpy as np ? ? import scipy as sp ? ? import scipy.optimize ? ?? ? ? import matplotlib as mpl ? ? import os ? ?? ? ? # prepare the data ? ? data = np.loadtxt('data.csv', delimiter=',', skiprows=1) ? ? vY = data[:, 0] ? ? mX = data[:, 1:] ? ? intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1) ? ? mX = np.concatenate((intercept, mX), axis = 1) ? ? iK = mX.shape[1] ? ? iN = mX.shape[0] ? ? ? ? ? # logistic transformation ? ? def logit(mX, vBeta): ? ? ? ? return((1/(1.0 + np.exp(-np.dot(mX, vBeta))))) ? ?? ? ? # test function call ? ? vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828,? ? ? ? ? 1.7993371, .7148045 ?]) ? ? logit(mX, vBeta0) ? ?? ? ? # cost function ? ? def logLikelihoodLogit(vBeta, mX, vY): ? ? ? ? return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta)))))) ? ? logLikelihoodLogit(vBeta0, mX, vY) # test function call ? ?? ? ? # gradient function ? ? def likelihoodScore(vBeta, mX, vY): ? ? ? ? return(np.dot(mX.T,? ? ? ? ? ? ? ? ? ? ? ? ((np.dot(mX, vBeta) - vY)/ ? ? ? ? ? ? ? ? ? ? ? ?np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1)) ? ?? ? ? likelihoodScore(vBeta0, mX, vY).shape # test function call ? ? ? ? ? # optimize the function (without gradient) ? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]),? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) ? ?? ? ? # optimize the function (with gradient) ? ? optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? x0 = np.array([-.1, -.03, -.01, .44, .92, .53, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1.8, .71]), fprime = likelihoodScore,? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? args = (mX, vY), gtol = 1e-3) #===================================================== * The first optimization (without gradient) ?ends with a whole lot of stuff about division by zero. * The second optimization (with gradient) ends with a matrices not aligned error, which probably means I have got the way the gradient is to be returned wrong. Any help with this is appreciated. If anyone wants to try this, the data is included below. ? ? low,age,lwt,race,smoke,ptl,ht,ui ? ? 0,19,182,2,0,0,0,1 ? ? 0,33,155,3,0,0,0,0 ? ? 0,20,105,1,1,0,0,0 ? ? 0,21,108,1,1,0,0,1 ? ? 0,18,107,1,1,0,0,1 ? ? 0,21,124,3,0,0,0,0 ? ? 0,22,118,1,0,0,0,0 ? ? 0,17,103,3,0,0,0,0 ? ? 0,29,123,1,1,0,0,0 ? ? 0,26,113,1,1,0,0,0 ? ? 0,19,95,3,0,0,0,0 ? ? 0,19,150,3,0,0,0,0 ? ? 0,22,95,3,0,0,1,0 ? ? 0,30,107,3,0,1,0,1 ? ? 0,18,100,1,1,0,0,0 ? ? 0,18,100,1,1,0,0,0 ? ? 0,15,98,2,0,0,0,0 ? ? 0,25,118,1,1,0,0,0 ? ? 0,20,120,3,0,0,0,1 ? ? 0,28,120,1,1,0,0,0 ? ? 0,32,121,3,0,0,0,0 ? ? 0,31,100,1,0,0,0,1 ? ? 0,36,202,1,0,0,0,0 ? ? 0,28,120,3,0,0,0,0 ? ? 0,25,120,3,0,0,0,1 ? ? 0,28,167,1,0,0,0,0 ? ? 0,17,122,1,1,0,0,0 ? ? 0,29,150,1,0,0,0,0 ? ? 0,26,168,2,1,0,0,0 ? ? 0,17,113,2,0,0,0,0 ? ? 0,17,113,2,0,0,0,0 ? ? 0,24,90,1,1,1,0,0 ? ? 0,35,121,2,1,1,0,0 ? ? 0,25,155,1,0,0,0,0 ? ? 0,25,125,2,0,0,0,0 ? ? 0,29,140,1,1,0,0,0 ? ? 0,19,138,1,1,0,0,0 ? ? 0,27,124,1,1,0,0,0 ? ? 0,31,215,1,1,0,0,0 ? ? 0,33,109,1,1,0,0,0 ? ? 0,21,185,2,1,0,0,0 ? ? 0,19,189,1,0,0,0,0 ? ? 0,23,130,2,0,0,0,0 ? ? 0,21,160,1,0,0,0,0 ? ? 0,18,90,1,1,0,0,1 ? ? 0,18,90,1,1,0,0,1 ? ? 0,32,132,1,0,0,0,0 ? ? 0,19,132,3,0,0,0,0 ? ? 0,24,115,1,0,0,0,0 ? ? 0,22,85,3,1,0,0,0 ? ? 0,22,120,1,0,0,1,0 ? ? 0,23,128,3,0,0,0,0 ? ? 0,22,130,1,1,0,0,0 ? ? 0,30,95,1,1,0,0,0 ? ? 0,19,115,3,0,0,0,0 ? ? 0,16,110,3,0,0,0,0 ? ? 0,21,110,3,1,0,0,1 ? ? 0,30,153,3,0,0,0,0 ? ? 0,20,103,3,0,0,0,0 ? ? 0,17,119,3,0,0,0,0 ? ? 0,17,119,3,0,0,0,0 ? ? 0,23,119,3,0,0,0,0 ? ? 0,24,110,3,0,0,0,0 ? ? 0,28,140,1,0,0,0,0 ? ? 0,26,133,3,1,2,0,0 ? ? 0,20,169,3,0,1,0,1 ? ? 0,24,115,3,0,0,0,0 ? ? 0,28,250,3,1,0,0,0 ? ? 0,20,141,1,0,2,0,1 ? ? 0,22,158,2,0,1,0,0 ? ? 0,22,112,1,1,2,0,0 ? ? 0,31,150,3,1,0,0,0 ? ? 0,23,115,3,1,0,0,0 ? ? 0,16,112,2,0,0,0,0 ? ? 0,16,135,1,1,0,0,0 ? ? 0,18,229,2,0,0,0,0 ? ? 0,25,140,1,0,0,0,0 ? ? 0,32,134,1,1,1,0,0 ? ? 0,20,121,2,1,0,0,0 ? ? 0,23,190,1,0,0,0,0 ? ? 0,22,131,1,0,0,0,0 ? ? 0,32,170,1,0,0,0,0 ? ? 0,30,110,3,0,0,0,0 ? ? 0,20,127,3,0,0,0,0 ? ? 0,23,123,3,0,0,0,0 ? ? 0,17,120,3,1,0,0,0 ? ? 0,19,105,3,0,0,0,0 ? ? 0,23,130,1,0,0,0,0 ? ? 0,36,175,1,0,0,0,0 ? ? 0,22,125,1,0,0,0,0 ? ? 0,24,133,1,0,0,0,0 ? ? 0,21,134,3,0,0,0,0 ? ? 0,19,235,1,1,0,1,0 ? ? 0,25,95,1,1,3,0,1 ? ? 0,16,135,1,1,0,0,0 ? ? 0,29,135,1,0,0,0,0 ? ? 0,29,154,1,0,0,0,0 ? ? 0,19,147,1,1,0,0,0 ? ? 0,19,147,1,1,0,0,0 ? ? 0,30,137,1,0,0,0,0 ? ? 0,24,110,1,0,0,0,0 ? ? 0,19,184,1,1,0,1,0 ? ? 0,24,110,3,0,1,0,0 ? ? 0,23,110,1,0,0,0,0 ? ? 0,20,120,3,0,0,0,0 ? ? 0,25,241,2,0,0,1,0 ? ? 0,30,112,1,0,0,0,0 ? ? 0,22,169,1,0,0,0,0 ? ? 0,18,120,1,1,0,0,0 ? ? 0,16,170,2,0,0,0,0 ? ? 0,32,186,1,0,0,0,0 ? ? 0,18,120,3,0,0,0,0 ? ? 0,29,130,1,1,0,0,0 ? ? 0,33,117,1,0,0,0,1 ? ? 0,20,170,1,1,0,0,0 ? ? 0,28,134,3,0,0,0,0 ? ? 0,14,135,1,0,0,0,0 ? ? 0,28,130,3,0,0,0,0 ? ? 0,25,120,1,0,0,0,0 ? ? 0,16,95,3,0,0,0,0 ? ? 0,20,158,1,0,0,0,0 ? ? 0,26,160,3,0,0,0,0 ? ? 0,21,115,1,0,0,0,0 ? ? 0,22,129,1,0,0,0,0 ? ? 0,25,130,1,0,0,0,0 ? ? 0,31,120,1,0,0,0,0 ? ? 0,35,170,1,0,1,0,0 ? ? 0,19,120,1,1,0,0,0 ? ? 0,24,116,1,0,0,0,0 ? ? 0,45,123,1,0,0,0,0 ? ? 1,28,120,3,1,1,0,1 ? ? 1,29,130,1,0,0,0,1 ? ? 1,34,187,2,1,0,1,0 ? ? 1,25,105,3,0,1,1,0 ? ? 1,25,85,3,0,0,0,1 ? ? 1,27,150,3,0,0,0,0 ? ? 1,23,97,3,0,0,0,1 ? ? 1,24,128,2,0,1,0,0 ? ? 1,24,132,3,0,0,1,0 ? ? 1,21,165,1,1,0,1,0 ? ? 1,32,105,1,1,0,0,0 ? ? 1,19,91,1,1,2,0,1 ? ? 1,25,115,3,0,0,0,0 ? ? 1,16,130,3,0,0,0,0 ? ? 1,25,92,1,1,0,0,0 ? ? 1,20,150,1,1,0,0,0 ? ? 1,21,200,2,0,0,0,1 ? ? 1,24,155,1,1,1,0,0 ? ? 1,21,103,3,0,0,0,0 ? ? 1,20,125,3,0,0,0,1 ? ? 1,25,89,3,0,2,0,0 ? ? 1,19,102,1,0,0,0,0 ? ? 1,19,112,1,1,0,0,1 ? ? 1,26,117,1,1,1,0,0 ? ? 1,24,138,1,0,0,0,0 ? ? 1,17,130,3,1,1,0,1 ? ? 1,20,120,2,1,0,0,0 ? ? 1,22,130,1,1,1,0,1 ? ? 1,27,130,2,0,0,0,1 ? ? 1,20,80,3,1,0,0,1 ? ? 1,17,110,1,1,0,0,0 ? ? 1,25,105,3,0,1,0,0 ? ? 1,20,109,3,0,0,0,0 ? ? 1,18,148,3,0,0,0,0 ? ? 1,18,110,2,1,1,0,0 ? ? 1,20,121,1,1,1,0,1 ? ? 1,21,100,3,0,1,0,0 ? ? 1,26,96,3,0,0,0,0 ? ? 1,31,102,1,1,1,0,0 ? ? 1,15,110,1,0,0,0,0 ? ? 1,23,187,2,1,0,0,0 ? ? 1,20,122,2,1,0,0,0 ? ? 1,24,105,2,1,0,0,0 ? ? 1,15,115,3,0,0,0,1 ? ? 1,23,120,3,0,0,0,0 ? ? 1,30,142,1,1,1,0,0 ? ? 1,22,130,1,1,0,0,0 ? ? 1,17,120,1,1,0,0,0 ? ? 1,23,110,1,1,1,0,0 ? ? 1,17,120,2,0,0,0,0 ? ? 1,26,154,3,0,1,1,0 ? ? 1,20,106,3,0,0,0,0 ? ? 1,26,190,1,1,0,0,0 ? ? 1,14,101,3,1,1,0,0 ? ? 1,28,95,1,1,0,0,0 ? ? 1,14,100,3,0,0,0,0 ? ? 1,23,94,3,1,0,0,0 ? ? 1,17,142,2,0,0,1,0 ? ? 1,21,130,1,1,0,1,0 Thanks. From alan.isaac at gmail.com Mon Dec 10 08:23:22 2012 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 10 Dec 2012 08:23:22 -0500 Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com>

<1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> Message-ID: <50C5E24A.8040201@gmail.com> On 12/10/2012 12:00 AM, Fg Nu wrote: > I am aware of the alternatives but am interested in rolling my own. At least look at statsmodels, because you are headed down the wrong track. Alan Isaac From alicsailmitedu at gmail.com Mon Dec 10 01:23:07 2012 From: alicsailmitedu at gmail.com (ali rahimi) Date: Sun, 9 Dec 2012 22:23:07 -0800 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize Message-ID: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> When minimizing a function f(x), df/dx can often also be computed with little additional effort. Yet the routines in scipy.optimize accept df/dx as a separate function fprime which must repeat much of the computation of f. do the scipy.optimize routines allow me to compute f and fprime simultaneous in one call? this is important to me because each evaluation in my case takes ~10 minutes and a few iterations of the optimizer is all i need. From gregor.thalhammer at gmail.com Mon Dec 10 09:54:55 2012 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Mon, 10 Dec 2012 15:54:55 +0100 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize In-Reply-To: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> Message-ID: <223DC4FB-F581-4C65-9D37-727A0A0251C3@gmail.com> Am 10.12.2012 um 07:23 schrieb ali rahimi: > When minimizing a function f(x), df/dx can often also be computed with little additional effort. Yet the routines in scipy.optimize accept df/dx as a separate function fprime which must repeat much of the computation of f. > > do the scipy.optimize routines allow me to compute f and fprime simultaneous in one call? this is important to me because each evaluation in my case takes ~10 minutes and a few iterations of the optimizer is all i need. They are called separately. For a more efficient calculation you could save the result of f, e.g. in a class attribute or a global variable, and reuse it in fprime. And don't forget to also store the parameters to check whether fprime is indeed called with the same params. Gregor From jsseabold at gmail.com Mon Dec 10 10:06:27 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 10 Dec 2012 10:06:27 -0500 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize In-Reply-To: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> Message-ID: On Mon, Dec 10, 2012 at 1:23 AM, ali rahimi wrote: > When minimizing a function f(x), df/dx can often also be computed with little additional effort. Yet the routines in scipy.optimize accept df/dx as a separate function fprime which must repeat much of the computation of f. > > do the scipy.optimize routines allow me to compute f and fprime simultaneous in one call? this is important to me because each evaluation in my case takes ~10 minutes and a few iterations of the optimizer is all i need. AFAIK, fmin_l_bfgs_b is the only optimize function written so that f can also return fprime, but I agree that this could be a nice option to the other ones. Skipper From pav at iki.fi Mon Dec 10 10:11:03 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 10 Dec 2012 15:11:03 +0000 (UTC) Subject: [SciPy-User] =?utf-8?q?computing_f_and_fprime_in_one_evaluation_i?= =?utf-8?q?n=09scipy=2Eoptimize?= References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com> Message-ID: Skipper Seabold gmail.com> writes: [clip] > AFAIK, fmin_l_bfgs_b is the only optimize function written so that f > can also return fprime, but I agree that this could be a nice option > to the other ones. The good news is that minimize(f, .., jac=True) [1] also works like this, and works for all of the solvers (it caches the jacobian). So the problem is already solved in Scipy 0.11.0 :) [1] http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html -- Pauli Virtanen From matthieu.brucher at gmail.com Mon Dec 10 10:22:16 2012 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 10 Dec 2012 16:22:16 +0100 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize In-Reply-To: References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com>

Message-ID: Hi Pauli, What is the Jacobian for a CG or BFGS optimization? There are no Jacobian in this case for non-least square optimization if I'm not mistaken (at least in no reference that I know of). Shouldn't this be the gradient function? Cheers, 2012/12/10 Pauli Virtanen > Skipper Seabold gmail.com> writes: > [clip] > > AFAIK, fmin_l_bfgs_b is the only optimize function written so that f > > can also return fprime, but I agree that this could be a nice option > > to the other ones. > > The good news is that minimize(f, .., jac=True) [1] also works like this, > and works for all of the solvers (it caches the jacobian). > > So the problem is already solved in Scipy 0.11.0 :) > > [1] > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fgnu32 at yahoo.com Mon Dec 10 10:25:36 2012 From: fgnu32 at yahoo.com (Fg Nu) Date: Mon, 10 Dec 2012 07:25:36 -0800 (PST) Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <50C5E24A.8040201@gmail.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com>

<1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> <50C5E24A.8040201@gmail.com> Message-ID: <1355153136.96410.YahooMailNeo@web160101.mail.bf1.yahoo.com> ----- Original Message ----- From: Alan G Isaac To: Fg Nu ; SciPy Users List Cc: Sent: Monday, December 10, 2012 6:53 PM Subject: Re: [SciPy-User] Logistic regression using SciPy On 12/10/2012 12:00 AM, Fg Nu wrote: > I am aware of the alternatives but am interested in rolling my own. At least look at statsmodels, because you are headed down the wrong track. Thanks, but I have fixed my problem, and have posted back to the list. From sturla at molden.no Mon Dec 10 11:26:50 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 10 Dec 2012 17:26:50 +0100 Subject: [SciPy-User] Logistic regression using SciPy In-Reply-To: <1355153136.96410.YahooMailNeo@web160101.mail.bf1.yahoo.com> References: <1355112090.68806.YahooMailNeo@web160105.mail.bf1.yahoo.com>

<1355115639.66117.YahooMailNeo@web160106.mail.bf1.yahoo.com> <50C5E24A.8040201@gmail.com> <1355153136.96410.YahooMailNeo@web160101.mail.bf1.yahoo.com> Message-ID: If it is of any interest, I have working code to do logistic regression with SciPy. Sturla Sendt fra min iPad Den 10. des. 2012 kl. 16:25 skrev Fg Nu : > > > > > ----- Original Message ----- > From: Alan G Isaac > To: Fg Nu ; SciPy Users List > Cc: > Sent: Monday, December 10, 2012 6:53 PM > Subject: Re: [SciPy-User] Logistic regression using SciPy > > On 12/10/2012 12:00 AM, Fg Nu wrote: >> I am aware of the alternatives but am interested in rolling my own. > > > At least look at statsmodels, > because you are headed down the wrong track. > > Thanks, but I have fixed my problem, and have posted back to the list. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Mon Dec 10 11:32:29 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 10 Dec 2012 16:32:29 +0000 (UTC) Subject: [SciPy-User] =?utf-8?q?computing_f_and_fprime_in_one_evaluation_i?= =?utf-8?q?n=09scipy=2Eoptimize?= References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com>

Message-ID: Matthieu Brucher gmail.com> writes: > What is the Jacobian for a CG or BFGS optimization? There are no Jacobian in this case for non-least square optimization if I'm not mistaken (at least in no reference that I know of). Shouldn't this be the gradient function? Yeah, it means the gradient. If you nit-pick enough, the gradient is the Jacobian of a scalar function. Pauli From matthieu.brucher at gmail.com Mon Dec 10 11:38:14 2012 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 10 Dec 2012 17:38:14 +0100 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize In-Reply-To: References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com>

Message-ID: OK, that's what I thought, but perhaps people expecting to find the word gradient somewhere in the document would not make the connection :/ Cheers, 2012/12/10 Pauli Virtanen > Matthieu Brucher gmail.com> writes: > > What is the Jacobian for a CG or BFGS optimization? There are no > Jacobian in > this case for non-least square optimization if I'm not mistaken (at least > in no > reference that I know of). Shouldn't this be the gradient function? > > Yeah, it means the gradient. If you nit-pick enough, > the gradient is the Jacobian of a scalar function. > > Pauli > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Mon Dec 10 12:04:15 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 10 Dec 2012 18:04:15 +0100 Subject: [SciPy-User] computing f and fprime in one evaluation in scipy.optimize In-Reply-To: References: <12E86D26-1E67-4669-A123-E171BDEECDF8@gmail.com>

Message-ID: <01D8D1CE-1910-405D-9AB6-3FA7D2B9CE3F@molden.no> Den 10. des. 2012 kl. 16:11 skrev Pauli Virtanen : > Skipper Seabold gmail.com> writes: > [clip] >> AFAIK, fmin_l_bfgs_b is the only optimize function written so that f >> can also return fprime, but I agree that this could be a nice option >> to the other ones. > > The good news is that minimize(f, .., jac=True) [1] also works like this, > and works for all of the solvers (it caches the jacobian). This is a problem I have run into with sp.optimize.leastsq as well. Caching did not solve the problem, as the Jacobian is rank ndata x nparam, and caching it made me run out of memory. Interestingly, MINPACK lmder.f obtains the residuals and the Jaobian in a single function call, thus the problem is not there in the original Fortran code. It is introduced by the SciPy wrapper. On the other hand, I also prefer to use LAPACK for least-squares solvers. I don't think the built-in QR in MINPACK is very efficient compared to e.g. MKL. (Which is why I seriously consider to write my own LM routine.) Sturla From klonuo at gmail.com Tue Dec 11 07:03:04 2012 From: klonuo at gmail.com (klo uo) Date: Tue, 11 Dec 2012 13:03:04 +0100 Subject: [SciPy-User] scipy.org is down Message-ID: I can't access scipy.org and it seems it's not just me: http://www.downforeveryoneorjustme.com/scipy.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Tue Dec 11 07:05:48 2012 From: klonuo at gmail.com (klo uo) Date: Tue, 11 Dec 2012 13:05:48 +0100 Subject: [SciPy-User] scipy.org is down In-Reply-To: References: Message-ID: Browsing mailing list seems like this is alternative: http://scipy.github.com/ So perhaps currently there is some transition issue On Tue, Dec 11, 2012 at 1:03 PM, klo uo wrote: > I can't access scipy.org and it seems it's not just me: > > http://www.downforeveryoneorjustme.com/scipy.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From takowl at gmail.com Tue Dec 11 07:21:42 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Tue, 11 Dec 2012 12:21:42 +0000 Subject: [SciPy-User] scipy.org is down In-Reply-To: References:

Message-ID: On 11 December 2012 12:05, klo uo wrote: > Browsing mailing list seems like this is alternative: > http://scipy.github.com/ > So perhaps currently there is some transition issue > That's the new site, I'm hoping we'll be able to put it up as the main site before long. Another advantage is that it's a static site, so hopefully there's less to go wrong. Unfortunately, the server for the current scipy.org site is rather unreliable. I think a workaround was implemented that involved automatically restarting the server regularly, so maybe it'll come back in an hour or two. Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Tue Dec 11 07:24:36 2012 From: klonuo at gmail.com (klo uo) Date: Tue, 11 Dec 2012 13:24:36 +0100 Subject: [SciPy-User] scipy.org is down In-Reply-To: References:

Message-ID: OK, thanks Thomas Google has the cache so it can be browsed that way On Tue, Dec 11, 2012 at 1:21 PM, Thomas Kluyver wrote: > On 11 December 2012 12:05, klo uo wrote: > >> Browsing mailing list seems like this is alternative: >> http://scipy.github.com/ >> So perhaps currently there is some transition issue >> > > That's the new site, I'm hoping we'll be able to put it up as the main > site before long. Another advantage is that it's a static site, so > hopefully there's less to go wrong. > > Unfortunately, the server for the current scipy.org site is rather > unreliable. I think a workaround was implemented that involved > automatically restarting the server regularly, so maybe it'll come back in > an hour or two. > > Thomas > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.l.anderson at phonecoop.coop Tue Dec 11 10:13:08 2012 From: j.l.anderson at phonecoop.coop (Joseph Anderson) Date: Tue, 11 Dec 2012 15:13:08 +0000 Subject: [SciPy-User] solving question Message-ID: <82663A13-C785-400E-87C7-52F941347896@phonecoop.coop> Hello All, Hoping for some advice with this somewhat naive question... I have a set of equations that look like this: my_f(a0, x0) + my_f(a1, x1) + ... + my_f(aN, xN) - b0 = 0 my_f(a0, x0) + my_f(a1, x1) + ... + my_f(aN, xN) - b1 = 0 . . . my_f(a0, x0) + my_f(a1, x1) + ... + my_f(aN, xN) - bN = 0 my_f() is my own function (some trig, I can post if that will help). a0... aN and b0... bN are known. I need to find x0... xN. Any advice on finding x0... xN using scipy? Thanks in advance for you're help!! My best, ~~ Joseph Anderson Artist: http://joseph-anderson.org Ambisonic Toolkit: http://ambisonictoolkit.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From wkerzendorf at gmail.com Wed Dec 12 17:08:26 2012 From: wkerzendorf at gmail.com (Wolfgang Kerzendorf) Date: Wed, 12 Dec 2012 17:08:26 -0500 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness Message-ID: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> Hello Scipyers, I've just stumbled across a problem with interpolate.interp1d: ------------- import numpy as np from scipy import interpolate x = arange(1000) y = y = np.random.random_integers(0, 900, 1000) %timeit interp = interpolate.interp1d(x, y, kind='cubic') 1 loops, best of 3: 3.63 s per loop #the call for the interpolation is really quick afterwards (a couple ms) tck = interpolate.splrep(x, y, s=0) %timeit interpolate.splev(x_new, tck, der=0) 100 loops, best of 3: 5.51 ms per loop ------ I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). Ah - a last question: Why don't you use the issues tab on the github page? Thanks in advance, Wolfgang From deshpande.jaidev at gmail.com Wed Dec 12 17:22:10 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Thu, 13 Dec 2012 03:52:10 +0530 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> Message-ID: ---------- Forwarded message ---------- From: Wolfgang Kerzendorf Date: Thu, Dec 13, 2012 at 3:38 AM Subject: [SciPy-User] scipy interpolate.interp1d spline slowness To: SciPy Users List Hello Scipyers, I've just stumbled across a problem with interpolate.interp1d: ------------- import numpy as np from scipy import interpolate x = arange(1000) y = y = np.random.random_integers(0, 900, 1000) %timeit interp = interpolate.interp1d(x, y, kind='cubic') 1 loops, best of 3: 3.63 s per loop #the call for the interpolation is really quick afterwards (a couple ms) tck = interpolate.splrep(x, y, s=0) %timeit interpolate.splev(x_new, tck, der=0) 100 loops, best of 3: 5.51 ms per loop ------ I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). Hi, I don't know how much I can help, but I've noticed that scipy.interpolate.splrep and scipy.interpolate.splev are slightly faster than interp1d. It should suffice for cubic splines. HTH Ah - a last question: Why don't you use the issues tab on the github page? Thanks in advance, Wolfgang _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -- JD From josef.pktd at gmail.com Wed Dec 12 17:25:00 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 12 Dec 2012 17:25:00 -0500 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> Message-ID: On Wed, Dec 12, 2012 at 5:08 PM, Wolfgang Kerzendorf wrote: > Hello Scipyers, > > I've just stumbled across a problem with interpolate.interp1d: > ------------- > import numpy as np > from scipy import interpolate > x = arange(1000) > y = y = np.random.random_integers(0, 900, 1000) > > %timeit interp = interpolate.interp1d(x, y, kind='cubic') > 1 loops, best of 3: 3.63 s per loop > #the call for the interpolation is really quick afterwards (a couple ms) > > tck = interpolate.splrep(x, y, s=0) > %timeit interpolate.splev(x_new, tck, der=0) > 100 loops, best of 3: 5.51 ms per loop It looks to me, you are timing two different things here, with interp1d you time the spline creation with splev you time the evaluation. for "cubic", interp1d uses _fitpack._bspleval so I wouldn't expect much difference in timing. But I didn't check whether there is a difference in what the wrappers are doing Josef > > ------ > I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. > > Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). > > Ah - a last question: Why don't you use the issues tab on the github page? > > Thanks in advance, > Wolfgang > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wkerzendorf at gmail.com Wed Dec 12 18:25:02 2012 From: wkerzendorf at gmail.com (Wolfgang Kerzendorf) Date: Wed, 12 Dec 2012 18:25:02 -0500 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> Message-ID: <652C4822-1171-47EE-BE0D-0DFA27BD9DA5@gmail.com> Hi Josef, Your'e absolutely right - I just omitted the other timings as they seemed small compared to the interp1d: %timeit interp(x_new) 100 loops, best of 3: 3.89 ms per loop %timeit tck = interpolate.splrep(x, y, s=0) 1000 loops, best of 3: 204 us per loop It would be great if you look into this. Thanks Wolfgang On 2012-12-12, at 5:25 PM, josef.pktd at gmail.com wrote: > On Wed, Dec 12, 2012 at 5:08 PM, Wolfgang Kerzendorf > wrote: >> Hello Scipyers, >> >> I've just stumbled across a problem with interpolate.interp1d: >> ------------- >> import numpy as np >> from scipy import interpolate >> x = arange(1000) >> y = y = np.random.random_integers(0, 900, 1000) >> >> %timeit interp = interpolate.interp1d(x, y, kind='cubic') >> 1 loops, best of 3: 3.63 s per loop >> #the call for the interpolation is really quick afterwards (a couple ms) >> >> tck = interpolate.splrep(x, y, s=0) >> %timeit interpolate.splev(x_new, tck, der=0) >> 100 loops, best of 3: 5.51 ms per loop > > It looks to me, you are timing two different things here, with > interp1d you time the spline creation with splev you time the > evaluation. > > for "cubic", interp1d uses _fitpack._bspleval so I wouldn't expect > much difference in timing. > But I didn't check whether there is a difference in what the wrappers are doing > > Josef > > > >> >> ------ >> I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. >> >> Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). >> >> Ah - a last question: Why don't you use the issues tab on the github page? >> >> Thanks in advance, >> Wolfgang >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wkerzendorf at gmail.com Wed Dec 12 18:26:49 2012 From: wkerzendorf at gmail.com (Wolfgang Kerzendorf) Date: Wed, 12 Dec 2012 18:26:49 -0500 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: <652C4822-1171-47EE-BE0D-0DFA27BD9DA5@gmail.com> References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> <652C4822-1171-47EE-BE0D-0DFA27BD9DA5@gmail.com> Message-ID: Sorry again, i should include that interp came from: interp = interpolate.interp1d(x, y, kind='cubic') Cheers W On 2012-12-12, at 6:25 PM, Wolfgang Kerzendorf wrote: > Hi Josef, > > Your'e absolutely right - I just omitted the other timings as they seemed small compared to the interp1d: > > %timeit interp(x_new) > 100 loops, best of 3: 3.89 ms per loop > > %timeit tck = interpolate.splrep(x, y, s=0) > 1000 loops, best of 3: 204 us per loop > > It would be great if you look into this. > > Thanks > Wolfgang > On 2012-12-12, at 5:25 PM, josef.pktd at gmail.com wrote: > >> On Wed, Dec 12, 2012 at 5:08 PM, Wolfgang Kerzendorf >> wrote: >>> Hello Scipyers, >>> >>> I've just stumbled across a problem with interpolate.interp1d: >>> ------------- >>> import numpy as np >>> from scipy import interpolate >>> x = arange(1000) >>> y = y = np.random.random_integers(0, 900, 1000) >>> >>> %timeit interp = interpolate.interp1d(x, y, kind='cubic') >>> 1 loops, best of 3: 3.63 s per loop >>> #the call for the interpolation is really quick afterwards (a couple ms) >>> >>> tck = interpolate.splrep(x, y, s=0) >>> %timeit interpolate.splev(x_new, tck, der=0) >>> 100 loops, best of 3: 5.51 ms per loop >> >> It looks to me, you are timing two different things here, with >> interp1d you time the spline creation with splev you time the >> evaluation. >> >> for "cubic", interp1d uses _fitpack._bspleval so I wouldn't expect >> much difference in timing. >> But I didn't check whether there is a difference in what the wrappers are doing >> >> Josef >> >> >> >>> >>> ------ >>> I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. >>> >>> Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). >>> >>> Ah - a last question: Why don't you use the issues tab on the github page? >>> >>> Thanks in advance, >>> Wolfgang >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Dec 12 19:10:41 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 12 Dec 2012 19:10:41 -0500 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> <652C4822-1171-47EE-BE0D-0DFA27BD9DA5@gmail.com> Message-ID: Can you please follow our convention and post at bottom or inline. On Wed, Dec 12, 2012 at 6:26 PM, Wolfgang Kerzendorf wrote: > Sorry again, i should include that interp came from: interp = interpolate.interp1d(x, y, kind='cubic') > Cheers > W > On 2012-12-12, at 6:25 PM, Wolfgang Kerzendorf wrote: > >> Hi Josef, >> >> Your'e absolutely right - I just omitted the other timings as they seemed small compared to the interp1d: >> >> %timeit interp(x_new) >> 100 loops, best of 3: 3.89 ms per loop that looks faster than what you had before : >> %timeit interpolate.splev(x_new, tck, der=0) >> 100 loops, best of 3: 5.51 ms per loop I don't understand the setup code for interp1d cubic, _find_smoothest for interpolation ? but that looks slow in your timing. >> >> %timeit tck = interpolate.splrep(x, y, s=0) >> 1000 loops, best of 3: 204 us per loop >> >> It would be great if you look into this. Sorry, not me, I don't have time for this. I went recently roughly through the source for interp1d "linear", and so the other parts just on the side. I have never seen this extra spline code in interpolate.py there are a lot of potentially nice options 'natural', 'second','clamped', 'endslope', 'first', 'not-a-knot', 'runout', 'parabolic' and a lot of NotImplementedError and one more wrapper around _fitpack It looks like a major undertaking to just find one's way around the splines and fitpack wrapper codes. I stick to UnivariateSplines for "cubic". Josef >> >> Thanks >> Wolfgang >> On 2012-12-12, at 5:25 PM, josef.pktd at gmail.com wrote: >> >>> On Wed, Dec 12, 2012 at 5:08 PM, Wolfgang Kerzendorf >>> wrote: >>>> Hello Scipyers, >>>> >>>> I've just stumbled across a problem with interpolate.interp1d: >>>> ------------- >>>> import numpy as np >>>> from scipy import interpolate >>>> x = arange(1000) >>>> y = y = np.random.random_integers(0, 900, 1000) >>>> >>>> %timeit interp = interpolate.interp1d(x, y, kind='cubic') >>>> 1 loops, best of 3: 3.63 s per loop >>>> #the call for the interpolation is really quick afterwards (a couple ms) >>>> >>>> tck = interpolate.splrep(x, y, s=0) >>>> %timeit interpolate.splev(x_new, tck, der=0) >>>> 100 loops, best of 3: 5.51 ms per loop >>> >>> It looks to me, you are timing two different things here, with >>> interp1d you time the spline creation with splev you time the >>> evaluation. >>> >>> for "cubic", interp1d uses _fitpack._bspleval so I wouldn't expect >>> much difference in timing. >>> But I didn't check whether there is a difference in what the wrappers are doing >>> >>> Josef >>> >>> >>> >>>> >>>> ------ >>>> I do understand that these are different spline interpolations (but that's as far as my knowledge goes). I was just annoyed at the person saying: Ah, you see python is slow - which it is not as shown by the second scipy command. >>>> >>>> Would it be possible to switch the spline interpolator used in interpolate.interp1d to the B-Splines, or to give an option to switch between different spline interpolators (maybe with a warning: slow). >>>> >>>> Ah - a last question: Why don't you use the issues tab on the github page? >>>> >>>> Thanks in advance, >>>> Wolfgang >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Thu Dec 13 07:18:19 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 13 Dec 2012 12:18:19 +0000 (UTC) Subject: [SciPy-User] scipy interpolate.interp1d spline slowness References: <6DF4A272-81C3-4A61-BFD0-BB7614448290@gmail.com> Message-ID: Wolfgang Kerzendorf gmail.com> writes: > I've just stumbled across a problem with interpolate.interp1d: > ------------- > import numpy as np > from scipy import interpolate > x = arange(1000) > y = y = np.random.random_integers(0, 900, 1000) > > %timeit interp = interpolate.interp1d(x, y, kind='cubic') > 1 loops, best of 3: 3.63 s per loop This is indeed slow. The problem is probably that the spline fitting routine splmake() does not make use of the bandedness of the spline interpolation matrix, and so it is inefficient. Switching to FITPACK splines could be a better option --- I think the duplicated (and partial) spline functionality is not very useful. -- Pauli Virtanen From perry at stsci.edu Thu Dec 13 09:37:58 2012 From: perry at stsci.edu (Perry Greenfield) Date: Thu, 13 Dec 2012 09:37:58 -0500 Subject: [SciPy-User] JOB: positions available at STScI References: Message-ID: <76734032-916A-43E9-A363-F04A94A71426@stsci.edu> The Science Software Branch at the Space Telescope Science Institute has openings for python developers. If you are interested please apply through this link: https://rn11.ultipro.com/SPA1004/JobBoard/JobDetails.aspx?__ID=*41BCE81D823F3F8C The relevant section is the one labeled "Science Data Calibration and Data Analysis" Although the position is advertised as a junior position, candidates with more experience may be considered for higher level positions. From bgamari.foss at gmail.com Thu Dec 13 12:30:41 2012 From: bgamari.foss at gmail.com (Ben Gamari) Date: Thu, 13 Dec 2012 12:30:41 -0500 Subject: [SciPy-User] scipy.stats.distributions lacking categorical and multinomial? Message-ID: <87txrpuaf2.fsf@gmail.com> Is there any particular reason for the categorical (generalized Bernoulli) and multinomial (generalized binomial) distribution not being included in scipy.stats.distributions? Cheers, - Ben From robert.kern at gmail.com Thu Dec 13 12:36:50 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Dec 2012 17:36:50 +0000 Subject: [SciPy-User] scipy.stats.distributions lacking categorical and multinomial? In-Reply-To: <87txrpuaf2.fsf@gmail.com> References: <87txrpuaf2.fsf@gmail.com> Message-ID: On Thu, Dec 13, 2012 at 5:30 PM, Ben Gamari wrote: > Is there any particular reason for the categorical (generalized > Bernoulli) and multinomial (generalized binomial) distribution not being > included in scipy.stats.distributions? scipy.stats.distributions only provides a framework for univariate distributions. It's not clear what the interfaces should be for multivariate distributions, but you are welcome to make proposals. -- Robert Kern From jsseabold at gmail.com Thu Dec 13 12:44:36 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 13 Dec 2012 12:44:36 -0500 Subject: [SciPy-User] scipy.stats.distributions lacking categorical and multinomial? In-Reply-To: References: <87txrpuaf2.fsf@gmail.com> Message-ID: On Thu, Dec 13, 2012 at 12:36 PM, Robert Kern wrote: > On Thu, Dec 13, 2012 at 5:30 PM, Ben Gamari wrote: >> Is there any particular reason for the categorical (generalized >> Bernoulli) and multinomial (generalized binomial) distribution not being >> included in scipy.stats.distributions? > > scipy.stats.distributions only provides a framework for univariate > distributions. It's not clear what the interfaces should be for > multivariate distributions, but you are welcome to make proposals. > I don't know what your needs are, but FWIW, I've been able to get by using np.random.multinomial for sampling and either rolling my own functions or using PyMC for the included categorical_like and multinomial_like. Skipper From jjstickel at gmail.com Thu Dec 13 15:50:24 2012 From: jjstickel at gmail.com (Jonathan Stickel) Date: Thu, 13 Dec 2012 13:50:24 -0700 Subject: [SciPy-User] scipy interpolate.interp1d spline slowness In-Reply-To: References: Message-ID: <50CA3F90.3080008@gmail.com> On 12/13/12 10:26 , scipy-user-request at scipy.org wrote: > Date: Wed, 12 Dec 2012 17:25:00 -0500 > From:josef > Subject: Re: [SciPy-User] scipy interpolate.interp1d spline slowness > To: SciPy Users List > > On Wed, Dec 12, 2012 at 5:08 PM, Wolfgang Kerzendorf > wrote: >> >Hello Scipyers, >> > >> >I've just stumbled across a problem with interpolate.interp1d: >> >------------- >> >import numpy as np >> >from scipy import interpolate >> >x = arange(1000) >> >y = y = np.random.random_integers(0, 900, 1000) >> > >> >%timeit interp = interpolate.interp1d(x, y, kind='cubic') >> >1 loops, best of 3: 3.63 s per loop >> >#the call for the interpolation is really quick afterwards (a couple ms) >> > >> >tck = interpolate.splrep(x, y, s=0) >> >%timeit interpolate.splev(x_new, tck, der=0) >> >100 loops, best of 3: 5.51 ms per loop > It looks to me, you are timing two different things here, with > interp1d you time the spline creation with splev you time the > evaluation. > > for "cubic", interp1d uses _fitpack._bspleval so I wouldn't expect > much difference in timing. > But I didn't check whether there is a difference in what the wrappers are doing > I think this is the appropriate comparison: In [1]: import numpy as np In [2]: from scipy import interpolate as ipt In [3]: x = np.arange(1000) In [4]: y = 100*np.random.rand(1000) In [5]: xp = x[:-1] + 0.5 In [6]: %timeit yp1 = ipt.interp1d(x,y,kind=3)(xp) 1 loops, best of 3: 5.08 s per loop In [7]: %timeit yp2 = ipt.splev( xp, ipt.splrep(x,y,k=3) ) 1000 loops, best of 3: 326 us per loop I also don't claim to know the details of the two implementations, but I now use splrep/splev for any cubic spline interpolation of more than a few points. Jonathan From pawel.kw at gmail.com Fri Dec 14 07:06:09 2012 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Fri, 14 Dec 2012 13:06:09 +0100 Subject: [SciPy-User] cubic spline interpolation - derivative value in an end point In-Reply-To: References:

Message-ID: Hi, Actually, I'm having a similar problem now. Your hack should do the job - thanks Eric. Cheers, Pawe? 2012/11/27 Maxim > Thank you a lot, Eric! > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip at semanchuk.com Fri Dec 14 09:34:57 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 09:34:57 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? Message-ID: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> Hi all, I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. Can anyone point me to a cross-platform (OS X, Linux & Windows) way I can call this function without compiling code myself? I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. Thanks in advance for any suggestions, Philip PS - Please pardon if you already saw this question on the numpy list. From jjhelmus at gmail.com Fri Dec 14 10:38:35 2012 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Fri, 14 Dec 2012 10:38:35 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> Message-ID: <50CB47FB.9090404@gmail.com> On 12/14/2012 09:34 AM, Philip Semanchuk wrote: > Hi all, > I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). > > I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. > > Can anyone point me to a cross-platform (OS X, Linux& Windows) way I can call this function without compiling code myself? > > I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. > > Thanks in advance for any suggestions, > Philip Philip, Not sure if this is portable or is possible on all or even most Scipy installs but I can find a pointer to the LAPACK dbdsqr function in the clapack/flapack shared library using ctypes on my EDP 7.3.1 rh5 install: In [1]: from ctypes import * In [2]: import scipy.linalg In [3]: lib = CDLL(scipy.linalg.lapack.clapack.__file__) In [4]: lib.dbdsqr Out[4]: <_FuncPtr object at 0x2d306d0> This also works with the scipy.linalg.lapack.flapack.__file__ and np.linalg.lapack_lite.__file__. I would think it would be possible to create a Python wrapper around this with the ctypes module (>= Python 2.5) which properly defines the arguments and returned types, etc but it is not ideal. Cheers, - Jonathan Helmus From sturla at molden.no Fri Dec 14 10:42:44 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 16:42:44 +0100 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> Message-ID: <50CB48F4.3000007@molden.no> If you are building SciPy from source, just add the interface to scipy/linalg/flapack.pyf.src and scipy.linalg.lapack.get_lapack_funcs will get it for you. I would be nice if SciPy exposed all of LAPACK, even the parts it doesn't use, but currently it don't. If you're not building SciPy from source, yopu can e.g. write a .pyf file for dbdsqr and use f2py and and link with your LAPACK library. If you prefer to avoid Fortran, but already have a LAPACK library, you can also use LAPACKE with Cython (or ctypes). Sturla On 14.12.2012 15:34, Philip Semanchuk wrote: > Hi all, > I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). > > I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. > > Can anyone point me to a cross-platform (OS X, Linux& Windows) way I can call this function without compiling code myself? > > I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. > > Thanks in advance for any suggestions, > Philip > > PS - Please pardon if you already saw this question on the numpy list. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From philip at semanchuk.com Fri Dec 14 10:43:52 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 10:43:52 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <50CB47FB.9090404@gmail.com> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> Message-ID: <1C358F5F-1D30-4ED3-B868-0B6D37E85521@semanchuk.com> On Dec 14, 2012, at 10:38 AM, Jonathan Helmus wrote: > On 12/14/2012 09:34 AM, Philip Semanchuk wrote: >> Hi all, >> I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). >> >> I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. >> >> Can anyone point me to a cross-platform (OS X, Linux& Windows) way I can call this function without compiling code myself? >> >> I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. >> >> Thanks in advance for any suggestions, >> Philip > Philip, > > Not sure if this is portable or is possible on all or even most Scipy > installs but I can find a pointer to the LAPACK dbdsqr function in the > clapack/flapack shared library using ctypes on my EDP 7.3.1 rh5 install: > > > In [1]: from ctypes import * > > In [2]: import scipy.linalg > > In [3]: lib = CDLL(scipy.linalg.lapack.clapack.__file__) > > In [4]: lib.dbdsqr > Out[4]: <_FuncPtr object at 0x2d306d0> > > This also works with the scipy.linalg.lapack.flapack.__file__ and > np.linalg.lapack_lite.__file__. I would think it would be possible to > create a Python wrapper around this with the ctypes module (>= Python > 2.5) which properly defines the arguments and returned types, etc but it > is not ideal. Awesome! Believe me, this is ideal compared to the alternatives. Your code also works on my OS X install using scipy 0.11. It doesn't work under Linux Mint 13 w/scipy 0.9.0 nor WinXP w/scipy 0.10.0. What's odd is that the latest source code for scipy only has one reference to this function, and it is documentation saying "Not Implemented". I'll have to experiment a little to see if this is coming to scipy via numpy. Thank you so much, Jonathan. And for nmrglue, too ;) Cheers Philip From philip at semanchuk.com Fri Dec 14 10:48:09 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 10:48:09 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <50CB48F4.3000007@molden.no> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB48F4.3000007@molden.no> Message-ID: On Dec 14, 2012, at 10:42 AM, Sturla Molden wrote: > If you are building SciPy from source, just add the interface to > scipy/linalg/flapack.pyf.src and scipy.linalg.lapack.get_lapack_funcs > will get it for you. I would be nice if SciPy exposed all of LAPACK, > even the parts it doesn't use, but currently it don't. > > If you're not building SciPy from source, yopu can e.g. write a .pyf > file for dbdsqr and use f2py and and link with your LAPACK library. If > you prefer to avoid Fortran, but already have a LAPACK library, you can > also use LAPACKE with Cython (or ctypes). Thanks, Sturla. Our code is distributed to others as part of an application suite. We don't ask our users to compile anything, nor can we necessarily predict what libraries will be available on the machine outside of our app's prerequisites. Our app requires numpy and scipy, so I was really hoping to get at it from one of those two libraries. Cheers Philip > On 14.12.2012 15:34, Philip Semanchuk wrote: >> Hi all, >> I'm porting some Fortran code that makes use of a number of BLAS and LAPACK functions, including dbdsqr(). I've found all of the functions I need via scipy.linalg.lapack.get_lapack_funcs/get_blas_funcs() except for dbdsqr(). >> >> I see that the numpy source code (I looked at numpy-1.6.0b2) contains dbdsqr() in numpy/linalg/dlapack_lite.c, but grep can't find that string in the binary distribution on my Mac nor on Linux. If it's buried in a numpy binary somewhere, I'm comfortable with using ctypes to call it, but I suspect it isn't. >> >> Can anyone point me to a cross-platform (OS X, Linux& Windows) way I can call this function without compiling code myself? >> >> I'm unfortunately quite na?ve about the math in the code I'm porting, so I'm porting the code blindly -- if you ask me what problem I'm trying to solve with dbdsqr(), I won't be able to explain. >> >> Thanks in advance for any suggestions, >> Philip >> >> PS - Please pardon if you already saw this question on the numpy list. >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Fri Dec 14 10:54:22 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 16:54:22 +0100 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <50CB47FB.9090404@gmail.com> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> Message-ID: <50CB4BAE.1050105@molden.no> On 14.12.2012 16:38, Jonathan Helmus wrote: > Not sure if this is portable or is possible on all or even most Scipy > installs but I can find a pointer to the LAPACK dbdsqr function in the > clapack/flapack shared library using ctypes on my EDP 7.3.1 rh5 install: > > > In [1]: from ctypes import * > > In [2]: import scipy.linalg > > In [3]: lib = CDLL(scipy.linalg.lapack.clapack.__file__) That will not be cross-platform, but depend on ABI for the Fortran compiler used on the particular platform. Cross-platform ABI dependency is a PITA. CLAPACK is also abandonware, LAPACKE is the current C/C++ interface. Thus with any recent LAPACK, there will be no clapack in SciPy. I hope that SciPy at some point could start to use LAPACKE. Another cross-platform solution is to use the Fortran 2003 ISO C binding and Cython (i.e. fwrap instead of f2py). Sturla From p.zaffino at yahoo.it Fri Dec 14 10:57:57 2012 From: p.zaffino at yahoo.it (Paolo Zaffino) Date: Fri, 14 Dec 2012 15:57:57 +0000 (GMT) Subject: [SciPy-User] Points fitting (non lin) Message-ID: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> Dear Scipy community, I have a set of points (2D) and I would compute a curve that fits them. The points are ordered in a precise way (not crescent order) and I can't change this order (the curve should fit the points in that order). I'm interseting in a non linear fit (the ideal case would be more intervals of quadratic curves). Has anyone any advice about? Thank you very much. Regards. Paolo ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip at semanchuk.com Fri Dec 14 11:03:11 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 11:03:11 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <50CB4BAE.1050105@molden.no> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no> Message-ID: On Dec 14, 2012, at 10:54 AM, Sturla Molden wrote: > On 14.12.2012 16:38, Jonathan Helmus wrote: > >> Not sure if this is portable or is possible on all or even most Scipy >> installs but I can find a pointer to the LAPACK dbdsqr function in the >> clapack/flapack shared library using ctypes on my EDP 7.3.1 rh5 install: >> >> >> In [1]: from ctypes import * >> >> In [2]: import scipy.linalg >> >> In [3]: lib = CDLL(scipy.linalg.lapack.clapack.__file__) > > That will not be cross-platform, but depend on ABI for the Fortran > compiler used on the particular platform. Cross-platform ABI dependency > is a PITA. > > CLAPACK is also abandonware, LAPACKE is the current C/C++ interface. > Thus with any recent LAPACK, there will be no clapack in SciPy. Hmmm, well that's disappointing to hear, but better to know now than finding out the hard way later. Maybe a pure-Python + numpy/scipy solution for this just isn't possible right now. Tack/Thanks P From sturla at molden.no Fri Dec 14 11:07:44 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 17:07:44 +0100 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB48F4.3000007@molden.no> Message-ID: <50CB4ED0.1070203@molden.no> On 14.12.2012 16:48, Philip Semanchuk wrote: > Thanks, Sturla. Our code is distributed to others as part of an application suite. We don't ask our users to compile anything, nor can we necessarily predict what libraries will be available on the machine outside of our app's prerequisites. Our app requires numpy and scipy, so I was really hoping to get at it from one of those two libraries. I know, I constantly run into the same issue. For example with the latest Enthought Python, I do this to get LAPACK: intel_mkl = ctypes.CDLL('mk2_rt.dll') Then I can call LAPACKE functions in the MKL library: LAPACKE_dbdsqr = intel_mkl.LAPACKE_dbdsqr And if I want to use it from Cython, I must do this conversion: ( ctypes.cast(LAPACKE_dbdsqr, ctypes.c_void_p).value) which gives me the address of LAPACKE_dbdsqr in a void*. Sturla From pawel.kw at gmail.com Fri Dec 14 11:25:31 2012 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Fri, 14 Dec 2012 17:25:31 +0100 Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> Message-ID: Dear Paolo, I'm not sure I understand correctly your problem, but this sounds like a spline fitting job. You can read more about this here: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html Is this what you are looking for? Cheers, Pawe? 2012/12/14 Paolo Zaffino > Dear Scipy community, > > I have a set of points (2D) and I would compute a curve that fits them. > The points are ordered in a precise way (not crescent order) and I can't > change this order (the curve should fit the points in that order). > I'm interseting in a non linear fit (the ideal case would be more > intervals of quadratic curves). > Has anyone any advice about? > > Thank you very much. > Regards. > Paolo > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Dec 14 11:30:11 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 14 Dec 2012 16:30:11 +0000 (UTC) Subject: [SciPy-User] Calling LAPACK function dbdsqr()? References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB48F4.3000007@molden.no> <50CB4ED0.1070203@molden.no> Message-ID: Sturla Molden molden.no> writes: > On 14.12.2012 16:48, Philip Semanchuk wrote: > > > Thanks, Sturla. Our code is distributed to others as part of an application suite. We don't ask our users to > compile anything, nor can we necessarily predict what libraries will be available on the machine outside > of our app's prerequisites. Our app requires numpy and scipy, so I was really hoping to get at it from one of > those two libraries. > > I know, I constantly run into the same issue. I think this means that someone needs to go through the LAPACK wrappers, and add any functions that are missing. Pull requests are accepted ;) -- Pauli Virtanen From pav at iki.fi Fri Dec 14 11:33:07 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 14 Dec 2012 16:33:07 +0000 (UTC) Subject: [SciPy-User] Points fitting (non lin) References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> Message-ID: Paolo Zaffino yahoo.it> writes: > I have a set of points (2D) and I would compute a curve that fits them. > The points are ordered in a precise way (not crescent order) and > I can't change this order (the curve should fit the points in that order). > I'm interseting in a non linear fit (the ideal case would be more > intervals of quadratic curves). > Has anyone any advice about? splprep may be the function you are looking for, it fits a parametric spline to a set of points. Read the documentation on how to choose the smoothing `s` parameter. http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.splprep.htm l -- Pauli Virtanen From dmtmakris at gmail.com Thu Dec 13 14:48:56 2012 From: dmtmakris at gmail.com (maxstirner) Date: Thu, 13 Dec 2012 11:48:56 -0800 (PST) Subject: [SciPy-User] [SciPy-user] decimal order bessel function zeros Message-ID: <34794804.post@talk.nabble.com> Hello all, jn_zeros(n,nt) calculates nt zeros of n order bessel function(n is integer). How i can calculate the nt zeros of n order bessel function when now n is decimal? -- View this message in context: http://old.nabble.com/decimal-order-bessel-function-zeros-tp34794804p34794804.html Sent from the Scipy-User mailing list archive at Nabble.com. From p.zaffino at yahoo.it Fri Dec 14 11:42:47 2012 From: p.zaffino at yahoo.it (Paolo Zaffino) Date: Fri, 14 Dec 2012 16:42:47 +0000 (GMT) Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> Message-ID: <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> Dear?Pawe?, thank you for the reply. I try to explain better the issue. I have these points (in this order): P1 = (1,1) P2 = (2,2) P3 = (4,2) P4 = (3,1) I need to fit the points in the order P1,P2,P3,P4 even if the x coord of P3 is greater than P4. I thought to quadratic piecewise curve but other solutions are welcome. Thanks a lot. Paolo ________________________________ Da: Pawe? Kwa?niewski A: Paolo Zaffino ; SciPy Users List Inviato: Venerd? 14 Dicembre 2012 17:25 Oggetto: Re: [SciPy-User] Points fitting (non lin) Dear Paolo, I'm not sure I understand correctly your problem, but this sounds like a spline fitting job. You can read more about this here: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html Is this what you are looking for? Cheers, Pawe? 2012/12/14 Paolo Zaffino Dear Scipy community, > > >I have a set of points (2D) and I would compute a curve that fits them. >The points are ordered in a precise way (not crescent order) and I can't change this order (the curve should fit the points in that order). >I'm interseting in a non linear fit (the ideal case would be more intervals of quadratic curves). >Has anyone any advice about? > > >Thank you very much. >Regards.Paolo ? >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Fri Dec 14 11:46:11 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 14 Dec 2012 17:46:11 +0100 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB48F4.3000007@molden.no> <50CB4ED0.1070203@molden.no> Message-ID: <50CB57D3.4000800@molden.no> On 14.12.2012 17:30, Pauli Virtanen wrote: > I think this means that someone needs to go through the LAPACK > wrappers, and add any functions that are missing. > > Pull requests are accepted ;) Yes, perferably all of LAPACK should be defined in scipy/linalg/flapack.pyf But now it just has the functions that SciPy needs. It will take quite some work. And I presume tests will be needed too? Sturla From pav at iki.fi Fri Dec 14 11:47:42 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 14 Dec 2012 16:47:42 +0000 (UTC) Subject: [SciPy-User] Calling LAPACK function dbdsqr()? References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no> Message-ID: Philip Semanchuk semanchuk.com> writes: [clip] > > CLAPACK is also abandonware, LAPACKE is the current C/C++ interface. > > Thus with any recent LAPACK, there will be no clapack in SciPy. > > Hmmm, well that's disappointing to hear, but better to know > now than finding out the hard way later. > > Maybe a pure-Python + numpy/scipy solution for this just > isn't possible right now. However, you should still be able to use ctypes to interface with the Fortran routines in scipy/linalg/flapack.so While the CLAPACK is not always present, the Fortran-based LAPACK is. This may however require some fiddling with the calling conventions. So testing with multiple platforms will be needed, but perhaps you are OK with doing that... -- Pauli Virtanen From zachary.pincus at yale.edu Fri Dec 14 11:50:25 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 14 Dec 2012 11:50:25 -0500 Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> Message-ID: <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> On Dec 14, 2012, at 11:42 AM, Paolo Zaffino wrote: > Dear Pawe?, > thank you for the reply. > I try to explain better the issue. > I have these points (in this order): > > P1 = (1,1) > P2 = (2,2) > P3 = (4,2) > P4 = (3,1) > > I need to fit the points in the order P1,P2,P3,P4 even if the x coord of P3 is greater than P4. > I thought to quadratic piecewise curve but other solutions are welcome. > > Thanks a lot. > Paolo You will want to fit a parametric spline of some degree with some amount (or no) smoothing. I'd look at the splprep function in scipy.interpolate. The trick is you associate each point with some monotonic parameter value, and then interpolate along that parameter (say t) to get your x, y coordinates. E.g.: t x y 0 1 1 1 2 2 2 4 2 3 3 1 Then if you were interpolating linearly, at t=0.5, you would have (1.5, 1.5) as the coordinate. As above, splprep will generate splines of a desired order (linear, quadratic, cubic, etc.) and with a user-specified smoothing parameter (s), which can be set to zero to get exact interpolation of the input coordinates, potentially at the cost of ringing (sometimes quite bad) away from the input coordinate. So you will need to plot the interpolated values, both at the input t-values, as well as at intermediate t's, to see if the output is sane. Hopefully this is somewhat clear, or at least enough to get you started. Please read the documentation for splprep and splev for more information. Zach > > Da: Pawe? Kwa?niewski > A: Paolo Zaffino ; SciPy Users List > Inviato: Venerd? 14 Dicembre 2012 17:25 > Oggetto: Re: [SciPy-User] Points fitting (non lin) > > Dear Paolo, > > I'm not sure I understand correctly your problem, but this sounds like a spline fitting job. You can read more about this here: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html > > Is this what you are looking for? > > Cheers, > > Pawe? > > > > 2012/12/14 Paolo Zaffino > Dear Scipy community, > > I have a set of points (2D) and I would compute a curve that fits them. > The points are ordered in a precise way (not crescent order) and I can't change this order (the curve should fit the points in that order). > I'm interseting in a non linear fit (the ideal case would be more intervals of quadratic curves). > Has anyone any advice about? > > Thank you very much. > Regards. > Paolo > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Fri Dec 14 12:03:04 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 14 Dec 2012 17:03:04 +0000 (UTC) Subject: [SciPy-User] Calling LAPACK function dbdsqr()? References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB48F4.3000007@molden.no> <50CB4ED0.1070203@molden.no> <50CB57D3.4000800@molden.no> Message-ID: Sturla Molden molden.no> writes: > On 14.12.2012 17:30, Pauli Virtanen wrote: > > I think this means that someone needs to go through the LAPACK > > wrappers, and add any functions that are missing. > > > > Pull requests are accepted ;) > > Yes, perferably all of LAPACK should be defined in > > scipy/linalg/flapack.pyf > > But now it just has the functions that SciPy needs. > > It will take quite some work. And I presume tests will be needed too? We don't have tests for all the existing routines. Just smoke tests would be good enough, I think --- work space query and calling the routine on some random data. I'd guess writing these would be boring, but on the other hand writing the .pyf file probably take more time. Probably doesn't make so much sense checking any of the results, as we trust LAPACK. -- Pauli Virtanen From pawel.kw at gmail.com Fri Dec 14 12:13:57 2012 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Fri, 14 Dec 2012 18:13:57 +0100 Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> Message-ID: So, as far as I understand, the scipy.interpolate.UnivariateSpline() function is equivalent to interpolate.splrep - according to the documentation string it is just a more modern, object oriented implementation. In practice, it means that the scipy.interpolate.UnivariateSpline() returns a function which can be then evaluated on the desired x axis (I actually learned this today...). In the case of example data you gave, this would be something like this: from scipy import interpolate x = array((1,2,4,3)) y = array((1,2,2,1)) # Calculate the spline f = interpolate.UnivariateSpline(x,y) # Evaluate the spline fx = f(x) Pawe? 2012/12/14 Zachary Pincus > > On Dec 14, 2012, at 11:42 AM, Paolo Zaffino wrote: > > > Dear Pawe?, > > thank you for the reply. > > I try to explain better the issue. > > I have these points (in this order): > > > > P1 = (1,1) > > P2 = (2,2) > > P3 = (4,2) > > P4 = (3,1) > > > > I need to fit the points in the order P1,P2,P3,P4 even if the x coord of > P3 is greater than P4. > > I thought to quadratic piecewise curve but other solutions are welcome. > > > > Thanks a lot. > > Paolo > > You will want to fit a parametric spline of some degree with some amount > (or no) smoothing. I'd look at the splprep function in scipy.interpolate. > > The trick is you associate each point with some monotonic parameter value, > and then interpolate along that parameter (say t) to get your x, y > coordinates. > > E.g.: > t x y > 0 1 1 > 1 2 2 > 2 4 2 > 3 3 1 > > Then if you were interpolating linearly, at t=0.5, you would have (1.5, > 1.5) as the coordinate. > > As above, splprep will generate splines of a desired order (linear, > quadratic, cubic, etc.) and with a user-specified smoothing parameter (s), > which can be set to zero to get exact interpolation of the input > coordinates, potentially at the cost of ringing (sometimes quite bad) away > from the input coordinate. So you will need to plot the interpolated > values, both at the input t-values, as well as at intermediate t's, to see > if the output is sane. > > Hopefully this is somewhat clear, or at least enough to get you started. > Please read the documentation for splprep and splev for more information. > Zach > > > > > > > Da: Pawe? Kwa?niewski > > A: Paolo Zaffino ; SciPy Users List < > scipy-user at scipy.org> > > Inviato: Venerd? 14 Dicembre 2012 17:25 > > Oggetto: Re: [SciPy-User] Points fitting (non lin) > > > > Dear Paolo, > > > > I'm not sure I understand correctly your problem, but this sounds like a > spline fitting job. You can read more about this here: > http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html > > > > Is this what you are looking for? > > > > Cheers, > > > > Pawe? > > > > > > > > 2012/12/14 Paolo Zaffino > > Dear Scipy community, > > > > I have a set of points (2D) and I would compute a curve that fits them. > > The points are ordered in a precise way (not crescent order) and I can't > change this order (the curve should fit the points in that order). > > I'm interseting in a non linear fit (the ideal case would be more > intervals of quadratic curves). > > Has anyone any advice about? > > > > Thank you very much. > > Regards. > > Paolo > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From philip at semanchuk.com Fri Dec 14 14:24:54 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 14:24:54 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no>

Message-ID: <6DD12DBD-2348-4F66-89B0-7ED338F933CA@semanchuk.com> On Dec 14, 2012, at 11:47 AM, Pauli Virtanen wrote: > Philip Semanchuk semanchuk.com> writes: > [clip] >>> CLAPACK is also abandonware, LAPACKE is the current C/C++ interface. >>> Thus with any recent LAPACK, there will be no clapack in SciPy. >> >> Hmmm, well that's disappointing to hear, but better to know >> now than finding out the hard way later. >> >> Maybe a pure-Python + numpy/scipy solution for this just >> isn't possible right now. > > However, you should still be able to use ctypes to interface with the > Fortran routines in scipy/linalg/flapack.so > > While the CLAPACK is not always present, the Fortran-based LAPACK is. > > This may however require some fiddling with the calling conventions. > So testing with multiple platforms will be needed, but perhaps you > are OK with doing that... Hmm, this just gets more and more interesting... My Python code was already calling a compiled Fortran version of dbdsqr() via ctypes. (This version of dbdsqr() is in our custom library -- the one I'm porting -- and it's the only thing I haven't yet been able to port to Python.) Just now I was able to swap out the reference to our custom library and use Jonathan Helmus' technique to get a scipy reference to dbdsqr. All I changed was the library reference, and it worked! So now I'm totally independent of our compiled custom library under OS X. That's excellent news. The not-so-excellent news is that on Windows I installed scipy 0.11.0 (the same version that's on my Mac) and it does *not* seem to provide a reference to dbdsqr(). I also looked for _dbdsqr, dbdsqr_, _dbdsqr_, etc. and it's just not there. I can get a reference to e.g. zgelss -- >>> hasattr(scipy.linalg.lapack.flapack, 'zgelss') True But not dbdsqr. I guess this is an artifact of the LAPACK library against which scipy was compiled? bye Philip From pav at iki.fi Fri Dec 14 14:38:41 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 14 Dec 2012 21:38:41 +0200 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: <6DD12DBD-2348-4F66-89B0-7ED338F933CA@semanchuk.com> References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no>

<6DD12DBD-2348-4F66-89B0-7ED338F933CA@semanchuk.com> Message-ID: 14.12.2012 21:24, Philip Semanchuk kirjoitti: [clip] > The not-so-excellent news is that on Windows I installed scipy 0.11.0 > (the same version that's on my Mac) and it does *not* seem to provide > a reference to dbdsqr(). I also looked for _dbdsqr, dbdsqr_, _dbdsqr_, > etc. and it's just not there. I can get a reference to e.g. zgelss -- > >>>> hasattr(scipy.linalg.lapack.flapack, 'zgelss') > True > > But not dbdsqr. I guess this is an artifact of the LAPACK library > against which scipy was compiled? No, the result from hasattr only means that the routine is not wrapped via f2py. The Scipy LAPACK wrappers always wrap the exact same set of routines. However, if a routine is not wrapped, the compiler might not include it in the built dll file. This probably is platform dependent. -- Pauli Virtanen From philip at semanchuk.com Fri Dec 14 15:15:44 2012 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 14 Dec 2012 15:15:44 -0500 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no>

<6DD12DBD-2348-4F66-89B0-7ED338F933CA@semanchuk.com> Message-ID: <26A717B5-2CC5-4A8F-8629-5DB45F46AF6F@semanchuk.com> On Dec 14, 2012, at 2:38 PM, Pauli Virtanen wrote: > 14.12.2012 21:24, Philip Semanchuk kirjoitti: > [clip] >> The not-so-excellent news is that on Windows I installed scipy 0.11.0 >> (the same version that's on my Mac) and it does *not* seem to provide >> a reference to dbdsqr(). I also looked for _dbdsqr, dbdsqr_, _dbdsqr_, >> etc. and it's just not there. I can get a reference to e.g. zgelss -- >> >>>>> hasattr(scipy.linalg.lapack.flapack, 'zgelss') >> True >> >> But not dbdsqr. I guess this is an artifact of the LAPACK library >> against which scipy was compiled? > > No, the result from hasattr only means that the routine is not wrapped > via f2py. The Scipy LAPACK wrappers always wrap the exact same set of > routines. > > However, if a routine is not wrapped, the compiler might not include it > in the built dll file. This probably is platform dependent. Oh, I see. When I use `dumpbin.exe /SYMBOLS flapack.pyd`, I see dbdsqr referenced, but I can't access it under any name. ISTR that some other LAPACK function in scipy uses dbdsqr(), so it should be in there but perhaps the symbol is not exported. I can't find it in any of the other .pyd files in site-packages\scipy\linalg. Cheers Philip From michael.aye at ucla.edu Fri Dec 14 19:50:19 2012 From: michael.aye at ucla.edu (Michael Aye) Date: Fri, 14 Dec 2012 16:50:19 -0800 Subject: [SciPy-User] scipy.interpolate.InterpolatedUnivariateSpline Message-ID: The code example for InterpolatedUnivariateSpline here: http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.InterpolatedUnivariateSpline.html#scipy.interpolate.InterpolatedUnivariateSpline has a copy/paste error. It imports and works with UnivariateSpline instead of InterpolatedUnivariateSpline. Nice weekend everyone! Michael From sturla at molden.no Sat Dec 15 07:42:21 2012 From: sturla at molden.no (Sturla Molden) Date: Sat, 15 Dec 2012 13:42:21 +0100 Subject: [SciPy-User] Calling LAPACK function dbdsqr()? In-Reply-To: References: <1208070C-2DA0-4982-BF4E-B85C7ECF67AD@semanchuk.com> <50CB47FB.9090404@gmail.com> <50CB4BAE.1050105@molden.no>

<6DD12DBD-2348-4F66-89B0-7ED338F933CA@semanchuk.com> Message-ID: <38F58C46-1690-4A55-AE58-A2324DE5127F@molden.no> Den 14. des. 2012 kl. 20:38 skrev Pauli Virtanen : > 14.12.2012 21:24, Philip Semanchuk kirjoitti: > [clip] >> The not-so-excellent news is that on Windows I installed scipy 0.11.0 >> (the same version that's on my Mac) and it does *not* seem to provide >> a reference to dbdsqr(). I also looked for _dbdsqr, dbdsqr_, _dbdsqr_, >> etc. and it's just not there. I can get a reference to e.g. zgelss -- >> >>>>> hasattr(scipy.linalg.lapack.flapack, 'zgelss') >> True >> >> But not dbdsqr. I guess this is an artifact of the LAPACK library >> against which scipy was compiled? > > No, the result from hasattr only means that the routine is not wrapped > via f2py. The Scipy LAPACK wrappers always wrap the exact same set of > routines. > > However, if a routine is not wrapped, the compiler might not include it > in the built dll file. This probably is platform dependent. Yes, and that typically happen when we link against a static LAPACK library. It will help to build a dynamic LAPACK DLL first and link NumPy/SciPy against that. Sturla From tmp50 at ukr.net Sat Dec 15 10:13:31 2012 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 15 Dec 2012 17:13:31 +0200 Subject: [SciPy-User] [ANN] OpenOpt Suite release 0.43 Message-ID: <1943.1355584411.5558559243999444992@ffe6.ukr.net> Hi all, I'm glad to inform you about new OpenOpt release 0.43 (2012-Dec-15): * interalg now can solve SNLE in 2nd mode (parameter dataHandling = "raw", before - only "sorted") * Many other improvements for interalg * Some improvements for FuncDesigner kernel * FuncDesigner ODE now has 3 arguments instead of 4 (backward incompatibility!), e.g. {t: np.linspace(0,1,100)} or mere np.linspace(0,1,100) if your ODE right side is time-independend * FuncDesigner stochastic addon now can handle some problems with gradient-based NLP / NSP solvers * Many minor improvements and some bugfixes Visit openopt.org for more details. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.zaffino at yahoo.it Sun Dec 16 09:29:35 2012 From: p.zaffino at yahoo.it (Paolo Zaffino) Date: Sun, 16 Dec 2012 15:29:35 +0100 Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> Message-ID: <50CDDACF.10305@yahoo.it> Dear Zach, I didn't understand the trick...sorry. Please, can you explain it again? Thanks a lot. Paolo Il 14/12/2012 17:50, Zachary Pincus ha scritto: > On Dec 14, 2012, at 11:42 AM, Paolo Zaffino wrote: > >> Dear Pawe?, >> thank you for the reply. >> I try to explain better the issue. >> I have these points (in this order): >> >> P1 = (1,1) >> P2 = (2,2) >> P3 = (4,2) >> P4 = (3,1) >> >> I need to fit the points in the order P1,P2,P3,P4 even if the x coord of P3 is greater than P4. >> I thought to quadratic piecewise curve but other solutions are welcome. >> >> Thanks a lot. >> Paolo > You will want to fit a parametric spline of some degree with some amount (or no) smoothing. I'd look at the splprep function in scipy.interpolate. > > The trick is you associate each point with some monotonic parameter value, and then interpolate along that parameter (say t) to get your x, y coordinates. > > E.g.: > t x y > 0 1 1 > 1 2 2 > 2 4 2 > 3 3 1 > > Then if you were interpolating linearly, at t=0.5, you would have (1.5, 1.5) as the coordinate. > > As above, splprep will generate splines of a desired order (linear, quadratic, cubic, etc.) and with a user-specified smoothing parameter (s), which can be set to zero to get exact interpolation of the input coordinates, potentially at the cost of ringing (sometimes quite bad) away from the input coordinate. So you will need to plot the interpolated values, both at the input t-values, as well as at intermediate t's, to see if the output is sane. > > Hopefully this is somewhat clear, or at least enough to get you started. Please read the documentation for splprep and splev for more information. > Zach > > > >> Da: Pawe? Kwa?niewski >> A: Paolo Zaffino ; SciPy Users List >> Inviato: Venerd? 14 Dicembre 2012 17:25 >> Oggetto: Re: [SciPy-User] Points fitting (non lin) >> >> Dear Paolo, >> >> I'm not sure I understand correctly your problem, but this sounds like a spline fitting job. You can read more about this here: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html >> >> Is this what you are looking for? >> >> Cheers, >> >> Pawe? >> >> >> >> 2012/12/14 Paolo Zaffino >> Dear Scipy community, >> >> I have a set of points (2D) and I would compute a curve that fits them. >> The points are ordered in a precise way (not crescent order) and I can't change this order (the curve should fit the points in that order). >> I'm interseting in a non linear fit (the ideal case would be more intervals of quadratic curves). >> Has anyone any advice about? >> >> Thank you very much. >> Regards. >> Paolo >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From zachary.pincus at yale.edu Sun Dec 16 14:44:34 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Sun, 16 Dec 2012 14:44:34 -0500 Subject: [SciPy-User] Points fitting (non lin) In-Reply-To: <50CDDACF.10305@yahoo.it> References: <1355500677.78276.YahooMailNeo@web171602.mail.ir2.yahoo.com> <1355503367.42200.YahooMailNeo@web171602.mail.ir2.yahoo.com> <9E04AB38-D38C-4C6D-881E-E294898EA3AC@yale.edu> <50CDDACF.10305@yahoo.it> Message-ID: <089C98C7-4E58-46EC-83A7-CDE506C1634C@yale.edu> > Dear Zach, > I didn't understand the trick...sorry. > Please, can you explain it again? You'll want to read up about parametric representation of curves on wikipedia etc. Obviously, the coordinates you listed: P1 = (1,1) P2 = (2,2) P3 = (4,2) P4 = (3,1) do not trace out y values that are a proper function of, because there are some places where the curve would have multiple y-values for a given x-value (at 3, say). So, as you correctly understood, you can't just try to fit a function y=f(x) to your data and expect that to work. What you can, however, do is fit TWO functions to your data, using a dummy variable. For example, imagine a point moving along your curve at a given speed, so that at any given time t, the position of the point is given as (x(t), y(t)). So for each of your points above P1-P4, associate a value T1-T4. For example, t x y 0 1 1 1 2 2 2 4 2 3 3 1 is just your data above but with a t parameter indexing each point. So at t=0, x=1 and y=1. At t=1, x=2 and y=2. If you had a functional form for x(t) and y(t), you could then estimate where the point on the curve that was midway between the first (t=0) and second (t=1) points by evaluating x(0.5) and y(0.5). To do this, one could fit x(t) and y(t) separately (because as long as the t values are monotonic, x(t) and y(t) are proper functions for any smooth curve that you could draw on a plane). Fitting a spline to these data is the most straightforward way to do this, and fortunately, scipy.interpolate.splprep does this all for you and gives you a spline that can be evaluated at different t positions by splev. If spline fitting is unfamiliar to you, then parametric spline fitting will seem even less comprehensible. So you might want to experiment by fitting other simpler functional curves in scipy with splrep and then moving on. (There should be interpolation examples aplenty online.) Zach > Thanks a lot. > Paolo > > > Il 14/12/2012 17:50, Zachary Pincus ha scritto: >> On Dec 14, 2012, at 11:42 AM, Paolo Zaffino wrote: >> >>> Dear Pawe?, >>> thank you for the reply. >>> I try to explain better the issue. >>> I have these points (in this order): >>> >>> P1 = (1,1) >>> P2 = (2,2) >>> P3 = (4,2) >>> P4 = (3,1) >>> >>> I need to fit the points in the order P1,P2,P3,P4 even if the x coord of P3 is greater than P4. >>> I thought to quadratic piecewise curve but other solutions are welcome. >>> >>> Thanks a lot. >>> Paolo >> You will want to fit a parametric spline of some degree with some amount (or no) smoothing. I'd look at the splprep function in scipy.interpolate. >> >> The trick is you associate each point with some monotonic parameter value, and then interpolate along that parameter (say t) to get your x, y coordinates. >> >> E.g.: >> t x y >> 0 1 1 >> 1 2 2 >> 2 4 2 >> 3 3 1 >> >> Then if you were interpolating linearly, at t=0.5, you would have (1.5, 1.5) as the coordinate. >> >> As above, splprep will generate splines of a desired order (linear, quadratic, cubic, etc.) and with a user-specified smoothing parameter (s), which can be set to zero to get exact interpolation of the input coordinates, potentially at the cost of ringing (sometimes quite bad) away from the input coordinate. So you will need to plot the interpolated values, both at the input t-values, as well as at intermediate t's, to see if the output is sane. >> >> Hopefully this is somewhat clear, or at least enough to get you started. Please read the documentation for splprep and splev for more information. >> Zach >> >> >> >>> Da: Pawe? Kwa?niewski >>> A: Paolo Zaffino ; SciPy Users List >>> Inviato: Venerd? 14 Dicembre 2012 17:25 >>> Oggetto: Re: [SciPy-User] Points fitting (non lin) >>> >>> Dear Paolo, >>> >>> I'm not sure I understand correctly your problem, but this sounds like a spline fitting job. You can read more about this here: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html >>> >>> Is this what you are looking for? >>> >>> Cheers, >>> >>> Pawe? >>> >>> >>> >>> 2012/12/14 Paolo Zaffino >>> Dear Scipy community, >>> >>> I have a set of points (2D) and I would compute a curve that fits them. >>> The points are ordered in a precise way (not crescent order) and I can't change this order (the curve should fit the points in that order). >>> I'm interseting in a non linear fit (the ideal case would be more intervals of quadratic curves). >>> Has anyone any advice about? >>> >>> Thank you very much. >>> Regards. >>> Paolo >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pierre.raybaut at gmail.com Mon Dec 17 08:57:28 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Mon, 17 Dec 2012 14:57:28 +0100 Subject: [SciPy-User] ANN: WinPython v2.7.3.2 Message-ID: Hi all, I'm pleased to announce that WinPython v2.7.3.2 has been released for 32-bit and 64-bit Windows platforms: http://code.google.com/p/winpython/ This is mainly a maintenance release (many packages have been updated since v2.7.3.1). WinPython is a free open-source portable distribution of Python for Windows, designed for scientists. It is a full-featured (see http://code.google.com/p/winpython/wiki/PackageIndex) Python-based scientific environment: * Designed for scientists (thanks to the integrated libraries NumPy, SciPy, Matplotlib, guiqwt, etc.: * Regular *scientific users*: interactive data processing and visualization using Python with Spyder * *Advanced scientific users and software developers*: Python applications development with Spyder, version control with Mercurial and other development tools (like gettext) * *Portable*: preconfigured, it should run out of the box on any machine under Windows (without any installation requirements) and the folder containing WinPython can be moved to any location (local, network or removable drive) * *Flexible*: one can install (or should I write "use" as it's portable) as many WinPython versions as necessary (like isolated and self-consistent environments), even if those versions are running different versions of Python (2.7, 3.x in the near future) or different architectures (32bit or 64bit) on the same machine * *Customizable*: using the integrated package manager (wppm, as WinPython Package Manager), it's possible to install, uninstall or upgrade Python packages (see http://code.google.com/p/winpython/wiki/WPPM for more details on supported package formats). *WinPython is not an attempt to replace Python(x,y)*, this is just something different (see http://code.google.com/p/winpython/wiki/Roadmap): more flexible, easier to maintain, movable and less invasive for the OS, but certainly less user-friendly, with less packages/contents and without any integration to Windows explorer [*]. [*] Actually there is an optional integration into Windows explorer, providing the same features as the official Python installer regarding file associations and context menu entry (this option may be activated through the WinPython Control Panel). Enjoy! -Pierre From amyla333 at gmail.com Mon Dec 17 18:22:04 2012 From: amyla333 at gmail.com (Amy Anderson) Date: Mon, 17 Dec 2012 18:22:04 -0500 Subject: [SciPy-User] running script error (eigen symmetric) In-Reply-To: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> References: <13BFE00357AE78459CE7894943A31CA91BD932BE@MED-CORE07A.med.wayne.edu> Message-ID: