From pierre.raybaut at gmail.com Sun Apr 1 03:36:35 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Sun, 1 Apr 2012 09:36:35 +0200 Subject: [SciPy-User] ANN: Spyder v2.1.9 Message-ID: Hi all, On the behalf of Spyder's development team (http://code.google.com/p/spyderlib/people/list), I'm pleased to announce that Spyder v2.1.9 has been released and is available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://code.google.com/p/spyderlib/ This is a pure maintenance release -- a lot of bugs were fixed since v2.1.8: http://code.google.com/p/spyderlib/wiki/ChangeLog Spyder is a free, open-source (MIT license) interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features. Originally designed to provide MATLAB-like features (integrated help, interactive console, variable explorer with GUI-based editors for dictionaries, NumPy arrays, ...), it is strongly oriented towards scientific computing and software development. Thanks to the `spyderlib` library, Spyder also provides powerful ready-to-use widgets: embedded Python console (example: http://packages.python.org/guiqwt/_images/sift3.png), NumPy array editor (example: http://packages.python.org/guiqwt/_images/sift2.png), dictionary editor, source code editor, etc. Description of key features with tasty screenshots can be found at: http://code.google.com/p/spyderlib/wiki/Features On Windows platforms, Spyder is also available as a stand-alone executable (don't forget to disable UAC on Vista/7). This all-in-one portable version is still experimental (for example, it does not embed sphinx -- meaning no rich text mode for the object inspector) but it should provide a working version of Spyder for Windows platforms without having to install anything else (except Python 2.x itself, of course). Don't forget to follow Spyder updates/news: * on the project website: http://code.google.com/p/spyderlib/ * and on our official blog: http://spyder-ide.blogspot.com/ Last, but not least, we welcome any contribution that helps making Spyder an efficient scientific development/computing environment. Join us to help creating your favourite environment! (http://code.google.com/p/spyderlib/wiki/NoteForContributors) Enjoy! -Pierre From lpc at cmu.edu Mon Apr 2 12:25:44 2012 From: lpc at cmu.edu (Luis Pedro Coelho) Date: Mon, 02 Apr 2012 17:25:44 +0100 Subject: [SciPy-User] ndimage/morphology - binary dilation and erosion? In-Reply-To: References: Message-ID: <3847715.zaxhO1o2BG@rabbit> On Saturday, March 31, 2012 08:08:41 PM klo uo wrote: > I tried grey opening on sample image with both modules. Approach seems > good and result is bit identical with both modules (footprint=square(3)), > and I thought to comment on differences on both modules: > > - skimage requires converting data type to 'uint8' and won't accept > anything less > - ndimage grey opening is 3 times faster on my PC Mahotas (which I wrote): http://luispedro.org/software/mahotas is closer in implementation to ndimage and should be as fast (as well as supporting multiple types). It doesn't have the open() operation, but you can dilate() & erode() yourself: def open(f, Bc, output=None): output = mahotas.dilate(f, Bc, output=output) return mahotas.erode(f, Bc, output=output) (Also, I think that the skimage erode() & dilate() are for flat structuring elements only, but that doesn't seem to be an issue for you). HTH, -- Luis Pedro Coelho | Institute for Molecular Medicine | http://luispedro.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From lb489 at cam.ac.uk Tue Apr 3 06:54:27 2012 From: lb489 at cam.ac.uk (L. Barrott) Date: 03 Apr 2012 11:54:27 +0100 Subject: [SciPy-User] Problems with dopri45 Message-ID: Hello, I have been trying to get scipy to solve a set of coupled odes and in particular I want to use the dopri 45 method as I want to compare the results to the ode45 method in MATLAB. The code runs along the lines of: def func (t, Y, params): ... return (Ydot) with Y a vector. The other ode methods (except dop853 of course) solve this fine but even if I use the example code on the documentation page the dopri method returns the following error create_cb_arglist: Failed to build argument list (siz) with enough arguments (tot-opt) required by user-supplied function (siz,tot,opt=2,3,0). ...(traceback stuff) _dop.error: failed in processing argument list for call-back fcn. Any ideas where I am going wrong? Many thanks LB From warren.weckesser at enthought.com Tue Apr 3 08:42:40 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 3 Apr 2012 07:42:40 -0500 Subject: [SciPy-User] Problems with dopri45 In-Reply-To: References: Message-ID: On Tue, Apr 3, 2012 at 5:54 AM, L. Barrott wrote: > Hello, > > I have been trying to get scipy to solve a set of coupled odes and in > particular I want to use the dopri 45 method as I want to compare the > results to the ode45 method in MATLAB. The code runs along the lines of: > > def func (t, Y, params): > ... > return (Ydot) > > with Y a vector. The other ode methods (except dop853 of course) solve this > fine but even if I use the example code on the documentation page the dopri > method returns the following error > > create_cb_arglist: Failed to build argument list (siz) with enough > arguments (tot-opt) required by user-supplied function (siz,tot,opt=2,3,0). > ...(traceback stuff) _dop.error: failed in processing argument list for > call-back fcn. > > Any ideas where I am going wrong? > > Many thanks > LB > It would help to see more of your code. Could you include a complete, self-contained script that demonstrates the error? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From lb489 at cam.ac.uk Tue Apr 3 12:29:16 2012 From: lb489 at cam.ac.uk (L. Barrott) Date: 03 Apr 2012 17:29:16 +0100 Subject: [SciPy-User] Problems with dopri45 Message-ID: >On Tue, Apr 3, 2012 at 5:54 AM, L. Barrott wrote: > >> Hello, >> >> I have been trying to get scipy to solve a set of coupled odes and in >> particular I want to use the dopri 45 method as I want to compare the >> results to the ode45 method in MATLAB. The code runs along the lines of: >> >> def func (t, Y, params): >> ... >> return (Ydot) >> >> with Y a vector. The other ode methods (except dop853 of course) solve >> this fine but even if I use the example code on the documentation page >> the dopri method returns the following error >> >> create_cb_arglist: Failed to build argument list (siz) with enough >> arguments (tot-opt) required by user-supplied function >> (siz,tot,opt=2,3,0). ...(traceback stuff) _dop.error: failed in >> processing argument list for call-back fcn. >> >> Any ideas where I am going wrong? >> >> Many thanks >> LB >> > > >It would help to see more of your code. Could you include a complete, >self-contained script that demonstrates the error? > >Warren Even something as simple as; from scipy.integrate import ode y0, t0 = [0, 1], 0 def func (t, y, x): return [x, y[0]] r = ode(func).set_integrator ('dopri5') r.set_initial_value(y0, t0).set_f_params(1) t1 = 10 dt = 0.1 while r.successful() and r.t < t1: r.integrate(r.t+dt) Will fail and this is lifted straight from the documentation as far as I can see. The full error message is create_cb_arglist: Failed to build argument list (siz) with enough arguments (tot-opt) required by user-supplied function (siz,tot,opt=2,3,0). Traceback (most recent call last): File "", line 2, in File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line 326, in integrate self.f_params,self.jac_params) File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line 745, in run x,y,iwork,idid = self.runner(*((f,t0,y0,t1) + tuple(self.call_args))) _dop.error: failed in processing argument list for call-back fcn. LB From warren.weckesser at enthought.com Tue Apr 3 12:42:26 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 3 Apr 2012 11:42:26 -0500 Subject: [SciPy-User] Problems with dopri45 In-Reply-To: References: Message-ID: On Tue, Apr 3, 2012 at 11:29 AM, L. Barrott wrote: > >On Tue, Apr 3, 2012 at 5:54 AM, L. Barrott wrote: > > > >> Hello, > >> > >> I have been trying to get scipy to solve a set of coupled odes and in > >> particular I want to use the dopri 45 method as I want to compare the > >> results to the ode45 method in MATLAB. The code runs along the lines of: > >> > >> def func (t, Y, params): > >> ... > >> return (Ydot) > >> > >> with Y a vector. The other ode methods (except dop853 of course) solve > >> this fine but even if I use the example code on the documentation page > >> the dopri method returns the following error > >> > >> create_cb_arglist: Failed to build argument list (siz) with enough > >> arguments (tot-opt) required by user-supplied function > >> (siz,tot,opt=2,3,0). ...(traceback stuff) _dop.error: failed in > >> processing argument list for call-back fcn. > >> > >> Any ideas where I am going wrong? > >> > >> Many thanks > >> LB > >> > > > > > >It would help to see more of your code. Could you include a complete, > >self-contained script that demonstrates the error? > > > >Warren > > Even something as simple as; > > from scipy.integrate import ode > y0, t0 = [0, 1], 0 > def func (t, y, x): > return [x, y[0]] > r = ode(func).set_integrator ('dopri5') > r.set_initial_value(y0, t0).set_f_params(1) > t1 = 10 > dt = 0.1 > while r.successful() and r.t < t1: > r.integrate(r.t+dt) > > Will fail and this is lifted straight from the documentation as far as I > can see. The full error message is > > create_cb_arglist: Failed to build argument list (siz) with enough > arguments (tot-opt) required by user-supplied function (siz,tot,opt=2,3,0). > Traceback (most recent call last): > File "", line 2, in > File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line 326, > in integrate > self.f_params,self.jac_params) > File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line 745, > in run > x,y,iwork,idid = self.runner(*((f,t0,y0,t1) + tuple(self.call_args))) > _dop.error: failed in processing argument list for call-back fcn. > > LB > I suspect you are using version 0.9 (or earlier) of scipy. This looks like a bug that was fixed in 0.10: http://projects.scipy.org/scipy/ticket/1392 Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pinto at gmail.com Tue Apr 3 20:21:06 2012 From: nicolas.pinto at gmail.com (Nicolas Pinto) Date: Tue, 3 Apr 2012 20:21:06 -0400 Subject: [SciPy-User] linalg.eigh hangs only after importing sparse module In-Reply-To: References: Message-ID: > To get further, the following information is needed: > > - which platform? x86_64 Gentoo Linux, with gcc-4.5.3 > - which binaries? what do you mean ? > - which LAPACK? atlas-3.8.0 > on how to fix or debug the issue. However, if it's really the C++ > runtime that is causing the problems, then compiling Numpy/Scipy with a > different compiler could fix the problem. I can try to compile with another gcc version. Thanks for your help. N > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Nicolas Pinto http://web.mit.edu/pinto From mok at bioxray.dk Wed Apr 4 03:00:49 2012 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Wed, 4 Apr 2012 09:00:49 +0200 Subject: [SciPy-User] Rician distributions lacks sigma parameter Message-ID: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Hi, I am a new reader of this list, please forgive me if this issue has already discussed. I have been wanting to use the rician distribution (stats.rice) to analyze some data, but it seems that the implementation in scipy does not take the distribution's sigma parameter into account; rather, it has been set to 1.0. The wikipedia article shows the traditional formulation of the rician distribution [0]. I understand that some distributions, e.g. stats.norm, use the scale parameter to define std, but this does not seem to be the case with stats.rice. Any ideas on how to get around this, without actually modifying distribution.py? I have no experience with the internals of scipy, and wouldn't know how to modify it correctly. Cheers, Morten [0] http://en.wikipedia.org/wiki/Rice_distribution From jeremy at jeremysanders.net Wed Apr 4 04:09:48 2012 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Wed, 04 Apr 2012 09:09:48 +0100 Subject: [SciPy-User] ANN: Veusz 1.15 Message-ID: For possible interest to members of this mailing list... Veusz 1.15 ---------- http://home.gna.org/veusz/ Veusz is a Qt4 based scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF/SVG output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Data can be captured from external sources such as Internet sockets or other programs. Changes in 1.15: * Improved hatching: - More hatch styles - Adjust spacing of hatching - Change hatching line style - Allow hatching background color * Axes will not extend beyond specified min and max values * Add options to extend axes by 2, 5, 10 and 15% of data range * Ctrl+MouseWheel zooms in and out of plot * Full screen graph view mode * New dataset plugins - Linear interpolation - Cumulative value - Rolling average - Subtract mean / minimum * Allow grid widgets to be placed in grid widgets * Catch EnvironmentError exceptions on Windows * Allow multiple datasets to be selected in dataset browser * Allow tagging of datasets and allow datasets be grouped by tags in dataset browser * Allow text to be written as text in SVG, rather than curves * Add DBus interface to program, if DBus is installed * 2D QDP support * Add setup.py options for packagers --veusz-resource-dir : location of data files --disable-install-docs * Add title option for keys Minor changes: * Use / rather than \ for path separator in saved file names for Windows/Unix compatibility * Add diamond fill error bar type * Add \color and \marker commands to text renderer * Support labels on xy datasets if one of x or y datasets missing * Reorganise dataset plugin menu * Fix links in INSTALL/README * Floating point intervals in capture dialog Bug fixes: * Trap case where nan values could be plotted * Fix error if website not accessible in exception dialog * Crash when min and max of axes are too similar * Fix clipping of paths after transform in SVG files * Fix crash in picker * Fix crash if duplication of characters in CSV date format * Fix crash in tool tip in dataset browser * Fix GlobalColor error (on certain dark color sets) * Fix blocked data import if no descriptor * Fix crash if log contours and minimum is zero * Bug fix https://bugzilla.redhat.com/show_bug.cgi?id=800196 Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Bar graphs * Vector field plots * Box plots * Polar plots * Ternary plots * Plotting dates * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG/EMF export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV, FITS, NPY/NPZ, QDP, binary and user-plugin importing * Data can be captured from external sources * User defined functions, constants and can import external Python functions * Plugin interface to allow user to write or load code to - import data using new formats - make new datasets, optionally linked to existing datasets - arbitrarily manipulate the document * Data picker * Interactive tutorial * Multithreaded rendering Requirements for source install: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.4 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/software/pyqt/ http://www.riverbankcomputing.co.uk/software/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits pyemf >= 2.0.0 (optional for EMF export) http://pyemf.sourceforge.net/ PyMinuit >= 1.1.2 (optional improved fitting) http://code.google.com/p/pyminuit/ For EMF and better SVG export, PyQt >= 4.6 or better is required, to fix a bug in the C++ wrapping dbus-python, for dbus interface http://dbus.freedesktop.org/doc/dbus-python/ For documentation on using Veusz, see the "Documents" directory. The manual is in PDF, HTML and text format (generated from docbook). The examples are also useful documentation. Please also see and contribute to the Veusz wiki: http://barmag.net/veusz-wiki/ Issues with the current version: * Some recent versions of PyQt/SIP will causes crashes when exporting SVG files. Update to 4.7.4 (if released) or a recent snapshot to solve this problem. If you enjoy using Veusz, we would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the Git repository at https://github.com/jeremysanders/veusz.git. From josef.pktd at gmail.com Wed Apr 4 07:30:03 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 07:30:03 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 3:00 AM, Morten Kjeldgaard wrote: > Hi, > > I am a new reader of this list, please forgive me if this issue has > already discussed. > > I have been wanting to use the rician distribution (stats.rice) to > analyze some data, but it seems that the implementation in scipy does > not take the distribution's sigma parameter into account; rather, it > has been set to 1.0. The wikipedia article shows the traditional > formulation of the rician distribution [0]. > > I understand that some distributions, e.g. stats.norm, use the scale > parameter to define std, but this does not seem to be the case with > stats.rice. > > Any ideas on how to get around this, without actually modifying > distribution.py? I have no experience with the internals of scipy, and > wouldn't know how to modify it correctly. > > Cheers, > Morten > > [0] http://en.wikipedia.org/wiki/Rice_distribution location loc and scale are handled generically for all distribution. you can add loc=some number and scale= some number to almost all methods of the distributions. This replaces x by (x-loc)/scale in the calculation in the function, e.g. the _pdf, (the pdf gets an additional 1/scale in front for the transformation) For example: from scipy import stats >>> x = np.linspace(0, 10, 100) >>> import matplotlib.pyplot as plt >>> for s in [0.5, 1, 2, 5, 10]: plt.plot(x, stats.rice.pdf(x, 0.5 , scale=s)) ... [] [] [] [] [] >>> plt.show() However, I don't see how the rice_gen._pdf matches up with the formula on the Wikipedia page. It looks to me that it uses a different parameterization for the shape parameter v. (Or I didn't have enough coffee yet) bugs in this case (only _pdf is defined) could be possible, because the tests only check for consistency across methods, but in most cases the distributions are not externally verified. Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Apr 4 07:51:42 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 07:51:42 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 7:30 AM, wrote: > On Wed, Apr 4, 2012 at 3:00 AM, Morten Kjeldgaard wrote: >> Hi, >> >> I am a new reader of this list, please forgive me if this issue has >> already discussed. >> >> I have been wanting to use the rician distribution (stats.rice) to >> analyze some data, but it seems that the implementation in scipy does >> not take the distribution's sigma parameter into account; rather, it >> has been set to 1.0. The wikipedia article shows the traditional >> formulation of the rician distribution [0]. >> >> I understand that some distributions, e.g. stats.norm, use the scale >> parameter to define std, but this does not seem to be the case with >> stats.rice. >> >> Any ideas on how to get around this, without actually modifying >> distribution.py? I have no experience with the internals of scipy, and >> wouldn't know how to modify it correctly. >> >> Cheers, >> Morten >> >> [0] http://en.wikipedia.org/wiki/Rice_distribution > > location loc and scale are handled generically for all distribution. > > you can add loc=some number and scale= some number to almost all > methods of the distributions. This replaces x by (x-loc)/scale in the > calculation in the function, e.g. the _pdf, (the pdf gets an > additional 1/scale in front for the transformation) > > For example: > from scipy import stats >>>> x = np.linspace(0, 10, 100) >>>> import matplotlib.pyplot as plt > >>>> for s ?in [0.5, 1, 2, 5, 10]: plt.plot(x, stats.rice.pdf(x, 0.5 , scale=s)) > ... > [] > [] > [] > [] > [] >>>> plt.show() > > However, I don't see how the rice_gen._pdf matches up with the formula > on the Wikipedia page. > It looks to me that it uses a different parameterization for the shape > ?parameter v. (Or I didn't have enough coffee yet) > > bugs in this case (only _pdf is defined) could be possible, because > the tests only check for consistency across methods, but in most cases > the distributions are not externally verified. (after another coffee) the shape parameter in stats.rice corresponds to (v/sigma) in the Wikipedia page. This is a consequence of the generic treatment of location and scale. Josef > > Josef > > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at enthought.com Wed Apr 4 07:58:19 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 4 Apr 2012 06:58:19 -0500 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 2:00 AM, Morten Kjeldgaard wrote: > Hi, > > I am a new reader of this list, please forgive me if this issue has > already discussed. > > I have been wanting to use the rician distribution (stats.rice) to > analyze some data, but it seems that the implementation in scipy does > not take the distribution's sigma parameter into account; rather, it > has been set to 1.0. The wikipedia article shows the traditional > formulation of the rician distribution [0]. > > I understand that some distributions, e.g. stats.norm, use the scale > parameter to define std, but this does not seem to be the case with > stats.rice. > > Any ideas on how to get around this, without actually modifying > distribution.py? I have no experience with the internals of scipy, and > wouldn't know how to modify it correctly. > > Cheers, > Morten > > [0] http://en.wikipedia.org/wiki/Rice_distribution > > Hi Morten, Given the parameters nu and sigma (as shown in the wikipedia article), you use scipy.stats.rice by setting the scale=sigma and the shape parameter b=nu/sigma. You can use the following script to verify this: ----- import numpy as np from scipy.special import i0 from scipy.stats import rice import matplotlib.pyplot as plt def rice_pdf(x, nu, sigma): s2 = sigma**2 f = x/s2 * np.exp(-(x**2 + nu**2)/(2*s2)) * i0(x*nu/s2) return f x = np.linspace(0, 8, 201) nu = 3.45 sigma = 0.35 my_pdf = rice_pdf(x, nu, sigma) sp_pdf = rice.pdf(x, nu / sigma, scale=sigma) np.testing.assert_allclose(my_pdf, sp_pdf) plt.plot(x, my_pdf, label="rice_pdf") plt.plot(x, sp_pdf, label="stats.rice.pdf") plt.legend(loc='best') plt.show() ----- Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From mok at bioxray.dk Wed Apr 4 13:33:13 2012 From: mok at bioxray.dk (Morten Kjeldgaard) Date: Wed, 4 Apr 2012 19:33:13 +0200 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: Thanks for replies Josef and Warren! I think my current limitation is that I don't fully grasp how the shape and scale parameters are propagated to the individual stats distributions, and alas the autogenerated documentation isn't always very helpful. > Given the parameters nu and sigma (as shown in the wikipedia > article), you use scipy.stats.rice by setting the scale=sigma and > the shape parameter b=nu/sigma. You can use the following script to > verify this: Like you write, the script works fine with parameters (nu, sigma) = (3.45, 0.35), but it actually fails when I try to reproduce the plots in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the following: AssertionError: Not equal to tolerance rtol=1e-07, atol=0 x and y nan location mismatch: x: array([ 0.00000000e+00, 3.99680128e-02, 7.97444092e-02, 1.19139103e-01, 1.57965051e-01, 1.96039735e-01, 2.33186584e-01, 2.69236346e-01, 3.04028363e-01,... y: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,... In other words, your defined function rice_pdf works, but stats.rice does not. Cheers, Morten From josef.pktd at gmail.com Wed Apr 4 13:53:24 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 13:53:24 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard wrote: > Thanks for replies Josef and Warren! > > I think my current limitation is that I don't fully grasp how the > shape and scale parameters are propagated to the individual stats > distributions, and alas the autogenerated documentation isn't always > very helpful. > >> Given the parameters nu and sigma (as shown in the wikipedia >> article), you use scipy.stats.rice by setting the scale=sigma and >> the shape parameter b=nu/sigma. ?You can use the following script to >> verify this: > > Like you write, the script works fine with parameters (nu, sigma) = > (3.45, 0.35), but it actually fails when I try to reproduce the plots > in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the > following: the shape parameter nu has to be strictly positive, eg. nu=1e-10 works there is a problem with the calculation for nu equal to zero Josef > > AssertionError: > Not equal to tolerance rtol=1e-07, atol=0 > > x and y nan location mismatch: > ?x: array([ ?0.00000000e+00, ? 3.99680128e-02, ? 7.97444092e-02, > ? ? ? ? ?1.19139103e-01, ? 1.57965051e-01, ? 1.96039735e-01, > ? ? ? ? ?2.33186584e-01, ? 2.69236346e-01, ? 3.04028363e-01,... > ?y: array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, > nan, ?nan, > ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, > nan, > ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, > nan,... > > In other words, your defined function rice_pdf works, but stats.rice > does not. > > Cheers, > Morten > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Apr 4 13:59:39 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 13:59:39 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 1:53 PM, wrote: > On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard wrote: >> Thanks for replies Josef and Warren! >> >> I think my current limitation is that I don't fully grasp how the >> shape and scale parameters are propagated to the individual stats >> distributions, and alas the autogenerated documentation isn't always >> very helpful. >> >>> Given the parameters nu and sigma (as shown in the wikipedia >>> article), you use scipy.stats.rice by setting the scale=sigma and >>> the shape parameter b=nu/sigma. ?You can use the following script to >>> verify this: >> >> Like you write, the script works fine with parameters (nu, sigma) = >> (3.45, 0.35), but it actually fails when I try to reproduce the plots >> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the >> following: > > the shape parameter nu has to be strictly positive, eg. nu=1e-10 works > there is a problem with the calculation for nu equal to zero But the _pdf doesn't have a problem >>> stats.rice._pdf(np.linspace(0,4,11),0.) array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, 0.00134185]) >>> stats.rice.pdf(np.linspace(0,4,11),0.) array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) >>> so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) Josef > > Josef > >> >> AssertionError: >> Not equal to tolerance rtol=1e-07, atol=0 >> >> x and y nan location mismatch: >> ?x: array([ ?0.00000000e+00, ? 3.99680128e-02, ? 7.97444092e-02, >> ? ? ? ? ?1.19139103e-01, ? 1.57965051e-01, ? 1.96039735e-01, >> ? ? ? ? ?2.33186584e-01, ? 2.69236346e-01, ? 3.04028363e-01,... >> ?y: array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> nan, ?nan, >> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> nan, >> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> nan,... >> >> In other words, your defined function rice_pdf works, but stats.rice >> does not. >> >> Cheers, >> Morten >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From benwhalley at gmail.com Wed Apr 4 15:54:41 2012 From: benwhalley at gmail.com (Ben) Date: Wed, 4 Apr 2012 19:54:41 +0000 (UTC) Subject: [SciPy-User] nan's in stats.spearmanr Message-ID: Apologies if this seems obvious to others, but I'm using both functions from pandas and stats.spearmanr in different bits of my code and noticed something odd. Is the following output expected? from pandas import DataFrame from scipy import stats a = [1, nan, 2] b = [1, 2, 2] df = DataFrame(zip(a,b)) stats.spearmanr(a,b) gives: (0.86602540378443871, 0.3333333333333332) df.corr(method="spearman") 0 1 0 1 1 1 1 1 Removing the nan from a produces identical results. I had expected the first output, but perhaps I'm not understanding how scipy likes to handle nan. Any advice much appreciated. Regards, Ben From josef.pktd at gmail.com Wed Apr 4 16:57:28 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 16:57:28 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 1:59 PM, wrote: > On Wed, Apr 4, 2012 at 1:53 PM, ? wrote: >> On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard wrote: >>> Thanks for replies Josef and Warren! >>> >>> I think my current limitation is that I don't fully grasp how the >>> shape and scale parameters are propagated to the individual stats >>> distributions, and alas the autogenerated documentation isn't always >>> very helpful. >>> >>>> Given the parameters nu and sigma (as shown in the wikipedia >>>> article), you use scipy.stats.rice by setting the scale=sigma and >>>> the shape parameter b=nu/sigma. ?You can use the following script to >>>> verify this: >>> >>> Like you write, the script works fine with parameters (nu, sigma) = >>> (3.45, 0.35), but it actually fails when I try to reproduce the plots >>> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the >>> following: >> >> the shape parameter nu has to be strictly positive, eg. nu=1e-10 works >> there is a problem with the calculation for nu equal to zero > > But the _pdf doesn't have a problem >>>> stats.rice._pdf(np.linspace(0,4,11),0.) > array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, > ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, > ? ? ? ?0.00134185]) > >>>> stats.rice.pdf(np.linspace(0,4,11),0.) > array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan]) >>>> > > so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) should be a ticket: define _argcheck for rice something like the following (but I didn't check which args _argcheck is supposed to have) >>> def _argcheck(self, *args): return args >=0 ... >>> stats.rice._argcheck = _argcheck >>> stats.rice.pdf(np.linspace(0,4,11),0.) array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, 0.00134185]) >>> stats.rice._pdf(np.linspace(0,4,11),0.) array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, 0.00134185]) Josef > > Josef > > >> >> Josef >> >>> >>> AssertionError: >>> Not equal to tolerance rtol=1e-07, atol=0 >>> >>> x and y nan location mismatch: >>> ?x: array([ ?0.00000000e+00, ? 3.99680128e-02, ? 7.97444092e-02, >>> ? ? ? ? ?1.19139103e-01, ? 1.57965051e-01, ? 1.96039735e-01, >>> ? ? ? ? ?2.33186584e-01, ? 2.69236346e-01, ? 3.04028363e-01,... >>> ?y: array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> nan, ?nan, >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> nan, >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> nan,... >>> >>> In other words, your defined function rice_pdf works, but stats.rice >>> does not. >>> >>> Cheers, >>> Morten >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at enthought.com Wed Apr 4 17:00:06 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 4 Apr 2012 16:00:06 -0500 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 12:59 PM, wrote: > On Wed, Apr 4, 2012 at 1:53 PM, wrote: > > On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard > wrote: > >> Thanks for replies Josef and Warren! > >> > >> I think my current limitation is that I don't fully grasp how the > >> shape and scale parameters are propagated to the individual stats > >> distributions, and alas the autogenerated documentation isn't always > >> very helpful. > >> > >>> Given the parameters nu and sigma (as shown in the wikipedia > >>> article), you use scipy.stats.rice by setting the scale=sigma and > >>> the shape parameter b=nu/sigma. You can use the following script to > >>> verify this: > >> > >> Like you write, the script works fine with parameters (nu, sigma) = > >> (3.45, 0.35), but it actually fails when I try to reproduce the plots > >> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the > >> following: > > > > the shape parameter nu has to be strictly positive, eg. nu=1e-10 works > > there is a problem with the calculation for nu equal to zero > > But the _pdf doesn't have a problem > >>> stats.rice._pdf(np.linspace(0,4,11),0.) > array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, > 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, > 0.00134185]) > > >>> stats.rice.pdf(np.linspace(0,4,11),0.) > array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) > >>> > > so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) > > Josef > > I created a ticket for this: http://projects.scipy.org/scipy/ticket/1639 Warren > > > > > Josef > > > >> > >> AssertionError: > >> Not equal to tolerance rtol=1e-07, atol=0 > >> > >> x and y nan location mismatch: > >> x: array([ 0.00000000e+00, 3.99680128e-02, 7.97444092e-02, > >> 1.19139103e-01, 1.57965051e-01, 1.96039735e-01, > >> 2.33186584e-01, 2.69236346e-01, 3.04028363e-01,... > >> y: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, > >> nan, nan, > >> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > >> nan, > >> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > >> nan,... > >> > >> In other words, your defined function rice_pdf works, but stats.rice > >> does not. > >> > >> Cheers, > >> Morten > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Apr 4 17:04:29 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 4 Apr 2012 16:04:29 -0500 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 3:57 PM, wrote: > On Wed, Apr 4, 2012 at 1:59 PM, wrote: > > On Wed, Apr 4, 2012 at 1:53 PM, wrote: > >> On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard > wrote: > >>> Thanks for replies Josef and Warren! > >>> > >>> I think my current limitation is that I don't fully grasp how the > >>> shape and scale parameters are propagated to the individual stats > >>> distributions, and alas the autogenerated documentation isn't always > >>> very helpful. > >>> > >>>> Given the parameters nu and sigma (as shown in the wikipedia > >>>> article), you use scipy.stats.rice by setting the scale=sigma and > >>>> the shape parameter b=nu/sigma. You can use the following script to > >>>> verify this: > >>> > >>> Like you write, the script works fine with parameters (nu, sigma) = > >>> (3.45, 0.35), but it actually fails when I try to reproduce the plots > >>> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the > >>> following: > >> > >> the shape parameter nu has to be strictly positive, eg. nu=1e-10 works > >> there is a problem with the calculation for nu equal to zero > > > > But the _pdf doesn't have a problem > >>>> stats.rice._pdf(np.linspace(0,4,11),0.) > > array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, > > 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, > > 0.00134185]) > > > >>>> stats.rice.pdf(np.linspace(0,4,11),0.) > > array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) > >>>> > > > > so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) > > should be a ticket: define _argcheck for rice > > something like the following (but I didn't check which args _argcheck > is supposed to have) > > >>> def _argcheck(self, *args): return args >=0 > ... > >>> stats.rice._argcheck = _argcheck > >>> stats.rice.pdf(np.linspace(0,4,11),0.) > array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, > 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, > 0.00134185]) > >>> stats.rice._pdf(np.linspace(0,4,11),0.) > array([ 0. , 0.36924654, 0.58091923, 0.58410271, 0.44485968, > 0.27067057, 0.13472343, 0.05555507, 0.01912327, 0.00552172, > 0.00134185]) > > Josef > > Once again, your post arrived just as I was finishing mine. Kinda spooky. Could you add that comment to the ticket? ( http://projects.scipy.org/scipy/ticket/1639) Warren > > > > Josef > > > > > >> > >> Josef > >> > >>> > >>> AssertionError: > >>> Not equal to tolerance rtol=1e-07, atol=0 > >>> > >>> x and y nan location mismatch: > >>> x: array([ 0.00000000e+00, 3.99680128e-02, 7.97444092e-02, > >>> 1.19139103e-01, 1.57965051e-01, 1.96039735e-01, > >>> 2.33186584e-01, 2.69236346e-01, 3.04028363e-01,... > >>> y: array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, > >>> nan, nan, > >>> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > >>> nan, > >>> nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, > >>> nan,... > >>> > >>> In other words, your defined function rice_pdf works, but stats.rice > >>> does not. > >>> > >>> Cheers, > >>> Morten > >>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Apr 4 17:30:35 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 4 Apr 2012 16:30:35 -0500 Subject: [SciPy-User] SciPy 2012 - The Eleventh Annual Conference on Scientific Computing with Python Message-ID: SciPy 2012, the eleventh annual Conference on Scientific Computing with Python, will be held July 16?21, 2012, in Austin, Texas. At this conference, novel scientific applications and libraries related to data acquisition, analysis, dissemination and visualization using Python are presented. Attended by leading figures from both academia and industry, it is an excellent opportunity to experience the cutting edge of scientific software development. The conference is preceded by two days of tutorials, during which community experts provide training on several scientific Python packages. Following the main conference will be two days of coding sprints. We invite you to give a talk or present a poster at SciPy 2012. The list of topics that are appropriate for the conference includes (but is not limited to): - new Python libraries for science and engineering; - applications of Python in solving scientific or computational problems; - high performance, parallel and GPU computing with Python; - use of Python in science education. Specialized Tracks Two specialized tracks run in parallel to the main conference: - High Performance Computing with Python Whether your algorithm is distributed, threaded, memory intensive or latency bound, Python is making headway into the problem. We are looking for performance driven designs and applications in Python. Candidates include the use of Python within a parallel application, new architectures, and ways of making traditional applications execute more efficiently. - Visualization They say a picture is worth a thousand words--we?re interested in both! Python provides numerous visualization tools that allow scientists to show off their work, and we want to know about any new tools and techniques out there. Come show off your latest graphics, whether it?s an old library with a slick new feature, a new library out to challenge the status quo, or simply a beautiful result. Domain-specific Mini-symposia Mini-symposia on the following topics are also being organized: - Computational bioinformatics - Meteorology and climatology - Astronomy and astrophysics - Geophysics Talks, papers and posters We invite you to take part by submitting a talk or poster abstract. Instructions are on the conference website: http://conference.scipy.org/scipy2012/talks.php Selected talks are included as papers in the peer-reviewed conference proceedings, to be published online. Tutorials Tutorials will be given July 16?17. We invite instructors to submit proposals for half-day tutorials on topics relevant to scientific computing with Python. See http://conference.scipy.org/scipy2012/tutorials.php for information about submitting a tutorial proposal. To encourage tutorials of the highest quality, the instructor (or team of instructors) is given a $1,000 stipend for each half day tutorial. Student/Community Scholarships We anticipate providing funding for students and for active members of the SciPy community who otherwise might not be able to attend the conference. See http://conference.scipy.org/scipy2012/student.php for scholarship application guidelines. Be a Sponsor The SciPy conference could not run without the generous support of the institutions and corporations who share our enthusiasm for Python as a tool for science. Please consider sponsoring SciPy 2012. For more information, see http://conference.scipy.org/scipy2012/sponsor/index.php Important dates: Monday, April 30: Talk abstracts and tutorial proposals due. Monday, May 7: Accepted tutorials announced. Monday, May 13: Accepted talks announced. Monday, June 18: Early registration ends. (Price increases after this date.) Sunday, July 8: Online registration ends. Monday-Tuesday, July 16 - 17: Tutorials Wednesday-Thursday, July 18 - July 19: Conference Friday-Saturday, July 20 - July 21: Sprints We look forward to seeing you all in Austin this year! The SciPy 2012 Team http://conference.scipy.org/scipy2012/organizers.php -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Apr 4 17:55:18 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 17:55:18 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 5:04 PM, Warren Weckesser wrote: > > > On Wed, Apr 4, 2012 at 3:57 PM, wrote: >> >> On Wed, Apr 4, 2012 at 1:59 PM, ? wrote: >> > On Wed, Apr 4, 2012 at 1:53 PM, ? wrote: >> >> On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard >> >> wrote: >> >>> Thanks for replies Josef and Warren! >> >>> >> >>> I think my current limitation is that I don't fully grasp how the >> >>> shape and scale parameters are propagated to the individual stats >> >>> distributions, and alas the autogenerated documentation isn't always >> >>> very helpful. >> >>> >> >>>> Given the parameters nu and sigma (as shown in the wikipedia >> >>>> article), you use scipy.stats.rice by setting the scale=sigma and >> >>>> the shape parameter b=nu/sigma. ?You can use the following script to >> >>>> verify this: >> >>> >> >>> Like you write, the script works fine with parameters (nu, sigma) = >> >>> (3.45, 0.35), but it actually fails when I try to reproduce the plots >> >>> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the >> >>> following: >> >> >> >> the shape parameter nu has to be strictly positive, eg. nu=1e-10 works >> >> there is a problem with the calculation for nu equal to zero >> > >> > But the _pdf doesn't have a problem >> >>>> stats.rice._pdf(np.linspace(0,4,11),0.) >> > array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >> > ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >> > ? ? ? ?0.00134185]) >> > >> >>>> stats.rice.pdf(np.linspace(0,4,11),0.) >> > array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> > ?nan]) >> >>>> >> > >> > so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) >> >> should be a ticket: define _argcheck for rice >> >> something like the following (but I didn't check which args _argcheck >> is supposed to have) >> >> >>> def _argcheck(self, *args): return args >=0 >> ... >> >>> stats.rice._argcheck = _argcheck >> >>> stats.rice.pdf(np.linspace(0,4,11),0.) >> array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >> ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >> ? ? ? ?0.00134185]) >> >>> stats.rice._pdf(np.linspace(0,4,11),0.) >> array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >> ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >> ? ? ? ?0.00134185]) >> >> Josef >> > > > > Once again, your post arrived just as I was finishing mine.? Kinda spooky. Better 2 than 0. > > Could you add that comment to the ticket? > (http://projects.scipy.org/scipy/ticket/1639) Done, I didn't check the source file for distributions (just pulling up source of functions of 0.9 that I have installed), but it should be just 2-3 lines to fix this. Josef > > Warren > > >> >> > >> > Josef >> > >> > >> >> >> >> Josef >> >> >> >>> >> >>> AssertionError: >> >>> Not equal to tolerance rtol=1e-07, atol=0 >> >>> >> >>> x and y nan location mismatch: >> >>> ?x: array([ ?0.00000000e+00, ? 3.99680128e-02, ? 7.97444092e-02, >> >>> ? ? ? ? ?1.19139103e-01, ? 1.57965051e-01, ? 1.96039735e-01, >> >>> ? ? ? ? ?2.33186584e-01, ? 2.69236346e-01, ? 3.04028363e-01,... >> >>> ?y: array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> >>> nan, ?nan, >> >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> >>> nan, >> >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >> >>> nan,... >> >>> >> >>> In other words, your defined function rice_pdf works, but stats.rice >> >>> does not. >> >>> >> >>> Cheers, >> >>> Morten >> >>> >> >>> _______________________________________________ >> >>> SciPy-User mailing list >> >>> SciPy-User at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Apr 4 18:08:28 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 18:08:28 -0400 Subject: [SciPy-User] Rician distributions lacks sigma parameter In-Reply-To: References: <378699F3-B5D7-4722-AD59-551226F02877@bioxray.dk> Message-ID: On Wed, Apr 4, 2012 at 5:55 PM, wrote: > On Wed, Apr 4, 2012 at 5:04 PM, Warren Weckesser > wrote: >> >> >> On Wed, Apr 4, 2012 at 3:57 PM, wrote: >>> >>> On Wed, Apr 4, 2012 at 1:59 PM, ? wrote: >>> > On Wed, Apr 4, 2012 at 1:53 PM, ? wrote: >>> >> On Wed, Apr 4, 2012 at 1:33 PM, Morten Kjeldgaard >>> >> wrote: >>> >>> Thanks for replies Josef and Warren! >>> >>> >>> >>> I think my current limitation is that I don't fully grasp how the >>> >>> shape and scale parameters are propagated to the individual stats >>> >>> distributions, and alas the autogenerated documentation isn't always >>> >>> very helpful. >>> >>> >>> >>>> Given the parameters nu and sigma (as shown in the wikipedia >>> >>>> article), you use scipy.stats.rice by setting the scale=sigma and >>> >>>> the shape parameter b=nu/sigma. ?You can use the following script to >>> >>>> verify this: >>> >>> >>> >>> Like you write, the script works fine with parameters (nu, sigma) = >>> >>> (3.45, 0.35), but it actually fails when I try to reproduce the plots >>> >>> in the wikipedia article. When setting (nu, sigma) = (0, 1), I get the >>> >>> following: >>> >> >>> >> the shape parameter nu has to be strictly positive, eg. nu=1e-10 works >>> >> there is a problem with the calculation for nu equal to zero >>> > >>> > But the _pdf doesn't have a problem >>> >>>> stats.rice._pdf(np.linspace(0,4,11),0.) >>> > array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >>> > ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >>> > ? ? ? ?0.00134185]) >>> > >>> >>>> stats.rice.pdf(np.linspace(0,4,11),0.) >>> > array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> > ?nan]) >>> >>>> >>> > >>> > so it should be possible to fix it for the nu=0 case, (define a>= 0 ?) >>> >>> should be a ticket: define _argcheck for rice >>> >>> something like the following (but I didn't check which args _argcheck >>> is supposed to have) >>> >>> >>> def _argcheck(self, *args): return args >=0 >>> ... >>> >>> stats.rice._argcheck = _argcheck >>> >>> stats.rice.pdf(np.linspace(0,4,11),0.) >>> array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >>> ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >>> ? ? ? ?0.00134185]) >>> >>> stats.rice._pdf(np.linspace(0,4,11),0.) >>> array([ 0. ? ? ? ?, ?0.36924654, ?0.58091923, ?0.58410271, ?0.44485968, >>> ? ? ? ?0.27067057, ?0.13472343, ?0.05555507, ?0.01912327, ?0.00552172, >>> ? ? ? ?0.00134185]) >>> >>> Josef >>> >> >> >> >> Once again, your post arrived just as I was finishing mine.? Kinda spooky. > > Better 2 than 0. > >> >> Could you add that comment to the ticket? >> (http://projects.scipy.org/scipy/ticket/1639) > > Done, > I didn't check the source file for distributions (just pulling up > source of functions of 0.9 that I have installed), > but it should be just 2-3 lines to fix this. Planning for the future: I think we could start to switch to personalized docstrings for distributions as the show up on the mailing list or in tickets. For example in the case of rice it would be good to have the difference in the parameterization in the doc string, so we don't have to do a search to find it again, or figure out the same thing each time again. Morten, thanks for finding the corner case. It's always difficult to figure out the limits for which the distributions are supposed to work. Josef > > Josef > >> >> Warren >> >> >>> >>> > >>> > Josef >>> > >>> > >>> >> >>> >> Josef >>> >> >>> >>> >>> >>> AssertionError: >>> >>> Not equal to tolerance rtol=1e-07, atol=0 >>> >>> >>> >>> x and y nan location mismatch: >>> >>> ?x: array([ ?0.00000000e+00, ? 3.99680128e-02, ? 7.97444092e-02, >>> >>> ? ? ? ? ?1.19139103e-01, ? 1.57965051e-01, ? 1.96039735e-01, >>> >>> ? ? ? ? ?2.33186584e-01, ? 2.69236346e-01, ? 3.04028363e-01,... >>> >>> ?y: array([ nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> >>> nan, ?nan, >>> >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> >>> nan, >>> >>> ? ? ? ? nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, ?nan, >>> >>> nan,... >>> >>> >>> >>> In other words, your defined function rice_pdf works, but stats.rice >>> >>> does not. >>> >>> >>> >>> Cheers, >>> >>> Morten >>> >>> >>> >>> _______________________________________________ >>> >>> SciPy-User mailing list >>> >>> SciPy-User at scipy.org >>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From josef.pktd at gmail.com Wed Apr 4 18:34:01 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Apr 2012 18:34:01 -0400 Subject: [SciPy-User] nan's in stats.spearmanr In-Reply-To: References: Message-ID: On Wed, Apr 4, 2012 at 3:54 PM, Ben wrote: > Apologies if this seems obvious to others, but I'm using both functions from > pandas and stats.spearmanr in different bits of my code and noticed something > odd. ?Is the following output expected? > > from ?pandas import DataFrame > from scipy import stats > a = [1, nan, 2] > b = [1, 2, 2] > df = DataFrame(zip(a,b)) > stats.spearmanr(a,b) > > gives: (0.86602540378443871, 0.3333333333333332) > > df.corr(method="spearman") > ? 0 ?1 > 0 ?1 ?1 > 1 ?1 ?1 > > Removing the nan from a produces identical results. I had expected the first > output, but perhaps I'm not ?understanding how scipy likes to handle nan. scipy.stats doesn't handle nans in most cases, they are just ignored (what the outcome is depends on the implementation details) the correct answer should be in stats.mstats, which uses masked arrays to handle nan cases >>> am = np.ma.fix_invalid(a) >>> bm = np.ma.fix_invalid(b) >>> stats.mstats.spearmanr(am, bm) (1.0, 0.0) Josef > > Any advice much appreciated. > > Regards, > > Ben > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wesmckinn at gmail.com Wed Apr 4 20:51:52 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 4 Apr 2012 20:51:52 -0400 Subject: [SciPy-User] nan's in stats.spearmanr In-Reply-To: References: Message-ID: On Wed, Apr 4, 2012 at 6:34 PM, wrote: > On Wed, Apr 4, 2012 at 3:54 PM, Ben wrote: >> Apologies if this seems obvious to others, but I'm using both functions from >> pandas and stats.spearmanr in different bits of my code and noticed something >> odd. ?Is the following output expected? >> >> from ?pandas import DataFrame >> from scipy import stats >> a = [1, nan, 2] >> b = [1, 2, 2] >> df = DataFrame(zip(a,b)) >> stats.spearmanr(a,b) >> >> gives: (0.86602540378443871, 0.3333333333333332) >> >> df.corr(method="spearman") >> ? 0 ?1 >> 0 ?1 ?1 >> 1 ?1 ?1 >> >> Removing the nan from a produces identical results. I had expected the first >> output, but perhaps I'm not ?understanding how scipy likes to handle nan. > > scipy.stats doesn't handle nans in most cases, they are just ignored > (what the outcome is depends on the implementation details) > > the correct answer should be in stats.mstats, which uses masked arrays > to handle nan cases > >>>> am = np.ma.fix_invalid(a) >>>> bm = np.ma.fix_invalid(b) >>>> stats.mstats.spearmanr(am, bm) > (1.0, 0.0) > > Josef > > >> >> Any advice much appreciated. >> >> Regards, >> >> Ben >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user pandas excludes NaN's by default so the output looks correct based on what Josef wrote From nicolas.pinto at gmail.com Thu Apr 5 14:33:34 2012 From: nicolas.pinto at gmail.com (Nicolas Pinto) Date: Thu, 5 Apr 2012 14:33:34 -0400 Subject: [SciPy-User] linalg.eigh hangs only after importing sparse module In-Reply-To: References: Message-ID: I tried (in a virtualenv) to compile with gcc-4.2.4, gcc-4.3.6 and gcc-4.4.6 and all failed. Any suggestion on what to try next? Thanks again. N On Tue, Apr 3, 2012 at 8:21 PM, Nicolas Pinto wrote: >> To get further, the following information is needed: >> >> - which platform? > > x86_64 Gentoo Linux, with gcc-4.5.3 > >> - which binaries? > > what do you mean ? > >> - which LAPACK? > > atlas-3.8.0 > >> on how to fix or debug the issue. However, if it's really the C++ >> runtime that is causing the problems, then compiling Numpy/Scipy with a >> different compiler could fix the problem. > > I can try to compile with another gcc version. > > Thanks for your help. > > N > >> >> -- >> Pauli Virtanen >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > Nicolas Pinto > http://web.mit.edu/pinto -- Nicolas Pinto http://web.mit.edu/pinto From pav at iki.fi Thu Apr 5 15:00:27 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 05 Apr 2012 21:00:27 +0200 Subject: [SciPy-User] linalg.eigh hangs only after importing sparse module In-Reply-To: References: Message-ID: 05.04.2012 20:33, Nicolas Pinto kirjoitti: > I tried (in a virtualenv) to compile with gcc-4.2.4, gcc-4.3.6 and > gcc-4.4.6 and all failed. Any suggestion on what to try next? Suggestions - Check with different versions of ATLAS and Lapack. - Run the program under valgrind, with python-specific suppressions enabled. Python's source distribution has a suppression file for Valgrind. - Run the program in gdb, first having compiled everything with debug symbols. (OPT="-ggdb" FOPT="-ggdb" python setup.py build ...) Check if the values passed down to the LAPACK routine are OK. Unfortunately, these take some work. Pauli From wkerzendorf at gmail.com Thu Apr 5 15:48:33 2012 From: wkerzendorf at gmail.com (Wolfgang Kerzendorf) Date: Thu, 5 Apr 2012 15:48:33 -0400 Subject: [SciPy-User] n-dimensional sparse array Message-ID: <2D1CA913-E8B9-4EE5-A9E4-83363161BFAB@gmail.com> Is there a scipy or numpy (or any other package) way to store an n-dimensional sparse array? I did only find the sparsematrix which is only for two dimensions. Cheers Wolfgang From apratap at lbl.gov Thu Apr 5 17:10:33 2012 From: apratap at lbl.gov (Abhishek Pratap) Date: Thu, 5 Apr 2012 14:10:33 -0700 Subject: [SciPy-User] Help with clustering : Memory Error with large dataset Message-ID: Hey Guys I am re-posting a message I had sent to numpy mailing list earlier. In summary I need help with clustering. My input dataset is about 1-2 million x,y coordinates which I would like to cluster together for ex using DBSCAN algo. I tried it on a small data set and it works fine. When I increase my input size it crashes. Can I be more efficient ? More details copied below. Thanks! -Abhi ===message from numpy mailing list==== I am new to both python and more so to numpy. I am trying to cluster close to a 900K points using DBSCAN algo. My input is a list of ~900k tuples each having two points (x,y) coordinates. I am converting them to numpy array and passing them to pdist method of scipy.spatial.distance for calculating distance between each point. Here is some size info on my numpy array shape of input array : (828575, 2) Size : 6872000 bytes I think the error has something to do with the default double dtype of numpy array of pdist function. I would appreciate if you could help me debug this. I am sure I overlooking some naive thing here See the traceback below. MemoryError Traceback (most recent call last) /house/homedirs/a/apratap/Dropbox/dev/ipython/ in () 36 37 print cleaned_senseBam ---> 38 cluster_pet_points_per_chromosome(sense_bamFile) /house/homedirs/a/apratap/Dropbox/dev/ipython/ in cluster_pet_points_per_chromosome(bamFile) 30 print 'Size of list points is %d' % sys.getsizeof(points) 31 print 'Size of numpy array is %d' % sys.getsizeof(points_array) ---> 32 cluster_points_DBSCAN(points_array) 33 #print points_array 34 /house/homedirs/a/apratap/Dropbox/dev/ipython/ in cluster_points_DBSCAN(data_numpy_array) 9 def cluster_points_DBSCAN(data_numpy_array): 10 #eucledian distance calculation ---> 11 D = distance.pdist(data_numpy_array) 12 S = distance.squareform(D) 13 H = 1 - S/np.max(S) /house/homedirs/a/apratap/playground/software/epd-7.2-2-rh5-x86_64/lib/python2.7/site-packages/scipy/spatial/distance.pyc in pdist(X, metric, p, w, V, VI) 1155 1156 m, n = s -> 1157 dm = np.zeros((m * (m - 1) / 2,), dtype=np.double) 1158 1159 wmink_names = ['wminkowski', 'wmi', 'wm', 'wpnorm'] From surfcast23 at gmail.com Thu Apr 5 18:46:54 2012 From: surfcast23 at gmail.com (surfcast23) Date: Thu, 5 Apr 2012 15:46:54 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. Message-ID: <33583156.post@talk.nabble.com> Hi, I have an if statement and what I want it to do is go through arrays and find the common elements in all three arrays. When I try the code below I get this error ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Can some one explain the error to me and how I might be able to fix it. Thanks in advance. if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: print("in range") else: print("Not in range") -- View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33583156.html Sent from the Scipy-User mailing list archive at Nabble.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Fri Apr 6 00:54:40 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Fri, 6 Apr 2012 00:54:40 -0400 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33583156.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> Message-ID: On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 wrote: > Hi, I have an if statement and what I want it to do is go through arrays > and find the common elements in all three arrays. When I try the code below > I get this error * ValueError: The truth value of an array with more than > one element is ambiguous. Use a.any() or a.all()* Can some one explain > the error to me and how I might be able to fix it. Thanks in advance. *if > min <= Xa <=max & min <= Ya <=max & min <= Za <=max: print("in range") > else: print("Not in range")* This explanation may or may not be clear, but your question is answered in this communication . Roughly: 1) Python's default behavior for chained comparisons don't work as you'd expect for numpy arrays. 2) Python doesn't allow numpy to change this default behavior (at least currently, and maybe never ). Nevertheless, you can get around this by separating the comparisons >>> if (min <= Xa) & (Xa <= max): Note the use of `&` instead of `and`, which is at the heart of the issue . Hope that helps, -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpeters at edisonmission.com Fri Apr 6 04:01:52 2012 From: cpeters at edisonmission.com (Christopher Peters) Date: Fri, 6 Apr 2012 04:01:52 -0400 Subject: [SciPy-User] AUTO: Christopher Peters is out of the office (returning 04/09/2012) Message-ID: I am out of the office until 04/09/2012. I am out of the office. Please email urgent requests to Mike McDonald. Note: This is an automated response to your message "Re: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous." sent on 4/6/2012 12:54:40 AM. This is the only notification you will receive while this person is away. From yann.ziegler at etu.unistra.fr Thu Apr 5 10:50:37 2012 From: yann.ziegler at etu.unistra.fr (Yann ZIEGLER) Date: Thu, 05 Apr 2012 16:50:37 +0200 Subject: [SciPy-User] Strange behavior of sph_jn Message-ID: <20120405165037.9i1619zicks8goo4@webmail.u-strasbg.fr> Hi everybody, I noticed a strange behavior of the sph_jn function (and it really bothers me!). Here is a copy-past of what I get (from scipy.special import sph_jn) in IPython: In [1]: sph_jn(2,3350.506) Out[1]: (array([ 2.98461400e-04, -6.76491641e-07, -2.98462005e-04]), array([ 6.76491641e-07, 2.98461803e-04, -4.09252598e-07])) In [2]: sph_jn(2,3350.507) Out[2]: (array([ inf, -3.78029638e-07, -inf]), array([ 3.78029638e-07, inf, inf])) As you can see, there is a 'jump' (a very big one!) between 3350.506 and 3350.507 for some orders and some derivatives. What happens ? As a point of comparison, Mathematica (http://www.wolframalpha.com/input/?i=SpericalBesselJ[2%2C3350.507]) says: J_2(3350.507) ~= -0.000298462... wich seems to be far more correct. Is there any 'numerical disaster' ? Thanks, Yann From kevin.gullikson at gmail.com Thu Apr 5 19:10:01 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Thu, 5 Apr 2012 18:10:01 -0500 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33583156.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> Message-ID: My only thought is that Xa, Ya, and/or Za are arrays? That array will show up if you have say: a = [False, True] if a: ... Kevin Gullikson On Thu, Apr 5, 2012 at 5:46 PM, surfcast23 wrote: > Hi, I have an if statement and what I want it to do is go through arrays > and find the common elements in all three arrays. When I try the code below > I get this error * ValueError: The truth value of an array with more than > one element is ambiguous. Use a.any() or a.all()* Can some one explain > the error to me and how I might be able to fix it. Thanks in advance. *if > min <= Xa <=max & min <= Ya <=max & min <= Za <=max: print("in range") > else: print("Not in range") * > ------------------------------ > View this message in context: ValueError: The truth value of an array > with more than one element is ambiguous. > Sent from the Scipy-User mailing list archiveat Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Fri Apr 6 09:40:27 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Fri, 6 Apr 2012 09:40:27 -0400 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: References: <33583156.post@talk.nabble.com> Message-ID: On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > > > On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 wrote: > >> Hi, I have an if statement and what I want it to do is go through arrays >> and find the common elements in all three arrays. When I try the code below >> I get this error * ValueError: The truth value of an array with more >> than one element is ambiguous. Use a.any() or a.all()* Can some one >> explain the error to me and how I might be able to fix it. Thanks in >> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >> print("in range") else: print("Not in range")* > > > This explanation may or may not be clear, but your question is answered > in this communication > . > > Roughly: > 1) Python's default behavior for chained comparisons don't work as you'd > expect for numpy arrays. > 2) Python doesn't allow numpy to change this default behavior (at least > currently, and maybe never > ). > > Nevertheless, you can get around this by separating the comparisons > > >>> if (min <= Xa) & (Xa <= max): > > Note the use of `&` instead of `and`, which is at the heart of the issue > . > > Hope that helps, > -Tony > Oops, I think I got myself mixed up in the explanation. Separating the comparisons fixes one error; For example, the following: >>> (min <= Xa) & (Xa <= max) will return an array of bools instead of raising an error (as you would get with `min <= Xa <= max`). This is what I meant to explain above. But, throwing an `if` in front of that comparison still doesn't work because it's ambiguous: Should `np.array([True False])` be true or false? Instead you should check `np.all(np.array([True False]))`, which evaluates as False since not-all elements are True, or `np.any(np.array([True False]))`, which evaluates as True since one element is True. -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From surfcast23 at gmail.com Fri Apr 6 16:52:30 2012 From: surfcast23 at gmail.com (surfcast23) Date: Fri, 6 Apr 2012 13:52:30 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: References: <33583156.post@talk.nabble.com> Message-ID: <33645519.post@talk.nabble.com> Hi Tony, Thanks for the help. Would I be able to use the np.any and np.all functions to count the number of true occurrences? Tony Yu-3 wrote: > > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > >> >> >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 wrote: >> >>> Hi, I have an if statement and what I want it to do is go through arrays >>> and find the common elements in all three arrays. When I try the code >>> below >>> I get this error * ValueError: The truth value of an array with more >>> than one element is ambiguous. Use a.any() or a.all()* Can some one >>> explain the error to me and how I might be able to fix it. Thanks in >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >>> print("in range") else: print("Not in range")* >> >> >> This explanation may or may not be clear, but your question is answered >> in this >> communication >> . >> >> Roughly: >> 1) Python's default behavior for chained comparisons don't work as you'd >> expect for numpy arrays. >> 2) Python doesn't allow numpy to change this default behavior (at least >> currently, and maybe >> never >> ). >> >> Nevertheless, you can get around this by separating the comparisons >> >> >>> if (min <= Xa) & (Xa <= max): >> >> Note the use of `&` instead of `and`, which is at the heart of the >> issue >> . >> >> Hope that helps, >> -Tony >> > > Oops, I think I got myself mixed up in the explanation. Separating the > comparisons fixes one error; For example, the following: > >>>> (min <= Xa) & (Xa <= max) > > will return an array of bools instead of raising an error (as you would > get > with `min <= Xa <= max`). This is what I meant to explain above. > > But, throwing an `if` in front of that comparison still doesn't work > because it's ambiguous: Should `np.array([True False])` be true or false? > Instead you should check `np.all(np.array([True False]))`, which evaluates > as False since not-all elements are True, or `np.any(np.array([True > False]))`, which evaluates as True since one element is True. > > -Tony > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html Sent from the Scipy-User mailing list archive at Nabble.com. From zachary.pincus at yale.edu Fri Apr 6 17:11:58 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 6 Apr 2012 17:11:58 -0400 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33645519.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> <33645519.post@talk.nabble.com> Message-ID: <1DBE014C-C991-4BF1-84A1-B8A1E6A6D639@yale.edu> > Thanks for the help. Would I be able to use the np.any and np.all > functions to count the number of true occurrences? I typically use np.sum() (or arr.sum() where arr is a numpy array) to count the number of True values (which count as 1 in boolean arrays, where Falses are 0.) > > > Tony Yu-3 wrote: >> >> On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: >> >>> >>> >>> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 wrote: >>> >>>> Hi, I have an if statement and what I want it to do is go through arrays >>>> and find the common elements in all three arrays. When I try the code >>>> below >>>> I get this error * ValueError: The truth value of an array with more >>>> than one element is ambiguous. Use a.any() or a.all()* Can some one >>>> explain the error to me and how I might be able to fix it. Thanks in >>>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >>>> print("in range") else: print("Not in range")* >>> >>> >>> This explanation may or may not be clear, but your question is answered >>> in this >>> communication >>> . >>> >>> Roughly: >>> 1) Python's default behavior for chained comparisons don't work as you'd >>> expect for numpy arrays. >>> 2) Python doesn't allow numpy to change this default behavior (at least >>> currently, and maybe >>> never >>> ). >>> >>> Nevertheless, you can get around this by separating the comparisons >>> >>>>>> if (min <= Xa) & (Xa <= max): >>> >>> Note the use of `&` instead of `and`, which is at the heart of the >>> issue >>> . >>> >>> Hope that helps, >>> -Tony >>> >> >> Oops, I think I got myself mixed up in the explanation. Separating the >> comparisons fixes one error; For example, the following: >> >>>>> (min <= Xa) & (Xa <= max) >> >> will return an array of bools instead of raising an error (as you would >> get >> with `min <= Xa <= max`). This is what I meant to explain above. >> >> But, throwing an `if` in front of that comparison still doesn't work >> because it's ambiguous: Should `np.array([True False])` be true or false? >> Instead you should check `np.all(np.array([True False]))`, which evaluates >> as False since not-all elements are True, or `np.any(np.array([True >> False]))`, which evaluates as True since one element is True. >> >> -Tony >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > -- > View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From tsyu80 at gmail.com Fri Apr 6 17:13:41 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Fri, 6 Apr 2012 17:13:41 -0400 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33645519.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> <33645519.post@talk.nabble.com> Message-ID: On Fri, Apr 6, 2012 at 4:52 PM, surfcast23 wrote: > > Hi Tony, > > Thanks for the help. Would I be able to use the np.any and np.all > functions to count the number of true occurrences? > > Nope, but there are a few other ways of counting. The easiest is to call `np.sum` since True = 1, False = 0; e.g.: >>> np.sum(np.array([True, False, True, False, True])) 3 You can also use `np.where` or `np.nonzero` to return indices of nonzero (i.e. True) elements and then get the length of the index array. Note, however, that these functions return a tuple of indices (with a length equal to the array dimensions) so you'll have to grab one of the index arrays first: >>> idx_true = np.nonzero(np.array([True, False, True, False, True])) >>> len(idx_true[0]) 3 Best, -Tony > Tony Yu-3 wrote: > > > > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > > > >> > >> > >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 > wrote: > >> > >>> Hi, I have an if statement and what I want it to do is go through > arrays > >>> and find the common elements in all three arrays. When I try the code > >>> below > >>> I get this error * ValueError: The truth value of an array with more > >>> than one element is ambiguous. Use a.any() or a.all()* Can some one > >>> explain the error to me and how I might be able to fix it. Thanks in > >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: > >>> print("in range") else: print("Not in range")* > >> > >> > >> This explanation may or may not be clear, but your question is answered > >> in this > >> communication< > http://mail.python.org/pipermail/python-ideas/2011-October/012278.html> > >> . > >> > >> Roughly: > >> 1) Python's default behavior for chained comparisons don't work as you'd > >> expect for numpy arrays. > >> 2) Python doesn't allow numpy to change this default behavior (at least > >> currently, and maybe > >> never< > http://mail.python.org/pipermail/python-dev/2012-March/117510.html> > >> ). > >> > >> Nevertheless, you can get around this by separating the comparisons > >> > >> >>> if (min <= Xa) & (Xa <= max): > >> > >> Note the use of `&` instead of `and`, which is at the heart of the > >> issue > >> . > >> > >> Hope that helps, > >> -Tony > >> > > > > Oops, I think I got myself mixed up in the explanation. Separating the > > comparisons fixes one error; For example, the following: > > > >>>> (min <= Xa) & (Xa <= max) > > > > will return an array of bools instead of raising an error (as you would > > get > > with `min <= Xa <= max`). This is what I meant to explain above. > > > > But, throwing an `if` in front of that comparison still doesn't work > > because it's ambiguous: Should `np.array([True False])` be true or false? > > Instead you should check `np.all(np.array([True False]))`, which > evaluates > > as False since not-all elements are True, or `np.any(np.array([True > > False]))`, which evaluates as True since one element is True. > > > > -Tony > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > -- > View this message in context: > http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From surfcast23 at gmail.com Fri Apr 6 22:16:07 2012 From: surfcast23 at gmail.com (surfcast23) Date: Fri, 6 Apr 2012 19:16:07 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: References: <33583156.post@talk.nabble.com> <33645519.post@talk.nabble.com> Message-ID: <33647200.post@talk.nabble.com> Thank you Tony Tony Yu-3 wrote: > > On Fri, Apr 6, 2012 at 4:52 PM, surfcast23 wrote: > >> >> Hi Tony, >> >> Thanks for the help. Would I be able to use the np.any and np.all >> functions to count the number of true occurrences? >> >> > Nope, but there are a few other ways of counting. The easiest is to call > `np.sum` since True = 1, False = 0; e.g.: > >>>> np.sum(np.array([True, False, True, False, True])) > 3 > > You can also use `np.where` or `np.nonzero` to return indices of nonzero > (i.e. True) elements and then get the length of the index array. Note, > however, that these functions return a tuple of indices (with a length > equal to the array dimensions) so you'll have to grab one of the index > arrays first: > >>>> idx_true = np.nonzero(np.array([True, False, True, False, True])) >>>> len(idx_true[0]) > 3 > > Best, > -Tony > > >> Tony Yu-3 wrote: >> > >> > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: >> > >> >> >> >> >> >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 >> wrote: >> >> >> >>> Hi, I have an if statement and what I want it to do is go through >> arrays >> >>> and find the common elements in all three arrays. When I try the code >> >>> below >> >>> I get this error * ValueError: The truth value of an array with more >> >>> than one element is ambiguous. Use a.any() or a.all()* Can some one >> >>> explain the error to me and how I might be able to fix it. Thanks in >> >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >> >>> print("in range") else: print("Not in range")* >> >> >> >> >> >> This explanation may or may not be clear, but your question is >> answered >> >> in this >> >> communication< >> http://mail.python.org/pipermail/python-ideas/2011-October/012278.html> >> >> . >> >> >> >> Roughly: >> >> 1) Python's default behavior for chained comparisons don't work as >> you'd >> >> expect for numpy arrays. >> >> 2) Python doesn't allow numpy to change this default behavior (at >> least >> >> currently, and maybe >> >> never< >> http://mail.python.org/pipermail/python-dev/2012-March/117510.html> >> >> ). >> >> >> >> Nevertheless, you can get around this by separating the comparisons >> >> >> >> >>> if (min <= Xa) & (Xa <= max): >> >> >> >> Note the use of `&` instead of `and`, which is at the heart of the >> >> issue >> >> . >> >> >> >> Hope that helps, >> >> -Tony >> >> >> > >> > Oops, I think I got myself mixed up in the explanation. Separating the >> > comparisons fixes one error; For example, the following: >> > >> >>>> (min <= Xa) & (Xa <= max) >> > >> > will return an array of bools instead of raising an error (as you would >> > get >> > with `min <= Xa <= max`). This is what I meant to explain above. >> > >> > But, throwing an `if` in front of that comparison still doesn't work >> > because it's ambiguous: Should `np.array([True False])` be true or >> false? >> > Instead you should check `np.all(np.array([True False]))`, which >> evaluates >> > as False since not-all elements are True, or `np.any(np.array([True >> > False]))`, which evaluates as True since one element is True. >> > >> > -Tony >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> >> -- >> View this message in context: >> http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33647200.html Sent from the Scipy-User mailing list archive at Nabble.com. From surfcast23 at gmail.com Fri Apr 6 22:16:33 2012 From: surfcast23 at gmail.com (surfcast23) Date: Fri, 6 Apr 2012 19:16:33 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <1DBE014C-C991-4BF1-84A1-B8A1E6A6D639@yale.edu> References: <33583156.post@talk.nabble.com> <33645519.post@talk.nabble.com> <1DBE014C-C991-4BF1-84A1-B8A1E6A6D639@yale.edu> Message-ID: <33647201.post@talk.nabble.com> Thanks Zachary Zachary Pincus-2 wrote: > >> Thanks for the help. Would I be able to use the np.any and np.all >> functions to count the number of true occurrences? > > I typically use np.sum() (or arr.sum() where arr is a numpy array) to > count the number of True values (which count as 1 in boolean arrays, where > Falses are 0.) > >> >> >> Tony Yu-3 wrote: >>> >>> On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: >>> >>>> >>>> >>>> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 >>>> wrote: >>>> >>>>> Hi, I have an if statement and what I want it to do is go through >>>>> arrays >>>>> and find the common elements in all three arrays. When I try the code >>>>> below >>>>> I get this error * ValueError: The truth value of an array with more >>>>> than one element is ambiguous. Use a.any() or a.all()* Can some one >>>>> explain the error to me and how I might be able to fix it. Thanks in >>>>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >>>>> print("in range") else: print("Not in range")* >>>> >>>> >>>> This explanation may or may not be clear, but your question is answered >>>> in this >>>> communication >>>> . >>>> >>>> Roughly: >>>> 1) Python's default behavior for chained comparisons don't work as >>>> you'd >>>> expect for numpy arrays. >>>> 2) Python doesn't allow numpy to change this default behavior (at least >>>> currently, and maybe >>>> never >>>> ). >>>> >>>> Nevertheless, you can get around this by separating the comparisons >>>> >>>>>>> if (min <= Xa) & (Xa <= max): >>>> >>>> Note the use of `&` instead of `and`, which is at the heart of the >>>> issue >>>> . >>>> >>>> Hope that helps, >>>> -Tony >>>> >>> >>> Oops, I think I got myself mixed up in the explanation. Separating the >>> comparisons fixes one error; For example, the following: >>> >>>>>> (min <= Xa) & (Xa <= max) >>> >>> will return an array of bools instead of raising an error (as you would >>> get >>> with `min <= Xa <= max`). This is what I meant to explain above. >>> >>> But, throwing an `if` in front of that comparison still doesn't work >>> because it's ambiguous: Should `np.array([True False])` be true or >>> false? >>> Instead you should check `np.all(np.array([True False]))`, which >>> evaluates >>> as False since not-all elements are True, or `np.any(np.array([True >>> False]))`, which evaluates as True since one element is True. >>> >>> -Tony >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33647201.html Sent from the Scipy-User mailing list archive at Nabble.com. From kevin.gullikson at gmail.com Fri Apr 6 17:00:21 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Fri, 6 Apr 2012 16:00:21 -0500 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33645519.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> <33645519.post@talk.nabble.com> Message-ID: Would I be able to use the np.any and np.all functions to count the number of true occurrences? You can count the number of true occurences with np.sum(): >>> a = np.array((True, False, False, True, True)) >>> a.sum() 3 Kevin On Fri, Apr 6, 2012 at 3:52 PM, surfcast23 wrote: > > Hi Tony, > > Thanks for the help. Would I be able to use the np.any and np.all > functions to count the number of true occurrences? > > > > Tony Yu-3 wrote: > > > > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > > > >> > >> > >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 > wrote: > >> > >>> Hi, I have an if statement and what I want it to do is go through > arrays > >>> and find the common elements in all three arrays. When I try the code > >>> below > >>> I get this error * ValueError: The truth value of an array with more > >>> than one element is ambiguous. Use a.any() or a.all()* Can some one > >>> explain the error to me and how I might be able to fix it. Thanks in > >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: > >>> print("in range") else: print("Not in range")* > >> > >> > >> This explanation may or may not be clear, but your question is answered > >> in this > >> communication< > http://mail.python.org/pipermail/python-ideas/2011-October/012278.html> > >> . > >> > >> Roughly: > >> 1) Python's default behavior for chained comparisons don't work as you'd > >> expect for numpy arrays. > >> 2) Python doesn't allow numpy to change this default behavior (at least > >> currently, and maybe > >> never< > http://mail.python.org/pipermail/python-dev/2012-March/117510.html> > >> ). > >> > >> Nevertheless, you can get around this by separating the comparisons > >> > >> >>> if (min <= Xa) & (Xa <= max): > >> > >> Note the use of `&` instead of `and`, which is at the heart of the > >> issue > >> . > >> > >> Hope that helps, > >> -Tony > >> > > > > Oops, I think I got myself mixed up in the explanation. Separating the > > comparisons fixes one error; For example, the following: > > > >>>> (min <= Xa) & (Xa <= max) > > > > will return an array of bools instead of raising an error (as you would > > get > > with `min <= Xa <= max`). This is what I meant to explain above. > > > > But, throwing an `if` in front of that comparison still doesn't work > > because it's ambiguous: Should `np.array([True False])` be true or false? > > Instead you should check `np.all(np.array([True False]))`, which > evaluates > > as False since not-all elements are True, or `np.any(np.array([True > > False]))`, which evaluates as True since one element is True. > > > > -Tony > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > -- > View this message in context: > http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33645519.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Sat Apr 7 14:38:53 2012 From: srean.list at gmail.com (srean) Date: Sat, 7 Apr 2012 13:38:53 -0500 Subject: [SciPy-User] Help with clustering : Memory Error with large dataset In-Reply-To: References: Message-ID: I think what you are asking is more of a research question than a Scipy/Numpy question. You have to think about the problem to see how you can reduce the amount of data (sampling, streaming, multi-staged clustering, min-hashing, space-filling curves etc) and think of the space complexity of the algorithm that suits it. At this scale, plug and play doesnt work as often, some work in appropriately reformulating the problem is required. On Thu, Apr 5, 2012 at 4:10 PM, Abhishek Pratap wrote: > Hey Guys > > I am re-posting a message I had sent to numpy mailing list earlier. In > summary I need help with clustering. My input dataset is about 1-2 > million x,y coordinates which I would like to cluster together for ex > using DBSCAN algo. I tried it on a small data set and it works fine. > When I increase my input size it crashes. Can I be more efficient ?? > More details copied below. > > Thanks! > -Abhi > > > ===message from numpy mailing list==== > > > I am new to both python and more so to numpy. I am trying to cluster > close to a 900K points using DBSCAN algo. My input is a list of ~900k > tuples each having two points (x,y) coordinates. I am converting them > to numpy array and passing them to pdist method of > scipy.spatial.distance for calculating distance between each point. > > Here is some size info on my numpy array > shape of input array ?: (828575, 2) > Size : ?6872000 bytes > > I think the error has something to do with the default double dtype of > numpy array of pdist function. I would appreciate if you could help me > debug this. I am sure I overlooking some naive thing here > > See the traceback below. > > > MemoryError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in () > ? ?36 > ? ?37 print cleaned_senseBam > ---> 38 cluster_pet_points_per_chromosome(sense_bamFile) > > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in cluster_pet_points_per_chromosome(bamFile) > ? ?30 ? ? ? ? ? ? print 'Size of list points is %d' % sys.getsizeof(points) > ? ?31 ? ? ? ? ? ? print 'Size of numpy array is %d' % > sys.getsizeof(points_array) > ---> 32 ? ? ? ? ? ? cluster_points_DBSCAN(points_array) > ? ?33 ? ? ? ? ? ? #print points_array > > ? ?34 > > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in cluster_points_DBSCAN(data_numpy_array) > ? ? 9 def cluster_points_DBSCAN(data_numpy_array): > ? ?10 ? ? #eucledian distance calculation > > ---> 11 ? ? D = distance.pdist(data_numpy_array) > ? ?12 ? ? S = distance.squareform(D) > ? ?13 ? ? H = 1 - S/np.max(S) > > /house/homedirs/a/apratap/playground/software/epd-7.2-2-rh5-x86_64/lib/python2.7/site-packages/scipy/spatial/distance.pyc > in pdist(X, metric, p, w, V, VI) > ?1155 > ?1156 ? ? m, n = s > -> 1157 ? ? dm = np.zeros((m * (m - 1) / 2,), dtype=np.double) > ?1158 > ?1159 ? ? wmink_names = ['wminkowski', 'wmi', 'wm', 'wpnorm'] > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From emanuele at relativita.com Sat Apr 7 15:21:40 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Sat, 07 Apr 2012 21:21:40 +0200 Subject: [SciPy-User] Help with clustering : Memory Error with large dataset In-Reply-To: References: Message-ID: <4F8093C4.4000909@relativita.com> Hi, You might want to have a look to scikit-learn http://scikit-learn.org In particular to: http://scikit-learn.org/stable/modules/clustering.html Maybe using a connectivity matrix (e.g. through a BallTree) could solve the computational issue. See for example: http://scikit-learn.org/stable/auto_examples/cluster/plot_ward_structured_vs_unstructured.html#example-cluster-plot-ward-structured-vs-unstructured-py Another possibility could be their clever implementation of k-means. Best, Emanuele On 04/05/2012 11:10 PM, Abhishek Pratap wrote: > Hey Guys > > I am re-posting a message I had sent to numpy mailing list earlier. In > summary I need help with clustering. My input dataset is about 1-2 > million x,y coordinates which I would like to cluster together for ex > using DBSCAN algo. I tried it on a small data set and it works fine. > When I increase my input size it crashes. Can I be more efficient ? > More details copied below. > > Thanks! > -Abhi > > > ===message from numpy mailing list==== > > > I am new to both python and more so to numpy. I am trying to cluster > close to a 900K points using DBSCAN algo. My input is a list of ~900k > tuples each having two points (x,y) coordinates. I am converting them > to numpy array and passing them to pdist method of > scipy.spatial.distance for calculating distance between each point. > > Here is some size info on my numpy array > shape of input array : (828575, 2) > Size : 6872000 bytes > > I think the error has something to do with the default double dtype of > numpy array of pdist function. I would appreciate if you could help me > debug this. I am sure I overlooking some naive thing here > > See the traceback below. > > > MemoryError Traceback (most recent call last) > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in() > 36 > 37 print cleaned_senseBam > ---> 38 cluster_pet_points_per_chromosome(sense_bamFile) > > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in cluster_pet_points_per_chromosome(bamFile) > 30 print 'Size of list points is %d' % sys.getsizeof(points) > 31 print 'Size of numpy array is %d' % > sys.getsizeof(points_array) > ---> 32 cluster_points_DBSCAN(points_array) > 33 #print points_array > > 34 > > /house/homedirs/a/apratap/Dropbox/dev/ipython/ > in cluster_points_DBSCAN(data_numpy_array) > 9 def cluster_points_DBSCAN(data_numpy_array): > 10 #eucledian distance calculation > > ---> 11 D = distance.pdist(data_numpy_array) > 12 S = distance.squareform(D) > 13 H = 1 - S/np.max(S) > > /house/homedirs/a/apratap/playground/software/epd-7.2-2-rh5-x86_64/lib/python2.7/site-packages/scipy/spatial/distance.pyc > in pdist(X, metric, p, w, V, VI) > 1155 > 1156 m, n = s > -> 1157 dm = np.zeros((m * (m - 1) / 2,), dtype=np.double) > 1158 > 1159 wmink_names = ['wminkowski', 'wmi', 'wm', 'wpnorm'] > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From surfcast23 at gmail.com Sat Apr 7 16:32:15 2012 From: surfcast23 at gmail.com (surfcast23) Date: Sat, 7 Apr 2012 13:32:15 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: References: <33583156.post@talk.nabble.com> Message-ID: <33649757.post@talk.nabble.com> Hi, I am still a little confused as how to use numpy.all() to evaluate all three arrays for a specific range vales. I looked at the documentation, but did not see any examples that were close to what I need. Tony Yu-3 wrote: > > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > >> >> >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 wrote: >> >>> Hi, I have an if statement and what I want it to do is go through arrays >>> and find the common elements in all three arrays. When I try the code >>> below >>> I get this error * ValueError: The truth value of an array with more >>> than one element is ambiguous. Use a.any() or a.all()* Can some one >>> explain the error to me and how I might be able to fix it. Thanks in >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: >>> print("in range") else: print("Not in range")* >> >> >> This explanation may or may not be clear, but your question is answered >> in this >> communication >> . >> >> Roughly: >> 1) Python's default behavior for chained comparisons don't work as you'd >> expect for numpy arrays. >> 2) Python doesn't allow numpy to change this default behavior (at least >> currently, and maybe >> never >> ). >> >> Nevertheless, you can get around this by separating the comparisons >> >> >>> if (min <= Xa) & (Xa <= max): >> >> Note the use of `&` instead of `and`, which is at the heart of the >> issue >> . >> >> Hope that helps, >> -Tony >> > > Oops, I think I got myself mixed up in the explanation. Separating the > comparisons fixes one error; For example, the following: > >>>> (min <= Xa) & (Xa <= max) > > will return an array of bools instead of raising an error (as you would > get > with `min <= Xa <= max`). This is what I meant to explain above. > > But, throwing an `if` in front of that comparison still doesn't work > because it's ambiguous: Should `np.array([True False])` be true or false? > Instead you should check `np.all(np.array([True False]))`, which evaluates > as False since not-all elements are True, or `np.any(np.array([True > False]))`, which evaluates as True since one element is True. > > -Tony > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33649757.html Sent from the Scipy-User mailing list archive at Nabble.com. From tsyu80 at gmail.com Sun Apr 8 10:38:57 2012 From: tsyu80 at gmail.com (Tony Yu) Date: Sun, 8 Apr 2012 10:38:57 -0400 Subject: [SciPy-User] [SciPy-user] ValueError: The truth value of an array with more than one element is ambiguous. In-Reply-To: <33649757.post@talk.nabble.com> References: <33583156.post@talk.nabble.com> <33649757.post@talk.nabble.com> Message-ID: On Sat, Apr 7, 2012 at 4:32 PM, surfcast23 wrote: > > Hi, > > I am still a little confused as how to use numpy.all() to evaluate all > three arrays for a specific range vales. I looked at the documentation, but > did not see any examples that were close to what I need. > > Calling `np.all` will just return a single True or False so you can combine all those return values with `and`. The important part is that you split up your chained comparisons. So you might write: if np.all(Xa >= min) and np.all(Xa <=max) and np.all(Ya >= min) and np.all(Ya <=max) and np.all(Za >= min) and np.all(Za <=max): ... Not the most elegant solution, but it works. Alternatively, you could use a list comprehension to collapse some of this down if np.all([np.all(a >= min) and np.all(a <=max) for a in (Xa, Ya, Za)]): ... The inner calls to `np.all` check that all elements of an array are greater than min or less than max. The outer call to `np.all` checks that all three arrays (Xa, Ya, Za) satisfies this requirement. Probably unnecessary information ------- In previous replies, I needed to use bitwise-and (i.e. `&`) only when combining bool arrays; for example, `(a >= min) & (a >= max)`. In the above examples, I have to call `np.all` on each comparison *before* combining with `and`. In contrast, the `&`-operator will combine *each* element of the first comparison with each element of the second. This gives a bool array, which you'd need to call `np.all` on if you want to put it in an if-statement: if np.all([np.all((a >= min) & (a <=max)) for a in (Xa, Ya, Za)]): ... -Tony Tony Yu-3 wrote: > > > > On Fri, Apr 6, 2012 at 12:54 AM, Tony Yu wrote: > > > >> > >> > >> On Thu, Apr 5, 2012 at 6:46 PM, surfcast23 > wrote: > >> > >>> Hi, I have an if statement and what I want it to do is go through > arrays > >>> and find the common elements in all three arrays. When I try the code > >>> below > >>> I get this error * ValueError: The truth value of an array with more > >>> than one element is ambiguous. Use a.any() or a.all()* Can some one > >>> explain the error to me and how I might be able to fix it. Thanks in > >>> advance. *if min <= Xa <=max & min <= Ya <=max & min <= Za <=max: > >>> print("in range") else: print("Not in range")* > >> > >> > >> This explanation may or may not be clear, but your question is answered > >> in this > >> communication< > http://mail.python.org/pipermail/python-ideas/2011-October/012278.html> > >> . > >> > >> Roughly: > >> 1) Python's default behavior for chained comparisons don't work as you'd > >> expect for numpy arrays. > >> 2) Python doesn't allow numpy to change this default behavior (at least > >> currently, and maybe > >> never< > http://mail.python.org/pipermail/python-dev/2012-March/117510.html> > >> ). > >> > >> Nevertheless, you can get around this by separating the comparisons > >> > >> >>> if (min <= Xa) & (Xa <= max): > >> > >> Note the use of `&` instead of `and`, which is at the heart of the > >> issue > >> . > >> > >> Hope that helps, > >> -Tony > >> > > > > Oops, I think I got myself mixed up in the explanation. Separating the > > comparisons fixes one error; For example, the following: > > > >>>> (min <= Xa) & (Xa <= max) > > > > will return an array of bools instead of raising an error (as you would > > get > > with `min <= Xa <= max`). This is what I meant to explain above. > > > > But, throwing an `if` in front of that comparison still doesn't work > > because it's ambiguous: Should `np.array([True False])` be true or false? > > Instead you should check `np.all(np.array([True False]))`, which > evaluates > > as False since not-all elements are True, or `np.any(np.array([True > > False]))`, which evaluates as True since one element is True. > > > > -Tony > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > -- > View this message in context: > http://old.nabble.com/ValueError%3A-The-truth-value-of-an-array-with-more-than-one-element-is-ambiguous.-tp33583156p33649757.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Mon Apr 9 17:04:39 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 9 Apr 2012 23:04:39 +0200 Subject: [SciPy-User] scipy.stats convolution of two distributions Message-ID: Hi, In one of my projects I built some code that depends in a nice and generic way on the methods of rv_continuous in scipy.stats. Now it turns out that I shot myself in the foot because I need to run a test with the sum (convolution) of three such distributions. As far as I can see there is no standard way to achieve this in scipy.stats. Does anybody know of a good (generic) way to do this? If not, would it actually be useful to add such functionality to scipy.stats, if possible at all? After some struggling I wrote the code below, but this is a first attempt. I am very interested in obtaining feedback to turn this into something that is useful for a larger population than just me. Thanks in advance Nicky import numpy as np import scipy.stats from scipy.stats import poisson, uniform, expon import pylab as pl # I need a grid since np.convolve requires two arrays. # choose the grid such that it covers the numerically relevant support # of the distributions grid = np.arange(0., 5., 0.001) # I need P(D1+D2+D3 <= x) D1 = expon(scale = 1./2) D2 = expon(scale = 1./3) D3 = expon(scale = 1./6) class convolved_gen(scipy.stats.rv_continuous): def __init__(self, D1, D2, grid): self.D1 = D1 self.D2 = D2 delta = grid[1]-grid[0] p1 = self.D1.pdf(grid) p2 = self.D2.pdf(grid)*delta self.conv = np.convolve(p1, p2) super(convolved_gen, self).__init__(name = "convolved") def _cdf(self, grid): cdf = np.cumsum(self.conv) return cdf/cdf[-1] # ensure that cdf[-1] = 1 def _pdf(self, grid): return self.conv[:len(grid)] def _stats(self): m = self.D1.stats("m") + self.D2.stats("m") v = self.D1.stats("v") + self.D2.stats("v") return m, v, 0., 0. convolved = convolved_gen(D1, D2, grid) conv = convolved() convolved2 = convolved_gen(conv, D3, grid) conv2 = convolved2() pl.plot(grid,D1.cdf(grid)) pl.plot(grid,conv.cdf(grid)) pl.plot(grid,conv2.cdf(grid)) pl.show() From josef.pktd at gmail.com Mon Apr 9 18:06:14 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Apr 2012 18:06:14 -0400 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: > Hi, > > In one of my projects I built some code that depends in a nice and > generic way on the methods of rv_continuous in scipy.stats. Now it > turns out that I shot myself in the foot because I need to run a test > with the sum (convolution) of three such distributions. As far as I > can see there is no standard way to achieve this in scipy.stats. Does > anybody know of a good (generic) way to do this? If not, would it > actually be useful to add such functionality to scipy.stats, if > possible at all? > > After some struggling I wrote the code below, but this is a first > attempt. I am very interested in obtaining feedback to turn this into > something that is useful for a larger population than just me. Sounds fun. The plots are a bit misleading, because matplotlib is doing the interpolation for you pl.plot(grid,D1.cdf(grid), '.') pl.plot(grid,conv.cdf(grid), '.') pl.plot(grid,conv2.cdf(grid), '.') The main problem is the mixing of continuous distribution and discrete grid. pdf, cdf, .... want to evaluate at a point and with floating points it's perilous (as we discussed before). As I read it, neither your cdf nor pdf return specific points. I think the easiest would be to work with linear interpolation interp1d on a fine grid, but I never checked how fast this is for a fine grid. If cdf and pdf are defined with a piecewise polynomial, then they can be evaluated at any points and the generic class, rv_continuous, should be able to handle everything. The alternative would be to work with the finite grid and use the fixed spacing to define a lattice distribution, but that doesn't exist in scipy. I haven't thought about the convolution itself yet, (an alternative would be using fft to work with the characteristic function.) If you use this for a test, how much do you care about having accurate tails, or is effectively truncating the distribution ok? Josef > > Thanks in advance > > Nicky > > import numpy as np > import scipy.stats > from scipy.stats import poisson, uniform, expon > import pylab as pl > > # I need a grid since np.convolve requires two arrays. > > # choose the grid such that it covers the numerically relevant support > # of the distributions > grid = np.arange(0., 5., 0.001) > > # I need P(D1+D2+D3 <= x) > D1 = expon(scale = 1./2) > D2 = expon(scale = 1./3) > D3 = expon(scale = 1./6) > > class convolved_gen(scipy.stats.rv_continuous): > ? ?def __init__(self, D1, D2, grid): > ? ? ? ?self.D1 = D1 > ? ? ? ?self.D2 = D2 > ? ? ? ?delta = grid[1]-grid[0] > ? ? ? ?p1 = self.D1.pdf(grid) > ? ? ? ?p2 = self.D2.pdf(grid)*delta > ? ? ? ?self.conv = np.convolve(p1, p2) > > ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") > > ? ?def _cdf(self, grid): > ? ? ? ?cdf = np.cumsum(self.conv) > ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 > > ? ?def _pdf(self, grid): > ? ? ? ?return self.conv[:len(grid)] > > ? ?def _stats(self): > ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") > ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") > ? ? ? ?return m, v, 0., 0. > > > > convolved = convolved_gen(D1, D2, grid) > conv = convolved() > > convolved2 = convolved_gen(conv, D3, grid) > conv2 = convolved2() > > pl.plot(grid,D1.cdf(grid)) > pl.plot(grid,conv.cdf(grid)) > pl.plot(grid,conv2.cdf(grid)) > pl.show() > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Apr 9 18:36:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Apr 2012 18:36:30 -0400 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 6:06 PM, wrote: > On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: >> Hi, >> >> In one of my projects I built some code that depends in a nice and >> generic way on the methods of rv_continuous in scipy.stats. Now it >> turns out that I shot myself in the foot because I need to run a test >> with the sum (convolution) of three such distributions. As far as I >> can see there is no standard way to achieve this in scipy.stats. Does >> anybody know of a good (generic) way to do this? If not, would it >> actually be useful to add such functionality to scipy.stats, if >> possible at all? >> >> After some struggling I wrote the code below, but this is a first >> attempt. I am very interested in obtaining feedback to turn this into >> something that is useful for a larger population than just me. > > Sounds fun. > > The plots are a bit misleading, because matplotlib is doing the > interpolation for you > > pl.plot(grid,D1.cdf(grid), '.') > pl.plot(grid,conv.cdf(grid), '.') > pl.plot(grid,conv2.cdf(grid), '.') > > The main problem is the mixing of continuous distribution and discrete > grid. pdf, cdf, .... want to evaluate at a point and with floating > points it's perilous (as we discussed before). > > As I read it, neither your cdf nor pdf return specific points. > > I think the easiest would be to work with linear interpolation > interp1d on a fine grid, but I never checked how fast this is for a > fine grid. > If cdf and pdf are defined with a piecewise polynomial, then they can > be evaluated at any points and the generic class, rv_continuous, > should be able to handle everything. > The alternative would be to work with the finite grid and use the > fixed spacing to define a lattice distribution, but that doesn't exist > in scipy. > > I haven't thought about the convolution itself yet, (an alternative > would be using fft to work with the characteristic function.) > > If you use this for a test, how much do you care about having accurate > tails, or is effectively truncating the distribution ok? The other question I thought about in a similar situation is, what is the usage or access pattern. For many cases, I'm not really interested in evaluating the pdf at specific points, but over a range or interval of points. In this case relying on floating point access doesn't look like the best way to go, and I spend more time on the `expect` method. Calculating expectation of a function w.r.t the distribution that can use the internal representation instead of the generic integrate.quad. Josef > > Josef > >> >> Thanks in advance >> >> Nicky >> >> import numpy as np >> import scipy.stats >> from scipy.stats import poisson, uniform, expon >> import pylab as pl >> >> # I need a grid since np.convolve requires two arrays. >> >> # choose the grid such that it covers the numerically relevant support >> # of the distributions >> grid = np.arange(0., 5., 0.001) >> >> # I need P(D1+D2+D3 <= x) >> D1 = expon(scale = 1./2) >> D2 = expon(scale = 1./3) >> D3 = expon(scale = 1./6) >> >> class convolved_gen(scipy.stats.rv_continuous): >> ? ?def __init__(self, D1, D2, grid): >> ? ? ? ?self.D1 = D1 >> ? ? ? ?self.D2 = D2 >> ? ? ? ?delta = grid[1]-grid[0] >> ? ? ? ?p1 = self.D1.pdf(grid) >> ? ? ? ?p2 = self.D2.pdf(grid)*delta >> ? ? ? ?self.conv = np.convolve(p1, p2) >> >> ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") >> >> ? ?def _cdf(self, grid): >> ? ? ? ?cdf = np.cumsum(self.conv) >> ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 >> >> ? ?def _pdf(self, grid): >> ? ? ? ?return self.conv[:len(grid)] >> >> ? ?def _stats(self): >> ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") >> ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") >> ? ? ? ?return m, v, 0., 0. >> >> >> >> convolved = convolved_gen(D1, D2, grid) >> conv = convolved() >> >> convolved2 = convolved_gen(conv, D3, grid) >> conv2 = convolved2() >> >> pl.plot(grid,D1.cdf(grid)) >> pl.plot(grid,conv.cdf(grid)) >> pl.plot(grid,conv2.cdf(grid)) >> pl.show() >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Apr 9 20:45:16 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Apr 2012 20:45:16 -0400 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 6:36 PM, wrote: > On Mon, Apr 9, 2012 at 6:06 PM, ? wrote: >> On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: >>> Hi, >>> >>> In one of my projects I built some code that depends in a nice and >>> generic way on the methods of rv_continuous in scipy.stats. Now it >>> turns out that I shot myself in the foot because I need to run a test >>> with the sum (convolution) of three such distributions. As far as I >>> can see there is no standard way to achieve this in scipy.stats. Does >>> anybody know of a good (generic) way to do this? If not, would it >>> actually be useful to add such functionality to scipy.stats, if >>> possible at all? >>> >>> After some struggling I wrote the code below, but this is a first >>> attempt. I am very interested in obtaining feedback to turn this into >>> something that is useful for a larger population than just me. >> >> Sounds fun. >> >> The plots are a bit misleading, because matplotlib is doing the >> interpolation for you >> >> pl.plot(grid,D1.cdf(grid), '.') >> pl.plot(grid,conv.cdf(grid), '.') >> pl.plot(grid,conv2.cdf(grid), '.') >> >> The main problem is the mixing of continuous distribution and discrete >> grid. pdf, cdf, .... want to evaluate at a point and with floating >> points it's perilous (as we discussed before). >> >> As I read it, neither your cdf nor pdf return specific points. >> >> I think the easiest would be to work with linear interpolation >> interp1d on a fine grid, but I never checked how fast this is for a >> fine grid. >> If cdf and pdf are defined with a piecewise polynomial, then they can >> be evaluated at any points and the generic class, rv_continuous, >> should be able to handle everything. >> The alternative would be to work with the finite grid and use the >> fixed spacing to define a lattice distribution, but that doesn't exist >> in scipy. >> >> I haven't thought about the convolution itself yet, (an alternative >> would be using fft to work with the characteristic function.) >> >> If you use this for a test, how much do you care about having accurate >> tails, or is effectively truncating the distribution ok? > > The other question I thought about in a similar situation is, what is > the usage or access pattern. > > For many cases, I'm not really interested in evaluating the pdf at > specific points, but over a range or interval of points. In this case > relying on floating point access doesn't look like the best way to go, > and I spend more time on the `expect` method. Calculating expectation > of a function w.r.t the distribution that can use the internal > representation instead of the generic integrate.quad. a test case to check numerical accuracy of discretized approximation/convolution Di = expon(scale = 1./2) sum of identical exponentially distributed random variables is gamma http://en.wikipedia.org/wiki/Gamma_distribution 1 exponential and corresponding gamma >>> convolved.D1.pdf(grid[:10]) array([ 2. , 1.996004 , 1.99201598, 1.98803593, 1.98406383, 1.98009967, 1.97614343, 1.97219509, 1.96825464, 1.96432206]) >>> stats.gamma.pdf(grid[:10], 1, scale=1/2.) array([ 2. , 1.996004 , 1.99201598, 1.98803593, 1.98406383, 1.98009967, 1.97614343, 1.97219509, 1.96825464, 1.96432206]) sum of 2 exponentials >>> stats.gamma.pdf(grid[:10], 2, scale=1/2.) array([ 0. , 0.00399201, 0.00796806, 0.01192822, 0.01587251, 0.019801 , 0.02371372, 0.02761073, 0.03149207, 0.0353578 ]) >>> convolved.pdf(np.zeros(10)) array([ 0.004 , 0.00798402, 0.0119521 , 0.01590429, 0.01984064, 0.0237612 , 0.02766601, 0.03155512, 0.03542858, 0.03928644]) >>> stats.gamma.pdf(grid[1:21], 2, scale=1/2.) - convolved.pdf(np.zeros(20)) array([ -7.99200533e-06, -1.59520746e-05, -2.38803035e-05, -3.17767875e-05, -3.96416218e-05, -4.74749013e-05, -5.52767208e-05, -6.30471746e-05, -7.07863571e-05, -7.84943621e-05, -8.61712832e-05, -9.38172141e-05, -1.01432248e-04, -1.09016477e-04, -1.16569995e-04, -1.24092894e-04, -1.31585266e-04, -1.39047203e-04, -1.46478797e-04, -1.53880139e-04]) sum of 3 exponentials >>> convolved2.conv[:2000:100] array([ 8.00000000e-06, 3.37382569e-02, 1.08865338e-01, 1.99552301e-01, 2.89730911e-01, 3.70089661e-01, 4.35890673e-01, 4.85403437e-01, 5.18794908e-01, 5.37354948e-01, 5.42966239e-01, 5.37750775e-01, 5.23842475e-01, 5.03248651e-01, 4.77772987e-01, 4.48980181e-01, 4.18187929e-01, 3.86476082e-01, 3.54705854e-01, 3.23544178e-01]) >>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) array([ 3.99200799e-06, 3.33407414e-02, 1.08109964e-01, 1.98494147e-01, 2.88432744e-01, 3.68614464e-01, 4.34297139e-01, 4.83743524e-01, 5.17112771e-01, 5.35686765e-01, 5.41340592e-01, 5.36189346e-01, 5.22360899e-01, 5.01857412e-01, 4.76478297e-01, 4.47784794e-01, 4.17091870e-01, 3.85477285e-01, 3.53800704e-01, 3.22727969e-01]) >>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.conv[:2000:100] array([ -4.00799201e-06, -3.97515433e-04, -7.55373610e-04, -1.05815476e-03, -1.29816640e-03, -1.47519705e-03, -1.59353434e-03, -1.65991307e-03, -1.68213690e-03, -1.66818281e-03, -1.62564706e-03, -1.56142889e-03, -1.48157626e-03, -1.39123891e-03, -1.29468978e-03, -1.19538731e-03, -1.09605963e-03, -9.98797767e-04, -9.05149579e-04, -8.16209174e-04]) values shifted by one ? >>> np.argmax(stats.gamma.pdf(grid, 3, scale=1/2.)) 1000 >>> np.argmax(convolved2.conv) 999 Josef > > Josef > > >> >> Josef >> >>> >>> Thanks in advance >>> >>> Nicky >>> >>> import numpy as np >>> import scipy.stats >>> from scipy.stats import poisson, uniform, expon >>> import pylab as pl >>> >>> # I need a grid since np.convolve requires two arrays. >>> >>> # choose the grid such that it covers the numerically relevant support >>> # of the distributions >>> grid = np.arange(0., 5., 0.001) >>> >>> # I need P(D1+D2+D3 <= x) >>> D1 = expon(scale = 1./2) >>> D2 = expon(scale = 1./3) >>> D3 = expon(scale = 1./6) >>> >>> class convolved_gen(scipy.stats.rv_continuous): >>> ? ?def __init__(self, D1, D2, grid): >>> ? ? ? ?self.D1 = D1 >>> ? ? ? ?self.D2 = D2 >>> ? ? ? ?delta = grid[1]-grid[0] >>> ? ? ? ?p1 = self.D1.pdf(grid) >>> ? ? ? ?p2 = self.D2.pdf(grid)*delta >>> ? ? ? ?self.conv = np.convolve(p1, p2) >>> >>> ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") >>> >>> ? ?def _cdf(self, grid): >>> ? ? ? ?cdf = np.cumsum(self.conv) >>> ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 >>> >>> ? ?def _pdf(self, grid): >>> ? ? ? ?return self.conv[:len(grid)] >>> >>> ? ?def _stats(self): >>> ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") >>> ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") >>> ? ? ? ?return m, v, 0., 0. >>> >>> >>> >>> convolved = convolved_gen(D1, D2, grid) >>> conv = convolved() >>> >>> convolved2 = convolved_gen(conv, D3, grid) >>> conv2 = convolved2() >>> >>> pl.plot(grid,D1.cdf(grid)) >>> pl.plot(grid,conv.cdf(grid)) >>> pl.plot(grid,conv2.cdf(grid)) >>> pl.show() >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Apr 9 21:09:19 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Apr 2012 21:09:19 -0400 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 8:45 PM, wrote: > On Mon, Apr 9, 2012 at 6:36 PM, ? wrote: >> On Mon, Apr 9, 2012 at 6:06 PM, ? wrote: >>> On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: >>>> Hi, >>>> >>>> In one of my projects I built some code that depends in a nice and >>>> generic way on the methods of rv_continuous in scipy.stats. Now it >>>> turns out that I shot myself in the foot because I need to run a test >>>> with the sum (convolution) of three such distributions. As far as I >>>> can see there is no standard way to achieve this in scipy.stats. Does >>>> anybody know of a good (generic) way to do this? If not, would it >>>> actually be useful to add such functionality to scipy.stats, if >>>> possible at all? >>>> >>>> After some struggling I wrote the code below, but this is a first >>>> attempt. I am very interested in obtaining feedback to turn this into >>>> something that is useful for a larger population than just me. >>> >>> Sounds fun. >>> >>> The plots are a bit misleading, because matplotlib is doing the >>> interpolation for you >>> >>> pl.plot(grid,D1.cdf(grid), '.') >>> pl.plot(grid,conv.cdf(grid), '.') >>> pl.plot(grid,conv2.cdf(grid), '.') >>> >>> The main problem is the mixing of continuous distribution and discrete >>> grid. pdf, cdf, .... want to evaluate at a point and with floating >>> points it's perilous (as we discussed before). >>> >>> As I read it, neither your cdf nor pdf return specific points. >>> >>> I think the easiest would be to work with linear interpolation >>> interp1d on a fine grid, but I never checked how fast this is for a >>> fine grid. >>> If cdf and pdf are defined with a piecewise polynomial, then they can >>> be evaluated at any points and the generic class, rv_continuous, >>> should be able to handle everything. >>> The alternative would be to work with the finite grid and use the >>> fixed spacing to define a lattice distribution, but that doesn't exist >>> in scipy. >>> >>> I haven't thought about the convolution itself yet, (an alternative >>> would be using fft to work with the characteristic function.) >>> >>> If you use this for a test, how much do you care about having accurate >>> tails, or is effectively truncating the distribution ok? >> >> The other question I thought about in a similar situation is, what is >> the usage or access pattern. >> >> For many cases, I'm not really interested in evaluating the pdf at >> specific points, but over a range or interval of points. In this case >> relying on floating point access doesn't look like the best way to go, >> and I spend more time on the `expect` method. Calculating expectation >> of a function w.r.t the distribution that can use the internal >> representation instead of the generic integrate.quad. > > > a test case to check numerical accuracy of discretized approximation/convolution > > Di = expon(scale = 1./2) > sum of identical exponentially distributed random variables is gamma > http://en.wikipedia.org/wiki/Gamma_distribution > > 1 exponential and corresponding gamma > >>>> convolved.D1.pdf(grid[:10]) > array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, > ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) >>>> stats.gamma.pdf(grid[:10], 1, scale=1/2.) > array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, > ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) > > sum of 2 exponentials > >>>> stats.gamma.pdf(grid[:10], 2, scale=1/2.) > array([ 0. ? ? ? ?, ?0.00399201, ?0.00796806, ?0.01192822, ?0.01587251, > ? ? ? ?0.019801 ?, ?0.02371372, ?0.02761073, ?0.03149207, ?0.0353578 ]) >>>> convolved.pdf(np.zeros(10)) > array([ 0.004 ? ? , ?0.00798402, ?0.0119521 , ?0.01590429, ?0.01984064, > ? ? ? ?0.0237612 , ?0.02766601, ?0.03155512, ?0.03542858, ?0.03928644]) >>>> stats.gamma.pdf(grid[1:21], 2, scale=1/2.) - convolved.pdf(np.zeros(20)) > array([ -7.99200533e-06, ?-1.59520746e-05, ?-2.38803035e-05, > ? ? ? ?-3.17767875e-05, ?-3.96416218e-05, ?-4.74749013e-05, > ? ? ? ?-5.52767208e-05, ?-6.30471746e-05, ?-7.07863571e-05, > ? ? ? ?-7.84943621e-05, ?-8.61712832e-05, ?-9.38172141e-05, > ? ? ? ?-1.01432248e-04, ?-1.09016477e-04, ?-1.16569995e-04, > ? ? ? ?-1.24092894e-04, ?-1.31585266e-04, ?-1.39047203e-04, > ? ? ? ?-1.46478797e-04, ?-1.53880139e-04]) > > sum of 3 exponentials > >>>> convolved2.conv[:2000:100] > array([ ?8.00000000e-06, ? 3.37382569e-02, ? 1.08865338e-01, > ? ? ? ? 1.99552301e-01, ? 2.89730911e-01, ? 3.70089661e-01, > ? ? ? ? 4.35890673e-01, ? 4.85403437e-01, ? 5.18794908e-01, > ? ? ? ? 5.37354948e-01, ? 5.42966239e-01, ? 5.37750775e-01, > ? ? ? ? 5.23842475e-01, ? 5.03248651e-01, ? 4.77772987e-01, > ? ? ? ? 4.48980181e-01, ? 4.18187929e-01, ? 3.86476082e-01, > ? ? ? ? 3.54705854e-01, ? 3.23544178e-01]) >>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) > array([ ?3.99200799e-06, ? 3.33407414e-02, ? 1.08109964e-01, > ? ? ? ? 1.98494147e-01, ? 2.88432744e-01, ? 3.68614464e-01, > ? ? ? ? 4.34297139e-01, ? 4.83743524e-01, ? 5.17112771e-01, > ? ? ? ? 5.35686765e-01, ? 5.41340592e-01, ? 5.36189346e-01, > ? ? ? ? 5.22360899e-01, ? 5.01857412e-01, ? 4.76478297e-01, > ? ? ? ? 4.47784794e-01, ? 4.17091870e-01, ? 3.85477285e-01, > ? ? ? ? 3.53800704e-01, ? 3.22727969e-01]) >>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.conv[:2000:100] > array([ -4.00799201e-06, ?-3.97515433e-04, ?-7.55373610e-04, > ? ? ? ?-1.05815476e-03, ?-1.29816640e-03, ?-1.47519705e-03, > ? ? ? ?-1.59353434e-03, ?-1.65991307e-03, ?-1.68213690e-03, > ? ? ? ?-1.66818281e-03, ?-1.62564706e-03, ?-1.56142889e-03, > ? ? ? ?-1.48157626e-03, ?-1.39123891e-03, ?-1.29468978e-03, > ? ? ? ?-1.19538731e-03, ?-1.09605963e-03, ?-9.98797767e-04, > ? ? ? ?-9.05149579e-04, ?-8.16209174e-04]) > > values shifted by one ? > >>>> np.argmax(stats.gamma.pdf(grid, 3, scale=1/2.)) > 1000 > >>>> np.argmax(convolved2.conv) > 999 on the other hand, the convolution with 3 distribution looks accurate at around 1e-4, so depending on the application this works pretty well >>> stats.gamma.cdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.cdf(np.zeros(2000))[:2000:100] array([ -6.64904892e-09, -3.41833566e-05, -1.12776463e-04, -2.11506843e-04, -3.14716605e-04, -4.12809581e-04, -5.00378449e-04, -5.74846082e-04, -6.35489967e-04, -6.82750381e-04, -7.17747313e-04, -7.41949767e-04, -7.56955276e-04, -7.64348246e-04, -7.65613908e-04, -7.62090849e-04, -7.54949690e-04, -7.45188971e-04, -7.33641860e-04, -7.20989200e-04]) for 2 distributions >>> np.max(np.abs(stats.gamma.cdf(grid[1:2001], 2, scale=1/2.) - convolved.cdf(np.zeros(2000))[:2000])) 0.00039327739646649595 Josef > > Josef > >> >> Josef >> >> >>> >>> Josef >>> >>>> >>>> Thanks in advance >>>> >>>> Nicky >>>> >>>> import numpy as np >>>> import scipy.stats >>>> from scipy.stats import poisson, uniform, expon >>>> import pylab as pl >>>> >>>> # I need a grid since np.convolve requires two arrays. >>>> >>>> # choose the grid such that it covers the numerically relevant support >>>> # of the distributions >>>> grid = np.arange(0., 5., 0.001) >>>> >>>> # I need P(D1+D2+D3 <= x) >>>> D1 = expon(scale = 1./2) >>>> D2 = expon(scale = 1./3) >>>> D3 = expon(scale = 1./6) >>>> >>>> class convolved_gen(scipy.stats.rv_continuous): >>>> ? ?def __init__(self, D1, D2, grid): >>>> ? ? ? ?self.D1 = D1 >>>> ? ? ? ?self.D2 = D2 >>>> ? ? ? ?delta = grid[1]-grid[0] >>>> ? ? ? ?p1 = self.D1.pdf(grid) >>>> ? ? ? ?p2 = self.D2.pdf(grid)*delta >>>> ? ? ? ?self.conv = np.convolve(p1, p2) >>>> >>>> ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") >>>> >>>> ? ?def _cdf(self, grid): >>>> ? ? ? ?cdf = np.cumsum(self.conv) >>>> ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 >>>> >>>> ? ?def _pdf(self, grid): >>>> ? ? ? ?return self.conv[:len(grid)] >>>> >>>> ? ?def _stats(self): >>>> ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") >>>> ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") >>>> ? ? ? ?return m, v, 0., 0. >>>> >>>> >>>> >>>> convolved = convolved_gen(D1, D2, grid) >>>> conv = convolved() >>>> >>>> convolved2 = convolved_gen(conv, D3, grid) >>>> conv2 = convolved2() >>>> >>>> pl.plot(grid,D1.cdf(grid)) >>>> pl.plot(grid,conv.cdf(grid)) >>>> pl.plot(grid,conv2.cdf(grid)) >>>> pl.show() >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user From vanforeest at gmail.com Tue Apr 10 06:07:23 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 10 Apr 2012 12:07:23 +0200 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: Hi Josef, Thanks for all your answers. This was more than was hoping for. I think I have to do some serious studying before I can deal with all your feedback. As a start I found the following article, which seems (telling from the abstract) a good starting point. From your list of mails above I'll also try to make a list of requirements that a convolution operator on probability distributions should offer. On computing the distribution function of the sum of independentrandomvariables Mani K. Agrawal*, a, [Author Vitae], Salah E. Elmaghraby*, b, , [Author Vitae] a Manugistics, Inc., 9200 E. Panorama Circle, Englewood, CO 80112, USA b Department of Industrial Engineering, College of Engineering, North Carolina State University, Campus Box 7906, Raleigh, NC 27695-7906, USA Received 1 August 1998. Revised 1 July 1999. Available online 4 December 2000. http://dx.doi.org/10.1016/S0305-0548(99)00133-1, How to Cite or Link Using DOI Cited by in Scopus (7) Permissions & Reprints On 10 April 2012 03:09, wrote: > On Mon, Apr 9, 2012 at 8:45 PM, ? wrote: >> On Mon, Apr 9, 2012 at 6:36 PM, ? wrote: >>> On Mon, Apr 9, 2012 at 6:06 PM, ? wrote: >>>> On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: >>>>> Hi, >>>>> >>>>> In one of my projects I built some code that depends in a nice and >>>>> generic way on the methods of rv_continuous in scipy.stats. Now it >>>>> turns out that I shot myself in the foot because I need to run a test >>>>> with the sum (convolution) of three such distributions. As far as I >>>>> can see there is no standard way to achieve this in scipy.stats. Does >>>>> anybody know of a good (generic) way to do this? If not, would it >>>>> actually be useful to add such functionality to scipy.stats, if >>>>> possible at all? >>>>> >>>>> After some struggling I wrote the code below, but this is a first >>>>> attempt. I am very interested in obtaining feedback to turn this into >>>>> something that is useful for a larger population than just me. >>>> >>>> Sounds fun. >>>> >>>> The plots are a bit misleading, because matplotlib is doing the >>>> interpolation for you >>>> >>>> pl.plot(grid,D1.cdf(grid), '.') >>>> pl.plot(grid,conv.cdf(grid), '.') >>>> pl.plot(grid,conv2.cdf(grid), '.') >>>> >>>> The main problem is the mixing of continuous distribution and discrete >>>> grid. pdf, cdf, .... want to evaluate at a point and with floating >>>> points it's perilous (as we discussed before). >>>> >>>> As I read it, neither your cdf nor pdf return specific points. >>>> >>>> I think the easiest would be to work with linear interpolation >>>> interp1d on a fine grid, but I never checked how fast this is for a >>>> fine grid. >>>> If cdf and pdf are defined with a piecewise polynomial, then they can >>>> be evaluated at any points and the generic class, rv_continuous, >>>> should be able to handle everything. >>>> The alternative would be to work with the finite grid and use the >>>> fixed spacing to define a lattice distribution, but that doesn't exist >>>> in scipy. >>>> >>>> I haven't thought about the convolution itself yet, (an alternative >>>> would be using fft to work with the characteristic function.) >>>> >>>> If you use this for a test, how much do you care about having accurate >>>> tails, or is effectively truncating the distribution ok? >>> >>> The other question I thought about in a similar situation is, what is >>> the usage or access pattern. >>> >>> For many cases, I'm not really interested in evaluating the pdf at >>> specific points, but over a range or interval of points. In this case >>> relying on floating point access doesn't look like the best way to go, >>> and I spend more time on the `expect` method. Calculating expectation >>> of a function w.r.t the distribution that can use the internal >>> representation instead of the generic integrate.quad. >> >> >> a test case to check numerical accuracy of discretized approximation/convolution >> >> Di = expon(scale = 1./2) >> sum of identical exponentially distributed random variables is gamma >> http://en.wikipedia.org/wiki/Gamma_distribution >> >> 1 exponential and corresponding gamma >> >>>>> convolved.D1.pdf(grid[:10]) >> array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, >> ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) >>>>> stats.gamma.pdf(grid[:10], 1, scale=1/2.) >> array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, >> ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) >> >> sum of 2 exponentials >> >>>>> stats.gamma.pdf(grid[:10], 2, scale=1/2.) >> array([ 0. ? ? ? ?, ?0.00399201, ?0.00796806, ?0.01192822, ?0.01587251, >> ? ? ? ?0.019801 ?, ?0.02371372, ?0.02761073, ?0.03149207, ?0.0353578 ]) >>>>> convolved.pdf(np.zeros(10)) >> array([ 0.004 ? ? , ?0.00798402, ?0.0119521 , ?0.01590429, ?0.01984064, >> ? ? ? ?0.0237612 , ?0.02766601, ?0.03155512, ?0.03542858, ?0.03928644]) >>>>> stats.gamma.pdf(grid[1:21], 2, scale=1/2.) - convolved.pdf(np.zeros(20)) >> array([ -7.99200533e-06, ?-1.59520746e-05, ?-2.38803035e-05, >> ? ? ? ?-3.17767875e-05, ?-3.96416218e-05, ?-4.74749013e-05, >> ? ? ? ?-5.52767208e-05, ?-6.30471746e-05, ?-7.07863571e-05, >> ? ? ? ?-7.84943621e-05, ?-8.61712832e-05, ?-9.38172141e-05, >> ? ? ? ?-1.01432248e-04, ?-1.09016477e-04, ?-1.16569995e-04, >> ? ? ? ?-1.24092894e-04, ?-1.31585266e-04, ?-1.39047203e-04, >> ? ? ? ?-1.46478797e-04, ?-1.53880139e-04]) >> >> sum of 3 exponentials >> >>>>> convolved2.conv[:2000:100] >> array([ ?8.00000000e-06, ? 3.37382569e-02, ? 1.08865338e-01, >> ? ? ? ? 1.99552301e-01, ? 2.89730911e-01, ? 3.70089661e-01, >> ? ? ? ? 4.35890673e-01, ? 4.85403437e-01, ? 5.18794908e-01, >> ? ? ? ? 5.37354948e-01, ? 5.42966239e-01, ? 5.37750775e-01, >> ? ? ? ? 5.23842475e-01, ? 5.03248651e-01, ? 4.77772987e-01, >> ? ? ? ? 4.48980181e-01, ? 4.18187929e-01, ? 3.86476082e-01, >> ? ? ? ? 3.54705854e-01, ? 3.23544178e-01]) >>>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) >> array([ ?3.99200799e-06, ? 3.33407414e-02, ? 1.08109964e-01, >> ? ? ? ? 1.98494147e-01, ? 2.88432744e-01, ? 3.68614464e-01, >> ? ? ? ? 4.34297139e-01, ? 4.83743524e-01, ? 5.17112771e-01, >> ? ? ? ? 5.35686765e-01, ? 5.41340592e-01, ? 5.36189346e-01, >> ? ? ? ? 5.22360899e-01, ? 5.01857412e-01, ? 4.76478297e-01, >> ? ? ? ? 4.47784794e-01, ? 4.17091870e-01, ? 3.85477285e-01, >> ? ? ? ? 3.53800704e-01, ? 3.22727969e-01]) >>>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.conv[:2000:100] >> array([ -4.00799201e-06, ?-3.97515433e-04, ?-7.55373610e-04, >> ? ? ? ?-1.05815476e-03, ?-1.29816640e-03, ?-1.47519705e-03, >> ? ? ? ?-1.59353434e-03, ?-1.65991307e-03, ?-1.68213690e-03, >> ? ? ? ?-1.66818281e-03, ?-1.62564706e-03, ?-1.56142889e-03, >> ? ? ? ?-1.48157626e-03, ?-1.39123891e-03, ?-1.29468978e-03, >> ? ? ? ?-1.19538731e-03, ?-1.09605963e-03, ?-9.98797767e-04, >> ? ? ? ?-9.05149579e-04, ?-8.16209174e-04]) >> >> values shifted by one ? >> >>>>> np.argmax(stats.gamma.pdf(grid, 3, scale=1/2.)) >> 1000 >> >>>>> np.argmax(convolved2.conv) >> 999 > > > on the other hand, the convolution with 3 distribution looks accurate > at around 1e-4, so depending on the application this works pretty well > >>>> stats.gamma.cdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.cdf(np.zeros(2000))[:2000:100] > array([ -6.64904892e-09, ?-3.41833566e-05, ?-1.12776463e-04, > ? ? ? ?-2.11506843e-04, ?-3.14716605e-04, ?-4.12809581e-04, > ? ? ? ?-5.00378449e-04, ?-5.74846082e-04, ?-6.35489967e-04, > ? ? ? ?-6.82750381e-04, ?-7.17747313e-04, ?-7.41949767e-04, > ? ? ? ?-7.56955276e-04, ?-7.64348246e-04, ?-7.65613908e-04, > ? ? ? ?-7.62090849e-04, ?-7.54949690e-04, ?-7.45188971e-04, > ? ? ? ?-7.33641860e-04, ?-7.20989200e-04]) > > for 2 distributions > >>>> np.max(np.abs(stats.gamma.cdf(grid[1:2001], 2, scale=1/2.) - convolved.cdf(np.zeros(2000))[:2000])) > 0.00039327739646649595 > > Josef > >> >> Josef >> >>> >>> Josef >>> >>> >>>> >>>> Josef >>>> >>>>> >>>>> Thanks in advance >>>>> >>>>> Nicky >>>>> >>>>> import numpy as np >>>>> import scipy.stats >>>>> from scipy.stats import poisson, uniform, expon >>>>> import pylab as pl >>>>> >>>>> # I need a grid since np.convolve requires two arrays. >>>>> >>>>> # choose the grid such that it covers the numerically relevant support >>>>> # of the distributions >>>>> grid = np.arange(0., 5., 0.001) >>>>> >>>>> # I need P(D1+D2+D3 <= x) >>>>> D1 = expon(scale = 1./2) >>>>> D2 = expon(scale = 1./3) >>>>> D3 = expon(scale = 1./6) >>>>> >>>>> class convolved_gen(scipy.stats.rv_continuous): >>>>> ? ?def __init__(self, D1, D2, grid): >>>>> ? ? ? ?self.D1 = D1 >>>>> ? ? ? ?self.D2 = D2 >>>>> ? ? ? ?delta = grid[1]-grid[0] >>>>> ? ? ? ?p1 = self.D1.pdf(grid) >>>>> ? ? ? ?p2 = self.D2.pdf(grid)*delta >>>>> ? ? ? ?self.conv = np.convolve(p1, p2) >>>>> >>>>> ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") >>>>> >>>>> ? ?def _cdf(self, grid): >>>>> ? ? ? ?cdf = np.cumsum(self.conv) >>>>> ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 >>>>> >>>>> ? ?def _pdf(self, grid): >>>>> ? ? ? ?return self.conv[:len(grid)] >>>>> >>>>> ? ?def _stats(self): >>>>> ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") >>>>> ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") >>>>> ? ? ? ?return m, v, 0., 0. >>>>> >>>>> >>>>> >>>>> convolved = convolved_gen(D1, D2, grid) >>>>> conv = convolved() >>>>> >>>>> convolved2 = convolved_gen(conv, D3, grid) >>>>> conv2 = convolved2() >>>>> >>>>> pl.plot(grid,D1.cdf(grid)) >>>>> pl.plot(grid,conv.cdf(grid)) >>>>> pl.plot(grid,conv2.cdf(grid)) >>>>> pl.show() >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From xrodgers at gmail.com Tue Apr 10 05:27:44 2012 From: xrodgers at gmail.com (Chris Rodgers) Date: Tue, 10 Apr 2012 02:27:44 -0700 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression Message-ID: I have what seems like a straightforward problem but it is becoming more difficult than I thought. I have two different computers recording timestamps from the same stream of events. I get lists X and Y from each computer and the question is how to figure out which entry in X corresponds to which entry in Y. Complications: 1) There are an unknown number of missing or spurious events in each list. I do not know which events in X match up to which in Y. 2) The temporal offset between the two lists is unknown, because each timer begins at a different time. 3) The clocks seem to run at slightly different speeds (~0.3% difference adds up to about 10 seconds over my 1hr recording time). I know this problem is solvable because once you find the temporal offset and clock-speed ratio, the matching timestamps agree to within 10ms. That is, there is a strong linear relationship between some unknown X->Y mapping. Basically, the problem is: given list X and list Y, and specifying a certain minimum R**2 value, what is the largest set of matched points from X and Y that satisfy this R**2 value? I have tried googling "unmatched linear regression" but this must not be the right search term. One approach that I've tried is to create an analog trace for X and Y with a Gaussian centered at each timestamp, then finding the lag that optimizes the cross-correlation between the two. This is good for finding the temporal offset but can't handle the clock-speed difference. (Also it takes a really long time because the series are 1hr of data sampled at 10Hz.) Then I can choose the closest matches between X and Y and fit them with a line, which gives me the clock-difference parameter. The problem is that there are a ton of local minima created by how I choose to match up the points in X and Y, so it gets stuck on the wrong answer. Any tips? Thanks! Chris PS: my current code and test data is here: https://github.com/cxrodgers/DiscreteAnalyze -- Chris Rodgers Graduate Student Helen Wills Neuroscience Institute University of California - Berkeley From yann.ziegler at etu.unistra.fr Tue Apr 10 09:49:50 2012 From: yann.ziegler at etu.unistra.fr (ZIEGLER Yann (ETU EOT)) Date: Tue, 10 Apr 2012 15:49:50 +0200 Subject: [SciPy-User] =?utf-8?q?_Strange_behavior_of_sph=5Fjn?= Message-ID: <4554-4f843a80-2f-65abee80@225308846> Hi, It seems that the bug I have with sph_jn for some values is due to scipy itself. I finded someone having the same kind of trouble on projects.scipy.org/scipy (the website seems to have some technical problems at this time). If anyone is interested in it, here is the solution I have finded to circumvent this bug : Using the definition of Spherical Bessel function involving Bessel function (http://functions.wolfram.com/Bessel-TypeFunctions/SphericalBesselJ/02/), I wrote -- nothing extraordinary, this is a 2-lines function -- my own SphericalBessel function (naive but seemingly correct) for positive or negative order n and complex argument z : import numpy as np from scipy.special import jn def SphericalBessel(n,z): zsqrt = np.sqrt(np.abs(z)) * np.exp(np.angle(z)/2) return np.sqrt(np.pi/2) / zsqrt * jn(n+0.5, z) Yann From josef.pktd at gmail.com Tue Apr 10 10:13:52 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Apr 2012 10:13:52 -0400 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 5:27 AM, Chris Rodgers wrote: > I have what seems like a straightforward problem but it is becoming > more difficult than I thought. I have two different computers > recording timestamps from the same stream of events. I get lists X and > Y from each computer and the question is how to figure out which entry > in X corresponds to which entry in Y. > > Complications: > 1) There are an unknown number of missing or spurious events in each > list. I do not know which events in X match up to which in Y. > 2) The temporal offset between the two lists is unknown, because each > timer begins at a different time. > 3) The clocks seem to run at slightly different speeds (~0.3% > difference adds up to about 10 seconds over my 1hr recording time). > > I know this problem is solvable because once you find the temporal > offset and clock-speed ratio, the matching timestamps agree to within > 10ms. That is, there is a strong linear relationship between some > unknown X->Y mapping. > > Basically, the problem is: given list X and list Y, and specifying a > certain minimum R**2 value, what is the largest set of matched points > from X and Y that satisfy this R**2 value? I have tried googling > "unmatched linear regression" but this must not be the right search > term. > > One approach that I've tried is to create an analog trace for X and Y > with a Gaussian centered at each timestamp, then finding the lag that > optimizes the cross-correlation between the two. This is good for > finding the temporal offset but can't handle the clock-speed > difference. (Also it takes a really long time because the series are > 1hr of data sampled at 10Hz.) Then I can choose the closest matches > between X and Y and fit them with a line, which gives me the > clock-difference parameter. The problem is that there are a ton of > local minima created by how I choose to match up the points in X and > Y, so it gets stuck on the wrong answer. > > Any tips? I'm pretty sure someone has a more experienced answer similar to image registration. What I would try to do is do your correlation or regression matching on two subsamples, for example a segment at the beginning and a segment at the end, then the different clock speeds will have a small effect. Then recover the clockspeed difference comparing the match between the two subsamples/segments. Josef > > Thanks! > Chris > > PS: my current code and test data is here: > https://github.com/cxrodgers/DiscreteAnalyze > > -- > Chris Rodgers > Graduate Student > Helen Wills Neuroscience Institute > University of California - Berkeley > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From charlesr.harris at gmail.com Tue Apr 10 10:24:58 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Apr 2012 08:24:58 -0600 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 3:27 AM, Chris Rodgers wrote: > I have what seems like a straightforward problem but it is becoming > more difficult than I thought. I have two different computers > recording timestamps from the same stream of events. I get lists X and > Y from each computer and the question is how to figure out which entry > in X corresponds to which entry in Y. > > Complications: > 1) There are an unknown number of missing or spurious events in each > list. I do not know which events in X match up to which in Y. > 2) The temporal offset between the two lists is unknown, because each > timer begins at a different time. > 3) The clocks seem to run at slightly different speeds (~0.3% > difference adds up to about 10 seconds over my 1hr recording time). > > I know this problem is solvable because once you find the temporal > offset and clock-speed ratio, the matching timestamps agree to within > 10ms. That is, there is a strong linear relationship between some > unknown X->Y mapping. > > Basically, the problem is: given list X and list Y, and specifying a > certain minimum R**2 value, what is the largest set of matched points > from X and Y that satisfy this R**2 value? I have tried googling > "unmatched linear regression" but this must not be the right search > term. > > One approach that I've tried is to create an analog trace for X and Y > with a Gaussian centered at each timestamp, then finding the lag that > optimizes the cross-correlation between the two. This is good for > finding the temporal offset but can't handle the clock-speed > difference. (Also it takes a really long time because the series are > 1hr of data sampled at 10Hz.) Then I can choose the closest matches > between X and Y and fit them with a line, which gives me the > clock-difference parameter. The problem is that there are a ton of > local minima created by how I choose to match up the points in X and > Y, so it gets stuck on the wrong answer. > > This is a tricky problem, especially if you need to support windows with it's limited tick rate. NTP is a good tool on linux, and you can use it to synchronize networked machines to a reference machine, which might well do what you need. Much depends on the required time resolution. There are also ways to deal with windows machines, but I forget the details. Google around, there is a lot of material out there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Apr 10 10:56:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Apr 2012 10:56:10 -0400 Subject: [SciPy-User] scipy.stats convolution of two distributions In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 6:07 AM, nicky van foreest wrote: > Hi Josef, > > Thanks for all your answers. This was more than was hoping for. I > think I have to do some serious studying before I can deal with all > your feedback. As a start I found the following article, which seems > (telling from the abstract) a good starting point. From your list of > mails above I'll also try to make a list of requirements that a > convolution operator on probability distributions should offer. I think how "fancy" you want to get will depend a lot on what you want to use it for. For example, I was reading the literature using characteristic function for applications where the tail probabilities should be accurate. Tails are difficult to get to a good precision, and there are many papers with various tricks, none of which I tried to implement. In the example, if I increase the truncation to 10, grid = np.arange(0., 10., 0.001), 10000 points which is still fast with convolution, then the percent or absolute difference to the true distribution in the gamma case doesn't look so bad. mean and variance >>> convolved2.stats() (array(1.5), array(0.75)) >>> (grid[1:10000] * convolved2.conv[1:10000]).sum()*0.001 1.502997188928352 >>> ((grid[1:10000] - 1.5)**2 * convolved2.conv[1:10000]).sum()*0.001 0.7522175059430164 >>> ((grid[1:10000] - 1.502997188928352)**2 * convolved2.conv[1:10000]).sum()*0.001 0.7522355563037656 (I didn't try to figure out the convolution boundaries.) Josef > > On computing the distribution function of the sum of independentrandomvariables > Mani K. Agrawal*, a, ?[Author Vitae], Salah E. Elmaghraby*, b, , ?[Author Vitae] > a Manugistics, Inc., 9200 E. Panorama Circle, Englewood, CO 80112, USA > b Department of Industrial Engineering, College of Engineering, North > Carolina State University, Campus Box 7906, Raleigh, NC 27695-7906, > USA > Received 1 August 1998. Revised 1 July 1999. Available online 4 December 2000. > http://dx.doi.org/10.1016/S0305-0548(99)00133-1, How to Cite or Link Using DOI > Cited by in Scopus (7) > Permissions & Reprints > > > On 10 April 2012 03:09, ? wrote: >> On Mon, Apr 9, 2012 at 8:45 PM, ? wrote: >>> On Mon, Apr 9, 2012 at 6:36 PM, ? wrote: >>>> On Mon, Apr 9, 2012 at 6:06 PM, ? wrote: >>>>> On Mon, Apr 9, 2012 at 5:04 PM, nicky van foreest wrote: >>>>>> Hi, >>>>>> >>>>>> In one of my projects I built some code that depends in a nice and >>>>>> generic way on the methods of rv_continuous in scipy.stats. Now it >>>>>> turns out that I shot myself in the foot because I need to run a test >>>>>> with the sum (convolution) of three such distributions. As far as I >>>>>> can see there is no standard way to achieve this in scipy.stats. Does >>>>>> anybody know of a good (generic) way to do this? If not, would it >>>>>> actually be useful to add such functionality to scipy.stats, if >>>>>> possible at all? >>>>>> >>>>>> After some struggling I wrote the code below, but this is a first >>>>>> attempt. I am very interested in obtaining feedback to turn this into >>>>>> something that is useful for a larger population than just me. >>>>> >>>>> Sounds fun. >>>>> >>>>> The plots are a bit misleading, because matplotlib is doing the >>>>> interpolation for you >>>>> >>>>> pl.plot(grid,D1.cdf(grid), '.') >>>>> pl.plot(grid,conv.cdf(grid), '.') >>>>> pl.plot(grid,conv2.cdf(grid), '.') >>>>> >>>>> The main problem is the mixing of continuous distribution and discrete >>>>> grid. pdf, cdf, .... want to evaluate at a point and with floating >>>>> points it's perilous (as we discussed before). >>>>> >>>>> As I read it, neither your cdf nor pdf return specific points. >>>>> >>>>> I think the easiest would be to work with linear interpolation >>>>> interp1d on a fine grid, but I never checked how fast this is for a >>>>> fine grid. >>>>> If cdf and pdf are defined with a piecewise polynomial, then they can >>>>> be evaluated at any points and the generic class, rv_continuous, >>>>> should be able to handle everything. >>>>> The alternative would be to work with the finite grid and use the >>>>> fixed spacing to define a lattice distribution, but that doesn't exist >>>>> in scipy. >>>>> >>>>> I haven't thought about the convolution itself yet, (an alternative >>>>> would be using fft to work with the characteristic function.) >>>>> >>>>> If you use this for a test, how much do you care about having accurate >>>>> tails, or is effectively truncating the distribution ok? >>>> >>>> The other question I thought about in a similar situation is, what is >>>> the usage or access pattern. >>>> >>>> For many cases, I'm not really interested in evaluating the pdf at >>>> specific points, but over a range or interval of points. In this case >>>> relying on floating point access doesn't look like the best way to go, >>>> and I spend more time on the `expect` method. Calculating expectation >>>> of a function w.r.t the distribution that can use the internal >>>> representation instead of the generic integrate.quad. >>> >>> >>> a test case to check numerical accuracy of discretized approximation/convolution >>> >>> Di = expon(scale = 1./2) >>> sum of identical exponentially distributed random variables is gamma >>> http://en.wikipedia.org/wiki/Gamma_distribution >>> >>> 1 exponential and corresponding gamma >>> >>>>>> convolved.D1.pdf(grid[:10]) >>> array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, >>> ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) >>>>>> stats.gamma.pdf(grid[:10], 1, scale=1/2.) >>> array([ 2. ? ? ? ?, ?1.996004 ?, ?1.99201598, ?1.98803593, ?1.98406383, >>> ? ? ? ?1.98009967, ?1.97614343, ?1.97219509, ?1.96825464, ?1.96432206]) >>> >>> sum of 2 exponentials >>> >>>>>> stats.gamma.pdf(grid[:10], 2, scale=1/2.) >>> array([ 0. ? ? ? ?, ?0.00399201, ?0.00796806, ?0.01192822, ?0.01587251, >>> ? ? ? ?0.019801 ?, ?0.02371372, ?0.02761073, ?0.03149207, ?0.0353578 ]) >>>>>> convolved.pdf(np.zeros(10)) >>> array([ 0.004 ? ? , ?0.00798402, ?0.0119521 , ?0.01590429, ?0.01984064, >>> ? ? ? ?0.0237612 , ?0.02766601, ?0.03155512, ?0.03542858, ?0.03928644]) >>>>>> stats.gamma.pdf(grid[1:21], 2, scale=1/2.) - convolved.pdf(np.zeros(20)) >>> array([ -7.99200533e-06, ?-1.59520746e-05, ?-2.38803035e-05, >>> ? ? ? ?-3.17767875e-05, ?-3.96416218e-05, ?-4.74749013e-05, >>> ? ? ? ?-5.52767208e-05, ?-6.30471746e-05, ?-7.07863571e-05, >>> ? ? ? ?-7.84943621e-05, ?-8.61712832e-05, ?-9.38172141e-05, >>> ? ? ? ?-1.01432248e-04, ?-1.09016477e-04, ?-1.16569995e-04, >>> ? ? ? ?-1.24092894e-04, ?-1.31585266e-04, ?-1.39047203e-04, >>> ? ? ? ?-1.46478797e-04, ?-1.53880139e-04]) >>> >>> sum of 3 exponentials >>> >>>>>> convolved2.conv[:2000:100] >>> array([ ?8.00000000e-06, ? 3.37382569e-02, ? 1.08865338e-01, >>> ? ? ? ? 1.99552301e-01, ? 2.89730911e-01, ? 3.70089661e-01, >>> ? ? ? ? 4.35890673e-01, ? 4.85403437e-01, ? 5.18794908e-01, >>> ? ? ? ? 5.37354948e-01, ? 5.42966239e-01, ? 5.37750775e-01, >>> ? ? ? ? 5.23842475e-01, ? 5.03248651e-01, ? 4.77772987e-01, >>> ? ? ? ? 4.48980181e-01, ? 4.18187929e-01, ? 3.86476082e-01, >>> ? ? ? ? 3.54705854e-01, ? 3.23544178e-01]) >>>>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) >>> array([ ?3.99200799e-06, ? 3.33407414e-02, ? 1.08109964e-01, >>> ? ? ? ? 1.98494147e-01, ? 2.88432744e-01, ? 3.68614464e-01, >>> ? ? ? ? 4.34297139e-01, ? 4.83743524e-01, ? 5.17112771e-01, >>> ? ? ? ? 5.35686765e-01, ? 5.41340592e-01, ? 5.36189346e-01, >>> ? ? ? ? 5.22360899e-01, ? 5.01857412e-01, ? 4.76478297e-01, >>> ? ? ? ? 4.47784794e-01, ? 4.17091870e-01, ? 3.85477285e-01, >>> ? ? ? ? 3.53800704e-01, ? 3.22727969e-01]) >>>>>> stats.gamma.pdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.conv[:2000:100] >>> array([ -4.00799201e-06, ?-3.97515433e-04, ?-7.55373610e-04, >>> ? ? ? ?-1.05815476e-03, ?-1.29816640e-03, ?-1.47519705e-03, >>> ? ? ? ?-1.59353434e-03, ?-1.65991307e-03, ?-1.68213690e-03, >>> ? ? ? ?-1.66818281e-03, ?-1.62564706e-03, ?-1.56142889e-03, >>> ? ? ? ?-1.48157626e-03, ?-1.39123891e-03, ?-1.29468978e-03, >>> ? ? ? ?-1.19538731e-03, ?-1.09605963e-03, ?-9.98797767e-04, >>> ? ? ? ?-9.05149579e-04, ?-8.16209174e-04]) >>> >>> values shifted by one ? >>> >>>>>> np.argmax(stats.gamma.pdf(grid, 3, scale=1/2.)) >>> 1000 >>> >>>>>> np.argmax(convolved2.conv) >>> 999 >> >> >> on the other hand, the convolution with 3 distribution looks accurate >> at around 1e-4, so depending on the application this works pretty well >> >>>>> stats.gamma.cdf(grid[1:2001:100], 3, scale=1/2.) - convolved2.cdf(np.zeros(2000))[:2000:100] >> array([ -6.64904892e-09, ?-3.41833566e-05, ?-1.12776463e-04, >> ? ? ? ?-2.11506843e-04, ?-3.14716605e-04, ?-4.12809581e-04, >> ? ? ? ?-5.00378449e-04, ?-5.74846082e-04, ?-6.35489967e-04, >> ? ? ? ?-6.82750381e-04, ?-7.17747313e-04, ?-7.41949767e-04, >> ? ? ? ?-7.56955276e-04, ?-7.64348246e-04, ?-7.65613908e-04, >> ? ? ? ?-7.62090849e-04, ?-7.54949690e-04, ?-7.45188971e-04, >> ? ? ? ?-7.33641860e-04, ?-7.20989200e-04]) >> >> for 2 distributions >> >>>>> np.max(np.abs(stats.gamma.cdf(grid[1:2001], 2, scale=1/2.) - convolved.cdf(np.zeros(2000))[:2000])) >> 0.00039327739646649595 >> >> Josef >> >>> >>> Josef >>> >>>> >>>> Josef >>>> >>>> >>>>> >>>>> Josef >>>>> >>>>>> >>>>>> Thanks in advance >>>>>> >>>>>> Nicky >>>>>> >>>>>> import numpy as np >>>>>> import scipy.stats >>>>>> from scipy.stats import poisson, uniform, expon >>>>>> import pylab as pl >>>>>> >>>>>> # I need a grid since np.convolve requires two arrays. >>>>>> >>>>>> # choose the grid such that it covers the numerically relevant support >>>>>> # of the distributions >>>>>> grid = np.arange(0., 5., 0.001) >>>>>> >>>>>> # I need P(D1+D2+D3 <= x) >>>>>> D1 = expon(scale = 1./2) >>>>>> D2 = expon(scale = 1./3) >>>>>> D3 = expon(scale = 1./6) >>>>>> >>>>>> class convolved_gen(scipy.stats.rv_continuous): >>>>>> ? ?def __init__(self, D1, D2, grid): >>>>>> ? ? ? ?self.D1 = D1 >>>>>> ? ? ? ?self.D2 = D2 >>>>>> ? ? ? ?delta = grid[1]-grid[0] >>>>>> ? ? ? ?p1 = self.D1.pdf(grid) >>>>>> ? ? ? ?p2 = self.D2.pdf(grid)*delta >>>>>> ? ? ? ?self.conv = np.convolve(p1, p2) >>>>>> >>>>>> ? ? ? ?super(convolved_gen, self).__init__(name = "convolved") >>>>>> >>>>>> ? ?def _cdf(self, grid): >>>>>> ? ? ? ?cdf = np.cumsum(self.conv) >>>>>> ? ? ? ?return cdf/cdf[-1] # ensure that cdf[-1] = 1 >>>>>> >>>>>> ? ?def _pdf(self, grid): >>>>>> ? ? ? ?return self.conv[:len(grid)] >>>>>> >>>>>> ? ?def _stats(self): >>>>>> ? ? ? ?m = self.D1.stats("m") + self.D2.stats("m") >>>>>> ? ? ? ?v = self.D1.stats("v") + self.D2.stats("v") >>>>>> ? ? ? ?return m, v, 0., 0. >>>>>> >>>>>> >>>>>> >>>>>> convolved = convolved_gen(D1, D2, grid) >>>>>> conv = convolved() >>>>>> >>>>>> convolved2 = convolved_gen(conv, D3, grid) >>>>>> conv2 = convolved2() >>>>>> >>>>>> pl.plot(grid,D1.cdf(grid)) >>>>>> pl.plot(grid,conv.cdf(grid)) >>>>>> pl.plot(grid,conv2.cdf(grid)) >>>>>> pl.show() >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jkhilmer at chemistry.montana.edu Tue Apr 10 11:31:01 2012 From: jkhilmer at chemistry.montana.edu (jkhilmer at chemistry.montana.edu) Date: Tue, 10 Apr 2012 09:31:01 -0600 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: If these computers are networked, why not have them communicating while recording the data? Don't adjust the system clocks during the run, but record inter-computer timestamps. At a 10Hz sampling rate, even a very naive non-NTP should be sufficient to keep a precision close to your recording interval. Jonathan On Tue, Apr 10, 2012 at 8:24 AM, Charles R Harris wrote: > > > On Tue, Apr 10, 2012 at 3:27 AM, Chris Rodgers wrote: >> >> I have what seems like a straightforward problem but it is becoming >> more difficult than I thought. I have two different computers >> recording timestamps from the same stream of events. I get lists X and >> Y from each computer and the question is how to figure out which entry >> in X corresponds to which entry in Y. >> >> Complications: >> 1) There are an unknown number of missing or spurious events in each >> list. I do not know which events in X match up to which in Y. >> 2) The temporal offset between the two lists is unknown, because each >> timer begins at a different time. >> 3) The clocks seem to run at slightly different speeds (~0.3% >> difference adds up to about 10 seconds over my 1hr recording time). >> >> I know this problem is solvable because once you find the temporal >> offset and clock-speed ratio, the matching timestamps agree to within >> 10ms. That is, there is a strong linear relationship between some >> unknown X->Y mapping. >> >> Basically, the problem is: given list X and list Y, and specifying a >> certain minimum R**2 value, what is the largest set of matched points >> from X and Y that satisfy this R**2 value? I have tried googling >> "unmatched linear regression" but this must not be the right search >> term. >> >> One approach that I've tried is to create an analog trace for X and Y >> with a Gaussian centered at each timestamp, then finding the lag that >> optimizes the cross-correlation between the two. This is good for >> finding the temporal offset but can't handle the clock-speed >> difference. (Also it takes a really long time because the series are >> 1hr of data sampled at 10Hz.) Then I can choose the closest matches >> between X and Y and fit them with a line, which gives me the >> clock-difference parameter. The problem is that there are a ton of >> local minima created by how I choose to match up the points in X and >> Y, so it gets stuck on the wrong answer. >> > > This is a tricky problem, especially if you need to support windows with > it's limited tick rate. NTP is a good tool on linux, and you can use it to > synchronize networked machines to a reference machine, which might well do > what you need. Much depends on the required time resolution. There are also > ways to deal with windows machines, but I forget the details. Google around, > there is a lot of material out there. > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Tue Apr 10 16:02:11 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 10 Apr 2012 21:02:11 +0100 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 10:27 AM, Chris Rodgers wrote: > I have what seems like a straightforward problem but it is becoming > more difficult than I thought. I have two different computers > recording timestamps from the same stream of events. I get lists X and > Y from each computer and the question is how to figure out which entry > in X corresponds to which entry in Y. > Could you describe your problem with more details ? Do you really need to match X to Y lists, or is that just an intermediary to the actualy problem ? At least issues 2 and 3 are handled with Lamport timestamps ( Lamport timestamps allow for partial ordering between unsynchronized machines) http://en.wikipedia.org/wiki/Lamport_timestamps David -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 10 17:18:02 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Apr 2012 22:18:02 +0100 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 10:27 AM, Chris Rodgers wrote: > I have what seems like a straightforward problem but it is becoming > more difficult than I thought. I have two different computers > recording timestamps from the same stream of events. I get lists X and > Y from each computer and the question is how to figure out which entry > in X corresponds to which entry in Y. > > Complications: > 1) There are an unknown number of missing or spurious events in each > list. I do not know which events in X match up to which in Y. > 2) The temporal offset between the two lists is unknown, because each > timer begins at a different time. > 3) The clocks seem to run at slightly different speeds (~0.3% > difference adds up to about 10 seconds over my 1hr recording time). Tricky! Others have given you plenty of advice for how to change the setup in ways that might help, but my guess is you can solve it with the data at hand. There's a 2-parameter space of (offset, relative clockspeed) that you're trying to search. You need a smooth and quick-to-evaluate function that gets larger when these numbers are closer to accurate, that you can hand to your optimizer. (Smooth to avoid the local minima problem you mention; quick-to-evaluate for the obvious reason.) How about, given a candidate offset + clockspeed, remap your Y events into the the purported X clock domain, and score each event by its squared distance from the nearest X event. This ignores the issue of matching, on the assumption that mismatches between the lists are rare enough that they won't matter. Given that you're trying to extract 2 numbers worth of information from 36000 samples, you should be able to get away with a fair amount of sloppiness. And it's very concise to write down and fast to compute (untested code): x_times = np.array([x_time1, x_time2, x_time3, ...], dtype=float) y_times = np.array([y_time1, y_time2, y_time3, ...], dtype=float) x_midpoints = x_times[:-1] + np.diff(x) / 2. def objective(offset, clockspeed): # adjust parametrization to suit adj_y_times = y_times * clockspeed + offset closest_x_times = np.searchsorted(x_midpoints, adj_y_times) return np.sum((y_times - x_times[closest_x_times]) ** 2) Each evaluation is O(n log n). Worth a try, anyway... Good luck, -- Nathaniel > > I know this problem is solvable because once you find the temporal > offset and clock-speed ratio, the matching timestamps agree to within > 10ms. That is, there is a strong linear relationship between some > unknown X->Y mapping. > > Basically, the problem is: given list X and list Y, and specifying a > certain minimum R**2 value, what is the largest set of matched points > from X and Y that satisfy this R**2 value? I have tried googling > "unmatched linear regression" but this must not be the right search > term. > > One approach that I've tried is to create an analog trace for X and Y > with a Gaussian centered at each timestamp, then finding the lag that > optimizes the cross-correlation between the two. This is good for > finding the temporal offset but can't handle the clock-speed > difference. (Also it takes a really long time because the series are > 1hr of data sampled at 10Hz.) Then I can choose the closest matches > between X and Y and fit them with a line, which gives me the > clock-difference parameter. The problem is that there are a ton of > local minima created by how I choose to match up the points in X and > Y, so it gets stuck on the wrong answer. > > Any tips? > > Thanks! > Chris > > PS: my current code and test data is here: > https://github.com/cxrodgers/DiscreteAnalyze > > -- > Chris Rodgers > Graduate Student > Helen Wills Neuroscience Institute > University of California - Berkeley > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From njs at pobox.com Tue Apr 10 17:22:58 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Apr 2012 22:22:58 +0100 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 10:18 PM, Nathaniel Smith wrote: > ? ?return np.sum((y_times - x_times[closest_x_times]) ** 2) On further thought, squaring is probably exactly the wrong transformation here -- squared error focuses on minimizing the large errors, and in this case we know that the large errors are caused by events that got dropped on the X side, and that these contain no information about the proper (offset, clockspeed). np.sqrt(np.abs(...)) would probably do better, or something similar that flattens out for larger values. Easy to play around with, though. Also on further thought, it might make sense to run that both directions, and match x values against y values too. -- Nathaniel From aronne.merrelli at gmail.com Wed Apr 11 00:42:20 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Tue, 10 Apr 2012 23:42:20 -0500 Subject: [SciPy-User] Strange behavior of sph_jn In-Reply-To: <4554-4f843a80-2f-65abee80@225308846> References: <4554-4f843a80-2f-65abee80@225308846> Message-ID: On Tue, Apr 10, 2012 at 8:49 AM, ZIEGLER Yann (ETU EOT) wrote: > Hi, > > It seems that the bug I have with sph_jn for some values is due to scipy > itself. I finded someone having the same kind of trouble on > projects.scipy.org/scipy (the website seems to have some technical problems > at this time). Hi, I believe that the SciPy implementation is from this reference (it is mentioned in the header file of the fortran file in SciPy): "Computation of Special Functions", 1996, John Wiley & Sons, Inc. Shanjie Zhang and Jianming Jin The fortran subroutine called from the python level sph_jn is 'SPHJ'; I don't understand the implementation very well (it is very cryptic to my untrained eye, and I don't have access to the book), but at the core of it is an iterative algorithm that just breaks down for the large arguments that you were testing. I think it is running into the max floating point exponent. So, I don't think there is any SciPy bug exactly, it is just this particular numerical implementation of the spherical bessel function doesn't work for those large arguments. Since you don't need the derivatives, then your own simple function relating it to the bessel function should work fine. Aronne From lb489 at cam.ac.uk Wed Apr 11 09:23:29 2012 From: lb489 at cam.ac.uk (L. Barrott) Date: 11 Apr 2012 14:23:29 +0100 Subject: [SciPy-User] (no subject) Message-ID: >On Tue, Apr 3, 2012 at 11:29 AM, L. Barrott wrote: > >> > On Tue, Apr 3, 2012 at 5:54 AM, L. Barrott wrote: >> > >> >> Hello, >> >> >> >> I have been trying to get scipy to solve a set of coupled odes and >> >> in particular I want to use the dopri 45 method as I want to compare >> >> the results to the ode45 method in MATLAB. The code runs along the >> >> lines of: >> >> >> >> def func (t, Y, params): >> >> ... >> >> return (Ydot) >> >> >> >> with Y a vector. The other ode methods (except dop853 of course) solve >> >> this fine but even if I use the example code on the documentation page >> >> the dopri method returns the following error >> >> >> >> create_cb_arglist: Failed to build argument list (siz) with enough >> >> arguments (tot-opt) required by user-supplied function >> >> (siz,tot,opt=2,3,0). ...(traceback stuff) _dop.error: failed in >> >> processing argument list for call-back fcn. >> >> >> >> Any ideas where I am going wrong? >> >> >> >> Many thanks >> >> LB >> >> >> > >> > >> > It would help to see more of your code. Could you include a complete, >> > self-contained script that demonstrates the error? >> > >> > Warren >> >> Even something as simple as; >> >> from scipy.integrate import ode >> y0, t0 = [0, 1], 0 >> def func (t, y, x): >> return [x, y[0]] >> r = ode(func).set_integrator ('dopri5') >> r.set_initial_value(y0, t0).set_f_params(1) >> t1 = 10 >> dt = 0.1 >> while r.successful() and r.t < t1: >> r.integrate(r.t+dt) >> >> Will fail and this is lifted straight from the documentation as far as I >> can see. The full error message is >> >> create_cb_arglist: Failed to build argument list (siz) with enough >> arguments (tot-opt) required by user-supplied function >> (siz,tot,opt=2,3,0). Traceback (most recent call last): >> File "", line 2, in >> File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line >> 326, in integrate >> self.f_params,self.jac_params) >> File "/usr/lib/python2.7/dist-packages/scipy/integrate/ode.py", line >> 745, in run >> x,y,iwork,idid = self.runner(*((f,t0,y0,t1) + tuple(self.call_args))) >> _dop.error: failed in processing argument list for call-back fcn. >> >> LB >> > > >I suspect you are using version 0.9 (or earlier) of scipy. This looks like >a bug that was fixed in 0.10 > > http://projects.scipy.org/scipy/ticket/1392 > > >Warren Sorry for being slow about responding, that works now. Many thanks. LB From xrodgers at gmail.com Wed Apr 11 00:16:33 2012 From: xrodgers at gmail.com (Chris Rodgers) Date: Tue, 10 Apr 2012 21:16:33 -0700 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: Dear all Thanks very much for the suggestions! Re a new hardware implementation: I bet this would totally work and honestly is probably the fastest way to get it working. I think even a rough system clock would do the trick. The downsides are 1) many data have already been collected with the old setup; 2) I'm getting stubbornly interested in this problem for its own sake since it feel so solvable. So perhaps I'll change the hardware for future data and keep working on algorithms for the old data. (I'd never heard of Lamport timestamps. The wikipedia article is really interesting. If I understand it correctly, it would still require a hardware change though.) Re Nathaniel's suggestion: I think this is pretty similar to the algorithm I'm currently using. Pseudocode: current_guess = estimate_from_correlation(x, y) for timescale in decreasing_order: xm, ym = find_matches( x, y, current_guess, within=timescale) current_guess = linfit(xm, ym) The problem is the local minima caused by mismatch errors. If the clockspeed estimate is off, then late events are incorrectly matched with a delay of one event. Then the updated guess moves closer to this incorrect solution. So by killing off the points that disagree, we reinforce the current orthodoxy! Actually the truest objective function would be the number of matches within some specified error. ERR = .1 def objective(offset, clockspeed): # adjust parametrization to suit adj_y_times = y_times * clockspeed + offset closest_x_times = np.searchsorted(x_midpoints, adj_y_times) pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) closest_good = closest_x_times[pred_err < ERR] return len(unique(closest_good)) That function has some ugly non-smoothness due to the len(unique(...)). Would something like optimize.brent work for this or am I on the wrong track? Thanks again all! Chris On Tue, Apr 10, 2012 at 2:22 PM, Nathaniel Smith wrote: > On Tue, Apr 10, 2012 at 10:18 PM, Nathaniel Smith wrote: >> ? ?return np.sum((y_times - x_times[closest_x_times]) ** 2) > > On further thought, squaring is probably exactly the wrong > transformation here -- squared error focuses on minimizing the large > errors, and in this case we know that the large errors are caused by > events that got dropped on the X side, and that these contain no > information about the proper (offset, clockspeed). > np.sqrt(np.abs(...)) would probably do better, or something similar > that flattens out for larger values. Easy to play around with, though. > > Also on further thought, it might make sense to run that both > directions, and match x values against y values too. > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From yann.ziegler at etu.unistra.fr Wed Apr 11 04:36:34 2012 From: yann.ziegler at etu.unistra.fr (ZIEGLER Yann (ETU EOT)) Date: Wed, 11 Apr 2012 10:36:34 +0200 Subject: [SciPy-User] =?utf-8?q?Strange_behavior_of_sph=5Fjn?= In-Reply-To: <4554-4f843a80-2f-65abee80@225308846> Message-ID: <3a7c-4f854280-171-656b8480@225579481> > On Tue, Apr 10, 2012 at 8:49 AM, ZIEGLER Yann (ETU EOT) > wrote: > > Hi, > > > > It seems that the bug I have with sph_jn for some values is due to scipy > > itself. I finded someone having the same kind of trouble on > > projects.scipy.org/scipy (the website seems to have some technical problems > > at this time). > > > Hi, > > I believe that the SciPy implementation is from this reference (it is > mentioned in the header file of the fortran file in SciPy): > > "Computation of Special Functions", 1996, John Wiley & Sons, Inc. > Shanjie Zhang and Jianming Jin > > The fortran subroutine called from the python level sph_jn is 'SPHJ'; > I don't understand the implementation very well (it is very cryptic to > my untrained eye, and I don't have access to the book), but at the > core of it is an iterative algorithm that just breaks down for the > large arguments that you were testing. I think it is running into the > max floating point exponent. You're probably right, it is coherent with what I read in some docs : for big values it is efficient enough to use the asymptotic approximation of these functions... but I'm not convinced anyway returning inf or NaN is the behavior one could expect by calling sph_jn which must be defined for all real or complex numbers. So, you're right: I was mistaken in speaking about a bug, but for me it's a pitfall that should be removed from scipy... at least for people like me who don't know all the ins and outs of special functions computation. > So, I don't think there is any SciPy bug exactly, it is just this > particular numerical implementation of the spherical bessel function > doesn't work for those large arguments. Since you don't need the > derivatives, then your own simple function relating it to the bessel > function should work fine. > > Aronne In fact, I needed the derivatives but, fortunately, there are some useful recurrence relations, such as : http://functions.wolfram.com/Bessel-TypeFunctions/SphericalBesselJ/20/01/02/ Thanks for your comment! Yann PS : Sorry, I'm not familiar with mailing lists and I've created a new message when I answered previously. I hope it will be all right now! From travis at continuum.io Wed Apr 11 09:48:35 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 11 Apr 2012 08:48:35 -0500 Subject: [SciPy-User] Strange behavior of sph_jn In-Reply-To: <4554-4f843a80-2f-65abee80@225308846> References: <4554-4f843a80-2f-65abee80@225308846> Message-ID: <33214579-2597-498C-88E3-3D4ABF6EF270@continuum.io> Thanks for this excellent feedback. It would be very helpful if you could file a bug at http://projects.scipy.org/scipy That way we we don't lose track of the problem you ran in to. The algorithm should be adjusted to handle these values. Best regards, -Travis On Apr 10, 2012, at 8:49 AM, ZIEGLER Yann (ETU EOT) wrote: > Hi, > > It seems that the bug I have with sph_jn for some values is due to scipy > itself. I finded someone having the same kind of trouble on > projects.scipy.org/scipy (the website seems to have some technical problems > at this time). > > If anyone is interested in it, here is the solution I have finded to > circumvent this bug : > > Using the definition of Spherical Bessel function involving Bessel function > (http://functions.wolfram.com/Bessel-TypeFunctions/SphericalBesselJ/02/), > I wrote -- nothing extraordinary, this is a 2-lines function -- my own > SphericalBessel function (naive but seemingly correct) for positive or > negative order n and complex argument z : > > import numpy as np > from scipy.special import jn > > def SphericalBessel(n,z): > > zsqrt = np.sqrt(np.abs(z)) * np.exp(np.angle(z)/2) > > return np.sqrt(np.pi/2) / zsqrt * jn(n+0.5, z) > > Yann > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 11 11:31:41 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 11 Apr 2012 09:31:41 -0600 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 10:16 PM, Chris Rodgers wrote: > Dear all > > Thanks very much for the suggestions! > > Re a new hardware implementation: I bet this would totally work and > honestly is probably the fastest way to get it working. I think even a > rough system clock would do the trick. The downsides are 1) many data > have already been collected with the old setup; 2) I'm getting > stubbornly interested in this problem for its own sake since it feel > so solvable. So perhaps I'll change the hardware for future data and > keep working on algorithms for the old data. (I'd never heard of > Lamport timestamps. The wikipedia article is really interesting. If I > understand it correctly, it would still require a hardware change > though.) > > > Re Nathaniel's suggestion: > I think this is pretty similar to the algorithm I'm currently using. > Pseudocode: > > current_guess = estimate_from_correlation(x, y) > for timescale in decreasing_order: > xm, ym = find_matches( > x, y, current_guess, within=timescale) > current_guess = linfit(xm, ym) > > The problem is the local minima caused by mismatch errors. If the > clockspeed estimate is off, then late events are incorrectly matched > with a delay of one event. Then the updated guess moves closer to this > incorrect solution. So by killing off the points that disagree, we > reinforce the current orthodoxy! > > Actually the truest objective function would be the number of matches > within some specified error. > ERR = .1 > def objective(offset, clockspeed): > # adjust parametrization to suit > adj_y_times = y_times * clockspeed + offset > closest_x_times = np.searchsorted(x_midpoints, adj_y_times) > pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) > closest_good = closest_x_times[pred_err < ERR] > return len(unique(closest_good)) > > > That function has some ugly non-smoothness due to the > len(unique(...)). Would something like optimize.brent work for this or > am I on the wrong track? > > > You can also get some useful information from NTP (or chrony). For instance on Fedora 16, which uses chrony instead of the ntp utilities to synchronize the time, chronyc> tracking Reference ID : 66.219.59.208 (mx1.mailfighter.net) Stratum : 3 Ref time (UTC) : Wed Apr 11 15:17:33 2012 System time : 0.000000349 seconds fast of NTP time Frequency : 41.465 ppm fast Residual freq : 0.001 ppm Skew : 0.104 ppm Root delay : 0.063986 seconds Root dispersion : 0.023441 seconds The Frequency might be useful to you. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From claumann at physics.harvard.edu Wed Apr 11 12:18:02 2012 From: claumann at physics.harvard.edu (Chris Laumann) Date: Wed, 11 Apr 2012 12:18:02 -0400 Subject: [SciPy-User] CSR sorting/duplication of entries Message-ID: <72B1BE95-6CEE-4C26-A16F-BAC0CC217870@physics.harvard.edu> Hi all- Does anybody know whether it is safe (or rather, what breaks) if you use a CSR matrix whose rows are not sorted and may even include duplicate entries? Are duplicates treated additively? The main scipy.sparse documentation page notes that CSR rows are not necessarily sorted but doesn't address duplication. It also simply cryptically notes that you may need to sort the CSR rows when talking to other libraries, but doesn't specifically address whether they are always safe within scipy submodules.. Best, Chris From njs at pobox.com Wed Apr 11 12:37:39 2012 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 11 Apr 2012 17:37:39 +0100 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers wrote: > Re Nathaniel's suggestion: > I think this is pretty similar to the algorithm I'm currently using. Pseudocode: > > current_guess = estimate_from_correlation(x, y) > for timescale in decreasing_order: > ?xm, ym = find_matches( > ? ?x, y, current_guess, within=timescale) > ?current_guess = linfit(xm, ym) > > The problem is the local minima caused by mismatch errors. If the > clockspeed estimate is off, then late events are incorrectly matched > with a delay of one event. Then the updated guess moves closer to this > incorrect solution. So by killing off the points that disagree, we > reinforce the current orthodoxy! Yes, that's why I was suggesting doing something similar to your current algorithm, but different in ways that might avoid this problem :-). > Actually the truest objective function would be the number of matches > within some specified error. > ERR = .1 > def objective(offset, clockspeed): > ? # adjust parametrization to suit > ? adj_y_times = y_times * clockspeed + offset > ? closest_x_times = np.searchsorted(x_midpoints, adj_y_times) > ? pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) > ? closest_good = closest_x_times[pred_err < ERR] > ? return len(unique(closest_good)) > > > That function has some ugly non-smoothness due to the > len(unique(...)). Would something like optimize.brent work for this or > am I on the wrong track? That also has the problem that the objective function is totally flat whenever the points are all within ERR, so you are guaranteed to get an offset inaccuracy on the same order of magnitude as ERR. It sounds like your clocks have a lot of jitter, though, if you can't do better than 10 ms agreement, so maybe you can't get more accurate than that anyway. The "perfect" solution to this would involve writing down a probabilistic model that had some distribution of per-event jitter, some likelihood of events being lost, etc., and then maximize the likelihood of the data. But calculating this likelihood would probably require considering all possible pairings of X and Y values, which would be extremely expensive to compute. But the approximation where you assume you have the correct matching is not so good, as you notice. So if I were you I'd just fiddle around with these different suggestions until you got something that worked :-). -- Nathaniel From charlesr.harris at gmail.com Wed Apr 11 12:53:20 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 11 Apr 2012 10:53:20 -0600 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: On Wed, Apr 11, 2012 at 10:37 AM, Nathaniel Smith wrote: > On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers wrote: > > Re Nathaniel's suggestion: > > I think this is pretty similar to the algorithm I'm currently using. > Pseudocode: > > > > current_guess = estimate_from_correlation(x, y) > > for timescale in decreasing_order: > > xm, ym = find_matches( > > x, y, current_guess, within=timescale) > > current_guess = linfit(xm, ym) > > > > The problem is the local minima caused by mismatch errors. If the > > clockspeed estimate is off, then late events are incorrectly matched > > with a delay of one event. Then the updated guess moves closer to this > > incorrect solution. So by killing off the points that disagree, we > > reinforce the current orthodoxy! > > Yes, that's why I was suggesting doing something similar to your > current algorithm, but different in ways that might avoid this problem > :-). > > > Actually the truest objective function would be the number of matches > > within some specified error. > > ERR = .1 > > def objective(offset, clockspeed): > > # adjust parametrization to suit > > adj_y_times = y_times * clockspeed + offset > > closest_x_times = np.searchsorted(x_midpoints, adj_y_times) > > pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) > > closest_good = closest_x_times[pred_err < ERR] > > return len(unique(closest_good)) > > > > > > That function has some ugly non-smoothness due to the > > len(unique(...)). Would something like optimize.brent work for this or > > am I on the wrong track? > > That also has the problem that the objective function is totally flat > whenever the points are all within ERR, so you are guaranteed to get > an offset inaccuracy on the same order of magnitude as ERR. It sounds > like your clocks have a lot of jitter, though, if you can't do better > than 10 ms agreement, so maybe you can't get more accurate than that > anyway. > > I'd like to know the platform. The default Window's clock ticks at 15 Hz, IIRC. From pav at iki.fi Wed Apr 11 14:13:41 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 11 Apr 2012 20:13:41 +0200 Subject: [SciPy-User] Strange behavior of sph_jn In-Reply-To: <33214579-2597-498C-88E3-3D4ABF6EF270@continuum.io> References: <4554-4f843a80-2f-65abee80@225308846> <33214579-2597-498C-88E3-3D4ABF6EF270@continuum.io> Message-ID: Hi, 11.04.2012 15:48, Travis Oliphant kirjoitti: > Thanks for this excellent feedback. It would be very helpful if you > could file a bug at > > http://projects.scipy.org/scipy > > That way we we don't lose track of the problem you ran in to. The > algorithm should be adjusted to handle these values. One was submitted some time ago: http://projects.scipy.org/scipy/ticket/1640 The algorithm probably would need some asymptotic expansions. This is unfortunately a pain to do so that it works for all parameters (see our jv implementation, which does things properly). One option is just to fall back to relate the spherical functions to the usual one, and reuse `jv` and `yv` which are more robust. -- Pauli Virtanen From pengyu.ut at gmail.com Wed Apr 11 17:39:09 2012 From: pengyu.ut at gmail.com (Peng Yu) Date: Wed, 11 Apr 2012 16:39:09 -0500 Subject: [SciPy-User] scipy installation error on Mac OS X (10.6.8) (with pip) Message-ID: Hi, I get the following error. When I try to instal scipy (pip install scipy). Does anybody know how to fix the problem? .... error: Command "/opt/local/bin/g95 -shared -shared build/temp.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-2.7/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-2.7/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-2.7/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-2.7/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/scipy/fftpack/src/dst.o build/temp.macosx-10.6-intel-2.7/build/src.macosx-10.6-intel-2.7/fortranobject.o -Lbuild/temp.macosx-10.6-intel-2.7 -ldfftpack -lfftpack -o build/lib.macosx-10.6-intel-2.7/scipy/fftpack/_fftpack.so" failed with exit status 1 -- Regards, Peng From cournape at gmail.com Wed Apr 11 17:51:59 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 11 Apr 2012 22:51:59 +0100 Subject: [SciPy-User] scipy installation error on Mac OS X (10.6.8) (with pip) In-Reply-To: References: Message-ID: On Wed, Apr 11, 2012 at 10:39 PM, Peng Yu wrote: > Hi, > > I get the following error. When I try to instal scipy (pip install > scipy). Does anybody know how to fix the problem? > Don't use g95 as a fortran compiler, it is known to be severly broken, especially on mac os x. Use gfortran itself. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Thu Apr 12 03:00:53 2012 From: srean.list at gmail.com (srean) Date: Thu, 12 Apr 2012 02:00:53 -0500 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: There have many good suggestions, I will add another. One option would be to model this via dynamic programming. I think searching for "sequence alignment" and "time warping" will give you a lot of helpful hits. The basic idea is that one models sequence-B to be a modification of sequence-A where "tokens" could have been added, deleted or perturbed by noise (all modeled as statistically independent of each other) and that there is a transformation that connects the indices of the two sequences. Computationally this is potentially costlier but you can get to the global minimum if the time warp wasnt there. It does reasonably well even if it is. Regarding previous suggestions: If you have some periodicity in the signals, then computing autocorrelation on the two streams will also give you an idea of whats the slew rate between the clocks and cross-correlation will point towards an offset once you have accounted for the slew rate. (All this is assuming there arent too many 'adds' and 'deletes'). The numbers you will get will not be exact but perhaps a good point to initialize your algorithm with. On Tue, Apr 10, 2012 at 11:16 PM, Chris Rodgers wrote: > Dear all > > Thanks very much for the suggestions! > > Re a new hardware implementation: I bet this would totally work and > honestly is probably the fastest way to get it working. I think even a > rough system clock would do the trick. The downsides are 1) many data > have already been collected with the old setup; 2) I'm getting > stubbornly interested in this problem for its own sake since it feel > so solvable. So perhaps I'll change the hardware for future data and > keep working on algorithms for the old data. (I'd never heard of > Lamport timestamps. The wikipedia article is really interesting. If I > understand it correctly, it would still require a hardware change > though.) > > > Re Nathaniel's suggestion: > I think this is pretty similar to the algorithm I'm currently using. Pseudocode: > > current_guess = estimate_from_correlation(x, y) > for timescale in decreasing_order: > ?xm, ym = find_matches( > ? ?x, y, current_guess, within=timescale) > ?current_guess = linfit(xm, ym) > > The problem is the local minima caused by mismatch errors. If the > clockspeed estimate is off, then late events are incorrectly matched > with a delay of one event. Then the updated guess moves closer to this > incorrect solution. So by killing off the points that disagree, we > reinforce the current orthodoxy! > > Actually the truest objective function would be the number of matches > within some specified error. > ERR = .1 > def objective(offset, clockspeed): > ? # adjust parametrization to suit > ? adj_y_times = y_times * clockspeed + offset > ? closest_x_times = np.searchsorted(x_midpoints, adj_y_times) > ? pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) > ? closest_good = closest_x_times[pred_err < ERR] > ? return len(unique(closest_good)) > > > That function has some ugly non-smoothness due to the > len(unique(...)). Would something like optimize.brent work for this or > am I on the wrong track? > > > Thanks again all! > Chris > > On Tue, Apr 10, 2012 at 2:22 PM, Nathaniel Smith wrote: >> On Tue, Apr 10, 2012 at 10:18 PM, Nathaniel Smith wrote: >>> ? ?return np.sum((y_times - x_times[closest_x_times]) ** 2) >> >> On further thought, squaring is probably exactly the wrong >> transformation here -- squared error focuses on minimizing the large >> errors, and in this case we know that the large errors are caused by >> events that got dropped on the X side, and that these contain no >> information about the proper (offset, clockspeed). >> np.sqrt(np.abs(...)) would probably do better, or something similar >> that flattens out for larger values. Easy to play around with, though. >> >> Also on further thought, it might make sense to run that both >> directions, and match x values against y values too. >> >> -- Nathaniel >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From xrodgers at gmail.com Thu Apr 12 01:02:47 2012 From: xrodgers at gmail.com (Chris Rodgers) Date: Wed, 11 Apr 2012 22:02:47 -0700 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: The platform is specialized because the data come from equipment in my lab. One system is a 3KHz real-time operating system detecting voltage pulses on one input. The other is a data-acquistion card sampling at 30KHz and detecting deflections on another input. The temporal uncertainty arises not from the raw sampling rates but from the uncertainties in the pulse detection algorithms. > Yes, that's why I was suggesting doing something similar to your > current algorithm, but different in ways that might avoid this problem I could be wrong but I think your suggestion retains the matching problem implicitly, because it uses the closest value in X to the transformed point from Y. I wrote a new objective function that uses the sum of squared error between every value in X and every transformed point from Y (but thresholds any squared error that is above a certain point). I played around with this function, and the one you suggested, and a couple of others, using scipy.optimize.brute. It was a fun exercise but the energy landscape is pathological. There are local minima for every nearby mismatch error, and the minimum corresponding to the true solution is extremely narrow and surrounded by peaks. The problem is that a good guess for one parameter (dilation say) means that a small error in the other parameter (offset) results in a very bad solution. It still feels like there should be a way to do this from the statistics of the input. I tried some crazier things like taking many subsamples of X and Y, fitting them, and then looking at the distribution of all the discovered fits. But the problem with this is that I'm very unlikely to choose useful subsamples, of course. I think I'll have to use the brute force approach with very fine sampling, or one of the practical hardware solutions suggested previously. Thanks! Chris On Wed, Apr 11, 2012 at 9:53 AM, Charles R Harris wrote: > > > On Wed, Apr 11, 2012 at 10:37 AM, Nathaniel Smith wrote: >> >> On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers wrote: >> > Re Nathaniel's suggestion: >> > I think this is pretty similar to the algorithm I'm currently using. >> > Pseudocode: >> > >> > current_guess = estimate_from_correlation(x, y) >> > for timescale in decreasing_order: >> > ?xm, ym = find_matches( >> > ? ?x, y, current_guess, within=timescale) >> > ?current_guess = linfit(xm, ym) >> > >> > The problem is the local minima caused by mismatch errors. If the >> > clockspeed estimate is off, then late events are incorrectly matched >> > with a delay of one event. Then the updated guess moves closer to this >> > incorrect solution. So by killing off the points that disagree, we >> > reinforce the current orthodoxy! >> >> Yes, that's why I was suggesting doing something similar to your >> current algorithm, but different in ways that might avoid this problem >> :-). >> >> > Actually the truest objective function would be the number of matches >> > within some specified error. >> > ERR = .1 >> > def objective(offset, clockspeed): >> > ? # adjust parametrization to suit >> > ? adj_y_times = y_times * clockspeed + offset >> > ? closest_x_times = np.searchsorted(x_midpoints, adj_y_times) >> > ? pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) >> > ? closest_good = closest_x_times[pred_err < ERR] >> > ? return len(unique(closest_good)) >> > >> > >> > That function has some ugly non-smoothness due to the >> > len(unique(...)). Would something like optimize.brent work for this or >> > am I on the wrong track? >> >> That also has the problem that the objective function is totally flat >> whenever the points are all within ERR, so you are guaranteed to get >> an offset inaccuracy on the same order of magnitude as ERR. It sounds >> like your clocks have a lot of jitter, though, if you can't do better >> than 10 ms agreement, so maybe you can't get more accurate than that >> anyway. >> > > I'd like to know the platform. The default Window's clock ticks at 15 Hz, > IIRC. > > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From yann.ziegler at etu.unistra.fr Thu Apr 12 04:33:26 2012 From: yann.ziegler at etu.unistra.fr (ZIEGLER Yann (ETU EOT)) Date: Thu, 12 Apr 2012 10:33:26 +0200 Subject: [SciPy-User] =?utf-8?q?Strange_behavior_of_sph=5Fjn?= Message-ID: <7338-4f869380-1d-3722e100@21137754> > Thanks for this excellent feedback. You're welcome, thanks to all for your reactivity and comments. Even if I eventually found the solution elsewhere, it helped me to clarify the origin of my problem. > It would be very helpful if you > could file a bug at > > http://projects.scipy.org/scipy > > That way we we don't lose track of the problem you ran in to. > The algorithm should be adjusted to handle these values. > > Best regards, > > -Travis In fact, I'm the author of the last ticket: http://projects.scipy.org/scipy/ticket/1640 But in a previous mail I spoke about this one: http://projects.scipy.org/scipy/ticket/1114 I didn't realize immediately that its author exactly described the problem I have reported here and gave the same trick to solve it :-S ... but 'in my defence' its ticket dates back to 2010! -Yann From sgarcia at olfac.univ-lyon1.fr Thu Apr 12 10:14:05 2012 From: sgarcia at olfac.univ-lyon1.fr (Samuel Garcia) Date: Thu, 12 Apr 2012 16:14:05 +0200 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: <4F86E32D.2000501@olfac.univ-lyon1.fr> What algo do you use for the online spike detection (sorting) ? Do you have a commercial system or a home made system ? Did you tried some Metroplis Hasting like method : * sample a subset of point * estimate offset, clockspeed at each step * stamp bad point X that have not a pair Y with that estimated offset and clockspeed * sample again only poitn (that are not bad stamp) * check is loglike is better or not and reject or accept the new state Samuel Le 12/04/2012 07:02, Chris Rodgers a ?crit : > The platform is specialized because the data come from equipment in my > lab. One system is a 3KHz real-time operating system detecting voltage > pulses on one input. The other is a data-acquistion card sampling at > 30KHz and detecting deflections on another input. The temporal > uncertainty arises not from the raw sampling rates but from the > uncertainties in the pulse detection algorithms. > >> Yes, that's why I was suggesting doing something similar to your >> current algorithm, but different in ways that might avoid this problem > I could be wrong but I think your suggestion retains the matching > problem implicitly, because it uses the closest value in X to the > transformed point from Y. > > I wrote a new objective function that uses the sum of squared error > between every value in X and every transformed point from Y (but > thresholds any squared error that is above a certain point). I played > around with this function, and the one you suggested, and a couple of > others, using scipy.optimize.brute. It was a fun exercise but the > energy landscape is pathological. There are local minima for every > nearby mismatch error, and the minimum corresponding to the true > solution is extremely narrow and surrounded by peaks. The problem is > that a good guess for one parameter (dilation say) means that a small > error in the other parameter (offset) results in a very bad solution. > > It still feels like there should be a way to do this from the > statistics of the input. I tried some crazier things like taking many > subsamples of X and Y, fitting them, and then looking at the > distribution of all the discovered fits. But the problem with this is > that I'm very unlikely to choose useful subsamples, of course. I think > I'll have to use the brute force approach with very fine sampling, or > one of the practical hardware solutions suggested previously. > > Thanks! > Chris > > On Wed, Apr 11, 2012 at 9:53 AM, Charles R Harris > wrote: >> >> On Wed, Apr 11, 2012 at 10:37 AM, Nathaniel Smith wrote: >>> On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers wrote: >>>> Re Nathaniel's suggestion: >>>> I think this is pretty similar to the algorithm I'm currently using. >>>> Pseudocode: >>>> >>>> current_guess = estimate_from_correlation(x, y) >>>> for timescale in decreasing_order: >>>> xm, ym = find_matches( >>>> x, y, current_guess, within=timescale) >>>> current_guess = linfit(xm, ym) >>>> >>>> The problem is the local minima caused by mismatch errors. If the >>>> clockspeed estimate is off, then late events are incorrectly matched >>>> with a delay of one event. Then the updated guess moves closer to this >>>> incorrect solution. So by killing off the points that disagree, we >>>> reinforce the current orthodoxy! >>> Yes, that's why I was suggesting doing something similar to your >>> current algorithm, but different in ways that might avoid this problem >>> :-). >>> >>>> Actually the truest objective function would be the number of matches >>>> within some specified error. >>>> ERR = .1 >>>> def objective(offset, clockspeed): >>>> # adjust parametrization to suit >>>> adj_y_times = y_times * clockspeed + offset >>>> closest_x_times = np.searchsorted(x_midpoints, adj_y_times) >>>> pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) >>>> closest_good = closest_x_times[pred_err< ERR] >>>> return len(unique(closest_good)) >>>> >>>> >>>> That function has some ugly non-smoothness due to the >>>> len(unique(...)). Would something like optimize.brent work for this or >>>> am I on the wrong track? >>> That also has the problem that the objective function is totally flat >>> whenever the points are all within ERR, so you are guaranteed to get >>> an offset inaccuracy on the same order of magnitude as ERR. It sounds >>> like your clocks have a lot of jitter, though, if you can't do better >>> than 10 ms agreement, so maybe you can't get more accurate than that >>> anyway. >>> >> I'd like to know the platform. The default Window's clock ticks at 15 Hz, >> IIRC. >> >> > >> Chuck >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Samuel Garcia Lyon Neuroscience CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1 Equipe R et D 50, avenue Tony Garnier 69366 LYON Cedex 07 FRANCE T?l : 04 37 28 74 24 Fax : 04 37 28 76 01 http://olfac.univ-lyon1.fr/unite/equipe-07/ http://neuralensemble.org/trac/OpenElectrophy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From josef.pktd at gmail.com Thu Apr 12 11:13:44 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 12 Apr 2012 11:13:44 -0400 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: <4F86E32D.2000501@olfac.univ-lyon1.fr> References: <4F86E32D.2000501@olfac.univ-lyon1.fr> Message-ID: On Thu, Apr 12, 2012 at 10:14 AM, Samuel Garcia wrote: > What algo do you use for the online spike detection (sorting) ? > Do you have a commercial system or a home made system ? > > Did you tried some Metroplis Hasting like method : > ?* sample a subset of point > ?* estimate offset, clockspeed at each step > ?* stamp bad point X that have not a pair Y with that estimated offset > and clockspeed > ?* sample again only poitn (that are not bad stamp) > ?* check is loglike is better or not and reject or accept the new state similar: estimate offset from many samples and clockspeed e.g. from the time difference of nearby samples. If you get lot's of estimates that are wrong but many that are right, then a robust estimator like statsmodels.RLM might be able to estimate the offset and clockspeed from the individual estimates by down weighting the outliers. (similar to trimmed means) maybe iterate with a better initial estimate for the individual subsamples. (I have no idea what data like this might look like) Josef > > > > Samuel > > > > Le 12/04/2012 07:02, Chris Rodgers a ?crit : >> The platform is specialized because the data come from equipment in my >> lab. One system is a 3KHz real-time operating system detecting voltage >> pulses on one input. The other is a data-acquistion card sampling at >> 30KHz and detecting deflections on another input. The temporal >> uncertainty arises not from the raw sampling rates but from the >> uncertainties in the pulse detection algorithms. >> >>> Yes, that's why I was suggesting doing something similar to your >>> current algorithm, but different in ways that might avoid this problem >> I could be wrong but I think your suggestion retains the matching >> problem implicitly, because it uses the closest value in X to the >> transformed point from Y. >> >> I wrote a new objective function that uses the sum of squared error >> between every value in X and every transformed point from Y (but >> thresholds any squared error that is above a certain point). I played >> around with this function, and the one you suggested, and a couple of >> others, using scipy.optimize.brute. It was a fun exercise but the >> energy landscape is pathological. There are local minima for every >> nearby mismatch error, and the minimum corresponding to the true >> solution is extremely narrow and surrounded by peaks. The problem is >> that a good guess for one parameter (dilation say) means that a small >> error in the other parameter (offset) results in a very bad solution. >> >> It still feels like there should be a way to do this from the >> statistics of the input. I tried some crazier things like taking many >> subsamples of X and Y, fitting them, and then looking at the >> distribution of all the discovered fits. But the problem with this is >> that I'm very unlikely to choose useful subsamples, of course. I think >> I'll have to use the brute force approach with very fine sampling, or >> one of the practical hardware solutions suggested previously. >> >> Thanks! >> Chris >> >> On Wed, Apr 11, 2012 at 9:53 AM, Charles R Harris >> ?wrote: >>> >>> On Wed, Apr 11, 2012 at 10:37 AM, Nathaniel Smith ?wrote: >>>> On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers ?wrote: >>>>> Re Nathaniel's suggestion: >>>>> I think this is pretty similar to the algorithm I'm currently using. >>>>> Pseudocode: >>>>> >>>>> current_guess = estimate_from_correlation(x, y) >>>>> for timescale in decreasing_order: >>>>> ? xm, ym = find_matches( >>>>> ? ? x, y, current_guess, within=timescale) >>>>> ? current_guess = linfit(xm, ym) >>>>> >>>>> The problem is the local minima caused by mismatch errors. If the >>>>> clockspeed estimate is off, then late events are incorrectly matched >>>>> with a delay of one event. Then the updated guess moves closer to this >>>>> incorrect solution. So by killing off the points that disagree, we >>>>> reinforce the current orthodoxy! >>>> Yes, that's why I was suggesting doing something similar to your >>>> current algorithm, but different in ways that might avoid this problem >>>> :-). >>>> >>>>> Actually the truest objective function would be the number of matches >>>>> within some specified error. >>>>> ERR = .1 >>>>> def objective(offset, clockspeed): >>>>> ? ?# adjust parametrization to suit >>>>> ? ?adj_y_times = y_times * clockspeed + offset >>>>> ? ?closest_x_times = np.searchsorted(x_midpoints, adj_y_times) >>>>> ? ?pred_err = abs(adj_y_times - x_midpoints[closest_x_times]) >>>>> ? ?closest_good = closest_x_times[pred_err< ?ERR] >>>>> ? ?return len(unique(closest_good)) >>>>> >>>>> >>>>> That function has some ugly non-smoothness due to the >>>>> len(unique(...)). Would something like optimize.brent work for this or >>>>> am I on the wrong track? >>>> That also has the problem that the objective function is totally flat >>>> whenever the points are all within ERR, so you are guaranteed to get >>>> an offset inaccuracy on the same order of magnitude as ERR. It sounds >>>> like your clocks have a lot of jitter, though, if you can't do better >>>> than 10 ms agreement, so maybe you can't get more accurate than that >>>> anyway. >>>> >>> I'd like to know the platform. The default Window's clock ticks at 15 Hz, >>> IIRC. >>> >>> >> >>> Chuck >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Samuel Garcia > Lyon Neuroscience > CNRS - UMR5292 - ?INSERM U1028 - ?Universite Claude Bernard LYON 1 > Equipe R et D > 50, avenue Tony Garnier > 69366 LYON Cedex 07 > FRANCE > T?l : 04 37 28 74 24 > Fax : 04 37 28 76 01 > http://olfac.univ-lyon1.fr/unite/equipe-07/ > http://neuralensemble.org/trac/OpenElectrophy > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Thu Apr 12 11:25:51 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 12 Apr 2012 17:25:51 +0200 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: Message-ID: <4F86F3FF.1010807@molden.no> I had a similar problem with a 16 channel recording setup I used for neuroscience. The ADC clock rates were slightly different, but after some minutes of recording the error was two orders of magnitude larger than the waveforms I was recording. I solved the problem by measuring the relative clock rates very accurately (by time-staming TTL signals from the parallel port a computer running MS-DOS), upsampling to some biological ridiculous sampling rate (with an FIR least squares interpolation filter and FFTs), and then downsampling to approximate synchrony. All in all it worked well, but I nevertheless modified the system by connecting all ADCs to a single oscillator. The problem had gone undetected by the manufacturer because each oscillator served four ADCs, and synchrony was only tested between ADCs served by the same oscillator. And I learned an important lesson: It can take less time to redo the whole experiment than fix corrupted data. But my anger at wasting time with a faulty recording system made me waste even more time correcting the corrupted data files. Sturla On 10.04.2012 11:27, Chris Rodgers wrote: > I have what seems like a straightforward problem but it is becoming > more difficult than I thought. I have two different computers > recording timestamps from the same stream of events. I get lists X and > Y from each computer and the question is how to figure out which entry > in X corresponds to which entry in Y. > > Complications: > 1) There are an unknown number of missing or spurious events in each > list. I do not know which events in X match up to which in Y. > 2) The temporal offset between the two lists is unknown, because each > timer begins at a different time. > 3) The clocks seem to run at slightly different speeds (~0.3% > difference adds up to about 10 seconds over my 1hr recording time). > > I know this problem is solvable because once you find the temporal > offset and clock-speed ratio, the matching timestamps agree to within > 10ms. That is, there is a strong linear relationship between some > unknown X->Y mapping. > > Basically, the problem is: given list X and list Y, and specifying a > certain minimum R**2 value, what is the largest set of matched points > from X and Y that satisfy this R**2 value? I have tried googling > "unmatched linear regression" but this must not be the right search > term. > > One approach that I've tried is to create an analog trace for X and Y > with a Gaussian centered at each timestamp, then finding the lag that > optimizes the cross-correlation between the two. This is good for > finding the temporal offset but can't handle the clock-speed > difference. (Also it takes a really long time because the series are > 1hr of data sampled at 10Hz.) Then I can choose the closest matches > between X and Y and fit them with a line, which gives me the > clock-difference parameter. The problem is that there are a ton of > local minima created by how I choose to match up the points in X and > Y, so it gets stuck on the wrong answer. > > Any tips? > > Thanks! > Chris > > PS: my current code and test data is here: > https://github.com/cxrodgers/DiscreteAnalyze > > -- > Chris Rodgers > Graduate Student > Helen Wills Neuroscience Institute > University of California - Berkeley > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Thu Apr 12 11:35:40 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 12 Apr 2012 17:35:40 +0200 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: <4F86F3FF.1010807@molden.no> References: <4F86F3FF.1010807@molden.no> Message-ID: <4F86F64C.6040009@molden.no> On 12.04.2012 17:25, Sturla Molden wrote: > I solved the problem by measuring the relative clock rates very > accurately (by time-staming TTL signals from the parallel port a > computer running MS-DOS), And this was because a $5000 biomedial TTL generator was not accurate enough ... unlike an 10-15 year old 386 computer I found in the trash. Sturla From njs at pobox.com Thu Apr 12 12:59:03 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Apr 2012 17:59:03 +0100 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: <4F86F3FF.1010807@molden.no> References: <4F86F3FF.1010807@molden.no> Message-ID: On Thu, Apr 12, 2012 at 4:25 PM, Sturla Molden wrote: > I solved the problem by measuring the relative clock rates very > accurately (by time-staming TTL signals from the parallel port a > computer running MS-DOS), If anyone goes this way, it turns out that Linux these days can actually handle this sort of precision-timed TTL signaling about as well as MS-DOS, and it's a heck of a lot easier to set up. You can even do it from Python+Cython. - N From wesmckinn at gmail.com Thu Apr 12 13:55:35 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 12 Apr 2012 13:55:35 -0400 Subject: [SciPy-User] ANN: pandas 0.7.3 release, upcoming time series work Message-ID: hi all, I'm very pleased to announce the pandas 0.7.3 release. This is primarily a bug fix release from 0.7.2, but also includes a number of new features. The most noticeable ones are a number of new plotting types and options. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html pandas 0.8.0 will be the largest pandas release in some time. We have been working intensely to expand its time series capabilities and to incorporate code and features from scikits.timeseries, thus enabling users to migrate their code to pandas. From 0.8.0 onward, pandas will use NumPy's datetime64 dtype and thus require at least NumPy version 1.6.1. I am extremely excited about this step forward, as pandas will be one of the most featureful and highest performance tools for time series data in any language. If this is of interest to you, tune in to the timeseries branch on GitHub, which will soon be merged into master. Targeting a mid-May release for 0.8.0, so the more user feedback between now and then the better. - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata Blogs: http://blog.wesmckinney.com and http://blog.lambdafoundry.com From surfcast23 at gmail.com Fri Apr 13 00:06:20 2012 From: surfcast23 at gmail.com (surfcast23) Date: Thu, 12 Apr 2012 21:06:20 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant Message-ID: <33679664.post@talk.nabble.com> Hi All, I would like to know how one can loop over one variable incrementing it while keeping the second fixed. With the final out put being a table of values. For example hold y fixed and increment x then increment y and re-increment x so you would get something like this x,y 1,1 2,1 3,1 1,2 2,2 3,2 thanks in advance -- View this message in context: http://old.nabble.com/Looping-over-one-variable-keeping-the-others-constant-tp33679664p33679664.html Sent from the Scipy-User mailing list archive at Nabble.com. From zhibo.xiao at gmail.com Fri Apr 13 01:37:12 2012 From: zhibo.xiao at gmail.com (aurora1625) Date: Thu, 12 Apr 2012 22:37:12 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] optimize fmin_cg parameter passing Message-ID: <33679809.post@talk.nabble.com> Hi, everyone I am new to Python and Scipy and I met a problem when trying to optimize a function. My problem is: my f function has several parameters(datatype:array), it's like f = \sum lambda_i * phi_i + \zeta *(exp(lambda_i + \nu_i)) this function is about lambda, so i get fprime = \sum phi_i + \zeta *(exp(lambda_i + \nu_i)) I want to pass in an initial lambda value, but how can I pass in other parameter, phi, zeta and nu. I guess my function is about lambda, and the other parameters should be constant and fixed, but I really don't know how to pass them in. Thank you for your help! -- View this message in context: http://old.nabble.com/optimize-fmin_cg-parameter-passing-tp33679809p33679809.html Sent from the Scipy-User mailing list archive at Nabble.com. From opossumnano at gmail.com Fri Apr 13 03:03:32 2012 From: opossumnano at gmail.com (Tiziano Zito) Date: Fri, 13 Apr 2012 09:03:32 +0200 (CEST) Subject: [SciPy-User] =?utf-8?q?=5BReminder=5D_Summer_School_=22Advanced_S?= =?utf-8?q?cientific_Programming_in_Python=22_in_Kiel=2C_Germany?= Message-ID: <20120413070332.985A7249AC6@mail.bccn-berlin> ?Advanced Scientific Programming in Python ========================================= a Summer School by the G-Node and the Institute of Experimental and Applied Physics, Christian-Albrechts-Universit?t zu Kiel Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists actually use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory materials before the course. Date and Location ================= September 2?7, 2012. Kiel, Germany. Preliminary Program =================== Day 0 (Sun Sept 2) ? Best Programming Practices - Best Practices, Development Methodologies and the Zen of Python - Version control with git - Object-oriented programming & design patterns Day 1 (Mon Sept 3) ? Software Carpentry - Test-driven development, unit testing & quality assurance - Debugging, profiling and benchmarking techniques - Best practices in data visualization - Programming in teams Day 2 (Tue Sept 4) ? Scientific Tools for Python - Advanced NumPy - The Quest for Speed (intro): Interfacing to C with Cython - Advanced Python I: idioms, useful built-in data structures, generators Day 3 (Wed Sept 5) ? The Quest for Speed - Writing parallel applications in Python - Programming project Day 4 (Thu Sept 6) ? Efficient Memory Management - When parallelization does not help: the starving CPUs problem - Advanced Python II: decorators and context managers - Programming project Day 5 (Fri Sept 7) ? Practical Software Development - Programming project - The Pelita Tournament Every evening we will have the tutors' consultation hour: Tutors will answer your questions and give suggestions for your own projects. Applications ============ You can apply on-line at http://python.g-node.org Applications must be submitted before 23:59 UTC, May 1, 2012. Notifications of acceptance will be sent by June 1, 2012. No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate last time was around 20%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures. You are encouraged to go through the introductory material available on the website. Faculty ======= - Francesc Alted, Continuum Analytics Inc., USA - Pietro Berkes, Enthought Inc., UK - Valentin Haenel, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland - Zbigniew J?drzejewski-Szmek, Faculty of Physics, University of Warsaw, Poland - Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland - Emanuele Olivetti, NeuroInformatics Laboratory, Fondazione Bruno Kessler and University of Trento, Italy - Rike-Benjamin Schuppner, Technologit GbR, Germany - Bartosz Tele?czuk, Unit? de Neurosciences Information et Complexit?, Centre National de la Recherche Scientifique, France - St?fan van der Walt, Helen Wills Neuroscience Institute, University of California Berkeley, USA - Bastian Venthur, Berlin Institute of Technology and Bernstein Focus Neurotechnology, Germany - Niko Wilbert, TNG Technology Consulting GmbH, Germany - Tiziano Zito, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin, Germany Organized by Christian T. Steigies and Christian Drews of the Institute of Experimental and Applied Physics, Christian-Albrechts-Universit?t zu Kiel , and by Zbigniew J?drzejewski-Szmek and Tiziano Zito for the German Neuroinformatics Node of the INCF. Website: http://python.g-node.org Contact: python-info at g-node.org From jsseabold at gmail.com Fri Apr 13 10:19:01 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 13 Apr 2012 10:19:01 -0400 Subject: [SciPy-User] [SciPy-user] optimize fmin_cg parameter passing In-Reply-To: <33679809.post@talk.nabble.com> References: <33679809.post@talk.nabble.com> Message-ID: On Fri, Apr 13, 2012 at 1:37 AM, aurora1625 wrote: > > Hi, everyone > > I am new to Python and Scipy and I met a problem when trying to optimize a > function. > > My problem is: > > my f function has several parameters(datatype:array), it's like > > f = \sum lambda_i * phi_i + \zeta *(exp(lambda_i + \nu_i)) > > this function is about lambda, so i get > > fprime = \sum phi_i + \zeta *(exp(lambda_i + \nu_i)) > > I want to pass in an initial lambda value, but how can I pass in other > parameter, phi, zeta and nu. > > I guess my function is about lambda, and the other parameters should be > constant and fixed, but I really don't know how to pass them in. > Use the args keyword for this. From the docstring of fmin_cg args : tuple, optional Extra arguments passed to f and fprime. Skipper From cameron.hayne at dftmicrosystems.com Fri Apr 13 03:13:41 2012 From: cameron.hayne at dftmicrosystems.com (Cameron Hayne) Date: Fri, 13 Apr 2012 03:13:41 -0400 Subject: [SciPy-User] [SciPy-user] optimize fmin_cg parameter passing In-Reply-To: <33679809.post@talk.nabble.com> References: <33679809.post@talk.nabble.com> Message-ID: <4D34A6F9-F204-4670-B143-C502FEF38BEE@hayne.net> On 13-Apr-12, at 1:37 AM, aurora1625 wrote: > I am new to Python and Scipy and I met a problem when trying to > optimize a > function. > My problem is: > my f function has several parameters(datatype:array), it's like > f = \sum lambda_i * phi_i + \zeta *(exp(lambda_i + \nu_i)) > this function is about lambda, so i get > fprime = \sum phi_i + \zeta *(exp(lambda_i + \nu_i)) > I want to pass in an initial lambda value, but how can I pass in other > parameter, phi, zeta and nu. > I guess my function is about lambda, and the other parameters should > be > constant and fixed, but I really don't know how to pass them in. I'm not sure I have correctly understood what you are asking but it seems that what you want is a locally defined function. In Python, you can define a function inside another function and the locally-defined function preserves all the values of variables used from the outer function. For example: def funcA(x, y): def funcB(z): return x**2 + y**2 + z**2 print funcB(42) In your case, you can make the inner function use the values of phi, zeta and nu, and then pass the inner function to the optimization routine. -- Cameron Hayne macdev at hayne.net From denis.laxalde at mcgill.ca Fri Apr 13 09:31:50 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Fri, 13 Apr 2012 09:31:50 -0400 Subject: [SciPy-User] [SciPy-user] optimize fmin_cg parameter passing In-Reply-To: <33679809.post@talk.nabble.com> References: <33679809.post@talk.nabble.com> Message-ID: <20120413093150.6515e326@mcgill.ca> aurora1625 wrote: > My problem is: > > my f function has several parameters(datatype:array), it's like > > f = \sum lambda_i * phi_i + \zeta *(exp(lambda_i + \nu_i)) > > this function is about lambda, so i get > > fprime = \sum phi_i + \zeta *(exp(lambda_i + \nu_i)) > > I want to pass in an initial lambda value, but how can I pass in other > parameter, phi, zeta and nu. > > I guess my function is about lambda, and the other parameters should be > constant and fixed, but I really don't know how to pass them in. Extra parameters usually go in the `args::tuple` parameter for minimizers in scipy, e.g.: fmin(f, x0, args=(phi, zeta, nu), ...) -- Denis From vanforeest at gmail.com Fri Apr 13 15:15:13 2012 From: vanforeest at gmail.com (nicky van foreest) Date: Fri, 13 Apr 2012 21:15:13 +0200 Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: <33679664.post@talk.nabble.com> References: <33679664.post@talk.nabble.com> Message-ID: Hi Perhaps jou van something useful in the python itertools module. Itertools product seems to do the job. Nicky On Apr 13, 2012 6:06 AM, "surfcast23" wrote: > > Hi All, > > I would like to know how one can loop over one variable incrementing it > while keeping the second fixed. With the final out put being a table of > values. For example hold y fixed and increment x then increment y and > re-increment x so you would get something like this > > x,y > 1,1 > 2,1 > 3,1 > 1,2 > 2,2 > 3,2 > > thanks in advance > -- > View this message in context: > http://old.nabble.com/Looping-over-one-variable-keeping-the-others-constant-tp33679664p33679664.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Fri Apr 13 15:21:46 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 13 Apr 2012 15:21:46 -0400 Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: <33679664.post@talk.nabble.com> References: <33679664.post@talk.nabble.com> Message-ID: > x,y > 1,1 > 2,1 > 3,1 > 1,2 > 2,2 > 3,2 > If you just want integer increments like a counter (as in the above), you could use numpy.ndindex, e.g.: In [20]: numpy.array(list(numpy.ndindex(2,3)))+1 Out[20]: array([[1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3]]) From ecarlson at eng.ua.edu Fri Apr 13 15:47:47 2012 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Fri, 13 Apr 2012 14:47:47 -0500 Subject: [SciPy-User] sparse matrix construction question Message-ID: Hello, I have a million row sparse matrix that can be created almost instantaneously using spdiags and csr format. I need to add one column and one row, and insert some nonzeros at various (symmetric) locations in the row and columns. I can easily accomplish this using lil format, but this increases the construction to around 20 seconds from about .05 seconds. I can probably work something out in fortran + f2py pretty easily, but ... Any python sparse matrix wizards out there with some tips? Cheers, Eric Carlson From cmutel at gmail.com Fri Apr 13 16:16:57 2012 From: cmutel at gmail.com (Christopher Mutel) Date: Fri, 13 Apr 2012 22:16:57 +0200 Subject: [SciPy-User] sparse matrix construction question In-Reply-To: References: Message-ID: One easy way is to construct a new CSR matrix, extending on the numpy arrays "data", "indptr", and "indices" from the original matrix. Because the I can never quite figure out the CSR format, I usually convert to a COO matrix, where it is easy to extend the "data", "row", and "col" numpy arrays. This can be slower, but is definitely much faster than the LIL matrix conversion. See e.g. http://stackoverflow.com/questions/6844998/is-there-an-efficient-way-of-concatenating-scipy-sparse-matrices -Chris On Fri, Apr 13, 2012 at 9:47 PM, Eric Carlson wrote: > Hello, > I have a million row sparse matrix that can be created almost > instantaneously using spdiags and csr format. I need to add one column > and one row, and insert some nonzeros at various (symmetric) locations > in the row and columns. I can easily accomplish this using lil format, > but this increases the construction to around 20 seconds from about .05 > seconds. > > I can probably work something out in fortran + f2py pretty easily, but ... > > Any python sparse matrix wizards out there with some tips? > > Cheers, > Eric Carlson > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Andrew.G.York+scipy at gmail.com Fri Apr 13 16:51:50 2012 From: Andrew.G.York+scipy at gmail.com (Andrew York) Date: Fri, 13 Apr 2012 16:51:50 -0400 Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: References: <33679664.post@talk.nabble.com> Message-ID: I might be missing the point. Does the OP want something as simple as: for i in range(3): for j in range(3): print i, j On Fri, Apr 13, 2012 at 3:21 PM, Zachary Pincus wrote: > > x,y > > 1,1 > > 2,1 > > 3,1 > > 1,2 > > 2,2 > > 3,2 > > > > If you just want integer increments like a counter (as in the above), you > could use numpy.ndindex, e.g.: > > In [20]: numpy.array(list(numpy.ndindex(2,3)))+1 > Out[20]: > array([[1, 1], > [1, 2], > [1, 3], > [2, 1], > [2, 2], > [2, 3]]) > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pinto at gmail.com Fri Apr 13 23:20:08 2012 From: nicolas.pinto at gmail.com (Nicolas Pinto) Date: Fri, 13 Apr 2012 23:20:08 -0400 Subject: [SciPy-User] ssymm missing ? Message-ID: Hello, I'm trying to use "ssymm" from blas but it's missing on my system: % python -c "from scipy.linalg.blas import cblas; assert hasattr(cblas, 'ssymm')" Traceback (most recent call last): File "", line 1, in AssertionError Other functions import fine, e.g.: % python -c "from scipy.linalg.blas import cblas; assert hasattr(cblas, 'sgemm')" % python -c "import scipy as sp; print sp.__version__" 0.10.0 Is it a problem on my side ? Thanks for your help. Regards, Nicolas From e.antero.tammi at gmail.com Sat Apr 14 01:09:45 2012 From: e.antero.tammi at gmail.com (eat) Date: Sat, 14 Apr 2012 08:09:45 +0300 Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: <33679664.post@talk.nabble.com> References: <33679664.post@talk.nabble.com> Message-ID: Hi, On Fri, Apr 13, 2012 at 7:06 AM, surfcast23 wrote: > > Hi All, > > I would like to know how one can loop over one variable incrementing it > while keeping the second fixed. With the final out put being a table of > values. For example hold y fixed and increment x then increment y and > re-increment x so you would get something like this > > x,y > 1,1 > 2,1 > 3,1 > 1,2 > 2,2 > 3,2 > Perhaps something along the lines will help you: In []: c_[tile(arange(3), 2), arange(2).repeat(3)]+ 1 Out[]: array([[1, 1], [2, 1], [3, 1], [1, 2], [2, 2], [3, 2]]) My 2 cents, -eat > > thanks in advance > -- > View this message in context: > http://old.nabble.com/Looping-over-one-variable-keeping-the-others-constant-tp33679664p33679664.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Apr 14 04:56:54 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 14 Apr 2012 10:56:54 +0200 Subject: [SciPy-User] Contributing to SciPy guide Message-ID: Hi all, It has been pointed out by a number of people that it's not so easy to get started with contributing to SciPy, and better documentation may help here. So I've written a guide for this. It would be great to get some feedback especially from people who've found it difficult to find this information before. And if you haven't contributed before but were thinking about doing so, perhaps this is a good opportunity to get started! Pull Request: https://github.com/scipy/scipy/pull/191 Rendered guide: https://github.com/rgommers/scipy/blob/howto-contribute/doc/HOWTO_CONTRIBUTE.rst.txt Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecarlson at eng.ua.edu Sat Apr 14 08:52:25 2012 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Sat, 14 Apr 2012 07:52:25 -0500 Subject: [SciPy-User] sparse matrix construction question In-Reply-To: References: Message-ID: perfect - thanks for the assist On 4/13/2012 3:16 PM, Christopher Mutel wrote: > One easy way is to construct a new CSR matrix, extending on the numpy > arrays "data", "indptr", and "indices" from the original matrix. > Because the I can never quite figure out the CSR format, I usually > convert to a COO matrix, where it is easy to extend the "data", "row", > and "col" numpy arrays. This can be slower, but is definitely much > faster than the LIL matrix conversion. > > See e.g. http://stackoverflow.com/questions/6844998/is-there-an-efficient-way-of-concatenating-scipy-sparse-matrices > > -Chris > > On Fri, Apr 13, 2012 at 9:47 PM, Eric Carlson wrote: >> Hello, >> I have a million row sparse matrix that can be created almost >> instantaneously using spdiags and csr format. I need to add one column >> and one row, and insert some nonzeros at various (symmetric) locations >> in the row and columns. I can easily accomplish this using lil format, >> but this increases the construction to around 20 seconds from about .05 >> seconds. >> >> I can probably work something out in fortran + f2py pretty easily, but ... >> >> Any python sparse matrix wizards out there with some tips? >> >> Cheers, >> Eric Carlson >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From surfcast23 at gmail.com Sat Apr 14 18:25:40 2012 From: surfcast23 at gmail.com (surfcast23) Date: Sat, 14 Apr 2012 15:25:40 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: <33679664.post@talk.nabble.com> References: <33679664.post@talk.nabble.com> Message-ID: <33688622.post@talk.nabble.com> Thanks for every ones replies. I'm sorry I am just replying I was out of town for a few days. I will look into everyones suggestions. The main thing is that I need a function that can use equations such as x[i]= const+x x+=const y[i]= const+y y+=const So that x and y increment by whatever the constant is and that new value is what is used to get the next array element. surfcast23 wrote: > > Hi All, > > I would like to know how one can loop over one variable incrementing it > while keeping the second fixed. With the final out put being a table of > values. For example hold y fixed and increment x then increment y and > re-increment x so you would get something like this > > x,y > 1,1 > 2,1 > 3,1 > 1,2 > 2,2 > 3,2 > > thanks in advance > -- View this message in context: http://old.nabble.com/Looping-over-one-variable-keeping-the-others-constant-tp33679664p33688622.html Sent from the Scipy-User mailing list archive at Nabble.com. From zhibo.xiao at gmail.com Sat Apr 14 20:33:13 2012 From: zhibo.xiao at gmail.com (=?UTF-8?B?6IKW5pm65Y2aIChaaGlibyBYaWFvKSAvIFNlYW4g?=) Date: Sun, 15 Apr 2012 08:33:13 +0800 Subject: [SciPy-User] [SciPy-user] optimize fmin_cg parameter passing In-Reply-To: References: <33679809.post@talk.nabble.com> Message-ID: Hi, Thank you for your replies, it seems everyone's answer works and I will look into them all. On Fri, Apr 13, 2012 at 22:19, Skipper Seabold wrote: > On Fri, Apr 13, 2012 at 1:37 AM, aurora1625 wrote: > > > > Hi, everyone > > > > I am new to Python and Scipy and I met a problem when trying to optimize > a > > function. > > > > My problem is: > > > > my f function has several parameters(datatype:array), it's like > > > > f = \sum lambda_i * phi_i + \zeta *(exp(lambda_i + \nu_i)) > > > > this function is about lambda, so i get > > > > fprime = \sum phi_i + \zeta *(exp(lambda_i + \nu_i)) > > > > I want to pass in an initial lambda value, but how can I pass in other > > parameter, phi, zeta and nu. > > > > I guess my function is about lambda, and the other parameters should be > > constant and fixed, but I really don't know how to pass them in. > > > > Use the args keyword for this. From the docstring of fmin_cg > > args : tuple, optional > Extra arguments passed to f and fprime. > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun Apr 15 05:08:24 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 15 Apr 2012 11:08:24 +0200 Subject: [SciPy-User] ssymm missing ? In-Reply-To: References: Message-ID: Hi, 14.04.2012 05:20, Nicolas Pinto kirjoitti: > I'm trying to use "ssymm" from blas but it's missing on my system: > > % python -c "from scipy.linalg.blas import cblas; assert > hasattr(cblas, 'ssymm')" > Traceback (most recent call last): > File "", line 1, in > AssertionError [clip] > Is it a problem on my side ? I think the wrappers for that BLAS function (and the other L3 functions apart from GEMM) are missing. They could be added to scipy/linalg/fblas_l3.pyf.src -- Pauli Virtanen From alexandre.gramfort at inria.fr Sun Apr 15 08:28:32 2012 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Sun, 15 Apr 2012 14:28:32 +0200 Subject: [SciPy-User] [SciPy-Dev] Contributing to SciPy guide In-Reply-To: References: Message-ID: Dear Ralf, thanks for this helpful document. What I think is missing is info and link to pages that explain how to get started with a dev version of scipy. Many users have a released version and have never compiled it although they could be contributors. FAQ could be, how to work with a dev version of scipy while keeping the last release to switch between both? You can find such info on the web but it might be worth centralizing them. If it already exists, please let me know and forget this message. my 2c, Alex On Sat, Apr 14, 2012 at 10:56 AM, Ralf Gommers wrote: > Hi all, > > It has been pointed out by a number of people that it's not so easy to get > started with contributing to SciPy, and better documentation may help here. > So I've written a guide for this. It would be great to get some feedback > especially from people who've found it difficult to find this information > before. And if you haven't contributed before but were thinking about doing > so, perhaps this is a good opportunity to get started! > > Pull Request: https://github.com/scipy/scipy/pull/191 > Rendered guide: > https://github.com/rgommers/scipy/blob/howto-contribute/doc/HOWTO_CONTRIBUTE.rst.txt > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From surfcast23 at gmail.com Sun Apr 15 18:44:52 2012 From: surfcast23 at gmail.com (surfcast23) Date: Sun, 15 Apr 2012 15:44:52 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] Looping over one variable keeping the others constant In-Reply-To: <33688622.post@talk.nabble.com> References: <33679664.post@talk.nabble.com> <33688622.post@talk.nabble.com> Message-ID: <33688997.post@talk.nabble.com> I looked through itertools in the documentation but did not find anything. I will see if I can adapt the other suggestions. surfcast23 wrote: > > Thanks for every ones replies. I'm sorry I am just replying I was out of > town for a few days. I will look into everyones suggestions. The main > thing is that I need a function that can use equations such as > > x[i]= const+x > x+=const > y[i]= const+y > y+=const > > So that x and y increment by whatever the constant is and that new value > is what is used to get the next array element. > > > > surfcast23 wrote: >> >> Hi All, >> >> I would like to know how one can loop over one variable incrementing it >> while keeping the second fixed. With the final out put being a table of >> values. For example hold y fixed and increment x then increment y and >> re-increment x so you would get something like this >> >> x,y >> 1,1 >> 2,1 >> 3,1 >> 1,2 >> 2,2 >> 3,2 >> >> thanks in advance >> > > -- View this message in context: http://old.nabble.com/Looping-over-one-variable-keeping-the-others-constant-tp33679664p33688997.html Sent from the Scipy-User mailing list archive at Nabble.com. From xrodgers at gmail.com Fri Apr 13 19:22:16 2012 From: xrodgers at gmail.com (Chris Rodgers) Date: Fri, 13 Apr 2012 16:22:16 -0700 Subject: [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression In-Reply-To: References: <4F86F3FF.1010807@molden.no> Message-ID: On Thu, Apr 12, 2012 at 12:00 AM, srean wrote: > There have many good suggestions, I will add another. One option would > be to model this via dynamic programming. I think searching for > "sequence alignment" and "time warping" will give you a lot of helpful > hits. That is a great tip, thanks. I knew someone must have worked on this problem before. I will see how they solved the problem instead of reinventing the wheel. I see DTW is included in mlpy. Also, estimating clock skew from autocorrelations sounds like it would definitely work. Samuel: Actually these are not spikes but behavioral events (nosepokes, stimulus onsets) > And this was because a $5000 biomedial TTL generator was not accurate > enough ... unlike an 10-15 year old 386 computer I found in the trash. My case is not the same because these clocks were never intended to be used in this way. But I sympathize with your frustration. One can sell the exact same piece of equipment for 10x more if it is labeled as "biomedical". If there is one thing I have learned in my PhD, it is that you make a lot more money selling research hardware than using it. Thanks all! On Thu, Apr 12, 2012 at 9:59 AM, Nathaniel Smith wrote: > On Thu, Apr 12, 2012 at 4:25 PM, Sturla Molden wrote: >> I solved the problem by measuring the relative clock rates very >> accurately (by time-staming TTL signals from the parallel port a >> computer running MS-DOS), > > If anyone goes this way, it turns out that Linux these days can > actually handle this sort of precision-timed TTL signaling about as > well as MS-DOS, and it's a heck of a lot easier to set up. You can > even do it from Python+Cython. > > - N > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sergio_r at mail.com Sun Apr 15 16:12:09 2012 From: sergio_r at mail.com (sergio) Date: Sun, 15 Apr 2012 13:12:09 -0700 (PDT) Subject: [SciPy-User] How to run individual failed scipy.test() tests? Message-ID: Hello all, I just manually installed python 2.7 with numpy 1.6.1, scipy0.10.1 and nose version 1.1.2 under ubuntu 10.04. I would like to know how to run failed tests of numpy.test('full') and scipy.test('full') individually, and to read a description of what exactly the test does. Thanks in advance for your help, Sergio PS. In what follows I am including the summary from each test command: >>> numpy.test('full') Running unit tests for numpy NumPy version 1.6.1 NumPy is installed in /home/myProg/Python272/Linux32b/lib/python2.7/ site- packages/numpy Python version 2.7.2 (default, Apr 14 2012, 20:45:06) [GCC 4.4.3] nose version 1.1.2 ====================================================================== FAIL: test_kind.TestKind.test_all ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /f2py/tests/test_kind.py", line 30, in test_all 'selectedrealkind(%s): expected %r but got %r' % (i, selected_real_kind(i), selectedrealkind(i))) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: selectedrealkind(16): expected 10 but got 16 ---------------------------------------------------------------------- Ran 3560 tests in 259.496s FAILED (KNOWNFAIL=3, SKIP=4, failures=1) >>> scipy.test('full') ...................... ====================================================================== ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /special/tests/test_basic.py", line 1642, in test_iv_cephes_vs_amos_mass_test c1 = special.iv(v, x) RuntimeWarning: divide by zero encountered in iv ====================================================================== ERROR: test_fdtri (test_basic.TestCephes) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /special/tests/test_basic.py", line 153, in test_fdtri cephes.fdtri(1,1,0.5) RuntimeWarning: invalid value encountered in fdtri ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (0.4141193182605212,), 'loggamma loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1631, in stats mu = self._munp(1.0,*goodargs) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 4119, in _munp return self._mom0_sc(n,*args) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1165, in _mom0_sc self.b, args=(m,)+args)[0] File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /integrate/quadpack.py", line 247, in quad retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /integrate/quadpack.py", line 314, in _quad return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel, limit) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1162, in _mom_integ0 return x**m * self.pdf(x,*args) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1262, in pdf place(output,cond,self._pdf(*goodargs) / scale) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 4112, in _pdf return exp(c*x-exp(x)-gamln(c)) RuntimeWarning: overflow encountered in exp ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (1.8771398388773268,), 'lomax loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1617, in stats mu, mu2, g1, g2 = self._stats(*args) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 4643, in _stats mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1615, in stats mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 4594, in _stats vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) RuntimeWarning: invalid value encountered in sqrt ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (30, 12, 6), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 6314, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /lib/function_base.py", line 1862, in __call__ theout = self.thefunc(*newargs) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 6668, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (21, 3, 12), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 6314, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /lib/function_base.py", line 1862, in __call__ theout = self.thefunc(*newargs) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 6668, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_distributions.py", line 439, in test_fit vals2 = distfunc.fit(res, optimizer='powell') File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1874, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /optimize/optimize.py", line 1621, in fmin_powell fval, x, direc1 = _linesearch_powell(func, x, direc1, tol=xtol*100) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /optimize/optimize.py", line 1491, in _linesearch_powell alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /optimize/optimize.py", line 1312, in brent brent.optimize() File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /optimize/optimize.py", line 1213, in optimize tmp2 = (x-v)*(fx-fw) RuntimeWarning: invalid value encountered in double_scalars ====================================================================== ERROR: test_fix_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/tests/test_distributions.py", line 460, in test_fix_fit vals2 = distfunc.fit(res,fscale=1) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /stats/distributions.py", line 1874, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /optimize/optimize.py", line 301, in fmin and max(abs(fsim[0]-fsim[1:])) <= ftol): RuntimeWarning: invalid value encountered in subtract ====================================================================== FAIL: test_mio.test_mat4_3d ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /io/matlab/tests/test_mio.py", line 740, in test_mat4_3d stream, {'a': arr}, True, '4') File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: DeprecationWarning not raised ====================================================================== FAIL: test_datatypes.test_uint64_max ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/nose/ case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /ndimage/tests/test_datatypes.py", line 57, in test_uint64_max assert_true(x[1] > (2**63)) AssertionError: False is not true ====================================================================== FAIL: Regression test for #651: better handling of badly conditioned ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/scipy /signal/tests/test_filter_design.py", line 34, in test_bad_filter assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- packages/numpy /testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: BadCoefficients not raised ---------------------------------------------------------------------- Ran 5836 tests in 2718.233s FAILED (KNOWNFAIL=14, SKIP=42, errors=8, failures=3) From eric at depagne.org Mon Apr 16 10:55:03 2012 From: eric at depagne.org (=?utf-8?q?=C3=89ric_Depagne?=) Date: Mon, 16 Apr 2012 16:55:03 +0200 Subject: [SciPy-User] inverting pdist. Message-ID: <201204161655.03204.eric@depagne.org> Hi. I'm using scipy.spatial.distance.pdist. I was wondering if it was possible to invert the result to get the elements of the input matrix that produced a given result. Let me explain a little. Say we have a,b, c and d. The possible distances are ab, ac, ad, bc, bd and cd. pdist will give the values for those 6 distances. Say the largest is at index 5. How can I get from the pdist result array that it is the bd distance? And if it is not, is there another way of doing so? My data are such that I have close to 100k distances to compute. Thanks. ?ric. Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From josef.pktd at gmail.com Mon Apr 16 11:10:22 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 16 Apr 2012 11:10:22 -0400 Subject: [SciPy-User] How to run individual failed scipy.test() tests? In-Reply-To: References: Message-ID: On Sun, Apr 15, 2012 at 4:12 PM, sergio wrote: > Hello all, > > ?I just manually installed python 2.7 with numpy 1.6.1, scipy0.10.1 > and nose version 1.1.2 under ubuntu 10.04. > > I would like to know how to run failed tests of numpy.test('full') and > scipy.test('full') individually, > and to read a description of what exactly the test does. I usually run individual testfiles http://readthedocs.org/docs/nose/en/latest/usage.html#selecting-tests or use --pdb option to get into a test error Josef > > Thanks in advance for your help, > > Sergio > > PS. In what follows I am including the summary > from each test command: > >>>> numpy.test('full') > Running unit tests for numpy > NumPy version 1.6.1 > NumPy is installed in /home/myProg/Python272/Linux32b/lib/python2.7/ > site- > packages/numpy > Python version 2.7.2 (default, Apr 14 2012, 20:45:06) [GCC 4.4.3] > nose version 1.1.2 > > ====================================================================== > FAIL: test_kind.TestKind.test_all > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /f2py/tests/test_kind.py", line 30, in test_all > ? ?'selectedrealkind(%s): expected %r but got %r' % ?(i, > selected_real_kind(i), > ?selectedrealkind(i))) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /testing/utils.py", line 34, in assert_ > ? ?raise AssertionError(msg) > AssertionError: selectedrealkind(16): expected 10 but got 16 > > ---------------------------------------------------------------------- > Ran 3560 tests in 259.496s > > FAILED (KNOWNFAIL=3, SKIP=4, failures=1) > > >>>> scipy.test('full') > ...................... > ====================================================================== > ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /special/tests/test_basic.py", line 1642, in > test_iv_cephes_vs_amos_mass_test > ? ?c1 = special.iv(v, x) > RuntimeWarning: divide by zero encountered in iv > > ====================================================================== > ERROR: test_fdtri (test_basic.TestCephes) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /special/tests/test_basic.py", line 153, in test_fdtri > ? ?cephes.fdtri(1,1,0.5) > RuntimeWarning: invalid value encountered in fdtri > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( _gen object at 0xa560c2c>, (0.4141193182605212,), 'loggamma loc, scale > test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_continuous_extra.py", line 78, in check_loc_scale > ? ?m,v = distfn.stats(*arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1631, in stats > ? ?mu = self._munp(1.0,*goodargs) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 4119, in _munp > ? ?return self._mom0_sc(n,*args) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1165, in _mom0_sc > ? ?self.b, args=(m,)+args)[0] > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /integrate/quadpack.py", line 247, in quad > ? ?retval = > _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /integrate/quadpack.py", line 314, in _quad > ? ?return > _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel, > limit) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1162, in _mom_integ0 > ? ?return x**m * self.pdf(x,*args) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1262, in pdf > ? ?place(output,cond,self._pdf(*goodargs) / scale) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 4112, in _pdf > ? ?return exp(c*x-exp(x)-gamln(c)) > RuntimeWarning: overflow encountered in exp > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( n object at 0xa56758c>, (1.8771398388773268,), 'lomax loc, scale > test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_continuous_extra.py", line 78, in check_loc_scale > ? ?m,v = distfn.stats(*arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1617, in stats > ? ?mu, mu2, g1, g2 = self._stats(*args) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 4643, in _stats > ? ?mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1615, in stats > ? ?mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 4594, in _stats > ? ?vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) > RuntimeWarning: invalid value encountered in sqrt > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( eom_gen object at 0xa56d7ec>, (30, 12, 6), 'hypergeom entropy nan > test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_discrete_basic.py", line 199, in check_entropy > ? ?ent = distfn.entropy(*arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 6314, in entropy > ? ?place(output,cond0,self.vecentropy(*goodargs)) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /lib/function_base.py", line 1862, in __call__ > ? ?theout = self.thefunc(*newargs) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 6668, in _entropy > ? ?lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( eom_gen object at 0xa56d7ec>, (21, 3, 12), 'hypergeom entropy nan > test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_discrete_basic.py", line 199, in check_entropy > ? ?ent = distfn.entropy(*arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 6314, in entropy > ? ?place(output,cond0,self.vecentropy(*goodargs)) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /lib/function_base.py", line 1862, in __call__ > ? ?theout = self.thefunc(*newargs) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 6668, in _entropy > ? ?lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: test_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_distributions.py", line 439, in test_fit > ? ?vals2 = distfunc.fit(res, optimizer='powell') > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1874, in fit > ? ?vals = optimizer(func,x0,args=(ravel(data),),disp=0) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /optimize/optimize.py", line 1621, in fmin_powell > ? ?fval, x, direc1 = _linesearch_powell(func, x, direc1, > tol=xtol*100) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /optimize/optimize.py", line 1491, in _linesearch_powell > ? ?alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /optimize/optimize.py", line 1312, in brent > ? ?brent.optimize() > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /optimize/optimize.py", line 1213, in optimize > ? ?tmp2 = (x-v)*(fx-fw) > RuntimeWarning: invalid value encountered in double_scalars > > ====================================================================== > ERROR: test_fix_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/tests/test_distributions.py", line 460, in test_fix_fit > ? ?vals2 = distfunc.fit(res,fscale=1) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /stats/distributions.py", line 1874, in fit > ? ?vals = optimizer(func,x0,args=(ravel(data),),disp=0) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /optimize/optimize.py", line 301, in fmin > ? ?and max(abs(fsim[0]-fsim[1:])) <= ftol): > RuntimeWarning: invalid value encountered in subtract > > ====================================================================== > FAIL: test_mio.test_mat4_3d > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /io/matlab/tests/test_mio.py", line 740, in test_mat4_3d > ? ?stream, {'a': arr}, True, '4') > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /testing/utils.py", line 1008, in assert_raises > ? ?return nose.tools.assert_raises(*args,**kwargs) > AssertionError: DeprecationWarning not raised > > ====================================================================== > FAIL: test_datatypes.test_uint64_max > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/nose/ > case.py", line 197, in runTest > ? ?self.test(*self.arg) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /ndimage/tests/test_datatypes.py", line 57, in test_uint64_max > ? ?assert_true(x[1] > (2**63)) > AssertionError: False is not true > > ====================================================================== > FAIL: Regression test for #651: better handling of badly conditioned > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/scipy > /signal/tests/test_filter_design.py", line 34, in test_bad_filter > ? ?assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) > ?File "/home/srojas/myProg/Python272/Linux32b/lib/python2.7/site- > packages/numpy > /testing/utils.py", line 1008, in assert_raises > ? ?return nose.tools.assert_raises(*args,**kwargs) > AssertionError: BadCoefficients not raised > > ---------------------------------------------------------------------- > Ran 5836 tests in 2718.233s > > FAILED (KNOWNFAIL=14, SKIP=42, errors=8, failures=3) > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jsalvati at u.washington.edu Tue Apr 17 13:35:29 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Tue, 17 Apr 2012 10:35:29 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem Message-ID: Hi all! I am having a problem with the fmin_bfgs solver that's surprising to me. Here's the toy problem I've set up: from scipy.optimize import fmin_bfgs, fmin_ncg from numpy import * import numpy as np def f(x ): if x < 0: return 1.79769313e+308 else : return x + 1./x xs = fmin_bfgs(f, array( [10.]), retall = True) The solver returns [nan] as the solution. The problem is designed to be stiff: between 0 and 1, it slopes upward to infinity but between 1 and infinity, it slopes up at a slope of 1. Left of 0 the function has a "nearly infinite" value. If bfgs encounters a value that's larger than the current value, it should try a different step size, no? Why does fmin_bfgs fail in this way? Cheers, John -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 17 14:13:13 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Apr 2012 19:13:13 +0100 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier wrote: > Hi all! > > I am having a problem with the fmin_bfgs solver that's surprising to me. > Here's the toy problem I've set up: > > from scipy.optimize import fmin_bfgs, fmin_ncg > from numpy import * > import numpy as np > > def f(x ): > ? ? if x < 0: > ? ? ? ? return 1.79769313e+308 > ? ? else : > ? ? ? ? return x + 1./x > > > xs = fmin_bfgs(f, array( [10.]), retall = True) > > The solver returns [nan] as the solution. > > The problem is designed to be stiff: between 0 and 1, it slopes upward to > infinity but between 1 and infinity, it slopes up at a slope of 1. Left of 0 > the function has a "nearly infinite" value.?If bfgs encounters ?a value > that's larger than the current value, it should try a different step size, > no??Why does fmin_bfgs fail in this way? I can't reproduce this (on my computer it converges to 0.99999992), but have you tried making that < into a <=? The divide-by-zero at f(0) might be making it freak out. -- Nathaniel From jsalvati at u.washington.edu Tue Apr 17 14:13:24 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Tue, 17 Apr 2012 11:13:24 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: I think the problem comes in the line search function: scalar_search_wolfe1 The step size (stp) is alternately too large (putting the suggestion point in the bad region) or much too small (too small of a change). This seems like it's not well suited to stiff problems. Is there a different solver that deals well with stiff problems? John On Tue, Apr 17, 2012 at 10:35 AM, John Salvatier wrote: > Hi all! > > I am having a problem with the fmin_bfgs solver that's surprising to me. > Here's the toy problem I've set up: > > from scipy.optimize import fmin_bfgs, fmin_ncg > from numpy import * > import numpy as np > > def f(x ): > if x < 0: > return 1.79769313e+308 > else : > return x + 1./x > > > xs = fmin_bfgs(f, array( [10.]), retall = True) > > The solver returns [nan] as the solution. > > The problem is designed to be stiff: between 0 and 1, it slopes upward to > infinity but between 1 and infinity, it slopes up at a slope of 1. Left of > 0 the function has a "nearly infinite" value. If bfgs encounters a value > that's larger than the current value, it should try a different step size, > no? Why does fmin_bfgs fail in this way? > > Cheers, > John > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsalvati at u.washington.edu Tue Apr 17 14:14:31 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Tue, 17 Apr 2012 11:14:31 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: Well that's good news. I have scipy .9.0b1, what version do you have? On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith wrote: > On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier > wrote: > > Hi all! > > > > I am having a problem with the fmin_bfgs solver that's surprising to me. > > Here's the toy problem I've set up: > > > > from scipy.optimize import fmin_bfgs, fmin_ncg > > from numpy import * > > import numpy as np > > > > def f(x ): > > if x < 0: > > return 1.79769313e+308 > > else : > > return x + 1./x > > > > > > xs = fmin_bfgs(f, array( [10.]), retall = True) > > > > The solver returns [nan] as the solution. > > > > The problem is designed to be stiff: between 0 and 1, it slopes upward to > > infinity but between 1 and infinity, it slopes up at a slope of 1. Left > of 0 > > the function has a "nearly infinite" value. If bfgs encounters a value > > that's larger than the current value, it should try a different step > size, > > no? Why does fmin_bfgs fail in this way? > > I can't reproduce this (on my computer it converges to 0.99999992), > but have you tried making that < into a <=? The divide-by-zero at f(0) > might be making it freak out. > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Apr 17 14:16:08 2012 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 17 Apr 2012 19:16:08 +0100 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: On Tue, Apr 17, 2012 at 7:14 PM, John Salvatier wrote: > Well that's good news. I have scipy .9.0b1, what version do you have? Less good news: I have 0.8.0 :-) - N > On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith wrote: >> >> On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier >> wrote: >> > Hi all! >> > >> > I am having a problem with the fmin_bfgs solver that's surprising to me. >> > Here's the toy problem I've set up: >> > >> > from scipy.optimize import fmin_bfgs, fmin_ncg >> > from numpy import * >> > import numpy as np >> > >> > def f(x ): >> > ? ? if x < 0: >> > ? ? ? ? return 1.79769313e+308 >> > ? ? else : >> > ? ? ? ? return x + 1./x >> > >> > >> > xs = fmin_bfgs(f, array( [10.]), retall = True) >> > >> > The solver returns [nan] as the solution. >> > >> > The problem is designed to be stiff: between 0 and 1, it slopes upward >> > to >> > infinity but between 1 and infinity, it slopes up at a slope of 1. Left >> > of 0 >> > the function has a "nearly infinite" value.?If bfgs encounters ?a value >> > that's larger than the current value, it should try a different step >> > size, >> > no??Why does fmin_bfgs fail in this way? >> >> I can't reproduce this (on my computer it converges to 0.99999992), >> but have you tried making that < into a <=? The divide-by-zero at f(0) >> might be making it freak out. >> >> -- Nathaniel >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsalvati at u.washington.edu Tue Apr 17 14:56:57 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Tue, 17 Apr 2012 11:56:57 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: Hmm, that's too bad. Looks like there was a big refactoring of linesearch.py ( https://github.com/scipy/scipy/blob/master/scipy/optimize/linesearch.py ) a couple of years ago ( https://github.com/scipy/scipy/commit/fefef2d73200d535b95ce0f21dcfe122301a967d ) Thanks for the help Nathaniel :) On Tue, Apr 17, 2012 at 11:16 AM, Nathaniel Smith wrote: > On Tue, Apr 17, 2012 at 7:14 PM, John Salvatier > wrote: > > Well that's good news. I have scipy .9.0b1, what version do you have? > > Less good news: I have 0.8.0 :-) > > - N > > > On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith wrote: > >> > >> On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier > >> wrote: > >> > Hi all! > >> > > >> > I am having a problem with the fmin_bfgs solver that's surprising to > me. > >> > Here's the toy problem I've set up: > >> > > >> > from scipy.optimize import fmin_bfgs, fmin_ncg > >> > from numpy import * > >> > import numpy as np > >> > > >> > def f(x ): > >> > if x < 0: > >> > return 1.79769313e+308 > >> > else : > >> > return x + 1./x > >> > > >> > > >> > xs = fmin_bfgs(f, array( [10.]), retall = True) > >> > > >> > The solver returns [nan] as the solution. > >> > > >> > The problem is designed to be stiff: between 0 and 1, it slopes upward > >> > to > >> > infinity but between 1 and infinity, it slopes up at a slope of 1. > Left > >> > of 0 > >> > the function has a "nearly infinite" value. If bfgs encounters a > value > >> > that's larger than the current value, it should try a different step > >> > size, > >> > no? Why does fmin_bfgs fail in this way? > >> > >> I can't reproduce this (on my computer it converges to 0.99999992), > >> but have you tried making that < into a <=? The divide-by-zero at f(0) > >> might be making it freak out. > >> > >> -- Nathaniel > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsalvati at u.washington.edu Tue Apr 17 15:53:27 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Tue, 17 Apr 2012 12:53:27 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: It seems that what was actually happening is that in line_search_wolfe2, calling scalar_search_wolfe2 calling _zoom calling _cubicmin was returning a NaN if you add a test at the end of _cubicmin testing for a NaN (and then returning None), it finds the right minimum. It looks like _quadmin probably would have the same problem. I'll file a bug report. On Tue, Apr 17, 2012 at 11:56 AM, John Salvatier wrote: > Hmm, that's too bad. Looks like there was a big refactoring of > linesearch.py ( > https://github.com/scipy/scipy/blob/master/scipy/optimize/linesearch.py ) > a couple of years ago ( > https://github.com/scipy/scipy/commit/fefef2d73200d535b95ce0f21dcfe122301a967d > ) > > Thanks for the help Nathaniel :) > > > On Tue, Apr 17, 2012 at 11:16 AM, Nathaniel Smith wrote: > >> On Tue, Apr 17, 2012 at 7:14 PM, John Salvatier >> wrote: >> > Well that's good news. I have scipy .9.0b1, what version do you have? >> >> Less good news: I have 0.8.0 :-) >> >> - N >> >> > On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith >> wrote: >> >> >> >> On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier >> >> wrote: >> >> > Hi all! >> >> > >> >> > I am having a problem with the fmin_bfgs solver that's surprising to >> me. >> >> > Here's the toy problem I've set up: >> >> > >> >> > from scipy.optimize import fmin_bfgs, fmin_ncg >> >> > from numpy import * >> >> > import numpy as np >> >> > >> >> > def f(x ): >> >> > if x < 0: >> >> > return 1.79769313e+308 >> >> > else : >> >> > return x + 1./x >> >> > >> >> > >> >> > xs = fmin_bfgs(f, array( [10.]), retall = True) >> >> > >> >> > The solver returns [nan] as the solution. >> >> > >> >> > The problem is designed to be stiff: between 0 and 1, it slopes >> upward >> >> > to >> >> > infinity but between 1 and infinity, it slopes up at a slope of 1. >> Left >> >> > of 0 >> >> > the function has a "nearly infinite" value. If bfgs encounters a >> value >> >> > that's larger than the current value, it should try a different step >> >> > size, >> >> > no? Why does fmin_bfgs fail in this way? >> >> >> >> I can't reproduce this (on my computer it converges to 0.99999992), >> >> but have you tried making that < into a <=? The divide-by-zero at f(0) >> >> might be making it freak out. >> >> >> >> -- Nathaniel >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Tue Apr 17 16:20:02 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 17 Apr 2012 13:20:02 -0700 Subject: [SciPy-User] All of the PyData videos are now up at the Marakana site Message-ID: Hi folks, A number of you expressed interest in attending the PyData workshop last month and unfortunately we had very tight space restrictions. But thanks to the team at Marakana, who pitched in and were willing to film, edit and post videos for many of the talks, you can access them all here: http://marakana.com/s/2012_pydata_workshop,1090/index.html They are in 720p so you can actually read the terminals, though I think you have to click the YouTube link to be able to change the resolution. Enjoy! f From Kathleen.M.Tacina at nasa.gov Wed Apr 18 12:39:57 2012 From: Kathleen.M.Tacina at nasa.gov (Kathleen M Tacina) Date: Wed, 18 Apr 2012 16:39:57 +0000 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: References: Message-ID: <1334767197.4548.155.camel@MOSES.grc.nasa.gov> I've recreated this problem on my machine. It appears to be relating to floating point precision, and only occurs when the x<0 return value is extremely large. >>> def f2(x,badval=1e20): ... if x<=0: ... return badval ... else: ... return x+1/x ... >>> ans=1. >>> badval=1e20 >>> while abs(ans-1.)<1e-5: ... badval = 10*badval ... ans = fmin_bfgs(f2,10,args=(badval,)) The while loop ends when badval is 1.0e301, with ans as nan and the following error message: Warning: Desired error not necessarily achieved due to precision loss. Current function value: nan Iterations: 45 Function evaluations: 1989 Gradient evaluations: 654 But it works with badval=1.0e300. Would it hurt your problem to use 1e200 as your "nearly infinite" value instead of 1.79769313e+308? Or am I missing something here? On Tue, 2012-04-17 at 13:56 -0500, John Salvatier wrote: > Hmm, that's too bad. Looks like there was a big refactoring of > linesearch.py > ( https://github.com/scipy/scipy/blob/master/scipy/optimize/linesearch.py ) a couple of years ago ( https://github.com/scipy/scipy/commit/fefef2d73200d535b95ce0f21dcfe122301a967d ) > > > > Thanks for the help Nathaniel :) > > > On Tue, Apr 17, 2012 at 11:16 AM, Nathaniel Smith > wrote: > > On Tue, Apr 17, 2012 at 7:14 PM, John Salvatier > wrote: > > Well that's good news. I have scipy .9.0b1, what version do > you have? > > > > Less good news: I have 0.8.0 :-) > > - N > > > > On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith > wrote: > >> > >> On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier > >> wrote: > >> > Hi all! > >> > > >> > I am having a problem with the fmin_bfgs solver that's > surprising to me. > >> > Here's the toy problem I've set up: > >> > > >> > from scipy.optimize import fmin_bfgs, fmin_ncg > >> > from numpy import * > >> > import numpy as np > >> > > >> > def f(x ): > >> > if x < 0: > >> > return 1.79769313e+308 > >> > else : > >> > return x + 1./x > >> > > >> > > >> > xs = fmin_bfgs(f, array( [10.]), retall = True) > >> > > >> > The solver returns [nan] as the solution. > >> > > >> > The problem is designed to be stiff: between 0 and 1, it > slopes upward > >> > to > >> > infinity but between 1 and infinity, it slopes up at a > slope of 1. Left > >> > of 0 > >> > the function has a "nearly infinite" value. If bfgs > encounters a value > >> > that's larger than the current value, it should try a > different step > >> > size, > >> > no? Why does fmin_bfgs fail in this way? > >> > >> I can't reproduce this (on my computer it converges to > 0.99999992), > >> but have you tried making that < into a <=? The > divide-by-zero at f(0) > >> might be making it freak out. > >> > >> -- Nathaniel > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsalvati at u.washington.edu Wed Apr 18 12:57:39 2012 From: jsalvati at u.washington.edu (John Salvatier) Date: Wed, 18 Apr 2012 09:57:39 -0700 Subject: [SciPy-User] fmin_bfgs failing on simple problem In-Reply-To: <1334767197.4548.155.camel@MOSES.grc.nasa.gov> References: <1334767197.4548.155.camel@MOSES.grc.nasa.gov> Message-ID: Thanks Kathleen! Good idea. It probably wouldn't hurt in my case. The problem also comes up if you use "inf" too, though which I think would be more common. In any case, I've filed a bug report: http://projects.scipy.org/scipy/ticket/1644 John On Wed, Apr 18, 2012 at 9:39 AM, Kathleen M Tacina < Kathleen.M.Tacina at nasa.gov> wrote: > ** > I've recreated this problem on my machine. It appears to be relating to > floating point precision, and only occurs when the x<0 return value is > extremely large. > > > > >>> def f2(x,badval=1e20): > ... if x<=0: > ... return badval > ... else: > ... return x+1/x > ... > >>> ans=1. > >>> badval=1e20 > >>> while abs(ans-1.)<1e-5: > ... badval = 10*badval > ... ans = fmin_bfgs(f2,10,args=(badval,)) > > > The while loop ends when badval is 1.0e301, with ans as nan and the > following error message: > Warning: Desired error not necessarily achieved due to precision loss. > Current function value: nan > Iterations: 45 > Function evaluations: 1989 > Gradient evaluations: 654 > > But it works with badval=1.0e300. Would it hurt your problem to use 1e200 > as your "nearly infinite" value instead of 1.79769313e+308? > > Or am I missing something here? > > > > On Tue, 2012-04-17 at 13:56 -0500, John Salvatier wrote: > > Hmm, that's too bad. Looks like there was a big refactoring of > linesearch.py ( > https://github.com/scipy/scipy/blob/master/scipy/optimize/linesearch.py ) > a couple of years ago ( > https://github.com/scipy/scipy/commit/fefef2d73200d535b95ce0f21dcfe122301a967d > ) > > > > Thanks for the help Nathaniel :) > > On Tue, Apr 17, 2012 at 11:16 AM, Nathaniel Smith wrote: > > On Tue, Apr 17, 2012 at 7:14 PM, John Salvatier > wrote: > > Well that's good news. I have scipy .9.0b1, what version do you have? > > > Less good news: I have 0.8.0 :-) > > - N > > > > On Tue, Apr 17, 2012 at 11:13 AM, Nathaniel Smith wrote: > >> > >> On Tue, Apr 17, 2012 at 6:35 PM, John Salvatier > >> wrote: > >> > Hi all! > >> > > >> > I am having a problem with the fmin_bfgs solver that's surprising to > me. > >> > Here's the toy problem I've set up: > >> > > >> > from scipy.optimize import fmin_bfgs, fmin_ncg > >> > from numpy import * > >> > import numpy as np > >> > > >> > def f(x ): > >> > if x < 0: > >> > return 1.79769313e+308 > >> > else : > >> > return x + 1./x > >> > > >> > > >> > xs = fmin_bfgs(f, array( [10.]), retall = True) > >> > > >> > The solver returns [nan] as the solution. > >> > > >> > The problem is designed to be stiff: between 0 and 1, it slopes upward > >> > to > >> > infinity but between 1 and infinity, it slopes up at a slope of 1. > Left > >> > of 0 > >> > the function has a "nearly infinite" value. If bfgs encounters a > value > >> > that's larger than the current value, it should try a different step > >> > size, > >> > no? Why does fmin_bfgs fail in this way? > >> > >> I can't reproduce this (on my computer it converges to 0.99999992), > >> but have you tried making that < into a <=? The divide-by-zero at f(0) > >> might be making it freak out. > >> > >> -- Nathaniel > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From khiggins3 at csc.com Wed Apr 18 11:52:18 2012 From: khiggins3 at csc.com (Kenneth P Higgins) Date: Wed, 18 Apr 2012 11:52:18 -0400 Subject: [SciPy-User] scipy lapack Message-ID: An HTML attachment was scrubbed... URL: From leofagundesdemello at gmail.com Wed Apr 18 21:09:05 2012 From: leofagundesdemello at gmail.com (Leonardo Mello) Date: Wed, 18 Apr 2012 18:09:05 -0700 (PDT) Subject: [SciPy-User] python + fortran Message-ID: <1847da6c-c9ff-4674-a122-405db0fb6335@f37g2000yqc.googlegroups.com> Ol? pessoal ! Estou com um problema para fazer chamadas do FORTRAN no python. Se eu coloco uma estrutura (TYPE) no c?digo o f2py n?o cria o arquivo.so E se utilizo chamdas de m?dulos com USE o f2py cria o arquivo.so mas quando entro no python e tento importar n?o rola. Algu?m sabe como utilizar essa biblioteca f2py que possa me ajudar. Vlw From leofagundesdemello at gmail.com Wed Apr 18 21:25:33 2012 From: leofagundesdemello at gmail.com (Leonardo Mello) Date: Wed, 18 Apr 2012 18:25:33 -0700 (PDT) Subject: [SciPy-User] python + fortran (english) Message-ID: <4c27b94e-afb4-4cad-b05a-ab8828447147@35g2000yqq.googlegroups.com> Hello! I'm having a problem getting calls from FORTRAN in Python. If I put a structure (TYPE) in the code does not create the f2py arquivo.so And if I use of modules with the USE f2py file.so but creates the when I enter the python and try to import does not work. Does anyone know how to use this library f2py you can help me. Thank you PS: excuse my english From srean.list at gmail.com Thu Apr 19 00:52:49 2012 From: srean.list at gmail.com (srean) Date: Wed, 18 Apr 2012 23:52:49 -0500 Subject: [SciPy-User] inverting pdist. In-Reply-To: <201204161655.03204.eric@depagne.org> References: <201204161655.03204.eric@depagne.org> Message-ID: I lost you on your description about the index of the largest distance. If what you are looking for is that given all pairwise distance ab, ac, ad, bc, bd, cd to find the coordinates of a,b,c and d, then the problem you are looking for is multidimensional scaling (MDS). You will get lots of hits if you search for it. In general it does not have a unique solution. On the other hand if you had all pairs of dot products (or similarities) you can obtain the coordinates by an eigen decomposition of the pairwise dot product matrix. On Mon, Apr 16, 2012 at 9:55 AM, ?ric Depagne wrote: > Hi. > > I'm using scipy.spatial.distance.pdist. > > I was wondering if it was possible to invert the result to get the elements of > the input matrix that produced a given result. > Let me explain a little. > > Say ?we have a,b, c and d. The possible distances are ab, ac, ad, ?bc, bd and > cd. pdist will give the values for those 6 distances. Say the largest is at > index 5. How can I get from the pdist result array that it is the bd distance? > > > And if it is not, is there another way of doing so? My data are such that I > have close to 100k distances to compute. > > Thanks. > ?ric. > > Un clavier azerty en vaut deux > ---------------------------------------------------------- > ?ric Depagne ? ? ? ? ? ? ? ? ? ? ? ? ? ?eric at depagne.org > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From lists at hilboll.de Thu Apr 19 02:56:12 2012 From: lists at hilboll.de (andreasrainfarn@googlemail.com) Date: Thu, 19 Apr 2012 08:56:12 +0200 Subject: [SciPy-User] scipy lapack In-Reply-To: References: Message-ID: <93ca6573-8cc9-4320-8868-f141b7f8403d@email.android.com> Let LAPACK environment variable point to your liblapack.a See www.scipy.org/Installing_SciPy/BuildingGeneral Cheers, Andreas Kenneth P Higgins schrieb: >_______________________________________________ >SciPy-User mailing list >SciPy-User at scipy.org >http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Thu Apr 19 08:11:02 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 19 Apr 2012 14:11:02 +0200 Subject: [SciPy-User] python + fortran (english) In-Reply-To: <4c27b94e-afb4-4cad-b05a-ab8828447147@35g2000yqq.googlegroups.com> References: <4c27b94e-afb4-4cad-b05a-ab8828447147@35g2000yqq.googlegroups.com> Message-ID: <4F9000D6.5080704@molden.no> Fortran derived types is a tricky issue. The problem is that Python can only interface with external C libraries. So to call Fortran it must use C code as proxy, which is what f2py does. Unfortunately a Fortran type does not map to a C struct on the binary level. The binary interface (ABI) is compiler dependent. The solution is to use "Fortran 2003 ISO C bindings" and let a Fortran subroutine map a C struct to a Fortran type. Use the Fortran C bindings to export the subroutine as a C callable void function. Now you can call the Fortran library from Python with ctypes, Cython, f2py, Swig, CXX, Boost.Python, Python C API, or whatever you prefer. In theory f2py or fwrap could do this mapping to Fortran derived types. But I am not sure if they do. I would not waste time on it, just write the wrapper by hand. It is not tedious if you only have to expose a few Fortran subroutines to Python. Sturla On 19.04.2012 03:25, Leonardo Mello wrote: > Hello! > > I'm having a problem getting calls from FORTRAN in Python. > If I put a structure (TYPE) in the code does not create the f2py > arquivo.so > And if I use of modules with the USE f2py file.so but creates the > when I enter the python and try to import does not work. > > Does anyone know how to use this library f2py you can help me. > > Thank you > > PS: excuse my english > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From edepagne at aip.de Thu Apr 19 03:07:38 2012 From: edepagne at aip.de (=?iso-8859-1?q?=C9ric_Depagne?=) Date: Thu, 19 Apr 2012 09:07:38 +0200 Subject: [SciPy-User] inverting pdist. In-Reply-To: References: <201204161655.03204.eric@depagne.org> Message-ID: <201204190907.38907.edepagne@aip.de> Le jeudi 19 avril 2012 06:52:49, srean a ?crit : Hi. Sorry for my unclear explanation. My choice of wording for the subject of my email was very poor, sorry for that. I have the coordinates of a,b,c and d from the start. What I needed (see the end of my message) was a way to find what are the two points for which their distance (given by pdist) is the largest. So in my cases, I have a, b, c and d, their coordinates, and pdist gives me the values ab, ac,ad bc, bd and cd. I can then extract the largest value (thus the longest distance). Then I want to find to which pair it corresponds. > I lost you on your description about the index of the largest > distance. If what you are looking for is that given all pairwise > distance ab, ac, ad, bc, bd, cd to find the coordinates of a,b,c and > d, then the problem you are looking for is multidimensional scaling > (MDS). You will get lots of hits if you search for it. In general it > does not have a unique solution. On the other hand if you had all > pairs of dot products (or similarities) you can obtain the coordinates > by an eigen decomposition of the pairwise dot product matrix. I've kept googling since, and I've found that Ben Root did exactly what I need here: http://old.nabble.com/unravel_index-for-pdist--p32151477.html Cheers, ?ric. > > On Mon, Apr 16, 2012 at 9:55 AM, ?ric Depagne wrote: > > Hi. > > > > I'm using scipy.spatial.distance.pdist. > > > > I was wondering if it was possible to invert the result to get the > > elements of the input matrix that produced a given result. > > Let me explain a little. > > > > Say we have a,b, c and d. The possible distances are ab, ac, ad, bc, bd > > and cd. pdist will give the values for those 6 distances. Say the > > largest is at index 5. How can I get from the pdist result array that it > > is the bd distance? > > > > > > And if it is not, is there another way of doing so? My data are such that > > I have close to 100k distances to compute. > > > > Thanks. > > ?ric. > > > > Un clavier azerty en vaut deux > > ---------------------------------------------------------- > > ?ric Depagne eric at depagne.org > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne edepagne at aip.de Leibniz-Institut f?r Astrophysik Potsdam An der Sternwarte 16 14482 Potsdam Germany ---------------------------------------------------------- From harijay at gmail.com Fri Apr 20 11:53:21 2012 From: harijay at gmail.com (hari jayaram) Date: Fri, 20 Apr 2012 11:53:21 -0400 Subject: [SciPy-User] building on Ubuntu Linux from github source : BLAS and LAPACK not getting picked up from environment variable Message-ID: Hi I am on 64 bit Ubuntu , python 2.6.5 , GCC 4.4.3. I want to compile my own scipy since the ubuntu package on lucid installs a scipy which does not have the scipy.optimize.curve_fit I followed the build instructions at the url below to install from the git source . http://www.scipy.org/Installing_SciPy/BuildingGeneral I could install numpy from the git source . Then I installed lapack and blas libraries in my home directory following the instructions at BuildingGeneral. I set the environment variables BLAS and LAPACK in my ~/.bashrc and sourced them and then tried the setup.py for scipy sudo python setup.py install I get the following errors implying that BLAS was not getting picked up. I even copied the libblas.a and liblapack.a files to /usr/lib, linked it to liblapack.so ...but none of these helped find lapack or blas "numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable." I am wondering if the build instructions are different for the github tree. The detailed error is given below. Thanks for your help Hari hari at hari:~/scipy$ sudo python setup.py install --prefix=/usr/local blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1474: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1483: UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__) blas_src_info: NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1486: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__) Traceback (most recent call last): File "setup.py", line 208, in setup_package() File "setup.py", line 199, in setup_package configuration=configuration ) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/core.py", line 152, in setup config = configuration() File "setup.py", line 136, in configuration config.add_subpackage('scipy') File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/setup.py", line 8, in configuration config.add_subpackage('integrate') File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/integrate/setup.py", line 10, in configuration blas_opt = get_info('blas_opt',notfound_action=2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py", line 325, in get_info return cl().get_info(notfound_action) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py", line 484, in get_info raise self.notfounderror(self.notfounderror.__doc__) numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. hari at hari:~/scipy$ export BLAS=/home/hari/blas/BLAS/libblas.a hari at hari:~/scipy$ export LAPACK=/home/hari/LAPACK/lapack-3.4.0/liblapack.a hari at hari:~/scipy$ sudo python setup.py install --prefix=/usr/local blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1474: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in ['/usr/local/lib', '/usr/lib64', '/usr/lib'] NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1483: UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__) blas_src_info: NOT AVAILABLE /usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py:1486: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__) Traceback (most recent call last): File "setup.py", line 208, in setup_package() File "setup.py", line 199, in setup_package configuration=configuration ) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/core.py", line 152, in setup config = configuration() File "setup.py", line 136, in configuration config.add_subpackage('scipy') File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/setup.py", line 8, in configuration config.add_subpackage('integrate') File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/integrate/setup.py", line 10, in configuration blas_opt = get_info('blas_opt',notfound_action=2) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py", line 325, in get_info return cl().get_info(notfound_action) File "/usr/local/lib/python2.6/dist-packages/numpy/distutils/system_info.py", line 484, in get_info raise self.notfounderror(self.notfounderror.__doc__) numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. From sergio_r at mail.com Fri Apr 20 12:53:51 2012 From: sergio_r at mail.com (Sergio Rojas) Date: Fri, 20 Apr 2012 12:53:51 -0400 Subject: [SciPy-User] Testing scipy: a curiosity Message-ID: <20120420165351.167370@gmx.com> Hello all, Running the scipy tests in sequence: scipy.test() then, after the previous test finished without errors or failures I run: scipy.test('full') some tests fails (see below). Exiting and starting a new python session and executing only the test suite: scipy.test('full') No failures or errors are reported. The test ends with: ... ---------------------------------------------------------------------- Ran 5832 tests in 423.039s OK (KNOWNFAIL=14, SKIP=42) What could be the problem?. Can we trust these tests results? Is there a way to run these failed tests individually so one can make sure the scipy system is working properly? perhaps an especific hard example could be usefull. Sergio >$ python_gnu Python 2.7.2 (default, Apr 18 2012, 13:06:38) [GCC 4.6.1] on linux3 Type "help", "copyright", "credits" or "license" for more information. >>> >>> >>> scipy.test('full') Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/si te-packages/numpy SciPy version 0.10.1 SciPy is installed in /home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/si te-packages/scipy Python version 2.7.2 (default, Apr 18 2012, 13:06:38) [GCC 4.6.1] nose version 1.1.2 ..... ====================================================================== ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc ipy/special/tests/test_basic.py", line 1642, in test_iv_cephes_vs_amos_mass_test c1 = special.iv(v, x) RuntimeWarning: divide by zero encountered in iv ====================================================================== FAIL: test_mio.test_mat4_3d ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/no se/case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc ipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d stream, {'a': arr}, True, '4') File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu mpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: DeprecationWarning not raised ====================================================================== FAIL: Regression test for #651: better handling of badly conditioned ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc ipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu mpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: BadCoefficients not raised ---------------------------------------------------------------------- Ran 5832 tests in 252.953s FAILED (KNOWNFAIL=14, SKIP=42, errors=1, failures=2) scipy.test('full') scipy.test() ====================================================================== FAIL: test_mio.test_mat4_3d ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/no se/case.py", line 197, in runTest self.test(*self.arg) File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc ipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d stream, {'a': arr}, True, '4') File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu mpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: DeprecationWarning not raised ====================================================================== FAIL: Regression test for #651: better handling of badly conditioned ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc ipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) File "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu mpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: BadCoefficients not raised ---------------------------------------------------------------------- Ran 5101 tests in 40.225s -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Fri Apr 20 13:07:24 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 20 Apr 2012 19:07:24 +0200 Subject: [SciPy-User] Testing scipy: a curiosity In-Reply-To: <20120420165351.167370@gmx.com> References: <20120420165351.167370@gmx.com> Message-ID: On Fri, Apr 20, 2012 at 6:53 PM, Sergio Rojas wrote: > > Hello all, > > > Running the scipy tests in sequence: > > scipy.test() > then, after the previous test finished without errors > or failures I run: > scipy.test('full') > some tests fails (see below). > > Exiting and starting a new python session and > executing only the test suite: > > scipy.test('full') > > No failures or errors are reported. The test ends with: > ... > ---------------------------------------------------------------------- > Ran 5832 tests in 423.039s > OK (KNOWNFAIL=14, SKIP=42) > > > What could be the problem?. Can we trust these tests results? > Python's default behavior is to raise warnings from the same place in the code only once. So when you re-run the tests it doesn't raise those warnings. The tests check that a warning is raised. Hence the failures. > Is there a way to run these failed tests individually so one can > make sure the scipy system is working properly? perhaps an > especific hard example could be usefull. > > See http://readthedocs.org/docs/nose/en/latest/usage.html#selecting-tests Ralf > > Sergio > >$ python_gnu > Python 2.7.2 (default, Apr 18 2012, 13:06:38) > [GCC 4.6.1] on linux3 > Type "help", "copyright", "credits" or "license" for more information. > >>> > >>> > >>> scipy.test('full') > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in > /home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/si > te-packages/numpy > SciPy version 0.10.1 > SciPy is installed in > /home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/si > te-packages/scipy > Python version 2.7.2 (default, Apr 18 2012, 13:06:38) [GCC 4.6.1] > nose version 1.1.2 > > ..... > > ====================================================================== > ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc > ipy/special/tests/test_basic.py", line 1642, in > test_iv_cephes_vs_amos_mass_test > c1 = special.iv(v, x) > RuntimeWarning: divide by zero encountered in iv > > ====================================================================== > FAIL: test_mio.test_mat4_3d > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/no > se/case.py", line 197, in runTest > self.test(*self.arg) > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc > ipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d > stream, {'a': arr}, True, '4') > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu > mpy/testing/utils.py", line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: DeprecationWarning not raised > > ====================================================================== > FAIL: Regression test for #651: better handling of badly conditioned > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc > ipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter > assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu > mpy/testing/utils.py", line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: BadCoefficients not raised > > ---------------------------------------------------------------------- > Ran 5832 tests in 252.953s > > FAILED (KNOWNFAIL=14, SKIP=42, errors=1, failures=2) > > > scipy.test('full') > scipy.test() > > ====================================================================== > FAIL: test_mio.test_mat4_3d > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/no > se/case.py", line 197, in runTest > self.test(*self.arg) > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc > ipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d > stream, {'a': arr}, True, '4') > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu > mpy/testing/utils.py", line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: DeprecationWarning not raised > > ====================================================================== > FAIL: Regression test for #651: better handling of badly conditioned > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/sc > ipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter > assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) > File > "/home/srojas/myPROG/Python272GNU/Linux64b/lib/python2.7/site-packages/nu > mpy/testing/utils.py", line 1008, in assert_raises > return nose.tools.assert_raises(*args,**kwargs) > AssertionError: BadCoefficients not raised > > ---------------------------------------------------------------------- > Ran 5101 tests in 40.225s > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From duiliotadeu at gmail.com Fri Apr 20 15:07:24 2012 From: duiliotadeu at gmail.com (Duilio Tadeu) Date: Fri, 20 Apr 2012 16:07:24 -0300 Subject: [SciPy-User] setting submatrix of sparse matrix Message-ID: Hi, I am developing a simple Finite Elements code and there a part of the code where I need to set some elements of the sparse matrix (actually a submatrix), But I could not make it work. I have tried many ways. I describe the better way I could try. The array index idx is 1x3, so idx1 and idx2 is 3x3. The (small) matrix M is 3x3. The matrix A is a sparse matrix, lil_matrix sparse matrix of scipy.sparse ###### partial code ################## M = local_matrix(x,y) M=M.ravel() idx1,idx2 = numpy.meshgrid(idx,idx) idx1=idx1.ravel() idx2=idx2.ravel() A[idx1,idx2] = A[idx1,idx2] + M ################################### In fact, I am trying to migrate from MATLAB to Scipy/Numpy, but this part is taking some time and is becoming frustrating, since I could not find any kind of way to perform this task. In MATLAB, the code would be just: M = local_matrix(x,y) A[idx,idx] = A[idx,idx] + M Which is shorter and easy. I send next the error message. Any help is welcome. Thanks, Duilio 131 idx2=idx2.ravel() 132 --> 133 A[idx1,idx2] = A[idx1,idx2] + M /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in __setitem__(self, index, x) 318 else: 319 for ii, jj, xx in zip(i, j, x): --> 320 self._insertat(ii, jj, xx) 321 elif isinstance(i, slice) or issequence(i): 322 rows = self.rows[i] /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in _insertat(self, i, j, x) 230 row = self.rows[i] 231 data = self.data[i] --> 232 self._insertat2(row, data, j, x) 233 234 def _insertat2(self, row, data, j, x): /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in _insertat2(self, row, data, j, x) 244 245 if not np.isscalar(x): --> 246 raise ValueError('setting an array element with a sequence') 247 248 try: ValueError: setting an array element with a sequence From a.klein at science-applied.nl Fri Apr 20 16:41:14 2012 From: a.klein at science-applied.nl (Almar Klein) Date: Fri, 20 Apr 2012 22:41:14 +0200 Subject: [SciPy-User] ANN: IEP 3.0.beta - the Interactive Editor for Python Message-ID: Dear all, On behalf of Rob, Ludo and myself, I'm pleased to finally announce version 3.0.beta of the Interactive Editor for Python. Give it at try and let us know what you think! IEP is a cross-platform Python IDE focused on interactivity and introspection, which makes it very suitable for scientific computing. Its practical design is aimed at simplicity and efficiency. IEP is written in Python 3 and Qt. Binaries are available for Windows, Linux, and Mac. Website: http://code.google.com/p/iep/ Discussion group: http://groups.google.com/group/iep_ Release notes: http://code.google.com/p/iep/wiki/Release Regards, Almar -- Almar Klein, PhD Science Applied phone: +31 6 19268652 e-mail: a.klein at science-applied.nl -------------- next part -------------- An HTML attachment was scrubbed... URL: From otrov at hush.ai Sat Apr 21 05:36:06 2012 From: otrov at hush.ai (Kliment) Date: Sat, 21 Apr 2012 11:36:06 +0200 Subject: [SciPy-User] Download moinmoin scipy cookbook locally Message-ID: <20120421093606.A06A614DBD4@smtp.hushmail.com> Hey guys, I use Zim (Desktop wiki) and accidentally found about moinmoin script (http://zim- wiki.org/wiki/doku.php?id=script_to_convert_moinmoin_pages_to_zim) that could transform mm to Zim format. I immediately thought on Scipy cookbook and went for it. However I know nothing about moinmoin and initially I tried to "wget" http://www.scipy.org/Cookbook recursively - bad idea for many reasons. So I thought to ask here, how can I download moinmoin scipy cookbook locally? TIA From ralf.gommers at googlemail.com Sun Apr 22 05:53:52 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 22 Apr 2012 11:53:52 +0200 Subject: [SciPy-User] Download moinmoin scipy cookbook locally In-Reply-To: <20120421093606.A06A614DBD4@smtp.hushmail.com> References: <20120421093606.A06A614DBD4@smtp.hushmail.com> Message-ID: On Sat, Apr 21, 2012 at 11:36 AM, Kliment wrote: > Hey guys, > > I use Zim (Desktop wiki) and accidentally found about moinmoin > script (http://zim- > wiki.org/wiki/doku.php?id=script_to_convert_moinmoin_pages_to_zim) > that could transform mm to Zim format. I immediately thought on > Scipy cookbook and went for it. > > However I know nothing about moinmoin and initially I tried to > "wget" http://www.scipy.org/Cookbook recursively - bad idea for > many reasons. > So I thought to ask here, how can I download moinmoin scipy > cookbook locally? > I'm not sure that there's a better way than grabbing all links from the main cookbook page that start with href="/Cookbook/. Shouldn't be that difficult. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Sun Apr 22 10:07:18 2012 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sun, 22 Apr 2012 16:07:18 +0200 Subject: [SciPy-User] Euroscipy 2012 - abstract deadline soon (April 30) + sprints Message-ID: <20120422140718.GB565@phare.normalesup.org> Hello, this is a reminder of the approaching deadline for abstract submission at the Euroscipy 2012 conference: the deadline is April 30, in one week. Euroscipy 2012 will be held in **Brussels**, **August 23-27**, at the Universit? Libre de Bruxelles (ULB, Solbosch Campus). The EuroSciPy meeting is a cross-disciplinary gathering focused on the use and development of the Python language in scientific research and industry. This event strives to bring together both users and developers of scientific tools, as well as academic research and state of the art industry. More information about the conference, including practical information, are found on the conference website http://www.euroscipy.org/conference/euroscipy2012 We are soliciting talks and posters that discuss topics related to scientific computing using Python. These include applications, teaching, future development directions, and research. We welcome contributions from the industry as well as the academic world. Submission guidelines are found on http://www.euroscipy.org/card/euroscipy2012_call_for_contributions Also, rooms are available at the ULB for sprints on Tuesday August 28th and Wednesday 29th. If you wish to organize a sprint at Euroscipy, please get in touch with Berkin Malkoc (malkocb at itu.edu.tr). Any other questions should be addressed exclusively to org-team at lists.euroscipy.org We apologize for the inconvenience if you received this e-mail through several mailing-lists. -- Emmanuelle, for the organizing team From josef.pktd at gmail.com Sun Apr 22 11:00:44 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 22 Apr 2012 11:00:44 -0400 Subject: [SciPy-User] OT: global optimization, hybrid global local search Message-ID: I'm looking at nonlinear regression, function estimation again. I'm interested in combining global search with local optimizers, which I think should be much faster in many of our problems. Anyone with ideas, experience, code? I just searched and browsed a few papers, but it's not my specialization. "In particular, the problems which are very hard to be solved by gradient descent-based methods, or the ones which have computationally expensive objective functions are very good candidates to be solved by the proposed methods." *) *) Hybrid optimization with improved tabu search http://www.sciencedirect.com/science/article/pii/S1568494610001511 Genetic and Nelder?Mead algorithms hybridized for a more accurate global optimization of continuous multiminima functions http://www.sciencedirect.com/science/article/pii/S0377221702004010 A hybrid method combining continuous tabu search and Nelder?Mead simplex algorithms for the global optimization of multiminima functions http://www.sciencedirect.com/science/article/pii/S0377221703006301 Hybridizing exact methods and metaheuristics: A taxonomy http://www.sciencedirect.com/science/article/pii/S0377221708003597 mostly integer programming Hybrid simulated annealing and direct search method for nonlinear unconstrained global optimization http://www.tandfonline.com/doi/abs/10.1080/1055678021000030084 Josef (Why does scipy not have any good global optimizers?) From denis-bz-gg at t-online.de Sun Apr 22 13:04:14 2012 From: denis-bz-gg at t-online.de (denis) Date: Sun, 22 Apr 2012 19:04:14 +0200 Subject: [SciPy-User] OT: global optimization, hybrid global local search In-Reply-To: References: Message-ID: On 22/04/2012 17:00, josef.pktd at gmail.com wrote: > I'm looking at nonlinear regression, function estimation again. > I'm interested in combining global search with local optimizers, which > I think should be much faster in many of our problems. > > Anyone with ideas, experience, code? ... Hi Josef, agree that hybrid methods have potential but like clearer goals before rushing in to code -- de gustabus. "Optimization" covers a HUGE range -- interactive, e.g. run 10 NM then look +- .1 then 10 NM more ... vs fully-automatic dimension: 2d / 3d visualizable, 4d .. say 10d, higher smooth / Gaussian noise / noisy but no noise model user gradients / finite-difference gradient est (low fruit) / no-deriv convex / not many application areas with a correspondingly huge range of optimizers, frameworks, plot / visualizers scipy.optimize, scikit-learn stuff, nlopt, cvxopt, stuff in R ... and a huge range of users from curve_fit to people who want X (but don't use it if it's there already) not to mention *lots* of papers on 1-user methods. There are more optimizers than test functions -- https://github.com/denis-bz/opt/{scopt,nlopt}/test/*sum show how noisy some no-deriv optimizers are on Powell's difficult sin-cos function. Do you use leastsq, any comments on that ? cf. Martin Teichmann rewrite https://github.com/scipy/scipy/pull/90 In short, we have to concentrate; suggestions ? cheers -- denis > Josef > (Why does scipy not have any good global optimizers?) Examples please ? deja vu: late March rant on "why doesn't leastsq do bounds" ? Well it can, easily From josef.pktd at gmail.com Sun Apr 22 13:31:57 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 22 Apr 2012 13:31:57 -0400 Subject: [SciPy-User] OT: global optimization, hybrid global local search In-Reply-To: References: Message-ID: On Sun, Apr 22, 2012 at 1:04 PM, denis wrote: > On 22/04/2012 17:00, josef.pktd at gmail.com wrote: >> I'm looking at nonlinear regression, function estimation again. >> I'm interested in combining global search with local optimizers, which >> I think should be much faster in many of our problems. >> >> Anyone with ideas, experience, code? > ... > Hi Josef, > > agree that hybrid methods have potential > but like clearer goals before rushing in to code -- de gustabus. > > "Optimization" covers a HUGE range -- > ? ? interactive, e.g. run 10 NM then look +- .1 then 10 NM more ... > ? ? ? ? ?vs fully-automatic > ? ? dimension: 2d / 3d visualizable, 4d .. say 10d, higher > ? ? smooth / Gaussian noise / noisy but no noise model > ? ? user gradients / finite-difference gradient est (low fruit) / no-deriv > ? ? convex / not > ? ? many application areas > > with a correspondingly huge range of optimizers, frameworks, plot / visualizers > scipy.optimize, scikit-learn stuff, nlopt, cvxopt, stuff in R ... > and a huge range of users from curve_fit > to people who want X (but don't use it if it's there already) > not to mention *lots* of papers on 1-user methods. > > There are more optimizers than test functions -- > https://github.com/denis-bz/opt/{scopt,nlopt}/test/*sum > show how noisy some no-deriv optimizers are on Powell's difficult sin-cos function. > > > Do you use leastsq, any comments on that ? > cf. Martin Teichmann rewrite https://github.com/scipy/scipy/pull/90 > > > In short, we have to concentrate; suggestions ? http://itl.nist.gov/div898/strd/nls/nls_main.shtml 5 to 20 parameters, smooth least squares estimation or maximize log-likelihood possibly with minimal user intervention what I sometimes do: choose random starting parameters then use leastsq, fmin or fmin_bfgs Instead of random starting parameters, I would like to have a not home-made, global directed search. simulated annealing, differential evolution, ... sound too random to me (and a waste of computer time) - not quite verified tabu search (I don't know yet what that is) + leastsq or fmin ? I'm a user of optimization algorithm and avoid looking at the inside, if I don't have to. Josef > > cheers > ? -- denis > > >> Josef >> (Why does scipy not have any good global optimizers?) > > Examples please ? > deja vu: late March rant on "why doesn't leastsq do bounds" ? > Well it can, easily > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From scott.sinclair.za at gmail.com Tue Apr 24 02:21:00 2012 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 24 Apr 2012 08:21:00 +0200 Subject: [SciPy-User] scipy-user, numpy-user on google groups ? In-Reply-To: <4F9528B6.3010907@t-online.de> References: <4F9528B6.3010907@t-online.de> Message-ID: On 23 April 2012 12:02, denis wrote: > Hi Scott, > ?would you know what's happened to scipy-user and ?numpy-user on google > groups ? > They seem to be dormant / unresponsive script since about 1 April > or else, what other newsreader would you recommend ? > Thanks, cheers > ?-- denis I'm not sure who set up the Google groups to follow the lists, or how to get them working again (it seems the Google account has been disabled by Mailman due to excessive bounces). If you don't want to subscribe to the lists directly, you can try the Gmane archives: http://dir.gmane.org/gmane.comp.python.numeric.general http://dir.gmane.org/gmane.comp.python.scientific.user Cheers, Scott From tmp50 at ukr.net Tue Apr 24 09:04:41 2012 From: tmp50 at ukr.net (Dmitrey) Date: Tue, 24 Apr 2012 16:04:41 +0300 Subject: [SciPy-User] [ANN] Optimization with categorical variables, disjunctive (and other logical) constraints Message-ID: <10202.1335272681.1300527922027298816@ffe17.ukr.net> hi all, free solver interalg for global nonlinear optimization with specifiable accuracy now can handle categorical variables, disjunctive (and other logical) constraints, thus making it available to solve GDP, possibly in multiobjective form. There are ~ 2 months till next OpenOpt release, but I guess someone may find it useful for his purposes right now. See here for more details. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-gg at t-online.de Tue Apr 24 10:56:10 2012 From: denis-bz-gg at t-online.de (denis) Date: Tue, 24 Apr 2012 16:56:10 +0200 Subject: [SciPy-User] OT: global optimization, hybrid global local search In-Reply-To: References: Message-ID: On 22/04/2012 19:31, josef.pktd at gmail.com wrote: > On Sun, Apr 22, 2012 at 1:04 PM, denis wrote: >> On 22/04/2012 17:00, josef.pktd at gmail.com wrote: >>> I'm looking at nonlinear regression, function estimation again. >>> I'm interested in combining global search with local optimizers, which >>> I think should be much faster in many of our problems. > http://itl.nist.gov/div898/strd/nls/nls_main.shtml > > 5 to 20 parameters, smooth > least squares estimation or maximize log-likelihood > possibly with minimal user intervention > > what I sometimes do: choose random starting parameters then use > leastsq, fmin or fmin_bfgs > > Instead of random starting parameters, I would like to have a not > home-made, global directed search. Josef, fwiw, leastsq_bounds (http://projects.scipy.org/scipy/ticket/1631) works fine on this NIST "Higher difficulty" testcase if you just bound [0, 2*x0]: Info: loaded MGH09: opt params .193 .191 .123 .136 xmin .19 .19 .12 .14 x0 25 39 42 39 test_leastsq_bounds.py MGH09 box [[ 0. 0. 0. 0.] [ 50. 78. 83. 78.]] boxweights [0, 10, 20] ftol .001 boxweight 0: err .00016 pfit 1.21 783 4.09e+03 1.65e+03 neval 56 boxweight 10: err 2.8e-05 pfit .193 .197 .124 .139 neval 297 boxweight 20: err 2.8e-05 pfit .193 .197 .124 .139 neval 297 Thanks to Matt Newville for making the NIST testcases available. But they're all 1d curve-fitting from the 90 s, pretty small (your link). Sure, curve_fits can shoot off to infinity but if a person can easily fix that (bounds, scaling) then that's enough -- cf. the Betty Crocker effect. Yes there's a whole zoo of interesting untried general methods, papers ... but with more methods than real test cases, where's our market -- which way is up ? cheers -- denis From fccoelho at gmail.com Wed Apr 25 08:05:10 2012 From: fccoelho at gmail.com (Flavio Coelho) Date: Wed, 25 Apr 2012 09:05:10 -0300 Subject: [SciPy-User] Job Oportunities in Brazil Message-ID: First of all I'd like to apologize for the cross-post, but I'd like to bring to the attention of these Important Python scientific communities, a couple of opportunities for young PhDs in our institution, located at Rio de Janeiro, Brasil. Getulio Vargas Foundation is one of the top Research and Educational Institutions in Brasil. We are the largest Think-Tank of Latin America, and rank among the top 30 in the world[1]. Though we are not an University we have both undergraduate and graduate programs, which are very highly ranked. - Statistician: http://emap.fgv.br/files/edital-estatistica-en.pdf - Mathematician: http://emap.fgv.br/files/edital-matematica-aplicada-en.pdf These two jobs are to work at the Applied Mathematics School, where I work. I'd be delighted to have more academic Pythonistas around me, so please take your time to look through the job descriptions, and if you feel up to the challenge, please apply! cheers, [1] http://www.postwesternworld.com/2012/01/26/fgv-ranks-among-worlds-top-thirty-think-tanks/ -- Fl?vio Code?o Coelho ================ +55(21) 3799-5567 Professor Escola de Matem?tica Aplicada Funda??o Get?lio Vargas Rio de Janeiro - RJ Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.pundurs at nokia.com Wed Apr 25 13:39:52 2012 From: mark.pundurs at nokia.com (Pundurs Mark (Nokia-LC/Chicago)) Date: Wed, 25 Apr 2012 12:39:52 -0500 Subject: [SciPy-User] cluster.hierarchy.fcluster: Choosing value for threshold t? Message-ID: <8A18D8FA4293104C9A710494FD6C273CB7BEEADB@hq-ex-mb03.ad.navteq.com> I have ~8000 observations of 1-D data (not a standard use case for cluster.hierarchy, I suspect). I send the output of linkage to fcluster; if I choose a t of 1.5, the output array has only one unique value, and if I choose t=0.5, the output array has as many unique values as does the input data. I can see from dendrogram that there are intermediate levels of clustering; apart from trial and error, how do I find a t that returns such a clustering? (Does the vertical axis of the dendrogram have anything to do with fcluster's t argument? In the dendrogram, clusters split at integer values of the vertical component.) Mark Pundurs Data Analyst - Traffic Location & Commerce Chicago The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files. From sergio_r at mail.com Fri Apr 27 07:27:31 2012 From: sergio_r at mail.com (Sergio Rojas) Date: Fri, 27 Apr 2012 07:27:31 -0400 Subject: [SciPy-User] Testing scipy: a curiosity (Ralf Gommers) Message-ID: <20120427112731.17940@gmx.com> On Fri, Apr 20, 2012 at 6:53 PM, Sergio Rojas wrote: > > Hello all, > > > Running the scipy tests in sequence: > > scipy.test() > then, after the previous test finished without errors > or failures I run: > scipy.test('full') > some tests fails (see below). > > Exiting and starting a new python session and > executing only the test suite: > > scipy.test('full') > > No failures or errors are reported. The test ends with: > ... > ---------------------------------------------------------------------- > Ran 5832 tests in 423.039s > OK (KNOWNFAIL=14, SKIP=42) > > > What could be the problem?. Can we trust these tests results? > Python's default behavior is to raise warnings from the same place in the code only once. So when you re-run the tests it doesn't raise those warnings. The tests check that a warning is raised. Hence the failures. Ralf Thanks for replying, Ralf. A natural questions then raises Ralf, Can one obtain fictitious python's error messages or failures of a computation because of such python's default behavior? How can we toggle that to a most reasonable behavior? or how can we associate an error or failure (what hint can we have from the system) to such behavior? Sergio -------------- next part -------------- An HTML attachment was scrubbed... URL: From jniehof at lanl.gov Fri Apr 27 10:16:53 2012 From: jniehof at lanl.gov (Jonathan T. Niehof) Date: Fri, 27 Apr 2012 08:16:53 -0600 Subject: [SciPy-User] Testing scipy: a curiosity (Ralf Gommers) In-Reply-To: <20120427112731.17940@gmx.com> References: <20120427112731.17940@gmx.com> Message-ID: <4F9AAA55.7010301@lanl.gov> On 04/27/2012 05:27 AM, Sergio Rojas wrote: > Thanks for replying, Ralf. A natural questions then raises Ralf, > Can one obtain fictitious python's error messages or failures of a computation > because of such > python's default behavior? How can we toggle that to a most reasonable behavior? > or how can we associate an error or failure (what hint can we have from the system) > to such behavior? http://docs.python.org/library/warnings.html Running python as "python -W all" will always raise warnings. Running as "python -W error" will turn warnings into errors. -- Jonathan Niehof ISR-3 Space Data Systems Los Alamos National Laboratory MS-D466 Los Alamos, NM 87545 Phone: 505-667-9595 email: jniehof at lanl.gov Correspondence / Technical data or Software Publicly Available From warren.weckesser at enthought.com Thu Apr 26 12:20:45 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 26 Apr 2012 11:20:45 -0500 Subject: [SciPy-User] SciPy 2012 - The Eleventh Annual Conference on Scientific Computing with Python In-Reply-To: References: Message-ID: Dear all, (Sorry if you receive this announcement multiple times.) Registration for SciPy 2012, the eleventh annual Conference on Scientific Computing with Python, is open! Go to https://conference.scipy.org/scipy2012/register/index.php We would like to remind you that the submissions for talks, posters and tutorials are open *until April 30th, *which is just around the corner. For more information see: http://conference.scipy.org/scipy2012/tutorials.php http://conference.scipy.org/scipy2012/talks/index.php For talks or posters, all we need is an abstract. Tutorials require more significant preparation. If you are preparing a tutorial, please send a brief note to Jonathan Rocher (jrocher at enthought.com) to indicate your intent. We look forward to seeing many of you this summer. Kind regards, The SciPy 2012 organizers scipy2012 at scipy.org On Wed, Apr 4, 2012 at 4:30 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > SciPy 2012, the eleventh annual Conference on Scientific Computing with > Python, will be held July 16?21, 2012, in Austin, Texas. > > At this conference, novel scientific applications and libraries related to > data acquisition, analysis, dissemination and visualization using Python > are presented. Attended by leading figures from both academia and industry, > it is an excellent opportunity to experience the cutting edge of scientific > software development. > > The conference is preceded by two days of tutorials, during which > community experts provide training on several scientific Python packages. > Following the main conference will be two days of coding sprints. > > We invite you to give a talk or present a poster at SciPy 2012. > > The list of topics that are appropriate for the conference includes (but > is not limited to): > > - new Python libraries for science and engineering; > - applications of Python in solving scientific or computational > problems; > - high performance, parallel and GPU computing with Python; > - use of Python in science education. > > > > Specialized Tracks > > Two specialized tracks run in parallel to the main conference: > > - High Performance Computing with Python > Whether your algorithm is distributed, threaded, memory intensive or > latency bound, Python is making headway into the problem. We are looking > for performance driven designs and applications in Python. Candidates > include the use of Python within a parallel application, new architectures, > and ways of making traditional applications execute more efficiently. > > > - Visualization > They say a picture is worth a thousand words--we?re interested in > both! Python provides numerous visualization tools that allow scientists > to show off their work, and we want to know about any new tools and > techniques out there. Come show off your latest graphics, whether it?s an > old library with a slick new feature, a new library out to challenge the > status quo, or simply a beautiful result. > > > > Domain-specific Mini-symposia > > Mini-symposia on the following topics are also being organized: > > - Computational bioinformatics > - Meteorology and climatology > - Astronomy and astrophysics > - Geophysics > > > > Talks, papers and posters > > We invite you to take part by submitting a talk or poster abstract. > Instructions are on the conference website: > > > http://conference.scipy.org/scipy2012/talks.php > > Selected talks are included as papers in the peer-reviewed conference > proceedings, to be published online. > > > Tutorials > > Tutorials will be given July 16?17. We invite instructors to submit > proposals for half-day tutorials on topics relevant to scientific computing > with Python. See > > http://conference.scipy.org/scipy2012/tutorials.php > > for information about submitting a tutorial proposal. To encourage > tutorials of the highest quality, the instructor (or team of instructors) > is given a $1,000 stipend for each half day tutorial. > > > Student/Community Scholarships > > We anticipate providing funding for students and for active members of the > SciPy community who otherwise might not be able to attend the conference. > See > > http://conference.scipy.org/scipy2012/student.php > > for scholarship application guidelines. > > > Be a Sponsor > > The SciPy conference could not run without the generous support of the > institutions and corporations who share our enthusiasm for Python as a tool > for science. Please consider sponsoring SciPy 2012. For more information, > see > > http://conference.scipy.org/scipy2012/sponsor/index.php > > > Important dates: > > Monday, April 30: Talk abstracts and tutorial proposals due. > Monday, May 7: Accepted tutorials announced. > Monday, May 13: Accepted talks announced. > > Monday, June 18: Early registration ends. (Price increases after this > date.) > Sunday, July 8: Online registration ends. > > Monday-Tuesday, July 16 - 17: Tutorials > Wednesday-Thursday, July 18 - July 19: Conference > Friday-Saturday, July 20 - July 21: Sprints > > We look forward to seeing you all in Austin this year! > > The SciPy 2012 Team > http://conference.scipy.org/scipy2012/organizers.php > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Apr 27 14:52:00 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 27 Apr 2012 14:52:00 -0400 Subject: [SciPy-User] ANN: statsmodels 0.4.0 Message-ID: We are pleased to announce the release of statsmodels 0.4.0. The big changes in this release are that most models can now be used with Pandas dataframes, and that we dropped the scikits namespace. Importing scikits.statsmodels is still possible but will be removed in the future. Pandas is now a required dependency. For more changes including some breaks in backwards compatibility see below. Josef and Skipper What it is ========== Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models. Documentation for the 0.4 version is currently at http://statsmodels.sourceforge.net/devel/ Main Changes and Additions in 0.4.0 ----------------------------------- * Added pandas dependency. * Cython source is built automatically if cython and compiler are present * Support use of dates in timeseries models * Improved plots - Violin plots - Bean Plots - QQ Plots * Added lowess function * Support for pandas Series and DataFrame objects. Results instances return pandas objects if the models are fit using pandas objects. * Full Python 3 compatibility * Fix bugs in genfromdta. Convert Stata .dta format to structured array preserving all types. Conversion is much faster now. * Improved documentation * Models and results are pickleable via save/load, optionally saving the model data. * Kernel Density Estimation now uses Cython and is considerably faster. * Diagnostics for outlier and influence statistics in OLS * Added El Nino Sea Surface Temperatures dataset * Numerous bug fixes * Internal code refactoring * Improved documentation including examples as part of HTML * ... *Changes that break backwards compatibility* * Deprecated scikits namespace. The recommended import is now:: import statsmodels.api as sm * model.predict methods signature is now (params, exog, ...) where before it assumed that the model had been fit and omitted the params argument. (this removed circularity between models and results instances) * For consistency with other multi-equation models, the parameters of MNLogit are now transposed. * tools.tools.ECDF -> distributions.ECDF * tools.tools.monotone_fn_inverter -> distributions.monotone_fn_inverter * tools.tools.StepFunction -> distributions.StepFunction Main Features ============= * linear regression models: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares. * glm: Generalized linear models with support for all of the one-parameter exponential family distributions. * discrete: regression with discrete dependent variables, including Logit, Probit, MNLogit, Poisson, based on maximum likelihood estimators * rlm: Robust linear models with support for several M-estimators. * tsa: models for time series analysis - univariate time series analysis: AR, ARIMA - vector autoregressive models, VAR and structural VAR - descriptive statistics and process models for time series analysis * nonparametric : (Univariate) kernel density estimators * datasets: Datasets to be distributed and used for examples and in testing. * stats: a wide range of statistical tests - diagnostics and specification tests - goodness-of-fit and normality tests - functions for multiple testing - various additional statistical tests * iolib - Tools for reading Stata .dta files into numpy arrays. - printing table output to ascii, latex, and html * miscellaneous models * sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered "production ready". This covers among others Mixed (repeated measures) Models, GARCH models, general method of moments (GMM) estimators, kernel regression, various extensions to scipy.stats.distributions, panel data models, generalized additive models and information theoretic measures. Where to get it =============== The master branch on GitHub is the most up to date code https://www.github.com/statsmodels/statsmodels Source download of release tags are available on GitHub https://github.com/statsmodels/statsmodels/tags Binaries and source distributions are available from PyPi http://pypi.python.org/pypi/statsmodels/ From dineshbvadhia at hotmail.com Fri Apr 27 22:41:31 2012 From: dineshbvadhia at hotmail.com (Dinesh Vadhia) Date: Fri, 27 Apr 2012 19:41:31 -0700 Subject: [SciPy-User] (no subject) Message-ID: http://lutedeniz.com/orndkty/62076.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Sat Apr 28 12:44:17 2012 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Sat, 28 Apr 2012 18:44:17 +0200 Subject: [SciPy-User] please add to planet Message-ID: Please add the following feeds to planet.scipy.org http://pythonxynews.blogspot.de/ http://spyder-ide.blogspot.de/ Thanks. From servant.mathieu at gmail.com Sat Apr 28 13:24:17 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Sat, 28 Apr 2012 19:24:17 +0200 Subject: [SciPy-User] upgrading scipy on windows seven 64 bits Message-ID: Hi there, The Python (x,y) distribution (version 2.7) is already installed on my computer (running natively windows seven 64 bits). At the moment, the scipy version is '0.9'. I'm interested in using the scipy.optimize.minimize function ( http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize), but I can't manage to find it. Did you change the name of this general-purpose optimization function? Cheers, Mat -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Apr 28 13:28:36 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 28 Apr 2012 13:28:36 -0400 Subject: [SciPy-User] upgrading scipy on windows seven 64 bits In-Reply-To: References: Message-ID: On Sat, Apr 28, 2012 at 1:24 PM, servant mathieu wrote: > Hi there, > > The Python (x,y) distribution (version 2.7)?is already installed on my > computer (running natively?windows seven 64 bits). At the moment, the scipy > version is '0.9'. > > I'm interested in using the scipy.optimize.minimize function ( > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize),?but?I > can't manage to find it. Did you change the name of this general-purpose > optimization function? the docstring says New in version 0.11.0. so it's not released yet, it's only available if you build scipy master Josef > > Cheers, > Mat > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From paulojamorim at gmail.com Sat Apr 28 17:50:55 2012 From: paulojamorim at gmail.com (Paulo Henrique Junqueira Amorim) Date: Sat, 28 Apr 2012 18:50:55 -0300 Subject: [SciPy-User] upgrading scipy on windows seven 64 bits In-Reply-To: References: Message-ID: Hi, Is there any provision for the release of a release? Has http://www.lfd.uci.edu/~gohlke/pythonlibs/ with releases, but scipy requieres numpy compiled with MKL from Intel (you may have problem with license). Regards, Paulo On 28 April 2012 14:28, wrote: > On Sat, Apr 28, 2012 at 1:24 PM, servant mathieu > wrote: > > Hi there, > > > > The Python (x,y) distribution (version 2.7) is already installed on my > > computer (running natively windows seven 64 bits). At the moment, the > scipy > > version is '0.9'. > > > > I'm interested in using the scipy.optimize.minimize function ( > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize > ), but I > > can't manage to find it. Did you change the name of this general-purpose > > optimization function? > > the docstring says > New in version 0.11.0. > > so it's not released yet, it's only available if you build scipy master > > Josef > > > > > Cheers, > > Mat > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paulojamorim at gmail.com Sat Apr 28 17:52:47 2012 From: paulojamorim at gmail.com (Paulo Henrique Junqueira Amorim) Date: Sat, 28 Apr 2012 18:52:47 -0300 Subject: [SciPy-User] upgrading scipy on windows seven 64 bits In-Reply-To: References: Message-ID: Hi, Is there any prevision for a release? Has http://www.lfd.uci.edu/~gohlke/pythonlibs/ with releases, but scipy requieres numpy compiled with MKL from Intel (you may have problem with license). Regards, Paulo On 28 April 2012 18:50, Paulo Henrique Junqueira Amorim < paulojamorim at gmail.com> wrote: > Hi, > > Is there any provision for the release of a release? > > Has http://www.lfd.uci.edu/~gohlke/pythonlibs/ with releases, but scipy > requieres numpy compiled with MKL from Intel (you may have problem with > license). > > Regards, > Paulo > > > On 28 April 2012 14:28, wrote: > >> On Sat, Apr 28, 2012 at 1:24 PM, servant mathieu >> wrote: >> > Hi there, >> > >> > The Python (x,y) distribution (version 2.7) is already installed on my >> > computer (running natively windows seven 64 bits). At the moment, the >> scipy >> > version is '0.9'. >> > >> > I'm interested in using the scipy.optimize.minimize function ( >> > >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize >> ), but I >> > can't manage to find it. Did you change the name of this general-purpose >> > optimization function? >> >> the docstring says >> New in version 0.11.0. >> >> so it's not released yet, it's only available if you build scipy master >> >> Josef >> >> > >> > Cheers, >> > Mat >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanleeuwen.martin at gmail.com Sat Apr 28 19:40:52 2012 From: vanleeuwen.martin at gmail.com (Martin van Leeuwen) Date: Sat, 28 Apr 2012 16:40:52 -0700 Subject: [SciPy-User] (no subject) Message-ID: Hi, I was trying to work with the mlab.pipeline.triangle_filter in mayavi but couldn't figure out how the method works or how to work with the returned data structure. I assume it is similar to working with the edges in the Delaunay Graph Example: http://github.enthought.com/mayavi/mayavi/auto/example_delaunay_graph.html#example-delaunay-graph , but I would need a bit more information particular to this triangle filter. The final data structure I would like to get it the triangles in is OBJ format (triangle indices into vectors array). Thanks for any help. Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From servant.mathieu at gmail.com Mon Apr 30 05:15:50 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Mon, 30 Apr 2012 11:15:50 +0200 Subject: [SciPy-User] problem with the fmin function Message-ID: Dear scipy users, I'm trying to minimize a chi-square value by a simplex routine. My code contains two functions: the first one, called spotdiffusion(a, ter , v , sda , rd) generates some simulated data for a given set of parameter values. The second function, called chis ( a, ter , v , sda , rd), computes a chi-square value from the comparison of observed and simulated data, for a given set of parameter values. To do so, the chis function calls the spotdiffusion one. Here is a simplified structure of my code: def spotdiffusion (a, ter , v , sda , rd): do simulation computations returns an array of simulated data def chis (a, ter , v , sda , rd): observed data simulated_data = spotdiffusion ( a, ter , v , sda , rd) do chi-square computations returns the chi-square value (float) Now I want now to find a, ter, v, sda and rda values which minimize the chi-square. Here is my attempt: x0 = np.array ([0.11,0.25,0.35, 1.7, 0.017]) ####initial guess xopt = fmin (chis, x0, maxiter=300) However, python returns an error: Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\startup.py", line 128, in runfile execfile(filename, glbs) File "C:\Users\mathieu\Desktop\modeling\spotlight diffusion model\fitting_spotlight.py", line 246, in xopt = fmin (chis, x0, maxiter=300) File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 257, in fmin fsim[0] = func(x0) File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 176, in function_wrapper return function(x, *args) TypeError: chis() takes exactly 5 arguments (1 given) I don't understand where is my mistake. Any help would be appreciated! Cheers, Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Apr 30 07:08:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 30 Apr 2012 07:08:30 -0400 Subject: [SciPy-User] problem with the fmin function In-Reply-To: References: Message-ID: On Mon, Apr 30, 2012 at 5:15 AM, servant mathieu wrote: > Dear scipy users, > > I'm trying to minimize a chi-square value by a simplex routine. My code > contains two functions: the first one, called spotdiffusion(a, ter , v , sda > , rd) generates some simulated data for a given set of parameter values. The > second function, called chis ( a, ter , v , sda , rd), computes a chi-square > value from the comparison of observed and?simulated data, for a given set of > parameter values.?To do so, the chis function calls the spotdiffusion one. > Here is a simplified structure of my code: > > def spotdiffusion (a, ter , v , sda , rd): > ?????????????do simulation computations > ?????????????returns an array of simulated data > > def chis (a, ter , v , sda , rd): the parameter is just one list or array, and you need to unpack, for example def chis (x): a, ter , v , sda , rd = x Josef > ????????????? observed data > ????????????? simulated_data = spotdiffusion ( a, ter , v , sda , rd) > ????????????? do chi-square computations > ????????????? returns the chi-square value (float) > > Now I want now to find a, ter, v, sda and rda values which minimize the > chi-square. Here is my attempt: > > x0 = np.array ([0.11,0.25,0.35, 1.7, 0.017]) ####initial guess > xopt = fmin (chis, x0, maxiter=300) > > However, python returns an error: > Traceback (most recent call last): > ? File "", line 1, in > ? File > "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\startup.py", > line 128, in runfile > ??? execfile(filename, glbs) > ? File "C:\Users\mathieu\Desktop\modeling\spotlight diffusion > model\fitting_spotlight.py", line 246, in > ??? xopt = fmin (chis, x0, maxiter=300) > ? File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 257, > in fmin > ??? fsim[0] = func(x0) > ? File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line 176, > in function_wrapper > ??? return function(x, *args) > TypeError: chis() takes exactly 5 arguments (1 given) > > I don't understand where is my mistake. Any help would be appreciated! > > Cheers, > Mathieu > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From guziy.sasha at gmail.com Mon Apr 30 07:15:38 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Mon, 30 Apr 2012 07:15:38 -0400 Subject: [SciPy-User] problem with the fmin function In-Reply-To: References: Message-ID: Hi, error message says it needs 5 input arguments and you are giving one (the list but still only one), you could change your chis function def chis (x): a, ter , v , sda , rd = x observed data simulated_data = spotdiffusion ( a, ter , v , sda , rd) do chi-square computations returns the chi-square value (float) cheers -- Oleksandr Huziy 2012/4/30 servant mathieu > Dear scipy users, > > I'm trying to minimize a chi-square value by a simplex routine. My code > contains two functions: the first one, called spotdiffusion(a, ter , v , > sda , rd) generates some simulated data for a given set of parameter > values. The second function, called chis ( a, ter , v , sda , rd), computes > a chi-square value from the comparison of observed and simulated data, for > a given set of parameter values. To do so, the chis function calls the > spotdiffusion one. Here is a simplified structure of my code: > > def spotdiffusion (a, ter , v , sda , rd): > do simulation computations > returns an array of simulated data > > def chis (a, ter , v , sda , rd): > observed data > simulated_data = spotdiffusion ( a, ter , v , sda , rd) > do chi-square computations > returns the chi-square value (float) > > Now I want now to find a, ter, v, sda and rda values which minimize the > chi-square. Here is my attempt: > > x0 = np.array ([0.11,0.25,0.35, 1.7, 0.017]) ####initial guess > xopt = fmin (chis, x0, maxiter=300) > > However, python returns an error: > Traceback (most recent call last): > File "", line 1, in > File > "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\startup.py", > line 128, in runfile > execfile(filename, glbs) > File "C:\Users\mathieu\Desktop\modeling\spotlight diffusion > model\fitting_spotlight.py", line 246, in > xopt = fmin (chis, x0, maxiter=300) > File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line > 257, in fmin > fsim[0] = func(x0) > File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line > 176, in function_wrapper > return function(x, *args) > TypeError: chis() takes exactly 5 arguments (1 given) > > I don't understand where is my mistake. Any help would be appreciated! > > Cheers, > Mathieu > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bacmsantos at gmail.com Mon Apr 30 09:02:06 2012 From: bacmsantos at gmail.com (Bruno Santos) Date: Mon, 30 Apr 2012 14:02:06 +0100 Subject: [SciPy-User] Alternative to R phyper In-Reply-To: References: Message-ID: Hello everyone, I have a bit of code where I am using rpy2 to import R phyper so I can perform an hypergeometric test. Unfortunately our cluster does not have a functional installation of rpy2 working. So I am wondering if I could translate to scipy which would make the code completly independent of R. The python code I have is as following: def lphyper(self,q,m,n,k): """ self.phyper(self,q,m,n,k) Calculate p-value using R function phyper from rpy2 low-level interface. "R Documentation phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE) q: vector of quantiles representing the number of white balls drawn without replacement from an urn which contains both black and white balls. m: the number of white balls in the urn. n: the number of black balls in the urn. k: the number of balls drawn from the urn. log.p: logical; if TRUE, probabilities p are given as log(p). lower.tail: logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x]. """ phyper_q = SexpVector([q,], rinterface.INTSXP) phyper_m = SexpVector([m,], rinterface.INTSXP) phyper_n = SexpVector([n,], rinterface.INTSXP) phyper_k = SexpVector([k,], rinterface.INTSXP) return phyper(phyper_q,phyper_m,phyper_n,phyper_k,**myparams)[0] I have looked to scipy.stats.hypergeom but it is giving me a different result which is also negative. > 1-phyper(45, 92, 7518, 1329) [1] 6.92113e-13 In [24]: stats.hypergeom.sf(45,(92+7518),92,1329) Out[24]: -8.4343643180773142e-12 This was supposed to be an error with an older version of scipy but I am using more recent versions of it which should not contain the error anymore: In [26]: numpy.__version__ Out[26]: '1.5.1' In [27]: scipy.__version__ Out[27]: '0.9.0' thank you very much in advance for any help. Best, Bruno -------------- next part -------------- An HTML attachment was scrubbed... URL: From servant.mathieu at gmail.com Mon Apr 30 09:54:48 2012 From: servant.mathieu at gmail.com (servant mathieu) Date: Mon, 30 Apr 2012 15:54:48 +0200 Subject: [SciPy-User] problem with the fmin function In-Reply-To: References: Message-ID: Hi Oleksandr, Thank you very much, it works perfectly. The simplex (Nelder-Mead) routine is rather slow. I'm looking for an alternative, faster algorithm which could do the job (minimization of a chi-square) as well. Any idea? Cheers, Mathieu 2012/4/30 Oleksandr Huziy > Hi, error message says it needs 5 input arguments and you are giving one > (the list but still only one), you could change your chis function > > def chis (x): > a, ter , v , sda , rd = x > observed data > simulated_data = spotdiffusion ( a, ter , v , sda , rd) > do chi-square computations > returns the chi-square value (float) > > cheers > > > -- > Oleksandr Huziy > > 2012/4/30 servant mathieu > >> Dear scipy users, >> >> I'm trying to minimize a chi-square value by a simplex routine. My code >> contains two functions: the first one, called spotdiffusion(a, ter , v , >> sda , rd) generates some simulated data for a given set of parameter >> values. The second function, called chis ( a, ter , v , sda , rd), computes >> a chi-square value from the comparison of observed and simulated data, for >> a given set of parameter values. To do so, the chis function calls the >> spotdiffusion one. Here is a simplified structure of my code: >> >> def spotdiffusion (a, ter , v , sda , rd): >> do simulation computations >> returns an array of simulated data >> >> def chis (a, ter , v , sda , rd): >> observed data >> simulated_data = spotdiffusion ( a, ter , v , sda , rd) >> do chi-square computations >> returns the chi-square value (float) >> >> Now I want now to find a, ter, v, sda and rda values which minimize the >> chi-square. Here is my attempt: >> >> x0 = np.array ([0.11,0.25,0.35, 1.7, 0.017]) ####initial guess >> xopt = fmin (chis, x0, maxiter=300) >> >> However, python returns an error: >> Traceback (most recent call last): >> File "", line 1, in >> File >> "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\startup.py", >> line 128, in runfile >> execfile(filename, glbs) >> File "C:\Users\mathieu\Desktop\modeling\spotlight diffusion >> model\fitting_spotlight.py", line 246, in >> xopt = fmin (chis, x0, maxiter=300) >> File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line >> 257, in fmin >> fsim[0] = func(x0) >> File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line >> 176, in function_wrapper >> return function(x, *args) >> TypeError: chis() takes exactly 5 arguments (1 given) >> >> I don't understand where is my mistake. Any help would be appreciated! >> >> Cheers, >> Mathieu >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guziy.sasha at gmail.com Mon Apr 30 10:01:20 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Mon, 30 Apr 2012 10:01:20 -0400 Subject: [SciPy-User] problem with the fmin function In-Reply-To: References: Message-ID: Try powell method, it could be faster. http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_powell.html#scipy.optimize.fmin_powell you may also check out others here http://docs.scipy.org/doc/scipy/reference/optimize.html Cheers -- Oleksandr 2012/4/30 servant mathieu > Hi Oleksandr, > > Thank you very much, it works perfectly. > The simplex (Nelder-Mead) routine is rather slow. I'm looking for an > alternative, faster algorithm which could do the job (minimization of a > chi-square) as well. Any idea? > Cheers, > Mathieu > > > > > 2012/4/30 Oleksandr Huziy > >> Hi, error message says it needs 5 input arguments and you are giving one >> (the list but still only one), you could change your chis function >> >> def chis (x): >> a, ter , v , sda , rd = x >> observed data >> simulated_data = spotdiffusion ( a, ter , v , sda , rd) >> do chi-square computations >> returns the chi-square value (float) >> >> cheers >> >> >> -- >> Oleksandr Huziy >> >> 2012/4/30 servant mathieu >> >>> Dear scipy users, >>> >>> I'm trying to minimize a chi-square value by a simplex routine. My code >>> contains two functions: the first one, called spotdiffusion(a, ter , v , >>> sda , rd) generates some simulated data for a given set of parameter >>> values. The second function, called chis ( a, ter , v , sda , rd), computes >>> a chi-square value from the comparison of observed and simulated data, for >>> a given set of parameter values. To do so, the chis function calls the >>> spotdiffusion one. Here is a simplified structure of my code: >>> >>> def spotdiffusion (a, ter , v , sda , rd): >>> do simulation computations >>> returns an array of simulated data >>> >>> def chis (a, ter , v , sda , rd): >>> observed data >>> simulated_data = spotdiffusion ( a, ter , v , sda , rd) >>> do chi-square computations >>> returns the chi-square value (float) >>> >>> Now I want now to find a, ter, v, sda and rda values which minimize the >>> chi-square. Here is my attempt: >>> >>> x0 = np.array ([0.11,0.25,0.35, 1.7, 0.017]) ####initial guess >>> xopt = fmin (chis, x0, maxiter=300) >>> >>> However, python returns an error: >>> Traceback (most recent call last): >>> File "", line 1, in >>> File >>> "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\startup.py", >>> line 128, in runfile >>> execfile(filename, glbs) >>> File "C:\Users\mathieu\Desktop\modeling\spotlight diffusion >>> model\fitting_spotlight.py", line 246, in >>> xopt = fmin (chis, x0, maxiter=300) >>> File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line >>> 257, in fmin >>> fsim[0] = func(x0) >>> File "C:\Python27\lib\site-packages\scipy\optimize\optimize.py", line >>> 176, in function_wrapper >>> return function(x, *args) >>> TypeError: chis() takes exactly 5 arguments (1 given) >>> >>> I don't understand where is my mistake. Any help would be appreciated! >>> >>> Cheers, >>> Mathieu >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Mon Apr 30 17:44:58 2012 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Mon, 30 Apr 2012 23:44:58 +0200 Subject: [SciPy-User] Euroscipy 2012 deadline extension: May 7th Message-ID: <20120430214458.GB16609@phare.normalesup.org> The committee of the Euroscipy 2012 conference has extended the deadline for abstract submission to **Monday May 7th, midnight** (Brussels time). Up to then, new abstracts may be submitted on http://www.euroscipy.org/conference/euroscipy2012, and already-submitted abstracts can be modified. We are very much looking forward to your submissions to the conference. Euroscipy 2012 is the annual European conference for scientists using Python. It will be held August 23-27 2012 in Brussels, Belgium. It is also still possible to propose sprints that will take place after the conference, please write to Berkin Malkoc (malkocb at itu.edu.tr) for practical organization (rooms, ...). Any other questions should be addressed exclusively to org-team at lists.euroscipy.org -- Emmanuelle From jrennie at gmail.com Mon Apr 30 18:15:40 2012 From: jrennie at gmail.com (Jason Rennie) Date: Mon, 30 Apr 2012 18:15:40 -0400 Subject: [SciPy-User] setting submatrix of sparse matrix In-Reply-To: References: Message-ID: When I work with sparse matrices, I prepare a list of row-column-value tuples, construct a COOmatrix using that RCV data, then convert to CSRor CSC. I never modify a sparse matrix once built. It sounds like this may be a common approach per Christopher Mutel's message in the "sparse matrix construction question" thread from a few weeks ago. Cheers, Jason On Fri, Apr 20, 2012 at 3:07 PM, Duilio Tadeu wrote: > Hi, > > I am developing a simple Finite Elements code and there a part of the code > where I need to set some elements of the sparse matrix (actually a > submatrix), > But I could not make it work. I have tried many ways. > > I describe the better way I could try. The array index idx is 1x3, > so idx1 and idx2 is 3x3. > The (small) matrix M is 3x3. The matrix A is a sparse matrix, > lil_matrix sparse matrix of scipy.sparse > > ###### partial code ################## > > M = local_matrix(x,y) > M=M.ravel() > > idx1,idx2 = numpy.meshgrid(idx,idx) > idx1=idx1.ravel() > idx2=idx2.ravel() > > A[idx1,idx2] = A[idx1,idx2] + M > > ################################### > > In fact, I am trying to migrate from MATLAB to Scipy/Numpy, but this > part is taking some time and is > becoming frustrating, since I could not find any kind of way to > perform this task. > In MATLAB, the code would be just: > > M = local_matrix(x,y) > A[idx,idx] = A[idx,idx] + M > > Which is shorter and easy. I send next the error message. Any help > is welcome. Thanks, Duilio > > 131 idx2=idx2.ravel() > 132 > --> 133 A[idx1,idx2] = A[idx1,idx2] + M > > /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in > __setitem__(self, index, x) > 318 else: > 319 for ii, jj, xx in zip(i, j, x): > --> 320 self._insertat(ii, jj, xx) > 321 elif isinstance(i, slice) or issequence(i): > 322 rows = self.rows[i] > > /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in > _insertat(self, i, j, x) > 230 row = self.rows[i] > 231 data = self.data[i] > --> 232 self._insertat2(row, data, j, x) > 233 > 234 def _insertat2(self, row, data, j, x): > > /usr/lib/python2.6/dist-packages/scipy/sparse/lil.pyc in > _insertat2(self, row, data, j, x) > 244 > 245 if not np.isscalar(x): > --> 246 raise ValueError('setting an array element with a > sequence') > 247 > 248 try: > > ValueError: setting an array element with a sequence > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Jason Rennie Research Scientist ITA Software by Google +1 617-446-3651 -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Mon Apr 30 21:12:52 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 30 Apr 2012 20:12:52 -0500 Subject: [SciPy-User] SciPy 2012 Abstract and Tutorial Deadlines Extended Message-ID: SciPy 2012 Conference Deadlines Extended Didn't quite finish your abstract or tutorial yet? Good news: the SciPy 2012 organizers have extended the deadline until Friday, May 4. Proposals for tutorials and abstracts for talks and posters are now due by midnight (Austin time, CDT), May 4. For the many of you who have already submitted an abstract or tutorial: thanks! If you need to make corrections to an abstract or tutorial that you have already submitted, you may resubmit it by the same deadline. The SciPy 2012 Organizers -------------- next part -------------- An HTML attachment was scrubbed... URL: